quartz/content/BigData/Big Data Intro.md
2025-07-23 20:36:04 +03:00

1.8 KiB
Raw Blame History

data vs. information
  • Data is just raw facts (like the number 42).

    • But 42 could mean: age, shoe size, stock amount, etc.
  • Information is when you give meaning to the data.

    • Example: “Age = 42” gives context and becomes useful.
Big Data implementations
  • Delta Sentiment analysis (e.g., of customer feedback).
  • Netflix User Behavioral Analysis (e.g., what you watch and when).
  • Time Warner Customer segmentation (dividing customers into groups).
  • Volkswagen Predictive support (e.g., predict car issues).
  • Visa Fraud detection.
  • China government Security Intelligence (National security).
  • Weather forecasting Weather prediction models to predicting the weather.
  • Hospitals Diagnosing diseases using machine learning on images.
  • Amazon Price optimization.
  • Facebook Targeted advertising using user profiling.
Design Principles for Big Data
  1. Horizontal Growth Add more machines instead of stronger ones.
  2. Distributed Processing Split work across machines.
  3. Process where Data is Dont move data, move the code.
  4. Simplicity of Code Keep logic understandable.
  5. Recover from Failures Systems should self-heal.
  6. Idempotency Running the same job twice shouldnt break results.
Big Data SLA (Service Level Agreement)

define performance expectations

  • Reliability Will the data be there?
  • Consistency Is the data accurate across systems?
  • Availability Is the system always accessible?
  • Freshness How up-to-date is the data?
  • Response time How fast do queries return?
  • Other concerns:
    • Cost
    • Scalability
    • Performance

Next Cloud Services