quartz/content/BigData/Big Data Intro.md
2025-07-23 20:36:04 +03:00

41 lines
1.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

##### data vs. information
- **Data** is just raw facts (like the number 42).
- But 42 could mean: age, shoe size, stock amount, etc.
- **Information** is when you give meaning to the data.
- Example: “Age = 42” gives context and becomes useful.
##### Big Data implementations
- **Delta** *Sentiment analysis* (e.g., of customer feedback).
- **Netflix** *User Behavioral Analysis* (e.g., what you watch and when).
- **Time Warner** *Customer segmentation* (dividing customers into groups).
- **Volkswagen** *Predictive support* (e.g., predict car issues).
- **Visa** *Fraud detection*.
- **China government** *Security Intelligence* (National security).
- **Weather forecasting** *Weather prediction models* to predicting the weather.
- **Hospitals** Diagnosing diseases using *machine learning* on images.
- **Amazon** *Price optimization*.
- **Facebook** Targeted advertising using *user profiling*.
##### Design Principles for Big Data
1. **Horizontal Growth** Add more machines instead of stronger ones.
2. **Distributed Processing** Split work across machines.
3. **Process where Data is** Dont move data, move the code.
4. **Simplicity of Code** Keep logic understandable.
5. **Recover from Failures** Systems should self-heal.
6. **Idempotency** Running the same job twice shouldnt break results.
##### Big Data SLA (Service Level Agreement)
define performance expectations
- **Reliability** Will the data be there?
- **Consistency** Is the data accurate across systems?
- **Availability** Is the system always accessible?
- **Freshness** How up-to-date is the data?
- **Response time** How fast do queries return?
* Other concerns:
- **Cost**
- **Scalability**
- **Performance**
> Next [[Cloud Services]]