##### data vs. information - **Data** is just raw facts (like the number 42). - But 42 could mean: age, shoe size, stock amount, etc. - **Information** is when you give meaning to the data. - Example: “Age = 42” gives context and becomes useful. ##### Big Data implementations - **Delta** – *Sentiment analysis* (e.g., of customer feedback). - **Netflix** – *User Behavioral Analysis* (e.g., what you watch and when). - **Time Warner** – *Customer segmentation* (dividing customers into groups). - **Volkswagen** – *Predictive support* (e.g., predict car issues). - **Visa** – *Fraud detection*. - **China government** – *Security Intelligence* (National security). - **Weather forecasting** – *Weather prediction models* to predicting the weather. - **Hospitals** – Diagnosing diseases using *machine learning* on images. - **Amazon** – *Price optimization*. - **Facebook** – Targeted advertising using *user profiling*. ##### Design Principles for Big Data 1. **Horizontal Growth** – Add more machines instead of stronger ones. 2. **Distributed Processing** – Split work across machines. 3. **Process where Data is** – Don’t move data, move the code. 4. **Simplicity of Code** – Keep logic understandable. 5. **Recover from Failures** – Systems should self-heal. 6. **Idempotency** – Running the same job twice shouldn’t break results. ##### Big Data SLA (Service Level Agreement) define performance expectations - **Reliability** – Will the data be there? - **Consistency** – Is the data accurate across systems? - **Availability** – Is the system always accessible? - **Freshness** – How up-to-date is the data? - **Response time** – How fast do queries return? * Other concerns: - **Cost** - **Scalability** - **Performance** > Next [[Cloud Services]]