mirror of
https://github.com/jackyzha0/quartz.git
synced 2025-12-23 21:04:07 -06:00
23 lines
873 B
Markdown
23 lines
873 B
Markdown
|
|
[[Database History]]
|
|
[[RDBMS]] - Relational Models
|
|
[[Hadoop]]
|
|
##### **Big Data Challenges**
|
|
Examples of tasks that are hard with large datasets:
|
|
1. Count the **most frequent words** in Wikipedia.
|
|
2. Find the **hottest November** per country from weather data.
|
|
3. Find the **day with most critical errors** in company logs.
|
|
|
|
These problems require:
|
|
- **Huge data**
|
|
- **Efficient distributed computing**
|
|
|
|
#### [[RDBMS]] vs. [[Hadoop]]
|
|
|
|
| **Feature** | **RDBMS** | **Hadoop** |
|
|
| -------------- | ------------------- | ----------------------------- |
|
|
| Data structure | Structured (tables) | Any (structured/unstructured) |
|
|
| Scalability | Limited | Highly scalable |
|
|
| Speed | Fast (small data) | Designed for huge data |
|
|
| Access | SQL | Code (e.g., Java, Python) |
|