mirror of
https://github.com/jackyzha0/quartz.git
synced 2025-12-24 05:14:06 -06:00
Add my Obsidian notes
This commit is contained in:
parent
7b7a97b7cf
commit
b8edcf9d55
@ -32,6 +32,7 @@ aliases:
|
|||||||
![[Screenshot 2025-07-23 at 18.27.30.png|]]
|
![[Screenshot 2025-07-23 at 18.27.30.png|]]
|
||||||
|
|
||||||
##### Hive Usage
|
##### Hive Usage
|
||||||
|
{% raw %}
|
||||||
```
|
```
|
||||||
#Start a hive shell:
|
#Start a hive shell:
|
||||||
$hive
|
$hive
|
||||||
@ -57,3 +58,4 @@ $hive -e 'SELECT name FROM mta;'
|
|||||||
#Execute script from file
|
#Execute script from file
|
||||||
$hive -f hive_script.txt
|
$hive -f hive_script.txt
|
||||||
```
|
```
|
||||||
|
{% endraw %}
|
||||||
@ -36,6 +36,7 @@ aliases:
|
|||||||
- Each [[RDD]] keeps track of how it was derived. If a node fails, Spark **recomputes only the lost partition** from the original transformations.
|
- Each [[RDD]] keeps track of how it was derived. If a node fails, Spark **recomputes only the lost partition** from the original transformations.
|
||||||
|
|
||||||
##### Writing Spark Code in Python
|
##### Writing Spark Code in Python
|
||||||
|
{% raw %}
|
||||||
```
|
```
|
||||||
# Spark Context Initialization
|
# Spark Context Initialization
|
||||||
from pyspark import SparkConf, SparkContext
|
from pyspark import SparkConf, SparkContext
|
||||||
@ -52,6 +53,8 @@ distData = sc.parallelize(data)
|
|||||||
distFile = sc.textFile("data.txt")
|
distFile = sc.textFile("data.txt")
|
||||||
distFile = sc.textFile("folder/*.txt")
|
distFile = sc.textFile("folder/*.txt")
|
||||||
```
|
```
|
||||||
|
{% endraw %}
|
||||||
|
|
||||||
##### **RDD Transformations (Lazy)**
|
##### **RDD Transformations (Lazy)**
|
||||||
These create a new RDD from an existing one.
|
These create a new RDD from an existing one.
|
||||||
|
|
||||||
|
|||||||
@ -21,6 +21,7 @@ Stores huge files (Typical file size GB-TB) across multiple machines.
|
|||||||
- Parquet Files - Yet another RC file
|
- Parquet Files - Yet another RC file
|
||||||
|
|
||||||
##### HDFS Command Line
|
##### HDFS Command Line
|
||||||
|
{% raw %}
|
||||||
```
|
```
|
||||||
# List files
|
# List files
|
||||||
hadoop fs -ls /path
|
hadoop fs -ls /path
|
||||||
@ -34,7 +35,7 @@ hadoop fs -cat /file
|
|||||||
# Upload file
|
# Upload file
|
||||||
hadoop fs -copyFromLocal file.txt hdfs://...
|
hadoop fs -copyFromLocal file.txt hdfs://...
|
||||||
```
|
```
|
||||||
|
{% endraw %}
|
||||||
#### HDFS Architecture – Main Components
|
#### HDFS Architecture – Main Components
|
||||||
##### **1.** NameNode (Master Node)
|
##### **1.** NameNode (Master Node)
|
||||||
- **Stores metadata** about the filesystem:
|
- **Stores metadata** about the filesystem:
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user