Add my Obsidian notes

This commit is contained in:
DeGefen 2025-07-23 21:00:46 +03:00
parent 7b7a97b7cf
commit b8edcf9d55
3 changed files with 7 additions and 1 deletions

View File

@ -32,6 +32,7 @@ aliases:
![[Screenshot 2025-07-23 at 18.27.30.png|]] ![[Screenshot 2025-07-23 at 18.27.30.png|]]
##### Hive Usage ##### Hive Usage
{% raw %}
``` ```
#Start a hive shell: #Start a hive shell:
$hive $hive
@ -57,3 +58,4 @@ $hive -e 'SELECT name FROM mta;'
#Execute script from file #Execute script from file
$hive -f hive_script.txt $hive -f hive_script.txt
``` ```
{% endraw %}

View File

@ -36,6 +36,7 @@ aliases:
- Each [[RDD]] keeps track of how it was derived. If a node fails, Spark **recomputes only the lost partition** from the original transformations. - Each [[RDD]] keeps track of how it was derived. If a node fails, Spark **recomputes only the lost partition** from the original transformations.
##### Writing Spark Code in Python ##### Writing Spark Code in Python
{% raw %}
``` ```
# Spark Context Initialization # Spark Context Initialization
from pyspark import SparkConf, SparkContext from pyspark import SparkConf, SparkContext
@ -52,6 +53,8 @@ distData = sc.parallelize(data)
distFile = sc.textFile("data.txt") distFile = sc.textFile("data.txt")
distFile = sc.textFile("folder/*.txt") distFile = sc.textFile("folder/*.txt")
``` ```
{% endraw %}
##### **RDD Transformations (Lazy)** ##### **RDD Transformations (Lazy)**
These create a new RDD from an existing one. These create a new RDD from an existing one.

View File

@ -21,6 +21,7 @@ Stores huge files (Typical file size GB-TB) across multiple machines.
- Parquet Files - Yet another RC file - Parquet Files - Yet another RC file
##### HDFS Command Line ##### HDFS Command Line
{% raw %}
``` ```
# List files # List files
hadoop fs -ls /path hadoop fs -ls /path
@ -34,7 +35,7 @@ hadoop fs -cat /file
# Upload file # Upload file
hadoop fs -copyFromLocal file.txt hdfs://... hadoop fs -copyFromLocal file.txt hdfs://...
``` ```
{% endraw %}
#### HDFS Architecture Main Components #### HDFS Architecture Main Components
##### **1.** NameNode (Master Node) ##### **1.** NameNode (Master Node)
- **Stores metadata** about the filesystem: - **Stores metadata** about the filesystem: