quartz/content/BigData/Hadoop/Yarn.md
2025-07-23 20:36:04 +03:00

1.1 KiB
Raw Blame History

YARN (Yet Another Resource Negotiator) is Hadoops cluster resource management system

  • Multiple jobs running simultaneously
  • Multiple jobs use same resources (disk, CPU, memory)
  • Assign resources to jobs and tasks exclusively
YARN is in charge of:
  1. Allocates Resources
  2. Schedules Jobs
    • allocate priorities to jobs by policies: FIFO scheduler, Fair scheduler, Capacity scheduler
Components:
  • ResourceManager

    • oversees resource allocation across the cluster
  • NodeManager

    • Each node in the cluster runs a NodeManager.
    • This component manages the execution of containers on its node.
  • ApplicationMaster

    • manages the lifecycle of applications.
    • handles job scheduling and monitors progress.
  • Resource Container

    • a logical bundle of resources (e.g., CPU, Memory) that is allocated by the ResourceManager

!Screenshot 2025-07-23 at 13.29.37.png

YARN ecosystem

Yarn can run other applications beside Hadoop MapReduce, that can integrate to the Hadoop ecosystem: • Apache Storm (Data Streaming engine) • Apache Spark (Data Batch and streaming engine) • Apache Solr (Search platform)