Explain YARN Architecture – Hadoop Tutorial

8/22/2025

YARN Architecture

Go Back

Explain YARN Architecture – Hadoop Tutorial

The Yet Another Resource Negotiator (YARN) is a core component of the Hadoop ecosystem, introduced in Hadoop 2.x to overcome the limitations of the classic MapReduce framework. YARN acts as a cluster resource management system, providing efficient resource allocation, job scheduling, and monitoring for big data applications. In this Hadoop tutorial, we’ll explain the YARN architecture, its components, and how it works in managing Hadoop clusters.


 YARN Architecture

What is YARN?

YARN is the resource management layer of Hadoop that separates the resource management and job scheduling functions. This makes Hadoop more flexible, scalable, and capable of handling multiple data processing engines beyond MapReduce, such as Apache Spark, Tez, and Flink.

Key Benefits of YARN:

  • Scalability: Manages thousands of nodes in a cluster.

  • Multi-Tenancy: Runs multiple applications simultaneously.

  • Efficiency: Allocates cluster resources dynamically.

  • Flexibility: Supports various data processing frameworks.


YARN Architecture Overview

The YARN architecture is designed around three main components:

  1. ResourceManager (RM)

  2. NodeManager (NM)

  3. ApplicationMaster (AM)

  4. Containers

These components work together to manage resources, monitor workloads, and execute distributed applications.


Components of YARN Architecture

1. ResourceManager (RM)

The ResourceManager is the master daemon responsible for managing resources in the cluster.

  • Handles resource allocation for all applications.

  • Consists of two parts:

    • Scheduler: Allocates resources based on policies (capacity, fairness, FIFO).

    • ApplicationManager: Manages application submissions, negotiates resources, and monitors progress.

2. NodeManager (NM)

The NodeManager runs on each DataNode in the cluster.

  • Monitors resource usage (CPU, memory, disk, network).

  • Manages containers launched on the node.

  • Reports node health and resource availability to the ResourceManager.

3. ApplicationMaster (AM)

Each application has its own ApplicationMaster that:

  • Negotiates resources with the ResourceManager.

  • Works with NodeManagers to launch tasks inside containers.

  • Monitors the execution and handles job failures.

4. Containers

Containers are the fundamental units of resource allocation in YARN.

  • Each container runs a specific task.

  • Resources like memory, CPU, and disk are allocated to containers.

  • Managed by the NodeManager under the instruction of the ApplicationMaster.


How YARN Works (Execution Flow)

  1. Job Submission: The client submits an application to the ResourceManager.

  2. ApplicationMaster Launch: ResourceManager allocates a container for the ApplicationMaster.

  3. Resource Negotiation: The ApplicationMaster requests resources from the ResourceManager.

  4. Task Execution: NodeManagers launch containers to run tasks.

  5. Monitoring & Completion: ApplicationMaster monitors progress and reports status back to the client.


Example of YARN in Action

Suppose you submit a Spark job on a Hadoop cluster using YARN:

  • The ResourceManager allocates resources.

  • The ApplicationMaster coordinates with NodeManagers.

  • Containers are launched on multiple nodes to process different tasks.

  • Once completed, results are returned to the client.


Advantages of YARN

  • Improved Resource Utilization through dynamic allocation.

  • Supports Multiple Processing Engines like Spark, Hive, Tez.

  • Fault Tolerance via monitoring and restarting failed containers.

  • Better Scalability for large clusters.


Conclusion

The YARN architecture plays a vital role in the Hadoop ecosystem by efficiently managing cluster resources, scheduling tasks, and supporting multiple processing frameworks. With its ResourceManager, NodeManager, ApplicationMaster, and Container-based architecture, YARN enables Hadoop to scale seamlessly while providing high availability and efficiency.