0% found this document useful (0 votes)

37 views9 pages

Docker

Uploaded by

Salma ENNOUAIMI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views9 pages

Docker

Uploaded by

Salma ENNOUAIMI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

1. docker-compose.

yaml

The docker-compose.yaml file sets up a Hadoop cluster with four key services:

1. NameNode: Manages HDFS metadata (port 9870).

2. DataNode: Stores data blocks.

3. ResourceManager: Allocates resources for distributed applications (port 8088).

4. NodeManager: Manages containers on each node.

Each service uses the apache/hadoop:3 image, with mounted configuration files for proper setup and
operation.

2. core-site.xml

This XML configuration file specifies Hadoop's core settings.

 fs.defaultFS: Configures the default file system as HDFS with the address
hdfs://namenode:9000. This points to the NameNode service running on port 9000.

This configuration file defines the HDFS storage directories.

 dfs.namenode.name.dir: Specifies the directory for the NameNode’s metadata storage

(/tmp/hadoop-root/dfs/name).

 dfs.datanode.data.dir: Defines where the DataNodes will store their block data (/tmp/hadoop-
root/dfs/data).
3. mapred-site.xml

This file configures the MapReduce framework.

 mapreduce.framework.name: Specifies the execution framework as YARN, which is Hadoop’s

resource management and job scheduling framework.

4.yarn-site.xml

This file configures the YARN (Yet Another Resource Negotiator) settings.

 yarn.resourcemanager.hostname: Sets the hostname of the YARN ResourceManager to

resourcemanager, indicating where YARN will manage resources and job execution in the
cluster.
Command: Docker-compose up

After running docker-compose up, the Hadoop cluster was successfully deployed with four services:
NameNode, DataNode, ResourceManager, and NodeManager. Configuration files were mounted, and
ports were exposed for the NameNode (9870) and ResourceManager (8088) web interfaces. The cluster
is now fully operational for distributed data storage and processing.

When running docker-compose up, Docker pulls the apache/hadoop:3 image, creates the NameNode,
DataNode, ResourceManager, and NodeManager containers, and starts them. The logs display real-
time service initialization. The web interfaces are accessible at:

 NameNode: http://localhost:9870

 ResourceManager: http://localhost:8088

Command: docker ps
As part of the Hadoop deployment, the docker ps command was used to check the status of the running
Docker containers. This command revealed that the Hadoop components are successfully running across
multiple containers. These components are essential for the distributed file system (HDFS) and resource
management functionalities of the Hadoop ecosystem.

Tests:

First test: containers health check

To verify the health and status of the Hadoop containers, I used the following command to check if all
containers are up and running: docker-compose ps

As shown below, all four key Hadoop components are listed with the status "Up," confirming that they
are running correctly:

 hadoop-namenode-1: NameNode, responsible for managing the HDFS namespace and

regulating access to files, is running and accessible on port 9870.

 hadoop-datanode-1: DataNode, which handles the storage on the HDFS, is running and
operating in conjunction with the NameNode.

 hadoop-resourcemanager-1: ResourceManager, responsible for resource allocation across the

cluster, is running properly.

 hadoop-nodemanager-1: NodeManager, which manages resources and tasks on each node, is

up and functioning.

This output confirms that all services required for the Hadoop environment (NameNode, DataNode,
ResourceManager, and NodeManager) are running as expected.

Second test: Hadoop functionalities:

Command: To interact with the Hadoop container, I executed the following command to access the
running NameNode container: docker exec -it b260b8e4e5ec bash
This command allows me to open an interactive shell inside the hadoop-namenode-1 container
(container ID b260b8e4e5ec).

Once inside the container, I used the following commands to interact with HDFS:

1. Create a new directory: Hadoop fs -mkdir /test1

This command creates a new directory called /test1 in the Hadoop distributed file system
(HDFS).

2. List the contents of HDFS: Hadoop fs -ls

This command lists the directories and files in the root directory (/) of HDFS. The expected
output should include the newly created /test1 directory, confirming that the HDFS is
functioning properly.

This output verifies that the HDFS is functional, as the directory /test1 was created successfully and is
visible when listing the directory contents.

Third test: YARN Job Execution

This test was conducted to ensure that the YARN (Yet Another Resource Negotiator) system is properly
configured and able to execute distributed jobs in the Hadoop environment. To verify this, I ran the
Hadoop MapReduce example job that estimates the value of Pi.

Command: I executed the following command to run a sample MapReduce job using YARN: yarn jar
/opt/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar pi 16 1000
The job was executed successfully, and the output was as follows:

Question 2: Hadoop NameNode and DataNode Directories

This section outlines where the NameNode stores its file system metadata (fsimage and edit logs) and
where the DataNode stores the blocks of data in the Hadoop Distributed File System (HDFS). These
locations are configured in Hadoop’s configuration files.
Configuration File Used: The following configuration file specifies the storage directories for both the
NameNode and DataNode:

Based on the configuration, the metadata is stored in: /tmp/hadoop-root/dfs/name

The data blocks are stored in the following directory: /tmp/hadoop-root/dfs/data

Verification: By accessing the NameNode and DataNode configuration directories, the following was
observed:

NameNode Metadata (fsimage and edit logs): The fsimage and edits logs were found in the
configured directory /tmp/hadoop-root/dfs/name. These files are critical for recovering the
HDFS state.

Bda Lab Manual
No ratings yet
Bda Lab Manual
45 pages
Exp1 Hirday Merged
No ratings yet
Exp1 Hirday Merged
102 pages
Data Storage Data Processing: Hadoop Distributed File System (HDFS) Mapreduce
No ratings yet
Data Storage Data Processing: Hadoop Distributed File System (HDFS) Mapreduce
35 pages
Big Data Record 2024-25
No ratings yet
Big Data Record 2024-25
46 pages
Bigdatamanual
No ratings yet
Bigdatamanual
45 pages
3 Hadoop
No ratings yet
3 Hadoop
40 pages
Big Data Analytics lab-JD
No ratings yet
Big Data Analytics lab-JD
49 pages
BigData Lab Manual
No ratings yet
BigData Lab Manual
44 pages
Ccs334 Bda Lab Manual PRINT
No ratings yet
Ccs334 Bda Lab Manual PRINT
53 pages
Bda Lab Manual
No ratings yet
Bda Lab Manual
42 pages
Unit 2
No ratings yet
Unit 2
22 pages
Big Data
No ratings yet
Big Data
23 pages
Unit 2-HDFS SGS
No ratings yet
Unit 2-HDFS SGS
29 pages
BDA-Lab Record
No ratings yet
BDA-Lab Record
43 pages
Bigdatamanualfinal 231019063224 d211cb48
No ratings yet
Bigdatamanualfinal 231019063224 d211cb48
45 pages
Ccs 334 Bigdata Manual
No ratings yet
Ccs 334 Bigdata Manual
45 pages
BDA Lab Manual - Organized
No ratings yet
BDA Lab Manual - Organized
69 pages
CCS334 Bda Lab Manual
No ratings yet
CCS334 Bda Lab Manual
48 pages
HDFS Commands Updated
No ratings yet
HDFS Commands Updated
87 pages
Introduction To HDFS
No ratings yet
Introduction To HDFS
20 pages
Big Data Analytics Lab Experiments
No ratings yet
Big Data Analytics Lab Experiments
16 pages
Ccs334 Bda Lab Ex
No ratings yet
Ccs334 Bda Lab Ex
45 pages
Module-2 PPT-1
No ratings yet
Module-2 PPT-1
126 pages
Introduction To HDFS
No ratings yet
Introduction To HDFS
18 pages
How To Set Up A Hadoop Cluster in Docker
No ratings yet
How To Set Up A Hadoop Cluster in Docker
13 pages
BIG DATA UNIT-III Notes
No ratings yet
BIG DATA UNIT-III Notes
16 pages
Exp1 Bda
No ratings yet
Exp1 Bda
11 pages
Hadoop 1
No ratings yet
Hadoop 1
15 pages
Bi Lab File
No ratings yet
Bi Lab File
19 pages
HDFS (Hadoop Distributed File System) : HDFS Architecture Components of The Architecture
No ratings yet
HDFS (Hadoop Distributed File System) : HDFS Architecture Components of The Architecture
10 pages
Hadoop
No ratings yet
Hadoop
18 pages
Chapter 1: The Nature and Relationships of Science, Technology and Society
100% (1)
Chapter 1: The Nature and Relationships of Science, Technology and Society
15 pages
CCS334-BDA LAB MANUAL Final
No ratings yet
CCS334-BDA LAB MANUAL Final
46 pages
How To Set Up A Hadoop Cluster in Docker
No ratings yet
How To Set Up A Hadoop Cluster in Docker
8 pages
Week 1 in Terminal
No ratings yet
Week 1 in Terminal
10 pages
Big Data File
No ratings yet
Big Data File
16 pages
Running Hadoop On Ubuntu Linux
No ratings yet
Running Hadoop On Ubuntu Linux
15 pages
Exp 1-2
No ratings yet
Exp 1-2
9 pages
Bdafile
No ratings yet
Bdafile
9 pages
Unit 3 PART 2
No ratings yet
Unit 3 PART 2
11 pages
Hadoop Configuration
No ratings yet
Hadoop Configuration
12 pages
C:/Users/HP Hdfs Namenode - Format
No ratings yet
C:/Users/HP Hdfs Namenode - Format
7 pages
BDA Exp 2
No ratings yet
BDA Exp 2
15 pages
Assignment Tanupriya BDDV
No ratings yet
Assignment Tanupriya BDDV
8 pages
Hadoop Single Node Cluster Setup Steps
No ratings yet
Hadoop Single Node Cluster Setup Steps
7 pages
10 Dfs
No ratings yet
10 Dfs
5 pages
Hadoop Installation Cluster
No ratings yet
Hadoop Installation Cluster
9 pages
Hdfs and Pig
No ratings yet
Hdfs and Pig
13 pages
Computer Science & Engineering: Department of
No ratings yet
Computer Science & Engineering: Department of
6 pages
Hadoop
No ratings yet
Hadoop
4 pages
Hadoop Commands Only
No ratings yet
Hadoop Commands Only
19 pages
Lab1 BigData
No ratings yet
Lab1 BigData
3 pages
Hadoop File System: CSC 369 Distributed Computing Alexander Dekhtyar
No ratings yet
Hadoop File System: CSC 369 Distributed Computing Alexander Dekhtyar
5 pages
Hands On-Exercies
No ratings yet
Hands On-Exercies
17 pages
Big Data Unit 3 by Multi Atoms
No ratings yet
Big Data Unit 3 by Multi Atoms
6 pages
Hadoop HDFS
No ratings yet
Hadoop HDFS
3 pages
Exp3 BDI 60004200124
No ratings yet
Exp3 BDI 60004200124
5 pages
Deepan
No ratings yet
Deepan
66 pages
Big Datalab
No ratings yet
Big Datalab
4 pages
Command
No ratings yet
Command
1 page
Pro Data Mashup For Power BI: Powering Up With Power Query and The M Language To Find, Load, and Transform Data 1st Edition Adam Aspin
No ratings yet
Pro Data Mashup For Power BI: Powering Up With Power Query and The M Language To Find, Load, and Transform Data 1st Edition Adam Aspin
66 pages
Abdurazak Kamil Final Thesis
No ratings yet
Abdurazak Kamil Final Thesis
80 pages
Module 1 - Personal Development PDF
No ratings yet
Module 1 - Personal Development PDF
196 pages
Gr12 - Research - Task - 2025 - 250409 - 151417
No ratings yet
Gr12 - Research - Task - 2025 - 250409 - 151417
9 pages
Journal of PMSA Spring 2017
No ratings yet
Journal of PMSA Spring 2017
77 pages
18tcpmcomm PDF
No ratings yet
18tcpmcomm PDF
269 pages
Proper Use of The DIPPR 801 Database For Creation of Models, Methods, and Processes
No ratings yet
Proper Use of The DIPPR 801 Database For Creation of Models, Methods, and Processes
8 pages
Predavanje 07
No ratings yet
Predavanje 07
69 pages
Log
No ratings yet
Log
141 pages
Captura de Ecrã 2023-01-15 À(s) 12.15.30
No ratings yet
Captura de Ecrã 2023-01-15 À(s) 12.15.30
60 pages
Unit3 (DB) 2
No ratings yet
Unit3 (DB) 2
82 pages
Categories of End-Users (Contd..) : - Sophisticated User
No ratings yet
Categories of End-Users (Contd..) : - Sophisticated User
26 pages
Bi2008 - Bi 7.0 Roles & Authorizations - V1.0: India Sap Coe, Slide 1
No ratings yet
Bi2008 - Bi 7.0 Roles & Authorizations - V1.0: India Sap Coe, Slide 1
69 pages
Computer Project
No ratings yet
Computer Project
52 pages
Dell PowerScale Archive
No ratings yet
Dell PowerScale Archive
7 pages
Unit-02 Notes - DBMS
No ratings yet
Unit-02 Notes - DBMS
24 pages
E10q4w2 V2
No ratings yet
E10q4w2 V2
12 pages
PROJECT
No ratings yet
PROJECT
7 pages
Chapter 1 Introduction To Research
No ratings yet
Chapter 1 Introduction To Research
32 pages
Chapter 3 IT at Work
No ratings yet
Chapter 3 IT at Work
19 pages
استخدام تقنيات الاستشعار عن بعد ونظم المعلومات الجغرافية في تقدير المساحة وكثافة مشاجر الغابات الاصطناعية قي مدينة الموصل
100% (1)
استخدام تقنيات الاستشعار عن بعد ونظم المعلومات الجغرافية في تقدير المساحة وكثافة مشاجر الغابات الاصطناعية قي مدينة الموصل
7 pages
Assignment 2 Data Types
No ratings yet
Assignment 2 Data Types
6 pages
AWR Report
No ratings yet
AWR Report
2 pages
SQL Queries Student Catalog Queries
No ratings yet
SQL Queries Student Catalog Queries
20 pages
Paper Magnetik 1
No ratings yet
Paper Magnetik 1
6 pages
Towards Secure and Dependable Storage Services in Cloud Computing
No ratings yet
Towards Secure and Dependable Storage Services in Cloud Computing
6 pages
Dos Cmds
No ratings yet
Dos Cmds
2 pages
SAP Interview Questions and Answers - SAP BI Interview Questions PDF
No ratings yet
SAP Interview Questions and Answers - SAP BI Interview Questions PDF
2 pages
Big Data Analytics
From Everand
Big Data Analytics
Nitin Kumar Yadav
No ratings yet
Quick Configuration of Openldap and Kerberos In Linux and Authenicating Linux to Active Directory
From Everand
Quick Configuration of Openldap and Kerberos In Linux and Authenicating Linux to Active Directory
Dr. Hidaia Mahmood Alassouli
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Docker

Uploaded by

Docker

Uploaded by

1. docker-compose.

1. NameNode: Manages HDFS metadata (port 9870).

2. DataNode: Stores data blocks.

3. ResourceManager: Allocates resources for distributed applications (port 8088).

This XML configuration file specifies Hadoop's core settings.

This configuration file defines the HDFS storage directories.

 dfs.namenode.name.dir: Specifies the directory for the NameNode’s metadata storage

This file configures the MapReduce framework.

 mapreduce.framework.name: Specifies the execution framework as YARN, which is Hadoop’s

 yarn.resourcemanager.hostname: Sets the hostname of the YARN ResourceManager to

First test: containers health check

 hadoop-namenode-1: NameNode, responsible for managing the HDFS namespace and

 hadoop-resourcemanager-1: ResourceManager, responsible for resource allocation across the

 hadoop-nodemanager-1: NodeManager, which manages resources and tasks on each node, is

Second test: Hadoop functionalities:

1. Create a new directory: Hadoop fs -mkdir /test1

2. List the contents of HDFS: Hadoop fs -ls

Third test: YARN Job Execution

Question 2: Hadoop NameNode and DataNode Directories

Based on the configuration, the metadata is stored in: /tmp/hadoop-root/dfs/name

The data blocks are stored in the following directory: /tmp/hadoop-root/dfs/data

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.