0% found this document useful (0 votes)

80 views7 pages

Hadoop Installation

This document provides steps to install and configure Hadoop in standalone mode on a single machine. It includes instructions for: 1. Installing Java and setting it as the default version. 2. Creating a Hadoop user account and group. 3. Downloading and extracting Hadoop before moving it to the local file system and setting permissions. 4. Configuring environment variables and files like core-site.xml, hdfs-site.xml, and yarn-site.xml to set properties for ports, directories, and other Hadoop settings. 5. Enabling passwordless SSH access between nodes for communication in distributed mode.

Uploaded by

Girik Khullar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

80 views7 pages

Hadoop Installation

Uploaded by

Girik Khullar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Minimum Requirements (Recommended)

RAM – More than 4GB

Better if you dual boot your OS (Even Working on Virtual Machine is fine)
Java 8 (Don’t go for higher versions for now, as some features may not work fine with versions
like jdk-11)
Hadoop could be worked in one of the three modes
1. Standalone mode
2. Pseudo-distributed mode
3. Fully distributed mode
Steps for installing Hadoop (including installation of Java)
Java Installation
Step1: update the system by using the following command
sudo apt-get update //sudo- super user
does(Substitute user does)
Step 2: use the following command to upgrade the packages that have already been installed
on your machine
sudo apt-get upgrade
Step 3: visit oracle official site and download java8 version tar file (tar.gz file extension)
Read the contents of the file by using table achieve command as
tar –xzvf oracle-java8-installer.tar.gz [x-extrect, v-verbose, z-zipfile,
f-file]
Where oracle-java-8-installer.tar.gz is the file downloaded
[OR]
Install java by using the following command
Sudo apt-get install oracle-java8-installer
If you have installed Java previously (or if you have multiple versions of java on your
machine), use the command to set the current java version as default
Sudo apt-get install oracle-java8-set-default
Move java to lib folder
Sudo mv
Creating user and installing HADOOP

Step 1: login as root

sudo su
to check the user, you can type
who am i
Step 2: Add a Hadoop group
sudo addgroup hadoop //Hadoop is the group name
Step 3: Create a user to the created group
sudo adduser hduser
Step 4: Add this user to Hadoop group created
sudo adduer hduser hadoop
[OR}
sudo adduser --ingroup hadoop hduser
Step 5: Grant all privileges to the newly created user
Open a new terminal
Use the following command
sudo su root //login as root
sudo gedit /etc/sudoers //super user(substitute
user) does file
A file gets opened, add the following lines
#user privilege specification
root ALL = (ALL:ALL)ALL
hduser ALL = (ALL:ALL)ALL
save the file; close the newly opened terminal.
Step 6: As Hadoop core requires Shell for communication with slave nodes (especially for Fully-
distributed mode, without requirement of password for every call, install password-less SSH. In
standalone and Pseudo-distributed mode, SSH makes Hadoop core to access local host.
SSH : command used to connect to remote machines (client)
SSHD : Daemon that runs on the server and allows clients to connect to the server
sudo apt-get install ssh
To know about the existing SSH installed, one could use
which ssh
which sshd
Generate a password-less SSH for the user created in Hadoop group
sudo hduser
ssh-keygen –t rsa –P “”
Note: In the above command, “P” is capital and don’t give any space in the double-
quotation.
Add the newly created key to the list of authorized keys so the Hadoop can use SSH without
prompting for password
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
To check the new created SSH, just type
ssh localhost
Note: It is recommended to disable IPV6 as Hadoop doesn’t work on IPV6. To disable
IPV6, open the system control configuration file
sudo gedit /etc/sysctl.conf
In the opened file, type the following lines
net.ipv6.conf.all.disable_ipv6=1
net.ipv6.conf.default.disable_ipv6=1
net.ipv6.la.disable_ipv6=1
Reboot the system for the changes to be reflected
sudo reboot
Step 7: Download Hadoop either from the official site (recommended) or by using the
command
To download from official site, Visit http://hadoop.apache.org
[OR]
wget
http://mirrors.sonic.net/apache/hadoop/common/hadoop-3.2.0/hadoop-3.2.0.tar.gz
Extract the contents from the “.tar.gz” file , using the command
tar –xzvf Hadoop-3.2.0.tar.gz
move Hadoop to local by changing the user as hduser
sudo su hduser
sudo mv hadoop-3.2.0 /usr/local/Hadoop
Add ownership to the user
Sudo chown hduser:Hadoop -R /usr/local/Hadoop
If you want to check the permissions, just use
ls -ltr /usr/local
The above will show the permissions for all the folders present in local directory. You can
change the permissions required using
sudo chmod 777 hadoop //777 grants all permissions to all users
{user, group and others}

Step 8: Before using the hadoop environment, it should be set-up by initially configuring the
configuration file of the Hadoop environment, by editing “bashrc” file
Open bashrc file, by using the command
sudo gedit /home/hduser/.bashrc
Append the following lines in the file opened
# Set HADOOP_HOME
export JAVA_HOME=/usr/lib/jvm/java-11-oracle
export HADOOP_HOME=/usr/local/hadoop
# Add Hadoop bin and sbin directory to PATH
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
The above will configure your Hadoop environment which helps in getting environmental set-up
needed for working with hadoop from anywhere for the user “hduser”
Step 9: Move to the hadoop folder that contains other environmental set-up related to HDFS,
Hadoop general environment etc.
cd /usr/local/hadoop/etc/hadoop
Now, configure hadoop-env.sh file, where path for java is to be mentioned
sudo gedit hadoop-env.sh
Append the following line in the file opened, and save it
export JAVA_HOME=/usr/lib/jvm/java-8-oracle

Step 9. A. Now configure core-site.xml file, for this open core-site.xml file, append the following
properties, save the file.
sudo gedit core-site.xml

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
</property>
</configuration>

The above configuration file helps in specifying the default port in which Hadoop environment
should work, including the location in which data related to hadoop should be stored.
It is to be remembered that XML tags are to be correctly opened and closed. There is difference
in port numbers from Hadoop 2.x to 3.x, in order to mitigate some port conflict issues.

Step 9. B. Now configure hdfs-site.xml file. Open hdfs-site.xml file, append the following
properties and save the file.
sudo gedit hdfs-site.xml

<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
<property>
<name>dfs.block.size</name>
<value>104857600</value>
</property>
</configuration>

The above configuration file specifies the number of replications that you intend to use for your
hadoop set-up, the location for your namenode and datanode folders and the block size for
each of the datanode.
Note: Datanode is where data is stored and complete computations are made. Namenode
just tracks about the processing from DataNode periodically. It is to be noted that
specifying the block size is not mandatory and for hadoop 1.x, the default size would be
64MB, whereas for other set-up like 2.x or higher, it would be 128MB. When specifying
the block size, it should be specified in terms of KB as given above in the XML file. It
purely depends on application to decide about block size.

Step 9.C. Configure yarn-site.xml. Open the file, add the properties and save the file.
sudo gedit yarn-site.xml

<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>localhost</value>
</property>
</configuration>

Note: YARN stands for “Yet Another Resource Negotiator” , which takes care about
resource management and job scheduling. Resource Manager has two main components
namely, Scheduler and ApplicationManager. Here the task of Scheduler is to allocate
resources to various running applications and ApplicationManager is to accept job-
submissions and negotiating containers.

The properties specified in the above XML file has the host specification in which your hadoop
works, about the auxiliary services that are to be formed.

Step 9.D. Configure mapred-site.xml

Open mapred-site.xml file, append the following properties and save the file.
sudo gedit mapred-site.xml

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.admin.user.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_COMMON_HOME</value>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_COMMON_HOME</value>
</property>

</configuration>

The above file has the configuration for MapReduce tasks used to override the default values
for MapReduce parameters. Configuration includes the path for mapreduce environment for the
current user, and framework name which handles it. (YARN)

AWS Certified DevOps Slides v8
100% (1)
AWS Certified DevOps Slides v8
573 pages
Updated CMD
No ratings yet
Updated CMD
23 pages
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster)
No ratings yet
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster)
27 pages
Single Node Hadoop Cluster
No ratings yet
Single Node Hadoop Cluster
9 pages
Installation of Hadoop in Ubuntu
No ratings yet
Installation of Hadoop in Ubuntu
15 pages
Hadoop
No ratings yet
Hadoop
5 pages
BDA Practical
No ratings yet
BDA Practical
38 pages
Hadoop InstallSteps
No ratings yet
Hadoop InstallSteps
14 pages
BDAO
No ratings yet
BDAO
23 pages
How To Install Hadoop On Ubuntu 18.04 or 20.04
No ratings yet
How To Install Hadoop On Ubuntu 18.04 or 20.04
15 pages
Hadoop Install
No ratings yet
Hadoop Install
19 pages
Bdamanual
No ratings yet
Bdamanual
8 pages
Experiment No - 1
No ratings yet
Experiment No - 1
13 pages
Bda Lab Manual
No ratings yet
Bda Lab Manual
45 pages
Hadoop 2 - Pseudo Node Installation
No ratings yet
Hadoop 2 - Pseudo Node Installation
9 pages
213nt1306 - Big Data Analytics Lab Manual
No ratings yet
213nt1306 - Big Data Analytics Lab Manual
80 pages
Hadoop Installation
No ratings yet
Hadoop Installation
5 pages
Experiment-2 BDA Lab
No ratings yet
Experiment-2 BDA Lab
13 pages
Anurag 1-6 Merged
No ratings yet
Anurag 1-6 Merged
60 pages
Hadoop Installation Manual 2.odt
No ratings yet
Hadoop Installation Manual 2.odt
20 pages
$ Sudo Apt-Get Install Oracle-Java8-Installer
No ratings yet
$ Sudo Apt-Get Install Oracle-Java8-Installer
4 pages
Installationof Hadoop 3
No ratings yet
Installationof Hadoop 3
6 pages
Hadoop 3 Installation
No ratings yet
Hadoop 3 Installation
10 pages
Java-Hadoop 2.X Setting Up
No ratings yet
Java-Hadoop 2.X Setting Up
12 pages
HADOOP 1.X Installation Steps On Ubuntu
No ratings yet
HADOOP 1.X Installation Steps On Ubuntu
3 pages
A Report On Distributed Computing
No ratings yet
A Report On Distributed Computing
25 pages
Experiment 1 Hadoop Installation
No ratings yet
Experiment 1 Hadoop Installation
6 pages
Hadoop 2.6.5 Installing On Ubuntu 16.04 and 18.04 (Single-Node Cluster)
No ratings yet
Hadoop 2.6.5 Installing On Ubuntu 16.04 and 18.04 (Single-Node Cluster)
7 pages
Hadoop Installation Step by Step
No ratings yet
Hadoop Installation Step by Step
8 pages
Bda Lab
No ratings yet
Bda Lab
37 pages
Big Data Analytics - Lab-Manual
No ratings yet
Big Data Analytics - Lab-Manual
19 pages
GCC Lab Manual
No ratings yet
GCC Lab Manual
61 pages
Aryan
No ratings yet
Aryan
60 pages
BDA LAB Programs
No ratings yet
BDA LAB Programs
56 pages
Support of Hadoop Cluster Installation and Administration
No ratings yet
Support of Hadoop Cluster Installation and Administration
10 pages
Online:: Setting Up The Environment
No ratings yet
Online:: Setting Up The Environment
9 pages
Install Hadoop
No ratings yet
Install Hadoop
8 pages
Hadoop Installation Steps
100% (1)
Hadoop Installation Steps
6 pages
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster) STEP:1
No ratings yet
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster) STEP:1
13 pages
Installation of Hadoop
No ratings yet
Installation of Hadoop
8 pages
Installing A Single Node Hadoop Cluster
No ratings yet
Installing A Single Node Hadoop Cluster
4 pages
Hadoop Installation Final
No ratings yet
Hadoop Installation Final
32 pages
How To Install Hadoop On Ubuntu 18.04 or 20.04
No ratings yet
How To Install Hadoop On Ubuntu 18.04 or 20.04
20 pages
Hadoop Cluster Creation
No ratings yet
Hadoop Cluster Creation
8 pages
Hadoop Installaion
No ratings yet
Hadoop Installaion
113 pages
Complete Hadoop Map Reduce Hive Setup Step by Step
No ratings yet
Complete Hadoop Map Reduce Hive Setup Step by Step
30 pages
Hbase Installationn
No ratings yet
Hbase Installationn
12 pages
2023MCS320004 HEMANTH TARRA - Hadoop Installation - Assignment
No ratings yet
2023MCS320004 HEMANTH TARRA - Hadoop Installation - Assignment
9 pages
Hadoop Installation
No ratings yet
Hadoop Installation
12 pages
Bda Lab
No ratings yet
Bda Lab
47 pages
Hadoop Single Node Installation
No ratings yet
Hadoop Single Node Installation
7 pages
DataVisuaization Lab
No ratings yet
DataVisuaization Lab
5 pages
Hadoop Installation On Linux
No ratings yet
Hadoop Installation On Linux
4 pages
Big Data Analytics Lab Experiments
No ratings yet
Big Data Analytics Lab Experiments
16 pages
Hadoop Installation
No ratings yet
Hadoop Installation
6 pages
CP5261Data Analytics Laboratory
No ratings yet
CP5261Data Analytics Laboratory
57 pages
Hadoop Installation
No ratings yet
Hadoop Installation
4 pages
BDA Lab Manual
No ratings yet
BDA Lab Manual
49 pages
Procedure: 1
No ratings yet
Procedure: 1
29 pages
Step 1 - Install Oracle Java 8 On Ubuntu
No ratings yet
Step 1 - Install Oracle Java 8 On Ubuntu
7 pages
Configuration of a Simple Samba File Server, Quota and Schedule Backup
From Everand
Configuration of a Simple Samba File Server, Quota and Schedule Backup
Dr. Hedaya Alasooly
No ratings yet
Classes and Objects: Unit 2
No ratings yet
Classes and Objects: Unit 2
136 pages
Compilers and Interpreters
No ratings yet
Compilers and Interpreters
2 pages
Name: Akshitha Paduru
No ratings yet
Name: Akshitha Paduru
4 pages
Selenium Headless Browser
No ratings yet
Selenium Headless Browser
6 pages
Lab Exercise 8.1 - Creating A Javascript Program: In-Sight Spreadsheets Advanced In-Sight Spreadsheets Advanced
No ratings yet
Lab Exercise 8.1 - Creating A Javascript Program: In-Sight Spreadsheets Advanced In-Sight Spreadsheets Advanced
10 pages
Point Pattern Analysis in An ArcGIS Environment
No ratings yet
Point Pattern Analysis in An ArcGIS Environment
17 pages
MPL - Assignment
No ratings yet
MPL - Assignment
2 pages
WP Media Folder Addon - Google Drive Integration
No ratings yet
WP Media Folder Addon - Google Drive Integration
12 pages
Host in Nger
100% (1)
Host in Nger
36 pages
SoapUI Day3 Day4 Day5 Day6 Day7 Day8
No ratings yet
SoapUI Day3 Day4 Day5 Day6 Day7 Day8
37 pages
Task 3 - Repair Module 11 of WD Disks
100% (1)
Task 3 - Repair Module 11 of WD Disks
20 pages
History Java
No ratings yet
History Java
7 pages
User Interface: Characteristics of Successful User Interfaces
No ratings yet
User Interface: Characteristics of Successful User Interfaces
11 pages
Oops Cs
No ratings yet
Oops Cs
7 pages
FastHTML Complete Feature List
No ratings yet
FastHTML Complete Feature List
5 pages
Enterprise Java: Time: 2 HRS.) (Marks: 75
No ratings yet
Enterprise Java: Time: 2 HRS.) (Marks: 75
26 pages
Kelly Hadoop Hyd May 2018
No ratings yet
Kelly Hadoop Hyd May 2018
14 pages
Adnan Shah CV21
No ratings yet
Adnan Shah CV21
6 pages
Chapter 4 Software
No ratings yet
Chapter 4 Software
17 pages
HTML Classes
No ratings yet
HTML Classes
30 pages
Dojo Charting in XPages
No ratings yet
Dojo Charting in XPages
40 pages
CS 100 Spring 2018 Midterm V3
No ratings yet
CS 100 Spring 2018 Midterm V3
7 pages
SOAP vs. REST: Difference Between Web API Services
No ratings yet
SOAP vs. REST: Difference Between Web API Services
11 pages
En ENBSP SDK EN eNBSP SDK Programmer's GuideProgrammer's Guide
No ratings yet
En ENBSP SDK EN eNBSP SDK Programmer's GuideProgrammer's Guide
56 pages
Eee1231 Software Engineering - Software Evolution and Maintenance
No ratings yet
Eee1231 Software Engineering - Software Evolution and Maintenance
9 pages
Python Week 3
No ratings yet
Python Week 3
4 pages
Javascript Programming Bootcamp
No ratings yet
Javascript Programming Bootcamp
11 pages
Brolly AI - Generative AI - Online Training
No ratings yet
Brolly AI - Generative AI - Online Training
13 pages
CV - Chelsea Chen Xi
No ratings yet
CV - Chelsea Chen Xi
4 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Hadoop Installation

Uploaded by

Hadoop Installation

Uploaded by

Minimum Requirements (Recommended)

RAM – More than 4GB

Step 1: login as root

Step 9.D. Configure mapred-site.xml

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.