0% found this document useful (0 votes)

48 views7 pages

Commands in Hadoop

The document describes 10 common commands used in Hadoop: ls, mkdir, touchz, copyFromLocal, cat, copyToLocal, cp, mv, du, and dus. It also provides the syntax and examples for using each command. Additionally, it outlines the 5 steps to create and run a MapReduce word count program in Python using Hadoop streaming: 1) create input data and mapper/reducer scripts, 2) start Hadoop daemons, 3) copy input to HDFS, 4) make scripts executable, and 5) run the program via Hadoop streaming utility.

Uploaded by

shweta shedshale

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

48 views7 pages

Commands in Hadoop

Uploaded by

shweta shedshale

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Commands in Hadoop

1. ls:

This command is used to list all the files. Use lsr for recursive approach. It is useful when we
want a hierarchy of a folder.

Syntax:

hadoop fs -ls <path>

Example:

Hadoop fs -ls /user

It will print all the directories present in user directory

2. mkdir:

To create a directory. In Hadoop fs there is no home directory by default. So let’s first create it.

Syntax:

hadoop fs -mkdir <folder name>

creating home directory:

Hadoop fs -mkdir /user

3. touchz:

It creates an empty file.

Syntax:

hadoop fs -touchz <file_path>

Example:

hadoop fs -touchz /user/myfile.txt

4. copyFromLocal (or) put:

To copy files/folders from local file system to hdfs store. This is the most important command.
Local filesystem means the files present on the OS.
Syntax:

hadoop fs -copyFromLocal <local file path> <dest(present on hdfs)>

Example: Let’s suppose we have a file AI.txt on Desktop which we want to copy to folder user
present on hdfs.

hadoop fs -copyFromLocal ../Desktop/AI.txt /user

hadoop fs -put ../Desktop/AI.txt /user

5. cat:

To print file contents.

Syntax:

hadoop fs -cat <path>

Example:

// print the content of AI.txt present

// inside geeks folder.

hadoop fs -cat /user/AI.txt

6. copyToLocal (or) get:

To copy files/folders from hdfs store to local file system.

Syntax:

hadoop fs -copyToLocal <<srcfile(on hdfs)> <local file dest>

Example:

hadoop fs -copyToLocal /user/data.txt ../Desktop

(OR)
hadoop fs -put /user/data.txt ../Desktop

7 cp:

This command is used to copy files within hdfs. Lets copy folder user to user_copied.

Syntax:

hadoop fs -cp <src(on hdfs)> <dest(on hdfs)>

Example:

hadoop -cp /user /user_copied

8. mv:

This command is used to move files within hdfs. Lets cut-paste a file myfile.txt from user folder
to user_copied.

Syntax:

hadoop fs -mv <src(on hdfs)> <src(on hdfs)>

Example:

hadoop -mv /user/myfile.txt /user_copied

9. du:

It will give the size of each file in directory.

Syntax:

hadoop fs -du <dirName>

Example:

hadoop fs -du /user

10. dus:

This command will give the total size of directory/file.

Syntax:
hadoop fs -dus <dirName>

Example:

hadoop fs -dus /user

Map Reduce Steps word count program

Step 1: Create a file with the name word_count_data.txt and add some data to it

Step 2: Create a mapper.py file that implements the mapper logic. It will read the data from
STDIN and will split the lines into words, and will generate an output of each word with its
individual count.
#!/usr/bin/env python
# import sys because we need to read and write data to STDIN and STDOUT
import sys
# reading entire line from STDIN (standard input)
for line in sys.stdin:
# to remove leading and trailing whitespace

line = line.strip()
# split the line into words
words = line.split()
for word in words:
print ('%s\t%s' % (word, 1))
Step 3: Create a reducer.py file that implements the reducer logic. It will read the output of
mapper.py from STDIN(standard input) and will aggregate the occurrence of each word and
will write the final output to STDOUT.

#!/usr/bin/env python
from operator import itemgetter
import sys

current_word = None
current_count = 0
for line in sys.stdin:
line = line.strip()
word, count = line.split('\t')
count = int(count)
if current_word == word:
current_count += count
else:
if current_word:
print ('%s\t%s' % (current_word, current_count))
current_count = count
current_word = word

if current_word == word:
print ('%s\t%s' % (current_word, current_count))

Step 4: Now let’s start all our Hadoop daemons with the below command.
start-all.cmd

Now make a directory word_count_in_python in our HDFS in the root directory that will
store our word_count_data.txt file with the below command.
hdfs dfs -mkdir /word_count_in_python

Copy word_count_data.txt to this folder in our HDFS with help of copyFromLocal

command.
Syntax to copy a file from your local file system to the HDFS is given below:

hdfs dfs -copyFromLocal /path 1 /path 2 .... /path n /destination

Let’s give executable permission to our mapper.py and reducer.py with the help of below
command.
cd Documents/

chmod 777 mapper.py reducer.py # changing the permission to read, write, execute for user,
group and others

Step 5: Now download the latest hadoop-streaming jar. Then place, this Hadoop,-streaming
jar file to a place from you can easily access it.

Now let’s run our python files with the help of the Hadoop streaming utility as shown below.

Compile and run command

hadoop jar /home/dikshant/Documents/hadoop-streaming-2.7.3.jar -input

word_count_in_python/word_count_data.txt -output word_count_in_python/output -mapper
mapper.py -reducer reducer.py

Software Engineering Paradigm Report
No ratings yet
Software Engineering Paradigm Report
41 pages
ZATCA E-Invoicing Customized Solution in SAP ECC - S4HANA
100% (1)
ZATCA E-Invoicing Customized Solution in SAP ECC - S4HANA
7 pages
Big Data
No ratings yet
Big Data
130 pages
Hadoop HDFS Commands With Examples
No ratings yet
Hadoop HDFS Commands With Examples
3 pages
Access - Chapter 4 - Working With Forms and Reports
No ratings yet
Access - Chapter 4 - Working With Forms and Reports
4 pages
Big Data & Analytics Lab Manual
No ratings yet
Big Data & Analytics Lab Manual
51 pages
Command
No ratings yet
Command
1 page
Experiment No 1
No ratings yet
Experiment No 1
13 pages
C++ Practice Questions PDF
100% (1)
C++ Practice Questions PDF
3 pages
Hadoop Commands
100% (1)
Hadoop Commands
6 pages
PDC All Labs
100% (1)
PDC All Labs
129 pages
Hadoop Mapreduce Python Script
No ratings yet
Hadoop Mapreduce Python Script
3 pages
Hadoop Exercise Mapreduce
No ratings yet
Hadoop Exercise Mapreduce
5 pages
Lab 10 Sp18 208A
No ratings yet
Lab 10 Sp18 208A
4 pages
HDFS Commands v02 PDF
No ratings yet
HDFS Commands v02 PDF
7 pages
PRACTICAL NO-3 Word Count
No ratings yet
PRACTICAL NO-3 Word Count
4 pages
Create A Directory in HDFS at Given Path(s) .: Upload
No ratings yet
Create A Directory in HDFS at Given Path(s) .: Upload
11 pages
Blue Illustrated Medical Center Presentation
No ratings yet
Blue Illustrated Medical Center Presentation
6 pages
BDA
No ratings yet
BDA
88 pages
lsde_workshop_wk9(2)
No ratings yet
lsde_workshop_wk9(2)
31 pages
TP3_hadoop python_Wordcount (1)
No ratings yet
TP3_hadoop python_Wordcount (1)
6 pages
Hadoop Practical Commands & Mapreduce Lab Mannula With Java and Python
No ratings yet
Hadoop Practical Commands & Mapreduce Lab Mannula With Java and Python
2 pages
hai hadoop
No ratings yet
hai hadoop
14 pages
hdfs commands
No ratings yet
hdfs commands
3 pages
Module 2 Cont.
No ratings yet
Module 2 Cont.
16 pages
SSJ Bda File
No ratings yet
SSJ Bda File
16 pages
HDFS Command
No ratings yet
HDFS Command
15 pages
Library Management System Project Ppt... 123
No ratings yet
Library Management System Project Ppt... 123
28 pages
2_Hadoop MapReduce
No ratings yet
2_Hadoop MapReduce
2 pages
C Program To Implement A Stack and Queue Using A Linked List
No ratings yet
C Program To Implement A Stack and Queue Using A Linked List
18 pages
Bda Lab Manual
No ratings yet
Bda Lab Manual
20 pages
big datalab
No ratings yet
big datalab
4 pages
BDC Output 2
No ratings yet
BDC Output 2
4 pages
LBC PPT All Unit
No ratings yet
LBC PPT All Unit
156 pages
Activity 2
No ratings yet
Activity 2
31 pages
Practical 1 - 1 - Hadoop Commands
No ratings yet
Practical 1 - 1 - Hadoop Commands
3 pages
Webhelper
No ratings yet
Webhelper
58 pages
3244953
No ratings yet
3244953
3 pages
hadoop
No ratings yet
hadoop
6 pages
Tutorial-Counting Words in File (S) Using Mapreduce: Prerequisites
No ratings yet
Tutorial-Counting Words in File (S) Using Mapreduce: Prerequisites
11 pages
Run Python MapReduce On Local Docker Hadoop Cluster - DEV Community
No ratings yet
Run Python MapReduce On Local Docker Hadoop Cluster - DEV Community
5 pages
Extreme Computing Lab Exercises Session One: 1 Getting Started
No ratings yet
Extreme Computing Lab Exercises Session One: 1 Getting Started
6 pages
Webcrawl A Blog To Retrieve All Entries Locally
No ratings yet
Webcrawl A Blog To Retrieve All Entries Locally
42 pages
MSCSE Project - Akramul Islam
No ratings yet
MSCSE Project - Akramul Islam
38 pages
Autocad - Guide de Personal Is at Ion
No ratings yet
Autocad - Guide de Personal Is at Ion
644 pages
UNIT 2-tt1
No ratings yet
UNIT 2-tt1
7 pages
Bda Lab Exercises Lab Mannual - 2023
No ratings yet
Bda Lab Exercises Lab Mannual - 2023
72 pages
BDA-Lab Record
No ratings yet
BDA-Lab Record
43 pages
Bda Experiment No2
No ratings yet
Bda Experiment No2
12 pages
Networking Squeak
No ratings yet
Networking Squeak
13 pages
C21053 Jay Vijay Karwatkar-Big Data Analytics & Visualization
No ratings yet
C21053 Jay Vijay Karwatkar-Big Data Analytics & Visualization
210 pages
Lab1 JSX and ES6
No ratings yet
Lab1 JSX and ES6
10 pages
Cloud PDF
No ratings yet
Cloud PDF
47 pages
Lista de Comandos HDFS
No ratings yet
Lista de Comandos HDFS
8 pages
HDFS File System Shell Guide
No ratings yet
HDFS File System Shell Guide
10 pages
Function Object (Also Known As A Functor) : - STL Function Objects Support Function Call Syntax
No ratings yet
Function Object (Also Known As A Functor) : - STL Function Objects Support Function Call Syntax
22 pages
Palak
No ratings yet
Palak
10 pages
Big Data File
No ratings yet
Big Data File
16 pages
BDF Programs
No ratings yet
BDF Programs
32 pages
Microsoft Office Topics Per Course MS Office For Beginners
No ratings yet
Microsoft Office Topics Per Course MS Office For Beginners
8 pages
Hadoop Commands Only
No ratings yet
Hadoop Commands Only
19 pages
084 Liza Bda File
No ratings yet
084 Liza Bda File
23 pages
COMMAND Line Interface
No ratings yet
COMMAND Line Interface
26 pages
Dsa Practical File
No ratings yet
Dsa Practical File
16 pages
HadoopExercises July2011 PDF
No ratings yet
HadoopExercises July2011 PDF
26 pages
bda-manual
No ratings yet
bda-manual
33 pages
solved-question-bank-for-ct2-mic
No ratings yet
solved-question-bank-for-ct2-mic
16 pages
1) Communication Port Settings: Gsmcomm
No ratings yet
1) Communication Port Settings: Gsmcomm
2 pages
MDM Mar2017
No ratings yet
MDM Mar2017
239 pages
ExecuteMultipleSQLStmnts Snowflake
No ratings yet
ExecuteMultipleSQLStmnts Snowflake
5 pages
Writing An Hadoop MapReduce Program in Python
No ratings yet
Writing An Hadoop MapReduce Program in Python
21 pages
big data
No ratings yet
big data
28 pages
Tutorial 2: On Objects & Classes, Inheritance and Polymorphism
No ratings yet
Tutorial 2: On Objects & Classes, Inheritance and Polymorphism
2 pages
@bigdatalabfile 09
No ratings yet
@bigdatalabfile 09
35 pages
HDFS Commands
No ratings yet
HDFS Commands
2 pages
Hadoop Commands
No ratings yet
Hadoop Commands
2 pages
Software Requirement Engineering SE-3731w IT-2102 SE-2102
No ratings yet
Software Requirement Engineering SE-3731w IT-2102 SE-2102
52 pages
Data Science
No ratings yet
Data Science
82 pages
C# Environment
No ratings yet
C# Environment
3 pages
SAP Cloud
100% (1)
SAP Cloud
512 pages
Jdegt - Updatedata/ Jdegt - Updatedatakeystr: Syntax
No ratings yet
Jdegt - Updatedata/ Jdegt - Updatedatakeystr: Syntax
5 pages
Linux Commands - Mkdir - Rmdir - Touch - RM - CP - More - Less - Head - Tail - Cat
No ratings yet
Linux Commands - Mkdir - Rmdir - Touch - RM - CP - More - Less - Head - Tail - Cat
16 pages
2 HDFS Commands
No ratings yet
2 HDFS Commands
7 pages
OS. Simple STAAD - Pro Macro
No ratings yet
OS. Simple STAAD - Pro Macro
3 pages
QP Ip Xii Test 1
No ratings yet
QP Ip Xii Test 1
3 pages
Software Localization Tools
No ratings yet
Software Localization Tools
4 pages
How To Set Up A Hadoop Cluster in Docker
No ratings yet
How To Set Up A Hadoop Cluster in Docker
13 pages
Hadoop Hdfs Commands
No ratings yet
Hadoop Hdfs Commands
5 pages
Windows Command Prompt
From Everand
Windows Command Prompt
Murat Yildirimoglu
No ratings yet
Linux Commands By Example
From Everand
Linux Commands By Example
Khaled Jamal
4.5/5 (3)

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Commands in Hadoop

Uploaded by

Commands in Hadoop

Uploaded by

Commands in Hadoop

hadoop fs -ls <path>

Hadoop fs -ls /user

It will print all the directories present in user directory

hadoop fs -mkdir <folder name>

creating home directory:

Hadoop fs -mkdir /user

It creates an empty file.

hadoop fs -touchz <file_path>

hadoop fs -touchz /user/myfile.txt

4. copyFromLocal (or) put:

hadoop fs -copyFromLocal <local file path> <dest(present on hdfs)>

hadoop fs -copyFromLocal ../Desktop/AI.txt /user

hadoop fs -put ../Desktop/AI.txt /user

To print file contents.

hadoop fs -cat <path>

// print the content of AI.txt present

// inside geeks folder.

hadoop fs -cat /user/AI.txt

6. copyToLocal (or) get:

To copy files/folders from hdfs store to local file system.

hadoop fs -copyToLocal <<srcfile(on hdfs)> <local file dest>

hadoop fs -copyToLocal /user/data.txt ../Desktop

hadoop fs -cp <src(on hdfs)> <dest(on hdfs)>

hadoop -cp /user /user_copied

hadoop fs -mv <src(on hdfs)> <src(on hdfs)>

hadoop -mv /user/myfile.txt /user_copied

It will give the size of each file in directory.

hadoop fs -du <dirName>

hadoop fs -du /user

This command will give the total size of directory/file.

hadoop fs -dus /user

Copy word_count_data.txt to this folder in our HDFS with help of copyFromLocal

hdfs dfs -copyFromLocal /path 1 /path 2 .... /path n /destination

Compile and run command

hadoop jar /home/dikshant/Documents/hadoop-streaming-2.7.3.jar -input

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.