05 Movies Data Analysis Using Mapreduce
05 Movies Data Analysis Using Mapreduce
Tushar B. Kute,
http://tusharkute.com
What is MapReduce?
• Movies.java
Compilation and Execution
• Step 2
Download hadoop-core-1.2.1.jar, which is used to compile
and execute the MapReduce program. Visit the following
link
http://mvnrepository.com/artifact/org.apache.hadoop/hado
op-core/1.2.1
To download the jar. Let us assume the downloaded folder
is /home/rashmi/movies.
• Step 3
The following commands are used for compiling the
wordcount.java program and creating a jar for the program.
$ javac -classpath hadoop-core-1.2.1.jar movies/Movies.java
$ jar -cvf snow.jar -C movies/ .
Compilation and Execution
• Step 4
– The following command is used to create an input directory in
HDFS.
– $hadoop fs -mkdir /input
• Step 5
– The following command is used to copy input dataset file on
HDFS.
– $hadoop fs -put u.data /input
• Step 6
– The following command is used to verify the files in the input
directory.
– $hadoop fs -ls /input
Compilation and Execution
• Step 7
– The following command is used to run the Snow
application by taking the input files from the input
directory.
– $hadoop jar movies.jar Movies /input /output
– Wait for a while until the file is executed. After
execution, the output will contain the number of
input splits, the number of Map tasks, the number of
reducer tasks, etc. The output directory must not be
existing already.
Compilation and Execution
• Step 8
– The following command is used to verify the
resultant files in the output folder.
– $hadoop fs -ls /output
• Step 9
– The following command is used to see the output in
part-r-00000 file. This file is generated by HDFS.
– $hadoop fs -cat /output/part-r-00000
Compilation and Execution
• Step 10
The following command is used to copy
the output file from HDFS to the local file
system for analyzing.
– $hadoop fs -get /output/part-r-00000
Output:
Thank you
This presentation is created using LibreOffice Impress 5.1.6.2, can be used freely as per GNU General Public License
/mITuSkillologies @mitu_group
Web Resources
http://mitu.co.in
http://tusharkute.com
tushar@tusharkute.com
contact@mitu.co.in