Hadoop Course Content
Hadoop Course Content
Hadoop Course Content
Hadoop Installation
Origins of BIGDATA Single-node Hadoop Setup using Images
BIGDATA processing and storage problems Multi-node Hadoop cluster setup with public repositories
Platforms for BIGDATA processing and storage: Hadoop & Multi-node Hadoop cluster setup with private repositories
Spark
What is Hadoop? Benefits of Hadoop
Overview of Hadoop EcoSystem
3. HDFS 4. MapReduce
What is HDFS? Why do we need HDFS? What is MapReduce? Why do we need MapReduce?
HDFS Architecture MapReduce Architecture
Concepts of HDFS: Block, NameNode, DataNode, JobTracker and TaskTracker setup
Secondary Namenode
Steps to develop MapReduce Jobs
Replication
Internal execution of MapReduce Jobs
Read & Write request walkthrough
Shuffle, Sort & Partitioning
Understanding Pipelining
Speculative Execution
Demonstration of Fault Tolerance & Self-Healing in reality
Input/Output formats
HDFS as a service - WebHDFS
Writing & Debugging MR programs in java
HDFS Integration with other services/applications
MR programming with python/c++/ruby with
HDFS Shell access for administration HadoopStreaming
Lab Session & Assignments Lab Session & Assignments
5. PIG 6. HIVE
What is PIG? Why do we need PIG? What is HIVE? Why do we need HIVE?
PIG installation HIVE installation
PIG internal architecture HIVE internal architecture
PIG Latin scripting Understanding HIVE metastore
PIG internal optimization Datamodel
PIG interaction via grunt shell(Local & Hadoop mode) Managing tables
Writing PIG UDFs HIVE Query Language
Working with open PIG libraries like piggybank, datafu & HIVE UDFs
elephant bird
HIVE partionitiong & bucketing
Lab Session & Assignments
Lab Session & Assignments
7. Sqoop 8. Flume
Integrating RDBMS servers data with Hadoop Integrating Streaming data sources with Hadoop
What is Sqoop? Benefits of Sqoop What is Flume? Benefits of Flume
Importing data from RDBMS servers to HDFS/HIVE Internal Architecture of FlumeNG
Exporting data from HDFS/HIVE to RDBMS servers Understanding & working with Sources
Working with sqoop jobs Understanding & working with Sinks
Lab Session & Assignments Understanding & working with Channels