Hadoop Course Content

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

1. Hadoop Overview 2.

Hadoop Installation
Origins of BIGDATA Single-node Hadoop Setup using Images
BIGDATA processing and storage problems Multi-node Hadoop cluster setup with public repositories
Platforms for BIGDATA processing and storage: Hadoop & Multi-node Hadoop cluster setup with private repositories
Spark
What is Hadoop? Benefits of Hadoop
Overview of Hadoop EcoSystem

3. HDFS 4. MapReduce
What is HDFS? Why do we need HDFS? What is MapReduce? Why do we need MapReduce?
HDFS Architecture MapReduce Architecture
Concepts of HDFS: Block, NameNode, DataNode, JobTracker and TaskTracker setup
Secondary Namenode
Steps to develop MapReduce Jobs
Replication
Internal execution of MapReduce Jobs
Read & Write request walkthrough
Shuffle, Sort & Partitioning
Understanding Pipelining
Speculative Execution
Demonstration of Fault Tolerance & Self-Healing in reality
Input/Output formats
HDFS as a service - WebHDFS
Writing & Debugging MR programs in java
HDFS Integration with other services/applications
MR programming with python/c++/ruby with
HDFS Shell access for administration HadoopStreaming
Lab Session & Assignments Lab Session & Assignments

5. PIG 6. HIVE
What is PIG? Why do we need PIG? What is HIVE? Why do we need HIVE?
PIG installation HIVE installation
PIG internal architecture HIVE internal architecture
PIG Latin scripting Understanding HIVE metastore
PIG internal optimization Datamodel
PIG interaction via grunt shell(Local & Hadoop mode) Managing tables
Writing PIG UDFs HIVE Query Language
Working with open PIG libraries like piggybank, datafu & HIVE UDFs
elephant bird
HIVE partionitiong & bucketing
Lab Session & Assignments
Lab Session & Assignments

7. Sqoop 8. Flume
Integrating RDBMS servers data with Hadoop Integrating Streaming data sources with Hadoop
What is Sqoop? Benefits of Sqoop What is Flume? Benefits of Flume
Importing data from RDBMS servers to HDFS/HIVE Internal Architecture of FlumeNG
Exporting data from HDFS/HIVE to RDBMS servers Understanding & working with Sources
Working with sqoop jobs Understanding & working with Sinks
Lab Session & Assignments Understanding & working with Channels

9. Oozie 10. Zookeeper


What is oozie? Benefits of oozie
oozie architecture What is zookeeper? Benefits of zookeeper
workflows, coordinators & bundles zookeeper architecture
Creating, Deployment & Monitoring of oozie workflows read and write walkthrough
Understanding & Working with datasets data arrangement in zookeeper
Time & data availability based workflow automation Lab Session & Assignments
Lab Session & Assignments

11. NoSQL Databases


Limitations of RDBMS
Why do we need NoSQL databases
Types of NoSQL databases
Understanding one NoSQL
DB(HBASE/MongoDB/Cassandra)
Installation of NoSQL datbase clusters
Interacting with NoSQL datbase clusters

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy