Big Data HDP Introduction
Big Data HDP Introduction
Big Data HDP Introduction
Governance
Tools Security Operations
Integration
Atlas ZooKeeper
Atlas
HDFS
Oozie
Encrpytion
Data workflow
Data Access
Sqoop
Batch Script SQL NoSQL Stream Search In-Mem Others
Governance
Tools Security Operations
Integration
Atlas ZooKeeper
Atlas
HDFS
Oozie
Encrpytion
Data workflow
Data Access
Sqoop
Batch Script SQL NoSQL Stream Search In-Mem Others
• Often used in place of traditional message brokers like JMS and AMQP
because of its higher throughput, reliability and replication.
• Can also be used to extract data from Hadoop and export it to relational
databases and enterprise data warehouses
• Helps offload some tasks such as ETL from Enterprise Data Warehouse
to Hadoop for lower cost and efficient execution
Governance
Tools Security Operations
Integration
Atlas ZooKeeper
Atlas
HDFS
Oozie
Encrpytion
Data workflow
Data Access
Sqoop
Batch Script SQL NoSQL Stream Search In-Mem Others
• Includes HCatalog
▪ Global metadata management layer that exposes Hive table metadata to
other Hadoop applications.
• Features:
▪ Server-side programming
▪ Designed to scale
▪ Cell-based access control
▪ Stable
• Useful when milliseconds of latency matter and Spark isn't fast enough
▪ Has been benchmarked at over a million tuples processed per second per
node
Governance
Tools Security Operations
Integration
Atlas ZooKeeper
Atlas
HDFS
Oozie
Encrpytion
Data workflow
Data Access
Sqoop
Batch Script SQL NoSQL Stream Search In-Mem Others
Governance
Tools Security Operations
Integration
Atlas ZooKeeper
Atlas
HDFS
Oozie
Encrpytion
Data workflow
Data Access
Sqoop
Batch Script SQL NoSQL Stream Search In-Mem Others
• Using Ranger console can manage policies for access to files, folders,
databases, tables, or column with ease
Governance
Tools Security Operations
Integration
Atlas ZooKeeper
Atlas
HDFS
Oozie
Encrpytion
Data workflow
Data Access
Sqoop
Batch Script SQL NoSQL Stream Search In-Mem Others