unit II big data architecture
unit II big data architecture
unit II big data architecture
Big Data has quickly risen to become one of the most desired
topics in the industry.
The main business drivers for such rising demand for Big Data
Analytics are :
1. The digitization of society
2. The drop in technology costs
3. Connectivity through cloud computing
4. Increased knowledge about data science
5. Social media applications
6. The rise of Internet-of-Things(IoT)
Example: A number of companies that have Big Data at the
core of their strategy like :
Apple, Amazon, Facebook and Netflix have become very
successful at the beginning of the 21st century.
1. Ingestion :
The ingestion layer is the very first step of pulling in raw data.
It comes from internal sources, relational databases, non-
relational databases, social media, emails, phone calls etc.
There are two kinds of ingestions :
Batch, in which large groups of data are gathered and delivered
together.
Streaming, which is a continuous flow of data. This is
necessary for real-time data analytics.
2. Storage :
Storage is where the converted data is stored in a data lake or
warehouse and eventually processed.
The data lake/warehouse is the most essential component of a
big data ecosystem.
It needs to contain only thorough, relevant data to make
insights as valuable as possible.
It must be efficient with as little redundancy as possible to
allow for quicker processing.
3. Analysis :
In the analysis layer, data gets passed through several tools,
shaping it into actionable insights.
There are four types of analytics on big data :