Chapter 1
Chapter 1
Big Data contains a large amount of data that is not being processed by
traditional data storage or the processing unit. It is used by many multinational
companies to process the data and business of many organizations. The data
flow would exceed 150 exabytes per day before replication.
There are five v's of Big Data that explains the characteristics.
5 V's of Big Data
o Volume
o Veracity
o Variety
o Value
o Velocity
Volume
The name Big Data itself is related to an enormous size. Big Data is a vast
'volumes' of data generated from many sources daily, such as business
processes, machines, social media platforms, networks, human
interactions, and many more.
Facebook can generate approximately a billion messages, 4.5 billion times that
the "Like" button is recorded, and more than 350 million new posts are
uploaded each day. Big data technologies can handle large amounts of data.
Variety
Big Data can be structured, unstructured, and semi-structured that are being
collected from different sources. Data will only be collected
from databases and sheets in the past, But these days the data will comes in
array forms, that are PDFs, Emails, audios, SM posts, photos, videos, etc.
Big data in the real world Big data analytics helps companies and governments
make sense of data and make better, informed decisions.
• Entertainment: Providing a personalised recommendation of movies and
music according to a customer’s preferences has been transformative for the
entertainment industry (think Spotify and Netflix).
• Education: Big data helps schools and educational technology companies
develop new curriculums while improving existing plans based on needs and
demands.
• Health care: Monitoring patients’ medical histories helps doctors detect and
prevent diseases.
• Government: Big data can be used to collect data from CCTV and traffic
cameras, satellites, body cameras and sensors, emails, calls, and more, to help
manage the public sector.
• Marketing: Customer information and preferences can be used to create
targeted advertising campaigns with a high return on investment (ROI).
• Banking: Data analytics can help track and monitor illegal money laundering.
The Big Data Framework consists of the following six main elements:
1. Big Data Strategy
Data has become a strategic asset for most organisations. The capability to
analyse large data sets and discern pattern in the data can provide
organisations with a competitive advantage. Netflix, for example, looks at user
behaviour in deciding what movies or series to produce. Alibaba, the Chinese
sourcing platform, became one of the global giants by identifying which
suppliers to loan money and recommend on their platform. Big Data has
become Big Business.
In order to achieve tangible results from investments in Big Data, enterprise
organisations need a sound Big Data strategy. How can return on investments
be realised, and where to focus effort in Big Data analysis and analytics? The
possibilities to analyse are literally endless and organisations can easily get lost
in the zettabytes of data. A sound and structured Big Data strategy is the first
step to Big Data success.
2. Big Data Architecture
In order to work with massive data sets, organisations should have the
capabilities to store and process large quantities of data. In order to achieve
this, the enterprise should have the underlying IT infrastructure to facilitate Big
Data. Enterprises should therefore have a comprehensive Big Data
architecture to facilitate Big Data analysis. How should enterprises design and
set up their architecture to facilitate Big Data? And what are the requirements
from a storage and processing perspective?
The Big Data Architecture element of the Big Data Framework considers the
technical capabilities of Big Data environments. It discusses the various roles
that are present within a Big Data Architecture and looks at the best practices
for design. In line with the vendor-independent structure of the Framework,
this section will consider the Big Data reference architecture of the National
Institute of Standards and Technology (NIST).
3. Big Data Algorithms
A fundamental capability of working with data is to have a thorough
understanding of statistics and algorithms. Big Data professionals therefore
need to have a solid background in statistics and algorithms to deduct insights
from data. Algorithms are unambiguous specifications of how to solve a class
of problems. Algorithms can perform calculations, data
processing and automated reasoning tasks. By applying algorithms to large
volumes of data, valuable knowledge and insights can be obtained.
The Big Data algorithms element of the framework focuses on the (technical)
capabilities of everyone who aspires to work with Big Data. It aims to build a
solid foundation that includes basic statistical operations and provides an
introduction to different classes of algorithms.
4. Big Data Processes
In order to make Big Data successful in enterprise organization, it is necessary
to consider more than just the skills and technology. Processes can help
enterprises to focus their direction. Processes bring structure, measurable
steps and can be effectively managed on a day-to-day basis. Additionally,
processes embed Big Data expertise within the organization by following
similar procedures and steps, embedding it as ‘a practice’ of the organization.
Analysis becomes less dependent on individuals and thereby, greatly enhancing
the chances of capturing value in the long term.
5. Big Data Functions
Big Data functions are concerned with the organisational aspects of managing
Big Data in enterprises. This element of the Big Data framework addresses how
organisations can structure themselves to set up Big Data roles and discusses
roles and responsibilities in Big Data organisations. Organisational culture,
organisational structures and job roles have a large impact on the success of
Big Data initiatives. We will therefore review some ‘best practices’ in setting up
enterprise big data
In the Big Data Functions section of the Big Data Framework, the non-technical
aspects of Big Data are covered. You will learn how to set up a Big Data Center
of Excellence (BDCoE). Additionally, it also addresses critical success factors for
starting Big Data project in the organization.
6. Artificial Intelligence
The last element of the Big Data Framework addresses Artificial Intelligence
(AI). One of the major areas of interest in the world today, AI provides a whole
world of potential. In this part of the framework, we address the relation
between Big Data and Artificial Intelligence and outline key characteristics of
AI.
Many organisations are keen to start Artificial Intelligence projects, but most
are unsure where to start their journey. The Big Data Framework takes a
functional view of AI in the context of bringing business benefits to enterprise
organisations. The last section of the framework therefore showcases how AI
follows as a logical next step for organisations that have built up the other
capabilities of the Big Data Framework. The last element of the Big Data
Framework has been depicted as a lifecycle on purposes. Artificial Intelligence
can start to continuously learn from the Big Data in the organization in order to
provide long lasting value.
Challenges of Big Data Analytics:-
Data is a very valuable asset in the world today. The economics of data is based
on the idea that data value can be extracted through analytics. Though Big data
and analytics are still in their initial growth stage, their importance cannot be
undervalued. As big data starts to expand and grow, the Importance of big data
analytics will continue to grow in everyday personal and business lives. In
addition, the size and volume of data are increasing daily, making it important
to address big data daily. Here we will discuss the Challenges of Big Data
Analytics.
According to surveys, many companies are opening up to using big data
analytics in their daily functioning. With the rising popularity of Big data
analytics, it is obvious that investing in this medium will secure the future
growth of companies and brands.
The key to data value creation is Big Data Analytics, so it is important to focus
on that aspect of analytics. Many companies use different methods to employ
Big Data analytics, and there is no magic solution to successfully implementing
this. While data is important, even more important is the process through
which companies can gain insights with their help. Gaining insights from data is
the goal of big data analytics, so investing in a system that can deliver those
insights is extremely crucial and important. Therefore, successful
implementation of big data analytics requires a combination of skills, people,
and processes that can work in perfect synchronization with each other.
Some of the major challenges that big data analytics programs are facing today
include the following:
1. Uncertainty of Data Management Landscape: Because big data is
continuously expanding, new companies and technologies are developed
every day. A big challenge for companies is to find out which technology
works bests for them without introducing new risks and problems.
2. The Big Data Talent Gap: While Big Data is growing, very few experts are
available. This is because Big data is a complex field, and people who
understand this field’s complexity and intricate nature are far from
between. Another major challenge in the field is the talent gap that
exists in the industry
3. Getting data into the big data platform: Data is increasing every single
day. This means that companies have to tackle a limitless amount of data
on a regular basis. The scale and variety of data available today can
overwhelm any data practitioner, which is why it is important to make
data accessibility simple and convenient for brand managers and owners.
4. Need for synchronization across data sources: As data sets become
more diverse, they must be incorporated into an analytical platform. It
can create gaps and lead to wrong insights and messages if ignored.
5. Getting important insights through the use of Big data analytics: It is
important that companies gain proper insights from big data analytics,
and it is important that the correct department has access to this
information. A major challenge in big data analytics is bridging this gap in
an effective fashion.
This article will look at these challenges in a closer manner and understand
how companies can tackle these challenges in an effective fashion.
Implementation of Hadoop infrastructure. Learn Hadoop skills like HBase, Hive,
Pig, and Mahout.
• Challenge 1
The challenge of rising uncertainty in data management: In a world of big
data, the more data you have, the easier it is to gain insights from them.
However, in big data, there are a number of disruptive technology in the world
today, and choosing from them might be a tough task. That is why big data
systems need to support both the operational and, to a great extent, analytical
processing needs of a company. These approaches are generally lumped into
the NoSQL framework category, which differs from the conventional relational
database management system.
• Challenge 2
The gap in experts in big data analytics: An industry completely depends on
the resources it has access to, whether human or material. Some tools for big
data analytics range from traditional relational database tools with alternative
data layouts designed to increase access speed while decreasing the storage
footprint, in-memory analytics, NoSQL data management frameworks, and the
broad Hadoop ecosystem. With so many systems and frameworks, there is a
growing and immediate need for application developers who have knowledge
of all these systems. Despite the fact that these technologies are developing at
a rapid pace, there is a lack of people who possess the required technical skill.
• Challenge 3
The challenge of getting data into the big data platform: Every company is
different and has different amounts of data to deal with. While some
companies are completely data-driven, others might be less so. That is why it is
important to understand these distinctions before finally implementing the
right data plan. Also, not all companies understand the full implication of big
data analytics. Assuming that every company is knowledgeable about the
benefits and growth strategy of business data analytics would seriously impact
the success of this initiative. That is why it is important that business
development analytics are implemented with the knowledge of the company.
• Challenge 4
The challenge of the need for synchronization across data sources: Once data
is integrated into a big platform, data copies are migrated from different
sources at different rates. Schedules can sometimes be out of sync within the
entire system. There are different types of synchrony. It is important that data
is in sync. Otherwise, this can impact the entire process. With so many
conventional data marks and data warehouses, sequences of data extractions,
transformations, and migrations, there is always a risk of data being
unsynchronized.
• Challenge 5
The challenge of getting important insights through Big data analytics: Data is
valuable only as long as companies can gain insights from them. By augmenting
the existing data storage and providing access to end users, big data analytics
needs to be comprehensive and insightful. The data tools must help companies
not just have access to the required information but also eliminate the need for
custom coding. As data grows inside, it is important that companies understand
this need and process it in an effective manner. With the increase in data size
on time and cycle, ensuring the proper adaptation of data is a critical factor in
the success of any company.