Big Data Analytics Seminar Report 2020-21
Big Data Analytics Seminar Report 2020-21
Big Data Analytics Seminar Report 2020-21
ABSTRACT
Big data is a new driver of the world economic and societal changes. The world’s
data collection is reaching a tipping point for major technological changes that can
bring new ways in decision making, managing our health, cities, finance and
education. While the data complexities are increasing including data’s volume,
variety, velocity and veracity, the real impact hinges on our ability to uncover the
`value’ in the data through Big Data Analytics technologies. Big Data Analytics
poses a grand challenge on the design of highly scalable algorithms and systems to
integrate the data and uncover large hidden values from datasets that are diverse,
complex, and of a massive scale. Potential breakthroughs include new algorithms,
methodologies, systems and applications in Big Data Analytics that discover useful
and hidden knowledge from the Big Data efficiently and effectively.
Big data analytics must also be team effort cutting across academic institutions,
government and society and industry, and by researchers from multiple disciplines
including computer science and engineering, health, data science and social and
policy areas.
CONTENTS
INTRODUCTION
What is Data, Big Data, Big Data Analytics
Benefits using Big Data Analytics
History and Evolution of Big Data Analytics
Why is Big Data Analytics Important
Types of Big Data
Characteristics of Big Data
Applications of Big Data
Advantages and Disadvantages of Big Data
Tools used in Big Data Analytics
The sources of Big Data
Impact of Big Data on Business
How it works and key technologies
Big Data Analytics uses and challenges
Lifecycle of Big Data Analytics
Different types of Big Data Analytics
CONCLUSION
REFERENCES
INTRODUCTION
Big data analytics is the process of examining large data sets containing a variety of
data types – i.e., big data -- to uncover hidden patterns, unknown correlations, market
trends, customer preferences and other useful business information. The analytical
findings can lead to more effective marketing, new revenue opportunities, better
customer service, improved operational efficiency, competitive advantages over
rival organizations and other business benefits.
The primary goal of big data analytics is to help companies make more informed
business decisions by enabling data scientists, predictive modelers and other
analytics professionals to analyze large volumes of transaction data, as well as other
forms of data that may be untapped by conventional business intelligence(BI)
programs. That could include Web server logs and Internet clickstream data, social
media content and social network activity reports, text from customer emails and
survey responses, mobile-phone call detail records and machine data captured by
sensors connected to the Internet of Things.
With the launch of Web 2.0, a large amount of valuable business data started being
generated beyond the organization by consumers and, generally, by web users. This
data can be structured or unstructured, and can come from multiple sources such as
social networks, products viewed in virtual stores, information read by sensors, GPS
signals from mobile devices, IP addresses, cookies, bar codes, etc.
What is Data?
The quantities, characters, or symbols on which operations are performed by a
computer, which may be stored and transmitted in the form of electrical signals and
recorded on magnetic, optical, or mechanical recording media.
Data also exists in different formats, like structured data, semi-structured data, and
unstructured data. For example, in a regular Excel sheet, data is classified as
structured data—with a definite format. In contrast, emails fall under semi-
structured, and your pictures and videos fall under unstructured data. All this data
combined makes up Big Data.
The new benefits that big data analytics brings to the table, however, are speed and
efficiency. Whereas a few years ago a business would have gathered information,
run analytics and unearthed information that could be used for future decisions,
today that business can identify insights for immediate decisions. The ability to
work faster – and stay agile – gives organizations a competitive edge they didn’t
have before.
1. Cost reduction. Big data technologies such as Hadoop and cloud-based analytics
bring significant cost advantages when it comes to storing large amounts of data –
plus they can identify more efficient ways of doing business.
2. Faster, better decision making. With the speed of Hadoop and in-memory
analytics, combined with the ability to analyze new sources of data, businesses are
able to analyze information immediately – and make decisions based on what
they’ve learned.
3. New products and services. With the ability to gauge customer needs and
satisfaction through analytics comes the power to give customers what they want.
Davenport points out that with big data analytics, more companies are creating new
products to meet customers’ needs.
Types of Big-Data
Big Data is generally categorized into three different varieties. They are as shown
below:
Structured Data
Semi-Structured Data
Unstructured Data
Volume
Facebook alone can generate about billion messages, 4.5 billion times that the “like”
button is recorded, and over 350 million new posts are uploaded each day. Such a
huge amount of data can only be handled by Big Data Technologies.
Variety
Veracity
Veracity basically means the degree of reliability that the data has to offer. Since a
major part of the data is unstructured and irrelevant, Big Data needs to find an
alternate way to filter them or to translate them out as the data is crucial in business
developments.
Value
Value is the major issue that we need to concentrate on. It is not just the amount of
data that we store or process. It is actually the amount of valuable, reliable and
trustworthy data that needs to be stored, processed, analyzed to find insights.
Velocity
Last but never least, Velocity plays a major role compared to the others, there is no
point in investing so much to end up waiting for the data. So, the major aspect of
Big Data is to provide data on demand and at a faster pace.
Social data comes from the Likes, Tweets & Retweets, Comments, Video Uploads,
and general media that are uploaded and shared via the world’s favorite social media
platforms. This kind of data provides invaluable insights into consumer behavior and
sentiment and can be enormously influential in marketing analytics. The public web
is another good source of social data, and tools like Google Trends can be used to
good effect to increase the volume of big data.
Transactional data is generated from all the daily transactions that take place both
online and offline. Invoices, payment orders, storage records, delivery receipts – all
are characterized as transactional data yet data alone is almost meaningless, and most
organizations struggle to make sense of the data that they are generating and how it
can be put to good use.
Big data technologies help companies store large volumes of data while enabling
significant cost benefits. Such technologies include cloud-based analytics and
Hadoop. They help businesses analyze information and improve decision-making.
Furthermore, data breaches pose the need for enhanced security, which technology
application can solve.
Big data has the potential to bring social and economic benefits to businesses.
Therefore, several government agencies have formulated policies for promoting the
development of big data.
Over the years, big data analytics has evolved with the adoption of agile technologies
and the increase of focus on advanced analytics. There is no single technology that
encompasses big data analytics. Several technologies work together to help
companies procure optimum value from the information. Among them are machine
learning, artificial intelligence, quantum computing, Hadoop, in-memory analytics,
and predictive analytics. These technology trends are likely to spur the demand for
big data analytics over the forecast period.
Earlier, big data was mainly deployed by businesses that could afford the
technologies and channels used to gather and analyze data. Nowadays, both large
and small business enterprises are increasingly relying on big data for intelligent
business insights. Thereby, they boost the demand for big data.
Enterprises from all industries contemplate ways of how big data can be used in
business. Its uses are poised to improve productivity, identify customer needs, offer
a competitive advantage, and scope for sustainable economic development.
Data management. Data needs to be high quality and well-governed before it can
be reliably analyzed. With data constantly flowing in and out of an organization, it's
important to establish repeatable processes to build and maintain standards for data
quality. Once data is reliable, organizations should establish a master data
management program that gets the entire enterprise on the same page.
Data mining. Data mining technology helps you examine large amounts of data to
discover patterns in the data – and this information can be used for further analysis
to help answer complex business questions. With data mining software, you can sift
through all the chaotic and repetitive noise in data, pinpoint what's relevant, use that
information to assess likely outcomes, and then accelerate the pace of making
informed decisions.
Hadoop. This open source software framework can store large amounts of data and
run applications on clusters of commodity hardware. It has become a key technology
to doing business due to the constant increase of data volumes and varieties, and its
distributed computing model processes big data fast. An additional benefit is that
Hadoop's open source framework is free and uses commodity hardware to store large
quantities of data.
Text mining. With text mining technology, you can analyze text data from the web,
comment fields, books and other text-based sources to uncover insights you hadn't
noticed before. Text mining uses machine learning or natural language
processing technology to comb through documents – emails, blogs, Twitter feeds,
surveys, competitive intelligence and more – to help you analyze large amounts of
information and discover new topics and term relationships.
Big data analytics applications often include data from both internal systems and
external sources, such as weather data or demographic data on consumers compiled
by third-party information services providers. In addition, streaming analytics
applications are becoming common in big data environments as users look to
perform real-time analytics on data fed into Hadoop systems through stream
processing engines, such as Spark, Flink and Storm.
Early big data systems were mostly deployed on premises, particularly in large
organizations that collected, organized and analyzed massive amounts of data. But
cloud platform vendors, such as Amazon Web Services (AWS) and Microsoft, have
made it easier to set up and manage Hadoop clusters in the cloud. The same goes for
Hadoop suppliers such as Cloudera-Hortonworks, which supports the distribution of
the big data framework on the AWS and Microsoft Azure clouds. Users can now
spin up clusters in the cloud, run them for as long as they need and then take them
offline with usage-based pricing that doesn't require ongoing software licenses.
Big data has become increasingly beneficial in supply chain analytics. Big supply
chain analytics utilizes big data and quantitative methods to enhance decision
making processes across the supply chain. Specifically, big supply chain analytics
expands datasets for increased analysis that goes beyond the traditional internal data
found on enterprise resource planning (ERP) and supply chain management (SCM)
systems. Also, big supply chain analytics implements highly effective statistical
methods on new and existing data sources. The insights gathered facilitate better
informed and more effective decisions that benefit and improve the supply chain.
Potential pitfalls of big data analytics initiatives include a lack of internal analytics
skills and the high cost of hiring experienced data scientists and data engineers to
fill the gaps.
Stage 1 - Business case evaluation - The Big Data analytics lifecycle begins with a
business case, which defines the reason and goal behind the analysis.
Stage 2 - Identification of data - Here, a broad variety of data sources are identified.
Stage 3 - Data filtering - All of the identified data from the previous stage is filtered
here to remove corrupt data.
Stage 4 - Data extraction - Data that is not compatible with the tool is extracted and
then transformed into a compatible form.
Stage 5 - Data aggregation - In this stage, data with the same fields across different
datasets are integrated.
Stage 6 - Data analysis - Data is evaluated using analytical and statistical tools to
discover useful information.
Stage 7 - Visualization of data - With tools like Tableau, Power BI, and QlikView,
Big Data analysts can produce graphic visualizations of the analysis.
Stage 8 - Final analysis result - This is the last step of the Big Data analytics lifecycle,
where the final results of the analysis are made available to business stakeholders
who will take action.
Descriptive Analytics
This summarizes past data into a form that people can easily read. This helps in
creating reports, like a company’s revenue, profit, sales, and so on. Also, it helps
in the tabulation of social media metrics.
Use Case: The Dow Chemical Company analyzed its past data to increase
facility utilization across its office and lab space. Using descriptive analytics,
Dow was able to identify underutilized space. This space consolidation helped
the company save nearly US $4 million annually.
Diagnostic Analytics
This is done to understand what caused a problem in the first place. Techniques
like drill-down, data mining, and data recovery are all examples. Organizations
use diagnostic analytics because they provide an in-depth insight into a particular
problem.
Use Case: An ecommerce company’s report shows that their sales have gone
down, although customers are adding products to their carts. This can be due to
various reasons like the form didn’t load correctly, the shipping fee is too high, or
there are not enough payment options available. This is where you can use
diagnostic analytics to find the reason.
Predictive Analytics
This type of analytics looks into the historical and present data to make
predictions of the future. The predictive analytics uses data mining, AI, and
machine learning to analyze current data and make predictions about the future.
It works on predicting customer trends, market trends, and so on.
Use Case: PayPal determines what kind of precautions they have to take to protect
their clients against fraudulent transactions. Using predictive analytics, the
company uses all the historical payment data and user behavior data and builds
an algorithm that predicts fraudulent activities.
Prescriptive Analytics
CONCLUSION
Big Data Analytics is a security enhancing tool of the future. The amount of
information that can be gathered, organized, and applied to users in a personalized
fashion would take a human, days, weeks, or even months to accomplish. In the
capitalistic market such as the United States of America’s, competition is key. Time
cannot be wasted gathering information and making decisions on incidents that have
already taken place. Stopping incidents in their tracks, completing investigative
work, and quarantining threatening sources needs to happen immediately and allow
for administrators/management to make a on the spot decision. With big data
analytics, more educated decisions can be made and focus can remain on business
operations moving forward.
The availability of Big Data, low-cost commodity hardware, and new information
management and analytic software have produced a unique moment in the history
of data analysis. The convergence of these trends means that we have the capabilities
required to analyze astonishing data sets quickly and cost-effectively for the first
time in history. These capabilities are neither theoretical nor trivial. They represent
a genuine leap forward and a clear opportunity to realize enormous gains in terms of
efficiency, productivity, revenue, and profitability. The Age of Big Data is here, and
these are truly revolutionary times if both business and technology professionals
continue to work together and deliver on the promise.
REFERENCES
www.123seminarsonly.com
www.wikipedia.com
www.edureka.co