DataAnalyticsCh 1
DataAnalyticsCh 1
DataAnalyticsCh 1
Dr. Rashmi M
Department of Computer Science, GFGC T. Dasarahalli
NEP V Sem Data Analytics
UNIT 1
Introduction to Data Analytics
Content: Evolution of Data Analytics, Data Analytics Overview, Types of Data Analytics Descriptive
Analytics Diagnostic Analytics Predictive Analytics Prescriptive Analytics, Importance and Benefits of
Data Analytics. Different Applications of Analytics in Business, Text Analytics and Web Analytics, Skills for
Business Analytics.
Definition: Data Analytics is a strategy/ a method to investigate, analyse, and demonstrate data to find
useful information and decisions. Data Analysis involves extraction, cleaning, analysis, transformation,
modelling and visualization of data with an objective to extract vital and useful information that can
derive conclusions and make decisions. Hence Data Analytics is known as Data- Driven Decision
Making Strategy that increases the business growth. Data Experts of many corporate companies use
data analytics in their core research. We say Data Mining is a step/ subset of Data Analytics, the reason
is that, it is the process of exploration and analysis of huge data to discover hidden patterns and rules.
Hence Data Mining is known as Knowledge Discovery in Database (KDD).
Data analytics includes numerous types of data analysis. Any type of data can be exposed to data
analytics strategies such that, they are accepted and used to improve data in turn improves the business
growth.
1. Game companies can use data analytics to recommend new games to players based on their past
gaming behaviour. This can help to increase player engagement and retention.
2. Data analytics can be used to balance game mechanics and difficulty levels to ensure that the game
is fun and challenging for all kind of players.
3. Game companies can use data analytics to detect and prevent fraud, such as cheating and account
hacking.
4. Retailers use data analytics to track customer behaviour, identify trends to optimize their product
offerings and marketing movements.
5. Financial sectors use data analytics to evaluate risk, detect fraud, and make investment decisions.
6. Healthcare providers use data analytics to improve patient care, develop new treatments, and
reduce costs.
7. Manufacturers use data analytics to optimize production processes, improve quality control, and
reduce waste.
1. Descriptive analytics: Descriptive analytics is the simplest type of data analysis and the foundation
the other types are built on. It allows you to pull trends (means classify customers into groups based
on product choosing patterns) from raw data to describe what happened or is currently happening.
It mines historical data to understand the cause of success or failure occurred. Hence we say
Descriptive analytics deals with what happened in past/ currently. Most commonly all kinds of
management reports (sales, marketing, operations performed, finances), data queries, data
dashboards, descriptive statistics use this kind of analysis.
Examples:
A retail company uses descriptive analytics to track sales data and identify which
products are selling well and which products are not.
Tracking the cases/ deaths happened in COVID- 19 dataset, descriptive analysis can
identify infected population of a country.
A social media company uses descriptive analytics to track user engaged data and identify
which types of content surfed by the users.
A healthcare provider uses descriptive analytics to track patient data and identify trends
in disease prevalence and treatment outcomes.
Statistical Summary : It provides statistical descriptions for a given business metric, e.g.
Mean, Median, Standard Deviation, Percentile, Interquartile range, etc.
Z–Score : Z Score tells us how far (in terms of standard deviation) is a particular value of x
from its mean.
Coefficient of Variance : It is a ratio where we divide standard deviation with mean.
Interquartile Range : It is an important measure to gauge the variation in the dataset.
Data Dashboard: Is a tool used to track, organise, visualize, analyse data. Overall purpose is to make it
easier for data analysts, decision makers and average users to understand their data, gain deeper
insights and make better data- driven decisions.
Dr.RASHMI M, Computer Science Department, GFGC T.Dasarahalli |2
NEP V Sem Data Analytics
Descriptive Statistics: Includes central tendency, variability, and frequency distribution of the dataset.
The frequency distribution records how often data occurs, central tendency records the data's centre
point of distribution, and variability of a data set records its degree of dispersion.
2. Diagnostic analytics: Diagnostic analytics takes descriptive analytics a step further by trying to
understand why something happened. It uses a variety of statistical techniques to identify patterns and
relationships, dependencies in the data of a particular problem. Hence Diagnostic analytics deal with
why did it happen in the past.
Examples
A marketing company uses diagnostic analytics to identify which marketing campaigns/
promoting are most effective at driving sales (includes particular promoting month,
particular theme relating to any region).
A footwear company uses Diagnostic analytics to find why particularly April month is
having highest sales. It identifies that children beach foot wears are having highest
reviews as its vacation month for children.
A manufacturing company uses diagnostic analytics to identify the root cause of product
defects.
A financial institution uses diagnostic analytics to identify customers who are at risk of
defaulting on their loans.
2. Predictive analytics: Predictive analytics uses historical data to predict future outcomes. It uses a
variety of machine learning techniques to develop models that can predict things like customer
churn, product demand, and fraud risk. Hence we say Predictive analytics deals with what will
happen in the future.
Examples:
An e-commerce company uses predictive analytics to recommend products to
customers based on their past purchase history.
Random Forest : It is another very famous business analytics technique that uses a
collaborative approach to solve the problem by generating a large number of predictive
models. Their accuracy is generally better.
Examples:
A retailer uses prescriptive analytics to optimize their inventory levels and pricing
strategies.
A manufacturing company uses prescriptive analytics to optimize their production
processes and supply chain management.
Data analytics is a powerful tool that can be used to improve decision-making in all industries. By
understanding the different types of data analytics and how they can be used, businesses can gain
valuable insights from their data and make better decisions about how to allocate resources, improve
products and services, and grow their business.
Data analytics is the process of collecting, cleaning, and analyzing data to extract meaningful insights. It
is a broad field that encompasses a variety of techniques and tools, and it is used in a wide range of
industries. The data analytics process can be broadly divided into the following steps:
Data collection: The first step is to collect the data that will be analysed. This data can come from a
variety of sources, such as internal databases, customer surveys, and social media.
1. Data cleaning: Once the data has been collected, it needs to be cleaned to remove any errors
or inconsistencies. This may involve correcting typos, filling in missing values, and removing
outliers.
2. Data preparation: Once the data has been cleaned, it needs to be prepared for analysis. This
may involve converting the data to a different format or splitting the data into different
subsets.
3. Data analysis: This is the step where the data is actually analysed to extract meaningful
insights. This can be done using a variety of statistical and machine learning techniques.
4. Data visualization: Once the data has been analysed, the insights need to be communicated
to the required people in a clear and concise way. This can be done using data visualization
tools to create charts, graphs, and other visuals.
The evolution of data analytics can be broadly divided into four eras:
1. Era 1 (1960s to 1980s): This era was dominated by early data processing technologies, such
as punch cards and mainframe computers. Data analytics was largely limited to descriptive
analytics, which involved using simple statistical techniques to analyze historical data.
2. Era 2 (1990s to early 2000s): The rise of relational databases and business intelligence tools
made it possible to analyse larger and more complex datasets. This led to the development
of more sophisticated data analytics techniques, such as diagnostic and predictive analytics.
3. Era 3 (mid-2000s to early 2010s): The emerging of big data and cloud computing concepts
made it possible to analyse unprecedented volumes of data. This led to the development of
new data analytics techniques, such as machine learning and deep learning.
4. Era 4 (present day): Data analytics is now becoming increasingly pervasive and accessible. AI-
powered data analytics tools are enabling businesses of all sizes to extract insights from their
data and make better decisions.
Some of the key trends/ factors that have driven/ improved the evolution of data analytics:
The rise of big data: The volume, velocity, and variety of data generated today are
extraordinary. This has created a need for new data analytics tools and techniques that
can handle big data.
The rapid growth of cloud computing: Cloud computing has made it easier and more
affordable to access and analyse large datasets. This has democratized data analytics and
made it available to businesses of all sizes.
The evolution of data analytics is having a major impact on businesses of all sizes. Data analytics is now
being used to improve decision-making in all aspects of business, from marketing and sales to product
development and operations.
1. Improved decision-making: Data analytics can help businesses make better decisions by
providing them with insights into their data. For example, a company can use data
analytics to identify which marketing campaigns are most effective or which products are
most popular with customers. This information can then be used to make better
decisions about how to allocate resources and improve business operations.
2. Increased efficiency: Data analytics can help businesses automate tasks and streamline
processes. For example, a company can use data analytics to automate customer service
tasks or to optimize production schedules. This can free up employees to focus on more
strategic initiatives.
3. Reduced costs: Data analytics can help businesses identify and reduce costs. For
example, a company can use data analytics to identify areas where they are wasting
money or to identify opportunities to negotiate better deals with suppliers.
4. Improved customer satisfaction: Data analytics can help businesses improve customer
satisfaction by providing them with deeper insights into their customers' needs and
preferences. For example, a company can use data analytics to identify which products or
services are most popular with customers or to identify areas where they can improve
the customer experience.
5. New product development: Data analytics can help businesses to develop new products
and services that meet the needs of their customers. For example, a technology company
can use data analytics to identify which features are most important to their customers
and to prioritize the development of new features. This information can then be used to
develop new products and services that are more likely to be successful.
1. In Business: Data analytics can be used in a variety of ways to improve business performance.
Here are a few examples:
Marketing and sales: Data analytics can be used to understand customer behaviour,
segment customers, target customers with relevant marketing campaigns, and measure
the effectiveness of marketing campaigns.
Product development: Data analytics can be used to identify customer needs, prioritize
product features, and test new product concepts.
Finance: Data analytics can be used to assess risk, detect fraud, and make investment
decisions.
Human resources: Data analytics can be used to identify top talent, improve employee
engagement, and reduce turnover.
2. In Text Analytics: Text analytics is a type of data analytics that focuses on extracting insights
from unstructured text data. Unstructured text data can come from a variety of sources, such as
social media posts, customer reviews, and product descriptions. Text analytics can be used to:
Understand customer sentiment: Text analytics can be used to identify the overall
sentiment of customer feedback, as well as the specific topics that customers are most
concerned about.
Identify emerging trends: Text analytics can be used to identify emerging trends in the
market, such as new products or services that customers are interested in.
Improve customer service: Text analytics can be used to identify customer support issues
and to develop targeted solutions.
Improve marketing campaigns: Text analytics can be used to improve the effectiveness of
marketing campaigns by identifying the keywords and phrases that are most likely to
resonate (reverbing the words) with customers.
3. In Web Analytics: Web analytics is a type of data analytics that focuses on extracting insights
from website data. Website data can include things like page views, visitor demographics, and
traffic sources. Web analytics can be used to:
Understand website traffic: Web analytics can be used to identify which pages are most
visited, which pages are leading to conversions (provides actual customers from targeted
onces), and where visitors are coming from.
Improve website performance: Web analytics can be used to identify areas where the
website can be improved, such as pages that are loading slowly or pages that have a high
bounce rate.
4. In Skills for Business Analytics: Business analysts need a variety of skills to be successful. Here
are a few of the most important:
Technical skills: Business analysts need to have strong technical skills, including
knowledge of statistical analysis, machine learning, and data visualization tools.
Problem-solving skills: Business analysts need to be able to identify and solve complex
business problems.
Note: Datum means "one piece of information" or "one numerical result." Data is the plural form of
datum and should not be used as a singular noun. Data a collection of facts from which conclusions are
drawn and Datum is an item of factual information derived from measurement or research.
Structured data: also known as quantitative data, is information that’s highly organized and readable
by machine learning algorithms, making it easier to search, manipulate, and analyse. It is easy to search
and analyse structured data. Structured data exists in a predefined format. You find structured data in
relational database that contains tables, rows, and columns.
Unstructured data: is also known as qualitative data, meaning the information it contains is subjective,
and traditional analytics tools and methods can’t handle it. Unstructured data come in the form of a
photo, audio, video, engineering CAD drawing, social media text stream, HTML document, or any form
of data that is not captured as a fixed record, field-defined data format.
Unstructured data is the data that lacks any predefined model or format. It requires a lot of storage
space, and it is hard to maintain security in it. It cannot be presented in a data model or schema. That's
why managing, analysing, or searching for unstructured data is hard. It is qualitative in nature and
sometimes stored in a non-relational database or NO-SQL.
Examples of human-generated unstructured data are Text files, Email, social media, media, mobile
data, business applications, and others. The machine-generated unstructured data includes satellite
images, scientific data, sensor data, digital surveillance, and many more.
Semi-structured data: occupies the middle ground between structured and unstructured data as data
that has some degree of organization but is not fully organized into a fixed record format found in a
traditional system or database.
Note: Structured data can be extracted from unstructured data using business intelligence (BI) tools that
rely on artificial intelligence (AI) and natural language processing (NLP).
Data Science: Data science and data analytics are closely related but there are key differences between
the two fields. While both fields involve working with data to gain insights, Data Analytics tends to focus
more on analysing past data to inform decisions in the present, while Data Science focus on use of data
in building data models that can predict future outcomes.
Data science is a broad field that encompasses data analytics and includes other areas such as data
engineering and machine learning. Data scientists use statistical and computational methods to extract
insights from data, build predictive models, and develop new algorithms. Data analytics involves
analysing data to gain insights and derive business decisions.
Big data plays a crucial role in data analysis solutions by providing organizations with large amounts of data that
can be used to uncover insights and support decision-making. Big data can be integrated with other data sources
such as structured data, semi-structured data, and unstructured data, to form a holistic view of the organization’s
data landscape, which can lead to more accurate predictions, better decision-making, and more effective
outcomes. The purpose of Big Data is to store huge volume of data and to process it whereas, the purpose of
Data Analytics is to analyse the raw data and find out insights for the information.
Data Analytics and Business Analytics: Data Analytics and Business Analytics both share a common goal but the
skills needed and the strategies used are different. Data Analytics focus on processing data and drawing
conclusions (or deriving decisions) whereas, business analytics focus on implementing changes and
communicating the results. Data analysts are more likely to work independently while business analysts need to
work directly with people in different departments and roles. A data analyst should have knowledge of data
structures (or patterns) whereas, business analyst should have knowledge of business structures.
Business analysts use data to identify problems and solutions, but do not perform a deep technical analysis of
the data. They operate at a conceptual level, defining strategy and communicating with stakeholders, and are
concerned with the business implications of data. Data analysts, on the other hand, spend the majority of their
time gathering raw data from various sources, cleaning and transforming it, and applying a range of specialized
techniques to extract useful information and develop conclusions.
Business analysts typically have extensive domain or industry experience in areas such as e-commerce,
manufacturing, or healthcare. People in this role rely less on the technical aspects of analysis than data
analysts, although they do need a working knowledge of statistical tools, common programming languages,
networks, and databases.