Unit-2 Data Analytics
Unit-2 Data Analytics
Unit-2
Data Analytics: Introduction to Analytics,
Introduction to Tools and Environment,
Application of Modeling in Business,
Databases & Types of Data and Variables,
D a ta M o d e l i n g Te c h n i q u e s , M i s s i n g
I m p u tat i o n s etc . , N e e d fo r B u s i n e s s
Modeling.
Introduction to Analytics
• In today's data-driven world, enormous
amounts of data are generated daily from
various sources, such as social media,
business transactions, and online activities.
• Extracting meaningful insights from this data
has become essential for individuals and
organizations to make informed decisions.
• Data Analytics plays a vital role in
identifying patterns, improving operations,
and driving success.
Introduction to Analytics
• 4 main factors which signify the
need for Data Analytics are:
i. Gather Hidden Insights
ii. Generate Reports
iii. Perform Market Analysis
iv. Improve Business Requirement
Introduction to Analytics
i. Gather Hidden Insights:
• Data often holds valuable information that
is not immediately visible.
• By analyzing data, we can uncover patterns
and insights that help solve problems or
identify opportunities and make strategic
d e c i s i o n s .
Example: A streaming platform like Netflix
analyzes user viewing patterns to
recommend shows or movies.
Introduction to Analytics
ii. Generate Reports:
• Re p o r t s p re s e nt a n a l yze d d ata i n a
structured manner, helping organizations
and teams make better decisions.
Example: Schools can generate reports from
student performance data to identify
subjects where students need additional
support.
Introduction to Analytics
iii. Perform Market Analysis:
• Anal yzi ng mar ket t re n d s h e l ps
organizations understand customer
preferences and stay competitive.
Example: A smartphone company
analyzes market trends to decide
which features to prioritize in its next
release.
Introduction to Analytics
i v. I m p r o v e R e q u i r e m e n t s a n d
Experience:
• U n d e rsta n d i n g c u sto m e r o r u s e r
behavior through data analytics allows
for better ser vices and improved
experiences.
Example: E-commerce platforms analyze
customer purchase patterns to suggest
personalized product recommendations.
Introduction to Analytics
i v. I m p r o v e R e q u i r e m e n t s a n d
Experience:
• U n d e rsta n d i n g c u sto m e r o r u s e r
behavior through data analytics allows
for better ser vices and improved
experiences.
Example: E-commerce platforms analyze
customer purchase patterns to suggest
personalized product recommendations.
Introduction to Analytics
Data Analytics:
• It involves techniques for analyzing data
to enhance productivity and achieve
business gains.
• Data is extracted from various sources,
cleaned, categorized, and analyzed to
uncover behavioral patterns and trends.
• The techniques and tools vary based on
organizational needs.
Introduction to Analytics
Data Analytics:
• It involves techniques for analyzing data
to enhance productivity and achieve
business gains.
• Data is extracted from various sources,
cleaned, categorized, and analyzed to
uncover behavioral patterns and trends.
• The techniques and tools vary based on
organizational needs.
Introduction to Analytics
Common Techniques:
i. Data Mining: Extracting patterns from large datasets.
ii. Statistical Analysis: Applying mathematical models to
analyze data.
iii.Predictive Analytics: Using historical data to predict
future trends.
iv.Machine Learning: Automating data analysis to
discover insights.
Example: A bank may use machine learning models to
predict customer churn and identify clients who are
likely to leave the bank. By offering targeted promotions,
the bank can retain these customers.
Role of Data Analysts
• Data Analysts play a crucial role in transforming data
into valuable insights. They collect, process, and
analyze data, then present their findings in reports or
dashboards that help decision-makers.
Example Workflow:
• Collect Data: Gather information from various
sources, such as databases or surveys.
• Clean Data: Remove duplicates and errors to ensure
data accuracy.
• Analyze Data: Use tools and techniques to find
patterns.
• Generate Reports: Present insights through charts,
tables, and written summaries.
Fig. Data Analytics
https://www.wallstreetmojo.com/data-analytics/#what-is-data-analytics
Types of Analytics and Human
Knowledge Involvement
i. Descriptive Analytics
ii. Diagnostic Analytics
iii. Predictive Analytics
iv. Prescriptive Analytics
v. Cognitive Analytics
Fig. Data and Human Knowledge Involvement
https://www.sv-europe.com/blog/10-reasons-organisation-ready-prescriptive-analytics/
Types of Analytics and Human Knowledge
Involvement
i. Descriptive Analytics: Provides an understanding of past
data and helps answer "what happened?"
Example: Monthly sales reports showing revenue trends.
• Human Input: High human interpretation is required
to summarize the data and understand its context.
ii. Diagnostic Analytics: Examines data to determine the
causes of events and answer "Why did it happen?"
Example: Identifying why sales dropped by analyzing
customer feedback, marketing campaigns, and competitor
actions.
• Human Input: Moderate, as analysts must interpret
correlations and identify root causes.
Types of Analytics and Human Knowledge
Involvement
iii. Predictive Analytics: Predicts future outcomes based on
h i s t o r i c a l d a t a .
Example: Forecasting demand for seasonal products.
• Human Input: Less human intervention is needed;
algorithms handle most of the prediction tasks.
iv. Prescriptive Analytics: Provides recommendations for
optimal decision-making.
Example: A logistics company may use prescriptive analytics
to determine the most efficient delivery routes.
• Human Input: Minimal or no human input is required, as
automated systems handle decision-making.
Types of Analytics and Human Knowledge
Involvement
V. Cognitive Analytics: Mimics human thought processes to
analyze data and provide insights. It combines artificial
intelligence, machine learning, and natural language
processing.
Example: A virtual assistant like Siri or Alexa analyzing user
requests and providing relevant information.
• Human Input: Very minimal, as cognitive systems
operate autonomously and learn from data over
time.
Introduction to Analytics
• Data Analytics has become essential for
businesses to stay competitive and thrive in
the data-driven world.
• Understanding the different types of
analytics and how they require varying
levels of human knowledge can help
organizations make better decisions and
achieve operational excellence.
Introduction to Analytics
https://uwex.wisconsin.edu/stories-news/data-science-vs-data-analytics/
Introduction to Tools and Environment
Data Analytics typically involves three main
components:
i. Subject Knowledge: Understanding the business
or field where the analysis is being applied (e.g.,
healthcare, marketing, or education).
ii. Statistical Knowledge: Applying mathematical
techniques to analyze data and draw meaningful
conclusions.
iii.Te c h n i c a l K n o w l e d g e : U s i n g t o o l s a n d
programming languages to clean, analyze, and
visualize data effectively.
https://www.wallstreetmojo.com/data-analytics/#what-is-data-analytics
Introduction to Tools and Environment
https://www.wallstreetmojo.com/data-analytics/#what-is-data-analytics
Introduction to Tools and Environment
• A Data Analyst must be proficient in all
three areas to generate valuable
insights for businesses.
Advantages:
• Reduces data duplication
• Easy to update and retrieve data
• Simple to understand
relationships
iii. Network Data Modeling Technique
• Data is represented using nodes
(entities) and edges
(relationships).
• Unlike hierarchical models, an
entity can have multiple parents.
Example 1: Hospital Management
System
iii. Network Data Modeling Technique
• A hospital database where
patients can visit multiple
doctors, and doctors can have
multiple patients.
iii. Network Data Modeling Technique
Example 2:
Fig. Network
Model
Structure
iii. Network Data Modeling Technique
Example 3:
Advantages:
• More flexible than hierarchical
models
• Best for complex relationships
iv. Entity-Relationship (ER) Data
Modeling Technique
• Uses diagrams to represent entities,
attributes, and relationships in a
database.
Example: College Student Database
Disadvantages:
• Does not preserve relationships
between variables.
• Not accurate if data has outliers.
• Does not work for categorical data.
iii. Imputation Using Most Frequent /
Zero / Constant Values
• This method is used for categorical
variables.
• The missing values are replaced
with:
§Most frequent value (Mode)
§Zero or constant values
iii. Imputation Using Most Frequent /
Zero / Constant Values
Example: Categorical Data
S.No. Gender City
1 M Hyderabad
2 F Bangalore
3 NAN Bangalore
4 F NaN
5 F Mumbai
iii. Imputation Using Most Frequent /
Zero / Constant Values
Example:
Using Most Frequent (Mode) → Replace NAN
Gender Mode = “Female"
City Mode = "Bangalore"
S.No. Gender City
1 M Hyderabad
2 F Bangalore
3 F Bangalore
4 F Bangalore
5 F Mumbai
iii. Imputation Using Most Frequent /
Zero / Constant Values
Advantages:
• Works well for categorical data.
Disadvantages:
• Creates bias in data.
• Does not preserve relationships
between variables.
i v. K - N e a r e s t N e i g h b o r s ( K N N )
Imputation
• KNN predicts missing values based
on the closest K neighbors in the
dataset.
How It Works?
• Finds the K closest neighbors
(similar rows).
• Uses the average of neighbors to
fill missing values.
i v. K - N e a r e s t N e i g h b o r s ( K N N )
Imputation
Example:
S.No. Age Salary Experience
1 25 50,000 3
2 28 60,000 5
3 30 NaN 6
4 35 80,000 10
5 40 90,000 NaN
i v. K - N e a r e s t N e i g h b o r s ( K N N )
Imputation
Step 1: Find the 3 Nearest Neighbors for S.No. 3
(missing salary)
• Closest to S.No. 2 & S.No. 4
• Take their average: (60,000 + 80,000) / 2 =
70,000
• Replace NAN with 70,000
Step 2: Find the 3 Nearest Neighbors for S.No. 5
(missing experience)
• Closest to S.No. 3 & S.No. 4
• Take their average: (6 + 10) / 2 = 8
• Replace NAN with 8
i v. K - N e a r e s t N e i g h b o r s ( K N N )
Imputation
Example:
S.No. Age Salary Experience
1 25 50,000 3
2 28 60,000 5
3 30 70,000 6
4 35 80,000 10
5 40 90,000 8
i v. K - N e a r e s t N e i g h b o r s ( K N N )
Imputation
Advantages:
• More accurate than mean/median
imputation.
• Preserves relationships between variables.
Disadvantages:
• Sensitive to outliers.
• Computationally expensive for large
datasets.
Which Imputation Method Should
You Use?
Which Imputation Method Should
You Use?
•F o r n u m e r i c a l d a t a : U s e
Mean/Median if speed is important.
Use KNN if accuracy is needed.
•For small missing data: Ignoring
missing values may be fine.
•For categorical data: Use Mode
(most frequent value).
Need for Business Modeling
• Business modeling is the process of
representing the structure,
o p e ra t i o n s , a n d p o l i c i e s o f a
business in a systematic way.
• It provides a clear blueprint for
understanding how a business
creates, delivers, and captures value.
Need for Business Modeling
It includes:
i. Business Goals & Objectives – What
the business wants to achieve.
ii.P r o c e s s e s & W o r k f l o w s – H o w
different tasks are carried out.
iii.Data & Information Flow – How data
is stored, shared, and utilized.
iv.S t a ke h o l d e r s & R o l e s – W h o i s
involved in different processes.
Why is Business Modeling Important?
Business modeling is crucial for organizations
because it:
i. Helps in Decision Making
• Business models provide data-driven insights,
allowing managers to make strategic decisions
about investments, expansion, and resource
allocation.
Example: An e-commerce company uses a
business model to decide whether to expand
into international markets by analyzing customer
demand, costs, and potential revenue.
Why is Business Modeling Important?
ii. Improves Operational Efficiency
• By mapping workflows and processes,
businesses can identify inefficiencies,
bottlenecks, and redundant steps.
Example: A manufacturing company may
use business process modeling to find
ways to reduce production costs and
improve delivery times.
Why is Business Modeling Important?
iii. Aligns Business Goals with IT Systems
• B u s i n e s s m o d e l s b r i d ge t h e ga p
b e t w e e n b u s i n e s s s t ra te g y a n d
technology implementation.
Example: A bank uses a business model
to determine how a new AI-driven loan
approval system aligns with its goal of
reducing loan processing time.
Why is Business Modeling Important?
iv. Enhances Risk Management
• Modeling helps businesses predict
potential risks and develop
contingency plans.
Example: A supply chain model helps a
retail company prepare for delays in
product delivery by identifying
alternative suppliers.
Why is Business Modeling Important?
v. Business Modeling in Data-Driven
Decision Making
• Modern business modeling often
integrates data analytics and AI to
enhance decision-making.
Example: A retail store analyzes
customer purchasing behavior and
adjusts its business model to introduce a
personalized recommendation system.
Why is Business Modeling Important?
vi. Supports Innovation & Growth
• Business modeling helps companies
identify new opportunities, test new
ideas, and adapt to market changes.