Overview of The Subject: Honours - TE - Mathematics For Data Science (Theory: HDSC501)
Overview of The Subject: Honours - TE - Mathematics For Data Science (Theory: HDSC501)
1
Shyamala Mathi
2
Shyamala Mathi
https://wikidocs.net/185538
3
Shyamala Mathi
1.What is Exploratory Data Analysis in Data Science?
2.Objective of Exploratory Data Analysis
3.Role of EDA in Data Science
4.Types of Exploratory Data Analysis
5.Steps Involved in Exploratory Data Analysis (EDA)
6.Exploratory Data Analysis Tools
7.Advantages of Using EDA
8.Example of Exploratory Data Analysis
9.Conclusion
4
Shyamala Mathi
• Data analysis involves different processes of
cleaning,
transforming,
analyzing the data, and
building models to extract specific, relevant insights.
• These are beneficial for making important business decisions in real-time situations.
5
Shyamala Mathi
What is Exploratory Data Analysis in Data Science?
• Exploratory Data Analysis (EDA) is one of the techniques used for extracting vital
features and trends used by machine learning and deep learning models in Data
Science.
• Exploratory Data Analysis (EDA) is widely used by Data Scientists to analyze and
investigate Data sets and helps the Data Scientist to discover Data Patterns or
characteristics in visual form.
7
Shyamala Mathi
Need for EDA
8
Shyamala Mathi
9
Shyamala Mathi
10
Shyamala Mathi
11
Shyamala Mathi
12
Shyamala Mathi
13
Shyamala Mathi
14
Shyamala Mathi
15
Shyamala Mathi
16
Shyamala Mathi
17
Shyamala Mathi
18
Shyamala Mathi
19
Shyamala Mathi
20
Shyamala Mathi
21
Shyamala Mathi
22
Shyamala Mathi
No. of persons (70-79 years) = 87
No. of persons (Hearing loss) = 17
No. of persons (mobility issues) = 46
23
Shyamala Mathi
24
Shyamala Mathi
25
Shyamala Mathi
26
Shyamala Mathi
27
Shyamala Mathi
28
Shyamala Mathi
29
Shyamala Mathi
30
Shyamala Mathi
31
Shyamala Mathi
A run chart is a
line graph of
data plotted
over time. By
collecting and
charting data
over time, you
can find trends
or patterns in the
process.
32
Shyamala Mathi
33
Shyamala Mathi
34
Shyamala Mathi
35
Shyamala Mathi
36
Shyamala Mathi
37
Shyamala Mathi
38
Shyamala Mathi
39
Shyamala Mathi
Steps Involved in Exploratory Data Analysis (EDA)
The key components in an EDA are the main steps undertaken to perform
the EDA. These are as follows:
1. Data Collection
2. Finding all Variables and Understanding Them
3. Cleaning the Dataset
4. Identify Correlated Variables
5. Choosing the Right Statistical Methods
6. Visualizing and Analyzing Results
40
Shyamala Mathi
41
Shyamala Mathi
42
Shyamala Mathi
One can follow these steps:
1. Look at the structure of the data: number of data points, number of features, feature
names, data types, etc.
2. When dealing with multiple data sources, check for consistency across datasets.
3. Identify what data signifies (called measures) for each of data points and be mindful
while obtaining metrics.
4. Calculate key metrics for each data point (summary analysis): a. Measures of central
tendency (Mean, Median, Mode); b. Measures of dispersion (Range, Quartile Deviation,
Mean Deviation, Standard Deviation); c. Measures of skewness and kurtosis.
5. Investigate visuals: a. Histogram for each variable; b. Scatterplot to correlate variables.
6. Calculate metrics and visuals per category for categorical variables (nominal, ordinal).
7. Identify outliers and mark them. Based on context, either discard outliers or analyze
them separately.
8. Estimate missing points using data imputation techniques.
43
Shyamala Mathi
44
Shyamala Mathi
45
Shyamala Mathi
Thank You!
(shyamalae@sies.edu.in)
46
Shyamala Mathi