0% found this document useful (0 votes)

91 views85 pages

Unit I

The document discusses exploratory data analysis including its fundamentals, significance, and steps. It covers differentiating between numerical and categorical data, data transformation techniques, and visualization aids like line charts, bar charts, and histograms.

Uploaded by

palaniappan.cse

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

91 views85 pages

Unit I

Uploaded by

palaniappan.cse

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 85

AD3301

DATA EXPLORATION AND

VISUALIZATION

1
Dr S Palaniappan , Associate Professor , Department of ADS
UNIT I

EXPLORATORY DATA ANALYSIS

2
Dr S Palaniappan , Associate Professor , Department of ADS
Lesson Plan
Highest Cognitive
Planned Hour Description of Portion to be Covered Relevant CO Nos
level**
EDA fundamentals - Understanding data science –
1 CO1 K1
Significance of EDA
1 Making sense of data CO1 K1
Comparing EDA with classical and Bayesian analysis -
1 CO1 K1
Software tools for EDA
1 Visual Aids for EDA CO2 K1
1 Visual Aids for EDA CO2 K1
Data transformation techniques-merging database, reshaping
1 CO2 K1
and pivoting
1 Transformation techniques CO2 K1
1 Grouping Datasets - data aggregation CO2 K1
1 Pivot tables and cross-tabulations CO2 K1
CO 1 Understand the fundamentals of exploratory data analysis.
Suresh Kumar Mukhiya, Usman Ahmed, “Hands-On Exploratory Data Analysis with Python”, Packt
T1
Publishing,2020.

Dr S Palaniappan , Associate Professor , Department of ADS 3

The Fundamentals of
EDA
• Data encompasses a collection of discrete objects, numbers, words, events, facts,
measurements, observations, or even descriptions of things.

• Processing such data elicits useful information and processing such information
generates useful knowledge

• How can we generate meaningful and useful information from such data?

• EDA is a process of examining the available dataset to discover patterns, spot

anomalies, test hypotheses, and check assumptions using statistical measures

Dr S Palaniappan , Associate Professor , Department of ADS 4

Understanding data science
• CRoss-Industry Standard Process for data mining (CRISP -DM)

Dr S Palaniappan , Associate Professor , Department of ADS 5

Stages Of Data Analysis And Data Mining
• Data requirements: What type of data is required for the organization to be
collected, curated, and stored

• Data collection : What are the ways we can collect the data ?Data collected from
several sources should be Stored in a correct form

• Data processing: Preprocessing involves the process of pre-curating the dataset

before actual analysis

• Data cleaning: Incompleteness check, duplicates check, error check, and missing
value check

Dr S Palaniappan , Associate Professor , Department of ADS 6

• EDA : We actually start to understand the message contained in the data

• Modelling and algorithm : The cause and effect – Dependent and Independent
variable

• Data Product : Mode developed during data analysis

• Communication: Disseminating the results to end stakeholders to use the result for
business intelligence

Dr S Palaniappan , Associate Professor , Department of ADS 7

The significance of EDA
• Different fields - accumulate and store data primarily in electronic databases

• EDA allows us to visualize data to understand it as well as to create hypotheses for

further analysis

• Key components - summarizing data, statistical analysis, and visualization of data

• Python provides exploratory analysis, with pandas for summarizing;

• scipy, along with others, for statistical analysis;
• and matplotlib and plotly for visualizations.

Dr S Palaniappan , Associate Professor , Department of ADS 8

Steps in EDA
• Problem definition
• Defining the main objective of the analysis,
• defining the main deliverables,
• outlining the main roles and responsibilities,
• obtaining the current status of the data,
• defining the timetable, and
• performing cost/benefit analysis

• Data preparation
• understand the main characteristics of the data,
• clean the dataset,
• delete non-relevant datasets,
• transform the data, and
• divide the data into required chunks for analysis.
Dr S Palaniappan , Associate Professor , Department of ADS 9
• Data analysis:
• summarizing the data,
• finding the hidden correlation and relationships among the data,
• Developing predictive models,
• evaluating the models, and
• calculating the accuracies.

• Development and representation of the results: Presenting the dataset to the target
audience in the form of
• graphs,
• summary tables,
• Maps and diagrams
Dr S Palaniappan , Associate Professor , Department of ADS 10
Making sense of data
• It is crucial to identify the type of data under analysis.
• Different disciplines store different kinds of data for different purposes.

Dr S Palaniappan , Associate Professor , Department of ADS 11

Two Groups of Data
• Numerical data and Categorical data.

• Numerical data - This data has a sense of measurement involved in it

• This data is often referred to as quantitative data in statistics

• Discrete data
• This is data that is countable and its values can be listed out

• Continuous data
• A variable that can have an infinite number of numerical values within a specific
range is classified as continuous data. Whose value is obtained by measuring.

Dr S Palaniappan , Associate Professor , Department of ADS 12

Guess the type of data

Dr S Palaniappan , Associate Professor , Department of ADS 13

• Categorical data
• This type of data represents the characteristics of an object
• Gender, Marital Status, Type Of Address, Or Categories Of The Movies,
Blood Type

• Qualitative datasets as per statistics

• Dichotomous variable – binary data - categorical variable

• Polytomous variables - more than two possible values.

Dr S Palaniappan , Associate Professor , Department of ADS 14

Measurement scales
• Four different types of measurement scales
• Nominal, Ordinal, Interval, And Ratio.

• Nominal
• These are practiced for labeling variables without any quantitative value. The
scales are generally referred to as labels

• The languages that are spoken in a particular country, Hair color, Gender,
nationalities, Profession

• Frequency, Proportion, Percentage and Vizualize

Dr S Palaniappan , Associate Professor , Department of ADS 15

• Ordinal

• In ordinal scales, the order of the values is a significant factor

Dr S Palaniappan , Associate Professor , Department of ADS 16

• Interval
• In interval scales, both the order and exact differences between the values are
significant
• Interval scales are widely used in statistics, for example, in the measure
of central tendencies—mean, median, mode, and standard deviations.

• Ratio data
• It is just like interval data in that it can be categorized and ranked, and there are
equal intervals between the data points
• Income, height, weight, annual sales, market share
• zero means none and it is not possible to have negative values

Dr S Palaniappan , Associate Professor , Department of ADS 17

Dr S Palaniappan , Associate Professor , Department of ADS 18
Comparing EDA with classical and Bayesian
analysis

Dr S Palaniappan , Associate Professor , Department of ADS 19

Software tools available for EDA
• Python:
• R programming language
• Weka
• KNIME

Dr S Palaniappan , Associate Professor , Department of ADS 20

Visual Aids for EDA
• Line chart
• Bar chart
• Scatter plot
• Area plot and stacked plot
• Pie chart
• Table chart
• Polar chart
• Histogram
• Lollipop chart
• Choosing the best chart
• Other libraries to explore

Dr S Palaniappan , Associate Professor , Department of ADS 21

Line chart
• A line chart is used to illustrate the relationship between two or more continuous
variables.
• Line charts are the simplest form of representing quantitative data between two
variables that are shown with the help of a line that can either be straight or curved

Dr S Palaniappan , Associate Professor , Department of ADS 22

Bar Chart
• Bars can be drawn horizontally or vertically to represent categorical variables.

• A bar chart is a statistical approach to represent given data using vertical and
horizontal rectangular bars.

• It has a uniform width and varying heights.

• The length of each bar is proportional to the value they represent.

• It is basically a graphical representation of data with the help of horizontal or vertical

bars with different heights.

Dr S Palaniappan , Associate Professor , Department of ADS 23

Scatter Plots
• Scatter plots are used to observe relationship between variables and uses dots to
represent the relationship between them

Dr S Palaniappan , Associate Professor , Department of ADS 24

Area plot and stacked plot
• The stacked plot owes its name to the fact that it represents the area under a line plot
and that several such plots can be stacked on top of one another

Dr S Palaniappan , Associate Professor , Department of ADS 25

Pie chart
• A pie chart (or a circle chart) is a circular statistical graphic, which is divided into
slices to illustrate numerical proportion

Dr S Palaniappan , Associate Professor , Department of ADS 26

Table chart
• A table chart combines a bar chart and a table.

Dr S Palaniappan , Associate Professor , Department of ADS 27

Polar chart
• Polar area charts are similar to pie charts, but each segment has the same angle -
the radius of the segment differs depending on the value.

• This type of chart is often useful when we want to show a comparison data similar to
a pie chart, but also show a scale of values for context.

Dr S Palaniappan , Associate Professor , Department of ADS 28

Histogram
• Histograms are a type of bar plot for numeric data that group the data into bins.
• Histogram plots are used to depict the distribution of any continuous variable
• It is used to represent grouped frequency distribution with continuous classes.

Dr S Palaniappan , Associate Professor , Department of ADS 29

• The lollipop chart is a composite chart with bars and circles. It is a variant of the
bar chart with a circle at the end, to highlight the data value.

Dr S Palaniappan , Associate Professor , Department of ADS 30

Data Transformation
• Data wrangling - data cleaning, data remediation, or data munging
• Transform raw data into more readily used formats

• Data deduplication • Data derivation

• Key restructuring • Data aggregation

• Data integration
• Data cleansing
• Data filtering
• Data validation
• Data joining
• Format revisioning

Dr S Palaniappan , Associate Professor , Department of ADS 31

Merging database-style dataframes

• Append / Concat

• Merge or Join

Dr S Palaniappan , Associate Professor , Department of ADS 32

Append
• Pandas dataframe.append() function is used to append rows of other dataframe to
the end of the given dataframe, returning a new dataframe object.
Student ID ScoreSE
1 89
Student ID ScoreSE Student ID ScoreSE 3 39
1 89 17 71 5 50
3 39 19 91 7 97
5 50 9 22
21 56 11 66
7 97 23 32 13 31
9 22 25 52 15 51
11 66 27 73 17 71
13 31 29 92 19 91
15 51 21 56
23 32
25 52
27 73
92 33
Dr S Palaniappan , Associate Professor , Department of ADS
29
Concat
Student ID ScoreSE Student ID ScoreSE
1 89 2 98
3 39 4 93
5 50 6 44
7 97 8 77
9 22 10 69
11 66 12 56
13 31 14 31
15 51 16 53
17 71 18 78
19 91 20 93
21 56 22 56
23 32 24 77
25 52 26 33
27 73 28 56
Dr S Palaniappan , Associate Professor , Department of ADS
29 92 34
• dataframe = pd.concat([dataFrame1, dataFrame2],
ignore_index=True)

• print(dataframe)

• The argument ignore_index creates new index and its absense

keeps the original indices.

Dr S Palaniappan , Associate Professor , Department of ADS 35

• pd.concat([dataFrame1, dataFrame2], axis=1)

• A DataFrame object has two axes: “axis 0” and “axis

1”. “axis 0” represents rows and “axis 1” represents
columns

Dr S Palaniappan , Associate Professor , Department of ADS 36

First Option in merging Multiple dataset
• Concatenating along with an axis

# Option 1
dfSE = pd.concat([df1SE, df2SE],
ignore_index=True)

dfML = pd.concat([df1ML, df2ML],

ignore_index=True)

df = pd.concat([dfML, dfSE], axis=1)

print(df)

Dr S Palaniappan , Associate Professor , Department of ADS 37

Using df.merge with an inner join

• dataframe.merge(right, how, on, left_on, right_on, left_index, right_index,

sort, suffixes, copy, indicator, validate)

Dr S Palaniappan , Associate Professor , Department of ADS 38

# Option 2

dfSE = pd.concat([df1SE, df2SE], ignore_index=True)

dfML = pd.concat([df1ML, df2ML], ignore_index=True)
df = dfSE.merge(dfML, how='inner')
Df

Here, you will perform inner join with each dataframe.

That is to say, if an item exists on the both dataframe, will be included in the new dataframe.

This means, we will get the list of students who are appearing in both the courses. (Student ID
30 will not be there in the list)

Dr S Palaniappan , Associate Professor , Department of ADS 39

StudentID ScoreSE ScoreML
9 22 52
11 66 86
13 31 41
15 51 77
17 71 73
19 91 51
21 56 86
23 32 82
25 52 92
27 73 23
29 92 49
2 98 93
4 93 44
6 44 78
8 77 97
10 69 87
12 56 89
14 31 39
16 53 43
18 78 88
20
Dr S Palaniappan , Associate Professor , Department of ADS 93 78 40
• The inner join takes the intersection from two or more dataframes.
• It corresponds to the INNER JOIN in Structured Query Language (SQL).

• The outer join takes the union from two or more dataframes.
• It corresponds to the FULL OUTER JOIN in SQL.

• The left join uses the keys from the left-hand dataframe only.
• It corresponds to the LEFT OUTER JOIN in SQL.

• The right join uses the keys from the right-hand dataframe only.
• It corresponds to the RIGHT OUTER JOIN in SQL.

Dr S Palaniappan , Associate Professor , Department of ADS 41

• # Option 3
dfSE = pd.concat([df1SE, df2SE], ignore_index=True)
dfML = pd.concat([df1ML, df2ML], ignore_index=True)

df = dfSE.merge(dfML, how='left')
df

Dr S Palaniappan , Associate Professor , Department of ADS 42

StudentID ScoreSE ScoreML
2 98 93.0
4 93 44.0
6 44 78.0
8 77 97.0
10 69 87.0
12 56 89.0
14 31 39.0
16 53 43.0
18 78 88.0
20 93 78.0
22 56 NaN
24 77 NaN
26 33 NaN
28 56 NaN
30 27 NaN
Dr S Palaniappan , Associate Professor , Department of ADS 43
• # Option 4
dfSE = pd.concat([df1SE, df2SE], ignore_index=True)
dfML = pd.concat([df1ML, df2ML], ignore_index=True)

df = dfSE.merge(dfML, how='right')
df

Dr S Palaniappan , Associate Professor , Department of ADS 44

StudentID ScoreSE ScoreML
1 NaN 39
3 NaN 49
5 NaN 55
7 NaN 77
9 22.0 52
11 66.0 86
13 31.0 41
15 51.0 77
17 71.0 73
19 91.0 51
21 56.0 86

Dr S Palaniappan , Associate Professor , Department of ADS 45

• # Option 5
dfSE = pd.concat([df1SE, df2SE], ignore_index=True)
dfML = pd.concat([df1ML, df2ML], ignore_index=True)

df = dfSE.merge(dfML, how='outer')
df.tail(10)

Dr S Palaniappan , Associate Professor , Department of ADS 46

StudentID ScoreSE ScoreML
20 93 78
22 56 NaN
24 77 NaN
26 33 NaN
28 56 NaN
30 27 NaN
1 NaN 39
3 NaN 49
5 NaN 55
7 NaN 77

Dr S Palaniappan , Associate Professor , Department of ADS 47

Reshaping and pivoting
• Stacking: Stack rotates from any particular column in the data to the rows.
• Unstacking: Unstack rotates from the rows into the column.

Dr S Palaniappan , Associate Professor , Department of ADS 48

data = np.arange(15).reshape((3,5))
indexers = ['Rainfall', 'Humidity', 'Wind']
dframe1 = pd.DataFrame(data, index=indexers, columns=['Bergen', 'Oslo',
'Trondheim', 'Stavanger', 'Kristiansand'])

print(dframe1)

Index Bergen Oslo Trondheim Stavanger Kristiansand

Rainfall 0 1 2 3 4
Humidity 5 6 7 8 9
Wind 10 11 12 13 14

Dr S Palaniappan , Associate Professor , Department of ADS 49

stacked = dframe1.stack()
Stacked

stacked.unstack()

Dr S Palaniappan , Associate Professor , Department of ADS 50

Transformation techniques
• Data deduplication
• Replacing values
• Handling missing data
• NaN values in pandas objects
• Dropping missing values
• Mathematical operations with NaN
• Filling missing values
• Backward and forward filling
• Interpolating missing values
• Renaming axis indexes
• Discretization and binning

Dr S Palaniappan , Associate Professor , Department of ADS 51

Data deduplication
• Remove the duplicate rows from the DataFrame.

frame3 = pd.DataFrame({'column 1': ['Looping'] * 3 + ['Functions'] * 4, 'column 2': [1

0, 10, 22, 23, 23, 24, 24]})

frame3.duplicated()

Dr S Palaniappan , Associate Professor , Department of ADS 52

Remove the duplicates
frame4 = frame3.drop_duplicates()
frame4

frame3['column 3'] = range(7)

frame5 = frame3.drop_duplicates(['column 2'])
frame5

Dr S Palaniappan , Associate Professor , Department of ADS 53

Replacing values
• Find and Replace some values inside a dataframe

import numpy as np
replaceFrame = pd.DataFrame({'column 1': [200, 3000, -786,3000, 234, 444, -786,
332, 3332 ], 'column 2': range(9)})

DataFrame.replace(to_replace=None, value=None, inplace=False, limit=None,

regex=False, method=’pad’, axis=None)

Dr S Palaniappan , Associate Professor , Department of ADS 54

replaceFrame.replace(to_replace =-786, value= np.nan)

to_replace =[-786, 200], value= [np.nan, 2]

to_replace : [str, list, dict, Series, numeric, or None]

Dr S Palaniappan , Associate Professor , Department of ADS 55

Handling missing data
• dataframe.isnull()

replaceFrame.isnull()
replaceFrame.notnull()

Dr S Palaniappan , Associate Professor , Department of ADS 56

Counting the null
• replaceFrame.isnull().sum()

• replaceFrame.isnull().sum().sum()
•5

• replaceFrame.count() --→ Count the number of reported values .

• What is alternate way to count the reported values ???

Dr S Palaniappan , Associate Professor , Department of ADS 57

Dropping missing values
• If we want to display column_1 without null values

replaceFrame.column_1[replaceFrame.column_1.notnull()]
replaceFrame.column_1.dropna()
If we want to drop all null values in all the columns
replaceFrame.dropna()

replaceFrame .dropna()

Dr S Palaniappan , Associate Professor , Department of ADS 58

Dropping by rows and column
• replaceFrame.loc[replaceFrame.column_1 < 0, 'column_1'] = np.nan
• replaceFrame.loc[replaceFrame.column_2 >= 0 , 'column_2'] = np.nan

• Suppose we want drop those rows which are NaN

• replaceFrame.dropna(how = 'all‘,axis = 0)
• # row 2 and 6 dropped

Dr S Palaniappan , Associate Professor , Department of ADS 59

• Suppose we want drop those columns which are NaN

• replaceFrame.dropna(how = 'all',axis = 1)
• # column_2 dropped

Dr S Palaniappan , Associate Professor , Department of ADS 60

Mathematical operations with NaN
• replaceFrame.mean()

• replaceFrame.sum()

Dr S Palaniappan , Associate Professor , Department of ADS 61

Filling missing values
fillnaDF = replaceFrame.fillna(0)
fillnaDF

fillnaDF.mean()

Dr S Palaniappan , Associate Professor , Department of ADS 62

x = np.random.randint(100, size=(7))
ind = ['apple', 'banana', 'kiwi', 'grapes', 'mango','pineapple','gauva']
dfx = pd.DataFrame(x,index = ind ,columns=['store1'])

dfx.loc[dfx.store1 < 50, 'store1'] = np.nan

print(dfx)

Dr S Palaniappan , Associate Professor , Department of ADS 63

Backward and forward filling
• dfx["store1"].fillna(method='ffill',inplace = True)
• print(dfx)

• fillna(method=‘bfill)

Dr S Palaniappan , Associate Professor , Department of ADS 64

Linear Interpolation
• ser3 = pd.Series([100, np.nan, np.nan, np.nan, 292])
• ser3.interpolate()

Dr S Palaniappan , Associate Professor , Department of ADS 65

Renaming axis indexes

dframe1.index = ['Rain','Moisture','Breeze']

Dr S Palaniappan , Associate Professor , Department of ADS 66

Outlier detection
• Outliers are data points that are far from other data points

Dr S Palaniappan , Associate Professor , Department of ADS 67

Grouping Datasets
• It is often essential to cluster or group data together based on certain criteria
• e-commerce store might want to group all the sales at various stores

Understanding groupby()
Groupby mechanics
Data aggregation
Pivot tables and cross-tabulations

Dr S Palaniappan , Associate Professor , Department of ADS 68

Groupby()
• We can group the city dwellers into different gender groups and calculate their mean
weight

Dr S Palaniappan , Associate Professor , Department of ADS 69

groupby()
• Categorizing a dataset into multiple categories or groups is often essential.

• Pandas groupby is used for grouping the data according to the categories and
apply a function to the categories

• Any groupby operation involves one of the following operations on the original
object. They are −
• Splitting the Object

• Applying a function

• Combining the results

Dr S Palaniappan , Associate Professor , Department of ADS 70
Let’s Create a Data set Index Gender Height
0 M 160
mylist = ['M','F'] 1 F 181
myheight = random.sample(range(150, 185),20) 2 M 151
3 F 156
mygender = random.choices(mylist, k = 20)
4 M 157
5 M 155
data = {'Gender':mygender,'Height':myheight} 6 F 163
df = pd.DataFrame(data) 7 F 172
df 8 M 173
9 F 179
10 F 162
Dr S Palaniappan , Associate Professor , Department of ADS 71
• How many Unique value does the Gender column has?

print(df.groupby('Gender').groups.keys())

• To fine the count of each category

print(df.groupby('Gender').count())

Dr S Palaniappan , Associate Professor , Department of ADS 72

Splitting the data
style = df.groupby('Gender')

male = style.get_group("M")
print(male)

female = style.get_group(“F")
print(female)

Dr S Palaniappan , Associate Professor , Department of ADS 73

Applying the operation
• Average height in each category

f_avg = male['Height'].mean()

m_avg = female['Height'].mean()

print(f_avg,m_avg)

Dr S Palaniappan , Associate Professor , Department of ADS 74

Combine the results of Average
df_output = pd.DataFrame({'Gender':['F','M'],'Height':[f_avg,m_avg]})
df_output

Dr S Palaniappan , Associate Professor , Department of ADS 75

Selecting a subset of columns
Let’s Create a Data set

mylist = ['M','F']
sub = ['EDA',"ML",'DBMS']
myscore = random.sample(range(0, 100),20)
mygender = random.choices(mylist, k = 20)
mysub = random.choices(sub, k = 20)
data = {'Gender':mygender,'Score':myscore,'Subject':mysub}
df = pd.DataFrame(data)
df

Dr S Palaniappan , Associate Professor , Department of ADS 76

double_grouping = df.groupby(["Gender","Subject"])
double_grouping.mean()

double_grouping.max()

double_grouping.min()

Dr S Palaniappan , Associate Professor , Department of ADS 77

• How many people attended the courses in gender wise

double_grouping['Subject'].count()

Dr S Palaniappan , Associate Professor , Department of ADS 78

Data aggregation
• Aggregation is the process of implementing any mathematical operation on a dataset
or a subset of it.

• The Dataframe.aggregate() function is used to apply aggregation across one or more

columns.

• Some of the most frequently used aggregations are as follows:

• sum: Returns the sum of the values for the requested axis
• min: Returns the minimum of the values for the requested axis
• max: Returns the maximum of the values for the requested axis

Dr S Palaniappan , Associate Professor , Department of ADS 79

Let’s Create a Data set

myscore1 = random.sample(range(0, 100),20)

myscore2 = random.sample(range(0, 100),20)
myscore3 = random.sample(range(0, 100),20)

data = {'DBMS':myscore1,'AI':myscore2,'EDA':myscore3}
df = pd.DataFrame(data)
df

Dr S Palaniappan , Associate Professor , Department of ADS 80

• We can apply aggregation in a DataFrame, df, as df.aggregate() or df.agg()

df.agg('count')

df.agg(‘mean')

df.agg(["min",'max','mean'])

Dr S Palaniappan , Associate Professor , Department of ADS 81

Pivot tables and cross-tabulations
• Pandas offers several options for grouping and summarizing data

• pivot_table and crosstab also provides a method of groupby’s

• The pivot table takes simple column-wise data as input, and groups the entries into a
two-dimensional table that provides a multidimensional summarization of the data.

Dr S Palaniappan , Associate Professor , Department of ADS 82

Selecting a subset of columns
Let’s Create a Data set

Dr S Palaniappan , Associate Professor , Department of ADS 83

table = pd.pivot_table(data=df,index=['Subject'])
table

table = pd.pivot_table(data=df,index=['Gender','Subject'])
table

By default it displays the average

Dr S Palaniappan , Associate Professor , Department of ADS 84

Cross-tabulations
• Also known as contingency tables or cross tabs, cross tabulation groups variables to
understand the correlation between different variables.

• pd.crosstab(df["Gender"], df["Subject"])

• pd.crosstab(df["Gender"], df["Subject"],values=df["Score"],aggfunc='mean')

Dr S Palaniappan , Associate Professor , Department of ADS 85

STA112_Lecture_1_Content_Probability 1
No ratings yet
STA112_Lecture_1_Content_Probability 1
42 pages
CCS369 - TSS-Unit 2
No ratings yet
CCS369 - TSS-Unit 2
56 pages
Nptel Swayam DWDM Slides
No ratings yet
Nptel Swayam DWDM Slides
406 pages
UNIT-4
No ratings yet
UNIT-4
29 pages
Unit 3
No ratings yet
Unit 3
222 pages
Chapter 1 Introduction To Visualization
No ratings yet
Chapter 1 Introduction To Visualization
53 pages
Feature Engg Pre Processing Python
No ratings yet
Feature Engg Pre Processing Python
68 pages
Unit Iii
No ratings yet
Unit Iii
108 pages
Fundamentals of Data Science and Analytics On Descriptive Analysis
No ratings yet
Fundamentals of Data Science and Analytics On Descriptive Analysis
53 pages
Basic Statistical Descriptions of Data: Dr. Amiya Ranjan Panda
No ratings yet
Basic Statistical Descriptions of Data: Dr. Amiya Ranjan Panda
35 pages
Unit 3 Ids Notes
No ratings yet
Unit 3 Ids Notes
31 pages
Data Science
No ratings yet
Data Science
74 pages
probability and stat unit 1
No ratings yet
probability and stat unit 1
12 pages
Unit II Notes
No ratings yet
Unit II Notes
36 pages
Unit 4 Part A
No ratings yet
Unit 4 Part A
51 pages
Data Science Regular Handout
No ratings yet
Data Science Regular Handout
25 pages
Dev Answer Key
100% (1)
Dev Answer Key
17 pages
Unit-1
No ratings yet
Unit-1
52 pages
Programming in C - CS3251 - HandWritten Notes - Un_250316_200237
No ratings yet
Programming in C - CS3251 - HandWritten Notes - Un_250316_200237
38 pages
FDSA UNIT 5
No ratings yet
FDSA UNIT 5
48 pages
Data Preprocessing
No ratings yet
Data Preprocessing
77 pages
DBMS Unit 2
No ratings yet
DBMS Unit 2
48 pages
IV_AI-DS_AD3491_FDSA_Unit4
No ratings yet
IV_AI-DS_AD3491_FDSA_Unit4
30 pages
FDSA UNIT 3
No ratings yet
FDSA UNIT 3
42 pages
Unit Iv
No ratings yet
Unit Iv
8 pages
FDS Unit 1
No ratings yet
FDS Unit 1
21 pages
IV_AI-DS_AD3491_FDSA_Unit3
No ratings yet
IV_AI-DS_AD3491_FDSA_Unit3
35 pages
Sat - 13.Pdf - Child Mortality Prediction Using Machine Learning
No ratings yet
Sat - 13.Pdf - Child Mortality Prediction Using Machine Learning
11 pages
Data Science PPT PD41
100% (1)
Data Science PPT PD41
8 pages
Unit V Big Data Analytics
No ratings yet
Unit V Big Data Analytics
47 pages
Notes - EDA-Unit1 (2)
No ratings yet
Notes - EDA-Unit1 (2)
34 pages
Unit V Data Visualization
No ratings yet
Unit V Data Visualization
49 pages
UNIT2
No ratings yet
UNIT2
25 pages
6 1 Mining Complex Data
No ratings yet
6 1 Mining Complex Data
69 pages
II Cse Cs3352 Fds QB Unit2
No ratings yet
II Cse Cs3352 Fds QB Unit2
5 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
9 pages
Unit 2 - Knowledge Delivery
No ratings yet
Unit 2 - Knowledge Delivery
31 pages
Modelling in R
No ratings yet
Modelling in R
47 pages
Data Wrangling (Data Preprocessing) : Practical Assessment 1
No ratings yet
Data Wrangling (Data Preprocessing) : Practical Assessment 1
5 pages
Introduction To Tree Methods
No ratings yet
Introduction To Tree Methods
15 pages
Proposal
No ratings yet
Proposal
5 pages
Data Mining-Outlier Analysis
No ratings yet
Data Mining-Outlier Analysis
6 pages
Tableau Lab Manual
No ratings yet
Tableau Lab Manual
6 pages
Unit-2 Solution
No ratings yet
Unit-2 Solution
22 pages
Data Analytics Unit III
No ratings yet
Data Analytics Unit III
15 pages
Data Science New
No ratings yet
Data Science New
9 pages
2.1 Exploratory Data Analysis Using Python
No ratings yet
2.1 Exploratory Data Analysis Using Python
12 pages
AD3461 ML lab manual
No ratings yet
AD3461 ML lab manual
32 pages
DM DT Solved Example 02 - Unlocked
No ratings yet
DM DT Solved Example 02 - Unlocked
3 pages
Data Preprocessing: L1+ Freq
No ratings yet
Data Preprocessing: L1+ Freq
13 pages
12-Exploratory Data Analysis, Anomaly Detection-28!03!2023
No ratings yet
12-Exploratory Data Analysis, Anomaly Detection-28!03!2023
79 pages
Chi Merge
No ratings yet
Chi Merge
5 pages
Busa2001 2023 Sem2 Newcastle
No ratings yet
Busa2001 2023 Sem2 Newcastle
6 pages
Data Mining Techniques and Applications
No ratings yet
Data Mining Techniques and Applications
16 pages
Revised CS8383 (Eee) Oop Lab Man
No ratings yet
Revised CS8383 (Eee) Oop Lab Man
85 pages
Genetic Algorithm
No ratings yet
Genetic Algorithm
14 pages
Semi-Automated Exploratory Data Analysis (EDA) in Python - by Destin Gong - Mar, 2021 - Towards Data
No ratings yet
Semi-Automated Exploratory Data Analysis (EDA) in Python - by Destin Gong - Mar, 2021 - Towards Data
3 pages
FDS Lab Manual
No ratings yet
FDS Lab Manual
48 pages
Ad3411 - Student
No ratings yet
Ad3411 - Student
27 pages
Pattern Recognition
No ratings yet
Pattern Recognition
3 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.