Data Science Fundamentals

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 22

DATA SCIENCE FUNDAMENTALS

LAB EXERCISE
PROGRAMS AND OUTPUTS
EX:1 PACKAGES FOR DATA SCINCE IN PYTHON

AIM : TO DOWNLOAD,INSTALL AND EXPLORE THE FEATURES


OF PYTHON FOR DATA ANALYTICS.
ALGORITHM :
STEP 1 :
STEP 2 :
STEP 3 :
STEP 4 :
STEP 5 :
PROGRAM :

OUTPUT :
AIM : WORKING WITH NUMPY ARRAYS

ALGORITHM :
STEP 1 :
STEP 2 :
STEP 3 :
STEP 4 :
STEP 5 :

PROGRAM :
import numpy as np
list_1 = [1, 2, 3, 4]
list_2 = [5, 6, 7, 8]
list_3 = [9, 10, 11, 12]
sample_array = np.array([list_1,
list_2,
list_3])
print("Numpy multi dimensional array in python\n",
sample_array)
OUTPUT :
Numpy multi dimensional array in python
[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]]
AIM : WORKING WITH PANDAS DATA FRAMES
ALGORTIHM :
STEP 1 :
STEP 2:
STEP 3 :
STEP 4 :
STEP 5 :
PROGRAM :
import pandas as pd
data = {'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'],
'Age':[27, 24, 22, 32],
'Address':['Delhi', 'Kanpur', 'Allahabad', 'Kannauj'],
'Qualification':['Msc', 'MA', 'MCA', 'Phd']}
df = pd.DataFrame(data)
print(df[['Name', 'Qualification']])
OUTPUT :
EX.NO 5. USE THE DIABETES DATA SET FROM UCI
AND PIMA INDIANS DATE:DIABETES DATA SET FOR
PERFORMING THE FOLLOWING:

A) UNIVARIATE ANALYSIS: FREQUENCY, MEAN, MEDIAN,


MODE, VARIANCE, STANDARD DEVIATION, SKEWNESS
AND KURTOSIS.

AIM:
To explore various commands for doing Univariate analytics on
the UCI AND PIMA
INDIANS DIABETES data set.
ALGORITHM:
STEP 1: Start the program
STEP 2: To download the UCI AND PIMA INDIANS DIABETES
data set using Kaggle.
STEP 3: To read data from UCI AND PIMA INDIANS DIABETES
data set.
STEP 4: To find the mean, median, mode, variance,
standard deviation, skewness and kurtosis in the
given excel data set package.
STEP 5: Display the output.
STEP 6: Stop the program.
PROGRAM:
import
pandas as pd
import numpy
as np
import matplotlib.pyplot
as plt import seaborn
as sns
sns.set_style('darkgrid')
%matplotlib inline
from matplotlib.ticker import
FormatStrFormatter import warnings
warnings.filterwarnings('ignore')

df =
pd.read_csv('C:/Users/kirub/Documents/Learning/Untitled
Folder/diabetes.csv') df.head()
df.shape
df.dtypes
df['Outcome']=df['Outcome'].astype('bool')
df.dtypes['Outcome']
df.info()
df.describ
e().T

# Frequency# finding the


unique count df1 =
df['Outcome'].value_count
s()

#
displaying
df1
print(df1)
#mean
df.mean()
#median
df.median(
)
#mode df.mode()
#Variance
df.var()
#standard
deviation
df.std()
#
#kurtosis
df.kurtosis(axis=0,skipn
a=True)
df['Outcome'].kurtosis(axis=0,s
kipna=True) #skewness
# skewness along the
index axis df.skew(axis
= 0, skipna = True)

# skip the na values


# find skewness in
each row df.skew(axis
= 1, skipna = True)

#Pregnancy variable
preg_proportion =
np.array(df['Pregnancies'].value_counts())
preg_month =
np.array(df['Pregnancies'].value_counts().index)
preg_proportion_perc =
np.array(np.round(preg_proportion/
sum(preg_proportion),3)*100,dtype=int)

preg =
pd.DataFrame({'month':preg_month,'count_of_preg_prop':preg_p
roportion,'percentage_pro portion':preg_proportion_perc})
preg.set_index(['month'],inplac
e=True) preg.head(10)
sns.countplot(data=df['Outcome'])

sns.distplot(df['Pregnancies'])

sns.boxplot(data=df['Pregnancies'])
OUTPUT:
AIM :
ALGORITHM :
STEP : 1
STEP : 2
STEP : 3
STEP : 4
STEP : 5
OUTPUT :
AIM :
ALGORITHM :
STEP : 1
STEP : 2
STEP : 3
STEP : 4
STEP : 5
OUTPUT :
AIM :
ALGORITHM :
STEP : 1
STEP : 2
STEP : 3
STEP : 4
STEP : 5
OUTPUT :
AIM :
ALGORITHM :
STEP : 1
STEP : 2
STEP : 3
STEP : 4
STEP : 5
OUTPUT :
AIM :
ALGORITHM :
STEP : 1
STEP : 2
STEP : 3
STEP : 4
STEP : 5
OUTPUT :
AIM :
ALGORITHM :
STEP : 1
STEP : 2
STEP : 3
STEP : 4
STEP : 5
OUTPUT :

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy