Expt - No.2. RUSHYA

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

4/30/24, 11:40 AM Expt.No.2. Plotting of probability distribution using different dataset. .

ipynb - Colab

Department of Electronics and Telecommunication


Name of Subject = Data Analytic Lab (22ESET4050L)
Acadmic Year : 2023-24

Name = Om Indrasing Chavan

Class: S.Y. B.Tech

Div: A

Roll No = 59

Name of Experiment: Plotting of probability distribution using different dataset.

Perform Date = 13/02/2024

Checking Date =20/02/2024

from google.colab import drive


drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).

import pandas as pd
import numpy as np
import seaborn as sns
from scipy import stats
import matplotlib.pyplot as plt
#from empiricaldist import Pmf , Cdf
from matplotlib.ticker import PercentFormatter

df_titanic = pd.read_csv('/content/drive/MyDrive/train.csv')
df_house = pd.read_csv('/content/drive/MyDrive/train (3)advanced.csv')
df_police = pd.read_csv('/content/drive/MyDrive/police_project.csv.zip')
df_olympic = pd.read_csv('/content/drive/MyDrive/athlete_events.csv.zip')

def label_graph(ticksfont , x_label , y_label , title_label , fontsize):

plt.xticks(fontsize = ticksfont)
plt.yticks(fontsize = ticksfont)

plt.xlabel(x_label, fontsize = fontsize)


plt.ylabel(y_label , fontsize = fontsize)
plt.title(title_label, fontsize = fontsize)

#fig, ax = plt.subplots(figsize=(12,8))
#sns.set_style("whitegrid")

#cdf = Cdf.from_seq(df_house['SalePrice'])
#cdf.plot()

#ax.annotate("25% of houses <= 129900$ ", xy=(140000, 0.24), xytext=(150000, 0.06) , fontsize = 18 ,
#arrowprops={'arrowstyle': '-|>', 'lw': 2 , 'color' : 'b'})

#plt.plot(129900 , 0.25 , marker = 'o' , color = 'r' , markersize = 15)

#label_graph(18 ,'Sale Price' , 'CDF' , " " , 20 )

#print('The probability of 100000$ is : ' + str(cdf(100000)))


#print("The value of probability 25% is : " + str(cdf.inverse(0.25)))

def cdf(data):
"""Compute CDF for a one-dimensional array of measurements."""
# Number of data points: n
n = len(data)

# x-data for the ECDF: x


x = np.sort(data)

# y-data for the ECDF: y


y = np.arange(1, n+1) /
n

return x, y

https://colab.research.google.com/drive/1UEuYHKnd6InNNgYRTYinubPGgEcKMQQx#printMode=true 1/3
4/30/24, 11:40 AM Expt.No.2. Plotting of probability distribution using different dataset. .ipynb - Colab
fig, ax = plt.subplots(figsize=(10,6))
x_price , y_price = cdf(df_house['SalePrice'])
plt.plot(x_price , y_price)
label_graph(10 ,'Sale Price' , 'CDF' , " " , 10 )

#fig, ax = plt.subplots(figsize=(12,8))
#pmf = Pmf.from_seq(df_house['BedroomAbvGr'])
#pmf.bar()

#label_graph(18 ,'Number Of Bedrooms' , 'PMF' , 'Probability of each room' , 20 )

fig, ax = plt.subplots(figsize=(10,6))

# Cdf using seaborn :


sns.ecdfplot(data=df_house, x="SalePrice")

label_graph(10 ,'Sale Price' , 'CDF' , " " , 10 )

https://colab.research.google.com/drive/1UEuYHKnd6InNNgYRTYinubPGgEcKMQQx#printMode=true 2/3
4/30/24, 11:40 AM Expt.No.2. Plotting of probability distribution using different dataset. .ipynb - Colab

https://colab.research.google.com/drive/1UEuYHKnd6InNNgYRTYinubPGgEcKMQQx#printMode=true 3/3

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy