0% found this document useful (0 votes)
24 views

b21 DSBDA Assignment No 3

The document reads in a CSV file containing customer data and performs summary statistics on the data including mean, median, standard deviation, minimum and maximum values. It then groups the data by gender and displays the first row of each group. A similar process is done for iris flower data, displaying summary statistics for each flower type and grouping/filtering the data.

Uploaded by

Prachi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

b21 DSBDA Assignment No 3

The document reads in a CSV file containing customer data and performs summary statistics on the data including mean, median, standard deviation, minimum and maximum values. It then groups the data by gender and displays the first row of each group. A similar process is done for iris flower data, displaying summary statistics for each flower type and grouping/filtering the data.

Uploaded by

Prachi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

import pandas as pd

import numpy as np

df=pd.read_csv("C:\\Users\\bhend\\OneDrive\\Desktop\\dataset\Mall_Customers.csv")
df

CustomerID Genre Age Annual Income (k$) Spending Score (1-100)

0 1 Male 19 15 39

1 2 Male 21 15 81

2 3 Female 20 16 6

3 4 Female 23 16 77

4 5 Female 31 17 40

... ... ... ... ... ...

195 196 Female 35 120 79

196 197 Female 45 126 28

197 198 Male 32 126 74

198 199 Male 32 137 18

199 200 Male 30 137 83

200 rows × 5 columns

df.mean()

C:\Users\bhend\AppData\Local\Temp\ipykernel_8504\3759992827.py:1: FutureWarning: Dropping of nuisance columns i


n DataFrame reductions (with 'numeric_only=None') is deprecated; in a future version this will raise TypeError.
Select only valid columns before calling the reduction.
df.mean()
CustomerID 100.50
Age 38.85
Annual Income (k$) 60.56
Spending Score (1-100) 50.20
dtype: float64

df.median()

C:\Users\bhend\AppData\Local\Temp\ipykernel_8504\4151078170.py:1: FutureWarning: Dropping of nuisance columns i


n DataFrame reductions (with 'numeric_only=None') is deprecated; in a future version this will raise TypeError.
Select only valid columns before calling the reduction.
df.median()
CustomerID 100.5
Age 36.0
Annual Income (k$) 61.5
Spending Score (1-100) 50.0
dtype: float64

df.std()

C:\Users\bhend\AppData\Local\Temp\ipykernel_8504\1561050584.py:1: FutureWarning: Dropping of nuisance columns i


n DataFrame reductions (with 'numeric_only=None') is deprecated; in a future version this will raise TypeError.
Select only valid columns before calling the reduction.
df.std()
CustomerID 57.879185
Age 13.969007
Annual Income (k$) 26.264721
Spending Score (1-100) 25.823522
dtype: float64

df.min()

CustomerID 1
Genre Female
Age 18
Annual Income (k$) 15
Spending Score (1-100) 1
dtype: object

df.max()

CustomerID 200
Genre Male
Age 70
Annual Income (k$) 137
Spending Score (1-100) 99
dtype: object

df["Age"].mean()

38.85
df["Age"].mode()

0 32
Name: Age, dtype: int64

df["Age"].median()

36.0

df["Age"].std()

13.969007331558883

gk=df.groupby(["Genre"])

gk.first()

CustomerID Age Annual Income (k$) Spending Score (1-100)

Genre

Female 3 20 16 6

Male 1 19 15 39

df_iris=pd.read_csv("C:\\Users\\bhend\\OneDrive\\Desktop\\dataset\\Iris.csv")
df_iris

Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

0 1 5.1 3.5 1.4 0.2 Iris-setosa

1 2 4.9 3.0 1.4 0.2 Iris-setosa

2 3 4.7 3.2 1.3 0.2 Iris-setosa

3 4 4.6 3.1 1.5 0.2 Iris-setosa

4 5 5.0 3.6 1.4 0.2 Iris-setosa

... ... ... ... ... ... ...

145 146 6.7 3.0 5.2 2.3 Iris-virginica

146 147 6.3 2.5 5.0 1.9 Iris-virginica

147 148 6.5 3.0 5.2 2.0 Iris-virginica

148 149 6.2 3.4 5.4 2.3 Iris-virginica

149 150 5.9 3.0 5.1 1.8 Iris-virginica

150 rows × 6 columns

gk=df_iris.groupby('Species')

gk.first()

Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm

Species

Iris-setosa 1 5.1 3.5 1.4 0.2

Iris-versicolor 51 7.0 3.2 4.7 1.4

Iris-virginica 101 6.3 3.3 6.0 2.5

gk.describe()

Id SepalLengthCm ... PetalLengthCm

count mean std min 25% 50% 75% max count mean ... 75% max count mean std min 25% 50%

Species

Iris-
50.0 25.5 14.57738 1.0 13.25 25.5 37.75 50.0 50.0 5.006 ... 1.575 1.9 50.0 0.244 0.107210 0.1 0.2
setosa

Iris-
50.0 75.5 14.57738 51.0 63.25 75.5 87.75 100.0 50.0 5.936 ... 4.600 5.1 50.0 1.326 0.197753 1.0 1.2
versicolor

Iris-
50.0 125.5 14.57738 101.0 113.25 125.5 137.75 150.0 50.0 6.588 ... 5.875 6.9 50.0 2.026 0.274650 1.4 1.8
virginica

3 rows × 40 columns

iris_Set=(df_iris['Species'] == "Iris-setosa")

print("Iris-setosa")
print("Iris-setosa")

Iris-setosa

print(df_iris[iris_Set].describe())

Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm


count 50.00000 50.00000 50.000000 50.000000 50.00000
mean 25.50000 5.00600 3.418000 1.464000 0.24400
std 14.57738 0.35249 0.381024 0.173511 0.10721
min 1.00000 4.30000 2.300000 1.000000 0.10000
25% 13.25000 4.80000 3.125000 1.400000 0.20000
50% 25.50000 5.00000 3.400000 1.500000 0.20000
75% 37.75000 5.20000 3.675000 1.575000 0.30000
max 50.00000 5.80000 4.400000 1.900000 0.60000

iris_Vir=(df_iris['Species'] == "Iris-virginica")
print(df_iris[iris_Vir].describe())

Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm


count 50.00000 50.00000 50.000000 50.000000 50.00000
mean 125.50000 6.58800 2.974000 5.552000 2.02600
std 14.57738 0.63588 0.322497 0.551895 0.27465
min 101.00000 4.90000 2.200000 4.500000 1.40000
25% 113.25000 6.22500 2.800000 5.100000 1.80000
50% 125.50000 6.50000 3.000000 5.550000 2.00000
75% 137.75000 6.90000 3.175000 5.875000 2.30000
max 150.00000 7.90000 3.800000 6.900000 2.50000

iris_Ver=(df_iris['Species'] == "Iris-versicolor")
print(df_iris[iris_Ver].describe())

Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm


count 50.00000 50.000000 50.000000 50.000000 50.000000
mean 75.50000 5.936000 2.770000 4.260000 1.326000
std 14.57738 0.516171 0.313798 0.469911 0.197753
min 51.00000 4.900000 2.000000 3.000000 1.000000
25% 63.25000 5.600000 2.525000 4.000000 1.200000
50% 75.50000 5.900000 2.800000 4.350000 1.300000
75% 87.75000 6.300000 3.000000 4.600000 1.500000
max 100.00000 7.000000 3.400000 5.100000 1.800000

Loading [MathJax]/jax/output/CommonHTML/fonts/TeX/fontdata.js

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy