Pandas 1
Pandas 1
Pandas is a Python library used for working with data sets. It has functions for analyzing, cleaning, exploring,
and manipulating data. The name "Pandas" has a reference to both "Panel Data", and "Python Data Analysis"
and was created by Wes McKinney in 2008.
In [1]:
import pandas as pd
import numpy as np
In [2]:
Out[2]:
'2.2.2'
In [3]:
dt = {"Name":['Ritu','Taniya','abhishek','tanmay','mayank'],
"Branch":['it','cse','it','ece','cse'],
"Class":[2,3,4,1,2],
"City":['jaipur','agra','vridavan','mathura','gokul']}
dt
Out[3]:
DATAFRAME-------- A dataframe is a data structure constructed with rows and columns, similar to a database
or Excel spreadsheet. It consists of a dictionary of lists in which the list each have their own identifiers or keys,
such as “last name” or “food group.”
In [4]:
#np.array()
df = pd.DataFrame(dt)
df
Out[4]:
0 Ritu it 2 jaipur
2 abhishek it 4 vridavan
Out[5]:
pandas.core.frame.DataFrame
In [6]:
df['Name']
Out[6]:
0 Ritu
1 Taniya
2 abhishek
3 tanmay
4 mayank
Name: Name, dtype: object
In [7]:
In [8]:
# String,categorical data values represent as Object
df.dtypes
Out[8]:
Name object
Branch object
Class int64
City object
dtype: object
In [9]:
Out[9]:
(5, 4)
In [10]:
# df['Name']
# OR
df.Name
Out[10]:
0 Ritu
1 Taniya
2 abhishek
3 tanmay
4 mayank
4 mayank
Name: Name, dtype: object
In [11]:
Branch Name
0 it Ritu
1 cse Taniya
2 it abhishek
3 ece tanmay
4 cse mayank
In [12]:
# It is used to show the DataFrame/Table
df
Out[12]:
0 Ritu it 2 jaipur
2 abhishek it 4 vridavan
In [13]:
Out[13]:
Branch Class
0 it 2
1 cse 3
2 it 4
3 ece 1
4 cse 2
To fetch a row ---> Use these methods .loc[]: Used for label-based indexing. It accepts a single label or a list of
labels. .iloc[]: Used for positional indexing. It accepts integers or boolean arrays.
In [14]:
# Select rows from index 1 to 2 (3 is exclusive)
df[1:3]
Out[14]:
2 abhishek it 4 vridavan
Name Branch Class City
In [15]:
# It showed only 1st row
df[1:2]
Out[15]:
In [16]:
df[1:3][['Name','Branch']]
Out[16]:
Name Branch
1 Taniya cse
2 abhishek it
loc $ iloc
In [17]:
#loc-> stopping pnt is inclusive
df.loc[1:2,['Name','Branch']]
Out[17]:
Name Branch
1 Taniya cse
2 abhishek it
In [18]:
#iloc-> stoping pnt is not inclusive
df.iloc[1:3,0:2]
Out[18]:
Name Branch
1 Taniya cse
2 abhishek it
In [19]:
#Selected Subset (Rows 3 onwards, Columns 1 onwards):
df.iloc[3:,1:]
Out[19]:
3 ece 1 mathura
4 cse 2 gokul
In [20]:
# Select rows with labels 3 to 4 and columns 'Branch', 'Class', 'City'
df.loc[3:4,['Branch','Class','City']]
df.iloc[3:,1:]
Out[20]:
3 ece 1 mathura
4 cse 2 gokul
Use read_csv(): The pd.read_csv() function is used to read the CSV file into a pandas DataFrame. You need to
provide the path to your CSV file within the function call. If your CSV file is in a different directory, specify the
full path or relative path accordingly.
In [22]:
df = pd.read_csv('Used_Bikes.csv')
df
Out[22]:
0 TVS Star City Plus Dual Tone 110cc 35000.0 Ahmedabad 17654.0 First Owner 3.0 110.0 TVS
1 Royal Enfield Classic 350cc 119900.0 Delhi 11000.0 First Owner 4.0 350.0 Royal Enfield
2 Triumph Daytona 675R 600000.0 Delhi 110.0 First Owner 8.0 675.0 Triumph
3 TVS Apache RTR 180cc 65000.0 Bangalore 16329.0 First Owner 4.0 180.0 TVS
4 Yamaha FZ S V 2.0 150cc-Ltd. Edition 80000.0 Bangalore 10000.0 First Owner 3.0 150.0 Yamaha
32643 Hero Passion Pro 100cc 39000.0 Delhi 22000.0 First Owner 4.0 100.0 Hero
32644 TVS Apache RTR 180cc 30000.0 Karnal 6639.0 First Owner 9.0 180.0 TVS
32645 Bajaj Avenger Street 220 60000.0 Delhi 20373.0 First Owner 6.0 220.0 Bajaj
32646 Hero Super Splendor 125cc 15600.0 Jaipur 84186.0 First Owner 16.0 125.0 Hero
32647 Bajaj Pulsar 150cc 22000.0 Pune 60857.0 First Owner 13.0 150.0 Bajaj
Displaying the DataFrame: Optionally, you can print the first few rows of the DataFrame using .head() to verify
that the CSV file was read correctly.
In [23]:
0 TVS Star City Plus Dual Tone 110cc 35000.0 Ahmedabad 17654.0 First Owner 3.0 110.0 TVS
1 Royal Enfield Classic 350cc 119900.0 Delhi 11000.0 First Owner 4.0 350.0 Royal Enfield
2 Triumph Daytona 675R 600000.0 Delhi 110.0 First Owner 8.0 675.0 Triumph
3 TVS Apache RTR 180cc 65000.0 Bangalore 16329.0 First Owner 4.0 180.0 TVS
4 Yamaha FZ S V 2.0 150cc-Ltd. Edition 80000.0 Bangalore 10000.0 First Owner 3.0 150.0 Yamaha
In [24]:
0 TVS Star City Plus Dual Tone 110cc 35000.0 Ahmedabad 17654.0 First Owner 3.0 110.0 TVS
1 Royal Enfield Classic 350cc 119900.0 Delhi 11000.0 First Owner 4.0 350.0 Royal Enfield
2 Triumph Daytona 675R 600000.0 Delhi 110.0 First Owner 8.0 675.0 Triumph
3 TVS Apache RTR 180cc 65000.0 Bangalore 16329.0 First Owner 4.0 180.0 TVS
4 Yamaha FZ S V 2.0 150cc-Ltd. Edition 80000.0 Bangalore 10000.0 First Owner 3.0 150.0 Yamaha
5 Yamaha FZs 150cc 53499.0 Delhi 25000.0 First Owner 6.0 150.0 Yamaha
6 Honda CB Hornet 160R ABS DLX 85000.0 Delhi 8200.0 First Owner 3.0 160.0 Honda
7 Hero Splendor Plus Self Alloy 100cc 45000.0 Delhi 12645.0 First Owner 3.0 100.0 Hero
8 Royal Enfield Thunderbird X 350cc 145000.0 Bangalore 9190.0 First Owner 3.0 350.0 Royal Enfield
9 Royal Enfield Classic Desert Storm 500cc 88000.0 Delhi 19000.0 Second Owner 7.0 500.0 Royal Enfield
Displaying the DataFrame: Optionally, you can print the last few rows of the DataFrame using .tail() to verify that
the CSV file was read correctly.
In [25]:
Out[25]:
32638 Yamaha Fazer 25 250cc 123000.0 Kadapa 14500.0 First Owner 4.0 250.0 Yamaha
32639 Royal Enfield Classic 350cc 95500.0 Delhi 18000.0 First Owner 8.0 350.0 Royal Enfield
32640 Hero Passion Pro 100cc 32000.0 Delhi 12000.0 First Owner 6.0 100.0 Hero
32641 Bajaj Avenger 220cc 41000.0 Delhi 20245.0 Second Owner 11.0 220.0 Bajaj
32642 Hero Passion 100cc 15000.0 Perumbavoor 35000.0 Second Owner 19.0 100.0 Hero
32643 Hero Passion Pro 100cc 39000.0 Delhi 22000.0 First Owner 4.0 100.0 Hero
32644 TVS Apache RTR 180cc 30000.0 Karnal 6639.0 First Owner 9.0 180.0 TVS
32645 Bajaj Avenger Street 220 60000.0 Delhi 20373.0 First Owner 6.0 220.0 Bajaj
32646 Hero Super Splendor 125cc 15600.0 Jaipur 84186.0 First Owner 16.0 125.0 Hero
32647 Bajaj Pulsar 150cc 22000.0 Pune 60857.0 First Owner 13.0 150.0 Bajaj
In [26]:
0 TVS
1 Royal Enfield
2 Triumph
3 TVS
4 Yamaha
...
32643 Hero
32644 TVS
32645 Bajaj
32646 Hero
32647 Bajaj
Name: brand, Length: 32648, dtype: object
In [27]:
In [27]:
#The .nunique() method in pandas is used to count the number of unique elements in a Ser
ies or DataFrame column.
df['brand'].nunique()
Out[27]:
23
In [28]:
#the unique() function is used to extract the unique elements from a Series. It returns
a NumPy array containing only the unique values present in the Series.
df['brand'].unique()
Out[28]:
array(['TVS', 'Royal Enfield', 'Triumph', 'Yamaha', 'Honda', 'Hero',
'Bajaj', 'Suzuki', 'Benelli', 'KTM', 'Mahindra', 'Kawasaki',
'Ducati', 'Hyosung', 'Harley-Davidson', 'Jawa', 'BMW', 'Indian',
'Rajdoot', 'LML', 'Yezdi', 'MV', 'Ideal'], dtype=object)
In [29]:
Out[29]:
brand
Bajaj 11213
Hero 6368
Royal Enfield 4178
Yamaha 3916
Honda 2108
Suzuki 1464
TVS 1247
KTM 1077
Harley-Davidson 737
Kawasaki 79
Hyosung 64
Benelli 56
Mahindra 55
Triumph 26
Ducati 22
BMW 16
Jawa 10
MV 4
Indian 3
Ideal 2
Rajdoot 1
Yezdi 1
LML 1
Name: count, dtype: int64
In [30]:
# Example: Filter rows where 'brand' column equals 'Royal Enfield'
bullete = df[df['brand'] == "Royal Enfield"] # single cndn
In [31]:
#bullete.head() will output the first 5 rows (by default) of the DataFrame bullete, whic
h contains only the rows where 'brand' is 'Royal Enfield'.
bullete.head()
Out[31]:
1 Royal Enfield Classic 350cc 119900.0 Delhi 11000.0 First Owner 4.0 350.0 Royal Enfield
8 Royal Enfield Thunderbird X 350cc 145000.0 Bangalore 9190.0 First Owner 3.0 350.0 Royal Enfield
9 Royal Enfield Classic Desert Storm 500cc 88000.0 Delhi 19000.0 Second Owner 7.0 500.0 Royal Enfield
23 Royal Enfield Classic Chrome 500cc 121700.0 Kalyan 24520.0 First Owner 5.0 500.0 Royal Enfield
36 Royal Enfield Classic 350cc 98800.0 Kochi 39000.0 First Owner 5.0 350.0 Royal Enfield
In [32]:
(4178, 8)
In [33]:
38 Royal Enfield Thunderbird X 500cc 190500.0 Samastipur 4550.0 First Owner 2.0 500.0 Royal Enfield
81 Royal Enfield Interceptor 650cc 260000.0 Navi Mumbai 3800.0 First Owner 2.0 650.0 Royal Enfield
139 Royal Enfield Himalayan 410cc Fi ABS 173300.0 Vadodara 14000.0 First Owner 2.0 410.0 Royal Enfield
157 Royal Enfield Himalayan 410cc Fi ABS 173300.0 Vadodara 14000.0 First Owner 2.0 410.0 Royal Enfield
194 Royal Enfield Electra 350cc 145000.0 Bangalore 4000.0 First Owner 2.0 350.0 Royal Enfield
7694 Royal Enfield Classic Chrome 500cc ABS 215000.0 Delhi 417.0 First Owner 2.0 500.0 Royal Enfield
7706 Royal Enfield Classic Chrome 500cc ABS 215000.0 Delhi 417.0 First Owner 2.0 500.0 Royal Enfield
8139 Royal Enfield Thunderbird X 350cc ABS 169000.0 Bangalore 4411.0 First Owner 2.0 350.0 Royal Enfield
8192 Royal Enfield Thunderbird 350cc ABS 145000.0 Ghaziabad 12400.0 First Owner 2.0 350.0 Royal Enfield
8836 Royal Enfield Thunderbird X 350cc ABS 170200.0 Mumbai 1000.0 First Owner 2.0 350.0 Royal Enfield
84 rows × 8 columns
In [34]:
# Apply the filter to get specific rows
bullete = df[(df['brand']=="Royal Enfield") & (df['age']<=2) & (df['owner']=="First Own
er")]
In [35]:
bullete.head()
Out[35]:
38 Royal Enfield Thunderbird X 500cc 190500.0 Samastipur 4550.0 First Owner 2.0 500.0 Royal Enfield
81 Royal Enfield Interceptor 650cc 260000.0 Navi Mumbai 3800.0 First Owner 2.0 650.0 Royal Enfield
139 Royal Enfield Himalayan 410cc Fi ABS 173300.0
bike_name price Vadodara 14000.0 First Owner
city kms_driven 2.0 power
owner age 410.0 Royal Enfield
brand
157 Royal Enfield Himalayan 410cc Fi ABS 173300.0 Vadodara 14000.0 First Owner 2.0 410.0 Royal Enfield
194 Royal Enfield Electra 350cc 145000.0 Bangalore 4000.0 First Owner 2.0 350.0 Royal Enfield
In [36]:
(84, 8)
In [37]:
Out[37]:
28
In [38]:
Out[38]:
array(['Samastipur', 'Navi Mumbai', 'Vadodara', 'Bangalore',
'Hamirpur(hp)', 'Mumbai', 'Delhi', 'Guwahati', 'Haldwani',
'Ahmedabad', 'Bardhaman', 'Silchar', 'Sibsagar', 'Kharar',
'Baripara', 'Sonipat', 'Pune', 'Farukhabad', 'Sultanpur',
'Hyderabad', 'Gurgaon', 'Faridabad', 'Kota', 'Thane', 'Nellore',
'Alipore', 'Ghaziabad', 'Noida'], dtype=object)
In [39]:
Out[39]:
city
Delhi 18
Mumbai 10
Bangalore 6
Ahmedabad 6
Vadodara 4
Guwahati 4
Faridabad 4
Hyderabad 3
Ghaziabad 3
Sibsagar 2
Bardhaman 2
Navi Mumbai 2
Gurgaon 2
Alipore 2
Farukhabad 2
Kharar 2
Silchar 1
Haldwani 1
Hamirpur(hp) 1
Samastipur 1
Sultanpur 1
Pune 1
Baripara 1
Sonipat 1
Thane 1
Kota 1
Nellore 1
Noida 1
Noida 1
Name: count, dtype: int64
In [40]:
bullete = df[(df['city']=="Jaipur") & (df['age']<=2) & (df['owner']=="First Owner")]
In [41]:
bullete.shape
Out[41]:
(13, 8)
In [42]:
df
Out[42]:
0 TVS Star City Plus Dual Tone 110cc 35000.0 Ahmedabad 17654.0 First Owner 3.0 110.0 TVS
1 Royal Enfield Classic 350cc 119900.0 Delhi 11000.0 First Owner 4.0 350.0 Royal Enfield
2 Triumph Daytona 675R 600000.0 Delhi 110.0 First Owner 8.0 675.0 Triumph
3 TVS Apache RTR 180cc 65000.0 Bangalore 16329.0 First Owner 4.0 180.0 TVS
4 Yamaha FZ S V 2.0 150cc-Ltd. Edition 80000.0 Bangalore 10000.0 First Owner 3.0 150.0 Yamaha
32643 Hero Passion Pro 100cc 39000.0 Delhi 22000.0 First Owner 4.0 100.0 Hero
32644 TVS Apache RTR 180cc 30000.0 Karnal 6639.0 First Owner 9.0 180.0 TVS
32645 Bajaj Avenger Street 220 60000.0 Delhi 20373.0 First Owner 6.0 220.0 Bajaj
32646 Hero Super Splendor 125cc 15600.0 Jaipur 84186.0 First Owner 16.0 125.0 Hero
32647 Bajaj Pulsar 150cc 22000.0 Pune 60857.0 First Owner 13.0 150.0 Bajaj
In [43]:
df['brand'].unique()
Out[43]:
array(['TVS', 'Royal Enfield', 'Triumph', 'Yamaha', 'Honda', 'Hero',
'Bajaj', 'Suzuki', 'Benelli', 'KTM', 'Mahindra', 'Kawasaki',
'Ducati', 'Hyosung', 'Harley-Davidson', 'Jawa', 'BMW', 'Indian',
'Rajdoot', 'LML', 'Yezdi', 'MV', 'Ideal'], dtype=object)
In [44]:
#The "|" operator performs element-wise OR operation between the two boolean Series. Thi
s results in a combined boolean Series where each element is True if the corresponding r
ow satisfies either condition
bike = df[(df['brand'] == "KTM") | (df['brand'] == "Jawa")]
bike
Out[44]:
33 KTM RC 390cc 180000.0 Pune 17700.0 First Owner 4.0 390.0 KTM
35 KTM Duke 200cc 70000.0 Nashik 100000.0 Second Owner 8.0 200.0 KTM
39 KTM RC 200cc ABS 179000.0 Bangalore 3400.0 First Owner 2.0 200.0 KTM
65 KTM Duke 200cc 94700.0 Baripara 32700.0 First Owner 4.0 200.0 KTM
83 KTM Duke 250cc 130000.0 Gandhidham 17500.0 Second Owner 4.0 250.0 KTM
83 KTM Duke 250cc 130000.0 Gandhidham 17500.0 Second Owner 4.0 250.0 KTM
bike_name price city kms_driven owner age power brand
... ... ... ... ... ... ... ... ...
32541 KTM RC 390cc 196700.0 Mumbai 13216.0 First Owner 4.0 390.0 KTM
32560 KTM RC 390cc 196700.0 Mumbai 13216.0 First Owner 4.0 390.0 KTM
32579 KTM RC 390cc 196700.0 Mumbai 13216.0 First Owner 4.0 390.0 KTM
32598 KTM RC 390cc 196700.0 Mumbai 13216.0 First Owner 4.0 390.0 KTM
32636 KTM RC 390cc 196700.0 Mumbai 13216.0 First Owner 4.0 390.0 KTM
In [45]:
Out[45]:
brand
KTM 1077
Jawa 10
Name: count, dtype: int64
Conditional Check:
In [46]:
if "Jaipur" in bike['city'].unique():
print("present")
else:
print("not present")
present
In [47]:
bike['city'].value_counts()
Out[47]:
city
Mumbai 703
Delhi 65
Bangalore 55
Pune 36
Hyderabad 27
...
Challakere 1
Hooghly 1
Bhubaneswar 1
Raigarh(mh) 1
Berhampore 1
Name: count, Length: 91, dtype: int64
city== jaipur
In [48]:
#bike[bike['city'] == "Jaipur" ]
In [49]:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 32648 entries, 0 to 32647
Data columns (total 8 columns):
# Column Non-Null Count Dtype
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 bike_name 32648 non-null object
1 price 32648 non-null float64
2 city 32648 non-null object
3 kms_driven 32648 non-null float64
4 owner 32648 non-null object
5 age 32648 non-null float64
6 power 32648 non-null float64
7 brand 32648 non-null object
dtypes: float64(4), object(4)
memory usage: 2.0+ MB
In [50]:
df.shape #actual
Out[50]:
(32648, 8)
In [51]:
df.duplicated().sum() # duplicates
Out[51]:
np.int64(25324)
df.drop_duplicates() method in pandas is used to remove duplicate rows from a DataFrame df.
In [52]:
df.drop_duplicates()
Out[52]:
0 TVS Star City Plus Dual Tone 110cc 35000.0 Ahmedabad 17654.0 First Owner 3.0 110.0 TVS
1 Royal Enfield Classic 350cc 119900.0 Delhi 11000.0 First Owner 4.0 350.0 Royal Enfield
2 Triumph Daytona 675R 600000.0 Delhi 110.0 First Owner 8.0 675.0 Triumph
3 TVS Apache RTR 180cc 65000.0 Bangalore 16329.0 First Owner 4.0 180.0 TVS
4 Yamaha FZ S V 2.0 150cc-Ltd. Edition 80000.0 Bangalore 10000.0 First Owner 3.0 150.0 Yamaha
9362 Hero Hunk Rear Disc 150cc 25000.0 Delhi 48587.0 First Owner 8.0 150.0 Hero
9369 Bajaj Avenger 220cc 35000.0 Bangalore 60000.0 First Owner 9.0 220.0 Bajaj
9370 Harley-Davidson Street 750 ABS 450000.0 Jodhpur 3430.0 First Owner 4.0 750.0 Harley-Davidson
9371 Bajaj Dominar 400 ABS 139000.0 Hyderabad 21300.0 First Owner 4.0 400.0 Bajaj
9372 Bajaj Avenger Street 220 80000.0 Hyderabad 7127.0 First Owner 5.0 220.0 Bajaj
In [53]:
In [54]:
df.shape
df.shape
Out[54]:
(7324, 8)
In [55]:
32648 - 25324
Out[55]:
7324
In [56]:
bullet = df[df['brand'] == "Royal Enfield"]
In [57]:
bullet.shape
Out[57]:
(1346, 8)
In [58]:
Out[58]:
Royal
5811 Royal Enfield Thunderbird 350cc 33500.0 Delhi 49463.0 First Owner 16.0 350.0
Enfield
Royal
9183 Royal Enfield Thunderbird 350cc 35800.0 Bangalore 90408.0 Third Owner 18.0 350.0
Enfield
Royal
4664 Royal Enfield Bullet Electra 350cc 41000.0 Noida 120000.0 First Owner 17.0 350.0
Enfield
Royal
2371 Royal Enfield Bullet 350 cc 45000.0 Gurgaon 40000.0 Second Owner 20.0 350.0
Enfield
Royal
5912 Royal Enfield Interceptor 650cc 265500.0 Nellore 12000.0 First Owner 2.0 650.0
Enfield
Royal
1931 Royal Enfield Interceptor 650cc 270000.0 Ahmedabad 6500.0 First Owner 2.0 650.0
Enfield
Royal
277 Royal Enfield Interceptor 650cc 280000.0 Bangalore 1500.0 First Owner 2.0 650.0
Enfield
Royal
2228 Royal Enfield Interceptor 650cc 280000.0 Mumbai 5000.0 First Owner 2.0 650.0
Enfield
In [59]:
bullet.sort_values(by="price").head(10)
Out[59]:
7038 Royal Enfield Bullet Electra 350cc 35000.0 Delhi 60000.0 Fourth Owner Or More 18.0 350.0 Royal Enfield
9183 Royal Enfield Thunderbird 350cc 35800.0 Bangalore 90408.0 Third Owner 18.0 350.0 Royal Enfield
4664 Royal Enfield Bullet Electra 350cc 41000.0 Noida 120000.0 First Owner 17.0 350.0 Royal Enfield
2371 Royal Enfield Bullet 350 cc 45000.0 Gurgaon 40000.0 Second Owner 20.0 350.0 Royal Enfield
5918 Royal Enfield Thunderbird 350cc 45000.0 Bangalore 93108.0 Third Owner 18.0 350.0 Royal Enfield
8475 Royal Enfield Thunderbird 350cc 45000.0 Delhi 45710.0 First Owner 16.0 350.0 Royal Enfield
6489 Royal Enfield Thunderbird 350cc 45918.0 Bangalore 51396.0 Second Owner 12.0 350.0 Royal Enfield
385 Royal Enfield Thunderbird 350cc 46000.0 Chennai 35000.0 First Owner 16.0 350.0 Royal Enfield
1227 Royal Enfield Thunderbird 350cc 47000.0 Faridabad 21000.0 First Owner 9.0 350.0 Royal Enfield
In [60]:
# Setting ascending=False means you want to sort the DataFrame in descending order of 'p
rice'
bullet.sort_values(by="price",ascending=False)
Out[60]:
Royal
277 Royal Enfield Interceptor 650cc 280000.0 Bangalore 1500.0 First Owner 2.0 650.0
Enfield
Royal
2228 Royal Enfield Interceptor 650cc 280000.0 Mumbai 5000.0 First Owner 2.0 650.0
Enfield
Royal
1931 Royal Enfield Interceptor 650cc 270000.0 Ahmedabad 6500.0 First Owner 2.0 650.0
Enfield
Royal
5912 Royal Enfield Interceptor 650cc 265500.0 Nellore 12000.0 First Owner 2.0 650.0
Enfield
Royal
2371 Royal Enfield Bullet 350 cc 45000.0 Gurgaon 40000.0 Second Owner 20.0 350.0
Enfield
Royal
4664 Royal Enfield Bullet Electra 350cc 41000.0 Noida 120000.0 First Owner 17.0 350.0
Enfield
Royal
9183 Royal Enfield Thunderbird 350cc 35800.0 Bangalore 90408.0 Third Owner 18.0 350.0
Enfield
Royal
5811 Royal Enfield Thunderbird 350cc 33500.0 Delhi 49463.0 First Owner 16.0 350.0
Enfield
In [61]:
# Sort by 'price' in descending order and get the top 10 rows
bullet.sort_values(by="price",ascending=False).head(10)
Out[61]:
4912 Royal Enfield Continental GT 650cc 285000.0 Hyderabad 4500.0 First Owner 2.0 650.0 Royal Enfield
277 Royal Enfield Interceptor 650cc 280000.0 Bangalore 1500.0 First Owner 2.0 650.0 Royal Enfield
2228 Royal Enfield Interceptor 650cc 280000.0 Mumbai 5000.0 First Owner 2.0 650.0 Royal Enfield
1931 Royal Enfield Interceptor 650cc 270000.0 Ahmedabad 6500.0 First Owner 2.0 650.0 Royal Enfield
1931 Royal Enfield Interceptor 650cc 270000.0 Ahmedabad 6500.0 First Owner 2.0 650.0 Royal Enfield
bike_name price city kms_driven owner age power brand
5912 Royal Enfield Interceptor 650cc 265500.0 Nellore 12000.0 First Owner 2.0 650.0 Royal Enfield
1396 Royal Enfield Interceptor 650cc 265000.0 Delhi 11000.0 First Owner 2.0 650.0 Royal Enfield
428 Royal Enfield Interceptor 650cc 265000.0 Bangalore 12900.0 First Owner 2.0 650.0 Royal Enfield
2332 Royal Enfield Interceptor 650cc 265000.0 Delhi 8500.0 First Owner 2.0 650.0 Royal Enfield
81 Royal Enfield Interceptor 650cc 260000.0 Navi Mumbai 3800.0 First Owner 2.0 650.0 Royal Enfield
1251 Royal Enfield Interceptor 650cc 250000.0 Silchar 920.0 First Owner 2.0 650.0 Royal Enfield
find out or filter all the yamaha brand bike top 10 lowest kms
In [62]:
df.brand.value_counts()
Out[62]:
brand
Bajaj 2081
Royal Enfield 1346
Hero 1142
Honda 676
Yamaha 651
TVS 481
KTM 375
Suzuki 203
Harley-Davidson 91
Kawasaki 61
Hyosung 53
Mahindra 50
Benelli 46
Triumph 21
Ducati 20
BMW 10
Jawa 7
Indian 3
MV 3
Rajdoot 1
LML 1
Yezdi 1
Ideal 1
Name: count, dtype: int64
In [63]:
In [64]:
bullet.sort_values(by="kms_driven").head(10)
Out[64]:
628 Yamaha FZs 150cc 77500.0 Delhi 45.0 First Owner 3.0 150.0 Yamaha
2264 Yamaha YZF-R1M 1000cc 600000.0 Faridabad 152.0 First Owner 5.0 1000.0 Yamaha
2037 Yamaha YZF-R1M 1000cc 600000.0 Surat 183.0 First Owner 5.0 1000.0 Yamaha
8693 Yamaha FZ25 ABS 250cc 142000.0 Delhi 233.0 First Owner 2.0 250.0 Yamaha
4765 Yamaha YZF-R1 1000cc 850000.0 Guwahati 285.0 First Owner 3.0 1000.0 Yamaha
4200 Yamaha YZF-R1 1000cc 700000.0 Delhi 285.0 First Owner 3.0 1000.0 Yamaha
123 Yamaha YZF R6 600cc 500000.0 Bangalore 285.0 First Owner 5.0 600.0 Yamaha
2265 Yamaha RX135 135cc 4-Speed 85000.0 Mumbai 300.0 Third Owner 24.0 135.0 Yamaha
8591 Yamaha YZF-R1 1000cc 1450000.0 Mumbai 340.0 First Owner 3.0 1000.0 Yamaha
5775 Yamaha FZ S Vbike_name
2.0 150cc 65000.0
price Pune 394.0
city kms_driven First Owner
owner 6.0 power
age 150.0 Yamaha
brand
In [65]:
Out[65]:
1 Royal Enfield Classic 350cc 119900.0 Delhi 11000.0 First Owner 4.0 350.0 Royal Enfield
8 Royal Enfield Thunderbird X 350cc 145000.0 Bangalore 9190.0 First Owner 3.0 350.0 Royal Enfield
23 Royal Enfield Classic Chrome 500cc 121700.0 Kalyan 24520.0 First Owner 5.0 500.0 Royal Enfield
36 Royal Enfield Classic 350cc 98800.0 Kochi 39000.0 First Owner 5.0 350.0 Royal Enfield
9261 Royal Enfield Classic 500cc 146006.0 Guwahati 8575.0 First Owner 4.0 500.0 Royal Enfield
9319 Royal Enfield Classic 350cc 100000.0 Chennai 25000.0 First Owner 10.0 350.0 Royal Enfield
9337 Royal Enfield Himalayan 410cc 120000.0 Gurgaon 8492.0 First Owner 5.0 410.0 Royal Enfield
9338 Royal Enfield Himalayan 410cc 138000.0 Delhi 5000.0 First Owner 5.0 410.0 Royal Enfield
9344 Royal Enfield Bullet Twinspark 350cc 80000.0 Delhi 56968.0 First Owner 8.0 350.0 Royal Enfield
In [66]:
In [67]:
#export csv file
jaipur_bullet.to_csv('jaipur_bullet.csv', index=False)
In [68]:
jaipur_bullet
Out[68]:
166 Royal Enfield Classic 350cc 123600.0 Jaipur 7702.0 First Owner 3.0 350.0 Royal Enfield
807 Royal Enfield Classic 350cc 115000.0 Jaipur 25000.0 First Owner 6.0 350.0 Royal Enfield
3009 Royal Enfield Himalayan 410cc 120000.0 Jaipur 14290.0 First Owner 5.0 410.0 Royal Enfield
3332 Royal Enfield Classic 350cc Dual Disc 120000.0 Jaipur 21000.0 First Owner 4.0 350.0 Royal Enfield
4497 Royal Enfield Classic 350cc 121000.0 Jaipur 20000.0 First Owner 4.0 350.0 Royal Enfield
4630 Royal Enfield Classic 350cc 90000.0 Jaipur 15437.0 First Owner 7.0 350.0 Royal Enfield
5193 Royal Enfield Electra 350cc 90000.0 Jaipur 14330.0 First Owner 3.0 350.0 Royal Enfield
6326 Royal Enfield Classic 350cc 120000.0 Jaipur 2069.0 First Owner 7.0 350.0 Royal Enfield
6655 Royal Enfield Classic Chrome 500cc 175000.0 Jaipur 7000.0 First Owner 5.0 500.0 Royal Enfield
6807 Royal Enfield Classic Chrome 500cc 160000.0 Jaipur 14000.0 First Owner 4.0 500.0 Royal Enfield
6897 Royal Enfield Thunderbird 350cc 135000.0 Jaipur 13500.0 First Owner 3.0 350.0 Royal Enfield
7137 Royal Enfield Classic 350cc 132000.0
bike_name price Jaipur 3467.0
city kms_driven First Owner
owner 6.0 power
age 350.0 Royal Enfield
brand
7345 Royal Enfield Thunderbird 350cc 105000.0 Jaipur 1543.0 First Owner 5.0 350.0 Royal Enfield
7352 Royal Enfield Thunderbird 350cc 55000.0 Jaipur 4779.0 First Owner 11.0 350.0 Royal Enfield
7974 Royal Enfield Classic 350cc 114000.0 Jaipur 38000.0 First Owner 7.0 350.0 Royal Enfield
8067 Royal Enfield Thunderbird 350cc 128000.0 Jaipur 8500.0 First Owner 5.0 350.0 Royal Enfield
8100 Royal Enfield Classic 350cc 100000.0 Jaipur 22600.0 First Owner 8.0 350.0 Royal Enfield
8367 Royal Enfield Thunderbird 350cc 90000.0 Jaipur 9902.0 First Owner 7.0 350.0 Royal Enfield
8461 Royal Enfield Classic Squadron Blue 500cc 160000.0 Jaipur 8000.0 First Owner 5.0 500.0 Royal Enfield
8708 Royal Enfield Bullet Twinspark 350cc 90000.0 Jaipur 34548.0 First Owner 7.0 350.0 Royal Enfield
8818 Royal Enfield Thunderbird 500cc 120000.0 Jaipur 32814.0 Second Owner 8.0 500.0 Royal Enfield
9197 Royal Enfield Classic 350cc 90000.0 Jaipur 17000.0 First Owner 7.0 350.0 Royal Enfield
9198 Royal Enfield Bullet Electra 350cc 83800.0 Jaipur 21540.0 Second Owner 10.0 350.0 Royal Enfield
In [111]:
jaipur_bullet[['bike_name','price','age']]
Out[111]:
In [70]:
166 Royal Enfield Classic 350cc 123600.0 Jaipur 7702.0 3.0 350.0 Royal Enfield
807 Royal Enfield Classic 350cc 115000.0 Jaipur 25000.0 6.0 350.0 Royal Enfield
3009 Royal Enfield Himalayan 410cc 120000.0 Jaipur 14290.0 5.0 410.0 Royal Enfield
3332 Royal Enfield Classic 350cc Dual Disc 120000.0 Jaipur 21000.0 4.0 350.0 Royal Enfield
4497 Royal Enfield Classic 350cc 121000.0 Jaipur 20000.0 4.0 350.0 Royal Enfield
4630 Royal Enfield Classic 350cc 90000.0 Jaipur 15437.0 7.0 350.0 Royal Enfield
5193 Royal Enfield Electra 350cc 90000.0 Jaipur 14330.0 3.0 350.0 Royal Enfield
6326 Royal Enfield Classic 350cc 120000.0 Jaipur 2069.0 7.0 350.0 Royal Enfield
6655 Royal Enfield Classic Chrome 500cc 175000.0 Jaipur 7000.0 5.0 500.0 Royal Enfield
6807 Royal Enfield Classic Chrome 500cc 160000.0 Jaipur 14000.0 4.0 500.0 Royal Enfield
6897 Royal Enfield Thunderbird 350cc 135000.0 Jaipur 13500.0 3.0 350.0 Royal Enfield
7137 Royal Enfield Classic 350cc 132000.0 Jaipur 3467.0 6.0 350.0 Royal Enfield
7345 Royal Enfield Thunderbird 350cc 105000.0 Jaipur 1543.0 5.0 350.0 Royal Enfield
7352 Royal Enfield Thunderbird 350cc 55000.0 Jaipur 4779.0 11.0 350.0 Royal Enfield
7974 Royal Enfield Classic 350cc 114000.0 Jaipur 38000.0 7.0 350.0 Royal Enfield
8067 Royal Enfield Thunderbird 350cc 128000.0 Jaipur 8500.0 5.0 350.0 Royal Enfield
8100 Royal Enfield Classic 350cc 100000.0 Jaipur 22600.0 8.0 350.0 Royal Enfield
8367 Royal Enfield Thunderbird 350cc 90000.0 Jaipur 9902.0 7.0 350.0 Royal Enfield
8461 Royal Enfield Classic Squadron Blue 500cc 160000.0 Jaipur 8000.0 5.0 500.0 Royal Enfield
8708 Royal Enfield Bullet Twinspark 350cc 90000.0 Jaipur 34548.0 7.0 350.0 Royal Enfield
8818 Royal Enfield Thunderbird 500cc 120000.0 Jaipur 32814.0 8.0 500.0 Royal Enfield
9197 Royal Enfield Classic 350cc 90000.0 Jaipur 17000.0 7.0 350.0 Royal Enfield
9198 Royal Enfield Bullet Electra 350cc 83800.0 Jaipur 21540.0 10.0 350.0 Royal Enfield
In [71]:
df
Out[71]:
0 TVS Star City Plus Dual Tone 110cc 35000.0 Ahmedabad 17654.0 First Owner 3.0 110.0 TVS
1 Royal Enfield Classic 350cc 119900.0 Delhi 11000.0 First Owner 4.0 350.0 Royal Enfield
2 Triumph Daytona 675R 600000.0 Delhi 110.0 First Owner 8.0 675.0 Triumph
3 TVS Apache RTR 180cc 65000.0 Bangalore 16329.0 First Owner 4.0 180.0 TVS
4 Yamaha FZ S V 2.0 150cc-Ltd. Edition 80000.0 Bangalore 10000.0 First Owner 3.0 150.0 Yamaha
9362 Hero Hunk Rear Disc 150cc 25000.0 Delhi 48587.0 First Owner 8.0 150.0 Hero
9369 Bajaj Avenger 220cc 35000.0 Bangalore 60000.0 First Owner 9.0 220.0 Bajaj
9370 Harley-Davidson Street 750 ABS 450000.0 Jodhpur 3430.0 First Owner 4.0 750.0 Harley-Davidson
9371 Bajaj Dominar 400 ABS 139000.0 Hyderabad 21300.0 First Owner 4.0 400.0 Bajaj
9372 Bajaj Avenger Street 220 80000.0 Hyderabad 7127.0 First Owner 5.0 220.0 Bajaj
7324 rows × 8 columns
In [114]:
jaipur_bullet.drop(['owner','city','power']), axis='columns',inplace=True)
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[114], line 4
1 #delete multiple col
2 #Coln--> owner, city, power
----> 4 jaipur_bullet.drop(('owner','city','power'), axis='columns',inplace=True)
File c:\Users\Krati\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\cor
e\frame.py:5581, in DataFrame.drop(self, labels, axis, index, columns, level, inplace, e
rrors)
5433 def drop(
5434 self,
5435 labels: IndexLabel | None = None,
(...)
5442 errors: IgnoreRaise = "raise",
5443 ) -> DataFrame | None:
5444 """
5445 Drop specified labels from rows or columns.
5446
(...)
5579 weight 1.0 0.8
5580 """
-> 5581 return super().drop(
5582 labels=labels,
5583 axis=axis,
5584 index=index,
5585 columns=columns,
5586 level=level,
5587 inplace=inplace,
5588 errors=errors,
5589 )
File c:\Users\Krati\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\cor
e\generic.py:4788 , in NDFrame.drop(self, labels, axis, index, columns, level, inplace, e
rrors)
4786 for axis, labels in axes.items():
4787 if labels is not None:
-> 4788 obj = obj._drop_axis(labels, axis, level=level, errors=errors)
4790 if inplace:
4791 self._update_inplace(obj)
File c:\Users\Krati\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\cor
e\generic.py:4830 , in NDFrame._drop_axis(self, labels, axis, level, errors, only_slice)
4828 new_axis = axis.drop(labels, level=level, errors=errors)
4829 else:
-> 4830 new_axis = axis.drop(labels, errors=errors)
4831 indexer = axis.get_indexer(new_axis)
4833 # Case for non-unique axis
4834 else:
File c:\Users\Krati\AppData\Local\Programs\Python\Python312\Lib\site-packages\pandas\cor
e\indexes\base.py:7070 , in Index.drop(self, labels, errors)
7068 if mask.any():
7069 if errors != "ignore":
-> 7070 raise KeyError(f"{labels[mask].tolist()} not found in axis")
7071 indexer = indexer[~mask]
7072 return self.delete(indexer)
In [73]:
jaipur_bullet
Out[73]:
166 Royal Enfield Classic 350cc 123600.0 7702.0 3.0 Royal Enfield
807 Royal Enfield Classic 350cc 115000.0 25000.0 6.0 Royal Enfield
3009 Royal Enfield Himalayan 410cc 120000.0 14290.0 5.0 Royal Enfield
3332 Royal Enfield Classic 350cc Dual Disc 120000.0 21000.0 4.0 Royal Enfield
4497 Royal Enfield Classic 350cc 121000.0 20000.0 4.0 Royal Enfield
4630 Royal Enfield Classic 350cc 90000.0 15437.0 7.0 Royal Enfield
5193 Royal Enfield Electra 350cc 90000.0 14330.0 3.0 Royal Enfield
6326 Royal Enfield Classic 350cc 120000.0 2069.0 7.0 Royal Enfield
6655 Royal Enfield Classic Chrome 500cc 175000.0 7000.0 5.0 Royal Enfield
6807 Royal Enfield Classic Chrome 500cc 160000.0 14000.0 4.0 Royal Enfield
6897 Royal Enfield Thunderbird 350cc 135000.0 13500.0 3.0 Royal Enfield
7137 Royal Enfield Classic 350cc 132000.0 3467.0 6.0 Royal Enfield
7345 Royal Enfield Thunderbird 350cc 105000.0 1543.0 5.0 Royal Enfield
7352 Royal Enfield Thunderbird 350cc 55000.0 4779.0 11.0 Royal Enfield
7974 Royal Enfield Classic 350cc 114000.0 38000.0 7.0 Royal Enfield
8067 Royal Enfield Thunderbird 350cc 128000.0 8500.0 5.0 Royal Enfield
8100 Royal Enfield Classic 350cc 100000.0 22600.0 8.0 Royal Enfield
8367 Royal Enfield Thunderbird 350cc 90000.0 9902.0 7.0 Royal Enfield
8461 Royal Enfield Classic Squadron Blue 500cc 160000.0 8000.0 5.0 Royal Enfield
8708 Royal Enfield Bullet Twinspark 350cc 90000.0 34548.0 7.0 Royal Enfield
8818 Royal Enfield Thunderbird 500cc 120000.0 32814.0 8.0 Royal Enfield
9197 Royal Enfield Classic 350cc 90000.0 17000.0 7.0 Royal Enfield
9198 Royal Enfield Bullet Electra 350cc 83800.0 21540.0 10.0 Royal Enfield
In [74]:
import pandas as pd
import numpy as np
In [75]:
df = pd.read_csv('Used_Bikes.csv')
df
Out[75]:
0 TVS Star City Plus Dual Tone 110cc 35000.0 Ahmedabad 17654.0 First Owner 3.0 110.0 TVS
1 Royal Enfield Classic 350cc 119900.0 Delhi 11000.0 First Owner 4.0 350.0 Royal Enfield
2 Triumph Daytona 675R 600000.0 Delhi 110.0 First Owner 8.0 675.0 Triumph
3 TVS Apache RTR 180cc 65000.0 Bangalore 16329.0 First Owner 4.0 180.0 TVS
4 Yamaha FZ S V 2.0 150cc-Ltd. Edition 80000.0 Bangalore 10000.0 First Owner 3.0 150.0 Yamaha
32643 Hero Passion Pro 100cc 39000.0 Delhi 22000.0 First Owner 4.0 100.0 Hero
32644 TVS Apache RTR 180cc 30000.0 Karnal 6639.0 First Owner 9.0 180.0 TVS
32644 TVS Apache RTR 180cc 30000.0 Karnal 6639.0 First Owner 9.0 180.0 TVS
bike_name price city kms_driven owner age power brand
32645 Bajaj Avenger Street 220 60000.0 Delhi 20373.0 First Owner 6.0 220.0 Bajaj
32646 Hero Super Splendor 125cc 15600.0 Jaipur 84186.0 First Owner 16.0 125.0 Hero
32647 Bajaj Pulsar 150cc 22000.0 Pune 60857.0 First Owner 13.0 150.0 Bajaj
In [76]:
data = {'A':[2,5,np.nan,8,np.nan,9],
'B':[np.nan,45,np.nan,89,63,np.nan],
'C':[np.nan,74,np.nan,np.nan,85,4],
'D':[10,20,30,40,50,60]}
data
Out[76]:
In [77]:
df2=pd.DataFrame(data)
df2
Out[77]:
A B C D
Missing value Handling 1- Remove the records ---> if data 3-4% out of total is missing then we use 2- fill the
records --->
In [78]:
df2.dropna() # by default remove all those rows that contains missing values
#df2.dropna(inplace=True)----> changes done permanently
Out[78]:
A B C D
In [79]:
df2.dropna(axis=1) #(axis='columns)
Out[79]:
D
0 10
D
1 20
2 30
3 40
4 50
5 60
In [80]:
Out[80]:
A 2
B 3
C 3
D 0
dtype: int64
In [81]:
df2
Out[81]:
A B C D
In [82]:
In [83]:
Out[83]:
In [84]:
Out[84]:
A 33.333333
B 50.000000
C 50.000000
D 0.000000
dtype: float64
In [85]:
Out[85]:
A B C D
In [86]:
df2
Out[86]:
A B C D
In [87]:
# A=500
# B=600
# C=700
df2['A'].fillna(500)
Out[87]:
0 2.0
1 5.0
2 500.0
3 8.0
4 500.0
5 9.0
Name: A, dtype: float64
In [88]:
df2['B'].fillna(600)
Out[88]:
0 600.0
1 45.0
2 600.0
3 89.0
4 63.0
5 600.0
Name: B, dtype: float64
In [89]:
In [89]:
df2['C'].fillna(700)
Out[89]:
0 700.0
1 74.0
2 700.0
3 700.0
4 85.0
5 4.0
Name: C, dtype: float64
In [90]:
df2['A'].mean()
Out[90]:
np.float64(6.0)
In [91]:
df2['A'].fillna(6.0)
#or
#df2['A'].fillna(df2['A'].mean())
Out[91]:
0 2.0
1 5.0
2 6.0
3 8.0
4 6.0
5 9.0
Name: A, dtype: float64
In [92]:
df2['A'].fillna(df2['A'].median())
Out[92]:
0 2.0
1 5.0
2 6.5
3 8.0
4 6.5
5 9.0
Name: A, dtype: float64
In [93]:
Groupby
In [94]:
df
Out[94]:
0 TVS Star City Plus Dual Tone 110cc 35000.0 Ahmedabad 17654.0 First Owner 3.0 110.0 TVS
1 Royal Enfield Classic 350cc 119900.0 Delhi 11000.0 First Owner 4.0 350.0 Royal Enfield
2 Triumph Daytona 675R 600000.0 Delhi 110.0 First Owner 8.0 675.0 Triumph
3 TVS Apache RTR 180cc 65000.0 Bangalore 16329.0 First Owner 4.0 180.0 TVS
4 Yamaha FZ S V 2.0 150cc-Ltd. Edition 80000.0 Bangalore 10000.0 First Owner 3.0 150.0 Yamaha
4 Yamaha FZ S V 2.0 150cc-Ltd. Edition 80000.0 Bangalore 10000.0 First Owner 3.0 150.0 Yamaha
bike_name price city kms_driven owner age power brand
... ... ... ... ... ... ... ... ...
32643 Hero Passion Pro 100cc 39000.0 Delhi 22000.0 First Owner 4.0 100.0 Hero
32644 TVS Apache RTR 180cc 30000.0 Karnal 6639.0 First Owner 9.0 180.0 TVS
32645 Bajaj Avenger Street 220 60000.0 Delhi 20373.0 First Owner 6.0 220.0 Bajaj
32646 Hero Super Splendor 125cc 15600.0 Jaipur 84186.0 First Owner 16.0 125.0 Hero
32647 Bajaj Pulsar 150cc 22000.0 Pune 60857.0 First Owner 13.0 150.0 Bajaj
In [95]:
Out[95]:
np.float64(4400.0)
In [96]:
Out[96]:
np.float64(1900000.0)
In [97]:
Out[97]:
np.float64(68295.41763660868)
In [98]:
df[df['brand']=="TVS"]['price'].min()
Out[98]:
np.float64(5800.0)
In [99]:
df['brand'].unique()
Out[99]:
In [100]:
# Group by 'brand'
brand_group = df.groupby('brand')
brand_group
Out[100]:
In [101]:
brand_group.get_group('TVS')
Out[101]:
Out[101]:
0 TVS Star City Plus Dual Tone 110cc 35000.0 Ahmedabad 17654.0 First Owner 3.0 110.0 TVS
3 TVS Apache RTR 180cc 65000.0 Bangalore 16329.0 First Owner 4.0 180.0 TVS
52 TVS Apache RTR 160cc 60000.0 Mumbai 30000.0 First Owner 5.0 160.0 TVS
114 TVS Apache RTR 160 4V Disc 69900.0 Delhi 8700.0 First Owner 3.0 160.0 TVS
130 TVS Phoenix Disc 125cc 21500.0 Barasat 10500.0 First Owner 5.0 125.0 TVS
32549 TVS Apache RTR 180cc 30000.0 Karnal 6639.0 First Owner 9.0 180.0 TVS
32568 TVS Apache RTR 180cc 30000.0 Karnal 6639.0 First Owner 9.0 180.0 TVS
32587 TVS Apache RTR 180cc 30000.0 Karnal 6639.0 First Owner 9.0 180.0 TVS
32606 TVS Apache RTR 180cc 30000.0 Karnal 6639.0 First Owner 9.0 180.0 TVS
32644 TVS Apache RTR 180cc 30000.0 Karnal 6639.0 First Owner 9.0 180.0 TVS
In [102]:
Out[102]:
2 Triumph Daytona 675R 600000.0 Delhi 110.0 First Owner 8.0 675.0 Triumph
93 Triumph Street Triple 765 790000.0 Mumbai 19000.0 First Owner 3.0 765.0 Triumph
284 Triumph Street Triple ABS 675cc 599999.0 Mumbai 7800.0 Third Owner 5.0 675.0 Triumph
614 Triumph Speed Triple 1050cc 750000.0 Koppal 9500.0 First Owner 4.0 1050.0 Triumph
1319 Triumph Tiger 800 XRX 750000.0 Delhi 10000.0 First Owner 6.0 800.0 Triumph
1362 Triumph Thunderbird Storm 1700cc 700000.0 Guwahati 176.0 First Owner 5.0 1700.0 Triumph
1375 Triumph Tiger 800 XCA 1300000.0 Ludhiana 9500.0 First Owner 4.0 800.0 Triumph
1440 Triumph Tiger 800 XCA 1300000.0 Ludhiana 9500.0 First Owner 4.0 800.0 Triumph
1867 Triumph Street Twin 900cc 790000.0 Delhi 2500.0 First Owner 3.0 900.0 Triumph
1927 Triumph Street Triple 765 860000.0 Udaipur 2100.0 First Owner 3.0 765.0 Triumph
1998 Triumph Street Triple 765 860000.0 Udaipur 2100.0 First Owner 3.0 765.0 Triumph
2107 Triumph Street Twin 900cc 790000.0 Delhi 2500.0 First Owner 3.0 900.0 Triumph
2157 Triumph Tiger 800 XR 1170000.0 Pune 31500.0 First Owner 5.0 800.0 Triumph
2244 Triumph Bonneville T100 865cc 550000.0 Bangalore 25536.0 First Owner 7.0 865.0 Triumph
3007 Triumph Thruxton R 1200cc 1000000.0 Delhi 1500.0 First Owner 4.0 1200.0 Triumph
3054 Triumph Thruxton R 1200cc 1000000.0 Delhi 1500.0 First Owner 4.0 1200.0 Triumph
3095 Triumph Street Triple ABS 675cc 650000.0 Mumbai 16000.0 First Owner 4.0 675.0 Triumph
3138 Triumph Bonneville T100 900cc 640000.0 Delhi 7500.0 Second Owner 4.0 900.0 Triumph
4597 Triumph Bonneville T100 865cc 500000.0 Delhi 5600.0 Second Owner 6.0 865.0 Triumph
4662 Triumph Bonneville T100 865cc 500000.0 Delhi 5600.0 Second Owner 6.0 865.0 Triumph
4828 Triumph Thruxton R 1200cc 950000.0 Delhi 4600.0 First Owner 4.0 1200.0 Triumph
4917 Triumph Speed Triple 1050cc 900000.0 Bangalore 8000.0 Second Owner 5.0 1050.0 Triumph
6793 Triumph Street Triple 765 789000.0 Pune 13000.0 First Owner 4.0 765.0 Triumph
6794 Triumph Tiger 800 XCA 1200000.0 Hyderabad 25800.0 First Owner 4.0 800.0 Triumph
6795 Triumph Street Triple 765 875000.0 Rajkot 750.0 First Owner 2.0 765.0 Triumph
6795 Triumph Street Triple 765 875000.0 Rajkot 750.0 First Owner 2.0 765.0 Triumph
bike_name price city kms_driven owner age power brand
7621 Triumph Daytona 675 ABS 699000.0 Delhi 9000.0 Third Owner 5.0 675.0 Triumph
In [103]:
brand_group[['price']].min()
Out[103]:
price
brand
BMW 255000.0
Bajaj 6400.0
Benelli 110700.0
Ducati 380000.0
Harley-Davidson 250000.0
Hero 5000.0
Honda 10000.0
Hyosung 120000.0
Ideal 100000.0
Indian 700000.0
Jawa 146000.0
KTM 55000.0
Kawasaki 110000.0
LML 4400.0
MV 950000.0
Mahindra 17800.0
Rajdoot 75000.0
Suzuki 8000.0
TVS 5800.0
Triumph 500000.0
Yamaha 9400.0
Yezdi 68000.0
In [104]:
brand_group[['price']].max()
Out[104]:
price
brand
BMW 1800000.0
Bajaj 195000.0
Benelli 785000.0
Ducati 1500000.0
Harley-Davidson 1100000.0
Hero 104000.0
Honda 800000.0
Hyosung 493500.0
Hyosung 493500.0
price
Ideal 100000.0
brand
Indian 1900000.0
Jawa 223000.0
KTM 860000.0
Kawasaki 1100000.0
LML 4400.0
MV 1500000.0
Mahindra 175000.0
Rajdoot 75000.0
Suzuki 1260000.0
TVS 224000.0
Triumph 1300000.0
Yamaha 1550000.0
Yezdi 68000.0
In [105]:
Out[105]:
brand
brand
In [106]:
brand_group['kms_driven'].agg(min_dis='min',max_dis='max')
Out[106]:
min_dis max_dis
brand
MV 1911.0 6898.0
In [107]:
df['owner'].value_counts()
Out[107]:
owner
First Owner 29964
Second Owner 2564
Third Owner 108
Fourth Owner Or More 12
Name: count, dtype: int64
In [108]:
brand_group = df.groupby('owner')
brand_group
Out[108]:
In [109]:
In [109]:
brand_group[['kms_driven'].agg(min_dis='min',max_dis='max')][['price'].agg(min_price='m
in',max_price='max',avg_price='mean')]
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[109], line 1
----> 1 brand_group[['kms_driven'].agg(min_dis='min',max_dis='max')][['price'].agg(min_p
rice='min',max_price='max',avg_price='mean')]
In [ ]:
df.select_dtypes(include='O')
Out[ ]:
0 TVS Star City Plus Dual Tone 110cc Ahmedabad First Owner TVS
In [ ]:
cat_col=df.select_dtypes(include='O')
cat_col
Out[ ]:
0 TVS Star City Plus Dual Tone 110cc Ahmedabad First Owner TVS
In [ ]:
df
df
Out[ ]:
0 TVS Star City Plus Dual Tone 110cc 35000.0 Ahmedabad 17654.0 First Owner 3.0 110.0 TVS
1 Royal Enfield Classic 350cc 119900.0 Delhi 11000.0 First Owner 4.0 350.0 Royal Enfield
2 Triumph Daytona 675R 600000.0 Delhi 110.0 First Owner 8.0 675.0 Triumph
3 TVS Apache RTR 180cc 65000.0 Bangalore 16329.0 First Owner 4.0 180.0 TVS
4 Yamaha FZ S V 2.0 150cc-Ltd. Edition 80000.0 Bangalore 10000.0 First Owner 3.0 150.0 Yamaha
32643 Hero Passion Pro 100cc 39000.0 Delhi 22000.0 First Owner 4.0 100.0 Hero
32644 TVS Apache RTR 180cc 30000.0 Karnal 6639.0 First Owner 9.0 180.0 TVS
32645 Bajaj Avenger Street 220 60000.0 Delhi 20373.0 First Owner 6.0 220.0 Bajaj
32646 Hero Super Splendor 125cc 15600.0 Jaipur 84186.0 First Owner 16.0 125.0 Hero
32647 Bajaj Pulsar 150cc 22000.0 Pune 60857.0 First Owner 13.0 150.0 Bajaj
In [ ]:
num_col = df.select_dtypes(exclude='O')
num_col.head()
Out[ ]:
In [ ]:
pd.concat([cat_col,num_col],axis='columns')
Out[ ]:
0 TVS Star City Plus Dual Tone 110cc Ahmedabad First Owner TVS 35000.0 17654.0 3.0 110.0
1 Royal Enfield Classic 350cc Delhi First Owner Royal Enfield 119900.0 11000.0 4.0 350.0
2 Triumph Daytona 675R Delhi First Owner Triumph 600000.0 110.0 8.0 675.0
3 TVS Apache RTR 180cc Bangalore First Owner TVS 65000.0 16329.0 4.0 180.0
4 Yamaha FZ S V 2.0 150cc-Ltd. Edition Bangalore First Owner Yamaha 80000.0 10000.0 3.0 150.0
32643 Hero Passion Pro 100cc Delhi First Owner Hero 39000.0 22000.0 4.0 100.0
32644 TVS Apache RTR 180cc Karnal First Owner TVS 30000.0 6639.0 9.0 180.0
32645 Bajaj Avenger Street 220 Delhi First Owner Bajaj 60000.0 20373.0 6.0 220.0
32646 Hero Super Splendor 125cc Jaipur First Owner Hero 15600.0 84186.0 16.0 125.0
32647 Bajaj Pulsar 150cc Pune First Owner Bajaj 22000.0 60857.0 13.0 150.0
Out[ ]:
In [ ]: