0% found this document useful (0 votes)

4 views

Pandas - Cheatsheet

Uploaded by

Nandan Patkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Pandas - Cheatsheet

Uploaded by

Nandan Patkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Pandas Cheat Sheet

by Justin1209 (Justin1209) via cheatography.com/101982/cs/21202/

Import the Pandas Module Loading and Saving CSVs (cont) Converting Datatypes

import pandas as pd # Get the first DataFrame chunk: # Convert argument to numeric type
df_urb_pop pandas.to_numeric(arg, errors‐
Create a DataFrame df_urb_pop = next(urb_pop_re‐ ="raise")

# Method 1 ader) errors:

"raise" -> raise an exception
df1 = pd.DataFrame({
Inspect a DataFrame "coerce" -> invalid parsing will be set as
'name': ['John Smith',
NaN
'Jane Doe'], df.head(5) First 5 rows
'address': ['13 Main St.', df.info() Statistics of columns (row
DataFrame for Select Columns / Rows
'46 Maple Ave.'], count, null values, datatype)
'age': [34, 28] df = pd.DataFrame([

}) Reshape (for Scikit) ['January', 100, 100, 23,

# Method 2 100],
nums = np.array(range(1, 11))
df2 = pd.DataFrame([ ['February', 51, 45, 145, 45],
-> [ 1 2 3 4 5 6 7 8 9 10]
['John Smith', '123 Main ['March', 81, 96, 65, 96],
nums = nums.reshape(-1, 1)
St.', 34], ['April', 80, 80, 54, 180],
-> [ [1],
['Jane Doe', '456 Maple ['May', 51, 54, 54, 154],
[2],
Ave.', 28], ['June', 112, 109, 79, 129]],
[3],
['Joe Schmo', '9 Broadway', columns=['month', 'east',
[4],
51] 'north', 'south', 'west']
[5],
], )
[6],
columns =[ 'name', [7],
Select Columns
'address', 'age']) [8],
# Select one Column
[9],
Loading and Saving CSVs clinic_north = df.north
[10]]
# Load a CSV File in to a --> Reshape values for Scikit
You can think of reshape() as rotating this
DataFrame learn: clinic_north.values.re‐
array. Rather than one big row of numbers,
df = pd.read_csv('my-csv-f‐ shape(-1, 1)
nums is now a big column of numbers -
ile.csv') # Select multiple Columns
there’s one number in each row.
# Saving DataFrame to a CSV File clinic_north_south = df[['n‐

df.to_csv('new-csv-file.csv') orth', 'south']]

# Load DataFrame in Chunks (For Make sure that you have a double set of
large Datasets) brackets [[ ]], or this command won’t work!
# Initialize reader object:
urb_pop_reader
urb_pop_reader = pd.read_c‐
sv('ind_pop_data.csv', chunks‐
ize=1000)

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

cheatography.com/justin1209/ Last updated 31st January, 2020. Measure your website readability!
Page 1 of 4. https://readable.com
Pandas Cheat Sheet
by Justin1209 (Justin1209) via cheatography.com/101982/cs/21202/

Select Rows Adding a Column Performing Column Operation (cont)

# Select one Row df = pd.DataFrame([ -> lower, upper

march = df.iloc[2] [1, '3 inch screw', 0.5, # Perform a lambda Operation on
# Select multiple Rows 0.75], a Column
jan_feb_march = df.iloc[:3] [2, '2 inch nail', 0.10, get_last_name = lambda x:
feb_march_april = df.iloc[1:4] 0.25], x.split(" ")[-1]
may_june = df.iloc[-2:] [3, 'hammer', 3.00, 5.50], df['last_name'] = df.Name. apply‐
# Select Rows with Logic [4, 'screwdriver', 2.50, 3.00] (get_last_name)
january = df[df.month == ],
'January'] columns=['Product ID', 'Descr‐ Performing a Operation on Multiple
-> <, >, <=, >=, !=, == iption', 'Cost to Manufacture', Columns
march_april = df[(df.month == 'Price'] df = pd.DataFrame([
'March') | (df.month == ) ["Apple", 1.00, "No"],
'April')] # Add a Column with specified ["Milk", 4.20, "No"],
-> &, | row-values ["Paper Towels", 5.00, "‐
january_february_march = df['Sold in Bulk?'] = ['Yes', Yes"],
df[df.month.isin (['January', 'Yes', 'No', 'No'] ["Light Bulbs", 3.75, "Yes"],
'February', 'March'])] # Add a Column with same value ],
-> column_name.isin([" ", " "]) in every row columns=["Item", "Price", "Is
df['Is taxed?'] = 'Yes'
Selecting a Subset of a Dataframe often taxed?"])
results in non-consecutive indices. # Add a Column with calculation # Lambda Function
df['Revenue'] = df['Price'] - df['Price with Tax'] = df.app‐
Using .reset_index() will create a new df['Cost to Manufacture'] ly(lambda row:
DataFrame move the old indices into a new row['Price'] * 1.075
colum called index. Performing Column Operation if row['Is taxed?'] ==
df = pd.DataFrame([ 'Yes'
Use .reset_index(drop=True) if you dont ['JOHN SMITH', 'john.smith@‐ else row['Price'],
need the index column. axis=1
gmail.com'],
Use .reset_index(inplace=True) to prevent
['Jane Doe', 'jdoe@yahoo.c‐ )
a new DataFrame from brein created.
om'], We apply a lambda to rows, as opposed to
['joe schmo', 'joeschmo@hotma‐ columns, when we want to perform functi‐
il.com'] onality that needs to access more than one
], column at a time.
columns=['Name', 'Email'])
# Changing a column with an
Operation
df['Name'] = df.Name. apply(lo‐
wer)

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

cheatography.com/justin1209/ Last updated 31st January, 2020. Measure your website readability!
Page 2 of 4. https://readable.com
Pandas Cheat Sheet
by Justin1209 (Justin1209) via cheatography.com/101982/cs/21202/

Rename Columns Column Statistics Pivot Tables

# Method 1 Mean = Average df.col umn .mean() orders =

df.columns = ['NewName_1', Median df.column .median() pd.read_csv('orders.csv')
'NewName_2, 'NewName_3', '...'] shoe_counts = orders.
Minimal Value df.column .min()
# Method 2 groupby(['shoe_type', 'shoe_col‐
Maximum Value df.column .max()
df.rename(columns={ or']).
Number of Values df.column .count()
'OldName_1': 'NewName_1', id.count().reset_index()
'OldName_2': 'NewName_2' Unique Values df.col umn .nuni‐ shoe_counts_pivot = shoe_coun‐
}, inplace=True) que() ts.pivot(
Standard Deviation df.col umn .std() index = 'shoe_type',
Using inplace=True lets us edit the original
DataFrame. List of Unique df.column .unique() columns = 'shoe_color',
Values values = 'id').reset_index()
Series vs. Dataframes Dont't forget reset_index() at the end of a We have to build a temporary table where
# Dataframe and Series groupby operation we group by the columns we want to
print(type(clinic_north)): include in the pivot table
# <class 'pandas.core.series.Series'>
Calculating Aggregate Functions
print(type(df)): # Group By Merge (Same Column Name)
# <class 'pandas.core.frame.DataFrame'> grouped = df. groupby(['col1', sales = pd.read_csv('sales.csv')
print(type(clinic_north_south)) 'col2']).col3 targets = pd.read_csv('targe‐
# <class 'pandas.core.frame.DataFrame
'> .measurement(). reset_index() ts.csv')
In Pandas # -> group by column1 and men_women = pd.read_csv('men_w‐
- a series is a one-dimensional object column2, calculate values of omen_sales.csv')
that contains any type of data. column3 # Method 1
# Percentile sales_targets = pd .merge (sales,
- a dataframe is a two-dimensional high_earners = df.groupby('cat‐ targets, how=" ")
object that can hold multiple columns of egory').wage # how: "inner"(default), "out‐
different types of data.
.apply(lambda x: np.percen‐ er", "left", "right"
tile(x, 75)) #Method 2 (Method Chaining)
A single column of a dataframe is a series,
.reset_index() all_data = sales .merge (targe‐
and a dataframe is a container of two or
# np.percentile can calculate ts).merge(men_women)
more series objects.
any percentile over an array of
values

Don't forget reset.index()

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

cheatography.com/justin1209/ Last updated 31st January, 2020. Measure your website readability!
Page 3 of 4. https://readable.com
Pandas Cheat Sheet
by Justin1209 (Justin1209) via cheatography.com/101982/cs/21202/

Inner Merge (Different Column Name) Melt

orders = pandas.melt(DataFrame, id_vars,

pd.read_csv('orders.csv') value_vars, var_name, value_nam‐
products = pd.read_csv('produ‐ e='value')
cts.csv') id_vars: Column(s) to use as identifier
# Method 1: Rename Columns variables.
orders_products = pd .merge (or‐ value_vars: Column(s) to unpivot. If not
ders, products.rename(columns= specified, uses all columns that are not set
{'id':'product_id'}), how=" ") as id_vars.

.reset_index() var_name: Name to use for the ‘variable’

column.
# how: "inner"(default), "out‐
value_name: Name to use for the ‘value’
er", "left", "right"
column.
# Method 2:
orders_products = Unpivot a DataFrame from wide to long
pd.merge (orders, products, format, optionally leaving identifiers set.

left_on ="pro‐
Assert Statements
duct_id",
right_on ‐ # Test if country is of type
="id", object
suffixes =["_‐ assert gapminder.country.dtypes
orders","_products"]) == np.object
# Test if year is of type int64
Method 2:
assert gapminder.year.dtypes ==
If we use this syntax, we’ll end up with two
columns called id. np.int64

Pandas won’t let you have two columns # Test if life_expectancy is of

with the same name, so it will change them type float64
to id_x and id_y. assert gapminder.life_expectanc‐
We can help make them more useful by y.dtypes == np.float64
using the keyword suffixes. # Assert that country does not
contain any missing values
Concatenate assert pd.notnull(gapminder.cou‐
bakery = ntry).all()
pd.read_csv('bakery.csv') # Assert that year does not
ice_cream = pd.read_csv('ice_c‐ contain any missing values
ream.csv') assert pd.notnull(gapminder.yea‐
menu = pd.concat([bakery, r).all()
ice_cream])

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

cheatography.com/justin1209/ Last updated 31st January, 2020. Measure your website readability!
Page 4 of 4. https://readable.com

Pandas Cheat Sheet PDF
67% (3)
Pandas Cheat Sheet PDF
1 page
Pandas Cheat Sheet
100% (1)
Pandas Cheat Sheet
2 pages
Pandas Cheat Sheet Final
No ratings yet
Pandas Cheat Sheet Final
1 page
Pandas Cheat Sheet - Python For Data Science
No ratings yet
Pandas Cheat Sheet - Python For Data Science
5 pages
Pandas Cheat Sheet........
No ratings yet
Pandas Cheat Sheet........
11 pages
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
No ratings yet
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
12 pages
Pandas
No ratings yet
Pandas
5 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
13 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
2 pages
Pandas Cheat Sheet
100% (2)
Pandas Cheat Sheet
6 pages
Pandas Cheat Sheet
83% (12)
Pandas Cheat Sheet
2 pages
Pandas Cheat Sheet CN
No ratings yet
Pandas Cheat Sheet CN
4 pages
Pandas Cheat Sheet
100% (4)
Pandas Cheat Sheet
2 pages
Data Analysis With Python
No ratings yet
Data Analysis With Python
60 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
12 pages
Content Pandas Cheat Sheet
No ratings yet
Content Pandas Cheat Sheet
9 pages
3Y3Z2Xzqn7 U Y%K : 2. How To Create A Data Frame Using A Dictionary of Pre-Existing Columns or Numpy 2D Arrays?
No ratings yet
3Y3Z2Xzqn7 U Y%K : 2. How To Create A Data Frame Using A Dictionary of Pre-Existing Columns or Numpy 2D Arrays?
8 pages
Pandas DataFrameObject
No ratings yet
Pandas DataFrameObject
4 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
10 pages
Cheat Sheet - Pandas
No ratings yet
Cheat Sheet - Pandas
12 pages
12 Pandas
100% (1)
12 Pandas
21 pages
Pandas DataFrame Notes
100% (1)
Pandas DataFrame Notes
10 pages
12 Pandas
No ratings yet
12 Pandas
9 pages
24
No ratings yet
24
7 pages
Cheat Sheet
No ratings yet
Cheat Sheet
10 pages
Panda Cheatsheet
No ratings yet
Panda Cheatsheet
17 pages
PYTHON PANDAS Cheat Sheet
No ratings yet
PYTHON PANDAS Cheat Sheet
2 pages
pandas_merged
No ratings yet
pandas_merged
2 pages
Pandas in Python
No ratings yet
Pandas in Python
59 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
5 pages
Data Science Cheat Sheet: KEY Imports
100% (1)
Data Science Cheat Sheet: KEY Imports
1 page
PANDAS Cheatsheet
No ratings yet
PANDAS Cheatsheet
4 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
17 pages
Pandas cheat sheet
No ratings yet
Pandas cheat sheet
19 pages
Pandas
No ratings yet
Pandas
8 pages
Pandas DataFrame Notes
67% (3)
Pandas DataFrame Notes
13 pages
Python Pandas Demo PDF
100% (2)
Python Pandas Demo PDF
23 pages
a5
No ratings yet
a5
28 pages
Important Pandas Operations 1697910759
No ratings yet
Important Pandas Operations 1697910759
6 pages
Learn Data Analysis With Pandas - Introduction
No ratings yet
Learn Data Analysis With Pandas - Introduction
2 pages
lab 1 ML lab
No ratings yet
lab 1 ML lab
15 pages
Pandas: Import
100% (1)
Pandas: Import
13 pages
Python Libraries Cheat Sheets
No ratings yet
Python Libraries Cheat Sheets
6 pages
Pandas 1705297450
No ratings yet
Pandas 1705297450
21 pages
PPT for Assignment-3 (Final_Pandas_Lab)
No ratings yet
PPT for Assignment-3 (Final_Pandas_Lab)
40 pages
Python Cheat Sheet Code Academy
100% (1)
Python Cheat Sheet Code Academy
1 page
Pandas - I (PPT 6)
No ratings yet
Pandas - I (PPT 6)
14 pages
python interviews
No ratings yet
python interviews
154 pages
Pandas Basics Cheat Sheet Python For Data Science: Retrieving Series/Dataframe Information
No ratings yet
Pandas Basics Cheat Sheet Python For Data Science: Retrieving Series/Dataframe Information
1 page
CHP 8 Pandas
No ratings yet
CHP 8 Pandas
49 pages
Pandas
No ratings yet
Pandas
21 pages
Pandas
No ratings yet
Pandas
13 pages
Pandas Data Structures: Sections
No ratings yet
Pandas Data Structures: Sections
13 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Apache Cassandra Developer Associate - Exam Practice Tests
From Everand
Apache Cassandra Developer Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet
Couchbase Certified Java Developer - Exam Practice Tests
From Everand
Couchbase Certified Java Developer - Exam Practice Tests
Cristian Scutaru
No ratings yet
Rust Package 100 Knocks: One-Hour Mastery Series 2024 Edition
From Everand
Rust Package 100 Knocks: One-Hour Mastery Series 2024 Edition
Kanto
No ratings yet
tabular DI
No ratings yet
tabular DI
17 pages
Nandan Resume
No ratings yet
Nandan Resume
1 page
Job-Description-ADVI
No ratings yet
Job-Description-ADVI
7 pages
Nandan_CV (1)
No ratings yet
Nandan_CV (1)
2 pages
700 solutions
No ratings yet
700 solutions
3 pages
IP Project File Harsh
No ratings yet
IP Project File Harsh
37 pages
Sari Serhan Python Toolbox 100 Scripts For Developers 2023
No ratings yet
Sari Serhan Python Toolbox 100 Scripts For Developers 2023
193 pages
Instant ebooks textbook A Python Data Analyst’s Toolkit: Learn Python and Python-based Libraries with Applications in Data Analysis and Statistics Gayathri Rajagopalan download all chapters
100% (6)
Instant ebooks textbook A Python Data Analyst’s Toolkit: Learn Python and Python-based Libraries with Applications in Data Analysis and Statistics Gayathri Rajagopalan download all chapters
55 pages
Python Programming Using Problem Solving Harsh Bhasin pdf download
100% (1)
Python Programming Using Problem Solving Harsh Bhasin pdf download
64 pages
EDA_unit-3
No ratings yet
EDA_unit-3
16 pages
Python Assignment 03- Anil Kumar KN -91241460081
No ratings yet
Python Assignment 03- Anil Kumar KN -91241460081
9 pages
Assignment-12(Pandas)
No ratings yet
Assignment-12(Pandas)
4 pages
A Python Data Analyst’s Toolkit: Learn Python and Python-based Libraries with Applications in Data Analysis and Statistics Gayathri Rajagopalan pdf download
100% (5)
A Python Data Analyst’s Toolkit: Learn Python and Python-based Libraries with Applications in Data Analysis and Statistics Gayathri Rajagopalan pdf download
54 pages
OceanofPDF - Com Python - Andy Vickler
No ratings yet
OceanofPDF - Com Python - Andy Vickler
177 pages
Data Scientist ML Resume
No ratings yet
Data Scientist ML Resume
5 pages
Lab Work
No ratings yet
Lab Work
5 pages
Ids Course Content
No ratings yet
Ids Course Content
98 pages
Syllabus MCA-I Sem (AIML) July-2024
No ratings yet
Syllabus MCA-I Sem (AIML) July-2024
14 pages
Predictive Analytics in Customer Segmentation and Targeting
No ratings yet
Predictive Analytics in Customer Segmentation and Targeting
62 pages
Python Interview Questions
No ratings yet
Python Interview Questions
8 pages
Packages
No ratings yet
Packages
37 pages
sql vs pandas
No ratings yet
sql vs pandas
38 pages
Python For Exploratory Data Analysis
No ratings yet
Python For Exploratory Data Analysis
12 pages
Assignment 1 Predict Student Success
No ratings yet
Assignment 1 Predict Student Success
23 pages
python 2.1.3 (2)
No ratings yet
python 2.1.3 (2)
6 pages
Script
No ratings yet
Script
12 pages
Python Tutorials - Data To Fish
No ratings yet
Python Tutorials - Data To Fish
4 pages
Business_Resume
No ratings yet
Business_Resume
1 page
Additional Program
No ratings yet
Additional Program
573 pages
MR Rafael Hernandez Nunez
No ratings yet
MR Rafael Hernandez Nunez
15 pages
Data Mining & Data Science Practical Slips
No ratings yet
Data Mining & Data Science Practical Slips
45 pages
Criminova Crime Forecast
No ratings yet
Criminova Crime Forecast
36 pages
Project-railway reservation system
No ratings yet
Project-railway reservation system
11 pages
Phase 3 Xii Ip(24!12!2024) Set b
No ratings yet
Phase 3 Xii Ip(24!12!2024) Set b
9 pages
Practical File (Xii - Ip) 2023-24
No ratings yet
Practical File (Xii - Ip) 2023-24
40 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Pandas - Cheatsheet

Uploaded by

Pandas - Cheatsheet

Uploaded by

Pandas Cheat Sheet

by Justin1209 (Justin1209) via cheatography.com/101982/cs/21202/

# Method 1 ader) errors:

}) Reshape (for Scikit) ['January', 100, 100, 23,

df.to_csv('new-csv-file.csv') orth', 'south']]

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

Select Rows Adding a Column Performing Column Operation (cont)

# Select one Row df = pd.DataFrame([ -> lower, upper

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

Rename Columns Column Statistics Pivot Tables

# Method 1 Mean = Average df.col umn .mean() orders =

Don't forget reset.index()

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

Inner Merge (Different Column Name) Melt

orders = pandas.melt(DataFrame, id_vars,

.reset_index() var_name: Name to use for the ‘variable’

Pandas won’t let you have two columns # Test if life_expectancy is of

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Pandas - Cheatsheet

Uploaded by

Pandas - Cheatsheet

Uploaded by

Pandas Cheat Sheet

by Justin1209 (Justin1209) via cheatography.com/101982/cs/21202/

# Method 1 ader) errors:

}) Reshape (for Scikit) ​ ​['J​anu​ary', 100, 100, 23,

df.to_​csv​('n​ew-​csv​-fi​le.c​sv') orth', 'south']]

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

Select Rows Adding a Column Performing Column Operation (cont)

# Select one Row df = pd.DataFrame([ -> lower, upper

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

Rename Columns Column Statistics Pivot Tables

# Method 1 Mean = Average df.col​ um​n .​m​ean() orders =

Don't forget reset.i​nd​ex()

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

Inner Merge (Different Column Name) Melt

orders = panda​s.m​elt​(Da​taF​rame, id_vars,

.res​et_​ind​ex() var_n​ame: Name to use for the ‘variable’

Pandas won’t let you have two columns # Test if life_e​xpe​ctancy is of

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

}) Reshape (for Scikit) ['January', 100, 100, 23,

df.to_csv('new-csv-file.csv') orth', 'south']]

# Method 1 Mean = Average df.col umn .mean() orders =

Don't forget reset.index()

orders = pandas.melt(DataFrame, id_vars,

.reset_index() var_name: Name to use for the ‘variable’

Pandas won’t let you have two columns # Test if life_expectancy is of