0% found this document useful (0 votes)

62 views17 pages

Panda Cheatsheet

This document provides a cheat sheet for commonly used Python Pandas functions and operations. It covers topics like importing and exporting data, selecting, filtering, and sorting data, grouping and joining data, handling missing values, and visualizing data. The cheat sheet acts as a quick reference guide for working with Pandas.

Uploaded by

Adevair Junior

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

62 views17 pages

Panda Cheatsheet

Uploaded by

Adevair Junior

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Subscribe to community at www.decodingdatascience.

com to
get more useful documents, ebooks, courses & job tips like this.

Python Pandas
Cheat sheet
Subscribe to community at www.decodingdatascience.com to
get more useful documents, ebooks, courses & job tips like this.

This covers some of the most commonly used functions and operations in Pandas:

Importing and Exporting Data

Here is a quick Python Pandas cheatsheet that covers some of the most
common functions and operations you will use when working with Pandas:

Importing Pandas

To use Pandas, you will first need to import the library:

import pandas as pd

Reading a CSV file

You can read a CSV file into a Pandas DataFrame using the read_csv function:

df = pd.read_csv('filename.csv')
Subscribe to community at www.decodingdatascience.com to
get more useful documents, ebooks, courses & job tips like this.

Displaying the DataFrame

To view the data in a DataFrame, you can use the head function to display the
first few rows:

df.head()

You can also use the tail function to display the last few rows:

df.tail()

To display the entire DataFrame, you can simply print it:

print(df)

Selecting Columns

You can select a single column of a DataFrame by using the [] operator and the
column name:

df['column_name']
Subscribe to community at www.decodingdatascience.com to
get more useful documents, ebooks, courses & job tips like this.

You can also select multiple columns by passing a list of column names:

df[['column_1', 'column_2']]

Filtering Rows

You can filter the rows of a DataFrame using a boolean expression. For example,
to select all rows where the value in the 'age' column is greater than 30:

df[df['age'] > 30]

Sorting Data

You can sort the rows of a DataFrame by one or more columns using the
sort_values function. For example, to sort the DataFrame by the 'age' column in
ascending order:
Subscribe to community at www.decodingdatascience.com to
get more useful documents, ebooks, courses & job tips like this.

df.sort_values(by='age')

To sort in descending order, set the ascending parameter to False:

df.sort_values(by='age', ascending=False)

Grouping Data

You can group a DataFrame by one or more columns and apply a function to
each group using the groupby function. For example, to group the DataFrame by
the 'gender' column and compute the mean of each group:

df.groupby('gender').mean()

Joining DataFrames

You can join two DataFrames using the merge function. For example, to join two
DataFrames on the 'user_id' column:

df1.merge(df2, on='user_id')

Pivot Tables
Subscribe to community at www.decodingdatascience.com to
get more useful documents, ebooks, courses & job tips like this.

You can create a pivot table from a DataFrame using the pivot_table function.
For example, to create a pivot table with the 'gender' column as the rows, the
'country' column as the columns, and the 'age' column as the values:

df.pivot_table(index='gender', columns='country', values='age')

Handling Missing Values

Pandas includes functions for handling missing values. To drop rows with
missing values:

df.dropna()

To fill missing values with a specific value, you can use the fillna function:

df.fillna(value=0)

You can also fill missing values with the mean of the column using the fillna
function and the mean function:
Subscribe to community at www.decodingdatascience.com to
get more useful documents, ebooks, courses & job tips like this.

df.fillna(df.mean())

Converting Data Types

You can convert the data type of a column using the astype function. For
example, to convert the 'age' column to a string:

df['age'] = df['age'].astype(str)

Applying Functions

You can apply a function to each element of a column using the apply function.
For example, to apply the len function to the 'name' column:

df['name'].apply(len)

You can also apply a custom function by defining it and passing it to the apply
function. For example:

def reverse_name(name):

return name[::-1]
Subscribe to community at www.decodingdatascience.com to
get more useful documents, ebooks, courses & job tips like this.

df['name'].apply(reverse_name)

Exporting Data

You can export a DataFrame to a CSV file using the to_csv function. For
example:

df.to_csv('output.csv')

You can also export to other file formats, such as Excel, by using the to_excel
function:

df.to_excel('output.xlsx', sheet_name='Sheet1')

Summary Statistics

You can compute summary statistics for a DataFrame using the describe
function, which returns a new DataFrame with statistical information about the
columns:
Subscribe to community at www.decodingdatascience.com to
get more useful documents, ebooks, courses & job tips like this.

df.describe()

You can also compute specific summary statistics by using the corresponding
function. For example, to compute the mean of the 'age' column:

df['age'].mean()

Other summary statistics functions include min, max, median, and mode.

Visualizing Data

You can use the plot function of a DataFrame to create various types of plots.
For example, to create a line plot:

df.plot()

You can specify the type of plot using the kind parameter. For example, to
create a bar plot:

df.plot(kind='bar')

You can also use the plot.bar function to create a bar plot:
Subscribe to community at www.decodingdatascience.com to
get more useful documents, ebooks, courses & job tips like this.

df.plot.bar()

To customize the plot, you can use various parameters of the plot function. For
example, to specify the x and y axis data and the title:

df.plot(x='column_1', y='column_2', title='Title')

Indexing and Selection

You can select rows and columns using the [] operator and indices or labels.

For example, to select a single row by its index:

df.loc[0]

To select a range of rows:

df.loc[0:2]

To select a single column:

df['column_name']

To select multiple columns:

Subscribe to community at www.decodingdatascience.com to
get more useful documents, ebooks, courses & job tips like this.

df[['column_1', 'column_2']]

You can also use the iloc attribute to select rows and columns by integer
position. For example, to select the first row:

df.iloc[0]

To select a range of rows:

df.iloc[0:2]

To select a single column:

df.iloc[:, 0]

To select multiple columns:

df.iloc[:, 0:2]

Adding and Removing Columns

You can add a new column to a DataFrame by assigning a list or array to a new
column name. For example:

df['new_column'] = [1, 2, 3]

You can also use an existing column to create a new one. For example:
Subscribe to community at www.decodingdatascience.com to
get more useful documents, ebooks, courses & job tips like this.

df['new_column'] = df['column_1'] + df['column_2']

To remove a column, you can use the drop function with the axis parameter set
to 1:

df.drop('column_name', axis=1)

Adding and Removing Rows

You can add a new row to a DataFrame by using the append function and
passing a Series or a dictionary:

df.append({'column_1': 1, 'column_2': 2}, ignore_index=True)

To remove a row, you can use the drop function with the index parameter:

df.drop(index=0)

Renaming Columns

You can rename the columns of a DataFrame using the rename function and the
columns parameter. For example:

df.rename(columns={'old_name': 'new_name'})

You can also use the columns attribute to rename the columns in place:

df.columns = ['new_name_1', 'new_name_2']

Subscribe to community at www.decodingdatascience.com to
get more useful documents, ebooks, courses & job tips like this.

Iterating Over a DataFrame

You can use a for loop to iterate over the rows of a DataFrame. For example:

for index, row in df.iterrows():

print(row['column_1'], row['column_2'])

You can also use the apply function to apply a function to each row or column:

df.apply(lambda row: row['column_1'] + row['column_2'], axis=1)

Conditional Selection

To select rows based on a condition, you can use the loc attribute and a boolean
expression. For example, to select rows where the value in the 'age' column is
greater than 30:

df.loc[df['age'] > 30]

You can also use the where function to select rows based on a condition. For
example:
Subscribe to community at www.decodingdatascience.com to
get more useful documents, ebooks, courses & job tips like this.

df.where(df['age'] > 30)

To select columns based on a condition, you can use the select_dtypes function
and pass the data type as an argument. For example, to select all columns with
numerical data:

df.select_dtypes(include=['int', 'float'])

You can also use the select_dtypes function to exclude columns with a specific
data type. For example, to exclude object columns:

df.select_dtypes(exclude=['object'])

Resetting the Index

You can reset the index of a DataFrame using the reset_index function. This will
create a new column with the old index as its values and set the index to a
default integer index starting from 0. For example:

df.reset_index()

You can also specify a name for the new index column using the index.name
attribute:
Subscribe to community at www.decodingdatascience.com to
get more useful documents, ebooks, courses & job tips like this.

df.reset_index().index.name = 'new_index_name'

Casting a Column to a Different Data Type

You can cast a column of a DataFrame to a different data type using the astype
function. For example, to cast the 'age' column to an integer:

df['age'] = df['age'].astype(int)

You can also specify the data type using a string. For example:

df['age'] = df['age'].astype('int')

Duplicate Rows

To identify duplicate rows in a DataFrame, you can use the duplicated function.
This will return a boolean Series indicating whether each row is a duplicate. For
example:

df.duplicated()
Subscribe to community at www.decodingdatascience.com to
get more useful documents, ebooks, courses & job tips like this.

You can then use this Series to select the duplicate rows:

df[df.duplicated()]

To drop the duplicate rows, you can use the drop_duplicates function:

df.drop_duplicates()

You can also specify which columns to consider when determining whether a
row is a duplicate using the subset parameter:

df.drop_duplicates(subset=['column_1', 'column_2'])

Concatenating DataFrames

You can concatenate multiple DataFrames using the concat function. For
example:

pd.concat([df1, df2, df3])

You can also specify the axis to concatenate along using the axis parameter. By
default, the concat function concatenates along the rows (axis=0). To
concatenate along the columns (axis=1), you can set the axis parameter to 1:
Subscribe to community at www.decodingdatascience.com to
get more useful documents, ebooks, courses & job tips like this.

Online Courses Available : https://decodingdatascience.com/courses/

Subscribe to the Community:

https://decodingdatascience.com/community/

Unit 4 Final
No ratings yet
Unit 4 Final
100 pages
DSL Pandas
No ratings yet
DSL Pandas
87 pages
Principles of Database Manageme - Wilfried Lemahieu
100% (6)
Principles of Database Manageme - Wilfried Lemahieu
1,843 pages
Unit-1 Python Pandas (1)
No ratings yet
Unit-1 Python Pandas (1)
56 pages
Pandas 6 1716219621
No ratings yet
Pandas 6 1716219621
17 pages
Python Pandas
No ratings yet
Python Pandas
177 pages
Pandas
No ratings yet
Pandas
30 pages
Pandas
No ratings yet
Pandas
86 pages
1 Pandas Basics
No ratings yet
1 Pandas Basics
13 pages
Pandas Notes Basic To Advance
No ratings yet
Pandas Notes Basic To Advance
21 pages
Day 4-01-Spark
No ratings yet
Day 4-01-Spark
43 pages
Pandas CheatSheet
No ratings yet
Pandas CheatSheet
18 pages
18_Pandas
No ratings yet
18_Pandas
33 pages
Pandas For Data Science
No ratings yet
Pandas For Data Science
42 pages
Pandas Class XII (2021-22)
No ratings yet
Pandas Class XII (2021-22)
246 pages
40_NumPy_and_Pandas_interview_questions_with_answers_1740141557
No ratings yet
40_NumPy_and_Pandas_interview_questions_with_answers_1740141557
6 pages
Pandas Library Documentation
No ratings yet
Pandas Library Documentation
16 pages
Pandas_Notes_Design
No ratings yet
Pandas_Notes_Design
5 pages
Pandas
No ratings yet
Pandas
41 pages
UN Data Analysis Pandas Matplotlib
No ratings yet
UN Data Analysis Pandas Matplotlib
28 pages
Pandas Methods
No ratings yet
Pandas Methods
6 pages
Py Spark
No ratings yet
Py Spark
427 pages
Pandas
No ratings yet
Pandas
14 pages
Pandas_Notes
No ratings yet
Pandas_Notes
6 pages
Data Analysis With Pandas - Aggregates in Pandas Cheatsheet - Codecademy
100% (1)
Data Analysis With Pandas - Aggregates in Pandas Cheatsheet - Codecademy
2 pages
Pandas
No ratings yet
Pandas
4 pages
Module1-Cheat-Sheet-LINE PLOT
No ratings yet
Module1-Cheat-Sheet-LINE PLOT
3 pages
Pandas
No ratings yet
Pandas
9 pages
King Air PL21 W IFIS PDF
100% (1)
King Air PL21 W IFIS PDF
660 pages
Data Manipulation With Pandas
No ratings yet
Data Manipulation With Pandas
19 pages
Python Libraries Cheat Sheets
No ratings yet
Python Libraries Cheat Sheets
6 pages
EDA Cheatsheet - Class Note
No ratings yet
EDA Cheatsheet - Class Note
29 pages
Chapter - 6 Dictionary
100% (2)
Chapter - 6 Dictionary
25 pages
DevOps Session 3 Pandas.pptx
No ratings yet
DevOps Session 3 Pandas.pptx
33 pages
Pandas
No ratings yet
Pandas
8 pages
Pandas
No ratings yet
Pandas
27 pages
Python Pandas Cheatsheety
No ratings yet
Python Pandas Cheatsheety
7 pages
EDA with Pandas
No ratings yet
EDA with Pandas
8 pages
Unstructured Data Transformation Overview
100% (2)
Unstructured Data Transformation Overview
13 pages
Python Pandas New Sylabus
No ratings yet
Python Pandas New Sylabus
53 pages
Top 50 Pandas Interview Questions and Answers (2024)
No ratings yet
Top 50 Pandas Interview Questions and Answers (2024)
34 pages
Untitled
100% (1)
Untitled
19 pages
Pandas Cheat Sheet PDF
67% (3)
Pandas Cheat Sheet PDF
1 page
Pandas Notes
No ratings yet
Pandas Notes
4 pages
ML Lab1 Python Panda
No ratings yet
ML Lab1 Python Panda
9 pages
10.python Lists
No ratings yet
10.python Lists
53 pages
Pandas
No ratings yet
Pandas
13 pages
Pandas Cheat Sheet - Python For Data Science
No ratings yet
Pandas Cheat Sheet - Python For Data Science
5 pages
Pandas Guide
No ratings yet
Pandas Guide
64 pages
Block 1-Data Handling Using Pandas DataFrame
No ratings yet
Block 1-Data Handling Using Pandas DataFrame
17 pages
Open Source Options v2 0 PDF
No ratings yet
Open Source Options v2 0 PDF
55 pages
02 Cisco Devices
No ratings yet
02 Cisco Devices
31 pages
Class XII Data Handlinng Using PandasI
No ratings yet
Class XII Data Handlinng Using PandasI
46 pages
IPL DATA ANLYSIS (1)
No ratings yet
IPL DATA ANLYSIS (1)
20 pages
Lecture 4 - Pair RDD and DataFrame
No ratings yet
Lecture 4 - Pair RDD and DataFrame
38 pages
Lab3 - Python - Pandas DataFrame - GeeksforGeeks
No ratings yet
Lab3 - Python - Pandas DataFrame - GeeksforGeeks
20 pages
Exploratory Data Analysis (Eda) With Pandas: (Cheatsheet)
No ratings yet
Exploratory Data Analysis (Eda) With Pandas: (Cheatsheet)
7 pages
Pandas in Python 16sept2022
No ratings yet
Pandas in Python 16sept2022
8 pages
Research Paper Presentation Pandas Moshiul Arefin
No ratings yet
Research Paper Presentation Pandas Moshiul Arefin
30 pages
E TN SWD Aci318 02 010
No ratings yet
E TN SWD Aci318 02 010
6 pages
1-Pandas Cheat Sheet
No ratings yet
1-Pandas Cheat Sheet
7 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
2 pages
Nguyễn Thị Thu Phương
No ratings yet
Nguyễn Thị Thu Phương
19 pages
Bio-Vision - Maths MM All
No ratings yet
Bio-Vision - Maths MM All
64 pages
Mongodb Cheat Sheet
No ratings yet
Mongodb Cheat Sheet
10 pages
One Punch Man Chapter 125
No ratings yet
One Punch Man Chapter 125
31 pages
Big IoT Data Analytics
100% (1)
Big IoT Data Analytics
15 pages
Introducing The Enhanced Mid-Range Architecture
No ratings yet
Introducing The Enhanced Mid-Range Architecture
46 pages
Heos Sistem: User Manual
No ratings yet
Heos Sistem: User Manual
40 pages
HP-19C & 29C Solutions Mathematics 1977 B&W
No ratings yet
HP-19C & 29C Solutions Mathematics 1977 B&W
40 pages
FORNEY TestingMachines
No ratings yet
FORNEY TestingMachines
17 pages
Decision Support Systems Development
No ratings yet
Decision Support Systems Development
52 pages
Java MCQ's
No ratings yet
Java MCQ's
90 pages
ONLINE CRIME REPORTING SYSTEM s7
100% (1)
ONLINE CRIME REPORTING SYSTEM s7
6 pages
Getting Started - Go - Dev
No ratings yet
Getting Started - Go - Dev
12 pages
AI Based Robot For Beach Cleaning
No ratings yet
AI Based Robot For Beach Cleaning
5 pages
Installing Wonderware InTouch 2014 R2 Development
No ratings yet
Installing Wonderware InTouch 2014 R2 Development
12 pages
Trace - 2020-10-16 18 - 06 - 42 923
No ratings yet
Trace - 2020-10-16 18 - 06 - 42 923
3 pages
Bluetooth Smart Module
No ratings yet
Bluetooth Smart Module
2 pages
Modra1N Checkra1N 14 in 1 Windows Download
No ratings yet
Modra1N Checkra1N 14 in 1 Windows Download
6 pages
Hemant Kulkarni: Work Experience Skills
No ratings yet
Hemant Kulkarni: Work Experience Skills
1 page
Bill Gates & Steve Jobs
No ratings yet
Bill Gates & Steve Jobs
7 pages
IT0025 - M8 - The Impact of Information Technology On Productivity and Quality of Life
No ratings yet
IT0025 - M8 - The Impact of Information Technology On Productivity and Quality of Life
2 pages
Technik: de Soi)
No ratings yet
Technik: de Soi)
2 pages
Accenture
No ratings yet
Accenture
3 pages
Data Structures and Algorithms: Unit - 2
No ratings yet
Data Structures and Algorithms: Unit - 2
2 pages
HDInsight Essentials - Second Edition
From Everand
HDInsight Essentials - Second Edition
Rajesh Nadipalli
No ratings yet
Optimizing Hadoop for MapReduce
From Everand
Optimizing Hadoop for MapReduce
Khaled Tannir
No ratings yet
Google Cloud Platform Complete Self-Assessment Guide
From Everand
Google Cloud Platform Complete Self-Assessment Guide
Gerardus Blokdyk
1/5 (1)
Learn Hive in 24 Hours
From Everand
Learn Hive in 24 Hours
Alex Nordeen
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Panda Cheatsheet

Uploaded by

Panda Cheatsheet

Uploaded by

Subscribe to community at www.decodingdatascience.

Importing and Exporting Data

To use Pandas, you will first need to import the library:

Reading a CSV file

Displaying the DataFrame

To display the entire DataFrame, you can simply print it:

df[df['age'] > 30]

To sort in descending order, set the ascending parameter to False:

df.pivot_table(index='gender', columns='country', values='age')

Handling Missing Values

Converting Data Types

df.plot(x='column_1', y='column_2', title='Title')

Indexing and Selection

For example, to select a single row by its index:

To select a range of rows:

To select a single column:

To select multiple columns:

To select a range of rows:

To select a single column:

To select multiple columns:

Adding and Removing Columns

df['new_column'] = df['column_1'] + df['column_2']

Adding and Removing Rows

df.append({'column_1': 1, 'column_2': 2}, ignore_index=True)

df.columns = ['new_name_1', 'new_name_2']

Iterating Over a DataFrame

for index, row in df.iterrows():

df.apply(lambda row: row['column_1'] + row['column_2'], axis=1)

df.loc[df['age'] > 30]

df.where(df['age'] > 30)

Resetting the Index

Casting a Column to a Different Data Type

pd.concat([df1, df2, df3])

Online Courses Available : https://decodingdatascience.com/courses/

Subscribe to the Community:

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.