100% found this document useful (1 vote)

119 views

Python Cheat Sheet Code Academy

This document provides a cheat sheet for the Pandas library in Python. It summarizes key functions for importing and exporting data, selecting and filtering data, cleaning and transforming data, joining/combining data, and descriptive statistics. Some important functions covered include reading/writing CSV/Excel files, selecting columns/rows, dropping null values, grouping/pivoting data, concatenating DataFrames, and calculating means, medians, and standard deviations. The cheat sheet is intended to be a handy reference for common Pandas tasks.

Uploaded by

Jaroslav Kostelník

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

119 views

Python Cheat Sheet Code Academy

Uploaded by

Jaroslav Kostelník

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

LEARN DATA SCIENCE ONLINE

Start Learning For Free - www.dataquest.io

Data Science Cheat Sheet

Pandas

KEY IMPORTS
We’ll use shorthand in this cheat sheet Import these to start
df - A pandas DataFrame object import pandas as pd
s - A pandas Series object import numpy as np

I M P O RT I N G DATA SELECTION col1 in ascending order then col2 in descending

pd.read_csv(filename) - From a CSV file df[col] - Returns column with label col as Series order
pd.read_table(filename) - From a delimited text df[[col1, col2]] - Returns Columns as a new df.groupby(col) - Returns a groupby object for
file (like TSV) DataFrame values from one column
pd.read_excel(filename) - From an Excel file s.iloc[0] - Selection by position df.groupby([col1,col2]) - Returns a groupby
pd.read_sql(query, connection_object) - s.loc[0] - Selection by index object values from multiple columns
Reads from a SQL table/database df.iloc[0,:] - First row df.groupby(col1)[col2].mean() - Returns the
pd.read_json(json_string) - Reads from a JSON df.iloc[0,0] - First element of first column mean of the values in col2, grouped by the
formatted string, URL or file. values in col1 (mean can be replaced with
pd.read_html(url) - Parses an html URL, string or DATA C L E A N I N G almost any function from the statistics section)
file and extracts tables to a list of dataframes df.columns = ['a','b','c'] - Renames columns df.pivot_table(index=col1,values=
pd.read_clipboard() - Takes the contents of your pd.isnull() - Checks for null Values, Returns [col2,col3],aggfunc=mean) - Creates a pivot
clipboard and passes it to read_table() Boolean Array table that groups by col1 and calculates the
pd.DataFrame(dict) - From a dict, keys for pd.notnull() - Opposite of s.isnull() mean of col2 and col3
columns names, values for data as lists df.dropna() - Drops all rows that contain null df.groupby(col1).agg(np.mean) - Finds the
values average across all columns for every unique
E X P O RT I N G DATA df.dropna(axis=1) - Drops all columns that column 1 group
df.to_csv(filename) - Writes to a CSV file contain null values df.apply(np.mean) - Applies a function across
df.to_excel(filename) - Writes to an Excel file df.dropna(axis=1,thresh=n) - Drops all rows each column
df.to_sql(table_name, connection_object) - have have less than n non null values df.apply(np.max, axis=1) - Applies a function
Writes to a SQL table df.fillna(x) - Replaces all null values with x across each row
df.to_json(filename) - Writes to a file in JSON s.fillna(s.mean()) - Replaces all null values with
format the mean (mean can be replaced with almost J O I N /C O M B I N E
df.to_html(filename) - Saves as an HTML table any function from the statistics section) df1.append(df2) - Adds the rows in df1 to the
df.to_clipboard() - Writes to the clipboard s.astype(float) - Converts the datatype of the end of df2 (columns should be identical)
series to float pd.concat([df1, df2],axis=1) - Adds the
C R E AT E T E ST O B J E C TS s.replace(1,'one') - Replaces all values equal to columns in df1 to the end of df2 (rows should be
Useful for testing 1 with 'one' identical)
pd.DataFrame(np.random.rand(20,5)) - 5 s.replace([1,3],['one','three']) - Replaces df1.join(df2,on=col1,how='inner') - SQL-style
columns and 20 rows of random floats all 1 with 'one' and 3 with 'three' joins the columns in df1 with the columns
pd.Series(my_list) - Creates a series from an df.rename(columns=lambda x: x + 1) - Mass on df2 where the rows for col have identical
iterable my_list renaming of columns values. how can be one of 'left', 'right',
df.index = pd.date_range('1900/1/30', df.rename(columns={'old_name': 'new_ 'outer', 'inner'
periods=df.shape[0]) - Adds a date index name'}) - Selective renaming
df.set_index('column_one') - Changes the index STAT I ST I C S
V I E W I N G/ I N S P E C T I N G DATA df.rename(index=lambda x: x + 1) - Mass These can all be applied to a series as well.
df.head(n) - First n rows of the DataFrame renaming of index df.describe() - Summary statistics for numerical
df.tail(n) - Last n rows of the DataFrame columns
df.shape() - Number of rows and columns F I LT E R, S O RT, & G R O U P BY df.mean() - Returns the mean of all columns
df.info() - Index, Datatype and Memory df[df[col] > 0.5] - Rows where the col column df.corr() - Returns the correlation between
information is greater than 0.5 columns in a DataFrame
df.describe() - Summary statistics for numerical df[(df[col] > 0.5) & (df[col] < 0.7)] - df.count() - Returns the number of non-null
columns Rows where 0.7 > col > 0.5 values in each DataFrame column
s.value_counts(dropna=False) - Views unique df.sort_values(col1) - Sorts values by col1 in df.max() - Returns the highest value in each
values and counts ascending order column
df.apply(pd.Series.value_counts) - Unique df.sort_values(col2,ascending=False) - Sorts df.min() - Returns the lowest value in each column
values and counts for all columns values by col2 in descending order df.median() - Returns the median of each column
df.sort_values([col1,col2], df.std() - Returns the standard deviation of each
ascending=[True,False]) - Sorts values by column

LEARN DATA SCIENCE ONLINE

Start Learning For Free - www.dataquest.io

ALX-Back-end - User Data
No ratings yet
ALX-Back-end - User Data
69 pages
Pandas Cheat Sheet PDF
67% (3)
Pandas Cheat Sheet PDF
1 page
12 Comp Sci 1 Revision Notes Pythan Advanced Prog
No ratings yet
12 Comp Sci 1 Revision Notes Pythan Advanced Prog
5 pages
Analyzing IoT Data in Python Chapter3
No ratings yet
Analyzing IoT Data in Python Chapter3
30 pages
Review of Basic Statistical Concepts Hanke
No ratings yet
Review of Basic Statistical Concepts Hanke
28 pages
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
100% (3)
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
9 pages
Data Science Cheat Sheet: KEY Imports
100% (1)
Data Science Cheat Sheet: KEY Imports
1 page
Python Data Science 101
100% (1)
Python Data Science 101
41 pages
Pandas: Reference Sheet
No ratings yet
Pandas: Reference Sheet
9 pages
Pandas Guide
No ratings yet
Pandas Guide
64 pages
Python Date Time
No ratings yet
Python Date Time
6 pages
8 Best Python Cheat Sheets For Beginners and Intermediate Learners
100% (1)
8 Best Python Cheat Sheets For Beginners and Intermediate Learners
17 pages
Python Cheat Sheet Dataquest PDF
No ratings yet
Python Cheat Sheet Dataquest PDF
5 pages
Chapter 2 - NumPy and Pandas
No ratings yet
Chapter 2 - NumPy and Pandas
26 pages
Python 3 Beginner's Reference Cheat Sheet: by Via
100% (1)
Python 3 Beginner's Reference Cheat Sheet: by Via
1 page
ENG 202: Computers and Engineering Object Oriented Programming in PYTHON
No ratings yet
ENG 202: Computers and Engineering Object Oriented Programming in PYTHON
56 pages
Keras Cheat Sheet Python
No ratings yet
Keras Cheat Sheet Python
1 page
ML0101EN Clus K Means Customer Seg Py v1
100% (1)
ML0101EN Clus K Means Customer Seg Py v1
8 pages
Numpy Complete Material
No ratings yet
Numpy Complete Material
19 pages
Introduction To Computer Programming Using Python Comp 111
No ratings yet
Introduction To Computer Programming Using Python Comp 111
227 pages
List Comprehension in Python
No ratings yet
List Comprehension in Python
8 pages
Python Functions
No ratings yet
Python Functions
29 pages
Mongodb Cheat Sheet
No ratings yet
Mongodb Cheat Sheet
10 pages
200+ Python Exercises For Beginners Solve Coding Challenges
No ratings yet
200+ Python Exercises For Beginners Solve Coding Challenges
8 pages
Slither Into Python
No ratings yet
Slither Into Python
221 pages
Python Cheat Sheet: Conditional Tests (Comparisons)
No ratings yet
Python Cheat Sheet: Conditional Tests (Comparisons)
2 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
2 pages
Rstudio IDE Cheatsheet
100% (1)
Rstudio IDE Cheatsheet
2 pages
03 Strings in Python
No ratings yet
03 Strings in Python
29 pages
Python Master Level
No ratings yet
Python Master Level
310 pages
Strings PDF
No ratings yet
Strings PDF
14 pages
Customer Segmentation Clustering
No ratings yet
Customer Segmentation Clustering
35 pages
Advance Python
No ratings yet
Advance Python
202 pages
Pandas Dataframe
No ratings yet
Pandas Dataframe
48 pages
PythonGuide V1.2.9
100% (2)
PythonGuide V1.2.9
2 pages
(Ebook) Data Analysis with Python and PySpark (MEAP V07) by Jonathan Rioux ISBN 9781617297205, 1617297208 2024 scribd download
100% (9)
(Ebook) Data Analysis with Python and PySpark (MEAP V07) by Jonathan Rioux ISBN 9781617297205, 1617297208 2024 scribd download
65 pages
EDA with Pandas
No ratings yet
EDA with Pandas
8 pages
Cleaning Data With PySpark Chapter3
No ratings yet
Cleaning Data With PySpark Chapter3
25 pages
UT-AUSTIN Data-Analytics-Essentials-Online-Course
No ratings yet
UT-AUSTIN Data-Analytics-Essentials-Online-Course
16 pages
Implementing Classes and Instances: Output
No ratings yet
Implementing Classes and Instances: Output
1 page
Introduction To Data Visualization in Python
No ratings yet
Introduction To Data Visualization in Python
16 pages
Python Pandas Data Analysis
No ratings yet
Python Pandas Data Analysis
36 pages
Python Pandas2 PDF
No ratings yet
Python Pandas2 PDF
38 pages
ENG202 - Introduction To Python
No ratings yet
ENG202 - Introduction To Python
34 pages
Statistics Machine Learning Python
No ratings yet
Statistics Machine Learning Python
415 pages
Programming On Parallel Machines
100% (1)
Programming On Parallel Machines
344 pages
Python-IQ
No ratings yet
Python-IQ
123 pages
Deep Learning CNN
100% (1)
Deep Learning CNN
22 pages
Acceleo User Guide
No ratings yet
Acceleo User Guide
56 pages
10.python Lists
No ratings yet
10.python Lists
53 pages
Complete Guide To Spark Memory Management 1726709042
No ratings yet
Complete Guide To Spark Memory Management 1726709042
11 pages
Azure Data Fundamentals 1
No ratings yet
Azure Data Fundamentals 1
25 pages
Python Data Visualisation
No ratings yet
Python Data Visualisation
56 pages
Pandas Cheat Sheet - Python For Data Science
No ratings yet
Pandas Cheat Sheet - Python For Data Science
5 pages
EDS - Python Cheat Sheet
0% (1)
EDS - Python Cheat Sheet
3 pages
exp3 python (1)
No ratings yet
exp3 python (1)
15 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
13 pages
Commands SQL, Python (BASICS)
No ratings yet
Commands SQL, Python (BASICS)
7 pages
Pandas Cheat Sheet
100% (2)
Pandas Cheat Sheet
6 pages
Fundamental - Python
No ratings yet
Fundamental - Python
3 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Python Cheat Sheet Code Academy

Uploaded by

Python Cheat Sheet Code Academy

Uploaded by

LEARN DATA SCIENCE ONLINE

Start Learning For Free - www.dataquest.io

Data Science Cheat Sheet

I M P O RT I N G DATA SELECTION col1 in ascending order then col2 in descending

LEARN DATA SCIENCE ONLINE

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.