Pandas - Cheatsheet
Pandas - Cheatsheet
Import the Pandas Module Loading and Saving CSVs (cont) Converting Datatypes
import pandas as pd # Get the first DataFrame chunk: # Convert argument to numeric type
df_urb_pop pandas.to_numeric(arg, errors‐
Create a DataFrame df_urb_pop = next(urb_pop_re‐ ="raise")
# Method 2 100],
nums = np.array(range(1, 11))
df2 = pd.DataFrame([ ['February', 51, 45, 145, 45],
-> [ 1 2 3 4 5 6 7 8 9 10]
['John Smith', '123 Main ['March', 81, 96, 65, 96],
nums = nums.reshape(-1, 1)
St.', 34], ['April', 80, 80, 54, 180],
-> [ [1],
['Jane Doe', '456 Maple ['May', 51, 54, 54, 154],
[2],
Ave.', 28], ['June', 112, 109, 79, 129]],
[3],
['Joe Schmo', '9 Broadway', columns=['month', 'east',
[4],
51] 'north', 'south', 'west']
[5],
], )
[6],
columns =[ 'name', [7],
Select Columns
'address', 'age']) [8],
# Select one Column
[9],
Loading and Saving CSVs clinic_north = df.north
[10]]
# Load a CSV File in to a --> Reshape values for Scikit
You can think of reshape() as rotating this
DataFrame learn: clinic_north.values.re‐
array. Rather than one big row of numbers,
df = pd.read_csv('my-csv-f‐ shape(-1, 1)
nums is now a big column of numbers -
ile.csv') # Select multiple Columns
there’s one number in each row.
# Saving DataFrame to a CSV File clinic_north_south = df[['n‐
# Load DataFrame in Chunks (For Make sure that you have a double set of
large Datasets) brackets [[ ]], or this command won’t work!
# Initialize reader object:
urb_pop_reader
urb_pop_reader = pd.read_c‐
sv('ind_pop_data.csv', chunks‐
ize=1000)
left_on ="pro‐
Assert Statements
duct_id",
right_on ‐ # Test if country is of type
="id", object
suffixes =["_‐ assert gapminder.country.dtypes
orders","_products"]) == np.object
# Test if year is of type int64
Method 2:
assert gapminder.year.dtypes ==
If we use this syntax, we’ll end up with two
columns called id. np.int64