0% found this document useful (0 votes)
5 views

Pandas Notes

This document provides notes on data handling and analysis using Pandas, covering key operations such as reading, cleaning, filtering, analyzing, grouping, merging, and exporting data. It includes code examples for each operation, demonstrating how to manipulate DataFrames effectively. The notes serve as a quick reference guide for performing common data tasks in Python with Pandas.

Uploaded by

mehul garje
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Pandas Notes

This document provides notes on data handling and analysis using Pandas, covering key operations such as reading, cleaning, filtering, analyzing, grouping, merging, and exporting data. It includes code examples for each operation, demonstrating how to manipulate DataFrames effectively. The notes serve as a quick reference guide for performing common data tasks in Python with Pandas.

Uploaded by

mehul garje
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Pandas Notes - Data Handling & Analysis

Reading Data

- Use `pd.read_csv('file.csv')` to read CSV files.


- Use `pd.read_excel('file.xlsx')` to read Excel files.
- Use `df.head()` to view the first 5 rows.
- Use `df.tail()` to view the last 5 rows.
- Use `df.info()` to see data types and non-null counts.
Example:
df = pd.read_csv('data.csv')

Cleaning Data

- Use `df.isnull().sum()` to check missing values.


- Fill missing data: `df.fillna(value)`.
- Drop missing rows: `df.dropna()`.
- Rename columns: `df.rename(columns={'old':'new'})`.
Example:
df['age'].fillna(df['age'].mean(), inplace=True)

Filtering Data

- Single condition: `df[df['age'] > 30]`


- Multiple conditions: `df[(df['age'] > 25) & (df['marks'] > 80)]`
- Equality: `df[df['name'] == 'Bob']`
- `isin()`: `df[df['name'].isin(['Alice', 'David'])]`
- String match: `df[df['name'].str.startswith('A')]`
Example:
df.query('age > 30 and marks < 90')

Analyzing Data

- `df.describe()` gives statistical summary.


- Column stats: `mean()`, `max()`, `min()`, `mode()`.
- `value_counts()` for frequency count.
- `df.groupby('col')['val'].mean()` for grouped mean.
- `df.corr()` for correlation.
Example:
df.groupby('department')['marks'].agg(['min', 'max', 'mean'])

Grouping Data

- Use `groupby()` to group and aggregate.


- Average marks: `df.groupby('department')['marks'].mean()`
- Multiple stats: `agg(['min', 'max'])`
- Group by multiple: `df.groupby(['dept', 'name'])`
Pandas Notes - Data Handling & Analysis
- Reset index: `reset_index()` to flatten result.
Example:
df.groupby('department')['marks'].sum().reset_index()

Merging Data

- `pd.merge(df1, df2, on='id')` for inner join.


- `how='left'`, `'right'`, `'outer'` for other joins.
- Merge on multiple keys: `on=['id', 'name']`.
Example:
pd.merge(students, marks, on='id', how='left')

Exporting Data

- To CSV: `df.to_csv('file.csv', index=False)`


- To Excel: `df.to_excel('file.xlsx', index=False)`
- To JSON: `df.to_json('file.json')`
Example:
df.to_csv('cleaned_data.csv', index=False)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy