0% found this document useful (0 votes)

5 views

Pandas_Tutorial

The document provides an introduction to Pandas, a data manipulation library in Python, detailing its primary data structures: Series and DataFrame. It covers basic operations, data manipulation, handling missing data, data aggregation, merging DataFrames, and advanced operations like pivot tables and applying functions. The conclusion emphasizes the importance of practice and using documentation for mastering Pandas.

Uploaded by

vardhankallempudi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

Pandas_Tutorial

Uploaded by

vardhankallempudi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Introduction to Pandas

1. Introduction to Pandas

Pandas is built on top of NumPy and provides two primary data structures: Series and DataFrame.

Series

A Series is a one-dimensional labeled array capable of holding any data type.

import pandas as pd

# Creating a Series

s = pd.Series([1, 3, 5, 7, 9])

print(s)

DataFrame

A DataFrame is a two-dimensional labeled data structure with columns of potentially different types.

# Creating a DataFrame

data = {

'Name': ['John', 'Anna', 'Peter', 'Linda'],

'Age': [28, 24, 35, 32],

'City': ['New York', 'Paris', 'Berlin', 'London']

df = pd.DataFrame(data)
Introduction to Pandas

print(df)

2. Basic Operations on DataFrames

Viewing Data

- head(): View the first few rows of the DataFrame.

- tail(): View the last few rows of the DataFrame.

- info(): Get a summary of the DataFrame.

- describe(): Get descriptive statistics.

print(df.head())

print(df.tail())

print(df.info())

print(df.describe())

Selecting Data

- Using column names.

- Using row indices with iloc and loc.

# Select a column

print(df['Name'])

# Select multiple columns

print(df[['Name', 'City']])
Introduction to Pandas

# Select rows by index

print(df.iloc[1:3])

# Select rows and columns by labels

print(df.loc[0:2, ['Name', 'City']])

3. Data Manipulation

Adding and Dropping Columns

- Adding new columns.

- Dropping columns.

# Adding a new column

df['Country'] = ['USA', 'France', 'Germany', 'UK']

print(df)

# Dropping a column

df = df.drop(columns=['Country'])

print(df)

Filtering Data

- Using conditions to filter rows.

Introduction to Pandas

# Filtering rows where Age > 30

filtered_df = df[df['Age'] > 30]

print(filtered_df)

4. Handling Missing Data

- Checking for missing data.

- Filling missing data.

- Dropping missing data.

# Creating a DataFrame with missing values

data = {

'Name': ['John', 'Anna', 'Peter', 'Linda'],

'Age': [28, None, 35, 32],

'City': ['New York', 'Paris', None, 'London']

df = pd.DataFrame(data)

# Checking for missing data

print(df.isnull())

# Filling missing data

df_filled = df.fillna({'Age': df['Age'].mean(), 'City': 'Unknown'})

Introduction to Pandas

print(df_filled)

# Dropping missing data

df_dropped = df.dropna()

print(df_dropped)

5. Data Aggregation and Grouping

- Using groupby to group data and perform aggregation.

data = {

'Category': ['A', 'B', 'A', 'B'],

'Value': [10, 20, 30, 40]

df = pd.DataFrame(data)

# Grouping by 'Category' and calculating the sum of 'Value'

grouped_df = df.groupby('Category').sum()

print(grouped_df)

6. Merging and Joining DataFrames

- Concatenation.
Introduction to Pandas

- Merging based on keys.

# Concatenation

df1 = pd.DataFrame({'A': ['A0', 'A1'], 'B': ['B0', 'B1']})

df2 = pd.DataFrame({'A': ['A2', 'A3'], 'B': ['B2', 'B3']})

result = pd.concat([df1, df2])

print(result)

# Merging

left = pd.DataFrame({'key': ['K0', 'K1', 'K2'], 'A': ['A0', 'A1', 'A2']})

right = pd.DataFrame({'key': ['K0', 'K1', 'K2'], 'B': ['B0', 'B1', 'B2']})

result = pd.merge(left, right, on='key')

print(result)

7. Advanced Data Operations

Pivot Tables

- Creating pivot tables to summarize data.

data = {

'Date': ['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04'],

'City': ['New York', 'Paris', 'Berlin', 'London'],

Introduction to Pandas

'Sales': [200, 150, 300, 250]

df = pd.DataFrame(data)

pivot_table = df.pivot_table(values='Sales', index='City', columns='Date')

print(pivot_table)

Applying Functions

- Using apply to apply functions to data.

# Applying a lambda function to a column

df['Sales'] = df['Sales'].apply(lambda x: x * 1.1)

print(df)

Conclusion

This is a brief overview of some of the basic and intermediate functionalities of pandas. As you work

more with pandas, you'll discover many more powerful features and methods that can help you

manipulate and analyze data efficiently. Practice is key, so try to work on different datasets and use

the pandas documentation for further reference.

Case Study - Balsara Hygiene Products LTD SM 206
No ratings yet
Case Study - Balsara Hygiene Products LTD SM 206
7 pages
Pandas Basics
No ratings yet
Pandas Basics
84 pages
Hypothesis Testing With One Sample
No ratings yet
Hypothesis Testing With One Sample
16 pages
Fast Food Nation Essay
No ratings yet
Fast Food Nation Essay
3 pages
Pandas
No ratings yet
Pandas
13 pages
Pandas
No ratings yet
Pandas
4 pages
Pandas_Tutorial
No ratings yet
Pandas_Tutorial
9 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
14 pages
pandas (1)
No ratings yet
pandas (1)
25 pages
FDS Module 2 Notes
No ratings yet
FDS Module 2 Notes
24 pages
Python Pandas Tutorial For Beginners
No ratings yet
Python Pandas Tutorial For Beginners
203 pages
Content Pandas Cheat Sheet
No ratings yet
Content Pandas Cheat Sheet
9 pages
99c949c0-5910-425f-9ac5-155882800fa5
No ratings yet
99c949c0-5910-425f-9ac5-155882800fa5
36 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
1 page
Pandas_Cheat_Sheet (1)_240511_113437
No ratings yet
Pandas_Cheat_Sheet (1)_240511_113437
1 page
Pandas
No ratings yet
Pandas
26 pages
Pandas Dataframe Export The CSV File
No ratings yet
Pandas Dataframe Export The CSV File
9 pages
Pandas - Digitalocean
No ratings yet
Pandas - Digitalocean
15 pages
Pandas Basics Cheat Sheet Python For Data Science: Retrieving Series/Dataframe Information
No ratings yet
Pandas Basics Cheat Sheet Python For Data Science: Retrieving Series/Dataframe Information
1 page
Data Wrangling With Python and Pandas
No ratings yet
Data Wrangling With Python and Pandas
7 pages
Pandas Data Structures: Sections
No ratings yet
Pandas Data Structures: Sections
13 pages
Pandas
No ratings yet
Pandas
12 pages
PandasGUIA PYTHON-04
No ratings yet
PandasGUIA PYTHON-04
1 page
Pandas
No ratings yet
Pandas
25 pages
UNIT II Notes (1)
No ratings yet
UNIT II Notes (1)
23 pages
Pandas Notes(1)
No ratings yet
Pandas Notes(1)
44 pages
Introduction To Pandas in Data Analytics
No ratings yet
Introduction To Pandas in Data Analytics
12 pages
Data Science Notes Unit-1 Part -2
No ratings yet
Data Science Notes Unit-1 Part -2
22 pages
python 2.1.2 (2)
No ratings yet
python 2.1.2 (2)
7 pages
Introduction To Pandas For Data Analysis
No ratings yet
Introduction To Pandas For Data Analysis
6 pages
Pandas
No ratings yet
Pandas
9 pages
Pandas
No ratings yet
Pandas
27 pages
Pandas
No ratings yet
Pandas
21 pages
python interviews
No ratings yet
python interviews
154 pages
Pandas PDF(2)
No ratings yet
Pandas PDF(2)
25 pages
Unit 4
No ratings yet
Unit 4
36 pages
Getting start with pandas
No ratings yet
Getting start with pandas
11 pages
The Pandas Library
No ratings yet
The Pandas Library
39 pages
python unit 3 4
No ratings yet
python unit 3 4
92 pages
mypnotes
No ratings yet
mypnotes
3 pages
Class 12 Practical File
No ratings yet
Class 12 Practical File
29 pages
Loki Temp PPT Pandas 2
No ratings yet
Loki Temp PPT Pandas 2
31 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
5 pages
Class 12 Panda Project
No ratings yet
Class 12 Panda Project
13 pages
All Document Reader 1715619870900
No ratings yet
All Document Reader 1715619870900
6 pages
ip study
No ratings yet
ip study
18 pages
DevOps Session 3 Pandas.pptx
No ratings yet
DevOps Session 3 Pandas.pptx
33 pages
Unit 2
No ratings yet
Unit 2
81 pages
2_Pandas
No ratings yet
2_Pandas
22 pages
IP 12th Chapter 3
No ratings yet
IP 12th Chapter 3
9 pages
Cheat Python
No ratings yet
Cheat Python
8 pages
05 Pandas Data Frames
No ratings yet
05 Pandas Data Frames
33 pages
Pandas
No ratings yet
Pandas
94 pages
Practical Guide To Pandas For Data Science
100% (1)
Practical Guide To Pandas For Data Science
26 pages
Pandas,Numpy,Matplotlib
No ratings yet
Pandas,Numpy,Matplotlib
11 pages
Pandas Questions
No ratings yet
Pandas Questions
11 pages
1745516832930-Pandas-Handbook
No ratings yet
1745516832930-Pandas-Handbook
33 pages
Data Analysis With Python
No ratings yet
Data Analysis With Python
60 pages
05Getting Started With Pandas
No ratings yet
05Getting Started With Pandas
44 pages
Pandas
No ratings yet
Pandas
13 pages
unit-3(FODS)
No ratings yet
unit-3(FODS)
34 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Empanelment and de Empanelment Guidelines
No ratings yet
Empanelment and de Empanelment Guidelines
64 pages
Christina Mrimi Vs Coca Cola Kwanza Bottlers LTD (Civil Application 113 of 2011) 2012 TZCA 1 (3 May 2012)
No ratings yet
Christina Mrimi Vs Coca Cola Kwanza Bottlers LTD (Civil Application 113 of 2011) 2012 TZCA 1 (3 May 2012)
5 pages
Garrett Motion E-Turbo Whitepaper 2019
No ratings yet
Garrett Motion E-Turbo Whitepaper 2019
29 pages
Pa 114 - Project Management: Roderick Tuling Olivar, MPA (CAR), CHRA
No ratings yet
Pa 114 - Project Management: Roderick Tuling Olivar, MPA (CAR), CHRA
9 pages
Module 2 - Quadratic Equations
100% (1)
Module 2 - Quadratic Equations
28 pages
Hadzic's Textbook of Regional Anesthesia and Acute Pain Management: Self-Assessment and Review 1st Edition Admir Hadzic - Quickly download the ebook to read anytime, anywhere
100% (1)
Hadzic's Textbook of Regional Anesthesia and Acute Pain Management: Self-Assessment and Review 1st Edition Admir Hadzic - Quickly download the ebook to read anytime, anywhere
58 pages
Computation of Phase and Chemical Equilibrium I
No ratings yet
Computation of Phase and Chemical Equilibrium I
9 pages
Manual Gaggia Platinum Swing
No ratings yet
Manual Gaggia Platinum Swing
78 pages
MD500E Tech Desc
No ratings yet
MD500E Tech Desc
52 pages
CV of DR MD Sajedur Rahman Chaudhury
No ratings yet
CV of DR MD Sajedur Rahman Chaudhury
23 pages
Power Point Modul Topik 1 Hingga 10
No ratings yet
Power Point Modul Topik 1 Hingga 10
117 pages
Industrial Placement CV
No ratings yet
Industrial Placement CV
3 pages
G.R. No. 92087: Torts and Damages Case Digest: Fernando V. CA (1992)
100% (1)
G.R. No. 92087: Torts and Damages Case Digest: Fernando V. CA (1992)
2 pages
CBLM Gas NC I
100% (1)
CBLM Gas NC I
77 pages
Internal Audit Checklist For ISO 14001
No ratings yet
Internal Audit Checklist For ISO 14001
3 pages
Welcome New International Students: September 2013
100% (1)
Welcome New International Students: September 2013
2 pages
Aem 360 Astronautics: The Space Imperative - Elements of A Space Mission
No ratings yet
Aem 360 Astronautics: The Space Imperative - Elements of A Space Mission
27 pages
Promar Medevac Flowchart - Eni NFT
100% (1)
Promar Medevac Flowchart - Eni NFT
2 pages
7055 Entrepreneurship Atvee P3
No ratings yet
7055 Entrepreneurship Atvee P3
4 pages
1g Handbook Die Design 2nd Edition
No ratings yet
1g Handbook Die Design 2nd Edition
9 pages
Deepsea Amf 5120
0% (1)
Deepsea Amf 5120
35 pages
Serlok Hom
No ratings yet
Serlok Hom
26 pages
Commitment letter for construction project
No ratings yet
Commitment letter for construction project
2 pages
A money management checklist for young earners
No ratings yet
A money management checklist for young earners
7 pages
PQ Unit-1
No ratings yet
PQ Unit-1
25 pages
Get Integrin Targeting Systems for Tumor Diagnosis and Therapy Eleonora Patsenker free all chapters
100% (2)
Get Integrin Targeting Systems for Tumor Diagnosis and Therapy Eleonora Patsenker free all chapters
65 pages
INTERNAL and EXTERNAL ISSUES
No ratings yet
INTERNAL and EXTERNAL ISSUES
3 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Pandas_Tutorial

Uploaded by

Pandas_Tutorial

Uploaded by

Introduction to Pandas

A Series is a one-dimensional labeled array capable of holding any data type.

'Name': ['John', 'Anna', 'Peter', 'Linda'],

'Age': [28, 24, 35, 32],

'City': ['New York', 'Paris', 'Berlin', 'London']

2. Basic Operations on DataFrames

- head(): View the first few rows of the DataFrame.

- tail(): View the last few rows of the DataFrame.

- info(): Get a summary of the DataFrame.

- describe(): Get descriptive statistics.

- Using column names.

- Using row indices with iloc and loc.

# Select multiple columns

# Select rows by index

# Select rows and columns by labels

print(df.loc[0:2, ['Name', 'City']])

Adding and Dropping Columns

- Adding new columns.

# Adding a new column

df['Country'] = ['USA', 'France', 'Germany', 'UK']

- Using conditions to filter rows.

# Filtering rows where Age > 30

filtered_df = df[df['Age'] > 30]

4. Handling Missing Data

- Checking for missing data.

- Filling missing data.

- Dropping missing data.

# Creating a DataFrame with missing values

'Name': ['John', 'Anna', 'Peter', 'Linda'],

'Age': [28, None, 35, 32],

'City': ['New York', 'Paris', None, 'London']

# Checking for missing data

# Filling missing data

df_filled = df.fillna({'Age': df['Age'].mean(), 'City': 'Unknown'})

# Dropping missing data

5. Data Aggregation and Grouping

- Using groupby to group data and perform aggregation.

'Category': ['A', 'B', 'A', 'B'],

'Value': [10, 20, 30, 40]

# Grouping by 'Category' and calculating the sum of 'Value'

6. Merging and Joining DataFrames

- Merging based on keys.

df1 = pd.DataFrame({'A': ['A0', 'A1'], 'B': ['B0', 'B1']})

df2 = pd.DataFrame({'A': ['A2', 'A3'], 'B': ['B2', 'B3']})

result = pd.concat([df1, df2])

left = pd.DataFrame({'key': ['K0', 'K1', 'K2'], 'A': ['A0', 'A1', 'A2']})

right = pd.DataFrame({'key': ['K0', 'K1', 'K2'], 'B': ['B0', 'B1', 'B2']})

result = pd.merge(left, right, on='key')

7. Advanced Data Operations

- Creating pivot tables to summarize data.

'Date': ['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04'],

'City': ['New York', 'Paris', 'Berlin', 'London'],

'Sales': [200, 150, 300, 250]

pivot_table = df.pivot_table(values='Sales', index='City', columns='Date')

- Using apply to apply functions to data.

# Applying a lambda function to a column

df['Sales'] = df['Sales'].apply(lambda x: x * 1.1)

the pandas documentation for further reference.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.