0% found this document useful (0 votes)

53 views10 pages

What Can You Do With Dataframes Using Pandas?: Pandas Is A High-Level Data Manipulation Tool Developed by Wes Mckinney

introduction to pands

Uploaded by

babjeereddy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views10 pages

What Can You Do With Dataframes Using Pandas?: Pandas Is A High-Level Data Manipulation Tool Developed by Wes Mckinney

introduction to pands

Uploaded by

babjeereddy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

Pandas

Pandas is a high-level data manipulation tool developed by Wes McKinney.

It is built on the top of Numpy package

Pandas key data structure is called the Series and DataFrame.

DataFrames allow you to store and manipulate tabular data in rows of observations and
columns of variables.

Pandas is an open source Python package that is most widely used for data
science/data analysis and machine learning tasks

What Can You Do With DataFrames Using Pandas?

 Data cleansing
 Data fill
 Data normalization
 Merges and joins
 Data visualization
 Statistical analysis
 Data inspection
 Loading and saving data

Series : series are similar to numpy array except we can give named or
datetime index instead of numerical index

import numpy as np

import pandas as pd

lable =['a','b','c']

lst =[10,20,30]

arr =np.array([10,20,30])

dis ={'a':10,'b':20,'c':30}

pd.Series(lst)
pd.Series(lst,lable)

pd.Series(arr,lable)

pd.Series(dis)

pd.Series([sum,print,len])

ser1 =pd.Series([1,2,3,4],['USA','CHAINA','FRANCE','GERMANY'])

ser2 =pd.Series([1,2,3,4],['USA','CHAINA','INDIA','SINGAPOOR'])

ser1

ser2

ser1['USA']

ser1 + ser2

Data frames which is but directly top of series which is used in financial data

The numpy.random.randn() function creates an array of specified shape and fills it with random

values as per standard normal distribution.

import numpy as np

import pandas as pd

from numpy.random import randn

np.random.seed(101)

df =pd.DataFrame(randn(5,4),['A','B','C','D','E'],['W','X','Y','Z'])

columns are series all sharing common index

df['W']

type(df['W'])

type(df)

df.w

df[['W','X']]

adding new column to data frame

df['new'] =df['Y']+df['Z']
df.drop('new')

df.drop('new',axis=1)

df.drop('new',axis=1,inplace=True)

df.drop('E',inplace=True)

selecting row in two ways

df.loc['A']

df.iloc[2]

df.loc[['A','B']]

subset of rows and columns

select row a, b and column w,y

df.loc[['A','B'],['W','Y']]

df.iloc[2:,:]

df.iloc[2:,2:]

df.iloc[2:,:2]

df.iloc[:2,:2]

f.iloc[1:3,1:3]

df.iloc[-2:,-2:]

df.iloc[0:2,0:2]

df >0

booldf =df >0

df[booldf]

df[df>0]

df['W']>0

df[df['W']>0]
resultdf =df[df['W']<0]

resultdf

resultdf[['X','Z']]

Instead of doing two steps

df[df['W']<0][['X','Z']]

df[(df['W']<0) & (df['Y'] >0)]

df[(df['W']<0) | (df['Y'] >0)]

df.reset_index()

lst=['TN','AP','KA','MH','TS']

df['STATE']=lst

Multi level data frame

outside =['G1','G1','G1','G2','G2','G2']

inside =[1,2,3,1,2,3]

hier_index=list(zip(outside,inside))

hier_index=pd.MultiIndex.from_tuples(hier_index)

df =pd.DataFrame(randn(6,2),hier_index,['A','B'])

df.loc['G1']

df.loc['G1']['A']

df.index.names

df.index.names=['Groups','Num']

df.loc['G2'].loc[2]['B']

Cross Section

df.xs('G1')

df.xs(1,level='Num')

df.xs(('G1',2))
Missing data

d ={'A':[1,2,np.nan],'B':[5,np.nan,np.nan],'C':[1,2,3]}

df = pd.DataFrame(d)

df.dropna()

df.dropna(axis=1)

df.dropna(thresh=2)

fill value

df.fillna(value=0)

df['A'].fillna(df['A'].mean())

df['A'].fillna(df['A'].mean(),inplace=True)

Grouping

d ={'Company':['GOOG','GOOG','MSFT','MSFT','FB','FB'],

'Person':['RAM','SHAM','SUNIL','SUDEEP','RAHEEM','SHEETAL'],

'Sales':[250,400,200,150,350,100]}

df =pd.DataFrame(d)

bycomp.mean()

bycomp.max()

bycomp.std()

bycomp.min()

bycomp.sum()

bycomp.sum().loc['FB']

bycomp.describe()

bycomp.describe().transpose()

df.groupby('Company').describe().transpose()['FB']

Merging , joining,Concatination
df1 =pd.DataFrame({'A':['A0','A1','A2','A3'],

'B':['B0','B1','B2','B3'],

'C':['C0','C1','C2','C3']},

index =[0,1,2,3])

df2=pd.DataFrame({'A':['A4','A5','A6','A7'],

'B':['B4','B5','B6','B7'],

'C':['C4','C5','C6','C7']},

index =[4,5,6,7])

df3=pd.DataFrame({'A':['A8','A9','A10','A11'],

'B':['B8','B9','B10','B11'],

'C':['C8','C9','C10','C11']},

index =[8,9,10,11])

concatinate

pd.concat([df1,df2,df2])

pd.concat([df1,df2,df2],axis=1)

left =pd.DataFrame({'key':['K0','K1','K2','K3'],

'A':['A0','A1','A2','A3'],

'B':['B0','B1','B2','B3']})

right =pd.DataFrame({'key':['K0','K1','K2','K3'],

'C':['C0','C1','C2','C3'],

'D':['D0','D1','D2','D3']})

pd.merge(left,right,how='inner',on='key')

emp=pd.DataFrame({'EMPNO':['E001','E0002','E003','E004'],

'ENAME':['BABJEE','RAM','SUNIL','SHAM'],

'DEPTNO':[10,10,20,30]})
dept=pd.DataFrame({'Dname':['Accounts','Admin','It'],'DEPTNO':[10,20,50]})

pd.merge(emp,dept,how='inner',on='DEPTNO')

emp=pd.DataFrame({'EMPNO':['E001','E0002','E003','E004'],

'ENAME':['BABJEE','RAM','SUNIL','SHAM']},

index =[10,10,20,30])

dept=pd.DataFrame({'DNAME':['Accounts','Admin','It'],

'LOCATION':['CHENNAI','MUMBAI','PUNE'] },

index=[10,20,50])

emp.join(dept,how='inner')

emp.join(dept,how='outer')

df =pd.DataFrame({'Col1':[1,2,3,4],

'Col2':[444,555,666,444],

'Col3':['abc','def','ghi','xyz']})

df.head(2)

df.tail(2)

df['Col2'].unique()

len(df['Col2'].unique())

df['Col2'].nunique()

df['Col2'].value_counts()

df[df['Col1']>2]

df[(df['Col1']>2) & (df['Col2']==444)]

df['Col1'].sum()

Customs function
def times2(x):

retrun x*x
df['Col1'].apply(times2)

calling built-in functions

df['Col3'].apply(len)

df['Col2'].apply(lambda x: x *x)

df.drop('Col1',axis=1)

df.columns

df.index

df.sort_values(by='Col2',ascending=False)

df.isnull()

input and output

pwd

pd.read_csv('d:\\demo\example.csv')

pd.read_excel("d:\demo\example.xlsx")

df.to_csv("d://demo/myoutput.csv",index=False)

pd.read_excel("d:\\demo\\example.xlsx",sheet_name='Sheet1')

df.to_excel("d:\\demo\\example1.xlsx",sheet_name='Sheet2',index=False)

table_MN = pd.read_html('https://en.wikipedia.org/wiki/Minnesota',match='Election results from

statewide races')

import pandas as pd

from sqlalchemy import create_engine

cnx = create_engine('mysql+pymysql://root:admin123@localhost:3306/demo').connect()

sql = 'select * from customers'

df = pd.read_sql(sql, cnx)
The Pandas datareader is a sub package that allows one to create a dataframe from
various internet datasources, currently including:

 Yahoo! Finance
 Google Finance
 St.Louis FED (FRED)
 Kenneth French’s data library
 World Bank
 Google Analytics

pip install pandas-datareader

import pandas_datareader.data as web

import datetime as dt

start=dt.datetime(2015,1,1)

end=dt.datetime(2015,12,31)

facebook =web.DataReader('FB','yahoo',start,end

Pandas time series

majority of data in financial analysis is time series

datatime index

import pandas as pd

import numpy as np

from datetime import datetime

first_two =[datetime(2017,1,1),datetime(2017,1,2)]

dt_ind =pd.DatetimeIndex(first_two)

data =np.random.randn(2,2)

df =pd.DataFrame(data,dt_ind,['a','b'])

df.index.argmax()

df.index.argmin()

df.index.max()
Time resampling

df = pd.read_csv("d://demo//walmart_stock.csv")

df.head()

df.info()

df['Date']=pd.to_datetime(df['Date'])

df.info()

df.set_index('Date',inplace=True)

df.index()

Pandas
No ratings yet
Pandas
26 pages
Pandas+With+Python+ +DATAhill+Solutions
No ratings yet
Pandas+With+Python+ +DATAhill+Solutions
24 pages
Georges MP3 Bangla
100% (1)
Georges MP3 Bangla
873 pages
12 Pandas
100% (1)
12 Pandas
21 pages
Ip Class 12 Practical File
No ratings yet
Ip Class 12 Practical File
61 pages
Pandas Notes
No ratings yet
Pandas Notes
54 pages
Pandas Cheat Sheet........
No ratings yet
Pandas Cheat Sheet........
11 pages
Programs of Python Pandas
No ratings yet
Programs of Python Pandas
15 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
13 pages
Ip Project Work 2
No ratings yet
Ip Project Work 2
52 pages
PDF&Rendition 1
No ratings yet
PDF&Rendition 1
47 pages
Pandas
No ratings yet
Pandas
44 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
17 pages
Unit 4
No ratings yet
Unit 4
25 pages
EDA With Pandas
No ratings yet
EDA With Pandas
8 pages
Week 5 LAB
No ratings yet
Week 5 LAB
23 pages
IP Practical File - Reference
No ratings yet
IP Practical File - Reference
98 pages
Pandas Introduction: What Is Python Pandas Used For?
No ratings yet
Pandas Introduction: What Is Python Pandas Used For?
28 pages
Python - Pandas Merging, Joining, and Concatenating
No ratings yet
Python - Pandas Merging, Joining, and Concatenating
1 page
"Rohit" "Janvi" "Mukesh" 'Name' 'ACC' 'BST': Import As
No ratings yet
"Rohit" "Janvi" "Mukesh" 'Name' 'ACC' 'BST': Import As
23 pages
Pandas & Mysql
No ratings yet
Pandas & Mysql
20 pages
Data Visualization EDA-print
No ratings yet
Data Visualization EDA-print
18 pages
Set 1
No ratings yet
Set 1
16 pages
Exp3 Python
No ratings yet
Exp3 Python
15 pages
Create A Pandas Series From A Dictionary of Values and An Ndarray
No ratings yet
Create A Pandas Series From A Dictionary of Values and An Ndarray
15 pages
Document (4) - 1
No ratings yet
Document (4) - 1
15 pages
Exp 3
No ratings yet
Exp 3
10 pages
Ip Practical File
No ratings yet
Ip Practical File
20 pages
EmployeeMgmt XII IP ProjectReprot 2022 23
No ratings yet
EmployeeMgmt XII IP ProjectReprot 2022 23
16 pages
Dev Lab Record
No ratings yet
Dev Lab Record
21 pages
Social Network Analysis: Cheruvu Nvss Suhas 21BCE8374
No ratings yet
Social Network Analysis: Cheruvu Nvss Suhas 21BCE8374
10 pages
Lab 3 - Working With Data Frames
No ratings yet
Lab 3 - Working With Data Frames
10 pages
12 Useful Pandas Techniques in Python For Data Manipulation
100% (2)
12 Useful Pandas Techniques in Python For Data Manipulation
19 pages
LIst of Practicals 2024 - 25 Class Xii
No ratings yet
LIst of Practicals 2024 - 25 Class Xii
10 pages
Python Pandas-DataFrames Complete - Jupyter Notebook
No ratings yet
Python Pandas-DataFrames Complete - Jupyter Notebook
34 pages
Cheat Sheet - Pandas
No ratings yet
Cheat Sheet - Pandas
12 pages
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
No ratings yet
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
12 pages
KSTV
No ratings yet
KSTV
19 pages
Python Cheat Sheets
97% (33)
Python Cheat Sheets
11 pages
Pandas Part-2
No ratings yet
Pandas Part-2
9 pages
IP Practical File Project
No ratings yet
IP Practical File Project
60 pages
Data Mining - Week - 4
No ratings yet
Data Mining - Week - 4
8 pages
DMT Function
No ratings yet
DMT Function
10 pages
Unit 2
No ratings yet
Unit 2
81 pages
Mixxx Manual 2.3
No ratings yet
Mixxx Manual 2.3
507 pages
Pandas Dataframe All Operations 1735471870
No ratings yet
Pandas Dataframe All Operations 1735471870
4 pages
Pandas - Ipynb - Colab
No ratings yet
Pandas - Ipynb - Colab
8 pages
Geo Python Doc (1) 7,8 Bavesh
No ratings yet
Geo Python Doc (1) 7,8 Bavesh
9 pages
Lab Record IP
No ratings yet
Lab Record IP
13 pages
Ip Lab File Python
No ratings yet
Ip Lab File Python
9 pages
NumPy and Pandas Step
No ratings yet
NumPy and Pandas Step
9 pages
10) Merging Dataframes: # Detecting Duplicates
No ratings yet
10) Merging Dataframes: # Detecting Duplicates
7 pages
Unit3 - 3) Pandas - Ipynb - Colab
No ratings yet
Unit3 - 3) Pandas - Ipynb - Colab
11 pages
Class 12 IP Practice Assignment Series 3
No ratings yet
Class 12 IP Practice Assignment Series 3
3 pages
Class 12 Practical File
No ratings yet
Class 12 Practical File
29 pages
EDA Cheat Sheet
No ratings yet
EDA Cheat Sheet
7 pages
Python Assignment-2
No ratings yet
Python Assignment-2
3 pages
Assignment 7
No ratings yet
Assignment 7
1 page
12 Pandas
No ratings yet
12 Pandas
9 pages
NDC Emv 300pdf
No ratings yet
NDC Emv 300pdf
560 pages
Data Analysis CheatSheet
No ratings yet
Data Analysis CheatSheet
2 pages
Navis-Ex Sme B2012
No ratings yet
Navis-Ex Sme B2012
76 pages
Expected Output:: 1. Write A C Program To Print Your Name, Date of Birth. and Mobile Number
No ratings yet
Expected Output:: 1. Write A C Program To Print Your Name, Date of Birth. and Mobile Number
9 pages
Database Overview
No ratings yet
Database Overview
101 pages
Pandas Merged
No ratings yet
Pandas Merged
2 pages
2023-IDA Custom Bootcamp Curriculum Day Wise Curriculum v0.1
No ratings yet
2023-IDA Custom Bootcamp Curriculum Day Wise Curriculum v0.1
122 pages
Aa Tutorial
No ratings yet
Aa Tutorial
41 pages
Build A Simple Chatbot With Fastapi 1733585637
No ratings yet
Build A Simple Chatbot With Fastapi 1733585637
11 pages
Exfo Spec-Sheet Oth-7000 v9 en
No ratings yet
Exfo Spec-Sheet Oth-7000 v9 en
8 pages
Two Winding Transformer To Autotransformer Conversion
No ratings yet
Two Winding Transformer To Autotransformer Conversion
4 pages
Chapter 4 Software Architecture
No ratings yet
Chapter 4 Software Architecture
33 pages
SYNTAX AND LOGICAL ERRORS IN COMPILATION PDF
No ratings yet
SYNTAX AND LOGICAL ERRORS IN COMPILATION PDF
6 pages
Tableau
No ratings yet
Tableau
16 pages
Magazine Burst - Dave Lombardo Number
No ratings yet
Magazine Burst - Dave Lombardo Number
174 pages
Quiz Day5 1july
No ratings yet
Quiz Day5 1july
2 pages
Datastage and Qualitystage Parallel Stages and Activities
No ratings yet
Datastage and Qualitystage Parallel Stages and Activities
154 pages
Datastage
0% (1)
Datastage
9 pages
Final - Revision - Sheet Prep.3 Second Term 2017 Khaled
No ratings yet
Final - Revision - Sheet Prep.3 Second Term 2017 Khaled
28 pages
1 Load Data From Database Table To CSV File
No ratings yet
1 Load Data From Database Table To CSV File
43 pages
HTML - Images: Insert Image
No ratings yet
HTML - Images: Insert Image
7 pages
CSE 102: Computer Programming: Structures
No ratings yet
CSE 102: Computer Programming: Structures
12 pages
Introduction To Datawarehousing: Duration: 45 Minutes (Approx.) Abhishek Ranjan
No ratings yet
Introduction To Datawarehousing: Duration: 45 Minutes (Approx.) Abhishek Ranjan
32 pages
Nuclear Power Plant Simulators: Goals and Evolution: Jaakko Miettinen
No ratings yet
Nuclear Power Plant Simulators: Goals and Evolution: Jaakko Miettinen
27 pages
Panorama™ Administrator's Guide: Manage Log Collection
No ratings yet
Panorama™ Administrator's Guide: Manage Log Collection
38 pages
Datastage Transformer Functions
No ratings yet
Datastage Transformer Functions
21 pages
Unix MCQ
No ratings yet
Unix MCQ
12 pages
DW Basic + Unix
No ratings yet
DW Basic + Unix
31 pages
Informatica
No ratings yet
Informatica
9 pages
Noreply Cbic1 PH Memo Commr-Cus4mum3@gov - in Noreply Cbic1 Fri, May 09, 2025 03:11 PM 1 Attachment
No ratings yet
Noreply Cbic1 PH Memo Commr-Cus4mum3@gov - in Noreply Cbic1 Fri, May 09, 2025 03:11 PM 1 Attachment
3 pages
HOWTO Install XRDP On Linux For Faster Remote GUI Sessions Than VNC
No ratings yet
HOWTO Install XRDP On Linux For Faster Remote GUI Sessions Than VNC
6 pages
AI Sample Papers
No ratings yet
AI Sample Papers
4 pages
Suyash
No ratings yet
Suyash
1 page
Eot CS Term 3 G8
No ratings yet
Eot CS Term 3 G8
6 pages
Oracle SQL: Program Duration: 7 Days. Contents
No ratings yet
Oracle SQL: Program Duration: 7 Days. Contents
11 pages
Navigating The Digital Landscape A Comprehensive Exploration of Information and Communication Technology (ICT)
No ratings yet
Navigating The Digital Landscape A Comprehensive Exploration of Information and Communication Technology (ICT)
6 pages
Nihon Kohden Net Konnect QP 983 Setup Instructions
No ratings yet
Nihon Kohden Net Konnect QP 983 Setup Instructions
3 pages
Informatica Power Center V 10.2
No ratings yet
Informatica Power Center V 10.2
4 pages
Informatica MDM Setup
No ratings yet
Informatica MDM Setup
17 pages
CSP 20-21 U4 Practice PT Planning Guide
No ratings yet
CSP 20-21 U4 Practice PT Planning Guide
6 pages
Unix Examples
No ratings yet
Unix Examples
8 pages
Etl Informatica Training
No ratings yet
Etl Informatica Training
8 pages
Idq Questions
No ratings yet
Idq Questions
2 pages
Python
No ratings yet
Python
4 pages
Spark Notes
No ratings yet
Spark Notes
6 pages
Writing Tugas Ke 3
100% (1)
Writing Tugas Ke 3
1 page
English Project
No ratings yet
English Project
2 pages
Dan Gavrilas: Rezumat
No ratings yet
Dan Gavrilas: Rezumat
3 pages
Edu 58 B
No ratings yet
Edu 58 B
2 pages
Microsoft Certified Azure Administrator Associate Skills Measured
No ratings yet
Microsoft Certified Azure Administrator Associate Skills Measured
4 pages
Fzzy Lookup Example
No ratings yet
Fzzy Lookup Example
1 page
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

What Can You Do With Dataframes Using Pandas?: Pandas Is A High-Level Data Manipulation Tool Developed by Wes Mckinney

Uploaded by

What Can You Do With Dataframes Using Pandas?: Pandas Is A High-Level Data Manipulation Tool Developed by Wes Mckinney

Uploaded by

Pandas

Pandas is a high-level data manipulation tool developed by Wes McKinney.

It is built on the top of Numpy package

Pandas key data structure is called the Series and DataFrame.

What Can You Do With DataFrames Using Pandas?

The numpy.random.randn() function creates an array of specified shape and fills it with random

from numpy.random import randn

columns are series all sharing common index

adding new column to data frame

selecting row in two ways

subset of rows and columns

select row a, b and column w,y

booldf =df >0

Instead of doing two steps

df[(df['W']<0) & (df['Y'] >0)]

df[(df['W']<0) | (df['Y'] >0)]

Multi level data frame

df[(df['Col1']>2) & (df['Col2']==444)]

calling built-in functions

input and output

table_MN = pd.read_html('https://en.wikipedia.org/wiki/Minnesota',match='Election results from

from sqlalchemy import create_engine

sql = 'select * from customers'

pip install pandas-datareader

import pandas_datareader.data as web

Pandas time series

majority of data in financial analysis is time series

from datetime import datetime

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.