100% found this document useful (2 votes)

302 views64 pages

On Data Handling Using Pandas-I

1. Pandas is the most popular Python library for data analysis. It contains powerful data structures called Series (1D) and DataFrames (2D). 2. DataFrames can be created from lists, dictionaries, NumPy arrays, other DataFrames, and CSV/text files. They allow labeled access similar to relational databases. 3. Common operations on DataFrames include selecting, adding, deleting, and renaming rows and columns, iterating over rows and columns, merging, joining, and concatenating DataFrames.

Uploaded by

Rayansh Chauhan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

302 views64 pages

On Data Handling Using Pandas-I

Uploaded by

Rayansh Chauhan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 64

Unit 1

Data Handling using Pandas-1

Module: Module is a file which contains python functions. It is
.py file which has python executable code or statements.
Package: Package is namespace which contains multiple
packages or modules. It is a directory which contains a special
file __init__.py.
__init__.py file denotes Python the file that contains __init__.py
as package.
Library: It is collection of various packages. There is no
difference between package and python library conceptually.

Framework: It is a collection of various libraries which architects

the code flow.
Pandas:
Pandas is the most popular open source python library used for
data analysis.
We can analyze the data in pandas in two ways-

● Series
● Dataframes
Series:
Series is 1-Dimensional array deﬁned in python pandas to store
any data type.

Syntax:

<Series Name>=<pd>.Series(<list name>, ...)

Example:
5 15 16 4 34

Properties of Series:
• Series will contain homogeneous data type.
• Size of the series immutable
• Values in the series are mutable.
Creation of Series:
We can create a pandas series in following ways-

● From arrays
● From Lists
● From Dictionaries
● From scalar value
From Lists :

Output:
From arrays :

Output:
From Dictionary:

Output:
From Scalar Value:

Output:
Mathematical Operations on Series:
Mathematical Operations on Series (cont…):

Output:
Head and Tail functions on Series:
head and tail functions returns first and last n rows respectively.
Syntax:
<Series name>.head(n)
<Series name>.tail(n)
n-number of rows
Default value of n is 5
Selection, Indexing and Slicing on Series:
Selection: We can select a value from the series by using its
corresponding index.
Syntax:
<Series name>[<index number>]

Output:
Indexing:
Series.index attribute is used to get or set the index labels for the
given series.

Syntax:
<Series name>.index
Indexing (cont...):

Output:
Slicing:
Slicing operation on the series split the series based on the given
parameters.
Syntax:
<Series name>[<start>:<stop>:<step>]
Note: start,stop,step are optional
Default values: start=0, stop=n-1, step=1
Note: slicing will take default index
1. What is the significance of Pandas library?
2. Name some common data structures of python’s pandas
library?
3. Write the syntax and description for min, sum, describe
and idxmax functions in python pandas series?
4. What will the output produced by following code”

Stationary = [‘pencils’, ‘notebooks’, ‘scales’, ‘erasers’]

S=pd.series([20,30,52,10],index=stationary)

S2=pd.series([17,13,32,21),index=stationary)

print(S+S2)

S=S+S2

print(S+S2)

5. Find the error in following code fragment:

S2=pd.Series([101,102,102,104])

S2.index=[0,1,2,3,4,5]

S2[5]=220

print(S2)
Write a Pandas program to multiply and divide two Pandas Series. Sample Series:
[2, 4, 8, 10], [1, 3, 7, 9]
import pandas as pd
ds1 = pd.Series([2, 4, 8, 10])
ds2 = pd.Series([1, 3, 7, 9])
print("Multiply two Series:")
ds = ds1 * ds2
print(ds)
print("Divide Series1 by Series2:")
ds = ds1 / ds2
print(ds)
Write a Pandas program to convert a dictionary to a Pandas series. Sample
dictionary: d1 = {'a': 100, 'b': 200, 'c':300}
import pandas as pd
d1 = {'a': 100, 'b': 200, 'c':300}
print("Original dictionary:")
print(d1)
new_series = pd.Series(d1)
print("Converted series:")
print(new_series)
Write a Pandas program to sort a given Series.
400, 300.12,100, 200
import pandas as pd
s = pd.Series([400, 300.12,100, 200])
print("Original Data Series:")
print(s)
new_s = pd.Series(s).sort_values()
print(new_s)
Data Frames
Data Frames:
Data Frames is a two-dimensional(2-D) data structure defined in
pandas which consist of rows and columns.
Data Frames stores an ordered collection of columns that can
store data of different types.

Example:
S.No. Name Age Marks

1 Ravi 25 99

2 Kunal 26 98
Characteristics of Data Frames:
➢ It has two indices (two axes)
○ Row index (axis=0) ->known as index
○ Column index (axis=1) ->known as column-name
➢ Value in the Data Frame will be identifiable by the
combination of row index and column index.
➢ Indices can be of any type
➢ Column can have data of different types.
➢ Value is mutable
➢ Size is mutable
Creation of Data Frames:
Syntax:
<Data Frame Name>=
pandas.DataFrame(
<2D data structure>,
<columns=<column sequence>,
<index=<index sequence>,............)
We can create Data Frame in many ways, such as-
(i) Two dimensional dictionaries
(ii) Two dimensional ndarrays(NumPy arrays)
(iii) Series type object
(iv) Another Dataframe object
(v) Text/CSV files
Creating Data frame from List:

Output:
Creating Data frame from array:

Output:
Creating Data frame from Series:

Output:
Creating Data frame from another Data frame:

Output:
(i) Two dimensional dictionaries
We can create Dataframe from Two dimensional dictionaries-

➢ Creating Dataframe from list of dictionaries

➢ Creating Dataframe from dictionary of Series

Creating Dataframe from list of dictionaries:

Output:
Creating Data frame from dictionary of Series:

Output:
(v) Text/CSV files:
We can Create Dataframe from Text/CSV Files by using
read_csv() function.
Syntax:
<data frame name>
=pandas.read_csv(filepath_or_buffer, sep=',',
delimiter=None, header='infer', names=None,
index_col=None, usecols=None, …)
(v) Text/CSV files (cont..):

Output:
Accessing values in dataframe:
Accessing a particular value:
<Data frame name>[<column name>][<index>]

Accessing a group of values:

<Data frame name>.loc[<index>],[<column name>]
Accessing values in dataframe (cont…):

Output:
NaN variable in Python:
NaN , standing for not a number, is a numeric data type used to
represent any value that is undefined or unpresentable. For
example, 0/0 is undefined as a real number and is, therefore,
represented by NaN.
Iteration on Dataframes:

In Pandas Dataframe we can iterate an element in two ways:

● Iterating over rows

● Iterating over columns
Iterating over rows :

To iterate over the rows of the DataFrame, we can use the

following functions −
● iterrows() − iterate over the rows as (index,series) pairs
● iteritems() − to iterate over the (key,value) pairs
● itertuples() − iterate over the rows as namedtuples
iterrows():

Output:
iteritems():

Output:
itertuples():

Output:
Iterating over Columns :In order to iterate over columns, we
need to create a list of dataframe columns and then iterating
through that list to pull out the data frame columns.
Operations on rows and columns:

● Add

● Select

● Delete

● Rename
Column selection:

Output:
Column addition:

Output:
Column Deletion:

Output:
Column Rename:

Output:
Row selection:

Output:
Row Addition:

Output:
Row Deletion:

Output:
Row Rename:

Output:
Head and Tail functions in Data Frames:

head(n):
Returns the first n rows.
tail(n):
Returns last n rows.
Default value for n is 5
Indexing using Labels in Data Frames: We can make one of
the columns as row index label for the data frame by using the
function set_index().

Output:
Boolean indexing in Data Frames: Boolean indexing helps us
to select the data from the Data Frames using a boolean vector.
Joining, Merging and Concatenation on Data Frames:
Merge:
pandas.merge() method is used for merging two data frames.
It will have three arguments.
● Data frame names
● how - how will take any of the three values i.e., left,right or
inner
● on - on the common column name
Merge (cont..):
Join:The join method uses the index of the dataframes.
Use <dataframe 1>.join(<dataframe 2>) to join
Concatenation:Concatenate uses pandas.concat(<List of data
frames>).
Importing/Exporting Data between CSV files and Data
Frames:
Import data from CSV file to Data Frame:We can import data
from CSV File to Data Frame by using read_csv() function.

Output:
Export data from Data Frame to CSV File:We can export data
from Data Frame to CSV File by using to_csv() function.
Syntax:
<data frame name>.to_csv(<File Path>,.....)

COMP1521 22T1 - Week 04 Laboratory Sample Solutions
No ratings yet
COMP1521 22T1 - Week 04 Laboratory Sample Solutions
25 pages
NumPy Notes
No ratings yet
NumPy Notes
13 pages
Typecasting in Python
No ratings yet
Typecasting in Python
6 pages
Data Handing Using Pandas-I
100% (2)
Data Handing Using Pandas-I
46 pages
Class XII Data Handlinng Using PandasI
No ratings yet
Class XII Data Handlinng Using PandasI
46 pages
Strings PDF
No ratings yet
Strings PDF
14 pages
Python Question Bank Complete 100 Question
No ratings yet
Python Question Bank Complete 100 Question
23 pages
Pandas Dataframe
No ratings yet
Pandas Dataframe
48 pages
Pandas
No ratings yet
Pandas
41 pages
Functions
No ratings yet
Functions
28 pages
Python Pandas
No ratings yet
Python Pandas
177 pages
Python Technical Interviews Questions
100% (1)
Python Technical Interviews Questions
15 pages
Python Practice Exercise PDF
No ratings yet
Python Practice Exercise PDF
3 pages
Block 1-Data Handling Using Pandas DataFrame
No ratings yet
Block 1-Data Handling Using Pandas DataFrame
17 pages
Python Programs by Narayana
100% (1)
Python Programs by Narayana
18 pages
Class XII (As Per CBSE Board) : Informatics Practices
No ratings yet
Class XII (As Per CBSE Board) : Informatics Practices
43 pages
Python Lab Programs - Chapter 2 To 4
No ratings yet
Python Lab Programs - Chapter 2 To 4
13 pages
Python Interview
100% (1)
Python Interview
66 pages
Python File 1-9
No ratings yet
Python File 1-9
11 pages
Input and Output Statements
No ratings yet
Input and Output Statements
9 pages
Pandas in Python 16sept2022
No ratings yet
Pandas in Python 16sept2022
8 pages
Python Programs
100% (1)
Python Programs
74 pages
International Indian School, Riyadh WORKSHEET (2020-2021) Grade - Xii - Informatics Practices - Second Term
No ratings yet
International Indian School, Riyadh WORKSHEET (2020-2021) Grade - Xii - Informatics Practices - Second Term
9 pages
Python Program
No ratings yet
Python Program
7 pages
Python Pandas Interview Questions
100% (1)
Python Pandas Interview Questions
17 pages
SQL For Everyone (Definitive Guide)
No ratings yet
SQL For Everyone (Definitive Guide)
10 pages
Interface Python With MySQL
100% (1)
Interface Python With MySQL
40 pages
03 Strings in Python
No ratings yet
03 Strings in Python
29 pages
Data Visualization
No ratings yet
Data Visualization
9 pages
File Handling: Types of Files
No ratings yet
File Handling: Types of Files
19 pages
Python Examples
100% (1)
Python Examples
16 pages
Python All Programs
No ratings yet
Python All Programs
30 pages
Python and SQL Programs
No ratings yet
Python and SQL Programs
27 pages
Chapter-2 Python Pandas
100% (2)
Chapter-2 Python Pandas
33 pages
python interview question
No ratings yet
python interview question
39 pages
Advanced Python
No ratings yet
Advanced Python
204 pages
Day64 - Pandas Interview Questions
No ratings yet
Day64 - Pandas Interview Questions
5 pages
Unit 1
100% (1)
Unit 1
69 pages
Python Notes
No ratings yet
Python Notes
11 pages
Python File Handling PDF
100% (1)
Python File Handling PDF
20 pages
Study Material IP XII
No ratings yet
Study Material IP XII
116 pages
Untitled
100% (1)
Untitled
125 pages
Python Main Program Set 2
No ratings yet
Python Main Program Set 2
18 pages
RN Reddy Python
No ratings yet
RN Reddy Python
312 pages
Pandas Practice Questions
No ratings yet
Pandas Practice Questions
2 pages
Pythone Notes
No ratings yet
Pythone Notes
103 pages
Python Programming - Introduction All
No ratings yet
Python Programming - Introduction All
44 pages
Chapter 12: Interface Python With An SQL Database
100% (1)
Chapter 12: Interface Python With An SQL Database
4 pages
Core-Python Syllabus
No ratings yet
Core-Python Syllabus
166 pages
Lab3 - Python - Pandas DataFrame - GeeksforGeeks
No ratings yet
Lab3 - Python - Pandas DataFrame - GeeksforGeeks
20 pages
Python - Module 3
No ratings yet
Python - Module 3
86 pages
Python Lists: List Initialization
No ratings yet
Python Lists: List Initialization
25 pages
Coding Interview Python Language Essentials
No ratings yet
Coding Interview Python Language Essentials
5 pages
Python With Data Science
No ratings yet
Python With Data Science
102 pages
Python Record Final With Viva Question
No ratings yet
Python Record Final With Viva Question
100 pages
Data Analysis and Visualisation With Python
No ratings yet
Data Analysis and Visualisation With Python
75 pages
STAT 451: Intro To Machine Learning Lecture Notes
100% (1)
STAT 451: Intro To Machine Learning Lecture Notes
17 pages
Pandas
No ratings yet
Pandas
4 pages
Python Durga Notes PDF
100% (1)
Python Durga Notes PDF
367 pages
Class Xii Information Practices Ppt on Data Handling Using Pandas-i
No ratings yet
Class Xii Information Practices Ppt on Data Handling Using Pandas-i
64 pages
On Data Handling Using Pandas-I
100% (2)
On Data Handling Using Pandas-I
63 pages
Array Notes
No ratings yet
Array Notes
16 pages
10Cs753 - Java and J2Ee Name of The Faculty: Rashmi Sinha: Unit 1:introduction Tojava
No ratings yet
10Cs753 - Java and J2Ee Name of The Faculty: Rashmi Sinha: Unit 1:introduction Tojava
3 pages
crash-2024-02-13-19-55-34-233
No ratings yet
crash-2024-02-13-19-55-34-233
10 pages
Selenium MCQ Quiz PDF
100% (1)
Selenium MCQ Quiz PDF
8 pages
Logcat
No ratings yet
Logcat
11 pages
Lab No 13
100% (1)
Lab No 13
12 pages
Lab 2 - at Home
No ratings yet
Lab 2 - at Home
20 pages
FullStackCafe QAS 1674720046331
No ratings yet
FullStackCafe QAS 1674720046331
7 pages
ITCS123 Lec Reviewer
No ratings yet
ITCS123 Lec Reviewer
4 pages
Chapter 4 - Javascript
No ratings yet
Chapter 4 - Javascript
9 pages
GRADE X LONG TEST-2025
No ratings yet
GRADE X LONG TEST-2025
4 pages
Im Falling in Love With The Villainess
No ratings yet
Im Falling in Love With The Villainess
3 pages
JMA Module 1
No ratings yet
JMA Module 1
29 pages
Variables in C 1
No ratings yet
Variables in C 1
6 pages
Python Interview Questions and Answers for Freshers
No ratings yet
Python Interview Questions and Answers for Freshers
7 pages
What Is JVM
No ratings yet
What Is JVM
7 pages
自动化UVM验证平台生成和代码管理
No ratings yet
自动化UVM验证平台生成和代码管理
34 pages
BADI Extractor by Netra
100% (1)
BADI Extractor by Netra
17 pages
Log
No ratings yet
Log
6 pages
Basic Java Cheat Sheet
No ratings yet
Basic Java Cheat Sheet
10 pages
pragma solidity
No ratings yet
pragma solidity
43 pages
Create ODATA Service Using ABAP
No ratings yet
Create ODATA Service Using ABAP
13 pages
SP Lab 1-Section1
No ratings yet
SP Lab 1-Section1
4 pages
C Programming Language
No ratings yet
C Programming Language
18 pages
BCA-06 C Programming 2022
No ratings yet
BCA-06 C Programming 2022
3 pages
Resume (Suraj)
No ratings yet
Resume (Suraj)
4 pages
C-Input-Output-1
No ratings yet
C-Input-Output-1
8 pages
Quick Revision
No ratings yet
Quick Revision
28 pages
Lecture 2.1 After Large
No ratings yet
Lecture 2.1 After Large
30 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

On Data Handling Using Pandas-I

Uploaded by

On Data Handling Using Pandas-I

Uploaded by

Unit 1

Data Handling using Pandas-1

Framework: It is a collection of various libraries which architects

<Series Name>=<pd>.Series(<list name>, ...)

Stationary = [‘pencils’, ‘notebooks’, ‘scales’, ‘erasers’]

5. Find the error in following code fragment:

➢ Creating Dataframe from list of dictionaries

➢ Creating Dataframe from dictionary of Series

Accessing a group of values:

In Pandas Dataframe we can iterate an element in two ways:

● Iterating over rows

To iterate over the rows of the DataFrame, we can use the

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.