0% found this document useful (0 votes)

7 views

Attachment 3 Python for Data Analysis Lyst9850 (1)

The document provides an overview of Python libraries NumPy and Pandas for data analysis, covering installation, array creation, basic operations, and data structures. It explains key features such as NumPy's n-dimensional arrays, broadcasting, and Pandas' Series and DataFrames for handling and analyzing data. Additionally, it discusses methods for managing missing data, grouping, merging, and input/output operations with various file formats.

Uploaded by

kalpeshboratkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views

Attachment 3 Python for Data Analysis Lyst9850 (1)

Uploaded by

kalpeshboratkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

SKILLATHON.

PYTHON
FOR
DATA ANALYSIS
© www.skillathon.co
Content

✔ NumPy ✔ Pandas
✔ Introduction ✔ Introduction
✔ Installation ✔ Series
✔ Numpy Arrays ✔ DataFrames
✔ How to create ndarrays? ✔ Missing Data
✔ random() methods ✔ Groupby
✔ Shape of arrays ✔ Aggregate Functions
✔ Reshaping arrays ✔ Merging joining and
✔ Operation on arrays concatenating
✔ Arithmetic ✔ Operations
✔ Broadcasting ✔ Data Input and output

❑ Stands for Numerical Python .

❑ Fundamental package for scientific computing in

python

❑ Incredibly fast , since has binding to C libraries .

❑ Part of the SciPy stack .

❑ Many other libraries rely on numpy as one of their

building blocks .

❑ It’s highly recommended to install anaconda distribution to make sure all underlying
dependencies sync up .

❑ If you have anaconda , install numpy by going to the

terminal or command prompt and start typing :

conda install numpy

❑ If you don’t have anaconda , then type

pip install numpy

❑ Fast built-in n-dimensional array object containing elements of same type .

❑ Dimensions are called axes .

Note

✔ Indexing starts at 0
✔Unlike list , they can be broadcasted .

❑ To start using numpy package , we need to import it.

>>> import numpy as np ### we’re importing numpy as np to reduce the work

❑ numpy arrays can directly be created using np.array() function.

>>> arr1 = np.array([1,2,3]) ###passing a simple list as arguments

>>> arr1
array([1,2,3]) ### returns a 1-d array
>>> arr2 = np.array( [ [1,2,3] , [2,3,4] ] ) ### passing nested list
>>>arr2
array( [ [1, 2, 3],
[2, 3, 4] ] ) ### returns a 2-d array

❑ numpy arrays can be quickly generated using np.arange() function.

np.arange ( start , stop, step)

❑Example:
>>> a = np.arange( 0 , 5) ###generates an array from 0 to 4.
© www.skillathon.co
How to create numpy arrays (continued)

❑ To generate an array of zeroes :

>>> np.zeros(shape)

❑ To generate arrays of ones :

>>> np.ones(shape)

❑ To create an identity matrix of size n*n:

>>> np.eye(n)

❑ To create an array with evenly spaced points :

>>> np.linspace(start, stop, no. of points)

linspace is same as arange but it takes an

additional argument of number of points.

❑ Numpy consists of some functions to generate arrays with random

elements.

np.random.rand(shape) : This function returns random numbers from a uniform

distribution

np.random.randn(shape) : This function generates array of the given size from

gaussian distribution or normal distribution set around zero.

np.random.randint( low , high , size ) : It returns array of given range and size.

Note:
✔In randint() function , lower limit is inclusive and upper limit is exclusive.

❑ To get the shape of an numpy array shape attribute is used.

>>> a = np.array ( [ 7, 2, 9, 10] )

>>> a.shape
( 4, )
>>> b = np.array ( [ [ 2, 4, 6 ] , [ 1, 3, 5 ] ] )
>>> b.shape
( 2, 3)

Note :
✔No brackets ,since it’s not a method but attribute .

❑ Shape of the arrays can be changed.

❑ Using numpy’s reshape() function , the dimensions of the given function can be changed.

❑ Example :
>>> a = np.random.rand( 4,4 )
>>> a.resahpe ( 2, 2, 4)

❑ Numpy provide some functions to perform basic operations on the array.

ndarray.max() : returns the max element in the given array.

>>> a = np.array ( [ 2, 4, 12, 83, 1] )
>>> a.max()
83
ndarray.min() : returns the smallest element in the given array.
>>> a.min()
1
ndarray.argmax() : returns the index of max element.
>>> a.argmax()
3
ndarray.argmin() : returns the index of smallest element.
>>> a.argmin()
4
ndarray.sum() : returns the sum of the given array.
>>> a.sum()
102
© www.skillathon.co
Basic Operations : statistics

❑ We can calculate mean , median or standard deviation using numpy functions directly.

>>> a = np.array([1,2,3,3])
>>> a.mean () ### will return mean of a
2.25
>>> a.median() ### return the median
2.5
>>> a.std() ### standard deviation
0.8291

❑ Many arithmetic operations can be done with numpy arrays.

❑ With scalars :
>>> a = np.array( [1 , 2, 3] )
>>> a + 1 ###adding 1 to each element in the array
[2, 3, 4]
>>> a ** 2 ### squaring all the elements of the array
[1, 4, 9]
❑ With another array :
>>> b = np.ones(3) ###generates this array [ 1, 1, 1]
>>> a + b
[2, 3, 4]
>>> a-b
[0,1,2]
>>> a * b
[1, 2, 3] ###this multiplication is not matrix multiplication,we use np.dot(a,b) for that.

Note: These operations are of course much faster than if you did them in pure python

❑ Comparisons can be done between elements 2 arrays.

>>> a == b ###returns an array of Booleans
[ True, False ,False]
>>> a > b
[False , True , True ]
❑ Comparing 2 arrays.
>>> np.array_equal (a ,b) ### returns a boolean value
False
❑ Logical operations :
>>> a = np.array([1 , 0, 0, 1], dtype=bool)
>>> b = np.array([0 , 1, 0, 1],dtype=bool)
>>> np.logical_or(a , b)
[ True, True, False, True ]
>>> np.logical_and(a, b)
[False, False, False, True]

❑ Broadcasting is useful when we want to do element-wise operations on numpy arrays with different
shape.
❑ It’s possible to do operations on arrays of different sizes if NumPy can transform these arrays so that
they all have the same size: this conversion is called broadcasting.
❑ It does this without making needless copies of data and usually leads to efficient algorithm
implementations.

Note:
✔If both your arrays are two-dimensional, then their corresponding sizes have to be either
equal or one of them has to be 1 .
© www.skillathon.co
Broadcasting : example

❑ One of the richest library in python.

❑ Can be used to analyze and visualize data.

❑ Pandas provide us two high performing new data structures :

Series : 1D labeled vector
DataFrames : 2-D spreadsheet like structure

❑ These data structures are fast since they are made on top of Numpy.

❑ SQL like functionality : GroupBy , joining / merging etc.

❑ Missing data handling

❑ Series is One dimensional object similar to array, list or column in a table.

❑ To each item in the list , an index is assigned .
❑ The index can be integer or string .
❑ By default each item will receive an index label from 0 to n .
❑ Values Can be heterogeneous

❑ Dictionaries can be converted into series.

❑ To grab any value from the given series, it’s index is used.

❑ A DataFrame is a tabular data structure comprised of rows and columns, like a spreadsheet,
database table, or R's dataframe object.
❑ Could be thought of as a bunch of Series objects grouped together to share the same index.

❑ Most commonly used pandas object.

❑ To create a DataFrame, pd.DataFrame() is used.

❑ Like Series, DataFrame accepts many different kinds of input:
Dict of 1D ndarrays, lists, dicts, or Series
2-D numpy.ndarray
Structured or record ndarray
A Series
Another DataFrame

Note:
✔ Along with the data, you can optionally pass index (row labels) and columns (column labels)
arguments.
✔ If axis labels are not passed, they will be constructed from the input data based on common
sense rules
© www.skillathon.co
DataFrames : Columns and rows

❑ To select a column in a data frame , we simply write:

dataframe_name [ ‘ Column_name’]
dataframe_name [ [ ‘Column_name_1’ ,‘Column_name_2’]] ###To select multiple columns
❑ To create a new column:
dataframe_name [‘New_column_name’] = [‘ Values’ ]
❑ We can also remove any column from the dataset .
dataframe_name.drop ( ‘Column_name’ , axis , inplace )
Note: we have to specify the axis of that column and whether we want to remove the column
permanently.
❑ To select rows in a dataframe we use loc attribute
dataframe_name.loc[ ‘row_name’]

❑ There maybe many missing data in your datasets.

❑ Pandas provide some functions to deal with the.

df.dropna() : Return object with labels on given axis

omitted where alternately any or all of the
data are missing.

df.fillna() : Fill NA/NaN values using the specified

method.

❑ GroupBy method is used to group together the data based off any row or column .

❑ After grouping them together , aggregate functions can be used on the data for analysis.

❑ There are many aggregate functions available like:

sum()
std()
mean()
min()
max()
describe()

Note: describe() method is the prior to the

rest of them, as it would already print the
max, min, std (standard deviation), count, etc.
out of the numerical columns of the
DataFrame.
© www.skillathon.co
© www.skillathon.co
Merging and Concatenation

❑ Concatenation basically glues together two dataframes who’s dimensions are same.
❑ Pandas provide a function pd.concat( ) to concatenate.
❑ The merge function allows you to merge DataFrames together using a similar logic as merging
SQL Tables together.
Conc
at
Merging

❑ Using pandas we can read and write files of various format like :
.csv()
.json()
.xml()
.html
And many more…
❑ Functions to read a file:
pd.read_csv(‘file_name’)
pd.read_json(‘file_name’)
pd.read_excel(‘file_name’)
❑ Functions to write a file:
pd.to_csv(‘file_name’)
pd.to_excel(‘file_name’)

Numpy & Pandas
No ratings yet
Numpy & Pandas
13 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
36 pages
NUMPY
No ratings yet
NUMPY
33 pages
NumPy and Pandas (1)
No ratings yet
NumPy and Pandas (1)
12 pages
Numpy Basics Introduction To
No ratings yet
Numpy Basics Introduction To
35 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
61 pages
ML Sample Programs (1)
No ratings yet
ML Sample Programs (1)
7 pages
LAB 2 DWM
No ratings yet
LAB 2 DWM
13 pages
DAY6 Pandas Seaborn
No ratings yet
DAY6 Pandas Seaborn
97 pages
RAW Data
No ratings yet
RAW Data
22 pages
PPS - Unit 5 (Imp Topics)
No ratings yet
PPS - Unit 5 (Imp Topics)
7 pages
unit-3(FODS)
No ratings yet
unit-3(FODS)
34 pages
05-Unit-V Python Lecture Notes
No ratings yet
05-Unit-V Python Lecture Notes
14 pages
4 Introduction to Python Part 3(1)
No ratings yet
4 Introduction to Python Part 3(1)
62 pages
Essential Python Libraries
100% (1)
Essential Python Libraries
41 pages
Python Abstract
No ratings yet
Python Abstract
7 pages
Cheat Sheet: Python For Data Science
No ratings yet
Cheat Sheet: Python For Data Science
4 pages
Cheat Sheet: Python For Data Science
No ratings yet
Cheat Sheet: Python For Data Science
4 pages
Week 4- Introduction to Python #3
No ratings yet
Week 4- Introduction to Python #3
47 pages
Advance Data Analysis and Visualisation - With - Python For Executives and Business Management
No ratings yet
Advance Data Analysis and Visualisation - With - Python For Executives and Business Management
76 pages
22mbada303 Module 4
No ratings yet
22mbada303 Module 4
32 pages
NumPy and Pandas
No ratings yet
NumPy and Pandas
72 pages
Q-Step WS 06112019 Data Analysis and Visualisation With Python
No ratings yet
Q-Step WS 06112019 Data Analysis and Visualisation With Python
76 pages
Data Analysis and Visualisation With Python
No ratings yet
Data Analysis and Visualisation With Python
75 pages
Numpy_Data_Analysis_and_visualisation_with_Python
No ratings yet
Numpy_Data_Analysis_and_visualisation_with_Python
75 pages
4 Introduction to Python Part 3 (2)
No ratings yet
4 Introduction to Python Part 3 (2)
48 pages
Week2-1 Numpy
No ratings yet
Week2-1 Numpy
43 pages
DV Lab2 Updated
No ratings yet
DV Lab2 Updated
12 pages
PyDays Day-2 - Final
No ratings yet
PyDays Day-2 - Final
26 pages
Report
No ratings yet
Report
18 pages
Usage of NumPy for Numerical Data in Detail
No ratings yet
Usage of NumPy for Numerical Data in Detail
52 pages
Introduction To Numpy: Aniruddh Kadam Reg No-12109237 Lovely Professional University
100% (1)
Introduction To Numpy: Aniruddh Kadam Reg No-12109237 Lovely Professional University
84 pages
NumPy & Pandas
No ratings yet
NumPy & Pandas
27 pages
Numpy&pandas
No ratings yet
Numpy&pandas
17 pages
NumPy and Pandas Tutorial
No ratings yet
NumPy and Pandas Tutorial
8 pages
Ch-2 Python Libraries For ML
No ratings yet
Ch-2 Python Libraries For ML
70 pages
FINAL FDS MANUAL print
No ratings yet
FINAL FDS MANUAL print
55 pages
45B AIML Practical1.1
No ratings yet
45B AIML Practical1.1
57 pages
Unit 5
No ratings yet
Unit 5
27 pages
Python For DScience & D Visualisation Updated
No ratings yet
Python For DScience & D Visualisation Updated
11 pages
01 Introduction to Python
No ratings yet
01 Introduction to Python
36 pages
EXP1-siddhant gupta (23_SE_148)
No ratings yet
EXP1-siddhant gupta (23_SE_148)
17 pages
CS3361 Data Science Lab Manual
No ratings yet
CS3361 Data Science Lab Manual
43 pages
Py PPT 06
No ratings yet
Py PPT 06
33 pages
Python-Unit-4
No ratings yet
Python-Unit-4
43 pages
Packages
No ratings yet
Packages
37 pages
HKU - 7001 - 3.2 Managing Data II
No ratings yet
HKU - 7001 - 3.2 Managing Data II
67 pages
Module 6 NumPY and Pandas
No ratings yet
Module 6 NumPY and Pandas
12 pages
NumPy Python Library by ChatGPT
No ratings yet
NumPy Python Library by ChatGPT
30 pages
unit 5
No ratings yet
unit 5
28 pages
Python Libraries
No ratings yet
Python Libraries
79 pages
Fds Lab Manual
No ratings yet
Fds Lab Manual
61 pages
Data Analysis and Visualization Using Python Libraries and Streamlit - RTF Pre Read Materials
No ratings yet
Data Analysis and Visualization Using Python Libraries and Streamlit - RTF Pre Read Materials
29 pages
Python Sem v Portion 2
No ratings yet
Python Sem v Portion 2
29 pages
New Chat
No ratings yet
New Chat
30 pages
Learning_NumPy_and_pandas
No ratings yet
Learning_NumPy_and_pandas
3 pages
Final Fds Manual
No ratings yet
Final Fds Manual
77 pages
dav 2 unit
No ratings yet
dav 2 unit
55 pages
UNIT 2
No ratings yet
UNIT 2
38 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
My Gita
No ratings yet
My Gita
201 pages
Åα¿½«ªÑ¡¿Ñ 4
No ratings yet
Åα¿½«ªÑ¡¿Ñ 4
130 pages
Object-Oriented Approach to Programming Logic and Design 4th Edition Joyce Farrell Solutions Manual - Instant Download To Read The Complete Content
100% (5)
Object-Oriented Approach to Programming Logic and Design 4th Edition Joyce Farrell Solutions Manual - Instant Download To Read The Complete Content
50 pages
SM - RaagDarbari
No ratings yet
SM - RaagDarbari
20 pages
Yoruba Grammar Oro Ayalo Loan Words 2
No ratings yet
Yoruba Grammar Oro Ayalo Loan Words 2
10 pages
The Authors of Modern Age
No ratings yet
The Authors of Modern Age
7 pages
Phonetics Diphthongs Practice
67% (3)
Phonetics Diphthongs Practice
2 pages
Ancash Aija Raimondi Asuncion Bolognesi Carhuaz Fitzcarrald Casma Corongo Huaraz Huari Huarmey Huaylas Luzuriaga Ocros Palla 2418
No ratings yet
Ancash Aija Raimondi Asuncion Bolognesi Carhuaz Fitzcarrald Casma Corongo Huaraz Huari Huarmey Huaylas Luzuriaga Ocros Palla 2418
1 page
Tamu Ogs Thesis Manual
100% (3)
Tamu Ogs Thesis Manual
4 pages
9CU SINIF AYLIQ SINAQ 4
No ratings yet
9CU SINIF AYLIQ SINAQ 4
3 pages
Making a Sphere from Flat Material – The Math Doct
No ratings yet
Making a Sphere from Flat Material – The Math Doct
14 pages
Grade 11 Com Prog Quarter 1 Week 5 Module 5
No ratings yet
Grade 11 Com Prog Quarter 1 Week 5 Module 5
12 pages
MARTINS Zoonpolitikon 2019
No ratings yet
MARTINS Zoonpolitikon 2019
37 pages
Church Planter Assessment Report - Sample #3
No ratings yet
Church Planter Assessment Report - Sample #3
6 pages
Culture Health and Society MIDs
100% (1)
Culture Health and Society MIDs
7 pages
Titus Pop From Eurocentrism To Hibridity or From Singularity To Plurality
No ratings yet
Titus Pop From Eurocentrism To Hibridity or From Singularity To Plurality
9 pages
How To Install Kubernetes On Ubuntu 18.04 (Step by Step)
No ratings yet
How To Install Kubernetes On Ubuntu 18.04 (Step by Step)
4 pages
Local Theory of Surfaces: Reading: Millman and Parker CH 4: Sections 4.1 - 4.5
No ratings yet
Local Theory of Surfaces: Reading: Millman and Parker CH 4: Sections 4.1 - 4.5
28 pages
Present Simple (настоящее простое время)
No ratings yet
Present Simple (настоящее простое время)
12 pages
SN SSNF L010 Answer Guide
No ratings yet
SN SSNF L010 Answer Guide
10 pages
TS R4 R395
No ratings yet
TS R4 R395
4 pages
Lecture Notes
No ratings yet
Lecture Notes
11 pages
Dll Matatag _music&Arts 7 q3 w5
No ratings yet
Dll Matatag _music&Arts 7 q3 w5
10 pages
Laporan Kebun Sawit
No ratings yet
Laporan Kebun Sawit
2 pages
CSC1041 Session 10 Python Collection Data Types
No ratings yet
CSC1041 Session 10 Python Collection Data Types
25 pages
Wa0021
No ratings yet
Wa0021
409 pages
Now Verify DB - NAME and DB - UNIQUE - NAME of Primary Database
No ratings yet
Now Verify DB - NAME and DB - UNIQUE - NAME of Primary Database
7 pages
Mkhailef 2020 E R
No ratings yet
Mkhailef 2020 E R
14 pages
Exercise: Abstract Superclass Shape and Its Concrete Subclasses
No ratings yet
Exercise: Abstract Superclass Shape and Its Concrete Subclasses
4 pages
Back Up
No ratings yet
Back Up
9 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Attachment 3 Python for Data Analysis Lyst9850 (1)

Uploaded by

Attachment 3 Python for Data Analysis Lyst9850 (1)

Uploaded by

SKILLATHON.

❑ Stands for Numerical Python .

❑ Fundamental package for scientific computing in

❑ Incredibly fast , since has binding to C libraries .

❑ Part of the SciPy stack .

❑ Many other libraries rely on numpy as one of their

❑ If you have anaconda , install numpy by going to the

conda install numpy

❑ If you don’t have anaconda , then type

pip install numpy

❑ Fast built-in n-dimensional array object containing elements of same type .

❑ Dimensions are called axes .

❑ To start using numpy package , we need to import it.

❑ numpy arrays can directly be created using np.array() function.

>>> arr1 = np.array([1,2,3]) ###passing a simple list as arguments

❑ numpy arrays can be quickly generated using np.arange() function.

❑ To generate an array of zeroes :

❑ To generate arrays of ones :

❑ To create an identity matrix of size n*n:

❑ To create an array with evenly spaced points :

>>> np.linspace(start, stop, no. of points)

linspace is same as arange but it takes an

❑ Numpy consists of some functions to generate arrays with random

np.random.rand(shape) : This function returns random numbers from a uniform

np.random.randn(shape) : This function generates array of the given size from

❑ To get the shape of an numpy array shape attribute is used.

>>> a = np.array ( [ 7, 2, 9, 10] )

❑ Shape of the arrays can be changed.

❑ Numpy provide some functions to perform basic operations on the array.

ndarray.max() : returns the max element in the given array.

❑ Many arithmetic operations can be done with numpy arrays.

❑ Comparisons can be done between elements 2 arrays.

❑ One of the richest library in python.

❑ Can be used to analyze and visualize data.

❑ Pandas provide us two high performing new data structures :

❑ SQL like functionality : GroupBy , joining / merging etc.

❑ Missing data handling

❑ Series is One dimensional object similar to array, list or column in a table.

❑ Dictionaries can be converted into series.

❑ Most commonly used pandas object.

❑ To create a DataFrame, pd.DataFrame() is used.

❑ To select a column in a data frame , we simply write:

❑ There maybe many missing data in your datasets.

❑ Pandas provide some functions to deal with the.

df.dropna() : Return object with labels on given axis

df.fillna() : Fill NA/NaN values using the specified

❑ There are many aggregate functions available like:

Note: describe() method is the prior to the

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.