Python Libraries

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 17

NUMPY

PANDAS

PYTHON

LIBRARIES SCIPY & SCIKITLEARN

MATPLOTLIB

SEABORN
1
Python Libraries for Data Science
Many popular Python toolboxes/libraries:
• NumPy
• SciPy All these libraries are
• Pandas installed on the SCC
• SciKit-Learn

Visualization libraries
• matplotlib
• Seaborn

and many more …


2
NumPy:
 NumPy introduces objects for multidimensional arrays and matrices, as well
as functions that allow to easily perform advanced mathematical and
statistical operations on those objects

 It Provides vectorization of mathematical operations on arrays and matrices


which significantly improves the performance

 Many other python libraries are built on NumPy

 The core functionality of NumPy is its "ND array", for n-dimensional array,
data structure. These arrays are stride views on memory.

Link: http://www.numpy.org/

3
• Here is some function that are defined in this NumPy Library.

• 1. zeros (shape [, dtype, order]) - Return a new array of given shape and type,
filled with zeros.

• 2. array (object [, dtype, copy, order, lubok, ndim]) - Create an array

• 3. as array (a [, dtype, order]) - Convert the input to an array.

• 4. As an array (a [, dtype, order]) - Convert the input to an ND array, but pass ND


array subclasses through.

4
SciPy:
 SciPy is a collection of algorithms for linear algebra, differential equations,
numerical integration, optimization, statistics and more

 part of SciPy Stack

 built on NumPy

Link: https://www.scipy.org/scipylib/

5
 Features Of SciPy:-

 The main feature of SciPy library is that it is developed using NumPy, and its
array makes the most use of NumPy.

 In addition, SciPy provides all the efficient numerical routines like


optimization, numerical integration, and specific submodules.

 Where Is SciPy Used?

 SciPy is a library that uses NumPy for the purpose of solving mathematical
functions. SciPy uses NumPy arrays as the basic data structure, and comes
with modules for various commonly used tasks in scientific programming.

 Tasks including linear algebra, integration (calculus), ordinary differential


equation solving and signal processing are handled easily by SciPy.
6
Pandas:
 adds data structures and tools designed to work with table-like data (similar
to Series and Data Frames in R)

 provides tools for data manipulation: reshaping, merging, sorting, slicing,


aggregation etc.

 allows handling missing data

Link: http://pandas.pydata.org/

7
 Key Features of Pandas

 Fast and efficient DataFrame object with default and customized indexing.
 Tools for loading data into in-memory data objects from different file formats.
 Data alignment and integrated handling of missing data.
 Reshaping and pivoting of date sets.
 Label-based slicing, indexing and subsetting of large data sets.
 Columns from a data structure can be deleted or inserted.
 Group by data for aggregation and transformations.
 High performance merging and joining of data

8
SciKit-Learn:
 provides machine learning algorithms: classification, regression, clustering,
model validation etc.

 built on NumPy, SciPy and matplotlib

Link: http://scikit-learn.org/

9
 It features various classification, regression and clustering algorithms
including support vector machines, random forests, gradient boosting, k-
means and DBSCAN, and is designed to interoperate with the Python
numerical and scientific libraries NumPy and SciPy.
 Advantages of using Scikit-Learn:
 Scikit-learn provides a clean and consistent interface to tons of different
models.
 It provides you with many options for each model, but also chooses sensible
defaults.
 Its documentation is exceptional, and it helps you to understand the models
as well as how to use them properly.
 It is also actively being developed

10
matplotlib:
 python 2D plotting library which produces publication quality figures in a
variety of hardcopy formats

 a set of functionalities similar to those of MATLAB

 line plots, scatter plots, barcharts, histograms, pie charts etc.

 relatively low-level; some effort needed to create advanced visualization

Link: https://matplotlib.org/

11
12
Seaborn:
 based on matplotlib

 provides high level interface for drawing attractive statistical graphics

 Similar (in style) to the popular ggplot2 library in R

Link: https://seaborn.pydata.org/

13
 The main aim of Seaborn is to make visualization a vital part of exploring and
understanding data. Its dataset-oriented plotting functions operate on arrays
and data-frames containing whole datasets. The library is ideal for examining
relationships among multiple variables.

 Highlights:
 Automatic estimation as well as the plotting of linear regression models
 Comfortable views of the overall structure of complex datasets
 Eases building complex visualizations using high-level abstractions for
structuring multi-plot grids
 Options for visualizing bivariate or univariate distributions
 Specialized support for using categorical variables

14
Loading Python Libraries
In [ ]: #Import Python Libraries
import numpy as np
import scipy as sp
import pandas as pd
import matplotlib as mpl
import seaborn as sns

Press Shift+Enter to execute the jupyter cell

15
Reading data using pandas
In [ ]: #Read csv file
df = pd.read_csv("http://rcs.bu.edu/examples/python/data_analysis/Salaries.csv")

Note: The above command has many optional arguments to fine-tune the data import process.

There is a number of pandas commands to read other data formats:

pd.read_excel('myfile.xlsx',sheet_name='Sheet1', index_col=None,
na_values=['NA'])
pd.read_stata('myfile.dta')
pd.read_sas('myfile.sas7bdat')
pd.read_hdf('myfile.h5','df')
16
Exploring data frames
In [3]: #List first 5 records
df.head()

Out[3]:

17

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy