0% found this document useful (0 votes)

5 views

13_Data Visualization

The document provides an overview of various Python libraries used for data visualization and analysis, including Matplotlib, NumPy, Pandas, SciPy, Scikit-learn, Seaborn, TensorFlow, Keras, and Statsmodels. It discusses their functionalities, advantages, and applications in data science, along with techniques for visualizing data such as box plots, histograms, heat maps, and scatter plots. Additionally, it highlights the importance of data visualization in understanding trends and patterns in large datasets.

Uploaded by

eeshasingh2501

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

13_Data Visualization

Uploaded by

eeshasingh2501

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Matplotlib

The plotting of numerical data is the responsibility of this library. It's for
this reason that it's used in analysis of data. It's an open-source library that
plots high-definition figures such as pie charts, scatterplots, boxplots, and
graphs, among other things.

NumPy

NumPy is one of the most widely used open-source Python packages,

focusing on mathematical and scientific computation. It has built-in
mathematical functions for convenient computation and facilitates large
matrices and multidimensional data. It can be used for various things,
including linear algebra, as an N-dimensional container for all types of
data. The NumPy Array Python object defines an N-dimensional array with
rows and columns. A long with this, it can be used as a random number
generator.

In Python, NumPy is recommended over lists because it uses less memory,

is faster, and is more convenient.

Images, sound waves, and other binary raw streams can be represented as
a multidimensional array of real values using the NumPy interface for
visualization. Full-stack developers must be familiar with Numpy to use
this machine learning library.

Pandas

Pandas is an open source library licenced under the Berkeley Software

Distribution (BSD). In the domain of data science, this well-known library
is widely used. They're mostly used for analysis, manipulation, and
cleaning of data, among other things. Pandas allows us to perform simple
data modelling and analysis without having to swap to another language
like R.

SciPy

Scipy is a Python library. It is an open-source library, especially designed

for scientific computing, information processing, and high-level
computing. A large number of user-friendly methods and functions for
quick and convinient computation are included in the library. Scipy can be
used for mathematical computations alongside NumPy.

Cluster, fftpack, constants, integrate, io, linalg, interpolate, ndimage, odr,

optimise, signal, spatial, special, sparse, and stats are just a few of the
subpackages available in SciPy.

Scikit- learn

Scikit-learn is also an open-source machine learning library based on

Python. Both supervised and unsupervised learning processes can be used
in this library. Popular algorithms and the SciPy, NumPy, and Matplotlib
packages are all already pre-included in this library. The most well-known
Scikit-most-learn application is for Spotify music recommendations.

Seaborn

Visualization of statistical models is possible with this package. The library

is largely based on Matplotlib and enables the formation of statistical
graphics via:

Variable comparison via an API based on datasets

Create complex visualisations with ease, including multi-plot grids.

Univariate and bivariate visualisations are used to compare data subsets.

Patterns can be displayed in a variety of colour palettes.

Linear regression estimation and plotting are done automatically.

TensorFlow

TensorFlow is an open-source numerical calculation library with high

performance. Deep learning and ML algorithms make use of it as well. It
was developed by Google Brain group researchers inside the Google AI
organisation and is now widely used for complex mathematical
computations by mathematics, physics, and also machine learning
researchers.

Keras
Keras is a Python-based open-source neural network library that makes it
possible for us to examine deep neural networks deeply. As deep learning
becomes more common, Keras emerges as a viable option because,
according to its creators, it is an API (Application Programming Interface)
designed for humans, not machines. Compared to TensorFlow or Theano,
Keras has a greater adoption rate in the research community and industry.
Before installing Keras, the user should first download the TensorFlow
backend engine.

Statsmodels

Statsmodels is a Python library that helps with statistical model analysis

and estimation. The library is used to run statistical tests and other tasks,
resulting in high-quality results.

The user-friendly interface The Python programming language is widely

used in many real-world applications. It is expanding rapidly in the sectors
of error debugging since it is a high-level language that is dynamically
written. Python is becoming more widely used in widely famous
applications like YouTube and DropBox. Users can also perform multiple
tasks without needing to type their code, thanks to the accessibility of
Python libraries.
Data Visualization Techniques
Data visualization is a graphical representation of information and data. By
using visual elements like charts, graphs, and maps, data visualization tools provide
an accessible way to see and understand trends, outliers, and patterns in data. This
study on data visualization techniques will help you understand detailed techniques
and benefits.

In the world of Big Data, data visualization tools and technologies are essential to
analyse massive amounts of information and make data-driven decisions.

Advantages of data visualization

The uses of Data Visualization as follows:

 Powerful way to explore data with presentable results.

 Primary use is the pre-processing portion of the data mining process.
 Supports the data cleaning process by finding incorrect and missing values.
 For variable derivation and selection means to determine which variable to include
and discarded in the analysis.
 Also play a role in combining categories as part of the data reduction process.
Disadvantages

While there are many advantages, some of the disadvantages may seem less obvious.
For example, when viewing a visualization with many different data points, it’s easy to
make an inaccurate assumption. Or sometimes the visualization is just designed
wrong so that it’s biased or confusing.

 Some other disadvantages include:

 Biased or inaccurate information.
 Correlation doesn’t always mean causation.
 Core messages can get lost in translation.

Data visualization for One-dimensional (1-D)

import numpy as np
from matplotlib import pyplot as plt
plt.rcParams["figure.figsize"] = [7.00, 3.50]
plt.rcParams["figure.autolayout"] = True
y_value = 1
x = np.arange(10)
y = np.zeros_like(x) + y_value
plt.plot(x, y, ls='dotted', c='red', lw=5)
plt.show()

Data visualization for 2-D

import numpy as np
import matplotlib.pyplot as plt

image = np.random.rand(30, 30)

plt.imshow(image, cmap=plt.cm.hot)
plt.colorbar()
plt.show()

Data visualization for 3-D

We can easily plot 3-D figures in matplotlib. Now, we discuss some important
and commonly used 3-D plots.
from mpl_toolkits.mplot3d import axes3d
import matplotlib.pyplot as plt
from matplotlib import style
import numpy as np

# setting a custom style to use

style.use('ggplot')

# create a new figure for plotting

fig = plt.figure()

# create a new subplot on our figure

# and set projection as 3d
ax1 = fig.add_subplot(111, projection='3d')

# defining x, y, z co-ordinates
x = np.random.randint(0, 10, size = 20)
y = np.random.randint(0, 10, size = 20)
z = np.random.randint(0, 10, size = 20)

# plotting the points on subplot

# setting labels for the axes

ax1.set_xlabel('x-axis')
ax1.set_ylabel('y-axis')
ax1.set_zlabel('z-axis')

# function to show the plot

plt.show()

General Types of Visualizations:

 Chart: Information presented in a tabular, graphical form with data displayed along
two axes. Can be in the form of a graph, diagram, or map
 Table: A set of figures displayed in rows and columns.
 Graph: A diagram of points, lines, segments, curves, or areas that represents certain
variables in comparison to each other, usually along two axes at a right angle.
 Geospatial: A visualization that shows data in map form using different shapes and
colors to show the relationship between pieces of data and specific locations.
 Infographic: A combination of visuals and words that represent data. Usually uses
charts or diagrams.
 Dashboards: A collection of visualizations and data displayed in one place to help
with analysing and presenting data.
Data Visualization Techniques

 Box plots
 Histograms
 Heat maps
 Charts
 Tree maps
 kernel density estimate
Box Plots

The image above is a box plot. A boxplot is a standardized way of displaying the
distribution of data based on a five-number summary (“minimum”, first quartile (Q1),
median, third quartile (Q3), and “maximum”). It can tell you about your outliers and
what their values are. It can also tell you if your data is symmetrical, how tightly your
data is grouped, and if and how your data is skewed.

A box plot is a graph that gives you a good indication of how the values in the data are
spread out. Although box plots may seem primitive in comparison to
a histogram or density plot, they have the advantage of taking up less space, which is
useful when comparing distributions between many groups or datasets. For some
distributions/datasets, you will find that you need more information than the measures
of central tendency (median, mean, and mode). You need to have information on the
variability or dispersion of the data.

# Import libraries
import matplotlib.pyplot as plt
import numpy as np

# Creating dataset
np.random.seed(10)
data = np.random.normal(100, 20, 200)

fig = plt.figure(figsize =(10, 7))

# Creating plot
plt.boxplot(data)

# show plot
plt.show()

Five Number Summary of Box Plot

Minimum Q1 -1.5*IQR

First quartile (Q1/25th The middle number between the smallest number (not the
Percentile) “minimum”) and the median of the dataset

Median (Q2/50th Percentile) the middle value of the dataset

Third quartile (Q3/75th the middle value between the median and the highest value (not
Percentile)”: the “maximum”) of the dataset.

Maximum Q3 + 1.5*IQR

interquartile range (IQR) 25th to the 75th percentile.

Histograms
A histogram is a graphical display of data using bars of different heights. In a
histogram, each bar groups numbers into ranges. Taller bars show that more data falls
in that range. A histogram displays the shape and spread of continuous sample data.
It is a plot that lets you discover, and show, the underlying frequency distribution
(shape) of a set of continuous data. This allows the inspection of the data for its
underlying distribution (e.g., normal distribution), outliers, skewness, etc. It is an
accurate representation of the distribution of numerical data, it relates only one
variable. Includes bin or bucket- the range of values that divide the entire range of
values into a series of intervals and then count how many values fall into each interval.

Bins are consecutive, non- overlapping intervals of a variable. As the adjacent bins
leave no gaps, the rectangles of histogram touch each other to indicate that the original
value is continuous.

# Import libraries
import matplotlib.pyplot as plt
import numpy as np
np.random.seed(256)
x = 10*np.random.rand(200,1)
y = (0.2 + 0.8*x) * np.sin(2*np.pi*x) + np.random.randn(200,1)
plt.hist(y, bins=20, color='purple')
plt.show()

Histograms are based on area, not height of bars

In a histogram, the height of the bar does not necessarily indicate how many
occurrences of scores there were within each bin. It is the product of height multiplied
by the width of the bin that indicates the frequency of occurrences within that bin. One
of the reasons that the height of the bars is often incorrectly assessed as indicating
the frequency and not the area of the bar is because a lot of histograms often have
equally spaced bars (bins), and under these circumstances, the height of the bin does
reflect the frequency.

Heat Maps
A heat map is data analysis software that uses colour the way a bar graph uses
height and width: as a data visualization tool.

If you’re looking at a web page and you want to know which areas get the most
attention, a heat map shows you in a visual way that’s easy to assimilate and make
decisions from. It is a graphical representation of data where the individual values
contained in a matrix are represented as colours. Useful for two purposes: for
visualizing correlation tables and for visualizing missing values in the data. In both
cases, the information is conveyed in a two-dimensional table.
Note that heat maps are useful when examining a large number of values, but they
are not a replacement for more precise graphical displays, such as bar charts,
because colour differences cannot be perceived accurately.

# importing the modules

import numpy as np
import seaborn as sn
import matplotlib.pyplot as plt

# generating 2-D 10x10 matrix of random numbers

# from 1 to 100
data = np.random.randint(low = 1,
high = 100,
size = (10, 10))
print("The data to be plotted:\n")
print(data)

# plotting the heatmap

hm = sn.heatmap(data = data)

# displaying the plotted heatmap

plt.show()

List of Charts to Visualize Data

 Bar Graph: It has rectangular bars in which the lengths are proportional to the
values which are represented.

import numpy as np
import matplotlib.pyplot as plt

# creating the dataset

data = {'C':20, 'C++':15, 'Java':30,
'Python':35}
courses = list(data.keys())
values = list(data.values())

fig = plt.figure(figsize = (10, 5))

# creating the bar plot

plt.bar(courses, values, color ='maroon',
width = 0.4)

plt.xlabel("Courses offered")
plt.ylabel("No. of students enrolled")
plt.title("Students enrolled in different courses")
plt.show()

 Area Chart: It combines the line chart and bar chart to show how the numeric
values of one or more groups change over the progress of a viable area.

import plotly.express as px

df = px.data.iris()

fig = px.area(df, x="sepal_width", y="sepal_length",

color="species",
hover_data=['petal_width'],)

fig.show()
 Line Graph: The data points are connected through a straight line; therefore,
creating a representation of the changing trend.

x = np.linspace(0, 1, 201)
y = np.sin((2*np.pi*x)**2)
plt.plot(x, y, 'purple')
plt.show()

 Pie Chart: It is a chart where various components of a data set are presented in
the form of a pie which represents their proportion in the entire data set.

import matplotlib.pyplot as plt

import numpy as np
y = np.array([35, 25, 25, 15])

plt.pie(y)
plt.show()

Scatter Charts

Another common visualization technique is a scatter plot that is a two-dimensional plot

representing the joint variation of two data items. Each marker (symbols such as dots,
squares and plus signs) represents an observation. The marker position indicates the
value for each observation. When you assign more than two measures, a scatter plot
matrix is produced that is a series scatter plot displaying every possible pairing of the
measures that are assigned to the visualization. Scatter plots are used for examining
the relationship, or correlations, between X and Y variables.

np.random.seed(256)
x = 10*np.random.rand(200,1)
y = (0.2 + 0.8*x) * np.sin(2*np.pi*x) + np.random.randn(200,1)
plt.scatter(x, y, color='purple')
plt.show()
Tree Map

A treemap is a visualization that displays hierarchically organized data as a set of

nested rectangles, parent elements being tiled with their child elements. The sizes and
colours of rectangles are proportional to the values of the data points they represent.
A leaf node rectangle has an area proportional to the specified dimension of the data.
Depending on the choice, the leaf node is coloured, sized or both according to chosen
attributes. They make efficient use of space, thus display thousands of items on the
screen simultaneously.

!pip install squarify -qqq

import squarify
import matplotlib.pyplot as plt

labels=['nepal', 'america', 'india']

sizes=[2, 3, 4]
colors=['red', 'blue', 'red']

squarify.plot(sizes=sizes,
label=labels,
color =colors,
alpha=.7,
bar_kwargs=dict(linewidth=1, edgecolor="#222222"))
plt.show()
Kernel density estimate (KDE) plot

A kernel density estimate (KDE) plot is a method for visualizing

the distribution of observations in a dataset, analogous to a
histogram. KDE represents the data using a continuous
probability density curve in one or more dimensions.

# importing the libraries

# importing the libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.neighbors import KernelDensity
from sklearn.model_selection import GridSearchCV

def generate_data(seed=17):
# Fix the seed to reproduce the results
rand = np.random.RandomState(seed)
x = []
dat = rand.lognormal(0, 0.3, 1000)
x = np.concatenate((x, dat))
dat = rand.normal(3, 1, 1000)
x = np.concatenate((x, dat))
return x

x_train = generate_data()[:, np.newaxis]

fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(10, 5))
plt.subplot(121)
plt.scatter(np.arange(len(x_train)), x_train, c='red')
plt.xlabel('Sample no.')
plt.ylabel('Value')
plt.title('Scatter plot')
plt.subplot(122)
plt.hist(x_train, bins=50)

Data Visualization in Python Preview PDF
100% (8)
Data Visualization in Python Preview PDF
58 pages
Accident and Incident Reporting Flow Chart
100% (1)
Accident and Incident Reporting Flow Chart
1 page
SYNOPSISfor Online Car Rental System
50% (4)
SYNOPSISfor Online Car Rental System
5 pages
HSM Info and Commands
No ratings yet
HSM Info and Commands
17 pages
Data Visualization
No ratings yet
Data Visualization
25 pages
Unit 3 (Python)
No ratings yet
Unit 3 (Python)
29 pages
Data Manipulation and Visualization
No ratings yet
Data Manipulation and Visualization
21 pages
Chapter 4 Data Visualizations
No ratings yet
Chapter 4 Data Visualizations
24 pages
dsbda Unit4
No ratings yet
dsbda Unit4
110 pages
unit 4
No ratings yet
unit 4
27 pages
Data Visualisation
No ratings yet
Data Visualisation
5 pages
scrib1
No ratings yet
scrib1
7 pages
unit_5 (1)
No ratings yet
unit_5 (1)
81 pages
Jmis 26 4 167
No ratings yet
Jmis 26 4 167
9 pages
NumPy, Pandas, MatplotLib,Seaborn, ScikitLearn (SkLearn)
No ratings yet
NumPy, Pandas, MatplotLib,Seaborn, ScikitLearn (SkLearn)
14 pages
unit 4
No ratings yet
unit 4
105 pages
Combinepdf
No ratings yet
Combinepdf
77 pages
Combinepdf
No ratings yet
Combinepdf
101 pages
Essential Python Data Visualization Libraries 1687141550
No ratings yet
Essential Python Data Visualization Libraries 1687141550
16 pages
Unit 3 - Data Visualization
No ratings yet
Unit 3 - Data Visualization
64 pages
5a Introduction To Matplotlib Graphical Representation of Data 1 - PPTX - Lyst6765
No ratings yet
5a Introduction To Matplotlib Graphical Representation of Data 1 - PPTX - Lyst6765
11 pages
Day2Part2. DataVisualization
No ratings yet
Day2Part2. DataVisualization
29 pages
Data Visualization With Matplotlib
No ratings yet
Data Visualization With Matplotlib
20 pages
DAV EXP 1 t12 31
No ratings yet
DAV EXP 1 t12 31
39 pages
Visualization - Python Data Analysis
No ratings yet
Visualization - Python Data Analysis
13 pages
Unit 4 Data Visualization using Matplotlib - Copy
No ratings yet
Unit 4 Data Visualization using Matplotlib - Copy
42 pages
Data Visualization
No ratings yet
Data Visualization
31 pages
DS 2
No ratings yet
DS 2
38 pages
Unit 5 Fod (1) (Repaired)
No ratings yet
Unit 5 Fod (1) (Repaired)
28 pages
Unit 4 Plotting Final
No ratings yet
Unit 4 Plotting Final
51 pages
UNIT5
No ratings yet
UNIT5
18 pages
Libraries For Data Science
No ratings yet
Libraries For Data Science
2 pages
Matplotlib in Python
No ratings yet
Matplotlib in Python
23 pages
Unit 5
No ratings yet
Unit 5
11 pages
Pre ML Practise
No ratings yet
Pre ML Practise
14 pages
Project Synopsis of Python
No ratings yet
Project Synopsis of Python
6 pages
Data Visualization Python Tutorial
No ratings yet
Data Visualization Python Tutorial
9 pages
l9 Scientific Python Proc
No ratings yet
l9 Scientific Python Proc
30 pages
Core Libraries For Machine Learning
No ratings yet
Core Libraries For Machine Learning
5 pages
Ex1_Plotting and Visualization using Numpy and Pandas
No ratings yet
Ex1_Plotting and Visualization using Numpy and Pandas
14 pages
Cs3361 Data Science Laboratory
No ratings yet
Cs3361 Data Science Laboratory
139 pages
Data Visualization
No ratings yet
Data Visualization
29 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
42 pages
Mastering Python Data Visualization - Sample Chapter
100% (9)
Mastering Python Data Visualization - Sample Chapter
63 pages
Datascienece
No ratings yet
Datascienece
18 pages
1 - Introduction - Data Visualization
No ratings yet
1 - Introduction - Data Visualization
3 pages
Matplotlib
No ratings yet
Matplotlib
9 pages
Data Visualization
No ratings yet
Data Visualization
11 pages
Class 1 Data Visualization in Python using matplotlib
No ratings yet
Class 1 Data Visualization in Python using matplotlib
13 pages
Data Science With Python - Lesson 10 - Data Visualization in Python With Matplotlib - Raw
No ratings yet
Data Science With Python - Lesson 10 - Data Visualization in Python With Matplotlib - Raw
71 pages
Machine Learning Experiment
No ratings yet
Machine Learning Experiment
69 pages
Lab - Manual FDS
No ratings yet
Lab - Manual FDS
12 pages
Introduction To Matplotlib
No ratings yet
Introduction To Matplotlib
20 pages
visualization
No ratings yet
visualization
18 pages
Programming 2 Lectures
No ratings yet
Programming 2 Lectures
41 pages
Mat Plot Lib
No ratings yet
Mat Plot Lib
12 pages
Data Visualization using Matplotlib in Python
No ratings yet
Data Visualization using Matplotlib in Python
15 pages
Unit 5
No ratings yet
Unit 5
27 pages
Session3 - Analytics For Programming II - Siryani - 090524
No ratings yet
Session3 - Analytics For Programming II - Siryani - 090524
28 pages
Python Abstract
No ratings yet
Python Abstract
7 pages
Practical Guide To Matplotlib For Data Science - 1689973407325
No ratings yet
Practical Guide To Matplotlib For Data Science - 1689973407325
35 pages
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
From Everand
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
e3
No ratings yet
Illuminating Data: A hands on guide to data visualization in R
From Everand
Illuminating Data: A hands on guide to data visualization in R
Eman Ahmad
No ratings yet
8086 Interview Questions:: 8086 Microprocessor
No ratings yet
8086 Interview Questions:: 8086 Microprocessor
20 pages
Research Topic Proposal F ORM: Bachelor of Science in Information Technology
No ratings yet
Research Topic Proposal F ORM: Bachelor of Science in Information Technology
3 pages
Reasearch Paper
100% (1)
Reasearch Paper
9 pages
20.009 (Remote Interface Manual v3.x)
No ratings yet
20.009 (Remote Interface Manual v3.x)
50 pages
A Decision Support Framework: (By Gory and Scott-Morten, 1971)
No ratings yet
A Decision Support Framework: (By Gory and Scott-Morten, 1971)
21 pages
ACS600 Hardware Manual
No ratings yet
ACS600 Hardware Manual
92 pages
Centralised Processing of Returns Scheme, 2011
No ratings yet
Centralised Processing of Returns Scheme, 2011
11 pages
Eslj 1517 Hill
No ratings yet
Eslj 1517 Hill
5 pages
Bose QC25 FAQ
No ratings yet
Bose QC25 FAQ
15 pages
Power Electronics (EL-313) : Complex Engineering Problem (CEP)
No ratings yet
Power Electronics (EL-313) : Complex Engineering Problem (CEP)
13 pages
Accenture Technology Vision 2013: Executive Summary
No ratings yet
Accenture Technology Vision 2013: Executive Summary
8 pages
Resume - II Siddesh
No ratings yet
Resume - II Siddesh
2 pages
Microsoft Blazor Building Web Applications in NET 6 and Beyond 3rd Edition Peter Himschoot - Quickly download the ebook to start your content journey
100% (1)
Microsoft Blazor Building Web Applications in NET 6 and Beyond 3rd Edition Peter Himschoot - Quickly download the ebook to start your content journey
76 pages
Packet Sniffing and Spoofing Lab
No ratings yet
Packet Sniffing and Spoofing Lab
12 pages
Insert - Update - Delete in Gridview
No ratings yet
Insert - Update - Delete in Gridview
5 pages
MT8127 Android Scatter
No ratings yet
MT8127 Android Scatter
7 pages
2025 Steamships GDP Application
No ratings yet
2025 Steamships GDP Application
4 pages
Senol Cali Et Al., 2018
No ratings yet
Senol Cali Et Al., 2018
18 pages
message(24)
No ratings yet
message(24)
6 pages
Fashion Fabric and Design Equipment List
100% (1)
Fashion Fabric and Design Equipment List
3 pages
Public Transport Planning System by Dijkstra Algorithm Case Study Bangkok Metropolitan Area
No ratings yet
Public Transport Planning System by Dijkstra Algorithm Case Study Bangkok Metropolitan Area
6 pages
Implementing Logic Gates Using Neural Networks (Part 2) - by Vedant Kumar - Towards Data Science
No ratings yet
Implementing Logic Gates Using Neural Networks (Part 2) - by Vedant Kumar - Towards Data Science
3 pages
Cover Letter OPCW Feb11
100% (1)
Cover Letter OPCW Feb11
1 page
History of Television
No ratings yet
History of Television
18 pages
BCP DRP
No ratings yet
BCP DRP
21 pages
Produced by An Autodesk Educational Product: First Floor Plan
No ratings yet
Produced by An Autodesk Educational Product: First Floor Plan
1 page
SZ Oteh2012
No ratings yet
SZ Oteh2012
11 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

13_Data Visualization

Uploaded by

13_Data Visualization

Uploaded by

Matplotlib

NumPy is one of the most widely used open-source Python packages,

In Python, NumPy is recommended over lists because it uses less memory,

Pandas is an open source library licenced under the Berkeley Software

Scipy is a Python library. It is an open-source library, especially designed

Cluster, fftpack, constants, integrate, io, linalg, interpolate, ndimage, odr,

Scikit-learn is also an open-source machine learning library based on

Visualization of statistical models is possible with this package. The library

Variable comparison via an API based on datasets

Create complex visualisations with ease, including multi-plot grids.

Univariate and bivariate visualisations are used to compare data subsets.

Patterns can be displayed in a variety of colour palettes.

Linear regression estimation and plotting are done automatically.

TensorFlow is an open-source numerical calculation library with high

Statsmodels is a Python library that helps with statistical model analysis

The user-friendly interface The Python programming language is widely

Advantages of data visualization

The uses of Data Visualization as follows:

 Powerful way to explore data with presentable results.

 Some other disadvantages include:

Data visualization for One-dimensional (1-D)

Data visualization for 2-D

image = np.random.rand(30, 30)

Data visualization for 3-D

# setting a custom style to use

# create a new figure for plotting

# create a new subplot on our figure

# plotting the points on subplot

# setting labels for the axes

# function to show the plot

General Types of Visualizations:

fig = plt.figure(figsize =(10, 7))

Five Number Summary of Box Plot

Median (Q2/50th Percentile) the middle value of the dataset

interquartile range (IQR) 25th to the 75th percentile.

Histograms are based on area, not height of bars

# importing the modules

# generating 2-D 10x10 matrix of random numbers

# plotting the heatmap

# displaying the plotted heatmap

List of Charts to Visualize Data

# creating the dataset

fig = plt.figure(figsize = (10, 5))

# creating the bar plot

fig = px.area(df, x="sepal_width", y="sepal_length",

import matplotlib.pyplot as plt

Another common visualization technique is a scatter plot that is a two-dimensional plot

A treemap is a visualization that displays hierarchically organized data as a set of

!pip install squarify -qqq

labels=['nepal', 'america', 'india']

A kernel density estimate (KDE) plot is a method for visualizing

# importing the libraries

x_train = generate_data()[:, np.newaxis]

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.