0% found this document useful (0 votes)
29 views73 pages

DV - Unit 2

The document discusses various types of graphs and charts that can be created in R including pie charts, bar charts, histograms, line charts, and scatter plots. It provides the syntax and parameters for generating each type of graph using functions like pie(), barplot(), hist(), plot(), and explains how to customize aspects like colors, labels, titles. Examples are given for each graph type.

Uploaded by

Narenkumar. N
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views73 pages

DV - Unit 2

The document discusses various types of graphs and charts that can be created in R including pie charts, bar charts, histograms, line charts, and scatter plots. It provides the syntax and parameters for generating each type of graph using functions like pie(), barplot(), hist(), plot(), and explains how to customize aspects like colors, labels, titles. Examples are given for each graph type.

Uploaded by

Narenkumar. N
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 73

Data

visualization
Data visualization working with R graphics

Pie –chart:
In R the pie chart is created using the pie() function which takes positive
numbers as a vector input. The additional parameters are used to control
labels, color, title etc.

Syntax
pie(x, labels, radius, main, col, clockwise)
Parameter description:

x is a vector containing the numeric values used in the pie chart.


labels is used to give description to the slices.
radius indicates the radius of the circle of the pie chart.(value between −1
and +1).
main indicates the title of the chart.
col indicates the color palette.
clockwise is a logical value indicating if the slices are drawn clockwise or
anti clockwise.
# create pie chart:
x<-c(60,70,80,55)
labels<-c("product","sales","advertise","mark")
pie(x,labels)
# main fun() # of chart

pie(x,labels,main="Departments")
#colr function:
colour<-c("pink","blue","yellow","white")

#init.angle=90# start the first pie at 90 degrees


pie(x,init.angle=90,labels,main="Departments",col=colour)
Legend function:
To add a list of explanation for each pie, use the legend() function

# legend function:
legend("bottomright",c("product","sales","advertise","mark"),cex=0.7,
fill=colour)
Bar chart:

• A bar chart is a pictorial representation in which numerical values of


variables are represented by length or height of lines or rectangles of equal
width. A bar chart is used for summarizing a set of categorical data.

• R uses the function barplot() to create bar charts. R can draw both
vertical and Horizontal bars in the bar chart. In bar chart each of the bars
can be given different colors.
Syntax

barplot(H,xlab,ylab,main, names.arg,col)

Following is the description of the parameters :


 H is a vector or matrix containing numeric values used in bar chart.
 xlab is the label for x axis.
 ylab is the label for y axis.
 main is the title of the bar chart.
 names.arg is a vector of names appearing under each bar.
 col is used to give colors to the bars in the graph.
# create bar chart
a<-c(15,30,45,60) #
barplot(a)
 The x variable represents values in the x-axis (A,B,C,D)
 The y variable represents values in the y-axis (2,4,6,8)

 Then we use the barplot() function to create a bar chart of the values
 names.arg defines the names of each observation in the x-axis
a<-c(15,30,45,60) # y axis(name.arg)#Names to each bar
b<-c("A","B","C","D") # x axis
barplot(a,names.arg=b)
a<-c(15,30,45,60) # y axis(name.arg)#Names to each bar
b<-c("A","B","C","D") # x axis
barplot(a,names.arg=b)

# plotting the bar chart


barplot(a ,names.arg=b,xlab="letters", ylab="value",
col="blue",main="bar chart",border="orange")
# Create the input vectors.
colors = c("pink","white","blue")
months <- c("Mar","Apr","May","Jun","Jul")
regions <- c("one","two","three")

# Create the matrix of the values.


Values <- matrix(c(2,9,3,11,9,4,8,7,3,12,5,2,8,10,11),nrow = 3,ncol = 5,byrow = TRUE)

# Create the bar chart


barplot(Values, main ="bar chart",names.arg =
months,xlab="month",ylab="value",col=colors)

# Add the legend to the chart


legend("topleft",regions, cex=0.3, fill=colors)
Histogram:

• A histogram is a type of bar chart which shows the frequency of the number
of values which are compared with a set of values ranges. The histogram is
used for the distribution.

• whereas a bar chart is used for comparing different entities. In the


histogram, each bar represents the height of the number of values present
in the given range.

• R creates histogram using hist() function. This function takes a vector as an


input and uses some more parameters to plot histograms.
Syntax:
hist(v,main,xlab,xlim,ylim,breaks,col,border)

 description of the parameters :

 v is a vector containing numeric values used in histogram.


 main indicates title of the chart.
 col is used to set color of the bars.
 border is used to set border color of each bar.
 xlab is used to give description of x-axis.
 xlim is used to specify the range of values on the x-axis.
 ylim is used to specify the range of values on the y-axis.
 breaks is used to mention the width of each bar.
# create Histogram:
a<-c(45,24,23,15,18,30,44,18,16,20)
hist(a,xlab="value",ylab="order")
#plot histogram:
hist(a,xlab="value",ylab=“order",main="histogram",col="orange",
border="green")
 Range of X and Y values

To specify the range of values allowed in X axis and Y axis, we can use the
xlim and ylim parameters.
The width of each of the bar can be decided by using breaks.

#using xlim and y lim parameter:

hist(a,xlab="value",ylab="point",main="histogram",col="orange"
,border="green",xlim=c(0,40),ylim=c(0,5),breaks=3)
Example of Histogram:
 A histogram is what we call an area diagram. It indicates the frequency of
a class interval. The class interval or the range of values is known as bins
or classes with reference to histograms. A bar indicates the number of
data points within a specific class. That means the higher the frequency of
a particular class, higher the bar.

Example of a Histogram.
From the below-given table of the various heights of trees in a region,
we will draw a histogram to illustrate how it is done. Let us look at the
frequency table now.
 Height of Trees (ft) No. of trees
 60-65 3
 65-70 3
 70-75 8
 75-80 10
 80-85 5
 85-90 2

 here the heights of the tree are continuous data. These class intervals are
the bins. And the number of trees are obviously the frequency.
Histogram:
 Histograms vs Bar Charts
 In bar graphs, each bar represents one value or category. On the other
hand in a histogram, each bar will represent a continuous data

 In a bar graph, the x-axis need not always be a numerical value. It can also
be a category. However, in a histogram, the X-axis is always quantitative
data and it is continuous data.

 Due to the above factor, a histogram can be observed for a pattern or


tendency of data to fall in more on the low end or high end etc. Same
cannot be done for a bar chart
Line chart:

 A line chart is a graph that connects a series of points by drawing line


segments between them. These points are ordered in one of their
coordinate (usually the x-coordinate) value. Line charts are usually used in
identifying the trends in data.
 The plot() function in R is used to create the line graph.
 syntax :

 plot(v,type,col,xlab,ylab)
Description of the parameters

 v is a vector containing the numeric values.


 type takes the value "p" to draw only the points, "l" to draw only the lines
and "o" to draw both points and lines.
 xlab is the label for x axis.
 ylab is the label for y axis.
 main is the Title of the chart.
 col is used to give colors to both the points and lines.
Create line chart:
#Create line chart:
x<-c(12,14,16,18,15,29,23,34)
plot(x,type=“o")
Different type of line charts:

x<-c(12,14,16,18,15,29,23,34)
plot(x,type="p")
plot(x,type="l")
plot(x,type="o")
plot(x,type="b")
plot(x,type="c")
plot(x,type="h")
plot(x,type="s")
plot(x,type="S")
plot(x,type="n")
#plotting the chart:
plot(x,type="o",xlab="points","ylab"="value",
col="pink",border="green",main="Line chart")
Scatter plot
A "scatter plot" is a type of plot used to display the relationship between two
numerical variables, and plots one dot for each observation.
 Each point represents the values of two variables. One variable is chosen
in the horizontal axis and another in the vertical axis.

 The simple scatter plot is created using the plot() function.

Syntax
plot(x, y, main, xlab, ylab, xlim, ylim, axes)
Description of the parameters :

 x is the data set whose values are the horizontal coordinates.


 y is the data set whose values are the vertical coordinates.
 main is the tile of the graph.
 xlab is the label in the horizontal axis.
 ylab is the label in the vertical axis.
 xlim is the limits of the values of x used for plotting.
 ylim is the limits of the values of y used for plotting.
 axes indicates whether both axes should be drawn on the plot.
Loading inbuilt dataset:
data=fread("C://Users/Admin/Downloads/archive/cardio.csv")
data

file<-data [,c("Age","Usage")]
head(file)

plot(x=file$Age,y=file$Usage,main="Scatter plot",xlab="Age",
ylab="Usage",colr="pink")
(or)
# creating dataset for scatterplot:

data <-data.frame(weight=c(3,5,4,2,2,5),
milegae=c(15,30,45,60,75,80))
data
Output:
weight milegae
1 3 15
2 5 30
3 4 45
4 2 60
5 2 75
6 5 80
#plotting the dataset:
plot(data,xlab="mileage",ylab="weight",main="scatterplot" ,col="red",)
The different points
symbols commonly used in R
 pch = 0,square
 pch = 1,circle
 pch = 2,triangle point up
 pch = 3,plus
 pch = 4,cross
 pch = 5,diamond
 pch = 6,triangle point down
 pch = 7,square cross
 pch = 8,star
 pch = 9,diamond plus
 pch = 10,circle plus
 pch = 11,triangles up and down
 pch = 12,square plus
 pch = 13,circle cross
 pch = 14,square and triangle down
 pch = 15, filled square
 pch = 16, filled circle
 pch = 17, filled triangle point-up
 pch = 18, filled diamond
 pch = 19, solid circle
 pch = 20,bullet (smaller circle)
 pch = 21, filled circle blue
 pch = 22, filled square blue
 pch = 23, filled diamond blue
 pch = 24, filled triangle point-up blue
 pch = 25, filled triangle point down blue
#pch=2
plot(data,xlab="mileage",ylab="weight",main="scatterplot"
,col="red",pch=2)
#plot(data,xlab="mileage",ylab="weight",main="scatterplot"
,col="red",pch=18)
#limits apply x and y axis:
plot(data,xlab="mileage",ylab="weight",main="scatterplot"
,col="red",xlim=c(3,5),ylim=c(30,60))
Scatterplot Matrices

 When we have more than two variables and we want to find the
correlation between one variable versus the remaining ones we use
scatterplot matrix.
 We use pairs() function to create matrices of scatterplots.

 Syntax
pairs(formula, data)

formula represents the series of variables used in pairs.


data represents the data set from which the variables will be taken.
data <-data.frame(weight=c(3,5,4,2,2,5),
milegae=c(15,30,45,60,75,80),
cyl=c(12,14,8,23,45,60),
km=c(20,30,45,35,40,48))
data

Output:
weight milegae cyl km
1 3 15 12 20
2 5 30 14 30
3 4 45 8 45
4 2 60 23 35
5 2 75 45 40
6 5 80 60 48
>
# pair of variables in scatter plot
pairs(~weight+mileage+cyl+km, data=input)
#making line graph using data set :
plot(input$cyl,input$km,type="l",xlab="cycle",ylab="kilometer",
main="Graph", col="blue")
# making bar chart using data set
x=input$cyl
y=input$km
barplot(x, names.arg=y,xlab="first“, ylab="second",
col="red",border="green",main="barplot")
ggplot2 package:

R allows us to create graphics declaratively. R provides the ggplot package


for this purpose. This package is famous for its elegant and quality graphs
which sets it apart from other visualization packages.

 always start by calling the ggplot() function.


 then specify the data object. It has to be a data frame. And it needs one
numeric and one categorical variable.
 then come these aesthetics, set in the aes() function: set the categoric
variable for the X axis, use the numeric for the Y axis
Installation:

Install.packages(“<package-name>”)
Install.packages(“ggplot2”)

library(“ggplot2”)

qplot is a function which is used to create a ggplot2 graph:


# create bar graph using ggplot2:
#qplot is a function from ggplot2 library
#ggplot2 :

qplot( x=input$cyl, names.arg=input$km,


geom="bar",
xlab="vehicle",
ylab="distance",
col="green",
main="ggplot graph")
# Histogram using ggplot2:

#ggplot2 Histogram:
qplot(input$mileage,geom="bar",xlab="vehicle",ylab="distance",
fill="red",main="ggplot graph")
The Jupyter Notebook is an open-source web application that allows you
to create and share documents that contain live code, equations,
visualizations and narrative text.
Uses include data cleaning and transformation, numerical simulation,
statistical modeling, data visualization, machine learning, and much more.

Matplotlib is one of the most popular Python packages used for data
visualization. It is a cross-platform library for making 2D plots from data in
arrays.
 Matplotlib

Matplotlib is a low-level library of Python which is used for data


visualization. It is easy to use and emulates MATLAB like graphs and
visualization.
This library is built on the top of NumPy arrays and consist of
several plots like line chart, bar chart, histogram, etc. It provides a lot of
flexibility but at the cost of writing more code.

We will use the pip command to install this module.


 you can start plotting with the help of the plot() function.When you’re
done, remember to show your plot using the show() function. Matplotlib
is written in Python and makes use of NumPy, the numerical
mathematics extension of Python.

It consists of several plots :


 Line
 Bar
 Scatter
 Histogram
 And many more
Installation
Install Matplotlib with pip Matplotlib can also be installed using the
Python package manager, pip. To install Matplotlib with pip, open a
terminal window and type:

pip install matplotlib


# importing matplotlib module
from matplotlib import pyplot as plt
 Pyplot
Pyplot is a Matplotlib module which provides a MATLAB-like interface.
Matplotlib is designed to be as usable as MATLAB, with the ability to use

Each pyplot function makes some change to a figure: e.g., creates a figure,
creates a plotting area in a figure, plots some lines in a plotting area,
decorates the plot with labels, etc.
Creating plot using matplotlib:
import matplotlib.pyplot as plt

x=[12,15,18,20,23]
y=[21,24,26,28,30]

plt.plot(x,y)
plt.show
 Adding Title
The title() method in matplotlib module is used to specify the title of the
visualization

import matplotlib.pyplot as plt

x=[12,15,18,20,23]
y=[21,24,26,28,30]

plt.plot(x,y)
plt.title("graph")
plt.show
Adding fontsize ,color,labels:

#plt.title("graph", fontsize=50,color="red")
Plt.xlabel(“x axis”)
Plt.ylabel(“y axis”)
# setting label name in x-axis
#legend
plt.ylim(24,28)
plt.xticks(x,labels=["a","b","c","d","e"])
plt.legend(["ABC"])
#grid()
Plt.grid(axis=‘x’)

Plt.grid(axis=‘y’)
 Creating a bar plot

The matplotlib API in Python provides the bar() function which can be
used in MATLAB style use or as an object-oriented API. The syntax of the
bar() function to be used with the axes is as follows:-

plt.bar(x, height, width, bottom, align)


import matplotlib.pyplot as plt
import numpy as np
x=np.array(["DT","DV","CLOUD","PYTHON"])
y=np.array([12,14,16,18])
fig=plt.figure(figsize=(8,4))
plt.bar(x,y,width=0.5,color="pink")

plt.xlabel("Subject")
plt.ylabel("Duration")
plt.title("bar chart")
plt.show
Histogram:
To create a histogram the first step is to create bin of the ranges, then
distribute the whole range of the values into a series of intervals, and count
the values which fall into each of the intervals.

The following table shows the parameters accepted by matplotlib.pyplot.hist()


function :
 Customization that is available for the Histogram –
 bins: Number of equal-width bins
 color: For changing the face color
 edgecolor: Color of the edges
 linestyle: For the edgelines
 alpha: blending value, between 0 (transparent) and 1 (opaque)
 Example:
# csv file
import matplotlib.pyplot as plt
import pandas as pd
data = pd.read_csv("C://Users/Admin/Downloads/archive/cardio.csv")
Data

# create histogram using matplotlib:

import matplotlib.pyplot as plt


import pandas as pd
data = pd.read_csv("C://Users/Admin/Downloads/archive/cardio.csv")
x=data["Age"]
fig=plt.figure(figsize=(10,5))
plt.hist(x,width=0.5,)
plt.xlabel("order")
plt.ylabel("frequency")
plt.title("Histogram")
plt.show
Thank
You!

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy