0% found this document useful (0 votes)
13 views

02 Visualize Slides

This document provides an overview of visualizing data in R using ggplot2. It introduces the core concepts of initializing plots with ggplot(), adding layers of different geom types using functions like geom_point() to visualize the data, and mapping different aesthetic properties like color, size, and shape to variables in the data. It also demonstrates how to add labels, titles, facets, and customize the overall visualization. The goal is to illustrate the modular and flexible grammar of graphics approach in ggplot2 to create informative data visualizations.

Uploaded by

andy kc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

02 Visualize Slides

This document provides an overview of visualizing data in R using ggplot2. It introduces the core concepts of initializing plots with ggplot(), adding layers of different geom types using functions like geom_point() to visualize the data, and mapping different aesthetic properties like color, size, and shape to variables in the data. It also demonstrates how to add labels, titles, facets, and customize the overall visualization. The goal is to illustrate the modular and flexible grammar of graphics approach in ggplot2 to create informative data visualizations.

Uploaded by

andy kc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 92

Visualize Data with

Slides CC BY-SA RStudio


(Applied) Data Science

Visualize

Import Tidy Model Communicate


Transform

Program

2
"The simple graph has brought more
information to the data analyst’s
mind than any other device. "

- John Tukey
mpg
Fuel economy data for 38 models of car.

mpg
Quiz

What relationship do you expect to see between engine


size (displ) and mileage (hwy)?

No peeking ahead!

5
Your Turn 1
Run this code in 02-Visualize-Exercises.Rmd to make a
graph. Pay strict attention to spelling, capitalization, and
parentheses!
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))

6
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
1. "Initialize" a plot with ggplot()
2. Add layers with geom_ functions

ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))

8
Pro tip: Always put the + at the end
of a line, Never at the start

ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))

9
data + before new line

ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))

type of layer aes() x variable y variable

10
A template
ggplot(data = <DATA>) +
<GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
A template
ggplot(data = <DATA>) +
<GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))
Mappings

13
"The greatest value of a picture is
when it forces us to notice what we
never expected to see."

- John Tukey
How can we test
the theory?

Why do these
cars get better
mileage?

ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
Aesthetics

1.4 1.4 1.4 1.4

1.2 1.2 1.2 1.2

1.0 1.0 1.0 1.0


1

1
0.8 0.8 0.8 0.8

0.6 0.6 0.6 0.6

0.6 0.8 1.0 1.2 1.4 0.6 0.8 1.0 1.2 1.4 0.6 0.8 1.0 1.2 1.4 0.6 0.8 1.0 1.2 1.4
1 1 1 1

16
Visual Space Data Space
color class
Red 2seater
Brown compact
Green midsize
Aqua minivan
Blue pickup
Violet subcompact
Pink suv

17
Aesthetics
aesthetic Variable to
property map it to

ggplot(mpg) + geom_point(aes(x = displ, y = hwy, color = class))


ggplot(mpg) + geom_point(aes(x = displ, y = hwy, size = class))
ggplot(mpg) + geom_point(aes(x = displ, y = hwy, shape = class))
ggplot(mpg) + geom_point(aes(x = displ, y = hwy, alpha = class))

18
Your Turn 2
In the next chunk, add color, size, alpha, and shape
aesthetics to your graph. Experiment.
Do di erent things happen when you map aesthetics to
discrete and continuous variables?
What happens when you use more than one aesthetic?

19
ff
Discrete Continuous

Color

Size

Shape

20
Legend added
automatically

ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, color = class))
Erro r !

ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy), color = class)
Facets

23
Facets
Subplots that display subsets of the data.
2seater compact midsize
40
30
20

minivan pickup subcompact


40
hwy

30
20

2 3 4 5 6 7 2 3 4 5 6 7
suv
40
30
20

2 3 4 5 6 7

CC by RStudio
displ
Help me
What do facet_grid and facet_wrap do?

q <- ggplot(mpg) + geom_point(aes(x = displ, y = hwy))


q + facet_grid(cols = vars(cyl))
q + facet_grid(rows = vars(drv))
q + facet_grid(rows = vars(drv), cols = vars(cyl))
q + facet_wrap(facets = vars(class))

CC by RStudio 25
summary

facet_grid() - 2D grid, one variable in rows, one variable in columns


facet_wrap() - 1D ribbon wrapped into 2D

CC by RStudio 26
A ggplot2 template
Make any plot by filling in the parameters of this template

ggplot(data = <DATA>) +
<GEOM_FUNCTION>(mapping = aes(<MAPPINGS>)) +
<FACET_FUNCTION>

CC by RStudio 27
Your Turn 3
Add the black code to your graph. What does it do?
ggplot(data = mpg) +
geom_point(mapping = aes(displ, hwy, color = class)) +
labs(title = "Fuel Efficiency by Engine Size",
subtitle = "Data facetted by class",
x = "Engine Size (displacement in liters)",
y = "Fuel Efficiency (MPG)",
color = "Class of\nAutomobile",
caption = "Data from the EPA")
CC by RStudio 28
Title
SU B T i t l e color

caption

x
CC by RStudio 29
Geoms

30
How are these Same: x var , y var , data
plots similar?





● ●
● ●
● ●●
● ● ●
● ● ● ● ●● ●
● ● ● ● ●
● ● ●● ● ● ● ●
● ● ● ●● ● ●● ●● ● ● ● ●
● ● ● ●● ● ● ● ●
●● ●● ● ● ● ●
● ● ● ● ● ● ●
● ● ● ● ●
● ●
● ● ●●
● ● ● ●● ●
● ● ● ● ● ●●
●● ●● ● ●● ● ● ● ● ● ●
●● ● ●
● ●● ●●● ● ●
● ●


How are these Di erent: geometric object (geom),
plots di e.g. the visual object used to represent the data
ff
ff
geoms

ggplot(data = <DATA>) +
<GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))

33
https://posit.co/resources/cheatsheets/

Click
T Sh e e t s
CheA
in the
S U P P O R T
LE AR N &
TAB
geom_ functions
Each requires a mapping
argument.
Your Turn 4
Decide how to replace this scatterplot with one that draws
boxplots. Use the cheatsheet. Try your best guess.
● ●

40


35 ●

● ●
● ●
● ●

30 ● ●
● ● ●
hwy

● ● ●
● ● ● ●
● ● ● ● ●

25 ● ● ● ● ●
● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ●
● ●

20 ● ● ●
● ●
● ●
● ● ●
● ●

15 ● ●

● ●

2seater compact midsize minivan pickup subcompact suv


class

ggplot(mpg) + geom_point(aes(class, hwy))


ggplot(data = mpg) +
geom_boxplot(mapping = aes(x = class, y = hwy))
Your Turn 5
Make the histogram of hwy below. Use the cheatsheet. Hint: do
not supply a y variable.

40

30
count

20

10

10 20 30 40
hwy
40

30
count

20

10

10 20 30 40
hwy

ggplot(data = mpg) +
geom_histogram(mapping = aes(x = hwy))
Quiz
40

30

count
What is the di erence?
20

10

10 20 30 40
hwy

40

30

count
20

10

40
10 20 30 40
hwy
ff
"Help" pages
To open the documentation
for a function, type

?geom_histogram

function name (no


? parentheses)
Tips
• scan page for
relevant info
• ignore things that
don't make sense
• try out the
examples
Your Turn 6
Use the help page for geom_histogram
to make the bins 2 mpg wide.

40

30
count

20

10

10 20 30 40
hwy
40

30
count

20

10

10 20 30 40
hwy

ggplot(data = mpg) +
geom_histogram(mapping = aes(x = hwy), binwidth = 2)
ggplot2.tidyverse.org
Your Turn 7
Make the bar chart of class below. Use the cheatsheet. Hint: do
not supply a y variable.
60

class
2seater
40
compact
midsize
count

minivan
pickup
subcompact
20
suv

2seater compact midsize minivan pickup subcompact suv


class
60

class
2seater
40
compact
midsize
count

minivan
pickup
subcompact
20
suv

2seater compact midsize minivan pickup subcompact suv


class

ggplot(data = mpg) +
geom_bar(mapping = aes(x = class, color = class))
60

class
2seater
40
compact
midsize
count

minivan
pickup
subcompact
20
suv

2seater compact midsize minivan pickup subcompact suv


class

ggplot(data = mpg) +
geom_bar(mapping = aes(x = class, fill = class))
60

count 40 drv
4
f
r
20

0
2seater compact midsize minivan pickup subcompact suv
class

ggplot(data = mpg) +
geom_bar(mapping = aes(x = class, fill = drv))
Quiz
What will this code do?

ggplot(mpg) +
geom_point(aes(displ, hwy)) +
geom_smooth(aes(displ, hwy))

50
Each new geom
adds a new layer

ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
geom_smooth(mapping = aes(x = displ, y = hwy))
Each new geom
adds a new layer

ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
geom_smooth(mapping = aes(x = displ, y = hwy))
global vs. local
Mappings (and data)
that appear in ggplot()
will apply globally to
every layer

ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +


geom_point() +
geom_smooth()
Mappings (and data) that appear
in a geom_ function will add to or
override the global mappings for
that layer only

ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +


geom_point(mapping = aes(color = drv)) +
geom_smooth()
data can also be set
locally or globally

ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +


geom_point(mapping = aes(color = drv)) +
geom_smooth(data = filter(mpg, drv == "f"))
Quiz
What is di erent about this plot? Run the code!

p <- ggplot(mpg) +
geom_point(aes(displ, hwy)) +
geom_smooth(aes(displ, hwy))

library(plotly)
ggplotly(p)
57
ff
interactivity
Plotly
Tools for making interactive plots. plot.ly/ggplot2/
Saving
graphs
60
GUI method
Right click on the plot
Code method
ggsave() saves the last plot.
Uses size on screen:

l
? il
ggsave("my-plot.pdf")

e w
ggsave("my-plot.png")

r
av e
t
h
i
m e
it w
e
.R id
t

s
d
Bu

ur ng
s
Specify size in inches

Q:

yo lo
A
ggsave("my-plot.pdf", width = 6, height = 6)

A:
t( )
Plotly

ge
id
av h
eW
::s it
ts w
Tools for making interactive plots. plot.ly/ggplot2/
ge ve
id Sa
lw
m
ht
Grammar
of Graphics
64
mpg cyl disp hp
21,0 6 160,0 2
21,0 6 160,0 2
22,8 4 108,0 1
21,4 6 258,0 2
18,7 8 360,0 3
18,1 6 225,0 2
14,3 8 360,0 5
24,4 4 146,7 1
22,8 4 140,8 1
19,2 6 167,6 2
17,8 6 167,6 2
16,4 8 275,8 3
17,3 8 275,8 3
15,2 8 275,8 3
10,4 8 472,0 4
10,4 8 460,0 4
14,7 8 440,0 4
32,4 4 78,7 1
30,4 4 75,7 1
33,9 4 71,1 1

data geom
mappings
fill

mpg cyl disp hp


21,0 6 160,0 2
21,0 6 160,0 2
22,8 4 108,0 1
21,4 6 258,0 2
18,7 8 360,0 3
18,1 6 225,0 2
14,3 8 360,0 5
24,4 4 146,7 1
22,8 4 140,8 1
19,2 6 167,6 2
17,8 6 167,6 2
16,4 8 275,8 3
17,3 8 275,8 3
15,2 8 275,8 3
10,4 8 472,0 4
10,4 8 460,0 4
14,7 8 440,0 4
32,4 4 78,7 1
30,4 4 75,7 1
33,9 4 71,1 1

data geom
mappings
shape ll

mpg cyl disp hp


21,0 6 160,0 2
21,0 6 160,0 2
22,8 4 108,0 1
21,4 6 258,0 2
18,7 8 360,0 3
18,1 6 225,0 2
14,3 8 360,0 5
24,4 4 146,7 1
22,8 4 140,8 1
19,2 6 167,6 2
17,8 6 167,6 2
16,4 8 275,8 3
17,3 8 275,8 3
15,2 8 275,8 3
10,4 8 472,0 4
10,4 8 460,0 4
14,7 8 440,0 4
32,4 4 78,7 1
30,4 4 75,7 1
33,9 4 71,1 1

data geom
fi
mappings
shape x fill

mpg cyl disp hp


21,0 6 160,0 2
21,0 6 160,0 2
22,8 4 108,0 1
21,4 6 258,0 2
18,7 8 360,0 3
18,1 6 225,0 2
14,3 8 360,0 5
24,4 4 146,7 1
22,8 4 140,8 1
19,2 6 167,6 2
17,8 6 167,6 2
16,4 8 275,8 3
17,3 8 275,8 3
15,2 8 275,8 3
10,4 8 472,0 4
10,4 8 460,0 4
14,7 8 440,0 4
32,4 4 78,7 1
30,4 4 75,7 1
33,9 4 71,1 1

data geom
mappings
y shape x fill

mpg cyl disp hp


21,0 6 160,0 2
21,0 6 160,0 2
22,8 4 108,0 1
21,4 6 258,0 2
18,7 8 360,0 3
18,1 6 225,0 2
14,3 8 360,0 5
24,4 4 146,7 1
22,8 4 140,8 1
19,2 6 167,6 2
17,8 6 167,6 2
16,4 8 275,8 3
17,3 8 275,8 3
15,2 8 275,8 3
10,4 8 472,0 4
10,4 8 460,0 4
14,7 8 440,0 4
32,4 4 78,7 1
30,4 4 75,7 1
33,9 4 71,1 1

data geom
mappings
y shape x fill

mpg cyl disp hp


21,0 6 160,0 2
21,0 6 160,0 2
22,8 4 108,0 1
21,4 6 258,0 2
18,7 8 360,0 3
18,1 6 225,0 2
14,3 8 360,0 5
24,4 4 146,7 1
22,8 4 140,8 1
19,2 6 167,6 2
17,8 6 167,6 2
16,4 8 275,8 3
17,3 8 275,8 3
15,2 8 275,8 3
10,4 8 472,0 4
10,4 8 460,0 4
14,7 8 440,0 4
32,4 4 78,7 1
30,4 4 75,7 1
33,9 4 71,1 1

data geom
points
lines
mappings
y x

mpg cyl disp hp


21,0 6 160,0 2
21,0 6 160,0 2
22,8 4 108,0 1
21,4 6 258,0 2
18,7 8 360,0 3
18,1 6 225,0 2
14,3 8 360,0 5
24,4 4 146,7 1
22,8 4 140,8 1
19,2 6 167,6 2
17,8 6 167,6 2
16,4 8 275,8 3
17,3 8 275,8 3
15,2 8 275,8 3
10,4 8 472,0 4
10,4 8 460,0 4
14,7 8 440,0 4
32,4 4 78,7 1
30,4 4 75,7 1
33,9 4 71,1 1

data geom
points
lines
bars
mappings
y xfill

mpg cyl disp hp


21,0 6 160,0 2
21,0 6 160,0 2
22,8 4 108,0 1
21,4 6 258,0 2
18,7 8 360,0 3
18,1 6 225,0 2
14,3 8 360,0 5
24,4 4 146,7 1
22,8 4 140,8 1
19,2 6 167,6 2
17,8 6 167,6 2
16,4 8 275,8 3
17,3 8 275,8 3
15,2 8 275,8 3
10,4 8 472,0 4
10,4 8 460,0 4
14,7 8 440,0 4
32,4 4 78,7 1
30,4 4 75,7 1
33,9 4 71,1 1

data geom
points
lines
bars
To make a graph

ggplot(data = <DATA>) +
[template] <GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))
To make a graph
mpg
21,0
cyl
6
disp
160,0
hp
2 1. Pick a data set
21,0 6 160,0 2
22,8 4 108,0 1
21,4 6 258,0 2
18,7 8 360,0 3
18,1 6 225,0 2
14,3
24,4
8
4
360,0
146,7
5
1
ggplot(data = <DATA>) +
<GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))
22,8 4 140,8 1
19,2 6 167,6 2
17,8 6 167,6 2
16,4 8 275,8 3
17,3 8 275,8 3
15,2 8 275,8 3
10,4 8 472,0 4
10,4 8 460,0 4
14,7 8 440,0 4
32,4 4 78,7 1
30,4 4 75,7 1
33,9 4 71,1 1

data
To make a graph
mpg
21,0
cyl
6
disp
160,0
hp
2 1. Pick a data set
21,0 6 160,0 2
22,8 4 108,0 1
21,4 6 258,0 2
18,7 8 360,0 3
18,1 6 225,0 2
14,3
24,4
8
4
360,0
146,7
5
1
ggplot(data = <DATA>) +
<GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))
22,8 4 140,8 1
19,2 6 167,6 2
17,8 6 167,6 2
16,4 8 275,8 3
17,3 8 275,8 3
15,2 8 275,8 3
10,4 8 472,0 4
10,4
14,7
32,4
8
8
4
460,0
440,0
78,7
4
4
1
2. Choose a geom
30,4
33,9
4
4
75,7
71,1
1
1 to display cases
data geom
mappings
To make a graph
fill

mpg
21,0
cyl
6
disp
160,0
hp
2 1. Pick a data set
21,0 6 160,0 2
22,8 4 108,0 1
21,4 6 258,0 2
18,7 8 360,0 3
18,1 6 225,0 2
14,3 8 360,0 5 ggplot(data = <DATA>) +
24,4 4 146,7 1

<GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))
22,8 4 140,8 1
19,2 6 167,6 2
17,8 6 167,6 2
16,4 8 275,8 3
17,3 8 275,8 3
15,2 8 275,8 3
10,4 8 472,0 4
3. Map aesthetic
10,4 8 460,0 4
2. Choose a geom
properties to
14,7 8 440,0 4
32,4 4 78,7 1
30,4 4 75,7 1
to display cases
variables
33,9 4 71,1 1

data geom
What else?

77
Position Adjustments
How overlapping objects are arranged

)
*(
n_
io
sit
po
Themes
Visual appearance of non-data elements

)
*(
e_
em
th
+
Scales
Customize color scales, other mappings

)
*(
e_
al
sc
+
Facets
Subplots that display subsets of the data.

)
*(
t_
ce
fa
+
Coordinate systems

()*
d_
or
co
+
Titles and captions

) (
bs
la
+
ymin, alpha, color, fill, linetype, s
+ = a + geom_ribbon(aes(ymin=u
data geom
x=F·y=A
color = F
A
systemggplot2
coordinate
template
plot
ymax=unemploy + 900)) - x, ym
alpha, color, fill, group, linetyp
size = A
Make any plot by filling in the parameters of this template
LINE SEGMENTS
common aesthetics: x, y, alpha, color, line
b + geom_abline(aes(intercept=
Complete the template below to build a graph. b + geom_hline(aes(yintercept =
ggplot (data = <DATA> ) +
required b + geom_vline(aes(xintercept =
<GEOM_FUNCTION> (mapping = aes( <MAPPINGS> ), b + geom_segment(aes(yend=lat+1, xend
b + geom_spoke(aes(angle = 1:1155, radi
stat = <STAT> , position = <POSITION> ) + Not
<COORDINATE_FUNCTION> + required,
sensible
<FACET_FUNCTION> + defaults
supplied ONE VARIABLE continuous
<SCALE_FUNCTION> + c <- ggplot(mpg, aes(hwy)); c2 <- ggplot(m
<THEME_FUNCTION>
c + geom_area(stat = "bin")
x, y, alpha, color, fill, linetype, siz
ggplot(data = mpg, aes(x = cty, y = hwy)) Begins a plot
that you finish by adding layers to. Add one geom c + geom_density(kernel = "gaus
function per layer. 85 x, y, alpha, color, fill, group, linety
David B Sparks, http://bit.ly/hn54NW
Violent
Crime
Density
1400

1200

1000

800

600

400

David Kahle, https://dl.dropbox.com/u/24648660/ggmap%20useR%202012.pdf


James Cheshire, http://bit.ly/xqHhAs
Useful resources

https://exts.ggplot2.tidyverse.org/gallery/
https://ggforce.data-imaginist.com
https://github.com/dkahle/ggmap
https://eliocamp.github.io/ggnewscale/
https://www.rayshader.com/

https://ggplot2-book.org
https://r4ds.hadley.nz
Your Turn

Navigate up to the 03-Transform folder.


Open 03-Transform-Exercises.Rmd

91
Visualize Data with

Slides CC BY-SA RStudio

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy