02 Visualize Slides
02 Visualize Slides
Visualize
Program
2
"The simple graph has brought more
information to the data analyst’s
mind than any other device. "
- John Tukey
mpg
Fuel economy data for 38 models of car.
mpg
Quiz
No peeking ahead!
5
Your Turn 1
Run this code in 02-Visualize-Exercises.Rmd to make a
graph. Pay strict attention to spelling, capitalization, and
parentheses!
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
6
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
1. "Initialize" a plot with ggplot()
2. Add layers with geom_ functions
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
8
Pro tip: Always put the + at the end
of a line, Never at the start
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
9
data + before new line
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
10
A template
ggplot(data = <DATA>) +
<GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
A template
ggplot(data = <DATA>) +
<GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))
Mappings
13
"The greatest value of a picture is
when it forces us to notice what we
never expected to see."
- John Tukey
How can we test
the theory?
Why do these
cars get better
mileage?
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy))
Aesthetics
1
0.8 0.8 0.8 0.8
0.6 0.8 1.0 1.2 1.4 0.6 0.8 1.0 1.2 1.4 0.6 0.8 1.0 1.2 1.4 0.6 0.8 1.0 1.2 1.4
1 1 1 1
16
Visual Space Data Space
color class
Red 2seater
Brown compact
Green midsize
Aqua minivan
Blue pickup
Violet subcompact
Pink suv
17
Aesthetics
aesthetic Variable to
property map it to
18
Your Turn 2
In the next chunk, add color, size, alpha, and shape
aesthetics to your graph. Experiment.
Do di erent things happen when you map aesthetics to
discrete and continuous variables?
What happens when you use more than one aesthetic?
19
ff
Discrete Continuous
Color
Size
Shape
20
Legend added
automatically
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy, color = class))
Erro r !
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy), color = class)
Facets
23
Facets
Subplots that display subsets of the data.
2seater compact midsize
40
30
20
30
20
2 3 4 5 6 7 2 3 4 5 6 7
suv
40
30
20
2 3 4 5 6 7
CC by RStudio
displ
Help me
What do facet_grid and facet_wrap do?
CC by RStudio 25
summary
CC by RStudio 26
A ggplot2 template
Make any plot by filling in the parameters of this template
ggplot(data = <DATA>) +
<GEOM_FUNCTION>(mapping = aes(<MAPPINGS>)) +
<FACET_FUNCTION>
CC by RStudio 27
Your Turn 3
Add the black code to your graph. What does it do?
ggplot(data = mpg) +
geom_point(mapping = aes(displ, hwy, color = class)) +
labs(title = "Fuel Efficiency by Engine Size",
subtitle = "Data facetted by class",
x = "Engine Size (displacement in liters)",
y = "Fuel Efficiency (MPG)",
color = "Class of\nAutomobile",
caption = "Data from the EPA")
CC by RStudio 28
Title
SU B T i t l e color
caption
x
CC by RStudio 29
Geoms
30
How are these Same: x var , y var , data
plots similar?
●
●
●
●
● ●
● ●
● ●●
● ● ●
● ● ● ● ●● ●
● ● ● ● ●
● ● ●● ● ● ● ●
● ● ● ●● ● ●● ●● ● ● ● ●
● ● ● ●● ● ● ● ●
●● ●● ● ● ● ●
● ● ● ● ● ● ●
● ● ● ● ●
● ●
● ● ●●
● ● ● ●● ●
● ● ● ● ● ●●
●● ●● ● ●● ● ● ● ● ● ●
●● ● ●
● ●● ●●● ● ●
● ●
●
How are these Di erent: geometric object (geom),
plots di e.g. the visual object used to represent the data
ff
ff
geoms
ggplot(data = <DATA>) +
<GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))
33
https://posit.co/resources/cheatsheets/
Click
T Sh e e t s
CheA
in the
S U P P O R T
LE AR N &
TAB
geom_ functions
Each requires a mapping
argument.
Your Turn 4
Decide how to replace this scatterplot with one that draws
boxplots. Use the cheatsheet. Try your best guess.
● ●
40
●
●
35 ●
●
● ●
● ●
● ●
30 ● ●
● ● ●
hwy
● ● ●
● ● ● ●
● ● ● ● ●
25 ● ● ● ● ●
● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ●
● ●
20 ● ● ●
● ●
● ●
● ● ●
● ●
15 ● ●
●
● ●
40
30
count
20
10
10 20 30 40
hwy
40
30
count
20
10
10 20 30 40
hwy
ggplot(data = mpg) +
geom_histogram(mapping = aes(x = hwy))
Quiz
40
30
count
What is the di erence?
20
10
10 20 30 40
hwy
40
30
count
20
10
40
10 20 30 40
hwy
ff
"Help" pages
To open the documentation
for a function, type
?geom_histogram
40
30
count
20
10
10 20 30 40
hwy
40
30
count
20
10
10 20 30 40
hwy
ggplot(data = mpg) +
geom_histogram(mapping = aes(x = hwy), binwidth = 2)
ggplot2.tidyverse.org
Your Turn 7
Make the bar chart of class below. Use the cheatsheet. Hint: do
not supply a y variable.
60
class
2seater
40
compact
midsize
count
minivan
pickup
subcompact
20
suv
class
2seater
40
compact
midsize
count
minivan
pickup
subcompact
20
suv
ggplot(data = mpg) +
geom_bar(mapping = aes(x = class, color = class))
60
class
2seater
40
compact
midsize
count
minivan
pickup
subcompact
20
suv
ggplot(data = mpg) +
geom_bar(mapping = aes(x = class, fill = class))
60
count 40 drv
4
f
r
20
0
2seater compact midsize minivan pickup subcompact suv
class
ggplot(data = mpg) +
geom_bar(mapping = aes(x = class, fill = drv))
Quiz
What will this code do?
ggplot(mpg) +
geom_point(aes(displ, hwy)) +
geom_smooth(aes(displ, hwy))
50
Each new geom
adds a new layer
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
geom_smooth(mapping = aes(x = displ, y = hwy))
Each new geom
adds a new layer
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
geom_smooth(mapping = aes(x = displ, y = hwy))
global vs. local
Mappings (and data)
that appear in ggplot()
will apply globally to
every layer
p <- ggplot(mpg) +
geom_point(aes(displ, hwy)) +
geom_smooth(aes(displ, hwy))
library(plotly)
ggplotly(p)
57
ff
interactivity
Plotly
Tools for making interactive plots. plot.ly/ggplot2/
Saving
graphs
60
GUI method
Right click on the plot
Code method
ggsave() saves the last plot.
Uses size on screen:
l
? il
ggsave("my-plot.pdf")
e w
ggsave("my-plot.png")
r
av e
t
h
i
m e
it w
e
.R id
t
s
d
Bu
ur ng
s
Specify size in inches
Q:
yo lo
A
ggsave("my-plot.pdf", width = 6, height = 6)
A:
t( )
Plotly
ge
id
av h
eW
::s it
ts w
Tools for making interactive plots. plot.ly/ggplot2/
ge ve
id Sa
lw
m
ht
Grammar
of Graphics
64
mpg cyl disp hp
21,0 6 160,0 2
21,0 6 160,0 2
22,8 4 108,0 1
21,4 6 258,0 2
18,7 8 360,0 3
18,1 6 225,0 2
14,3 8 360,0 5
24,4 4 146,7 1
22,8 4 140,8 1
19,2 6 167,6 2
17,8 6 167,6 2
16,4 8 275,8 3
17,3 8 275,8 3
15,2 8 275,8 3
10,4 8 472,0 4
10,4 8 460,0 4
14,7 8 440,0 4
32,4 4 78,7 1
30,4 4 75,7 1
33,9 4 71,1 1
data geom
mappings
fill
data geom
mappings
shape ll
data geom
fi
mappings
shape x fill
data geom
mappings
y shape x fill
data geom
mappings
y shape x fill
data geom
points
lines
mappings
y x
data geom
points
lines
bars
mappings
y xfill
data geom
points
lines
bars
To make a graph
ggplot(data = <DATA>) +
[template] <GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))
To make a graph
mpg
21,0
cyl
6
disp
160,0
hp
2 1. Pick a data set
21,0 6 160,0 2
22,8 4 108,0 1
21,4 6 258,0 2
18,7 8 360,0 3
18,1 6 225,0 2
14,3
24,4
8
4
360,0
146,7
5
1
ggplot(data = <DATA>) +
<GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))
22,8 4 140,8 1
19,2 6 167,6 2
17,8 6 167,6 2
16,4 8 275,8 3
17,3 8 275,8 3
15,2 8 275,8 3
10,4 8 472,0 4
10,4 8 460,0 4
14,7 8 440,0 4
32,4 4 78,7 1
30,4 4 75,7 1
33,9 4 71,1 1
data
To make a graph
mpg
21,0
cyl
6
disp
160,0
hp
2 1. Pick a data set
21,0 6 160,0 2
22,8 4 108,0 1
21,4 6 258,0 2
18,7 8 360,0 3
18,1 6 225,0 2
14,3
24,4
8
4
360,0
146,7
5
1
ggplot(data = <DATA>) +
<GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))
22,8 4 140,8 1
19,2 6 167,6 2
17,8 6 167,6 2
16,4 8 275,8 3
17,3 8 275,8 3
15,2 8 275,8 3
10,4 8 472,0 4
10,4
14,7
32,4
8
8
4
460,0
440,0
78,7
4
4
1
2. Choose a geom
30,4
33,9
4
4
75,7
71,1
1
1 to display cases
data geom
mappings
To make a graph
fill
mpg
21,0
cyl
6
disp
160,0
hp
2 1. Pick a data set
21,0 6 160,0 2
22,8 4 108,0 1
21,4 6 258,0 2
18,7 8 360,0 3
18,1 6 225,0 2
14,3 8 360,0 5 ggplot(data = <DATA>) +
24,4 4 146,7 1
<GEOM_FUNCTION>(mapping = aes(<MAPPINGS>))
22,8 4 140,8 1
19,2 6 167,6 2
17,8 6 167,6 2
16,4 8 275,8 3
17,3 8 275,8 3
15,2 8 275,8 3
10,4 8 472,0 4
3. Map aesthetic
10,4 8 460,0 4
2. Choose a geom
properties to
14,7 8 440,0 4
32,4 4 78,7 1
30,4 4 75,7 1
to display cases
variables
33,9 4 71,1 1
data geom
What else?
77
Position Adjustments
How overlapping objects are arranged
)
*(
n_
io
sit
po
Themes
Visual appearance of non-data elements
)
*(
e_
em
th
+
Scales
Customize color scales, other mappings
)
*(
e_
al
sc
+
Facets
Subplots that display subsets of the data.
)
*(
t_
ce
fa
+
Coordinate systems
()*
d_
or
co
+
Titles and captions
) (
bs
la
+
ymin, alpha, color, fill, linetype, s
+ = a + geom_ribbon(aes(ymin=u
data geom
x=F·y=A
color = F
A
systemggplot2
coordinate
template
plot
ymax=unemploy + 900)) - x, ym
alpha, color, fill, group, linetyp
size = A
Make any plot by filling in the parameters of this template
LINE SEGMENTS
common aesthetics: x, y, alpha, color, line
b + geom_abline(aes(intercept=
Complete the template below to build a graph. b + geom_hline(aes(yintercept =
ggplot (data = <DATA> ) +
required b + geom_vline(aes(xintercept =
<GEOM_FUNCTION> (mapping = aes( <MAPPINGS> ), b + geom_segment(aes(yend=lat+1, xend
b + geom_spoke(aes(angle = 1:1155, radi
stat = <STAT> , position = <POSITION> ) + Not
<COORDINATE_FUNCTION> + required,
sensible
<FACET_FUNCTION> + defaults
supplied ONE VARIABLE continuous
<SCALE_FUNCTION> + c <- ggplot(mpg, aes(hwy)); c2 <- ggplot(m
<THEME_FUNCTION>
c + geom_area(stat = "bin")
x, y, alpha, color, fill, linetype, siz
ggplot(data = mpg, aes(x = cty, y = hwy)) Begins a plot
that you finish by adding layers to. Add one geom c + geom_density(kernel = "gaus
function per layer. 85 x, y, alpha, color, fill, group, linety
David B Sparks, http://bit.ly/hn54NW
Violent
Crime
Density
1400
1200
1000
800
600
400
https://exts.ggplot2.tidyverse.org/gallery/
https://ggforce.data-imaginist.com
https://github.com/dkahle/ggmap
https://eliocamp.github.io/ggnewscale/
https://www.rayshader.com/
https://ggplot2-book.org
https://r4ds.hadley.nz
Your Turn
91
Visualize Data with