3 Styling Ggplot2 Graphics
3 Styling Ggplot2 Graphics
17/10/2022
Introduction
We know the three key components of a statistical graphic (data, aesthetics and geometries) and how they
fit in the ggplot2 framework. But these three key components are not enough to produce graphics used in
the real world, and ggplot2 does have additional components. In this lab, we explore how to change the
appearance of graphics.
Style a graphic
Let’s investigate if city MPG and highway MPG are correlated. We map highway mileage (hwy) to the
x-axis and city mileage (cty) to the y-axis, and add a scatter plot layer using geom_jitter. The jitter geom
is a shortcut for geom_point(position = "jitter") and adds a small amount of random variation to the
location of each point to avoid overplotting (i.e. multiple points on the exact same location). Cf. [https:
//ggplot2.tidyverse.org/reference/geom_jitter.html]. We also fit a trend line using linear regressions(method
= "lm") with geom_smooth().
Don’t forget to load the package ggplot2 first.
library(ggplot2)
1
50
40
hwy
30
20
10 15 20 25 30 35
cty
2
50
40
hwy
30
20
10
10 20 30
cty
Let’s freshen up this graphic a bit. We can set any aesthetic to a fixed setting, e.g. make the dots pink. To
do this, we need to pass the aesthetic as a parameter in the geom_* function (instead of the aes() function).
In the next example, we’re colouring the dots a bit bigger and pink, and making the smoothed line a dashed
black line. Visit Aesthetic specifications for a overview of the available values for each aesthetic.
3
50
40
hwy
30
20
10
10 20 30
cty
Refer to R graph gallery or the figure below for a list of all available colours in R.
4
white aliceblue antiquewhite antiquewhite1 antiquewhite2
antiquewhite3 antiquewhite4 aquamarine aquamarine1 aquamarine2
aquamarine3 aquamarine4 azure azure1 azure2
azure3 azure4 beige bisque bisque1
bisque2 bisque3 bisque4 black blanchedalmond
blue blue1 blue2 blue3 blue4
blueviolet brown brown1 brown2 brown3
brown4 burlywood burlywood1 burlywood2 burlywood3
burlywood4 cadetblue cadetblue1 cadetblue2 cadetblue3
cadetblue4 chartreuse chartreuse1 chartreuse2 chartreuse3
chartreuse4 chocolate chocolate1 chocolate2 chocolate3
chocolate4 coral coral1 coral2 coral3
coral4 cornflowerblue cornsilk cornsilk1 cornsilk2
cornsilk3 cornsilk4 cyan cyan1 cyan2
cyan3 cyan4 darkblue darkcyan darkgoldenrod
darkgoldenrod1 darkgoldenrod2 darkgoldenrod3 darkgoldenrod4 darkgray
darkgreen darkgrey darkkhaki darkmagenta darkolivegreen
darkolivegreen1 darkolivegreen2 darkolivegreen3 darkolivegreen4 darkorange
darkorange1 darkorange2 darkorange3 darkorange4 darkorchid
darkorchid1 darkorchid2 darkorchid3 darkorchid4 darkred
darksalmon darkseagreen darkseagreen1 darkseagreen2 darkseagreen3
darkseagreen4 darkslateblue darkslategray darkslategray1 darkslategray2
darkslategray3 darkslategray4 darkslategrey darkturquoise darkviolet
deeppink deeppink1 deeppink2 deeppink3 deeppink4
deepskyblue deepskyblue1 deepskyblue2 deepskyblue3 deepskyblue4
The x- and y-axis labels aren’t very useful in the graphic above. To specify labels, use the labs() function.
Any parameters of the label() function are optional.
5
Highway by city MPG
Are highway and city consumption correlated?
50
40
Highway MPG
30
20
10
10 20 30
City mileage
I ♥ ggplot2
6
Legends
ggplot2 automatically creates legends when needed. In the next example, we create a graphic where we
encode four variables: engine displacement and highway mileage in the position (x- and y-axes), city mileage
as colour and fuel type as shape.
Because city mileage is a continuous scale, ggplot creates a colour bar legend. Fuel type is discrete, therefore
ggplot2 creates a standard legend.
cty
40 35
30
25
20
15
30 10
hwy
fl
c
d
20
e
p
r
2 3 4 5 6 7
displ
7
City MPG
40 35
30
25
20
15
Highway MPG
30 10
Fuel type
c
d
20 e
p
r
2 3 4 5 6 7
Engine displacement
Finally, let’s style the legend. The function theme() is used to control the appearance of any element of the
graphic.
8
City MPG
40 35
30
25
20
15
Highway MPG
30 10
Fuel type
c
d
20
e
p
r
10
2 3 4 5 6 7
Engine displacement
9
Scales
The values for fuel type are not descriptive, let’s change them by adding a custom scale. We use the
scale_shape_manual() function to specify specific shapes (shapes in R are numbered, we use shapes 15
to 19) and the labels for each category. We also make the shapes a bit larger and translucent (size = 3,
alpha = 0.5).
Fuel type
40
CNG
Diesel
E85
Premium
Highway MPG
30 Regular
City MPG
35
30
25
20
20
15
10
2 3 4 5 6 7
Engine displacement
Let’s also specify a custom scale for the colour aesthetic. This time, we use a gradient scale, where we specify
the “low” and “high” colour.
10
Fuel type
40
CNG
Diesel
E85
Premium
Highway MPG
Regular
30
City MPG
35
30
25
20
20
15
10
2 3 4 5 6 7
Engine displacement
Logarithmic scales
One commonly used scale is the logarithmic scale. Note how we can specify different scales for the x and
y-axes.
11
40
30
hwy
20
2 3 5
displ
12
10000
3000
price
1000
If we want to use a different logarithm, e. g. base2, we need to use scale_*_continuous and specify the
transformation:
13
32
hwy
16
7 6 5 4 3 2
displ
Annotations
Sometimes we need to add a label to a chart to explain something. ggplot provides the annotate() function
to add an annotation layer.
14
40
class
2seater
minivan
pickup
subcompact
20
suv
10
2 3 4 5 6 7
displ
15
# Facets
While ggplot2 supports several aesthetics, encoding to many variables in one graphic will produce a confusing
chart. Sometimes, it is easier to produce several graphics instead. ggplot2 makes it easy to create a series of
charts with faceting. We just need to add one of the facet functions (facet_wrap() or facet_grid()) and
pass the respective variable wrapped in a vars() call.
c d e
40
30
20 cty
35
30
hwy
2 3 4 5 6 7 25
p r
20
40 15
10
30
20
2 3 4 5 6 7 2 3 4 5 6 7
displ
But now, the labels for each chart are meaningless again. To define custom labels, we first need to define a
vector with the labels. Then we set the name of each element to the value in the data.
And of course, we can also still add custom scales.
16
CNG Diesel E85
40
30
20 cty
35
30
hwy
2 3 4 5 6 7 25
Premium Regular
20
40 15
10
30
20
2 3 4 5 6 7 2 3 4 5 6 7
displ
17
2seater compact midsize minivan pickup subcompact suv
40
CNG
30
20
40
Diesel
30
20 cty
35
40 30
hwy
E85
30 25
20 20
15
40
Premium
10
30
20
40
Regular
30
20
Themes
Themes are a way to change the look of a graphics completely.
18
40
class
2seater
compact
30
midsize
hwy
minivan
pickup
subcompact
suv
20
2 3 4 5 6 7
displ
ggplot2 comes with a some predefined themes. More themes are defined in the package ggthemes. You’ll
need to install it first:
install.packages("ggthemes")
library(ggthemes)
ls("package:ggthemes")
19
## [25] "ptol_pal" "scale_color_calc"
## [27] "scale_color_canva" "scale_color_colorblind"
## [29] "scale_color_continuous_tableau" "scale_color_economist"
## [31] "scale_color_excel" "scale_color_excel_new"
## [33] "scale_color_few" "scale_color_fivethirtyeight"
## [35] "scale_color_gdocs" "scale_color_gradient_tableau"
## [37] "scale_color_gradient2_tableau" "scale_color_hc"
## [39] "scale_color_pander" "scale_color_ptol"
## [41] "scale_color_solarized" "scale_color_stata"
## [43] "scale_color_tableau" "scale_color_wsj"
## [45] "scale_colour_calc" "scale_colour_canva"
## [47] "scale_colour_colorblind" "scale_colour_economist"
## [49] "scale_colour_excel" "scale_colour_excel_new"
## [51] "scale_colour_few" "scale_colour_fivethirtyeight"
## [53] "scale_colour_gdocs" "scale_colour_gradient_tableau"
## [55] "scale_colour_gradient2_tableau" "scale_colour_hc"
## [57] "scale_colour_pander" "scale_colour_ptol"
## [59] "scale_colour_solarized" "scale_colour_stata"
## [61] "scale_colour_tableau" "scale_colour_wsj"
## [63] "scale_fill_calc" "scale_fill_canva"
## [65] "scale_fill_colorblind" "scale_fill_continuous_tableau"
## [67] "scale_fill_economist" "scale_fill_excel"
## [69] "scale_fill_excel_new" "scale_fill_few"
## [71] "scale_fill_fivethirtyeight" "scale_fill_gdocs"
## [73] "scale_fill_gradient_tableau" "scale_fill_gradient2_tableau"
## [75] "scale_fill_hc" "scale_fill_pander"
## [77] "scale_fill_ptol" "scale_fill_solarized"
## [79] "scale_fill_stata" "scale_fill_tableau"
## [81] "scale_fill_wsj" "scale_linetype_stata"
## [83] "scale_shape_calc" "scale_shape_circlefill"
## [85] "scale_shape_cleveland" "scale_shape_few"
## [87] "scale_shape_stata" "scale_shape_tableau"
## [89] "scale_shape_tremmel" "show_linetypes"
## [91] "show_shapes" "smart_digits"
## [93] "smart_digits_format" "solarized_pal"
## [95] "stat_fivenumber" "stata_linetype_pal"
## [97] "stata_pal" "stata_shape_pal"
## [99] "StatFivenumber" "tableau_color_pal"
## [101] "tableau_div_gradient_pal" "tableau_gradient_pal"
## [103] "tableau_seq_gradient_pal" "tableau_shape_pal"
## [105] "theme_base" "theme_calc"
## [107] "theme_clean" "theme_economist"
## [109] "theme_economist_white" "theme_excel"
## [111] "theme_excel_new" "theme_few"
## [113] "theme_fivethirtyeight" "theme_foundation"
## [115] "theme_gdocs" "theme_hc"
## [117] "theme_igray" "theme_map"
## [119] "theme_pander" "theme_par"
## [121] "theme_solarized" "theme_solarized_2"
## [123] "theme_solid" "theme_stata"
## [125] "theme_tufte" "theme_wsj"
## [127] "tremmel_shape_pal" "wsj_pal"
20
Economist
base.plot +
theme_economist() + scale_colour_economist() +
scale_y_continuous(position = "right")
40
30
hwy
20
2 3 4 5 6 7
displ
Solarized light
21
40
class
2seater
compact
30
midsize
hwy
minivan
pickup
subcompact
20 suv
2 3 4 5 6 7
displ
Excel 2003
22
40
class
2seater
compact
30
midsize
hwy
minivan
pickup
subcompact
20 suv
10
2 3 4 5 6 7
displ
Tableau
23
40
class
2seater
compact
30
midsize
hwy
minivan
pickup
subcompact
20 suv
2 3 4 5 6 7
displ
base.plot + scale_colour_colorblind()
24
40
class
2seater
compact
30
midsize
hwy
minivan
pickup
subcompact
20 suv
2 3 4 5 6 7
displ
Tufte
25
40
class
2seater
compact
30
midsize
hwy
minivan
pickup
subcompact
20 suv
10
2 3 4 5 6 7
displ
26
Exercises
CO2 uptake
Use the built-in data set CO2 for the following exercises.
1. Create a scatter plot showing the CO2 uptake by ambient CO2 concentrations. Should you use jitter
or not?
2. Add a dotted trend line using linear regression(“lm”) in your favourite colour. Change the dots to sky
blue large triangles.
3. Encode the origin of the plant in the colour of the triangles. Make sure you still have only one trend
line.
4. Label the chart, axes and legend meaningfully.
5. Use two trend lines (one for each origin). Colour plants from Quebec blue and plants from Mississippi
red.
6. Create the chart above for each Plant using facet_wrap().
7. Create the following chart. Tip: You can hide the legend using the theme() function and its
legend.position parameter.
CO2 uptake
nonchilled chilled
40
CO2 uptake rates (umol/m^2 sec)
30
Quebec
20
10
40
Mississippi
30
20
10
Mammals sleep
Use the msleep data set for the following exercises.
1. Investigate if there is a correlation between body weight and brain weight. You might need to use
logarithmic scales.
27
2. Investigate if the food has an influence on body/brain weight.
3. Does the body weight have an influence on the total sleep? What else has an influence on the sleep
behaviour?
28
Solutions
CO2 uptake
1. Create a scatter plot showing the CO2 uptake by ambient CO2 concentrations. Should you use jitter
or not?
40
30
uptake
20
10
2. Add a dotted trend line using linear regression(“lm”) in your favourite colour. Change the dots to sky
blue large triangles.
29
40
30
uptake
20
10
3. Encode the origin of the plant in the colour of the triangles. Make sure you still have only one trend
line.
30
40
30 Type
uptake
Quebec
Mississippi
20
10
31
CO2 uptake
40
CO2 uptake rates (umol/m^2 sec)
30 Origin
Quebec
Mississippi
20
10
5. Use two trend lines (one for each origin). Colour plants from Quebec blue and plants from Mississippi
red.
32
CO2 uptake
50
CO2 uptake rates (umol/m^2 sec)
40
Origin
30 Quebec
Mississippi
20
10
33
CO2 uptake
Qn1 Qn2 Qn3 Qc1
60
40
CO2 uptake rates (umol/m^2 sec)
20
60 Origin
40 Quebec
20 Mississippi
60
40
20
250 500 750 1000 250 500 750 1000 250 500 750 1000 250 500 750 1000
ambient CO2 concentrations (mL/L)
7. Create the following chart. Tip: You can hide the legend using the theme() function and its
legend.position parameter.
34
CO2 uptake
nonchilled chilled
40
CO2 uptake rates (umol/m^2 sec)
30
Quebec
20
10
40
Mississippi
30
20
10
Mammals sleep
1. Investigate if there is a correlation between body weight and brain weight. You might need to use
logarithmic scales.
35
1e+01
Brain weight
1e-01
1e-03
36
1e+01
vore
carni
Brain weight
1e-01
herbi
insecti
omni
NA
1e-03
3. Does the body weight have an influence on the total sleep? What else has an influence on the sleep
behaviour?
37
Sleep and body weight by food source
Carnivore Herbivore
1e+04
1e+01
1e-02
Body weight
Insectivore Omnivore
1e+04
1e+01
1e-02
5 10 15 20 5 10 15 20
Sleep total
38