0% found this document useful (0 votes)
30 views

Data Visualization 2.1

1. ggplot2 is a data visualization package in R based on the grammar of graphics which builds graphs from data, coordinates and geoms. 2. Geoms represent data points and their aesthetic properties map variables to visual properties like color, size, x and y location. 3. Common geoms include points, lines, histograms and boxplots which are used to visualize one or two variables from the data.

Uploaded by

tulipania
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

Data Visualization 2.1

1. ggplot2 is a data visualization package in R based on the grammar of graphics which builds graphs from data, coordinates and geoms. 2. Geoms represent data points and their aesthetic properties map variables to visual properties like color, size, x and y location. 3. Common geoms include points, lines, histograms and boxplots which are used to visualize one or two variables from the data.

Uploaded by

tulipania
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Data Visualization with ggplot2 : : CHEAT SHEET

Basics Geoms Use a geom function to represent data points, use the geom’s aesthetic properties to represent variables.
Each function returns a layer.
GRAPHICAL PRIMITIVES TWO VARIABLES
ggplot2 is based on the grammar of graphics, the idea
that you can build every graph from the same a <- ggplot(economics, aes(date, unemploy)) continuous x , continuous y continuous bivariate distribution
components: a data set, a coordinate system, b <- ggplot(seals, aes(x = long, y = lat)) h <- ggplot(diamonds, aes(carat, price))
e <- ggplot(mpg, aes(cty, hwy))
and geoms—visual marks that represent data points. a + geom_blank() e + geom_label(aes(label = cty), nudge_x = 1, h + geom_bin2d(binwidth = c(0.25, 500))
(Useful for expanding limits) nudge_y = 1, check_overlap = TRUE) x, y, label, x, y, alpha, color, fill, linetype, size, weight
F M A alpha, angle, color, family, fontface, hjust,
b + geom_curve(aes(yend = lat + 1, lineheight, size, vjust
+ = xend=long+1),curvature=1) - x, xend, y, yend,
alpha, angle, color, curvature, linetype, size e + geom_jitter(height = 2, width = 2)
h + geom_density2d()
x, y, alpha, colour, group, linetype, size
x, y, alpha, color, fill, shape, size
data geom coordinate plot a + geom_path(lineend="butt", linejoin="round", h + geom_hex()
x=F·y=A system linemitre=1) x, y, alpha, colour, fill, size
e + geom_point(), x, y, alpha, color, fill, shape,
x, y, alpha, color, group, linetype, size size, stroke
To display values, map variables in the data to visual a + geom_polygon(aes(group = group)) e + geom_quantile(), x, y, alpha, color, group,
properties of the geom (aesthetics) like size, color, and x x, y, alpha, color, fill, group, linetype, size linetype, size, weight continuous function
and y locations. i <- ggplot(economics, aes(date, unemploy))
b + geom_rect(aes(xmin = long, ymin=lat, xmax=
F M A long + 1, ymax = lat + 1)) - xmax, xmin, ymax, e + geom_rug(sides = "bl"), x, y, alpha, color, i + geom_area()
ymin, alpha, color, fill, linetype, size x, y, alpha, color, fill, linetype, size
+ =
linetype, size
a + geom_ribbon(aes(ymin=unemploy - 900, e + geom_smooth(method = lm), x, y, alpha, i + geom_line()
ymax=unemploy + 900)) - x, ymax, ymin, color, fill, group, linetype, size, weight x, y, alpha, color, group, linetype, size
data geom coordinate plot alpha, color, fill, group, linetype, size
x=F·y=A system
color = F e + geom_text(aes(label = cty), nudge_x = 1, i + geom_step(direction = "hv")
size = A nudge_y = 1, check_overlap = TRUE), x, y, label, x, y, alpha, color, group, linetype, size
alpha, angle, color, family, fontface, hjust,
LINE SEGMENTS lineheight, size, vjust
common aesthetics: x, y, alpha, color, linetype, size
b + geom_abline(aes(intercept=0, slope=1)) visualizing error
Complete the template below to build a graph. b + geom_hline(aes(yintercept = lat)) df <- data.frame(grp = c("A", "B"), fit = 4:5, se = 1:2)
required b + geom_vline(aes(xintercept = long)) discrete x , continuous y j <- ggplot(df, aes(grp, fit, ymin = fit-se, ymax = fit+se))
ggplot (data = <DATA> ) + f <- ggplot(mpg, aes(class, hwy))
b + geom_segment(aes(yend=lat+1, xend=long+1)) j + geom_crossbar(fatten = 2)
<GEOM_FUNCTION> (mapping = aes( <MAPPINGS> ), x, y, ymax, ymin, alpha, color, fill, group, linetype,
b + geom_spoke(aes(angle = 1:1155, radius = 1)) f + geom_col(), x, y, alpha, color, fill, group,
stat = <STAT> , position = <POSITION> ) + Not linetype, size size
<COORDINATE_FUNCTION> + required,
sensible j + geom_errorbar(), x, ymax, ymin, alpha, color,
f + geom_boxplot(), x, y, lower, middle, upper, group, linetype, size, width (also
<FACET_FUNCTION> + defaults
supplied ONE VARIABLE continuous ymax, ymin, alpha, color, fill, group, linetype, geom_errorbarh())
<SCALE_FUNCTION> + shape, size, weight
c <- ggplot(mpg, aes(hwy)); c2 <- ggplot(mpg)
j + geom_linerange()
<THEME_FUNCTION> f + geom_dotplot(binaxis = "y", stackdir = x, ymin, ymax, alpha, color, group, linetype, size
c + geom_area(stat = "bin") "center"), x, y, alpha, color, fill, group
x, y, alpha, color, fill, linetype, size j + geom_pointrange()
ggplot(data = mpg, aes(x = cty, y = hwy)) Begins a plot f + geom_violin(scale = "area"), x, y, alpha, color, x, y, ymin, ymax, alpha, color, fill, group, linetype,
that you finish by adding layers to. Add one geom c + geom_density(kernel = "gaussian") fill, group, linetype, size, weight shape, size
function per layer. x, y, alpha, color, fill, group, linetype, size, weight
aesthetic mappings data geom
c + geom_dotplot() maps
qplot(x = cty, y = hwy, data = mpg, geom = “point") x, y, alpha, color, fill data <- data.frame(murder = USArrests$Murder,
Creates a complete plot with given data, geom, and discrete x , discrete y state = tolower(rownames(USArrests)))
mappings. Supplies many useful defaults. c + geom_freqpoly() x, y, alpha, color, group, g <- ggplot(diamonds, aes(cut, color)) map <- map_data("state")
linetype, size k <- ggplot(data, aes(fill = murder))
last_plot() Returns the last plot g + geom_count(), x, y, alpha, color, fill, shape, k + geom_map(aes(map_id = state), map = map)
c + geom_histogram(binwidth = 5) x, y, alpha,
ggsave("plot.png", width = 5, height = 5) Saves last plot color, fill, linetype, size, weight size, stroke + expand_limits(x = map$long, y = map$lat),
as 5’ x 5’ file named "plot.png" in working directory. map_id, alpha, color, fill, linetype, size
Matches file type to file extension. c2 + geom_qq(aes(sample = hwy)) x, y, alpha,
color, fill, linetype, size, weight
THREE VARIABLES
seals$z <- with(seals, sqrt(delta_long^2 + delta_lat^2)); l <- ggplot(seals, aes(long, lat))
discrete l + geom_contour(aes(z = z)) l + geom_raster(aes(fill = z), hjust=0.5, vjust=0.5,
d <- ggplot(mpg, aes(fl)) x, y, z, alpha, colour, group, linetype, interpolate=FALSE)
size, weight x, y, alpha, fill
d + geom_bar()
x, alpha, color, fill, linetype, size, weight l + geom_tile(aes(fill = z)), x, y, alpha, color, fill,
linetype, size, width

RStudio® is a trademark of RStudio, Inc. • CC BY SA RStudio • info@rstudio.com • 844-448-1212 • rstudio.com • Learn more at http://ggplot2.tidyverse.org • ggplot2 3.1.0 • Updated: 2018-12
Stats An alternative way to build a layer Scales Coordinate Systems Faceting
A stat builds new variables to plot (e.g., count, prop). Scales map data values to the visual values of an r <- d + geom_bar() Facets divide a plot into
fl cty cyl aesthetic. To change a mapping, add a new scale. r + coord_cartesian(xlim = c(0, 5)) subplots based on the
xlim, ylim values of one or more
(n <- d + geom_bar(aes(fill = fl)))
+ =
x ..count..
The default cartesian coordinate system discrete variables.
aesthetic prepackaged scale-specific r + coord_fixed(ratio = 1/2)
scale_ to adjust scale to use arguments ratio, xlim, ylim t <- ggplot(mpg, aes(cty, hwy)) + geom_point()
data stat geom coordinate plot Cartesian coordinates with fixed aspect ratio
x=x· system n + scale_fill_manual(
y = ..count.. values = c("skyblue", "royalblue", "blue", “navy"), between x and y units t + facet_grid(cols = vars(fl))
Visualize a stat by changing the default stat of a geom limits = c("d", "e", "p", "r"), breaks =c("d", "e", "p", “r"), r + coord_flip() facet into columns based on fl
name = "fuel", labels = c("D", "E", "P", "R")) xlim, ylim
function, geom_bar(stat="count") or by using a stat t + facet_grid(rows = vars(year))
Flipped Cartesian coordinates facet into rows based on year
function, stat_count(geom="bar"), which calls a default range of title to use in labels to use breaks to use in
values to include legend/axis in legend/axis legend/axis
geom to make a layer (equivalent to a geom function). in mapping r + coord_polar(theta = "x", direction=1 ) t + facet_grid(rows = vars(year), cols = vars(fl))
Use ..name.. syntax to map stat variables to aesthetics. theta, start, direction facet into both rows and columns
Polar coordinates
GENERAL PURPOSE SCALES t + facet_wrap(vars(fl))
geom to use stat function geommappings r + coord_trans(ytrans = “sqrt") wrap facets into a rectangular layout
Use with most aesthetics xtrans, ytrans, limx, limy
i + stat_density2d(aes(fill = ..level..), Transformed cartesian coordinates. Set xtrans and Set scales to let axis limits vary across facets
scale_*_continuous() - map cont’ values to visual ones ytrans to the name of a window function.
geom = "polygon")
variable created by stat scale_*_discrete() - map discrete values to visual ones t + facet_grid(rows = vars(drv), cols = vars(fl),
π + coord_quickmap()
scales = "free")
60

scale_*_identity() - use data values as visual ones π + coord_map(projection = "ortho", x and y axis limits adjust to individual facets

lat
c + stat_bin(binwidth = 1, origin = 10) scale_*_manual(values = c()) - map discrete values to orientation=c(41, -74, 0))projection, xlim, ylim
x, y | ..count.., ..ncount.., ..density.., ..ndensity.. manually chosen visual ones Map projections from the mapproj package "free_x" - x axis limits adjust
(mercator (default), azequalarea, lagrange, etc.) "free_y" - y axis limits adjust
long

scale_*_date(date_labels = "%m/%d"), date_breaks = "2


c + stat_count(width = 1) x, y, | ..count.., ..prop.. weeks") - treat data values as dates.
c + stat_density(adjust = 1, kernel = “gaussian") Set labeller to adjust facet labels
scale_*_datetime() - treat data x values as date times.
x, y, | ..count.., ..density.., ..scaled..

e + stat_bin_2d(bins = 30, drop = T)


Use same arguments as scale_x_date(). See ?strptime for
label formats. Position Adjustments t + facet_grid(cols = vars(fl), labeller = label_both)
fl: c fl: d fl: e fl: p fl: r
x, y, fill | ..count.., ..density.. Position adjustments determine how to arrange geoms t + facet_grid(rows = vars(fl),
X & Y LOCATION SCALES that would otherwise occupy the same space. labeller = label_bquote(alpha ^ .(fl)))
e + stat_bin_hex(bins=30) x, y, fill | ..count.., ..density..
Use with x or y aesthetics (x shown here) ↵c ↵d ↵e ↵p ↵r
e + stat_density_2d(contour = TRUE, n = 100) s <- ggplot(mpg, aes(fl, fill = drv))
x, y, color, size | ..level.. scale_x_log10() - Plot x on log10 scale
s + geom_bar(position = "dodge")
Labels
e + stat_ellipse(level = 0.95, segments = 51, type = "t") scale_x_reverse() - Reverse direction of x axis
scale_x_sqrt() - Plot x on square root scale Arrange elements side by side
l + stat_contour(aes(z = z)) x, y, z, order | ..level.. s + geom_bar(position = "fill")
Stack elements on top of one another, t + labs( x = "New x axis label", y = "New y axis label",
l + stat_summary_hex(aes(z = z), bins = 30, fun = max) COLOR AND FILL SCALES (DISCRETE) normalize height
x, y, z, fill | ..value.. title ="Add a title above the plot",
n <- d + geom_bar(aes(fill = fl)) e + geom_point(position = "jitter") Use scale functions
subtitle = "Add a subtitle below title", to update legend
l + stat_summary_2d(aes(z = z), bins = 30, fun = mean) Add random noise to X and Y position of each
n + scale_fill_brewer(palette = "Blues") element to avoid overplotting caption = "Add a caption below plot", labels
x, y, z, fill | ..value.. For palette choices: <aes> = "New <aes>
<AES> <AES> legend title")
A
RColorBrewer::display.brewer.all() e + geom_label(position = "nudge")
f + stat_boxplot(coef = 1.5) x, y | ..lower.., B Nudge labels away from points t + annotate(geom = "text", x = 8, y = 9, label = "A")
..middle.., ..upper.., ..width.. , ..ymin.., ..ymax.. n + scale_fill_grey(start = 0.2, end = 0.8,
na.value = "red") geom to place manual values for geom’s aesthetics
f + stat_ydensity(kernel = "gaussian", scale = “area") x, y | s + geom_bar(position = "stack")
..density.., ..scaled.., ..count.., ..n.., ..violinwidth.., ..width.. Stack elements on top of one another
COLOR AND FILL SCALES (CONTINUOUS)
e + stat_ecdf(n = 40) x, y | ..x.., ..y..
e + stat_quantile(quantiles = c(0.1, 0.9), formula = y ~
o <- c + geom_dotplot(aes(fill = ..x..)) Each position adjustment can be recast as a function with
manual width and height arguments Legends
log(x), method = "rq") x, y | ..quantile.. o + scale_fill_distiller(palette = "Blues") s + geom_bar(position = position_dodge(width = 1)) n + theme(legend.position = "bottom")
Place legend at "bottom", "top", "left", or "right"
e + stat_smooth(method = "lm", formula = y ~ x, se=T,
level=0.95) x, y | ..se.., ..x.., ..y.., ..ymin.., ..ymax.. o + scale_fill_gradient(low="red", high="yellow") n + guides(fill = "none")

Themes
Set legend type for each aesthetic: colorbar, legend, or
ggplot() + stat_function(aes(x = -3:3), n = 99, fun = o + scale_fill_gradient2(low="red", high=“blue", none (no legend)
dnorm, args = list(sd=0.5)) x | ..x.., ..y.. mid = "white", midpoint = 25) n + scale_fill_discrete(name = "Title",
labels = c("A", "B", "C", "D", "E"))
e + stat_identity(na.rm = TRUE) r + theme_bw() r + theme_classic() Set legend title and labels with a scale function.
o + scale_fill_gradientn(colours=topo.colors(6)) White background
ggplot() + stat_qq(aes(sample=1:100), dist = qt, Also: rainbow(), heat.colors(), terrain.colors(), with grid lines r + theme_light()
dparam=list(df=5)) sample, x, y | ..sample.., ..theoretical..
Zooming
cm.colors(), RColorBrewer::brewer.pal() r + theme_gray() r + theme_linedraw()
e + stat_sum() x, y, size | ..n.., ..prop.. Grey background
(default theme) r + theme_minimal()
e + stat_summary(fun.data = "mean_cl_boot") SHAPE AND SIZE SCALES Minimal themes
r + theme_dark() r + theme_void() Without clipping (preferred)
h + stat_summary_bin(fun.y = "mean", geom = "bar") p <- e + geom_point(aes(shape = fl, size = cyl)) dark for contrast
p + scale_shape() + scale_size() Empty theme t + coord_cartesian(
e + stat_unique() xlim = c(0, 100), ylim = c(10, 20))
p + scale_shape_manual(values = c(3:7))
With clipping (removes unseen data points)
t + xlim(0, 100) + ylim(10, 20)
p + scale_radius(range = c(1,6))
p + scale_size_area(max_size = 6) t + scale_x_continuous(limits = c(0, 100)) +
scale_y_continuous(limits = c(0, 100))

RStudio® is a trademark of RStudio, Inc. • CC BY SA RStudio • info@rstudio.com • 844-448-1212 • rstudio.com • Learn more at http://ggplot2.tidyverse.org • ggplot2 3.1.0 • Updated: 2018-12

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy