Chapter 1
Chapter 1
experimental design
E X P E R I M E N TA L D E S I G N I N R
Joanne Xiong
Data Scientist
Intro to experimental design
Starts with a question (hypothesis)
Collecting & analyzing the data
EXPERIMENTAL DESIGN IN R
Steps of an experiment
Planning
dependent variable = outcome
Analysis
EXPERIMENTAL DESIGN IN R
Key components of an experiment
Randomization
Replication
Blocking
EXPERIMENTAL DESIGN IN R
Randomization
Evenly distributes any variability in outcome due to outside factors across treatment groups
Example:
double-blind medical trials
neither patient nor doctor knows which group has been assigned
EXPERIMENTAL DESIGN IN R
Recap: t-tests
t-tests help answer research questions
data("mtcars")
library(broom)
tidy()
EXPERIMENTAL DESIGN IN R
Let's practice!
E X P E R I M E N TA L D E S I G N I N R
Replication and
blocking
E X P E R I M E N TA L D E S I G N I N R
Joanne Xiong
Data Scientist
Replication
Must repeat an experiment to fully assess variability
If we only conduct a drug efficacy experiment on one person, how can we properly
generalize those results? (We can't!)
library(dplyr)
mtcars %>%
count(cyl)
cyl n
1 4 11
2 6 7
3 8 14
EXPERIMENTAL DESIGN IN R
Blocking
Helps control variability by making treatment groups more alike
Inside of groups, differences will be minimal. Across groups, differences will be larger
EXPERIMENTAL DESIGN IN R
Boxplots
# Boxplot of MPG by Car Cylinders
ggplot(mtcars, aes(x=as.factor(cyl),
y=mpg)) +
geom_boxplot(fill="slateblue",
alpha=0.2) +
xlab("cyl")
EXPERIMENTAL DESIGN IN R
Functions for modeling
Linear models
anova(object,...)
EXPERIMENTAL DESIGN IN R
Let's practice!
E X P E R I M E N TA L D E S I G N I N R
Hypothesis testing
E X P E R I M E N TA L D E S I G N I N R
Joanne Xiong
Data Scientist
Breaking down hypothesis testing:
Null hypothesis:
there is no change
Alternative hypothesis:
there is a change
EXPERIMENTAL DESIGN IN R
Power and sample size
Power: probability that the test correctly rejects the null hypothesis when the alternative
hypothesis is true.
Sample size: How many experimental units you need to survey to detect the desired
difference at the desired power.
EXPERIMENTAL DESIGN IN R
Power and sample size calculations
library(pwr)
pwr.anova.test(k = 3,
n = 20,
f = 0.2,
sig.level = 0.05,
power = NULL)
EXPERIMENTAL DESIGN IN R
Let's practice!
E X P E R I M E N TA L D E S I G N I N R