0% found this document useful (0 votes)

103 views16 pages

What Does A Statistical Test Do

Statistical tests calculate a test statistic and p-value to determine if there is a statistically significant relationship between predictor and outcome variables. The test statistic measures how much the observed relationship differs from the null hypothesis of no relationship, while the p-value estimates the likelihood of observing that relationship if the null hypothesis is true. Different statistical tests are used depending on the types of variables (e.g. continuous, categorical) and whether assumptions like normality are met. Parametric tests like regression, t-tests, and ANOVA analyze continuous variables, while nonparametric alternatives like Spearman's r are used when assumptions are violated.

Uploaded by

Renato Panes

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

103 views16 pages

What Does A Statistical Test Do

Uploaded by

Renato Panes

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 16

What does a statistical test do?

Statistical tests work by calculating a test statistic – a number that describes how much the relationship between
variables in your test differs from the null hypothesis of no relationship.

It then calculates a p value (probability value). The p-value estimates how likely it is that you would see the difference
described by the test statistic if the null hypothesis of no relationship were true.

If the value of the test statistic is more extreme than the statistic calculated from the null hypothesis, then you
can infer a statistically significant relationship between the predictor and outcome variables.

If the value of the test statistic is less extreme than the one calculated from the null hypothesis, then you can infer no
statistically significant relationship between the predictor and outcome variables.

When to perform a statistical test

You can perform statistical tests on data that have been collected in a statistically valid manner – either through
an experiment, or through observations made using probability sampling methods.

For a statistical test to be valid, your sample size needs to be large enough to approximate the true distribution of the
population being studied.

To determine which statistical test to use, you need to know:

 whether your data meets certain assumptions.

 the types of variables that you’re dealing with.

Statistical assumptions

Statistical tests make some common assumptions about the data they are testing:

1. Independence of observations (a.k.a. no autocorrelation): The observations/variables you include in your test

are not related (for example, multiple measurements of a single test subject are not independent, while
measurements of multiple different test subjects are independent).

2. Homogeneity of variance: the variance within each group being compared is similar among all groups. If one
group has much more variation than others, it will limit the test’s effectiveness.

3. Normality of data: the data follows a normal distribution (a.k.a. a bell curve). This assumption applies only
to quantitative data.

If your data do not meet the assumptions of normality or homogeneity of variance, you may be able to perform
a nonparametric statistical test, which allows you to make comparisons without any assumptions about the data
distribution.

If your data do not meet the assumption of independence of observations, you may be able to use a test that accounts
for structure in your data (repeated-measures tests or tests that include blocking variables).

Types of variables

The types of variables you have usually determine what type of statistical test you can use.

Quantitative variables represent amounts of things (e.g. the number of trees in a forest). Types of quantitative variables
include:

 Continuous (aka ratio variables): represent measures and can usually be divided into units smaller than one (e.g.
0.75 grams).
 Discrete (aka integer variables): represent counts and usually can’t be divided into units smaller than one (e.g. 1
tree).

Categorical variables represent groupings of things (e.g. the different tree species in a forest). Types of categorical
variables include:

 Ordinal: represent data with an order (e.g. rankings).

 Nominal: represent group names (e.g. brands or species names).

 Binary: represent data with a yes/no or 1/0 outcome (e.g. win or lose).

Choose the test that fits the types of predictor and outcome variables you have collected (if you are doing
an experiment, these are the independent and dependent variables). Consult the tables below to see which test best
matches your variables.

Choosing a parametric test: regression, comparison, or correlation

Parametric tests usually have stricter requirements than nonparametric tests, and are able to make stronger inferences
from the data. They can only be conducted with data that adheres to the common assumptions of statistical tests.

The most common types of parametric test include regression tests, comparison tests, and correlation tests.

Regression tests

Regression tests look for cause-and-effect relationships. They can be used to estimate the effect of one or more
continuous variables on another variable.

Predictor variable Outcome variable Research question example

Simple linear  Continuous  Continuous What is the effect of income on longevity?

regression
 1 predictor  1 outcome

Multiple linear  Continuous  Continuous What is the effect of income and minutes of

regression exercise per day on longevity?
 2 or more  1 outcome
predictors

Logistic regression  Continuous  Binary What is the effect of drug dosage on the survival
of a test subject?

Comparison tests

Comparison tests look for differences among group means. They can be used to test the effect of a categorical variable
on the mean value of some other characteristic.

T-tests are used when comparing the means of precisely two groups (e.g., the average heights of men and
women). ANOVA and MANOVA tests are used when comparing the means of more than two groups (e.g., the average
heights of children, teenagers, and adults).

Predictor variable Outcome variable Research question example

Paired t-test  Categorical  Quantitative What is the effect of two different test prep
programs on the average exam scores for
 1 predictor  groups come from the students from the same class?
same population

Independent  Categorical  Quantitative What is the difference in average exam

t-test scores for students from two different
 1 predictor  groups come from schools?
different populations
Predictor variable Outcome variable Research question example

ANOVA  Categorical  Quantitative What is the difference in average pain

levels among post-surgical patients given three
 1 or more  1 outcome different painkillers?
predictor

MANOVA  Categorical  Quantitative What is the effect of flower species on petal

length, petal width, and stem length?
 1 or more  2 or more outcome
predictor

Correlation tests

Correlation tests check whether variables are related without hypothesizing a cause-and-effect relationship.

These can be used to test whether two variables you want to use in (for example) a multiple regression test are
autocorrelated.

Variables Research question example

Pearson’s r  2 continuous variables How are latitude and temperature related?

Choosing a nonparametric test

Non-parametric tests don’t make as many assumptions about the data, and are useful when one or more of the
common statistical assumptions are violated. However, the inferences they make aren’t as strong as with parametric
tests.

Predictor variable Outcome variable Use in place of…

Spearman’s r  Quantitative  Quantitative Pearson’s r

Chi square test of  Categorical  Categorical Pearson’s r

independence

Sign test  Categorical  Quantitative One-sample t-

test

Kruskal–Wallis H  Categorical  Quantitative ANOVA

 3 or more groups

ANOSIM  Categorical  Quantitative MANOVA

 3 or more groups  2 or more outcome variables

Wilcoxon Rank-Sum test  Categorical  Quantitative Independent t-

test
 2 groups  groups come from different
populations

Wilcoxon Signed-rank test  Categorical  Quantitative Paired t-test

 2 groups  groups come from the same

population
Reporting Statistics in APA Style | Guidelines & Examples

The APA Publication Manual is commonly used for reporting research results in the social and natural sciences. This
article walks you through APA Style standards for reporting statistics in academic writing.

Statistical analysis involves gathering and testing quantitative data to make inferences about the world. A statistic is any
number that describes a sample: it can be a proportion, a range, or a measurement, among other things.

When reporting statistics, use these formatting rules and suggestions from APA where relevant.

Numbers and measurements

In general, APA advises using words for numbers under 10 and numerals for 10 and greater. However, always spell out a
number that appears at the start of a sentence (or rephrase).

You should always use numerals for:

 Exact numbers before units of measurement or time

 Mathematical equations

 Percentages and percentiles

 Ratios, decimals, and uncommon fractions

 Scores and points on scales (e.g., 7-point scale)

 Exact amounts of money

Units of measurement and time

Report exact measurements using numerals, and use symbols or abbreviations for common units of measurement when
they accompany exact measurements. Include a space between the number and the abbreviation.

When stating approximate figures, use words to express numbers under 10, and spell out the names of units of
measurement.

Examples: Reporting exact and approximate figures

 The ball weighed 7 kg.

 The ball weighed approximately seven kilograms.

Measurements should be reported in metric units. If you recorded measurements in non-metric units, include metric
equivalents in your report as well as the original units.

Percentages

Use numerals for percentages along with the percent symbol (%). Don’t insert a space between the number and the
symbol.

Words for “percent” or “percentage” should only be used in text when numbers aren’t used, or when a percentage
appears at the start of a sentence.

Examples: Reporting percentages

 Of these respondents, 15% agreed with the statement.

 Fifteen percent of respondents agreed with the statement.

 The percentage was higher in 2020.

Decimal places and leading zeros

The number of decimal places to report depends on what you’re reporting. Generally, you should aim to round numbers
while retaining precision. It’s best to present fewer decimal digits to aid easy understanding.

The following guidelines are usually applicable.

One decimal place Two decimal places

 Means  Correlation coefficients

 Standard deviations  Proportions

 Descriptive statistics based on discrete data  Inferential test statistics such as t values, F values,

and chi-squares

Use two or three decimal places and report exact values for all p values greater than .001. For p values smaller
than .001, report them as p < .001.

Leading zeros

A leading zero is zero before the decimal point for numbers less than one. In APA Style, it’s only used in some cases.

Use a leading zero only when the statistic you’re describing can be greater than one. If it can never exceed one, omit the
leading zero.

Use a leading zero Don’t use a leading zero

 Variables that can be greater than 1 (e.g., height or  p values

weight)
 Pearson correlation coefficient
 Cohen’s d
 Coefficient of determination
 t values
 Cronbach’s alpha
 F value

 z values

Examples: Use of decimal places and leading zeros

 Consumers reported high satisfaction with the services (M = 4.1, SD = 0.8).

 The correlation was medium-sized (r = .35).

 Although significant results were obtained, the effect was relatively small (p = .015, d = 0.11).

Formatting mathematical formulas

Provide formulas only when you use new or uncommon equations. For short equations, present them within one line in
the main text whenever possible.

Make the order of operations as clear as possible by using parentheses (round brackets) for the first step, brackets
[square brackets] for the second step, and braces {curly brackets} for the third step, where necessary.

Example: Short mathematical formulaWe used the formula c = [(x − 1)/b]-1 in our analysis.

More complex equations, or equations that take more than one line, should be displayed on their own lines. Equations
should be displayed and numbered if you will reference them later on, regardless of their complexity. Number equations
by placing the numbers in parentheses near the right edge of the page.
Example: Numbering mathematical formulas

Formatting statistical terms

When reporting statistical results, present information in easily understandable ways. You can use a mix of text, tables,
and figures to present data effectively when you have a lot of numbers to report.

In your main text, use helpful words like “respectively” or “in order” to aid understanding when listing several statistics
in a sequence.

The APA manual provides guidelines for dealing with statistical terms, symbols and abbreviations.

Symbols and abbreviations

Population parameters are often represented with Greek letters, while sample statistics are often represented with
italicized Latin letters.

Use the population symbol (N) for the total number of elements in a sample, and use the sample symbol (n) for the
number of elements in each subgroup of the full sample.

In general, abbreviations should be defined on first use, but this isn’t always the case for common statistical
abbreviations.

Define Don’t define

 Abbreviations that do not represent statistics:  Statistical symbols or abbreviations: M, SD, F, t, df,

ANOVA, CI, CFA p, N, n, OR

 Non-standard abbreviations that appear in tables  Greek letters: α, β, χ2

and figures, even if they are already defined in the
text.

Use symbols for statistical terms Use words for statistical terms

When directly referring to a numerical quantity or In the main text: “the mean accuracy was higher…”
operator: M = 5.41

Capitalization, italicization and hyphenation

Statistical terms such as t test, z test, and p value always begin with a lowercase, italicized letter. Never begin a sentence
with lowercase statistical abbreviations.

These statistical terms should only be hyphenated when they modify a subsequent word (e.g., “z-test results” versus
results of “z tests”).

You can form plurals of statistical symbols (e.g., M or p) by adding a non-italicized “s” to the end with no apostrophe
(e.g., Ms or ps).

In general, the following guidelines apply.

Italicize Don’t italicize

 Letters when they are statistical symbols or  Greek letters: σ or Χ2

algebraic variables: Cohen’s d, SD, p value, t test
 Subscripts for statistical symbols: Mcontrol

 Trigonometric terms: sin, cos

Italicize Don’t italicize

 Vectors or matrices (boldface these instead): V, X

Capitalize Don’t capitalize

Names of effects or variables only when they appear with Lowercase statistical terms: t test, p value
multiplication signs: Age × Sex effect

Parentheses vs. brackets

Always aim to avoid nested parentheses and brackets when reporting statistics. Instead, you should use commas to
separate related statistics.

Use parentheses (round brackets) Use (square) brackets

 Degrees of freedom  Confidence interval limits

 Statistical values when they aren’t already in  Statistics in a text that’s already enclosed within
parentheses parentheses

Examples: Reporting values in parentheses

 Scores improved between the pretest and posttest (p < .001).

 Significant differences in test scores were recorded, F(1, 30) = 4.67, p = .003.

 (A previous meta-analysis highlighted low effect sizes [d = 0.1] in the field).

Reporting means and standard deviations

Report descriptive statistics to summarize your data. Quantitative data is often reported using means and standard
deviations, while categorical data (e.g., demographic variables) is reported using proportions.

Means and standard deviations can be presented in the main text and/or in parentheses. You don’t need to repeat the
units of measurement (e.g., centimeters) for statistics relating to the same data.

Examples: Reporting mean and standard deviation

 Average sample height was 136.4 cm (SD = 15.1).

 The height of the initial sample was relatively low (M = 125.9 cm, SD = 16.6).

 Height significantly varied between children aged 5–7, 8–10, and 11–13. The means were 115.3, 133.5, and
149.1 cm, respectively.

Reporting chi-square tests

To report the results of a chi-square test, include the following:

 the degrees of freedom (df) in parentheses

 the chi-square (Χ2) value (also referred to as the chi-square test statistic)

 the p value

Example: Reporting chi-square test results

 A chi-square test of independence revealed a significant association between gender and product preference,
Χ2(8) = 19.7, p = .012.

 Based on a chi-square test of goodness of fit, Χ2(4) = 11.34, p = .023, the sample’s distribution of religious
affiliations matched that of the population’s.


Reporting z tests and t tests

For z tests

To report the results of a z test, include the following:

 the z value (also referred to as the z statistic or z score)

 the p value

Example: Reporting z test results

 The participants’ scores were higher than the population average, z = 2.48, p = .013.

 Higher scores were obtained on the new 20-item scale compared to the previous 40-item scale, z =
2.67, p = .007.

For t tests

To report the results of a t test, include the following:

 the degrees of freedom (df) in parentheses

 the t value (also referred to as the t statistic)

 the p value

Example: Reporting t test results

 Older adults experienced significantly more loneliness than younger adults, t(32) = 2.94, p = .006.

 Reaction times were significantly faster for mice in the experimental condition, t(53) = 5.94, p < .001.

Reporting analysis of variance (ANOVAs)

To report the results of an ANOVA, include the following:

 the degrees of freedom (between groups, within groups) in parentheses

 the F value (also referred to as the F statistic)

 the p value

Example: Reporting ANOVA results

 A one-way ANOVA demonstrated that the effect of leadership style was significant for employee
engagement, F(2, 78) = 4.58, p = .013.

 We found a statistically significant main effect of age group on social media use, F(3, 117) = 3.19, p = .026.

Reporting correlations

To report the results of a correlation, include the following:

 the degrees of freedom in parentheses

 the r value (the correlation coefficient)

 the p value

Example: Reporting correlation results

 We found a strong correlation between average temperature and new daily cases of COVID-
19, r (357) = .42, p < .001.
Reporting regressions

Results of regression analyses are often displayed in a table because the output includes many numbers.

To report the results of a regression analysis in the text, include the following:

 the R2 value (the coefficient of determination)

 the F value (also referred to as the F statistic)

 the degrees of freedom in parentheses

 the p value

The format is usually:

Example: Reporting regression results

 SAT scores predicted college GPA, R2 = .34, F(1, 416) = 6.71, p = .009.

Reporting confidence intervals

You should report confidence intervals of effect sizes (e.g., Cohen’s d) or point estimates where relevant.

To report a confidence interval, state the confidence level and use brackets to enclose the lower and upper limits of the
confidence interval, separated by a comma.

Example: Reporting confidence intervals

 Older adults experienced significantly more loneliness than younger adults, t(32) = 2.94, p = .006, d = 0.81, 95%
CI [0.6, 1.02].

 On average, the treatment resulted in a 30% reduction in migraine frequency, 99% CI [26.5, 33.5].

When presenting multiple confidence intervals with the same confidence levels in a sequence, don’t repeat the
confidence level or the word “CI.”
An Introduction to t Tests | Definitions, Formula and Examples

A t test is a statistical test that is used to compare the means of two groups. It is often used in hypothesis testing to
determine whether a process or treatment actually has an effect on the population of interest, or whether two groups
are different from one another.

t test exampleYou want to know whether the mean petal length of iris flowers differs according to their species. You find
two different species of irises growing in a garden and measure 25 petals of each species. You can test the difference
between these two groups using a t test and null and alterative hypotheses.

 The null hypothesis (H0) is that the true difference between these group means is zero.

 The alternate hypothesis (Ha) is that the true difference is different from zero.

When to use a t test

A t test can only be used when comparing the means of two groups (a.k.a. pairwise comparison). If you want to compare
more than two groups, or if you want to do multiple pairwise comparisons, use an ANOVA test or a post-hoc test.

The t test is a parametric test of difference, meaning that it makes the same assumptions about your data as other
parametric tests. The t test assumes your data:

1. are independent

2. are (approximately) normally distributed

3. have a similar amount of variance within each group being compared (a.k.a. homogeneity of variance)

If your data do not fit these assumptions, you can try a nonparametric alternative to the t test, such as the Wilcoxon
Signed-Rank test for data with unequal variances.

What type of t test should I use?

When choosing a t test, you will need to consider two things: whether the groups being compared come from a
single population or two different populations, and whether you want to test the difference in a specific direction.

One-sample, two-sample, or paired t test?

 If the groups come from a single population (e.g., measuring before and after an experimental treatment),
perform a paired t test. This is a within-subjects design.

 If the groups come from two different populations (e.g., two different species, or people from two separate
cities), perform a two-sample t test (a.k.a. independent t test). This is a between-subjects design.

 If there is one group being compared against a standard value (e.g., comparing the acidity of a liquid to a neutral
pH of 7), perform a one-sample t test.

One-tailed or two-tailed t test?

 If you only care whether the two populations are different from one another, perform a two-tailed t test.

 If you want to know whether one population mean is greater than or less than the other, perform a one-
tailed t test.

t test exampleIn your test of whether petal length differs by species:

 Your observations come from two separate populations (separate species), so you perform a two-sample t test.

 You don’t care about the direction of the difference, only whether there is a difference, so you choose to use a
two-tailed t test.
Performing a t test

The t test estimates the true difference between two group means using the ratio of the difference in group means over
the pooled standard error of both groups. You can calculate it manually using a formula, or use statistical analysis
software.

T test formula

The formula for the two-sample t test (a.k.a. the Student’s t-test) is shown below.

In this formula, t is the t value, x1 and x2 are the means of the two groups being compared, s2 is the pooled standard
error of the two groups, and n1 and n2 are the number of observations in each of the groups.

A larger t value shows that the difference between group means is greater than the pooled standard error, indicating a
more significant difference between the groups.

You can compare your calculated t value against the values in a critical value chart (e.g., Student’s t table) to determine
whether your t value is greater than what would be expected by chance. If so, you can reject the null hypothesis and
conclude that the two groups are in fact different.

T test function in statistical software

Most statistical software (R, SPSS, etc.) includes a t test function. This built-in function will take your raw data and
calculate the t value. It will then compare it to the critical value, and calculate a p-value. This way you can quickly see
whether your groups are statistically different.

In your comparison of flower petal lengths, you decide to perform your t test using R. The code looks like this:

t.test(Petal.Length ~ Species, data = flower.data)

Interpreting test results

If you perform the t test for your flower hypothesis in R, you will receive the following output:

The output provides:

1. An explanation of what is being compared, called data in the output table.

2. The t value: -33.719. Note that it’s negative; this is fine! In most cases, we only care about the absolute value of
the difference, or the distance from 0. It doesn’t matter which direction.

3. The degrees of freedom: 30.196. Degrees of freedom is related to your sample size, and shows how many ‘free’
data points are available in your test for making comparisons. The greater the degrees of freedom, the better
your statistical test will work.
4. The p value: 2.2e-16 (i.e. 2.2 with 15 zeros in front). This describes the probability that you would see a t value
as large as this one by chance.

5. A statement of the alternative hypothesis (Ha). In this test, the Ha is that the difference is not 0.

6. The 95% confidence interval. This is the range of numbers within which the true difference in means will be 95%
of the time. This can be changed from 95% if you want a larger or smaller interval, but 95% is very commonly
used.

7. The mean petal length for each group.

t test exampleFrom the output table, we can see that the difference in means for our sample data is −4.084
(1.456 − 5.540), and the confidence interval shows that the true difference in means is between −3.836 and −4.331. So,
95% of the time, the true difference in means will be different from 0. Our p value of 2.2e–16 is much smaller than 0.05,
so we can reject the null hypothesis of no difference and say with a high degree of confidence that the true difference
in means is not equal to zero.

Presenting the results of a t test

When reporting your t test results, the most important values to include are the t value, the p value, and the degrees of
freedom for the test. These will communicate to your audience whether the difference between the two groups
is statistically significant (a.k.a. that it is unlikely to have happened by chance).

You can also include the summary statistics for the groups being compared, namely the mean and standard deviation. In
R, the code for calculating the mean and the standard deviation from the data looks like this:

flower.data %>%
group_by(Species) %>%
summarize(mean_length = mean(Petal.Length),
sd_length = sd(Petal.Length))

In our example, you would report the results like this:

The difference in petal length between iris species 1 (M = 1.46; SD = 0.206) and iris species 2 (M = 5.54; SD = 0.569) was
significant (t (30) = −33.7190; p < 2.2e-16).
One-way ANOVA | When and How to Use It

ANOVA, which stands for Analysis of Variance, is a statistical test used to analyze the difference between the means of
more than two groups.

A one-way ANOVA uses one independent variable, while a two-way ANOVA uses two independent variables.

One-way ANOVA exampleAs a crop researcher, you want to test the effect of three different fertilizer mixtures on crop
yield. You can use a one-way ANOVA to find out if there is a difference in crop yields between the three groups.

When to use a one-way ANOVA

Use a one-way ANOVA when you have collected data about one categorical independent variable and one quantitative
dependent variable. The independent variable should have at least three levels (i.e. at least three different groups or
categories).

ANOVA tells you if the dependent variable changes according to the level of the independent variable. For example:

 Your independent variable is social media use, and you assign groups to low, medium, and high levels of social
media use to find out if there is a difference in hours of sleep per night.

 Your independent variable is brand of soda, and you collect data on Coke, Pepsi, Sprite, and Fanta to find out if
there is a difference in the price per 100ml.

 You independent variable is type of fertilizer, and you treat crop fields with mixtures 1, 2 and 3 to find out if
there is a difference in crop yield.

The null hypothesis (H0) of ANOVA is that there is no difference among group means. The alternative hypothesis (Ha) is
that at least one group differs significantly from the overall mean of the dependent variable.

If you only want to compare two groups, use a t test instead.

How does an ANOVA test work?

ANOVA determines whether the groups created by the levels of the independent variable are statistically different by
calculating whether the means of the treatment levels are different from the overall mean of the dependent variable.

If any of the group means is significantly different from the overall mean, then the null hypothesis is rejected.

ANOVA uses the F test for statistical significance. This allows for comparison of multiple means at once, because the
error is calculated for the whole set of comparisons rather than for each individual two-way comparison (which would
happen with a t test).

The F test compares the variance in each group mean from the overall group variance. If the variance within groups is
smaller than the variance between groups, the F test will find a higher F value, and therefore a higher likelihood that the
difference observed is real and not due to chance.

Assumptions of ANOVA

The assumptions of the ANOVA test are the same as the general assumptions for any parametric test:

1. Independence of observations: the data were collected using statistically valid sampling methods, and there are
no hidden relationships among observations. If your data fail to meet this assumption because you have
a confounding variable that you need to control for statistically, use an ANOVA with blocking variables.

2. Normally-distributed response variable: The values of the dependent variable follow a normal distribution.

3. Homogeneity of variance: The variation within each group being compared is similar for every group. If the
variances are different among the groups, then ANOVA probably isn’t the right fit for the data.
Performing a one-way ANOVA

While you can perform an ANOVA by hand, it is difficult to do so with more than a few observations. We will perform
our analysis in the R statistical program because it is free, powerful, and widely available. For a full walkthrough of this
ANOVA example, see our guide to performing ANOVA in R.

The sample dataset from our imaginary crop yield experiment contains data about:

 fertilizer type (type 1, 2, or 3)

 planting density (1 = low density, 2 = high density)

 planting location in the field (blocks 1, 2, 3, or 4)

 final crop yield (in bushels per acre).

This gives us enough information to run various different ANOVA tests and see which model is the best fit for the data.

For the one-way ANOVA, we will only analyze the effect of fertilizer type on crop yield.

After loading the dataset into our R environment, we can use the command aov() to run an ANOVA. In this example we
will model the differences in the mean of the response variable, crop yield, as a function of type of fertilizer.

One-way ANOVA R codeone.way <- aov(yield ~ fertilizer, data = crop.data)

Interpreting the results

To view the summary of a statistical model in R, use the summary() function.

One-way ANOVA model summary R codesummary(one.way)

The summary of an ANOVA test (in R) looks like this:

The ANOVA output provides an estimate of how much variation in the dependent variable that can be explained by the
independent variable.

 The first column lists the independent variable along with the model residuals (aka the model error).

 The Df column displays the degrees of freedom for the independent variable (calculated by taking the number of
levels within the variable and subtracting 1), and the degrees of freedom for the residuals (calculated by taking
the total number of observations minus 1, then subtracting the number of levels in each of the independent
variables).

 The Sum Sq column displays the sum of squares (a.k.a. the total variation) between the group means and the
overall mean explained by that variable. The sum of squares for the fertilizer variable is 6.07, while the sum of
squares of the residuals is 35.89.

 The Mean Sq column is the mean of the sum of squares, which is calculated by dividing the sum of squares by
the degrees of freedom.

 The F value column is the test statistic from the F test: the mean square of each independent variable divided by
the mean square of the residuals. The larger the F value, the more likely it is that the variation associated with
the independent variable is real and not due to chance.
 The Pr(>F) column is the p value of the F statistic. This shows how likely it is that the F value calculated from the
test would have occurred if the null hypothesis of no difference among group means were true.

Because the p value of the independent variable, fertilizer, is statistically significant (p < 0.05), it is likely that fertilizer
type does have a significant effect on average crop yield.

Post-hoc testing

ANOVA will tell you if there are differences among the levels of the independent variable, but not which differences are
significant. To find how the treatment levels differ from one another, perform a TukeyHSD (Tukey’s Honestly-Significant
Difference) post-hoc test.

Tukey test R codeTukeyHSD(one.way)

The Tukey test runs pairwise comparisons among each of the groups, and uses a conservative error estimate to find the
groups which are statistically different from one another.

The output of the TukeyHSD looks like this:

First, the table reports the model being tested (‘Fit’). Next it lists the pairwise differences among groups for the
independent variable.

Under the ‘$fertilizer’ section, we see the mean difference between each fertilizer treatment (‘diff’), the lower and
upper bounds of the 95% confidence interval (‘lwr’ and ‘upr’), and the p value, adjusted for multiple pairwise
comparisons.

The pairwise comparisons show that fertilizer type 3 has a significantly higher mean yield than both fertilizer 2 and
fertilizer 1, but the difference between the mean yields of fertilizers 2 and 1 is not statistically significant.

Reporting the results of ANOVA

When reporting the results of an ANOVA, include a brief description of the variables you tested, the F value, degrees of
freedom, and p values for each independent variable, and explain what the results mean.

Example: Reporting the results of a one-way ANOVAWe found a statistically-significant difference in average crop yield
according to fertilizer type (F(2)=9.073, p < 0.001). A Tukey post-hoc test revealed significant pairwise differences
between fertilizer types 3 and 2, with an average difference of 0.42 bushels/acre (p < 0.05) and between fertilizer types
3 and 1, with an average difference of 0.59 bushels/acre (p < 0.01).

If you want to provide more detailed information about the differences found in your test, you can also include a graph
of the ANOVA results, with grouping letters above each level of the independent variable to show which groups are
statistically different from one another:

Intro To BigQuery Solutions
No ratings yet
Intro To BigQuery Solutions
2 pages
Types of Statistical Tests
No ratings yet
Types of Statistical Tests
5 pages
The History of Statistics
0% (1)
The History of Statistics
4 pages
Algorithm Analysis Cheat Sheet PDF
0% (1)
Algorithm Analysis Cheat Sheet PDF
2 pages
Formulas in Inferential Statistics
No ratings yet
Formulas in Inferential Statistics
4 pages
Benchmarking Business Incubators Main Report
100% (1)
Benchmarking Business Incubators Main Report
47 pages
Sample Proposal - Employee Satisfaction
No ratings yet
Sample Proposal - Employee Satisfaction
5 pages
Chapter 2
No ratings yet
Chapter 2
34 pages
Hypothesis Testing: Categorical Data Analysis
No ratings yet
Hypothesis Testing: Categorical Data Analysis
54 pages
Categorical Data Analysis With Graphics
No ratings yet
Categorical Data Analysis With Graphics
104 pages
University of Chicago Professors Demand More Cops
No ratings yet
University of Chicago Professors Demand More Cops
12 pages
Proofs PDF
No ratings yet
Proofs PDF
50 pages
Interview Questions
No ratings yet
Interview Questions
225 pages
Mathematical Statistics - I
No ratings yet
Mathematical Statistics - I
12 pages
Algorithms
No ratings yet
Algorithms
61 pages
Categorical Data Analysis and Chi-Square
No ratings yet
Categorical Data Analysis and Chi-Square
27 pages
Categorical Data Analysis
No ratings yet
Categorical Data Analysis
11 pages
אלגוריתמים- זמני ריצות
No ratings yet
אלגוריתמים- זמני ריצות
1 page
Frequency Distribution For Categorical Data
No ratings yet
Frequency Distribution For Categorical Data
6 pages
Data Structures
No ratings yet
Data Structures
2 pages
IELTS Academic Reading 11,12,13 - Key
No ratings yet
IELTS Academic Reading 11,12,13 - Key
10 pages
Statement of Purpose
No ratings yet
Statement of Purpose
3 pages
Algorithem Cheat Sheet
No ratings yet
Algorithem Cheat Sheet
25 pages
Hypothesis Testing and Confidence Intervals
0% (1)
Hypothesis Testing and Confidence Intervals
3 pages
AP Statistics Vocabulary
No ratings yet
AP Statistics Vocabulary
1 page
Choosing The Correct Statistical Test Made Easy
100% (1)
Choosing The Correct Statistical Test Made Easy
5 pages
Cluster Computing
No ratings yet
Cluster Computing
23 pages
Chemical Engineering: 13BCH011-13BCH015
No ratings yet
Chemical Engineering: 13BCH011-13BCH015
8 pages
Data Structures - Cheat Sheet
100% (1)
Data Structures - Cheat Sheet
2 pages
Categorical Data Analysis-Tabular and Graphical
No ratings yet
Categorical Data Analysis-Tabular and Graphical
16 pages
Choose Statistical Test
No ratings yet
Choose Statistical Test
2 pages
Europe For Dummies 6th Edition PDF
No ratings yet
Europe For Dummies 6th Edition PDF
2 pages
Ielts Tips For Success
No ratings yet
Ielts Tips For Success
2 pages
Inferential Statistics in Details
No ratings yet
Inferential Statistics in Details
652 pages
SOP UTD Narender
No ratings yet
SOP UTD Narender
1 page
The Three MS: Analysis Data
No ratings yet
The Three MS: Analysis Data
5 pages
Formulas Statistics II: ∫ = E (X) = ∫ = E (X) = ∫ ∫ Γ (p + 1) =
No ratings yet
Formulas Statistics II: ∫ = E (X) = ∫ = E (X) = ∫ ∫ Γ (p + 1) =
1 page
Nmoinal-Data N Ordinal-data-Comparison-Chrt
No ratings yet
Nmoinal-Data N Ordinal-data-Comparison-Chrt
1 page
CS Engg SOP
No ratings yet
CS Engg SOP
2 pages
CA CheatSheet
No ratings yet
CA CheatSheet
3 pages
Class 7
No ratings yet
Class 7
42 pages
Introduction To SQL - NEW
No ratings yet
Introduction To SQL - NEW
27 pages
Data Science Course and Machine Learnign Using Python
No ratings yet
Data Science Course and Machine Learnign Using Python
3 pages
Statistics Formulas: Parameters
No ratings yet
Statistics Formulas: Parameters
3 pages
Glossary of Statistical Terms: Roger Stern, Ian Dale and Sandro Leidi
No ratings yet
Glossary of Statistical Terms: Roger Stern, Ian Dale and Sandro Leidi
23 pages
Database: Note
No ratings yet
Database: Note
81 pages
45 Genetic Algorithms
No ratings yet
45 Genetic Algorithms
20 pages
Exercises On Proving
No ratings yet
Exercises On Proving
49 pages
RedCap Demo & Assignment
No ratings yet
RedCap Demo & Assignment
5 pages
Barclays Data Engineer Interview Questions
No ratings yet
Barclays Data Engineer Interview Questions
17 pages
11 Parameter Estimation
No ratings yet
11 Parameter Estimation
6 pages
Statistics Packet
No ratings yet
Statistics Packet
17 pages
Solutions To Selected Exercises: Categorical Data Analysis, 3Rd Edition
No ratings yet
Solutions To Selected Exercises: Categorical Data Analysis, 3Rd Edition
30 pages
PRIYANKA Complete Interview Questions
No ratings yet
PRIYANKA Complete Interview Questions
13 pages
SOP of Ezaz Ahmed Updated
No ratings yet
SOP of Ezaz Ahmed Updated
5 pages
Practical DBMS
100% (2)
Practical DBMS
38 pages
Sampling Techniques - Towards Data Science
No ratings yet
Sampling Techniques - Towards Data Science
10 pages
Coventry Sop
No ratings yet
Coventry Sop
2 pages
Choosing The Right Statistical Test - Types & Examples
No ratings yet
Choosing The Right Statistical Test - Types & Examples
6 pages
Choosing The Right Statistical Test
No ratings yet
Choosing The Right Statistical Test
6 pages
Statistical Treatment
No ratings yet
Statistical Treatment
7 pages
Choosing The Right Statistical Test - Types and Examples
No ratings yet
Choosing The Right Statistical Test - Types and Examples
14 pages
Gifty+Edna+Anani++ The+Impact+of+Inductive+Teaching+and+Learning+of+Grammar+in+Selected+Basic+Schools+in+Accra 2015
No ratings yet
Gifty+Edna+Anani++ The+Impact+of+Inductive+Teaching+and+Learning+of+Grammar+in+Selected+Basic+Schools+in+Accra 2015
143 pages
A Cross-Cultural Analysis of Ethnocentrism in China, India, and Taiwan
No ratings yet
A Cross-Cultural Analysis of Ethnocentrism in China, India, and Taiwan
17 pages
Out 3
No ratings yet
Out 3
308 pages
Khulna University of Engineering & Technology
No ratings yet
Khulna University of Engineering & Technology
9 pages
Gsbs008 Advanced Research Methods Arthur Digman II
No ratings yet
Gsbs008 Advanced Research Methods Arthur Digman II
4 pages
G David Garson Game Theory 2012, Statistical Associates Publishing
No ratings yet
G David Garson Game Theory 2012, Statistical Associates Publishing
15 pages
Customer Satisfaction Ganesh Suzuki
No ratings yet
Customer Satisfaction Ganesh Suzuki
3 pages
Sta302 Final Project - Poster
No ratings yet
Sta302 Final Project - Poster
1 page
LD7083 - Dissertation - Tagged
No ratings yet
LD7083 - Dissertation - Tagged
9 pages
A Study On Risk Management in Banking Sector: Objectives
No ratings yet
A Study On Risk Management in Banking Sector: Objectives
2 pages
Pa Salary Report
No ratings yet
Pa Salary Report
3 pages
Maintaining Test Methods in The User's Laboratory: Standard Guide For
No ratings yet
Maintaining Test Methods in The User's Laboratory: Standard Guide For
4 pages
Silo - Tips - The Philippine Aptitude Classification Test Why Shift From Classical Test Theory To Item Response Theory Abstract
No ratings yet
Silo - Tips - The Philippine Aptitude Classification Test Why Shift From Classical Test Theory To Item Response Theory Abstract
17 pages
AACI Mock Exam
100% (2)
AACI Mock Exam
3 pages
912-Article Text-2700-1-10-20230318
No ratings yet
912-Article Text-2700-1-10-20230318
8 pages
Progress Toward Construct Validation of The Sports Mental Toughness Questionnaire (SMTQ)
No ratings yet
Progress Toward Construct Validation of The Sports Mental Toughness Questionnaire (SMTQ)
9 pages
A Novel Approach For Feature Selection Based On Correlation Measures CFS and Chi Square
No ratings yet
A Novel Approach For Feature Selection Based On Correlation Measures CFS and Chi Square
13 pages
Secondary Data Analysis: A Method of Which The Time Has Come
No ratings yet
Secondary Data Analysis: A Method of Which The Time Has Come
9 pages
Scientific Method Lab Report
No ratings yet
Scientific Method Lab Report
37 pages
Thesis Proposal For Mba
100% (3)
Thesis Proposal For Mba
5 pages
A Study On Recent Trends in HRM With Special Reference To Work Life Balance
No ratings yet
A Study On Recent Trends in HRM With Special Reference To Work Life Balance
3 pages
WISC IV Advanced Clinical Interpretation ISBN 0120887630, 9780120887637 Full-Resolution Download
No ratings yet
WISC IV Advanced Clinical Interpretation ISBN 0120887630, 9780120887637 Full-Resolution Download
14 pages
Business Statistics, 4e: by Ken Black
No ratings yet
Business Statistics, 4e: by Ken Black
13 pages
Ilovepdf - Merged (1) (2) - Pages-1
No ratings yet
Ilovepdf - Merged (1) (2) - Pages-1
37 pages
12ABM1 - G 4 - Revised Chapter 1 and 2 1
No ratings yet
12ABM1 - G 4 - Revised Chapter 1 and 2 1
24 pages
3I'S (Week 7-8)
No ratings yet
3I'S (Week 7-8)
15 pages
Analisis HPLC Lesitin
No ratings yet
Analisis HPLC Lesitin
8 pages
Big O Exercises and Solutions
No ratings yet
Big O Exercises and Solutions
14 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.