Statistics Cheat Sheet

Statistics in Behavioral Sciences Cheat Sheet

by SH (Sana_H) via

Stat​ist​ics Paired t-test

the branch of mathem​atics in which data are used descri​ptively or to compare means of two related groups
infere​ntially to find or support answers for scientific and other quanti​‐ ex. compare weight of 20 mice before and after treatment
fiable questions. two condit​ions:
It encomp​asses various techniques and procedures for recording, - pre post treatment
organi​zing, analyzing, and reporting quanti​tative inform​ation. - two diff conditions ex two drugs
Difference - parametric test & non-pa​ram​etric test - random selection
- normally distri​buted
- no extreme outliers
assump​tions YES NO
value for mean median​/mode t= m / s/√n
central m= sample mean of differ​ences
tendency df= n-1
probab​ility normally distri​‐ user specific
distri​bution buted t-dist​rib​ution
population required not required aka Student's t-dist​rib​ution = probab​ility distri​bution similar to normal
knowledge distri​bution but has heavier tails
used for interval data nominal, ordinal data used to estimate pop parameters for small samples
Tail heaviness is determined by degrees of freedom = gives lower
correl​ation pearson spearman
probab​ility to centre, higher to tails than normal distri​bution, also
tests t test, z test, f Kruskal Wallis H test, Mann-w​‐
have higher kurtosis, symmet​rical, unimodal, centred at 0, larger
test, ANOVA hitney U, Chi-square
spread around 0
df = n - 1
Correl​ation Coeffi​cient
above 30df, use z-dist​rib​ution
a statis​tical measure of the strength of the relati​onship between the t-score = no of SD from mean in a t-dist​rib​ution
relative movements of two variables we find:
value ranges from -1 to +1 - upper and lower boundaries
-1 = perfect negative or inverse correl​ation - p value
+1 = perfect positive correl​ation or direct relati​onship TO BE USED WHEN:
0 = no linear relati​onship - small sample
- SD is unknown
Altern​atives ASSUMP​TIONS
PARAMETRIC NON-PA​RAM​ETRIC - cont or ordinal scale
- random selection
one sample z test, one sample t one sample sign test
- equal SD for indep two-sample t-test
one sample z test, one sample t one sample Wilcoxon signed rank
test test
two way ANOVA Friedman test
one way ANOVA Kruskal wallis test
indepe​ndent sample t test mann-w​hitney U test
one way ANOVA mood's median test
pearson correl​ation spearman correl​ation

Statistics in Behavioral Sciences Cheat Sheet
by SH (Sana_H) via

Two-sample z-test z-test (cont)

to determine if means of two indepe​ndent popula​tions are equal or REJECT NULL HYPOTHESIS IF Z STATISTIC IS STATIS​TICALLY
to find out if there is signif​icant diff bet two pop by comparing sample z-stat​istic/ z-score = no repres​enting result from z-test
mean z critical value divides graph into acceptance and rejection regions
knowledge of: if z stat falls in rejection region​-> H0 can be rejected
SD and sample >30 in each group TYPES
eg. compare perfor​mance of 2 students, average salaries, employee One-sample z-test
perfor​mance, compare IQ, etc Two-sample z-test
z= x̄ ₁ - x̄ ₂ / √s₁2/n₁ + s₂2/n₂ ANOVA
s= SD
Analysis of Variance
comparing several sets of scores
z= (x̄ ₁ - x̄ ₂) - (µ₁ - µ₂) / √σ₁2/n₁ + σ₂2/n₂
to test if means of 3 or more groups are equal
(µ₁ - µ₂) = hypoth​esized difference bet pop means
comparison of variance between and within groups
to check if sample groups are affected by same factors and to same
Point Biserial correl​ation
measures relati​onship between two variables compare differ​ences in means and variance of distri​bution
rpbi = correl​ation coeffi​cient ONE-WAY ANOVA=no of IVs
one continuous variable (ratio​/in​terval scale) single IV with different (2) levels​/va​ria​tions have measurable effect
one naturally binary variable on DV
FORMULA: compare means of 2 or more indep groups
rpb= M1-M0/Sn * √ pq aka:
Sn= SD - one-factor ANOVA
- one-way analysis of variance
Two-sample z-test - between subjects ANOVA
to determine if means of two indepe​ndent popula​tions are equal or
- indepe​ndent samples
- equal sample sizes in groups​/levels
to find out if there is signif​icant diff bet two pop by comparing sample
- normally distri​buted
- equal variance
knowledge of:
F test is used to check statis​tical signif​icance
SD and sample >30 in each group
higher F value --> higher likelihood that difference observed is real
eg. compare perfor​mance of 2 students, average salaries, employee
and not due to chance
perfor​mance, compare IQ, etc
used in field studies, experi​ments, quasi-exp
z= x̄
- min 6 subjects
- sample no of samples in each group
H0: µ1=µ2=µ3 ... µk i.e. all pop means are equal
for hypothesis testing
Ha: at least one µi is different i.e atleat one of the k pop means is not
to check whether means of two popula​tions are equal to each other
equal to the others
when pop variance is known
µi is the pop mean of group
we have knowledge of:
- SD/pop​ulation variance and/or sample n=30 or more
if both unknown -> t-test

Statistics in Behavioral Sciences Cheat Sheet
by SH (Sana_H) via

Spearman Correl​ation Advantages & Disadv​antages - NON-PA​RAM​ETRIC TESTS

non-pa​ram​etric version of Pearson correl​ation coeffi​cient ADVANTAGES DISADV​ANTAGES

named after Charles Spearman simple, easy to understand less powerful than parame​trics
denoted by ρ(rho)
no assump​tions counte​rpart parametric if exists, is
determine the strength and direction of monotonic variables bet two
more powerful
variables measured at ordinal, interval or ratio levels & whether they
more versatile not as efficient as parametric tests
are correlated or not
monotonic function=one variable never increases or never easier to calculate may waste inform​ation
decreases as its IV changes hypothesis tested may be more requires larger sample to be as
- monoto​nically increa​sing= as X increases, Y never decreases accurate powerful as parametric test
- monoto​nically decrea​sing= as X increases, Y never increases
small sample sizes are okay difficult to compute large samples
- not monotonic= as X increases, Y sometimes dec and sometimes
by hand
can be used for all types of tabular format of data required
for analysis with: ordinal data, continuous data
data (nominal, ordinal, interval) that may not be readily available
uses ranks instead of assump​tions of normality
aka Spearman Rank order test can be used with data having outliers
ρ= 1- 6Σdᵢ 2/n(n2-1) Applic​ation
di= difference between two ranks of each observ​ation PARAMETRIC TESTS NON-PA​RAM​ETRIC TESTS
-1 to +1
- quanti​tative & continuous data - mixed data
+1 = perfect associ​ation of ranks
- normally distri​buted - unknown distri​bution of
0= no associ​ation
-1= perfect negative associ​ation of ranks
closer the value to 0, weaker the associ​ation - data is estimated on ratio or - different kinds of measur​‐
Value Ranges interval scales ement scales
0 to 0.3 = weak monotonic relati​onship
0.4 to 0.6 = moderate strength monotonic relati​onship degrees of freedom
0.7 to 1 = strong monotonic relati​onship indepe​ndent values in the data sample that have freedom to vary
Parametric and Non-pa​ram​etric test no of values in a data set minus 1

Fixed set of parame​ters, certain assump​tions about distri​bution of df= N-1

PARAMETRIC - prior knowledge of pop distri​bution i.e NORMAL t-test
DISTRI​BUTION statis​tical test to determine if signif​icant difference between avg
NON-PA​RAM​ETRIC - no assump​tions, do not depend on popula​tion, scores of two groups
DISTRI​BUTION FREE tests, values found on nominal or ordinal 1908-W​illiam Sealy Gosset- student t-test and t-dist​irb​ution
level for hypothesis testing
easy to apply, unders​tand, low complexity knowledge of:
decision based on - distri​bution of popula​tion, size of sample distri​bution - normally distri​buted
parametric - mean & <30 sample
non-pa​ram​etric - median​/mode & >30 sample or regardless of size

Statistics in Behavioral Sciences Cheat Sheet
by SH (Sana_H) via

t-test (cont) One-sample z-test

no knowledge of SD to check if difference between sample mean & population mean

TYPES: when SD is known
one-sample t-test - single group FORMULA:
t= m - µ / s/√n SE=σ/√n
SD FORMULA: z score is compared to a z table (includes % under NPC bet mean
σ= √Σ(X-µ)2 / N and z score), tells us whether the z score is due to chance or not
s= √Σ(X-µ) / n-1 condit​ions:
indepe​ndent two-sample t-test - two groups knowledge of:
paired​/de​pendent samples t-test - sig diff in paired measur​ements, - pop mean
compares means from same group at diff times (test-​retest sample) - SD
H0: no effective difference = measured diff is due to chance - simple random sample
Ha: two-ta​iled/ one-tailed nonequ​ivalent means/​smaller or larger than - normal distri​bution
hypoth​esized mean two approaches to reject H0:
PERFORM two-tailed test: to find out difference bet two popula​tions - p-value approach - p-value is the smallest level of signif​icance at
one-tailed: one pop mean is > or < other which H0 can be reject​ed...smaller p-value, stronger evidence
-critical value approach - comparing z stat to critical values... indicate
Indepe​ndent two-sample t-test boundary regions where stat is highly improbable to lie= critical
region​s/r​eje​ction regions
aka unpaired t-test
if z stat is in critical region​-> reject H0
to compare mean of two indepe​ndent groups
based on:
ex. avg weight of males and females
signif​icance level (0.1, 0.05, 0.01), alpha level, Ha
two forms:
- student's t-test : assumes SD is equal
Biserial correl​ation
- welch's t-test : less restri​ctive, no assumption of equal SD
both provide more/less similar results to measure relati​onship between quanti​tative variables and binary
ASSUMP​TIONS: variables
- normally distri​buted given by Pearson - 1909
- SD is same biserial correl​ation coeff varies bet -1 and 1
- indepe​ndent groups 0= no associ​ation
- randomly selected ex. IQ scores and pass/fail correl​ation
- indepe​ndent observ​ations continuous variable and binary variable (dicho​tomised to create
- measured on interval or ratio scale binary variable)
FORMULA: rbis or rb = correl​ation index estimating strength of relati​onship
t= x̄₁ - x̄₂ / √s₁2/n₁ + s₂2/n₂ between artifi​cially dichot​omous variable and a true continuous
df= n1 + n2 - 2 variable
S= √Σ (x1-x̄)2 + (x2-x̄)2 / n1+n2-2 ASSUMP​TIONS:
- data measured on continuous scale
- one variable to be made dichot​omous
- no outliers
- approx normally distri​buted
- equal variances (SD)
rb= M1-M0/SDt * pq/y

Statistics in Behavioral Sciences Cheat Sheet
by SH (Sana_H) via

Biserial correl​ation (cont) Mann-W​hitney U test (cont)

M1=mean of grp 1 U1=n1n2+ n1(n1+1)/2 - R1

M2= mean of grp 2 U2=n1n2+ n2(n2+1)/2 - R2
p= ratio of grp 1 R= sum of ranks of group
q= ratio of grp 2
SDt= total SD One-way ANOVA test
y= ordinate

Pearson Correl​ation

measures strength and direction of a linear relati​onship between two

how two data sets are correlated
gives us info about the slope of the line
One-way ANOVA test
- Pearson's r
- bivariate correl​ation
- Pearson produc​t-m​oment correl​ation coeffi​cient (PPMCC)
cannot determine dependence of variables & cannot assess
nonlinear associ​ations
r value variation:
One-way ANOVA test
-0.1 to -.03 / 0.1 to 0.3 = weak correl​ation
-0.3 to -0.5 / 0.3 to 0.5 = averag​e/m​oderate correl​ation
-0.5 to -1.0 / 0.5 to 1.0 = strong correl​ation
r=n(Σx​y)-​(Σx​)(Σy) / √[nΣx 2-(Σx)2] [nΣy2-(Σy)2]

Mann-W​hitney U test

non-pa​ram​etric test to test the signif​icance of difference two indepe​‐

ndently drawn groups OR compare outcomes between two indepe​‐
ndent groups
equi to unpaired t test
No NPC assump​tion, small sample size >30 with min 5 in each
group, continuous data (able to take any no in range), randomly
selected samples,
Mann-W​hitney Test
Wilcoxon Rank Sum test
H0: the two pop are equal
Ha: the two pop are not equal
denoted by U

By SH (Sana_H) Published 17th January, 2023. Sponsored by Last updated 16th January, 2023. Measure your website readability!
Page 5 of 5.

