Statistics Cheat Sheet
Statistics Cheat Sheet
Statistics Cheat Sheet
the branch of mathematics in which data are used descriptively or to compare means of two related groups
inferentially to find or support answers for scientific and other quanti‐ ex. compare weight of 20 mice before and after treatment
fiable questions. two conditions:
It encompasses various techniques and procedures for recording, - pre post treatment
organizing, analyzing, and reporting quantitative information. - two diff conditions ex two drugs
ASSUMPTIONS
Difference - parametric test & non-parametric test - random selection
- normally distributed
PROPERTIES PARAMETRIC NON-PARAMETRIC
- no extreme outliers
assumptions YES NO
FORMULA
value for mean median/mode t= m / s/√n
central m= sample mean of differences
tendency df= n-1
probability normally distri‐ user specific
distribution buted t-distribution
population required not required aka Student's t-distribution = probability distribution similar to normal
knowledge distribution but has heavier tails
used for interval data nominal, ordinal data used to estimate pop parameters for small samples
Tail heaviness is determined by degrees of freedom = gives lower
correlation pearson spearman
probability to centre, higher to tails than normal distribution, also
tests t test, z test, f Kruskal Wallis H test, Mann-w‐
have higher kurtosis, symmetrical, unimodal, centred at 0, larger
test, ANOVA hitney U, Chi-square
spread around 0
df = n - 1
Correlation Coefficient
above 30df, use z-distribution
a statistical measure of the strength of the relationship between the t-score = no of SD from mean in a t-distribution
relative movements of two variables we find:
value ranges from -1 to +1 - upper and lower boundaries
-1 = perfect negative or inverse correlation - p value
+1 = perfect positive correlation or direct relationship TO BE USED WHEN:
0 = no linear relationship - small sample
- SD is unknown
Alternatives ASSUMPTIONS
PARAMETRIC NON-PARAMETRIC - cont or ordinal scale
- random selection
one sample z test, one sample t one sample sign test
- NPC
test
- equal SD for indep two-sample t-test
one sample z test, one sample t one sample Wilcoxon signed rank
test test
two way ANOVA Friedman test
one way ANOVA Kruskal wallis test
independent sample t test mann-whitney U test
one way ANOVA mood's median test
pearson correlation spearman correlation
to determine if means of two independent populations are equal or REJECT NULL HYPOTHESIS IF Z STATISTIC IS STATISTICALLY
different SIGNIFICANT WHEN COMPARED WITH CRITICAL VALUE
to find out if there is significant diff bet two pop by comparing sample z-statistic/ z-score = no representing result from z-test
mean z critical value divides graph into acceptance and rejection regions
knowledge of: if z stat falls in rejection region-> H0 can be rejected
SD and sample >30 in each group TYPES
eg. compare performance of 2 students, average salaries, employee One-sample z-test
performance, compare IQ, etc Two-sample z-test
FORMULA:
z= x̄ ₁ - x̄ ₂ / √s₁2/n₁ + s₂2/n₂ ANOVA
s= SD
Analysis of Variance
formula:
comparing several sets of scores
z= (x̄ ₁ - x̄ ₂) - (µ₁ - µ₂) / √σ₁2/n₁ + σ₂2/n₂
to test if means of 3 or more groups are equal
(µ₁ - µ₂) = hypothesized difference bet pop means
comparison of variance between and within groups
to check if sample groups are affected by same factors and to same
Point Biserial correlation
degree
measures relationship between two variables compare differences in means and variance of distribution
rpbi = correlation coefficient ONE-WAY ANOVA=no of IVs
one continuous variable (ratio/interval scale) single IV with different (2) levels/variations have measurable effect
one naturally binary variable on DV
FORMULA: compare means of 2 or more indep groups
rpb= M1-M0/Sn * √ pq aka:
Sn= SD - one-factor ANOVA
- one-way analysis of variance
Two-sample z-test - between subjects ANOVA
Assumptions
to determine if means of two independent populations are equal or
- independent samples
different
- equal sample sizes in groups/levels
to find out if there is significant diff bet two pop by comparing sample
- normally distributed
mean
- equal variance
knowledge of:
F test is used to check statistical significance
SD and sample >30 in each group
higher F value --> higher likelihood that difference observed is real
eg. compare performance of 2 students, average salaries, employee
and not due to chance
performance, compare IQ, etc
used in field studies, experiments, quasi-exp
FORMULA:
CONDITIONS:
z= x̄
- min 6 subjects
- sample no of samples in each group
z-test
H0: µ1=µ2=µ3 ... µk i.e. all pop means are equal
for hypothesis testing
Ha: at least one µi is different i.e atleat one of the k pop means is not
to check whether means of two populations are equal to each other
equal to the others
when pop variance is known
µi is the pop mean of group
we have knowledge of:
- SD/population variance and/or sample n=30 or more
if both unknown -> t-test
left-tailed
right-tailed
two-tailed
Pearson Correlation
Mann-Whitney U test