0% found this document useful (0 votes)
26 views

Normal Distribution and Statistical Hypothesis

1. The normal distribution is a bell-shaped symmetric distribution that is completely determined by its mean and standard deviation. 2. Some key characteristics of a normal distribution include that the mean, median, and mode are equal and it is symmetric around the mean. The total area under the curve is 1. 3. The normal distribution is important in statistics and epidemiology as many biological phenomena can be explained using it. It also plays a role in statistical inference and hypothesis testing due to properties like the central limit theorem.

Uploaded by

jeling.zabala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Normal Distribution and Statistical Hypothesis

1. The normal distribution is a bell-shaped symmetric distribution that is completely determined by its mean and standard deviation. 2. Some key characteristics of a normal distribution include that the mean, median, and mode are equal and it is symmetric around the mean. The total area under the curve is 1. 3. The normal distribution is important in statistics and epidemiology as many biological phenomena can be explained using it. It also plays a role in statistical inference and hypothesis testing due to properties like the central limit theorem.

Uploaded by

jeling.zabala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

BIOSTATISTICS AND EPIDEMIOLOGY

3. total area under the curve is 1 or 100%


4. has long tapering tails (asymptote line) that
- Normal distribution is a distribution that is extend infinitely in either direction but never
symmetric about the mean. touching the x-axis
- One of the earliest distribution in statistics to be 5. completely determined by two parameters, its
well-studied mean (μ) and standard deviation (𝜎)
- The first equation was derived by English
mathematician Abraham de Moivre
- The work of Carl Friedrich Gauss became famous
- skewness – imbalance in the distribution of the
data relative to the mean

➢ Negatively Skewed
o mean is less than the median, and the
median is less than the mode
o many outliers which is lower
➢ Normal
o Mean is equal to the median, and the
median is equal to the mode
o gaussian curve, normal distribution o mean: location
curve, or the probability distribution o standard deviation: spread
curve
o The graphical representation of the
probability that the data collected
distribute itself normally
o The spread of the data follows a
consistent pattern on a bell-shaped
curve
➢ Positively skewed
o mode is less than the median, and the
median is less than the mean
o many outliers which is higher

Characteristics of a Normal Distribution o 𝜎 increases, the distribution becomes


1. bell shaped and symmetric about the mean wider
2. mean = median = mode o 𝜎 decreases, the distribution becomes
– Measures the peakness or flatness of the data set
thinner
6. Areas:

o Leptokurtic Distribution
▪ Peakness
▪ Positive kurtosis
▪ Higher peak and longer tail o 1 SD covers 68% of the distribution
o Normal Distribution o 2 SD covers 95% of the distribution
▪ Mesokurtic o 3 SD covers 99.7% of the distribution
▪ Should have a value of 3
o Platykurtic Distribution Importance of the Normal Distribution
▪ Flatness • useful for explaining many biological
▪ Negative kurtosis phenomena
▪ Lower peak and shorter tail
BIOSTATISTICS AND EPIDEMIOLOGY

o even if the distribution of the variable is


not normal, can easily transform using
log, square root, or other transformation
to make it approximate the normal
distribution
• plays an important role in statistical inference
because
o the other important distributions
(binomial, t-distribution, can be
approximated by the normal esp. when
the sample size is large enough
o the sampling distribution of the mean is
approximately normal if the sample size
is large enough (central limit theorem)

THE STANDARD NORMAL DISTRIBUTION


• has mean = 0 and SD = 1
• the capital Z is traditionally used to represent the
standard normal random variable
• small letter z us used to represent a particular Answer: 2.5% of 5 year old Filipino boys are taller than
value of Z 121 cm.
• any value x from the normal distribution can be
transformed into a standard normal value of z
using the formula
𝑥− 𝜇 STATISTICAL HYPOTHESIS
𝑧=
𝜎 - statement about the value of a population
• areas under the standard normal are tabulated parameter
- mean, median, mode, variance, sd, proportion,
SOLVING THE NORMAL DISTRIBUTION total
Steps: - assertion or proposition about the relationship
1. compute for the equivalent z value of the given x between 2 or more variables
value - formulated as a result of years of observation
2. sketch the graph and research
3. get the area from the standard normal table
HYPOTHESIS TESTING
Example: Suppose the height of 5 year old Filipino boys - method of making decision using data, whether
is normally distributed with a mean of 110 cm and from a controlled experiment or an
standard deviation of 5.61 cm. What is the proportion of observational study (not controlled)
these boys that are taller than 121 cm.? find p = (x > 121) - set of procedures that culminates in either
rejection or non-rejection of null hypothesis
Compute for the equivalent z value of the given x - involves the comparison of two hypotheses:
value o null
o alternative
𝑥− 𝜇 121 − 110
𝑧= = = 1.96
𝜎 5.61
PROCESS OF HYPOTHESIS TESTING
Sketch the graph
𝑧 = 1.96

- set an alpha (cut off value which tells us whether


the p value is low or high)
- p value is the probability of the null hypothesis
being true
o if p ≤ 𝛼, reject the null hypothesis
▪ the probability of the
occurrence of the sample result
Get the area from the standard normal table is low
o if p > 𝛼, do not reject the null hypothesis
𝑧 = 1.96
▪ the probability of the
occurrence of the sample result
is high
BIOSTATISTICS AND EPIDEMIOLOGY

STEPS OF HYPOTHESIS TESTING Decision Making Errors


Statistical decision True state of affairs in the population
1. State the null (H0) & alternative (H1 or HA) based on sample data H0 is true H0 is not true
hypothesis Do NOT reject Correct decision 𝛽 or type II error
➢ Null hypothesis (H0) reject 𝛼 or type I error Correct decision
o hypothesis of no difference ➢ Type I error (𝜶)
o statement of equality o error of rejecting a true null hypothesis
o often used to signify zero treatment o alpha is kept at a low level when it is
effect or the equivalence of population important not to make a mistake of
parameters or to a specified value rejecting a true H0
➢ Alternative hypothesis (HA) ➢ Type II error (𝜷)
o usually the research hypothesis (the o error of not rejecting a false null
hypothesis that the investigator believes hypothesis
in) o beta is kept at a low level when it is
o uses <, > or ≠ sign important not to accept a false H0
o if the “P” is low, then the H0 must go o it is inversely related to alpha
➢ Non-directional (two-tailed test)
o simply states that there is difference in 3. Select the appropriate test statistics
the groups being compared • factors to be considered:
o that the true value of the parameter is o objectives of the study
not equal to the hypothesized value o design of the study
o H0: 𝜃 = 𝜃0 vs HA: 𝜃 ≠ 𝜃0 o types of variables
➢ Directional (one-tailed test) o level of measurement
o specifies direction of disagreement with o whether the samples are related or
H0 independent
o H0: 𝜃 = 𝜃0 vs HA: 𝜃 > 𝜃0 o assumption about the test
o H0: 𝜃 = 𝜃0 vs HA: 𝜃 < 𝜃0
o example: The average length of stay of TYPES OF LEVELS OF MEASUREMENT
patients in the Pediatrics department is SAMPLE/AIM Nominal Ordinal Interval/Ratio
longer than that in the OB Gyne I. One sample case
department. determine if sample - Binomial Test - Kolmogorov- - z-test for 1
is from a population Smirnov One mean
with pre-specified P sample test
Example: or 𝜇
- z-test for 1
Suppose a proposed study was used to compare the determine if sample proportion - t-test for one
performance in anatomy of two groups of students, is from a population Chi-square mean
with pre-specified Goodness-of-
those using cadavers for demonstration and those using distribution fit test
models. If the parameter for evaluating student
- Runs test
performance is the proportion who obtain a grade of 2.0 Test of randomness
or better, how do we write H0 and HA? II. Two sample case
(comp. of 2 groups) - Marginal Chi - Sign test - Paired t-test
• H0: PM = PC (”no difference”) – The proportion of Related (matched - Mc Nemar Wilcoxon
students who obtain a grade of 2.0 or better samples Change test - Signed ranks
Independent samples - Fisher Exact - Wilcoxon- - z-test for 2
among those using models for demonstration, Test for 2x2 Mann- Whitney means
PM, is equal to the proportion among those using test test - Independent-
- Z-test for 2 - Robust-rank- test for 2
cadavers, PC proportions order test means
• HA: PM ≠ PC (”two-tailed test”) – The proportion - Chi-square - Kolmogorov-
test for rx2 Smirnov two
of students who obtain a grade of 2.0 or better table sample test
among those using models for demonstration, - Siegel-Tukey
PM, is not equal to the proportion among those test for scale
differences
using cadavers, PC III. K-sample case
• HA: PM > PC (”one-tailed test”) – The proportion (comp. of more than - Cochran Q- - Friedman-two - Two-way
2 groups) Related test way ANOVA by ANOVA (F-test)
of students who obtain a grade of 2.0 or better (matched samples ranks
among those using models for demonstration, - Page test for
ordered
PM, is greater than to the proportion among alternatives
those using cadavers, PC Independent samples - Chi-square - Extension of - One-way
test for r x k median test ANOVA (F-test)
tables - Kruskal-Wallis
2. State the level of significance One-way
ANOVA
- Level of significance (𝛂) - Jonckheere
o Set randomly before the data collection test for ordered
alternatives
o Probability of occurence IV. Measurement of - Cramer - Spearman - Pearson
o the probability level that is considered Correlation Coefficient Rankorder Moment
- Phi correlation correlation
too low to warrant support of the coefficient coefficient coefficient
hypothesis being tested - The Kappa - Kendall (simple &
o probability of committing a Type 1 error coefficient of Rankorder multiple)
agreement correlation
▪ rejecting the null hypothesis, - Asymmetric coefficient
when in fact, the hull hypothesis association;
the lambda
is true statistics
o usually set as 0.05, 0.1 or 0.01
BIOSTATISTICS AND EPIDEMIOLOGY

4. Determine the critical region ➢ Compare the p-value with 𝛼 (level of


- “region of rejection” significance)
- Set of values of the test statistics which leads to o statistical software automatically give
the rejection of the null hypothesis this value
- The values are corresponding to the 𝛼 o p-value
- Usually found at the tail ends of the distribution ▪ the probability of obtaining a
- depends on the level of significance set by the result as extreme or more
investigator extreme than the value
- if the test statistic computed for the sample data observed from the sample given
fall in this region, then there is a basis for that the null hypothesis is true.
rejecting the null hypothesis ▪ If the p-value ≤ 𝛼
- can be one-tailed or two-tailed based on the • sample does not
alternative hypothesis support the null
- easier to reject the null hypothesis with a one- hypothesis and thus is
tailed test when the choice of the direction is rejected
correct • thus, if p-value ≤ 𝛼,
- how to determine the critical region: reject H0
o based on the alpha that is set • Example: if 𝛼 = 0.05,
o the way that the alternative is stated then reject H0 if p < 0.05
- do not reject the null hypothesis if the computed ▪ If p > 𝛼
test statistic falls at the non-rejection region • do not reject H0

7. Draw conclusion
- always towards the alternative hypothesis
- Rejection of the Null Hypothesis (H0) leads to a
conclusion stated this way:
o “There is sufficient evidence to say that
(alternative hypothesis)”
- Non-rejection of the Null Hypothesis (H0) leads
to a conclusion stated this way:
o “There is no sufficient evidence to say
that (alternative hypothesis)”
- Reject H0
o conclude HA
o sample does not come from a
population with the same parameter
values defined by null hypothesis
o sample value cannot support/ not
consistent with the null hypothesis
- Do not reject H0
o conclude that there is no sufficient
evidence to say that HA is NOT true
o H0 is not accepted
5. Compute the test-statistics - Non-rejection H0
- it is important that before this step, the level of o not proof that the null hypothesis is
significance (𝛼) and the critical region have been correct
set to avoid manipulation of these entities after o factors:
the test statistic has been computed to obtain ▪ inadequate sample size
desired outcome. ▪ measurement problems
- the computation differed depending on the test o rather than accepting H0, it is more
statistic used. accurate to say that the researcher fails
- basic formula of the test statistic: to reject H0
𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐 − 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑝𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟 𝑢𝑛𝑑𝑒𝑟 𝐻0 o Insufficient proof to conclude HA rather
𝑡𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐 =
𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟 than proof of H0

6. Make statistical decision


- whether to reject or not to reject the null
hypothesis
- always towards the null hypothesis

2 Ways of making statistical decision:


➢ Compare test statistic with the critical region
o If the computed value falls under the CR,
then we reject the null hypothesis;
otherwise we cannot reject the null
hypothesis

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy