5 & 6 - BIOSTATISTICS V & VI Inferential Statistics I & II

Download as pdf or txt
Download as pdf or txt
You are on page 1of 68

BIOSTATISTICS

Inferential Statistics
DR. RAIMA ASIF
Lets recap
Identify the graph
Is the distribution left skewed or right?
Learning outcomes
Key elements of inferential statistics
Concept of generalizing sample result to population
Concept of SE , CI
What increases or decreases SE and CI
Steps of hypothesis testing
What’s this all about?

 Its all about inference


Inferential Statistics
Bread & butter of statistics
Quantitative, analytical studies
Inferential statistics are used to draw conclusions about a
population by examining the sample

POPULATION

Sample
Inferential Statistics
Accuracy of inference depends on
representativeness of sample from population
Inference is based on the principle of probability
It help researchers
◦ test hypotheses and answer research questions
◦ derive meaning from the results—holds true for
population or not
Inferential Statistics…
Inferential comes from the word infer. To infer is to conclude
or judge from premises or evidence and not to prove.
Inferential statistics frequently involves
Estimation (i.e., guessing the characteristics of a population from a
sample of the population)

Hypothesis testing (i.e., finding evidence for or against an explanation


or theory).
Inferential Statistics…
Making comparison
Hypothesis testing
Evaluation of the role of chance
Relationship…correlation
Predictions…regression
Inference of results as….
Statistically significant or not

Significant difference exists or not

Clinically significant
Generalization of result of a sample over the whole
population

Once we calculate the mean of sample we are interested in


knowing the mean of population as well.
To estimate population mean from sample mean we have to
calculate
◦ Standard error
◦ Confidence interval.

10
Standard error (SE)
It is measure of extent to which sample mean deviate from the population mean”.
Standard deviation of distributions of means or variation
Also called as Standard error of mean
Standard Error = S/√ n
Where S=standard deviation and n= sample size
Standard Error depends upon:
1. Standard deviation
2. Sample size
If standard deviation is more SE is also
If sample size is more  SE is
Difference between SD & SE
SD quantifies the variation within a sample

SE quantifies the variations in the means


from multiple sets of measurements/samples
Or Variation of sample means
Or Sampling distribution of sample means
Summary

Descriptive Inferential

Hypothesis
Estimation
testing

Interval
Point Estimate
Estimate
Point estimate
A point estimate is a specific numerical value
estimate of a parameter.
The best point estimate of the population mean
is the sample mean
Point estimate
Suppose a college president wishes to estimate the
average age of students attending classes this
semester. The president could select a random
sample of 100 students and find the average age of
these students, say, 22.3 years( mean)
This type of estimate is called a point estimate.
Interval estimate
Sample mean different from the population mean
The point estimate fails to indicate how close the estimate is
to population parameter.
This flaw can be remedied by use of a confidence interval
estimate
A group of numbers for which we have a specified degree of
assurance that the value of the parameter was captured.
Interval estimate…
Either the interval contains the parameter or it does
not.
A degree of confidence (usually a percent) can be
assigned before an interval estimate is made.
90%, 95% , 99%
Estimating Confidence Intervals
Confidence Interval - Definition
A range of values for a variable constructed so that this
range has a specified probability of including the true value
of the variable

A measure of the study’s precision

Point estimate

Lower limit Upper limit

19
How much confident you are?
The range in which we are fairly certain that the
population mean lies.
Formula:
CI 68% = x ± SE x = sample mean
CI 95% = x ± 2SE SE = standard error
CI= mean ± z (SE), at 95% z = 1.96 , for
CI 99% = x ± 3 SE calculations use 2,
At 99%, z = 2.58, for calculations use 2.5
Knowledge check!
Mean and SD for height of 50 boys were 150 cm and 7 cm
respectively. Could this sample be from the universe with a population
mean of 154cm?
For this u need to calculate SE and 95% CI
CI = x ± 2 SE
CI = 150 ± 2 (7/√50)
CI= 148.04- 151.96
So this sample is not drawn from this pop with mean of 154.
Imp terms
Confidence Interval:
Range of values for a point estimate that has a specified
probability of including the true value of the parameter.
Confidence Level:
usually expressed as a percentage (e.g. 95%)
Confidence Limits:
The upper and lower end points of the confidence interval.

22
How much should be CI?
CI = mean ± z (S/√n)
S and n can be varied
Narrow confidence intervals are of the greatest
value in making estimates, because they allow us
to estimate an unknown parameter with little room
for error…..
Width of CI reflects precision
Hypothesis testing
A systematic way to test claims or ideas about a
group or population.
Hypothesis testing…
In medical research, through test of significance :
Compare sample mean with population mean
Estimate unknown population parameter
Comparing two or more statement to reach a conclusion
General physician making a diagnosis & statisticians testing a
hypothesis
“The procedure which lead to acceptance & rejection of
specific statement about population parameters” is called
hypothesis testing
Hypothesis
Testable theory
Assumption
Educated guess
Statement of belief
Types of statistical hypotheses
Alternative/research hypothesis (HA)…
◦ An effect (that you predict)
Null hypothesis (HO) …
◦ No difference
◦ Observed difference is due to chance alone
Hypotheses

hypothesis there is a relationship between age


and hypertension
Ho there is no relationship

HA there is a relationship

this is a two-tailed hypothesis as no direction


is predicted
When to formulate hypothesis?
Prior to conduction of research.

Must be mentioned in
Protocol/synopsis/proposal/research plan.
examples
A company claims that average time a make a
ready mix pie is 5min…
Claim….
Test it…
Ho: µ = 5min
Ha: µ= 5min, µ > 5min, µ< 5min
Types of null hypothesis
1. One tailed ..directional or one sided
one group is either greater than or less than
group B
2. Two tailed or non directional or two sided
Two groups are not the same
Example
A researcher thinks that if expectant mothers
use vitamin pills, the birth weight of the babies
will increase. The average birth weight of the
population is 8.6 pounds.
H 0 : wt = 8.6
H 1 : wt > 8.6
directional or onesided
Example
Doctors believe that the average teen sleeps no
longer than 10 hours per day…
A researcher believes that teens on average sleep
longer.
Ho: µ = 10 hours
Ha: µ > 10 hours
Steps of hypothesis testing
1. Statistical hypothesis
 Null Hypothesis
 Alternate hypothesis
2. Decision Rule or level of significance
3. Apply test of statistical significance
4. Comparison of p-value with Alpha
5. Decision
13 May 2024 36
Steps of Hypothesis testing…
1. State the null hypothesis, Ho
2. State the alternative hypothesis, H1 or HA
3. Establish a level of significance or Alpha = 0.05
Steps of Hypothesis testing…

4. Choose a test of significance (based upon data) and


calculate test statistic…t calc
5. Compare with t crit…..Obtain p value
< Alpha --> reject Ho
> Alpha --> Failed to Reject the Ho
Steps of Hypothesis testing…
6. State the conclusion
failed to accept null hypothesis- rejected Ho?
failed to reject null hypothesis- accepted Ho?
Level of significance or alpha
Alpha (α) is a predetermined point on the probability
scale – against which the obtained probability (after
applying a test of significance) is tested
when (α)0.05, there is a 5% chance of rejecting a
true null hypothesis
when (α)0.01, there is a 1% chance of rejecting a
true null hypothesis.
Tests of significance
obtaining p - value
It is a procedure used to establish the validity of claim by
determining whether test statistic falls in critical region –
called significant
Choosing an appropriate test of significance depends upon
◦ Type of Data (Nominal, Ordinal, Continuous)
◦ Paired or Un paired observations
Types of tests of significance

13 May 2024 42
Steps of Hypothesis testing…
Tests of Significance
Parametric vs Non parametric
Depends upon scale of data
◦ Categorical data
◦ Chi Square test/ SE of proportion
◦ Continuous Data
◦ One Sample
◦ Two Samples
◦ SEM
◦ Paired t test / Dependent sample t test
◦ Unpaired / Independent samples t test/ two samples t test
◦ More than two samples
◦ ANOVA
SE of mean
We obtained a random sample of 100 individuals
with mean Hb levels of 12 gm and SD of 2gm.
Assuming normal distribution of Hb in population,
What possible range of mean of Hb level could we
expect within 95% confidence limits?
SE of proportion
In a population of 10000, the working people are
5200. A random sample of 100 individuals was
taken and the proportion of working people was
40%.
What possible range of working people we will
expect in a sample of 100 with 95% confidence?
Choice of test
Knowledge check
When comparison is based on quantitative
data (comparison of mean)
t-Test
When comparison is based on qualitative
data (comparison of proportions)
Chi-Square Test
FUN TIME
Which of the following test is appropriate for
testing the proportion of type II diabetes is similar
in men and women?
a. X2 test
b. T test
c. ANOVA
FUN TIME
Which of the following test is appropriate for testing the mean
Hb levels difference among female students before and after
iron supplementation?
a. X2 test
b. Paired/dependent t test
c. Unpaired/independent samples t test
d. ANOVA
P-value
•Indicates the probability or likelihood of obtaining a
result at least as extreme as that observed in a study
by chance alone, assuming that there is truly no
association between exposure and outcome under
consideration.
• Probability of falsely rejecting Ho
• Probability of finding the result by chance alone
• Probability of committing Type I error or Alpha error
Interpretation of p - value
The p – value that is obtained after having applied
a relevant test of significance is compared with the
predetermined level of significance that is ‘Alpha’.
If this was set at 0.05 then:
p = or < 0.05 – Reject the null hypothesis
p > 0.05 – Accept null hypothesis
Exact term : failed to reject null hypothesis

13 May 2024 53
Interpretation of p – value…
P value is probability of wrongly rejecting a null hypothesis
when it is true
p > 0.05 – result not significant
p = or < 0.05 – result significant at 0.05 but not at 0.01
p < 0.01 - result highly significant
p < 0.001- result very highly significant
p < 0.0003 – means it is significant at all levels
p < 0.0000 in journals/articles -??
13 May 2024 54
Pl note!
The size of the p-value does not indicate the
importance of the results
Results may be statistically significant but be
clinically unimportant
Results that are not statistically significant
may still be important
Difference between alpha and p value
The p value (obtained after applying test of significance)for obtaining a
sample
outcome is compared to the level of significance.

Significance, or statistical significance, describes a decision made


concerning a value stated in the null hypothesis. It is predetermined or
predefined before applying test.
When the null hypothesis is rejected, we reach significance. When the
null hypothesis is retained, we fail to reach significance.
Forming Conclusions
•Every hypothesis test ends with the experimenters
(you and I) either
•Rejecting the Null Hypothesis, or
•Failing to Reject the Null Hypothesis
•As strange as it may seem, you never accept the
Null Hypothesis.
•The best you can ever say about the Null Hypothesis
is that you don’t have enough evidence, based on a
sample, to reject it!
Type I (Alpha) and Type II (beta) Errors
reality
Study Null hypothesis Null hypothesis
decision was true was false

Null
Type I error Correct
Hypothesis (observe difference
rejected when not exists)
Decision

Null Type II error


Hypothesis Correct (Failed to observe
accepted Decision difference when
one exists)
58
Type I (Alpha) & Type II (beta) Errors
reality
Study
Null hypothesis Null hypothesis
decision
was true was false

Null
Hypothesis Correct
rejected Decision

Null
Hypothesis Correct
accepted Decision
Type I (Alpha) and Type II (beta)
Errors
reality
Study
Null hypothesis Null hypothesis
decision
was true was false

Null Type I error


Hypothesis (observe difference Correct Decision
rejected when not exists) α

Null Type II error


Hypothesis Correct Decision
(Failed to observe
accepted difference when one
exists) ß
Knowledge check!
If null hypothesis is rejected, potentially
……error can occur
Power of study
Ability of study to detect the effect/difference
for which it was designed.
Probability that false null hypothesis will be
rejected.
Preventing type II error
80%...1-ß
P value and CI
If the 95% confidence interval does NOT include
the null value of 1.0 (p < 0.05), then we declare a
“statistically significant” association.
If the 95% confidence interval includes the null
value of 1.0, then the test result is “not statistically
significant.”
Contd…
The C.I. provides an idea of the likely
magnitude of the effect and the random
variability of the point estimate.
On the other hand, the p-value reveals nothing
about the magnitude of the effect or the random
variability of the point estimate.
When results are significant at p<0.05
Results are unlikely to be due to chance
Likelihood of having results due to chance is
only 0.05 or less or 1 in 20.
Correlation
Statistical technique to establish and quantify
the strength & direction of relationship b/w two
variables
Test of significance
Expressed in terms of correlation coefficent
Correlation
Scatter diagram Y
* *
*
X

• Relationship between two quantitative variables


on x-y plane.
• One variable is called independent (X) and the
second is called dependent (Y)
• Points are not joined …bivariate
• This is the first step in correlation analysis

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy