Dissertation: Testing OF Hypothesis
Dissertation: Testing OF Hypothesis
Dissertation: Testing OF Hypothesis
TOPIC -
TESTING
OF
HYPOTHESIS
In this study, we will understand how we use hypothesis testing for various
cases. It will guide us to draw conclusions from any research or theory.
Let us understand a few keywords -
*reference
link - https://www.simplilearn.com/tutorials/statistics-tutorial/hypothesis-testing-in-statistics
https://www.sciencedirect.com/topics/mathematics/statistical-hypothesis-testing
https://www.statlect.com/glossary/estimator#:~:text=In%20statistics%2C%20an%20estimator
%20is,Estimators%20as%20random%20variables
ASSUMPTIONS
NULL HYPOTHESIS
A statistician or decision-maker should be completely impartial and he/she
should not be allowed to let their personal opinion influence the decision. A
decision-maker should take a neutral attitude towards the outcome of the test.
The Null Hypothesis is a statement of no effect or significant difference
between a population parameter and a sample statistic.
Even if a difference is observed that is merely because of fluctuation in
sampling from the same population. It is a statement about a population
parameter that is assumed to be true.⇒H₀: µ=µ₀
where H₀ is the null hypothesis, µ is the population mean, and
µ₀ is the sample mean.
ALTERNATE HYPOTHESIS
The important reason for testing the null hypothesis is because we suspect it is
wrong. The acceptance or rejection of the null hypothesis is meaningful to the
decision-maker hence the suspension.
The alternate hypothesis is a statement about the population parameter that
states there is a significant difference between the population parameter and
the sample statistic.
The alternate hypothesis is accepted when the null hypothesis is rejected and
is rejected when the null hypothesis is accepted.
⇒ H₁: µ>µ₁ or µ<µ₁ or µ≠µ₁
where H₁ is the alternate hypothesis, µ is the population mean,
and µ₁ is the sample mean.
SIGNIFICANCE
Level of significance (α)
When there is a difference in the population parameter and sample statistics a
measure of the strength of the evidence or the probability level is considered
before rejecting the null hypothesis is known as the level of significance. This
level is determined before conducting the experiment. It is also known as alpha
or α. Since the significance level is probability, it ranges from 0
to 1.
Confidence level
For scientific experiments, confidence levels are used to express confidence in
the output of the experiment. It is expressed in percentage. Confidence level
shows the probability of occurrence of the same conclusion if the experiment is
repeated. If the confidence level is 95%, this would mean that if we were to
repeat our experiment 100 times and compute 100 corresponding confidence
intervals, approximately 95 of the confidence intervals would contain the
population mean.
Confidence level = 1 - α
Critical Region
The region of a set of values for a test statistic that leads to the rejection of the
null hypothesis is called the critical region or rejection region.
Acceptance Region
The region of a set of values for a test statistic that leads to the acceptance of
the null hypothesis is called the acceptance region.
Error
The decision to accept or reject the null hypothesis H₀ is made based on
information available from the observation of the sample. The
conclusion hence drawn is necessarily not always true in
respect of the population. This leads to errors in decision-
making. This error can be of two types namely -
Type I Error – The error of rejecting the null hypothesis even if it is true is
known as a Type I error.
The probability of Type I error is denoted by α. Type I error is also known
as significance level.
Type II Error – The error of accepting the null hypothesis even if it is false is
known as Type II error.
The probability of Type II error is denoted by β.
Power of a test
In hypothesis testing, the power of a test refers to avoiding a
type II error i.e. correctly rejecting a null hypothesis that is
false.
power of a test = 1- β
It ranges from 0 to 1 since it is the probability of rejecting a
false null hypothesis. Factors such as significance level, sample
size, etc. affect the power. If the value of the power of a test is
closer to 1, it is said to be a good power.
Single-Tailed Test
A one-tailed test is a statistical test in which the critical area of a distribution is
one-sided so that it is either greater than or less than a certain value, but not
both. If the sample being tested falls into the one-sided critical area, the
alternative hypothesis will be accepted instead of the null hypothesis. The one-
tailed test gets its name from testing the area under one of the tails of a
normal distribution.
*Graph of a single tail
4. Make a decision
Once the p-value is formulated, compare the p-value calculated to the
significance level(α) and check if the p-value is greater or smaller or equal to
the significant level value i.e. tabulated value. Analyze if the p-value lies in the
rejection region (or critical region) or acceptance region.
Reject the null hypothesis H₀ if the p-value lies in the rejection region (i.e.
accept alternate hypothesis H₁). Accept the null hypothesis H₀ if the p-value lies
in the acceptance region (i.e. reject the alternate hypothesis H₁).
5. Conclusion
Summarize the result of whether the null hypothesis is accepted or rejected.
According to the decision, a conclusion is drawn if there is a significant
difference between the sample test statistics and the population parameter.
IMPORTANCE OF TESTING OF HYPOTHESIS
3. Business Analysis
By being more data-driven, one can improve business by enabling them to
identify new opportunities. It helps in optimizing marketing strategies to
ensure product quality. For example, an e-commerce company can use
hypothesis testing to compare sales data from customers who received free
shipping offers and those who didn't.
Hypothesis testing can help businesses reduce the risk of costly mistakes by
basing business decisions on data, not hunches.
In manufacturing, hypothesis testing can help ensure that production
processes are within specified limits. In financial analysis, hypothesis testing
can help measure investment performance and identify anomalies in financial
data.
4. Understanding Relationships
Statistical hypothesis helps in understanding relationships between sample
statistics(variables), identifying patterns, and making assumptions based on
statistical evidence. It helps in identifying if there is a significant relationship
between variables and how far the relation extends.
5. Decision Making
With the help of hypothesis testing, one can avoid making errors while decision
making. Hypothesis testing helps analysts and researchers to make informed
decisions based on the evidence.
TYPES OF HYPOTHESIS TESTING
In statistics, various tests are used to compare different samples or groups and
draw conclusions about populations using the sample drawn. These tests,
known as statistical tests, focus on analysing the likelihood or probability of
obtaining the observed data under specific assumptions or hypotheses. They
provide a framework for assessing evidence in support of or against a particular
hypothesis.
There are different statistical tests like -
i) T–test
ii) Z – test
iii) F-Test
iv) Chi-square test
v) ANOVA
vi) MANOVA
T-TEST
The term "t-statistic" is abbreviated from "hypothesis test statistic".
The t-test is named after William Sealy Gosset’s Student’s t-distribution,
created while he was writing under the pen name “Student.” It was William
Sealy Gosset who first published it in English in 1908 in the scientific journal
Biometrika using the pseudonym “Student”.
Although it was William Gosset after whom the term "Student" is penned, it
was actually through the work of Ronald Fisher that the distribution became
well known as "Student's distribution" and "Student's t-test".
TYPES OF T-TESTS
1. One-Sample t-test
2. Two-sample t-test
3. Paired sample t-test
ONE - SAMPLE T-TEST
We can use this when the sample size is small (i.e. n < 30) data is collected
randomly and it is approximately normally distributed. It can be calculated as:
x−μ
t=
s
√n
Where,
t = t-value
x = sample mean
μ= population mean
σ = standard deviation
n = sample size
and,
√
n
1
s= ∑ ( x −x )2
n−1 i=1 i
Result: The null hypothesis is rejected if the calculated t-value is greater than the tabulated t-
value. Therefore, we can conclude that there exists a significant difference. Otherwise (i.e.
when the computed t-value is less than the tabulated t-value) the null hypothesis is
accepted.
TWO-SAMPLE T-TEST (INDEPENDENT)
√ s 1 s 2 where,
2 2
+
n1 n 2
x 1 ¿ x 2 are the means of the two sample groups.
s1 and s2 are the standard deviations of the two sample groups
n1 and n2 are the sample sizes of the two groups.
and,