Project Work

1
LIST OF CONTENTS
Chapter 1: Introduction
1.1 Origin and development
1.2 Terminologies and Definitions
1.3 Single – Tailed Test
1.4 Two – Tailed Test
Chapter 2: Parametric Tests
2.1 Introduction
2.2 Assumptions
2.3 Types of Tests
2.4 Method
2.5 Application
Chapter 3: Small Sample Tests

3.1 t-test
3.1.1 One Sample t-test
3.1.2 Two Sample t-test
3.1.3 Paired t-test
3.2 F-test
3.3 One-Way ANOVA
Chapter 4: Large Sample Tests

4.1 Z-test
4.2 Welch’s F-Test
Chapter 5: Non-Parametric Tests

Chapter 6: Summary
Reference
2
CHAPTER 1
INTRODUCTION
1.1 Origin and Development
In today’s data-driven world, decisions are based on data all the time.
Hypothesis plays a crucial role in that process, whether it may be making
business decisions, in the health sector, academia, agriculture, or quality
improvement. Without hypothesis and hypothesis tests, you risk drawing the
wrong conclusions and making bad decisions.
Hypothesis testing is a systematic procedure for deciding whether the
results of a research study support a particular theory.
Hypothesis testing or significance testing is a method for testing a claim or
hypothesis about a parameter in a population, using data measured in a sample.
In this method, we test some hypotheses by determining the likelihood that a
sample statistic could have been selected, if the hypothesis regarding the
population parameter were true.
Hypothesis testing is the method of using samples to learn more about the
characteristics of a given population. It involves making an assumption
about a population parameter or distribution and testing it further.
While hypothesis testing was popularized early in the 20th century, early forms
were used in the 1700s. The first use is credited to John Arbuthnot (1710),
followed by Pierre-Simon Laplace (1770s), in analyzing the human sex ratio at
birth.
In 1778, Pierre Laplace compared the birthrates of boys and girls in multiple
European cities. He states: "It is natural to conclude that these possibilities are
very nearly in the same ratio". Thus, the null hypothesis in this case is that the
birthrates of boys and girls should be equal.
In 1900, Karl Pearson developed the chi-square test to determine ‘whether a
given form of frequency curve will effectively describe the samples drawn from
a given population.’ Thus, the null hypothesis is that a population is described
by some distribution predicted by theory.
In 1904, Karl Pearson developed the concept of "contingency" to determine
whether outcomes are independent of a given categorical factor. Here the null
hypothesis is by default that two things are unrelated (e.g. scar formation and
death rates from smallpox). In this case, the null hypothesis is no longer
3
predicted by theory or conventional wisdom, but instead by the principle of
indifference that led Fisher and others to dismiss the use of "inverse
probabilities".
Modern significance testing is largely the product of Karl Pearson (p-value,
Pearson's chi-squared test), William Sealy Gosset (Student's t-distribution), and
Ronald Fisher ("null hypothesis", analysis of variance, "significance test"),
while hypothesis testing was developed by Jerzy Neyman and Egon Pearson
(son of Karl). Ronald Fisher began his life in statistics as a Bayesian (Zabell
1992), but Fisher soon grew disenchanted with the subjectivity involved
(namely use of the principle of indifference when determining prior
probabilities) and sought to provide a more "objective" approach to inductive
inference.
1.2 Terminologies and Definitions

Following are some keywords:
1.2.1 Population
A population is a set of data that is gathered for analysis of its characteristics
and behaviour. Population is used to reach a conclusion based on the evidence
and reasoning. For example, the population could be all voters in a country, all
students in a school, or all manufactured products from a factory. Understanding
the population helps control the variability and bias, leading to more accurate
and reliable results.
1.2.2 Sample
Selecting data from the population is known as a sample. The method or
technique used for the selection of data from the population is referred to as
sampling. Sampling can be easily related to an example of forming a subset
from an original set. Since the analysis of the whole population is impractical or
impossible (due to size, cost, or time constraints), samples are drawn. Sampling
is required to estimate a large population's characteristics or behaviour with a
sample's help. There are several techniques for sampling data which include
simple random sampling, systematic sampling, stratified sampling, cluster
sampling, etc.
4
1.2.3 Estimation
The process of drawing inferences about a population using its sample is known
as estimation. It can also be referred to as a way of making a calculation more
manageable. Estimation in statistics is a method to calculate the value of some
property of a population from the observation of a sample drawn from the same
population.
1.2.4 Parameter
A parameter is a numerical value that explains the characteristics of a
population. A few common parameters used in testing a hypothesis are mean,
variance, standard deviation, proportion, etc. It is a key characteristic of a
population, providing a foundation for analysis, estimation, and decision-
making in research.
5
1.2.5 Sample statistic
Sample statistics are functions that explain a sample's characteristics. These
function’s values are computed using sample data. In statistics, a function or
mathematical formula that uses sample data to calculate the estimate of an
unknown parameter or quantity is known as an estimator.
1.2.7 Level of significance (α)

When there is a difference in the population parameter and sample statistics a
measure of the strength of the evidence or the probability level is considered
before rejecting the null hypothesis is known as the level of significance. This
level is determined before conducting the experiment. It is also known as alpha
or α. Since the significance level is probability, it ranges from 0 to 1.
1.2.8 Confidence level

For scientific experiments, confidence levels are used to express confidence in
the output of the experiment. It is expressed in percentage. Confidence level
shows the probability of occurrence of the same conclusion if the experiment is
repeated. If the confidence level is 95%, this would mean that if we were to
repeat our experiment 100 times and compute 100 corresponding confidence
intervals, approximately 95 of the confidence intervals would contain the
population mean.
Confidence level = 1 - α
1.2.9 Critical Region

The region of a set of values for a test statistic that leads to the rejection of the
null hypothesis is called the critical region or rejection region.
1.2.10 Acceptance Region

The acceptance region is the region of a set of values for a test statistic that
leads to the acceptance of the null hypothesis.
6
1.2.11 Error
The decision to accept or reject the null hypothesis H₀ is made based on
information available from the observation of the sample. The conclusion hence
drawn is necessarily not always true in respect of the population. This leads to
errors in decision-making. This error can be of two types namely -
Type I Error – The error of rejecting the null hypothesis even if it is true is
known as a Type I error. The probability of Type I error is denoted by α. Type I
error is also known as significance level.
Type II Error – The error of accepting the null hypothesis even if it is false is
known as Type II error. The probability of Type II error is denoted by β.
7
1.2.12 Power of a test
In hypothesis testing, the power of a test refers to avoiding a type II error i.e.
correctly rejecting a false null hypothesis.
The power of a test is (1- β), that is, the probability of rejecting the null
hypothesis when it is false and can be calculated by following the procedures
outlined by Cohen (1988).
It ranges from 0 to 1 since it is the probability of rejecting a false null
hypothesis. Factors such as significance level, sample size, etc. affect the power.
If the value of the power of a test is closer to 1, it is said to be a good power.
Note: A statistician or decision-maker should choose a test that minimizes β or

maximizes the power i.e. 1 – β.
1.2.13 p-Value
A p-value is the probability of obtaining a sample outcome, given that the value
stated in the null hypothesis is true. It ranges from 0 to 1. If the p-value is small,
more likely to reject the null hypothesis, and if the p-value is large, more likely
to accept the null hypothesis. A p-value can be obtained using test distribution
tables.
8
1.2.14 Degrees of Freedom ( ⅆf )
The number of independent items in a data sample data which is randomly
selected from the population is known as the degree of freedom. Usually, the
degree of freedom is one less than the total number of items in the sample data.
The degree of freedom is used to ensure that the sample data is statistically valid
for tests.
ⅆf = N – 1
1.2.15 Statistical Hypothesis

A statistical hypothesis is some statement or assertion about a population
parameter or about population distribution characterizing a population, which
we want to verify based on information available from the sample. If the
statistical hypothesis specifies the population completely then it is termed as
simple statistical hypothesis otherwise it is called a composite statistical
hypothesis. A statistical hypothesis is of two types namely -
i) Null Hypothesis
ii) Alternate Hypothesis
i) NULL HYPOTHESIS
A statistician or decision-maker should be completely impartial and
he/she should not be allowed to let their personal opinion influence the
decision. A decision-maker should take a neutral attitude towards the
outcome of the test. The Null Hypothesis is a statement of no effect or
significant difference between a population parameter and a sample
statistic.
Even if a difference is observed that is merely because of fluctuation in
sampling from the same population. It is a statement about a population
parameter that is assumed to be true.
⇒H₀: µ=µ₀
where H₀ is the null hypothesis, µ is the population mean, and µ₀ is
the sample mean.
9
ii) ALTERNATE HYPOTHESIS
The important reason for testing the null hypothesis is because we
suspect it is wrong. The acceptance or rejection of the null hypothesis
is meaningful to the decision-maker hence the suspension.
The alternate hypothesis is a statement about the population parameter
that states there is a significant difference between the population
parameter and the sample statistic.
The alternate hypothesis is accepted when the null hypothesis is
rejected and is rejected when the null hypothesis is accepted.
⇒ H₁: µ>µ₁ or µ<µ₁ or µ≠µ₁
where H₁ is the alternate hypothesis, µ is the population mean, and µ₁
is the sample mean.
1.3 Single-Tailed Test

A one-tailed test is a statistical test in which the critical area of a distribution is
one-sided so that it is either greater than or less than a certain value, but not
both. If the sample being tested falls into the one-sided critical area, the
alternative hypothesis will be accepted instead of the null hypothesis. The one-
tailed test gets its name from testing the area under one of the tails of a normal
distribution.
10
1.4 Two-Tailed Test or Double–Tailed Test
A two-tailed test, in statistics, is a method in which the critical area of a
distribution is two-sided and tests whether a sample is greater than or less than a
certain range of values. If the sample being tested falls into either of the critical
areas, the alternative hypothesis is accepted instead of the null hypothesis. The
two-tailed test gets its name from testing the area under both tails of a normal
distribution.
11
1.2.15 p-Value
A p-value is the probability of obtaining a sample outcome, given that the value
stated in the null hypothesis is true. It ranges from 0 to 1. Typically, the
probability of obtaining a sample outcome is set at 5% i.e. 95% level of
significance. If the p-value is small, more likely to reject the null hypothesis,
and if the p-value is large, more likely to accept the null hypothesis. A p-value
can be obtained using test distribution tables.
1.2.16 Degree of Freedom ( ⅆf )

The number of independent items in a data sample data which is randomly
selected from the population is known as the degree of freedom. Usually, the
degree of freedom is one less than the total number of items in the sample data.
The degree of freedom is used to ensure that the sample data is statistically valid
for tests.
ⅆf = N – 1
12
CHAPTER 2
PARAMETRIC TESTS
2.1 Introduction
Parametric statistics is a branch of statistics that leverages models based on a
fixed (finite) set of parameters. However, it may make some assumptions about
that distribution, such as continuity or symmetry, or even an explicit
mathematical shape but have a model for a distributional parameter that is not
itself finite-parametric. Parametric statistics was mentioned by R. A. Fisher in
his work Statistical Methods for Research Workers in 1925, which created the
foundation for modern statistics.
When choosing a test for our hypothesis, we need to know what type of
(outcome) data we have and the characteristics of the data. Parametric tests are
the most common statistical tests for continuous outcome variables. They are
often easier to calculate. They are named "parametric" because they are based
on Gaussian parameters, such as the mean and standard deviation.
When conducting hypothesis testing with parametric data, several assumptions
must be met to ensure the validity of statistical tests.
2.2 Assumptions
The following are the key assumptions while performing parametric tests:
2.2.1 Random Sampling
All statistical hypothesis tests assume that the sampling is random.
2.2.2 Distribution
It is important for the distribution of the population must be known or the
sample size should be large enough.
2.2.3 Normality
The data should be normally distributed, especially for smaller sample sizes.
This means that the distribution of the population from which the samples are
drawn should resemble a bell-shaped curve.
13
For larger sample sizes (typically n≥30), the Central Limit Theorem suggests
that the sampling distribution of the sample mean will be approximately normal,
even if the data itself is not.
2.2.4 Independence
The samples drawn from the populations should be independent of each other.
2.4 Types of Tests

In statistics, various tests are used to compare different samples or groups and
draw conclusions about populations using the sample drawn. These tests, known
as statistical tests, focus on analyzing the likelihood or probability of obtaining
the observed data under specific assumptions or hypotheses. They provide a
framework for assessing evidence in support of or against a particular
hypothesis.
There are different statistical tests like -
i) T-test
ii) Z – test
iii) F-Test
iv) ANOVA
v) Welch’s F-Test
2.5 Methodology
A method of comparing a claim or statement observed regarding the population
with the help of a sample drawn from the same population is known as the
testing of a hypothesis. Statistical tests are used in hypothesis testing which can
be used to determine whether a predictor or estimator variable has a statistically
significant relationship with an outcome.
Steps involved in Testing of Hypothesis-
2.5.1 Formulate the null hypothesis
From the gathered sample data, understand the characteristics

required to be observed. Make an observation or state the null
hypothesis. The null hypothesis is set to claim whether there is a
significant difference from the population parameter.
2.5.2 Choose the appropriate test
14
Depending upon the given or collected sample data, wisely choose the
test for further observation and inference. The test can be selected
based on the distribution of the sample data. The appropriate test
should be used for a better study of the design of the sample data or
variation.
2.5.3 Calculate the p-value
Perform the statistical test to compute the test statistics and calculate
the p-value. The p-value is calculated using the sampling distribution
of test statistics, the sample data, and the type of test being performed.
The method of calculation of the p-value is different for different
tests.
2.5.4 Make a decision
Once the p-value is formulated, compare the p-value calculated to the

significance level(α) and check if the p-value is greater or smaller or
equal to the significant level value i.e. tabulated value. Analyze if the
p-value lies in the rejection region (or critical region) or acceptance
region.
Reject the null hypothesis H₀ if the p-value lies in the rejection region
(i.e. accept alternate hypothesis H₁). Accept the null hypothesis H₀ if
the p-value lies in the acceptance region (i.e. reject the alternate
hypothesis H₁).
2.5.5 Conclusion
Summarize the result of whether the null hypothesis is accepted or

rejected. According to the decision, a conclusion is drawn if there is a
significant difference between the sample test statistics and the
population parameter.
15
3.6 Application
We have already discussed that hypothesis testing is a method used to test
whether the null hypothesis is valid to be accepted or rejected. There are various
fields where testing of hypothesis is used. The following are some examples-
3.6.1 Evaluate the strength of a claim
It helps to determine whether the evidence supports a specific claim
or theory, ensuring that conclusions are based on empirical data rather
than assumptions. Hypothesis testing is a way of assessing the
strength of a claim or assumption before using it in a data set.
3.6.2 Improve quantitative research / Research Advancement

A formalized hypothesis will force us to think about what results we
should look for in an experiment. Hypothesis testing can help
researchers generalize their results to a larger population, rather than
just the sample they studied. It helps the researcher to identify gaps or
limitations in their data or analysis. Hypothesis testing allows the
researchers to determine whether the data from the sample is
statistically significant.
3.6.3 Business Analysis

By being more data-driven, one can improve business by enabling
them to identify new opportunities. It helps in optimizing marketing
strategies to ensure product quality. For example, an e-commerce
16
company can use hypothesis testing to compare sales data from
customers who received free shipping offers and those who didn't.
Hypothesis testing can help businesses reduce the risk of costly
mistakes by basing business decisions on data, not hunches.
In manufacturing, hypothesis testing can help ensure that production
processes are within specified limits. In financial analysis, hypothesis
testing can help measure investment performance and identify
anomalies in financial data.
3.6.4 Understanding Relationships

Statistical hypothesis helps in understanding relationships between
sample statistics(variables), identifying patterns, and making
assumptions based on statistical evidence. It helps in identifying if
there is a significant relationship between variables and how far the
relation extends.
17
CHAPTER 3
SMALL SAMPLE TESTS
When the sample size is small (n < 30) the central limit theorem does
not assume the distribution of sample statistics as normal. When the
sample size is small the special probability distributions are used to
determine critical value for the test statistic.
The two types of small sample tests are –
i) t-test
ii) F-test
3.1 t-test
The term "t-statistic" is abbreviated from "hypothesis test statistic".
The t-test is named after William Sealy Gosset’s Student’s t-
distribution, created while he was writing under the pen name
“Student.” It was William Sealy Gosset who first published it in
English in 1908 in the scientific journal Biometrika using the
pseudonym “Student”.
Although it was William Gosset after whom the term "Student" is
penned, it was actually through the work of Ronald Fisher that the
distribution became well known as "Student's distribution" and
"Student's t-test".
A t-test is a type of inferential statistic test used to determine if

there is a significant difference between the means of two groups
or just due to random variation.
It is often used when data is normally distributed and population

variance is unknown. The t-test is a parametric test, meaning it makes
certain assumptions about the data. (Assumptions are mentioned on
page no. --)
In many cases, a Z-test will yield very similar results to a t-test
because the latter converges to the former as the size of the dataset
increases.
In the case of T-Test degree of freedom ( ⅆf ) = Σ ns−1, where “n s” is the

number of observations in the sample. The ⅆf reflects the number of
18
values in the sample that are free to vary after estimating the sample
mean.
TYPES OF t-TESTS
1. One-Sample t-test
2. Two-sample t-test
3. Paired sample t-test
3.1.1 ONE-SAMPLE t-TEST

A one-sample t-test is a statistical hypothesis test that compares the
mean of a population to a known or hypothesized value. It is also
known as a single-sample t-test.
We can use this when the sample size is small (i.e. n < 30) data is
collected randomly and it is approximately normally distributed. It
can be calculated as:
x−μ
t=
s
√n
Where,
t = t-value
x = sample mean
μ= population mean
19
σ = standard deviation
n = sample size
and,
√
n
1
s= ∑
n−1 i=1
( x i−x )
2
Result: The null hypothesis is rejected if the calculated t-value is

greater than the tabulated t-value. Therefore, we can conclude that
there exists a significant difference. Otherwise (i.e. when
the computed t-value is less than the tabulated t-value) the null
hypothesis is accepted.
3.1.2 TWO-SAMPLE t-TEST (INDEPENDENT)

A two-sample t-test, commonly known as an unpaired sample t-test, is
used to determine whether the differences between two groups
are significant or random occurrences. These tests are also referred to
as unpaired or independent sample t-tests. This test is also referred to
as a t-test for differences in means.
We can use this when:
I. The population mean or standard deviation is unknown.
(information about the population is unknown)
II. The two samples are separate/independent.
(E.g. boys and girls – the two are independent of each
other)
20
In this case, the degree of freedom ( ⅆf ) = n1 +n 2−2, where n1 and n2 ae
sample sizes for the two groups. This is because there are two
parameters to estimate a two-sample t-test.
Given two independent random samples x i(i=1,2,3 . . . n1) and y i( ⅈ
=1,2,3 . . . n2) of size n1 ¿ n2 with means x and y and standard
deviations s₁ and s₂ from a normal population with the same variance,
we have to test the hypothesis that the population means are the same.
The test statistics are given by:
( x− y )
t=
1
√
S 1 1
+
n1 n 2
2
s=
n1 +n 2−2
[ s1 ( n1−1 )+ s 2 ( n2−1 ) ]
2 2
n1
1
x= ∑ x i and,
n1 i=1
n2
1
y= ∑y
n2 i=1 i
Result: If the calculated value of t is greater than the tabulated t value,

the difference between the sample means is said to be significant at
a certain level of significance, otherwise the data are said to be
consistent with the hypothesis.
3.1.3 PAIRED t-TEST (DEPENDENT)

A paired t-test, also known as a dependent samples t-test, is a
statistical method that compares the means of two paired
21
measurements to determine if they are significantly different. Paired
measurements can be taken from the same individual, object, or
related units at different times or under various conditions. If the
groups come from a single population (e.g., measuring before and
after an experimental treatment), perform a paired t-test.
If the size of the two samples is the same, say equal to “n” and the
data are paired ( x i , y i ) corresponds to the same ⅈ th sample unit. The
problem is to test if the sample means differ significantly or not. Here
we take the null hypothesis as there is no significant difference
between the sample means over time. In paired t-test, the degree of
freedom ( ⅆf )= (n−1).
The test statistics are given by:
d
t=
s/√n
Where,
n
1
d= ∑ⅆ
n i=1 i
22
ⅆi= xi − y i (i=1,2,3 . . . n)
√
n
1
s= ∑
n−1 i=1
( di−d )2
Result: If the calculated value of t is greater than the tabulated t value,

the difference between the sample means at different times (before or
after any test) is said to be significant at a certain level of
significance, otherwise the data is said to be consistent.
3.2 F – Test
F test is a statistical test that is used in hypothesis testing, that
determines whether or not the variances of two populations or two
samples are equal. The F – statistic is a ratio of two variances, which
are a measure of how spread-out data is from the mean.
The history of F – Test involves the work of two statisticians:
Sir Ronald Fisher and Gorge W. Snedecor.
In the 1920s, Fisher developed the F-statistic as a variance ratio. He
also provided the form of the F-distribution. In 1934, Snedecor
tabulated the F-distribution and named the test statistic "F" in the
honor of Fisher. Snedecor also coined the name "F-test".
F – Distribution: The F-distribution was developed by Fisher to study
the behaviour of two variances from random samples taken from two
independent normal populations. In applied problems we may be
interested in knowing whether the population variances are equal,
based on the response of the random samples. the F-
distribution or F-ratio, also known as Snedecor's F distribution or
the Fisher–Snedecor distribution (after Ronald Fisher and George
W. Snedecor), is a continuous probability distribution that arises
frequently as the null distribution of a test statistic.
Several assumptions are used in the F Test equation. For the F-test
Formula to be utilized, the population distribution needs to be normal.
Independent events should be the basis for the test samples. Apart
from this, the following considerations should also be taken into
consideration.
 It is simpler to calculate right-tailed tests. By pushing the
bigger variance into the numerator, the test is forced to be right
tailed.
23
 Before the critical value is determined in two-tailed tests, alpha
is divided by two.
 Squares of standard deviations equal variances.
In this case, the degrees of freedom are ( n 1−1 ) and ( n 2−1 ) where n1 and
n2 are the sample sizes of the two groups. The shape of the F –
distribution is determined by its degrees of freedom. It is a right –
skewed distribution, meaning it has a longer tail on the right side. As
the degrees of freedom increase, the F-distribution becomes more
symmetric and approaches a bell shape.
Let x 1 , x 2 ,⋯ x nand y 1 , y 2 , ⋯ , y n be the sample value of two independent

random samples drawn from the same normal population with
variance σ 2. Then we define variance ratio F as follows:
2
S1
F= 2 , S1 > S2
S2
where,
n1
1
2
S1 = ∑ ( x−x i )
2
( n1 −1 ) i=1
n2
1
S=
2
2 ∑ ( y− y i )
2
( n 2−1 ) i=1
24
3.3 ANOVA
3.3.1 Introduction
We have seen that the test of significance of the difference of two
means whether two samples differ significantly with respect to some
property or not. In actual practice, however, it happens that more than
two samples are involved for example-
In the agriculture experiment, 4 different chemical treatments of soil
A, B, C, and D produce mean wheat yields of 22, 24, 18, and 24
bushels per acre respectively. If we want to test whether there is a
significant difference in these means or due to chance, we cannot use
a t-test. However, one way of using a t-test is to make pairs and then
test them. We test AB, AC, AD, BD, and CD separately.
The conclusion is also drawn separately. In other words, a t-test will
be applied 6 times and still, a joint-together test will not be available.
This t-test is not suitable in this case because we want a test that
provides interference for all 4 samples. Such problems can be solved
by using an important technique ANALYSIS OF VARIANCE.
As the name indicates the method consists of the analysis of
the variance of sample into useful components. We know that
variability may arise due to a large number of causes and the amount
of variation in the data may be the sum of the total small deviation
produced by these factors and causes forming a homogenous system.
The variation in data may also arise due to other random causes such
as lack of homogeneity of some raw material or error or chance or
fluctuation.
ANALYSIS OF VARIANCE is a method to estimate the contribution
made by each factor to the total variation.
3.3.2 History
In the 1770s, Laplace was performing hypothesis testing. Around
1800, Laplace and Gauss developed the least-squares method for
combining observations. Ronald Fisher introduced the term
“variation” and proposed its formal analysis in 1918. Fisher’s first
application of ANOVA to data analysis was published in 1921,
Studies in Crop Variation I. Later he came up with Studies in Crop
Variation II, which was written with Winifred Mackenzie and
published it in 1923. The variation in yield across plots sown with
25
different varieties and fertilizer treatments was studied. Analysis of
variance became widely known after being included in Fisher’s 1925
book “Statistical Methods for Research Workers”. Since ANOVA was
developed by Fisher, it is also known as Fisher analysis of variance. It
uses f-distribution to test two or more sample variations.
3.3.3 Assumptions
When the ANOVA technique is used the following assumptions
should be met –
i) The total variance of various sources of variance should be
additive i.e. contribution made by different factors or sources is
additive.
ii) The individuals in various subgroups should be selected based
on random sampling from a normally distributed population.
iii) The variance of the subgroups/samples should be homogenous.
2 2 2 2
σ 1=σ 2=σ 3=…=σ n
iv) The errors attached to each observation are independently and
normally distributed with mean = 0 and variance = σ 2.
v) The observations are independent and are distributed about
the true unknown mean.
vi) There should be at least two observations in each subgroup
otherwise ANOVA cannot be applied.
3.3.4 One – Way ANOVA
Let us assume that n random observations are classified into ‘k’
different classes or groups such that ⅈ thclass contains ni observations (
ⅈ =1 , 2, 3 , ⋯ , k ). We shall assume that all the observations are
independent and that the distribution from which the observations are
taken is normal with mean μi and variance σ 2.
Mean ( μi) maybe different for each sample but variance ( σ 2) remains
the same for different groups or samples.
Consider the following arrangement of n observations in k classes.
Classes
A₁₁ A₁₂ Ai Ak
y₁₁ y₁₂ y1 ⅈ y1 k
y 1 n1
26
k
∑ yi =x
i=1
A is called ONE WAY classification and its ANOVA is known as one-

way analysis of variance.
Let y 1 j̇ be the j th observation of ⅈ th class where j=1 ,2 , 3 , … , n.
We may consider that these k classes are the only classes in which we
are interested. In other words, we have a fixed effect model and
the method of analysis will be as follows:-
27
CHAPTER 4
LARGE SAMPLE TEST
The large sample tests are used when a sample size is greater than 30
(n>30 ). The central limit theorem states that they become more
normally distributed as the number of samples increases. This means
that for large sample sizes, the sampling distributions of statistics are
approximately normal.
The different types of large sample tests –
i) Z – Test
ii) Welch’s F - Test
4.1 Z – TEST
A z-test is a statistical test used to determine whether two population
means are different when the variances are known and the sample size
is large. It can also be used to compare one mean to a hypothesized
value. It is commonly used when the sample size is large (typically
n>30 ). It can be used for:
 Comparing the sample mean to a known population mean.

 Comparing the means of two independent samples to determine if
there is a significant difference between them.
4.1.1 HISTORY
The Z-test specifically arose from the increasing need for a statistical
method that could compare sample means to population means when
the population variance was known. The Z-test utilized the standard
normal distribution (a normal distribution with a mean of 0 and a
standard deviation of 1) as its reference.
William Sealy Gosset, who wrote under the pseudonym “Student,”
developed the t-test in 1908, which was used for small samples and
unknown variances. However, for large samples or when the
population standard deviation was known, the Z-test was found to be
more appropriate. Gosset’s work indirectly influenced the use of Z-
tests by establishing the conditions under which a simpler method,
like the Z-test, would suffice. Sir Francis Galton and other
statisticians popularized the use of the Z – test.
28
The Z-test gained more widespread use as statistical theory matured,
especially as statisticians emphasized the importance of standardizing
data and making inferential claims using the standard normal
distribution. The work on central limit theorem played a critical role
in justifying the use of the Z-test for large samples. This theorem
states that the distribution of the sample mean approaches a normal
distribution as the sample size becomes larger, regardless of the
population distribution, as long as the variance is finite. It is
commonly used in fields such as economics, medicine, psychology,
and social sciences to compare sample data to population means or
compare the means of two large independent samples.
4.1.2 ASSUMPTIONS
When Z – test is used following assumptions should be met –
 The Z-test is typically used when the sample size is large,
generally considered n>30. This is because the central limit
theorem states that for sufficiently large sample sizes, the
distribution of the sample mean approaches a normal distribution
regardless of the shape of the population distribution.
 The population standard deviation (σ\sigma) should be known for
the Z-test to be valid. If the population variance is unknown but
the sample size is large, the sample standard deviation can
sometimes be used as an approximation
 For smaller sample sizes, the underlying data should be
approximately normally distributed. However, this is less critical
for large samples due to the central limit theorem.
 The sample should be randomly drawn from the population.
 The data should be measured on an interval or ratio scale, where
the differences between values are meaningful (e.g., height,
weight, or temperature).
4.1.3 TYPES OF Z – TESTS
Following are the various types of Z-tests used in statistical
hypothesis-
i) One Sample Z – Test
ii) Two Sample Z – Test
iii) Z – Test for Proportions
4.1.4 ONE SAMPLE Z – TEST
29
A one-sample z-test is used to test whether the mean of a population
is less than, greater than, or equal to some specific value. The Z – Test
can be left-tailed, right-tailed, or one-tailed.
or
In this test, our region of rejection is located to the extreme left of the
distribution. Here our null hypothesis is that the claimed value is less
than or equal to the mean population value.
or
30
In this test, our region of rejection is located to the extreme right of
the distribution. Here our null hypothesis is that the claimed value is
less than or equal to the mean population value.
or
In this test, our region of rejection is located at both extremes of the

distribution. Our null hypothesis is that the claimed value is equal to
the mean population value.
Let x 1 , x 2 ,. . . , x n be a random sample of size N from a normal
population with mean μ and variance σ 2.
i.e. X N ( μ , σ 2 )
Then, x N ( μ , σ 2 ∕ n )
Thus, the standard normal variate corresponds to x is given by,
x−μ
z=
(σ ∕ √ n)
Where,
x = mean of the sample
μ = mean of the population
σ = the standard deviation of the population
n = sample size
31
Suppose we want to test, H 0 : μ=μ 0
Against,
H A : μ> μ 0
μ< μ0
μ ≠ μ0
Under H 0, The test statistic is,

x−μ0
z=
σ ∕ √n
1) If z > z α , H 0 is to be rejected.
2) If z <−z α , H 0is to be rejected.
3) If |z|>−z α , H is to be rejected.
2
0
4.1.5 TWO – SAMPLE Z-TEST

A two-sample z-test is a statistical test that compares the means of
two samples to determine if they come from the same population. Let
x 1be the mean of sample of size n1froma population with mean μ1and
variance σ 21and x 2 be the mean of sample of size n2 from a population
with mean μ2 and variance σ 22.
Since sample size are large,
( )
2
σ
x1 N μ1 , 1
2
( )
2
σ
x2 N μ2 , 2
2
( )
2 2
σ1 σ2
x 1−x 2 N μ 1−μ2 , +
n1 n2
Where,
E ( x 1−x 2 )=E ( x 1 )−E ( x 2)
¿ μ1−μ2
And,
V ( x 1−x 2 )=V ( x 1 )+ V ( x2 ) −2 cov ( x 1 , x 2 )
32
2 2
σ1 σ 2
¿ + −0
n 1 n2
Thus the test statistic is given as:

( x 1−x 2 )−( μ1−μ 2 )
z=
√
2 2
σ 1 σ2
+
n1 n2
A two-sample z-test is also used to determine if the means of two

populations are equal.
( x 1−x 2 )
z=
√
2 2
σ 1 σ2
+
n1 n 2
Where,
x 1 , x 2= mean of the two samples
μ1 , μ 2= mean of the two populations
σ 1 , σ 2 = the standard deviations of the two samples
n1 , n2= sizes of the two samples
Result: If the calculated Z-test statistic falls within the critical region
(exceeds the critical value), reject the null hypothesis. Otherwise, fails
to reject the null hypothesis.
4.1.6 Z – TEST FOR PROPORTION

A z-test for proportions is a statistical hypothesis test that compares
the proportions of categories in two groups or a single variable. There
are two types of z-tests for proportion:
 Single Proportion z test
 Two - Proportion z test
Single Proportion Z – Test
If ‘ Χ ’ is the number of success in ‘n’ independent trials with constant
probability ‘ P’ of success for each trial, then:
33
E ( x )=P
V ( x )=nPQ
P+Q=1
For large ‘n’: We know that binomial distribution tends to be a normal

distribution. Hence, Χ follows-
X N ( nP , nPQ )
For testing H 0 : P=P0 against H A : P ≠ P 0 ( P> P0 ∨P< P0)

The test statistic is given by-
X− E ( X )
z=
S−E ( X )
X−nP
z= N (0 , 1)
√ nP Q
For H A : P> P0 , H 0 is to be rejected if for given sample,
z > zα
Where, z α is tabulated value from standard normal distribution at α

percent of significance.
Test for the difference of two proportion

Suppose we want to compare two distinct populations with respect to
the prevalence of a certain attribute (A).
Among their members, let Χ ₁ and Χ ₂ be the number of persons
having the given attribute (A) in random samples of size n ₁ and n ₂,
from the two populations respectively. Then sample proportions are
given by:
X₁ X2
p1 = and p2 =
n1 n2
If P1 and P2 are population proportion, then:

E ( p )1=P1 and E ( p2 ) =P 2
And,
P1 Q 1 PQ
V ( p1 )= and V ( p2 )= 2 2
n1 n2
34
Since, large samples p1 and p2 are normally distributed then, ( p1− p 2) is
also normally distributed.
Thus, the standard variable corresponding to ( p1− p 2) is given by:
( p1− p2 ) −E ( p 1− p2 )
z=
√ V ( p 1− p2 )
Now, under the null hypothesis:
Η 0: Ρ1=Ρ2
(There is no significant difference between the sample proportions.)

We have,
Ε ( p1− p2 ) =E ( p1 )−E ( p 2)
¿ P1−P2
=0
V ( p1− p2 ) =V ( p 1) + V ( p 2)
P 1 Q 1 P2 Q 2
¿ +
n1 n2
¿ PQ
( 1 1
+
n1 n2 )
Under Η 0 : P1=P2=P
( p1− p2 ) −0
z=
( ( ))
1
1 1
PQ + 2
n1 n 2
The hypothesis of interest is, H 0 : P1=P2

Against,
H A : P1 > P2
P1 < P2
P1 ≠ P2
If, z > z α
Η 0is to be rejected at α % level of significance.
If, absolute vale of z i.e.
35
|z|> z α
2
H 0 is to be rejected at α % level of significance.
If, z <−z α
H 0 is to be rejected at α % level of significance.
Remark: We generally do not have any information about the

proportion of attributes in the populations from which the samples
have been drawn.
Under H 0: P1=P2=P , an unbiased estimate of population proportion.
Based on both the samples is given by,
n1 p1+ n2 p 2
^
P=
n1+ n2
X1+ X2
^
P=
n1 +n 2
Where,
X1 X
p1 = and p2= 2
n1 n2
4.2 WELCH’S F – TEST

Welch’s F–test is a variation of the traditional ANOVA (Analysis of
Variance) test designed to be more robust when the assumption of
equal variances across groups is violated. It is particularly useful
when comparing the means of two or more groups that may have
unequal sample sizes and different variances.
4.2.1 HISTORY
Welch's F-test is named after its creator, Bernard Lewis Welch. The
Welch's F-test is based on Welch's t-test, which is an adaptation of
Student's t-test. Welch's t-test is also known as the unequal variances
t-test or the Scatterthwaite t-test. It's used to test the hypothesis that
two populations have equal means.
The traditional analysis of variance (ANOVA) was developed by
Ronald A. Fisher in the early 20th century as a method for comparing
the means of three or more groups. ANOVA was based on the
assumption of homogeneity of variances (equal variances across
groups).
36
While ANOVA became a widely used tool, its reliability was
compromised when the assumption of equal variances was violated.
This led to an increased risk of Type I errors (incorrectly rejecting the
null hypothesis).
A statistician at University College London, Welch aimed to address
the limitations of traditional ANOVA, particularly in cases where
variances were unequal (heteroscedasticity).
In 1947, Welch published his influential paper titled “The
Generalization of `Student's’ Problem When Several Different
Population Variances are Involved”, which laid the groundwork for
what is now known as Welch’s ANOVA or Welch's F-test. This paper
detailed the statistical method that could be used to compare means
when variances were not equal.
Welch developed an approach for adjusting the degrees of freedom
for the test, which involved the Welch–Satterthwaite equation. This
equation modifies the degrees of freedom used in calculating the test
statistic, making the F-distribution applicable to cases with unequal
variances. It was an extension of work by Frank E.
Over time, Welch’s test was incorporated into statistical software
packages, making it easier for researchers and data analysts to apply
the method. Today, it is a standard option in programs such as R,
SPSS, SAS, and Python libraries, allowing for quick application and
comparison of group means.
Welch’s test gained popularity because real-world data often exhibit
unequal variances. Unlike traditional ANOVA, Welch’s method does
not require transformations or assumptions that could compromise the
interpretation of results.
4.2.2 ASSUMPTIONS
The following assumptions should be met to apply Welch’s F–Test:
 Independence: The observations must be independent both within
and across the groups.
 Normal Distribution: Each group should ideally be normally
distributed. However, Welch's F-test is more robust to violations of
normality than traditional ANOVA, especially with larger sample
sizes.
37
 Heterogeneous Variances: Unlike ANOVA, Welch’s F-test does not
assume that the variances of the groups are equal.
4.2.3 METHODOLOGY
Welch's F-test uses a more complex calculation for the degrees of
freedom to account for unequal variances across groups. The degrees
of freedom are calculated using the Welch–Satterthwaite equation,
which adjusts based on the variances and sample sizes of each group.
The degrees of freedom (df ) for the F-statistic in Welch’s test are
adjusted as:
(∑ )
2 2
k
n i ( X i−X )
2
i=1 s i ∕ ni
ⅆf =
(( ) )
( ni−1 )
k
∑ 2 2
i=1 s i ∕ ni
The test statistic for Welch's F-test is:
(∑ n i ( X i−X ) 2
)
k
i=1 s 2i ∕ ni
ⅆf =
( )
k
ni−1
∑ s 2i
i=1
where,
k : Number of groups.
ni : Sample size of the ⅈ th group.
X i : Mean of the ⅈ th group.
X : The grand mean of all the group means.
si : Variance of the ⅈ th group.

2
Result: If the calculated F-statistic is greater than the critical value at

the chosen significance level (or if the p-value is less than α ), the null
hypothesis is rejected, indicating that at least one group mean is
significantly different.
38
CHAPTER 5
NON-PARAMETRIC TESTS
5.1 Introduction
A non-parametric test is a type of statistical test that does not require
the data to follow a specific distribution (e.g., a normal distribution).
These tests are particularly useful when you cannot assume or do not
know the distribution of the population from which your sample is
drawn. They are also applied when dealing with ordinal data or when
the sample size is too small to satisfy the assumptions of parametric
tests.
Nonparametric tests serve as an alternative to parametric tests such as
T-test or ANOVA that can be employed only if the underlying data
satisfies certain criteria and assumptions. Note that nonparametric
tests are used as an alternative method to parametric tests, not as their
substitutes. In other words, if the data meets the required assumptions
for performing the parametric tests, the relevant parametric test must
be applied.
5.2 Origin and Development

The first nonparametric test appeared in the works of Arbuthnott, J.
(1710), who introduced the sign test. But most nonparametric tests
were developed between 1940 and 1955. John Arbuthnot was Scottish
physician, satirist and polymath in London. He is best known for his
contribution to mathematics and various other fields.
We make special mention of the articles of Kolmogorov, Andrey
Nikolaevich (1933), Smirnov, Vladimir Ivanovich (1939),
Wilcoxon, F. (1945, 1947), Mann, H.B. and Whitney, D.R. (1947),
Mood, A.M. (1950), and Kruskal, W.H. and Wallis, W.A. (1952).
Later, many other articles were added to this list. Savage, I.R. (1962)
published a bibliography of about 3000 articles, written before 1962,
concerning nonparametric tests.
5.3 Importance
39
In order to achieve the correct results from the statistical analysis, we
should know the situations in which the application of nonparametric
tests is appropriate. The main reasons to apply the nonparametric test
include the following:
1. The underlying data do not meet the assumptions about the
population sample
Generally, the application of parametric tests requires various
assumptions to be satisfied. For example, the data follows a normal
distribution and the population variance is homogeneous. However,
some data samples may show skewed distributions.
The skewness makes the parametric tests less powerful because the
mean is no longer the best measure of central tendency because it is
strongly affected by the extreme values. At the same time,
nonparametric tests work well with skewed distributions and
distributions that are better represented by the median.
2. The population sample size is too small
The sample size is an important assumption in selecting the
appropriate statistical method. If a sample size is reasonably large, the
applicable parametric test can be used. However, if a sample size is
too small, it is possible that you may not be able to validate the
distribution of the data. Thus, the application of nonparametric tests is
the only suitable option.
3. The analyzed data is ordinal or nominal
Unlike parametric tests that can work only with continuous data,
nonparametric tests can be applied to other data types such as ordinal
or nominal data. For such types of variables, the nonparametric tests
are the only appropriate solution.
40

Project Work

Uploaded by

Copyright:

Available Formats

Project Work

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Project Work

Uploaded by

Copyright:

Available Formats

1

Chapter 3: Small Sample Tests

Chapter 4: Large Sample Tests

Chapter 5: Non-Parametric Tests

1.2 Terminologies and Definitions

1.2.7 Level of significance (α)

1.2.8 Confidence level

1.2.9 Critical Region

1.2.10 Acceptance Region

Note: A statistician or decision-maker should choose a test that minimizes β or

1.2.15 Statistical Hypothesis

1.3 Single-Tailed Test

1.2.16 Degree of Freedom ( ⅆf )

2.4 Types of Tests

From the gathered sample data, understand the characteristics

2.5.3 Calculate the p-value

2.5.4 Make a decision

Once the p-value is formulated, compare the p-value calculated to the

Summarize the result of whether the null hypothesis is accepted or

3.6.2 Improve quantitative research / Research Advancement

3.6.3 Business Analysis

3.6.4 Understanding Relationships

A t-test is a type of inferential statistic test used to determine if

It is often used when data is normally distributed and population

In the case of T-Test degree of freedom ( ⅆf ) = Σ ns−1, where “n s” is the

3.1.1 ONE-SAMPLE t-TEST

Result: The null hypothesis is rejected if the calculated t-value is

3.1.2 TWO-SAMPLE t-TEST (INDEPENDENT)

Result: If the calculated value of t is greater than the tabulated t value,

3.1.3 PAIRED t-TEST (DEPENDENT)

Result: If the calculated value of t is greater than the tabulated t value,

Let x 1 , x 2 ,⋯ x nand y 1 , y 2 , ⋯ , y n be the sample value of two independent

A is called ONE WAY classification and its ANOVA is known as one-

 Comparing the sample mean to a known population mean.

In this test, our region of rejection is located at both extremes of the

μ = mean of the population

σ = the standard deviation of the population

Under H 0, The test statistic is,

4.1.5 TWO – SAMPLE Z-TEST

Thus the test statistic is given as:

A two-sample z-test is also used to determine if the means of two

μ1 , μ 2= mean of the two populations

σ 1 , σ 2 = the standard deviations of the two samples

n1 , n2= sizes of the two samples

4.1.6 Z – TEST FOR PROPORTION

For large ‘n’: We know that binomial distribution tends to be a normal

For testing H 0 : P=P0 against H A : P ≠ P 0 ( P> P0 ∨P< P0)

Where, z α is tabulated value from standard normal distribution at α

Test for the difference of two proportion

If P1 and P2 are population proportion, then:

(There is no significant difference between the sample proportions.)

The hypothesis of interest is, H 0 : P1=P2

If, absolute vale of z i.e.

H 0 is to be rejected at α % level of significance.

Remark: We generally do not have any information about the

4.2 WELCH’S F – TEST

The test statistic for Welch's F-test is:

ni : Sample size of the ⅈ th group.

X i : Mean of the ⅈ th group.

X : The grand mean of all the group means.