Business Mathematics and Statistics
Business Mathematics and Statistics
Business Mathematics and Statistics
1st YEAR
1
CONTENTS
Sr No. Topic Page No.
1 DEFINITION OF MATRIX 3
2 MULTIPLICATION OF A MATRIX BY ANOTHER MATRIX 15
3 DETERMINANT OF A MATRIX 27
4 SET THEORY 43
5 PROBABILITY 52
6 CONDITIONAL EVENTS 71
7 CONCEPT AND USES OF DISTRIBUTION 87
8 POISSON DISTRIBUTION 95
9 STATISTICAL INFERENCES 107
10 HYPOTHESIS 129
11 SMALL SAMPLES 149
12 CHI SQUARE TEST 166
13 REGRESSION AND REGRESSION MODEL 180
14 FITTING OF SIMPLE REGRESSION EQUATION 187
15 TESTING AND INTERVAL ESTIMATION OF REGRESSION 235
COEFFICIENT
16 MULTIPLE LINEAR REGRESSION 255
17 STATISTICAL DECISION THEORY 273
18 CHI - SQUARE TEST IN CONTINGENCY TABLE 292
19 MEASURES OF ASSOCIATION 309
20 NON-PARAMETRIC TEST 326
21 TIME SERIES AND DETERMINATION OF TREND 339
22 SEASONAL INDICES IN TIME SERIES ANALYSIS 356
23 STATISTICAL QUALITY CONTROL 375
24 BUSINESS FORECASTING 400
2
LESSON - 1
DEFINITION OF MATRIX
Numbers arranged in rows and columns of the rectangular array and enclosed by square
brackets [ ] or parenthesis ( ) or pair of double vertical line ||, || is called a matrix. Matrix is
thus, defined as a rectangular array of ordered numbers in rows and columns. The numbers are
also known as elements of matrix. A matrix consisting of m rows and n columns is written in the
form as:
The first subscript 'm' refers to the 'number of rows' and the second subscript 'n' refers to the
'number of columns' in the particular matrix. Thus a 12 is the element of 1st row and 2nd column in
the matrix. Similarly, a21 is the element of 2nd row and 1st column, and aij refers to the element of
ith row and jth column in the matrix. Generally, the dimension or order of a matrix is
determined by the number of rows and columns it has. A matrix is denoted by capital letters like
A, B, C etc. Some examples are:
3
3 x 4 matrix has 3 rows and 4 columns
Order of Matrix
If the matrix has mxn number of elements arranged in m rows and n columns, it is said to be of
the order "m by n", which is written as m x n matrix. Remember that in a matrix...
b. Two matrices of the same order are equal if the corresponding elements of them are same.
For example,
c. If the matrix consists of only one row as [ a1 b1 c1 ], it is a Row Matrix or Row Vector.
e. A matrix is said to be a Zero Matrix or Null Matrix if and only if each of its elements is
zero. For example,
f. A matrix containing the number of rows equal to the number of columns is known as a
Square Matrix. For example,
4
In this, the elements 1, 9 and 8 are called the diagonal elements, and the diagonal is
called the principal diagonal.
g. A matrix containing all diagonal elements as 'non-zero' and all other elements are zero is
called diagonal matrix. For example,
h. In a diagonal matrix, if all the diagonal elements are equal, it is known as scalar matrix.
For example,
ALGEBRA OF MATRIX
The addition of two or more matrices is possible only if they have the same order. The sum of
matrices is obtained by adding the corresponding elements of the matrices.
Illustration 1
Find A+B.
5
Solution
A+B=
If two matrices have different orders, their addition is not defined. For example, the following
matrices cannot be added:
iii. If for any matrix of A of dimension or order m x n, if there exists another matrix B of the
same dimension, such that A + B = B + A = 0 (null matrix), then B is known as Additive
Inverse or Negative of A, and is denoted by -A.
Illustration 2
6
Illustration 3
Solution
The two matrices are of the same order 2x2, and are equal. This means that each element of one
matrix is equal to the corresponding element of the second matrix. Therefore,
X+Y = 2
Z-Y = 0
X+W = 4
Illustration 4
Reddy Company produces A, B and C products. The turnover (in 100 units) of these products for
1993 and 1994 regions are given below. Find total turnover for two years.
For 1993
-------------------------------------------------------------------------------------------------------
Product Region
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------
7
For 1994
--------------------------------------------------------------------------------------------------------
Product Region
-----------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
Solution
By additing both year's figures, the total turnover of each product for two years can be
calculated:
----------------------------------------------------------------------------------------------------------
Product Region
------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------
8
Illustration 5
Find
(i) A + B + C;
(ii) (A + B) + C;
(iii) A + (B + C)
Solution
(ii) (A +B) +C =
9
(iii) A + (B + C) =
Illustration 6
Find A + B.
Solution
Illustration 7
Find A + B
10
Solution
Here, A + B = 0 (or B+ A = 0 ).
Like addition, the subtraction of two or more matrices is possible only if they have the same
order. It is obtained by subtracting the corresponding elements of the given matrices.
Illustration 8
Therefore A + B = B + A but A -B = B –A
11
(C) Multiplication of Matrix
Illustration 9
Three persons intend to buy clothes at a shop, the prices of the clothes are given below:
Shirt Rs. 60
Bush-shirt Rs. 30
Tie Rs. 20
Let us write the prices of these clothes in the form of a column, so that it becomes a column
matrix:
Suppose person A wishes to buy 2, 2, 1, and 1 units of pant, shirt, bush-shirt, and tie
respectively, this can be written as a row matrix as [ 2 2 1 1 ]. Then, the total amount of
purchases by A can be obtained by multiplying the two matrices - matrix of units of clothes with
the matrix of prices, as this...
12
= 160 + 120 + 30 + 20
= Rs. 330.
If person B wishes to buy 1, 2, 2, and 1 units of pant, shirt, bush-shirt, and tie respectively, the
total amount spent by B can be calculated by multiplying the following matrices...
= 80 + 120 + 60 + 20
= Rs. 280.
Similarly, if person C wishes to buy 3, 2, 2, 1 units of the respective clothe items, the total
amount paid by C is...
= 240 + 120 + 60 + 20
= Rs. 440.
Alternately, instead of doing three claculations separately, it can be done at one go as..
OR
13
=
- End of Chapter -
14
LESSON - 2
Illustration 10
Three persons intend to buy clothes at shops R1, R2, R3 and R4, the prices are given below.
Price Matrix:
Shops
Clothes Matrix:
Assuming that a person should buy his lot from the same shop, which shop should A prefer.
15
Solution
Shops
16
First row gives the different amounts to be paid by A for the same lot at four (R1, R2, R3 , R4)
shops; and so also second row and third row relate the amounts to be paid by B and C at four
shops respectively.
For A and C, it would be better to buy the whole lot from shop R4, and B from shop R3 as the
payment would be less.
(i) All the matrices cannot be multiplied with each other. Matrix multiplication in general is not
commutative
17
i.e., A x B is not equal to B x A
Remark:
Given matrix A of order m x n, and matrix B of order p x q, the ordered pair of matrices A and B
(A, B) is said to be conformable only if n = p (i.e. the number of columns of A = number of rows
of B). If p and n are not equal, then AxB is not conformable. If AxB is conformable, it is not
A is a 3x3 matrix, and B is a 3x2 matrix. So, m=3, n=3, p=3, q=2.
Is BA defined? No, because the number of columns of B (q = 2) is not equal to the number of
rows in A (m = 3).
i.e., A x (B x C) = (A x B) x C
i.e., A x (B + C) = A x B + B x C
Illustration 11
In family R, men, women and children are 3, 2, 1 respectively; and in family Q they are 1, 1, 2
respectively. The recommended daily calories are 2000, 1800 and 1500 for men, women and
children, and proteins recommended are 50, 45, 35 respectively. Calculate the total requirement
of calories as well as proteins for each of the two families.
Solution
18
Calorie Protein
The dimensions / order of two matrices A and B satisfy the rule of multiplication (no. of
columns in A = no. of rows in B).
By multiplying A and B, we can get the total requirement of calories and proteins for R and Q
families.
19
Illustration 12
A company produces three products A, B and C which it sells in two markets X and Y. Data given
is...
If the unit sales price of A, B and C are Rs.3, Rs.2, and Rs.1 respectively, find the total revenue in
X and Y market with matrix algebra.
Solution
The total revenue received by selling the products in markets X and Y is Rs.39000 and
Rs.42000 respectively. The revenue of the company from both the markets is Rs.81000:
20
Illustration 13
If the unit cost of the above three products (given in Illustration 12) are Rs. 2.50, Rs. 1.75 and
Rs. 0.80 respectively, find the profits.
Solution
Profit by selling products in X market and Y market is Rs. 6550 and Rs. 6400 respectively. Total
profit earned by the PQR Company is Rs. 12950.
Illustration 14
A factory employs 40 skilled workers and 20 unskilled workers. The daily wages to skilled and
unskilled are Rs. 35 and Rs. 22 respectively. Using matrices, find
Solution
21
Illustration 15
Solution
which means
13X1 = 10
22
3(0.77) + 2X2 = 7
23
Square matrix A is symmetric matrix of aij = aji for all the values of i & j
A square matrix A is said to be skew symmetric if a ij=-aij for all the values of i & j
Transpose of a matrix:
24
If A is a matrix, the matrix obtained by changing its rows into columns & columns into rows is
known as transpose of the matrix, and denoted by A' or A T
Illustration 17
Let
Illustration 18
25
Prove that (AB)' = B'A'
- End of Chapter -
LESSON - 3
26
DETERMINANT OF A MATRIX
A single number expressing the difference between two or more products is called determinant
(which is denoted by either | A | or det. A). It is represented by a group of numbers enclosed by
two vertical lines. For example, a, b, c, d are the elements of the determinant...
and its value is (a x d) - (b x c)... the difference between the products of the elements of the two
diagonals.
Illustration 19
= 6 x 4 – 8 x 2
= 24 – 16 = 8
The determinant of a square matrix is of order 1, 2, 3, and so on. Let us define the two important
methods which are used to find the determinants to matrices of order 3 & order 4.
Minor
The 'minor of an element' in a determinant is the determinant of one lower order obtained by
deleting the row and the column containing that element. Thus, in the determinant of order 3.
The Minor (M) of the element a mn can be obtained by deleting m th row and nth column. So, in the
given matrix, minor of element a 11 will be obtained by deleting 1st row and 2nd column of the
matrix...
27
Cofactor
The Cofactor Cij of an element aij is defined as (-1)i+j Mij, where Mij is the minor of the element aij.
Illustration 20
Solution
Expand the determinant by using the elements of the first column. We get:
28
Cramer's Rule
In this rule, the determinant is known by two simultaneous equations. To solve the equations
multiply equation (i) with b2 and equation (ii) with b1, to get
Similarly, multiply equation (i) with a2 and equation (ii) with a1, to get
y = ------------------
29
Solution for x and y in matrix form is...
Illustration 21
4x + 2y = 2
3x – 5y = 21
Solution
We have,
So,
30
Illustration 22
x + 6y –z = 10
2x + 3y + 3z =17
Solution:
For equations,
solution is
31
Hence, the solution for the given equations is:
1. The value of a determinant remains unchanged even if columns are changed into rows,
and rows changed into columns.
2. If two columns (or rows) of a determinant are interchanged, the value of the determinant
so obtained is negative of the value of the original determinant.
3. If each element in any one row or any one column of a determinant is multiplied by a
constant, say K, then the value of determinant so calculated is K times the value of the
original determinant.
4. If any two columns (or any two rows) in a determinant are equal, the value of the
determinant is zero. Such a determinant is said to be linearly dependant, otherwise it is
called independent.
Inverse of Matrix
In matrix theory, the concept of dividing one matrix directly by another does not exist. However,
a unit matrix can be divided by any square matrix by a process called "inversion of a matrix". In
algebra if X x Y = 1, then X = 1/Y or we say that Y is the inverse of X, or X is inverse of Y. In
matrices, inverse of a matrix A is represented by A -1. Product of A and A-1 must be equal to
identity matrix (denoted by I).
A x A-1 = I
A-1 = I / A
32
The inverse matrix concept is very useful in solving simultaneous equations in input-output analysis as
well as regression analysis in economics.
Illustration 23
Solution
= 3x2 - 4x1 = 6 - 4 = 2
(Adj A is obtained by interchanging the principal diagonal elements, and revesing signs of the
elements of the other diagonal in A)
So,
33
Remember that:
Rank of Matrix
The rank of a matrix is the maximum number of linearly independent (non-singular) rows (or
columns) in the matrix. The rank of a matrix A is the order of the largest non-zero minor of |A|.
1. The rank of matrix is not related to any way to the number of zero elements in it.
2. The rank cannot exceed the number of its rows or columns, whichever is lesser.
3. The rank is at least 1, unless the matrix has all zero elements.
4. The rank of a column matrix (single column) having any number of rows, say m x 1
matrix, is at most 1; rank of a 3 x 50 matrix is at most 3.
5. Rank of the transpose of matrix A is the same as that of A.
Note:
The straight way to find the rank of any matrix is to look for non-zero determinant of the highest
order which the given matrix contains.
Illustration 24
34
Solution
We see in the above calculation that each minor is non-zero. Therefore, the rank can be at most
2.
Now,
Again, the matrix is not null matrix (it's elements are non-zero). So, the 1st order determinants
are non-zero.
Simultaneous Equations
Matrix algebra is useful in solving a set of linear simultaneous equations involving more than
two variables. The procedure for getting the solutions is as :
x+y+z=4
35
2x + 5y -2z = 3
or AX = B, where
A is known as the coefficient matrix in which coefficients of X are written in the first column,
coefficients of y in the second column, and coefficients of z in the third column.
B is the matrix formed with the right hand terms of the equations.
Note:
Linear Equations
If matrix B is zero then the system AX = 0, which is said to be homogeneous system, otherwise
the system is said to be non-homogeneous.
(2) Multiplication (or division) of the elements of any row (or column) by any non-zero number,
(3) Addition of the elements of any row (or column) to the corresponding elements of any other
row (or column) or multiplied by any number.
To solve homogeneous linear equations, the Guass-Jordon method also called Triangular Term
36
Reduction method is applied. In this, the given linear equation is reduced to an equivalent
simpler system, which is studied in both homogeneous and non homogeneous equations. The
simplified system is as
x2 + c2x3 = d2
x3 = d3
The non-homogeneous linear equations can be solved by either (i) Matrix method, or (ii)
Cramer's method or (iii) Gauss Jordon method.
Matrix Method
Let AX = B be the given system of linear equations, and A -1 be the inverse of A. Pre-multiplying
both sides of the equation with A-1, we get
(A-1A) X = A-1 B
I X = A-1 B
X gives the solution to the given set of simultaneous equations. It is thus, calculated by first
finding A-1 and then post-multiplying A-1 with B.
Illustration 25
Solve the equations and find the values by linear equations using the matrix inverse method.
37
Solution
The equations for three days cost can be written like this:
38
X = A-1B
39
Proof
5000+750+1200=6950
6950=6950
Illustation 26
-------------------------------------------------------------------------------------------------------
C2 Available
---------------------------------------------------------------------------------------------------------
R1 2 3 4 29
R2 1 1 2 13
R3 3 2 1 16
------------------------------------------------------------------------------------------------------------
40
41
- End of Chapter -
42
LESSON - 4
SET THEORY
A set is fundamental and all mathematical objects as well as constructions need the set theory.
In many fields, scientists plan for construction of their objects in terms of sets. In trade, industry
and commerce analysis, we have sets of data, sets of items produced, sets of outcomes of
decisions and alike. While expressing the words such as family, association, group, crowd, we
often used to convey the idea of a set in our day life. Enumerations and calculations leading to
numbers form set and which gives a good insight into the depth of nature. In set, symbols are
used, on which its operations can be done.
Notation
A collection of any type of numbers, things, or objects is referred to "set". The constituent
numbers of it is termed as its "elements". These elements may be presented by enclosing them
in brackets or may be described in a form with statement of the properties. The sets is denoted
by capital letters A, B, C,………and and their elements in lower case letters a, b, c,..... A set is
known by its elements. A set may be presented either in tabular method (or Roster method),
descriptive phrase method or rule method (or set builder). These are discussed with examples.
A = {2, 4, 6, 8} is a set of the even numbers 2,4, 6, 8. It means A is a set of all even numbers
between 1 and 9; and 2, 4, 6, 8 are elements of A. The function of this set is described in tabular
form enlisting each element of it within the bracket. This method is known as tabular method.
A = {even numbers between 1 and 9}. In this, the elements of A simply place a phrase describing
the elements of the set within the bracket. This method is called descriptive phrase method.
A = {x/x is an even integer, 1 < x < 0}. In this, the set is determined by its elements but not by
the description. This method is known as rule method. However, it should be noted that the
set A we get in any of the three different ways as shown above, remains the same.
Venn Diagram
A diagram is said to be Venn diagram if the elements of set are represented by points in circular
or similarly enclosed curve. For example, V = {a, e, i, o, u}. The Venn diagram shows the letters
a, e, i, o, u which are the elements of set V. A set remains unchanged even if the order of its
elements is changed. Thus, the above set can be written as:
V = {e, o, i, u, a}
43
Illustration 28:
v. The three smallest integers greater than 20, and three smallest integers less than 20.
Solution
ii. {e, n, g, l, i, s, h}
iii. {200}
Illustration 29
ii. B = {b /b>4}
iii. C = {c (integer)/c<0}
Solution
i. Set A consists of the elements x such that, x is a college having more than 200 students.
ii. Set B consists of the elements b such that b is all the numbers greater than 4.
iii. Set C consists of the elements c such that c is all integers less than 0 i.e. C is the set of all
negative integers.
If A set has a finite number of elements, A is said to be a finite set. The number of elements is
denoted by n(A) and can be counted by a finite number. A set is infinite, if it has an infinite
44
number of elements. The elements of such set cannot be counted by a finite number. A set of
points along a line or in a plane is known as point set.
Illustration 30
i. A= {a, b, c, y, 1, 4, 6, r}
Solution
v. No
vi. Yes
Types of Sets
a. A set which contains no elements is called a null set or empty set, and is denoted by Ø
(Phi)
b. A set which contains only one element is known as unit set.
c. A set which contains the totality of elements is called universal set. It is often drawn as a
rectangle and is denote by E
d. If the two sets have no elements in common, they are disjoint sets.
e. If the two sets overlap, that overlapping portion will include the common points between the
two sets is known as overlapping set.
45
g. If some elements in a set have a common special property, they can be said to belong to a
subset.
h. If each and every element of one set is also an element of another set, the two sets are said to
be equal or identical sets.
i. Two sets are said to be equivalent if there is a one-to-one correspondence between the
elements in two sets. The elements appearance order in each set is immaterial. Equivalence may
be expressed symbolically as, A = B or A⇔B
j. The class of all subsets of a set A is called the power set of A. It is denoted by P(A).
Illustration 31
i. A = {x /x is a married bachelor}
ii. A = {x/x is a tomato growing a mango}
ii. A = {a}
iv. A = {0}
v. E = {a, b, c, d, ..., z}
vi. E = {tail, head}
vii. E = {rise in supply, constant in supply, fall in supply }
viii. A = {odd numbers} and B = {even numbers}
ix. A = {0, l, 2, 3} and B = {5, 6}
x A = {0, 1, 2, 3, 4, 5, 6, 8} and B ={2, 4, 6, 8}
xi If x belongs to a set A and if x does belong to a set A
xii A = {3, 5, 7} and B= {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
{commerce, economics} and B = {chemistry, mathematics,
xiii A =
commerce, economics}
xiv A = {2, 4, 6, 8} and B = {4, 8, 2, 6}
xv A = {x/x is a letter of the word total and B {otalt}
Solution
iv. A = {0} is a unit set containing an element zero. A is not a null set
v. E = {a, b, c, d, ..., z} is a universal set (it covers all the letters of the English alphabet)
vi. E = {tail, head} is a universal set (it covers all the possibilities of tossing a fair coin)
vii. E = {rise in supply, constant in supply, fall in supply} is a universal set (it covers all the
possibilities)
46
viii. If A = {odd numbers} and B = {even numbers}, then A and B are disjoint sets
ix. If A = {0, 1, 2, 3} and B = {5, 6}, then A and B are disjoint sets
x. A = {0, 1, 2, 3, 4, 6} and B {2, 4, 6, 8}, the sets are overlapping (the common points are 2,
4, 6)
xi. If x belongs to a set A, and if x does not belong to a set A, then we write x ¢ A, and x € A
respectively
Operations on Sets
A set may be combined and operated in various ways to get new sets. The basic operations on
sets are classified into four. They are (a) Complementation, (b) Difference, (c) Intersection and
(d) Union. These are discussed below with suitable examples.
a. Complementation
If A is a subset of the universal set E, then the set of elements of E that do not belong to A, is
known as the complement of A.
The unshaded portion of of E in the above Venn diagram is referred to the complement of A.
Therefore complement of a set A in a universal set E, is the set of all elements in E which are not
elements of A. The complement of A may be written as A1 .
Illustration 32
Find the complement of A, given the universal set as E = {1, 2, 3, 4, 5, 6}, if:
i. A = {2, 4, 6}
47
ii. A = {1, 2, 5, 7}
Solution
b. Difference
Difference between the two sets A and B is the set of all elements belong to A but not B.
Illustration 33
Solution
To get A - B, take set A = {a, b, x, y} and delete from it the elements that are present in B, which
are x and y.
Similarly, to get B - A, take set B = {c, d, x, y} and delete from it the elements that are present in
A, which are x and y.
c. Intersection
The intersection of two sets A and B is the set of all elements which belong to both A and B.
Symbolically,
A(¯)B = {x / x € A and x € B}
Illustration 34
Solution
The elements that are present in both A and B are {c, d}. Hence, A(¯)B = {c, d}
d.Union
48
The Union of two sets A and B is the set of all elements that belong to A or B or both A and B.
Illustration 35
i) A(¯)B
ii) AUB
Solution
ii) AUB = {0, 1, 2, 3, 4} (elements that are present in A and B put together)
Illustration 36
Solution
A(¯)B = {c}
A-B = {f}
AUB = {c, d, f, g}
A(¯)B = Ø
A-B = {c, d}
AUB = {c, d, e, f, g}
Exercise
49
1. Define matrix. Explain the operations on matrices.
4. Explain how the matrix algebra is useful in solving a set of linear simultaneous equations.
5. Find the rate of commission on items A, B, C from the data given below:
7. What is a Venn diagram? Explain complement, union, difference and intersection with Venn
diagrams.
50
viii. Symmetric matrix
ix. Set
REFERENCES
Birkhoff, G. and Maclane, PA., 'Survey of Modern Algebra', The Macmillan Co.
Mehta and Madnani, 'Mathematics for Economies', Sulthan Chand and Sons, New Delhi, 1979.
- End of Chapter -
51
LESSON - 5
PROBABILITY
The probability theory has its origin in the games of chance pertaining to gambling. Jerome
Cardon, an Italian mathematician was pioneer to write a book on 'Games of Chance' which was
published posthumously in 1663. Credit goes to the French mathematicians Blaise Pascal and
Pierre de Fermat who developed a systematic and scientific procedure for probability theory on
solving the stake in an incomplete gambling match posed by a notable French gambler and
noble man Chevalier de Mere. A Swiss mathematician James Bernoulli made an extensive study
on probability is a major contribution to the theory of probability and combinatory.
Contribution to the theory of probability was made by Abraham de Moivre (1667-1754) - The
Doctrines of chances and the Revered Thomas Bayes (1702-61) - Inverse Probability. In 19th
century, Pierre Simon de Laplace made an extensive research on all the early ideas and
published his monumental work under caption "Theory of Analytical Probability" in 1812 which
became pioneer in theory of probability.
Meaning of Probability
The word probability is very commonly used in day-to-day conversation, and people have a
rough idea about its meaning. We often come across statements like probably it may rain today.
It means not sure about the occurrence of rain but there is possibility of occurrence of rain.
What do we mean when we say the probability to win the match is 0.75? Do we mean that we
will win three-fourths of the match. No. We actually mean that the conditions show the
likelihood of winning the match is only 75 percent. Hence, the term probability is sensible
numerical expression about uncertainty. In brief, probability is numerical value about
uncertainty with calculated risk. If an event is certain not to occur, its probability is zero and if it
is certain to occur, its probability is one.
The subject probability has been developed to a great extent and today, no discipline in social,
physical or natural sciences is left without the use of probability. It is widely and popularly used
in the quantitative analysis of business and economic problems; and is an essential tool in
statistical inferences which form the basis for the decision theory. In other words, the role
played by probability in modern science is of a substitute for certainty. Thus, the probability
theory is a part of our everyday life.
Terminology
The various terms which are used in defining probability under different approaches are
discussed below.
52
experiment.
b. Trial and Event:- Performing a random experiment is called a trial, and the outcome is
referred to as event. For example, the result is not unique by tossing a coin repeatedly.
We may get either head or tail. Thus, tossing of a coin is a trial and getting head or tail is
an event.
c. Independent and Dependent Events:- If the occurrence of an event does not affect
the occurrence of other, such event is said to be independent event. For example, in
tossing a die repeatedly, the event of getting number 3 in 1st throw is independent of
getting number 3 in the 2nd, 3rd or subsequent throws. Event is said to be dependent, if
the occurrence of an event affects the occurrence of the other. For example, when we
draw a single card from a deck of cards, probability of it being a King is 4/52. If we do
not replace this card, the probability of getting a King in the next draw will be affected
(probability of getting a King in the second draw of card will be 3/51).
d. Mutually Exclusive Events :- Two or more events are said to be mutually exclusive if
the occurrence of any one of them excludes the occurrence of all others. For example, if a
die is thrown in a trial, by getting an outcome of say number 6, the occurrence of
remaining numbers is excluded in that trial. Hence, all the outcomes are mutually
exclusive.
e. Equally Likely Events:- The outcomes are said to be equally likely or equally probable
if none of them is expected to occur in preference to other. For example, if an unbiased
/fair coin is tossed, the outcomes heads and tails are equally likely to happen in each
trial.
f. Exhaustive Events :- The total number of possible outcomes of a random experiment
is called exhaustive events. For example, if a coin is tossed, we can get head (H) or tail
(T). Hence exhaustive cases are 2. If two coins are tossed together, the various possible
outcomes are HH, HT, TH, TT were HT means heads on the first coin and tails on the
second coin and so on. Thus in a toss of two coins, exhaustive cases are 4, i.e. 2 2. In a
throw of n coins, exhaustive events are 2n.
g. Simple and Compound Events :- Events are said to be simple when it is about
calculating probability of happening or not happening of single events. For example, the
probability of drawing a red ball from a bag containing 8 white and 7 red balls. The joint
occurrence of two or more events are termed compound events. For example if a bag
containing 9 white and 6 red balls, and if two successive draws of 3 balls are made, we
are going to find out the probability of getting 3 white balls in the first draw and 3 red
balls in the second draw.
h. Complementary Events :- Let there be two events A and B. A is called
complementary event of B (and vice-versa) if A and B are mutually exclusive and
exhaustive events. For example, when a die is thrown, occurrence of an even number (2,
4, 6) and odd number (1, 3, 5) are complementary events.
i. Probability Tree:- A probability representation showing the possible outcomes of a
series of experiments and their representative probabilities is known as Probability Tree.
j. Sample Space :- The set of all possible outcomes of a random experiment is known as
the sample space and is denoted by S. In other words, the sample space is the set of all
exhaustive cases of the random experiment. The elements of the sample space are the
outcomes.
Algebra of Sets: The union of two sets of A and B, denoted by A U B is defined as a: set of-
elements which belong to either A or B or both. Symbolically,
AUB = (X C A or X C B)
53
For example, if
The intersection of two sets A and B, denoted by A(¯)B, is defined as a set of those elements that
belong to both A and B. Symbolically,
A(¯)B = (X C A and X C B)
A(¯)B = {3, 4}
Two sets A and B are said to be disjoint or mutually exclusive if they do not have any common
elements. That is, A(¯)B = Ø.
In the figure given below, we represent the union, difference, and intersection of two events by
means of Venn diagrams. The region enclosed by a rectangle is taken to represent the sample
space whereas given events are represented by ovals within the rectangle.
The word permutation means arrangement and the word combination means group or selection.
For example, let us take three letters A, B and C. The permutations of these three letters taken
two at a time will be AB, AC, BC, BA,CA and CB i.e. 6 in all, whereas the combinations of three
letters taken two at a time will be AB, BC and CA i.e. 3 in all. The order of elements is immaterial
in combinations, while in permutations the order of elements matters.
Permutation
54
A permutation of n different objects taken r at a time is an ordered arrangement of only r objects
out of the n objects. In other words, the number of ways of arranging n things taken r at a time.
It is denoted symbolically as nPr, where n is total number of elements in the set, r is the number
of elements taken at a time, and P is the symbol for permutation. Thus,
n!
n
Pr = -------
(n-r)!
For example,
Solution
We are given n=4, r=4. Number of possible ways of arrangement (or permutations) = 4! / (4-4)!
= 4!/1! = (4 x 3 x 2 x 1) / 1 (because 0! is 1)
= 24 permutations.
Solution
n=5, r=2
n
Pr = 5P2 = 5! / (5-2)! = 5! / 3! = (5 x 4 x 3 x 2) / (3 x 2 x 1) = 20 permutations
Solution
n=8, r=8
n
Pr = 8P8 = 8! / (8-8)! = 8! / 0! = (8 x 7 x 6 x 5 x 4 x 3 x 2 x 1) / 1 = 40320
Combination
55
n!
n
Cr = -----------
(n-r)! x r!
Solution
n=8, r=3
n
Cr = 8C3 = 8! / [(8-3)! x 3!] = 8! / (5! x 3!) = (8 x 7 x 6 x 5 x 4 x 3 x 2 x 1) / (5 x 4 x 3 x 2 x 1)
(3 x 2 x 1)
= 8 x 7 x 6 / 3 x 2 x 1 = 56 combinations
Solution
n=20, r=4
n
Cr = 20C4 = 20! / [(20-4)! x 4!] = 20! / (16! x 4!)
6. In how many ways can a committee of 4 men and 3 women be selected out of 9 men and 6
women.
Solution
So the number of ways of selecting 4 men from 9 and 3 women from 6 = 126 x 20 = 2520 ways.
Probability Theorems
The computation of probabilities can become easy and be facilitated to a great extent by the two
fundamental theorems of probability - the Addition Theorem and the Multiplication Theorem.
56
(a) Addition Theorem - Independent Events:
The probability of occurring either event A or event B (where A and B are mutually exclusive
events) is the sum of the individual probability of A and B. Symbolically,
Proof
If an event A can happen in a1 ways and B in a2 ways, then the number of ways in which either of
the events can happen is a1 + a2. If the total number of possibilities is n, then by definition the
probability of either A or B event happening is:
But,
and,
The theorem can be extended to three or more mutually exclusive events. Thus,
If the events are not mutually exclusive, the above procedure discussed no longer holds. For
example, if the probability of buying a pen is 0.6 and that of pencil is 0.3, we cannot calculate
the probability of buying either pen or pencil by adding the two probabilities because the events
are not mutually exclusive. When the events are not mutually exclusive, the above said theorem
is to be modified. The probability of occurring of at least one of the two events, A and B which
are not mutually exclusive is given by:
By subtracting P(A and B) i.e. the proportion of events as counted twice in P(A) + P(B), the
addition theorem is, thus, reconstructed in such a way as to render A and B mutually exclusive
57
events. In case of three events :
P(A or B or C) = P(A) + P(B) + P(C) – P(A and B) – P(A and C) – P(B and C) + P(A
and B and C)
Illustration 1
A bag contains 25 balls marked 1 to 25. One ball is drawn at random. What is the probability
that it is marked with a number that is multiple of 5 or 7?
Solution
i. The possible sample points that are multiples of 5 are 5 ...(5, 10, 15, 20, 25)
ii. The possible sample points that are multiples of 7 are 3 ...(7, 14, 21)
So the probability that the ball picked at random is either marked with a multiple of 5 or 7
Illustration 2
A bag contains 20 balls numbered 1 to 20. One ball is drawn at random. What is the probability
that it is marked with a number that is a multiple of 3 or 4?
Solution
i. The possible outcomes where the ball with a multiple of 3 = 6 ...(3, 6, 9, 12, 15, 18)
ii. The possible outcomes where the ball has a multiple of 4 = 5 ...(4, 8, 12, 16, 20)
So the probability that the ball picked at random is either marked with a multiple of 3 or 4 is not
just adding the two probabilities as calculated above. Since there are some numbers which are
multiples of both 3 and 4, we need to exclude them so that they don't get counted twice.
iii. The possible outcomes where the ball has a multiple of both 3 and 4 = 1 ...(12)
58
Hence, the probability of two events which are not mutually exclusive, i.e., disjoint cases = 1/20
Illustration 3
From a pack of 52 cards, what is the probability of drawing one card that it is either a King or a
Queen?
Solution
Probability of drawing a King = P(K) = 4/52 (as there are 4 Kings in a pack of 52 cards) = 1/13
Probability of drawing a Queen = P(Q) = 4/52 (as there are 4 Queens in a pack of 52 cards) =
1/13
Probability of getting either a King or Queen in the draw = P(K or Q) = P(K) + P(Q) = 1/13 +
1/13 = 2/13
Illustration 4
A bag contains 30 balls numbered from 1 to 30. One ball is drawn at random. Find the
probability that the number of
(a) 5 or 9, and
(b) 5 or 7
Solution
a) Probability of getting a ball that has a multiple of 5, i.e. 5, 10, 15, 20, 25 or 30 = 6/30
Probability of getting a ball that has a multiple of 9, i.e. 9, 18, 27 = 3/30
Then, the probability of getting a ball that has either multiple of 5 or 9 = 6/30 + 3/30 = 9/30
= 3/10
b) Probability of getting a ball that has a multiple of 5, i.e. 5, 10, 15, 20, 25 or 30 = 6/30
Probability of getting a ball that has a multiple of 7, i.e. 7, 14, 21, 28 = 4/30
Then, the probability of getting a ball that has either multiple of 5 or 7 = 6/30 + 4/30 = 10/30
= 1/3
Illustration 5
59
The probability that a student passes a Chemistry test is 2/3 and the probability that he passes
both Chemistry and English tests is 14/45. The probability that he passes at least one test is 4/5.
What is the probability that he passes the English test?
Solution
Probability that the student passes both Chemistry and English tests = P(C and E) = 14/45
Probability that the student passes either Chemistry or English test = P(C or E) = 4/5
P(E) = 4/5 + 14/45 - 2/3 = (36+14)/45 = 50/45 - 2/3 = 10/9 - 2/3 = (10-6)/9 = 4/9
Illustration 6
Let A and B be the two possible outcomes of an experiment and suppose that P(A) = 0.4, P(A or
B) = 0.8,
Solution
i. We know that
= P - 0.4
If A and B are to be mutually exclusive, then there should be no outcomes common to both.
60
That is, P(A and B) = 0
⇒ P - 0.4 = 0
⇒ P = 0.4
ii. If A and B are to be independent, then P(A and B) should be equal to P(A) x P(B)
⇒ P – 0.4 = 0.4 x P
⇒ P - 0.4P = 0.4
⇒ 0.6P = 0.4
⇒ P = 0.677
Illustration 7
A person is known to hit the target in 3 out of 4 shots, whereas another person is known to hit
the target in 2 out of 3 shots. Find the probability of the target being hit at all when they both
try.
Solution
= 3/4 + 2/3 - (3/4) x (2/3) = (9+8)/12 - 1/2 = 17/12 - 6/12 = 11/12
Illustration 8
Solution
61
a. Let P(X), P(Y) and P(Z) be the respective probabilities of components of X, Y and Z being
defective. We are given that P(X) = 0.01, P(Y) = 0.02, P(Z) = 0.05.
= 0.01 + 0.02 + 0.05 - 0.01 x 0.02 - 0.02 x 0.05 + 0.01 x 0.02 x 0.05
The probability that the assembled product will not be defective = 1 - Probability that it
will be defective
b. Let P(A) be the probability that an item will have A-type of defect, and P(B) be the
probability that an item will have B-type of defect. We are given that P(A) = 20% = 0.2,
P(B) = 10% = 0.1, P(A and B) = 6% = 0.06.
Illustration 9
A salesman has 65 per cent chance of making a sale to a customer. The behaviour of each
successive customer is independent. If three customers A, B and C enter together, what is the
probability that the salesman will make a sale to at least one of the customers.
Solution
Let the probability of making sale to A = P(A), probability of making sale to B = P(B), and
probability of making sale to C = P(C)
= 0.65 + 0.65 + 0.65 - 0.65 x 0.65 - 0.65 x 0.65 – 0.65 x 0.65 + 0.65 x 0.65 x 0.65
62
= 1.95 - 1.2675 + 0.274625 = 0.957125 = 0.957
The probability of occurring of two independent events A and B is equal to the product of their
individual probabilities.
Proof
If an event A can happen in n1 ways of which a1 are successful, we can combine each successful
event in the first with each successful event in the second. Thus, the total number of successful
events in both cases is a1 x a2. Similarly, the total number of possible cases is n1 x n2. By
definition, the probability of occurrence of both events is:
Illustration 10
Four cards are drawn at random from a pack of 52 cards. Find the probability that
vi. There are two cards of Clubs and two cards of Diamonds
Solution
i. A pack of 52 cards consisting of four cards each of King, Queen, Jack and Ace. So, a King or a
Queen or a Jack or an Ace can be drawn in 4C1 = 4 ways. Since drawing a King is independent
of the ways of getting a Queen, a Jack and an Ace, the sample points or the favorable number of
cases are 4C1 x 4C1 x 4C1 x 4C1.
63
Probability = No. of favourable cases / Total number of exhaustive cases
ii. Probability of getting two Kings and two Queens = (4C2 x 4C2) / 52C4 = 36 / 270725 = 0.00013
iii. A pack of 51 cards contains 13 Diamond cards. So we can draw 4 out of 13 Diamond cards in
13
C4 ways.
iv. Probability of getting two reds and two blacks = ( 26C2 x 26C2) /52C4 = 0.390
v. In a pack of 52 cards, there are 13 cards of each suit. We can draw one card of each suit in 13C1
x 13C1 x 13C1 x 13C1 ways.
Probability of getting one card of each suit = (13C1 x 13C1 x 13C1 x 13C1) / 52C4 = 0.1055
vi. In a pack of 52 cards, there are 13 Club cards and 13 Diamond cards. So, we can draw two
cards of Clubs and two cards of Diamonds in 13C2 x 13C2 ways.
Probability of getting two cards of Diamonds and two cards of Clubs = ( 13C2 x C2) /
13 52
C4 =
0.0225
Illustration 11
A bag contains 8 white and 6 red balls. Two balls are drawn in succession at random. What is
the probability that one of them is white and the other is red?
Solution
Number of favourable cases for both white and red balls = 8 x 6 = 48 ways
Illustration 12
An urn contains 8 white and 3 red balls. If two balls are drawn at random, find the probability
that
64
(ii) both are red
Solution
(iii) 1 white ball and 1 red ball can be drawn out of 8 white and 3 red balls in 8C1 x 3C1ways = 24
ways
Illustration 13
A bag contains 10 white, 6 red, 4 black and 7 blue balls. 5 balls are drawn at random. What is the
probability that 2 are red, 2 are white, and 1 is blue?
Solution
Total balls = 27
2 red balls can be drawn out of 6 red balls in 6C2 ways = 15 ways
2 white balls can be drawn out of 10 white balls in 10C2 ways = 45 ways
1 blue ball can be drawn out of 7 blue balls in 7C1 ways = 7 ways
Probability of getting 2 red balls, 2 white balls and 1 blue ball = (15 x 45 x 7) / 80730 = 0.585
Illustration 14
A bag contains 8 red and 5 white balls. Two successive drawings of 3 balls are made. Find the
probability that the first drawing will give 3 white and the second 3 red balls, if
65
(i) the balls were replaced before the second trial,
(ii) the balls were not replaced before the second trial.
Solution
(i) Since the balls were replaced before the second draw, events A and B are independent events.
3 white balls out of 5 white balls can be drawn in 5C3 ways (10 ways).
3 red balls out of 8 red balls can be drawn in 8C3 ways (56 ways).
Hence, probability of getting 3 white and 3 red balls P(AB) = P(A) x P(B) = (10/286) x
(56/286) = 0.006846
ii. Since the balls were not replaced before the second draw, events A and B are dependent
events.
3 white balls out of 5 white balls can be drawn in 5C3 ways (10 ways).
3 red balls out of 8 red balls can be drawn in 8C3 ways (56 ways).
66
Hence, probability of getting 3 white and 3 red balls P(AB) = P(A) x P(B) = (10/286) x (56/120)
= 0.01632
Illustration 15
An urn contains 5 white, 3 black, and 6 red balls. 3 balls are drawn at random. Find the
probability that:
Solution
Total balls is urn are 14. 3 balls can be drawn out of 14 balls in 14
C3 ways, i.e., the exhaustive
number of cases = 364.
i. 2 white balls can be drawn out of 5 in 5C2 ways and another ball can be drawn from the
remaining 9 balls in 9C1 ways. So, the favourable cases are 5C2 x 9C1 = 90.
ii. We can draw 3 different coloured balls in 5C1 x 3C1x 6C1 = 90 ways.
iii. We can draw 3 balls out of the 11 non-black balls in 11C3 = 165 ways.
Illustration 16
The probability that India wins a cricket test match against Australia is given to be 1/3. If India
and Australia play 4 test matches, what is the probability that
67
(ii) India will win at least one test match.
Solution
In notation,
(i) Probability that India will lose all 4 test matches = 2/3 x 2/3 x 2/3 x 2/3 = 16/81 = 0.19
(ii) Probability that India will win at least one test match
= 1 - Probability that India will lose all 4 test matches = 1 - 0.19 = 0.81
Illustration 17
A University has to select an examiner from a list of 60 persons, 25 of them women and 35 men;
15 of them knowing Hindi and 45 not, 18 of them being teachers and the remaining 42 not. What
is the probability of the University selecting a Hindi-knowing woman teacher?
Solution
68
Probability of the selected person being a teacher = P(T) = 18/60
Illustration 18
The MCom class consists of 60 students, 12 of them are girls and 48 boys, 10 of them are rich, 15
of them are fair complexioned. What is the probability of selecting a fair complexioned rich girl?
Solution
Illustration 19
The following table shows Subjects and Educational qualification of 40 teachers of S.K.
University. Of total 144, one is nominated at random to be the University Executive. Find the
probability that
(ii) the teacher has a PhD degree and is from Commerce subject,
(iii) the teacher has a PhD degree and if from Economics subject,
Solution
69
(i) Probability of nominating a teacher with only PG degree
= 12 / 40 = 0.3
(ii) Probability of nominating a teacher with PhD degree and Commerce subject
= 9/40 = 0.225
(iii) Probability of nominating a teacher with PhD degree and Economics subject
= 12/40 = 0.3
- End of Chapter –
LESSON - 6
CONDITIONAL EVENTS
70
Conditional Theorem:
If two events A and B are dependent, the probability of the second event occurring will be
affected by the outcome of the first that has already occurred. The term conditional probability
is used to describe this situation. It is symbolically denoted by P(B/A), which is read as the
probability of occurring B, given that A has already occurred. Robert L. Birte defined the concept
of conditional probability as: "A conditional probability indicates that the probability that an
event will occur is subject to the condition that another event has already occurred."
Symbolically,
and
Bayes' Rule
Computation of unknown probabilities on the basis of information supplied by the past records,
or experiment is one of the most important applications of the conditional probability. The
occurrence of an event B only, when event A is known to have occurred (or vice versa) is said to
be conditional probability, which is denoted by P(B/A). In other words, probability of B, given A.
The conditional probability occuring due to a particular event or reason is called its reverse or
posteriori probability. The posteriori probability is computed by Bayes' Rule, named after its
innovator, the British Mathematician, Sir Thomas Bayes. The revision of given i.e. old prob-
abilities in the light of the additional information supplied by the experiment or past records is
of extreme help to business and management executives in making valid decisions in the face of
uncertainties. The Bayes' theorem is defined as:
71
By observing the following diagram, the derivation of the above formula can be drawn
(Observe diagram. The part of B which is within A1 represents the area (A1 and B); and the part
of B which is within A2 represents the area (A2 and B).
F.R. Jolliffee in his book captioned 'Commonsense Statistics for Economists and Others' has
defined the concept of Bayes theorem as: With twisting conditional probabilities the other way
round, i.e. given probabilities of the form P(B/Ai) to find conditional probabilities of the form
P(Ai/B), where Ai can be described as prior probabilities i.e. probabilities known before
anything happens, then the probabilities P(Ai/B) are described as posterior probabilities i.e.
probabilities found after something has happened, and supported experimental evidences.
Probability before revision by Bayes' rule is called a priori or simply prior probability, since
it is determined before the sample information is taken in account. A probability which has
undergone revision via Bayes' rule is called posterior probability because, it represents a
probability computed after the sample information is taken into account.
Posterior probability is also called revised probability in the sense that it is obtained by
revising the prior probability with the sample information. Posterior probability is always
conditional probability, the conditional event being the sample information. Thus, a prior
probability which is unconditional probability becomes a posterior probability, which is
conditional probability by using Baye's rule.
72
Illustration 20
Two sets of candidates are competing for positions on the Board of Directors of a company. The
probability that the first and second sets will win are 0.65 and 0.35 respectively. If the first set
wins, the probability of introducing a new product is 0.8, and the corresponding probability if
the second set wins is 0.3. What is the probability that the product will be introduced.
Solution
Illustration 21
A factory has two machines. Past records show that the first machine produces 40 per cent of
output and the second machine produces 60 per cent of output. Further, 4 per cent and 2 per
cent of products produced by the first machine and the second machine respectively, were
defectives. If a defective item is drawn at random, what is the probability that the defective item
was produced by the first machine or the second machine.
Solution
73
A2 = the event of drawing an item produced by the second machine
P(B/A1) = 4% = 0.04
P(B/A2) = 2% = 0.02
P(A1/B) = P(A1 and B) / P(B) = P(A1) x P(B/A1) / P(B) = 0.4 x 0.04 / 0.028 = 0.571
P(A2/B) = P(A2 and B) / P(B) = P(A2) x P(B/A2) / P(B) = 0.6 x 0.02 / 0.028 = 0.429
The values derived above can also be computed in a tabular form as below.
Conclusion
With additional information i.e. conditional probability, the probability of defective items
produced by the first machine is 0.5714 or 57.14 per cent and that by the second machine is
0.4286 or 42.86 per cent. And we may say that the defective item is more likely drawn from the
output produced by the first machine.
74
Illustration 22
The probability that management trainee will remain with a company is 0.60. The probability
that an employee earns more than Rs. 10,000 per year is 0.50. The probability that an employee
is a management trainee who remained with the company or who earns more than Rs. 10000
per year is 0.70. What is the probability that an employee earns more than Rs. 10,000 per year
given that he is a management trainee who stayed with the company.
Solution
Probability of an employee earns more than Rs. 10000 per year given that he is a management
trainee is 0.67.
Illustration 23
Suppose that one of the three men, a politician, a businessman and an academician will be
appointed as the Vice Chancellor of a University. Their probability of appointments respectively
are 0.40, 0.25 and 0.35. The probabilities that research activities will be promoted by these
people if they are appointed are 0.5, 0.4 and 0.8 respectively. What is the probability that
research will be promoted by the new Vice-Chancellor?
Solution
75
A3 = Event that the academician will be appointed as the Vice-Chancellor
The probability that research will be promoted by the new Vice-Chancellor, P(B)
Now,
By random variable we mean a variable value which is computed by the outcome of a random
experiment. In brief, a random variable is a function which assigns a unique value to each
sample point of the sample space. A random variable is also called a chance variable or
stochastic variable. A random variable may be continuous or discrete. If the random variable
takes on all values within a certain interval, then it is called a continuous random variable
while if the random variable takes on the integer values such as 0, 1, 2, 3, ..... then it is known as
discrete random variable.
The function p(x) is known as the probability function of random variable of x and the set of all
possible ordered pairs is called probability distribution of random variable. The concept of
probability distribution is in relation to that of frequency distribution. While the frequency
distribution tells how the total frequency is distributed among different classes of the variable,
the probability distribution tells how the total frequency is distributed among different classes of
76
the variable, the probability distribution tells how the total probability of 1 is distributed among
various values which random variable can take. In brief, the word frequency is replacing by
probability.
Illustration 24
A dealer of Allwyn refrigerators estimates from his past experience the probabilities of his
selling refrigerators in a day. These are as follows:
Probability : 0.03 0.20 0.23 0.25 0.12 0.10 0.07
Solution
Illustration 25
A die is tossed twice. Getting an odd number is termed as a success. Find the probability
distribution of number of success.
Solution
Let us denote success by S and failure by F. In two throws of a die, X denoted by S becomes a
random variable and takes the values 0, 1, 2. This means, in the two throws, we can get either no
odd numbers, or 1 odd number, or both odd numbers.
i. P(X=0) = P(F in both throws) = P(FF) = P(F) x P(F) = 1/2 x 1/2 = 1/4
ii. P(X=1) = P(S in first throw and F in second throw) or P(F in first throw and S in second
throw) = P(SF) or P(FS)
77
= P(S) x P(F) + P(F) x P(S) = 1/2 x 1/2 + 1/2 x 1/2 = 1/4 + 1/4 = 2/4 = 1/2
iii. P(X=2) = P(S in both throws) = P(SS) = P(S) x P(S) = 1/2 x 1/2 = 1/4
Illustration 26
Four bad apples are mixed accidentally with 26 good apples. Obtain the probability distribution
of the number of bad apples in a draw of 3 apples at random.
Solution
Denote X as the number of bad apples drawn. Now X is a random variable which takes values of
0, 1, 2, 3. There are 30 apples in all (26 good + 4 bad), and the exhaustive number of cases
drawing three apples is 30C3. We get,
P(X=0) = 0 bad apples and 3 good apples = (4C0 x 26C3 )/ 30C3 = 1 x 26 x 25 x 24 / 30 x 29 x 28 =
2600 /4060
P(X=1) = 1 bad apple and 2 good apples = ( 4C1 x 26C2) / 30C3 = 4 x 26 x 25 x 6 / 2 x 30 x 29 x 28 =
1300 / 4060
P(X=2) = 2 bad apples and 1 good apple = ( 4C2 x 26C1) / 30C3 = 4 x 3 x 26 x 6 / 2 x 30 x 29 x 28 =
156 / 4060
P(X=3) = 3 bad apples and 0 good apples = (4C3 x 26C0) / 30C3 = 4 x 1 x 6 /30 x 29 x 28 = 4 / 4060
Illustration 27
An insurance company offers a 40 year old man a Rs. 1200 one year term insurance policy for an
annual premium of Rs. 15. Assume that the number of deaths per one thousand is four persons
in this group. What is the expected gain for the insurance company on a policy of this type?
Solution
Accordingly,
78
P(X) = 4/1000 = 0.004
= Premium charged x Probability of no death - Net amount payable on death x Probability
of death
Illustration 28
Gopi, a newspaper vendor purchases the Hindu newspaper at a special concessional rate of Rs.
1.80 per copy against the selling price of Rs. 2.30. Any unsold copies are, however, a dead loss.
Gopi has estimated the following probability distribution for the number of copies demanded.
How many copies should he order so that the expected profit is the maximum.
Solution
Expected profit = Profit per copy x Probability of paper distribution x No. of papers sold.
20 0.05 0.50 20 x 0.05 x 0.5 = 0.50
21 0.18 0.50 21 x 0.18 x 0.5 = 1.89
25 0.08 0.50 25 x 0.08 x 0.5 = 1.00
By purchasing and selling 22 copies of the newspaper, Gopi will get maximum expected profit of
Rs. 3.30
Illustration 29
(a) The monthly demand for Allwyn watches is known to have the following probability
79
distribution:
Probability : 0.08 0.12 0.19 0.24 0.16 0.10 0.07 0.04
Determine the expected demand for watches. Also compute the variance.
Solution
E(X) = 1 x 0.08 + 2 x 0.12 + 3 x 0.19 + 4 x 0.24 + 5 x 0.16 + 6 x 0.10 + 7 x 0.07 + 8 x 0.04
= 0.08 + 0.24 + 0.57 + 0.96 + 0.80 + 0.60 + 0.49 + 0.32 = 4.06
Illustration 30
Ravali, proprietor of a food stall has invented a new item of food delicacy which she calls R-
foods. She has calculated that the cost of manufacturing as Rs. 2 per piece, and because of its
quality it would the sold for Rs. 3 per piece. It is, however, perishable and any goods unsold at
the end of the day are dead loss. She expects the demand to be variable and has drawn up the
following probability distribution:
No. of pieces demanded 10 11 12 13 14 15
80
ii. Find an expression for her net profit or loss if she manufactures 'm' pieces and demand is 'n'
pieces.
Consider separately the two cases -- 'n' lesser than or equal to 'm', and 'n' greater than 'm'.
iii. Find the net profit or loss, assuming that she manufactures 12 pieces.
Solution
Now we can compute the probability distribution of the demand for any day as this:
No. of pieces demanded : 10 11 12 13 14 15
ii. If she manufactures 'm' pieces on any day, the cost is Rs.2m. If the number of pieces
demanded on any day 'n' is less than or equal to peices produced 'm', then all the pieces
demanded are sold, and the sale proceeds is Rs.3n. But, if the number of pieces demanded on
any day 'n' is greater than the pieces produced 'm', then the maximum sales is limited to 'm' and
thus the sale proceeds is Rs.3m.
iii. No. of pieces produced, m = 12. Lets apply the finding in (ii).
81
13 Rs.12
14 Rs.12
15 Rs.12
We notice that for the first three cases (where n = 10, 11, 12), n is less than or equal to m (12).
For the remaining three cases (where n = 13, 14, 15), n is greater than m.
= 6 x 0.07 + 9 x 0.10 + 12 x 0.23 + Rs.12 x 0.38 + Rs.12 x 0.12 + Rs.12 x 0.10
v. Profit for m
15 10 11 12 13 14 15 0.10
Expected net profit(Rs.) 10.00 10.79 11.28 11.08 9.74 8.04
= Rs. 10.00 x 1
= Rs. 10.79
82
Expected net profit, when production (m) is 12,
= Rs. 11.28
= Rs. 11.08
= Rs. 9.74
= Rs. 8.04
From the expected net profit given in table, we conclude that the maximum expected profit is
Rs. 11.28 which happens when production (m) is 12. Hence, the production of 12 pieces per day
will optimise Ravali's food stall enterprise's expected profit.
The concept, mathematical expectation also called the expected value, occupies an important
place in statistical analysis. The expected value of a random variable is the weighted arithmetic
mean of the probabilities of the values that the variable can possibly assume. Robert L. Brite has
defined the mathematical expectation as: It is the expected value of outcome in the long run. In
other words, it is the sum of each particular value within the set (X) multiplied by the
probability. Symbolically
83
The concept of mathematical expectation was originally applied to games of chance and
lotteries, but the notion of an expected value has become a common term in everyday parlance.
This term is popularly used in business situations which involve the consideration of expected
values.
Illustration 31
Mr. Reddy, owner of petrol bunk sells an average of Rs. 80,000 worth of petrol on rainy days
and an average of Rs. 100,000 on clear days. Statistics from the Meteorological Department
show that the probability is 0.83 for clear weather and 0.17 for rainy weather. Find the expected
value of petrol sale and variance.
Solution
We are given
X1 = 80,000
P1= 0.17
X2 = 100,000
P2 = 0.83
= P1 X1 + P2 X2
Illustration 32
There are three alternative proposals before a businessman to start a new project.
Proposal A - Profit of Rs. 4 lakhs with probability of 0.6 or a loss of Rs. 70,000 with probability
of 0.35
84
Proposal B - Profit of Rs. 8 lakhs with probability of 0.4 or a loss of Rs. 2 lakhs with probability
of 0.6
Proposal C - Profit of Rs. 4.5 lakhs with probability of 0.8 or a loss of Rs. 50,000 with
probability of 0.2
For maximising profits and minimising loss which proposal he should prefer?
Solution
Formula
Proposal A:
Expected value = Rs. 400,000 x 0.6 - Rs. 70,000 x 0.35 = Rs. 240,000 - 24,500 = Rs.
215,500
Proposal B :
Expected value = Rs.800,000 x 0.4 - Rs. 200,000 x 0.6 = Rs. 320,000 - Rs. 120,000 = Rs.
200,000
Proposal C :
Expected value = Rs. 450,000 x 0.8 - Rs. 50,000 x 0.2 = Rs. 360,000 - Rs. 10,000 = Rs.
350,000
The maximum expected value is Rs. 350,000 which is in case of proposal C. Hence the business
man should prefer proposal C.
Illustration 33
The probability that there is at least one error in an accounts statement prepared by A is 0.3,
and for B and C the probailities are 0.5 and 0.4 respectively. A, B and C prepared 10, 16 and 20
statements respectively. Find the expected number of correct statements in all and the standard
deviation.
Solution
We are given,
85
X1 = 10; P1 = 1 - 0.3 = 0.7
= (10)2 x 0.7 + (16)2 x 0.5 + (20)2 x 0.6 = 70 + 128 + 240 = 438
- End of Chapter -
LESSON - 7
86
CONCEPT AND USES OF DISTRIBUTION
Inferences about the characteristics of population can be drawn through (a) observed or
experimental frequency distributions and (b) theoretical or probability distributions. In
frequency distribution, measures like average, dispersion, correlation, etc. are studied based on
the actual data so that it is possible to deduce inferences which indicate both (i) the nature and
form of the sample data and (ii) help in formulating certain ideas about the characteristics of
population. In population, the values of variable may be distributed according to some definite
probability law, and the corresponding probability distribution is known as Theoretical
Probability Distribution. Such probability distributions are based on prior considerations or
previous experience and are more scientific approach to draw inferences about the charac-
teristics of population fitting a mathematical model or a function of the form Y = P(X) to the
given data.
We have defined the mathematical expectation, random variable, and probability distribution
function and also discussed these. In the present section, we will cover the following univariate
probability distributions:
The first two distributions are discrete probability distributions and the third one is a con-
tinuous probability distribution.
Binomial Distribution
Binomial distribution is named after the Swiss mathematician James Bernoulli who innovated
it. The binomial distribution is used to determine the probability of success or failure of the one
set in which there are only two equally likely and mutually exclusive outcomes. This distribution
can be used under specific set of assumptions:
i. The random experiment is performed under the finite and fixed number of trials.
iii. All the trails are independent in the sense the outcome of any trail is not affected by the
preceding or succeeding trials.
iv. The probability of success or failure remains constant from trial to trial.
The success of an event is denoted by 'p' and its failure by 'q'. Since the binomial distribution is a
set of dichotomous alternatives i.e. successes or failures and thus, p + q = 1. Hence, (q + p) are
the terms of binomial. By expanding the binomial terms, we obtain probability distribution
which called the binomial probability distribution or simply the binomial distribution.
It is defined as
87
P(r) = P(X=r) = nCr qn-rpr
or
(ii) The binomial coefficients for n + 1 terms of the distribution are symmetrical, ascending up to
the middle of the series, and then descending. When n is odd number, n + 1 becomes even and
the binomial coefficients of the two central terms are identical.
β1 = (q - p)2 / npq
β2 = 3 + (1 - 6pq) / npq
Step 1. Determine the values of p and q. If one of these values is known, the other can be found
88
by their relationship -
p = (1 - q) and q = (1 - p)
Step 3. Multiply each term of the expanded binomial by N (total frequency) in order to obtain
the expected frequency in each category.
Illustration 34
A coin is tossed six times. What is the probability of obtaining four or more heads.
Solution
In a toss of an unbiased coin, the probability of head as well as tail is equal, i.e, p = q = 1/2
By expanding the terms (q + p)6, we get various possibilities for all the events.
p6 = (1/2)6 = 1/64
Illustration 35
Eight coins are thrown simultaneously. Show that the probability of obtaining at least 6 heads is
37/256.
Solution
Probability of getting head and tail are denoted by p and q respectively. In case of unbiased coin,
p = q = 1/2.
89
The probability of r successes i.e., getting heads in 8 trials is given by
p(r) = 8Cr q8-r pr = 8Cr (1/2)8-r (1/2)r = 8Cr (1/2)8 = 8Cr /256
= 8C6 /256 + 8C7 /256 + 8C8 /256 = 1/256 (8C6+8C7+8C8) = 1/256(28+8+1) = 37/256
Illustration 36
The incidence of occupational disease in an industry is such that the workman have a 20%
chance of suffering from it. What is the probability that out of six workmen, three or more will
contact the disease?
Solution
q = 1 - p = 1 - 1/5 = 4/5
Probability of 3 or more men suffering from disease is termed in binomial expansion of (1/5 +
4/5)6
= 1545/15625 = 0.09888
Illustration 37
90
ii) the variance
Solution
np = 7 and npq = 11
(7)q = 11
q = 11/7
Illustration 38
Nine coins are tossed one at at a time, 512 times. Number of heads observed is recorded at each
throw, and the results are given below. Find the expected frequencies. What are the theoretical
values of mean and standard deviation? Also calculate the mean and standard deviation of the
observed frequencies.
91
9 2
Solution
Standard Deviation
Now, we calculate the mean and standard deviation of observed or actual frequency distribution.
Arrange data:
Frequency
X dx F.dx F.dx2
0 -5 4 - 20 100
1 -4 10 -40 160
2 -3 45 -135 4d5
3 -2 115 -230 460
4 - 1 139 - 139 139
5 0105 0 0
6 + 1 65 + 65 65
7 + 219 + 38 76
8 + 38 + 24 72
9 + 42 +8 32
92
-5 512 - 429 1509
Remark
For frequency distribution, the mean and standard deviation are 4.1621 and 1.4984 respectively
while the figures for probability frequency distribution are 4.50 and 1.50. The probability
frequency is more scientific and mathematical model so that the arriving results are more
accurate and precise. For example, if we substitute the mean of observed frequency distribution,
we get
Mean = np = 4.1621
The value of p + q = 1. But in our example, the value of (p + q ) calculated based on the observed
frequency distribution is 0.462455 + 0.539439) = 1.001894
Hence, the binomial probability distribution is more scientific and describes the real life of an
event.
Illustration 39
Given data shows the number of seeds germinating out of 10 on damp filter for 100 sets of seeds.
Fit a binomial distribution
X = 0 1 2 3 4 5 6 7 8 9 10
Y = 6 20 30 16 12 8 4 3 1 0 0
Solution
93
X F Fx
0 6 0
1 20 20
2 30 60
3 16 48
4 12 48
5 8 40'
6 4 24
7 3 21
8 1 8
9 0 0
10 0 0
100 269
Mean = np
p = np/n = 2.69 / 10 = 0.269
q = 1 - p = 1 - 0.269 = 0.731
Hence, Binomial Distribution = 100 (0.731 + 0.269). By expanding 100(0.731 + 0.269) 10, we get
the expected frequencies for 0, 1, 2, …10 and the results are tabulated below:
- End of Chapter –
94
LESSON - 8
POISSON DISTRIBUTION
Poisson Distribution was found by French mathematician Simeon D. Poisson. This distribution
describes the behaviour of rare events and has been known as the Law of Improbable events.
Poisson distribution is a discrete probability distribution and is very popularly used in statistical
inferences. The binomial distribution can be used when only the sample space (number of trials
n) is known, while the Poisson distribution can study when we know the mean value of
occurrences of an event without knowing the sample space. Mathematically, the Poisson
distribution is in limiting form of binomial distribution as n (number of trials) tends to infinity
and p (success) approaches zero, in that way np= m remains constant. Such distribution is fairly
common. Under the conditions, n is infinity, p approaches zero and np = m remains constant,
the Binomial distribution function tends to Poisson probability function which is given below
(definition of Poisson probability distribution).
where x is the number of succeses (occurrences of an event), m = np and e = 2.71823 (the base of
natural logarithm). m is called the parameter of the Poisson ditsribution. The standard deviation
is √m.
Poisson distribution can explain the behaviour of the discrete 'random variables where the
probability of occurrence of events is very small and the number of trials is sufficiently large As
such, this distribution has found application in many fields like Queuing theory, Insurance,
Biology, Physics, Business, Economics, Industry etc. The practical areas where the Poisson
distribution can be used is listed below. It is used in...
4. Physics to count the number of disintegrating of a radioactive element per unit of time,
In addition to the above, the Poisson distribution can also use in things like counting number of
accidents taking place per day, in counting number of suicides in a particular day, or persons
dying due to a rare disease such as heart attack or cancer or snake bite or plague, in counting
95
number of typographical errors per page in a typed or printed material etc.
Illustration 40
An average number of phone calls per minute into the switch-board of Reddy Company Limited
between the hours of 10 AM to 1 PM is 2.5. Find the probability that during one particular
minute there will be (i) no phone calls at all, (ii) exactly 3 calls and (iii) at least 5 calls.
Solution
Let us denote the number of telephone calls per minute by X. Then X follows Poisson
distribution with mean distribution, m = 2.5. The Poisson probability function is:
Note:
Refer Table for e-2.5. We cannot get table value for 2.5. First we have to find the value for e -2.0 (=
0.13534) and then for e-0.5 (= 0.6065). Now, multiply them to get 0.08208).
96
Illustration 41
The mistakes per page were observed in a book, fit a Poisson distribution. The data are:
No. of mistakes per page (x) 0 1 2 3 4
occurred (F)
Solution
m = ∑Fx / N
e-0.44 = 0.644
Expected frequencies:
97
Illustration 42
The components processed by a machine have been found to have some defects. 40 components
were selected at random and the number of defects in each of them were noted. The data is:
(a) Determine the probability distribution of the random variable, number of defects in a
component and frequency distribution based on the Poisson distribution.
Solution
Expected frequency of defects containing 'r' defects according to binomial law is given by...
NP(r) = 40 x e-m mr / r!
98
1 5.9833 x (1.9)1 12
2 5.9833 x (1.9)2/2! 11
3 5.9833 x (1.9)3/3! 7
4 5.9833 x (1.9)4/4! 3
5 5.9833 x (1.9)5/5! 1
40
Chi Square
The calculated value is less than the table value, hence Poisson distribution provides a good fit to
the data.
Note:
Without referring to Table, we can calculate the value of e. For example , e-0.2 = 0.8188.
-0.2 loge
= -0.08686
Normal Distribution
The Binomial and Poisson distributions discussed in preceeding paragraphs are most useful
theoretical distribution for discrete variables i.e. occurrence of disjoint events. A more suitable
distribution for dealing with the variable whose magnitude is continuous is normal distribution.
It is also called the normal probability distribution.
99
Uses
1. It aids solving many business and economic problems including the problems in social and
physical sciences. Hence, it is cornerstone of modern statistics.
2. It becomes a basis to know how far away and in what direction a variable is from its
population mean.
3. It is symmetrical. Hence mean, median and mode are identical and can be known.
4. It has only one maximum point at the mean, and hence it is unimodel (i.e. only one mode).
Definition
Where,
π = 22/7 = 3.14159
e = 2.71828
where
Yo = N/θ √2π
If the normal curve is in terms of standard deviation units, it is called normal deviate. The
normal deviate at the mean will be zero viz. x = x/θ = 0/θ = 0, where x-x ̅ at values equal to 10,
20, and 30 will be respectively 1θ, 2θ and 3θ (to the left of the mean, these units will be negative
as x variates will be less than the value of mean, and to the right of the mean, these units will be
positive as x variates will be greater than the value of mean). This is known as changing to
standardized scale. In equation the changing to standardized scale is written as
100
The normal curve is distributed as under:
Method of Ordinates
To make a curve on graph, we need the frequencies and the values of variable which represent
on the ordinate (Y-axis) and abscissa (X-axis) respectively. Hence, in order to fit a curve we
must know the ordinates (i.e. frequencies) at the various points of the abscissa scale. Find the x ̅ ,
N and class interval, if any, of the observed distribution. Then calculate Yo
Yo = Ni /√2π = Ni/2.5071
Yo = 0.399 (Ni/θ)
Illustration 41
The customer accounts at the Departmental Store have an average balance of Rs. 120 and
standard deviation of Rs. 40. Assuming that the account balances are normally distributed, find
ii. What proportion of the accounts is between Rs. 100 and Rs. 150.
iii. What proportion of the accounts is between Rs. 60 and Rs. 90.
Solution
We are given
x ̅ = 120, and θ = 40
Formula
Referring to Z-table the area under, Z = 0.2734 (See table for 0.75).
101
We have to find probability of items falling to the right of Z = 0.74 i.e. over Rs. 150. Deduct the
value of 0.2734 from the total probability to the right origin, 0.5 - 0.2734 = 0.2266. Hence,
22.66 per cent of the accounts have a balance in excess of Rs. 150.
ii. Proportion of the accounts between Rs. 100 and Rs. 150
Referring to Z-table,
Therefore 46.49 percent of the accounts have an average between Rs. 100 and Rs.150
Referring to Z-table
Area between Z60 = (-)1.50 and Z90 = (-)0.75 is 0.4332 - 0.2734 = 0.1598
Thus, 15.98 per cent of the accounts have an average between Rs. 60 and Rs. 90.
Illustration 42
In a public examination 6000 students have appeared for statistics. The average mark of them
was 62 and standard deviation was 10. Assuming the distribution is normal, obtain the number
of students who might have obtained (i) 80 percent or more, (ii) First class (i.e. 60 per cent), (iii)
secured less than 40 per cent and (iv) if there are only 150 vacancies, find the minimum mark
that one should secure to get selected against a vacancy.
Solution
We are given
102
x ̅ = 62 and θ = 10
Number of students secure 80 per cent or more is 0.5 - 0.4773 = 0.0227, i.e. 2.27 per cent of
students. Thus, 600 x 0.0227 = 136 students secured 80 per cent or more.
Hence, no. of students who scored less than 40% marks is 6000 x (0.5 - 0.4861) = 6000 x 0.139
=8
150/6000 = 0.025
Area under which the students who secure more than 150 rank is 0.5 - 0.025 = 0.475. In other
words, the students who secure more than 150 ranks fall under the area of 0.475. The Z value
corresponding to 0.475 of Z-tab;e is 1.96 (see Table)
Z = (X - 62) / 10 = 1.96
X = 19.6 + 62 = 81.6
The minimum mark that one should obtain to get selected against a vacancy is 81.6
EXERCISES
1. What do you mean by probability. Discuss the importance of probability in statistics?
103
2. What is meant by mathematical expectation? Explain it with the help of an example?
4. What is meant by the Poisson distribution? What are its uses?
6. 3 balls from an urn containing 6 white and 4 black balls are drawn. Find the probability that 2
are white and 1 is black...(i) if each ball is returned before the next is drawn, (ii) if the three balls
are drawn successively without replacement. (Ans: i. 1/2; ii. 1/6)
7. A bag containing 8 white, 6 red and 4 black balls. Three balls are drawn at random. Find the
probability that they willbe white. (Ans: 56/816)
8. A bag contains 4 white and 8 red balls, and a second bag 3 white and 5 black balls. One of the
bags is chosen at random and a draw of 2 balls is made it. Find the probability that one is white
and the other is black. (Ans: 0.510)
9. A class consists of 100 students, 25 of them are girls and the remaining are boys, 35 of them
are rich and 65 poor, 20 of them are fair glamor What is the probability of selecting a fair glamor
rich girl. (Ans: 0.0175)
10. Three persons A, B and C are being considered for the appointment as Vice-Chancellor of a
University whose chances of being selected for the post are in the proportion 5:3:2 respectively.
The probability that A if selected will introduce democratisation in the University strut is 0.3,
the corresponding probabilities for B and C doing the same are respectively 0.5 and 0.7. What is
the probability that democratisation would be introduced in the University. (Ans: 0.44)
11. The probability that a trainee will remain with a company is 0.65. The probability that an
employee earns more than Rs. 1800 per year is 0.60. The probability that an employee is a
trainee who remained with the company or who earns more than Rs.18000 per year is 0.75.
What is the probability than an employee earns more that Rs. 18000 per year given that he is a
trainee who stayed with the company.
104
12. In a bolt factory, machines A, B, C produce 30 per cent, 40 per cent and 30 per cent
respectively. Of their output 3, 4, 2 per cents are defective bolts. A bolt is drawn at random from
the product and is found to be defective. What are the probabilities that it was produced by
machines A, B and C. (Ans: 0.29, 0.52, 0.19)
13. A factory produces a certain type of output by two types of machines. The daily production
are: Machine I - 4000 units and Machine II - 5500 units. Past records show that defectives for
the output produced by Machine I and Machine II are 1.4 per cent and 1.9 per cent respectively.
An item is drawn at from the day's production and is found to be defective. What is the
probability that it comes from the output of (i) Machine I and (ii) Machine II. (Ans: (i) 0.3489
(ii) 0.6511)
14. Dayal company estimates the net profit on a new product it is launching to be Rs. 5 lakhs,
during the first year if it is successful, Rs. 3 lakhs if it is made moderately successful. The
company assigning the probabilities to the first year prospects for the product are: Successful -
0.18, moderately successful -0.22. What are the expected profit and standard deviation of first
year net profit for his product. (Ans: Profit 0.66 lakhs; S.D. Rs. 2.719 lakhs)
15. A systematic sample of 100 passes was taken from the concise Oxford Dictionary and the
observed frequency distribution of foreign words per page was found to be as follows :
No. of foreign words per page (x) : 0 1 2 3 4 5 6
Calculate the expected frequencies using Poisson distribution. Also calculate the variance of
fitted distribution. (Ans: 37, 37, 18, 16, 2, 0, 0; Variance = 0.99)
16. Income of a group of 10000 persons were found to be normally distributed with mean Rs.
750 per month and standard deviation Rs. 50. Of third group, about 95 per cent had income
exceeding Rs. 668 and only 5 per cent had income exceeding Rs. 832. What was the lowest
income among the richest 100. (Ans: Rs. 866)
REFERENCE BOOKS
Agarwal, B.L. 'Basic Statistics', Wiley Eastern Ltd , New Delhi, 1994.
Chance, W., 'Statistical Methods for Decision - making', Irwin Inc., Homewood, 1969.
Gupta, S.P., 'Statistical Methods', Sultan Chand and Co., New Delhi.
Levin, R.. 'Statistics for Management', Prentice - Hall of India, New Delhi, 1984.
Monga, G.S.,'Mathematics and Statistics for Economies' Vikas Publishing House, New Delhi, 19
Reddy, C.R., 'Quantitative Methods for Management Decision', Himalaya Publishing House,
105
Bomay, 1990.
-End of Chapter -
LESSON - 9
106
STATISTICAL INFERENCES
Introduction
Human welfare including daily actions of human beings leans heavily on statistics. Inferring
valid conclusions for making decision needs the study of statistics and application of statistical
methods almost in every field of human activity. Statistics, therefore, is regarded as the science
of decision making. The statisticians can commonly categorise the techniques of statistics which
are of so diverse into (a) descriptive statistics and (b) inferential statistics (or inductive
statistics). The former describes the characteristics of numerical data while the latter describes
the judgment based on the statistical analysis. In other words, the former is process of analysis.
In other words, the former is process of analysis whereas the latter is that of scientific device of
inferring conclusions. Both are the systematic methods of drawing satisfactory valid conclusions
about the totality (i.e. population) on the basis of examining a part of population, termed as
sample. The process of studying the sample and then generalising the results to the population
needs a scientific investigation searching for truth.
The word population is technical term in statistics, not necessarily referring to people. It is
totality of objects under consideration. In other words, it refers to a number of objects or items
which are to be selected for investigation. This term as sometimes called the universe. Figure 1.
shows the concept of population and a sample in the form of the Venn diagram where
population is shown as the universal set and a sample is shown as a true subject of the
population.
A population containing a finite number of objects say the students in a college, is called finite
population. A population having an infinite number of objects say, heights or weights or ages of
people in the country, stars in the sky etc. is known as an infinite population. Having concrete
objects say, the number of books in a library, the number of buses or scooters in a district, etc. is
called existent population. If the population consists of imaginary objects say, throw of die or
coin in infinite numbers of times is referred to hypothetical population.
107
For social scientist, it is often difficult, in fact impossible to collect information from all the
objects or units of a population. He, therefore, interested to get sample data. Selection of a few
objects or units forming true representative of the population is termed as sampling and the
objects or units selected are termed as sample. On the analysis being derived from the sample
data, he generalises to the entire population from which the sample is drawn. The sampling has
two objectives which are: (a) obtaining optimum information and (b) getting the best possible
estimates of (population) parameters.
The statistical constants of the population such as population size (N), population mean(m) ,
population variance (θ2), population correlation coefficient (p), etc are called parameters. In
other words, the values that are derived using population data are known as parameters.
Similarly, the values that are derived using sample data are termed as statistics not to be
confused with the word statistics meaning data or the science of statistics. The examples for
statistics are sample mean (x), sample variance (S2), sample correlation coefficient (r), sample
size (n), etc Obviously 1 statistics are quotients of the sample data whereas parameters are
function of the 1 population data. In brief the population constant is called parameter while the 1
sample constant is known as statistics.
Random Sample
Sampling refers to the method of selecting a sub-set of the population for investigation.
Selection of objects or units in such a way that each and every object or unit in the population
has the chance of being selected is called random sampling. The number of objects or units in
the sample is termed as sample size. This size should neither be too big nor too small but should
be optimum. Over the census method, the sample method has distinct merits, which R.A. Fisher
sums up thus: Speed, economy, adaptability and scientific. The right type of sampling plan is of
paramount importance in execution of a sample survey in accordance with the objectives and
scope of investigation the sampling techniques are broadly classified into (a) random sample,
(b) non-random sample and (c) mixed sample.
The term random (or probability) is very widely applicable technique in selecting a sample from
the population. All the objects or units in the universe will have an equal chance of being
included in the random sample. In other words, every unit or object is as likely to be considered
as any other. In this, the process is random in character and is usually representative. Selecting
'n' units out of N in such a way that every one of Ncn samples has an equal chance of being
selected. This is done in the ways: (a) random sampling with replacement (rswr) and (b) random
sampling without replacement (rswor). The former does permit replacing while the latter does
not.
Let x stands for the lift (in hours) of television produced by Konark Company under essentially
identical conditions (with the same set of workers working on the same machine using the same
type of materials and the same technique). If x1, X2, X3, …xn are the lives of n such television,
then x1, x2, X3......xn may be regarded as a random sample from the distribution of x. The
number n is called the size of random sample.
108
A random sample may be selected either by drawing the chits or by the use of random numbers.
The former is a random method but is subject to biases (as can be identified chits). The latter is
the best as numbers are drawn randomly.
For example where a population consists of 15 units and a sample of size 6 is to be selected thus
since 15 is a two-digit figure, units are numbered as 00, 01, 02, 03 … 99. Six random numbers
are obtained from a two digit random number table They are - 69,36, 75, 91,44 and 86. On
dividing 69 by 15, the remainder is 9, hence select the unit on serial number 9. Likewise divide
36, 75, 91 and 44 and 86 by 15. The respective remainders are 6,0,1, 14 and 11. Hence select units
of serial numbers 09, 06, 00, 01, 14 and 11. These selected units form the sample.
Sampling Distribution
A function of the random variables x1, x2, x3.....xn, a statistic is itself a random variable. Hence, a
random variable has probability distribution. This probability distribution of a statistic is known
as the sampling distribution of the statistics. This distribution describes t he way that a statistic
is the function of the random variables. In Practice the sampling distributions which commonly
used are the sample mean and the sample variance. These will give a fillip to a number of test
statistics for hypothesis testing.
Suppose in a simple random sample of size n picked up from a population, then the sample
mean represented by x is defined as
Suppose, the simple random sample of size n chosen from a population the sample variance is
used to estimate the population variance. In an equation form.
109
Standard Error
The standard deviation measures variability variable. The standard deviation of a sampling
distribution is referred to standard error (S.E.). It measures only sampling variability which
occurs due to chance or random forces, in estimating a population parameter. The word error is
used in place of deviation to emphasize that the variation among sample statistic is due to
sampling errors. If 0 is not known, we use the standard error given by:
1. That it provides an idea about the reliability of sample. The lesser the standard error, the
lesser the variation of population value from the expected (sample) value. Hence is
greater reliability of sample.
2. That it helps to determine the confidence limits within which the parameter value is
expected to lie. For large sample, sampling distribution tends to be close to normal
distribution. In normal distribution, a range of mean ± one standard error, of mean±3
two standard error, of mean ± 3 standard error will give 68.27 per cent, 95.45 per cent
and 99.73 per cent values respectively. The chance of a value lying outside ± 3 S.E. is
only 0.27 per cent. i.e.. approximately 3 in 1000.
3. That it aids in testing hypothesis and in interval estimation.
110
4. That it aids in testing hypothesis and in interval estimation.
Estimation Theory
A technique which is used for generalizing the results of the sample to the population for
estimating population parameters along with the degree of confidence is provided by an
important branch of statistics is called Statistical Inference. In other words, it is the process of
inferring information about a population from a sample. This statistical inference deals with two
main problems namely (a) estimation and(b) testing hypothesis.
a Estimation:
The estimation of population parameters such as mean, variance, proportion, etc., from the
corresponding sample statistics is an important function of statistical inference. The parameters
estimation is very much need for making decision. For example, the manufacturer of electric
tubes may be interested in knowing the average life of his product, the scientist may be eager in
estimating the average life span of human being and so on. Due to the practical and relative
merits of the sample method over the census method, the scientists will prefer the former.
A specific observed value of sample statistic is called estimate. A sample statistic which is used
to estimate a population parameter is known as estimator. In other words, sample value is an
estimate and the method of estimation (statistical measure) is termed as an estimator. The
theory, of Estimation was innovated by Prof. R.A. Fisher. Estimation is studied under Point
Estimation and Interval Estimation.
Good Estimation
A good estimator is one which is as close to the true value of population parameter as possible. A
good estimator possesses the features which are:
(a) Unbiasedness : An estimate is said to be unbiased if its expected value is equal to its
parameter. For example, if 3c is an estimate of ft, x will be an unbiased estimate only if
(See Illustration 1)
(b) Consistency : An estimator is said to be consistent if the estimate tends to approach the
parameter as the example size increases. For any distribution, i.e.., symmetrical or skew
symmetric, sample mean, sample variance and sample proportions are consistent estimators of
the population mean, population variance and population proportion respectively.
(c) Efficiency: An estimate is said to be efficient if the variance i.e., is minimum. An estimator
with less variability and the consistency more reliable than the other.
(d) Sufficiency : An estimator which uses all the relevant information in its estimation is said
111
to be sufficient. If the estimator sufficiently insures all the information in the sample, then
considering the other estimator is absolutely unnecessary.
A Point Estimation is a single statistic which is used to estimate a population parameter. Now,
we shall discuss the sample mean and sample variance are unbiased estimate for corresponding
population parameters.
In Point Estimation, a single value of statistic is used as estimate of the population parameter.
Sometimes, this point estimate may not disclose the true parameter value. Having computed a
statistic from a given random sample, can we make reasonable probability statements about the
unknown parameter of the population from which the sample is drawn? The answer can provide
by the technique of Interval Estimation. The Interval Estimation within which the unkown value
of parameter is expected to lie is called confidence interval or Fiducial Interval (which are
respectively called by Neyman and Fisher). The limits so determined are called Confidence
Limits or Fiducial Limits and at required precision of estimate say 95 percent is known as
Confidence Coefficient. Thus, a confidence interval indicates the probability that the population
parameter lies within a specified range of values. To compute confidence interval we require:
112
113
Z-Distribution
Interval estimation for large samples is based on the assumption that if the size of sample is
large, the sample value tends to be very close to the population value. In other words, the size of
sample is sufficiently large, the sampling distribution is approximately of normal curve shape.
This is the feature of the central limit theorem. Therefore, the sample value can be used in
estimation of standard error in the place of population value. The Z-distribution is used in case
of large samples to estimate confidence limits. For small sample, instead of Z-values, t-values
are studied to estimate the confidence limits. One has to know the degree of confidence level
before calculating confidence limits. Confidence level means the level of accuracy required. For
example, the 99 per cent confidence level means, the actual population mean lies within the
range of the estimated values to a tune of 99 per cent. The risk is to a tune of one percent.
To find the Z-value corresponding to 99 per cent confidence level, divide that J confidence level
by 2 i.e. 99/2 which gives 49.5, in probability terms it is 0.495. Identify this value in the Z-value.
The Z-value corresponding to it can be identified in the left-most column and also in the top-
most row. The confidence coefficient for 99 per cent confidence level is 2.51. The 99 per cent of
items or cases falls within x ± 2.51 SE which means the sampling distribution will have 99
percent of the population.
In estimating the value by the Point Estimate Method and the Interval Estimate Method, the
former provides only a point in the sample with no tolerance or confidence level attached to it.
The latter provides accuracy of the estimate at a confidence level. Further it helps in hypothesis
testing and becomes a basis for decision-making under the conditions of uncertainly or
probability. The interval estimate, therefore, has a superiority or practical application over the
point estimate.
Illustration 1
A Universe consists of four numbers 3, 5, 7 and 9. Consider all possible samples of size two
which can be drawn with replacement from the universe. Calculate the mean and variance.
Further, examine whether the statistics are unbiased for corresponding parameters. What is the
114
sampling mean and sample variance?
Solution:
Any one of the four numbers, 3, 5, 7 and 9 drawn in the first draw can be associated with any
one of these four numbers drawn at random with replacement in the subsequent draw i.e..
second draw. Hence, the total number of possible samples of size 2 is 4 ´ 4 = 16 , and is given by
the cross product: (3, 5, 7, 9) ´ (3, 5, 7, 9) as shown below.
115
= 250
116
Thus,
We conclude that the sample statistics of mean and of variance, and the corresponding
population parameter are the same; Hence, the sample statistics are unbiased estimate for the
corresponding population parameters.
Illustration 2
Consider a hypothetical three numbers 2, 5 and 8. Draw all possible samples of size 2 and
examine the statistics are unbiased for corresponding parameters.
Solution
The given universe consists of three values namely 2,5,8. The total possible samples of size 2 is
3x3 = 9 and is given by cross product: (2,5,8) x (2,5,8). Thus there are 9 samples of size 2. They
are shown in the} following table.
117
118
Illustration 3
Consider the population of 5 units with values 1,2,3,4 and 5. Write down all possible samples of
size 2 without replacement and verify that sample mean is an unbiased estimate of the
population mean. Also calculate sampling variance and verify that (i) it agrees with the formula
for variance of the sample mean and (ii) this variance is less than the variance obtained from the
sampling with replacement (iii) and find the standard error.
Solution
119
120
Thus, the variance of sample mean distribution is agreed with the formula for the variance of the
sample without replacement.
Illustration 4
The Golden Cigarettes Company has developed a new blended tobacco product. The marketing,
department has yet to determine the factory price, A sample of 100 wholesalers were selected
and were asked about price. Determine the sample mean for the following prices supplied by the
wholesalers.
5.00
Solution
-------------------------------------------------------------------
121
Thus, using the sample wholesaler's cigarettes price mean as an estimator, the point estimation
of the golden cigarettes mean is Rs. 5.50. Both the buyer as well as seller accept the use of this
point estimate as a basis for fixing the price. The point estimate can save time and expense to
the producer of cigarettes.
Illustration 5
Sensing the downward in demand for a product, the financial manager was considering shifting
his company's resources to a new product area. He selected a sample of 10 firms in the textile
industry and discovered their earnings (in %) on investment. Find point estimate of the mean
and variance of the population from data given below.
Solution
122
Thus, the point estimate of mean and of variance of the population from which the sample
drawn are 16 and 4.69 respectively.
Illustration 6
A random sample of 600 appeals was taken from a large consignment and 66 of them were
found to be bad. Find the limits at which the bad appeals lie at 99 per cent confidence level.
Solution
We are given,
N=600
Number of bad appeals in the consignment, 65 proportion of bad appeals in the consignment,
123
0.1484 and 0.0716
Hence, the bad appeals in the consignment lie between the limits at 14.84 per cent and 7.16
percent.
Illustration 7
Out of 20,000 customer's ledger accounts, a sample of 500 accounts was taken to accuracy of
posting and balancing wherein 40 mistakes were found. Assign limits Within which the number
of defective cases can be expected to lie at 95 per cent confidence.
We are given,
Therefore,
124
= 0.0121
95 per cent confidence limits for proportion of mistakes is given by (95% confidence value =
1.96)
125
Illustration 8
In sample of 1000 TV viewers, 330 watched a particular programme. Find 99 per cent
confidence limits for TV viewers who watch this programme.
Solution
In notation
126
Illustration 9
Out of 1200 M.Com. students, a, sample of 150 selected at random to test the accuracy of solving
a problem in Quantitative Methods and 10 did mistakes. Assign limits within which the number
of students who done the problem wrongly in whole universe of 1200 students at 99 per cent
confidence level.
Solution
We are given
127
- End Of Chapter -
128
LESSON - 10
HYPOTHESIS
To be fruitful, the decision, one should collect random Sample for or against some point of view
of proposition. Such point of view or proposition is termed as hypothesis. Hypothesis is a
proportion which can be put to test to determine validity. A hypothesis, in statistical parlance is
a statement about the nature of a population which is to be tested on the basis of outcome of a
random sample.
Testing Hypothesis
The formulation of a hypothesis about population parameter is the first step in testing
hypothesis. The process of accepting or rejecting a null hypothesis on the basis of sample results
is called testing of hypothesis. The two hypothesis in a statistical test are normally referred to:
A reasoning for possible rejection of proposition is called null hypothesis. In other words, it
asserts that there is no true difference in the sample and the population, and that the difference
found is accidental and unimportant arising out of fluctuations of sampling. Hence the word,
null means invalid, void or amounting to nothing. Decision-maker should always adopt the null
attitude regarding the outcome of the sample.
A null hypothesis consists of only a single parameter value and is simple while the alternative
hypothesis is usually composite. In any statistical test, there are four possibilities which are
termed as exhaustive decisions.
They are:
129
The decisions are expressed in the following dichotomous table:
The error of rejecting Ho when it is true is called Type I error and of accepting Ho when Ho is
false is known as Type II error. The probability denoted by a (pronounced as alpha) and the
probability (p) of type II error is denoted by a (pronounced as beta). In practice, in business and
social science problems, it is more risky to reject a correct hypothesis than to accept a wrong
hypothesis. In other words, the consequences of Type I error are likely to be more serious than
the consequences of Type II error.
The quantity of risk tolerated in hypothesis testing is called the level of significance is commonly
used at 5 percent respectively account for moderate and high precision.
The most commonly used test are t-test, F-ratio and Chi-square. The estimated value of the
parameter which depends on the number of observations. The sample size, therefore, plays an
important role in testing of hypothesis and is taken care of by degrees of freedom. Degrees of
freedom are the number of independent observations in a set.
(e) Conclusion:
A statistical decision is a decision either to accept or to reject the null hypothesis based on the
computed value in comparison with the given level of significance. If the computed value of test
statistic is less or more than the critical value, it can be said that the significant difference or
insignificant difference and, the null hypothesis is rejected or accepted respectively at the given
level of significance.
Test of Significance
The tests of significance available to know the significance or otherwise of variables in various
situations are
130
(i) comparing observation with expectation and thereon finding how far the deviation of one
from the other can be attributed to variations of sampling,
It is difficult to draw a line of demarcation between large and small samples; but a view among
statistician is that, a sample is to be recorded as large only if its size exceeds 30 and if the sample
size is less than 30, it is noted as small sample. The tests of significance used for large samples
are different from the small samples, the reason being that the assumptions made in case of
large samples do not hold good for small samples. The assumptions that made in dealing with
problems relating to large samples are: (a) the random sampling distribution of a statistic is
approximately normal and (b) the values given by sample data are sufficiently close to the
population values and can be used in their (population), place for calculating the standard error
of the estimate.
In case of small samples, the above said assumptions will no longer be hold good. It should be
noted that the estimates will vary from sample to sample if we work with very small samples. We
must satisfy with relatively wide confidence intervals. Of course, the wider the interval, the less
is the precision. An inference drawn from the large sample is far more precise in the confidence
limits it sets up than an inference based on a much smaller sample. Though, drawing a precise
line of demarcation between the large sample and the small sample is not always easy, but the
division of their theories is a very real one. As a rule, the theory and methods of small samples
are applicable to large samples, but the reverse is riot true.
Large Samples
Illustration 10
Compute the standard error of mean from the following data showing the amount mid by 100
firms on the occasion of Deepavali.
50-60 10
131
80-90 25
Solution
132
Illustration 11
For a random sample of 100, the mean height is 63 inches. The standard deviation of the height
distribution of the population is known to be 3 inches. Test the statement that the mean height
of the population is 66 inches at 0.05 level of significance. Also set up 0.01 confidence limits of
the mean height of the population.
Solution
Since the difference is more than 1.96 SE at 0.05 level, it could not arise due to fluctuations of
sampling. Hence, the hypothesis is rejected. In other words, the mean height of the population
could not be 66 inches.
X ± Z SE
63 ± 0.774
Illustration 12
A sample of 900 items is taken from a normal population whose mean is 6 and variance is 6. If
the sample mean is 6.60 can the sample be regarded as a truly random sample. Give necessary
justification for your conclusions.
133
Solution
Let us take the hypothesis that there is no difference between the sample mean a the population
mean.
Since the difference is more than 1.96 SE at 0.05 level, it could not have arisen due to variations
of sampling. Hence, the sample cannot be regarded as truly random sample.
Illustration 13
If it costs a rupee to draw one number of a sample, how much would it cost in sampling from a
universe with mean 100 and standard deviation 9 to take sufficient number as to ensure that the
mean of a sample would be within 0.015 per cent of the value at 0.05 level. Find the extra cost
necessary to double this precision.
Solution
134
The difference between sample mean and population mean = 0.015 (given)
For 95 percent confidence, difference between sample mean and population mean should be
equal to 1.96 SE.
0.015 = 1.96 SE
Therefore,
Hence to double the precision, the extra cost would be Rs. 41, 48,928.
135
Illustration 14
The average number of defective articles in a factory is claimed to be less than for all the
factories whose average is 30.5. A random sample showed the following distribution.
Calculate the mean and standard deviation of the sample and use it to test the claim that the
average is less than the figure for all the factories at 0.05 level of significance.
Solution
136
137
Since f |Z| is more than 1.96, it is signification at 0.05 level of significance. Hence we reject the
null hypothesis and conclude that the sample mean and population mean differ significantly. In
other words, the manufacturer's claim that the average number of defectives in his product is
less than the average figure for all the factories is valid.
Illustration 15
A random sample of 100 articles selected from a batch of 2000 articles which show that the
average diameter of the articles is 0.354 with a standard deviation 0.048. Find 95 per cent
confidence interval for the average of this batch of 2000 articles.
Solution
We are given,
___
Illustration 16
A sample of 400 male students is found to have a mean height of 171.38 cms. Can it reasonably
be regarded as a sample from a large population with mean height of 171.17 cms. and standard
deviation 3.30 cms?
138
Solution
HO = m = 171.17
We are given,
Since |Z| is less than 1.96 at 0.05 level of significance, we accept the null hypothesis. In other
words, the sample of 400 has come from the population with mean height of 171.17.
Illustration 17
Mrs. P, an insurance agent in Anantapur Division has claimed that the average age of policy-
holders who insure through her is less than the average of all the agents, which is 30.5 year. A
random sample of 60 policy-holders who had insured through her gave the following age
distribution.
139
Calculate the mean and standard deviation of this distribution and use these values to test her
claim at the 95 per cent of level of significance. You are given that Z at 0.95 is 1.96.
Solution
140
Since | Z | is less than 1.96 at 0.05 level of significance, it could have arisen due to sampling
variation. Hence, difference is insignificant. In other words, Mrs. P claim that the average age of
policy-holders who insure through her is less than the average for all the agents at 30.5 years, is
valid,
Illustration 18
A random sample of 1000 workers from South India shows that their mean wage of Rs. 47 per
week with a standard deviation of Rs. 28. A random sample of 1500 workers from North India
gives a mean wage of Rs. 49 per week with a standard deviation of Rs. 40. Is there any
significant difference between their mean wages?
Solution
We are given two independent samples. Their means be denoted by x and ; sizes by m and n2
respectively. The given values are:
141
Illustration 19
Mrs. Lahari has selected two markets, A and B at different locations of a city in order to make a
survey on buying habits of customers. 400 women shoppers are chosen at random in market A.
Their average monthly expenditure on food is found to be Rs. 1050 with a standard deviation of
Rs. 44. The corresponding figures are Rs. 1020 and Rs. 56 found respectively in market B where
also 400 men shoppers are chosen at random. Test at 1 per cent level of significance whether the
average monthly food expenditure of the two populations of shoppers are equal.
Solution
142
Since | Z | is 8.43 which is much greater than 2.58, the value of Z at 1 per cent level of
significance, it is highly significant. Hence, the data do not provide any evidence to accept the
hypothesis. In other words, the monthly expenditure in two populations of shoppers in markets
A and B differ significantly
Illustration 20
Solution
Let the hypothesis that there is no significant difference between the performance of the
students at college A and B.
143
Conclusion
i. At 0.05 level of significance, since |Z| is 2.56 which is greater than 1 96, it is significant
difference and therefore, we reject the hypothesis. In other words, we conclude that mean grades
of the students of college A and B are different at 0.05 level of significance.
ii. At 0.01 level of significance - The value | Z | 2.56 which is less than the value of Z at 1 percent
level of significance i.e. 2.58. Thus, the data consistent with the hypothesis and conclude that the
mean grades of the students of college A and B is almost the same.
Illustration 21
Random samples drawn from two States give the following data relating to the heights of adult
males.
Solution
Let the hypothesis be there is no significant difference in mean height of adult males of two
States.
144
Since |Z| value is less than 1.96 at 0.05 level of significance, we accept the hypothesis.
Illustration 22
A sample survey of 121 boys about their intelligence gives a mean of 84, with-a standard
deviation of 10. The population standard deviation is 11. Does the sample has come from the
population ?
Solution
145
Since |Z | is less than 1.96 at 0.01 level of significance, we conclude that the sample with
standard deviation of 10 has come from the population.
Illustration 23
The mean yield of two sets of plots and their variability are given below. Examine whether the
difference in the variability of yields is significant.
Solution
Since |Z| value is much greater that the 2.58 at 0.01 level of significance, it could not have arisen
due to variations of sampling. We may reject the hypothesis.
Illustration 24
A sample of height of 6400 Englishmen has a mean of 67.85 inches and a standard deviation of
2.56 inches. While a sample of height of 1600 Australians has a mean 68.55 inches and a
standard deviation of 2.52 inches. Do the data indicate that Australians are, on an average, taller
than Englishmen?
146
Solution
147
- End Of Chapter -
148
LESSON - 11
SMALL SAMPLES
If sample size is less than 30, it is termed as small sample. The greatest contribution to the
theory of small samples is that of Sir William Gossett (t-test), R.A. Fisher (F-Test), Karl Pearson
(Chi-Square test).
(a) t-test
(i) the parent population from which the sample is drawn is normally distributed,
(ii) the sample observations are random and independent of each other and
Single Mean
We calculate the statistics for determining whether the mean of a sample drawn from a normal
population deviates significantly from the stated value (hypothetical population mean) of the
statistics is defined as :
If calculated 111 is more than the tabulated t for n-1 degrees of freedom at certain level of
significance, we say it is significant and Ho is rejected. If calculated 111 is less than tabulated t,
Ho may be accepted at the adopted level of significance.
149
Illustration 25
A machine is designed to produce insulating washers for electrical devices of average thickness
of 0.025 cms. A random sample of 10 washers was found to have an average thickness of 0.024
cms with a standard deviation of 0.002 cms. Test the significance of the deviation.
Solution
We are given,
m = 0.025 cms.
Since |t| is 1.58 which is less than the tabulated value at 9 d.f at 0.05 level of significance. Hence,
deviation is not significant. Ho is accepted.
Illustration 26
A random sample of 16 values from a normal population showed a mean of 41.5 inches and the
sum of squares of deviations from this mean equal to 135 inches. Show that the assumption of a
mean of 43.5 inches for the population is not reasonable. Obtain 95 per cent confident limits for
the same.
Solution
150
Since |t| is more than the table value of 2.131 at t 0.05 for 15 d.f Hence HO is rejected. We
conclude that the assumption of a mean of 43.5 inches for population is not reasonable.
41.5 ± 1.598
Therefore,
Illustration 27
Ten individuals are chosen at random from a population and their heights (in inches) are found
to be: 62, 63, 66, 68, 69, 71, 70, 68, 71 and 66. In the light of these data, mentioning the null
hypothesis, discuss the suggestion that the mean height in the population is 66 inches.
Solution
151
152
Table value for 9 d.f at t0.05 is 2.262 which is more than the calculated. It is not significant. Hence
hypothesis at 0.05 level of significance may be accepted. We conclude that the mean height in
the population may be regarded as 66 inches.
153
Illustration 28
A group of 5 patients treated with medicine A weigh 42, 39, 48, 60 and 41 Kgs ; second group of
7 patients from the same hospital treated with medicine B weigh 38, 42, 56, 64, 68, 69 and 62
Kgs. Do you agree with the claim that medicine B increases the weight significantly.
Solution
154
Since the calculated value (1.70) is less than the table value 2.228 at to.05 for 10 d.f, the
difference is insignificant. We conclude that the medicine A and medicine B do not differ
significantly as regards their effect on increase in weight.
Illustration 29
The average number of articles produced by two machines per day are 200 and 250 with
standard deviation 20 and 25 respectively on the basis of records of 25 days production.
Can you regard both the machines equally efficient at 0.01 level of significant?
155
S (pooled estimate of standard deviation on the basis of given standard deviations)
Since calculated |t| is more than table value (1.96 at 0.05 for 48 d.f) it, is highly significant and
hence we reject Ho. We conclude that the performance of two machines differ significantly.
In the previous test, the two samples were independent. But there are many situations in which
this condition does not hold true in the sense we have dependent samples. Two samples are said
to be dependent in nature, if the elements in one Sample are related to those in the other. The t-
test for paired observations is definite as:
156
157
The calculated |t| is less than the table value of 2.776 at t 0.05 for 4 d.f. and hence accepted the
hypothesis. We conclude that the students have benefited by the training programme.
Illustration 31
A group of 10 children were treated to find out how many digits they could repeat from after
hearing them once. They were given practice for a week and were then tested. Is the difference
between the performances of 10 children at the two tests significant.
158
Solution
Ho: No significance difference between the performance of children with practice and those
without practice.
159
(b) F-Test
The main object of F-test is to discover whether the samples have come from the same universe.
We can get the answer to this problem by the study of Analysis of Variance also called ANOVA,
160
an obvious abbreviated word. For testing the difference of more than two samples, the F-test is
an alternative to t-test which can be applied to test the difference of only two or less than two
samples. The test consists of classifying and cross-classifying statistical results, and testing the
significance of the difference between the samples statistics as well as among the samples statis-
tics. The analysis of variance is studied by (a) one-way classification and (b) two-way
classification.
If we observe the variation of the variables with one factor, it is known as one-day classification.
If we observe the variation of the variables with two factors, it is called two-way classification.
The estimate of population variance which is based on variation between the groups is known as
the mean square between groups. The estimate of population variance that is based on the
variation within group is known as the mean square within graphs. Since the F-test is based on
the ratio of two variances, it is also known as Variance Ratio test. The variance ratio is denoted
by F. It is given by:
If the calculated value of F is lesser than the table value, all the groups are drawn from a normal
population. To calculate variance ratio, one must determine the following values :
Illustration 32
To test the significance of possible variation in performance in a test between the programmer
schools of a city; a common test was conducted to a number of students taken at random from
the fifth class of each of the four schools concerned. The results are given below. Is there any
significant in the means of samples?
161
162
(iv) Sum of squares within groups i.e. Schools
= 258 - 50 = 208
The above information may be presented in the form of a table, known as ANOVA table.
163
Illustration 33
A tea company appoints four salesmen A, B, C and D; and observes their sales in three seasons -
Summer, Winter and Monsoon. The figures (in lakhs) are given as :
Solution
Let us take the hypothesis that there is no significant variation in sales between the salesmen.
164
165
Conclusion
The calculated value of F between salesmen is 0.67 which is less than the table value, hence
conclude that the sales of different salesmen do not differ significantly.
- End Of Chapter -
166
LESSON - 12
The significance for small samples has been tested by t and F based on the assumption that the
samples were drawn from normally distributed population. Since testing the significance
requires an assumption about the parameters (i.e. population values such as mean, standard
deviation, correlation etc.) hence t and F tests are called PARAMETRIC tests. In reality, all
distributions of variables pertaining to social, economic and business situations may not be
normal. The limitation of t and F tests has led to the development of a group of alternative
techniques (tests) known as NON-PARAMETRIC or distribution - free methods.
In the study of non-parametric tests, no assumption about the parameters of the population
from which we draw samples is required. An increasing use of non-parametric tests in economic
and business is on account of the three reasons namely:
(ii) they are computationally easier to handle and understand than parametric tests ; and
(iii) they can use with types of measurements that prohibit the use of parametric tests.
Of course, the non-parametric tests are popular in application but not superior to the parametric
methods. In fact, in situations where both tests apply, the non-parametric tests are more
desirable than the parametric tests.
A good compatibility between theory and experiment proved by the statistic chi-square test
which is defined as
167
where Oi refers to observed frequencies and
(i) taking the difference between an observed frequency and expected frequency,
Then compare this value to the table value with desired degrees of freedom. The degrees of
freedom we mean the number of classes to which the values can be assigned arbitrarily. The
degrees of freedom are denoted by d.f. at the level of n-1. In a contingency table, the d.f. are (r-1)
(c-1) where r and c refers to the number of rows and columns respectively. The expected
frequency for any cell is E = RT x CT/N.
If the calculated value of chi-square is less than the table value of chi-square, then it is said to be
non-significant at the required level of significance. This implies that the discrepancy between
observed (experiment) values and expected (theory) values may be attributed to chance i.e.
variations of sampling. In other words, data do not provide any evidence against the hypothesis
which may be accepted. In brief, we may conclude that there is good correspondence or good fit
between theory and experiment. On the other, if the calculated value is greater than the
tabulated value, it is said to be significant. In other words, we conclude that the discrepancy
between observed and expected frequencies cannot be attributed to chance, hence the
experiment does not support the theory.
i. The total frequency should be sufficiently large, say greater than 50.
ii. The sample observations should be independent in the sense no individual item should be
included twice or more in the same sample.
iii. The constraints must be linear. Equations containing no squares or high powers of the
frequencies are linear constraints (such as S0 =SE).
iv. Expected frequency should be 10, but not less than 5. If it is less than 5, the technique of
168
pooling is to be applied.
Illustration 34
The number of automobile accidents per month was as follows: 48, 32, 80, 10, 54, 42, 60, 26,
38, 20. Are these frequencies in agreement with the belief that accident conditions were same
during 10 months period.
Solution
Since the calculated value is more than the table value of 16.919 for 9 d.f. at 0.05, it is significant
and null hypothesis is rejected. We conclude that the accidents are certainly not uniform during
the period of 10 months.
Illustration 35
A sample analysis of examination results of 500 students was made. It was found that 220
students had failed, 170 had secured a third class, 90 were placed in second class and 20 got a
first class. Are these figures commensurate with the general examination result policy which is
in the ratio of 4:3:2:1 for the various categories respectively.
Solution
HO:
The observed figures do not differ significantly to the ratio of 4 : 3 : 2 : 1. Frequency distribution
of the result of 500 students is as follows:
169
Since the calculated value of chi-square (23.667) is more than the table value of 7.815 for 3 d.f. at
0.05, the difference is significant and Ho is rejected. We conclude that the data are not
commensurate with the general examination result policy.
Illustration 36
Records show the number of male and female births in 800 families having four children are
given below.
Frequency 32 178 290 236 64
Test whether the data are consistent with the hypothesis that the binomial law holds and the
chance of male birth is equal to that of a female birth.
How to fit binomial distribution: Suppose a random experiment consists of n trials, satisfying
the conditions of binomial distribution and suppose this experiment is repeated N times, then
the frequency of r successes is given by the formula.
170
P(r) = N p (r) = N x ncr pr qn-r
r = 0, 1,2,3….n
171
Since the calculated value of chi-square is 19.63 which is greater than the table of 9.488 for 4 d.f
at 0.05, it is significant. Thus, the difference between the observed and expected frequencies is
significant, hence null hypothesis is rejected. We conclude that the equal male and female births
is wrong and the binomial distribution with p = q = 0.5 is not a good fit to the given data.
Illustration 37
A set of 5 coins is tossed 3200 times, and the number of heads appearing each time is noted. The
results are given below.
No. of heads : 0 1 2 3 4 5
Frequency : 80 570 1100 900 500 50
Solution
172
x2 = 58.80
The calculated value is more than the table value of 11.070 for 5 d.f at 0.05. Hence the
hypothesis is rejected. We conclude that coins are biased.
Illustration 38
In experiment of pea-breeding, Reddy got the following frequencies of seeds: 315 round and
yellow, 101 wrinkled and yellow, 108 round and green, 32 wrinkled and green, total 556. Theory
173
predicts that the frequencies should be in the proportion of '9:3:3:1. Examine the
correspondence between theory and experiment.
Solution
Ho:
Result:
Contingency Table
A table having R rows and C columns is known as contingency table. Each row corresponds to a
level of one variable, each column to a level of another variable. Entries in the body are
frequencies with which each variable combination occur. A table having 3 rows and 2 columns is
called a 3 x % contingency table while 2 rows and 3 columns is known as a 2 x 3 contingency
table. Thus, the table depends on the number of rows and columns.
174
Illustration 39
Out of a sample of 120 persons in village, 76 persons were administered a new drug for
preventive influenza and out of them, 24 were attacked by influenza Out of those who were not
administered the new drug, 12 persons were not affected by influenza. Prepare (i) 2 x 2 table
showing actual and expected frequencies, (ii) use chi-square test for finding out whether the new
drug is effective or not.
= 18.9687
Since the calculated value is greater than the table value of 3.84 for 1 d.f at 0.05, hence the null
hypothesis is accepted. We conclude that the new drug is definitely effective in controlling i.e.
preventing the disease (influenza).
The computation of chi-square as above is quite tedious and consuming process. The chi-square
can conveniently compute by using the alternative formula.
175
The chi-square value of 18.96 is more approximately the same as obtained earlier,
The chi-square test is also used to know whether a random sample has been drawn from a
normal population with mean and variance. The statistic
Illustration 40
38, 40, 45, 53,47,43,55,48, 52,49. Can we say that variance of the distribution of weights of all
students from which the above sample of 10 students was drawn, is equal to 20 square kgs?
Solution
176
Now , we have
The calculated value is less than the table value, it is not significant and hence accept the null
hypothesis. The given data is consistent with the hypothesis that the variance of the distribution
of weights of 10 students in the population is 20 square kgs.
Illustration 41
A random sample of size 20 from a population gives the sample standard deviation of 6. Test the
hypothesis that the population standard deviation is 9.
Solution
n = 20 and s = 6
177
The calculated value is less than the table value of 30.144 for 19 d.f at 0.05, hence H o may he
accepted. In other, words, the population standard deviation, 9 may be accepted.
Illustration 42
Test the hypothesis that variance = 64 given that sample variance 100 for a random sample of
size 51.
Solution
Since Z is greater than 1.96, hence the null hypothesis is rejected. In other words, that the
random sample cannot be regarded as drawn from the population with variance 64.
EXERCISES
178
2. What do you mean by Point Estimation and Interval Estimation? Explain with
illustrations.
3. What is standard error? State the application of standard error in large sample test of
statistical hypothesis.
5. What is meant by the sampling distribution of a statistic? Describe briefly the sampling
distribution of mean and variance.
6. What do you understand by small sample tests? How are they different from large
sample tests?
7. What do you understand by t-test, F-ratio and Chi-square. Discuss their application.
8. A sample of 100 families selected at random gives an average income of Rs.8000 with a
standard deviation of Rs. 3000. Estimate the confidence interval of mean income of
families at 95 per cent and 99 per cent.Ans: at 95 % — Rs. 7412 and Rs. 8588 at 99%,-
Rs. 7226,andRs. 8774
9. 50 out of 500 machine parts tested are found to be defective. Estimate the 95per cent
confidence interval for the proportion of defective machine parts.
REFERENCE BOOKS
Agarwal, B.L. "Basic statistics", Wiley Eastern Limited, New Delhi, 1994.
Gupta, S.P. "Statistical Methods", Sultan Chand and Sons, New Delhi, 1992.
Lehman, E.L."Testing Statistical Hypothesis", Wiley Eastern Limited, New Delhi, 1976.
Levin, R.I. "Statistics for Management", Prentice - Hall of India, New Delhi, 1987
Reddy, C.R. "Quantitative Methods for Management Decision?, Himalaya Publishing House,
Bombay, 1990.
Rao, C.R. "Linear Statistical Inference and its Applications", Wiley Eastern Limited, New Delhi,
1977.
- End Of Chapter -
179
LESSON - 13
Introduction
The study of a single variable with regard, to its distribution, moments etc. is one facet of
statistical studies. Another aspect of statistical analysis is the joint study off two or more
variables in "respect of their interdependence or functional relationships while studying the
interdependence, we study the correlation between two or more variables. The statistics does-
not explain the causative-effect between the variables but it is being explored through other
considerations. The theory of correlation and association between variables was largely
developed by Karl Pearson and G.Undy Yule in early twentieth century. The idea of dependence
between two or more quantitative normal variants further lead to the theory of regression. In
regression analysis, one has to deal with two or more interdependent Variables. Among the
variables under consideration, we have two sets of variables, one is known as dependent variable
and the other independent variable(s).
For instance weight of a person is related to his height. So we may find the regression of height
on weight or weight on height. Age of wife and age of husband at the time of marriage is another
example of interdependent variables. But such situations seldom prevail. Mostly, there is a
variable which depends on one or more variables. For instance, the yield of a crop depends on
fertilizer dose. In this situations yield is the dependent variable and fertilizer dose is an
independent variable, The production cost of an unit depends on the cost of raw-material,
labour cost, cost of electricity, transportation etc. In this example, production cost is the
dependent variable and the cost, of raw material, labour cost, cost of electricity and
transportation cost etc. are independent variables.
From the above discussion it is apparent that in regression analysis we deal with two types of
variable, the one dependent variable and the other independent variable. Not going beyond the
scope of curriculum, it would not be wrong to say that; there is one dependent variable, called
response variable, which depends on one or more variables, so called independent variable(s).
The independent variables are also termed as regressors, predictors, explanatory variables etc.
The concept of regression was first developed by Francis Galton in a study of inheritance of
structure in human being. To prove this biometrical fact Karl Pearson found the regression of
son's height on father's height. But soon the use of regression technique was extended in large
number of sciences like Economics, Sociology, Psychology, Medical Sciences, Zoology, Breeding,
Agronomy, Management etc. Now adays it is one of the most frequently used tool of statistics.
Objective
Literal meaning of the word regression is to progress or to step back. Sir Francis Galton used
this word towards mediocrity in hereditary statue. But now a days it is used much wider without
giving any heed towards its old conceptual sense. The objectives of regression can be delineated
in the following manner.
1. The foremost objective of the regression is to establish the actual functional relationship
180
between the dependent and independent variables.
2. To estimate the value of the dependent variable for a given value of X.
3. Regression equation can very well be used as prediction equation. A value of the
dependent variable can easily be estimated for the given value(s) of the independent
variable or variables.
4. It helps to find the trend in analysis of time series.
5. Regression is aimed for projections like population projection, production of cereal crops
etc, some of the writers are quoted here which throw light on the objectives of regression.
Regression analysis measures the nature and extent of this relation, thus enabling
us to make predictions.
It is often more important to find out what the relation actually is, in order to estimate or
predict one variable (the dependent variable); and the statistical technique appropriate to such a
case is called regression analysis.
The device used for estimating the value of one variable from the value, of the other consists of a
line through the points drawn in such a manner as to represent the average relationship between
the two variables. Such a line is called the line of regression. Once the objective of regression
analysis is clear, the problem confronted by one is to establish a suitable and appropriate
statistical model which has to be fitted in by the use of actual data.
Regression Model
A statistical model for regression equation between a dependent variable Y and the independent
variables X1, X2, .... Xk involving the parameter 00, 01, 02.... 0k and the error terms can in general,
be given as:
181
Is a linear regression equation and of the type
182
In case of linear regression with two Independent variables X 1 and X2, the regression model is
given as.
In the same way any other linear or curvilinear regression model can be given.
Scatter diagram
When we wish to set up a regression equation between two variables Y and X, the idea about the
type of relationship between the two can easily be obtained by plotting the n paired values (X 1,
Y1), (X2, Y2), (Xn, Yn) on a graph paper. If these points lie in a straight line or lie in the close
vicinity of this line, then a linear relationship is considered appropriate. If some other known
183
pattern is observed the regression equation has to be chosen accordingly. While plotting the
points (Xi, Yi.) for = 1,2, n, one should always take Y, the dependent variable along ordinate (Y-
axis) and the independent variable X along abscissa (X-axis) by choosing suitable scales.
If the plotted points do not show any pattern, then it is considered that4he two variables are
independent. In this situation rarely more than 2,3 points lie in a straight line or in the pattern
of a known curve.
Now we give below two scatter diagrams and discuss then in brief.
In fig. 13.1 few points lie on the straight line and the other are just above or below the line. A line
of best fit will be one for which the perpendicular distance of all the points from this line is
minimum. If the distance from above to the line as positive and from below the line as negative;
the sum of the distances is almost zero.
In fig.13.2, it is easy to note that hardly any three points lie in a straight line or show any known
pattern. Hence, .the variables X and Y are treated as independent and no regression equation is
possible. In such a situation, no path is discernable.
Example 1:
Following are the scores out of 100 obtained in a test by the sales representatives and their sales
performance in lac rupees.
184
The figure reveals that the points lie on and around a straight line. Hence, simple linear
regression line can be fitted to the data.
The paired observation (x, y) plotted on the graph paper clearly indicate that there is no set
185
pattern shown by the points and hence no equation can be chosen to fit m the data.
QUESTIONS
10. The heights (in cms) and weights (in kgs) of a random sample of twelve adult males are given
below:
- End Of Chapter -
186
LESSON - 14
A linear equation between two variables X and Y represents a straight line. A simple regression--
line involves two variables Y and X. If Y is the dependent (response) variable and X is the
independent variable, the simple regression line is given by the equation.
The choice of the dependent and independent variables depends on the nature of the variables:
for example, age of persons and ft. Q. are-two variables. Out of these two, age is the
independent, variable because it goes on increasing with the passage of time and does not
depend on any factor and I.Q. increase with age as a person learns many things through
education and experience. Hence, in this case age has to be taken as the independent variable X
and I. Q. the dependent variable Y.
187
independent variable. X.
It is not necessary that we are always .to find out the regression of Y on X; There are many
situations when the variables Y and X are such that either of the two variables can be taken as
dependent variable, and the other as independent variable. For example, height and weight of
persons are two variables in which height depends on weight and weight depends on height. So
in such a situation we can find U the regression of height on weight or weight on height. In this
way, we get two regression equation, the one of height (Y) on weight (x) or the other of weight
(X) on height (Y). A word of caution is important at this juncture. A regression of Y on
Fitting of the regression equation means the estimation of the parameters a and through b the
paired observations (x, y). By paired observation we mean the observations taken on He same
item or individual or the already paired items. For instance, height and weight of the same
person are to be taken in pair; income and expenditure of the same person are to form a paired
observation etc. The best estimates; of a and b will be those which minimise the error. As we all
know, the least error is zero. But such an ideal situation is hard to achieve. Hence, effort is made
to estimate a and b such that the error is minimum. There are many methods of estimation, but
we shall discuss here only the least square method of fitting the regression equation. Least
188
square method of estimation was given by the mathematician Legendre.
Let there be n pairs of observations, (Xl, Xl) (X2, X2) (X3, X3)..... (Xn, Yn). From these n pairs, we
will estimate the values of a and b so that the error is minimum.
189
190
191
The crown or hat (A) over y indicates that it is the estimated value of y, not the actual value. If
we have obtained the numerical values of a and b, the same can be substituted in the equation Y
= a + bx.
In the equation y = a +bx, a and b are known values. If we substitute the value of x for which Y is
to be estimated, Y will be easily available.
192
Here, for any given value of Y, X can be estimated.
193
194
195
So, when there is a perfect correlation between X and Y, the two lines of regression are either
coincident or parallel to each other. Since both the lines pass through the point (X, Y) they
cannot parallel. Hence, when there is a perfect correlation between the variables X and Y, the
two lines of regression are coincident.
196
(iii) Fundamental property:
The geometric mean of the two regression coefficients is equal" to the correlation
coefficient between the two variables. The sign of correlation coefficient will be the same
as that of byx and bxy. This is known as the fundamental property of regression
coefficients. It is trivial to derive the relation. We know,
The arithmetic mean of the positive regression coefficients is greater than the correlation
coefficient between the two variables. This is known as the mean property of the regression
coefficients notationally.
197
If one of the regression coefficients byx or bxy is greater than 1, the other is less than 1. This
property of regression known as the magnitude property. The proof is very simple and direct.
We know
198
If X and Y are independent, r = 0. If r = by x = 0 and bxy = 0.
Residual Variance
199
Consider the regression line of Y on X.
200
The result shows that the correlation coefficient between y and y is same as the correlation
coefficient between x and y.
Example - 1.
Taking the data of example 1, chapter - 1 regarding test scores (x) and their sales performance
we fit in the regression line of y on x and also estimates and sales performance of a sales
representative who secure a score of 50 in the test. Also check whether the regression line is a
201
good fit.
Also (X) : 40 45 65 55 70 85 35 60 75 80
Scores (Y) : 12 14 22 18 31 34 15 20 24 30
202
203
To check whether the regression line is a good fit, we calculate r 2. Firstly we calculate r by the
formula.
To work out various term in the formula for byx, we prepare the following table.
204
Making use of the partial calculations, the value
205
206
Since the value of r2 is near to zero, we conclude that linear regression is not a good fit to the
given data.
Example - 3.
= 34.96
20.6
207
208
209
Regression estimates through coding of data
We discuss coding directly for the calculation of the intercept 'a' and regression coefficient ‘b’
In regression analysis, we are concerned with two variables ac and y. Let a constant c 1 subtracted
from x and C2 from y. Then the reduced variables (x — c 1) and (y – c2) are divided by d1 and d2
respectively. Thus, the coded values for x and y are
210
Now to calculate the regression. coefficient byx we prepare the following table.
211
Now we calculated the regression coefficient from the coded variables dx and dy and denote the
regression coefficient of yon x from the coded variables as b c. The formula for bc is,
212
213
Note: 1.
The values of c1, C2, d 1 and dg are arbitrarily chosen looking to the data which result into the
most convenient values of coded values. In a situation, they may be all different or, all the same,
or some of them are the same the others differ.
2. The value of d1 and d2 should never be taken as zero, otherwise all variety values will
Example – 5
A company wanted to see the impact of advertising on sales of a re company collected the data
which were as follows. (Fictitious data)
Advt. Expdt (X) : 6 12 18 24 30 36 42 48
(000’ Rs.)
Sales (Y) : 5 10 25 30 35 45 65 70
(Lac Rs.)
(ii) Estimate the sales of the product of advertising worth Rs. 25,000.
Solution :
Fitting: of the regression .line can easily be done by using the method of coding. For this we
214
prepare the following table by choosing c1 = 24, d1 = 6, c2 = 25and d2 = 5.
215
Fit a straight line trend and estimate the sale for the year 199.3. (Take the year 1988 as working
origin.)
216
Solution :
Since the years which are independent variate value cannot be used as such. Hence years are
coded taking 1988 as origin. Such type of problems are usually faced with finding out the trend
line. Let the trend line be Y = a + bx.
We can get the values of 'a' and 'b' in two ways. One is to substitute the value sof the terms in the
formula for 'a' and V and the other is to write the normal equation and solve them directly for 'a'
and 'b'.. Both lead to the same result.
By the formula,
217
The readers can solve these equation and verify that the same values of 'a' and 'b are obtained.
Curvilinear Regression : The relation between the dependent and independent variable (s) is
not necessarily linear. It is often curvilinear. The curve may be of any type, a parabola, a cubic, a
polynomial of degree K, an as trod, a hyperbola or any other type of curve. But confining to the
requirement of the course, we will consider some well known curves which can be reduced to
linear relations under transformation.
218
Here we note that the log term appears only for Y but not X. If we put log Y = z, log a — a and log
= b , the equation takes the form,
So in the fitting of the growth curve, first the variate values of Y are to be transformed to log -
values and then linear equation z - a + bx is fitted in the same manner as we do for the
regression equation of Y on X. Here it is a simple linear regression of z on X. The formulae for a
and b in the like manner for n paired values (x1, z1) , (x2 , z),…(xn, zn) can be given as
219
Example 7 The data given below show the atmospheric pressure at various heights above the sea
level.
220
221
222
= 1.4096 + 0.4504
= 1.86
Taking antilog,
223
Logarithmic curve
Let the estimated values of y and b be a and b respectively. The formulae for a and b for n paired
observations can be given as
224
225
Now,
226
The comparison of estimated values with the observed values reveals that the fitted logarithmic
curve is a very good, fit to the data.
227
The Reciprocal curve
Example - 9
The production of potato (Million Tonnes) and prices per quintal during the last thirteen years
are as follows :
228
229
By the formula (55)
230
231
QUESTIONS
232
18. Given the two regression lines,
3X +2y = 12
5x + y = 13
(i) Find which line out of the two represent the regression line of y on x to of x on Y.
(ii) Find the correlation coefficient between X and Y.(Hi) At what point the two lines intersect.
19. The area under tea cultivation from 1945 to 1951 is given below
____________________________________________________________________________
233
1949 710
- End Of Chapter -
234
LESSON -15
Preamble
The fitting of a regression-equation is done to estimate the dependent variable Y through the
independent variable X. The regression parameters a, and b play the complete role in estimation
process. When regression analysis is done, some values of the parameters a, and b are obtained.
Now it becomes essential to know whether the contribution of these parametric, values in
estimating Y is significant or not. If the value of b is non-significant, then it shows that the
estimation of Y through X is not meaningful. Since if b = 0, r = 0, it means that the two variable
are not linearly retard. -Hence Simple linear equation is not fit for estimating Y through X. For
instance, if we want to estimate sale of a product on the basis of test scores of the salesman. If IT
comes out to be non-significant, there is no sense in relating the sales with the test spores of the
salesman. Also if a 0, it means that the line passes through the origin, In this way, there is no
intercept in reality and its contribution in estimating Y is insignificant. Therefore, one should
test the significance of regression parameters prior to estimating the variate values.
235
With a little algebraic manipulation, it. is easy to show that
236
Once we know Se2 , it is easy to find Sb2 and thereby Sb Thus,
To make a decision about Ho, compare the calculated value of t With the tabulated value of t for
(n-2) degrees of freedom and a level of significance. The decision criteria is, reject Ho if t cal > t
a , (n-2) and otherwise accept Ho. Accepting Ho means that the' change in Y corresponding to
an unit change in X is of no significance.
237
Note: The test procedure in case of regression line of x on Y, remains the same > except that
replace X by Y and Y by X or u by v and v by u.
The hypothesis,
238
can be tested by F-test also with the help of analysis of variance (ANOVA) table.
ANOVA Table
To decide about Ho,, compare the calculated value of F with the tabulated value of for (1, n-2)
d.f. and a level of significance. If F cal > F a , (n-2), reject Ho. It means that the regression
coefficient H0 is significant and thus makes a significant contribution in estimating Y through X
239
(ii) Estimate Y for X = 50
(iii) Test the significance of the recession coefficient, (iv) Test the significance of the intercept.
To fit in the regression line of profit (y) on capital (X), we prepare the following computation
table.
Other values required for fitting the regression equation and testing are computed in the
following manner.
240
Thus, the regression line is
241
The Statistic,
Tabulated value oft for 7 d.f. and 5 % level of significance for two tailed test is 2.365.Since t cal
>2.365, we reject Ho. It means that the regression coefficient is significant.
242
(i) Taking 1988 as origin, fit in the trend line y = a + bx
Solution:
To fit in the trend line and test of hypothesis we prepare the following computation table.
243
With the help of the above computations,
244
The trend line is
X = 5, hence
245
Interval Estimation
The confidence limits for a parameter 6 corresponding to the confidence probability (1 - a ) are
given by the formula,
246
(i) Fits in the two regression lines of y on x.
Solution :
247
The regression line of X on Y is
To Test
248
The residual variance,
249
The statistic value is
250
251
Thus,
Note: If we consider the regression of X on Y, all the formulae and procedures can be followed
in the like manner simply by changing Y by X and X by Y.
252
QUESTIONS
2. If We intercept comes out to fee non-significant, what do you infer by it?
3. How will you perform the test of significance for the regression coefficient?
5. How will you test the significance of the intercept a in a regression line of Y on X?
6. What changes would you make in testing the significance of the regression coefficient of X on
Y?
7. On the basis of the figures recorded below for supply and price for nine years, build a
regression of price on supply
Supply : 80 82 86 91 83 85 89 96 93
Price :145 140 130 124 133 127 120 110 116
253
- End Of Chapter -
254
LESSON - 16
Introduction
There are number situations where variable depends on, more than one independent variable In
this situation, the estimated the of the dependent variable through single variable cannot yield
satisfactory results For instance, the selling price of a finished product depends on the cost of
the raw, material, labour cost, cost of energy (electricity), transportation, advertising cost etc.
The production cost of a cereal depends on the cost of the feed, fertilizer, labour, irrigation etc.
In all such situations to estimate the cost of "production can be estimated through a number of
variables and hence one has to choose a multiple regression mode instead of a simple linear
regression model.
A multiple linear regression model with a dependent variable Y and K independent variables x 1
x2,….. xk can be given as,
Now we consider the linear regression model with two independent variables X1 and X2
255
Y= b0 + b1 x1 + b2x2 +e ……………. (2)
We know that all the triplets will satisfy the equation if they belong to it. Hence, for the i th
triplet, the equation (2) can be written as
256
Rearranging the above equations, we get.
Substituting the value of to in equations (8) and (9) from (10), we get.
257
Now we have to solve the equations (11) and (12) for bi and b 2.
Method 1 : One way is to solve them by the method of elimination. Thus, in this way we get,
Method 2 : Another simple way of solving the equations (11) and (12) could be through the
determinants. In this approach we directly write the equation by taking unknowns in the
numerator and determinants in the denominator leaving its known coefficients in a particular
order as given below :
258
and (14). Regression equation (15) stands as such.
Method 3: Earlier two methods are simple and applicable for two unknowns: But the modern
approach is to solve the equation through matrix approach.
We know, the equations (11) and (12) in the matrix notations can be written in the following
manner.
259
260
Definition:
261
262
263
264
265
(iv) 95 per cent -confidence- interval for b1 is given by the formula
266
10.71 ±2.33 x 2.365
Addendum :
Here it is to add further that the equations (7) through (9) can be solved to obtain b 0, b1 and
b2wihtout taking: the, deviations from respective means. The matrix approach is the best. The
readers are referred to 'Basic Statistics' by B.L. AGARWAL, Chapter 14.
267
Matrix A is symmetrical matrix and A - 1 will also be a symmetrix matix. Once we know A -1, we
can write (33) in the expanded form as,
Multicollinearity:
The term multicollinearity is used to describe a situation where the explanatory variables
X1,X2,X……….Xk a Ere correlated. If multicollinearity is present, the regression model-will riot-
268
yield good results'1 The problem of multicollinearity is usually confronted with in time series
data.
If perfect multcollinearily Is present, the coefficient matrix is .usually singular i.e. the
determinant of the coefficient matrix is zero. Consider the case of 4 regressor variables X 1 and
X2. If X1 and X2 are correlateds X1 = CA &.The coefficient matrix,
The above situation exists in case of perfect collinearity which in practical life rarely exist. What
we usually come across is the high collinearity.
In case of high collinearity, the best remedy is to redefine the mix of the variables by either
discarding or combining some of the variables. Usually dropping of a variable introduces b 1 as in
the regression coefficient. and combining one or two variables, one looses information. -
QUESTIONS
6. The daily rainfall of 14 selected places along with the altitude and distance from the sea
level was as follows:
269
270
(i) Fit in the linear regression of rainfall on altitude and distance.
(iii) Estimate the rainfall for given altitude = 600 and distance = 600.
(iv) Establish 99 per cent confidence limits for the partial regression coefficients
7. The following table gives the number of leaves per plant, height of plant and height of main
stem of mung at eleven places.
271
8. How can you test the significance of a partial regression, coefficient ?
9. How to find out the confidence limits of a partial regression coefficient ?
12. How can one tackle with the problem of multi collinearity ?
- End Of Chapter -
272
LESSON - 17
Preamble
There is hardly any moment in life left without decision making. One starts taking decision right
from morning whether to get up in the early morning or to continue sleeping till late hours,
whether to buy an article or not. Even if to buy an article, then from which shop he should buy
it; what quality product one should buy it. All such decisions are common decision and are
based purely on the judgment, experience, liking and will of an individual.
In scientific phenomenon, testing of hypothesis comes under the category of classical decision
theory. In this case, we have an assertion about population parameters and has to decide
whether to accept or reject the hypothesis on the basis of certain appropriate statistical test at
certain level of significance. Classical decision theory suffers with three aspects.
Firstly it takes only two actions about the null hypothesis vis-a-vis alternative hypothesis
whether to accept or reject the null hypothesis.
Secondly it does not take into account any other information pertaining to the decision except
empirical data which is being collected through the sampling process.
Thirdly, there are economic consequences that result from making a wrong decision. Although
such consequences are covered by taking into consideration a desired level of significance level
for the test. These procedures are never the explicit part of the decision model or procedure. But
some prior information is utilized under Bayesian decision theory as this is based upon a direct
evaluation of the payoff for such alternative course of action.
The Bayesian decision theory or simply decision theory removes the above short-comings and
enables are to take optimal decisions. The reasons for optimal decision being that:
(i) It provides a model for decision making in situations that involves multiple states of
the parameter which is termed as nature-in parlance in decision theory
(iii) It utilises the information .pertaining to the decision making which exists prior to
any sampling or experiment. The prior information may be in the form of empirical data
or a subjective type of information considered useful by the decision maker. For
instance, one has to undergo an operation. He takes into consideration the survival rate
of patients after the operation and also the expertise of the doctor. Survival rate is an
empirical data whereas expertise of the doctor is a subjective consideration.
Decision making is extremely useful in business. A wrong decision puts the business
organisation into heavy losses and even the company or business may fail. How much stocks
should be maintained,, what percentage of profit be fixed, what short of items be manufactured
are the parts of decision making in business.
273
The likelihood principle: It is another principle which is largely used in decision making. This
involves a likelihood function which is defined as, the function L(q,x), where the sample x has
been observed,. This is considered as a function o£0 and is called the likelihood function. The
function L(q,x), L(x/q).
The intuitive reason for the name likelihood function is that a θ for which f(x/q). is large is more
likely will be true than a small value of f(x/q).
In making decisions about when x; is observed all relevant information is contained, in. f(x /
θ)Hence, this principle bears, good importance. The details are omitted as this does not
constitute the part of the syllabus.
The business organisation has to make decisions every day with regard to its expansion, number
of units to be produced, price fixing, whether to replace the old plant with the. new one etc. All
this has to be decided on some criteria. Therefore, various steps in consideration are called
ingredients of the decision problem which are discussed in brief below.
The decision problem arises only when there are different course of action at our disposal and a
decision maker has to choose one out of many. Let there are K actions, a1, a2, a3,…..ak at the
disposal of the decision makers and he has to choose one put of these alternative actions. The set
of K actions is known as action space.
If the action selected does not fulfil the objective, it would result into the waste of time and cause
heavy losses. So, by making use of all the available information, an action has to be chosen;
based on statistical decision procedures which makes one to attain an optimal decision which
fulfils the objectives i.e. which minimizes the loss and/or maximises the gain etc.
2. Uncertainty:
It is not possible to predict the outcome of an experiment. Hence, the outcome is said to be
uncertain. So, there are many outcomes for an event which are called states of nature in decision
theory. So, it is possible to predict the state of nature in terms of probabilities. In an usual
manner K states of nature are represented by q 1 q2 , . . . qk .The, totality of states of nature; is
called states space and is denoted by Q. If an action leads to four outcomes θ1,θ2 , θ3and θ4
Then W = (q1, q2, q3, q4)
For instance, a product liked by 100 per cent customers is denoted by q 1 , a product liked by 50
per cent person is denoted by q2 ; the product liked by 25 percent buyers is denoted by q 3 and
the product not liked by any is denoted by q4.
As a matter of fact the decision making under 'risk' and decision making under uncertainty are
not synonyms. They are different in the, sense that when the state of nature is unknown but
objective or empirical data is available to enable are to assign probabilities to the various, states
of nature, the procedure is referred to as decision making under 'risk'. But if the state of nature
274
is unknown and there is no object or empirical data available to assign probabilities to various
states of nature, then the decision procedure is referred to as decision under uncertainty.
Anyhow, if the probabilities are assigned even sunder uncertainty on intuitive basis, the decision
procedures under risk and under uncertainty are equivalent.
3. Pay Offs :
Usually one evaluates the consequences of a course of action for each event in terms of monetary
value or time. A number of consequences result from each action under different states of
nature. For ‘m’ possible acts and 'n' states of nature, there will be m x n consequences. The
consequences are usually evaluated in monetary terms such as,
i. in terms of profit
ii. in terms of cost
iii. in terms of opportunity loss
iv. unit of utility
In table 1.1, Pij represent the pay off as a consequence of act aj when the 'n' state of nature is q i
for i = 1,2,….n and j = 1,2,….m. With the help of the payoff table, a decision maker can reach an
optimum solution of the problem in respect of an event which is going to occur. Since, the
outcome which, is going to occur is unknown, a forecast has to be made in terms of probabilities
assigned to the events to occur. The last, step is to make use of these probabilities for calculating
the expected pay off of expected monetary value for each course of action. A decision maker has
275
to choose an optimal act which results into maximum expected pay off.
An alternative way to decide about and act for the events is on the basis of expected opportunity
loss (EOL). This criterion yields the same results which are obtained by expected profits.
Since the pay offs depend on the choice of a particular act and are conditional and subject to the
occurrence of an event, are often termed as conditional values and the pay off table is termed as
the conditional value table.
The payoff for any cell of the payoff table 1.1 calculated by the formula
Expected monetary value (EMV) is also called expected payoff. Suppose Xi denotes the ith event
in an act and Pi is the probability that Xi takes place, the expected monetary value of an act is
given as,
To make a decision for an act, the rule is to select an act which has maximum expected monetary
value. The criterion of selecting the maximum EMV of an act is often referred to as Bayes
decision rule named after Thomas Bayes.
Risk function
If q is the state of nature and ‘a' is an act, the function R(q , a) is known as the risk function and
denotes the risk involved in taking a decision about when the act 'a’ has been adopted. The risk
R(q, a) is nothing but the expected loss incurred in taking the act 'a' about q Hence,
Loss is usually taken to mean opportunity loss (OL) of money, time, fuel etc. The loss function is
a pessimistic view taken by the statisticians. Contrary to this, economists take an optimistic view
and talk of utility function. It amounts to the same whether minimises the loss function or
maximises the utility function.
Inadmissible act
An act is called an inadmissible act, if it is dominated by any other act. An inadmissible act in a
payoff table is one for which the payoffs for an act are less than the payoffs for any other act.
Corresponding to the events in this situation, the inadmissible act is discarded from the payoff
276
table. This process of elimination i; saves labour of calculations and simplifies the process of
decision making.
Example-1.1: Consider the following payoff table with five acts and four events.
From the above table it is apparent that the payoffs for the act A2 for all events are less than the
payoffs for the Act A3. Therefore, the act A2 is inadmissible. This act can be removed from the
table. In this way the payoff table reduces to the order of 4 x 4 for further analysis.
The name itself indicates that the loss incurred due to missing of better opportunity is termed as
expected opportunity loss (EOL). Expected opportunity loss is an alternative to EMV approach.
Thus, the expected opportunity for an outcome is the difference between the best pay off for an
event and the payoff for the outcome of that event under an act. In other words it is the loss
incurred due to the gain missed which could be earned by making the right choice of an act for
an event. This has been shown in the following example.
Example-1.2 : Suppose a shopkeeper costs a particular type of sweet for Rs.5 each and sells it
for Rs.6 each. Now the problem before the shopkeeper is that any sweets left unsold will be a net
loss as it is a perishable item. So how much should he prepare. If the shopkeeper expects the sale
of 100 items per day with probability 0.5, 150 items with probability 0.4 and 200 items with
probability 0.1, then how much should be prepared so that the expected opportunity loss is
minimum. The loss table can be prepared and displayed in the following manner.
277
In the above table, the loss for the act, prepare 100 and sale of 100 item is zero, as the
shopkeeper prepared and sold 11 of them. So there is no loss of opportunity, whereas prepare
150 and only sold 100 item, 50 items are left unsold. In this way he has incurred a loss of Rs.250
directly.
Again if the shopkeeper has prepared only 100 items whereas he could sell 150 items, he has lost
the opportunity of the profit on 50 items. So he has incurred a opportunity loss of Rs.50. The
other entries are made likewise. Now the expected opportunity loss for each act can be
calculated by taking the sum of the product of losses with their corresponding probabilities.
= 0 + 20 + 10 = 30
= 125 + 0 + 5= 130
Since the minimum expected opportunity loss is Rs.30 for the act-prepare 100 items, the
shopkeeper should prepare 100 items.
278
Optimal decisions
Decision theory helps to select an action which minimises the loss or maximises the gain. As we
know, there is no single act which can always be best under all situations (states of nature). An
act may be the best for particular state of nature and the worst of the other state of nature. So a
decision maker has to choose one under uncertainties. Hence, a decision maker wants some
criterion on the basis of which he can choose the best act out of the many at his disposal. Here,
we shall discuss three principle which are popularly used, named:
Maximin principle
It is the simplest principle out of all principles for cheesing an optimal action when the payoffs
are given in terms of profits. According to maximin principle a decision maker first selects the-
minimum payoffs over the various possible states of nature. Then he "selects that action for
which the minimum payoff.' This principle guards against worst that can happen and makes him
prepared to face the worst. In short under maximin principle a decision maker chooses an act
which maximises the min Pij
Example-1.3 : For the problem given in example-1.2, we prepare the following payoff table
279
P12 = 100 x 6- 150 x 5 = -150
Similarly
In the above payoff table, the act A1 has minimum payoff 100, act A2 has minimum payoff-150
and for act A3, the minimum payoff is -400. Out of the three minimum payoffs 100, -150, -400,
the payoff100 is maximum and the shopkeeper should choose to prepare 100 items. This is the
same decision which we obtained on the basis of expected opportunity loss.
Minimax principle
This principle is used when the payoffs are-given in terms of opportunity losses. Here we
minimize the maximum opportunity joss. A decision maker first observes the maximum
opportunity loss over all states of nature. He then chooses the action for which the maximum
opportunity loss is minimum. This principle guards against the maximum loss.
Example-1.4 :
Now we consider the decision problem given in example-1.2 and, wish to select the best act on.
the basis of minimax principle. From the loss table the maximum loss, for the act A 1 is 100, for
apt A2 is 250 and for the act A 3 is 500. The minimum loss out of the maximum losses out of
three acts is Rs. 100 for the act A 1. Hence, one should choose the act Ai again the same result is
obtained as we got from maximin principle.
Baye's principle
Baye's principle is based on the use of a priori probabilities. A major advantage of Bayesion
approach is that a decision maker selects an action on a rational basis since he uses subjective
probabilities ascertained from his own experience, past performance or intuitively. Here we
explain Bayesion principle of decision making in brief.
To make use of Bayesion principle, a decision maker should first assign prior probabilities to
each of the state of nature. Here, it should be bear in mind that the sum of these probabilities is
equal to 1. These probabilities reflect on the belief of the decision maker about the states of
nature to occur. The set of probabilities along with the states of nature constitute a probability
distribution known as prior distribution. Let the probability density function (Pd.f.) of the prior
distribution for the state of nature 0 is g (q ). For the state of natureθ = q i the p.d.f. is g( q, i), Let
us assume that L(q , g) is the loss function, which is non-negative.
280
It is commonly called the probability density function of the posterior distribution, of 0 , given
that x has been observed. Having the idea about the prior and posterior distribution, we directly
return to the decision problem.
Baye's approach states that the expected payoff for each act in a set of acts be computed and on
the basis of these payoffs, a best act should be chosen A best act in case of profit is one for which
the expected profit is maximum or the expected loss is minimum. This is also known as expected
monetary value (EMV) criterion. A best act is one for which EMV is maximum. The procedure
for this is-
Once the prior distribution is determine, Baye's principle is applied to make a decision about
selecting the optimal action. The Bayesion analysis is carried out in three phases namely, (i)
Prior analysis (ii) preposterior analysis (iii) posterior analysis each of these are discussed in
brief.
281
Example-1.5 : We retake up the problem discussed in example-1 - 2 and make decision about
the selection of an optimal action through Bayesion principle. 0 The table of prior probabilities
and state of nature qi can be displayed as given below,
The expected payoffs for each act a1, a2, and as can be computed in the following manner.
282
The EP for act a1 is maximum and hence the action ai i.e. prepare 100 items is the preferable.
The reader can do the above exercise by preparing opportunity loss table and calculating EOL. I
am sure, they will be compelled to choose the same action as through E.P
ii. Preposterior analysis: If the decision maker feels that his apriori probabilities are not
fully reliable, he may try to obtain some more information about the states of nature. For this,
he can collect information from a clairvoyant and make use of this information either to
maximise his profits or minimise his losses. The expected profits by making use of clairvoyant's
information is known as expected payoffs of perfect information (EPPI). EPPI is also often
called expected value of payoffs under certainty. The perfect prediction by the clairvoyant
reduces the opportunity losses due uncertainty to zero. The difference between EPPI -EP is
called expected value of perfect information (EVPI); EVPI is the maximum amount which the
decision make can to clairvoyant for perfect prediction. Here it is worth pointing out that
EVPI=EOL under uncertainty for selecting an action.
In nut-shell, preposterior analysis leads are to decide whether it is profitable to acquire perfect
prediction or not. Consider the maximum payoffs for states of nature 81 , ft* and83 under the
acts a1, a2, and a3. Then,
Posterior analysis:
For posterior probabilities. Her borrow the example .8.7 of Basie Statistics authored by B.L.
Agarwal.
Example : 1.7 : A professional economist, approaches the contractor to make use of his forest.
The consultant actually does not tell the exact probabilities of a fixed percentage of a price rise,
but only tells about the trend i.e whether there will be a fast rise in prices or whether prices will
rise at the slow rate dun contract For brevity, we write the forecasts as fast and slow On The
basis of this information, the economist also given the reliability statement for various price rise
as, the probability of price rise in view of forecast 'fast' is 0.2 i.eP (Fast / 5% ) = 0.2and for
forecast 'slow' is 0.8 i.e.
P (Slow/ 5% ) = 0.2
For 8% price rise, the probability under forecast ‘fast’ is 0.7 i.e P (Fast / 8%) = 0.7 and 0.3 for
forecast 'slow', P (slow / 8%) = 0.3. For10% price rise the probability under the forecast 'fast' is
0.9 i e. P(Fast / 10%) = 0.9 and 0.1 for forecast ‘slow’ i.e P (slow / 10%) = 0.1 we know that the
prior probability assessed by the contractor in EVPI are
283
p(5 %) = P (5 % price rise) = 0.4
284
It can be verified that the sum of the probabilities :
Once we have got the posterior probabilities, there is no sense in confining ourselves to the use
of prior probabilities. Hence, we calculate the expected monetary values under the forecast 'Fast’
and 'Slow' superlatively, for a fixed price and cost plus percentage contract.
= 6633.60
EMV (Cost plus) = 1550 x 0.1538 + 19880 x 0.6731 + 20100 x 0.7131
= 19867.33
EMV for cost plus percentage contract is greater than EMV for fixed price contract in spite of the
use of posterior probabilities under the new forecast fast. For the forecast 'Slow',
285
For the forecast ‘Slow’
EMV (Fixed price) = 22500 x 0.6667 + 6000 x 0.3125 – 5000 x 0.0208
= 16771.75
EMV (Cost plus) = 19550 x 0.667 + 19880 x 0.3125 +20100 x 0.0208
= 19664.56
Again, for the price rise forecast 'slow', cost plus percentage contract is better than fixed price
contract. We can also calculate the expected value, for selecting cost plus percentage contract in
both kind of forecasts, for contractor's problem the expected value is,
To avoid confusion, it is worthwhile to point out, that a situation may arise in which a decision
maker may select one course of action with the arise forecast 'Fast' and the other course of
action with the forecast 'Slow'.
If the contractor has to pay fee to the consultant say Rs.500, this amount has to be deducted
from the expected value. In the present example, there is no gain after paying any fee, and the
contractor would not like to buy the forecast.
QUESTIONS
1. Payoff table showing profits (Lakhs of rupees) for various sizes of plants and demand levels is
given below:
286
What capacity plant should he install?
> 40,000 .3 50 50 60
3. A shopkeeper costs Rs.3.00 per icecream and sells it for Rs.4.00 each. The demand pattern
of icecreams per day with their respective probabilities is as follows:
287
Monetary returns
Size of Hotel Good demand Medium demand Poor demand
50 rooms 50,000 30,000 10,000
100 rooms 80,000 60,000 20,000
200 rooms 1,00,000 90,000 40,000
5. shipbuilding company has launched a programme for the construction of a new class of
ships, certain spare units like the prime mover, each costing Rs.2,00,000 have to be purchased.
If these units are not available when needed, a very serious loss is incurred which is of the order
of Rs. 10,000,000 in each instance. Requirements of the spares with corresponding
probabilities are given below:
No. of Spares : 0 1 2 3 4 5
Probability of requirement : 0.876 0.062 0.041 0.015 0.005 0.001
How many spares should the company buy in order to optimize inventory decision?
6. Two companies, Hindustan Electro-carbon Ltd., and Poly Chemicals Ltd., expect to
announce plan for next year's operations on the same day. On vital issue that the shareholders of
each company as well as the general public have an interest in, is the opposition that each of the
companies will take regarding the problem of pollution. If one company, for example, declared
its intent to take action towards stopping pollution, its public image will be greatly improved.
But on the other hand such action could increase its costs and put it in a bad position with
respect to its competitor, if the competitor chooses not to take the same course of action. Each
company can take any of the following three actions:
3. Intention to continue as in the past. The payoffs for the actions are:
288
Determine the optimal cause of action for each company.
(i) Maximin (ii) Minimax (iii) Maximax (iv) Laplace (v) regret criterion is applied.
9. YZ Co. Ltd. wants to go in for a public share issue of Rs.10 lakhs (1 lakh shares of Rs.10 each)
as a part of effort to raise capital needed for its expansion programme. The company is
optimistic that if the issue were made now it would be fully taken up at a price of Rs.30 per
share.
However, the company if facing two crucial situations, both of which may influence the share
prices in the near future, namely:
289
i. An impending wage dispute with assembly workers which could lead to a strike in the
whole factory could have an adverse effect on the share price.
ii. The possibility of a substantial business in the export market, which would increase the
share price. The four possible event and their expected effect on the company's share
prices are envisaged as:
E3 : No strike and export business lost - share price hovers around Rs.32.
E4 : Strike and export business lost - share price drop to Rs. 16.
And the management has identified three possible strategies that the Company could adopt; viz.,
S2 : Issue 1,00,000 shares only after the outcome of (a) and (b) are known
S3 : Issue 50,000 share now and 50,000 shares after the outcome of (a) and (b) are known.
1. Draw up a payoff table for the company and determine the minimax regret solution. What
alternative criteria might he used.
2. Determine the optimum policy for the company using the criterion of maximising expected
pay-off, given the estimate that the probability of a strike is 55 % and there is a 65 % chance of
getting the export business, these probabilities being independent.
3. Determine the expected value of perfect information for the company.
10. A group of volunteers of a service organisation raises Money each year by selling gift articles
outside the stadium after a football match between Team X and Y. They can buy any of the three
different types of gift articles from a dealer. Their sales are mostly dependent on which team
wins the match. A conditional pay-off table is as under:
ii Which type of gift article should the volunteers buy if the probability of Team X winning is
0.8.
290
REFERENCES
Agarwal, B.L. : 'Basic Statistics', Wiley Eastern Ltd., New Delhi, 2nd ed., 1991.
Byrkit, D.R. : 'Elementary Business Statistics', D.Vari Nostrand Company New York.
Hoel, EG. and Jessen, R.J : 'Basic Statistics for Business' arid Economies', John Wiley, New
York, 1982
Lapin, L.L. : 'Quantitative Method for Business Decision', Harcoust Braco, Jovanovicls, New
York, 2nd ed., 1981.
Richard, L.E. and Lacava, J.J. : 'Business Statistics', McGraw-Hill Book Company, New
York, 1978,
- End Of Chapter -
291
LESSON - 18
Preamble
Testing of hypothesis is an important tool of Statistics. There are many test used in Statistical
analysis. But some are more frequently used than others. Chi-square test is one of the most
frequently used test. The reason being that it is applicable in a large number of sciences like
Biology, Agriculture, Psychology Education, Management, etc. Chi-square test makes use of the
Chi-square distribution, that is why it is called chi-square test. The chi-square distribution is
utilised to determine the critical values of the chi-square variate at various level of significance.
Like other tests chi-square test also entails null and alternative hypothesis, two types of error in
test of hypothesis leading to level of significance and power of the test, degrees of freedom. The
details of these are omitted here.
Chi-square test is applicable to test the hypothesis about the variance of a normal population,
test of goodness of fit of a theoretical distribution, test of independence of attributes when the
frequencies are presented in a two way table according to two attributes classified in various
categories known as the contingency table.
Chi-square test dated back to 1900, when Karl Pearson gave the test statistics for frequencies
classified into K-mutually exclusive categories.
Chi-square Statistic
Suppose there are K mutually classes and theoretically it is expected that they are likely to occur
in the ratio
r1: r2 : r3 :… rk
Let O1, O2, O3,……….Ok be the observed frequencies in k classes C 1, C2, C3,….Ck respectively. Also
suppose E1, E2, E3, …Ek are the expected (theoretical or hypothetical) frequencies under null
hypothesis calculated by the formula
292
Properties
Contingency table
The data are often based on the count of persons, items, units or individuals which possess
certain attributes. Here we categories individuals according to attributes say, A and B. Suppose
the attribute A has on categories and P has q categories. For example, the attribute A represents
Father's height categories and as tall, medium, gnome and attribute B as son's height categories
as tall, medium and gnome. The frequencies are to be observed for all combination of Father's
height and Son's height.
A contingency table with attributes A and B having p and q categories is displayed below.
Attribute A is taken along rows and B along columns. O ij is the frequency of (ij)1 cell which
represent the number of individual in a group which possess the attributes Ai and Bj for i = 1,
2,... p and j = 1,2,... q. The contingency table with p >2 and/or q >2 is called a manifold
contingency table. A contingency table with 'p' rows and q columns is known as the contingency
table of order (px q).
1 -1 Contingency table
293
The contingency holds certain relations
Also each of the row total or column total is known as marginal total.
Chi-square test is a test of independence of attributes in a contingency table. From the table 2-1,
that a contingency is a rectangular array having rows and columns ascertained according to the
categories of the attributes A and B.
Test statistics for chi-square test in case of contingency table of order (pxq) is,
The observed frequency Oij is already available in the contingency table. Now the question
remains to obtain the expected frequency Eij corresponding to each Oij.
294
Under H0, the independence of attributes, the expected frequency,
Decision criteria:
The calculated value of chi-square is compared with the tabulated value of X 2 for (p-1) (q-1) d.f.
and prefixed level of significance a . If the calculated value of X 2 is greater than the tabulated
value of chi-square for (p-1) (q-1) d.f. and a level of significance, reject H 0. It means that the
attributes are dependent on each other meaning that an unit possessing the attribute Bj. On the
other hand if the calculated value of X 2 is less than the tabulated value of X for (p-1) (q-1) d.f.
and a level of significance, then Ho is accepted. It shows the presence of attribute A has no
bearing in the presence of the attribute B.
Example 2-1:
Suppose a survey is conducted to know the opinion of the workers of a factory whether various
types of incentives has got any relationship with category of worker or not. The data collected
through the survey are displayed in the table below:
295
Ho : Choice of the type of incentive scheme is independent of the category of worker.
against
To test Ho we apply chi-square test. For this first we calculate the expected frequencies for all
the cells by the formula (2.3). The expected frequencies computed below and are entered in the
above contingency table in parenthesis. The frequencies are also rounded of to the nearest unit
value.
Similarly,
296
E31 = 98.9 = 99; E32 = 39.78 = 40; E33 = 76.3 = 76
Here it should be kept in mind that the sum of expected frequencies for each row and column is
the same as the, sum of the observed frequencies.
Degree of freedom for the given contingency table of 4x3 is 3x2= 6. Let the prefixed level of
significance a = 0.05. The reader's are referred to the appendix table VI of Basic Statistics
authored by B.L. Agarwal. From the table of X 2 distribution, x2 for 6 d.f. and 5% a level of
significance is 12.59. The calculated value of x 2= 131.16 is greater 12.59. Hence we reject H 0. Here
we conclude that the choice of type of incentive scheme is associated with type of workers.
Example 2-2:
The following table gives the number of breakdowns of three machines in three shifts during a
month.
297
The hypothesis that the number of breakdowns on machines is independent of shift or not can
be tested by chi-square test.
Similarly,
The expected frequencies are displayed in the contingency table itself in parentheses.
For the given contingency table of order (3x3), the degrees of freedom for the statistic X is 4.
Tabulated value of chi- square for 4 d.f. and a = 0.05 from appendix table VI of Basic Statistics
by B.L. Agarwal is 9.48 which is greater than the calculated value of X = 3.08. Hence, we accept
H0. It leads to the conclusion that the breakdowns have nothing to do with the shifts.
There are many situations in which the contingency table has 2 rows and 2 columns. In case of
contingency table of order 2x2, short cut method i.e., a direct formula for chi-square can be
used. This formula is derived from direct approach.
298
Suppose the contingency table of order (2x2) is,
Where n is the sample size. One can of course calculate the value of chi-square fry calculating
the expected frequencies. But by algebraic manipulation, the direct formula for chi-square
statistic is
X2 has 1 d.f
Decision about the independence of the attributes A and B can be taken in the usual way i.e. if c 2
cal >c2 reject H0. It means that the attributes A and B are dependent. Again if c 2 cal, 2 a 1accept
H0. It leads to the conclusion that the attributes A and B are independent.
Example 2-3 : A sampled group of 50 persons was vaccinated to prevent against malaria. Also
a control group of 50 persons was observed in the same colony. Following results were obtained.
Here we test,
Ho : Vaccination has nothing to do with the prevalence of malaria against Hi: Vaccination
prevents the occurrence of malaria. The value of chi-square by the formula (2.4) is,
299
Tabulated value ofc2 for 1 d.f. and a = 0.05 level of significance is 3.841 which is
less than the calculated value of c2 = 21.23. Hence, we reject H0. It leads to the conclusion that
vaccination prevents malaria.
Example 2-4:
In a survey of 200 children of which 80 were intelligent, 40 has skilled fathers, while 85 of the
unintelligent children had unskilled fathers. Do these information support the hypothesis that
skilled fathers have intelligent children.
The hypothesis,
against H1 : Skilled fathers have intelligent children can be tested by c2- test.
300
Calculated c2 = 8.9 is greater than the tabulated value of chi-square for 1 d.f. and a = 0.5 i.e.
3.841. Hence, we reject H0. It means that skilled fathers have intelligent children.
Yate's correction
Alternative approach
Instead of adding and subtracting 0.5 from the cell frequencies, a formula for chi-square has
been developed which has emerged after making the adjustments. In this way, the botheration
of adding and subtracting 0.5 and dealing with the fractional frequencies is avoided. The
formula for the contingency table of order (2x2).
The quantity |ad-bc| represents the absolute value of the difference (ad-bc). It means whether
301
the difference is positive or negative it has to be taken as positive.
Here it will be worth pointing out that the value of statistic % obtained by adjusting the cell
frequencies and calculatingx2 by the formula (2.4) is always same as we get directly by the
formula (2.5)
Example 2-5 :
The following data give the number of persons classified by in a sample of 200 addicts habit of
drugs addiction and survival after ten years.
To test whether the survival and addition are independent or not, we use chi-square test. Since,
a cell frequency will use Yates correction. Now we solve the above example by adjusting
frequencies and also directly by the formula.
To implement Yate's correction, we add 0.5 to the cell frequency 3 and subtract 0.5 from 117 and
again add 0.5 to 13 and subtract from 67. In this way, the adjusted contingency table is,
The test statistics c2 can be calculated by the formula (2.4) from the above table.
302
The two approaches lead to the same value of statistic c2 Tabulated value of chi-square for 1 d.f.
and 5 per cent level of significance is 3.841. The calculated value of x is greater than the
tabulated value. Hence we reject Ho. It leads to the conclusion that addiction and survival are
related to each other.
Example 2-6 :
The number of companies classified by licensed and unlicensed companies and the proportion
of projects were as tabulated below:
The hypothesis
303
H0 : Proportion of profit and licensor type are independent.
against
Since a cell frequency is only 2, one have to make use of Yates correction for continuity.
Again we will calculate the statistic % by adjusting the frequencies as well as by direct formula
first to show that both the approaches lead to the same value. The contingency table after
adjustment comes out to be
c2cal = 0.375 is less than the tabulated value c2.0591 = 3.841. So we accept Ho. We conclude
that type of the company has no bearing on the proportion of profits.
304
Coefficient of contingency
Rejection of the independence of two factors reveals that the factors are associated with each
other. But it fails to delineate the strength of dependence. This can be very well measured by
coefficient of contingency The formula for coefficient of contingency is,
Where c2 is the calculated value of statistic chi-square and n is the sample size.
If the value of chi-square is zero, c = 0, If c2 is large and n is small, the value of c is near zero but
never attains 1. If value of c is near zero, it shows a poor degree of dependence.
Again a value of chi-square nearing unity shows a high degree of dependency between the two
factors. For a contingency table of order 5x5, the maximum value ofc is 0.894.
It should be kept in mind that if c2 test shows independence, coëfficiënt of contingency should
not be calculated.
Exarnple 2-7:
We calculate the value of coëfficiënt of contingency for the example 2-1. The value of c2 = 131.16
The value of C = 0.916 is near to unity. Hence, there is a strong associaticn between the type of
workers and incentive schemes.
QUESTIONS
1 A survey was conducted to investigate the views about the size of family from male and female
groups separately Their opinion is tabulated below:
305
Test the hypothesis that opinion about the size of family and sex are independent. If H0 is
rejected, fmd the value of coëfficiënt of contingency.
2. Sample of respondents classified by socia] class and political thinking is tabulated below:
Socialist 24 106 20
Capitalist 10 34 56
3 The candidates from rural and urban areas appeared in a competition and the results
pertaining to their selection or non-selection were as follows:
Result
Sex Selected Not selected
Rural 40 110
Urban 80 20
4 The following table provides the results of a survey on a group of persons to flnd out whether
the smoking causes lungs cancer
306
Can it be concluded that smoking and lung cancer are independent.
5 Let 'A' represent new therapy and 'a' represent old one and let 'B' represent those who die and
'b' represent those who remain alive. The information about 500 subjects is put in the following
table.
A a
B 28 72
b 112 288
6 The table given below shows the data obtair.ed during an epidemie of cholera:
[Given X2.05 = 3.841 for 1 d.f, 5.991 for 2 d.f, 7.815 for 3 d.f.]
7. The following table reveals the condition of the house and the condition of the children.
307
Using the chi-square test, fmd out whether the condition of house effects the condition of
children.
8. The following table gives the joint distribution of 120 fathers and sons with respect to hair
colour.
Hair Grey 7 18 20 45
Colour Brown 10 10 25 45
Total 25 40 55 120
On the hypothesis of chance, find out if there is significant associatión between fathers and sons
with respect to hair colour. Use coëfficiënt of contingency C.
- End Of Chapter -
308
LESSON - 19
MEASURES OF ASSOCIATION
Introduction
There are two types of studies, one regarding the variables and the other regarding attributes.
The associatión between variables is usually made through correlation studies. Whereas the
associatión between two attributes is measured through chi-square. Also the associatión
between two or more attributes (qualitative factors) is measured through specially defined
coefficients of associatión like Yule's coefficient, coefficient of colligation etc.
Here the point to emphasize is that correlation can be work out for the variables or characters
which can be quantitatively measured. On the other hand associatión of attributes is applicable
for those factors or characters where we can determine only the presence or absence of an
attribute.
Notations
Since, we are dealing with the characters of units showing the absence or presence of attributes,
various classes are to be formed. When we are dealing with one attribute say A, we can have one
class denoted by A, showing the presence of attribute and the other by ‘a’ or’a’ , showing the
absence of the attribute. If there are two attributes say A and B, there will be four classes formed
out of the presence of the attributes A and B and their absence and a & b. The classes will be AB,
A a B &a b. Similarly, three attributes A, B and C will have eight classes out of the presence of
attributes A, B and C and their absence a, ,Y, namely, ABC, A BY, AbC, a B C ,a b C and a b y , .
The frequencies of various classes are denoted by closing the classes in parentheses. For
instance, the number of units showing the presence of the attribute A by (A) and those showing
the absence of A by (a). Similarly frequency of the class Acr is denoted as (A/3) and of the class
a as (a (3) and so on.
Terminology
Class frequency:
The number of units or individuals belongings to any class is known as class frequency. As
stated the class frequency (B) denotes the number of units or subjects possessing the attribute B.
The class frequency (Ab) denotes the frequency of class Ab i.e. the number of the units or
subjects possessing the attribute A and not possessing the attribute B. In the same manner any
other class frequency can be defined.
In the classification of units or subjects in a group under consideration, the frequencies of the
classes of highest order are called ultimate class frequencies. For example, if there is one
attribute considered for the units or subjects say A, then (A) and (a) are the ultimate class
frequencies. For two attributes A and B, (AB) (Ab), (a B) and (a b) are the ultimate class
frequencies. Similarly for three attribute A, B and C, the frequencies (ABC), (ABY), (a b Y ).....are
309
the ultimate class frequencies and so on.
Another point to emphasis is that any lower order frequencies can be expressed in terms of
ultimate frequencies. For example, in case of two attributes, thé first order frequencies can be
expressed in term of second order (Ultimate) frequencies i.e.
(a)=(aB)+( ab)
(b) = (AbC)+(Abg)+(abC)+(abg)
and so on.
Example 3-1:
Give the ultimate class frequencies for three attributes A, B and C as follows:
(ABC) = 120, (ABy) = 638, (AbC) = 220, (Aby) = 760, (aBC) = 200, (aBy) = 1162, (abC) = 161,
(aby) = 1500
310
Similarly,
(C) = (ABC) + (AbC) + (aBC) + (abC) = 120 + 220 + 200 + 161 = 701
The order of a class depends on the number of the attributes involved in defining a class. A class
having one attribute say A is known as the class of the first order. A class involving two
attributes is called the class second order like Ab, aB, AB, Aa. Similarly, the classes like ABC,
ABY, AbC etc. are called the classes of third order and so on.
Number of frequencies:
For two attributes, the number of frequencies = 3 2 = 9. The nine frequencies for two attributes A
and B of the positive, negative and ultimate classes can usefully be displayed in a two way table
as follows:
311
From the above table it is obvious that
A. = (AB)'+(Ab)
B. = (AB) + (aB)
Note :
From the above we can make a general statement that any higher order frequency will never be
greater than its lower order class frequency.
In like manner, we can give the class frequencies for three attributes A, B, C. The number of
frequencies in this case will be 33 = 27.
Inconsistency of data :
As a rule, no class frequency can be negative under any circumstances. If it happens, then the
data are said to be inconsistent. For example, if any of the following inequality holds, the data
will become inconsistent.
When three attributes are under consideration and if any of the following inequalities holds, the
data will be inconsistent.
312
ii) (ABC) < (AB) + (AC) - (A)=> (Ab g) will be - ve.
iii) (ABC) < (AB) + (BC) - (B) => (aBY) will be - ve.
iv) (ABC) < (AC) + (BC) - (C) => (abC) will be - ve.
In the process of finding out the measure of association if data are inconsistent, they are not
suitable for use. Hence, either data should be corrected, if possible or rejected.
Consistency of data :
Contrary to inconsistency of data, it can be said that if any of the ultimate class frequency
associated with the same population is not negative, the data are said to be consistent. It means
that the data are in conformity with each other and are suitable for further analysis. For the test
of consistency of data, the signs of inequalities given for inconsistency of data should be
reversed and the word negative (-ve.) be changed to positive (+ve). Hence it can be stated that
for consistency of data, no class frequency should be negative.
Example 3-2;
Given the class frequencies as below = (A)60, (b) =175, (AB) - 160, and N = 250
We can test whether the data is re-consistent or not by making the following two may table.
A a Total
(AB) (aB)
220 -45
A. = 380 a. = -130
Total 250
In the above the frequencies for the class and are negative. Hence , the given data are
inconsistent.
313
A a Total
(AB) (aB)
B
70 20 (B) = 90
(Ab) (ab) (b)=210
b
155 55
Total (A)= 225 (a)= 75 N = 300
Since no class frequency is negative, we conclude that the data are consistent.
Example 3-4:
The following information was supplied by a tabulator about three attributes A, B and C.
N = 900, (A) = 200, (B) = 80, (C) = 10, (AB) = 180, (AC) = 140,
The consistency of the data supplied by the tabulator can be tested by using the inequality
if it holds, the data are consistent and if not, the data are inconsistent.
= 910
Given (ABC) = 300 is less than the value 910 of the class frequency (ABC) obtained from the
inequality. Hence, the data are inconsistent.
Type of association
Positive association :
Two attributes A and B are said to be positively associated if the presence of an attribute is
accompanied by the presence of the other. For instance, health and cleanliness are positively
associated attributes.
314
Negative association :
Two attributes are said to be negatively associated if the presence of one ensures the absence of
the other.
Independence :
Two attributes are called independent if the presence or absence of one attribute has nothing to
do with the presence or absence of the other.
If two events A and B are independent, then it is expected that the proportion of B's in A's is
same as the proportion in a's and vice-versa i.e.
Hence, if A and B are independent, if the proportion of frequencies of AB is equal to the product
of proportion of frequencies in A and B.
315
Methods for measures of association
i. Frequency method
ii. Proportion method
iii. By Yule’s coëfficiënt
iv. By Yule's coëfficiënt of colligation
In this method we compare the observed frequency of the joint classes for two attributes A and E
with their expected frequencies of the classes. If the observed frequency is greater than the
expected frequency, then the association between the attributes is taken to be positive and if
less, then negative. In case the observed frequency of the joint class is equal to its expected
frequency, the two attributes are independent.
For two attributes A and B of a population of size N and their respective class frequencies (A),
(B) and (AB). The expected frequency for the joint class (AB) by the- multiplicative law of
probability under independence is
and so on.
The main drawback of this method is that we cannot find the degree of association between the
two attributes. What we get is the kind of association only.
Example 3-5: -
Given the following data, find out whether the attributes A *nd B are independent positively
associated or negatively associated.
316
Therefore (A) ( B) = (310) (610)
= 343.8
Example 3-7:
Given the class frequencies as (AB) = 15, (Ab) = 25, (a B) = 45, (ab ) = 5 find the type of
association between A and B.
We know,
= 15+25+45+5+ = 90
A = (AB) + (Ab) + 15 + 25 = 40
B = (AB) + (aB) + 15 + 45 = 60
317
(AB) = 15< 26.7
Two attributes A and B are said to be unassociated if the proportion of A's in B is same as
amongst b's. Also if the proportion in/3 's then there is a positive association between A and B
and if less than A and B are negatively associated. Symbolically, two attributes are related
according as,
The above situations also hold good in the following situations also.
In proportion method too, one obtains the kind of association only but not the degree of
association.
318
Example 3-8:
In a study of 200 persons. 120 are poor. Out of 100 persons who suffered from T.B. 50 are poor.
Find the kind of association between poverty and T.B.
Suppose A represents the attribute poor, B represent the attribute T.B. from the given question.
Yule's coefficient is the most popular measure of association. The main advantage of Yules
coefficient is that it not only tells about the kind of association but also the degree of association.
For two attributes A and B the class frequencies can be displayed in a two way table as given
below:
319
The value of Q lies between —1 and + l. If Q =1, there is a prefect positive association. If Q = -1,
there is a perfect negative association
A high positive value i.e. greater than 0.5, shows a high degree of positive association between A
and B. A negative value of Q less than 0.5 shows a high degree of negative association between A
and B similarly other values of Q can be interpreted.
Example 3-9 :
We prepare two way table for the frequencies with two attributes A and B and calculate Yule's
coefficient of associations given that
N m 800, (fl) = 330, (A) = 420, (AB) = 110 with the help of the given data, the two way
frequency table is,
320
321
Q = 0.79 leads to the conculsion that habits of smoking and alcohol drinking are highly
positively associated
Coefficient of colligation:
322
Here the value of Q is same as obtained in example (3-10)
Partial association :
The association between two attributes A and B may sometimes be due to the presence of a third
factor C Hence it looks germane to find out the association between a and B in the sub
populations C and Y, Thus the associations between A and B in the sub-populations C and y are
called the partial associations and are denoted as QABy . The formulae of partial associations are
323
Spurious association :
The association between two attributes A and B due to, some other factor are not considered
which is known as spurious association.
QUESTIONS
A, B and C as,
Find the association between marital status and the success in the examination both for boys
and girls.
4. In a class test in which 135 candidates were examined for proficiency in English and
Economics, it was discovered that 75 students failed in English, 80 failed in Economics, and 50
failed in both. Find if there is any association between failing in English and Economics and also
324
state the magnitude of association.
5. Give the following frequencies of the positive classes as, (A) = 950, (B) = 1100, (C) = 590,
(AB) = 450, (AC) = 250, (BC) = 200, (ABC) = 120 and N = 10,000.
Find the frequencies of the remaining classes and test the consistency of data.
6. Out of 1000 people consulted, 811 liked chocolates; 752 liked toffees and 418 liked sweets,
570 liked chocolates and sweets and 348 liked toffees and sweets; 297 liked all three. Is this
information correct?
7. Calculate the coefficient of partial association between A and B for the subpopulations C and
from the following frequencies.
8. State the condition for two attributes A and B to be independent. Given the following
information.
9. Prepare a 2´2 table from the following information and calculated Yule’s coefficient of
colligation
REFERENCES
Agarwal, B.L., 'Basic Statistic', Wiley Eastern Ltd., New Delhi, 2nd ed., 1991.
Ansari, M.A., Gupta, O.E and Chaudhari, S.S., 'Applied Statistics', KedarNath Ram Nath
and Co., Meerut, 1980.
Garg, N.L., 'Practical Problems in Statistics', Ramesh Book Depot, Jaipur, 1978.
Gupta, S.C. and Kapoor, VK., 'Fundamentals of Mathematical Statistics', Sultan Chand,
New Delhi, 7th ed., 1980.
Sancheti, D.C. and Kapoor, VK., 'Statistics', Sultan Chand, New Delhi, 1978.
- End Of Chapter -
325
LESSON - 20
NON-PARAMETRIC TEST
Introduction
We are well versed with the theory of parametric tests as t, Z and F tests are most commonly
used. But these tests are valid on certain assumptions. The most commonly used assumptions
are that the variable follows normal distribution or the sample has come from a normal
population. But the assumption of normality is not always true. Even then people used
parametric test freely under the umbrella of central limit theorem. If the distribution is highly
skewed or the sample size is not large- enough to hold central limit theorem either the
parametric test will not be reliable or the result obtained by them will be erroneous.
Usually the population parameters, mean and variance are estimated through sample values.
But such an estimation will not be meaningful if the observation are on nominal scale like good,
satisfactory, bad etc. or on ordinal scale. In this situation too, nonparametric tests yield good
results.
The tests which do not require shape of the distribution are known as distribution free tests. The
tests which do not depend on the parameters of the distribution like mean and variance are
termed as nonparametric tests. In parlance or practice both these terms are used as synonyms.
Spearman's rank correlation was a breakthrough in the theory of nonparametric statistics which
was well taken up by Harold hotel ling and later testing of hypothesis was initiated by Wilcoxon
proposing a test for two sample cases. Anyhow, chi- square test is an interesting case which is
categorized as parametric as well as nonparametric test. Now a large number of tests have been
developed and nonparametric statistics is gaining ground day by day.
1. Nonparametric tests do not require any assumptions about the population distributions
particularly about the normality of the population.
2. The calculations are simple.
3. Nonparametric tests are not complicated to understand.
4. Nonparametric tests are applicable to all types of data quantitative, nominal ordinal etc.
5. Many nonparametric tests are applicable even to the small samples.
6. Nonparametric tests are based on a few mild assumptions. Objectives
Generally, the hypotheses tested are about the median of the distribution of the population, the
randomness of the population or whether populations have same or hypothetical distributions.
326
Problem of ties
If the variable is continuous there is no question of ties amongst the observations. But still due
to limitations of measurements, rounding of figures etc., ties ac occur Two observations are said
to be ties if they are equal. In ranking the observations, the problem arises how to award ranks
to equal observations. The ith group consisting of ri tied observations
In this method all the tied observations are arranged in order and ranked as if they are not tied.
Then the average of the ranks of all the tied observations of a group is found out and each tied
observation is given the same rank equal to the average value obtained. This is the most simple
and frequency used method of dealing with the ties.
In this method the tied observations are arranged in all possible ways and ranked. The statistical
value is calculated for each arrangement and then the average of these statistic value is used as
the final value of the test statistic to take a decision about H0. But this method is not used since
it needs two lengthy calculations.
Instead of using the average value of statistics of all possible arrangements, one may choose a
value out of all which minimizes the probability of rejection. This minimizes the probability of
type I error.
This is the most simple method but entails loss of information as the sample size is reduced by
the number of tied values. The method can be used only when the number of tied values is small
as compared to the sample size.
Non-parametric tests are based on a few mild assumptions which are given below:
l. The first assumption is about the continuity of the distribution function. This is required to
determine the sampling distribution.
327
One sample case:
Runs test:
It has been a common practice to write that the sample drawn is random. People standing in
queue are in random order. Assumption of randomness may not come true in many cases.
Hence, it becomes necessary to perform the test for randomness. Before we discuss runs test, it
will be worth to discuss the runs first.
Definition of run:
A run is a sequence of like symbols preceded and followed by different kind of symbol(s) or no
symbol(s).
Different runs may be exhibited by enclosing them in small vertical lines. A pattern of runs
having a systematic arrangement of symbols shows the lack of randomness. For instance the
observation in an order, all ladies ahead and all men behind them or one lady one gent in a
queue show a lack of randomness. If we denote a lady by F and a gent by M, the sequences of the
type, are considered to be nonrandom sequences. Whereas a sequence of the type,
M F M F M F M F
M M F F F M FFMMFMMMMFFF
As a general principle too many runs or a few runs are a mark of no randomness. Whereas an
adequate number of runs in a sequence confirms randomness.
Now we consider a problem of test of randomness in case of one sample of size n having two
kinds of symbols a and b numbering ri and n2 respectively. The null hypothesis, H0: the
symbols a and b occur in random order in the sequence against the alternative, Hi: symbols a
and b do not occur in random order, can be tested by the run test. Let the sample of size n
contains n1 symbols of one type, say a, and n2 symbols of the other type, say b. Thus, n = n1 +
n2. Also, suppose the number of runs of symbol are ri and that of symbol b are r2. Suppose ri +
328
r2 == r. In order to perform a test of hypothesis based on the random variable R, we need to
know the probability distribution of R under H0.
When r is even
For r even the number of runs of both types must be the same i.e r1 = r2 = r/2 Again
When r is old
For r odd, r1 = r2 ± 1. In this situation, the sum is taken over two pairs of values, ri = (r - l)/2
and r2 = (r+l)/2 and vice versa.
Decision criteria :
To decide about Ho, the observed value of number of runs is compared with the lower and
upper, critical number of runs. The critical values of the number of run can be seen in appendix
Tables XI-Fi and XI-Fii of Basic Statistics by B.L. Agarwal at level of significant and n1 and n2
symbols,
If the observed value of r lies in between the critical values, the hypothesis of randomness is
accepted otherwise rejected.
aa bbb a a b b a b b b b a a a b
we test the null hypothesis H0: the symbols V and 'b' occur in random order.
329
n1 = 8, n2 = 10 and n = 18
From the tables, the critical values of lower and upper critical number of sequences are, n = 5
and r2 = 15. The observed number of runs = 8 which lies between 5 and 15. Hence we conclude
that the symbols ‘a’ and ‘b’ are in random order.
Further remark : In cases where the observations are taken are after the other and we take it
for granted that they are taken randomly. To make sure, the runs test can be applied even for
quantitative observations.
This type of data is dichotomized by taking the deviation from the median. Then supossingly the
positive difference is symbolised by 'a’ and negative difference by 4b In this way we have a
sequence of a's and b's. Runs test can be applied in the usual way
Example 4-2 :
21.6, 15.6, 18.4, 30.8, 25.9, 17.6, 31.4, 33.8, 9.5, 11.6, 28,9, 40.2, 12.7, 7.4, 19.6, 54.3
The hypothesis of randomness of the set of observation can be tested in the following manner.
7.4, 9.5, 11.6, 15.6, 17.6, 18.4, 19.6, 21.6, 25.9, 28.9, 30.8, 31.4, 33.8, 40.2, 51.8, 54.3
Now taking the deviations and denoting the + ve and -ve differences by the symbols a and b, the
sequence of symbols comes out to be,
bbbaabaabbabbba
n1 = 7, n2 = 9
The critical values at a = .05 and n1 = 7 and n2 = 9 from the appendix tables XI-Fi and XI-Fii of
Basic Statistics by B.L. Agarwal are 4 and 14. The number of runs in the sample i.e. 8 lies
330
between 4 and 14. Hence, it can be said that the sample is a random sample.
This test is applied to test the identically of two populations. If X and Y are two variable
measuring the same character then we want to test,
Here we shall be testing H0 against Hi on the basis of runs obtained from two samples drawn
from the two populations. Let two samples be drawn from population X and population Y
denoted by the distribution function Fx(x) and Fy (x) of size m and n respectively Suppose the
two independent samples are
Procedure :
Combine both the samples and arrange the observations in order (ascending order), under the
assumption of continuous variables, no ties should occur. In the combined samples track be
maintained which observation belongs to which sample. For the observation belonging to a
particular sample should either be underlined or some other identification mark may be used.
Let the combined sequence of ordered statistics with m = 7 and n = 8 as follows:
In the above sequence, there are 4 runs of ‘X’s and 5 runs of ‘Y's. In this way we have in all 9
runs.
Now we define a random variable denoting the number of runs in the combined sequence of m
‘X's and n ‘Y's. As we know, too few runs tend to reject Ho as too many runs do. Wald- Wolfowitz
runs test of size has a critical region given by the inequality.
R < ra
In two samples case, we have the letters X and Y instead 'a' and 'b' in one sample case. So R is
same it was found in the case of one sample given by (3.1). In this case m = m and n = n2. The
only difference between one sample and two samples runs test is that in Wald-Wolfowitz runs
test, we use only one sided test where as in one sample case we use two sided test.
Decision criteria :
331
To decide about H0, we compare the value R = r with critical value of r at a level of significance
and n1 and n2 (m and n) sample sizes obtained from the tables prepared by Swed and Eisenhart.
The same table is reproduced in Table XI-F(i) of Basic Statistics by B.L. Agarwal. If the observed
number of runs r is less the critical value of r for a level of significance and ni, n2 (m,n) d.f. we
reject Ho- It means that the two populations are not identical from the view of randomness. If r
>ra , ni, n2 we accept H0.
In case of large samples, we can apply normal deviate test. In the situation when m and n, both
are greater than 10, a normal approximation may be made assuming
Decision criteria :
If the calculated value of z is greater that Za, which is 1.96 for 5% level of significance, reject Ho.
It means two population are different. Again if Z < Za, it can be concluded that the two
populations are identical.
Problem of ties:
332
Theoretically, no ties should occur. Anyhow if they occur within the sample, it causes no
problem. But if they occur across samples, there arise a problem. The solution to the problem of
ties in this case is to break the ties in all possible range ways and computer ‘r’ in each case.
Choose the largest value of r to take a decision about H0. Largest value of ‘r’ is preferred due to
the fact that, this is the one value which will be least favourable to reject H0.
Example 4-3 :
A sample each was drawn from two populations one of smokers and the other of nonsmokers.
Their clinical tests were conducted to measure the effects on their blood, cells, lungs etc. and the
scores were as follows:
whether there is a significant difference between the score of smokers and - nonsmokers
population can be tested by Wald-wolfowitz tests in the following manner.
Here we test.
To test Ho against Hi, we combine the two samples and put them in an ordered sequence.
__ 18 12 20 21 22 23 25 26 27 30
The total number of runs in the combined sequence is three. Compare it with the critical value of
R at a = 0.05 and n1 = m = 7 and ni2 = n = 8. The critical value of r = 4. The calculated value of r
is less than the critical value. Hence, we reject Ho. This shows that the scores of smokers and
nonsmokers are not identically distributed.
Example 4-4 : The production of an ore during the months of the years 1990 and i.991 were as
follows:
333
Can it be regarded that the production of ore during months in two years is a random process.
The hypothesis,
This can be tested by Wald-wolfowitz runs test. Since m and ore both 12 which are more than 10,
we will apply large sample test.
Now we combine the samples and arrange them in an ordered sequence (ascending order).
334
The calculated value of z = 1.67 is less than the tabulated value of z = 1.96 at 5% level of
significance. Hence, we accept that the production of ore during the months is a random
process.
Before we describe the test, it looks logical to define first the runs up and runs down. In this
method the magnitude of each observation of a set is not compared with a single value but with
the one immediately preceding it in a given sequence of observations. If the preceding value is
smaller, 'a' runs up started and if is greater a runs down commences. Let a runs up is denoted by
+ ve sign and ‘a’ runs down by -ve sign. So the sequence of +ve and -ve signs is reflected in the
form of runs. For example, if we consider a sequence of seven observations as 5, 8, 3, 9, 6, 7, 10.
The sequence of runs up and runs down will be + - + - + +. In the given sequence, there are fair
runs up and two runs down.
Test procedure
The hypothesis
Ho : sequence is random
can be tested by runs up and runs down test. From the above discussion it is clear that if there
are n observations there will be in all (n-1) runs up and runs down i.e.(n-l) +ve and -ve signs.
Let us suppose that there are V runs in the sequence of + ve and +ve signs.
335
The decision about Ho can be taken by comparing the value of the probability obtained from
Table M of Non parametric methods for quantitative analysis by
J.D.Gibbons with predecided level of significance a . Usually we take a = 0.05 or 0.01. If the
tabular probability for n and V is less than a , we reject Ho otherwise we accept H0.
Note : For two tailed test the probability obtained by table M should be doubled.
Example 4-5 :
5, 8, 3, 9, 6, 7, 10
The hypothesis
Ho : sequence is random
Can be tested by the runs up and runs down test in the following manner.
1 + 1-1 + 1-1+1+1
Here n = 7, V = 5
The probability P for n = 7, V = 5, by Table M as referred above is. 4417. Thus, 2 P = .8834,
which is greater than predecided level of significance a = 0.05. Hence, we accept HG. It means
the sequence is random.
QUESTIONS
l. In a queue for a coming bus in Delhi, the following arrangement of adults according to sex
was observed:
On the basis of the above sequence of males and females denoted by M and F respectively, can it
be concluded that there was a lack of randomness.
2. The test scores gained by boys and girls in an oral examination were as follows:
Test by Wald-wolfowitz test whether the distribution of scores earned by boys and girls is
336
identical.
3. A row of cotton plants in a field had the following sequence of healthy (H) and diseased (D)
plants.
HHDHHHDDHDDDHHHH
Test the hypothesis of randomness against the alternative of clustering (one tailed test).
4. The number of births occurring on seven days of a week in a hospital were as given below:
Births (X): 15 22 38 28 18 25 18
By the method of runs up and runs down, test that the number of births during the week days is
a random process against clustering. (Use one tailed test).
7 2 .0250 5 .4417
3 .1909 4 .8091
5. The test scores in shooting competition were as given below out of 500 shots.
6. The efficiency of eight workers in a factory was measured in terms of time consumed by them
in completing a task before and after the lunch.
Time in (Minutes)
Before Lunch: 8.6, 11.5, 12.2, 9.6, 10.4, 12.4, 13.5, 8.8
After Lunch : 10.2, 13.5, 10.5, 12.5, 11.6, 15.2, 16.2, 17.5
337
REFERENCES
Agarwal, B.L., 'Basic Statistics', Wiley Eastern Ltd., New Delhi, 2nd ed., 1991.
Daniel, W.W., 'Applied Nonparametric Statistics', Houghton Miffin Company, Boston, 1978.
Gibbon, J.D., 'Nonparametric Methods for Quantitative Analysis', Holt, Rinehart and
Winston, 1976.
- End Of Chapter -
338
LESSON - 21
Introduction
Economic system is dynamic and it changes with time. The study of national income, national
production, demand, supply, wages, etc. more with time. So the data pertaining to economic
variable(s) collected in 3 chronological order is called a time series. The study of movement of
economic factor(s) in a chronological order leads to draw many conclusions for present policy
making, planning and management. Also the estimates can be made well in advance so as to
regulate our production, supply and prices etc. A large number of Economists and Statistician
defined time series in their own way. Firstly we give the definition for time series alone.
The study of a time series to evaluate the changes occurring over time and to look for the causes
for these changes is known as the analysis of time series. Analysis of time series defined by
various workers is also quoted below:
1. W.Z. Hirsch : A main objective in analysing time series is to understand, interpret and
evaluate changes in economie phenomena in the hope of most correctly anticipating the
cause of future events.
2. P.H. Karmel : The analysis of time series has developed in the main as a result of
investigations into the nature and causes of those fluctuations in economic activity called
trade cycles. Economic theory has suggested various explanations of trade cycles.
Analysis of time series has attempted to test the plausibility or otherwise of these
theories. At the same time, such analysis may suggest new hypotheses for economic
theorists to work on.
The analysis of a time series may be due to long term changes or short term changes. The factors
responsible for short and long term changes should be studied separately. Also while analysing
the time series data quantitatively, the analyst should always use his own judgement and logic
before taking a decision. Now we study the components of a time series and their analysis.
339
(i) Trend or secular trend (T)
The original data 'O' influenced by these four components can be expressed by the following
model.
Multiplicative model:
Addititive model
O = T + S + C + I (1.2)
The analysis of time series is an attempt to segregate these component from the time series data
and find their influence. If we remove the influence of any one or more, the data is left with the
remaining one. If we eliminate the influence R, S and C, it is easy to infer that wherever
variation is present in the data is due to irregular movement but not due to assignable causes
like trend, seasonal or cyclical changes.
Secular trend :
Secular trend depicts the long-term movements, This is one of the main components of time
series. It measures the slow changes occurring in a time series over a long period. For example,
we find the rate of increase or decrease of food production over the last 20 years, we see the
changes in GNP over the last two decades. Secular trend or simply trend does not take into
consideration the short term fluctuations but only takes case of the long term fluctuations.
Trend is found out not only on national level but also by individual companies and industries to
regulate their finances, marketing and production etc. There are various methods of estimating
trend. They are named below:
Graphical methods
Mathematical method
340
iii. Fitting of a Parabola
Editing of data :
Before we analyse a time series data, it is necessary to edit the data keeping in view the purpose
of study. In the analysis, usually a value over time is compared with the other. So the data
should be adjusted to ensure comparability. Generally, the data are adjusted for:
iii. Population changes : The consumption figures cannot totally be a criterion for
demand of a particular item. But they have to be viewed in the make of population. The
liking or demand for an item may be decreasing but the total demand may be increasing
due to the increased population. Hence, the demand be measured in term of
consumption per head. Hence, while comparing the fïgure or various periods in a time
series adjustment for population changes be made in the data.
iv. Miscellaneous changes : While comparing the figures in a time series data, one
should take care of type of items also. In the beginning there were no colour T.V sets.
Hence to compare the production of coloured TV sets in value with black and white will
not be proper. Hence, suitable adjustments should have to be made.
Also the units of measurements often change. As there are no miles or yards but the units are
kilometer or meters or kilograms. Hence, while comparing the figures of two periods in which
the units are not similar, the fïgure should be reduced to similar units.
341
Fitting of Trend
Freehand method:
This is one of the graphical methods of fitting a trend line. Here we plot the time series data on a
graph paper by choosing suitable scales to represent time along X-axis and variate values on Y-
axis. Once all the data are plotted in the form of dots or crosses on the graph, a straight line is
drawn on the graph paper through a transparent scale in between the points in such a way that
almost half of the points are above the line and half are below it and also as many points as
possible lie on it. The line will be a good fit if the sum of all the vertical distances from the points
to the line is zero. But this much of accuracy is not easily attainable. Usually the best line is fitted
visually.
Freehand method is not very preferable as for the same date, the line of best fit will vary from
person to person. In this way a trend line by free hand method pro vides rough estimates and is
not suitable for predictions.
Example 1-1 :
Gross National Product (GNP) at 1980-81 prices for public administration, defence and other
services is given below for the years 1977-78 to 1988-89.
Years GNP
1977-78 2.7
1978 - 79 4.3
1979-80 7.3
1980-81 4.1
1981 - 82 3.5
1982 - 83 7.8
1983-84 3.6
1984 - 85 7.3
1985-86 7.5
1986-87 7.9
1987-88 8.0
1988-89 5.6
For the data given above, the trend line can be fitted on the graph by freehand method in the
following manner.
Take years on abscissa at a distance of 1 cm and 1 cm = 1 unit of GNP on the ordinate. Then draw
a trend line through a transparent scale as depicted in fig 21.1 below
342
Semi-average method :
The problems of drawing the trend merely by the judgement of the investigator is avoided in this
method. The time series is divided into two equal halves consisting of the beginning half years
and last half year and the average of each of the half series is calculated. The time series data are
plotted in the usual way on a graph paper and also the average values are plotter against the mid
periods of the corresponding half series. The points representing the average values are joined
through a straight line and the line may be extended ". eyoiid points also. This line represents
the trend line.
(i) The number of time periods is odd and in this situation it is not possible to divide the series
into two equal halves. The problem can be resolved by either including the middle most value in
both the halves or neglecting it.
(ii) The number of years (periods) in half of the series is even. In that situation no year is mid-
year of the half series. To resolve this problem, the average value should be plotted against the
mid-point of the two middle years in both the half series.
ii) It does not depict the true trend line.
iii) It does not ensure that the short term and long term fluctuations are
eliminated.
Example 1-2 : For the data given in example 1-1, the trend line by semi-average method can be
fitted in the following manner.
343
Since the number of years in half series is six which is an even number hence no year will be a
middle year of the series. The average values will have to be plotted against lst July 1979 and lst
July 1985.
As per practice, the original data should be joined through dotted lines and the trend line by a
smooth dark line.
The graph showing the trend line by semi-average method is given below.
344
The semi-average method cannot eliminate short term fluctuations except seasonals which are
of little or no interest. But moving average method removes short term fluctuations as well.
Moving average method is one of the most popular methods in time series analysis. Firstly we
define a moving average.
Moving average is a series of average of the variant values corresponding to the sequences of
fixed number of years (periods). The sequences are formed by deleting the first year of the last
sequences and adding a subsequent year. The process continues till all the years are exhausted.
Now the question that arises is to determine the years to be taken to form the first group, so that
the trend is reflected as a straight line. There is no hard and fast rule for it. As a principle, the
minimum number of years be taken together in a group, which result in a straight line for the
trend. As a thumb rule, the minimum number of years in a group, should be equal to the
number of years (periods) which form a business cycle. An idea of cycles can be got by studying
the short term fluctuations as adjudged by plotting the time series data. Once the number of
years to be included in the first group is finalised, the method follows mechanically.
The moving averages are plotted on the graph paper by taking the averages along the Y-axis and
years on the X-axis by choosing a suitable scale. The points so plotted are joined sequentially.
The resulting gr provides the trend. If the points lie nearly on a straight line, the moving aw
methods give a very clear picture of trend. On the contrary, if they do not fall in a line, we should
search for a curvilinear trend. For a linear trend, the cycles should regular in amplitude and
periodicity.
Merit:
The greatest advantage is the moving average method reduces the influence of extreme values.
Demerits :
i. The moving averages are not available for some of the beginning and end years. Hence,
this method is not suit-able for projections.
ii. There is hardly any series which has regular cycles. Hence to use the same number of
years for moving averages is not very logical.
iii. The method is not appropriate for the comparison of two series
iv. There is no hard and fast rule to decide the number of years to be taken in a group.
v. The method is not based on sound mathematical footing.
Example 1-3 :
For the data given in example 1.1, we will be fitting the trend by the method of moving averages
taking 3 years in a cycle.
In the table below we reproduce the data along with the three year moving averages.
345
The average value in the above table are plotted on the graph paper the years to which they are
entered. The graph joining these point depicts the trend as shown in Fig.(21.3)
346
The trend is not a linear.
Moving average method when the number of years in the cycle is even.
In this method the moving average is entered against the mid position of the two middle years.
Since, in is moving average does not belong to any of the years given in the data. We again find
out the moving average of the averages taking two averages at a time and enter them against the
mid position which is a year of the given data. Once we have got the moving averages they are
plotted on the graph paper in the usual way and trend is obtained by joining the plotted points
sequentially.
Example 1-4 :
Net availability of per capita pulses per day (in gms) from 1972 to 1986 was as follows:
Trend by moving average method taking four years moving averages is given below Prepare the
following table for moving averages.
347
The moving average given in column (iv) of the above table are shown in the figure 21.4
348
A curvilinear trend is observed by the graph
This method is totally mathematical and is free from all sorts of subjectivities. It is , a very
appropriate and reliable method and is extensively practiced. The procedure I is exactly the
same as described with fitting of regression line or a regression curve.
Fitting a line or a curve means estimating the parameters of the &. Equation and establish a
prediction equation. The only difference between the regression line and a trend line is that in
this, the X-variable is always the time period and the corresponding variate values stand for Y-
variable.
Merits:
Demerits
349
1) In the model it is assumed that Y depends on time alone which is not true in a large number of
case. For instance, the increase in demand of various commodities is due to increasing
population from year to year but not due to time factor. If the population growth rate is reduced
the least square estimate will not be applicable.
Example 1-5:
The supply of electricity due to thermal projects from 1981 to 1989 ('000 M.W.) was as given
below:
1981 1982 1983 1984 1985 1986 1987 1988 1989
17.6 19.3 21.4 24.4 27.0 30.0 31.8 35.6 39.7
The trend line by the method of least squares can be fitted by preparing the following
computation table.
350
Logarithmic trend line:
When the time series records per cent or proportional yearly changes, then a logarithmic linear
trend is a better proposition. The equation of logarithmic trend is,
Z = a1 + b1X (1.9.1)
Which is a straight line equation (1.9) can be fitted by getting the values of log a and logb.
351
Under the system of coding as discussed in case of linear equation, sum of x of odd powers is
zero. Therefore, Xi = Ix3 = 0. Using these relation, the normal equations reduce to,
QUESTIONS
352
(i) Fit in a trend line by free hand
3 Below are the figures of production (in thousand tonnes) of a sugar factory
Years : 1981 1982 1983 1984 1985 1986 1987
Production : 70 80 85 82 90 100 96
('000 Tonnes)
4 The following data relates to the number of scooters (in lakhs) sold by a manufacturing
353
company during the Years 1985 to 1992.
Year : 1985 1986 1987 1988 1989 1990 1991 1992
Number : 6 6.1 5.2 5 4.6 4.8 4.1 6.2
(in Lakhs)
Fit a straight line trend and estimate the sales for the year 1993. (Take the year 1988 as working
origin).
5 The following are data on the production (in '000 units) of a commodity from the years 1980
to 1986.
Year : 1980 1981 1982 1983 1984 1985 1986
('000 units)
Fit the trend of the type Y = a + bx + cx2 to the above data. take 1983 as the year of origin).
7 Fit a linear trend equation to the following data by the method of least squares:
354
‘000 tonnes)
REFERENCES
Agarwal, B.L., 'Basic Statistics1, Wiley Eastern Ltd., New Delhi, 2nd ed, 1991.
Richard, L.E., Lacava, J., 'Business Statistics' (Why and When), McGraw-Hill Book Company,
1978.
- End Of Chapter -
LESSON - 22
355
SEASONAL INDICES IN TIME SERIES ANALYSIS
Objectives
Seasonal variations occurring in a time series refer to those changes which occur during the
periods (seasons) within a year. It gives an idea about the sales of woolens (winters). Similarly
one can make an estimate of the sales of ice cream during summers. Seasons have lot of impact
on the sales and hence unless an executive knows about the quantum of sales or nature of
seasonal variations, we cannot take policy decisions. The objectives of study of seasonal
variations is twofold.
i. To isolate the effect of seasonality in order to evaluate the effect of seasonal factors on a
time series.
ii. To make the time series free from the effect of seasonal variation. This operation
facilitates to make long term forecasts. The elimination of seasonal effects from a time
series is known as deseasonalisation.
To isolate the seasonal variations, one will have to remove trend cyclical and irregular
variations. For a multiplicative model,
The study of seasonal variations refers to two type of seasonals. They are:
(ii) On the other hand the average of specific seasonality over a number of years is known as
typical seasonal.
Also a season in a year is not a bunch of any consecutive month but it refers to those in which
really a particular season occurs, for instance rainy season occur from July to Sept. and not
otherwise.
There are various method of calculating seasonal indices which are named below and then
explained.
356
(d) Link relative method
(i) Arrange for each year according to the quarters, months etc. for which the seasonal indices
are to be calculated.
(iv) Obtain the seasonal indices in percentages by dividing every month's averages by the
overall average and multiplied by 100 one by one.
Check:
The exactness of seasonal indices can be checked by taking the sum of the indices. For monthly
data, it will be 1200 and for quarterly data 400.
(i) The upswing and downswing of cycles in the series are fairly balanced.
(ii) If any irregular movement are present, they are of random nature and compensate each
other under averaging.
But such assumptions do not strictly hold and hence this method provides only rough estimates.
Example 2-1 :
The monthly production of fertilizer of a factory for four years was as given .below:
357
The monthly seasonal indices can be calculated in the following manner. Calculated values are
given in the last three columns first to save space.
Check :
358
The sum of the seasonal indices is little more than 1200. For a greater accuracy the indices can
be adjusted by multiplying each seasonal index by the quantity 1200/(sum of indices).
359
(b) Ratio to trend method :
This is also called percentage to trend method. This method is better than simple average
method in the sense that it does not assume that seasonal variation for any month or quarter is a
content factor in trend. Seasonal indices by ratio to trend method involves the following steps:
i. Obtain the trend value by the method of least squares for and appropriate trend equation
(Usually a linear trend).
ii. Divide the original observation by this (estimated) trend value and multiply by 100. Each
of these value is as the percentage of trend values. Assuming of the multiplicative model,
the other steps are almost similar to simple average method.
iii. Find the average of the percentages for each quarter or month as the case may be. This
eliminates the cyclic and irregular variations. These averages are the seasonal indices.
In some situation data possess extreme values. In that case median should be found out instead
of average.
iv. If the indices for the months or quarters do not sum up to 1200 or 400 as per the
situation, they can be adjusted by multiplying each indices by the factor 1200/sum of
monthly indices or 400/sum of the quarterly, indices respectively.
360
Now we fit the trend line Yt = a + bx
361
Thus, the trend line is
Yt = 12 + 0.95 x
The trend values have been calculated by putting x equals to -2, -1, 0 and 1, 2 respectively and
entered in the last column of the above table. The trend value lies at the mid year. So the trend
value for the second quarter will be trend value -C/2 and for the 3rd quarter as trend value -f
c/2. Similarly for 1st and 4th quarter will be, trend value-3C/2 and trend value +3c/2. Using this
adjustment we write the trend values for each quarter and for all the years as follows:
362
Similarly, the quarterly trend values for the other years have been calculated and entered in the
above table.
Now we calculate trend eliminated values which are obtained dividing the quarterly value by the
corresponding trend value multiplied by 100. These as displayed in the table below:
363
The sum of the seasonal indices - 397.15. Now to bring the sum to 400 each seasonal index is
multiplied by 400/397.15 and entered in the last low. It can be verified that the sum of the
adjusted seasonal indices is 400.
Note: The same method can be applied to monthly seasonal indices. In this-situation, year wise
monthly data have to be treated in the same way as we did for a quarter.
Just like trend, moving average method is very popular for finding out the seasonal indices. The
main advantage of this; method is that it eliminates periodic changes if the period of moving
average is equal to the cycles and are to be eliminated. In moving average method we take, 12-
months or 4-quarters moving averages for the monthly or quarterly data respectively. Thus, this
eliminates the seasonal variation totally provided they are constant in their amplitude and
direction. The method involves the following steps.
Step-2: Find the twelve-monthly or 4-quarters average for the first year and enter it in the
mid-position of the 12-months (between June and July) or of 4-quarter in the middle of II and
III quarter.
Step-3 : Delete the January value for the first year and add the January value for the next year
and again find the average and enter it between July and August. In case of quarterly date,
delete first quarter value and add next years quarter value. Find the average of this new set of
values and enter it against the III and IV quarter. Continue this process until all the monthly or
quarterly data are exhausted.
Step-4 : Since, the averages under step-3 are entered against any month or quarter, again find
the moving averages of the two average obtained under step-3 and enter it in front of the July or
III quarter according to the situation.
Step-5 : Calculate the ratio of each monthly (quarterly) value to the corresponding moving
average value multiplied by 100. Obtained under step-4 and enter it against the month it exists.
Step-6 : Now prepare two way table displaying the years and monthly (quarterly) ratio
obtained under step-5.
Step-7: Find the media for each month or quarter. These medians are nothing but seasonal
indices
Step-8: If the sum of these indices is not 1200 or 400 in case of monthly or quarterly data
respectively, then it should be adjusted by multiplying each index by 1200/sum of indices.
Remark :
The above method is suitable if the multiplicative model is used. In practice, mostly the
multiplicative model is used.
In case of additive model instead of percentage ratio to moving average, we have to compute
364
deviation from moving averages.
Example 2-4 :
Given the following quarterly data for four years, calculate the seasonal indices by the method of
moving averages using multiplicative as well as additive model.
365
366
Assuming multiplicative model, the ratio to moving averages and then seasonal indices are
calculated and displayed in the following table. The values in the body of the following table
represent the percentage ratios as (Y/corresponding moving average) x 100.
The sum of the seasonal indices = 409.00. The adjusted seasonal indices are entered in the last
row which are obtained by multiplying each index by the fraction (400/409). After adjustment,
the sum of the indices reduces to 400.
If the additive model is assumed, the trend eliminated values are entered in the following table
and the seasonal indices for the quarters are calculated by taking the average of the trend
eliminated values.
367
Sum of the quarterly indices = 0.375
To obtain the adjusted seasonal indices, subtract the adjustment factor C = 0.09375 from each
of the quarterly average. The same values are entered in the last row of the above table.
Remark :
The same procedure can be adopted for monthly data and monthly seasonals can be obtained.
This method was invented by Karl Pearson. Link relative method for finding out the seasonal
indices has been explained through the following steps.
Step-1: Express each periodic or seasonal value as the percentage of the preceding value of the
time series. The percentage so obtained represents the link relatives (L.R.). This eliminates the
influence of the trend. Besides the trend the cyclic effects are also eliminated to a great extent.
Step-2 : Calculate the median of the link relative for each period or seasons (months or quarter
etc.) This eliminates irregular effect. These medians are not seasonal indices but only the
368
medians of the link relative.
Step-3 : Calculate chain relative (C.R.) medians on the basis of the first season, month or
quarter etc. obtain other season's chain relative medians. Assume first season median L.R. as
100. The formula for chain relatives is,
Since the variation present due to long distance between first season and last C.R. some error is
induced which needs correction. Hence, a correction factor is introduced which is,expressed in
Step-5.
Step-5 : The adjusted value of chain relative median for the first season is taken equal to 100.
For this, the adjustment factor C is obtained by the formula,
Number of seasons for monthly data is 12, Hence the correction factors for the seasons from first
to last one are OxC, IxC, 2xC, respectively. The correction factors are added to the season’s chain
medians.
Step-6 : Find the mean of the adjusted medians, Divide each seasons median by the mean of
the medians. The resulting values represent the seasonal indices. The advantage of this
operation is that the sum of the final indices reduces to 1200 or 400 etc. as the case be:
Compute seasonal indices for the following quarterly data by the method of link relatives.
369
Firstly we calculate the link relative and enter the following table.
370
Isolation of cyclic and irregular variations
Once we remove the trend and seasonality from the data, it is left with cyclic and irregular
factors. Removing seasonal effect from the data is known as deseasonalisation.
There are various methods of isolating cyclic and irregular variation but they are not discussed
371
as they are not the part of the syllabus.
QUESTIONS
l The prices of manufactured products from the year 1980 to 1986 are as follows:
Prices : (Crore Rs.) 108 112 114 119 114 125 120
Fit in the trend line by (i) graphical method (ii) least square method.
2 The following tables gives the number of workers employed in a small industry during the
years 1979-88. Calculate the four yearly moving average.
Year: 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988
No. of : 430 470 450 460 480 470 470 500 430 480
Workers
3 Explain briefly, how the seasonal element in a time series data is isolated and eliminated.
4 The number (in hundreds) of letters posted in a certain city on each day in a, typical period of
five weeks was as follows:
Obtain the indices of seasonal variation, within the weeks, by the simple average method.
Year Jan. Ml Mat Apr. May June July Aug. Sep. Oct. Nov Dec.
1980 18 20 18 17 15 16 17 19 19 23 23 24
1981 20 22 10 18 IS 20 24 23 23 24 24 26
1982 22 18 20 18 17 18 24 25 26 24 25 27
1983 24 24 22 20 18 22 25 26 27 26 27 29
By the method of (i) 4 months moving average method (ii) ratio to trend method (iii) Link
relative method for multiplicative model.
372
6 Calculate seasonal indices by the ratio to moving average method from the following data:
8 The data given below gives the average quarterly prices of a commodity for four years:
- End Of Chapter -
373
LESSON - 23
Introduction
In nature, no two things are similar as Bernard Shaw wrote: "Nature never repeats". But in a
manufacturing process, it is expected that all items will be alike but the experience is dispelled
this belief. If we take measurements on the manufactured item of a factory it has been found
that they too differ. If the difference is minute, it can be neglected and if large, it has to be taken
care off. So, there was a great problem whether to accept a lot for marketing or not. If the
product does not meet the specifications and marketed, it will affect the reputation of the
company. Again if the lot is rejected even for slight variation from specifications, the company
might have to undergo unbearable losses. So the statistical quality control comes to the rescue of
the manufacturer. Under this, statistically two limits are set, the lower and upper limits for a
particular variety and if all items measured lie within these limits, the process is said to be under
control and if not, the process is considered to be out of control. Hence, the fault has to be
removed and a particular lot may be rejected.
Objectives
i. Chance causes
ii. Assignable causes
374
variation and can be amended. Such causes which are detectable and removable, are
known as assignable causes.
Statistical quality control is meant to collect data from the manufacturing process at various
stage or at the final stage and analyse the data for the purpose of retaining whether the process
is under control or not. If not, what are possibly assignable causes. Mostly statistical quality
control methods are base control charts.
Control Charts :
Control charts are the devices to show .the pattern of variation in the units inspected. Control
charts delimit the range or band in which the variability of a product is under tolerance limits
and outside these. Limits the variability is not tolerable. The most frequently used control charts
are, the mean X chart, the range R chart, standard deviation a chart and the number of defects C
charts. Control charts leads are to conclude whether the process is under control or not. If not,
what are the assignable causes for it. These defects or faults can be removed and the
manufacturing process is brought under control.
We know that for a normal population distributed with mean and standard deviation , 99.73 :
units lie within the limits pi - 3a and pi+3a. If the population parameters fj, and a are not known
their estimated values, sample mean X and sample standard deviation o can be used. The
process is considered under control if the observed relevant value lie within their limits. If the
observed values are outside their limits, the process is said to be out of control.
A control chart is essentially a graphic method for presenting data so as to reveal the extent of
variation from the limits at a glance. A control chart contains three line, namely:
(i) the control line, which shows the standard which one wants to maintain in respect of certain
characteristic of a product;
(ii) lower control limit, (iii) Upper control limit. These limits provide a band, in which the values
shows that the process is under control limits and any point lying beyond these limits show that
the process is not under control. A sketch of the control chart is given below in fig.(3.1).
375
Control charts for statistical quality control are classified into two types:
Control chart for variable is prepared when the variable is continuous. In this situation, the
mean, standard deviation and range are found out for the samples drawn at regular interval of
production process and thus X a and R charts are prepared.
Control charts for attributes are prepared in that situation where the sampled units are
inspected for finding out whether an unit is defective or non defective. Also many times it has
been checked how many defects per unit are there in this situation d or C charts are usually
prepared.
If any point out of the plotted points lies outside the control limits, then the process is
considered to be out of control and hence some assignable causes be traced out and necessary
corrective measures be taken to bring the process under control. This saves the manufacturer
from heavy losses. Again if all the points (dc f.s) lie within the control limits the process is
considered to be under control and whatever variation is being observed is due to chance
(random) causes.
Sometime, the points (dots) may be lying within the control limits but still they show a peculiar
pattern. For example all the points may be lying below the central line or above it or they follow
a peculiar path. All such arrangement should not be left unnoticed. They are also the danger
signals which may indicate a change in the production process. Hence, the control charts should
376
not be scrutinized only for the dots lying within the control limits or outside but should also be
taken care for some special pattern. For the first time, the control charts were developing by
Dr.Walter A.Shewhart of the Bell Telephone Company in 1924, since then they are in consistent
use. Mainly, X, , R and C charts are used which are discussed in succession.
377
378
379
Remark :
If the lower control limit comes out to be negative it has to be taken as zero, since the standard
deviation can never be negative.
380
R-chart
It has been experienced that in case of small samples, the standard deviation and the range
fluctuate simultaneously. Also in general only small samples are drawn in case of quality control
inspection. Hence, one can use the range in place of standard deviation to set up the control
limits. The reason being that the computation of range is much easier than the computation of
standard deviation. Therefore, the use of range is preferable as compared to standard deviation
even for a little loss of efficiency. The relations between the range R and standard deviation a
from the sampling distribution of range are given as
381
382
Example 3.1 : The following table gives the height in grams of the packets sampled on 15 days.
The packets were sampled for 25 grams each. Ever} 7 day a sample of 5 packets was selected at
random and weighted. The weights were as tabulated below:
383
384
The process is under control or not has been checked by constructing (i) X - chart, (ii) R — chart
and (iii) s— chart.
Now we mark the control limits on the graph paper. Also plot the mean values against the
sample numbers. Now check whether any point lies outside the control limits or not.
385
Since the points lie below the lower control line and also beyond the upper control line, we
conclude that the process is not under control.
(ii) The control limits for the R-chart from (3.21.1) are,
The value of 7t from the above table is 11.87 also from the table of constants, for n = 5, are D3 =
0 and D4= 2.115
Now we mark the control limits on the graph paper. Plot the values of range against sample
numbers. The graph is as given below.
386
Since the plotted points lie above the upper control limit, we conclude that the process is not
under control.
(iii) When the population value of the standard deviation are not known, the control limits are
given by (3.15)
= 11.11
387
Now we mark the control limits on the graph paper and also plot the values of the standard
deviations against sample numbers as shown in the graph (3.4).
After a check of the graph, we find that no point lies beyond the upper control limits. Hence, we
come to conclusion that the process is under control in respect of standard deviation as the
quality characteristics.
Example 3-2 : The following data show the values of sample mean X and range R for the
samples of size 5 each. Calculate the value for central line and control limits for mean chart and
range chart and determine whether the process is in control.
Sample No : 1 2 3 4 5 6 7 8 9 10
Mean(X) : 11.2 11.8 10.8 11.6 11.0 9.6 10.4 9.6 10.6 10.0
Range (R) : 7 4 8 5 7 4 8 4 7 9
Solution:
388
Substituting the values of X , R HandA2weget
=7.03
=14.29
=13.32
None of the sample mean lies outside the limits 7.03 and 14.29. Hence, the X-chart reveals that
the process is under control. Again all the sample ranges lie within the control limits of 0 to
13.32. Hence, we conclude that the process is under control.
In the beginning it has been given that there are charts for variables and the other for attributes.
The charts for variables have already been discussed. Now we discuss the chart for attributes.
This covers two types of charts:
389
(i) the charts for number of defectives, so called p-charts
(ii) the chart for number of defects per unit known as c-chart.
p-chart
If there are n units in the sample and d items are defective. Then, the proportion p of defective is
d/n. Since, we are dealing with two types of items namely, defectives» and non defectives, the
population is dichotomous. Hence, it can be thought of to follow binomial distribution. Using
the mean and variance formula for binomial population, we can construct the control limits for
p-chart. In the construction of control limits, we are using the following notations. If the
proportion ‘P’ of the binomial is prefixed, then its standard
390
The decision criteria remain the same. If the proportion defectives in each sample plotted
against sample numbers lie within the control limits, the process is under control otherwise not.
Control limits for the average number of defectives per sample of equal size 'n can also be
constructed. If there are K samples drawn at regular intervals from the lot of manufactured
products. If p is the proportion of defective, then
per box
391
(i) 3 a control limits for the proportion of defectives can be constructed in the following
manner.
Now we mark the control limits on the graph paper and plot the proportion of defective values
against the sample numbers.
392
From the graph it is apparent that no point lies outside the control limits. Hence, the process is
under control.
(ii) Now we establish 3-control limits for number of defectives per sample and test whether the
process is under control or not.
From the given data, the average number of defectives per sample are,
Now we draw the control limits on the graph and plot the number of defectives against sample
numbers as depicted below :
The graph reveals that no plotted point lies outside the control band. Hence, we conclude that
the process is under control.
393
Remark:
It should be remembered that if a control limit comes out to be negative it has to be taken equal
to zero.
C-chart
If there is any mistake in the item, it is counted as defective. In the counting of defectives, it
hardly matters whether an unit has one defect or more defects. But it is also important how
many defects per item are there. Hence, we establish the control limits for the average number
of defects per item. The charts used for number of defects are known as C- charts. Since the
number of defects per item is a rare event and thus follows Poisson distribution. Hence, the
control limits for C-chart are based on Poisson distribution. For instance, number of rivets
missing in an aero plane, number of defective seeds in a packet etc.
If the standard value of the nonconformities C is given, then the control limits for C-chart are
In case the standard value 'C of nonconformities is hot given, we have to use the estimated value
of C which is equal to the average number of defects per unit on the basis of all the samples. If
there are K samples and Ci is the number of defects for i sample unit, then the average number
of defects per unit is,
Example 3-4 :
394
Now we demarcate the control limits on the graph paper and plot the number of defects against
sample (Car) Nos. as displayed below.
395
From the chart, it is trivial to draw the conclusion that the process is under control as no plotted
point lies above the upper control limits.
Control charts are of great use in industries and big companies. Statistical quality control is of
great help to the producer as well as to the buyer.
QUESTIONS
1 Draw the control chart for mean and range for the observations taken on the diameter of lead
pellets selected at random as samples of size five and these samples were selected 15 times. The
observations were as given below:
396
Also draw a chart for the above data,
2 The number of defects in match boxes of 12 samples drawn from the bundles of match boxes
were as follows:
5, 7, 8, 2, 0, 12, 4, 5, 3, 1, 6, 4,
Construct the control limits for the number of defects and draw control chart.
3 The number of defective plugs in the 10 boxes of 20 plugs each were as follows.
2, 3, 4, 0, 1, 5, 2, 3, 6, 4, ...
Establish the control limits for number of defectives and draw p-chart.
4 In a glass factory the task of quality control was done with the help of mean (X) and stand
deviation a chart 18 samples of 10 items each were chosen and then values of Ix and 2s were
397
found to be 595.8 and 8.28 respectively. Determine the 3 limits for standard deviation chart.
You may use the following control factors for your calculations:
5 Given below are the values of sample mean (X) and the range (R) for ten samples of size 5
each.
Draw the mean and range charts and comment on the state of control of the process.
Sample No. 1 2 3 4 5 6 7 8 9 10
X: 43 49 37 44 45 37 51 46 43 47
R: 5 6 5 7 7 4 8 6 4 6
6 In a manufacturing concern producing radio transistors, lots of 250 items are inspected at a
time. Considering the number of defectives in 20 lots shown in the table below, draw suitable
control chart and write a brief report based on the evidence of the chart.
Lot No 1 2 3 4 5 6 7 8 9 10
No. of defectives : 25 47 23 36 24 34 39 32 35 22
Lot No. : 11 15 13 14 15 16 17 18 19 20
No. of defectives 45 40 32 35 21 40 15 28 2':' 42
7 A daily sample of 30 items was taken over a period of 14 days in order to establish attributes
control limits. If 21 defectives were found, what should be the upper and lower control limits of
the proportion of defectives.
8 A drilling machine bores holes with a mean diameter of 0.5230 cm and a standard deviation
of 0.0032 cm. calculate the 2-sigma and 3-sigma upper and lower control limits for means of
samples 4 and prepare a control charts.
9 Samples of 100 tubes are drawn randomly from the output of a process that produces several
thousand units daily. Sample items are inspected for quality and defective tubes are rejected.
The results of 15 samples are shown below:
398
On the basis of information given above prepare a control chart for fraction defective. What
conclusion do you draw from the control chart.
REFERENCES
Agarwal. B.L., 'Basic Statistics', Wiley Eastern Ltd., New Delhi, 2nd ed., 1991.
Burr, I.W., 'Engineering Statistics and Quality Control', McGraw-Hill Book Company, New
York, 1960.
Cowden, D.J., 'Statistical Methods in Quality Control', Asia Publishing House, Bombay, 1960.
Duncan, A.J., 'Quality Control and Industrial Statistics', Richard D.Irwin, Home wood, 1953.
- End Of Chapter -
399
LESSON - 24
BUSINESS FORECASTING
Introduction
Every businessman wants to be successful in his business. Success of the business can be
interpreted as to earn maximum gains and guard against all likelihood losses. This is only
possible if he can plan well about his business proceeds. For this most of the middle class
businessmen proceed on the basis of their experience and judgement. But this is not all. The big
business houses cannot merely depend upon the guess but have to analyse the past data and use
is for predictions. Forecasts also forewarn against all the eventualities which are likely to happen
so that a businessman can prepare himself. For instance, a slum period is likely to come. So for
this, a business man may arrange the finances, godowns etc. and wait for the boom period. Here,
we quote some of the writers about their views about forecasting.
Forecasting is using the knowledge we have at one time to estimate what will happen at some
future moment in time.
Business forecasting refers to the statistical analysis of the past and current movements in a
given time series, so as to obtain clues about the future pattern of the movements.
Leo Barnes:
H.J. Wheldon:
Business forecasting is not so much the estimation of certain figures of sales, production, profits
etc. as the analysis of known data, internal and external, in a manner which will enable policy to
be determined to meet probable future conditions to the best advantage.
The forecasting is aimed to predict future course of activity on the basis of the analysis of the
statistical data available and also the circumstances in which an activity took place. The
forecasting is based on minimising the risk of errors in predictions through probability
measures.
The forecasts guard a businessman about the future mishappening and also foretell the events
which are helpful in his expansion of business. The business activity is not an uniform
phenomenon. They are always some periods of booms and depressions. So, through business
400
forecasting one predicts the periods in which a boom is expected and in which a depression is
expected as we do in time series analysis also. But in business forecasting one is not confined to
statistical analysis only but also takes into consideration the qualitative and circumstantial
factors. Now we give some of the forecasting methods barring those which have already been
discussed like extrapolation, regression analysis, time series analysis etc. The forecasting
techniques are basically classified under three basic categories
i. Naive methods
ii. Barometric methods
iii. Analytical methods
Each of these methods cover a number of techniques and theories which are given below and
discussed one by one.
1)Naive method
2) Barometric methods
3) Analytical methods
Under this theory a businessman analyses the time series data of his own firm for trend,
seasonal variation and cycles and forecasts himself. Under this he usually forecasts about his
sales, production, dividend etc. The forecast is based on the projection obtained by the analysis
of time series data. Such an analysis leads one to decide whether the increase or decrease in
demand is cyclical or it is a continued process. The businessman has to adjust his stocks
accordingly. If there is a continued increase in demand, a manufacturer can think of expanding
his plant or to install a new plant etc. In case of decreasing demand, are may reduce his
production or make arrangements to hold the stocks.
We know a time series analysis is not able to exactly forecast the turning point and amplitude of
the cycles. Hence, a prominent approach is to predict the cycle of one series by the another
401
series that leads.
The forecasts made by a company under this method are only valid to the company itself and are
not applicable to any other business house. Moreover, the forecasts under economic rhythm
theory are not very reliable as they do not take into consideration other qualitative factors like
government policies, liking of people, etc. Standard and poors trade and securities service,
New York, have some faith in this theory.
This technique is based on the principle that history repeats itself. It means a situation occurred
in past in economic activity will be repeated in time to come. Hence, it is hoped that a present
series will have certain similarities to a past series. Hence, the conclusions drawn from past
series are directly applicable to the present series to express the state of economy. The method
of historical analogy can be applied to a series in general.
As a matter of fact, it is rare to find a situation in the past which is exactly similar to the present
one. Hence, a series in the past is searched out which resemble most. Inferences are drawn to
make a forecast giving allowance for dissimilarities in the two situations. This method of
forecasting is not very accurate.
This method is also known as sequence method. This method is based on the principle that
economic activities take place in succession or it can be said that changes occur with a time lag.
For instance, in the state of inflation, the exchange rate is adversely affected. The decrease in
exchange rate increases the whole sale prices. Which consequently brings rise in retail prices..
When retain prices increase, people are to be paid higher wages and salaries. So change in one
activity of economic phenomenon brings changes in many activities one after the other. Under
this theory, the effort is made to determine the time gap between a series and general business
cycle. So a turning point in the past leads to forecast what is going to happen to general business
activity of connected events.
The lead-lag technique of forecasting takes into account some hypothetical or observed
relationship among variables, these relationship are usually established by the inspection of
graphs of various series and correlation studies between the series. The famous Harvard index
of general business conditions was comprised of three prediction curves (a) speculation, the
leading series, (b) business, the coincident series and (c) money, the lagging series. The
movement of the three curves is related to one another. The ups and downs of the speculation
curve were used to forecast the movement of the business curve. Also speculation and money
curve more in opposite directions. The rise in one leads to the fall in the other. Upturn in money
curve indicates a recession in business within a few months.
So, in the lead-lag approach three types of indicators are involved namely, lead indicators like
exchange rate, money reserves, coincident indicators like employment, industrial production
index, total foreign traffic etc. and lag indications like sales, business loans etc.
The main difficulty with lead-lag approach is that it is not easy to interpret the movement of
indicators and to establish their relationship. All the more choice of indicators itself is a difficult
task. So this method alone is not capable of making reliable forecast but can supplement other
402
methods of forecasting. National Bureau of Economic Research, Massachusetts follows lead-lag
relationship theory.
Diffusion index:
This method is based on the proposition that all factors affecting a business do not reach their
peaks or trough simultaneously. This method does not require to identify which series has a lead
and which has a lag. A broad group of series is studied without bothering about any individual
series. The diffusion index shows the percentage of the series as expanding or contracting at
regular intervals (monthly). So it gives an idea about the general movement of business activity.
All series do not expand or contract simultaneously. One series may be expanding at a point of
time while other may be contracting at that time. In such a dilemma, if more than 50 per cent
series are expanding, the business is taken to be in the state of booming, otherwise in the state of
contracting. The activities of time series are expressed in terms of percentages on monthly basis.
National Bureau of Economic Research used 400 time series at a time to determine the diffusion
index. Diffusion index are constructed taking any group of variable of business activity like,
prices. Profits, prediction, working hours etc. or considering a group of industries.
Action-reaction theory:
This theory is based on Newtons third law of motion that to every action, there is an equal and
opposite reaction. So, enough reliance has been placed in action-reaction theory with regard to
business activity as well. Normal condition prevails for longer periods. Any upsurge is bound to
follow a recession and a recession will follow an upsurge almost of the same amplitude.
The method faces the difficulty in deciding the line of normal business activity Another problem
is to determine the phase through which a company is actually passing at the time of forecast.
Even then some forecasting agencies follow this theory for business forecasts. Business
Statistics Organisation formerly known as Babsons statistical organisation follows action
reaction theory.
This is a nonmathematical approach for forecasting. In this method, all the factors which are
considered to affect a business activity are analysed. Each factor is analyses individually to
ascertain whether it is favourable to a business activity or not and then a cumulative picture is
drawn by a mental* process. Thus, the forecast is made on the basis of this cumulative picture.
This method is completely flexible with regard to the number of factors tw.
This method is completely flexible with regard to the number of factors their relative
importence etc., The method is very easy and handly. But the main drawback of this it is totally
subjective. So under this method different persons may give forecasts. Hence, this method is not
much trustworthy.
403
This theory is just opposite to historical analogy approach. Under this theroy it is belived that
past cycle cannot be thrusted upon future cycles Hence, past series a cannot be a guide for
forecasting. Under this situation, the impact of present policies, demand, technological changes
availabiiitv of inputs styles, fads etc. are considered to see their joint impact on economic
activity. Also the viewes of economists. Executives and consumers etc. are taken to reach a clear
understanding for forecasting. These forecasts are usually short term forecasts.
This method suffers with the lacunae that it is very difficult to see the impact of factors and then
to get a cumulative picture. Also people do not see.
Opinion polling
This method of forecasting is entirely based on the opinion of the personnel involved m business
like sales officers, production managers executives and also opinion expressed in magazines.
The opinion is analysed and the forecast are made. This approach is usually very suitable for
short term forecasting.
This totally a mathematical approach of forecasting .The trend by moving average method had
been discussed undertime series analysis where equal weights are assigned to all item. But
under this method, the weights are assigned which are in geometric progression . More
weightage is given to recent observation and less to distant observation.
Taking n large and neglecting higher power of w and (1-w) and doing algebraic manipulation,
404
the relation between Xt+1 with the help of (4.1) and (4.2) can be established as
In (4.3), Xt+i is known as the smoothed value at the time (t+1) which is a to make forecast the
smoothed value Now to make a forecast, the smoothed values are used to find out a change in
each period. The constant w is called the smoothing coefficient and (1-w) /w the trend factor.
Since in the above process only one constant w is used, it is known as the single parameter
exponential smoothing. The forecast for the first period is taken from some old forecast or is
assumed by him.
A situation is also faced where trend and forecast move in opposite direction. To minimize this
effect,the calculation under trend adjusted exponential smoothing are made. Here the forecasts
are adjusted according to the trend.
Xt+1=w(Xt+1Xt)+Xt
X =w(Xt-Xt-1)+Xt-1 (4.4)
The quantity (Xt-Xt-1) is called the error. Thus from (4.4), the forecast at time t is the sum of the
proceeding forecast and w times the error. The trend coefficient required for doing the forecast
is obtained by the formula, Trend coefficient,O t=wx change in smoothed value +(l-w)x preceding
trend coefficient.
W is chosen arbitrarily. There is no rule which governs to choose a value of w. Anyhow guidance
may be provided. If the fluctuations in economic factors like, sales, production, demand, profits
etc. are of random nature, a small value of w is preferred. Also if the actual value turns down and
forecast do not turn down, a large value of w is usually chosen and vice-versa.
Example 4-1 :
Monthly pattern of off-take of edible oils by the public distribution system during the year 1989-
90 were as follows:
405
Months Apr. May Jun. Jul. Aug. Sep. Oct. Nov. Dec.
Edible oils :
24.2 24.4 23.0 28.8 34.9 37.7 38.3 42.9 26.0
('000 Tonnes)
The smoothed values and trend adjusted forecasts for the above time series can uc calculated in
the following manner. Let us take w = 0.4. The values are calculated step by step and entered in
the table given below:
Also suppose that the first smoothed value X = 25.0 smoothed forecasts from May to Mar by the
formula (4.4) are,
406
Change in smoothed values Xt+i - Xt is calculated by subtracting the smoothed value of the
preceding period from its next period.
. .
. .
. .
. .
407
For July 04 = .4 x 1.898 +(1- .4) (- 0.339) = 0.556
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
Note:
Exponential smoothing can be applied with more than one trend coefficient also. But the details
are kept out of this chapter.
Econometric method :
The manner is which an economic system behaves depends on a number of variables that
408
influence it. The inter- relationship between the variables is expressed by a set of equations.
These variables are of two types in nature namely (i) Endogenous variables, (ii) Exogenous
variables. Endogenous variables are those which belong to the economic system itself like
production, sales, prices, employment, wages etc. Again, exogenous variables are those which
affect the economic system but do not belong to it like politics, fads, styles etc. But econometric
methods deals with the quantitative variables only which influence and economic phenomenon.
On the basis of economic analysis one can forecast about economic changes with certain
probability Level.
Econometric analysis fundamentally concerns with the model building. Building a model means
estimation of parameters involved in the model through the time series data as we do in a way in
regression and time series analysis. There are hundreds of econometric models applicable in
various areas of economic activity. Here, we discuss only one model just to explain the idea lying
behind econometric methods.
Now we consider a model for Gross National Product (GNP) at a time period t.
Yt = Ct + It + Gt ' (4.8)
Where,
Yt = GNP at time t
In model (4.8), each factor on the right hand side is a function of other variables. We know, the
consumption Ct is a function of the increment in consumption, irrespective of the initial, value
of GNP. The incremental value is known as marginaL propensity and is denoted by . Also there
will be some consumption even if the G NP is zero. Let it be denoted by . Thus, the model for Ct
is,
The gross investment is equal to the total investment made by the private sector and
autonomous bodies. The induced investment in any period t is proportion to the difference in
consumption in period t and its presiding equivalent period. it be denoted i. Also the amount
invested for replacement and adoption of new technology comes under the category of
autonomous investment. Let this amount be denoted by K.Hence, the model for It is,
It = k + i(Ct-Ct-i)(4.10)
Therefore,
Gt = G0 (4.11)
409
Now the model (4.8) after substituting the values of Ct, It and Gt from (4.9), (4.10) become,
Again,
Substituting the values of Yt, Yt-1, Yt-2 and G0 for any time period t in the equation (4.14) we
obtain the values of Yt for the forecast period t.
Econometric method are good and reliable but not exact due to sampling errors. All the more , a
forecast will be good only when the econometric model is suitable for the data. Also , the forecast
are not the same as they are likely to be in times to come. Anyhow still they provide some good
guidelines to planning.
Forecasting in India has not been taken up professionally and is still in an infant stage . The
business houses are realizing its importance and establishing cells in their own companies.
QUESTIONS
410
8. Differentiate between long term and short term forecasting. Name the method, wh.ch are
suitable for long term and short term forecasting
9. Examine critically the time lag and the action-reaction theory of business forecasting. Which
of these, in your opinion is better and why?
10 Per capita consumption of tea (in gms from 1976-77 to 1986- 87 is given in the table below:
1976-77 1951-82
1977-78
Years :
469
1978-79 479 1979-80 498 1980-81 518
455 487
1984-85 1986-87
1982-83 1983-84
464 367
1985-86 422
399 420
Find the forecast values by the method of exponential smoothing taking the initial forecast as
450 and w = 0.5.
Find the forecast values by exponential smoothing method taking w = 0.6 and initial forecast as
525.
REFERENCES
Agarwal, B.L., 'Basic Statistics', Wiley Eastern Ltd., New Delhi, 2nd ed., 1991.
Chou, Y.L., 'Applied Business and Economic Statistics', Holt, Rinehart and Whiston, New York,
1963.
Firth, M., 'Forecasting Methods in Business and Management', Edward Arnold, London, 1977
- End Of Chapter -
411