Statistics First Year 2024 . Alhamd
Statistics First Year 2024 . Alhamd
Written by:-
(Lecturer)
CHISHTIAN SCIENCE COLLEGES, CHISHTIAN
College___________________________________________
________
Chapter No.1
Statistics:
Statistics is a science of systematic collection, presentation, analysis and
interpretation of the numerical data.
Two branches of Statistics
i- Descriptive statistics
ii- Inferential Statistics
Descriptive Statistics:
A branch of statistics which deals with collection, presentation and analysis of
data is called descriptive statistics
Inferential Statistics:
A branch of statistics which deals with drawing inferencesabout population on
the basis of sample data is calledinferential statistics.
Meaning ofStatistics:
The word statistics is used the following three meanings:
i- Singular sense
ii- Plural Sense
iii- Plural of the word Statistic
Singular sense
In singular sense statistics means methods used to systematic collection,
presentation, analysis and interpretation of the numerical data.
Plural Sense
In Plural sense statistics means data systematically collected and presented in the
form of tables or charts etc.
Plural of the word Statistic
Theword “statistics”is the plural of”statistic”which means a numerical
quantity calculated from sample data.
Main Departments where Statistics is Applied
i- Banks ii- Insurance Companies
iii- Medical iv- Research Institutes
v- Agriculture vi- Bureau of Statistics
3
Uses (Importance) of Statistics
i- Statistics are the eyes of administration.
ii- Statistics is necessary for social studies.
iii- Statistics are an aid to supervision.
iv- Planning without Statistics is impossible.
v- It is used for forecasting.
vi- It is used to present data in simple form.
vii- It is used to determine uncertainties of events.
Characteristics of Statistics
i- Statistics are aggregates of facts.
ii- Statistics are numerically expressed.
iii- It deals with aggregate of values.
iv- Its results are true on the average.
v- It simplifies the complex mass of data.
vi- It deals with variations.
Limitations of Statistics
i- Statistics deal only the aggregates of facts.
ii- Statistics results are valid only on the average.
iii- Statistics is liable to be misused.
iv- Different methods give different results
v- It does not deal with single value.
Function of Statistics
i- Statistics presents facts in a numerical form.
ii- Statistics helps in formulating the policies.
iii- Statistics helps in testing the laws of other sciences.
iv- Statistics studies relationship among different facts.
v- Statistics facilitates comparison of data.
Population
Total group under study fromwhich we wish to get the information is
calledpopulation or universe. For example, All the studentsin a college, Number of
schools in a city.Population may be finite or infinite.
Sample
A small part selected from the population is called sample, which is used to
estimate the value of the population parameters. Question paper is the example of a
sample.
4
Parameter
Anyvalue computed from a population is called parameter. parameters are
denoted by Greek letters.e.g. µ(Mean) and σ2(variance) etc. parameter is a constant.
Statistic
Any value computed from a sample is called statistic. statistic’s are denoted by
Roman letters. e.g. ̅(Mean) and S2 (variance) etc. statistic is a variable.
Ratio:
The ratio is a fraction between two values. If A and B are two values then ratio
is a fraction A/B. for example, there are 200 boys and 300 girls. So the boys-girls ratio
is 200/300.
Proportion;
The ratio of a part to its total is called proportion. For example, if there are
500 students in a college 300 boys and 200 girls. So the proportion of boys and girls
are 300/500 and 200/500.
Order Statistic:
The order statistic (OS) of data X1, X2, X3,..……….Xn is just arrangement of data
in order of magnitude. It is denoted by X(1), X(2), X(3), ……….X(n)
Model:
It is a mathematical statement used in studying the results of an experiment.
The model in the simplest form is Yi=µ- i
Sigma ( ):
The Greek letter (sigma) is used as short hand notation for summation of a
sequence of observations.The sum of the valuesX1, X2, X3, ……….Xn.
Product ( ):
The Greek letter π(pi) is used as short hand notation for product of a
sequence of observations.The product of the values X1, X2, X3, ……….Xn
Continuous Data:
A set of values that belong to continuousvariable is called continuous data.
e.g. weight of different students in Kg. as 50.5, 60.7, 55.3, 40.8&70.5 etc.
GroupedData:
When the ungrouped data are arranged according to classes or groups with
their corresponding frequencies are called grouped data. it is also called a frequency
distribution.
Variable:
Any quantity or quality which varies from person to person or object to object is
called variable. For example height, weight, age, speed, beauty, liking or disliking etc.
6
Discrete Variable:
A variable which takes countable numberof values is called discrete variable. It
is also called counting variable.For example, number of boys in family, number of
girls in class, number of colleges in a city.
Continuous Variable:
A variable which takes measurable values is called continuous variable.It is
also called measurable variable. For example, age, height, weight, speed etc.
Quantitative Variable:
A variable which vary only in quantity from individual to individual or object to
object is called a Quantitative variable.
For example, age, height, weight, speed, price, imports & exports,production etc.
Constant:
A quantity which can assume only one value is called a constant.
for exampleNumber of days in a week. A constant is usually denoted by small
lettersa, b, c. etc.
Written by:-
ABDUL WAHEED
Lecturer:
Chishtian Science Colleges, Chishtian
Cell #0300-6982680, 0308-4991880
Available at:
7
Chapter No.2
Presentation of Data:
The raw data arranged and reduced in to a form, which is easy tounderstand,
analyze and interpret is known as Presentation of data.
Methods of Presentation of data are
i- Classification ii- Tabulation
iii- Graphs iv- Diagrams or Charts
Classification:
The process of arranging the data in groups or classes according to their
similarities is called classification. For example sorting of letters in a post office city
wise, town wise or mohallah wise etc.
Types Of Classification.
i- One Way Classification. ii- Two Way Classification.
iii- Three Way Classification. iv- Many Way Classification.
Tabulation:
The process of making tables or arrangement of data into rows and columns is
called tabulation.
Types of Tabulation.
i- One Way Tabulation.(Simple Tabulation )
ii- Two Way Tabulation. (Double Tabulation )
iii- Three Way Tabulation.(Complex Tabulation )
The general sketch of table indicating its necessary parts is given bellow:
←…………………………………………Title…………………………………………..→
Prefatory notes: ……………..
8
Parts of a Table:
i- Title.A title is a heading (in capital letters)at the top of the table describing its
contents.
ii- Prefatory Note. The prefatory note is given after the title of the table is used
to explain the contents of table.
iii- Stub. The heading of different rowsi.e Rows captions.
iv- Box-head. The heading of different columns i.e columns captions.
v- Body of table.Main part of the table where classified data is written.
vi- Foot Note.The footnoteisgiven at the end of the table. Tell as additional
information about given data. The foot note usually indicated by
vii- Source Note. The source note is given at the end of the table. Tell as about
the source of given data.
Graph:
A drawing representing the relationship between data sets is called the graph.
Chart:It is the plotting data and showing results of a process over a period of time
Diagram:
A diagram is a simplified and structured visual representation of concepts,
ideas, relations, and statistical data. etc.
Array:
Arrangement of data in ascending or descending order is called an array.
For exampleArray the following data in ascending and descending order:
4, 8, 1, 9, 3, 7, 10, 2, 5, 6.
Array (Ascending order): 1, 2, 3, 4, 5, 6, 7, 8, 9, 10.
Array (Descending order): 10, 9, 8, 7, 6, 5, 4, 3, 2, 1.
Frequency:
Thenumber of values falling in a class is called a frequency or class-frequency.
It is denoted by “ f “
Frequency distribution:
Tabular arrangement of data into different classes with theirfrequencies is
called a frequency distribution. It is also called grouped data.
Class limits:
The values of the classes or groups are called the class limits. the smaller value
is the lower class limit and the larger value is the upper class limit.
Class Boundaries:
The true values which describe the actual or true class limits of a class are
called class boundaries. The smaller true value is called the lower class boundary and
the larger onethe upper class boundary of the group or class.The upper class
boundary of a class coincides with the lower class boundary of the next class.
9
Class Mark or Mid Point:
The class mark or mid point is the mean of lower and upper class limits or
class boundaries. It is usually denoted by X or Y. i.e.
Where L= lower class limit & U=upper class limit
Size of Class Interval (Class width):
The difference between the upper and lower class boundaries (Not between class
limits) of a class or group,or difference between two successive mid points is called
the class interval. It is denoted by ‘h’ or ‘i’ .
Relative Frequency:
If frequency of a class is divided by the sum of frequencies (total frequency)
then it is called relative frequency or proportion or probability of that class.The sum
of all relative frequencies is one. R.f=
Cumulative Frequency:
The total frequency of all classes less than the upper class boundary of a given
class is called the cumulative frequency of that class.
Multiple Bar Diagram:
Multiple barchart is used to represent two or more set of data having
commoncharacteristicsin the variable values.
Pie Chart:
Pie chart is the sub-division of a circle into sectors whose areas are
proportional to the different angles of the total angle of a circle i.e. 360:.
Historigram:
The graph of time series is called Historigram.it is constructed by taking
variable time along X-axis and observed values along Y-axis.
Histogram:
The graph of a frequency distribution in the form of adjacent rectangles in
which area of each rectangle is proportional to their class frequencies is called
histogram.
Frequency Polygon:
The graphic form of a frequency distribution obtained by joining the mid
points of classes is smoothed by free-hand curve and then it is called a frequency
curve.
Frequency Curve:
If a frequency polygon is smoothed out, the resulting graph is called a
frequencycurve.
10
Cumulative Frequency Curve or Ogive:
The graph of a cumulative frequency distribution is called cumulative frequency
curve or Ogive. it is constructed bythe cumulative frequencies are plotted against the
upper class boundaries and each point is joined to the next by smooth curve.
Simple Bar Chart:
A simple bar chart consists of horizontal or vertical bars of equal weights
andheightsproportionalto theirfrequencies.
CHISHTIAN SCIENCE COLLEGE FOR GIRLS CHISHTIAN
Roll No. : 603882
Part-I Part-II
Subjects Theory Practical Subjects Theory Practical Total
URI 72 URII 82 154
ENGI 65 ENGII 88 153
ISL 41 -- -- 41
-- -- PS 43 43
ECO-I 88 ECO-II 91 179
CS-I 89 CS-II 92 A+ 181
STAT-I 99 STAT-II 100 A+ 199
Result: 950 (BISE. BWP 2016 First Position in ICS Girls)
Part-I Part-II
Subjects Theory Practical Subjects Theory Practical Total
URI 75 URII 76 151
ENGI 63 ENGII 85 148
ISL 42 -- -- 42
-- -- PS 48 48
ECO-I 80 ECO-II 86 166
CS-I 73 CS-II 84 C 157
STAT-I 100 STAT-II 99 B 199
Result 911
11
Chapter No.3
Central Tendency:
A single numerical value which represents the data as a whole is called an
average.Sincethese measures tend to lie in the center,so they are called measures of
central tendencyIt is also called measure of location.
Average:
A single numerical value which represents the data as a whole is called an average.
Properties of an Ideal (good) Average:
i- It should be clearly defined.
ii- It should be easy to calculate.
iii- It should beeasy to understand and easy to explain.
iv- It should be based on all the values.
v- It should be capable of mathematical treatments.
vi- It should have sampling stability.
vii- It should not be affectedby extreme values.
Typesof Average:
i- Arithmetic mean.
ii- Geometric Mean.
iii- Harmonic Mean.
iv- Median.
v- Mode.
Arithmetic Mean:
The arithmetic mean is defined as the sum of all the values divided by their
numbers.it is denoted by ̅ (X-bar)
Arithmetic mean for ungrouped data
i- ̅= ∑ (Direct Method)
∑
ii- ̅= (Short Cut Method)
∑
iii- ̅= h (Coding Method or Step Deviation Method)
Arithmetic mean for Grouped data
i- ̅=∑ (Direct Method)
∑
∑
ii- ̅= (Short Cut Method)
∑
∑
iii- ̅= h (Coding Method or Step Deviation Method)
∑
12
WEIGHTED MEAN (MEAN OF VALUES OF UNEQUAL IMPORTANCE)
Let x1, x2, x3….…………..xn be the values of the variable X having weights
(relative importance) w1, w2, w3…………….wn , then the weighted arithmetic mean is
∑
̅ where i=1, 2, 3,………..n
∑
mean is given by
̅ ̅ ̅ ̅ ̅
̿ oR ̿=∑ , i=1, 2, 3….k
∑
vi- The product of mean (̅) and number of values (n) is equal to the sum of all the
values. i.e ∑ ̅
13
Geometric Mean (G.M):
Geometric mean is the nth positive root of the product of ‘n’ positive values.
G.M for ungrouped data
Basic definition formula
√ oR
using logarithm formula
∑
* +
14
Advantages and Disadvantages of Harmonic Mean.
Advantages(Merits)
i- It is well defined.
ii- It is based on all the values.
iii- It is able for more action.
iv- It is not affected by the extreme values.
v- It is the average of ratios and percentages.
vi- It is stable in repeated experiments.
Disadvantages (Demerits)
i- It is not easy to calculate.
ii- It is not easy to interpret.
iii- It does not exist if any value is zero.
iv- It gives high weightage to small values.
v- It is not computed for every kind of data.
Relation between A.M, G.M&H.M
i- If all the positive values are not equal then A.M > G.M>H.M
ii- If all the positive values are equal(constant) then A.M =G.M=H.M
iii- For two positive values √
Median
The value, which divides the array data in to two equal parts is called median.
It is middle value if ‘n’ is odd and mean of two middle values if ‘n’ is even.
It is denoted by ̃. (X-childa)
Median for Ungrouped Data.
Median = L * +
15
Quartiles
The values, which divide the array data in to four equal parts, are called quartiles.
There are three quartiles, denoted by Q1, Q2and Q3.
Where Q1= First or lower quartile.
Q2 = Second quartile or Median.
Q3= Third or upper quartile.
For ungrouped data
Qj * +th item of array data
For grouped data
Qj = L * + where j= 1, 2, 3.
Deciles
The values, which divide the arraydata in to ten equal parts, are calleddeciles.
There are nine deciles, denoted by D1 , D2 , D3………………..D9.
For ungrouped data
Dk = L * + where k= 1, 2, 3………..9.
Percentiles
The values, which divide the array datain to hundred equal parts, are called
percentiles. There are 99 percentiles, denoted by P1,P2,P3………………….P99.
For ungrouped data
Pi = L * + where i= 1, 2, 3………..99.
16
Disadvantages (Demerits)
i- It is not based on all the values.
ii- It is notable for more action.
iii- It cannot be interpolated.
iv- For large number of values arrangement of data is difficult.
Mode
The most repeated valueofthe data is called mode. It is denoted bŷ.(X-hat)
It is possible that a distribution have no mode, two modes, or more than two
modes.A data having one mode is called uni-model, having two modes is called bi-
model, and having more than two modes is called multi-model distribution.
For ungrouped Data
Mode
For Grouped data
Mode = L h
The class or group which has maximum frequency is called model class or group.
Where L=lower C.B of model group, fm = freq. of modal group
f1 =Pervious freq. of modal group, f2 = next freq. of modal group
h= C.I of modal group
For discrete FrequencyDistribution
Mode
Advantages andDisadvantages of Mode
Advantages (Merits)
i- It can be located graphically.
ii- It can be useful for qualitative data.
iii- It is not affected by extreme values.
iv- It can be computed from an open-end frequency table.
Disadvantages (Demerits)
i- It is ill defined.
ii- It is not based on all the values.
iii- It is notable for more action.
iv- It may not exist in small samples.
v- Sometimes a distribution may have more than one mode.
Empirical Relation between the Mean,Medianand Mode.
The empirical relation depends upon the shape of the data. The empirical relation
betweenthe mean,median and the mode of a uni-model frequency distribution are:
i- In a symmetrical distribution, Mean=Median=Mode.
ii- In a positively skewed distributions, Mean>Median>Mode
iii- In a negatively skewed distributions, Mean<Median< Mode
iv- Mode = 3Median - 2Mean
17
Suitable measure of Average
Arithmetic mean: It is used when distribution is approximately symmetrical. For
example, distribution of heights, weights of persons.
Geometric Mean: It is used when data is in the form of percentages.for example,
the average rates of increase in income.
Harmonic Mean:It is used for averaging certain rates of speed
Median: It is used when distribution is skewed. For example, distribution of wages,
wealth, skill. etc.
Mode: It is used when most repeated value is required or data isqualitative.
Q. No Short Questions
1 Find ∑ =? If ̅ and n=5
2 ̅
Find n=? if =15 and ∑ =75
If you have ∑ =308 ∑ =30.
3
Can we find the mean and what is
4 If ∑ =400 and ̅ =16 Find ∑
5 Given ∑ =2.8 and n = 5. Find sample Mean.
Give ∑(x-10)=0 , n=5. Find Mean?
Find mean if n=10 , ∑U=100 , h=2 where D=X-50?
18
Give U= , ∑fu=100, ∑f=200. Find A.M?
19
Chapter No.4
Dispersion:
The degree to which numerical data tend to spread about an average value is
called the dispersion or variation.
Thereare two main types of dispersion.
i- Absolute Dispersion.
ii- Relative Dispersion.
Absolute Dispersion.
An absolute dispersion is theactual variation present in the set of data.It is measure
in the same unit as that of data. The absolutemeasures of dispersion are
i- Range ii- Quartile deviation
iii- Mean Deviation iv- Standard Deviation& Variance
Relative Dispersion
The relative dispersion is the ratio between absolute dispersion and the
averageit is expressed in theform of a ratio, coefficient or percentage and is
independent of the unit of measurements.
The relativemeasures of dispersion are
i- Coefficient of range ii- Coefficient of quartile Deviation
iv- Coefficient of dispersion iv- Coefficient of variation.
20
Range for Grouped data
Range is the difference between maximum and minimum mid values of the data.
0R
The difference between theupper class-boundary of the last group and lower
class-boundary of the first group.
where
Co-efficient of Range:
The ratio between the difference and sum of largest and smallest value of the
data is called co-efficient of range or co-efficient of dispersion.
∑| ̃| ∑ | ̃|
(̃ ) = (M.D from Median) (̃ ) = ∑
(M.D from Median)
∑| ̂| ∑ | ̂|
(̂ ) = (M.D from Mode) (̂ ) = ∑
(M.D from Mode)
22
Disadvantages
i- It has a mathematical flaw of ignoring negative signs in its calculations.
ii- It is not capable of algebraic treatment.
iii- When computed from median or mode it is not so accurate measure of
dispersion.
iv- It is not a satisfactory measurefor skewed distribution.
Variance:
The arithmetic mean of squared deviations taken from mean is called variance
it is denoted by S2.Variance
Method For ungrouped Data For Grouped Data
∑ ̅ ∑ ̅
Direct Method→
∑
∑ ∑ ∑ ∑
Indirect Method→ * + * +
∑ ∑
∑ ∑ ∑ ∑
Short Cut Method→ * + * +
∑ ∑
∑ ∑ ∑ ∑
Coding Method→ [ ( ) ] [ ( ) ]
∑ ∑
Properties of Variance
i- The Variance is independent of origin.
i.e. if “a “is any constant then
ii- The variance of a constant is equal to zero.
i.e. if “a “is any constant then
iii- Variance is always non-negative. i.e.
iv- If be the n observations, then
v- When all the value are multiplied or divided by a constant the variance is
multiplied or divided by the square of the constant i.e.
or ( ) , where “a” is any constant.
vi- The variance of the sum or difference of two independent variables is equal to
the sum of their respective variance i.e.
23
Standard Deviation:
The positive square root of the arithmetic mean of squared deviations taken from
mean is called standard deviation.It is denoted by S. oR
The positive square root of thevariance is called standard deviation.
Standard Deviation
Method For ungrouped Data For Grouped Data
∑ ̅ ∑ ̅
Direct Method→ √ √
∑
∑ ∑ ∑ ∑
Indirect Method→ √ * + √ * +
∑ ∑
∑ ∑ ∑ ∑
Short Cut Method→ √ * + √ * +
∑ ∑
∑ ∑ ∑ ∑
Coding Method→ √[ ( ) ] √[ ( ) ]
∑ ∑
Properties of Standard Deviation
i- Thestandard deviation is independent of origin.
i.e. if “a “is any constant then S
ii- The standard deviation of a constant is equal to zero.
i.e. if “a “is any constant then
iii- Standard deviation is always non-negative. i.e.
iv- If be the n observations then
v- When all the value are multiplied or divided by a constant the standard
deviation is multiplied or divided by the constant i.e. | | or
( ) | |
, where “a” is any constant.
vi- The standard deviation of the sum or difference of two independent variables
is equal to the positive square root of sum of their respective variance
i.e √ √
∑ * (̅ ̿) +
√ , i=1, 2, 3………….k
∑
24
Co-efficient of Standard deviation.
The ratio between Standard deviation and arithmetic mean is called Co-efficient of
standard deviation.i.e. ̅
Co-efficient of variation:
The percentage ratio between Standard deviation and arithmetic mean is
called Co-efficient of variation. i.e.
̅
Uses of Co-efficient of variation.
i- Comparing the variation of two series.
ii- Comparing the consistency of two or more sets of data. The data having
Smaller Co-efficient of variation is called more consistent.
= ̅
Moments
Moments are great importance in the study of symmetry and normality of the
distribution and these are defined as the arithmetic mean of the powered deviations.
25
Moments AboutMean(Central Moments)
The arithmetic mean of various powers of deviations taken from their
arithmetic mean is called moments about mean.They are denoted by m1, m2, m3,
m4………etc. the rth moments about mean is define as
For ungrouped DataFor grouped
∑ ̅ ∑ ̅
Data
∑
Where r=1, 2, 3,4………….
Note:
*The first moment about mean is zero.
*The first moment about zero(origin) is mean.
*The second moment about mean is variance.
Moments about an arbitrary mean
The arithmetic mean of various powers of deviations taken from any arbitrary mean
(a) are called moments about arbitrary mean or raw moments. They are denoted by
, etc. the rth moments about arbitrary mean is define as
For ungrouped DataFor grouped Data
∑ ∑
where r=1, 2, 3,4………….
∑
Moments AboutZero (Origin)
IF be n observations on a variable X then rth moment about
zero(origin) is defined as
For ungrouped DataFor grouped
∑ ∑
Data
∑
Where r=1, 2, 3,4………….
*The first moment about zero(origin) is mean.
Relation between Moments about mean and Moments
about arbitrary mean or Moments about origin.
m1= =0
m2= ( ) Variance
m3=
m4=
26
Moment Ratios
The ratio between moments about meanare called moment ratios. Important
moment ratios are
&
Symmetry
A distribution in which a deviation below the mean is exactly equals the
corresponding deviations above the mean is called symmetry.
Symmetrical distribution
A distribution in which both sides are at equal distance from mean is called
symmetrical distribution. In a symmetrical distribution.
i- The graph of the series will be bell shaped.
ii- Mean = median = mode.
iii- is always equal to zero.
iv-
v- Right tail of curve from peak= Lift tail of curve from peak
Skewness
The lake of symmetry is called skewness. or any departure from symmetry is
called skewness.For a skewed distribution
Positive Skewness
IF the curve has long tail toward right, then the Skewnesswill be positive,in such
cases. i-
ii- and Sk 0 (Positive)
iii-
iv- Right tail of curve from peak>Lift tail of curve from peak
Negative Skewness.
IF the crave has long tail towards left,then the skewness will be negative, in such
cases. i-
ii- and Sk 0 (Negative)
iii-
iv- Right tail of curve from peak<Lift tail of curve from peak
Co-efficient of Skewness
The measure of skewness is called co-efficient of skewness.
The important formulas of co-efficient of skewness are
Pearson’s first co-efficient of skewness,
, -3 ≤ Sk ≤ +3
Pearson’s second co-efficient of skewness,
, -3 ≤ Sk ≤ +3
27
Bowley’s co-efficient of skewness.(Quartile Co-efficient of skewness)
, -1 ≤ Sk ≤ +1
Moment co-efficient of skewness
=√ -2 ≤ Sk ≤ +2
Kurtosis
The term Kurtosis measures the degree of peakedness of a symmetrical
distribution. is the measure of kurtosis which tells as the shape of the symmetrical
distribution.
If , then distribution is called Mesokurtic (normal)
If , then distribution is called Leptokurtic.
If , then distribution is called platykurtic. where
Q# Short Questions(Numerical)
1. If X1=-6, X2=3, X3=-4, X4=0, X5=5, X6=7, X7=6. Find Range andCo-efficient of Range
2. If 8, 3, 15, 10, 6, 13, 10, 9. Find Range and Co-efficient of Range
3. Find the Range and Co-efficient of Range of -1, -3, 0, 2, 3.
4. Find the range of: -1,-4,0,7,4
5. Compute the value of var(Y) if Y=3X+10 and var(X)=2.
62. The first three moments about x=4 is 1,5 and 12 Find Co-efficient of SK.
63. The first three moments mean are 0, 3 and -5 Find Co-efficient of skewness.
64. The first two moments about x=2 is 1,4 Find first two moments about mean
65. The first four moments about mean are 0, 6, 19 and 42 Find β1 and β2
66. The first three moments about mean are µ1=0, µ2=8, µ3=0, Find β1
67. What are the values of β1 and β2 in symmetrical distribution?
68. The second moment about mean is 25. What must be the value of m4 if the
distribution is leptokurtic
29
Chapter No.5
Index Numbers
Index numbers are series of numbers which measure the relative change
occurring in the datafrom time to time or place to place. Index numbers are also
called as economics barometers. Index numbers are usually constructed for the
variables such as prices, quantities, wages etc.
Typesofindexnumbers.
1 - Simple Index Number.
2 - Composite Index Number.
1- Simple Index Number.
An index number is called a simpleindex number when it measures a relative change
in a single variable with respect to a base year.For example index numbers for wages
of workers. index number of wheat prices.
Following two different methods are used to compute simple index number.
i- Fixed Base Method.
ii- Chain Base Method.
Fixed Base Method.
In this method a particular economically stable year or average of some years is
chosen as the base period that remains fixed (unchanged).
Price Relatives:
30
(i) Un-weighted index numbers.
An index number of two or more than two commodities when their relative
importance is not considered is called un-weighted index number. Its types are:
(a) Simple Aggregate index numbers.
(b) Simple Average of Relatives index numbers.
∑ ∑
P0n= √ 100.
∑ ∑
31
Cost of living index number or consumer price index numbers (C.P.I)
This index number is designed to measure thechange in composite price of
goods and services during the given period. The goods and services are
i- Food and beverages.
ii- Clothing and footwear.
iii- Fuel and lighting.
iv- Housing
v- Services of teachers, doctors advocate, etc.
Following two different methods are used to compute C.P.I.
Aggregate Expenditure Method.
∑
Laspeyre’s index: P0n= 100.
∑
Family Budget Method (House Hold Budget Method)
∑
P0n= 100.Where I & W
∑
Price index numbers.
These index numbers show relative changes in the wholesale or retail prices of
a variable or a group of related variables with respect to base.
Quantityindex numbers.
These index numbers show relative changes in the volume(quantity) of a
variable or a group of related variables with respect to base.
Chain Index Numbers.
The link relatives are converted back to a fixed base by multiplying together all
the link relatives (without the factor 100) involved between the two years. This
process of conversion is called the changing processing & the indices so determined
are called chain indices.
Base Period.
A period with which the comparison of other period is to be done is called base
period.
P₀= Price of base year(period)
q₀= quantity of base year(period)
Current Period.
A period whose comparison is to be done with base period is called current period.
Pn= Price of current year(period)
qn= quantity of current year(period)
Price Relative.
They are obtained by dividing the price in a current year by price in a base
yearandmultiplied by 100.
i.e
32
Link Relative.
They are obtained by dividing the price in a current year by the price in the
preceding year and multiplied by 100. i.e
11. If ∑ 1 q0=3123,∑ 0 q0=2902 and Paasche’s index number is 140 Find Fisher index number.
12. If Laspeyre’s index no. is 125.4 and Fisher’s index no. is 137.5 Find Paasche’s index number.
13. If Fisher’s index no. is 110.10 and Paasche’s index no. is 137.5 Find Laspeyre’s index number.
14. If Paasche’s index no. is 103.2 and Laspeyre’s index no. is 105.4 Find Fisher’s index number.
15. If ∑ n qn=460,∑ 0 qn= 115. Find Paasche’s index number.
16. Given ∑ 1 q0=9000 and ∑ 0 q0=8490. Find C.P.I and write the name of the method.
17. Given the following information. ∑ =8074.5,∑ =60.25 Find consumer price index number.
18. 1. Give po=6, 2, 4 and qo=50,100, 60 find ∑W?
33
Chapter No.6
Probability
Subjective orPersonalistic Definition of Probability:
It is defined as a measure of uncertainty or degree of belief in a particular
statement or uncertain problem.
0 ≤ P(A) ≤ 1
Conditional Probability:
When the sample space is reduced due to the occurrence of some outcomes, then
the probability of an event related to this reduced sample space is called conditional
probability.
P (A/B) =
P (B/A) =
Factorial:
The numbers 1, 2, 3, 4……….n, are called natural numbers. The product of first n
natural numbers is called n factorial and denoted by n!
i.e.n! = 1×2×3×………… (n-2) (n-1) (n)
or n! = n(n-1) (n-2)……… 3×2×1
Note: 0!=1
Permutations:
An ordered arrangement of ‘n’ objects taken ‘r’ at a same time is called
permutation. The no. of permutations of ‘n’ different objects is given by
34
Combinations:
An arrangement of ‘n’ objects taken ‘r’ at a same time without regard to their
order is called combination. The no. of combinations of ‘n’ objects taken ‘r’ at a time
is given by
Experiment:
Any action performed or process to get some results is called an experiment.
for example: Tossing a fair coin.Rolling a die.
Trial:
A single performance of an experiment is called a trial. For example if we toss a
coin 5 times it will be an experiment and one time tossing will be a trial.
Outcomes:
The results obtained from an experiment or trials are called outcomes.
For example if we toss a coin Head and Tail are possible outcomes.
Sample Space (S):
A set of all possible outcomes of a random experiment is called sample space.
It is denoted by S. for example when we toss a coin
S= {Head, Tail}
Sample Point:
Each possible outcome of a random experiment is called sample point.
For example when we toss a coin Head or Tail is sample point.
Event:
Any sub set of the sample space is called an event. Events are denoted by capital
letters A, B,C etc.
A = {Head}
B = {Tail}
Simple Event:
An event which contains only one sample point is called simple event.
e.g A={a} , B={6} , C={Head} are called simple event.
Compound Event.
An event which contains more than one sample points is called compound event.
e.g. A= {1, 3, 5} , B={2, 4, 6} are called compound events.
Equally Likely Events:
Two or more events are said to be equally likely eventswhen each of them is as
likely to occur as any other.If P (A) = P (B) then A and B are equally likely events
35
Mutually Exclusive Events:
The two events are said to be mutually exclusive or disjoint events if they cannot
occur together. i.e. (A B) = , for example A={1, 2, 3} and B={4, 5, 6} are mutually
exclusive events
Not Mutually Exclusive Events:
The two events are said to be not mutually exclusive or joint events if they can
occur together. i.e. (A B) ≠ , for example A= {1, 2, 3} and B= {3, 4, 5} are not
mutually exclusive events
Exhaustive Events:
The two or more events are said to be exhaustive if they collectively form the
entire sample space. If (A B) = S, then A and B are said to be exhaustive events.
Independent Events:
The two events are said to be independent if the occurrence of one event does
not affect the occurrence of the other.
P (A B) = P (A) P (B)
Dependent Events:
The two events are said to be dependent if the occurrence of one event effect the
occurrence of the other.
P(A B)= P(A/B) P(B)
Sure Event:
An event which contains all sample points of sample space is called sure
event. The probability of a sure event is always one. P(S) = 1
Impossible Event:
An event which contains no sample point of the sample space is called
impossible event. The probability of an impossible event is always zero. P( ) = 0
Law (Theorem) of Probability:
i- Addition Law of Mutually Exclusive Events.
If two events A and B are mutually exclusive, then
ii- Addition Law of Not Mutually Exclusive Events.
If two events A and B are not mutually exclusive, then
36
Note:
i- Two events that are independent, can never be mutually exclusive
ii- Two events that are mutually exclusive can never be independent.
iii- Two events that are not mutually exclusive may be independent or dependent.
Q# Short Questions(Numerical)
1. a) If P(A)=0.6, find P( ̅) b) If P( ̅ )=0.45, find P(B)
2. Compute and
3. What is the relation between and
4. Show that =
5. Verify that: 10C4=10C6
21. If A and B are two independent events such that P(A)=0.2 and P(B)=0.15 then
evaluate P( )?
37
Distribution of Playing Cards
Heart Diamond Club Spade
Probability of
Red cords Black cords Total Cards
cards
↓
↓
3 3 3 3 4 =
4 4 4 4 4 =
5 5 5 5 4 =
6 6 6 6 4 =
7 7 7 7 4 =
8 8 8 8 4 =
9 9 9 9 4 =
10 10 10 10 4 =
Jack Jack Jack Jack 4 =
Queen Queen Queen Queen 4 =
King King King King 4 =
Number of RED CARDS 26 =
Number of BLACK CARDS 26 =
TOTAL NUMBERS OF PLAYING CARDS 52
PictureCards =king ,queen ,jack 12 =
Face Cards =Ace, Queen ,jack king , 16 =
HonorCards= Ace ,10, king ,Queen ,jack 20 =
BridgeHand=A game of four players each having 13 cards.
38
Chapter No.7
Random Number
Random numbers are defined as the numbers which are obtained by some random
process. The basic random numbers are these ten one-digit
numbers0,1,2,3……….9.Each of these number has an equal chance (probability) of
being selected. These numbers are combined into two – digit, three-four-
digit…………..numbers according to use.
Generation of Random Numbers:
The random numbers are generated by the following methods
i- By playing Cards. ii- From Random numbers table.
iii- From Computer iv- By revolving a wheel machine.
Applications of Random Numbers:
i- Random numbers are used to select random samples.
ii- Random numbers are used wherever random selection is needed.
iii- Random number table can also be used to generate data without performing
the actual experiment.
Random Experiment:
An experiment in which outcomes vary from trial to trial is called a random
experiment for example tossing a coin.Throwing a die.
Random Variable:
A Random variable is outcome of random experiment. A random variable is also
called as chance or stochastic variable or simply a variate. It is denoted by Latin
capital letters as X,Y,Z. and their values by small letters i.e. by x, y, z, etc. There are
two type of random variable
i- Discrete Random variable
ii- Continuous Random variable.
Discrete Random variable:
A random variable which can assume only a countable number of values is
called a discrete random variable or discontinuous variable or direct variable.
For example, the number of students in a class.number of colleges in a city.
Continuous Random Variable:
A random variable which can assume all possible values in the given interval is called
a continuous random variable. For example age, height, weight and speed etc.
39
Chapter No.8
Probability Distribution (Discrete Probability Distribution)
If in a table all possible values of a discrete random variable are given with
their probabilities, then this table is called as probability distribution. For example if
we toss a coin. We get the following probability distribution.
X 0 1
P(x)
40
Distribution Function
The sum of all probabilities less than or equal to a specified value of X is called
distribution function (d.f) or cumulative distribution function (c.d.f) It is denoted by
F(X) =P(X ≤ x)
Properties of Distribution Function
i- F(x) is non negative. i.e F(x) ≥ 0
ii- It lies between 0 and 1. i.e 0 ≤ F(x) ≤ 1
iii- F(- )=0 and F(+ )=1
iv- It is increasing function.
Example: Distribution function
X P(X) F(X)=P(X ≤ x)
1 0.1 0.1
2 0.3 0.4
3 0.4 0.8
4 0.2 1
Mathematical Expectation
Sum of products of values of discrete variable X with their probabilities is called
expectation or expected value. i.e E(X)= ∑XP(X).
E(X) is also called mean of X. denoted by µ
Properties of Mathematical Expectation
i- If C is a constant then E(C)=C
ii- If X and Y are two random variables then
E(X+Y) =E(X) +E(Y) and E(X-Y) =E(X)-E(Y)
iii- If X is random variables and ‘a’ & ‘b’ are constants then
E (aX+b) =aE(X) +b and E (aX-b) =aE(X)-b
iv- If X and Y are two independent random variables then
E (XY) =E(X) E(Y)
v- Expectation of deviations from mean is zero i.e.E[X-E(X)] =0
Written by:-
ABDUL WAHEED
Lecturer:
Chishtian Science Colleges, Chishtian
Cell # 0300-6982680, 0308-4991880
36. Given E(X)=0.55, var(X)=1.35 and Y= 2X+1. Find E(Y) and var(Y)?
42
Chapter No.9
Bernoulli Trial
A trial having only two possible results and probability of success is constant is
called Bernoulli trail e.g. tossing a coin.
Bernoulli Process
An experiment that results in one of two mutually exclusive outcomes on each
trial is called Bernoulli process.
Properties of Bernoulli Trial
i- The result of each trial can be classified as success or failure.
ii- The probability of success is constant for all trials.
iii- The repeated trials are independent.
iv- The experiment is performed a single time. i.e n=1
Binomial Experiment
An experiment having successive independent trials in which the outcome can
always be a classified as success or failure and the probability of success remains
constant from trial to trial is called a binomial experiment.
Properties of Binomial Experiment
i- The result of each trial can be classified as success or failure.
ii- The probability of success ’P’ remains constant for all trials.
iii- The repeated trials are independent.
iv- The experiment is repeated a fixed number of times say ‘n’.
Binomial Random Variable
The random variable which denotes the number of successes of binomial experiment
is called binomial random variable. It is discrete variable and its range is zero to ‘n’.
i.e x = 0 , 1 , 2 , 3…………n
Binomial Probability Distribution
The formula used to find the probabilities of binomial random variable X is
called binomial probability Distribution or binomial probability mass function.
it is given by. ( ) , for x = 0 , 1 , 2 , 3…………n
The binomial probability Distribution has two parameters n and p. i.e b(x; n, p)
n = Number of trials and P = Probability of Success in a single trial.
∗ In binomial Distribution successive trials are with replacement.
43
Binomial Frequency Distribution
If the binomial probability Distribution is multiplied by N(No. of experiments)
the resulting distribution is called binomial frequency distribution.
N ( ) , for x = 0 , 1 , 2 , 3…………n
Hypergeometric Experiment
An experiment, in which successive trails are dependent and the outcome can
always be a classified as success or failure and the probability of success change from
trail to trail, is called hypergeometric experiment.
44
Hypergeometric Random Variable
The random variable which denotes the number of successes of
hypergeometric experiment is called hypergeometric random variable. It is discrete
variable and its range is zero to ‘n’.i.ex = 0 , 1 , 2 , 3…………n.
Hypergeometric Probability Distribution:
The formula used to find the probabilities of hypergeometric random variable
X is called hypergeometric probability Distribution or hypergeometric probability
mass function. it is given by.
( )( )
, for x = 0 , 1 , 2 , 3…………n
( )
Mean
ii- Variance ( )( )
iv-
v- Mean always greater than variance.
vi- Var(Hypergeometric) <Var(Binomial)
vii- If N→ , hypergeometric probability Distribution→ binomial distribution
Written by:-
ABDUL WAHEED
Lecturer:
Chishtian Science Colleges, Chishtian
Cell #0300-6982680, 0308-4991880
45
Q# Short Questions(Numerical)
1. What is sum of P and q
2. If n=10 &P=0.6 find Mean and variance
3. Find Mean, standard deviation and variance of (q+p)6
4. Identify the parameters of binomial distribution b(x; n, p)
5. If Mean=10 , P=0.5 find n.
6. If Mean=38 , q=0.83, find n & p
7. If p=q , n= 10. find Mean and variance
8. In binomial distribution mean=2.4 and S.D=1.2 find n?
9. In a binomial distribution n=7 and P= 0.7 Find coefficient of variation
10. Discuss the statement that binomial distribution has mean=4 and S.D=3
11. Is it possible to have binomial distributionwith mean=3 and S.D= 9
12.
Is it possible to have binomial distribution with mean=4 and variance = 2
13.
Find binomial distribution whose mean is 12 and standard deviation is 3.
14.
Sketch the curve of binomial distribution if P=1/2 , P> ½ & P<1/2
15.
Find the parameters of binomial distribution if mean is 3 and S.Dis 1.5
16. Is it possible to have binomial distribution (i)P(X=3.8)=0.16 (ii)P(X>1)=1.27
17. If Mean=10 , n=20 find p and q.
18. If X b(10 , 0.6)then find mean and variance of (i) Y=X+10 (ii) Y=2X-5
19. If in a binomial distribution n=5 then find P(X≥0),P(X≤5)& P(X>5)
20. If in a binomial distribution P= ½, then find m3
21. If n=7 and q= o.6, then find Mean and variance of binomial distribution
22. Identify the parameters of hypergeometricdistributionh(x; N, n, k)
23. If N=15, n=7, k=4 then find Mean andS.D ofhypergeometricdistribution
24. If N=10, n=5, P=0.4 then find Mean and variance ofhypergeometricdistribution
25. If N=8, n=2 and k=3 find P(X=0)
26. If N=5, n=3 and k=2 find probability distribution of X
27. Given N= 10 , n= 4 and k= 5 find E(X)
28. Given N= 10 , n= 5 and k= 3 find the mean of hypergeometric distribution
29. If N=6, n=4 and k=3 find P(X=1)
46
Statistics Ist – Annul - 2023 Session (2020-22) to (2022-2)
[C]
(Objective) Inter (Part –I) Time: 20 Minutes Marks : 17
Note : Four possible choices A , B , C ,D to each question are given. Which choice is
correct fill that circle in front of that Question No. Use Marker or Pen to fill the circles.
Cutting or filling two or more circles will result in Zero Mark in that Question.
Q.No.1 ∑( Y - ̅ )2 is …………:
(1) (A) 0 (B) LEAST (C) 1 (D) VARIANCE
In a table part of rows captions is called :
(2)
(A) Box head (B) Title (C) Stub (D) Body
Questionnaire is ………...source :
(3)
(A) Primary (B) Secondary (C) Original (D)Local
If D = X – 15 and ∑D = 20 for 10 observations , then ̅ is :
(4)
(A) 2 (B) 5 (C) -13 (D) 17
Standard Deviation of a set of data is 4 , then its variance is :
(5)
(A) 16 (B) 04 (C) 02 (D) -4
If β2 = 3 then Distribution is called :
(6)
(A) Leptokurtic (B) Platykurtic (C) Mesokurtic (D) Symmetrical
Most Central Value of an Arrayed Set of Data is called :
(7)
(A) Mode (B) Median (C) A.M (D) G.M
First moment about Origin is :
(8)
(A) 0 (B) 1 (C) Variance (D) MEAN
For two independent events A and B , P(A B) = :
(9)
(A) 0 (B) P(A) P(B/A) (C) P(B) P (A/B) (D) P (A) P (B)
Fisher’s Ideal Index No………. between Laspeyre’s and paasche,s Index Number.
(10)
(A) G.M (B) A.M (C) H.M (D) Median
Index Number for Base Year is:
(11)
(A) 0 (B) 50 (C) 100 (D) Not possible
When two dice are rolled , elements in Sample Space are :
(12)
(A) 6 (B) 12 (C) 36 (D) 16
Variance of Hyper – Geometric Distribution:
(13)
(A) ( ) (B) npq (C) (D)
In Hypergeometric Distribution successive trials are :
(14)
(A) Independent (B) Dependent (C)Not Associated (D) Continuous
Which is not possible in a probability Distribution , successive trials are :
(15)
(A) p(x) =0.5 (B) p(x) = (C) p(x) = 0.05 (D) p(x) =
Binomial Distribution is :
(16)
(A) Continuous (B) Qualitative (C) Symmetrical (D)Discrete
2
E [ x – E (x) ] = 49 , then S.D (x) = ------------
(17)
(A) 49 (B) 07 (C) 13 (D) 36
MCQ’s # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Answer B C A D A C B D D A C C A B D D B
47
Statistics Roll no. Ist – Annul - 2023 Session (2020-22) to (2022-24)
(Subjective) Inter (Part –I) Time: 2 : 40 Hours Marks : 68
Note:It is compulsory to attempt any (8-8) Parts each from Q.No.2 and Q.No.3 while
attempt any (6) Parts From Q.No.4. Attempt any (3) Questions from part – ii.
Write same Question No. and its part No. as given In the question Paper.
Part - i
Q.No.2
Describe any two limitations of
(i) (ii) Define Statistics in Singular Form.
“ Statistics ”.
(xi) What are Equally likely Events? (xii) What is Conditional Probability?
Q.No.4
(i) Define Bernoulli Trial. (ii) How Random Numbers are generated?
48
Part - ii (22 x 2= 44)
Q.No.7 (a) From the following Prices, Construct Chain Indices using Geometric Mean as
an Average :
Years Sugar Wheat Rice
2010 50 39 62
(04)
2011 53 41 65
2012 57 42 68
2013 70 48 76
P(X) (04)
Q.No.9 (a) A Fair coin is tossed four times. Find the Probability Distribution of Number of
Heads (04)
(b) Ten Vegetable cans, all of the same size have lost their labels. It is known that
5 contain tomatoes and 5 contain Corns. If 5 cans selected at random, what is
(04)
the Probability that 3 contain tomatoes?
49
Statistics L.K.NO. 1542
Paper I ( Subjective type ) Inter ( 1st - A – Exam – 2024
Time : 2:40 Hours Inter (Part-1)
Marks : 68 Session (2022-24) &
(2023 – 25)
(ix) If N=10 ,n= 5,k=3 find Mean of the Hyper geometric Distribution.
50
Part-II (3 x 8 = 24)
Q.NO.5 (a) Find the Geometric Mean for following data:
Age(years) 11-20 21-30 31-40 41-50 51-60 (04)
f 6 7 9 6 4
(b) The Average Wage of 4 men is Rs 17/- per hour. What is the Average Wage (04)
of further 6 Men if the Average Wage of all 10 men is RS 20/-?
Q.NO.6 (a) Find coefficient of Quartile deviation from the following table: (04)
Weight
(grams) 160-170 170-180 180-190 190-200 200-210 210-220
No. of
apples 7 13 30 42 35 23
(b) Given that ∑f=120,∑fx=296, Mode=2.944 and Second Moment about Mean is (04)
1.4884. calculate coefficient of skewness.
(b) A pair of dice is rolled . let “A” denote the event that the sum shown is “6” (04)
and “B” be the event that the two dice had the same no. Find (i)P (A/B) (II)
p(B/A)
Q.No.8 (a) The Probability distribution of Discrete Random Variable ‘x’ is given by (04)
f(x) = )( ( , x=0,1,2,3 Find E(x) and E( )
(b) Three balls are drawn from a bag containing 5 white and 3 black balls . If ‘x’ (04)
denotes the number of white balls ,then find the Probability Distribution of
‘x’and find its Mean.
51
Statistics [A] L.K.NO. 1542 Paper Code No. 6181
Paper I ( Objective type ) Inter ( 1st - A – Exam – 2024
Time : 20 Minutes Inter (Part-1)
Marks : 17 Session (20022-24) & (2023 – 25)
Note : Four possible choices A , B , C ,D to each question are given. Which choice is correct
fill that circle in front of that Question No. Use Marker or Pen to fill the circles. Cutting or
filling two or more circles will result in Zero Mark in that Question.
Q.NO.1 A quantity calculate from population is called:
(1) (A) Frequency (B) Statistic (C) Parameter (D) Sample
Measurement usually provide;
(2)
(A) Qualitative Data (B)Discrete Data (C) Primary Data (D) Continuous Data
Cumulative Frequency curve is also called:
(3)
(A) Histogram (B)Frequency curve (C) Ogive (D) Historigram
In a statistical table , column captions is called :
(4)
(A) Box head (B)Stub (C) Body (D) Title
The value of the data lying between Q1 and Q3 are:
(5)
(A) 50% (B) 25% (C) 75% (D) 100%
Var(2x+3) is :
(6)
(A) 5 var(x) (B) 4 var(x) (C) 4var(x) +3 (D) 4var(x)+9
The sum of squares of deviation is least from :
(7)
(A) Median (B) Mean (C) Mode (D) G.M
Mean Deviation is least,if deviation are calculate from:
(8)
(A) Median (B) Mode (C) Mean (D) G.M
In fixed Base Method , the base period should be :
(9)
(A) Abnormal (B)Middle (C)Normal (D) For Distant
Simple Index Number involves Commodities:
(10)
(A) 2 (B)3 (C)4 (D)1
A coin and a Die can throw together:
(11)
(A) 12 way (B)6 Way (C) 2 way (D) 36 way
Probability of drawing a card of Ace is :
(12)
(A) (B) (C) (D)
E( )=29 and E(x)=4 then Var(x)= :
(13)
(A) 25 (B)5 (C)13 (D)33
A discrete probability distribution may be presented by :
(14)
(A) Table (B)Mathematic Equation (C) Diagram (D)All of these
In Binomial Distribution n=10, p=0.5 then Mean is :
(15)
(A) 0.5 (B) 5 (C)10 (D)2.5
The parameters of Hypergeometric Distribution are :
(16)
(A) 3 (B)2 (C) 1 (D)4
The sum of P and q is always :
(17)
(A) 0 (B)2 (C)1 (D)4
MCQ’s # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Answer C D C A A B B A C D A B C D B A C
GOOD LUCK
52