0% found this document useful (0 votes)

148 views

Basic Statistics

This document provides an introduction to basic statistics. It defines statistics as both the collection of numerical data (plural sense) and the methods used to analyze data (singular sense). There are two main types of statistics: descriptive statistics, which involves summarizing and presenting data; and inferential statistics, which involves making generalizations from samples to populations. The stages of a statistical investigation are outlined as data collection, organization, presentation, analysis, and interpretation. Variables are defined as characteristics that can assume different values and are classified as qualitative or quantitative, with quantitative variables further divided into discrete and continuous.

Uploaded by

አንተነህ የእናቱ

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

148 views

Basic Statistics

Uploaded by

አንተነህ የእናቱ

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 53

Basic Statistics

Chapter One: Introduction

1.1. Definition and Classification of Statistics
Statistics is the art of learning from data. It is concerned with the collection of data, their subsequent
description, and their analysis, which often leads to the drawing of conclusions. It is the scientific study of
numerical data based on variation in nature (Sokal and Rohlf). Statistics can be defined in two senses:
plural (as Statistical Data) and singular (as Statistical Methods).
a. Plural sense:
Statistics are collection of facts (figures). This meaning of the word is widely used when reference is made to facts
and figures on sales, employment or unemployment, accident, weather, death, education, etc.
Examples: Sales Statistics, Labor Statistics, Employment Statistics, etc. In this sense the word Statistics serves
simply as data. “But not all numerical data are statistics”.
b. Singular sense:
Statistics is the science that deals with the methods of data collection, organization, presentation, analysis and
interpretation of data. It refers the subject area that is concerned with extracting relevant information from available
data with the aim to make sound decisions. According to this meaning, statistics is concerned with the development
and application of methods and techniques for collecting, organizing, presenting, analyzing and interpreting
statistical data.

1.2. Stages in Statistical Investigation

According to the singular sense definition of statistics, a statistical investigation involves five stages: data
collection, organization, presentation, analysis and interpretation of results
1. Collection of Data: Data collection is the first stage in any statistical investigation. It involves the
process of obtaining (gathering) a set of related measurements or counts to meet predetermined
objectives. Data may be available from existing published sources which may have already been
organized in some presentable form. Such information is commonly referred to as secondary data. On
the other hand, the investigator may actually collect his or her own data. This is usually necessary
when information about some area of inquiry has not been ascertained. In such cases, the data are said
to be of primary form.
2. Organization of Data: It is usually not possible to derive any conclusion about the main features of
the data from direct inspection of the observations. It needs describing the properties of the data in a
summary form. Editing is the first step in the organization of data since there may be inconsistencies,
ambiguity, irrelevant answers and recording errors. Once the data is edited, the second step is
classification, which is, arranging the collected data according to some common characteristics. Such
classified data can more easily be presented. The last step of the organization of data is presenting the
classified data in tabular form, using rows and columns (tabulation).
3. Presentation of Data: The purpose of data presentation is to have an overview of what the data
actually looks like, and to facilitate statistical analysis. Data presentation can be done using diagrams
and graphs which have great memorizing effect and facilitate comparison.
4. Analysis of Data: The analysis of data is the extraction of summarized and comprehensive numerical
description in order to reach conclusions or provide answers to a problem. That is, the basic purpose of
data analysis is to make it useful for certain conclusions. This analysis may require from simple to
sophisticated mathematical techniques.
5. Interpretation of Results: This is the last stage of statistical investigation. Once the data has been
analyzed, some numerical value(s) can be achieved. The main job consists of attaching physical
meaning or interpretation to these numerical results. This must be true in its meaning and sense. No
pre-conceived ideas should be thrusted on the numerical results obtained out of the analysis of the
data. Also no attempts should be made to draw more conclusions than the results are actually liable to.

Page 1
Basic Statistics

Classifications of Statistics
Data can be used in different ways. The body of knowledge called statistics is sometimes divided into two
main areas, depending on how data are used. The two areas are
1. Descriptive statistics
2. Inferential statistics
Descriptive statistics: consists of the collection, organization, summarization, and presentation of data. In
descriptive statistics the statistician tries to describe a situation. Descriptive statistics are numbers that are
used to summarize and describe data. The word “data” refers to the information that has been collected
from an experiment, a survey, an historical record, etc. Descriptive statistics are just descriptive. They do
not involve generalizing beyond the data at hand. Generalizing from our data to another set of cases is the
business of inferential statistics.
The part of statistics concerned with the description and summarization of data is called descriptive
statistics.
Inferential Statistics: The second area of statistics is called inferential statistics. Inferential statistics
consists of generalizing from samples to populations, performing estimations and hypothesis tests,
determining relationships among variables, forecasting and making predictions.
Here, the statistician tries to make inferences from samples to populations. Inferential statistics uses
probability, i.e., the chance of an event occurring. You may be familiar with the concepts of probability
through various forms of gambling. Most of the time it uses words like (will, would, may, might, can, could, at
least, chance of, etc and it explains about future conditions and known as future agenda).
Inferential statistics can be defined as the science of using probability to make decisions.
The nature of this discipline
Descriptive Statistics Probability Inferential Statistics
The part of statistics concerned with the drawing of conclusions from data is called inferential statistics.
Examples: Classify the following statements as descriptive and inferential statistics.
1. The average age of the students in this class is 21 years.
2. There is a strong association between smoking and lung cancer.
3. The price of wheat will be increased by 5% in the coming year.
4. Of the students enrolled in Oda Bultum University this year, 74% are male and 26% are female.
1.3. Some Basic Terminologies in Statistics
Data: It is figures or facts from which conclusion can be made. Data are the values (measurements or observations)
that the variables can assume.
Variable: It is any characteristic of an object that can be represented as a number.It is a characteristic or attribute
that can assume different values.
Population: It is the totality of all elements under study (such as objects, items, people, etc). It consists of all
subjects (human or otherwise) that are being studied.
Sample: It is a portion or part of the population taken so that some generalization about the population can be made.
It is the subset of the population which is assumed to be the representative of the population.
Sampling: The process or method of sample selection from the population.
Sample size: The number of elements or observation to be included in the sample.
Census: Complete enumeration or observation of the elements of the population. Or it is the collection of data from
every element in a population
Parameter: It is a numerical characteristic of an entire population (Greek letters)
Statistic: It is a numerical characteristic of a sample (Latin letters)
1.4. Types of Variables
Variable is a characteristic or attribute that can assume different values. Variables can be classified as qualitative or
quantitative.
Qualitative variables: arevariables that can be placed into distinct categories, according to some characteristic
orattribute. For example, if subjects are classified according to gender (male or female), then the variable gender is
qualitative. Other examples of qualitative variables are religiouspreference and geographic locations.
Quantitative variables: are numerical and can be ordered or ranked. For example, the variable age is numerical, and
people can be ranked in order according to the value of their ages. Other examples of quantitative variables are

Page 2
Basic Statistics

heights, weights, and body temperatures. A quantitative variable is determined when the description of the
characteristic of interest results in a numerical value. When a measurement is required to describe the characteristic
of interest or it is necessary to perform a count to describe the characteristic, a quantitative variable is defined.
Quantitative variables can be further classified into two groups: discrete and continuous.
Discrete variables can be assigned values such as 0, 1, 2, 3 and are said to be countable. Variables such as number
of children in a household are called discrete variables since the possible scores are discrete points on the scale. For
example, a household could have three children or six children, but not 4.53 children. Other examples of discrete
variables are the number of children in a family, the number of students in a classroom, and the number of calls
received by a switchboard operator each day for a month. Discrete variables assume values that can be counted.
Continuous variables, by comparison, can assume an infinite number of values in an interval between any two
specific values. Temperature, for example, is a continuous variable, since the variable can assume an infinite
number of values between any two given temperatures.
Continuous variables can assume an infinite number of values between any two specific values. They are obtained
by measuring. They often include fractions and decimals.
1.5. Measurement Scales
In addition to being classified as qualitative or quantitative, variables can be classified by how they are categorized,
counted, or measured. Based this type of classification—i.e., how variables are categorized, counted, or measured—
uses measurement scales, and four common types of scales are used: nominal, ordinal, interval, and ratio.
Nominal: The first level of measurement is called the nominal level of measurement. A sample of college
instructors classified according to subject taught (e.g., English, history, psychology, or mathematics) is an example
of nominal-level measurement. Classifying survey subjects as male or female is another example of nominal-level
measurement. No ranking or order can be placed on the data. Other examples of nominal-level data are political
party (Democratic, Republican, Independent, etc) and marital status (single, married, divorced, widowed,
separated). The nominal level of measurement classifies data into mutually exclusive (no overlapping) categories in
which no order or ranking can be imposed on the data.
Ordinal: The next level of measurement is called the ordinal level. Data measured at this level can be placed into
categories, and these categories can be ordered, or ranked. For example, from student evaluations, guest speakers
might be ranked as superior, average, or poor. For instance, when people are classified according to their build
(small, medium, or large), a large variation exists among the individuals in each class. Other examples of ordinal
data are letter grades (A, B, C, D, and F).The ordinal level of measurement classifies data into categories that can
be ranked; however, precise differences between the ranks do not exist.
Interval: The third level of measurement is called the interval level. This level differs from the ordinal level in that
precise differences do exist between units. For example, manystandardized psychological tests yield values
measured on an interval scale. IQ is anexample of such a variable. There is a meaningful difference of 1 point
between an IQ of 109 and an IQ of 110. Temperature is another example of interval measurement, since there is a
meaningful difference of 10F between each unit, such as 720F and 730F. One property is lacking in the interval scale:
There is no true zero. For example, IQ tests donot measure people who have no intelligence. For temperature, 0F
does not mean there is no any heat at all.The interval level of measurement ranks data, and precise differences
between units of measure do exist; however, there is no absolute zero.
Ratio: The final level of measurement is called the ratio level. Examples of ratio scales arethose used to measure
height, weight, area, and number of phone calls received. Ratio scales have differences between units (1 inch, 1
pound, etc.) and an absolute zero. In addition, the ratio scale contains a true ratio between values. For example, if
one farmer can geta yield of 200 ton and another can get 100 ton, and then the ratio between them is 2 to 1. Put
another way, the first farmer can get twice as much as the second farmer. The ratio level of measurement possesses
all the characteristics of interval measurement, and there exists an absolute zero. In addition, true ratios exist when
the samevariable is measured on two different members of the population.

1.6. Application, uses and limitation of statistics

Application area of statistics
 In research work
 In management of cost budgetary
 In engineering areas and physical science
 In economics and biological science
 In social science and politics
 In industries especially in quality control area. etc

Page 3
Basic Statistics

There is hardly any walk of life which has not been affected by statistics - ranging from a simple
household to big business and the government. Hence, in this moder n time, statistical information plays
a very important role in a wide range of fields. Some of the areas where the knowledge of statistics is
usually applied are as follows:
1.5.2 Uses of Statistics
Uses of statistics
Statistics is used in almost all fields of human activities and used by government bodies, private business firms and
research agencies as a major tool. Some of the uses are:
 It is also helpful in formulating and testing hypothesis and to develop new theories
 It can condenses and summarizes complex data
 It helps to predict the future trend
• Reduction and summarization o f data: Statistics c o n d e n s e s and summarizes a large mass of data and
presents facts into a few presentable, un der s t an da bl e and precise numerical figures. The raw data, as is
usually available, is voluminous and haphazard. It is generally not possible to draw any conclusions from the
raw data as collected. Hence it is necessary and desirable to express these data in a few numerical values.
• Facilitating comparison of data: Arrangement of data with respect to different characteristics f a c i l i t a t e s
compar i s on. Statistical devises such as averages, percentages, ratios, e.t.c. are used for this purpose.
• Determining f u n c t i o n a l relationships b e t w e e n two or more phenomenon: Statistical techniques
such as correlation analysis assist in establishing the degree of association between two or more variables.
• Formulation and test of hypothesis : For instance, hypothesis like whether a new medicine is effective
in curing a disease, whether there is an association between variables can be tested using statistical tools.
• Prediction: Statistical methods are highly useful tools in analyzing the past data and predicting some future
trends.
Exercises:
1) Identify the following as nominal level, ordinal level, interval level, or ratio level data.
1. Flavors of frozen yogurt ________________
2. Amount of money in savings accounts________________
3. Students classified by their reading ability: Above average, Below average, Normal _____________
4. Letter grades on an English essay ________________
5. Religions ________________
6. Commuting times to work ____________
7. Ages (in years) of art students ________________
8. Ice cream flavor preference ________________
9. Years of important historical events ________________
10. Instructors classified as: Easy, Difficult or Impossible ________________
2) Identify whether the statement describes inferential statistics or descriptive statistics:
a) The average age of the students in a statistics class is 21 years.
b) The chances of winning the California Lottery are one chance in twenty-two million.
c) There is a relationship between smoking cigarettes and getting emphysema.
d) From past figures, it is predicted that 39% of the registered voters in California will vote in the June
primary.

II. Multiple Choice-Variable Types

1- Number of bicycles sold in one year by a large sporting goods store is an example of what type of data?
A) Qualitative B) Quantitative
2- Colors of baseball caps in a store is an example of what type of data?
A) Qualitative B) Quantitative
3- Time to cut a lawn is an example of what type of data?
A) Qualitative B) Quantitative
4- Number of doughnuts sold each day by a doughnuts store is an example of what type of data?
A) Discrete B) Continuous C) Ordinal D) Nominal
5- Water temperatures of six swimming pools in Jeddah on a given day are an example of what type of data?

Page 4
Basic Statistics

Chapter Two: Methods of Data Collection and Presentation

Data Types
Data are observations of random variables made on the elements of a population or sample.
 Data are the quantities, numbers or qualities (attributes) measured or observed that are to be collected and/or
analyzed.
 The word data is plural_ datum is singular.
 A collection of data is often called a data set.
Based on the source, data can be classified into two: Primary Data and Secondary Data.
 Primary data are data collected for the first time either through direct observation or by enquiring individuals. It
refers to the data collected either by or under the direct supervision and instruction of the researcher.
 Secondary data are data obtained from published or unpublished sources like newspapers, journals, official
records, e.t.c. Secondary data should be used with utmost care. So before using this data, the following three points
should be considered.
 Whether the data are suitable for the purpose of investigation. This can be judged in the light of the nature and scope
of investigation.
 If the data obtained is suitable for our purpose it should be look at whether the data are adequate for the purpose of
investigation.
 Whether the data are reliable. The data obtained should be checked for its accuracy.
Based on the role of time, data can be classified as Cross-sectional and Time series.
 Cross-sectional Data: is a set of observations taken at a point of time.
 Time series Data: is a set of observations collected for a sequence of time usually at equal intervals.

Methods of Data Collection

The first and foremost task in statistical investigation is data collection. Before data collection, four important points
should be considered. These are the purpose of data collection (why we need to collect data), the data to be
collected (what kind of data to be collected), the source of data (where we can get the data) and the methods of data
collection (how can we collect this data). These steps are called the why, what, where and how of the data
collection.
Primary data are collected from primary sources and secondary data from secondary sources. Primary data can be
collected through experimental methods in laboratory and through survey method. The survey methods of data
collection are personal interview, telephone interview, mailed questionnaire and personal observation.

Telephone surveys have an advantage over personal interview surveys in that they are less costly. Also, people may
be more candid in their opinions since there is no face-to-face contact. A major drawback to the telephone survey is
that some people in the population will not have phones or will not answer when the calls are made; hence, not all
people have a chance of being surveyed. Also, many people now have unlisted numbers and cell phones, so they
cannot be surveyed.
Mailed questionnaire surveys can be used to cover a wider geographic area than telephone surveys or personal
interviews since mailed questionnaire surveys are less expensive to conduct. Also, respondents can remain
anonymous if they desire. Disadvantages of mailed questionnaire surveys include a low number of responses and
inappropriate answers to questions. Another drawback is that some people may have difficulty reading or
understanding the questions.
Personal interview surveys have the advantage of obtaining in-depth responses to questions from the person being
interviewed. One disadvantage is that interviewers must be trained in asking questions and recording responses,
which makes the personal interview survey more costly than the other two survey methods. Another disadvantage is
that the interviewer may be biased in his or her selection of respondents.
Data can also be collected in other ways, such as surveying records or direct observation of situations.

Types of Surveys
In general there are two methods of data collection: Census Survey and Sample Survey Method.

Page 5
Basic Statistics

 Census Survey: is (complete enumeration) a study covered all the elements in the population under
consideration. In this method we resort a 100% inspection of the population and each and every unit of the
population is enumerated. It enables to obtain information about each and every element in the population.
 Sample Survey: is a survey in which some elements which are representatives of the population (sample)
are taken to infer about the whole population. It is a statistical process in which we select and examine a sample
instead of considering the whole population.
The Sampling method has many advantages over the census methods.
1. Sampling reduces cost of data collection.
2. Greater speed i.e. it enables us to obtain results on time.
3. Greater accuracy. It helps us to get data of good quality as the number of enumerators’ decreases we can
train and supervise them well in the process of data collection.
4. Greater scope (under circumstances where human and material resources are limited).
5. Census may be destructive. Samples reduce the damages caused by some tests in quality control. For
example, in cooking food mothers check whether the food has enough amount of salt, spices, butter and so on, by
taking small amount and testing it. What would happen if the test is all what is in the dish?
6. Complete enumeration may be impossible or impractical (when the population is infinite), thus
sampling is the only way.

Frequency Distributions
To describe situations, draw conclusions, or make inferences about events, the researcher must organize
the data in some meaningful way.The most convenient methodof organizing data is to construct a
frequency distribution.
A frequency distribution is the organization of raw data in table form, using classes and frequencies.
Definition of some terms
Class: is a description of a group of similar numbers in a data set.
Frequency: is the number of times a variable value is repeated.
Class frequency: the number of observations belonging to a certain class.
Types of Frequency Distribution: - There are three types of frequency distributions; categorical, ungrouped
(discrete or frequency array) and grouped (continuous) frequency distributions.
Categorical Frequency Distribution: - a Frequency Distribution in which the data is qualitative i.e. either nominal
or ordinal. Each category of the variable represents a single class and the number of times each category repeats
represents the frequency of that class (category).
Eg:- The blood type of 25 students is given below
A B B AB O A
O O B AB B A B
B B O A O AB
A O O O AB O
Class(Blood type) Frequency(number of students)
A 5
B 7
AB 4
O 9
Total 25

Eg2:-construct FD for the following letter grade of 25 students

A B C C C
C B B A D
A C C A B
F C C A B
Ungrouped FD (Frequency Array):- A FD of numerical data (quantitative) in which each value of a variable
represents a single class (i.e. the values of the variable are not grouped) and the number of times each value repeats
represents the frequency of that class.
Example:-Number of children for 21 families.

Page 6
Basic Statistics

2 3 5 4 3 3 2
3 1 0 4 3 2 2
1 1 1 4 2 2 2
Class(Number of children) Frequency(Number of families)
0 1
1 4
2 7
3 5
4 3
5 1
Total 21

Grouped (Continuous) Frequency Distribution: - A Frequency Distribution of numerical data in which several
values of a variable are grouped into one class. The number of observations belonging to the class is the frequency
of the class.
Eg:-Consider age group and number of persons
Class Limits (Age in years) Class Boundaries (Age in years) Frequency (number of persons)
1-25 0.5-25.5 20
26-50 25.5-50.5 18
51-75 50.5-75.5 20
76-100 75.5-100.5 10
101-125 100.5-125.5 2
Total 70

Class Limits:-The lowest and highest values that can be included in a class are called Class Limits. The lowest
values are called Lower Class Limits and the highest values are called Upper Class Limits.
Class limit for the first class 1-25
Lower class limit 1
Upper class limit 25
Class Boundaries:-are class limits when there is no gap between the UCL of the first class and the LCL of the
second class. The lowest values are called Lower Class Boundaries and the highest values are called Upper CB
Cass Boundary for the first class 0.5-25.5
Lower class boundary 0.5
Upper class boundary 25.5
Class Width (Class Size):-the difference between UCB and LCB of a class. It is also the difference between the
lower limits of two consecutive classes or it is the difference between upper limits of two consecutive classes.
W=UCB-LCB or W=LCLi-LCLi-1or W=UCLi-UCLi-1
For the above Example W=25.5-0.5=25 or W=26-1=25 or W=50-25=25
Class Mark (Class Midpoint):-is the half way between the class limits or the class boundaries.
LCL  UCL LCB  UCB
CM= or CM=
2 2
Note that W=CMi-CMi-1
Class Limits Class Boundaries Class Mark Frequency
1-25 0.5-25.5 13 20
26-50 25.5-50.5 38 18
51-75 50.5-75.5 63 20
76-100 75.5-100.5 88 10
101-125 100.5-125.5 113 2
Total 70

Page 7
Basic Statistics

Relative frequency: - is the ratio of class frequency to the total frequency (total number of observations).
Percentage frequency: - Relative frequency ×100
Class Limits Class Boundaries Class Mark Frequency Relative frequency Percentage frequency
1-25 0.5-25.5 13 20 20/70
26-50 25.5-50.5 38 18 18/70
51-75 50.5-75.5 63 20 20/70
76-100 75.5-100.5 88 10 10/70
101-125 100.5-125.5 113 2 2/70
Total 70 70/70=1 100
Cumulative frequency: is the sum of frequencies below or above a certain value.
Less than Cumulative Frequency: is the total number of values of a variable below a certain UCB.
More than Cumulative Frequency: - is the total number of values of a variable above a certain LCB.
Class Class Class Frequency Less than More than
Limits Boundaries Mark Cum. Freq. Cum. Freq.
1-25 0.5-25.5 13 20 20 2+10+20+18+20=70
26-50 25.5-50.5 38 18 20+18=38 2+10+20+18=50
51-75 50.5-75.5 63 20 20+18+20=58 2+10+20=32
76-100 75.5-100.5 88 10 20+18+20+10=68 2+10=12
101-125 100.5-125.5 113 2 20+18+20+10+2=70 2
Total 70
Construction of Grouped Frequency Distribution
1. Arrange the data in an array form (increasing or decreasing order).
2. Find the Unit of Measurement (U).
U is the smallest difference between any two distinct values of the data.
3. Find the Range(R)R is the maximum numerical difference in the data set, i.e. the difference between the
largest and the smallest values of the variable.
4. Determine the number of classes (K) using Sturges Rule. ==> K=1+3.322logN where N is the total number
of observations.
5. Specify the class width(W) W= R
K
6. Put the smallest value of the data set as the LCL of the first class. To obtain the LCL of the second class add
the class width W to the LCL of the first class. Continue adding until you get K classes.
Let X be the smallest observation
LCL1=X
LCLi=LCLi-1+W for i=2, 3… K.
7. Obtain the UCLs of the FD by adding W-U to the corresponding LCLs.
UCLi=LCLi+(W-U) for i=1,2…K.
8. Generate the class boundaries.
LCBi = LCLi- 1 U and UCBi=UCLi+ 1 U for i=1,2…K.
2 2
Example 1: Mark of 50 students out of 40
16, 21, 26, 24, 11, 17, 25, 26, 13, 27, 24, 26, 3, 27, 23, 24, 15, 22, 22, 12, 22, 29, 18, 22, 28, 25, 7, 17, 22, 28, 19,
23, 23, 22, 3, 19, 13, 31, 23, 28, 24, 9, 20, 33, 30, 23, 20, 8, 21, 24
Construct grouped frequency distribution.
Solution
1. The array form of the data (increasing order)
3, 3, 7, 8, 9, 11, 12, 13, 13, 15, 16, 17, 17, 18, 19, 19, 20, 20, 21, 21, 22, 22, 22, 22, 22, 22, 23, 23, 23, 23, 23, 24,
24, 24, 24, 24, 25, 25, 26, 26, 26, 27, 27, 28, 28, 28, 29, 30, 31, 33
2. U=9-8=1
3. R=L-S=33-3=30

Page 8
Basic Statistics

4. K=1+3.322logN=1+3.322log50=6.64≈7
5. W=R/K=30/6.64=4.5≈5
6. W-U=5-1=4

Class Class Class Frequency Relative Percentage LCF MCF

Limits Boundaries Mark Frequency Frequency
3-7 2.5-7.5 5 3 3/50=0.06 6 3 50
8-12 7.5-12.5 10 4 4/50=0.08 8 7 47
13-17 12.5-17.5 15 6 6/50=0.12 12 13 43
18-22 17.5-22.5 20 13 13/50=0.26 26 26 37
23-27 22.5-27.5 25 17 17/50=0.34 34 43 24
28-32 27.5-32.5 30 6 6/50=0.12 12 49 7
33-37 32.5-37.5 35 1 1/50=0.02 2 50 1
Total 50 1 100

Properties of Classes
Classes should be
 Complete and non-overlapping
Complete: - it should include all the data set.
Non-overlapping: - no data should belong to two classes.
 Clear and properly set: The W and K should be calculated properly and W should be the same for all
classes.
 Standardized: A class should follow logical and chronological (increasing) order.
 The number of classes should be in between 5 and 20 i.e. 5≤K≤20. K depends on N. the larger the N the
more the K. But we need to condense the data set with minimum lose of information in an easy manageable classes.
 Continuous: Even if there are no values in a class the class must be included in the frequency distribution.
Advantages and disadvantages of frequency distributions
a. Advantages
 It condenses a large mass of data in to a comparatively small table.
 It attracts the attention of even a layman and gives him an insight into the nature of the distribution.
 It helps for further statistical analysis, like central tendency, scatter, and symmetry… of the data.
b. Disadvantages
 In the grouped frequency distributions, the identity of the observations is lost. We know only the number of
observations in a class and don not know what the values are.
 Because the selection of the class width and the lower class limit of the first class are to a certain extent
arbitrary, different frequency distributions may be constructed for the same data and hence may give contradictory
impressions.
2.2. Graphical and Diagrammatical Representation of the Data

Definition of graph:
 The word graph comes from the Greek word meaning ‘’to draw or write.’’
 We define a graph as a pictorial representation of a set of data.
 Many types of graphs are employed in statistics, depending on the nature of the data involved and the
purpose for which the graph is intended.
The step of pictorial representation comes after the raw data set has been pruned & organized
The most common & simple form of Pictorial representation of data are
 Bar chart
 Pie chart
 Histogram

Page 9
Basic Statistics

 Frequency polygon
 Cumulative frequency curve (Ogive curve)
The three most commonly used graphs in research are
1. The histogram.
2. The frequency polygon.
3. The cumulative frequency graph or ogive (pronounced o-jive).
1. Histogram: The histogram is a graph that displays the data by using contiguous vertical bars (unless the
frequency of a class is 0) of various heights to represent the frequencies of the classes. A graph in which the
classes are marked on the X axis (horizontal axis) and the frequencies are marked along the Y axis (vertical
axis).
 The height of each bar represents the class frequencies and the width of the bar represents the class
width.
 The bars are drawn adjacent to each other.
Example: Construct a histogram to represent the data shown for the record high temperatures for fifty states in the
following frequency distribution
Class boundaries Frequency
99.5-104.5 2
104.5-109.5 8
109.5-114.5 18
114.5-119.5 13
119.5-124.5 7
124.5-129.5 1
129.5-134.5 1
Solution
Step 1: Draw and label the x and y axes. The x axis is always the horizontal axis, and the y axis is always the
vertical axis.
Step 2: Represent the frequency on the y axis and the class boundaries on the x axis.
Step 3: Using the frequencies as the heights, draw vertical bars for each class. See Figure 2–2.

6
3

0
99.5° 104.5° 109.5° 114.5° 119.5° 124.5° 129.5° 134.5°

Temperature (°F)
As the histogram shows, the class with the greatest number of data values (18) is 109.5–114.5, followed
by 13 for 114.5–119.5. The graph also has one peak with the data clustering around it.

2. Frequency Polygon
Another way to represent the same data set is by using a frequency polygon. The frequency polygon is a
graph that displays the data by using lines that connect points plotted for the frequencies at the midpoints
of the classes. The frequencies are represented by the heights of the points. It is a graph that consists of
line segments connecting the intersection of the class marks and the frequencies. It can be constructed
from Histogram by joining the mid-points of each bar.

Page 10
Basic Statistics

Example: Using the frequency distribution given in previous example construct a frequency polygon.
Step 1: Find the midpoints of each class. Recall that midpoints are found by adding the upper and lower
boundaries and dividing by 2:
Class boundaries Midpoints Frequency
99.5–104.5 102 2
104.5–109.5 107 8
109.5–114.5 112 18
114.5–119.5 117 13
119.5–124.5 122 7
124.5–129.5 127 1
129.5–134.5 132 1
Step 2: Draw the x and y axes. Label the x axis with the midpoint of each class, and then use a suitable
scale on the y axis for the frequencies.
Step 3: Using the midpoints for the x values and the frequencies as the y values, plot the points.
Step 4: Connect adjacent points with line segments. Draw a line back to the x axis at the beginning and
end of the graph, at the same distance that the previous and next midpoints would be located, as shown in
Figure below.

3. The Ogive
The third type of graph that can be used represents the cumulative frequencies for the classes. This type of graph is
called the cumulative frequency graph, or ogive. The cumulative frequency is the sum of the frequencies
accumulated up to the upper boundary of a class in the distribution. The ogive is a graph that represents the
cumulative frequencies for the classes in a frequency distribution.
Example: Construct a less than ogive for the frequency distribution described in the previous example
Solution
Step 1: Find the cumulative frequency for each class.

Less than cumulative frequency

Less than 99.5 0
Less than 104.5 2
Less than 109.5 10
Less than 114.5 28
Less than 119.5 41
Less than 124.5 48
Less than 129.5 49
Less than 134.5 50
Starting with the first upper class boundary, 104.5, connect adjacent points with line segments. Then
extend the graph to the first lower class boundary, 99.5, on the x axis.

Page 11
Basic Statistics

The steps for drawing these three types of graphs are shown in the following
Constructing Statistical Graphs
Step 1: Draw and label the x and y axes.
Step 2: Choose a suitable scale for the frequencies or cumulative frequencies, and label it
on the y axis.
Step 3: Represent the class boundaries for the histogram or ogive, or the midpoint for the
frequency polygon, on the x axis.
Step 4: Plot the points and then draw the bars or lines.

Diagrams
1. Bar Diagram:-It is the simplest and most commonly used diagrammatic representation of a frequency
distribution. It is appropriate to present Qualitative Data (nominal\ordinal). It uses a serious of separated and
equally spaced bars in which the width of the bars is constant and height of bars corresponds to the frequency of
the category. The bars are separated by constant distance.
a. Simple Bar Diagram: is a diagram in which categories of a variable are marked on the X axis and the
frequencies of the categories are marked on the Y axis.
It is applicable for discrete variables, that is, for data given according to some period, places and timings. These
periods and timings are represented on the base line (X-axis) at regular interval and the corresponding frequencies
are represented on the Y-axis.
 The width of the rectangle represents nothing (it is meaningless), but it should be equal
for all rectangles.
 Each rectangle is separated by an equal space.
 It can also represent some magnitude (on the Y axis) over time, space, groups, e.t.c.(on
the X axis).
Example1:
Marital Status Number of individuals
Single 100
Married 70
Divorced 30
Total 200

Page 12
Basic Statistics

100

Frequency

0
Single Married Divorced

Marital Status

b. Component Bar Diagram: is used when there is a desire to show a total or aggregate is divided into its
component parts. The bars represent total value of a variable with each total broken into its component parts and
different colors are used for identification. In such type of diagrams, a bar is subdivided in to parts in proportion
to the size of the sub division. These subdivided rectangles are shaded differently by lines, dots and colors so
that they will be very easy to compare the components.

Sometimes the volumes of different attributes may be greatly different. For making meaningful comparisons, the
components of the attributes are reduced to percentages. In that case each attribute will have 100 as its maximum
volume. This sort of component bar diagram is known as percentage bar-diagram.

Each rectangle represents total value of a variable and is broken into its component parts.
Example:
Marital Status Male Female Total
Single 90 10 100
Married 30 40 70
Divorced 1 29 30

c. Multiple Bars Diagram: used to display data on more than one variable. In the multiple bars diagram two
or more sets of inter-related data are interpreted.
Example:
Year Coffee Butter Sugar Total
1997 120 127 75

Page 13
Basic Statistics

1998 25 98 87
1999 100 120 75
2000 198 98 60

2. Pie chart: - Pie chart is popularly used in practice to show percentage break down of data. A pie chart is a
circle representing a set of data by dividing the circle into sectors proportional to the number of items in the
categories or a pie chart is a circle representing the total, cut into slices in proportional to the size of the parts
that make up the total. It gives the proportional sizes of different data groups as slice of a pie or a circle.
Example:
Marital Status Number of individuals Percentage Degree
Single 100 50 180
Married 70 35 126
Divorced 30 15 54
Total 200 100 360

Exercises
The number of calories per serving for selected ready-to-eat cereals is listed here. Construct a frequency distribution
using 7 classes. Draw a histogram, a frequency polygon, and an ogive for the data, using relative frequencies.
130 190 140 80 100 120 220 220 110 100
210 130 100 90 210 120 200 120 180 120
190 210 120 200 130 180 260 270 100 160
190 240 80 120 90 190 200 210 190 180
115 210 110 225 190 130

Page 14
Basic Statistics

Chapter 3: Measures of Central Tendency and variation

3.1. Measures of Central Tendency

A single value which can be considered as typical or representative of a set of observations and around
which the observations can be considered as centered is called an ’average’ (or average value or center of
location). Since, such typical values tend to lie centrally within a set of observations when arranged
according to magnitudes; averages are called measures of central tendency.
Objectives of Measures of Central Tendency
1. To condense a mass of data in to one single value. That is to get a single value which is best
representative of the data (that describes the characteristics of the entire data). Measures of central
tendency, by condensing masses of in to one single value enable us to get an idea of the entire data. Thus
one value can represent thousands of data even more.
2. To facilitate comparison. Measures of central tendency, by condensing masses of in to one single
value, facilitates comparison. Example: to compare two classes A and B, instead of comparing each
student result, which is infeasible, we can compare the average mark of the two classes.
3. To comprehend or understand the data easily.
There are many types of measures of central tendency, each possessing particular properties and each being typical
in some unique way. The most frequently encountered ones are
I. Computed averages
 Mean (Arithmetic Mean. Geometric Mean and Harmonic Mean)
II. Positional averages
 Median
 Quantiles (Quartiles, Deciles, Percentiles)
III. Mode
Properties of Good Measures of Central Tendency
A measure of central tendency is good or satisfactory if it possesses the following characteristics.
1. It should be calculated based on all observations.
2. It should not be affected by extreme values. It should be as close to the maximum number of observed
values as possible.
3. It should be defined rigidly which means it should have a definite value (it should be unique).
4. It should always exist.
5. It should be easy to understand calculate.
6. It should be stable with regard to sampling.
7. It should be capable of further algebraic treatment.
Mean: Arithmetic Mean
A. Simple Arithmetic Mean:-is the sum of all observations divided by total number of observations. For a sample of n
observations X1, X2, …, Xn the sample mean is denoted by X (X-bar) and calculated as follows.
X= = 1
X X  X 2  ....  X n
n n
For a frequency array (ungrouped FD), X =  fX = f 1 X 1  f 2 X 2  ....  f K X K
 f f 1  f 2  ...  f K

For grouped FD, X =

 fX = f X1 1  f 2 X 2  ....  f K X K
, X represents class mark
f f1  f 2  ...  f K
Examples:
(a) Find the arithmetic mean of 2, 5,7 and 8.
(b) Find the mean for the frequency distribution of the following grouped frequency distribution.
Class boundaries Frequency
99.5–104.5 2
104.5–109.5 8
109.5–114.5 18
114.5–119.5 13

Page 15
Basic Statistics

119.5–124.5 7
124.5–129.5 1
129.5–134.5 1
Total 50
Solution
n

 Xi
2 5 7 8
a) X= i1
  5. 5
n 4
b) X =  fX
 f
Class boundaries Midpoints(Xi) Frequency(fi) fiXi

99.5–104.5 102 2 204

104.5–109.5 107 8 856
109.5–114.5 112 18 2016
114.5–119.5 117 13 1521
119.5–124.5 122 7 854
124.5–129.5 127 1 127
129.5–134.5 132 1 132
Total ∑fi=50 ∑ fiXi =5710

X = 5710  114 . 2
50
B. Combined Mean
If there are p different groups (having the same unit of measurement) with mean X 1 , X 2 ,…, X p and number of
observations n1,n2,…np respectively, then the mean of all the groups i.e. the combined mean is given by X C
n X  n 2 X 2  ....  n p X
XC =  nX = 1 1 p

n n1  n 2  ...  n p
Example: The mean weight of 50 women working in a factory is 48 kilograms. The mean weight of 75
men working in the same factory is 58 kilograms. Find the mean weight of all workers in the factory.
Solution: n w  50, X w  48, n m  75, X m  58, X c  ?
nw X w  nm X m 50  48  75  58 6570
Xc     52 .56
nw  nm 50  75 125
Exercises
i. The mean mark in statistics of 50 students in a class was 72 and that of the 35 boys was 75. Find the
mean mark of the girls in the class. Ans:65
ii. The mean salary of 100 laborers working in a factory, running in two shifts of 40 and 60 workers
respectively is birr 380. The mean salary of the 40 laborers working in the morning shift is 350. Find the
mean salary of the 60 laborers working in the evening shift. Answer = 400
C. Weighted Arithmetic Mean
While calculating the simple arithmetic mean we had given equal importance to all values. But there are cases
where the relative importance is not the same for all items. When this is case, it is necessary to assign them weights
(i.e. relative importance) and then calculate a weighted arithmetic mean. Let X1, X2, …, Xn be the values and
W1,W2,…,Wn be the corresponding weights then the weighted arithmetic mean denoted by X W is given by

XW = 
WXW1 X 1  W 2 X 2  ....  W n X n
=
W W1  W 2  ...  W n
Arithmetic mean fulfills almost all characteristics of good measures of central tendency with the exception that it is
highly affected by extreme values. And it cannot be calculated for a FD with open-ended classes (a FD with no
lower class boundary of the first class or with no upper class boundary of the last class or with both).
Properties of AM

Page 16
Basic Statistics

 The algebraic sum of the deviations of each value from the arithmetic mean is zero. That is ∑(X- X ) =0.
 The sum of the squares of the deviations from the mean is less than the sum of the squares of the deviations about
the other score in the distribution.
That is∑(X- X ) 2≤∑(X-A)2,A≠ X
 If a constant C is added or subtracted from each value in a distribution, then the new mean will be X new= X old  C
respectively.
 If each value of a distribution is multiplied by a constant C, the new mean will be the original mean multiplied by C.
Exercise:
1. Find the arithmetic mean of A) 1, 2, 3, 4, 5. B) 1, 2, 3, 4, 100. Is there a great difference between the
mean of A and that of B?
2. A teacher attaches 2 to Quiz, 3 to Mid-term and 5 for Final exam. If a student gets 90, 50 and 60 for
Quiz, Mid-term and Final-exam respectively, what is his/her average academic performance?
3. The mean weight of 50 women workers in a factory is 48 kg. The mean weight of 75 men working in the
same factory is 58 kg. Find the mean weight of all workers in the factory.
4. The mean of 200 items was found to be 40. Later on it was discovered that two items were wrongly read
as 92 and 8 instead of 192 and 88 respectively. Find the correct mean.
5. The mean salary of 100 laborers working in a factory , running in two shifts of 40 and 60 workers
respectively is birr 380. The mean salary of the 40 laborers working in the morning shift is birr 350. Find
the mean salary of the 60 laborers working in the evening shift.

Median
It has been pointed out that mean cannot be calculated when there is frequency distribution with open-
ended classes. Also the mean is to a great extent affected by the extreme values. For instance, there are
eight persons getting salaries as Birr 150, 225, 240, 260,275, 290, 300 and 1500. The mean salary of the
persons is Birr 405. This value is not a good measure of central tendency because out of the eight people,
seven get Birr 300 or less. Hence, some better measure is preferable and median is one of them.
Median is the half-way point in a data set. The median is the midpoint of the data array. It divides a data
set into two equal parts such that half of the numbers have a value less than the median and have will have values
greater than the median. Graphically median is the intersection of the less than and more than cumulative frequency
curves.
~
The median(denoted by X ) of a set of n observations X1, X2,…,Xn arranged in ascending order of magnitude is the
middle value if n is odd or the arithmetic mean of the two middle values if n is even.
That is
~ ~ n th n
 1 ) th value
If n is odd X =( n  1) th
value and if n is even X=(2) value  (
2
2 2
Steps in computing the median of a data array
Step 1: Arrange the data in order.
Step 2: Select the middle point.
Median for continuous grouped data: for grouped frequency distributions median is given by the formula
n
~  F X~  1
X =L ~  ( 2 )w
X
f X~
Where n=∑f= sum of frequencies
L X~ is the LCB of the median class.
FX~ 1 is the less than cumulative frequency just before the median class.
f X~ is frequency of the median class.
First obtain the less than cumulative frequencies. From the cumulative frequencies select the minimum one which
contains the value n . Then the median class is the class corresponding to this minimum cumulative frequency
2
which contains the value n .
2

Page 17
Basic Statistics

Median is not influenced by extreme values. It can be calculated for FD with open-ended classes, even it can be
located if the data is incomplete.
Advantages of the median
 It always exists
 Median is a positional average and hence not influenced by extreme observations. I.e., It is not affected by
extreme values /insensitive to outlier (extreme values).
 It is unique
 It can compute for an open-ended f.d.
 It can be computed for ratio, interval and ordinal data.
 Always guarantees that 50% of the data values are on either side of the median.
Disadvantages
 It doesn’t take each and every value into consideration.
 Arrangement of the data in order
 No algebraic manipulation (e.g. it is not possible to calculate combined median of two or
groups).
Examples
1. The number of rooms in the seven hotels in downtown Pittsburgh is 713, 300, 618, 595, 311, 401, and
292. Find the median.
Solution
Step 1: Arrange the data in order: 292, 300, 311, 401, 595, 618, 713
Step 2: Select the middle value.292, 300, 311, 401, 595, 618, 713
Median
Hence, the median is 401 rooms.
2. The number of cloudy days for the top 10 cloudiest cities is shown. Find the median.
209, 223, 211, 227, 213, 240, 240, 211, 229, 212
Solution
Arrange the data in order.
209, 211, 211, 212, 213, 223, 227, 229, 240, 240
n th n th
 (  1)
Median= ( 2
) value
2
value

5 th
value  6 th
value

213  223
 218
2 2 2
Hence, the median is 218 days
3. Find the median for the following frequency distribution
ClassBoundaries Frequency
10.5-14.5 4
14.5-18.5 7
18.5-22.5 8
22.5-26.5 10
26.5-30.5 12
30.5-34.5 7
34.5-38.5 8
Total 56
Solution:
First calculate less than cumulative frequency of the frequency distribution and identify the median class.
ClassBoundaries fi LCF(Fi)
10.5-14.5 4 4
14.5-18.5 7 11
18.5-22.5 8 19
22.5-26.5 10 29
26.5-30.5 12 41
30.5-34.5 7 48
34.5-38.5 8 56
Total 56
The median class is the class having the less than cumulative frequency containing the value
n/2=56/2=28. This implies 22.5-26.5 is the median class.

Page 18
Basic Statistics

n
~  F ~
X 1  28  19 
X  L ~
X
 ( 2 ) w  22 . 5     4  22 . 5  3 . 6  26 . 1
f X~  10 
Other Measures of Location
As discussed before, median divides a given data set in to two equal parts. There are also other positi onal
measures that d i v i d e a given data s e t in to more than t w o equal parts. These measures are collectively
known as Quantiles. Quantiles i n c l u d e quartiles, d e c i l e s and percentiles.
Quartiles: are values that divide a dataset into four equal parts. These values are denoted by Q1, Q2 and Q3 such that
25% of the data fall below Q1, 50% below Q2 and 75% below Q3.
Deciles: are values that divide the data into ten equal parts. These values are denoted by D1, D2, …, D9 such that
10% of the data fall below D1, 20% below D2, …, 90% below D9.
Percentiles: are values that divide a dataset into 100 equal parts. These values are denoted by P1, P2, …, P99.
Methods of calculation
a. Ungrouped (individual) series: Arrange the values in ascending order. Then
 Quartiles: Let Qi be the ith quartile (i=1,2,3), then
i ( n  1) th
Qi= ( ) value
4
 Deciles: Let Di be the ith decile (i=1,2,…,9)
i ( n  1) th
Di= ( ) value
10
 Percentiles: Let Pi be the ith percentile (i=1,2,…,99)
Pi= (
i ( n  1) th
) value
100
b. Group (continuous) data:
in
 F Q i 1
 Quartiles: Qi= L ( 4 )w i=1, 2, 3.
Qi
f Qi
in
 F D i 1
 Deciles: Di= L  ( 10 )w i=1, 2,…., 9.
Di
f Di
in
 F pi 1
 Percentiles: Pi= L ( 100 )w i=1, 2,…,99.
pi
f pi
Where n=∑f= sum of frequencies
L is the LCB of the ith(quartile, decile and percentile) class.
Fi-1 is the less than cumulative frequency just before the ith(quartile, decile and percentile) class.
fiis frequency of theith(quartile, decile and percentile) class .
w is the class width.
Relationship between median, quartiles, deciles and percentiles
~
 X =Q2=D5=P50
 Qi=Pi*25
 Di=Pi*10
Examples:
1. Given the data: 420, 430, 435, 438, 441, 449, 490, 500, 510 and 515. Find
th
(a) All the quartiles. (b) The 1st and 7th deciles. (c) The 40th and 75 percentiles
Solution
Data array: 420, 430, 435, 438, 441, 449, 490, 500, 510 and 515.
i ( n  1) th
a) Quartiles Qi= ( ) value ,i=1,2,3
4
(10  1) th
Q1  ( ) value  2.75 th value  2 nd value  0.75(3 rd value  2 nd value)
4
 430  0.75(435  430)  433.5

Page 19
Basic Statistics

(10  1) th
Q 2  (2 ) value  5.5 th value  5 th value  0.5(6 th value  5 th value)
4
 441  0.5( 449  441)  445
(10  1) th
Q 2  (3 ) value  8.25 th value  8 th value  0.25(9 th value  8 th value)
4
 500  0.25(510  500)  502.5
b) Di = (
i(n  1) th
) value , i=1,2, …,9
10
(10  1) th
D 1  (1 ) value  1.10 th value  1st value  0.1( 2 nd value  1st value)
10
 420  0.1( 430  420)  421
(10  1) th
D 7  (7 ) value  7.7 th value  7 th value  0.7 (8 th value  7 th value)
10
 490  0.7(500  490)  497
i ( n  1) th
c) Percentiles Pi= ( ) value ,i=1,2,…,99
100
(10  1) th
P40  (40 ) value  4.4 th value  4 th value  0.4(5 th value  4 th value)
100
 438  0.4(441  438)  439.2
(10  1) th
P75  (75 ) value  8.25 th value  8 th value  0.25(9 th value  8 th value)
100
 500  0.25(510  500)  502.5
th th th th
2. Calculate all quartiles, the 5 and 8 deciles, and the 30 and 80 percentiles for the following
frequency distribution and interpret the results.
ClassBoundaries Frequency
10.5-14.5 4
14.5-18.5 7
18.5-22.5 8
22.5-26.5 10
26.5-30.5 12
30.5-34.5 7
34.5-38.5 8
Total 56

Solutions: calculate the less than cumulative frequency for each class
in
 F Q i 1
Quartiles: Qi= L
Qi ( 4 )w i=1, 2, 3.
f Qi
Q1 class: n/4=56/4=14, the Q1 class is =>18.5-22.5
n
 FQ 1 1
Q1= L 14  11
Q1 (4 ) w  18 . 5  ( )  4  18 . 5  1 . 5  20
f Q1 8
Q2 class: 2n/4=28, the Q2 class is=>22.5-26.5
2n
 FQ 2 1
28  19
Q 2  L Q2 ( 4 ) w  22 . 5  ( )  4  22 . 5  3 . 6  26 . 1
f Q2 10
Q3 class: 3n/4 = 42, the Q3 class is => 30.5-34.5
3n
 FQ 3  1
42  41
Q 3  L Q3 ( 4 ) w  30 .5  ( )  4  30 .5  0 . 57  31 . 07
f Q3 7

Page 20
Basic Statistics

The Mode
The mode, X̂ , is the most frequently occurring value in a set of observations or it is the value with the highest
frequency. A data set may have one mode (uni-modal), two modes (bi-modal), more than two modes (multi-
modal) or no mode at all (i.e. when all observations are equally frequented).
Ungrouped (individual series): Arrange the data in ascending order and take the value appearing most
frequently (the most frequent value).
Grouped (continuous) series: In a frequency distribution, the mode is located in the class with highest
frequency and that class is the modal class.
f Xˆ  f Xˆ 1
Then the formula for mode is X̂ = L Xˆ  ( )w
( f Xˆ  f Xˆ 1 )  ( f Xˆ  f Xˆ 1 )
Mode is not affected by extreme values and can be calculated for open-ended classes. But it often does not
exist and is value may not be unique.
**The mode is the only measure of central tendency that can be used in finding the most typical case when the
data are nominal or categorical.

Examples:
1. Find the mode of the following data sets.
(a) 110, 113, 116, 116, 118, 118, 118, 121 and 123.
(b) 2, 3, 5, 7 and 8.
(c) 15, 18, 18, 18, 20, 22, 24, 24, 24, 26 and 26
(d) 5, 6, 6, 7, 9, 9, 10, 12 and 12.
(e) 1, 1, 0, 1, 0, 0, 0, 2, 4 and 3.
Solutions: Find the value having the highest frequency.
(a) Since 118 occur more than other values, the mode is 118.
(b) Each value occurs once (equally frequent), the data has no mode.
(c) 18 and 24 occur three times; hence the modal values are 18 and 24 (bi-modal).
d) Tri-modal (multi-modal): 6, 9 and12.
(e) The modal value here is 0 as it occurs more number of times than other values.
2. Find the median for the following frequency distribution
Class Boundaries Frequency
10.5-14.5 4
14.5-18.5 7
18.5-22.5 8
22.5-26.5 10
26.5-30.5 12
30.5-34.5 7
34.5-38.5 8
Total 56
Solution: The class having highest frequency is ⇒26.5−30.5, hence it is the modal class.
f Xˆ  f Xˆ  1 (12  10 )
Xˆ  L Xˆ  ( ) w  26 . 5   4  26 . 5  1 . 14  27 . 64
( f Xˆ  f Xˆ  1 )  ( f Xˆ  f Xˆ  1 ) (12  10 )  (12  7 )

3.2. Measures of Variation

Variation or dispersion may be defined as the extent of scatteredness of value around the measures of
central tendency. Thus, a measure of dispersion tells us the extent to which the values of a variable
vary about the measure of central tendency.
Objectives of Measures of Dispersion
To have an idea about the reliability of the measure of central tendency. If the degree of scattered ness is large,
an average is less reliable. If the value of the dispersion is small, it indicates that a central value is a good
representative of all the values in the data set.

Page 21
Basic Statistics

To compare two or more sets of data with regard to their variability. Two or more data sets can be compared
by calculating the same measure of dispersion having the same unit of measurement. A set with smaller value
posses less variability or is more uniform (or more consistent).
To provide information about the structure the data. A value of a measure of dispersion gives an idea about the
spread of the observations. Further, one can surmise about the limits of the expansion of the values in the data set.
To pave way to the use of other statistical measures. Measures of dispersion, especially variance and standard
deviation, lead to many statistical techniques like correlation, regression, analysis of variance.
Types of Measures of Variation
Absolute measures of variation: A measure of variation is said to be an absolute form when it shows the
actual amount of variation of an item from a measure of central tendency and are expressed in concrete
units in which the data have been expressed.
Relative measure of variation: It is the quotient obtained by dividing the absolute measure by a quantity
in respect to which absolute deviation has been computed. Relative measure of variation is a pure number
and used for making comparisons between different distributions.
Absolute Measures Relative Measures
Range Coefficient of Range
Quartile Deviation Coefficient of Quartile Deviation
Mean Deviation Coefficient of Mean Deviation
Variance and Standard Deviation Coefficient of Variation
Before giving the details of these measures of dispersion, it is worthwhile to point out that a measure of
dispersion (variation) is to be adjudged on the basis of all those properties of good measures of central
tendency. Hence, their repetition i s unnecessary.
Range
It is the simplest and crudest measure of dispersion. Range is defined as the difference between the largest and the
smallest values in the data.
Ungrouped Data: R=L-S Grouped Data: R = UCL last – LCL first
Coefficient of Range (CR)
UCLlast  LCL first
For raw data: CR= L  S for grouped data: CR=
L  S UCLlast  LCL first
Range hardly satisfies any property of good measure of dispersion as it is based on two extreme values only,
ignoring the others. It is not liable to further algebraic treatment.
Quartile Deviation
Quartile deviation is denoted QD, it is sometimes known as Semi-inter-quartile Range (SIR).
Inter-quartile Range=Q3-Q1
QD= Q 3  Q 1 Coefficient of QD= Q 3  Q 1
2 Q 3  Q1
QD involves only the middle 50% of the observations by excluding the observations below the lower quartile and
the observations above the upper quartile. Note that QD does not take into account all the individual values
occurring between Q1 and Q3. It means that, no idea about the variation of even 50% mid values is available from
this measure. Anyhow it provides some idea if the values are uniformly distributed between Q1 and Q2. It can be
calculated for open-ended classes.
Mean Deviation
The measures of variation discussed so far are not satisfactory in the sense that they lack most ofthe
requirements of a good measure. Mean deviation is a better measure than range and quartile deviation.
It is the arithmetic mean of the absolute values of the deviation from some measures of central tendency usually the
mean and the median of a distribution. Hence we have mean deviation about the mean MD( X ) and mean deviation
~
about the median MD( X ).
~
| X  X | ~ | X  X |
Ungrouped Data: MD( X )= MD( X )=
n n

Page 22
Basic Statistics

~
Grouped Data: MD ( X ) =
f |XX| ~ f |X X |
MD ( X ) =
f f
MD ( X ) ~ MD( X~ )
Coefficient of Mean Deviation: CMD ( X ) = CMD ( X ) = ~
X X
MD is not affected by extreme values. Its main drawback is that the algebraic negative signs of the deviations are
ignored. MD is minimum when the deviation is taken from median.
Variance
The Variance and Standard Deviation are the most superior and widely used measures of dispersions and both
measure the average dispersion of the observations around the mean.
2
For a population containing N elements, the population variance (  ) is calculated by using the formula
 X )2
 2 =  (X 2 2
for ungrouped data and  =  f ( X  X ) for grouped data.
N  f
2
For a sample of n elements, the sample variance (S ) is calculated by using the formula
2
2
S=
(X  X ) 2
2
for ungrouped data and S =
 f (X  X ) for grouped data.
n 1  f 1
The variance has mostly removed the lacunae which are present in the MD given before it. The first main demerit of
variance is that its unit is the square of the unit of measurement of the variable values. For example the sample
variance of 2m, 6m and 4m is 4m2. The interpretation is on average, each value differs from the mean by 4m2,
which is completely wrong because one thing the unit of measurement of variance is not the same as that of the data
set; secondly the variation of the data is exaggerated from two to four since it is taking the square of the deviations.
Thus the other disadvantage of variance is, the variation of the data is exaggerated because the deviation
(difference) of the each value from the mean is squared. Also it gives more weight the extreme values as compared
to those which are near to the mean value.

Standard Deviation
Standard deviation is the positive square root of variance.
Population Standard Deviation (δ) =  2 Sample Standard Deviation (S) = S 2
Standard deviation is considered to be the best measure of dispersion because the unit of measurement is the same
as the data set and the exaggeration made by variance is eliminated by taking the square root of it.
If the standard deviation of the data is small the values are concentrated near the mean and if it large the values
are scattered away from the mean.

Even if standard deviation is better than variance, there is however on difficulty with it. If there are two or more
distributions of different variables (having different units of measurement), there variability cannot be compared by
comparing the values of the standard deviation.
Examples:
1.Find the variance and standard deviation of: 20, 28, 40, 12, 30, 15 and 50.
a.Take the data as a population
b. Take the data as sample
2.Find the variance and standard deviation for the following frequency distribution
Class Boundaries Frequency
10.5-14.5 4
14.5-18.5 7
18.5-22.5 8
22.5-26.5 10
26.5-30.5 12
30.5-34.5 7
34.5-38.5 8
Total 56

Page 23
Basic Statistics

Coefficient of Variation (CV)

All absolute measures of dispersion have units. If two or more distributions differ in their units of measurement,
there variability cannot be compared by any of the absolute measure given before. Also, the size of these measures
of dispersion depends up on the size of the values. That is if the size of the values is larger, the value of the absolute
measures will also be larger. Hence, in situations where either the two or more data sets have different units of
measurement, or their means differ sufficiently in size, absolute measures fails to be appropriate.
It is a relative measure of standard deviation. The coefficient of variation is the ratio of the standard deviation to the
mean and it is expressed as percent.
 S
CV= ×100%, population CV= ×100%, for sample
 X
It is used for comparing the variability of two or more distributions. The distribution having less CV is said to be
less variable or more consistent or more uniform.
Since absolute measures depend on the units of measurement of the data, they fail to be appropriate for comparing
two or more groups if
1. The groups have different units of measurement.
2. The size of the data between the groups is not the same.
When either of these two conditions happens we have to use relative measures of variation. CV is a unit less
measure of variation and also takes into account the size of the means of the distributions.
EX: Given Data Set A: 2 Meters, 4Meters, 6Meters
Data Set B: 1000 Liters, 800 Liters, 900Liters
Compare the variability of the two data sets using standard deviation and coefficient of variation.

Page 24
Basic Statistics

Chapter 4: Probability and Probability distributions

Probability is a branch of mathematics that deals with calculating the likelihood of a given event's
occurrence, which is expressed as a number between 1 and 0. An event with a probability of 1 can be
considered a certainty: for example, the probability of a coin toss resulting in either "heads" or "tails" is 1,
because there are no other options, assuming the coin lands flat. An event with a probability of .5 can be
considered to have equal odds of occurring or not occurring: for example, the probability of a coin toss
resulting in "heads" is .5, because the toss is equally as likely to result in "tails." An event with a
probability of 0 can be considered impossibility: for example, the probability that the coin will land (flat)
without either side facing up is 0, because either "heads" or "tails" must be facing up.

As a general concept, probability is the measure of a chance that something will occur.
Probability is
 A quantitative (numerical) measure of uncertainty.
 A measure of the strength of belief in the occurrence of something (an event).
 A measure of the degree of chance of an uncertain event.
 It is a numerical measure with a value between 0 (0%) and 1 (100%) where the probability of 0 indicates
that the given event cannot occur and a probability of 1(100%) assures certainty of such an occurrence.

Set theory
In order to discuss the theory of probability, i t is essential to be familiar with some ideas and concepts of
mathematical theory of set. A set is a collection of well-defined objects which is denoted by capital letters
like A, B, C, etc.
In describing which objects are contained i n set A, two common methods are available. These methods are:
1. Listing all objects of A. For example, A = {1, 2, 3, 4} describes the set consisting of the positive integers
1, 2, 3 and 4.
2. Describing a set in words, for example, set A consists of all real numbers between 0 and 1, inclusive. It
can be written a s A = {x: 0 ≤ x ≤ 1}, that i s , A is the set of all x’s where x is a real number between 0 and
1, inclusive.

If A = {a1 , a2 , · · · , an }, then each object ’ai ; i = 1, 2, · · · , n’ belonging to set ’A’ is called a member or an
element of set A, i.e., ai ∈ A. A set consisting all possible elements under consideration is called a universal
set (denoted by U). On the other hand, a set containing no element is called an empty set (denoted by ∅ or
{}).
If every element of set A is also an element of set B, A is said to be a subset of B and write as A ⊂ B. Every
set is a subset of itself, i.e., A ⊂ A. Empty set is a subset of every set. If A ⊂ B and B ⊂ C , then A ⊂ C . If A
⊂ B and B ⊂ A, then A and B are said to be equal.

Now let us see some methods of combining sets in order to form a new set and develop the main properties.
 Empty set (denoted by Ф or {})
 A set containing no element.
 Universal set (denoted by S)
 A set containing all possible elements.
 Complement (Not)(A’)
 The complement of a set A is: a set containing all elements of S that are not in A.
 Intersection (And) ( AnB)
 A set containing all elements in A and B.
 Union (Or) (AuB)
 A set containing all elements in A or B or both.
 If A and B are finite sets, then n(AuB) = n(A) + n(B) - n(AnB).
 Mutually exclusive or disjoint sets
 Sets having no element in common, having no intersection, whose intersection is empty set are known
as mutually exclusive sets.

Page 25
Basic Statistics

Example: In a survey conducted among 200 statistics major students, the number of students who visited historical,
religious and both sites are found to be 150, 130 and 80 respectively. Find the number of students who visited none
of the sites.

Definition & some terminologies of probability

We shall introduce some of the basic concepts of probability theory by defining random experiments and
terminologies relating to random experiments (i.e., Outcomes, Events and Sample Spaces).
1. Experiment: - It is an activity or a trial that leads to well-defined results called outcome
 It is doing or observing something happening under certain condition resulting in some final outcomes. It
could be physical, chemical or social.
 It is the process by which an observation (or measurement) is obtained.
 It is any activity, process, measurement, or observation that yields or generates well-defined out comes or
sample points. Each repetition of the experiment is also known as a trial.
Example: The following are some of the experiment perpetrated in most practical situations.
 Tossing a fair coin twice.
 Drawing a ball from an urn.
 Picking a king from a well-shuffled standard deck of playing card.
There are two types of experiments that we observe in our natural phenomenon. Those are referred to:-
(i). Deterministic (also non-random/non-probabilistic/ non-stochastic) experiments: They are a kind of
experiments whose outcome can be predicted or determined exactly in advance.
Example: an experiment: hydrogen + oxygen = water
(ii). Random or non-deterministic or probabilistic or statistical or stochastic experiments:
A random experiment is: an experiment that can result in different outcomes, or in outcomes which cannot be
predicted with certainty even though experiment is repeated in the same manner every time or the entire past history
of the experiment is known.
Example: If a coin is tossed once it is possible to list all the possible outcomes i.e., S  H , T , but it is not
possible to predict which outcome will occur.
In many random experiments, there is always uncertainty as to whether a particular event or phenomenon will
occur. The likelihood or chance of the occurrence of an event or a phenomenon resulting from such a statistical
experiment is evaluated by probability.
Example: Throwing an unbiased coin one time. Here one can exactly determine the possible outcomes and are either a head
(H) or a tail (T). Nevertheless, as one cannot exactly predict in advance, we mean ahead or prior to the throw, in which of those
outcomes that the coin tossed will show up, the outcomes cannot be known with certainty and hence the experiment is referred
to a random experiment.
2. Outcome: - is a result of a single trial (experiment).
 The individual result of a random experiment is said to be an outcome or a sample point.
 An outcome is usually denoted by lowercase letters, such as x, y, t, w, etc.
Example: Determine the outcome of each of the following random experiments.
a) Toss a die and observe the number that shows on top.
b) Roll a coin four times and observe the sequence of heads and tails obtained.
Solution:
a) The outcomes of the experiment are 1, 2, 3, 4, 5 or 6.
b) The outcomes of the experiment are HHHH, HHHT, HHTH, HTHH, THHH, HHTT, HTHT, THTH, HTTH, THHT,
TTHH, HTTT, THTT, TTHT, TTTH, and TTTT.
3. Sample Space: - is a collection of all possible outcomes of an experiment.
 You may think of a sample space as the set of all values that a variable may assume.
Note: The sample space of an event is equivalent to the universal set. It can never be an empty; it should, therefore, have at
least one outcome.

Example: The following are random experiments with their respective sample spaces:

Page 26
Basic Statistics

No Random Experiment Sample Space

1 Tossing of an unbiased coin S={H, T}
2 Rolling of unbiased coin twice S={HH, HT, TH, TT}
3 Tossing of two unbiased coins S={HH, HT, TH, TT}
4 Rolling of a balanced die S={1, 2, 3, 4, 5, 6}
5 Answering a true –false questions by guess S={True, False}
6 Consider an experiment in which you count the number of S = {0, 1, 2, 3, . . .}
defects on a molded part.
7 Selecting an item from a production lot S={Defective, Non-defective}
4. Event: - is an outcome or a set of outcomes (having common characteristics) of an experiment. For
example getting one head in a trial of tossing three coins simultaneously would be an event, E={HT,TH}. Getting
even number in rolling a die, E= {2,4,6}.
Often we are not interested in a single outcome of a sample space of a random experiment but in whether or not one of a group
or a subset of outcomes occurs. Such groups or subsets of the sample space, S, are called events.
Remark:
 If S (sample space) has n members then there are exactly 2 n subsets or events.
 If a sample space, S, contains n outcomes, definitely it will have 2 n subsets, and hence 2 n events.
 E.g., we can have 26  64 possible events from S = {1, 2, 3, 4, 5, 6}.
Example 1: The event that the sum of two dice is 10 or more is
A = {(4, 6), (5, 5), (5, 6), (6, 4), (6, 5), (6, 6)}.
Example 2: Given a random experiment which consists of tossing a fair coin one times. Then, the possible sample
points of this random experiment are head (H) and tail (T). Thus, the sample space S will be the set encompasses
those outcomes, i.e., S = {H, T}. Here, n(S) =2. Hence, there are the four possible events of this sample space,
namely: , {H} , {T} ,
.320{H, T}, in which case
Example 3: In tossing a coin twice write the event consisting of
a) At least one head
b) Two head
c) Two tails
d) Identify for a), b) and c) as elementary or compound events.
e) From a), b) and c) which are mutually exclusive events.
Solution: S = {HH, HT, TH, TT}.
a) Let A= an event consisting of at least one head, hence the possible outcomes of event A are {HH, HT, TH}
which is compound event.
b) Let B= an event consisting of two head, so the possible outcomes of event B,
B = {HH}, consists of only one outcome and is therefore elementary event.
c) Let C=an event consisting of two tails, so C= {TT} and is elementary event.
Classifications of Probability: Interpretation of probability (Method of assigning probability
There are two basic views or interpretation of probability: objective & subjective. The objective perspective requires repeatable
random process under similar conditions (regularity) like the flip of a coin, the roll of the dice, etc. Subjective believe
probability is liked to one’s state of mind with regard to the knowledge about the event in question. Subjectivists are constantly
updating their belief system with the arrival of incoming knowledge.
Objective probability
Objectivist subscribe to the
 The classical (a priori) ==> the axiomatic (mathematical basis) approach to probability
 the relative frequency (Empirical or a posteriori)

Page 27
Basic Statistics

I. Classical (Mathematical) Probability: Suppose there are N possible outcomes in the sample space S of an
experiment. Out of these N outcomes, only n are favorable to the event E, then the probability that the event E will
occur is P ( E )  n ( E )  n
n(S ) N

 P ( E )  number of ways E can occur

total number of sample events
Definition: Suppose a random experiment with N equally likely outcomes is conducted and out of those, assume
that N A outcomes are favorable to the occurrence of the event A, then probability of the event A, denoted by P (A),
are defined as:

This, in other words, means that given a finite sample space S of a random experiment and an event A containing
equally likely outcomes, then

Example 1: Let a fair coin tossed two times. Find the probabilities of getting at least one tail.
Solution: First identify the sample space, say S
 The sample space for this experiment is S= {HH, HT, TH, TT}.  n(S )  4
 Let A= an event consisting of at least one tail, hence A= {HT, TH, TT}.  n( A)  3
Since event A consists of equally likely outcomes and finite possible outcomes we can use classical approach as
follow to find P(A)
n( A) 3
P ( A)    0 . 75
n(S ) 4
Example 2: A fair die is tossed once. What is the probability of getting:
a) Number 4?
b) An odd number?
c) Number 8?
Solutions: First identify the sample space, say S
S  1, 2, 3, 4, 5, 6  N  n ( S )  6
a) Let A be the event of number 4
A  4 
 N A  n ( A )  1
n ( A )
P ( A )   1 6
n ( S )
b) Let A be the event of odd numbers
A  1 , 3 , 5 
 N A  n ( A )  3
n ( A )
P ( A )   3 6  0 . 5
n ( S )
c) Let A be the event of number 8
A  
 N A  n ( A )  0
n ( A )
P ( A )   0 6  0
n ( S )
Example 3: A family has three children. What is the probability that:-
a. At most one of them is a boy?
b. All are girls?
c. None of them is girls?
d. At least two of them are males?

Page 28
Basic Statistics

Solution: First and foremost, let‘s determine the sample space, the possible children of the family, of this random
experiment. To this effect, let‘s once and again designate “F” for female and “M” for male children in the family.
Hence, the possible children of the family, i.e., the sample space for the sex of this family is
S= {FFF, FFM, FMF, MFF, FMM, MFM, MMF, MMM}, which implies n(S) = 8. Thus,
a. If A is an event of getting at most one of children in the family is a male, then possible children of this event is A={
FFF, FFM, FMF,MFF }, and therefore , by the above definition, its probability will be:
P ( A)  n( A)  4  0 .5
n(S ) 8
b. If B is another event that all children in the family are females, then possible children of this event is B={ FFF }, and
therefore , by the above definition,
P (B )  n(B )  1
n(S ) 8
c. Similarly here, if C is an event of getting none of children in the family is a female, then possible children of the event
will be all of children in the family are males, and the event is C={ MMM }, and therefore
P (C )  n (C )  1
n(S ) 8
d. Let D be the event of getting at least two of the children of the family are males. Then, D={ FMM, MFM, MMF, MMM
}, and therefore
P (D )  n(D )  4  0 .5
n(S ) 8
Do you think that the approach is applicable for any event? No, because
It has its own limitations. These are
1. Is not applicable if the outcomes are not equally likely.
2. Is applicable only for finite sample space.
Exercise
1. What is the probability of getting number 6 in rolling a die? Answer P (getting number 6) =1/6
2. What is the probability of getting two heads in tossing two coins? AnswerP (HH) = 0.25
3. Two dice are rolled. Describe the sample space. What is the probability of getting
i. A sum of 10 or more.
ii. A pair which at least one number is 3.
iii. A sum of 8, 9 or 10.
iv. One number less than 4.

Page 29
II. The Relative frequency/empirical approach
A second interpretation of probability is called the relative frequency concept of probability; this is an
empirical approach to probability. If an experiment is repeated a large number of times and event E occurs
30% of the time, then .30 should be a very good approximation to the probability of event E. Symbolically,
if an experiment is conducted n different times and if event E occurs on ne of these trials, then the
probability of event E is approximately
ne
P ( event E ) 
n
Given a frequency distribution the probability of an event being in a given class is P ( E )  f where f
 f
is the class frequency and ∑f= n total number of observations.
The difference between classical and empirical probability is that the former uses sample space to
determine the numerical probability while the latter is based on frequency distribution.
number of occurances of A number of successes
 P ( A)  
total number of trials total number of trials
We say “approximate” because we think of the actual probability P(event E) as the relative frequency of
the occurrence of event E over a very large number of observations or repetitions of the phenomenon.
The fact that we can check probabilities that have a relative frequency interpretation (by simulating many
repetitions of the experiment) makes this interpretation very appealing and practical. This approach bases
on the theory of large number, i.e., on trials perpetrated large number of times. Suppose A is an event of
an experiment which can be repeated large number of times. If n A is the number of occurrences of the
event A in “n-repeated experiments” then
i. nA , the number of occurrences of the event A, is called the frequency of A, and
nA
ii. is called the relative frequency of the event A.
f n ( A) 
n
Example: In a sample of 50 people, 21 had type O blood, 22 had type A blood, 5 had type B blood, and 2
had type AB blood. Set up a frequency distribution and find the following probabilities.
a) A person has type O blood.
b) A person has type A or type B blood.
c) A person has neither type A nor type O blood.
d) A person does not have type AB blood.
Solution:
Type Frequency
A 22
B 5
AB 2
O 21

a) P(O)=

b) P(A or B)=

c) P(neither A nor O)= (neither A nor O means a person has either type B or type AB blood)

d) P(not AB)=1-P(AB)=1- =
Exercise: Given the following frequency distribution.
Grade A B C D F
No of students 10 20 50 15 5
What is the probability of selecting a student who scored B?
III. Subjective or personal Probability

Page 30
A probability that is determined based on the individuals own judgment, experience, information, and
belief is called subjective probability method.
In everyday life we say or hear statements such as
 “Probably Hana will miss the bus”
 “The likelihood that Arsenal will win the next match is high”
 “There is small probability that the rain will come tonight” etc.
Such statements can be made more precise by saying as:
 “The probably that Azeb will miss the bus 70%”
 “The likelihood that Arsenal will win the next match is 84%”
 “There are 10% probabilities that the rain will come tonight” etc.
Here, 70%, 84%, etc measure one`s belief or guess or estimation in the occurrence of the event and called
subjective probabilities in the sense that different persons may assign different probabilities other than the
stated. Therefore, in this probability approach, the probability values of events are usually based on either
the common sense, educational guess, judgments or opinion of a person or groups.
Definition If A is an event (or a statement), the subjective probability of A is
P(A) = Degree of belief that A is true

Subjective Probability calculates probability based on an educated guess or experience or evaluation of a

problem. For example a physician might say that on the basis of his/her diagnosis, there is a 30% chance
the patient will need an operation.
In this view, probability of A is treated as a quantifiable level of belief ranging from 0 (complete
disbelief) to 1 (complete belief). For instance, an experienced physician may say “this patient has a 50%
chance of recovery.” Presumably, this is based on an understanding of the relative frequency of might
occur in similar cases. Although this view of probability is subjective, it permits a constructive way for
dealing with uncertainty.

Properties (Rules) of Probability

1. The probability of an event E is in between 0 and 1 inclusive. It can never be negative or greater than
one. 0≤ P(E) ≤1.
P(E)=0, means it is sure that E can never happen.
P(E)=1, means the event E is certain to occur (E occurs surely).
Ex: What is the probability of getting:
a. Number 9 in rolling a die?
b. A number less than 7 in rolling a die?
2. If the probability that an event E will occur is P(E), then the probability that this event will not
occur is P(E’), where P(E’)=1-P(E).
3. The sum of the probabilities of each outcome in the sample space S is 1 i.e. ∑Pi=1.
Eg: Rolling a die
Outcome 1 2 3 4 5 6
Probability 1/6 1/6 1/6 1/6 1/6 1/6
∑Pi=1/6+1/6+1/6+1/6+1/6+1/6=1
4. If there are two events E1 and E2, the probability that at least one of these events will occur is the
sum of the probability that each event will occur minus the probability that both events will occur at the
same time (simultaneously).
P(E1 u E2)=P(E1)+P(E2)-P(E1 n E2)
Ex: A part time student is taking two courses, namely Economics and Statistics. The probability that the
student will pass economics course is 0.60 and the probability of passing statistics course is 0.70. The
probability that the student will pass both courses is 0.50. Find the probability that the student
a. Will pass at least one course.

Page 31
Solution
Let Ec = passing in economics course
Sc= pasiing in statistics course
 P(Ec) = 0.6
P(Sc) = 0.7
P(EcSc) = 0.5
 P(EcSc) = P(Ec)+ P(Sc)- P(EcSc) = 0.8
b. Will fail both courses. Ans = 0.2
Counting Techniques
Counting techniques are mathematical models which are used to determine the number of possible ways
of arranging or ordering objects. They are used to find a solution to fix the size of the sample space that is
extremely large. Example: What is the size of the sample space if a coin is tossed a large number of times
say 20 or more?
Objectives: Upon successful completion of this topic, the student will be able to:
 Distinguish the various counting rules and
 Use these counting techniques to evaluate probability
In order to calculate probabilities, we have to know
 The number of elements of an event
 The number of elements of the sample space.
That is in order to judge what is probable, we have to know what is possible.
In order to determine the number of outcomes, one can use several rules of counting.
Generally, we have the following types of counting techniques or methods:-
i. Fundamental counting principle, which encompasses
a) the Addition principle, and
b) the Multiplication Principle
ii. Permutation, and
iii. Combination
Addition Rule
Suppose there are k procedures (p1, p2, …,pk), in which the ith procedure can be done in ni; i = 1, 2, …,k ways.
Hence, the total number of ways of performing p1 or p2 or …or pk is n1+n2+…+nk, provided that no two procedures
can be performed at the same time or one after the other.
Example: Suppose that we are planning a trip and are deciding between bus and train transportation. If there are 2 bus routes and
3 train routes to go from A to B, find the available routes for the trip?
Solution: we have two opportunities or procedures either using bus or train, hence k  2 .
  n1  2 .
The first procedure (bus transportation) has 2 routes (ways),
 The second procedure (train transportation) has 3 routes (ways),  n2  3 .
 Therefore, we have n1  n2   2  3  5 possible routes for someone to go from city A to city B.
Addition rules are important in probability. These rules provide us with a way to calculate the probability
of the event "A or B", provided that we know the probability of A and the probability of B. Sometimes
the "or" is replaced by U, the symbol from set theory that denotes the union of two sets. The precise
addition rule to use is dependent upon whether event A and event B are mutually exclusive or not.
Addition Rule for Mutually Exclusive Events: If events A and B are mutually exclusive, then the probability
of A or B is the sum of the probability of A and the probability of B. We write this compactly as follows:
P(A or B) = P(A) + P(B)
The above formula can be generalized for situations where events may not necessarily be mutually exclusive. For
any two events A and B, the probability of A or B is the sum of the probability of A and the probability of B minus
the shared probability of both A and B:
P (A or B) = P(A) + P(B) - P(A and B)
Sometimes the word "and" is replaced by ∩, which is the symbol from set theory that denotes the intersection of two
sets.
The addition rule for mutually exclusive events is really a special case of the generalized rule. This is because
if A and B are mutually exclusive, then the probability of both A and B is zero.
When two events, A and B, are mutually exclusive, the probability that A or B will occur is the sum of the
probability of each event. P(A or B) = P(A) + P(B)

Page 32
 Example: A single 6-sided die is rolled. What is the probability of rolling a 2 or a 5?
P(2) = 1
6
P(5) = 1
6
P(2 or 5) = P(2) + P(5)
= 1 + 1 = 2 = 1
6 6 6 3
Example2: You are going to roll two dice.
Find P (sum that is even or sum that is a multiple of 3).
The addition rule says we need to find P (even) + P (multiple of 3) - P(both).
In rolling two dice we have 36 possibilities. S= {(1, 1), (1, 2)… (6, 6)}
P(even) means how many ways to roll 2, 4, 6, 8, 10, or 12.
P (even) = 18/36
P(multiple of 3) means how many ways to roll 3, 6, 9 or 12.
P (multiple of 3) = 12/36
P (both) means what is the overlap. Notice that 6 and 12 occur in both places and have been counted
twice. We need to subtract those out.
P (both) = 6/36
So P (sum that is even or a multiple of 3) = 18/36 + 12/36 - 6/36 = 24/36 = 2/3.
Multiplication Rule
Suppose there are a sequence of k events, in which the ith event has ni; i = 1, 2, … , k possibilities, then
the total number of possibilities of the whole sequence will be n1 n2…nk.
 In general, if a task consists of K-steps or operations of which the first, o1 , can be made in n1
ways; for each of those, the second, o2 , can be made in n2 - ways ;  ; and for each of the those, the
k th operation, ok , can be made in nk - ways then the whole task or choice can be made in:-
k
n1  n 2  n 3    n k   n i ways
i 1
Example1: An airline has 4 flights from A to B, and 2 flights from B to C per day. If the flights are to be made on
separate days, in how many different ways can the airline offer from A to C?
Solution: we have k  2
 In operation 1 there are 4 flights from A to B,  n1  4
 In operation 2 there are 2 flights from B to C,  n2  2
 Altogether there are n1  n2   4  2  8 possible flights from A to C.
Example 2: Of the books in a college library, suppose 20 are chemistry, 25 are mathematics and 15 are physics.
Hence, in how many ways can you choose three books if you are forced to have one from each kind?
Solution: Here, you have 3 operations or tasks  k  3 :
 Choosing one book from 20 chemistry books. So, we have 20 different possible ways.
 n1  20
 Choosing one book from 25 mathematics books. So, we have 25 different possible
ways.  n2  25
 Choosing one book from 15 physics books. So, we have 15 different possible ways.  n3  15
 Therefore, using multiplication principle, you will have n1  n2  n3   20  25  15  7,500
different ways to choose those three books from each kind.
Example 3: suppose that in a medical study patients are classified according to their blood type as A, B , AB, and
O; according to their RH factors as + or - and according to their blood pressure as high, normal or low ,then in
how many different ways can a patient be classified ?
Solution: we have k  3

Page 33
 The 1st classification done in 4 ways,  n1  4
 the 2 nd classification done in 2 ways,  n2  2 and
rd
 the 3 classification done in 3 ways,  n3  3
 Thus patient can be classified in n1  n2  n3   4  2  3  24 different ways.
Permutation: is the arrangement or selection of objects in a specific order. The arrangement of n
distinct objects in a specific order using r objects at a time is called a permutation of n objects
taking r objects at a time, that is, nPr where
n!
n Pr  ,0  r  n
n  r !
 It is an arrangement of objects (without repetition) in a definite order, i.e., with attention given to
the order of arrangement.
 In permutation, the order of the arrangement is very important.
 The arrangement of objects with attention given to the order of their arrangement is called a
permutation.
 It is an arrangement of all or parts of a set of objects with regard to order.
Example 1: Suppose a photographer must arrange 3 people in a row for photograph. In How many different ways
can the arrangement be done?
Solution: The photographer uses all the 3 people and he can arrange in different orders as.
ABC, BCA, ACB, CAB, BAC, CBA.
3!
Using rule 1 the photographer can arrange in 3 P3   3! 6 different ways.
(3  3)!
Example 2: In how many ways can 6 persons be arranged (sit) in a row?
6!
Solution: In 6 P6   6! 720 .
(6  6)!
Example 3: In how many ways can 9 books be arranged on a shelf having 4 places?
Solution: In 6 P6  9!
 9  8  7  6  3024 .
( 9  4 )!
Combination: is the arrangement or selection of objects without regard to order. Here, order
does not matter. The number of combinations of r objects selected from n objects is denoted by
n n n!
nCr    Where nCr    
r  ,0  r  n
r 
    r ! ( n  r )!
Note: The difference between permutation and combination is that in combination, the order of
objects being selected (arranged) is not important, but order matters in permutation.
Definition:
 A combination is a distinct selecting or grouping of objects without regard to the order ofthe arrangement.
 The number of ways in which r -objects can be selected from a set of n -distinct objects, equivalently; the
number of combinations of n -distinct objects taken r - at a time, where r  n , denoted by  n  or
r 
 

n Cr or C (n, r ) , read as “ n -combination r ” and is defined as:.

n n n  1 n  2  n  r  1  n! P
   n C r    n r .
r
  r ! r ! n  r ! r !
Computational Hint: Note that
a a! a!  a 
      
 b  b ! a  b ! a  b ! b !  a  b 
So, for example,
5 5 9 9
     and     
3 2 4 5

Page 34
Remark: When to use combinations?
1. Combinations are only applied when
 repetitions are not allowed, and
 Order is not important.
n! n! n! n!
2. n Cn    1 n C0 and n Cr    n C n r
n!( n  n)! n!0! r!(n  r )! ( n  r )!r!
Example 1: Given the letters a, b, c and d, list the permutations and combinations for selecting 2 letters.
Solution:
Permutation Combinations
n Pr  4 P2  12 n C r  4 C2  6
These are: These are:
ab, ba, ac, ca, ad, da ab, ac, ad
bc, cb, bd, db, cd, dc bc, bd, cd
Example 2: In how many ways a committee of 5 people be chosen out of 9 people?
Solutions: since order is not important we use combination
n9 , r 5
n n! 9!
     126 ways
 r  ( n  r )! r ! 4!5!
Example 3: suppose the geneticist decides to do a single experiment that will utilize 3 stocks of
Drosophila at once and, therefore, order is not important. How many experimental protocols are
possible from his 5 available stocks?
Solution: n  5 , r  3
5 C3  10 , there are only 10 ways now to design this new experiment because order is not
important.
Example 4: Out of 5 Chemists and 7 Statisticians, a committee consisting of 2 chemists and 3 statisticians is to be
formed. In how many ways can this be done? If
a. Any chemist and statistician can be included?
b. One particular statistician must be included in the committee?
c. Two particular chemists cannot be included in the committee?
Solution:
a. we have to select 2 out of 5 in 5 C 2 ways and we have to select 3 out of 7 in 7 C3 ways,
therefore 5 C2 7 C3  350 committees (ways)
 5 7
       350
 2  3
b. 5 C 2  6 C 2  150 ways (one statistician is already taken)

 5  1  6 
          150
 2  1  2 
c. 3 C 2  7 C3  105 (2 chemists out of 5 are out competition)

 2  3 7
          105
 0  2  3
Exercises:
1. In how many different ways can a secretary, a president and a manager be selected from 5 persons?
2. A committee of 3 persons is to be selected from 5 persons. In how many different ways can this be
done?

Page 35
3. A committee of 5 persons must be selected from 5 men and 8 women. How many ways can the
selection be done if there are at least 3 women in the committee?

Mutually Exclusive Events

Two events are said to mutually exclusive if they cannot occur at the same time or if the occurrence of one stops the
occurrence of the other, as a result P(A n B)=0.
- In everyday life, there are events that cannot happen at the same time. We called these mutually exclusive events.
=> Because mutually exclusive events cannot happen together, the probability that both events will
happen together is equal to zero.
- If A and B are mutually exclusive events, then P (A or B) = P(A) + P(B).
=> Because two mutually exclusive events cannot occur at the same time, they have no outcome in common,
the probability that either event may happen is the sum of the probabilities of the events A and B.

Conditional Probability
Many times the probability of an event will depend on external factors. Think about the probability that a high
school student will go to college. We know that many factors, such as socio-economic status (SES) can impact this
probability. Conditional probability lets us think about the probability of an event given that something else is true.
 We use the notation P ( A B ) to refer to the conditional probability that event A occurs given that B is true.
 For example the probability that someone who is low in SES goes to college would be written as:
P ( going to college low SES ) .
Conditional Events: If the occurrence of one event has an effect on the next occurrence of the other event then the
two events are conditional or dependent events.
Definition: Let A and B be two events in S. The conditional probability of an event A given that (or on the condition
that) an event B has occurred, denoted by P ( A B ) , is defined as
P( A  B)
P ( A B)  provided that P ( B )  0
P( B)
Similarly, the conditional probability of B given A, denoted by PB/A, is denoted by
P( A  B)
P ( B A)  provided that P ( A)  0
P ( A)
Note: It should be noted that P ( A B) SATISFIES all the basic axioms of probability, namely:
 0  P ( A B)  1
 P ( S B)  1
 P( A1 B  A2 B)  P( A1 B)  P( A2 B) (Provided that the events A1 and A2 are mutually exclusive).
Example 1: If the probability that a research project will be well planned is 0.60 and the probability that it will be
well planned and well executed is 0.54, what is the probability that it will be well executed given that it is well
planned?
Solution; Let A= the event that a research project will be well Planned B= the event that a research project will be well
Executed
given p ( A )  0 . 60 , p A  B  0 . 54
required p B A 
p B A  p A  B  0 . 54
 0 . 90
p A  0 . 60
Example 2: A bin contains 5 defective (that immediately fail when put in use), 10 partially defective (that fail after
a couple of hours of use), and 25 acceptable transistors. A transistor is chosen at random from the bin and put into
use. If it does not immediately fail, what is the probability it is acceptable?
Solution:
Let
 A: be a selected transistor does not immediately fail (i.e. either partially or acceptable)
 B: be a selected transistor is acceptable

Page 36
Then, we have:
10 25 35
P  A   P  partially defective  acceptable   
40 40 40
25
P  A  B   P  partially defective  acceptable   acceptable   P acceptable  
40
Thus
P A  B  25 40 25
P B A  
P A  35 40 35
Note:
In case when all the outcomes are equally likely, it is sometimes easier to find conditional probabilities directly,
without having to apply equation (1). If we already know that B has happened, we need only to consider outcomes
in B, thus reducing our sample space to B. Then,
Number of outcomes in (A  B) n( A  B)
P A B    * * * * * * * * * * * *(2)
Number of outcomes in (B) n( B)
Examsple3: Suppose we toss a fair die once. S = {1, 2, 3, 4, 5, 6) ← sample space of the experiment tossing a fair
die once. Let A = an event that number 3 turns up, and B = an event that odd number turns up. Then, find P (A|B).
Solution:
A  {3}, B  {1,3,5}, A  B  {3}
n( A  B ) 1

Hence, by equation (2) P A B   
n( B) 3
 
Example 4: Let A and B be two events of the sample space with P A B  0.3 , P B A  0.6  
and p A  B   0.3 then find P A and PB 
Solution:
 p  A  B  0 .3
p  A  B   p B A . p  A   p  A     0 .5
p B A  0 .6
 p  A  B  0 .3
p  A  B   p  A B . p B   p B    1
p A B  0 .3
Independent and Dependent Events
Two events A and B in the same sample space S, are defined to be independent if the probability that one event
occurs, is not affected by whether the other event has or has not occurred, that is
P( A B)  P( A) and P( B A)  P ( B) , It then follows that two events A and B are independent if and only if
P( A  B)  P( A)  P( B)
Rationale
According to the multiplication theorem of probability, we have:
P( A  B)  P( A)  P( B A)
Putting P ( B A)  P( B) , we obtain
P( A  B)  P( A)  P( B)
The events A and B are defined to be DEPENDET if P( A  B)  P( A)  P( B) ).
This means that the occurrence of one of the events in some way affects the probability of the occurrence of the
other event. Speaking of independent events, it is to be emphasized that two events that are independent, can
NEVER be mutually exclusive.
 When A and B are independent
 P(A and B) = P(A = P(A)P(B)
 P(A)=P(A/B)
 P(B)=P(B/A)
Example 1: A coin is flipped & a die is rolled. Find the probability of getting a head on the coin & a 6 on the die.
Solution:
P(head  6)  P(head )  P(6)  1  1  1
2 6 12
 

Page 37
Example 2: What is the probability that a female student will be selected at random from a class of seven males and
three female students, in each of the next two class meetings?
Solution: Define the events:
 A – the first student selected is a female B – the second student selected is a female
 Thus, 
p A  B  p A  pB  3  3  9
10 10

100
 0.09
Example 3: A box contains four black and six white balls. What is the probability of getting two black
balls in drawing one after the other under the following conditions?
a. The first ball drawn is not replaced
b. The first ball drawn is replaced
Solution: Let A= first drawn ball is black. B= second drawn is black. Required p A  B
a. p A  B  pB A. p A  4 103 9  2 15
b. p A  B  p A. pB  4 104 10  4 25
4.2. Probability Distributions
Random Variable is a variable whose values are determined by chance or with some probability.
 It is a function or rule that assigns a numerical value to each simple event in a sample space.
 Given a random variable, say X , the set or collection of all possible value that X can assume is
called the range of X & it will be denoted by  x .
Remark:
 Random variable are often designated by capital letters such as X , Y , Z ,  or X 1 , X 2 , X 3 ,  ;
and the values of these random variables are, however, denoted by lower cases, like x, y , z ,  or
x1 , x2 , x3 , 
Types of random variable
Random variables can be classified into two categories based on their values or range spaces,  x ; those
random variables are known as discrete or continuous random variables. To better recognize the idea
behind these types of variables, let us deal with them one by one.
Discrete random variable
Let X is a random variable.
 If the number of possible values of X (i.e.  x ) is finite or countable infinite we say X is a discrete
random variable.
 It is a random variable that assumes only certain clearly separated values or whole numbers.
Examples of discrete random variables are:
1. The number of defective transistors out of 100 inspected ones.
2. The number of bugs in a computer program.
3. The number of heads in tossing coin three times.
Continuous random variable:
Let X is a random variable.
 If the random variables X assumes all values in some interval say (c, d), where c and d are element of real
number then we say X is a continuous random variables.
 It assumes any values between two specific values.
Examples of continuous random variable includes the following
1. The decimal value between 1 and 2.
2. The time to assemble a product (e.g. a chair).

Probability Distribution is a listing of all possible values of a random variable together with their
corresponding probabilities.
 It is assignment of probabilities to each distinct value of a discrete r.v or to each interval of values
of a continuous r.v.
 It is the distribution of all possible r.v and their related probabilities.
Discrete Probability Distribution

Page 38
 Probability distribution of discrete random variable is known as probability function or
probability mass function (pmf) & it can be defined as a graph, table, formula or other devices that
specifies the probability associated with each possible value the r.v can assume.
In general, we can characterize the distribution of a discrete random variable in terms of its
 Probability mass function (p m f).
 Cumulative Distribution Function (c d f)

Probability mass function (pmf):

For a discrete random variable X , we define the probability mass function P ( xi ) of X by

P( xi )  P X  xi  Satisfying the following conditions:

i. P( xi )  0, i , (Non negative probability).

ii.  P( x )  1
i 1
i

Note that if X is discrete random variable then

b1 b 1
P a  X  b    P x  P a  X  b    Px 
X a  1 X a
b b
P a  X  b    Px P a  X  b    Px 
X a  1 X a

Example 1: toss a coin twice, & let X be the No. of head observed.
a. Construct a probability distribution of X ?
b. Is P legitimate probability function? (verify P(X) is pmf)
Solution: let as construct by using table
S HH TH HT TT
X 2 1 1 0
Hence, X is a discrete random variable with finite possible values of  x  0, 1, 2

 P(0)  P( X  0)  P(TT )  1 4  0.25

 P(1)  P( X  1)  P( HT or TH )  P( HT )  P(TH )  1 4  1 4  0.5
 P(2)  P( X  2)  P( HH )  1 4  0.25
a) Therefore the probability distribution of X by using table & graph is as follows
xi 0 1 2
P ( xi ) 0.25 0 .5 0.25
b) To verify P ( xi ) is pmf we have to consider the f.f two criteria
i) P( xi )  0, i . Look the above table P( xi )  0, i

ii)  P( x )  1  P(0)  P(1)  P(2)  (0.25  0.5  0.25)  1
i 0
i

Therefore p is legitimate probability function

Exercises:
1. Construct a probability distribution for the number of heads observed in tossing a coin two times.
2. Construct a probability distribution for the number of heads observed in tossing a coin three times.
3. Construct a probability distribution for the number of girls if a family plans to have four children.

Continuous Probability Distribution

Probability distribution of continuous random variable (probability density function/pdf)

Page 39
Lecture Note Basic Statistics Stat1011

The probability distribution of a continuous random variable is typically described by a density curve. The curve is
defined so that the probability of any event is equal to the area under the curve for the values that make up the event,
and the total area under the curve is equal to 1.

The graphical form of the P.d for a continuous random variable X is a smooth curve that might appear in the f.f
figure. This curve, a function of X is denoted by the symbol f (x ) and is variously called a probability density
function (pdf), a frequency function, or a probability distribution.

We can characterize the distribution of a continuous random variable in terms of its

 Probability density function (p d f)
 Cumulative Distribution Function (c d f)

Probability Density Function (p d f):

The function f(x) is a probability density function (p d f) for the continuous random variable X, defined over the set of
real numbers R, if
i. f ( x)  0, i

ii.  f ( x)dx  1 i.e. ,the area under the total curve should equal to1


Remark:
a
 P( X  a)  P(a  X  a)   f ( x)dx  0 , So Probability at a point is zero for a continuous r.v.
a

 P ( X  a )  0 for any a  
 P ( a  X  b)  P ( a  X  b)  P ( a  X  b)  P ( a  X  b )
 For continuous case P ( A)  0 doesn’t imply that A   .
 This is not true in general for discrete random variables.
Example 1: Pick a point x randomly between 0 and 1 such that
1, if 0  x  1
f x   
0, otherwise
Then,
 P x1  X  x2   area under f(x) from x1 to x2
 P X  1  0
Example 2: Let X be a continuous random variable and its pdf be
2 x, if 0  x  1
f x   
0, otherwise
a) Verify that is a probability density function(pdf)

b) Find 
P1 X3
2 4

Solution:
a) In order to verify it must satisfies the f.f two condition

Page 40
Lecture Note Basic Statistics Stat1011

i. f ( x)  0 for 0  x  1
  0 1 
ii. Showing  f ( x)dx  1 i.e.,  f ( x ) dx   f ( x )dx   f ( x)dx   f ( x) dx 1
   0 1
 0 1  1
 x2 1 
f ( x ) dx  0 dx  2 xdx  0 dx  2 xdx  2 
  0 1 0  2 0  1
Thus,
   

1
It is a pdf. Or you can write simply  f ( x)dx   f ( x)dx  1
0


 2 3 
x 4  5
b) P 1
2

X 3 
4 1 2
3
4
 3
4
f ( x ) dx  1 2 xdx  2  16
2  2 1 
 2
Example 3: Suppose X is a continuous random variable with probability density function
cx 2 , if 0  x  2
f x  
0 , elsewhere
Determine
a) the value of the constant c
b) P(0  X  1)
c) P(0  X  11 2  X  3 2)
Solution:
  2
2 8 3
a) Since f (x ) is a pdf then  f ( x)dx  1 . However,  f ( x)dx 0 cx dx  3 c  1,  c  8 .

2 2
3 1
b) P (0  X  1)   f ( x)dx   x 2 dx  .
0 0
8 8
1
3 2
8x dx
P(0  X  1 and 1 2  X  3 2) 1
2 7
P(0  X  11 2  X  3 2)   3
 .
P( 1 2  X  3 2) 2
3 26
2
8x
1
dx
2

A continuous probability distribution is represented by the probability density function (pdf), having the
following characteristics: suppose X is continuous on an interval [a, b].
i. f(x)≥0, for all x Є (a, b)
ii. ∫f(x)dx=1
b
iii. P (a  X  b)  f ( x ) dx

a

Exercises
1. Show that the following functions are pdf.
1,0  x  1 b. e  x , x  0
a. f ( x)   f (x)  
 0 , otherwise
0, otherwise
2
2. Find the value of b for the following function to be a pdf. f ( x )   bx , 0  x  1
 0 , otherwise

Page 41
Lecture Note Basic Statistics Stat1011

Expectations and Variance

Since probability distributions are the key to statistical inference, it is helpful to study some of their characteristics.
Two useful characteristics of a probability distribution are its expected value and its variance. Expected value is a
measure of the location of the distribution, while variance is a measure of its spread.
Expectation of random variable
The expected value (theoretical mean) of a random variable indicates its average or central value. It is a useful
summary value (a number) of the variable's distribution. Stating the expected value gives a general impression of the
behavior of some random variable without giving full details of its pmf (if it is discrete) or its pdf (if it is continuous).
For Discrete Random Variable
Let x be a discrete random variable with possible values x1 , x2 ,  , xn  .
Letp( xi )  P ( X  xi ), i  1, 2, 3,  , n,  is its probability function, then the expected value of X or the mean of
X or the mathematical expectation of X , designated by E ( X ) , and is defined as:-

E ( X )   xi p ( xi )  
i 1

Provided the sum or the series converges absolutely, i.e., if x
i 1
i p( xi )   . If this sum does not, however,

converge absolutely, then we say that X does not have an expected value.

The mean of a random variable X is known as the expected value X, denoted by E(X).
 xP( x) , if X is a discrete r.v.
E( X )  
 xf ( x)dx , if X is a continous r.v.
Example 1: Find the expected value of the random variable X defined as the number of heads in
tossing a coin three times?
Solution: The sample space of this random variable is
 S  HHH , HHT , HTH , THH , HTT , THT , TTH , TTT .
Let X = number of heads, this is a discrete random variable with finite possible values, i.e.,
 x  0,1, 2, 3Moreover, The probability function of X is
xi 0 1 2 3
P ( xi ) 18 38 38 18
Therefore,
 4
 E ( X )   xi p ( xi )   xi p ( xi )  0  1 8  1  3 8  2  3 8  3  1 8   1.5
i 1 i 1
that is, when a coin is tossed many times, the theoretical mean will be 1.5.

Exercises
1. Find the mean number of heads observed in tossing a coin two times.
2. Find the average number of girls if a family plans to have four children.

For Continuous Random Variable

Let X be a continuous random variable with pdf of f (x) . The expected value of X is defined as

E( X )   xf ( x)dx

Example 1: Let X be a continuous random variable with pdf

Page 42
Lecture Note Basic Statistics Stat1011

 1
 (1500 ) 2 x , if 0  x  1500

 1
f x    2
( x  3000 ), if 1500  x  3000
 (1500 )
0, otherwise


Find expectation of X ?
Solution: Since X is a continuous random variable. Hence

By definition given above, E( X )   xf ( x)dx .Thus,

 1500 3000
 1   1 
E( X )   xf ( x)dx   x 2
x dx   x 2
( x  3000)dx
 0  (1500)  1500 
(1500) 
1500 3000
1 1

(1500) 0 2 
x2dx  2 
(1500) 1500

x2  3000x dx  1500 
Example 2: Determine the mean value of a continuous random variable X having the following
probability density functions.
 2 e 2 x , x  0
a. f x    b. f x    2 1  x , 0  x  1
0, otherwise 0, elsewhere

Solution: By definition given above, E ( X )   xf ( x)dx .Thus,

  
a.
  
x 2 e  2 x dx  2  xe 2 x
E(X )   xf ( x ) dx  dx
 0 0

Now, using integration by parts, we have:-

u  x and dv  e  2 x dx
 1 2 x
 du  dx and v  e
2
 
2x
  1  1 2x 
 E ( X )  2  xe dx  2   xe  2 x    e dx 
0   2  0 0
2 
1  1   1 2 x    1     1  1
2 xe 2 x
   e     xe  2 x  e 2 x    0  0    0   
 2 2
  2  0  2  0   2  2
 1 1
 x2 x3 1 1
b. E(X )   xf ( x ) dx   
2 x 1  x dx  2  1  x 2 dx  2     
 0 0  2 3 0 3
Exercise: Find the mean of the following probability distributions.
1,0  x  1 3x 2 ,0  x  1
a. f ( x)   b. f ( x)  
0, otherwise 0, otherwise
Properties of Expectation
Assume that the expected value of a random variable X exists
Property 1: Let X  c is a constant. Then, E ( X )  E (c)  c
Property 2: Let f (x) be the pdf of a random variable and c be a constant. Then, E (cX )  cE( X )
Property 3: Let X and Y be any random variables. if Y  aX  b , where “ a ” and “ b ” are constant, then
E (Y )  aE ( X )  b

Page 43
Lecture Note Basic Statistics Stat1011

Property 4: Let X and Y be any random variables. Then, E ( X  Y )  E ( X )  E (Y ) .

Property 5: If X and Y are independent random variables, then E ( XY )  E ( X ) E (Y ) .
Variance of Random Variable
2
In previous chapter, we noted that the variance denoted by  is used to summarize the dispersion in a data set. Here
again we use it to summarize the variability in the values of a r.v.
Let X be a random variable (i.e., whether it discrete or continuous). Then, the variance of X , denoted by Var (X )
or V (X ) or  X2 , is defined as (i.e., it is general formula for both discrete of continuous) :
Var ( X )  E  X  E ( X )  E ( X   ) 2
2

The positive square root of Var (X ) is called the standard deviation of X and is denoted by  X .
2
 2
Theorem: Var ( X )  E ( X )  E ( X ) this is shortcut formula for variance

Properties of variance of random variables

Property 1: If C is any constant, then Var ( X  C )  Var ( X )
2
Property 2: If C is any constant, then Var (CX )  C Var ( X )
1,0  x  1
Exercises: Find the variance of f ( x)  
0, otherwise
Discrete probability distributions are used when the sampling space is discrete. Following is a list
of discrete probability distributions:
 discrete uniform
 binomial and multinomial
 hypergeometric
 negative binomial
 geometric
 Poisson

Continuous probability distribution is used when the sample space is continuous. Following is a
list of continuous probability distributions:
 Uniform
 Normal (or Guassian)
 Gamma
 Beta
 t distribution
 F distribution
 2 distribution

Common Discrete Probability Distribution

The Binomial Distribution
Binomial distribution is one of the simplest and most frequently used discrete probability
distribution and is very useful in many practical situations involving either /or types of events.
Properties of Binomial Experiment
1. Each trial has only two mutually exclusive outcomes or outcomes that can be reduced to two.
One of the outcomes is labeled as Success and the other as Failure.
2. The outcome of each trial is independent.
3. The probability of Success remains the same from trial to trial.

Page 44
Lecture Note Basic Statistics Stat1011
4. The experiment (trial) is performed for fixed number of times, say n.
Let X be the number of successes. Then X follows a binomial distribution with parameters n, number
of experiments performed and p, probability of success, and write as X ~ Bin(n, p).
A discrete random variable X is said to follow a Binomial distribution with parameters n and p , if its
distribution is given by
 n  x n x
  p (1  p ) , for x  0 ,1,2 , ,n
P X  x    x 
0, elsewhere

Where
 n  Number of trails; n  W .
 x  Number of successes assumes values.
 ( n  x )  Number of failure
 p  Probability success; 0  p  1 .
Notation: If the random variable X has a binomial distribution with parameters n and p then we
write it as X ~ Bin ( n , p ) Or X ~ b ( n , p ) Or simply b( x; n, p)
Theorem: let X be a binomially distributed random variable with parameters p , based on n
repetitions of an experiment. Then,
n
i.  P( X  x)  1
x 0

ii. The expected value (mean) of X is given by E ( X )   X  np and

2
iii. The variance of X is also given by Var( X )   X  npq
Example 1: A coin is flipped 10 times. Each outcome is either a head or a tail. The variable X is the
number of heads among those 10 flips, our count of “successes.” On each flip, the probability of
success, “head,” is 0.5. The number X of heads among 10 flips has the binomial
distribution B(n  10, p  0.5) .
Example2: A coin is tossed three times. What is?
i. The probability of getting exactly two head? Exactly three head?
ii. The expected value and the variance of getting exactly 2 numbers of heads.
Solution:
Method-1 (Using notion of classical probability)
The sample space of the random experiment
is S  HHH , HHT , HTH , HTT , THH , THT , TTH , TTT 
 If A is the event of getting exactly two heads then A  HHT , HTH , THH 
n( A) 3
Thus, P( A)  
n( S ) 8
 If B is the event of getting exactly three heads then B  HHH 
n( B ) 1
Thus, P( B)  
n( S ) 8
Method-2 (Using the notion of binomial distribution)
Here, it is given that
 n  3 (the number of trial),
Page 45
Lecture Note Basic Statistics Stat1011

 p  1 2 (probability of success) and

 q  1  p  1 2 (probability of failure)
Then,
2 1
3  3  1   1  3
i. P ( exactly 2 heads )  P ( X  2 )    p 2 (1  p ) 3  2        
2  2  2   2  8
3 0
3  3  1   1  1
P ( exactly 3 heads )  P ( X  3 )    p 3 (1  p ) 3  3        
3  3  2   2  8
1 3
ii. E ( X )  np  3   and
2 2
1 1 3
iii. Var ( X )  npq  3   
2 2 4
Example 3: A multiple-choice test with 20 questions has five possible answers for each question. A
completely unprepared student picks the answer for each question at random and independently.
Then,
 Suppose X is the number of questions that the student answers correctly. We identify each
question with a Bernoulli trial and a correct answer as a success. Since there are 20 questions and
the student picks his answer at random from five choices, X ~ Bin(n, p) with n  20 and
p  1 5  0.2
 We can now answer any question we want about X .
For instance:
 20 
a. P ( The student gets every answer wrong )  P ( X  0 )    ( 0 .2 ) 0 (1  0 .2 ) 20  0 .0115
 0 
b.  20 
P ( The student gets every answer right )  P ( X  20 )    ( 0 . 2 ) 20 (1  0 . 2 ) 0  1 . 05 (10 )  14
 20 
 Suppose the instructor has decided that it will take at least 13 correct answers to pass this test.
Then,
20
 20 
c. P ( The student will pass )     ( 0 . 2 ) x (1  0 . 2 ) 20  x  0 . 000015
x  13  x 

Example 4: The probability that a seed will germinate is 0.3. What is the probability that out of 6 seeds.
a. one b. two c. none d. at least two will germinate?
Solution: X  no of seeds that germinate, n  6 , p  0.3 , q  0.7
6  6 1 5
a) P( X  1)    p1 (q) 61   0.3 0.7   0.3025
1 1
 6  2 62  6  2 4
b) P( X  2)    p ( q )   0.3 0.7   0.3241
 2  2
 6  0 6 0  6  0 6
c) P( X  0)    p ( q )   0.3 0.7   0.11765
0
  0
 
d) P( X  2)  1  P( X  2)  1  P( X  0)  P( X  1)  1  0.11765  0.3025  0.58
Exercises
1. Suppose a coin is tossed 10 times. What is the probability of getting
a. Exactly 3 heads b. At most 3 heads c. At least 3 heads

Page 46
Lecture Note Basic Statistics Stat1011
d. More than 3 heads e. No head
Find the average and variance of the number of heads.
2. The probability of a man kicking into the goal is 2/3. If a person kicks 5 times, what is the
probability of scoring
a. At least one goal.
b. At most 3 goals.
Find the average, variance and standard deviation of the number of goals.

Poisson distribution
The Poisson probability distribution is useful for determining the probability of a number of
occurrences over a given period of time or within a given area or volume. That is, the Poisson
random variable counts occurrences over a continuous interval of time or space.
Properties
1. The probability of success, p, is very small.
2. The experiment is performed indefinitely (n is very large).
3. The average number of events per unit of time (  ) is known.
Can you list some real situations that can be analyzed by Poisson distribution?
When an event occurs rarely, the number of occurrences of such an event may be assumed to follow
a Poisson distribution .The following are some of the examples, which can be analyzed using Poisson
distribution:
 The number of Bacteria in a given volume of water.
 The number of wars in a given year
 The number of telephone calls received at a telephone exchange in a given time interval.
 The number of defective articles in a packet of 200.
 The number of printing errors at each page of a book.
 The number of road accidents reported in a city per day.
 The number of deaths in a given period of time, etc.
A random variable X is said to have a Poisson distribution with parameter λ >0 if its distribution is
given by

 Where X  count of the number of events that occur in a certain time interval or spatial area (or
in a Poisson process).

 We denote as .where  the average number occurrence of an event in the

unit length of interval or distance.
Theorem: If X has a Poisson distribution with parameter , then and
Example 1: If the receptionist‘s phone rings on average four times an hour. Find
a) The probability of no calls randomly selected hours.
b) The probability of exactly 2 calls per hour.
c) The probability of 3 or more calls in a duration of 2 hours.

Page 47
Lecture Note Basic Statistics Stat1011
Solution:
a) Let X be the number of times the receptionist‘s phone rings per hour. Thus, we have   4
and hence

b) Let X be as determined in part (a) then   4 , X  2 and hence

c) Let X be the number of times the receptionist‘s phone rings per 2 hours. Hence, we
have and hence
P X  3  1  P X  3
e 8  80 e 8  81 e 8  82
But P X  3  P X  0   P X  1  P X  2      0.014
0! 1! 2!
Thus P X  3  1  P X  3  1  0.014  0.989
Example 2: On the average, two bacteria occur with one cubic centimeter of water. What is the
probability that 3 bacteria will occur in two cubic centimeter of water?
Solution: Let X be the number of bacteria per two cubic centimeter of water.
 X ~ Poiss ( ) , in which case   2  2  4 since if 2 bacteria occur in 1cm 3 then in 2cm3 , 4 will
occur. Thus, X ~ Poiss(  4) and hence,

Example 3: If a typist commits 3 typographic errors per page on average,

a. What is the probability that she will commit at most 3 errors on the following page?
b. What is the average number of errors per page?
c. What is the standard deviation of the number of errors per page?
Solution: Suppose X is the number of errors per page committed by the typist then

P(X  3)  P(X  0)  P(X  1)  P(X  2)  P(X  3)

e 3  3 0 e 3  3 1 e 3  3 2 e  3  3 3
A)    
0! 1! 2! 3!
 
 e 3 1  3  9  27  0.447
2 6
B)
C) Var( X )    3  S.D( X )  Var( X )    3

Continuous Probability Distributions: Normal Distribution

The most important and widely used probability distribution is normal distribution. It is
also known as Gaussian distribution. This distribution plays a very important and pivotal
role in statistical theory and practice, particularly in the area of statistical inference.
Page 48
Lecture Note Basic Statistics Stat1011

A Normal distribution is
 A continuous, symmetric, bell-shaped distribution of a variable.
 It is not a single distribution but a family of distributions, each of w/c is determined by its
mean & standard deviation.
 It is used to represent the probability distribution of a continuous random variable like life
expectancies of some product, the volume of shipping container etc.
2
If a random variable X , with mean  and variance  , is normally distributed, its probability density function
is given by

Where:
 = a constant equaling 22/7.  x  a given value of the r.v in the range -
 e  Naperian base equaling 2.7182   x  .
 = population mean.  f (x)  pdf of normal distribution.
 = population standard deviation.
Note:
 Once you get and value one can draw a normal curve by using the above
formula.
 Location of the hump (or the peak), is denoted by & the thickness of the symmetric
tails (or the spread of the curve), is denoted by
 The graph of the normal distribution is important because the portion of the area
under the curve above a given interval represents the probability that a measurement will
lies in that interval.
 When X has the density (1.1), we say that X is distributed as , or
simply . This function is represented by the familiar bell-shaped curve
illustrated in the f.f figure
 The graph of normal probability distribution function f ( x;  , 2 ) is a curve known as
normal curve and looks like the following bell shaped diagram.

Properties of normal distribution

 A Normal distribution curve is bell-shape
 The mean, median & mode are equal & are located at the center of the distribution
 A Normal distribution curve is unimodal (i.e., it has only one mode)
 The curve is symmetric about the mean and reaches its maximum at  .
 The curve is continuous (i.e., there is no gap & holes).
 The curve never touches the X axis
 The area under the part of a normal curve that lies within 1 SD of the mean    ,     is
approximately 0.68 or 68%; within 2 SD, about 0.95 or 95%; & within 3 SD, about 0.997 or 99.7%.
Page 49
Lecture Note Basic Statistics Stat1011

 If several independent random variables are normally distributed then their sum will also be
normally distributed.
 The area under the density curve between    x   is usually 1. This is to say that

The standard normal distribution

Since each Normal distribution variable has its own mean & SD, the shape & location of these curves
will vary. In practical applications, then, one would have to have a table of areas under the curve for
each variable. To simplify this situation, statistician use w/t is called a standard Normal distribution.
The standard normal distribution is
 A normal distribution with mean of 0 & SD of 1.
 An important special case of the normal distribution is the standard normal distribution
 i.e., if we substitute   0 and  2  1 in place of the above pdf of normal distribution (1.1),
we get pdf of standard normal distribution as follows:
1
1  x2
2
f ( x)  e , x 
2
 When X has the density (1.2), we say that X is distributed as N 0,1 , or simply
X ~ N 0,1 .
How to covert normal distribution to standard normal distribution
Let we have a normal random variable say X . If we want to standardize it we use the f.f formula

Now her, z is a normal variable that is in the form of standardized

now, if we substitute   0 and   1 in place of (1.3) we will get z = x. i.e., we can write equ (1.2)
as:

Where:
 z = standardized normal variable
 The distribution of the standard normal variable is known as standard normal distribution

 Graphically, is given as it is shown in the figure below.

A normal distribution with = 0 and  = 1 is referred to as a standard normal distribution

(and a random variable with this distribution is usually denoted Z).
2
Important result: If X is a random variable distributed as N[  ,  ] , then

Page 50
Lecture Note Basic Statistics Stat1011

The process of subtracting the mean and dividing by the standard deviation is referred to as
standardisation:
General Normal Standard Normal
2
X ~ N[  ,  ] x   Z ~ N[0, 1)
z

Finding the Area under the Standard Normal Distribution Curve
For the solution of problems using the standard normal distribution, a two-step process is recommended. This
are
Step1. Draw the normal distribution curve and shade the area.
Step2. Find the appropriate figure and follow the directions given.
Note:
1. You have got the area from the Z table only the area between 0 & 3.8 or between -3.8 & 0.
2. The area between 0 & +Z = the are between -Z & 0
3. The total area under the normal distribution curve is 1 means the area from -  to 0 is 0.5 &
the area between 0 &  is 0.5
Example1: Find the area to the left of z  2.06
Solution:
Step 1 Draw the figure.

Step 2 we are looking for the area under the standard normal distribution to the left of z  2.06 . Look up the
area in the table between 0 & 2.06.It is 0.4803. But the area between -  & 0 is 0.5
 The total area becomes 0.5 + 0.4803 = 0.9803. Hence, 98.03% of the area is less than z =2.06.

Example 2: Find the area between z  1.68 and z  1.37 .

Solution
Step 1 Draw the ﬁgure as shown.

Step 2 in this case we have to find the area between 0 &1.68 and between 0 & 1.37. Then add those two areas.
Now the area for z between 0 & 1.68 is 0.4535 and the area for z between 0 & 1.37 is 0.4147.  Add those two
areas w/c is 0.4535+0.4147=0.8682 or 86.82%.

Computation of probabilities: Suppose a and b are any numbers with a  b , then we want to compute
2
P(a  X  b) were X ~ N (  , ) . We use the principle of standardization as follows:
a X  b a b
P ( a  X  b)  P     P Z  
        
Example 1: If X is a normal r.v with parameters   3 and  2  9 , find
i. P(2  X  5)
ii. P( X  0)
iii. P( X  9)
Solution: First we have to standardize the r.v X .

Page 51
Lecture Note Basic Statistics Stat1011

2 X  5   23 53

i. P ( 2  X  5)  P      P Z  
      3 3 
 P  0 . 33  Z  0 . 67   0 .3779 ( from table )

 X  0  0 3
ii. P ( X  0)  P    P Z    P Z  1  0.8413
     3 

 X  9  9  3
iii. P ( X  9)  P     P Z    PZ  2   0.0228
     3 

Example 2: The weekly income of salary workers is normally distributed with mean $ 200 and variance
$2500.
1. If an individual is chosen at random, what is the probability that his/her income is
i. At least $ 325
ii. Less than $ 100
iii. Between $ 150 and $ 300
2. If the numbers of employees are 300, how many of them have a salary between $ 150 & $ 300.
Solution
1. Let X be the weekly income of salary workers. X ~ N  ,  2  ~ N 200, 2500
 X   325     325  200 
i. P ( X  325)  P    P Z    P Z  2.5  0.0062
     50 
 X   100     100  200 
ii. P ( X  100)  P    P Z    PZ  2   0.0228
     50 
 150   X   300   
P(150  X  300)  P   
iii.     
 150  200 300  200 
 P Z   P 1  Z  2   0.8185
 50 50 
2. No of employees with salary between $ 150 & $ 300 is simply
300  P(150  X  300)  300  0.8185  246
Exercise
The IQ score of students is normally distributed with a mean of 120 and variance 400. What is the
probability that a student will have an IQ
a. Between 100 and 130.
b. Above 140.
c. Below 150.
d. Between 140 and 150

Page 52
Lecture Note Basic Statistics Stat1011

The z-Table
The following z-Table indicates the area to the right of the vertical centre-line of the z-curve (or standard normal
curve) for different standard deviations.

P(0<z<1.45)=0.4265
You can see how to find the appropriate value in the full z-table below
z 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0 0 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3304 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 0.4821 0.4826 0.483 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990
3.1 0.4990 0.4991 0.4991 0.4991 0.4992 0.4992 0.4992 0.4992 0.4993 0.4993
3.2 0.4993 0.4993 0.4994 0.4994 0.4994 0.4994 0.4994 0.4995 0.4995 0.4995
3.3 0.4995 0.4995 0.4995 0.4996 0.4996 0.4996 0.4996 0.4996 0.4996 0.4997
3.4 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4998
3.5 0.4998 0.4998 0.4998 0.4998 0.4998 0.4998 0.4998 0.4998 0.4998 0.4998
3.6 0.4998 0.4998 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999
3.7 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999
3.8 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999 0.4999

Page 53

STA301 - Midterm MCQS Solved With References by Moaaz PDF
50% (2)
STA301 - Midterm MCQS Solved With References by Moaaz PDF
28 pages
Misleading Statistics, Misleading Graphs Worksheet
No ratings yet
Misleading Statistics, Misleading Graphs Worksheet
4 pages
Statatics Cha 1
No ratings yet
Statatics Cha 1
8 pages
Chapter 1 (2)
No ratings yet
Chapter 1 (2)
22 pages
Chapter One
No ratings yet
Chapter One
8 pages
Chapter 1
No ratings yet
Chapter 1
34 pages
Statistics Chapter 1
No ratings yet
Statistics Chapter 1
8 pages
Chapter 1-4 For Fundametals of Biostat
No ratings yet
Chapter 1-4 For Fundametals of Biostat
36 pages
Basic Statistics2222
No ratings yet
Basic Statistics2222
52 pages
Chapter 1 & 2
No ratings yet
Chapter 1 & 2
17 pages
Lesson 1 - Meaning, Types and Limitations of Statistics
No ratings yet
Lesson 1 - Meaning, Types and Limitations of Statistics
6 pages
Introduction To Statistics - Lecture Note RC-1
No ratings yet
Introduction To Statistics - Lecture Note RC-1
64 pages
Basic Statistics Ch1 -4
No ratings yet
Basic Statistics Ch1 -4
69 pages
Chapter One: Definition of Statistics: The Word "Statistics" Has Different Meanings To Different Person's .When
No ratings yet
Chapter One: Definition of Statistics: The Word "Statistics" Has Different Meanings To Different Person's .When
30 pages
Introduction To Statistics Hand Out 2022 Alebachew A.
No ratings yet
Introduction To Statistics Hand Out 2022 Alebachew A.
41 pages
Probability and Statistics 2022 Hand Out
No ratings yet
Probability and Statistics 2022 Hand Out
34 pages
CH 1
No ratings yet
CH 1
20 pages
Math-101-Statistics
No ratings yet
Math-101-Statistics
100 pages
Note For Students
No ratings yet
Note For Students
68 pages
Probability and Statistics Acct Y II T III HU SWE 12 May, 2024
No ratings yet
Probability and Statistics Acct Y II T III HU SWE 12 May, 2024
123 pages
Basic Statistics 2025 (1-5) (1)
No ratings yet
Basic Statistics 2025 (1-5) (1)
158 pages
Statistics and Probability A Brief History of Statistics
No ratings yet
Statistics and Probability A Brief History of Statistics
42 pages
Introduction To Statistics Material 2023
No ratings yet
Introduction To Statistics Material 2023
85 pages
statistics 1-1
No ratings yet
statistics 1-1
4 pages
Chapter One: 1.1definition and Classification of Statistics
No ratings yet
Chapter One: 1.1definition and Classification of Statistics
22 pages
Chapter One: 1. Basic Concepts, Methods of Data Collection and Presentation
No ratings yet
Chapter One: 1. Basic Concepts, Methods of Data Collection and Presentation
111 pages
Chapter One Definition of Statistics
No ratings yet
Chapter One Definition of Statistics
17 pages
Stat I CH - 1
No ratings yet
Stat I CH - 1
9 pages
statics 1 and 2 (1)
No ratings yet
statics 1 and 2 (1)
21 pages
Stat 1&2
No ratings yet
Stat 1&2
24 pages
Basic Stat 1-2 PDF-1-1
No ratings yet
Basic Stat 1-2 PDF-1-1
15 pages
Gizaw
No ratings yet
Gizaw
78 pages
STAT 111 LECTURE NOTE 01 pdf
No ratings yet
STAT 111 LECTURE NOTE 01 pdf
14 pages
Basic Statistics material Final (1)
No ratings yet
Basic Statistics material Final (1)
86 pages
Business Statistics
No ratings yet
Business Statistics
186 pages
Engineering Data Analysis
No ratings yet
Engineering Data Analysis
64 pages
Chapter One PDF
No ratings yet
Chapter One PDF
22 pages
Statistics and Probability Handout
No ratings yet
Statistics and Probability Handout
93 pages
1-Nature-of-Statistics
No ratings yet
1-Nature-of-Statistics
33 pages
Ch#5# ST
No ratings yet
Ch#5# ST
79 pages
Business Stastics
No ratings yet
Business Stastics
82 pages
1.1 Definitions and Classification of Statistics: Chapter One: Introduction
100% (3)
1.1 Definitions and Classification of Statistics: Chapter One: Introduction
10 pages
Begashaw Probability Full @keleme - 2013
100% (3)
Begashaw Probability Full @keleme - 2013
232 pages
Satatistics
No ratings yet
Satatistics
40 pages
1 Descriptive Part
No ratings yet
1 Descriptive Part
13 pages
Elementary Statistics and Probability: By: Carmela O. Zamora-Reyes Lorelei B. Ladao - Saren
100% (2)
Elementary Statistics and Probability: By: Carmela O. Zamora-Reyes Lorelei B. Ladao - Saren
27 pages
C1 - Statistic Introduction
No ratings yet
C1 - Statistic Introduction
34 pages
Probability and Statistics PDF
No ratings yet
Probability and Statistics PDF
108 pages
Prepared by Kenish
No ratings yet
Prepared by Kenish
28 pages
Statistics by Begashaw Moltot
100% (2)
Statistics by Begashaw Moltot
232 pages
Statistics For Management I
No ratings yet
Statistics For Management I
82 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
21 pages
Chapter 1 Intro to Statistics
No ratings yet
Chapter 1 Intro to Statistics
12 pages
Yo Fams
No ratings yet
Yo Fams
33 pages
Basic Statistics Material
No ratings yet
Basic Statistics Material
79 pages
Presentation ON Introduction To Statistics: Course No: URP 5151 Couse Title: Statistics For Planners
No ratings yet
Presentation ON Introduction To Statistics: Course No: URP 5151 Couse Title: Statistics For Planners
37 pages
Chap 1
No ratings yet
Chap 1
5 pages
Basic of Statistics
No ratings yet
Basic of Statistics
10 pages
CHAPTER ONE Stat I
No ratings yet
CHAPTER ONE Stat I
6 pages
Basic Statistics material
No ratings yet
Basic Statistics material
97 pages
1 Nature of Statistics
No ratings yet
1 Nature of Statistics
7 pages
Business Statistics
From Everand
Business Statistics
Knowledge Flow
No ratings yet
Chapter 5 Organizing
No ratings yet
Chapter 5 Organizing
26 pages
Management
No ratings yet
Management
103 pages
Chapter One Introduction To Business Statistics
No ratings yet
Chapter One Introduction To Business Statistics
29 pages
Cash and Recievables
100% (1)
Cash and Recievables
60 pages
Statistics-1 - LESSON 2 CONSTRUCTION OF FREQUENCY DISTRIBUTION AND GRAPHICA155953 PDF
No ratings yet
Statistics-1 - LESSON 2 CONSTRUCTION OF FREQUENCY DISTRIBUTION AND GRAPHICA155953 PDF
16 pages
Record Ip Mithun
No ratings yet
Record Ip Mithun
25 pages
Session 5 - Mathematics in The Modern World-Statistical Tool-Part 1
No ratings yet
Session 5 - Mathematics in The Modern World-Statistical Tool-Part 1
88 pages
Digital Skills Excel Charts (4)
No ratings yet
Digital Skills Excel Charts (4)
13 pages
Stats Yr2 Chapter 3::: Normal Distribution
No ratings yet
Stats Yr2 Chapter 3::: Normal Distribution
46 pages
Probability and Statistics With R For Engineers and Scientists 1st Edition Michael Akritas Solutions Manual
0% (1)
Probability and Statistics With R For Engineers and Scientists 1st Edition Michael Akritas Solutions Manual
18 pages
Teacher Achievement Performance: Exploring The Impact of Organization Culture, Achievement Motivation, and Job Satisfaction
No ratings yet
Teacher Achievement Performance: Exploring The Impact of Organization Culture, Achievement Motivation, and Job Satisfaction
15 pages
Normal Distribution
No ratings yet
Normal Distribution
23 pages
L2-Organization of Data
No ratings yet
L2-Organization of Data
51 pages
Assignment 4
No ratings yet
Assignment 4
5 pages
570-Asm2-GBS1006-Tran Khanh Ly
No ratings yet
570-Asm2-GBS1006-Tran Khanh Ly
34 pages
Statistics Diagrams PDF
0% (1)
Statistics Diagrams PDF
16 pages
Ch. 10 Data Displays
No ratings yet
Ch. 10 Data Displays
42 pages
STAT.docx
No ratings yet
STAT.docx
11 pages
GE 4 Module 10
No ratings yet
GE 4 Module 10
16 pages
Data Visualization With Python
No ratings yet
Data Visualization With Python
36 pages
STATISTICS - Mean Mode Median
100% (1)
STATISTICS - Mean Mode Median
4 pages
84 Percentiles Teacher PDF
No ratings yet
84 Percentiles Teacher PDF
7 pages
Statics and Probability Ch1-9
No ratings yet
Statics and Probability Ch1-9
161 pages
LU 1, Summarization of data (1)
No ratings yet
LU 1, Summarization of data (1)
3 pages
Lecture Notes For ZCT 205 Quantum Mechan
No ratings yet
Lecture Notes For ZCT 205 Quantum Mechan
136 pages
A Quick Approach To Statistics by G.R.pashA
100% (1)
A Quick Approach To Statistics by G.R.pashA
210 pages
Section 2-1 #'S 3, 7,, 11: Math 227
0% (1)
Section 2-1 #'S 3, 7,, 11: Math 227
14 pages
Interpreting Test Scores: UNIT-8
100% (2)
Interpreting Test Scores: UNIT-8
41 pages
Numerical Transformation of Geochemical Data
No ratings yet
Numerical Transformation of Geochemical Data
10 pages
ETW1001 Week 3: Pre-Class: A. Tables and Charts For Numerical Data
No ratings yet
ETW1001 Week 3: Pre-Class: A. Tables and Charts For Numerical Data
12 pages
Image Processing Using Matlab
No ratings yet
Image Processing Using Matlab
19 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Basic Statistics

Uploaded by

Basic Statistics

Uploaded by

Basic Statistics

Chapter One: Introduction

1.2. Stages in Statistical Investigation

1.6. Application, uses and limitation of statistics

II. Multiple Choice-Variable Types

Chapter Two: Methods of Data Collection and Presentation

Methods of Data Collection

Eg2:-construct FD for the following letter grade of 25 students

Class Class Class Frequency Relative Percentage LCF MCF

Less than cumulative frequency

Chapter 3: Measures of Central Tendency and variation

3.1. Measures of Central Tendency

For grouped FD, X =

99.5–104.5 102 2 204

3.2. Measures of Variation

Coefficient of Variation (CV)

Chapter 4: Probability and Probability distributions

Definition & some terminologies of probability

No Random Experiment Sample Space

 P ( E )  number of ways E can occur

Subjective Probability calculates probability based on an educated guess or experience or evaluation of a

Properties (Rules) of Probability

n Cr or C (n, r ) , read as “ n -combination r ” and is defined as:.

Mutually Exclusive Events

Probability mass function (pmf):

P( xi )  P X  xi  Satisfying the following conditions:

Note that if X is discrete random variable then

 P(0)  P( X  0)  P(TT )  1 4  0.25

Therefore p is legitimate probability function

Continuous Probability Distribution

We can characterize the distribution of a continuous random variable in terms of its

Probability Density Function (p d f):

Expectations and Variance

For Continuous Random Variable

Now, using integration by parts, we have:-

Property 4: Let X and Y be any random variables. Then, E ( X  Y )  E ( X )  E (Y ) .

Properties of variance of random variables

Common Discrete Probability Distribution

ii. The expected value (mean) of X is given by E ( X )   X  np and

 p  1 2 (probability of success) and

 We denote as .where  the average number occurrence of an event in the

b) Let X be as determined in part (a) then   4 , X  2 and hence

Example 3: If a typist commits 3 typographic errors per page on average,

P(X  3)  P(X  0)  P(X  1)  P(X  2)  P(X  3)

Continuous Probability Distributions: Normal Distribution

Properties of normal distribution

The standard normal distribution

Now her, z is a normal variable that is in the form of standardized

 Graphically, is given as it is shown in the figure below.

A normal distribution with = 0 and  = 1 is referred to as a standard normal distribution

Example 2: Find the area between z  1.68 and z  1.37 .

2 X  5   23 53

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.