0% found this document useful (0 votes)
30 views

MBAStat Unit 1 Ques Ans

Statistics is the science of collecting, organizing, analyzing, and interpreting quantitative and qualitative data. There are four main components of statistics: data collection, presentation, analysis, and interpretation. Data can be primary, collected directly by researchers, or secondary, obtained from other sources. Common methods for primary data collection include surveys, interviews, questionnaires, observation, and document review.

Uploaded by

Sharma Gsr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

MBAStat Unit 1 Ques Ans

Statistics is the science of collecting, organizing, analyzing, and interpreting quantitative and qualitative data. There are four main components of statistics: data collection, presentation, analysis, and interpretation. Data can be primary, collected directly by researchers, or secondary, obtained from other sources. Common methods for primary data collection include surveys, interviews, questionnaires, observation, and document review.

Uploaded by

Sharma Gsr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Introduction of Statistics

Dr Kamalanathan S
Assistant Professor of Mathematics
E.G.S. Pillay Engineeing College (Autonomous), Nagapattinam

November 21, 2022


Statistics for Managers Introduction of Statistics

1.1 Introduction
Sir Ronald Aylmer Fisher was a British Statistician and Biologist. He was known as the Father of Modern
Statistics and Experimental Design. Fisher did experimental agricultural research, which saved millions from
starvation. He was awarded the Linnean Society of London’s prestigious Darwin-Wallace Medal in 1958.

Statistics is the science of collecting, organising, analysing and interpreting data in order to make decisions.
In everyday life, we come across a wide range of quantitative and qualitative information. These have profound
impact on our lives.

Data means the facts, mostly numerical, that are gathered; statistics implies collection of data. We analyse
the data to make decisions. The methods of statistics are tools to help us in this.

1.2 Components of Statistics


There are four main components of statistics that are also known as the process of statistics. Croxton and
Cowden defined these four components:

(i) Collection of data.

(ii) Presentation of data.

(iii) Analysis of data.

(iv) Interpretation of data.

1.3 Data Collection


Statistics is a tool for converting data into information.

Data collection is the process of gathering, measuring, and analyzing accurate data from a variety of rel-
evant sources to find answers to research problems, answer questions, evaluate outcomes, and forecast trends
and probabilities.

1.3.1 Types of Data


There are two types of data:
(i) Primary Data
(ii) Secondary Data
Statistics for Managers Introduction of Statistics

Primary Data Collection Method


When an investigator collects data himself with a definite plan or design in his/her way, then the data is known
as primary data. Generally, the results derived from the primary data are accurate as the researcher gathers
the information. But, one of the disadvantages of primary data collection is the expenses associated with it.
Primary data research is very time-consuming and expensive.

Primary or raw data is obtained directly from the first-hand source through experiments, surveys, or obser-
vations. The primary data collection method is further classified into two types, and they are given below:

Quantitative Data Collection Methods

Qualitative Data Collection Methods

Quantitative Data Collection Methods


The term ‘Quantity’ tells us a specific number. Quantitative data collection methods express the data in num-
bers using traditional or online data collection methods. Once this data is collected, the results can be calculated
using Statistical methods and Mathematical tools. Some of the quantitative data collection methods include
Probability Sampling, Surveys and Conducting Interviews.

Qualitative Data Collection Methods


The qualitative method does not involve any mathematical calculations. This method is closely connected with
elements that are not quantifiable. The qualitative data collection method includes several ways to collect this
type of data, and they are given below:

Interview Method
As the name suggests, data collection is done through the verbal conversation of interviewing the people in
person or on a telephone or by using any computer-aided model. This is one of the most often used methods
by researchers. A brief description of each of these methods is shown below:

Personal or Face-to-Face Interview: In this type of interview, questions are asked personally directly to
the respondent. For this, a researcher can do online surveys to take note of the answers. Telephonic Interview:
This method is done by asking questions on a telephonic call. Data is collected from the people directly by
collecting their views or opinions.

Computer-Assisted Interview: The computer-assisted type of interview is the same as a personal in-
terview, except that the interviewer and the person being interviewed will be doing it on a desktop or laptop.
Also, the data collected is directly updated in a database to make the process quicker and easier. In addition,
it eliminates a lot of paperwork to be done in updating the collection of data.

Questionnaire Method of Collecting Data


The questionnaire method is nothing but conducting surveys with a set of quantitative research questions.
These survey questions are done by using online survey questions creation software. It also ensures that the
people’s trust in the surveys is legitimised. Some types of questionnaire methods are given below:

Web-Based Questionnaire: The interviewer can send a survey link to the selected respondents. Then the
respondents click on the link, which takes them to the survey questionnaire. This method is very cost-efficient
and quick, which people can do at their own convenient time. Moreover, the survey has the flexibility of being
done on any device. So it is reliable and flexible.

Mail-Based Questionnaire: Questionnaires are sent to the selected audience via email. At times, some
incentives are also given to complete this survey which is the main attraction. The advantage of this method
is that the respondent’s name remains confidential to the researchers, and there is the flexibility of time to
complete this survey.

Observation Method
As the word ‘observation’ suggests, data is collected directly by observing this method. This can be obtained
by counting the number of people or the number of events in a particular time frame. Generally, it’s effective in
small-scale scenarios. The primary skill needed here is observing and arriving at the numbers correctly. Struc-
tured observation is the type of observation method in which a researcher detects certain specific behaviours.
Statistics for Managers Introduction of Statistics

Document Review Method


The document review method is a data aggregation method used to collect data from existing documents with
data about the past. There are two types of documents from which we can collect data. They are given below:

Public Records: The data collected in an organisation like annual reports and sales information of the
past months are used to do future analysis.

Personal Records: As the name suggests, the documents about an individual such as type of job, desig-
nation, and interests are taken into account.

Secondary Data Collection Method


Data that the investigator does not initially collect but instead obtains from published or unpublished sources
are secondary data. Secondary data is collected by an individual or an institution for some purpose and are
used by someone else in another context. It is worth noting that although secondary data is cheaper to obtain,
it raises concerns about accuracy. As the data is second-hand, one cannot fully rely on the information to be
authentic.

The data collected by another person other than the researcher is secondary data. Secondary data is read-
ily available and does not require any particular collection methods. It is available in the form of historical
archives, government data, organisational records etc. This data can be obtained directly from the company or
the organization where the research is being organised or from outside sources.

The internal sources of secondary data gathering include company documents, financial statements, annual
reports, team member information, and reports got from customers or dealers. Now, the external data sources
include information from books, journals, magazines, the census taken by the government, and the information
available on the internet about research. The leading edge of this data aggregation method is that it is easy to
collect since they are readily accessible.

Some Other Methods of Collecting Data

• Surveys

• Transactional Tracking

• Interviews and Focus Groups

• Observation

• Online Tracking

• Forms

• Social Media Monitoring

Data Sort Out

There are various ways to represent data after gathering. But, the most popular method
is to tabulate the data using tally marks and then represent them in a frequency distri-
bution table. The frequency distribution table is constructed by using the tally marks.
Tally marks are a form of a numerical system used for counting. The vertical lines are
used for the counting. The cross line is placed over the four lines giving the total at
5.

Example

Construction of a frequency distribution table for a jar containing the different colours of
pieces of bread as shown below:
When data are initially collected and before it is edited and not processed for use, they are
known as Raw data. It will not be of much use because it would be too much for the human eye to analyse.
Statistics for Managers Introduction of Statistics

For example, study the marks obtained by 50 students in mathematics in an examination, given below:
61 60 44 49 31 60 79 62 39 51 67 65 43 54 51 42
52 43 46 40 60 63 72 46 34 55 76 55 30 67 44 57
62 50 65 58 25 35 54 59 43 46 58 58 56 59 59 45
42 44

In this data, it is not easy to locate the five highest marks or any rank among them. Hence arrangement
of an array of marks will make the job simpler. With some difficulty it may be noted in the list that 79 is the
highest mark and 25 is the least. Using these it can be subdivided the data into convenient classes and place
each mark into the appropriate class.
From this table can you answer the questions raised above? To answer the question, “how many scored below

Class Interval Marks


25 - 30 30, 25
31 - 35 31, 34, 35
36 - 40 39, 40
41 - 45 44, 43, 42, 43, 44, 43, 45, 42, 44
46 - 50 49, 46, 46, 50, 46
51 - 55 51, 54, 51, 52, 55, 55, 54
56 - 60 60, 60, 60, 57, 58, 59, 58, 58, 56, 59, 59
61 - 65 61, 62, 65, 63, 62, 65
66 - 70 67, 67
71 - 75 72
76 - 80 79, 76
Total 100

56”, it is not need the actual marks. It just wants “how many” were there. To answer such cases, which often
occur in a study, it can be modified the table slightly and just note down how many items are there in each
class. It then may have a slightly simpler and more useful arrangement, as given in the table.

Class Interval 25-30 31-35 36-40 41-45 46-50 51-55 56-60 61-65 66-70 71-75 76-80
No. of Items 2 3 2 9 5 7 11 6 2 1 2

1.4 Measures of Central Tendency


Statistics and Data Measures Measures of Central Tendency
A measure of central tendency (also referred to as measures of centre or central location) is a summary
measure that attempts to describe a whole set of data with a single value that represents the middle or centre
of its distribution.

There are three main measures of central tendency: the mode, the median and the mean. Each of these
measures describes a different indication of the typical or central value in the distribution.

Arithmetic Mean: The mean is the sum of the value of each observation in a dataset divided by the
number
P of observations. This is also known as the arithmetic average.
x
x= (Raw data)
n
Statistics for Managers Introduction of Statistics

P
fx
x= P (Discrete frequency distribution)
f
P
fd
x=A+ P ×C (Continuous frequency distribution)
f

Median: The median is the middle value in distribution when the values are arranged in ascending or
descending order.
 
n+1
Median = th item, if n is odd
2
= mean of the 2 values in the middle (Raw data) 
N +1
Median = the cumulative frequency just greater than
2
= The corresponding value of x (Discrete frequency distribution)
N
−m
Median = l + 2 ×C (Continuous frequency distribution)
f

Mode: The mode is the most commonly occurring value in a distribution.


Mode = The value which occurs most frequently (Raw data)
Mode = The value of x corresponding to the maximum frequency
(Discrete frequency distribution)
f1 − f0
Mode = l + ×C
2f1 − f0 − f2
(Continuous frequency distribution)

Find the arithmetic mean of the frequency distribution.

x 1 2 3 4 5 6 7
f (x) 5 9 12 17 14 10 6

Solution

x f fx
1 5 5
2 9 18
3 12 36
4 17 68
5 14 70
6 10 60
7 6 42
Total 73 299
P
fx
Arithmetic mean, x = P
f

299
=
73
x = 4.09

Calculate the arithmetic mean of the marks from the following table

Marks 0 - 10 10 - 20 20 - 30 30 - 40 40 - 50 50 - 60
No. of 12 18 27 20 17 6
Students
Statistics for Managers Introduction of Statistics

Solution

x f Mid x fx
0 - 10 12 5 60
10 - 20 18 15 270
20 - 30 27 25 675
30 - 40 20 35 700
40 - 50 17 45 765
50 - 60 6 55 330
Total 100 2800
P
fx
Arithmetic mean, x = P
f

2800
=
100
x = 28

Calculate the arithmetic mean of the marks from the following table.

x 1 2 3 4 5 6 7 8 9 10
f 48 65 43 31 57 37 60 59 49 77

Solution
x 1 2 3 4 5 6 7 8 9 10 Total
f 48 65 43 31 57 37 60 59 49 77 526
fx 48 130 129 124 285 222 420 472 441 770 3041
P
fx
Arithmetic mean, x = P
f

3041
=
526
x = 5.78

Find the value of the median for the following data

Marks 10 23 18 38 65 92 40 58
No of Students 8 12 16 12 10 18 4 1

Solution

Marks 10 23 18 38 65 92 40 58
No. of 8 12 16 12 10 18 4 1
Students
Cumulative 8 20 36 48 58 76 80 81
frequency

N +1
Median= th item
2

82
= = 41 st item
2
Median = 38
Statistics for Managers Introduction of Statistics

Find the value of the median for the following data

Wages 100-110 110-120 120-130 130-140 140-150


No. of workers 15 23 38 24 10

Solution

Wages 100-110 110-120 120-130 130-140 140-150


No. of workers 15 23 38 24 10
cf 15 38 76 100 110

N
−m
Median = l + 2 ×C
f
N 110
Here N = 110; = = 55
2 2
Median class is 120-130
l = 120, c = 10, m = 38, f = 38
55 − 38
Median = 120 + × 10
38
Median = 124.47

Find the value of the Mode for the following data.

Marks 1 2 3 4 5 6 7 8
No of Students 4 9 16 25 22 15 7 3

Solution
Maximum frequency = 25

Mode = 4

Calculate the mode for the following data

Wages 0-10 10-20 20-30 30-40 40-50


No. of workers 10 14 19 17 13

Solution
Maximum frequency = 19
f1 − f0
Mode = l + ×C
2f1 − f0 − f2
Mode class is 20-30
l = 20, f1 = 19, f0 = 14, f2 = 17, c = 10
19 − 14
Mode = 20 + × 10
2 (19) − 14 − 17
Mode = 27.14

The age of a sample of a faculty members selected from the school of business administration is
shown below. Compute the average age, mode and median age.
Solution
P
x
Arithmetic mean, x =
n
Statistics for Managers Introduction of Statistics

Faculty 1 2 3 4 5 6 7 8
Age 42 30 73 50 51 37 42 54

384
=
8
Mean = 48

Ascending order: 30, 37, 42, 42, 50, 51, 54, 73


N +1
Median = th item
2
8+1
= = 4.5 the item
2
42 + 50
=
2
Median = 46

Mode = most frequent value

Mode = 42

Calculate the mean, median and mode for the following data:

Marks 20-30 30-40 40-50 50-60 60-70


No. of students 3 5 20 10 5

Solution
Assumed mean, A = 45

CI f Mid x d = x − A = x − 45 fd cf
20-30 3 25 -20 -60 3
30-40 5 35 -10 -50 8
40-50 20 45 0 0 28
50-60 10 55 10 100 38
60-70 5 65 20 100 43
Total 43 90

P
fd
Arithmetic mean, x = A +
n
90
= 45 +
43
Mean, x = 47.09

N
−m
Median = l + 2 ×C
f
Cumulative frequency = 43
N +1 44
Here N = 43 → Odd; = = 22
2 2
Median class is 40-50
l = 40, c = 10, m = 8, f = 20
22 − 8
Median = 40 + × 10
20
Median = 47
Statistics for Managers Introduction of Statistics

Maximum frequency = 20 = f1
f1 − f0
Mode = l + ×C
2f1 − f0 − f2
Mode class is 40-50
l = 40, f1 = 20, f0 = 5, f2 = 10, c = 10
20 − 5
Mode = 40 + × 10
2 (20) − 5 − 10
Mode = 46

Calculate the mean, median and mode for the following data:

Marks 0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90 90-100
No. of students 3 5 7 10 12 15 14 4 2 8

Solution P
fd
Arithmetic mean, x =
n

Data f x fx cf
0-10 3 5 15 3
10-20 5 15 75 8
20-30 7 25 175 15
30-40 10 35 350 25
40-50 12 45 540 37
50-60 15 55 825 52
60-70 14 65 910 66
70-80 4 75 300 70
80-90 2 85 170 72
90-100 8 95 760 80
Total 80 4120

4120
=
80
Mean, x = 51.5

N
−m
Median = l + 2 ×C
f
Cumulative frequency = 43
N 80
Here N = 80 → Even; = = 40
2 2
Median class is 50-60
l = 50, c = 10, m = 37, f = 15
40 − 37
Median = 50 + × 10
15
Median = 52

Maximum frequency = 15 = f1
f1 − f0
Mode = l + ×C
2f1 − f0 − f2
Mode class is 50-60
l = 50, f1 = 15, f0 = 12, f2 = 14, c = 10
Statistics for Managers Introduction of Statistics

15 − 12
Mode = 50 + × 10
2 (15) − 12 − 14
Mode = 57.5

1.5 Probability
Probabaility
A probability is a number that reflects the chance or likelihood that a particular event will occur. Probabilities
can be expressed as proportions that range from 0 to 1, and they can also be expressed as percentages ranging
from 0% to 100%.
Number of ways it can happen
Probability of an event happening =
Total number of outcomes

Experiment: A repeatable procedure with a set of possible results.

Example 1. Throwing dice: We can throw the dice again and again, so it is repeatable.
The set of possible results from any single throw is 1, 2, 3, 4, 5, 6

Outcome: A possible result of an experiment.


Example 2. Getting a “6”
Sample Space: The set of all possible outcomes of an experiment. It is denoted by S.

Example 3. Choosing a card from a deck: There are 52 cards in a deck (not including Jokers)
So the Sample Space is all 52 possible cards: Ace of Hearts, 2 of Hearts, etc...
Sample Point: Just one of the possible outcomes

Event: Any subset of the sample space S.


Example 4. An event can be just one outcome: Getting a Tail when tossing a coin
Rolling a ”5”
An event can include more than one outcome: Choosing a ”King” from a deck of cards (any of the 4 Kings)
Rolling an ”even number” (2, 4 or 6)
n (E)
∴ Probability, P (E) =
n (S)
Statistics for Managers Introduction of Statistics

where n (E) = number of events of an experiment


n (S) = number of samples of an experiment

Axioms of Probability

i P (A) ≥ 0 (Probability is a non-negative number)


ii P (Ω) = 1 (Probability of the whole set is unity)
iii If A ∩ B = ϕ, then P (A ∪ B) = P (A) + P (B)

Independent Events
Two events A and B are independent, if and only if P (A ∩ B) = P (A) · P (B)

Addition of Probability
If A and B are two events in a probability experiment, then the probability that either one of the events will
occur is:
P (A ∪ B) = P (A) + P (B) − P (A ∩ B)

Conditional Probability
Conditional probability is used to determine how two events are related; that is, we can determine the
probability of one event given the occurrence of another related event.
Conditional probabilities are written as P (A|B) and read as “the probability of A given B” and is calculated as

P (A ∩ B)
P (A|B) =
P (B)
where P (B) ̸= 0

Baye’s Theorem
Let B1 , B2 , ...Bn be an exhaustive and mutually exclusive random experiment and A be an event related to
that Bi , then
P (Bi ) P (A|Bi )
P (Bi |A) = Pn
i=1 P (Bi ) P (A|Bi )
If A and B are two independent events, then P (A ∩ B) = P (A) P (B).
Two events A and B are said to be mutually exclusive events if A ∩ B.

If A and B are two events in S, then P (A ∪ B) = P (A) + P (B) − P (A ∩ B)

A coin is tossed 3 times. Find the sample space and probability of at least 2 heads.
Statistics for Managers Introduction of Statistics

Solution S = {HHH, HHT, HT H, T HH, HT T, T HT, T T H, T T T }


Total number of events = 8
Pr(at least two heads)
Favourable events = 4
4
p = = 0.5
8

What is the chance that non leap year select the random will be contained 53 Sundays?

Solution 1 year = 365 days = 52 weeks + 1 day

Favourable cases = 52 Sundays


S= {Sun, M on, T ue, W ed, T hu, F ri, Sat}
Number of events = 7
1
Number of favourable = 1 p =
7

What is the chance that leap year select the random will be contained 53 Sundays?

Solution 1 year = 366 days = 52 weeks + 2 days

Favourable cases = 53 Sundays


S= {Sun, M on, T ue, W ed, T hu, F ri, Sat}
Number of events = 7
2
Number of favourable = 2 p = = 0.29
7
From a set of 17 cards numbers 1,2,3, ..., 17, one is drawn at random what is the chance (i) its
number is multiple of 3 or 7?; (ii) its number is multiple of 3 or 5 or both?

Solution S = {1, 2, 3, ..., 17}

Number of events = 17
(i) Favourable cases = {3, 6, 9, 12, 15, 7, 14}
Number of favourable events = 7
7
p= = 0.41
17
(ii) Favourable cases = {3, 6, 9, 12, 15, 5, 10}
Number of favourable events = 7
7
p= = 0.41
17
If from a pack of cards a single card is drawn. What is the probability that it is either a spade
or a king?

Solution Total no. of cards = 52


Let A be the event for spade and B be the event for king

13 4
P (A) = and P (B) =
52 52
P (A ∪ B) = P (A) + P (B) − P (A ∩ B)
13 4 1 16 4
= + − = =
52 52 52 52 13

A person is known to hit a target in 3 out of 4 shots, where another person is known to hit the
target in 2 out of 3 shots. Find the probability of being hit at all when they both person try.
Solution
Let A be the event of first shooter and B be the event of second shooter
Statistics for Managers Introduction of Statistics

3 4
P (A) = and P (B) =
4 52

P (A ∪ B) = P (A) + P (B) − P (A) · P (B)


3 2 3 2
= + − ·
4 3 4 3
3 2 1 11
= + − =
4 3 2 12

If from a pack of 52 cards a single card is drawn. What is the probability of drawing a red card
or a king?
Solution Total no. of cards = 52
Let A be the event for red card and B be the event for king
26 4 2
P (A) = ; P (B) = and P (A ∩ B) =
52 52 52
P (A ∪ B) = P (A) + P (B) − P (A ∩ B)

26 4 2
= + −
52 52 52

28 7
= =
52 13

If P (A|B) = 0.2, P (B) = 0.4, then find P (A ∩ B)


Solution
P (A ∩ B) = P (A|B) · P (B)
= 0.2 × 0.4
P (A ∩ B) = 0.08

A box contain 4 bad and 6 good tubelights. 2 of them are drawn out from the box at a time. 1
of them is tested and found to be good. What is the probability that the other one is also good?
Solution Let A be the event of choosing first tubelight (good) and B be the event of choosing second tubelight
(good)
Total number of tubelights = 10; 4 bad and 6 good.
P (A ∩ B)
To find P (B|A) =
P (A)
6 6
C2 1 C1 3
P (A ∩ B) = 10 C
= and P (A) = 10 C
=
2 3 1 5
1/3 5
P (B|A) = = = 0.55
3/5 9

A bag contains 3 red and 4 white balls. Two of them are drawn without replacement. What is
the probability that both the balls are red?
Solution Let A be the event of drawing first red ball and B be the event of drawing second red ball. The balls
are chosen without replacement.
Total number of balls = 7; 3 red and 4 white.
To find P (A ∩ B)
3 2
C1 3 C1 1
P (A) = 7C
= and P (B|A) = 6 =
1 7 C1 3
P (A ∩ B) = P (B|A) · P (A)
3 1 1
= × =
7 3 7
Statistics for Managers Introduction of Statistics

A bag contains 5 white balls and 3 black balls. Two balls are drawn at random one after the
other without replacement. Find the probability that both balss drawn are black.
Solution Let A be the event of drawing balck ball first and B be the event of drawing black ball second. The
balls are chosen without replacement.
Total number of balls = 8; 3 black and 5 white.
To find P (A ∩ B)
3 2
C1 3 C1 2
P (A) = 8C
= and P (B|A) = 7 =
1 8 C1 7
P (A ∩ B) = P (B|A) · P (A)
3 2 3
= × =
8 7 28

In a bolt factory, machine A, B and C manufacture respectively 25%, 35% and 40% of total
output. Also out of these output of A, B, C 5%, 4%, 2% respectively are defective. A bolt is
drawn at random from the total output and it is found to be defective. What is the probability
that it was manufactured by the machine A, B, C.

Solution Let A1 , A2 , A3 be the events that the bolt are manufactured by machines A, B, C respectively.
25 35 40
Given P (A1 ) = = 0.25, P (A2 ) = = 0.35, P (A3 ) = = 0.40
100 100 100
Let E be the event of drawing defective bolt.
5 4 2
P (E|A1 ) = = 0.05, P (E|A2 ) = = 0.04, P (E|A3 ) = = 0.02
100 100 100
Hence probability that the defective bolt selected at random is manufactured from the machine B is P (E|A2 )
By Bayes theorem
P (A2 ) (E|A2 )
P (A2 |E) = P2
i=1 P (E) (E|Ai )

0.35 × 0.04
P (A2 |E) =
(0.25 × 0.05) + (0.35 × 0.04) + (0.4 × 0.02)

P (A2 |E) = 0.40

0.25 × 0.05
P (A1 |E) =
(0.25 × 0.05) + (0.35 × 0.04) + (0.4 × 0.02)

P (A1 |E) = 0.362

0.4 × 0.02
P (A3 |E) =
(0.25 × 0.05) + (0.35 × 0.04) + (0.4 × 0.02)

P (A3 |E) = 0.232

A manufacturing firm produces steel pipes in 3 plants with daily production of 500, 1000 and
2000 respectively. According to past experience it is known that fraction of defective output
is 0.005, 0.008 and 0.010. If a pipe is selected from a day’s total production and found to be
defective. What is the probability of that it comes from 1st, 2nd, 3rd plant

Solution Let A, B, C be the events that steel manufactured by plants 1, 2, 3 respectively.


500 1000 2000
Given P (A) = = 0.1428, P (B) = = 0.2857, P (C) = = 0.5714
3500 3500 3500
Let E be the event of drawing defective bolt.
P (E|A) = 0.005, P (E|B) = 0.008, P (E|C) = 0.010
Statistics for Managers Introduction of Statistics

Hence probability that the defective bolt selected at random is manufactured from the machine B is P (E|B)
By Bayes theorem
P (A) (E|A)
P (A|E) = P2
i=1 P (Ei ) (A|Ei )
0.1428 × 0.005
P (A|E) =
(0.1428 × 0.005) + (0.2857 × 0.008) + (0.5714 × 0.010)

P (A|E) = 0.08152

0.2857 × 0.008
P (B|E) =
(0.1428 × 0.005) + (0.2857 × 0.008) + (0.5714 × 0.010)

P (B|E) = 0.2629

0.5714 × 0.010
P (C|E) =
(0.1428 × 0.005) + (0.2857 × 0.008) + (0.5714 × 0.010)

P (C|E) = 0.6555

1.6 Random Variables


Random Variables A real-valued function definite on the outcome of a probability experiment is called a
random variable.
(i.e., A numerical value to each outcome of a particular experiment)

As far as the range space, random variables are of two types

Random
Variables

Discrete Continuous
Random Variables Random Variables

As far as the number of variables, random variables are of two types

Random
Variables

One Dimensional Two Dimensional


Random Variables Random Variables

Discrete Random Variables


A random variable x is said to be discrete if it takes a finite number of values or countably infinite number of
values. i.e., its range RX is finite or countabley infinite.
The values of a discrete random variable X can be listed as x1 , x2 .x3 , ... i.e., {x1 , x2 , x3 , ...} and RX is also
called the spectrum of X.

Probability function or Probability mass function


Le tx be discrete random variable which takes values x1 , x2 .x3 , ... .Let P (X = xi ) = p (xi ) be probability of xi .
Then the function p ic called the probability mass function of X if the numbers p (xi ) satisfy the conditions
Statistics for Managers Introduction of Statistics

(i) p (xi ) ≥ 0, ∀i = 1, 2, 3, ...


P∞
(ii) i=1 p (xi ) = 1

Note:

(i) If x1 < x2 < x3 < ... < xi < ..., then P (X ≤ xi ) = P (X = x1 ) + P (X = x2 ) + P (X = x3 ) + .... +
P (X = xi )

(ii) P (X > xi ) = 1 − P (X ≤ xi ) and

(iii) P (X < xi ) = 1 − P (X ≥ xi )

Continuous Random Variables


A random variable x is said to be continuous if its range space RX is an uncountable set of real numbers. i.e.,
the random variables assumes values in an interval (a, b) or in an union of equal intervals.

Probability density function of a continuous random variable


A function f , defined for all x ∈ (−∞, ∞) is called the probability density function of a continuous random
variable X, if

i f (x) ≥ 0, ∀x ∈ (−∞, ∞)
R∞
ii −∞ f (x) dx = 1

Note:
Ra
(i) P (X < a) = −∞
f (x) dx
R∞
(ii) P (X > a) = a
f (x) dx
Rb
(iii) P (a ≤ X ≤ b) = a
f (x) dx

(iv) P (a < X ≤ b) = P (a ≤ X < b) = P (a ≤ X ≤ b) = P (a < X < b)

Expectation of a discrete random variable


Let X be a random variable taking values x1 , x2 , x3 , ..., xn , ... with probabilities p (x1 ) , p (x2 ) , ..., p (xn )

X
E (X) = xi p (xi )
i=1

Expectation of a continuous random variable


Let X be a continuous random variable with pdf f defined in (−∞, ∞), then the expected value of X is defined
as
Z ∞
E (X) = x f (x) dx
−∞

Note:

(i) E (C) = C, where C is a constant

(ii) E (aX + b) = aE (X) + b

(iii) E (aX − b) = aE (X) − b

(iv) E (aX) = aE (X)


2
Let X be a random variable with mean E (X), then the variance of X is defined as E (X) − [E (X)] . It is
2
denoted by V ar (X) or σX

Note:

(i) V ar (aX) = a2 V ar (X)

(ii) V ar (aX + b) = a2 V ar (X)


Statistics for Managers Introduction of Statistics

1.6.1 Moment Generating Function



The moment generating function of a random variable X is defined as E etX for all t ∈ (−∞, ∞). It is denoted
by M (t) or MX (t)

(a) If X is a discrete random variable with values x1 , x2 .x3 , ... and probability function p (xi ), then

X
MX (t) = etxi p (xi )
i=1

(b) If X is a continuous random variable with pdf f (x), x ∈ (−∞, ∞), then
Z ∞
MX (t) = etX f (x) dx
−∞

If X1 , X2 , X3 , ..., Xn are n independent random variables, then

MX1 +X2 +...+Xn (t) = MX1 (t) · MX2 (t) · ... · MXn (t)
Find the moment generating function of the discrete distribution whose probability function is
1
given by P (X = x) = x , x = 1, 2, 3, .... Find the mean and variance of the distribution. Also find
2
(i) P (X = even); (ii) P (X = odd); (iii) P (X ≥ 4); (iv) P (X is a multiple of 3)
1
Solution Given P (X = x) = x , x = 1, 2, 3, ...
2
P∞
MGF, MX (t) = i=1 etX P (x = xi )
P∞ 1
= i=1 etX x
2
 t x
P∞ tX 1 x P∞
 
e
= i=1 e = i=1
2 2
 t   t 2  t 3
e e e
= + + + ...
2 2 2
−1  −1
et 2 − et

= 1− =
2 2

2
MX (t) =
2 − et

(i) P (X = even) = P (X = 2) + P (X = 4) + P (X = 6) + ...


1 1 1
= 2 + 4 + 6 + ...
2 2 2
 
 
1 1 1 1  1 
= 2 1 + 2 + 4 ... = 2 
2 2 2 2 1 
1− 2
2

1
P (X = even) =
3

1
(ii) P (X = odd) = 1 − P (X = even) = 1 −
3

2
P (X = odd) =
3
Statistics for Managers Introduction of Statistics

(iii) P (X ≥ 4) = 1 − P (X < 4)
= 1 − [P (X = 1) + P (X = 2) + P (X = 3)]
 
1 1 1
=1− + 2+ 3
2 2 2
4+2+1
=1−
8

1
P (X ≥ 4) =
8

(iv) P (X is a multiple of 3)

= P (X = 1) + P (X = 2) + P (X = 3)
1 1 1
= 3 + 6 + 9 + ...
2 2 2
" #
1 1 1
= 3 1+ 3 + 2 ...
2 2 (23 )
1
 
1 1P (X is a multiple of 3) =
=  7
8 1−8

1
P (X is a multiple of 3) =
7

A random variable X has the following probability function

X 0 1 2 3 4 5 6 7 8
p(x) a 3a 5a 7a 9a 11a 13a 15a 17a

Determine: (i) the value of ’a’; (ii) find P (X < 3), P (X ≥ 3), P (0 < X < 3); (iii) find the distribu-
tion function of X

P8
Solution We know that i=1 p (xi ) = 1
1
P (0) + P (1) + P (2) + ...P (8) = 81a = 1 ⇒ a =
81

X 0 1 2 3 4 5 6 7 8
1 3 5 7 9 11 13 15 17
p(x)
81 81 81 81 81 81 81 81 81

9
P (X < 3) = P (0) + P (1) + P (2) =
81
9 72
P (X ≥ 3) = 1 − P (X < 3) = 1 − =
81 81
24
P (0 < X < 3) = P (1) + P (2) + P (3) + P (4) =
81
Distribution function of X: F (X) = P (X ≤ x)
1
F (0) = P (0) =
81
1 3 4
F (1) = P (X ≤ 1) = P (0) + P (1) = + =
81 81 81
4 5 9
F (2) = P (X ≤ 2) = F (1) + P (2) = + =
81 81 81
Statistics for Managers Introduction of Statistics

9 7 16
F (3) = P (X ≤ 3) = F (2) + P (3) = + =
81 81 81
16 9 25
F (4) = P (X ≤ 4) = F (3) + P (4) = + =
81 81 81
25 11 36
F (5) = P (X ≤ 5) = F (4) + P (5) = + =
81 81 81
36 13 49
F (6) = P (X ≤ 6) = F (5) + P (6) = + =
81 81 81
49 15 64
F (7) = P (X ≤ 7) = F (6) + P (7) = + =
15 81 81
64 17 81
F (8) = P (X ≤ 8) = F (7) + P (8) = + = =1
81 81 81

A random variable x has the following probability function

x 0 1 2 3 4 5 6 7
p(x) 0 k 2k 2k 3k kˆ2 2kˆ2 7Kˆ2+k

Find: (i) the value of ’k’; (ii) compute P (X < 6), P (X ≥ 6), P (0 < X < 5); (iii) Find the minimum
1
value of ‘a’ such that P (X ≤ a) > ; (iv) find the distribution function of X
2
P7
Solution (i) We know that i=0 p (xi ) = 1
P (0) + P (1) + P (2) + ...P (8) = 10k 2 + 9k = 1
⇒ 10k 2 + 9k = 110k 2 + 9k − 1 = 0
1
By solving, we get k = and k = −1; since p (x) cannot be negative,
10

1
k=
10

81
(ii) P (X < 6) = P (0) + P (1) + P (2) + ... + P (2) =
100
81 19
P (X ≥ 6) = 1 − P (X < 6) = 1 − =
100 100
4
P (0 < X < 5) = P (1) + P (2) + P (3) + P (4) =
5
1
(iii) P (X ≥ 3) =
2
8 1
P (X ≥ 4) = >
10 2

a=4

Distribution function of X:
F (X) = P (X ≤ x)
F (0) = P (0) = 0
1 1
F (1) = P (X ≤ 1) = P (0) + P (1) = 0 + =
10 10
1 1 3
F (2) = P (X ≤ 2) = F (1) + P (2) = + =
10 10 10
Similarly, the distribution function is

x 0 1 2 3 4 5 6 7
F(x) 0 1/10 3/10 5/10 8/10 81/100 83/100 1
Statistics for Managers Introduction of Statistics

The density function of a random variable X is given by f (x) = Kx (2 − x) , 0 ≤ x ≤ 2. Find K,


mean, variance and rth moment.
Solution If f (x) is a probaility density function, then

R∞ R∞ 3
−∞
f (x) dx = 1 ⇒ −∞
kx (2 − x) dx = 1 ⇒ K=
4
R∞ R∞ 3
Mean = E (X) = −∞
x f (x) dx = −∞ 4
x · x (2 − x) dx ⇒ E (X) = 1
 R∞ R∞ 3  6
E X 2 = −∞ x2 f (x) dx −∞ x2 · x (2 − x) dx ⇒ E X 2 =
4 5
 
2
 2 6 1
Variance, V ar (X) = E X − [E (X)] = − 12 ⇒ V ar (X) =
5 5

To find rthR moment:



E (xr ) = −∞ xr f (x) dx
R2
= 0 xr x (2 − x) dx
3 R2 
= 2xr+1 − xr+2 dx
4 0

2r
E (X r ) = 6
(r + 2) (r + 3)

The monthly demand for Allwyn watches is known to have the probability distributions

Demand 1 2 3 4 5 6 7 8
Probability 0.1 0.1 0.2 0.2 0.2 0 0.1 0

Find the expected demand for watches. Also compute the variance.
Solution Let X be the random
P8variable denoting the monthly demand for Allwyn watches
Expected Demamd, E (X) = i=1 xi p (xi )

E (X) = 0.46

 P8
E X 2 = i=1 x2i p (xi ) = 19.7
2 2
V ar (X) = E (X) − [E (X)] = 19.7 − (0.46)

V ar (X) = 3.22

The amount of time in hours that a computer function before probability density function given
x
 −
by f (x) = λe 100 ; x > 0 . What is the probability that (i) a computer will function between 50
0; x < 0

and 150 hours before breaking down; (ii) it will function less than 100 hours?

R ∞ Solution If f (x) is a probaility density function, then


−∞
f (x) dx = 1
x
R∞ −
−∞
λe 100 dx = 1
Statistics for Managers Introduction of Statistics

x
R∞ − 1
λ −∞ e 100 dx = 1 ⇒ λ=
100
(i) Hence the probability that a computer will function between 50 and 150 hours before breaking down is

given by
x
R 100 −
P (50 < X < 150) = 50 λe 100 dx
x
R 100 1 −
= 50 e 100 dx ⇒ P (50 < X < 150) = 0.384
100
(ii) The probability that a computer will function between 50 and 150 hours

x
R 100 −
P (X < 150) = 0 λe 100 dx
x
R 100 1 −
= 0 e 100 dx ⇒ P (X < 150) = 0.633
100

x

1 −

Let X be the a random variable with probability density function f (x) = 3 e 3 ; x ≥ 0

 0; x < 0
Find (i) P (X > 3); (ii) MGF of X; (iii) E (X); (iv) V ar (X)
Solution
R∞
(i) P (X > 3) = 3 f (x) dx
  x
R∞ 1 −
= 3
e 3 dx = e−1 ⇒ P (X > 3) = 0.379
3

R∞
(ii) MX (t) = −∞
etx f (x) dx
  x
R∞ tX 1 −
= −∞
e e 3 dx
3
  x
1 R ∞ tx − 1
= e e 3 dx ⇒ MX (t) =
3 −∞ (1 − 3t)

(iii) E (X) = [MX (t)]t=0
′ d 1 d −1
MX = = [1 − 3t]
dt 1 − 3t dt
−2 −2 3
= − (1 − 3t) (−3) = 3 (1 − 3t) = 2
" # (1 − 3t)
3
E (X) = 2 ⇒ E (X) = 3
(1 − 3t) t=0
′′

(iv) E X 2 = [MX (t)]t=0

′′ d2
MX (t) = 2 [MX (t)]
dt
" #
2
d 3
= 2
dt (1 − 3t)2
 3
3 18
= 18 (1 − 3t) =
(1 − 3t)
 
2
 18
E X = = 18
(1 − 3t) t=0
Statistics for Managers Introduction of Statistics

 2
V ar (X) = E X 2 − [E (X)] = 18 − 32

⇒ V ar (X) = 9

1.7 One Dimensional Random Variables


1.7.1 Binomial Distribution
A random variable X is said to follow binomial distribution with parameters n and p if its probability mass
function is given by
(
n
Cx px q n−x , x = 0, 1, 2, ..., n
P (X = x) =
0, otherwise

X is called a Binomial random variable.


 
n
px q n−x
x
n
MGF, MX (t) = (q + pet )
Mean, E (X) = np
Variance, V ar (X) = npq

1.7.2 Poisson Disdtribution


A random variable X is said to follow Poisson distribution with parameter λ > 0, if its probability mass function
is given by

λx
P (X = x) = e−λ , x = 0, 1, 2, ..., ∞
x!

X is called a Poisson random variable.


t
MGF, M (t) = eλ(e −1)
X

Mean, E (X) = λ
Variance, V ar (X) = λ

1.7.3 Uniform Distribution


A continuous random variable X is said to follow a uniform distribution over an interval (a, b), if its probability
density function is given by
(
k, a < x < b
f (x) = ,
0, otherwise

k - constant; a and b - The parameters

1 
MGF, MX (t) = etb − eta , t ̸= 0
t (b − a)
Statistics for Managers Introduction of Statistics

b+a
Mean, E (X) =
2
2
(a − b)
Variance, V ar (X) =
12

1.7.4 Normal Distribution


A continuous random variable X is said to follow a normal distribution with parameters µ and σ 2 , if its
probability density function is given by

2
 (x − µ)
f (x) = 1 − ,
 √ e 2σ 2 , −∞ < x < ∞, σ > 0
σ 2π

X is called a normal random variable.


t2 σ 2
µt+
MGF, MX (t) = e 2
Mean, E (X) = µ
Variance, V ar (X) = σ 2

A sales representative can convert a customer as a potentiall buyer with the probabillity of 70%.
If he is able to meet 10 customers in a day, find the probability of

(i) exactly one customer

(ii) not even a single customer

(iii) at least one customer

Solution Let X be the random variable denoting number of customers.


70
Given p = 70% = = 0.7 ∴ q = 1 − p = 1 − 0.7 = 0.3; n = 10
100
P(number of customers converted),
P (X = x) =n Cx px q n−x =10 Cx px q 10−x

(i) P(exactly one customer),


1 10−1 9
P (X = 1) =10 C1 (0.7) (0.3) =10 C1 (0.7) (0.3)

P(exactly one customer) = 0.0001

(ii) P(not even single customer),


0 10−0 10
P (X = 0) =10 C0 (0.7) (0.3) = (1) (1) (0.3)

P(not even single customer) = 0.00001

(iii) P(at least one customer),


P (X ≥ 1) = 1 − P (X ≤ 1) = 1 − P (X < 0)
h i h i
0 10−0 10
= 1 − 10 C0 (0.7) (0.3) = 1 − (1) (1) (0.3)
= 1 − 0.00001

P(at least one customer) = 0.0001

P (1 ≤ X ≤ 3) = 0.9999
Statistics for Managers Introduction of Statistics

4 coins were tossed 150 times and the following results were obtained.

No. of heads 1 1 2 3 4
Observed frequency 28 62 46 10 4

Fit a binomial distribution.

SolutionN =150
No. of values, n = 5
To find p
Mean,

x f fx
0 28 0
1 62 62
2 46 92
3 10 30
4 4 16
Total 73 299
P
fx
x= P
f
200
= = 1.33
150
Mean = np = 1.33
1.33
5p = 1.33 ⇒ p = = 0.267
5
q = 1 − p = 1 − 0.267 = 0.734
x 5−x
P (X = x) =n Cx px q n−x =5 Cx (1.33) (0.734)
1 5−4 4
P (x = 1) =5 C1 (0.267) (0.733) = (5) (0.267) (0.233)

P(1) = 0.386

2 5−2 2 3
P (x = 2) =5 C2 (0.267) (0.733) =5 C2 (0.267) (0.233)

P(2) = 0.281

3 5−3 3 2
P (x = 3) =5 C3 (0.267) (0.733) =5 C3 (0.267) (0.233)

P(3) = 0.102

4 5−4 4 1
P (x = 4) =5 C4 (0.267) (0.733) =5 C4 (0.267) (0.233)

P(4) = 0.019

Global green ccompany as the average yearly sales in an outlet is Rs. 25,00,000 and SD is Rs.
5,00,000. If the sales follow normal distribution, find

(i) probability of sales greater than Rs. 30,00,000

(ii) probability of sales between Rs. 20,00,000 and Rs. 30,00,000

(iii) Since 1,000 outlets the company possessed, find the number of outlets less than Rs. 15,00,000
Statistics for Managers Introduction of Statistics

Solution Given, Average, µ = 25.


Standard deviation, S.D., σ = 5
x−µ x − 25
z= =
σ 5
30 − 25 5
When x = 30, z = = =1
5 5
20 − 25 −5
When x = 20, z = = = −1
5 5
15 − 25 −10
When x = 15, z = = = −2
5 5
(i) Probability of sales greater than Rs. 30 lakhs
P (x > 30) = P (z > 1)
= 0.5 − P (0 < z > 1) = 0.5 − 0.3413
= 0.1587
(ii) Probability of sales between Rs. 20 lakhs and Rs. 30 lakhs
P (20 < x < 30) = P (−1 < z < 1) = 2P (0 < z < 1)
= 2 × 0.3413
= 0.6286
(i) Probability of sales less than Rs. 15 lakhs
P (x < 15) = P (z < −2) = P (z > 2)
= 0.5 − P (0 < z > 2) = 0.5 − 0.4772
= 0.0228
Sales for 1000 outlets = 1000 × 0.0228 = 22.8

In a test of 2000 electric uls it was found that the life of a particular make was normally distributed
with an average life of 2,040 hours and SD of 60 hours. Estimate the number of bulbs likely to
burn for
(i) more than 2150 hours
(ii) less than 1950 hours
(iii) more than 1920 hours but less than 2160 hours
Solution Given, Average, µ = 2040.
Standard deviation, S.D., σ = 60
x−µ x − 2040
z= =
σ 60
2150 − 2040 110
When x = 2150, z = = = 1.833
60 60
1950 − 2140 −90
When x = 1950, z = = = −1.5
60 60
2160 − 2040 120
When x = 2160, z = = =2
60 60
1960 − 2040 −120
When x = 1960, z = = = −2
60 60
(i) Probability of bulbs burn more than 2150 hours
P (x > 2150) = P (z > 1.833)
= 0.5 − P (0 < z > 1.833) = 0.5 − 0.4664
= 0.0336
(i) Probability of bulbs burn less than 2150 hours
P (x < 1950) = P (z < −1.5) = P (z > 1.5)
= 0.5 − P (0 < z > 1.5) = 0.5 − 0.4332
= 0.0668
Sales for 1000 outlets = 1000 × 0.0228 = 22.8

(ii) Probability of bubs burn more than 1920 but less than 2160
Statistics for Managers Introduction of Statistics

P (1920 < X < 2160) = P (−2 < z < 2) = 2P (0 < z < 2)


= 2 × 0.4773
= 0.9546
Number of bulbs likelly to burn
(i) more than 2150 hours = 0.0336 × 2000 = 67
(ii) less than 1950 hours = 0.0668 × 2000 = 134
(iii) more than 1920 but less than 2160 = 0.9546 × 2000 = 1909

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy