Statistics Project

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 14

3.

4 CONDITIONAL PROBABILITY AND BAYES’ THEOREM

CONDITIONAL PROBABILITY
- examines the likelihood of an event occurring based on a likelihood of a
preceding event occurring.
 FORMULA
P( A∧B)
P( A /B)=
P (B)
 EXAMPLE PROBLEM
There are 500 students in a certain school. 150 students are enrolled
in an algebra course and 80 students are enrolled in Chemistry course. There are 30
students who are taking both. If a student is chosen at random (a.) what is the
probability of student taking algebra? (b.) what is the probability of student taking
chemistry given that the student is also taking algebra (c.) what is the probability of
student taking algebra given that the student is also taking chemistry.
SOLUTION:
A.) P( A)=150 /500=0.3∨30 %
30 /500
B.) P(C / A )= 150 /500
¿ 1/5 ¿ .2∨20 %
C.) P( A /C )=30/80 ¿ 3/8

BAYES’ THEOREM
-states that conditional probability of an event, based of the occurrence of
another event, is equal to the likelihood of the second event given the first event
multiplied by the probability of the first event. This theorem was named after English
Mathematician Thomas Bayes (1701-1761)
 FORMULA
P(B / A )P( A)
P( A /B)=
P(B)

 EXAMPLE PROBLEM
In a pain clinic, 10% of patients prescribed narcotic pain killers. 5% of
the clinic patients are addicted to narcotics. Out of all people prescribed pain pills,
8% are addicts, if a patient is an addict, what is the probability that they will be
prescribed pain pills?
P(B / A )P( A)
SOLUTION: P ( A ) = 10% P( A /B)=
P(B)
(0.08)(.10)
P ( B ) = 5% ¿
.05
P ( B/A ) = 8% ¿ .16∨16 %
4. DISCRETE PROBABILITY DISTRIBUTIONS

4.1 PROPERTIES OF DISCRETE PROBABILITY DISTRIBUTIONS


PROPERTIES OF DISCRETE PROBABILITY DISTRIBUTIONS
1. The probability of each value of a discrete random variable is between 0 and 1
inclusive. 0 ≤ P ( x ) ≤ 1
2. The sum of all probabilities is 1. Σ P(x) = 1
EXAMPLE:
X 5 10 15 20
P(X) 1/8 3/8 1/2 0

1. 0 ≤ X ≤ 1. ? YES
1 3 1
2. Σ P(x) = 1 ? YES Σ P(x) = + + +0 = 1
8 8 2

4.2 BINOMIAL DISTRIBUTION


BINOMIAL DISTRIBUTION
-A binomial distribution can be thought of as simply the probability of a
SUCCESS or FAILURE outcome in an experiment or survey that is repeated multiple
times. The binomial is a type of distribution that has two possible outcomes (the
prefix “bi” means two, or twice). For example, a coin toss has only two possible
outcomes: heads or tails and taking a test could have two possible outcomes: pass or
fail.

CRITERIA OF BINOMIAL DISTRIBUTION


1. ) The number of observations or trials is fixed. In other words, you can only
figure out the probability of something happening if you do it a certain number of
times. This is common sense—if you toss a coin once, your probability of getting a
tails is 50%. If you toss a coin a 20 times, your probability of getting a tails is very,
very close to 100%.
2. ) Each observation or trial is independent. In other words, none of your trials
have an effect on the probability of the next trial.
3. ) The probability of success (tails, heads, fail or pass) is exactly the same from
one trial to another.

Once you know that your distribution is binomial, you can apply the binomial
distribution formula to calculate the probability

 FORMULA
Where:

= binomial probability

= number of times for a specific outcome within n trials

= number of combinations

= probability of success on a single trial

= probability of failure on a single trial

= number of trials

 EXAMPLE PROBLEM
A six-sided die is rolled 12 times. What is the probability of getting a 4 five times?
SOLUTION:
n = 12 n!
nCr =
(n − r)!r !
x=5
p = 1/6 12 ¿ 12!
( )
5 (12 −5)! 5 !
q = 5/6
nCr = 792

1 5 5 12− 5
P(5)=792( ) ( )
6 6

¿ 0.028425∨2.84 %
4.3 POISSON DISTRIBUTION
POISSON DISTRIBUTION
-It is a probability distribution that is used to show how many times an event
is likely to occur over a specified period. In other words, it is a count distribution.
Poisson distributions are often used to understand independent events that occur at
a constant rate within a given interval of time. It was named after a French
mathematician Simeon Denis Poisson.
The Poisson Distribution is a discrete function, meaning that the variable can
only take specific values in a list. Put differently, the variable cannot take all values in
any continuous range. For the Poisson distribution, the variable can only take whole
number values, with no decimals or fractions.
 FORMULA
x − μ'
μ' e
P( X =x)=
x!
Where:

e = Euler’s number (e= 2.71828)


x = number of occurrences
x! = factorial of x
μ' = mean or average

 EXAMPLE PROBLEM
A small business receives, on average, 12 customers per day. (a) What is the
probability that the business will receive exactly 8 customers in one day?

SOLUTION :
μ' = 12
x=8
e = 2.71828

x − μ'
μ' e
P( X =x)=
x!
12 ∗ 2.71828−12
8
P( X =8)=
8!
¿ 0.065523∨6.55 %
4.4 POISSON APPROXIMATION TO BINOMIAL
POISSON APPROXIMATION TO BINOMIAL
-When the value of n in a binomial distribution is large and the value of p is
very small, the binomial distribution can be approximated by a Poisson distribution.
If n > 20 and np < 5 OR nq < 5 then the Poisson is a good approximation.

The Binomial distribution tables given with most examinations only have n values up
to 10 and values of p from 0 to 0.5
The similarities between the two distributions can be seen in the vertical line graph
below.

The black graph is a binomial distibution with n = 10 and p = 0.2


The red graph is a Poisson distribution with λ = 2

The value of the mean needed for the Poisson approximation is λ = np


To summarise:
For large values of n and small values of p, the Poisson distribution approximates the
binomial distribution
Test n > 20, np < 5 OR nq < 5

New parameters λ = np
 EXAMPLE PROBLEM
A factory puts biscuits into boxes of 100. The probability that a biscuit is broken is
0.03. Find the probability that a box contains 2 broken biscuits

 SOLUTION
This is a binomial distribution with n = 100 and p = 0.03.
These values are outside the range of the tables and involve lengthy
calculations.
Using the Poisson approximation (test: np = 100 x 0.3 = 3, which is less
than 5)
Let X be the random variable of the number of broken biscuits
The mean λ = np = 100 × 0.3 = 3
P(X = 2) = 0.224 (from tables)
The probability that a box contains two broken biscuits is 0.224.

5. CONTINOUS PROBABILITY DISTRIBUTIONS


5.1 PROPERTIES OF CONTINUOUS PROBABILITY DISTRIBUTION
PROPERTIES OF CONTINUOUS PROBABILITY DISTRIBUTION
-A continuous distribution has a range of values that are infinite, and
therefore uncountable. For example, time is infinite: you could count from 0 seconds
to a billion seconds…a trillion seconds…and so on, forever.
-A probability distribution in which the random variable X can take on any
value (is continuous). Because there are infinite values that X could assume, the
probability of X taking on any one specific value is zero. Therefore we often speak in
ranges of values (p(X>0) = .50). The normal distribution is one example of a
continuous distribution. The probability that X falls between two values (a and b)
equals the integral (area under the curve) from a to b:

5.2 UNIFORM DISTRIBUTION


UNIFORM DISTRIBUTION
-The uniform distribution defines equal probability over a given range for a
continuous distribution. For this reason, it is important as a reference distribution.
-One of the most important applications of the uniform distribution is in the
generation of random numbers. That is, almost all random number generators
generate random numbers on the (0,1) interval. For other distributions, some
transformation is applied to the uniform random numbers.

 FORMULA
1
f ( x)= for A ≤ x ≤ B
B− A

where A is the location parameter and (B - A) is the scale parameter. The case


where A = 0 and B = 1 is called the standard uniform distribution. The equation for
the standard uniform distribution is

f ( x)=1 for 0 ≤ x ≤1
Since the general form of probability functions can be expressed in terms of
the standard distribution, all subsequent formulas in this section are given for the
standard form of the function.

5.3 EXPONENTIAL DISTRIBUTION


EXPONENTIAL DISTRIBUTION
-In Probability theory and statistics, the exponential distribution is a
continuous probability distribution that often concerns the amount of time until
some specific event happens. It is a process in which events happen continuously
and independently at a constant average rate. The exponential distribution has the
key property of being memoryless. The exponential random variable can be either
more small values or fewer larger variables. For example, the amount of money
spent by the customer on one trip to the supermarket follows an exponential
distribution.

 FORMULA

{
−λx
fx( x∨λ)= λ e for x > 0 for x ≤ 0
0

λ is called the distribution rate

 EXAMPLE PROBLEM
The amount of time spouses shop for anniversary cards can be
modeled by an exponential distribution with the average amount of time
equal to eight minutes. Write the distribution, state the probability density
function, and graph the distribution.

ANSWER: X exp(0.125);
f ( x)=0.125 e− 0.125 x ;

5.4 NORMAL DISTRIBUTION


NORMAL DISTRIBUTION
-Normal distribution, also known as the Gaussian distribution, is
a probability distribution that is symmetric about the mean, showing that data near
the mean are more frequent in occurrence than data far from the mean.
In graphical form, the normal distribution appears as a "bell curve".

The normal distribution is the most common type of distribution assumed in


technical stock market analysis and in other types of statistical analyses. The
standard normal distribution has two parameters: the mean and the standard
deviation.
The normal distribution model is important in statistics and is key to the  Central
Limit Theorem (CLT). This theory states that averages calculated from independent,
identically distributed random variables have approximately normal distributions,
regardless of the type of distribution from which the variables are sampled
(provided it has finite variance).

The normal distribution is one type of symmetrical distribution. Symmetrical


distributions occur when where a dividing line produces two mirror images. Not all
symmetrical distributions are normal, since some data could appear as two humps
or a series of hills in addition to the bell curve that indicates a normal distribution.

The normal distribution has several key features and properties that define it.

First, its mean (average), median (midpoint), and mode (most frequent observation)


are all equal to one another. Moreover, these values all represent the peak, or
highest point, of the distribution. The distribution then falls symmetrically around
the mean, the width of which is defined by the standard deviation.

For all normal distributions, 68.2% of the observations will appear within plus or
minus one standard deviation of the mean; 95.4% of the observations will fall within
+/- two standard deviations; and 99.7% within +/- three standard deviations. This
fact is sometimes referred to as the "empirical rule," a heuristic that describes
where most of the data in a normal distribution will appear.

This means that data falling outside of three standard deviations ("3-sigma") would
signify rare occurrences.

 FORMULA

-
WHERE:

f(x) = probability density function


= standard deviation
= mean
x  = value of the variable or data being examined

 EXAMPLE
Many naturally-occurring phenomena appear to be normally-distributed.
Take, for example, the distribution of the heights of human beings. The average
height is found to be roughly 175 cm (5' 9"), counting both males and females.

As the chart below shows, most people conform to that average. Meanwhile,
taller and shorter people exist, but with decreasing frequency in the population.
According to the empirical rule, 99.7% of all people will fall with +/- three
standard deviations of the mean, or between 154 cm (5' 0") and 196 cm (6' 5").
Those taller and shorter than this would be quite rare (just 0.15% of the
population each).

5.5 NORMAL APPROXIMATION TO THE BINOMIAL


NORMAL APPROXIMATION TO THE BINOMIAL
-If X is a random variable that follows a binomial distribution with n trials
and p probability of success on a given trial, then we can calculate the mean (μ) and
standard deviation (σ) of X using the following formulas:
 μ = np
 σ = √np(1-p)
It turns out that if n is sufficiently large then we can actually use the normal
distribution to approximate the probabilities related to the binomial distribution.
This is known as the normal approximation to the binomial.

For n to be “sufficiently large” it needs to meet the following criteria:

 np ≥ 5
 n(1-p) ≥ 5

When both criteria are met, we can use the normal distribution to answer
probability questions related to the binomial distribution.

However, the normal distribution is a continuous probability distribution while the


binomial distribution is a discrete probability distribution, so we must apply a
continuity correction when calculating probabilities.

In simple terms, a continuity correction is the name given to adding or subtracting


0.5 to a discrete x-value.

For example, suppose we would like to find the probability that a coin lands on
heads less than or equal to 45 times during 100 flips. That is, we want to find P(X ≤
45). To use the normal distribution to approximate the binomial distribution, we
would instead find P(X ≤ 45.5).

The following table shows when you should add or subtract 0.5, based on the type of
probability you’re trying to find:

Using Binomial DistributionUsing Normal Distribution with Continuity Correction


X = 45 44.5 < X < 45.5
X ≤ 45 X < 45.5
X < 45 X < 44.5
X ≥ 45 X > 44.5
X > 45 X > 45.5

The following step-by-step example shows how to use the normal distribution to
approximate the binomial distribution.

Example: Normal Approximation to the Binomial

Suppose we want to know the probability that a coin lands on heads less than or
equal to 43 times during 100 flips.

In this situation we have the following values:

 n (number of trials) = 100


 X (number of successes) = 43
 p (probability of success on a given trial) = 0.50
To calculate the probability of the coin landing on heads less than or equal to 43
times, we can use the following steps:

Step 1: Verify that the sample size is large enough to use the normal
approximation.

First, we must verify that the following criteria are met:

 np ≥ 5
 n(1-p) ≥ 5

In this case, we have:

 np = 100*0.5 = 50
 n(1-p) = 100*(1 – 0.5) = 100*0.5 = 50

Both numbers are greater than 5, so we’re safe to use the normal approximation.

Step 2: Determine the continuity correction to apply.

Referring to the table above, we see that we should add 0.5 when we’re working
with a probability in the form of X ≤ 43. Thus, we will be finding P(X< 43.5).

Step 3: Find the mean (μ) and standard deviation (σ) of the binomial distribution.

μ = n*p = 100*0.5 = 50

σ = √n*p*(1-p) = √100*.5*(1-.5) = √25 = 5

Step 4: Find the z-score using the mean and standard deviation found in the
previous step.

z = (x – μ) / σ = (43.5 – 50) / 5 = -6.5 / 5 = -1.3

Step 5: Find the probability associated with the z-score.

We can use the Normal CDF Calculator to find that the area under the standard
normal curve to the left of -1.3 is .0968.

Thus, the probability that a coin lands on heads less than or equal to 43 times during
100 flips is .0968.
5.6 TRIANGULAR DISTRIBUTION
TRIANGULAR DISTRIBUTION
-The triangular distribution is a continuous probability distribution with a
probability density function shaped like a triangle.
It is defined by three values:

 The minimum value a


 The maximum value b
 The peak value c

The name of the distribution comes from the fact that the probability density
function is shaped like a triangle.

It turns out that this distribution is extremely useful in the real world because we can
often estimate the minimum value (a), the maximum value (b), and the most likely
value (c) that a random variable will take on, so we can often model the behavior of
random variables by using a triangular distribution with the knowledge of just these
three values.

The triangular distribution has the following properties:

PDF:

CDF:
REFERENCES:

*https://www.statisticshowto.com/probability-and-statistics/probability-main-index/bayes-theorem-
problems/
*https://www.statisticshowto.com/probability-and-statistics/probability-main-index/bayes-theorem-
problems/
*https://www.youtube.com/watch?v=sqDVrXq_eh0
*https://www.youtube.com/watch?v=OByl4RJxnKA
*https://statisticsbyjim.com/basics/probability-distributions/
*https://www.youtube.com/watch?v=fH6TMTSpGMA
*https://www.statisticshowto.com/probability-and-statistics/binomial-theorem/binomial-
distribution-formula/
*https://www.youtube.com/watch?v=3PWKQiLK41M
*https://youtu.be/m0o-585xwW0
*https://bestmaths.net/online/index.php/year-levels/year-12/year-12-topic-list/poisson-
approximation-binomial/
*https://www.statisticshowto.com/continuous-probability-distribution/
*https://www.itl.nist.gov/div898/handbook/eda/section3/eda3662.htm
*https://byjus.com/maths/exponential-distribution/
*https://www.statology.org/normal-approximation/#:~:text=If%20X%20is%20a%20random,
%E2%88%9Anp(1%2Dp)
*https://www.statology.org/triangular-distribution/
PROJECT IN

STATISTICAL ANALYSIS WITH

SOFTWARE APPLICATION

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy