Statistics Project
Statistics Project
Statistics Project
CONDITIONAL PROBABILITY
- examines the likelihood of an event occurring based on a likelihood of a
preceding event occurring.
FORMULA
P( A∧B)
P( A /B)=
P (B)
EXAMPLE PROBLEM
There are 500 students in a certain school. 150 students are enrolled
in an algebra course and 80 students are enrolled in Chemistry course. There are 30
students who are taking both. If a student is chosen at random (a.) what is the
probability of student taking algebra? (b.) what is the probability of student taking
chemistry given that the student is also taking algebra (c.) what is the probability of
student taking algebra given that the student is also taking chemistry.
SOLUTION:
A.) P( A)=150 /500=0.3∨30 %
30 /500
B.) P(C / A )= 150 /500
¿ 1/5 ¿ .2∨20 %
C.) P( A /C )=30/80 ¿ 3/8
BAYES’ THEOREM
-states that conditional probability of an event, based of the occurrence of
another event, is equal to the likelihood of the second event given the first event
multiplied by the probability of the first event. This theorem was named after English
Mathematician Thomas Bayes (1701-1761)
FORMULA
P(B / A )P( A)
P( A /B)=
P(B)
EXAMPLE PROBLEM
In a pain clinic, 10% of patients prescribed narcotic pain killers. 5% of
the clinic patients are addicted to narcotics. Out of all people prescribed pain pills,
8% are addicts, if a patient is an addict, what is the probability that they will be
prescribed pain pills?
P(B / A )P( A)
SOLUTION: P ( A ) = 10% P( A /B)=
P(B)
(0.08)(.10)
P ( B ) = 5% ¿
.05
P ( B/A ) = 8% ¿ .16∨16 %
4. DISCRETE PROBABILITY DISTRIBUTIONS
1. 0 ≤ X ≤ 1. ? YES
1 3 1
2. Σ P(x) = 1 ? YES Σ P(x) = + + +0 = 1
8 8 2
Once you know that your distribution is binomial, you can apply the binomial
distribution formula to calculate the probability
FORMULA
Where:
= binomial probability
= number of combinations
= number of trials
EXAMPLE PROBLEM
A six-sided die is rolled 12 times. What is the probability of getting a 4 five times?
SOLUTION:
n = 12 n!
nCr =
(n − r)!r !
x=5
p = 1/6 12 ¿ 12!
( )
5 (12 −5)! 5 !
q = 5/6
nCr = 792
1 5 5 12− 5
P(5)=792( ) ( )
6 6
¿ 0.028425∨2.84 %
4.3 POISSON DISTRIBUTION
POISSON DISTRIBUTION
-It is a probability distribution that is used to show how many times an event
is likely to occur over a specified period. In other words, it is a count distribution.
Poisson distributions are often used to understand independent events that occur at
a constant rate within a given interval of time. It was named after a French
mathematician Simeon Denis Poisson.
The Poisson Distribution is a discrete function, meaning that the variable can
only take specific values in a list. Put differently, the variable cannot take all values in
any continuous range. For the Poisson distribution, the variable can only take whole
number values, with no decimals or fractions.
FORMULA
x − μ'
μ' e
P( X =x)=
x!
Where:
EXAMPLE PROBLEM
A small business receives, on average, 12 customers per day. (a) What is the
probability that the business will receive exactly 8 customers in one day?
SOLUTION :
μ' = 12
x=8
e = 2.71828
x − μ'
μ' e
P( X =x)=
x!
12 ∗ 2.71828−12
8
P( X =8)=
8!
¿ 0.065523∨6.55 %
4.4 POISSON APPROXIMATION TO BINOMIAL
POISSON APPROXIMATION TO BINOMIAL
-When the value of n in a binomial distribution is large and the value of p is
very small, the binomial distribution can be approximated by a Poisson distribution.
If n > 20 and np < 5 OR nq < 5 then the Poisson is a good approximation.
The Binomial distribution tables given with most examinations only have n values up
to 10 and values of p from 0 to 0.5
The similarities between the two distributions can be seen in the vertical line graph
below.
New parameters λ = np
EXAMPLE PROBLEM
A factory puts biscuits into boxes of 100. The probability that a biscuit is broken is
0.03. Find the probability that a box contains 2 broken biscuits
SOLUTION
This is a binomial distribution with n = 100 and p = 0.03.
These values are outside the range of the tables and involve lengthy
calculations.
Using the Poisson approximation (test: np = 100 x 0.3 = 3, which is less
than 5)
Let X be the random variable of the number of broken biscuits
The mean λ = np = 100 × 0.3 = 3
P(X = 2) = 0.224 (from tables)
The probability that a box contains two broken biscuits is 0.224.
FORMULA
1
f ( x)= for A ≤ x ≤ B
B− A
f ( x)=1 for 0 ≤ x ≤1
Since the general form of probability functions can be expressed in terms of
the standard distribution, all subsequent formulas in this section are given for the
standard form of the function.
FORMULA
{
−λx
fx( x∨λ)= λ e for x > 0 for x ≤ 0
0
EXAMPLE PROBLEM
The amount of time spouses shop for anniversary cards can be
modeled by an exponential distribution with the average amount of time
equal to eight minutes. Write the distribution, state the probability density
function, and graph the distribution.
ANSWER: X exp(0.125);
f ( x)=0.125 e− 0.125 x ;
The normal distribution has several key features and properties that define it.
For all normal distributions, 68.2% of the observations will appear within plus or
minus one standard deviation of the mean; 95.4% of the observations will fall within
+/- two standard deviations; and 99.7% within +/- three standard deviations. This
fact is sometimes referred to as the "empirical rule," a heuristic that describes
where most of the data in a normal distribution will appear.
This means that data falling outside of three standard deviations ("3-sigma") would
signify rare occurrences.
FORMULA
-
WHERE:
EXAMPLE
Many naturally-occurring phenomena appear to be normally-distributed.
Take, for example, the distribution of the heights of human beings. The average
height is found to be roughly 175 cm (5' 9"), counting both males and females.
As the chart below shows, most people conform to that average. Meanwhile,
taller and shorter people exist, but with decreasing frequency in the population.
According to the empirical rule, 99.7% of all people will fall with +/- three
standard deviations of the mean, or between 154 cm (5' 0") and 196 cm (6' 5").
Those taller and shorter than this would be quite rare (just 0.15% of the
population each).
np ≥ 5
n(1-p) ≥ 5
When both criteria are met, we can use the normal distribution to answer
probability questions related to the binomial distribution.
For example, suppose we would like to find the probability that a coin lands on
heads less than or equal to 45 times during 100 flips. That is, we want to find P(X ≤
45). To use the normal distribution to approximate the binomial distribution, we
would instead find P(X ≤ 45.5).
The following table shows when you should add or subtract 0.5, based on the type of
probability you’re trying to find:
The following step-by-step example shows how to use the normal distribution to
approximate the binomial distribution.
Suppose we want to know the probability that a coin lands on heads less than or
equal to 43 times during 100 flips.
Step 1: Verify that the sample size is large enough to use the normal
approximation.
np ≥ 5
n(1-p) ≥ 5
np = 100*0.5 = 50
n(1-p) = 100*(1 – 0.5) = 100*0.5 = 50
Both numbers are greater than 5, so we’re safe to use the normal approximation.
Referring to the table above, we see that we should add 0.5 when we’re working
with a probability in the form of X ≤ 43. Thus, we will be finding P(X< 43.5).
Step 3: Find the mean (μ) and standard deviation (σ) of the binomial distribution.
σ = √n*p*(1-p) = √100*.5*(1-.5) = √25 = 5
Step 4: Find the z-score using the mean and standard deviation found in the
previous step.
We can use the Normal CDF Calculator to find that the area under the standard
normal curve to the left of -1.3 is .0968.
Thus, the probability that a coin lands on heads less than or equal to 43 times during
100 flips is .0968.
5.6 TRIANGULAR DISTRIBUTION
TRIANGULAR DISTRIBUTION
-The triangular distribution is a continuous probability distribution with a
probability density function shaped like a triangle.
It is defined by three values:
The name of the distribution comes from the fact that the probability density
function is shaped like a triangle.
It turns out that this distribution is extremely useful in the real world because we can
often estimate the minimum value (a), the maximum value (b), and the most likely
value (c) that a random variable will take on, so we can often model the behavior of
random variables by using a triangular distribution with the knowledge of just these
three values.
PDF:
CDF:
REFERENCES:
*https://www.statisticshowto.com/probability-and-statistics/probability-main-index/bayes-theorem-
problems/
*https://www.statisticshowto.com/probability-and-statistics/probability-main-index/bayes-theorem-
problems/
*https://www.youtube.com/watch?v=sqDVrXq_eh0
*https://www.youtube.com/watch?v=OByl4RJxnKA
*https://statisticsbyjim.com/basics/probability-distributions/
*https://www.youtube.com/watch?v=fH6TMTSpGMA
*https://www.statisticshowto.com/probability-and-statistics/binomial-theorem/binomial-
distribution-formula/
*https://www.youtube.com/watch?v=3PWKQiLK41M
*https://youtu.be/m0o-585xwW0
*https://bestmaths.net/online/index.php/year-levels/year-12/year-12-topic-list/poisson-
approximation-binomial/
*https://www.statisticshowto.com/continuous-probability-distribution/
*https://www.itl.nist.gov/div898/handbook/eda/section3/eda3662.htm
*https://byjus.com/maths/exponential-distribution/
*https://www.statology.org/normal-approximation/#:~:text=If%20X%20is%20a%20random,
%E2%88%9Anp(1%2Dp)
*https://www.statology.org/triangular-distribution/
PROJECT IN
SOFTWARE APPLICATION