Unit4 ML
Unit4 ML
Frequency distribution
A data set is made up of a distribution of values, or scores. In tables or graphs, you can
summarize the frequency of every possible value of a variable in numbers or percentages.
This is called a frequency distribution.
Measures of variability
Measures of variability give you a sense of how spread out the response values are. The
range, standard deviation and variance each reflect different aspects of spread.
Range
The range gives you an idea of how far apart the most extreme response scores are. To find
the range, simply subtract the lowest value from the highest value.
Standard deviation
The standard deviation (s or SD) is the average amount of variability in your dataset. It tells
you, on average, how far each score lies from the mean. The larger the standard deviation, the
more variable the data set is.
There are six steps for finding the standard deviation:
1. List each score and find their mean.
2. Subtract the mean from each score to get the deviation from the mean.
3. Square each of these deviations.
4. Add up all of the squared deviations.
5. Divide the sum of the squared deviations by N – 1.
6. Find the square root of the number you found.
Variance
The variance is the average of squared deviations from the mean. Variance reflects the degree
of spread in the data set. The more spread the data, the larger the variance is in relation to the
mean.
To find the variance, simply square the standard deviation. The symbol for variance is s2.
Compound Probability
Compound probability is the probability of two or more independent events occurring
together. Compound probability can be calculated for two types of compound events, namely,
mutually exclusive and mutually inclusive compound events. The formulas to calculate the
compound probability for both types of events are different.
Compound probability is a concept that is widely used in the finance industry to assess risks
and assign premiums to various policies. In this article, we will learn more about compound
probability, its formulas, how to determine it as well as see various associated examples.
What is Compound Probability?
The compound probability of compound events (mutually inclusive or mutually
exclusive) can be defined as the likelihood of occurrence of two or more independent events
together. An independent event is one whose outcome is not affected by the outcome of other
events. A mutually inclusive event is a situation where one event cannot occur with the other
while a mutually exclusive event is when both events cannot take place at the same time. The
compound probability will always lie between 0 and 1.
Compound Probability Formulas
There are two formulas to calculate the compound probability depending on the type
of events that occur. In general, to find the compound probability, the probability of the first
event is multiplied by the probability of the second event and so on. The compound
probability formulas are given below:
Mutually Exclusive Events Compound Probability
P(A or B) = P(A) + P(B)
Using set theory this formula is given as,
P(A ∪ B) = P(A) + P(B)
Mutually Inclusive Events Compound Probability
P(A or B) = P(A) + P(B) - P(A and B)
P(A ∪ B) = P(A) + P(B) - P(A ⋂ B)
where A and B are two independent events, and P(A and B) = P(A) x P(B)
Notes
Compound probability is the likelihood of occurrence of two independent compound
events together.
Compound probability can be calculated for mutually exclusive and mutually
inclusive compound events.
P(A or B) = P(A) + P(B) and P(A or B) = P(A) + P(B) - P(A and B) are the compound
probability formulas.
Conditional Probability
The conditional probability, as its name suggests, is the probability of happening an event
that is based upon a condition. For example, assume that the probability of a boy playing
tennis in the evening is 95% (0.95) whereas the probability that he plays given that it is a
rainy day is less which is 10% (0.1). Then the former case is just normal probability whereas
the latter case is the conditional probability. In this example, we represent the two
probabilities as P(Play tennis) = 0.95 and P(Play tennis | Rainy day) = 0.1.
Let us learn more about conditional probability along with its formula, examples, and
practice questions.
What Is Conditional Probability?
Conditional probability is one of the important concepts in probability and statistics. The
"probability of A given B" (or) the "probability of A with respect to the condition B" is
denoted by the conditional probability P(A | B) (or) P (A / B) (or) PBB(A). Thus, P(A | B)
represents the probability of A which happens after event B has happened already. the
probability of an event may alter if there is a condition given.
Definition of Conditional Probability
If A and B are two events associated with the same sample space of a random
experiment, the conditional probability of event A given that B has occurred is given by
P(A/B) = P( A ∩ B)/ P (B), provided P(B) ≠ 0.
Let us understand conditional probability with an example. Let us find the conditional
probability of getting at least two tails given that it is a head on the first toss when 3 coins are
tossed. The sample space, S (the list of all outcomes) when 3 coins are tossed is given as
follows:
Here:
P(A | B) = The probability of A given B (or) the probability of A which happens after B
P(B | A) = The probability of B given A (or) the probability of B which happens after A
P(A ∩ B) = The probability of happening of both A and B
P(A) = The probability of A
P(B) = The probability of B
Derivation of Conditional Probability
Note that the elements of B which favor the event A are the common elements of A and B.
i.e. the sample points of A ∩ B.
Thus P(A/B) = Number of events favorable to A ∩ B ÷ Number of events favorable to B.
P(A/B) = n(A∩B)n(S)n(B)n(S)n(A∩B)n(S)n(B)n(S)
Thus P(A | B) = P(A ∩ B) / P(B)
Properties of Conditional Probability
Here are some properties of conditional probability along with their proofs (derivations)
which we may need to use while solving the problems. All these properties depend on the
conditional probability formula (which is mentioned in the previous section).
Property 1
Let S be the sample space of an experiment and A be any event. Then P(S | A) = P(A | A) = 1.
Proof:
By the formula of conditional probability,
P(S | A) = P(S ∩ A) / P(A) = P(A) / P(A) = 1
P(A | A) = P(A ∩ A) / P(A) = P(A) / P(A) = 1
Hence property 1 is proved.
Property 2
Let S be the sample space of an experiment and A and B be any two events. Let E be any
other event such that P(E) ≠ 0. Then P((A ⋃ B) | E) = P(A | E) + P(B | E) - P((A ∩ B) | E).
Proof:
By the formula of conditional probability,
P((A ⋃ B) | E) = [P((A ⋃ B) ∩ E)] / P(E)
= [ P(A ∩ E) ⋃ P(B ∩ E) ] / P(E) (using a property of sets)
= [P(A ∩ E) + P(B ∩ E) - P(A ∩ B ∩ E)] / P(E) (using addition theorem of probability)
= P(A ∩ E) / P(E) + P(B ∩ E) / P(E) - P(A ∩ B ∩ E) / P(E)
= P(A | E) + P(B | E) - P((A ∩ B) | E) (By conditional probability formula)
Hence property 2 is proved.
Property 3
P(A' | B) = 1 - P(A | B), where A' is the complement of the set A.
Proof:
By Property 1, we have P(S | B) = 1.
We know that S = A ⋃ A'. Thus by the above property,
P( A ⋃ A' | B) = 1
Since A and A' are disjoint events,
P(A | B) + P(A' | B) = 1
P(A' | B) = 1 - P(A | B)
Hence property 3 is proved.
Dependent and Independent Events
The definition of independent and dependent events is connected to conditional probability.
Let us see the definitions of independent and dependent events along with their formulas.
Dependent Events
Dependent events, as the name suggests, are any two events in which the happening of one
event depends on the happening of the other event.
If A depends on B, then the probability of A is P(A | B).
If B depends on A, then the probability of B is P(B | A).
By the conditional probability formulas,
P(A | B) = P(A ∩ B) / P(B) ⇒ P(A ∩ B) = P(A | B) · P(B)
P(B | A) = P(A ∩ B) / P(A) ⇒ P(A ∩ B) = P(B | A) · P(A)
Thus, two event A and B are said to be dependent events if one of the conditions is satisfied.
P(A ∩ B) = P(A | B) · P(B) (or)
P(A ∩ B) = P(B | A) · P(A)
Independent Events
Independent events, as the name suggests, are any two events in which the happening of one
event does not depend on the happening of the other event. i.e., if A and B are independent
then P(A | B) = P(A) and P(B | A) = P(B). Thus, to get the formula of independent events, we
just need to replace P(A | B) with P(A) (or P(B | A) with P(B)) in one of the above
(dependent events) formulas. Hence, two events are said to be independent if
P(A ∩ B) = P(A) · P(B)
This is also called as multiplication rule of probability.
Notes:
The probability of A given B is called the conditional probability and it is calculated
using the formula P(A | B) = P(A ∩ B) / P(B).
The events that are part of conditional probability are dependent events. For example, if
we have P(A | B) anywhere in the problem, then it means that A and B are dependent.
If two events A and B are independent, then P(A | B) = P(A) and P(B | A) = P(B).
For any two events A and B, P(A ∩ B) = P(A) · P(B). This is called the multiplication
theorem of probability.
Overweight (N') 17 33
Solution:
From the given table, P(N) = (5+45) / 100 = 50/100.
P(D ∩ N) = 5/100.
By the conditional probability formula,
P(D | N) = P(D ∩ N) / P(N)
= (5/100) / (50/100)
= 5/50
= 1/10
Answer: P(D | N) = 1/10.
Example 2: The probability that it will be sunny on Friday is 4/5. The probability that
an ice cream shop will sell ice creams on a sunny Friday is 2/3 and the probability that
the ice cream shop sells ice creams on a non-sunny Friday is 1/3. Then find the
probability that it will be sunny and the ice cream shop sells the ice creams on Friday.
Solution:
Let us assume that the probabilities for a Friday to be sunny and for the ice cream shop to
sell ice creams be S and I respectively. Then,
P(S) = 4/5.
P(I | S) = 2/3.
P(I | S') = 1/3.
We have to find P(S ∩ I).
We can see that S and I are dependent events. By using the dependent events' formula of
conditional probability,
P(S ∩ I) = P(I | S) · P(S) = (2/3) · (4/5) = 8/15.
Answer: The required probability = 8/15.
Example 3: If a fair die is rolled twice, observe the numbers that face up. Find the
conditional probability that the sum of the numbers is 7, given that the first number is
2.
Solution:
Let us determine the sample space of rolling a die twice. S = {(1,1), (1,2), (1,3), (1,4),
(1,5), (1,6), (2,1), (2,2), (2,3), (2,4), (2,5), (2,6), (3,1), (3,2), (3,3), (3,4), (3,5), (3,6)(4,1),
(4,2), (4,3), (4,4), (4,5), (4,6), (5,1), (5,2), (5,3), (5,4), (5,5), (5,6), (6,1), (6,2), (6,3), (6,4),
(6,5), (6,6) }
Considering events A and B as given: we have
A : the sum of the numbers is 7. Thus set A = {(1,6),(2,5), (3,4), (4,3), (5,2),(6,1) }
B: the first number is 2. Thus set B = {(2,1), (2,2), (2,3), (2,4), (2,5), (2,6)}
A ∩ B: {(2,5)}
By the conditional probability, we know that
P(A ) = P(A ∩ B) / P(B)
P(A ) = 136636136636
P(A ) = 1/6
Answer: The conditional probability that the sum of the numbers is 7, given that the
first number is 2 is 1/6
Bayes Theorem
Bayes theorem is a theorem in probability and statistics, named after the Reverend
Thomas Bayes, that helps in determining the probability of an event that is based on some
event that has already occurred. Bayes theorem has many applications such as bayesian
interference, in the healthcare sector - to determine the chances of developing health
problems with an increase in age and many others. Here, we will aim at understanding the use
of the Bayes theorem in determining the probability of events, its statement, formula, and
derivation with the help of examples.
What is Bayes Theorem?
Bayes theorem, in simple words, determines the conditional probability of an event A
given that event B has already occurred. Bayes theorem is also known as the Bayes Rule or
Bayes Law. It is a method to determine the probability of an event based on the occurrences
of prior events. It is used to calculate conditional probability. Bayes theorem calculates the
probability based on the hypothesis. Now, let us state the theorem and its proof. Bayes
theorem states that the conditional probability of an event A, given the occurrence of another
event B, is equal to the product of the likelihood of B, given A and the probability of A. It is
given as:
P(A|B)=P(B|A)P(A)P(B)P(A|B)=P(B|A)P(A)P(B)
Here, P(A) = how likely A happens(Prior knowledge)- The probability of a hypothesis is true
before any evidence is present.
P(B) = how likely B happens(Marginalization)- The probability of observing the evidence.
P(A/B) = how likely A happens given that B has happened(Posterior)-The probability of a
hypothesis is true given the evidence.
P(B/A) = how likely B happens given that A has happened(Likelihood)- The probability of
seeing the evidence if the hypothesis is true.
Example 2: Assume that the chances of a person having a skin disease are 40%. Assuming
that skin creams and drinking enough water reduces the risk of skin disease by 30% and
prescription of a certain drug reduces its chance by 20%. At a time, a patient can choose any
one of the two options with equal probabilities. It is given that after picking one of the
options, the patient selected at random has the skin disease. Find the probability that the
patient picked the option of skin screams and drinking enough water using the Bayes
theorem.
Solution: Assume E1: The patient uses skin creams and drinks enough water; E2: The patient
uses the drug; A: The selected patient has the skin disease
P(E1) = P(E2) = 1/2
Using the probabilities known to us, we have
P(A|E1) = 0.4 × (1-0.3) = 0.28
P(A|E2) = 0.4 × (1-0.2) = 0.32
Using Bayes Theorem, the probability that the selected patient uses skin creams and drinks
enough water is given by,
P(E1|A)=P(A|E1)P(E1)P(A|E1)P(E1)+P(A|E2)P(E2)P(E1|A)=P(A|E1)P(E1)P(A|E1)P(E1)+P(
A|E2)P(E2)
= (0.28 × 0.5)/(0.28 × 0.5 + 0.32 × 0.5)
= 0.14/(0.14 + 0.16)
= 0.47
Answer: The probability that the patient picked the first option is 0.47
Example 3: A man is known to speak the truth 3/4 times. He draws a card and reports it
is king. Find the probability that it is actually a king.
Solution:
Let E be the event that the man reports that king is drawn from the pack of cards
A be the event that the king is drawn
B be the event that the king is not drawn.
Then we have P(A) = probability that king is drawn = 1/4
P(B) = probability that king is drawn = 3/4
P(E/A) = Probability that the man says the truth that king is drawn when actually king is
drawn = P(truth) = 3/4
P(E/B)= Probability that the man lies that king is drawn when actually king is drawn = P(lie)
= 1/4
Then according to Bayes theorem, the probability that it is actually a king = P(A/E)
=P(A)P(E|A)P(A)P(E|A)+P(B)P(E|B)P(A)P(E|A)P(A)P(E|A)+P(B)P(E|B)
= [1/4 × 3/4] ÷[(1/4 × 3/4) + (1/4 × 3/4)]
= 3/16 ÷12/16
= 3/16 × 16/12
=1/2 = 0.5
Answer: Thus the probability that the drawn card is actually a king = 0.5