Baye's Rule and Its Use
Baye's Rule and Its Use
Baye's Rule and Its Use
Bayes’ Theorem
We again consider the conditional probability statement:
P(A|B) = P(A ∩ B)
P(B) = P(A ∩ B)
P(B|A1)P(A1) + P(B|A2)P(A2) + ··· + P(B|An)P(An)
in which we have used the Theorem of Total Probability to replace P(B). Now
P(A ∩ B) = P(B ∩ A) = P(B|A) × P(A)
Substituting this in the expression for P(A|B) we immediately obtain the result
P(A|B) = P(B|A) × P(A)
P(B|A1)P(A1) + P(B|A2)P(A) + ··· + P(B|An)P(An)
This is true for any event A and so, replacing A by Ai gives the result, known as Bayes’
Theorem
as
P(Ai|B) = P(B|Ai) × P(Ai)
P(B|A1)P(A1) + P(B|A2)P(A2) + ··· + P(B|An)P(An)
This equation is known as Bayes' rule (also Bayes' law or Bayes' theorem).
This simple equation underlies all modern AI systems for probabilistic inference.
Simple Example Problem
• A doctor knows that the disease meningitis is causes the patient to have a
stiff neck, say, 50% of the time. The doctor also knows some unconditional
facts: the prior probability of a patient having meningitis is 1/50,000, and
the prior probability of any patient having a stiff neck is 1/20. Letting S be
the proposition that the patient has a stiff neck and M be the proposition
that the patient has meningitis.
EXAMPLE PROBLEM
The training data are in above Table. The data tuples are described by the attributes age,
income, student, and credit rating. The class label attribute, buys computer, has two distinct
values (namely, yes, no). Let C1 correspond to the class buys computer = yes and C2
correspond to buys computer = no. The tuple we wish to classify is X = (age = youth,
income = medium, student = yes, credit rating = fair)
• We need to maximize P(X|Ci)P(Ci), for i = 1, 2. P(Ci),
the prior probability of each class, can be computed
based on the training tuples: