Quantifying Uncertainty
Quantifying Uncertainty
Quantifying Uncertainty
You pick a door, say No. 1, and the host, who knows what's behind
the doors, opens another door, say No. 3, which has a goat. He then
says to you, "Do you want to pick door No. 2?" Is it to your advantage
to switch your choice?
Digression: The Monty Hall Problem
Digression: The Monty Hall Problem
Uncertainty
• Agents need to handle uncertainty, whether due to partial
observability, non-determinism, or a combination of the two.
Problems:
“A25 will get me there on time if there's no accident on the bridge and it doesn't
rain and my tires remain intact etc etc.”
(A1440 might reasonably be said to get me there on time but I'd have to stay
overnight in the airport …)
Uncertainty
• Consider a trivial example of uncertain reasoning for medical
diagnosis.
(*) Toothache => Cavity (this is faulty)
Me amend it:
(*) Tootache => Cavity V Gum Problem V Abscess…
Subjective probability:
• Probabilities relate propositions to agent's own state of
knowledge
e.g., P(A25 | no reported accidents) = 0.06
P a
a
P
a
P P P
a
a
P P
a
1 P a
Axioms of probability
• Inclusion-Exclusion (probability of a disjunction):
P a b P a P b P a b
• Joint probability distribution for a set of random variables gives the probability of
every atomic event on those random variables
•
P(Weather,Cavity) = a 4 × 2 matrix of values:
• Frequentists: Parameters are fixed; there is a (Platonic) model; parameters remain constant.
• Bayesians: Data are fixed; data are observed from realized sample; we encode prior beliefs;
parameters are described probabilistically.
• Frequentists commonly use the MLE (maximum likelihood estimate) as a cogent point estimate
Potential issues with frequentist approach: philosophical reliance on long-term ‘frequencies’, the
problem of induction (Hume) and the black swan paradox, as well as the presence of limited exact
solutions for a small class of settings.
Bayesian and Frequentist Probability
In the Bayesian framework, conversely, probability is regarded as a measure of uncertainty
pertaining to the practitioner’s knowledge about a particular phenomenon.
The prior belief of the experimenter is not ignored but rather encoded in the process of
calculating probability.
As the Bayesian gathers new information from experiments, this information is used, in
conjunction with prior beliefs, to update the measure of certainty related to a specific outcome.
These ideas are summarized elegantly in the familiar Bayes’ Theorem:
Where H here connotes ‘hypothesis’ and D connotes ‘data’; the leftmost probability is referred to
as the posterior (of the hypothesis), and the numerator factors are called the likelihood (of the
data) and the prior (on the hypothesis), respectively; the denominator expression is referred to
as the marginal likelihood.
Typically, the point estimate for a parameter used in Bayesian statistics is the mode of the
posterior distribution, known as the maximum a posterior (MAP) estimate, which is given as:
Practice Problems
(1) Derive Inclusion-Exclusion from Equations (13.1) and (13.2) in the text.
(13.1)
(13.2)
(2) Consider the set of all possible five-card poker hands dealt fairly (i.e. randomly) from a
single, standard deck.
(i) How many atomic events are there in the joint probability distribution?
(ii) What is the probability of each atomic event?
(iii) What is the probability of being dealt a royal flush?
(iv) Four of a kind?
(v) Given that my first two cards are aces, what is the probability that my total hand consists
of four aces?
Inference by enumeration
• Start with the joint probability distribution:
•
Then the required summation of joint entries is done by summing out the hidden
variables:
• The terms in the summation are joint entries because Y, E and H together exhaust the
set of random variables
• Obvious problems:
1. Worst-case time complexity O(dn) where d is the largest arity
2. Space complexity O(dn) to store the joint distribution
3. How to find the numbers for O(dn) entries?
Independence
• A and B are independent iff
P(A|B) = P(A) or P(B|A) = P(B) or P(A, B) = P(A) P(B)
• If I have a cavity, the probability that the probe catches in it doesn't depend on
whether I have a toothache:
(1) P(catch | toothache, cavity) = P(catch | cavity)
• Equivalent statements:
P(Toothache | Catch, Cavity) = P(Toothache | Cavity)
(2) Suppose that X, Y are independent random variables; let Z be a function of X and Y. Must
X and Y be conditionally independent, given Z? Explain.
(3) Suppose you are given a bag containing n unbiased coins. You are told that n – 1 of these
coins are “normal”, with heads on one side and tails on the other, whereas one coin is a fake,
with heads on both sides.
Consider the scenario in which you reach into the bag, pick out a coin at random, flip it, and get
a head. What is the (conditional) probability that the coin you chose is fake?
Bayes' Rule
• Product rule P(ab) = P(a | b) P(b) = P(b | a) P(a)
•
Bayes' rule: P(a | b) = P(b | a) P(a) / P(b)
Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.
Alternative Proxies: