Decision 1
Decision 1
Decision 1
DECISION THEORY
Brian Weatherson
2015
Contents
1 Introduction 1
1.1 Decisions and Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Previews . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Example: Newcomb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Example: Sleeping Beauty . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3 Uncertainty 13
3.1 Likely Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Do What’s Likely to Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 Probability and Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4 Measures 18
4.1 Probability Defined . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2 Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.3 Normalised Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.4 Formalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.5 Possibility Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5 Truth Tables 24
5.1 Compound Sentences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.2 Equivalence, Entailment, Inconsistency, and Logical Truth . . . . . . . . . 27
5.3 Two Important Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
9 Expected Utility 49
9.1 Expected Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
9.2 Maximise Expected Utility Rule . . . . . . . . . . . . . . . . . . . . . . . . 50
9.3 Structural Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
11 Understanding Probability 61
11.1 Kinds of Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
11.2 Frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
11.3 Degrees of Belief . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
12 Objective Probabilities 66
12.1 Credences and Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
12.2 Evidential Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
12.3 Objective Chances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
12.4 The Principal Principle and Direct Inference . . . . . . . . . . . . . . . . . 69
13 Understanding Utility 71
13.1 Utility and Welfare . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
13.2 Experiences and Welfare . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
13.3 Objective List Theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
14 Subjective Utility 76
14.1 Preference Based Theories . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
14.2 Interpersonal Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . 77
14.3 Which Desires Count . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
15 Declining Marginal Utilities 80
15.1 Money and Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
15.2 Insurance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
15.3 Diversification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
15.4 Selling Insurance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
16 Newcomb’s Problem 85
16.1 The Puzzle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
16.2 Two Principles of Decision Theory . . . . . . . . . . . . . . . . . . . . . . 86
16.3 Bringing Two Principles Together . . . . . . . . . . . . . . . . . . . . . . . 87
16.4 Well Meaning Friends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Introduction
1.1 Decisions and Games
This course is an introduction to decision theory. We’re interested in what to do when the
outcomes of your actions depend on some external facts about which you are uncertain.
The simplest such decision has the following structure.
State 1 State 2
Choice 1 a b
Choice 2 c d
The choices are the options you can take. The states are the ways the world can be that affect
how good an outcome you’ll get. And the variables, a, b, c and d are numbers measuring how
good those outcomes are. For now we’ll simply have higher numbers representing better
outcomes, though eventually we’ll want the numbers to reflect how good various outcomes
are.
Let’s illustrate this with a simple example. It’s a Sunday afternoon, and you have the
choice between watching a football game and finishing a paper due on Monday. It will be
a little painful to do the paper after the football, but not impossible. It will be fun to watch
football, at least if your team wins. But if they lose you’ll have spent the afternoon watching
them lose, and still have the paper to write. On the other hand, you’ll feel bad if you skip
the game and they win. So we might have the following decision table.
The numbers of course could be different if you have different preferences. Perhaps your
desire for your team to win is stronger than your desire to avoid regretting missing the
game. In that case the table might look like this.
1
Your Team Wins Your Team Loses
Watch Football 4 1
Work on Paper 3 2
Either way, what turns out to be for the best depends on what the state of the world is. These
are the kinds of decisions with which we’ll be interested.
Sometimes the relevant state of the world is the action of someone who is, in some
loose sense, interacting with you. For instance, imagine you are playing a game of rock-
paper-scissors. We can represent that game using the following table, with the rows for
your choices and the columns for the other person’s choices.
Not all games are competitive like this. Some games involve coordination. For instance,
imagine you and a friend are trying to meet up somewhere in New York City. You want
to go to a movie, and your friend wants to go to a play, but neither of you wants to go to
something on their own. Sadly, your cell phone is dead, so you’ll just have to go to either the
movie theater or the playhouse, and hope your friend goes to the same location. We might
represent the game you and your friend are playing this way.
In each cell now there are two numbers, representing first how good the outcome is for you,
and second how good it is for your friend. So if you both go to the movies, that’s the best
outcome for you, and the second-best for your friend. But if you go to different things, that’s
the worst result for both of you. We’ll look a bit at games like this where the party’s interests
are neither strictly allied nor strictly competitive.
Traditionally there is a large division between decision theory, where the outcome de-
pends just on your choice and the impersonal world, and game theory, where the outcome
depends on the choices made by multiple interacting agents. We’ll follow this tradition here,
focussing on decision theory for the first two-thirds of the course, and then shifting our at-
tention to game theory. But it’s worth noting that this division is fairly arbitrary. Some
decisions depend for their outcome on the choices of entities that are borderline agents,
such as animals or very young children. And some decisions depend for their outcome on
choices of agents that are only minimally interacting with you. For these reasons, among
others, we should be suspicious of theories that draw a sharp line between decision theory
and game theory.
2
1.2 Previews
Just thinking intuitively about decisions like whether to watch football, it seems clear that
how likely the various states of the world are is highly relevant to what you should do. If
you’re more or less certain that your team will win, and you’ll enjoy watching the win, then
you should watch the game. But if you’re more or less certain that your team will lose, then
it’s better to start working on the term paper. That intuition, that how likely the various
states are affects what the right decision is, is central to modern decision theory.
The best way we have to formally regiment likelihoods is probability theory. So we’ll
spend quite a bit of time in this course looking at probability, because it is central to good
decision making. In particular, we’ll be looking at four things.
First, we’ll spend some time going over the basics of probability theory itself. Many
people, most people in fact, make simple errors when trying to reason probabilistically. This
is especially true when trying to reason with so-called conditional probabilities. We’ll look
at a few common errors, and look at ways to avoid them.
Second, we’ll look at some questions that come up when we try to extend probability
theory to cases where there are infinitely many ways the world could be. Some issues that
come up in these cases affect how we understand probability, and in any case the issues are
philosophically interesting in their own right.
Third, we’ll look at some arguments as to why we should use probability theory, rather
than some other theory of uncertainty, in our reasoning. Outside of philosophy it is some-
times taken for granted that we should mathematically represent uncertainties as proba-
bilities, but this is in fact quite a striking and, if true, profound result. So we’ll pay some
attention to arguments in favour of using probabilities. Some of these arguments will also
be relevant to questions about whether we should represent the value of outcomes with
numbers.
Finally, we’ll look a little at where probabilities come from. The focus here will largely be
negative. We’ll look at reasons why some simple identifications of probabilities either with
numbers of options or with frequencies are unhelpful at best.
In the middle of the course, we’ll look at a few modern puzzles that have been the focus
of attention in decision theory. Later today we’ll go over a couple of examples that illustrate
what we’ll be covering in this section.
The final part of the course will be on game theory. We’ll be looking at some of the
famous examples of two person games. (We’ve already seen a version of one, the movie and
play game, above.) And we’ll be looking at the use of equilibrium concepts in analysing
various kinds of games.
We’ll end with a point that we mentioned above, the connection between decision theory
and game theory. Some parts of the standard treatment of game theory seem not to be
consistent with the best form of decision theory that we’ll look at. So we’ll want to see how
much revision is needed to accommodate our decision theoretic results.
3
1.3 Example: Newcomb
In front of you are two boxes, call them A and B. You call see that in box B there is $1000,
but you cannot see what is in box A. You have a choice, but not perhaps the one you were
expecting. Your first option is to take just box A, whose contents you do not know. Your
other option is to take both box A and box B, with the extra $1000.
There is, as you may have guessed, a catch. A demon has predicted whether you will
take just one box or take two boxes. The demon is very good at predicting these things –
in the past she has made many similar predictions and been right every time. If the demon
predicts that you will take both boxes, then she’s put nothing in box A. If the demon predicts
you will take just one box, she has put $1,000,000 in box A. So the table looks like this.
There are interesting arguments for each of the two options here.
The argument for taking just one box is easy. The way the story has been set up, lots
of people have taken this challenge before you. Those that have taken 1 box have walked
away with a million dollars. Those that have taken both have walked away with a thousand
dollars. You’d prefer to be in the first group to being in the second group, so you should take
just one box.
The argument for taking both boxes is also easy. Either the demon has put the million
in box A or she hasn’t. If she has, you’re better off taking both boxes. That way you’ll get
$1,001,000 rather than $1,000,000. If she has not, you’re better off taking both boxes. That
way you’ll get $1,000 rather than $0. Either way, you’re better off taking both boxes, so you
should do that.
Both arguments seem quite strong. The problem is that they lead to incompatible con-
clusions. So which is correct?
4
It seems plausible to suggest that the answers to questions 2 and 3 should be the same.
After all, given that Sleeping Beauty will have forgotten about the Monday waking if she
wakes on Tuesday, then she won’t be able to tell the difference between the Monday and
Tuesday waking. So she should give the same answers on Monday and Tuesday. We’ll as-
sume that in what follows.
First, there seems to be a very good argument for answering 1/2 to question 1. It’s a fair
coin, so it has a probability of 1/2 of landing heads. And it has just been tossed, and there
hasn’t been any ‘funny business’. So that should be the answer.
Second, there seems to be a good, if a little complicated, argument for answering 1/3 to
questions 2 and 3. Assume that questions 2 and 3 are in some sense the same question. And
assume that Sleeping Beauty undergoes this experiment many times. Then she’ll be asked
the question twice as often when the coin lands tails as when it lands heads. That’s because
when it lands tails, she’ll be asked that question twice, but only once when it lands heads.
So only 1/3 of the time when she’s asked this question, will it be true that the coin landed
heads. And plausibly, if you’re going to be repeatedly asked How probable is it that such-
and-such happened, and 1/3 of the time when you’re asked that question, such-and-such will
have happened, then you should answer 1/3 each time.
Finally, there seems to be a good argument for answering questions 1 and 2 the same way.
After all, Sleeping Beauty doesn’t learn anything new between the two questions. She wakes
up, but she knew she was going to wake up. And she’s asked the question, but she knew
she was going to be asked the question. And it seems like a decent principle that if nothing
happens between Sunday and Monday to give you new evidence about a proposition, the
probability that you think it did happen shouldn’t change.
But of course, these three arguments can’t all be correct. So we have to decide which one
is incorrect.
Upcoming
These are just two of the puzzles we’ll be looking at as the course proceeds. Some of these
will be decision puzzles, like Newcomb’s Problem. Some of them will be probability puzzles
that are related to decision theory, like Sleeping Beauty. And some will be game puzzles. I
hope the puzzles are somewhat interesting. I hope even more that we learn something from
them.
5
Chapter 2
If your team wins, you are better off working on the paper, since 4 > 2. And if your team
loses, you are better off working on the paper, since 3 > 1. So either way you are better off
working on the paper. So you should work on the paper.
6
2.2 States and Choices
Here is an example from Jim Joyce that suggests that dominance might not be as straight-
forward a rule as we suggested above.
Suupose you have just parked in a seedy neighborhood when a man approaches
and offers to “protect” your car from harm for $10. You recognize this as ex-
tortion and have heard that people who refuse “protection” invariably return
to find their windshields smashed. Those who pay find their cars intact. You
cannot park anywhere else because you are late for an important meeting. It
costs $400 to replace a windshield. Should you buy “protection”? Dominance
says that you should not. Since you would rather have the extra $10 both in
the even that your windshield is smashed and in the event that it is not, Domi-
nance tells you not to pay. (Joyce, The Foundations of Causal Decision Theory,
pp 115-6.)
We can put this in a table to make the dominance argument that Joyce suggests clearer.
In each column, the number in the ‘Don’t pay’ row is higher than the number in the ‘Pay
extortion’ row. So it looks just like the case above where we said dominance gives a clear
answer about what to do. But the conclusion is crazy. Here is how Joyce explains what goes
wrong in the dominance argument.
Of course, this is absurd. Your choice has a direct influence on the state of the
world; refusing to pay makes it likly that your windshield will be smashed while
paying makes this unlikely. The extortionist is a despicable person, but he has
you over a barrel and investing a mere $10 now saves $400 down the line. You
should pay now (and alert the police later).
This seems like a general principle we should endorse. We should define states as being,
intuitively, independent of choices. The idea behind the tables we’ve been using is that the
outcome should depend on two factors - what you do and what the world does. If the ‘states’
are dependent on what choice you make, then we won’t have successfully ‘factorised’ the
dependence of outcomes into these two components.
We’ve used a very intuitive notion of ‘independence’ here, and we’ll have a lot more
to say about that in later sections. It turns out that there are a lot of ways to think about
independence, and they yield different recommendations about what to do. For now, we’ll
try to use ‘states’ that are clearly independent of the choices we make.
7
2.3 Maximin and Maximax
Dominance is a (relatively) uncontroversial rule, but it doesn’t cover a lot of cases. We’ll
start now lookintg at rules that are more or less comprehensive. To start off, let’s consider
rules that we might consider rules for optimists and pessimists respectively.
The Maximax rule says that you should maximise the maximum outcome you can get.
Basically, consider the best possible outcome, consider what you’d have to do to bring that
about, and do it. In general, this isn’t a very plausible rule. It recommends taking any kind
of gamble that you are offered. If you took this rule to Wall St, it would recommend buying
the riskiest derivatives you could find, because they might turn out to have the best results.
Perhaps needless to say, I don’t recommend that strategy.
The Maximin rule says that you should maximise the minimum outcome you can get.
So for every choice, you look at the worst-case scenario for that choice. You then pick the
option that has the least bad worst case scenario. Consider the following list of preferences
from our watch football/work on paper example.
So you’d prefer your team to win, and you’d prefer to watch if they win, and work if they lose.
So the worst case scenario if you watch the game is that they lose - the worst case scenario
of all in the game. But the worst case scenario if you don’t watch is also that they lose. Still
that wasn’t as bad as watching the game and seeing them lose. So you should work on the
paper.
We can change the example a little without changing the recommendation.
In this example, your regret at missing the game overrides your desire for your team to win.
So if you don’t watch, you’d prefer that they lose. Still the worst case scenario is you don’t
watch is 2, and the worst case scenario if you do watch is 1. So, according to maximin, you
should not watch.
Note in this case that the worst case scenario is a different state for different choices.
Maximin doesn’t require that you pick some ‘absolute’ worst-case scenario and decide on
the assumption it is going to happen. Rather, you look at different worst case scenarios for
different choices, and compare them.
8
2.4 Ordinal and Cardinal Utilities
All of the rules we’ve looked at so far depend only on the ranking of various options. They
don’t depend on how much we prefer one option over another. They just depend on which
order we rank goods is.
To use the technical language, so far we’ve just looked at rules that just rely on ordinal
utilities. The term ordinal here means that we only look at the order of the options. The
rules that we’ll look at rely on cardinal utilities. Whenever we’re associating outcomes with
numbers in a way that the magnitudes of the differences between the numbers matters, we’re
using cardinal utilities.
It is rather intuitive that something more than the ordering of outcomes should matter
to what decisions we make. Imagine that two agents, Chris and Robin, each have to make
a decision between two airlines to fly them from New York to San Francisco. One airline is
more expensive, the other is more reliable. To oversimplify things, let’s say the unreliable
airline runs well in good weather, but in bad weather, things go wrong. And Chris and Robin
have no way of finding out what the weather along the way will be. They would prefer to
save money, but they’d certainly not prefer for things to go badly wrong. So they face the
following decision table.
If we’re just looking at the ordering of outcomes, that is the decision problem facing both
Chris and Robin.
But now let’s fill in some more details about the cheap airlines they could fly. The cheap
airline that Chris might fly has a problem with luggage. If the weather is bad, their passen-
gers’ luggage will be a day late getting to San Francisco. The cheap airline that Robin might
fly has a problem with staying in the air. If the weather is bad, their plane will crash.
Those seem like very different decision problems. It might be worth risking one’s luggage
being a day late in order to get a cheap plane ticket. It’s not worth risking, seriously risking,
a plane crash. (Of course, we all take some risk of being in a plane crash, unless we only ever
fly the most reliable airline that we possibly could.) That’s to say, Chris and Robin are facing
very different decision problems, even though the ranking of the four possible outcomes
is the same in each of their cases. So it seems like some decision rules should be sensitive
to magnitudes of differences between options. The first kind of rule we’ll look at uses the
notion of regret.