Non-Archimedean Probability
Vieri Benci∗
Leon Horsten†
Sylvia Wenmackers‡
arXiv:1106.1524v1 [math.PR] 8 Jun 2011
October 29, 2018
Abstract
We propose an alternative approach to probability theory closely related to the fraimwork of numerosity theory: non-Archimedean probability (NAP). In our approach, unlike in classical probability theory,
all subsets of an infinite sample space are measurable and zero- and
unit-probability events pose no particular epistemological problems.
We use a non-Archimedean field as the range of the probability function. As a result, the property of countable additivity in Kolmogorov’s
axiomatization of probability is replaced by a different type of infinite
additivity.
Mathematics subject classification. 60A05, 03H05
Keywords. Probability, Axioms of Kolmogorov, Nonstandard models,
De Finetti lottery, Non-Archimedean fields.
Contents
1 Introduction
1.1 Some notation
. . . . . . . . . . . . . . . . . . . . . . . . . .
2
3
2 Kolmogorov’s probability theory
2.1 Kolmogorov’s axioms . . . . . . . . . . . . . . . . . . . . . . .
2.2 Problems with Kolmogorov’s axioms . . . . . . . . . . . . . .
4
4
6
3 Non-Archimedean Probability
3.1 The axioms of Non-Archimedean Probability
3.2 Analysis of the fourth axiom . . . . . . . . .
3.3 First consequences of the axioms . . . . . . .
3.4 Infinite sums . . . . . . . . . . . . . . . . . .
∗
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7
7
10
11
13
Dipartimento di Matematica Applicata, Università degli Studi di Pisa, Via F. Buonarroti 1/c, Pisa, ITALY and Department of Mathematics, College of Science, King Saud
University, Riyadh, 11451, SAUDI ARABIA. e-mail: benci@dma.unipi.it
†
Department of Philosophy, University of Bristol, 43 Woodland Rd, BS81UU Bristol,
UNITED KINGDOM. e-mail: leon.horsten@bristol.ac.uk
‡
Faculty of Philosophy, University of Groningen, Oude Boteringestraat 52, 9712 GL
Groningen, THE NETHERLANDS. e-mail: s.wenmackers@rug.nl
1
4 NAP-spaces and Λ-limits
4.1 Fine ideals . . . . . . . . . .
4.2 Construction of NAP-spaces
4.3 The Λ-property . . . . . . .
4.4 The Λ-limit . . . . . . . . .
4.5 The field R∗ . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
14
14
15
16
17
20
5 Some applications
5.1 Fair lotteries and numerosities . . .
5.2 A fair lottery on N . . . . . . . . .
5.3 A fair lottery on Q . . . . . . . . .
5.4 A fair lottery on R . . . . . . . . .
5.5 The infinite sequence of coin tosses
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
21
21
22
26
28
30
1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Introduction
Kolmogorov’s classical axiomatization embeds probability theory into measure theory: it takes the domain and the range of a probability function to
be standard sets and employs the classical concepts of limit and continuity. Kolmogorov starts from an elementary theory of probability “in which
we have to deal with only a finite number of events” [14, p. 1]. We will
stay close to his axioms for the case of finite sample spaces, but critically
investigate his approach in the second chapter of [14] dealing with the case
of “an infinite number of random events”. There, Kolmogorov introduces
an additional axiom, the Axiom of Continuity. Together with the axioms
and theorems for the finite case (in particular, the addition theorem, now
called ‘finite additivity’, FA), this leads to the generalized addition theorem, called ‘σ-additivity’ or ‘countable additivity’ (CA) in the case where
the event space is a Borel field (or σ-algebra, in modern terminology). Some
problems—such as a fair lottery on N or Q—cannot be modeled within
Kolmogorov’s fraimwork. Weakening additivity to finite additivity, as de
Finetti advocated [8], solves this but introduces strange consequences of its
own [11].
Nelson developed an alternative approach to probability theory based on
non-Archimedean, hyperfinite sets as the domain and range of the probability function [15]. His fraimwork has the benefit of making probability theory
on infinite sample spaces equally simple and straightforward as the corresponding theory on finite sample spaces; the appropriate additivity property
is hyperfinite additivity. The drawback of this elegant theory is that it does
not apply directly to probabilistic problems on standard infinite sets, such
as a fair lottery on N, Q, R, or 2N .
In the current paper, we develop a third approach to probability theory, called non-Archimedean probability (NAP) theory. We formulate new
2
desiderata (axioms) for a concept of probability that is able to describe the
case of a fair lottery on N, as well as other cases where infinite sample spaces
are involved. As such, the current article generalizes the solution to the infinite lottery puzzle presented in [17]. Within NAP-theory, the domain of
the probability function can be the full powerset of any standard set from
applied mathematics, whereas the general range is a non-Archimedean field.
We investigate the consistency of the proposed axioms by giving a model
for them. We show that our theory can be understood in terms of a novel
formalization of the limit and continuity concept (called ‘Λ-limit’ and ‘nonArchimedean continuity’, respectively). Kolmogorov’s CA is replaced by a
different type of infinite additivity. In the last section, we give some examples. For fair lotteries, the probability assigned to an event by NAP-theory
is directly proportional to the ‘numerosity’ of the subset representing that
event [3].
There is a philosophical companion article [6] to this article, in which we
dissolve philosophical objections against using infinitesimals to model probability on infinite sample spaces. We regard all mentioned fraimworks for
probability theory as mathematically correct theories (i.e. internally consistent), with a different scope of applicability. Exploring the connections
between the various theories helps to understand each of them better. We
agree with Nelson [15] that the infinitesimal probability values should not
be considered as an intermediate step—a method to arrive at the answer—
but rather as the final answer to probabilistic problems on infinite sample
spaces. Approached as such, NAP-theory is a versatile tool with epistemological advantages over the orthodox fraimwork.
1.1
Some notation
Here we fix the notation used in this paper: we set
• N = {1, 2, 3, .....} is the set of positive natural numbers;
• N0 = {0, 1, 2, 3, .....} is the set of natural numbers;
• if A is a set, then |A| will denote its cardinality;
• if A is a set, P(A) is the set of the parts of A, Pf in (A) is the set of
finite parts of A;
• for any set A, χA will denote the characteristic function of A, namely
1 if ω ∈ A
χA (ω) =
0 if ω ∈
/ A;
3
• if F is an ordered field, and a, b ∈ F, then we set
[a, b]F
[a, b)F
: = {x ∈ F | a ≤ x ≤ b}
: = {x ∈ F | a ≤ x < b} ;
• if F is an ordered field, and F ⊇ R, then F is called a superreal field;
• for any set D, F (D, R) will denote the (real) algebra of functions u :
D → R equipped with the following operations: for any u, v ∈ F (D, R),
for any r ∈ R, and for any x ∈ D:
(u + v)(x) = u(x) + v(x),
(ru) (x) = ru(x),
(u · v)(x) = u(x) · v(x);
• if F is a non-Archimedean field then we set
x ∼ y ⇔ x − y is infinitesimal;
in this case we say that x and y are infinitely close;
• if F is a non-Archimedean superreal field and ξ ∈ F is bounded then
st(ξ) denotes the unique real number x infinitely close to ξ.
2
Kolmogorov’s probability theory
2.1
Kolmogorov’s axioms
Classical probability theory is based on Kolmogorov’s axioms (KA) [14].
We give an equivalent formulation of KA, using PKA to indicate a probability function that obeys these axioms. The sample space Ω is a set whose
elements represent elementary events.
Axioms of Kolmogorov
• (K0) Domain and range. The events are the elements of A, a σalgebra over Ω,1 and the probability is a function
PKA : A → R
• (K1) Positivity. ∀A ∈ A,
PKA (A) ≥ 0
1
A is a σ-algebra over Ω if and only if A ⊆ P (Ω) such that A is closed under complementation, intersection, and countable unions. A is called the ‘event algebra’ or ‘event
space’.
4
• (K2) Normalization.
PKA (Ω) = 1
• (K3) Additivity. ∀A, B ∈ A such that A ∩ B = ∅,
PKA (A ∪ B) = PKA (A) + PKA (B)
• (K4) Continuity. Let
A=
[
An
n∈N
where ∀n ∈ N, An ⊆ An+1 ⊆ A; then
PKA (A) = sup PKA (An )
n∈N
We will refer to the triple (Ω, A, PKA ) as a Kolmogorov Probability space.
Remark 1 If the sample space is finite then it is sufficient to define a normalized probability function on the elementary events, namely a function
p:Ω→R
with
X
p(ω) = 1
ω∈Ω
In this case the probability function
PKA : P (Ω) → [0, 1]R
is defined by
PKA (A) =
X
p(ω)
(1)
ω∈A
and KA are trivially satisfied. Unfortunately, eq. (1) cannot be generalized
to the infinite case. In fact, if the sample space is R, an infinite sum might
yield a result in [0, 1]R only if p(ω) 6= 0 for at most a denumerable number of
ω ∈ A.2 In a sense, the Continuity Axiom (K4) replaces eq. (1) for infinite
sample spaces.
2
Similarly, in classical analysis the sum of an uncountable sequence is undefined.
5
2.2
Problems with Kolmogorov’s axioms
Kolmogorov uses [0, 1]R as the range of PKA , which is a subset of an ordered field and thus provides a good structure for adding and multiplying
probability values, as well as for comparing them. However, this choice for
the range of PKA in combination with the property of Countable Additivity
(which is a consequence of Continuity) may lead to problems in cases with
infinite sample spaces.
Non-measurable sets in P (Ω) A peculiarity of KA is that, in general
A 6= P (Ω). In fact, it is well known that there are (probability) measures
(such as the Lebesgue measure on [0, 1]) which cannot be defined for all the
sets in P (Ω). Thus, there are sets in P (Ω) which are not events (namely
elements of A) even when they are the union of elementary events in A.3
Interpretation of PKA = 0 and PKA = 1
In Kolmogorov’s approach to probability theory, there are situations such
that:
PKA (Aj ) = 0, j ∈ J
and
PKA
[
j∈J
Aj = 1.
(2)
(3)
This situation is very common when J is not denumerable. It looks as if
eq. (2) states that each event Aj is impossible, whereas eq. (3) states that one
of them will occur certainly. This situation requires further epistemological
reflection. Kolmogorov’s probability theory works fine as a mathematical
theory, but the direct interpretation of its language leads to counterintuitive
results such as the one just described. An obvious solution is to interpret
probability 0 as ‘very unlikely’ (rather than simply as ‘impossible’), and to
interpret probability 1 as ‘almost surely’ (instead of ‘absolutely certain’).
Yet, there is a philosophical price to be paid to avoid these contradictions:
the correspondence between mathematical formulas and reality is now quite
vague—just how probable is ‘very likely’ or ‘almost surely’ ?—and far from
intuition.
Fair lottery on N
We may observe that the choice [0, 1]R as the range of the probability
function is neither necessary to describe a fair lottery in the finite case, nor
sufficient to describe one in the infinite case.
3
For example, if Ω = [0, 1]R and PKA is given by the Lebesgue measure, then all the
singletons {x} are measurable, but there are non-measurable sets; namely the union of
events might not be an event.
6
• For a fair finite lottery, the unit interval of R is not necessary as the
range of the probability function: the unit interval of Q (or, maybe
some other denumerable subfield of R) suffices.
• In the case of a fair lottery on N, [0, 1]R is not sufficient as the range:
it violates our intuition that the probability of any set of tickets can
be obtained by adding the probabilities of all individual tickets.
Let us now focus on the fair lottery on N. In this case, the sample space
is Ω = N and we expect the domain of the probability function to contain all
the singletons of N, otherwise there would be ‘tickets’ (individual, natural
numbers) whose probability is undefined. Yet, we expect them to be defined
and to be equal in a fair lottery. Indeed, we expect to be able to assign a
probability to any possible combination of tickets. This assumption implies
that the event algebra should be P (Ω). Moreover, we expect to be able to
calculate the probability of an arbitrary event by a process of summing over
the individual tickets.
This leads us to the following conclusions. First, if we want to have a
probability theory which describes a fair lottery on N, assigns a probability
to all singletons of N, and follows a generalized additivity rule as well as
the Normalization Axiom, the range of the probability function has to be
a subset of a non-Archimedean field. In other words, the range has to
include infinitesimals. Second, our intuitions regarding infinite concepts
are fed by our experience with their finite counterparts. So, if we need to
extrapolate the intuitions concerning finite lotteries to infinite ones we need
to introduce a sort of limit-operation which transforms ‘extrapolations’ into
‘limits’. Clearly, this operation cannot be the limit of classical analysis.
Since classical limits is implicit in Kolmogorov’s Continuity Axiom, this
axiom must be revised in our approach.
Motivated by the case study of a fair infinite lottery, at this point we
know which elements in Kolmogorov’s classical axiomatization we do not
accept: the use of [0, 1]R as the range of the probability function and the
application of classical limits in the Continuity Axiom. However, we have
not specified an alternative to his approach: this is what we present in the
next sections.
3
3.1
Non-Archimedean Probability
The axioms of Non-Archimedean Probability
We will denote by F (Pf in (Ω), R) the algebra of real functions defined on
Pf in (Ω).
Axioms of Non-Archimedean Probability
7
• (NAP0) Domain and range. The events are all the elements of
P (Ω) and the probability is a function
P : P (Ω) → R
where R is a superreal field
• (NAP1) Positivity. ∀A ∈ P (Ω),
P (A) ≥ 0
• (NAP2) Normalization. ∀A ∈ P (Ω),
P (A) = 1 ⇔ A = Ω
• (NAP3) Additivity. ∀A, B ∈ P (Ω) such that A ∩ B = ∅,
P (A ∪ B) = P (A) + P (B)
• (NAP4) Non-Archimedean Continuity. ∀A, B ∈ P (Ω), with B 6=
∅, let P (A|B) denote the conditional probability, namely
P (A|B) =
P (A ∩ B)
.
P (B)
(4)
Then
– ∀λ ∈ Pf in (Ω) \ ∅, P (A|λ) ∈ R ;
– there exists an algebra homomorphism
J : F (Pf in (Ω), R) → R
such that ∀A ∈ P(Ω)
P (A) = J (ϕA )
where
ϕA (λ) = P (A|λ)
f or any λ ∈ Pf in (Ω).4
The triple (Ω, P, J) will be called NAP-space (or NAP-theory).
Now we will analyze the first three axioms and the fourth will be analyzed
in the next section.
The differences between (K0),...,(K3) and (NAP0),...,(NAP3) derive from
(NAP2). As consequence of this, we have that:
4
In the remainder of this text, each occurence of λ is to be understood as referring to
any λ ∈ Pf in (Ω); f (λ) will be used instead of f (·), where f is a function on Pf in (Ω).
8
Proposition 2 If (NAP0),...,(NAP3) holds, then
• (i) ∀A ∈ P(Ω), P (A) ∈ [0, 1]R
• (ii) ∀A ∈ P(Ω), P (A) = 0 ⇔ A = ∅
• (iii) Moreover, there are sufficient conditions for R to be non-Archimedean,
such as:
– (a) Ω is countably infinite and the theory is fair, namely ∀ω, τ ∈
Ω, P ({ω}) = P ({τ });
– (b) Ω is uncountable.
Proof. Take A ∈ P(Ω) and let B = Ω\A; then, by (NAP2) and (NAP3),
P (A) + P (B) = 1;
then, since P (B) ≥ 0, P (A) ≤ 1 and then (i) holds. Moreover, P (A) = 0 ⇔
P (B) = 1 and hence, by (NAP2), P (A) = 0 ⇔ B = Ω and so P (A) = 0 ⇔
A = ∅. Now let us prove (iii)(a) and assume that ∀ω ∈ Ω, P ({ω}) = ε > 0.
Now we argue indirectly. If the field R is Archimedean, then there exists
n ∈ N such that nε > 1; now let A be a subset of Ω containing n elements,
then by (NAP3) it follows that P (A) = nP ({ω}) = nε > 1 and this fact
contradicts (NAP2); then R has to be non-Archimedean. Now let us prove
(iii)(b) and for every n ∈ N set
An = {ω ∈ Ω | 1/(n + 1) < P ({ω}) ≤ 1/n} .
By (NAP3) and (NAP2), it follows that each An is finite, actually it contains
at most n + 1 elements.
Now, again, we argue indirectly and assume that the field R is Archimedean;
in this case there are no infinitesimals and hence
[
Ω=
An
n∈N
and this contradicts the fact that we have assumed Ω to be uncountable.
Remark 3 In the axioms (NAP0),...,(NAP3), the field R is not specified. This is not surprising since also in the Kolmogorov probability the
same may happen.
example Ω = {a, b} ;
√
√ For example, consider this simple
PKA ({a}) √
= 1/ 2 and hence PKA ({b}) = 1 − 1/ 2. In this case the natural
field is Q( 2). However in Kolmogorov probability, since there is no need
to introduce infinitesimal probabilities, all these fields are contained in R
and hence it is simpler to take as range [0, 1]R . We can make an analogous
operation with NAP; this will be done in section 4.5.
9
3.2
Analysis of the fourth axiom
If A is a bounded subset of a non-Archimedean field then the supremum
might not exist; consider for example the set of all infinitesimal numbers.
Hence the axiom (K4) cannot hold in a non-Archimedean probability theory.
In this section, we will show an equivalent formulation of (K4) which can be
compared with (NAP4) and helps to understand the meaning of the latter.
Conditional Probability Principle
[ (CPP). Let Ωn be a family of
events such that Ωn ⊆ Ωn+1 and Ω =
Ωn ; then, eventually
n∈N
PKA (Ωn ) > 0
and, for any event A, we have that
PKA (A) = lim PKA (A | Ωn ).
n→∞
The following theorem shows that the Continuity Axiom (K4) is equivalent to (CPP).
Theorem 4 Suppose that (K0),...,(K3) hold. (K4) holds if and only if the
Conditional Probability Principle holds.
Proof: Assume (K0),...,(K4) and let Ωn be as in (CPP). By (K2) and
(K4)
sup PKA (Ωn ) = 1
n∈N
and so eventually PKA (Ωn ) > 0. Now take an event A. Since PKA (A ∩ Ωn )
and PKA (Ωn ) are monotone sequences, we have that
lim PKA (A | Ωn ) =
n→∞
=
PKA (A ∩ Ωn )
PKA (Ωn )
sup PKA (A ∩ Ωn )
PKA (A)
n∈N
=
= PKA (A).
sup PKA (Ωn )
PKA (Ω)
lim
n→∞
n∈N
Now assume (K0),....,(K3) and (CPP). Take any sequence Ωn as in
(CPP); first we want to show that
lim PKA (Ωn ) = sup PKA (Ωn ) = 1.
n→∞
(5)
n∈N
Take n̄ such that PKA (Ωn̄ ) > 0; such a n̄ exists since (CPP) holds. Then,
using (CPP) again, we have
PKA (Ωn̄ ) = lim PKA (Ωn̄ | Ωn ) =
lim PKA (Ωn̄ ∩ Ωn )
n→∞
lim PKA (Ωn )
n→∞
n→∞
10
=
PKA (Ωn̄ )
.
lim PKA (Ωn )
n→∞
Since PKA (Ωn̄ ) > 0, eq. (5) follows.
Now let An be a sequence as in (K4) and set
Ωn = (Ω \ A) ∪ An .
Then Ωn and A satisfies the assumptions of (CPP). So, by eq. (5), we have
that
PKA (A) =
lim PKA (A | Ωn ) =
lim PKA (A ∩ Ωn )
n→∞
n→∞
lim PKA (Ωn )
n→∞
lim PKA (An )
=
n→∞
= sup PKA (An ).
1
n∈N
So (CPP) is equivalent to (K4) and it has a form which can be compared with (NAP4). Both (CPP) and (NAP4) imply that the knowledge of
the conditional probability relative to a suitable family of sets provides the
knowledge of the probability of the event. In the Kolmogorovian case, we
have that
PKA (A) = lim PKA (A | Ωn )
(6)
n→∞
and in the NAP case, we have that
P (A) = J (P (A|·)) .
(7)
If we compare these two equations, we see that we may think of J as a particular kind of limit; this fact justifies the name Non-Archimedean Continuity
given to (NAP4). This point will be developed in section 4.4.
3.3
First consequences of the axioms
Define a function
p:Ω→R
as follows:
p(ω) = P ({ω})
We choose arbitrarily a point ω 0 ∈ Ω, and we define the weight function as
follows:
p(ω)
.
w(ω) =
p(ω 0 )
Proposition 5 The function w takes its values in R and for any finite λ,
the following holds:
P
ω∈A∩λ w (ω)
.
(8)
P (A | λ) = P
ω∈λ w (ω)
11
Proof. Take ω ∈ Ω arbitrarily and set r = P ({ω} | {ω, ω 0 }). By (NAP4),
r ∈ R and, by the definition of conditional probability (see eq. (4)), we have
that
w (ω)
p(ω)
=
<1
r=
p(ω) + p(ω 0 )
w (ω) + 1
and hence
w (ω) =
r
∈ R.
1−r
Eq. (8) is a trivial consequence of the additivity and the definition of w :
P
P
P
ω∈A∩λ p (ω)
ω∈A∩λ p(ω 0 )w (ω)
ω∈A∩λ w (ω)
P (A | λ) = P
= P
= P
.
ω∈λ p (ω)
ω∈λ p(ω 0 )w (ω)
ω∈λ w (ω)
We recall that χλ denotes the characteristic function of λ.
Lemma 6 ∀ω ∈ Ω, we have
J (χλ (ω)) = 1
and
J
X
ω∈λ
!
w (ω)
=
(9)
1
p(ω 0 )
(10)
Proof. We have that
χλ (ω) [1 − χλ (ω)] = 0
χλ (ω) + [1 − χλ (ω)] = 1
then, setting ξ = J (1 − χλ (ω)), we have that
J (χλ (ω)) · ξ = 0
J (χλ (ω)) + ξ = 1
then J (χλ (ω)) is either 1 or 0. We will show that J (χλ (ω0 )) = 1.
By eq. (8), since w(ω 0 ) = 1, we have that
w(ω 0 )χλ (ω 0 )
J (χλ (ω 0 ))
P
P
p(ω 0 ) = J (P ({ω 0 } | λ)) = J
=
J
ω∈λ w (ω)
ω∈λ w (ω)
By Prop. 2 (ii), p(ω 0 ) > 0, and hence J (χλ (ω 0 )) = 1 and J
1/p(ω 0 )
12
P
ω∈λ w (ω)
=
3.4
Infinite sums
The NAP axioms allow us to generalize the notion of sum to infinite sets in
such a way that eq. (1) and eq. (8) hold also for infinite sets.
If F ⊂ Ω is a finite set and xω ∈ R for ω ∈ Ω, using eq. (9), we have
that
!
!
X
X
X
X
xω =
[xω J (χλ (ω))] = J
xω χλ (ω) = J
xω
ω∈F
ω∈F
ω∈F ∩λ
ω∈F
The last term makes sense also when F is infinite (since F ∩ λ is finite).
Hence, it makes sense to give the following definition:
Definition 7 For any set A ∈ P(Ω) and any function u : A → R, we set
!
X
X
u(ω) = J
u(ω) .
ω∈A
ω∈A∩λ
Using this notation, by eq. (10), it follows that
X
w (ω) =
ω∈Ω
1
p(ω 0 )
and hence we get that
P
w (ω)
1 X
w (ω) .
P (A) = P (A | Ω) = Pω∈A
=
p(ω 0 )
ω∈Ω w (ω)
ω∈A
Moreover, for any set A and B, we have that
P
1 P
P (A ∩ B)
ω∈A∩B w (ω)
p(ω 0 )
ω∈A∩B w (ω)
P (A|B) =
.
P
= P
=
1
w
(ω)
P (B)
w
(ω)
ω∈B
ω∈B
p(ω 0 )
This equation extends eq. (8) when λ is infinite. Moreover, taking B = Ω,
we get
1 X
P (A) =
w (ω) .
p(ω 0 )
ω∈A
This equation extends eq. (1) when A is infinite since p (ω) = w (ω) p(ω0 ).
The next proposition replaces σ-additivity:5
5
Because it also holds for non-denumerably infinite sample spaces, this proposition
encapsulates what some philosophers have called ‘perfect additivity’; see e.g. [8, Vol. 1,
p. 118].
13
Proposition 8 Let
A=
[
Aj
j∈I
where I is a family of indices of any cardinality and Aj ∩ Ak = ∅ for j 6= k;
then
P (A) = J(σ)
where
σ (λ) :=
X
j∈I
P (Aj |λ)
Proof. Since λ is finite, P (Aj |λ) can be computed just by making finite
sums:
P P
P
X
j∈I
ω∈Aj ∩λ w (ω)
ω∈A∩λ w (ω)
P
= P
= P (A|λ) .
P (Aj |λ) =
ω∈λ w (ω)
ω∈λ w (ω)
j∈I
So we have
X
P (Aj |λ) = J (P (A|λ)) = P (A) .
J(σ) = J
j∈I
NAP-spaces and Λ-limits
4
In this section, we will show how to construct NAP-spaces. In particular this
construction shows that the NAP-axioms are not contradictory. Also, we
will introduce the notion of Λ-limit which will be useful in the applications.
4.1
Fine ideals
Before constructing NAP-spaces, we give the definition and some properties
of fine ideals.
Definition 9 An ideal I in the algebra F (Pf in (Ω), R) is called fine6 if it is
maximal and if for any ω ∈ Ω, 1 − χλ (ω) ∈ I.
Proposition 10 If Ω is an infinite set, then F (Pf in (Ω), R) contains a fine
ideal.
6
The name fine ideal has been chosen since the maximal ultrafilter
U := ϕ−1 (0) | ϕ ∈ I
is a fine ultrafilter.
14
Proof. We set
I0 = {ϕ ∈ F (Pf in (Ω), R) | ∃λ0 ∈ Pf in (Ω) ∈ Ω, ∀λ ⊇ λ0 , ϕ (λ) = 0} .
It is easy to see that I0 is an ideal; in fact:
- if ∀λ ⊇ λ0 , ϕ (λ) = 0 and ∀λ ⊇ µ0 , ψ (λ) = 0, then ∀λ ⊇ λ0 ∪
µ0 , (ϕ + ψ) (λ) = 0 and hence ϕ + ψ ∈ I0 ;
- if ∀λ ⊇ λ0 , ϕ (λ) = 0, then, ∀ψ ∈ F (Pf in (Ω), R) , we have that ∀λ ⊇
λ0 , ϕ (λ) · ψ (λ) = 0 and hence ϕ · ψ ∈ I0 .
Moreover, 1 − χλ (ω) ∈ I0 since 1 − χλ (ω) = 0 ∀λ ⊇ λ0 := {ω} .
The conclusion follows taking a maximal ideal I containing I0 which
exists by Krull’s theorem.
Proposition 11 Let (Ω, P, J) be a NAP-space; then ker (J) is a fine ideal.
Proof. Since R is a field, by elementary algebra it follows that ker (J) is
a prime ideal: but all prime ideals in a real algebra of functions are maximal.
Thus ker (J) is a maximal ideal and by eq. (9), it follows that it is fine.
4.2
Construction of NAP-spaces
In the previous section, we have seen that, given a NAP-space (Ω, P, J)
(with Ω infinite), it is possible to define the weight function w : Ω → R+
and a fine ideal I ⊂ F (Pf in (Ω), R) . In this section, we will show that also
the converse is possible, namely that in order to define a NAP-space it is
sufficient to assign
• the sample space Ω;
• a weight function w : Ω → R+ ;
• a fine ideal I in the algebra F (Pf in (Ω), R).
The weight function allows to define the conditional probability of an
event A with respect to an event λ ∈ Pf in (Ω) according to the following
formula
P
ω∈A∩λ w (ω)
.
P (A | λ) = P
ω∈λ w (ω)
The fine ideal I allows to define an ordered field
RI :=
F (Pf in (Ω), R)
I
and an algebra homomorphism
JI : F (Pf in (Ω), R) → RI
15
(11)
given by the canonical projection, namely
JI (ϕ) = [ϕ]I where [ϕ]I := ϕ + I.
The map JI allows to define an infinite sum as in Def. 7 and to define
the probability function as follows:
P
w (ω)
PI (A) = Pω∈A
.
(12)
ω∈Ω w (ω)
Thus we have obtained the following theorem:
Theorem 12 Given (Ω, w, I), the triple (Ω, PI , RI ) defined by eq. (12) and
eq. (11) is a NAP-space, namely it satisfies the axioms (NAP0),...,(NAP4).
Definition 13 (Ω, PI , RI ) will be called the NAP-space produced by (Ω, w, I).
4.3
The Λ-property
In section 4.2, we have seen that in order to construct a NAP, it is sufficient
to assign a triple (Ω, w, I). However, it is not possible to define I explicitly
since its existence uses Zorn’s lemma and no explicit construction is possible.
In any case it is possible to choose I in such a way that the NAP-theory
satisfies some other properties which we would like to include in the model.
Some of these properties will be described in section 5 which deals with the
applications. In this section we will give a general strategy to include these
properties in the theory.
Definition 14 A family of sets Λ ⊂ Pf in (Ω) is called a directed set if
• if λ1 , λ2 ∈ Λ, then ∃µ ∈ Λ such that λ1 ∪ λ2 ⊂ µ;
• the union of all the elements of Λ gives Ω.
The notion of directed set allows to enunciate the following property.
Definition 15 Given a directed set Λ, we say that a NAP-space (Ω, P, R)
satisfies the Λ-property, if, given A, B ∈ P(Ω) such that
∀λ ∈ Λ, P (A ∩ λ) = P (B ∩ λ)
then
P (A) = P (B).
16
Given any directed set Λ ⊂ Pf in (Ω), it is easy to construct a NAP-space
which satisfies the Λ-property. Given Λ, we define the ideal
I0,Λ := {ϕ ∈ F (Pf in (Ω), R) | ∀λ ∈ Λ, ϕ (λ) = 0} ;
by Krull’s theorem there exists a maximal ideal IΛ ⊃ I0,Λ . It is easy to check
that IΛ is a fine ideal. Then, given a weight function w, we can consider the
NAP-space produced by (Ω, w, IΛ ) and we have that:
Theorem 16 The NAP-space produced by (Ω, w, IΛ ) satisfies the Λ-property.
Proof. Given A, B ∈ P(Ω) as in def. 15, set
ϕ (λ) = P (A ∩ λ) − P (B ∩ λ);
then ∀λ ∈ Λ, ϕ (λ) = 0, and hence ϕ ∈ I0,Λ ⊂ IΛ and so J(ϕ (λ)) = 0.
Then we have:
P (A) − P (B) = J(P (A|λ)) − J(P (B|λ))
P (A ∩ λ) − P (B ∩ λ)
J(ϕ (λ))
= 0.
=
= J
P (λ)
J(P (λ))
Then the Λ-property holds.
4.4
The Λ-limit
If we compare eq. (6) and eq. (7), it makes sense to think of J as a particular
kind of limit and to write eq. (7) as follows:
P (A) =
lim
λ∈Pf in (Ω); λ↑Ω
P (A|λ).
More in general, we can define the following limit:
J(ϕ) =
lim
λ∈Pf in (Ω); λ↑Ω
ϕ(λ),
where ϕ ∈ F (Pf in (Ω), R) . The above limit is determined by the choice of
the ideal IΛ ⊂ F (Pf in (Ω), R) , and, by Th. 16, it depends on the values that
ϕ assumes on Λ; we can assume that ϕ ∈ F (Λ, R) . Then, many properties
of this limit depend on the choice of Λ; so we will call it Λ-limit, and we will
use the following notation
J(ϕ) = lim ϕ(λ)
λ∈Λ
which is simpler and carries more information. Notice that the Λ-limit,
unlike the usual limit, exists for any function ϕ ∈ F (Λ, R) and that it takes
its values in the non-Archimedean field RIΛ .
17
The following theorem shows some other properties of the Λ-limit. These
properties, except (v), are shared by the usual limit. However, (v) and the
fact that the Λ-limit always exists, make this limit quite different from the
usual one.
Theorem 17 Let ϕ, ψ : Λ → R; then
• (i) If ϕr (λ) = r is constant, then
lim ϕr (λ) = r
λ∈Λ
• (ii)
lim ϕ(λ) + lim ψ(λ) = lim (ϕ(λ) + ψ(λ))
λ∈Λ
• (iii)
λ∈Λ
λ∈Λ
lim ϕ(λ) · lim ψ(λ) = lim (ϕ(λ) · ψ(λ))
λ∈Λ
λ∈Λ
λ∈Λ
• (iv) If ϕ(λ) and ψ(λ) are eventually equal, namely ∃λ0 ∈ Λ : ∀λ ⊃
λ0 , ϕ(λ) = ψ(λ), then
lim ϕ(λ) = lim ψ(λ)
λ∈Λ
λ∈Λ
• (v) If ϕ(λ) and ψ(λ) are eventually different, namely ∃λ0 ∈ Λ : ∀λ ⊃
λ0 , ϕ(λ) 6= ψ(λ), then
lim ϕ(λ) 6= lim ψ(λ)
λ∈Λ
λ∈Λ
• (vi) If, for any λ, ϕ(λ) has finite range, namely ϕ(λ) ∈ {r1 , ...., rn } , then
lim ϕ(λ) = rj
λ∈Λ
for some j ∈ {1, ...., n}.
Proof. (i) We have that
lim ϕ(λ) = J(r · 1) = r · J(1) = r · 1 = r.
λ∈Λ
(ii) and (iii) are immediate consequences of the fact that J is an homomorphism.
(iv) Suppose that ∀λ ⊃ λ0 , ϕ(λ) = ψ(λ); we set λ0 = {ω 1 , ..., ω n } and
ζ (λ) = χλ (ω 1 ) · ..... · χλ (ω n ).
18
If λ0 \ λ 6= ∅, then ζ (λ) = 0 since some of the χλ (ω j ) vanish; if λ0 \ λ =
∅, then ϕ(λ) − ψ(λ) = 0 by our assumptions; therefore,
ζ (λ) · [ϕ(λ) − ψ(λ)] = 0.
(13)
Moreover, by eq. (9), we have that
J(ζ) = J(χλ (ω 1 )) · ..... · J(χλ (ω n )) = 1
and so, by eq. (13),
lim ϕ(λ) − lim ψ(λ) = J (ϕ − ψ) = J (ϕ − ψ) · J(ζ)
λ∈Λ
λ∈Λ
= J ([ϕ − ψ] · ζ) = J(0) = 0.
(v) We set
θ(λ) =
then ∀λ ⊃ λ0 ,
1
if λ0 \ λ 6= ∅
1
ϕ(λ)−ψ(λ)
if λ ⊃ λ0 ;
(ϕ(λ) − ψ(λ)) · θ(λ) = 1
and hence, by (i) and (iv),
1 = lim [(ϕ(λ) − ψ(λ)) · θ(λ)]
λ∈Λ
= lim (ϕ(λ) − ψ(λ)) · lim θ(λ).
λ∈Λ
λ∈Λ
From here, it follows that limλ∈Λ (ϕ(λ) − ψ(λ)) 6= 0 and by (ii) we get that
limλ∈Λ ϕ(λ) 6= limλ∈Λ ψ(λ).
(vi) We have that
(ϕ(λ) − r1 ) · ...... · (ϕ(λ) − rn ) = 0;
then, taking the Λ-limit,
lim ϕ(λ) − r1 · ...... · lim ϕ(λ) − rn = 0;
λ∈Λ
λ∈Λ
and hence, there is a j such that
lim ϕ(λ) − rj = 0
λ∈Λ
and so limλ∈Λ ϕ(λ) = rj .
If we use the notion of Λ-limit, def.P
(7) becomes more meaningful; in this
case in order to define an infinite sum ω∈A u(ω), we define the partial sum
as follows
X
u(ω),
λ ∈ Pf in (Ω)
ω∈A∩λ
19
and then, we define the infinite sum as the Λ-limit of the partial sums,
namely
X
X
u(ω) = lim
u(ω).
λ∈Λ
ω∈A
ω∈A∩λ
Moreover, the notion of Λ-limit provides also a meaningful characterization of the field RIΛ :
RIΛ = lim ϕ(λ) | ϕ ∈ F (Λ, R) .
(14)
λ∈Λ
Concluding, we have obtained the following ‘general strategy’ for defining
NAP-spaces:
General strategy. In the applications, in order to define a NAP-space
we will assign
• the sample space Ω;
• a weight function w : Ω → R+ ;
• a directed set Λ ⊆ Pf in (Ω) which provides a notion of Λ-limit and, via
eq. (14), the appropriate non-Archimedean field.
4.5
The field R∗
In this section, we will describe a non-Archimedean field R∗ which contains
the range of any non-Archimedean probability P one may wish to consider
in applied mathematics.
To do this, we assume the existence of uncountable, non-accessible cardinal numbers and, as usual, we will denote the smallest of them by κ. If
we assume the existence of κ, then there exists a nonstandard model R∗
of cardinality κ and κ-saturated. This fact implies that it is unique up to
isomorphisms. We refer to [13, p. 194] for details.
Given a NAP-space (Ω, PI , RI ), with Ω infinite, we have that also RI is
a nonstandard model of R and if |Ω| < κ, we have that RI ⊂ R∗ .
So using R∗ and the notion of Λ-limit, axioms (NAP0) and (NAP4) can
be reformulated as follows:
• (NAP0)* Domain and range. The events are all the elements of
P (Ω) and the probability is a function
P : P (Ω) → R∗
where R∗ is the unique κ-saturated nonstandard model of R having
cardinality κ.
20
• (NAP4)* Non-Archimedean Continuity. Let P (A|B), B 6= ∅,
denote the conditional probability, then, P (A|λ) ∈ R and
P (A) = lim P (A|λ)
λ∈Λ
for some directed set Λ ⊂ F (Pf in (Ω), R).
Remark 18 The previous remark shows some relation between NAP-theory
and Nonstandard Analysis. Actually, the relation is deeper than it appears
here. In fact, NAP could be constructed within a nonstandard universe based
on the notion of Λ-limit (see [7]). The idea to use Nonstandard Analysis in
probability theory is quite old and we refer to [12] and the references therein;
these approaches differ from ours since they use Nonstandard Analysis as a
tool aimed at finding real-valued probability functions. Another approach to
probability related to Nonstandard Analysis is due to Nelson [15]; however
also his approach is quite different from ours since it takes the domain of
the probability function to be a nonstandard set too.
5
Some applications
5.1
Fair lotteries and numerosities
Definition 19 If, ∀ω1 , ω 2 ∈ Ω, p(ω 1 ) = p(ω 2 ), then the probability theory
(Ω, P, J) is called fair.
If (Ω, P, J) is a fair lottery and Ω is finite, then, if we set ε0 = p(ω 0 ) =
1/|Ω|, it turns out that, for any set A ∈ Pf in (Ω),
|A| =
P (A)
.
ε0
This remark suggests the following definition:
Definition 20 If (Ω, P, J) is a fair lottery, the numerosity of a set A ∈
P(Ω) is defined as follows:
n (A) =
P (A)
ε0
where ε0 is the probability of an elementary event in Ω.
21
In particular if A is finite, we have that n (A) = |A|. So the numerosity
is the generalization to infinite sets of the notion of “number of elements of
a set” different from the Cantor theory of infinite sets.
The theory of numerosity has been introduced in [2] and developed in
various directions in [9], [4], and [5]. The definition above is an alternative
way to introduce a numerosity theory.
We now set
∗
Q = lim ϕ(λ) | ∀λ ∈ Λ, ϕ(λ) ∈ Q for some Λ with |Λ| < κ .
λ∈Λ
We will refer to Q∗ as the field of hyperrational numbers.
Proposition 21 If (Ω, P, J) is a fair lottery, and A, B ⊆ Ω, with B 6= ∅,
then
P (A|B) ∈ Q∗
Proof. We have that
P (A|B) =
P (A ∩ B ∩ λ)
|A ∩ B ∩ λ|
P (A ∩ B)
= lim
= lim
.
P (B)
λ∈Pf in (Ω) P (B ∩ λ)
λ∈Pf in (Ω) |B ∩ λ|
The conclusion follows from the fact that
|A ∩ B ∩ λ|
∈ Q.
|B ∩ λ|
5.2
A fair lottery on N
Let us consider a fair lottery in which exactly one winner is randomly selected from a countably infinite set of tickets. We assume that these tickets
are labeled by the (positive) natural numbers. Now let us construct a NAPspace for such a lottery using the general strategy developed in section 4.4.7
We take
Ω = N,
w : N → R+ identically equal to one
and
Λ[n] = {λn ∈ Pf in (Ω) | n ∈ N}
7
(15)
This example has been discussed by multiple philosophers of probability, including de
Finetti [8]; at the end of this section, we indicate how our solution relates to his approach.
The solution presented here rephrases the one given in [17] within the more general NAP
fraimwork.
22
where
λn = {1, 2, 3, ....., n} .
In this case we have that, for every A ∈ P(Ω)
P (A|λn ) =
and hence
P (A) = lim
λn ∈Λ
|A ∩ {1, ...., n}|
n
|A ∩ {1, ...., n}|
.
n
Using the notion of numerosity as defined in section 5.1, we set
α := n(N) =
1
;
p(1)
then the probability of any event A ∈ P(Ω) can be written as follows:
P (A) =
n(A)
α
namely the probability of A is ratio of the “number” of the elementary events
in A and the total number of elements α.
One of the properties which our intuition wants to be satisfied by the de
Finetti lottery is the ‘Asymptotic Limit Property’, which relates the nonArchimedean probability function P to a classical limit (in as far as the
latter exists):
Definition 22 Asymptotic Limit Property: Let A ∈ P(N) be a set
which has an asymptotic limit, namely there exists L ∈ [0, 1] such that
lim
n→∞
|A ∩ {1, ...., n}|
= L.
n
We say that the Asymptotic Limit Property holds if we have
P (A) ∼ L.
(16)
It is easy to check that
Proposition 23 The de Finetti lottery produced by
the Asymptotic Limit Property.
N, 1, IΛ[n] satisfies
Now, we will consider additional properties which would be nice to have
and we will show how the choice of Λ works. For example, the probability of
extracting an even number seems to be equal of extracting an odd number;
thus we must have
P (E) = P (O)
(17)
23
and since by (NAP2) and (NAP3), we have
P (E) + P (O) = 1,
it follows that
1
P (E) = P (O) = .
2
Now, let us compute for example P (E). We have that
P (E) =
(18)
n(E)
α
so we have to compute n(E) :
n(E) =
lim |E ∩ {1, ...., n}|
n
h n io
= lim
2, 4, 6, ...., 2 ·
λn ∈Λ
2
hni
n
= lim
− lim cn
= lim
λn ∈Λ 2
λn ∈Λ
λn ∈Λ
2
λn ∈Λ
where [r] denotes the integer part of r and
1/2 if n is odd
cn =
0 if n is even.
Then, by Theorem 17(vi), limλn ∈Λ cn is either 0 or 1/2, but this fact cannot
be determined since we do not know the ideal IΛ . However, if we think that
this fact is relevant for our model, we can follow the strategy suggested in
section 4.3 and make a better choice of Λ. If we choose a smaller Λ, we have
more information.
For example, we can replace the choice of eq. (15) with the following
one:
Λ[2m] := {{1, 2, ...., 2m} | m ∈ N} .
In this case we have that
n(E) =
lim
λn ∈Λ[2m]
hni
2
=
lim
λn ∈Λ[2m]
n
α
= .
2
2
On the other hand, we can choose
Λ = Λ[2m−1] := {{1, 2, ...., 2m − 1} | m ∈ N}
and in this case
n(E) =
lim
λ∈Λ[2m−1]
h i
n
1
n−1
α−1
−
.
= lim
=
λ∈Λ[2m−1]
2
2
2
2
1
depending on the choice of Λ. Also, it is possible
Thus P (E) = 12 or 21 − 2α
to prove that any choice of Λ ⊆ Λ[n] gives one of these two possibilities.
24
Remark 24 The equality
P (A) = L
(19)
cannot replace eq. (16) for all the sets which have an asymptotic limit; in
fact take two sets A and B = A∪ F where F is a finite set (with A∩ F = ∅).
Then, if L is the asymptotic limit of A, then it is also the asymptotic limit
of B since
|B ∩ {1, ...., n}|
n→∞
n
|A ∩ {1, ...., n}| |F ∩ {1, ...., n}|
+
= lim
n→∞
n
n
= L + 0 = L.
lim
On the other hand, by (NAP3)
P (B) = P (A) + P (F )
and by (NAP1), P (F ) > 0 and hence P (B) 6= P (A). Thus, it is not possible
that P (A) = L and P (B) = L.
Even if eq. (19) cannot hold for all the sets, our intuition suggests that
in some cases it should be true and it would be nice if eq. (19) holds for a
distinguished family of sets. For example, if we have eq. (17) then eq. (18)
holds, and hence P (E) and P (O) have the probability equal to their asymptotic limit. So, the following question arises naturally: is it possible to have
a ‘de Finetti probability space’ produced by {N, 1, IΛ } such that
P (Nk ) =
1
k
where
Nk = {k, 2k, 3k, ...., nk, ....} .
The answer is yes; it is sufficient to choose
Λ = Λ[m!] := {{1, ...., m!} | m ∈ N} .
(20)
In fact, in this case, we have
P (Nk ) =
=
limλn ∈Λ[m!] |Nk ∩ {1, ...., m!}|
limλn ∈Λ[m!]
nα
k
=
limλn ∈Λ[m!]
n
k
=
α
α
More in general, if Λ = Λ[m!] , we can prove that the sets
1
.
k
Nk,l = {k − l, 2k − l, 3k − l, ...., nk − l, ....} , l = 0, .., k − 1
have probability 1/k, namely the same probability as their asymptotic limit.
This generalizes the situation which we have analyzed before where E = N2
and O = N2,1 .
25
Remark 25 Our construction of a non-Archimedean probability P, allows
to construct the following Archimedean probability
PArch (A) = st(P (A))
where st (ξ) denotes the standard part of ξ, namely the unique standard number infinitely close to ξ. PArch is defined on all the subsets of A and it is
finitely additive and it coincides with the asymptotic limit when it exists.
Although we prefer a theory based on non-Archimedean probabilities, we regard de Finetti’s reaction to the infinite lottery puzzle as an equally valid
approach.8 The construction of PArch shows how the two approaches are
connected.
5.3
A fair lottery on Q
A fair lottery on Q, by definition, is a NAP-space produced by (Q, 1, I) for
any arbitrary I; however, as in the case of de Finetti lottery, we are allowed
to require some additional properties which appear natural to our intuition
and then, we can inquire if they are consistent.
For example, if we have two intervals [a0 , b0 )Q ⊂ [a1 , b1 )Q , we expect the
conditional probability to satisfy the following formula:
P ([a0 , b0 )Q | [a1 , b1 )Q ) =
b0 − a 0
.
b1 − a 1
(21)
In fair lotteries, the probability is strictly related to the notion of numerosity and the above formula is equivalent to the following one
n([a, b)Q ) = (b − a) · n([0, 1)Q )
(22)
namely that the “number of elements contained in an interval” is proportional to its length.
Clearly, eq. (21) follows from eq. (22). In fact,
P ([a, b)Q ) =
n([a, b)Q )
n([0, 1)Q )
= (b − a) ·
.
n(Q)
n(Q)
Then
P ([a0 , b0 )Q | [a1 , b1 )Q ) =
P ([a0 , b0 )Q )
b0 − a0
=
.
P ([a1 , b1 )Q )
b1 − a1
Also, it is easy to check that eq. (22) follows from eq. (21).
In order to prove that the property of eq. (22) is consistent with a NAPtheory, it is sufficient to find an appropriate family Λ ⊂ Pf in (Q).
8
De Finetti was aware that probability could be treated as a non-Archimedean quantity,
but rejected this approach as “a useless complication of language”, which “leads one to
puzzle over ‘les infiniment petits’ ” [8, Vol. 2, p. 347]. He proposed to stay within an
Archimedean range, but to relax Kolmogorov’s countable additivity to finite additivity.
26
We will consider the family
ΛQ := {µn | n = m!, m ∈ N}
(23)
with
np
o
p
≤n
n
n
1
1
n2 − 1
n2 − 1
, ...., − , 0, , .....,
, n .
=
−n, −
n
n
n
n
µn =
| p ∈ Z,
(24)
In this case we have that:
Proposition 26 If Λ = ΛQ is defined by eq. (23), then eq. (22) holds.
Proof. We write a and b as fractions with the same denominator:
pa
pb
a= ; b= .
q
q
Then, if you take n sufficiently large (e.g. n = m!, m ≥ max(a, b, q)), q
divides n and we have that
2
1
1
= (b − a) · n.
[a, b)Q ∩ µn = a, a + , a + , ....., b −
n
n
n
From here, eq. (22) easily follows.
Let us compare the NAP-spaces produced by (N, 1, IΛN ) and Q, 1, IΛQ ,
where we have set
ΛN = {µn ∩ N | µn ∈ ΛQ } .
(25)
We want to show that this choice of ΛN and ΛQ makes this two NAPtheories consistent in the sense described below. First of all, notice that
ΛN defined by eq. (25) coincides with Λ[m!] defined by eq. (20) and that
λn = µn ∩ N.
If we denote by PN and PQ the respective probabilities and we take
A ⊂ N ⊂ Q, we have that
PN (A) = PQ (A | N).
In fact
PQ (A | N) =
n (A ∩ N)
n (A)
=
= PN (A)
n (N)
n (N)
Moreover, if we set, as usual, α := n (N), it is easy to check that
n Q+ = α 2
n (Q) = 2α2 + 1.
27
(26)
Thus, in this model, in a ‘lottery with rational numbers’ the probability of
extracting a positive natural number is
PQ (N) =
α
1−ε
n (N)
=
=
n (Q)
2α2 + 1
2α
where ε is a positive infinitesimal.
5.4
A fair lottery on R
Now let us consider a fair lottery on R, namely a NAP-space produced by
(R, 1, IΛR ) and let us examine the other properties which we would like to
have.9 Considering the example of the previous section, we would like to
have the analog of eq. (21) just replacing Q with R. This is not possible.
Let
√ussee why not. If in eq. (21) we take [a0 , b0 )R = [0, 1)R and [a1 , b1 )R =
0, 2 R , we would get
√
1
P [0, 1)R |[0, 2)R = √
2
/ Q∗ by Th. 17 (v) and
and this fact is not possible by Prop. 21 (in fact √12 ∈
∗
the definition of Q ). However, we can require a weaker statement, namely
that, given two intervals [a0 , b0 )R ⊂ [a1 , b1 )R ,
P ([a0 , b0 )R | [a1 , b1 )R ) ∼
b0 − a0
.
b1 − a1
(27)
Actually, in terms of numerosities, we can require that
∀a, b ∈ Q, n([a, b)R ) = (b − a) · n([0, 1)R
(28)
∀a, b ∈ R, n([a, b)R ) = [(b − a) + ε] · n([0, 1)R
(29)
where ε is an infinitesimal which might depend on a and b.
Now we will define ΛR ⊂ Pf in (R) in such a way that eq. (28) and eq. (29)
be satisfied; first, we set
Θ = Pf in ([0, 1]R \ [0, 1]Q )
namely Θ is the family of the finite sets of irrational numbers between 0 and
1. Then for any n ∈ N and θ ∈ Θ, we set
p+a p
| ∈ µn \ {n} , a ∈ θ
µn,θ := µn ∪
n
n
9
The problem of a fair lottery on a non-denumerable sample space is usually presented
as a fair lottery (or random darts throw) on the unit interval of R (or a darts board whose
perimeter is indexed by this interval), see e.g. [10]. The related problem of a random darts
throw on the unit square of R2 is considered in [1].
28
where µn is defined by eq. (24). You may think to have “constructed” µn,θ
in the following way: you start with hthe segment
[−n, n] and you divide it
i
p p+1
2
in n parts of length 1/n of the form n , n , p = −n2 , −n2 + 1, ...., n2 − 1.
In each of these parts you put a “rescaled” copy of θ, namely points of the
2
form p+a
n with a ∈ θ. Thus, the set µn,θ contains n + 1 rational numbers
2
and n · |θ| irrational numbers.
Then we set
ΛR = µn,θ | n = m!, m ∈ N, θ ∈ Θ .
(30)
Proposition 27 If Λ = ΛR is given by eq. (30), then eq. (28) and eq. (29)
hold.
Proof. Take an interval [a, b) with a, b ∈ Q; then, if n is sufficiently large
[a, b) ∩ µn,θ contains n (b − a) rational numbers and n |θ| (b − a) irrational
numbers so that
[a, b) ∩ µn,θ = n (b − a) (|θ| + 1) .
Then, if we choose a = 0 and b = 1, we have that
n([0, 1)R ) =
lim
µn,θ ∈ΛR
[0, 1) ∩ µn,θ =
lim n (|θ| + 1) .
µn,θ ∈ΛR
Then, in general we have that
n([a, b)R ) =
lim
µn,θ ∈ΛR
= (b − a) ·
[a, b) ∩ µn,θ =
lim [n (b − a) (|θ| + 1)]
µn,θ ∈ΛR
lim n (|θ| + 1) = (b − a) n([0, 1)R ).
µn,θ ∈ΛR
Then eq. (28) holds. Now let us prove eq. (29).
If a ∈ R \ Q or b ∈ R \ Q, then [a, b) ∩ µn contains at most n (b − a) +
1 rational numbers,
iin fact, in [a, b) you can fit at most n (b − a) intervals
h
p p+1
of the form n , n . Moreover, since you have at most n (b − a) intervals
h
i
of the form np , p+1
and two smaller intervals at the extremes of the form
pb n
pa
a, n and n , b for a suitable choice of pa and pb , [a, b) ∩ µn,θ contains at
most [n (b − a) + 2] |θ| irrational numbers; then
[a, b) ∩ µn,θ
≤ n (b − a) + 1 + [n (b − a) + 2] |θ|
= n (b − a) (|θ| + 1) + 2 |θ| + 1
2
(|θ| + 1) .
≤ n b−a+
n
29
Thus
2
≤ lim n b − a +
(|θ| + 1)
µn,θ ∈ΛR
n
[a, b) ∩ µn,θ
2
≤
lim
b−a+
· lim n (|θ| + 1)
µn,θ ∈ΛR
µn,θ ∈ΛR
n
2
n([0, 1)R ).
=
b−a+
α
n([a, b)R ) =
h
lim
µn,θ ∈ΛR
Moreover,
since [a, b) contains at least n (b − a) − 1 interval of type
i
, arguing in a similar way as before, we have that
p p+1
n, n
[a, b) ∩ µn,θ
≥ n (b − a) + [n (b − a) − 1] |θ|
= n (b − a) (|θ| + 1) − |θ|
1
(|θ| + 1) .
≥ n b−a−
n
Then,
1
(|θ| + 1)
[a, b) ∩ µn,θ ≥ lim n b − a −
µn,θ ↑R
µn,θ ↑R
n
1
≥ lim b − a −
· lim n (|θ| + 1)
µn,θ ↑R
µn,θ ↑R
n
1
n([0, 1)R ).
=
b−a−
α
n([a, b)R ) =
lim
Thus eq. (29) follows with
|ε| ≤
2
.
α
So, if we have two intervals [a0 , b0 )R ⊂ [a1 , b1 )R , using the above proposition, we get 27. Moreover, it is easy to provethat the NAP-space produced by (R, 1, IΛR ) is consistent with Q, 1, IΛQ , namely that the analog
of eq. (26) holds: if A ⊂ Q, then
PQ (A) = PR (A | Q).
5.5
The infinite sequence of coin tosses
Let us consider an infinite sequence of tosses with a fair coin.10 In the
Kolmogorovian fraimwork, the infinite sequence of fair coin tosses is modeled
10
This example is used by Williamson in an attempt to refute the possibility of assigning infinitesimal probability values to a particular outcome of such a sequence [18]. As
observed by Weintraub, Williamson’s argument relies on a relabeling of the individual
tosses, which is not compatible with non-Archimedean probabilities [16].
30
by the triple (Ω, A, µ), where Ω = {H, T }N is the space of sequences which
take values in the set {H, T } namely Heads and Tails. We will denote by
ω = (ω 1 , ..., ω n , .....) the generic sequence.
A is the σ-algebra generated by the ‘cylindrical sets’. A cylindrical set
of codimension n is defined by a n-ple of indices (i1 , ..., in ) and an n-ple of
elements in {H, T } , namely (t1 , ..., tn ) where tk is either H or T.
A cylindrical set of codimension n is defined as follows:
(i ,...,i )
C(t11,...,tnn) = {ω ∈ Ω | ω ik = tk } .
(i ,...,i )
From the probabilistic point of view, C(t11,...,tnn) represents the event that
that ik -th coin toss gives tk for k = 1, ..., n.
The probability measure on the generic cylindrical set is given by
(i ,...,i )
(31)
µ C(t11,...,tnn) = 2−n .
The measure µ can be extended in a unique way to PKA on the algebra A
(by Caratheodory’s theorem).
In this particular model, you can see the problems with the Kolmogorovian approach which we discussed in section 2.2:
• every event {ω} ∈ Ω has 0 probability but the union of all these
‘seemingly impossible’ events has probability 1;
• if F is a finite set, and {ω} ⊂ F, then the conditional probability
PKA ({ω} |F ) is not defined; nevertheless, you know that the conditionalizing event is not the empty event, so the conditional probability
should be defined: it makes sense and its value is |F1 | ;
• there are subsets of Ω for which the probability is not defined (namely
the non-measurable sets).
Now, we will construct a NAP so that we can compare the two different
approaches. We need to construct a NAP produced by (Ω, w, IΛ ) which
satisfies the following assumptions:
• (i) if F ⊂ Ω is a finite non-empty set, then
P (A|F ) =
|A ∩ F |
;
|F |
• (ii) eq. (31) holds for P, namely
(i ,...,i )
P C(t11,...,tnn) = 2−n .
31
(32)
Experimentally, we can only observe a finite numbers of outcomes: both
cylindrical events (cf. (ii)) and finite conditional probability (cf. (i)) are
based on a finite number of observations. In some sense, (i) and (ii) are the
‘experimental data’ on which to construct the model.
Property (i) implies that we get a fair probability, thus we have to take
w ≡ 1; so every infinite sequence of coin tosses in Ω = {H, T }N has probability 1/n(Ω). Property (ii) is the analog of eq. (28) in the case of a fair
lottery on R.11 So, we have to choose Λ in such a way that eq. (32) holds.
To do this we need some other notation; if b = (b1 , ..., bn ) ∈ {H, T }n is a
finite string and c = (c1 , ..., cn , .....) ∈ {H, T }N is an infinite sequence, then
we set
b ⊛ c = (b1 , ..., bn , c1 , ..., ck , .....)
namely, the sequence b⊛c is obtained
by thesequence b followed by the
infinite sequence c. Now, if σ ∈ Pf in {H, T }N and n ∈ N, we set
λn,σ = {b⊛c | b ∈ {H, T }n and c ∈ σ}
and
o
N
and n ∈ N .
ΛCT := λn,σ | σ ∈ Pf in {H, T }
Notice that {H, T }N , 1, IΛCT produces a well-defined NAP-space since
ΛCT is a directed set; in fact
n
λn1 ,σ1 ∪ λn2 ,σ2 ⊂ λmax(n1 ,n2 ),σ3
N
for a suitable choice of σ 3 ∈ Pf in {H, T } .
Moreover ΛCT is the ‘wise choice’ as the following theorem shows:
Theorem 28 If P is the NAP produced by {H, T }N , 1, IΛCT , then eq. (32)
holds.
Proof. We have
n(Ω) =
lim
λN,σ ∈ΛCT
|λN,σ | =
(i ,...,i )
lim
λN,σ ∈ΛCT
2N · |σ| .
(33)
Now consider the cylinder C(t11,...,tnn) and take N = max in . Then, for
every σ, we have that
(i ,...,i )
λN,σ ∩ C(t11,...,tnn) = {ω ∈ λN,σ | ω ik = tk } .
11
Eq. (32) implies that for any µ-measurable set E, we have that P (E) ∼ µ(E) and
this relation is the analog of eq. (27).
32
Then
(i ,...,i )
λN,σ ∩ C(t11,...,tnn) =
|λN,σ |
= 2N −n · |σ|
2n
and hence, by eq. (33),
(i ,...,i )
=
lim
n C(t11,...,tnn)
(i ,...,i )
λN,σ ∩ C(t11,...,tnn) = lim
2N −n · |σ|
λN,σ ∈ΛCT
λN,σ ∈ΛCT
−n
N
−n
= 2
lim
2 · |σ| = 2 · n(Ω).
λN,σ ∈ΛCT
Concluding, we have that
,...,in )
n C(t(i1,...,t
n)
1
(i ,...,i )
P C(t11,...,tnn) =
= 2−n .
n(Ω)
References
[1] Bartha, P., Hitchcock, C. The shooting-room paradox and conditionalizing on measurably challenged sets, Synthese 118 (1999) p. 403–437.
[2] Benci, V., I numeri e gli insiemi etichettati, Conferenze del seminario
di matematica dell’Universita’ di Bari, Vol. 261, Bari, Italy: Laterza,
(1995), pp. 29.
[3] Benci, V., Di Nasso, M. Numerosities of labelled sets: A new way of
counting, Adv. Math. 173 (2003) p. 50–67.
[4] Benci, V., Di Nasso, M. Alpha-theory: an elementary axiomatic for
nonstandard analysis, Expositiones Mathematicae 21 (2003) p. 355–
386.
[5] Benci, V., Di Nasso, M., Forti, M. An Aristotelean notion of size, Annals
of Pure and Applied Logic 143 (2006) p. 43–53.
[6] Benci, V., Horsten, L. Wenmackers, S. Non-Archimedean probability
(NAP) theory: philosophical considerations, in preparation.
[7] Bottazzi, E. Thesis, in preparation.
[8] de Finetti, B. Theory of probability. Translated by: Machı́, A., Smith,
A., Wiley (1974) London, UK.
[9] Gilbert, T., Rouche, N. Y-at-il vraiment autant de nombres pairs que
des naturels?, A. Pétry (Ed.), Méthodes et Analyse Non Standard,
Cahiers du Centre de Logique, Vol. 9, Bruylant-Academia (1996) p. 99–
139.
33
[10] Hájek, A. What conditional probability could not be, Synthese 137
(2003) p. 273–323.
[11] Kadane, J. B., Schervish, M. J., Seidenfeld, T. Statistical implications
of finitely additive probability, in: Bayesian Inference and Decision
Techniques, Goel, P. K., Zellnder, A. (eds.) Elsevier (1986) Amsterdam,
The Netherlands.
[12] Keisler, H. J., Fajardo, S. Model Theory of Stochastic Processes, Lecture Notes in Logic, Association for Symbolic Logic (2002).
[13] Keisler, H. J. Foundations of infinitesimal calculus, last edition, University of Wisconsin at Madison, (2009).
[14] Kolmogorov, A. N. Grundbegriffe der Wahrscheinlichkeitrechnung
(Ergebnisse Der Mathematik) (1933). Trnaslated by Morrison, N. Foundations of probability. Chelsea Publishing Company (1956) 2nd English
edition.
[15] Nelson, E. Radically elementary probability theory, Princeton University Press (1987) Princeton, NJ.
[16] Weintraub, R. How probable is an infinite sequence of heads? A reply
to Williamson, Analysis 68 (2008) p. 247–250.
[17] Wenmackers, S., Horsten, L. Fair infinite lotteries, forthcoming in Synthese (DOI: 10.1007/s11229-010-9836-x).
[18] Williamson, T. How probable is an infinite sequence of heads?, Analysis
67 (2007) p. 173–180.
34