Time in Physics (2017)
Time in Physics (2017)
Mathematical Sciences
Renato Renner
Sandra Stupar
Editors
Time
in Physics
Tutorials, Schools, and Workshops
in the Mathematical Sciences
The series is the primary resource for the publication of results and developments
presented at summer or winter schools, workshops, tutorials, and seminars. Written
in an informal and accessible style, they present important and emerging topics
in scientific research for PhD students and researchers. Filling a gap between
traditional lecture notes, proceedings, and standard textbooks, the titles included
in TSWMS present material from the forefront of research.
Time in Physics
Editors
Renato Renner Sandra Stupar
Institute for Theoretical Physics Institute for Theoretical Physics
ETH Zurich ETH Zurich
Zurich, Switzerland Zurich, Switzerland
v
vi Preface
Some of the questions tackled are: Is an arrow of time physical and does it stem from
the second law of thermodynamics as usually believed? What does time represent
on a cosmological level and can we connect it to entanglement? How are causality
and time related, and what can we deduce from causal relations? How intertwined
are the notions of free will and time in physics?
This book is aimed at students and scientists learning about the concept of time
and related areas. The choice of topics represents different approaches that are
currently followed by researchers working in the field. The readers can find answers
to questions about time, but more likely they will also find themselves puzzled by
the many fundamental questions that are still open. The book may therefore also
serve as a starting point for new research into the subject. We would like to thank all
the authors for their time and knowledge they shared when writing chapters for this
book. We are grateful to the referees whose comments helped shaping the articles to
their final form. Our thanks go to Clemens Heine from Birkhäuser Basel for his help
and encouragement during the preparation and realisation of this book. We are also
thankful to Luca Sidler and the whole editorial team of Birkhäuser for the smooth
editing and publishing procedure.
vii
Time Really Passes, Science Can’t Deny That
Nicolas Gisin
Abstract Today’s science provides quite a lean picture of time as a mere geometric
evolution parameter. I argue that time is much richer. In particular, I argue that
besides the geometric time, there is creative time, when objective chance events
happen. The existence of the latter follows straight from the existence of free-will.
Following the French philosopher Lequyer, I argue that free-will is a prerequisite for
the possibility to have rational argumentations, hence can’t be denied. Consequently,
science can’t deny the existence of creative time and thus that time really passes.
1 Introduction
What is free-will for a physicist? This is a very personal question. Most physicists
pretend they don’t care, that it is not important to them, at least not in their
professional life. But if pressed during some evening free discussions, after a few
beers, surprising answers come out. Everything from “obviously I enjoy free-will”
to “obviously I don’t have any free-will” can be heard. Similarly, questions about
time lead to vastly different, though general quite lean discussions: “Time is a mere
evolution parameter”, “Time is geometrical” are standard claims that illustrate how
poorly today’s physics understands time. Consequently, a theory of quantum gravity
that will have to incorporate time in a much more subtle and rich way will remain a
dream as long as we don’t elaborate deeper notions of time.
I like to argue that some relevant aspect of time is not independent of free-will
and that free-will is necessary for rational thinking, hence for science. Consequently,
this aspect of time, that I’ll name creative time—or Heraclitus-time—is necessary
for science. For different arguments in favor of the passage of time, see, e.g., [1, 2].
The identification of time with (classical) clocks is likely to be misleading (sorry
Einstein). Clocks do not describe our internal feeling of the passage of time, nor
the objective chance events that characterize disruptive times—the creative time—
N. Gisin ()
Group of Applied Physics, University of Geneva, 1211 Geneva 4, Switzerland
e-mail: nicolas.gisin@unige.ch
when something beyond the mere unfolding of a symmetry happens. Indeed, clocks
describe only one aspect of time, the geometric, boring, Parmenides-time.
But let’s start from the beginning. Before thinking of time and even before
physics and philosophy, we need the possibility to decide what we’ll consider as
correct statements that we trust and believe and which statements we don’t trust and
thus don’t buy. Hence:
Free-Will comes first, in the logical order; and all the rest follows from this
premise.
Free-will is the possibility to choose between several possible futures, the
possibility to choose what to believe and what to do (and thus what not to believe
and not to do). This is in tension with scientific determinism,1 according to which,
all of today’s facts were necessary given the past and the laws of nature. Notice that
the past could be yesterday or the big-bang billions of years ago. Indeed, according
to scientific determinism, nothing truly new ever happens, everything was set and
determined at the big-bang.2 This is the view today’s physics offers and I always
found it amazing that many people, including clever people, do really believe in this
[3]. Time would merely be an enormous illusion, nothing but a parameter labeling
an extraordinary unraveling of some pre-existing initial (or final) conditions, i.e. the
unfolding of some symmetry. What is the explanatory power of such a view? What
is the explanatory power of the claim that everything was set at the beginning—
including our present day feelings about free-will—and that there is nothing more to
add because there is no possibility to add anything. Clearly, I am not a compatibilist
[4], i.e. not among those who believe that free-will is merely the fact that we always
happen to “choose” what was already pre-determined to occur, hence that nothing
goes against our apparently free choices.3 I strongly believe that we truly make
choices among several possible futures.
Before elaborating on all this, let me summarize my argument. The following
sections do then develop the successive points of my reasoning.
1
For physicists, scientific determinism is an extraordinarily strong view: everything is determined
by the initial state of the atoms and quanta that make-up the work, nothing beyond that has any
independent existence.
2
Equally, one may claim that everything is set by tomorrow; a fact that illustrates that time in such
a deterministic world is a mere illusion [3].
3
Compatibilism is quite fashionable among philosophers. They argue that it is our character,
reasons and power that determine our actions [4]. But for a physicist, there is nothing like
characters, reasons or power above the physical state of the atoms and quanta that make up our
brain, body and all the universe. Hence, if the physical state evolves deterministically, then there is
nothing left, everything is determined. In such a case the difference between a human and a laundry
machine would only be a matter of complexity, nothing fundamental.
Time Really Passes, Science Can’t Deny That 3
1. Free-Will comes first in the logical order. Indeed, without free-will there is no
way to make sense of anything, no way to decide which arguments to buy and
which to reject. Hence, there would be no rational thinking and no science. In
particular, there would be no understanding.
2. Since free-will is the possibility to choose between several possible futures, point
1 implies that the world is not entirely deterministic.
3. Non-determinism implies that time really exists and really passes: today there
are facts that were not necessary yesterday,4 i.e. the future is open.
4. In addition to the geometrical time, there is also creative time. One may like to
call the first one Parmenides-time, and the second concept of time Heraclitus-
time [5]. Both exist.
5. The tension between free-will and creative time on one side and scientific
determinism on the other side dissolves once one realizes that the so-called real
numbers are not really real: there is no infinite amount of information in any
finite space volume, hence initial conditions and parameters defining evolution
laws are not ultimately defined, i.e. the real numbers that theories use as initial
conditions and parameters are not physically real. Hence, neither Newtonian, nor
relativity, nor quantum physics are ultimately deterministic.
6. Consequently, neither philosophy nor science nor any rational argument can ever
disprove the existence of free-will, hence of the passage of time.
4
Admittedly, I use the primitive concepts of today and yesterday to get the direction of time, but
the existence of creative time is a direct consequence of non-determinism.
5
Some may believe that a computer can think rationally, possibly that computers are optimal in
terms of rationality. But, even if one limits oneself to mathematics, a highly rational field, how
4 N. Gisin
Fig. 1 Jules Lequyer was born in 1814 in the village Quintin (see inset), in Brittany, France, in
this house. He died in 1862, probably committing suicide by swimming away in the sea
understanding. Furthermore, without free-will one could not decide when and how
to test scientific theories. Hence, one could not falsify theories and science, in the
sense of Popper [7], would be impossible.
I was very pleased to learn that my basic intuition, expressed above, was
shared and anticipated by a poorly known French philosopher, Jules Lequyer in the
nineteenth century, who wanted to simultaneously validate Science and free-will [8],
Fig. 1. As Lequyer emphasized: “without free-will the certainty of scientific truths
would become illusory”. And (my addition) the consistency of rational arguments
would equally become illusory. Lequyer continues: “Instead of asking whether free-
will is certain, let’s realize that certainty requires free-will” [8].6
could a computer decide to add or not the axiom of choice to the basic Zermolo-Fraenkel axioms
of mathematics? Consistency doesn’t help, as both assuming the axiom of choice and assuming
its negation lead to consistent sets of axioms. Hence, a choice has to be made, a choice that has
consequences, hence impacts what makes sense to us. Most mathematicians accept the axiom of
choice because it allows them to prove more theorems. Why not. But I reject this axiom because
some of its consequences are absurd to me [6]. This is an example where free-will is necessary to
make a sensible decision. Note that one’s decision may evolve over time.
6
Au lieu de nous demander si la liberté est une certitude, prenons conscience que la certitude a
pour condition la liberté.
Time Really Passes, Science Can’t Deny That 5
Lequyer also emphasized that free-will doesn’t create any new possibilities, it
only makes some pre-existing potentialities become actual, a view very reminiscent
of Heisenberg’s interpretation of quantum theory. However, Lequyer continues,
free-will is also the rejection of chance. For Lequyer—and for me—our acts of
free-will are beginnings of chains of consequences. Hence, the future is open,
determinism is wrong; a point on which I’ll elaborate in the next two sections.
Lequyer didn’t publish anything. But, fortunately, had an enormous influence
on another French philosopher, a close friend, Charles Renouvier who wrote about
Lequyer’s ideas and published some of Lequyer’s notes [8, 9]. In turn, Renouvier
had a great influence on the famous American philosopher and psychologist William
James who is considered as one of the most influential American psychologists.
William James wrote “After reading Renouvier, my first act of free-will shall be to
believe in free-will”. This may sound bizarre, but, in fact, is perfectly coherent: once
one realizes that everything rests on free-will, then one acts accordingly.
The existence of genuine free-will, i.e. the possibility to choose among several
possible futures, naturally implies that the world is not entirely deterministic. In
other worlds, today there are facts that were not necessary, i.e. facts that were not
predetermined from yesterday, and even less from the big-bang.
Recall that according to scientific determinism everything was set at the begin-
ning, let’s say at the big-bang, and since then everything merely unfolds by
necessity, without any possible choice. Philosophers include in the initial state not
only the physical state of the universe, but possibly also the character of humans—
and living beings. Hence, let’s recall that according to physical determinism
everything is fully determined by the initial state of all the atoms and quanta at
any time (or time-like hypersurface) and the laws of physics. For example, given the
state of the universe a nanosecond after the big-bang, everything that ever happened
and will ever happen—including the characters, desires and reasons of all humans—
was entirely determined by this initial condition. In other words, nothing truly new
happens, as everything was already necessary a nanosecond after the big-bang.
But how can one reconcile ideas about free-will, such as summarized in the
previous sections, with scientific determinism? Or even with quantum randomness?
This difficulty led many philosophers and scientists to doubt the very existence
of free-will. These so-called compatibilist changed the definition of free-will in
order to make it compatible with determinism [4]. Free-will, they argue, is merely
the fact that we are determined to never choose anything that doesn’t necessary
happen. Nevertheless, compatibilists argue, we have the feeling that our “necessary
choices” are free. This sounds to me like a game of words, some desperate tentative
to save our inner feeling of free-will and scientific determinism. But, as Lequyer
6 N. Gisin
anticipated, free-will comes first, hence there is no way to rationally argue against
its existence, for rational arguing requires that one can freely buy or not buy the
argument: genuine compatibilists must freely decide to buy the compatibilists’
argument, hence compatibilists must enjoy free-will in Lequyer’s sense. Moreover,
and this is my main point, scientific determinism is wrong, hence there is no need
to squeeze free-will in a deterministic world-view.
Let me emphasize that since free-will comes first, i.e. the possibility to choose
between several possible futures comes first, and since this is incompatible with
scientific determinism, the latter is necessarily wrong: the future has to be open, as
we show in the next section.
Before explaining why physics, including classical Newtonian physics, is not
deterministic, let me first address two related questions: When do random (undeter-
mined) events happen? What triggers random events?
Already when I was a high school student, long before thinking seriously about
free-will, the concept of randomness and indeterminism puzzled me a lot [10]. When
can a random event happen? What triggers its occurrence? If randomness is only a
characteristic of long sequences, as my teachers told me, then what characterizes
individual random events? What is the probability of a singular event? Aren’t long
sequences merely the accumulation of individual events7 ?
The only interesting answer to the question “when do random events happen?”
I could find was given by yet another nineteenth century French philosopher
(there is no way to escape from one’s cultural environment), Antoine A. Cournot
[11]. His idea was that chance happens when causal chains meet. This is a nice
idea, illustrated, e.g., by quantum chance which happens when a quantum system
encounters a measuring device.8
This idea can be illustrated by everyday chance events. Imagine that two persons,
Alice and Bob meet up by chance in the street (taken from [12]). This might happen,
for example, because Alice was going to the restaurant further down the same street
and Bob to see a friend who lives in the next street. From the moment they decide
to go on foot, by the shortest possible path, to the restaurant for Alice and to see
his friend for Bob, their meeting was predictable. This is an example of two causal
chains of events, the paths followed by Alice and Bob, which cross one another and
thus produce what looks like a chance encounter to each of them. But that encounter
7
A long sequence of pseudo-random bits is entirely given at once, because it is entirely determined
by the initial condition, i.e. by the seed. In such a case I have no problem with the idea that the
pseudo-randomness is a characteristic of the entire sequence. But what about long sequences of
truly random bits, produced one after the other, let’s say one per second? Each one is a little act of
creation and the sequence nothing but an accumulation of individual random bits. Accordingly,
randomness of truly random bits must be a characteristic of the individual events, not of the
sequence [10]. Notice that in the case of pseudo-randomness only the geometric-boring-time is
relevant, but in the case of true randomness that concept of time is insufficient, as the creative-time
is at work (but without any free-will).
8
Note that this doesn’t solve the quantum measurement problem, i.e. doesn’t answer the question
“which configurations of atoms constitute a measurement device?”.
Time Really Passes, Science Can’t Deny That 7
was predictable for someone with a sufficiently global view. The apparently chance-
like nature of the meeting was thus only due to ignorance: Bob did not know where
Alice was going, and conversely. But what was the situation before Alice decided to
go to the restaurant? If we agree that she enjoys the benefits of free-will, then before
she made this decision, the meeting was truly unpredictable. True chance is like
this. True chance does not therefore have a cause in the same sense as events in a
deterministic world. A result subject to true chance is not predetermined in any way.
But we need to qualify this assertion, because a truly chance like event may have a
cause. It is just that this cause does not determine the result, only the probabilities
of a range of different possible results are determined. In other words, it is only the
propensity of a certain event to be realised that is actually predetermined, not which
event obtains [10].
Let’s have a more physicist look at that. First, consider two colliding classical
particles, see Fig. 2. Next, consider a unitary quantum evolution in an arbitrary
Hilbert space, see Fig. 3. Look for a while at the latter one; it is especially
boring, nothing happens, it is just a symmetry that displays itself. Possibly the
symmetry is complex and the Hilbert space very large, but frankly, nothing happens
as the equivalence between the Schrödinger and the Heisenberg pictures clearly
demonstrates. Likewise, for a bunch of classical harmonic oscillators nothing
happens. Somehow, there is no time (or only the boring geometric time that merely
labels the evolution). Similarly, as long as the classical particles of Fig. 2 merely
move straight at a constant speed, nothing happens: in another reference frame they
are at rest. It is only when the classical particles collide, or when the quantum system
meets a measuring apparatus, that something happens, as Cournot claimed.
But one may object that in phase space the point that represents the two particles
doesn’t meet anything. In phase space, there is no collision, as collisions require
at least two objects and in phase space there is only one object, i.e. one point.
Fig. 2 Sketch of two colliding classical particles. Initially they merely move along straight lines,
nothing happens. Next, they collide, the very detail of this process depends on infinitesimal digits
of the initial conditions and of their shapes. Finally, the two particles continue again along boring
straight lines
8 N. Gisin
Moreover, the collision in real space and the consequence of that collision is
already entirely determined by the initial conditions: in phase space it’s again only
a symplectic symmetry that displays itself.
And even if one assumes that each particle is initially “independent”, whatever
that could mean, after colliding the two particles get correlated. Hence, for
Cournot’s idea to work, one would need a “correlation sink”. This is a bit similar
to the collapse postulate of quantum theory which breaks correlations, i.e. resets
independence (separability).
In summary, Cournot’s idea is attractive, but not entirely satisfactory; it doesn’t
seem to fit with scientific determinism. It took me a very long time to realize what
is wrong with that claim.
Consider a finite volume of space, e.g. a one millimeter radius ball containing
finitely many particles. Can this finite volume of space hold infinitely many bits
of information? Classical and quantum theories answer is a clear “yes”. But why
should we buy this assertion? The idea that a finite volume of space can hold but
a finite amount of information is quite intuitive. However, theoretical physics uses
Time Really Passes, Science Can’t Deny That 9
real numbers (and complex numbers, but let’s concentrate on the reals, this suffices
for my argument). Hence the question: are so-called real numbers really real? Are
they physically real?
For sure, it is not because Descartes (yet another French philosopher, but this
time a well-known one) named the so-called real numbers “real” that they are really
real.
Actually, the idea that real numbers are truly real is absurd: a single real number
contains an infinite number of bits and could thus, for example, contain all the
answers to all questions one could possibly formulate in any human language [13].
Indeed, there are only finitely many languages, each with finitely many letters or
symbols, hence there are only countably many sequences of letters. Most of them
don’t make any sense, but one could enumerate all sequences of letters as successive
bits of one real number 0:b1 b2 b3 : : : bn : : :, first the sequences of length 1, next of
length 2 and so on. The first bit after each sequence tells whether the sequence
corresponds to a binary question and, if so, the following bit provides the answer.
Such a single real number would contain an infinite amount of information, in
particular, as said, it would contain the answer to all possible questions one can
formulate in any human language. No doubt, real numbers are true monsters!
Moreover, almost all so-called real numbers are uncomputable. Indeed, there are
only countably many computer programs, hence real numbers are uncomputable
with probability one. In other words, almost all real numbers are random in the
sense that their sequences of digits (or bits) are random. Let me emphasize that they
are as random as the outcome of measurements on half a singlet,9 the archetype
of quantum randomness. And these random numbers (a better name for “real”
numbers) should be at the basis of scientific determinism? Come on, that’s just not
serious!
Imagine that at school you would have learned to name the so-called real
numbers using the more appropriate terminology of random numbers. Would you
believe that these using the terminology numbers are at the basis of scientific
determinism? To name “random numbers” “real numbers” is the greatest scam and
trickery of science; it is also a great source of confusion in the philosophy of science.
Note that not all real numbers are random. Some, but only countably p many, are
computable, like all rational numbers and numbers like and 2. Actually, all
numbers one may explicitly encounter are computable, i.e. are exceptional.
The use of real numbers in physics, and other sciences, is an extremely efficient
and useful idealization, e.g. to allow for differential equations. But one should
not make the confusion of believing that this idealization implies that nature is
deterministic. A deterministic theoretical model of physics doesn’t imply that nature
is deterministic. Again, real numbers are extremely useful to do theoretical physics
and calculations, but they are not physically real.
1
9
That is, on a spin 2
maximally entangled with another spin 12 .
10 N. Gisin
The fact that so-called real numbers have in fact random digits, after the few first
ones, has especially important consequences in chaotic dynamical systems. After
a pretty short time, the future evolution would depend on the thousandth digit of
the initial condition. But that digit doesn’t really exist.10 Consequently, the future
of classical chaotic systems is open and Newtonian dynamics is not deterministic.
Actually most classical systems are chaotic, at least the interesting ones, i.e. all
those that are not equivalent to a bunch of harmonic oscillators. Hence, classical
mechanics is not deterministic, contrary to standard claims and widely held beliefs.
Note that the non-deterministic nature of physics may leave room for emerging
phenomena, like e.g. phenomena that could produce top-down causes, in contrast to
the usual bottom-up causes we are used to in physics [14]. A well-known example
of a set of phenomena that emerges from classical mechanics is thermodynamics
which can be deduced in the so-called thermodynamical limit. But, rather than going
to infinite systems, it suffices to merely understand that classical mechanics is not
ultimately deterministic, neither in the initial condition, nor in the set of boundary
conditions and potentials required to define the evolution equations.
What about quantum theory? Well, if one accepts that the measurement problem
is a real physics problem—as I do, then this theory is also clearly not deterministic
[12]. If, on the contrary, one believes in some form of a many worlds view, then
the details of the enormously entangled wave function of the Universe depends
again on infinitesimal details, as in classical chaotic systems. Note that although
quantum dynamics has no hyper-sensitivity to initial conditions, it shares with
classical chaotic systems hyper-sensitivity to the parameters that characterize
that dynamics, e.g. the Hamiltonian. Furthermore, open quantum systems recover
classical trajectories also in the case of chaotic systems, see Fig. 4. Hence, quantum
dynamics is not deterministic. Finally, Bohmian quantum mechanics is again hyper-
sensitive to the initial condition of the positions of the Bohmian particles; hence,
like chaotic classical systems, Bohmian mechanics is not deterministic.
Admittedly, one may object that now we have an analog of the measurement
problem in classical physics, as it is unclear when and how the non-existing
digits necessary to define the future of chaotic systems get determined. This is
correct and, in my opinion, inevitable. First, because free-will comes first, next
because mathematical real numbers are physical random numbers. Finally, because
physics—and science in general—is the human activity aiming at describing and
understanding how Nature does it. For this purpose one needs to describe also
how humans interact with nature, how we question nature [16]. Including the
observer inside the description results, at best, in a tautology without any possible
understanding: there would result no way to freely decide which description
provides explanations, which argument to buy or not to buy.
10
It’s not that there is a sharp limit on the number of digits, they merely fade off.
Time Really Passes, Science Can’t Deny That 11
15
10
<P>
-5
-10
-20 -15 -10 -5 0 5 10 15 20
<X>
Fig. 4 Poincaré section of the forced and damped quantum Duffing oscillator in the chaotic
regime, described by the Quantum State Diffusion model of open quantum systems [15]. Note that
the axes represent quantum expectation values of position and momentum. This strange attractor
is essentially identical to its classical analog
Fig. 5 The real or physical world versus Pythagoras’ mathematical world should not be confused
So far we saw that free-will comes first in the logical order, hence all its con-
sequences are necessary. In particular one can’t argue rationally against free-will
and its natural consequence, namely that time really passes. We also saw that
this is not in contradiction with any scientific fact. Actually, quite the opposite,
it is in accordance with the natural assumption that no finite region of space can
contain more than a finite amount of information. The widely held faith in scientific
determinism is nothing but excessive scientism.
This can be summarized with the simple chain of implications:
Let us look closer at the implications for time. There is no doubt that time as an
evolution parameter exists. To get convinced it suffices to look at a bunch of classical
harmonic oscillators (like classical clocks), or the unitary evolution of a closed
quantum system, or at the inertial motion of a classical particle as in Fig. 2. This time
is the boring time, the time when nothing truly new happens, the time when things
merely are, time when what matters is being, i.e. Parmenides-time. One could also
name this Einstein’s time.11 But let’s look at the collision between the two particles
of Fig. 2. The detail of the consequences of such a collision depends on non-existing
infinitesimal digits, i.e. on mathematically real but physically random numbers. To
get convinced just imagine a series of such collisions; this leads to chaos, hence
each collision is the place of some indeterminism, that is of some creative time,
time when what matters is change. Hence we call this creative time Heraclitus-time
[5]. This creative time is extraordinarily poorly understood by today’s science, in
particular by today’s physics. This doesn’t mean that it doesn’t exist, or that it is
not important. On the contrary, it means that there are huge and fascinating open
problems in front of us, scientists, physicists and philosophers.
Notice that this is closely related to Cournot’s idea that random events happen
when independent causal chains meet, e.g. when two classical particles meet. The
two particles are independent, at least not fully correlated, because their initial
conditions are not fully determined. And their future, after the collision, is not
predetermined, but contains a bit of chance.
Similarly, quantum chance happens when a quantum system meets a measure-
ment apparatus, as described by standard textbooks. Admittedly, we don’t know
what a measurement apparatus is, i.e. we don’t know which configurations of atoms
constitute a measurement apparatus. This is the so-called quantum measurement
problem. According to what we saw, there is a similar problem in classical
11
Einstein identified time with classical clocks, i.e. with classical harmonic oscillators. But what
about clocks based on Heraclitus’ creative time? i.e. clocks based on chaotic or quantum systems?
Time Really Passes, Science Can’t Deny That 13
mechanics: despite the indeterminism in the initial conditions and evolution param-
eters, things get determined as creative time passes (as discussed near the end of the
previous section).
7 Conclusion
Neither philosophy nor science can ever disprove the existence of free-will. Indeed,
free-will is a prerequisite for rational thinking and for understanding, as emphasized
by Jules Lequyer. Consequently, neither philosophy nor science can ever disprove
that time really passes. Indeed, the fact that time really passes is a necessary
consequence of the existence of free-will.
The fact that today’s science—including classical Newtonian mechanics—is not
deterministic may come as a huge surprise to many readers (including the myself
of 20 years ago). Indeed, the fact that Descartes named real numbers that are
actually physically random had enormous consequences. This together with the
tendency of many scientists to elevate their findings to some sort of quasi-religious
ultimate truth—i.e. scientism—lead to great confusion, as illustrated by Laplace
famous claim about determinism and by believers in some form of the many-world
interpretation of quantum mechanics, based respectively on the determinism of
Newton’s equation and on the linearity of Schödinger’s equation.
Once one realizes that science is not entirely deterministic, though it clearly
contains deterministic causal chains, one faces formidable opportunities. This might
seem frightening, though I always prefer challenges and open problems to the claim
that everything is solved.
Non-determinism implies that time really passes, most likely at the junction of
causal chains, i.e. when creative time is at work. This leaves room for emerging
phenomena, like thermodynamics of finite systems. It may also leave room for
top-down causality: the initial indeterminism must become determined before
indeterminism hits large scale, much in the spirit of quantum measurements.
As a side conclusion, note that robots based on digital electronics will never leave
room for free-will, hence the central thesis of hard artificial intelligence (the claim
that enough sophisticated robots will automatically become conscious and human-
like) is fundamentally wrong.
So, am I a dualist? Possibly, though it depends what is meant by that. For
sure I am not a materialist. Note that today’s physics already includes systems
that are not material in the sense that they have no mass, like electro-magnetic
radiation, light and photons. What about physicalism? If this means that everything
can be described and understand by today’s physics, then physicalism is trivially
wrong, as today’s theories describe at best 5% of the content of the universe. More
interestingly, if physicalism means that everything can be understood using the tools
of physics, then I adhere to this view, though the fact that free-will comes first
implies that, although physics will make endless progress, it will never reach a final
14 N. Gisin
point. We will understand much more, in particular about time and about free-will,
though we’ll never get a full rational description and understanding of free-will.
Just imagine this debate a century ago. How naive anyone claiming at that time
that physics provides a fairly complete description of nature would appear today.12
Similarly, for anyone making today a similar claim.
Let me make a last comment, a bit off-track. Free-will is often analyzed in
a context involving human responsibility, “How could we be responsible for our
actions if we don’t enjoy free-will?”. There is another side to this aspect of the
free-will question: “How could we prevent humans from destroying humanity if
we claim we are nothing more than sophisticated robots?”, and “How could one
argue that human life has some superior value if we pretend we are nothing but
sophisticated robots?”.
Acknowledgements This work profited from numerous discussions, mostly with myself over
many decades during long and pleasant walks. I should also thank my old friend Jean-Claude
Zambrini for introducing me to Cournot’s idea, when we were both students in Geneva. Thanks are
due to Chris Fuchs who introduced me to Jules Lequyer and to many participants to the workshop
on Time in Physics organized at the ETH-Zurich by Sandra Rankovic, Daniela Frauchiger and
Renato Renner.
References
12
Remember a few decades ago, when biology was claiming that genes fully determine all living
beings. This was considered as a major and final finding. It was a major finding, indeed, but clearly
not a final one. Today, epigenetics proves that there is much more than genes that influences living
beings.
Time Really Passes, Science Can’t Deny That 15
Julian Barbour
Abstract Entropy and the second law of thermodynamcs were discovered through
study of the behaviour of gases in confined spaces. The related techniques developed
in the kinetic theory of gases have failed to resolve the apparent conflict between the
time-reversal symmetry of all known laws of nature and the existence of arrows of
time that at all times and everywhere in the universe all point in the same direction.
I will argue that the failure may be due to unconscious application to the universe of
the conceptual framework developed for confined systems. If, as seems plausible,
the universe is an unconfined system, new concepts are needed.
1 Introduction
In this paper, I will not attempt to cover all the ground of my talk (http://www.
video.ethz.ch/conferences/2015/d-phys.html) at the Time Conference in Zurich in
September 2015 or the material in [1–3] on which my talk was based. Instead, taking
an historical perspective, I want to indicate why I think the traditional understanding
of entropy needs to be modified if it is to be applied to the universe. The main
reason is that thermodynamics and its interpretation by statistical mechanics were
developed for confined systems whereas the universe appears to be unconfined.
This, I believe, has far-reaching implications for all questions relating to the various
arrows of time.
Simple examples explain what I mean by confined and unconfined systems. In
the ideal-gas model, many particles move inertially apart from short-range elastic
interactions. They are confined to a box at rest in an inertial frame and bounce
elastically off its walls. That’s a confined system. The same particles without
box form an unconfined system. Pointlike particles that interact solely through
J. Barbour ()
College Farm, South Newington, Banbury, Oxon OX15 4JG, UK
University of Oxford, Oxford, UK
e-mail: BarbourJ@physics.ox.ac.uk
Newtonian gravity can model an unconfined ‘island universe’, but the ideal gas will
already indicate the need for new concepts. Proper application of entropic ideas to
the universe will surely need inclusion of gravity. My collaborators present ideas
about that in [3] and about the quantum mechanics of unconfined systems in [1,
Sect. 4].
My survey of the arrow-of-time literature failed to identify any study that highlights
the distinction between confined and unconfined systems. True, the universe’s
expansion, aided by gravity, is often mooted (see, e.g., [4, 5]) as the ‘master arrow’
for the other arrows, but one finds little suggestion that the very concept of entropy
needs reexamination in unconfined systems. The unconfined ideal gas shows that it
does.
For this the heterogeneity of its degrees of freedom (dofs) is important: N
particles in Euclidean space have 3N Cartesian coordinates. Three locate the centre
of mass, three define orientation and one overall size. If racm is the centre-of-mass
position of particle a, the centre-of-mass moment of inertia (half the trace of the
inertia tensor):
X
N
1 X X
2
Icm D ma racm racm ma mb rab ; mtot D ma ; (1)
aD1
mtot a<b a
or its square root (divided by the total mass), which is the root-mean-square length
sP
2
ma mb rab
`rms D a<b
; (2)
m2tot
measures the size. The remaining dofs describe the shape of the instantaneous
configuration. This paper is about the different behaviours of the scale and shape
dofs in confined and unconfined systems. It is revealing that even when general-
relativistic cosmological models of unconfined universes have been considered an
important consequence of the shape/scale difference has hardly ever been noted,
as I shall explain. In books that do not consider cosmology, I have not once
seen attention drawn to the shape/scale difference. As in many dynamical-systems
studies, virtually all authors use Lagrange’s generalized coordinates, which are
simply denoted q1 ; : : : ; qn . This hides all trace of the shape/scale difference.
Arrows of Time in Unconfined Systems 19
Suppose that at t0 all the particles of the ideal-gas model have random velocities
and are in a small cloud in the centre of the box. The particles will spread out.
Their elastic collisions with each other and the box walls will soon establish thermal
equilibrium. Coarse graining will permit definition of a Boltzmann entropy SB . It
will be low at t0 and then grow to a more or less stable maximum value. Since
the system is perfectly isolated (affected by no external forces), it will be subject
to Poincaré recurrence. Mostly, SB will exhibit very small fluctuations about its
maximum with rare deep fluctuations. If at any time t > t0 all the velocities are
exactly reversed, the system will retrace its evolution back to the state of low SB
at t0 , after which SB will rise again and embark on the typical Poincaré-recurrence
behaviour of the forward time direction. The complete SB .t/ curve, like the flanks
of the entropy dips within it, will be qualitatively symmetric with respect to the
direction of time. Note that the purely dynamical moment of inertia Icm (or `rms )
behaves just like the statistical SB .t/.
In this scenario, three factors create the SB curve: the initial spreading of the
particles into more phase-space cells; the interparticle collisions; the particle-box
collisions. Without the box, the interparticle collisions would soon cease and the
sole cause of SB growth would be the growth of `rms .
4 Conceptual Inertia
Tolman showed that in such comoving volumes the entropies of both blackbody
radiation and an ideal non-relativistic gas in thermal equilibrium remain constant.
Provided the departure from homogeneity remains small and does not significantly
perturb the microwave background, it is possible to give a sensible estimate of the
entropy and its growth within our Hubble radius.
The other example of self-containment is associated with horizons: most con-
fidently with the event horizon of black holes and rather less with the particle
horizon in de Sitter space and the Rindler-wedge horizon of a uniformly accelerated
observer in Minkowski space. The entropic interpretation associated with horizons
in these cases relies to some extent on information-type arguments: the entropy
is said to represent an observer’s ignorance of what is on the unobservable ‘other
side’ of the horizon. Although few theoreticians doubt the existence of a deep
connection between gravity, thermodynamics and entropy, it may be noted that the
beautiful proofs which lead to this confidence rely in part on assumptions that can be
questioned. In particular, it is often assumed that space is asymptotically flat. Also
1
Fermi’s definition of the entropy [6] of out-of-equilibrium systems is illuminating. They must
consist of subsystems each in equilibrium and separated by heat-insulating walls.
2
This is also the most important condition required for Poincaré’s recurrence theorem to hold.
3
Gibbs noted that this restriction has a counterpart in thermodynamics, in which “there is no
thermodynamic equilibrium of a (finite) mass of gas in an infinite space”.
Arrows of Time in Unconfined Systems 21
the black hole event horizon is not so much inpenetrable as semipermiable (matter
can fall into the black hole) and requires a subtle definition involving the state of the
complete universe long after the collapse of matter that leads to the formation of the
black hole.
The universally recognized problem is a general definition of gravitational
entropy. This is widely attributed to the breakdown of homogeneity at the end of the
era well described by FLRW cosmologies. I will suggest that a much more serious
problem is the very definition of entropy in a system that can expand freely. Since
the universe is manifestly far from equilibrium once it becomes inhomogeneous,
thermodynamic concepts which rely on equilibrium will not help. We must see how
far we can get with concepts taken from statistical mechanics.
5 Unconfined Systems
To this end, let us now consider unconfined systems, starting with the very simplest:
two particles moving inertially. This system exhibits a feature that will occupy a
central position in my discussion, both conceptually and literally: in every solution
there is a unique instant that divides every solution into two halves. This is the
instant at which the two particles are closer to each other than at any other time.
This is a very trivial system, but it already exhibits the feature I want to highlight.
We get a more illuminating example if we add Newtonian gravity, for which the two-
body solutions are of three kinds depending on the total centre-of-mass energy Ecm :
elliptical (Ecm < 0), parabolic (Ecm D 0) and hyperbolic (Ecm > 0) motion of each
particle about the common centre of mass. The elliptical case is periodic and quite
different to the other two but does have successive points of closest approach that
each divides the current orbit in half. In the other two cases, there is always a unique
point of closest approach. Even the case of collision can be regularized by a bounce,
which maintains the rule.
The N-body problem, N 3, is much more interesting. It hardly ever enters
university dynamics courses, which pass directly from two-body problems to rigid-
body theory and then to Lagrangian and Hamiltonian theory. This may explain why
a fact with a possibly deep connection with the second law of thermodynamics has
escaped attention. I recall first that a potential V.ra / is homogeneous of degree k if,
for ˛ > 0, V.˛ra / D ˛ k V.ra /. For any such potential, Newton’s second law leads
to the relation
For the Newton potential VNew ; k D 1. Thus, in the N-body problem, IRcm D 2E
VNew . In addition, VNew is negative definite, so if Ecm 0
This means that the graph of Icm as a function of the time t is concave upwards
and tends to infinity in both time directions. This fact, first discovered for 3-body
motions in 1772 by Lagrange and later generalized to the N-body problem by Jacobi,
was the first qualitative discovery made in dynamics and played an important role in
the history of dynamics because it showed that the N-body problem with Ecm 0 is
unstable: at least one particle must escape to infinity. This then raised the question
of whether the solar system, for which Ecm < 0, is stable, the study of which
led to Poincaré’s discovery of chaos. Another important consequence of (4) is the
monotonicity of IPcm :
1P
Icm D D: (5)
2
The monotonic quantity (5), which by its close analogy with angular momentum
may be called the dilational momentum, is a Lyapunov variable; its existence
immediately shows that there can be no periodic motions or Poincaré recurrence
in the N-body problem with non-negative energy. For inertial motion, for which
V D const; k D 0; so in this case too (4) holds and Icm has a unique minimum.
In [3], my collaborators and I coined the expression Janus point for the minimum
of Icm and Janus-point systems for unconfined dynamical systems for which every
solution divides into two (qualitatively similar) halves at a unique central point.
Moreover, as pointed out in [1–3], the evolution in either direction away from
the Janus point J is time-asymmetric even though the governing equation is time-
reversal symmetric. This can be seen very easily in purely inertial motion, in which
the position vector of each particle satisfies ra .t/ D r0a C va t, where r0a is the
initial position and va the (constant) velocity. With the passage of time (in either
direction t ! ˙1), the contribution of the velocity term must become dominant.
Moreover, because the particles with greater velocities get ever further from the
slower particles, the rate of separation rPab of any two particles a and b tends to
become ever more closely proportional to their mutual separation rab : rPab / rab .
This Hubble-type expansion will occur not only in inertial motion but also for an
ideal gas if the confining box is suddenly removed.
The time asymmetry either side of the minimal Icm at J is therefore manifested
in the ever greater tendency to Hubble-type expansion away from J. Moreover, the
system is always in its most disordered state around J. In the N-body problem, the
effect is much more striking because bound clusters are formed and move away
from each other in Hubble-type expansion. This causes growth (between bounds
that grow as t ! ˙1) of a scale-invariant quantity called complexity in [1–3].
There is a deep reason for the time-asymmetric behaviour: Liouville’s theorem.
In accordance with what I said about degrees of freedom, the total phase-space
volume is divided into parts: an orientational part (which we can ignore), a shape
part and the scale part. At J, the scale variable `rms takes its minimal value and
increases monotonically in both directions away from J. Given a Gibbs ensemble of
identical systems at J, the phase-space scale part must increase as t ! ˙1. This
means that the shape part must decrease: dynamical attractors must act on the shape
Arrows of Time in Unconfined Systems 23
As noted at the end of Sect. 3, if all the particles of an ideal gas are situated at t0 in a
small region within a much larger box three factors contribute to the t > t0 behaviour
of SB : the initial more or less free growth `rms with some interparticle collisions; the
particle collisions with the box walls once `rms is large enough; thereafter regular
interparticle and particle-wall collisions with essentially constant `rms .
There is a common intuition that entropy increase corresponds to growth of
disorder. Random motion of particles in a confined region seems much more
disordered than free Hubble-type expansion. In the previous paragraph’s scenario,
disorder-increasing interparticle and particle-box collisions rapidly erase the initial
expansion’s disorder-decreasing effect. However, in the absence of a box the latter
rapidly becomes the dominant effect. This simple observation suggests that entropic
concepts need reconsideration if they are to be applied to a freely expanding
universe.
This can be seen especially clearly if we include gravity and model the universe
by the N-body problem with Ecm D 0. As we have seen, it is an immediate
consequence of Liouville’s theorem that the shape of the system is attracted to
ever smaller regions of the system’s space of possible shapes (shape space S)
with increasing distance from the Janus point J. Intuitively, this is anti-entropic
behaviour. Indeed, in [3] my collaborators and I use the scale-invariant complexity
mentioned earlier as a state function to define a Boltzmann-type count of microstates
we call entaxy (to avoid confusion with the entropy concept that can be meaningfully
used for confined systems). We argue that entaxy, not entropy, must be used to
4
That growth of the scale part of phase space must reduce the part corresponding to the remaining
degrees of freedom was noted in connection with inflation in [8].
5
This was the main motivation for the development of shape dynamics [9, 10].
24 J. Barbour
characterize the typicality of the universe’s state. What is more, the entaxy always
has it greatest value near J and decreases in both directions away from it. At the
same time, the universe becomes more structured because bound subsystems form
and separate from each other in Hubble-type expansion.
Thus, as the universe evolves in both directions away from J, its complexity
increases while its entaxy decreases. There is nothing mysterious about this
inversion of normal entropic behaviour. It is due to the difference, enhanced by
gravity, between confined and unconfined systems. We also point out in [3] that the
subsystems which gravity creates become more or less ‘self-confined’. As I noted
earlier, this is the sine qua non for application of Gibbs-type statistical-mechanical
arguments based on conventional entropic notions. In fact, we are able to show that
the subsystems form with some given Boltzmann entropy SB , which then increases.
Moreover, the overwhelming majority of these subsystem entropies all increase in
the same direction as the universe’s entaxy decreases. This shows how local entropy
increase—the tendency of a confined system’s state to become less special—is
compatible with the simultaneous tendency of the universe to become more special.
This also casts light on our experienced direction of time. Boltzmann argued that
it is aligned with the direction of increasing entropy. The apparent conflict with the
growth of records and structure we see around us is widely said to be perfectly
compatible with the second law: a decrease of SB here is more than compensated by
an increase elsewhere. This is often stated without proof. When one is given, it often
invokes refrigerators, in which the cooling is more than offset by the heating of the
environment. But if this is to be quantified, the environment must be confined, since
otherwise its increase in T and SB cannot be determined. In the absence of physical
insulating walls, we are back to the problem of defining the universe’s entropy.6
The mismatch between the universe’s increasing structure and the entropic arrow is
resolved in [3]. Entaxy determines the master arrow. In a self-gravitating universe it
creates more or less stably bound subsystems. In turn, these are born with a certain
SB that in the overwhelming majority of cases then increases in the same direction
as the master arrow which gave birth to them. Moreover, the Janus-point structure
(and with it the oppositely pointing master arrows) is a dynamical necessity. It is not
imposed by a special selection principle. It merely requires a non-negative energy
and an unconfined system.
In discussing ‘conceptual inertia’, I noted that collisions tend to increase
disorder but growth of `rms has the opposite effect. Could it be that the almost
exclusive concentration on confined systems in statistical mechanics has allowed
this difference to escape notice? I have not studied the literature exhaustively, but I
found few discussions of the entropy of a freely expanding gas.
6
Planck’s well-known statement of the second law shows how essential it is to have complete
control over the environment: “It is impossible to construct an engine which will work in a
complete cycle and produce no effect except the raising of a weight and cooling of a heat reservoir.”
Arrows of Time in Unconfined Systems 25
Gibbs, as we saw, ruled out systems in infinite space in order to avoid unnormal-
izable probability functions. However, Tolman [11], having noted that in confined
systems entropy will increase to an equibrated maximum, then continued “in
the case of unconfined gases . . . a final state of infinite dilution and complete
dissociation into atoms would be one of maximum entropy”. Davies [4, p. 33],
discussing the explosive escape of gas from a cylinder says “the second law becomes
an expression of the principle that a gas will explode into a vacuum, but will never
spontaneously implode into a smaller volume”. Two comments can be made here.
First, the gas under consideration forms a subsystem of the universe; it does not
serve as a model of the whole universe, in which (for a given choice of the nominal
time direction) spontaneous implosion (followed by explosion) does occur. Second,
Davies does not say explicitly that the entropy of the exploding gas increases, only
that, in being irreversible, the process is an expression of the second law. Finally,
discussing the inertial model discussed here and in [3] in the recent [12], Carroll
and Guth say the model exhibits a “two-headed arrow of time” in which entropy
increases in both limits t ! ˙1 (see also [13]).
That Janus-point solutions exhibit oppositely directed arrows of time can hardly
be doubted, but whether one can say entropy increases in the direction of the arrows
seems very questionable. I have already noted that traditional thermodynamics of
the universe cannot exist because the universe is not a thermodynamic system
whose state can be changed and measured. Application of conventional statistical
mechanics to universes that can expand is also highly problematic because of the
problem pointed out by Gibbs: probability distributions are only meaningful if
they can be normalized, which means that they must be defined on a space with
a bounded measure.
At this point I will stop. My main point—the need to think about the entropy
and statistics of universes differently—has been made. I will only say that the
greatest difficulty to which I have drawn attention, the unbounded phase space of an
expanding universe, may suggest [1–3] its solution. For Liouville’s theorem directs
us to the attractor-induced arrows on the space S of possible shapes of the universe,
and S is obtained by quotienting the Newtonian configuration space by translations,
rotations and dilatations. Due to these last, the resulting space is compact, so that
one can define on it a bounded measure. As explained in http://www.video.ethz.ch/
conferences/2015/d-phys.html, [1–3], this meets Gibbs’ requirement for meaningful
definition of probability distributions and opens up the possibility of creating a
theory of the statistics of universes.
Acknowledgements My thanks to Tim Koslowski and Flavio Mercati for the stimulating and
fruitful collaboration that led to [1–3].
26 J. Barbour
References
Vlatko Vedral
Abstract We present arguments to the effect that time and temperature can be
viewed as a form of quantum entanglement. Furthermore, if temperature is thought
of as arising from the quantum mechanical tunneling probability this then offers
us a way of dynamically “converting” time into temperature based on the entan-
glement between the transmitted and reflected modes. We then show how similar
entanglement-based logic can be applied to the dynamics of cosmological inflation
and discuss the possibility of having observable effects of the early gravitational
entanglement at the level of the universe.
V. Vedral ()
Clarendon Laboratory, University of Oxford, Parks Road, Oxford OX1 3PU, UK
Centre for Quantum Technologies, National University of Singapore, 3 Science Drive 2,
Singapore 117543, Singapore
Department of Physics, National University of Singapore, 2 Science Drive 3, Singapore 117542,
Singapore
e-mail: vlatko.vedral@gmail.com
Here I would like to ask: can what we sometimes think of as different (cosmologi-
cally) relevant physical parameters actually be thought of as just different instances
of quantum entanglement?
In what follows I would like to recount the arguments that time and temperature
can indeed be thought of as forms of entanglement. This is exciting for two reasons.
One is that these potentially completely different entities can be seen to have the
same common origin (in entanglement). It is always pleasing to be able to postulate
no more phenomena than one needs to account for all observations (Occam).
Secondly, however, claiming that entanglement is at the root of these quantities
might lead us to some observable consequences especially and most excitingly
at the cosmological level. We will explore this in the second half of this paper.
Finally, we outline how the fluctuations in the Cosmic Microwave Background
(CMB) radiation can be used to witness entanglement at the cosmological level. The
following two sections are largely a review of the existing material, though mainly
from the author’s own perspective. The last three sections present new material, by
first unifying the arguments of the proceeding two sections and then extending them
to cosmology and witnessing entanglement.
First of all, I would like to set the scene by explaining the picture that is
affectionately referred to (by the quantum information community) as the Church of
Higher Hilbert Space. This picture is the expression of the fact that any mixed state
(here written in its eigen-expansion)
X
1 D rn jrn ihrn j (1)
n
can (at least in principle) be represented as a reduction from a pure state existing on
a Higher Hilbert space
1 D trj‰ih‰j12 (2)
where
Xp
j‰i12 D rn jrn i1 ˝ jn i2 : (3)
n
P
The entropy of 1 , S.1 / D rn ln rn , quantifies the entanglement between the
system (labeled by index 1) and the extension (which itself is non-unique and is
labeled by index 2). The reason for using the word “Church” is that, though the
statement that “we can always write a mixed state as a reduction of a pure one”
looks like a tautology (and hence always true), it is actually an expression of our
belief that this extension to purity could always be performed in practice (this, of
course, is an open question since we might run out of resources to perform the
required purification).
The Hilbert Space extension is an important mathematical technique when
proving many results in quantum information theory, ranging from calculating
Time, (Inverse) Temperature and Cosmological Inflation as Entanglement 29
2 Time as Entanglement
The method of viewing time as entanglement simply encapsulates the fact that
we never observe time directly. We usually observe the position (of the hand
of the clock, the sun or the stars) or some other observable of a periodically
evolving system. Therefore, when we are timing the evolution of the system under
consideration, we are always talking about the system’s states with respect to the
state of the clock. The clock in this case will provide the extending Hilbert space
within which nothing ever evolves. However, as we show below, the state of the
system will evolve relative to the state of the clock. Here we follow the work of
Page and Wootters [3], although essentially the same logic is built into arguments of
Banks [4] and Brout [5] (for a pedagogical review see [6]). The germs of this idea
go back to a paper by Mott [7], where he used the time independent Schrödinger
equation to derive trajectories of alpha particles in a cloud chamber (the point being
that the background atoms in the chamber act as a clock recording the position and
hence the time of the passing alpha particle).
Suppose therefore that we are in an eigenstate j‰sc i, of a Hamiltonian, H,
consisting of two different subsystems, call them the system (s) and the clock (c).
The reason why we want an overall eigenstate is that there is no dynamics at the
global level of the system and the clock. Suppose further that the interaction between
the system and the clock is negligible so that H D Hs C Hc (this is what in fact
defines a good clock, namely that it is, at least to a high degree, independent of the
system). We assume without any loss of generality that Hj‰cs i D 0 (all this does it
set the overall phase which is in any case an unobservable quantity).
Imagine furthermore that the state j‰sc i has a special suitably chosen form:
X
j‰cs i D j s ./i ˝j c ./i : (4)
i.e. the clock Hamiltonian generates shifts between one clock time and the imme-
diate next clock time (note that this is just a mathematical property of the states
with respect to the Hamiltonian, there is actually no real temporal evolution taking
place yet). An obvious clock to choose is a quantized rigid rotor, but our discussion
is completely generic and does not require us to confine ourselves to anything that
resembles the traditional classical clock.
Now we look at the evolution of the system relative to the states of the clock (the
relative state of the system in the same Everett sense [8]):
d d
i„ j si D i„ h c j‰sc i (6)
d d
D h c jHc j‰sc i (7)
Dh c jHs Hj‰sc i (8)
Dh c jHr ‰sc i (9)
D Hs j si (10)
and so the system undergoes the Schrödinger type evolution relative to the ticking
of the clock. Time therefore arises internally without the need for any global time.
This kind of argument is therefore potentially important in cosmology where there
are presumably no clocks to measure time outside of the universe. The cosmological
time itself then has to emerge from within, as in the calculation above.
An important subtlety is that the clock need not encompass the rest of the
universe, though it can include it if required (as in [5]). This means that the above
argument would work even if the state of the system and the clock was mixed when
the rest of the universe was traced out. All that matters is the relative state of the
system with respect to the clock. Next we show how temperature can likewise arise
without the need for having an overall temperature.
3 Temperature as Entanglement
where pn D eˇEn =Z (ˇ D 1=kB T), can always be obtain from an extension of the
form
Xp
j‰i D pn jEn i ˝ jn i : (12)
n
Time, (Inverse) Temperature and Cosmological Inflation as Entanglement 31
From what we said before it follows that temperature T and entanglement (as
measured by the entropy of the reduced states) are directly related: the higher the
temperature, the higher the entanglement between the two subsystems.
This simple argument can, in fact, be made to resemble the “timeless time”
argument even further. The bonus will be that the Gibbs-Boltzmann distribution
will arise naturally providing we make a few assumptions (to be detailed in what
follows).
Imagine we divide the total universe into a small system (s) and a large rest (r).
The attributes “small” and “large” will be quantified below. Let us again assume that
the interaction between the system and the rest is small enough to be negligible and
that the total state is a zero energy eigenstate .Hs C Hr /j‰sr i D 0. The reason for
this will become transparent shortly (we recall that in the clock argument this was
needed because a good clock neither affects nor is affected by the evolution of the
system—at least to within a good approximation).
Now construct j‰sr i as a superposition of energy eigenstates of the system jEn i
correlated to the states of the rest with energy En (since the sum has to add up
to zero—here is where we need the assumption that the interaction Hamiltonian
between the two vanishes). The total state can be written as
D.En /
X X
j‰sr i D jEn i ˝ j nm i ; (13)
n mD1
where the states of the system are not normalized so that hEm jEn i D Nınm . The
index m for the rest takes into account the fact that the rest is huge compared with
the system and there may be many degenerate states whose energy is En . The
degree of degeneracy will be labeled as D.En /. To obtain the state of the system,
s we trace out the rest, i.e.
X X
s D jEn ihEn j tr.j nm ih nm j/ (14)
n m
X
D h nm j nm ijEn ihEn j (15)
nm
X
D D.En /jEn ihEn j : (16)
n
We now assume that the energies En are small enough that we can expand to the
first approximation (this is one of the two central assumptions leading to Gibbs-
Boltzmann weights as we will shortly see):
dD 1 dD
D.En / D D.0/ En D D.0/ 1 En
dEn D.0/ dEn
1 dD
D.0/ exp En : (17)
D.0/ dEn
32 V. Vedral
The second central assumption is that the function whose first order expansion is
f .x/ D 1 x C : : : is in fact the exponential ex (there are of course infinitely
many functions that have the same first order Taylor expansion;the exponential can
be further justified by requiring that f .x C y/ D f .x/f .y/, namely that densities of
independent systems get multiplied). We can now define
1 dD
ˇD ; (18)
D.0/ dEn
which is our effective inverse temperature. We can rewrite this in an even more
transparent way as
d ln D.0/
ˇD (19)
dEn
where we now have the standard statistical definition of inverse temperature as the
derivative of entropy with respect to energy. The state of the system now emerges to
be
X eˇEn
s D jEn ihEn j (20)
n
Z
P ˇEn
where Z D ne is the partition function which arises from the normaliza-
tion N. Just like we noted in the case of time, there is here no need to start from an
entangled state of the system and the rest; a mixture will suffice just as well for the
above argument [9]. However, one can always assume that the state is purified to
include everything in the universe, so that the rest is indeed the rest of the universe
excluding the system. We will revisit this argument when we discuss cosmological
inflation.
We have now seen that both time and temperature can arise from entanglements
between the system under consideration and another suitably chosen system. But
could the two (time and temperature) be related more directly? Namely, is there
a physical process that can convert time into temperature (or vice versa)? An
interesting possibility is to view a system that dynamically tunnels through a
potential barrier. The state of the system is a superposition of the transmitted and
the reflected wave. However, suppose that we only have access to the transmitted
wave. Then, we actually need to trace out the reflected wave in which case the
transmitted state is a mixed one (and can therefore be thought of as being at some
finite temperature).
It is the process of tracing over part of an entangled system that gives us an
effective temperature for the remaining part even though the total system is in a
pure state. The time, being a measure of entanglement between the system and the
rest, becomes the temperature after ignoring the rest. In other words, we can think
of an observer (existing within the rest of the universe) choosing to measure the
hand of the clock (also within the rest of the universe) or the energy of the system
Time, (Inverse) Temperature and Cosmological Inflation as Entanglement 33
(thereby effectively tracing over the rest) which leads to the emergence of either
time or temperature respectively.
The presence of entanglement in this example is a bit more subtle. It can be
seen to arise from the second quantized notation of the tunneling particle having an
amplitude to tunnel (i.e. to be transmitted through the barrier) and another amplitude
not to tunnel (i.e. to be reflected by the barrier). The state can then be written as
p p
rj1ir j0it C tj0ir j1it (21)
where the r and t subscripts indicate the reflected and transmitted modes respec-
tively. When we trace out the reflected mode, we obtain a mixed state of the
transmitted mode. It is interesting that this process of conversion of time into
temperature by quantum tunneling was recently employed by Parikh and Wilczek
[10] to explain the Hawking radiation [11] and the resulting temperature of a black
hole.
We now briefly summarize the argument by Parikh and Wilczek [10]. We imagine
that a particle-anti-particle pair was created inside the black hole, close to the event
horizon, and that the particle is then able to tunnel out. We proceed to calculate
the probability for this to happen. The inverse of this process is the creation of the
pair outside of the event horizon and that the anti-particle tunnels into the black
hole. The two processes will have the same probability (since their amplitudes are
presumably complex conjugates of one another). We now proceed to explain how
this is calculated.
The main ingredient is the formula for quantum tunneling. The reader will recall
that, in the WKB approximation, the trial solution to the Schrödinger equation
d2 .x/
C k2 .x/ .x/ D 0 (22)
dx2
This assumes that jk0 j << k2 (i.e. small de Broglie wavelength). The tunneling rate
is, within this approximation, defined as the ratio of the outgoing to the incoming
intensity of particles and this is given by
Z rout
D exp 2 k.r/dr : (24)
rin
34 V. Vedral
Here rin and rout are the boundaries of the potential. It is now clear that since this is
an exponential dependence of the probability on the wavenumber, we can equate it to
the Boltzmann weight (exp E=kB T) and thereby obtain an “effective” temperature.
This is the gist of the Parikh-Wilczek argument.
To illustrate how to obtain the Hawking temperature we will now apply this
formula to the scenario where a particle (antiparticle) tunnels out of (into) a black
hole of mass M. In the former case, rin D 2M and rout D 2.M !/, are the
Schwarzshild radius before and after the particle (of frequency ! D m, where we
have set „ D c D kB D G D 1; we will reintroduce the constants shortly below) has
tunneled out respectively. We will assume an otherwise flat potential. Evaluating the
integral:
Z rout Z rout Z k0
k.r/dr D Im dk0 .r/dr
rin rin 0
Z rout Z !
dr
D Im d! 0 p :
rin 0 1 2.M ! 0 /=r
Here wep have used the fact that dH=dk D rP , that dH D d! and that, finally,
rP D 1 2.M !/=r.
This integral can be solved (by using e.g. calculus of residues) to yield:
Equating this to the thermal distribution expf„!=kTg, and ignoring the ! 2 contri-
bution, gives us the following temperature (with all the relevant constants now in
place):
„c3
T ; (26)
8GkB M
5 Cosmological Inflation
In quantum cosmology, we treat the universe as a quantum system, but then use the
resulting model to compute some macroscopically observable parameter, such as
the temperature of the Universe, or its density fluctuations (ultimately also measure
by the temperature fluctuations in the CMB). If the universe is a closed system, its
state ought to be pure, and it is then impossible to assign a (non-zero) temperature
to it. However, if we imagine that the pure state belongs to the universe as a whole
and we only observe a small section of the whole universe (as the theory of inflation
might suggest) then it is clear how the observable universe could be in a mixed state
to which it is then appropriate to assign a temperature. The suggestion to treat the
observable universe as an open quantum system comes from Prigonine [12], though
the line of argument we use here will be entirely different.
The plan is to apply the temperature-as-the-result-of-tunneling argument to the
universe. The evolution of the universe follows two equations that are derived from
Einstein’s field equations assuming the cosmological principle, which tells us that
the universe is homogeneous and isotropic (i.e. the same for all observers, which in
turn fixes the metric to be used).
The two equations describing the evolution of the universe are known as the
Friedmann equations:
2
aP 3kc2 8G
3 C 2 D (27)
a a c2
2
aR aP kc2 8Gp
2 2 D (28)
a a a c2
Here a is the scale-factor of the universe, k its curvature (not to be confused with
the Boltzmann constant kB ), G Newton’s gravitational constant, c the speed of light,
is the density of the universe and p is the pressure. These are two equations with
three unknowns (the pressure, the density and the scale factor). Usually we assume
an equation of state relating the pressure and volume and then solve for the scale
factor. Here however we will follow a different route.
We now present an argument for the temperature of the universe that mirrors
the Parikh-Wilczek black hole calculation. Imagine that the temperature is a
consequence of matter tunneling into the observable universe (this argument was
originally used to calculate the probability for the universe to tunnel into its own
existence [13, 14]). This process can be described by a quantized version of the first
of the Friedmann equations (the quantum version is known as the Wheeler-DeWitt
equation) [15]:
( 2 )
2
@2 3 a
a2 1 2 j .a/i D 0 ; (29)
@a2 2G a0
36 V. Vedral
where a20 D G (assumed for simplicity) is the size of the tunneling barrier (as is
customary we have again set c D 1). The solution of the above equation is known
as the wave-function of the universe [16]. The probability to tunnel through the
2
potential . 3
2G
/2 a2 .1 aa2 / is given by
0
( Z s )
3 a0
a2
p D exp daa 1 2 ; (30)
2G 0 a0
c5
p e „G2 H ; (31)
where H D aP =a is the Hubble parameter. Note that this is usually applied to the
beginning of the universe with some fixed initial value of H. In our case, this formula
holds at all times and describes the tunneling between the observable universe and
the rest. Writing this in the exponential Gibbs-Boltzmann form
Mu c2
p e kB Tu (32)
where Mu is the mass of the observable universe and Tu its temperature. Using the
fact that Mu D c3 =4GH [17] we obtain
„H
Tu D : (33)
4kB
As already noted, this temperature is time dependent (as the result of the time
dependence of H). We briefly point out that the mass of the universe can be arrived
at by different methods to be about 1053 kg (see e.g. [17]) which is in a pretty
good agreement with the formula used here (and which can almost be obtained
by dimensional analysis by combining c, G and H into a quantity with dimensions
of mass).
We will now use this temperature and assume the universe to be a black body
(here we follow the argument given in [18]). This will then be inserted into the
continuity equation (which is basically the First law of thermodynamics and is
derivable from the Friedmann equations) which is of the form:
dQ d dV
D .V/ C p (34)
dt dt dt
where V, p and are the volume, pressure and density of the universe respectively. If
the universe is truly an isolated system then dQ D 0 and the left hand side of above
would vanish (which is what is normally assumed). However, if we think of just the
observable universe then the theory that best fits the current observation is based on
Time, (Inverse) Temperature and Cosmological Inflation as Entanglement 37
the idea of inflation. Namely, at the very beginning the universe was supposed to
have undergone a rapid expansion which stretched the space-time fabric faster than
the speed of light. As a consequence what we call the observable universe is only a
part of the total universe, the rest of it being outside of our light cone. If we suppose
that the universe has always been quantum mechanical, then the observable universe
should be entangled with the rest.
Furthermore and in line with the Church of Higher Hilbert Space picture, all the
mixedness (entropy) in our observable universe comes from tracing out the rest. This
would then provide us with the term dQ=dt (see also the discussion in [12]). The rest
in this case is the component of the universe that lies outside the horizon, and we
assume that the universe is at a temperature derived from the tunneling argument
above. The change in time comes from the fact that the universe is evolving, which
in turn affects the entanglement between the observable universe and the rest (and
therefore leads to a changing temperature). Since we are assuming that Q has the
black body spectrum, we can then use the Stefan-Boltzmann law to write the rate of
change of heat as
dQ
D AT 4 (35)
dt
Using the first of Friedmann equations (with the curvature term k set to zero) and
the fact that H D aP =a we can transform the above to:
P aP aP
C 3.1 C !/ D 3!c (37)
a a
where p D ! is the equation of state relating the pressure and density of the
universe, and !c D „G2 =45c7 is the (time-dependent) critical density. We now see
that the term due to dQ
dt
(which is on the right hand side of the equation) effectively
acts as negative pressure (countering the second term on the left hand side). We can
solve this equation for to obtain:
Da3.1C!/
D ˛D 3.1C!/
(38)
1 C 1C! a
38 V. Vedral
where D is just a constant and ˛ D „G2 =45c7. If we assume the equation of state
for ordinary radiation (! D 1=3) and that ˛D >> a4 we obtain that
4
D (39)
3˛
which is a constant density. This allows us to integrate the first Friedmann equation
leading to an exponential expansion of the scale factor
The possible effects of quantum gravity on the CMB spectrum are very much
discussed and analyzed (see e.g. [22]). Here we follow the logic of constructing
macroscopically observable entanglement witnesses that might be inferred from the
CMB. We expect that the effects of quantum physics and gravity were important
in the very early stages of the universe and were then possibly amplified by the
process of inflation. Cosmologists in fact believe that all the structure in the universe
come from the original quantum fluctuations whose effect was then amplified by
gravity. But how do we know that the correlations we observe are due to quantum
correlations and not just of an entirely classical origin? (After all we said that both
time and temperature can arise in the same way from a mixed, classically correlated
state). Here we present a simple argument.
Suppose that we are given a thermal state T D pj‰0 ih‰0 j C .1 p/rest , where
j‰0 i is the ground state, p D exp.E0 =kB T/=Z is the usual Boltzmann weight and
rest involves all higher levels. A very simple entanglement witness can now be
derived by noting that if
where S.jj/ is the quantum relative entropy [23], then the state T must be
entangled (as it is closer to j‰0 i than the closest separable state, which we denoted as
sep ). E./ is the relative entropy of entanglement of [23, 24]. The entanglement
we are talking about here is within the system itself and between the subsystems
comprising the system.
After a few simple steps, the above inequality leads to another inequality,
satisfied by entangled thermal states T ,
eE0 =kB T
pD D e.E0 CkB T ln Z/=kT e.UCF/=kB T
Z
D eS=kB ;
which, if satisfied, implies that T is entangled. We now have a very simple criterion
which can be expressed as follows: if the entropy of a thermal state is lower than the
40 V. Vedral
relative entropy of its ground state (multiplied by the Boltzmann constant k), then
this thermal state contains some form of entanglement.
Here we are not concerned with the type of entanglement we have (e.g. bi-partite
or multipartite, distillable or bound), but we only what to confirm that the state is
not fully separable. It is also very clear that if the ground state is not entangled,
this witness will never detect any entanglement (since entropy is always a non-
negative quantity), even though the state may in reality be entangled for some range
of temperatures.
The entanglement witness based on entropy, though at first sight very simple, is
nevertheless rather powerful as it allows us to talk very generally about temperatures
below which we should start to detect entanglement in a very generic macroscopic
system. Since entropy is lower at low temperatures, this is the regime where we
expect the witness to show entanglement. Let us look at the typical examples of
ideal bosonic and fermionic gases. Non-ideal systems behave very similarly, with
some for us unimportant corrections. At low T, the entropy scales as (see e.g. [25])
pF;B
kB T
SN (44)
„!Q F;B
@S
CDT (46)
@T
In terms of the heat capacity, Eq. (45) implies that the values of the heat capacity
below
kB T pF;B
C < Ccrit D (47)
„!Q F;B
Time, (Inverse) Temperature and Cosmological Inflation as Entanglement 41
Acknowledgements The author acknowledges funding from the National Research Foundation
(Singapore), the Ministry of Education (Singapore), the EPSRC (UK), the Templeton Foundation,
the Leverhulme Trust, the Oxford Martin School, the Oxford Fell Fund and the European Union
(the EU Collaborative Project TherMiQ, Grant Agreement 618074).
References
A.J. Short
1 Introduction
Special relativity [1] lies at the heart of modern physics, and has played a central role
in advancing the subject over the last century. It also inspired a fundamental shift in
our picture of reality, from a spatial state evolving in time to a static block universe.
This conceptual shift raises some deep issues, particularly concerning causality and
complexity, which this paper seeks to highlight and address. In light of these issues,
we will consider whether relativity could emerge naturally without requiring such
a large conceptual shift. For simplicity, we will focus mainly on special relativity,
but similar arguments could be applied to the Hamiltonian formulation of general
relativity [2], in which space-time can be described in terms of a space-like surface
evolving in time.
If reality consists of a state evolving in time via physical laws, causality follows
naturally—from an exact description of the state at a particular time, we can
determine the state at any future time by applying the physical laws (even if the
laws are probabilistic, we can characterise the final probability distribution). If, on
the other hand, reality is described by a block universe, a four-dimensional ‘box’
containing a static structure, it seems highly surprising that we would be able to
predict the entire contents of the box from one slice through it. Indeed, causality
makes the block universe a highly redundant object. Considering all possible laws
for constructing block universes, it seems that causality itself then requires a deeper
explanation.
The block universe also seems an intrinsically very complex object to exist
without some mechanism for its construction, whereas in the evolving state picture,
one could easily imagine that both the initial state and physical laws are simple,
and complexity is only generated dynamically (indeed, this could also explain the
apparent asymmetry in the universe’s boundary conditions). Note that complexity
here refers to the amount of computational time required to generate an object, as
well as to the compressibility of its description.1 The intuition is that any substantial
computation must be done within the universe, rather than prior to the universe
existing.
There are further issues with the block universe in quantum theory, where most
interpretations favour the evolving state picture. Finally, the block universe conflicts
strongly with our intuition that the ‘present’ is special.
None of these issues are definitive, and it is certainly possible that these concerns
about the block universe can be addressed. However, it is also interesting to consider
whether the evolving state picture yields a more natural view of reality even in light
of relativity. Formally, it is entirely consistent with special relativity for there to exist
a preferred reference frame in which the true state evolves. The issue is that this
reference frame would be undetectable, and that relativity then seems unnatural—
why should the laws of physics be the same in any inertial frame when only one is
‘real’?
A possible solution would be to derive special relativity from a different set of
assumptions, such that it emerges naturally even in the evolving state picture. Recent
work on particles in discrete space-time suggests that this is highly plausible—
relativistic evolution laws emerge naturally there at large scales despite the existence
of a preferred frame [4–12]. The key is the existence of a bounded speed of
information propagation, which is an appealing assumption in any picture. Can this
form part of a natural alternative set of assumptions from which to derive special
relativity?
Amongst the general public, the most widely held view about space-time is that only
the present is real, and that it changes with time. However, this view encountered
a serious problem with the advent of special relativity, which showed that different
observers (in particular, those in relative motion with each other) disagree about
what constitutes the present. There are two natural options at this point. The first is to
1
Note that this differs from Kolmogorov complexity [3], which only captures compressibility.
The Kolmogorov complexity would generally be small for a block-universe as one could write
a compact program to generate it by iterating the physical laws on the initial state.
Re-evaluating Space-Time 45
claim that the present in some particular reference frame is real, and to explain why
observers moving with respect to this frame reach ‘mistaken’ beliefs about reality
[13, 14]. We will return to this approach in Sect. 3. However, this goes against the
central principle of relativity that all inertial frames are equivalent with respect to
all of the laws of physics.
The second approach, which is the almost universal strategy adopted by theo-
retical physicists, is to move to a reference-frame independent picture of a block
universe. In the block universe approach, reality is a four dimensional space-time
manifold [15] in which all events from the beginning to the end of the universe are
contained. Describing the universe from the perspective of one particular inertial
frame then involves foliating the universe into space-like slices in a particular way.
This viewpoint has been hugely successful, and played a key role in the development
of general relativity. However, in this section we will highlight some important
conceptual issues raised by this shift.
2.1 Causality
All of our physical investigations into the universe so far have confirmed its causal
nature—that the future state of the universe can be predicted from its present
state via the application of physical laws. In quantum theory, these predictions are
generally probabilistic rather than deterministic, but even in this case the probability
distribution of any measurement’s outcome can be accurately predicted using the
Born rule.2 This causal structure is present at the most basic level in the evolving
state picture of reality, as the future state of the universe is indeed generated
from the present one via physical laws. However, in the block universe approach,
causality does not seem inevitable. Indeed, special relativity appears to formally
allow tachyons [18] which travel faster than light (and thus backwards in time
according to some observers), and general relativity permits the existence of closed
time-like curves [19]. Moving further away from the specific theories describing
our universe, if we consider general rules for describing the contents of a four-
dimensional ‘box’, it seems plausible that most such theories would not be causal.
For example, one might imagine rules for constructing and linking four dimensional
loops inside the box. Perhaps anthropic arguments can be made that universes
without at least approximate causality cannot support intelligent life, or it can be
show that causality follows from a natural local differential structure of the physical
2
The idea of retrocausality can be helpful in explaining quantum effects, particularly in cases
involving post-selection, such as in the two-state vector formalism [16]. However, a standard causal
explanation is also possible. There are also interesting recent results on quantum causal models
[17].
46 A.J. Short
laws, or that given a more general structure one can always find coordinates and time
direction for which it is causal. However, a significant advantage of the evolving
state picture over the block universe approach is that it offers a simple explanation
of observed causality.
2.2 Simplicity
Although it is difficult to speculate about the origins of the universe, one potential
issue with the block universe approach is that it requires the entire complex structure
of the universe, for all time, to ‘come into existence’ without any mechanism
by which it is created. In this view, physical laws themselves are also somewhat
redundant, as they arguably just describe some particular properties of the block
universe.
By contrast, in the evolving state picture, all that has to ‘come into existence’ is
a simple initial state for the universe and a simple set of physical laws. All the later
complexity of the universe is then generated dynamically from this starting point,
and one can argue that this evolution explains the thermodynamic arrow of time
[20].
Note that by ‘simple’ here, we mean something which could be generated on a
computer with parallel processing capabilities by a short program in a short time.
For example an array of zeroes would be simple (as they could all be generated in
parallel), but the Block-universe would be complicated (as one would have to either
store the entire structure in memory or compute it from the initial state). It would be
interesting to develop this idea further in future work.
Some alternative pictures, such as a growing block universe [21], include an
explicit process by which the block universe is formed, and would also count as
simple models in the sense described here. However, these do not seem to offer any
particular advantages over the evolving state picture.
Another apparent advantage of the evolving state picture is that the ‘present’ is real
and changing, and this fits intuitively with our conscious perception of reality. By
contrast, in the block universe picture there is no objective present, and young and
old versions of each individual co-exist and are presumably all conscious, and all
experiencing their own subjective ‘now’. It is interesting that many people seem
happy to accept a block universe view of spacetime, but reject the ‘parallel worlds’
of Everettian quantum theory [22]. Although it is certainly possible that the reality
of the present, and the dynamic nature of reality, are subjective illusions, a picture
of reality closer to our conscious perceptions is appealing.
Re-evaluating Space-Time 47
The arguments above could be applied to both classical and quantum theory.
However, the block universe picture arguably fits less well in the quantum case.
Most discussions of quantum theory are carried out in the evolving state picture,
and this viewpoint is adopted in many of the standard interpretations of quantum
theory, including the Copenhagen, Everettian [22], Collapse [23], and Bohmian [24]
approaches. In contrast, approaches highlighting a block universe view of quantum
reality include the consistent histories approach [25], the two state-vector picture
[26–28], and Kent’s work on Lorentzian models of quantum reality [29, 30].
In quantum field theory, it is standard to consider the algebra of observables
associated with each space-time point, which naturally fits into a relativistic block
universe picture. However, if we consider reality to be composed of this set of
observables, then one suffers even more from the simplicity argument above, as
one must consider a set of operators on infinite dimensional Hilbert space for every
space time point, each of which has a complicated structure. The interplay between
the information contained in the observables and the (static) initial state is also subtle
here, and difficult to interpret directly.
Following an Everettian approach, one could also foliate space-time into a set
of space-like quantum states at different times, and note that different foliations
yield the same physical predictions. However, it is not clear how to describe the
underlying un-foliated reality.
Finally, the Wheeler-de-Witt [31] equation of quantum gravity leads one to con-
sider time as represented by correlations in an essentially spatial state. Recent work
has further developed this viewpoint [32], and it offers an interesting alternative
picture of reality to explore further, but is similar in spirit to the block universe
picture, and many of the issues raised above would also apply to this model.
Given the issues raised in the previous section, it is interesting to explore alternatives
to the block universe picture. One possibility which is entirely consistent with
the predictions of special relativity (if not its spirit) is to assume that a preferred
reference frame exists, and that only the spatial state corresponding to a particular
moment of time in this frame is real. Time evolution then becomes a fundamental
property of reality describing how the spatial state of the universe changes. This
viewpoint is known as presentism, in contrast with the eternalism of the block
universe picture.3
3
A similar alternative is the ‘moving spotlight’ view of time. In this picture the entire block universe
exists, but in addition a particular spatial slice representing an objective present is ’highlighted’,
48 A.J. Short
If all of the physical laws in the preferred frame are consistent with relativity (i.e.
Lorentz covariant), then it would be impossible to detect from within the universe
what the preferred reference frame was. The existence of an undetectable property
of the universe is philosophically unappealing, but does not seem to be a compelling
argument against this view. More concerning is that the relativistic symmetry of the
physics laws then seems unnatural—why should the laws of physics be the same in
any inertial frame when only one is real?
One way of addressing this concern would be to derive special relativity from an
alternative set of assumptions which do not include the principle that all reference
frames are equivalent. In particular, this seems more plausible given recent work
on quantum particle dynamics in discrete space-time [4–12], in which relativistic
symmetries emerge naturally in the continuum limit despite the underlying discrete
model having a preferred frame (for example a lattice of spatial points and discrete
time steps). In particular, it has been shown that the simplest quantum walks on a
lattice behave like massless relativistic particles at scales much larger than the lattice
scale, given some natural non-relativistic assumptions [4, 9, 11]. Similar results have
also been obtained for discrete quantum cellular automata models of quantum fields
(and thus multiple particles) [6–8, 12], and discrete versions of Lorentz transforms
have also been constructed [5, 10]. Note that in all of these cases, relativity is not
assumed initially, but emerges from the other assumptions used to construct the
models.
A key ingredient in these results is the finite speed of causal influence, in which
particles only move by a finite distance (e.g. by one lattice site) in each time-step.4
Even starting from a presentist viewpoint, the principle that causal influences travel
at a bounded speed seems a very natural property, which would have warranted
investigation even without any consideration of relativity. In particular, this property
means that in order to determine the state in a finite region after a finite time, one
only needs to know the initial state of a larger finite region, and not the state of
the entire universe. Furthermore, it means that the state of the universe in different
regions can be evolved ‘efficiently’ in parallel. Note that these approaches do not
address the non-locality of quantum measurements highlighted by Bell’s theorem,
however this need not require any non-local influences if an Everettian approach is
adopted, and in any case such phenomena cannot be used to transmit information.
and this highlight evolves up the block universe. However, this view seems to suffer from almost
all of the disadvantages of the block universe, as well as those of a preferred frame.
4
Or more generally that operators localised in a spatial region only evolve into operators on a
slightly larger region.
Re-evaluating Space-Time 49
4 Conclusions
The block universe picture of reality leads to a radically different notion of time
to our everyday intuitions. In this paper, we have highlighted some issues raised
by this conceptual shift—in particular how such a complex structure could come to
exist without evolving, why such a model should lead to the observed causality of
our universe, how it fits with our subjective perception of time, and the role played
by time in interpretations of quantum theory.
In light of these issues, we reconsider the presentist view of reality as a spatial
state evolving in time. Can an alternative explanation be found for the emergence
of relativistic behaviour even when reality has a preferred frame? Results showing
the emergence of approximate Lorentz symmetry for models of particles in discrete
space and time suggest this may be possible, and it would be very interesting to
generalise these results.
Understanding relativity as an emergent symmetry would not only allow us
to recover a more natural view of reality as a time-evolving spatial state, but
would provide a basis for further research into models in which relativity is only
approximate, including discrete models of space and time. This may prove crucial
in opening new research directions in quantum gravity and particle physics.
Acknowledgements AJS acknowledges support from the FQXi ‘Physics of What Happens’ grant
program, via the SVCF.
References
1. A. Einstein, Zur Elektrodynamik bewegter Körper. Ann. Phys. 17, 891 (1905); English
translation On the electrodynamics of moving bodies, G.B. Jeffery, W. Perrett (1923)
2. R. Arnowitt, S. Deser, C. Misner, Dynamical structure and definition of energy in general
relativity. Phys. Rev. 116, 1322–1330 (1959)
3. A. Kolmogorov, On tables of random numbers. Sankhyā Ser. A 25, 369–375 (1963). MR
178484
4. I. Bialynicki-Birula, Weyl, Dirac, and Maxwell equations on a lattice as unitary cellular
automata. Phys. Rev. D 49, 6920 (1994)
5. G.M. D’Ariano, A. Tosini, Emergence of space-time from topologically homogeneous causal
networks. Stud. Hist. Phil. Sci. B: Stud. Hist. Phil. Mod. Phys. 44, 294-299 (2013)
6. G.M. D’Ariano, P. Perinotti, Derivation of the Dirac equation from principles of information
processing. Phys. Rev. A 90, 062106 (2014)
7. A. Bisio, G.M. D’Ariano, A. Tosini, Quantum field as a quantum cellular automaton: the Dirac
free evolution in one dimension. Ann. Phys. 354, 244 (2015)
8. G.M. D’Ariano, N. Mosco, P. Perinotti, A. Tosini, Path-integral solution of the one-dimensional
Dirac quantum cellular automaton (2014). arXiv:1406.1021
9. G.M. D’Ariano, N. Mosco, P. Perinotti, A. Tosini, Discrete Feynman propagator for the Weyl
quantum walk in 2+1 dimensions (2014). arXiv:1410.6032
10. A. Bisio, G.M. D’Ariano, P. Perinotti, Lorentz symmetry for 3d quantum cellular automata
(2015). arXiv:1503.01017
11. T.C. Farrelly, A.J. Short, Discrete spacetime and relativistic quantum particles. Phys. Rev. A
89, 062109 (2014)
50 A.J. Short
12. T.C. Farrelly, A.J. Short, Causal fermions in discrete space-time. Phys. Rev. A 89, 012302
(2014)
13. G.F. FitzGerald, The ether and the earth’s atmosphere. Science 13(328), 390 (1889)
14. H.A. Lorentz, The relative motion of the earth and the aether. Zittingsverlag Akad. V. Wet. 1,
74–79 (1892)
15. H. Minkowski, Raum und Zeit (English translation: space and time). Jahresberichte der
Deutschen Mathematiker-Vereinigung, 75–88 (1909)
16. Y. Aharonov, P.G. Bergmann, J.L. Lebowitz, Time symmetry in the quantum process of
measurement. Phys. Rev. B 134, 1410–1416, (1964)
17. J.-M.A. Allen, J. Barrett, D.C. Horsman, C.M. Lee, R.W. Spekkens, Quantum common causes
and quantum causal models (2016). arXiv:1609.09487
18. G. Feinberg, Possibility of faster-than-light particles. Phys. Rev. 159, 1089–1105 (1967)
19. K. Gödel, An example of a new type of cosmological solution of Einstein’s field equations of
gravitation. Rev. Mod. Phys. 21, 447–450 (1949)
20. D.Z. Albert, Time and Chance (Harvard University Press, Harvard, 2003)
21. M. Tooley, Time, Tense, and Causation (Clarendon Press, Oxford, 1997)
22. H. Everett, Relative state formulation of quantum mechanics. Rev. Mod. Phys. 29, 454–462
(1957)
23. G.C. Ghirardi, A. Rimini, T. Weber, A model for a unified quantum description of macroscopic
and microscopic systems, in Quantum Probability and Applications, ed. by L. Accardi et al.
(Springer, Berlin, 1985)
24. D. Bohm, A suggested interpretation of the quantum theory in terms of “hidden” variables. I
& II. Phys. Rev. 85, 166–193 (1952)
25. R.B. Griffiths, Consistent histories and the interpretation of quantum mechanics. J. Stat. Phys.
36, 219–272 (1984)
26. S. Watanabe, Symmetry of physical laws. Part III. Prediction and retrodiction. Rev. Mod. Phys.
27(2), 179 (1955)
27. Y. Aharonov, P.G. Bergmann, J.L. Lebowitz, Time symmetry in the quantum process of
measurement. Phys. Rev. B 134(6), 1410–1416 (1964)
28. Y. Aharonov, S. Popescu, J. Tollaksen, Each instant of time a new Universe (2013).
arXiv:1305.1615
29. A. Kent, Path integrals and reality (2013). arXiv:1305.6565
30. A. Kent, Solution to the Lorentzian quantum reality problem. Phys. Rev. A 90, 012107 (2014)
31. B.S. DeWitt, Quantum theory of gravity. I. The canonical theory. Phys. Rev. 160, 1113–1148
(1967)
32. V. Giovannetti, S. Lloyd, L. Maccone, Quantum time. Phys. Rev. D 92, 045033 (2015)
Relativistic Quantum Clocks
Abstract The conflict between quantum theory and the theory of relativity is
exemplified in their treatment of time. We examine the ways in which their
conceptions differ, and describe a semiclassical clock model combining elements
of both theories. The results obtained with this clock model in flat spacetime
are reviewed, and the problem of generalizing the model to curved spacetime is
discussed, before briefly describing an experimental setup which could be used
to test of the model. Taking an operationalist view, where time is that which is
measured by a clock, we discuss the conclusions that can be drawn from these
results, and what clues they contain for a full quantum relativistic theory of time.
When an experiment is carried out, the experimenter hopes to gain some information
about nature through her controlled interaction with the system under study. In
classical physics, systems possess a set of measurable properties with definite
values, which can in principle be interrogated simultaneously, to arbitrary accuracy,
and without affecting the values of those properties. Any uncertainty in the
measurements arises from some lack of knowledge on the part of the experimenter
(for example due to imperfect calibration of the apparatus) which could, in principle,
The author “Ivette Fuentes” was previously known as Fuentes-Guridi and Fuentes-Schuller.
M.P.E. Lock ()
Department of Physics, Imperial College, SW7 2AZ London, UK
Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090 Vienna, Austria
e-mail: maximilian.lock12@imperial.ac.uk
I. Fuentes
Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090 Vienna, Austria
School of Mathematical Sciences, University of Nottingham, University Park, Nottingham NG7
2RD, UK
e-mail: ivette.fuentes.guridi@univie.ac.at
is always possible to measure and remove acceleration effects and recover an ideal
clock. Finally, the fourth issue is that, given the locality of the equivalence principle
(i.e. that it only holds exactly when we consider a pointlike observer), it is unclear to
what extent it applies to quantum objects, which do not follow pointlike trajectories.
We investigate the interplay of these four issues, seeking to answer the following
questions: what time does a quantum clock measure as it travels through spacetime,
and what factors affect its precision? What are the fundamental limitations imposed
by quantum theory on the measurement of time, and are these affected by the
motion of the clock? To answer this question, we cannot in general rely on the
Schrödinger equation, as we must use a particular time parameter therein, which in
turn requires the use of a particular classical trajectory.1 The relativistic clock model
detailed in Sect. 3 gives a compromise; its boundaries follow classical trajectories,
but the quantum field contained therein, and hence the particles of that field, do not.
In Sect. 4 we examine the extent to which this clock has allowed the four issues
discussed above to be addressed, and possible future progress.
Given the difference in the scales at which quantum theory and GR are usually
applied, one may ask what we expect to gain by examining their overlap. Our
response to such a question is threefold. Firstly, we note that optical clocks have
reached a precision where gravitational time dilation as predicted by GR has been
measured over scales accessible within a single laboratory [15]. Indeed, optical
clocks are now precise enough that they are sensitive to a height change of 2 cm
at the Earth’s surface [16]. Given the rate of improvement of this technology
(see Figure 1 of [17], for example), one can anticipate an even greater sensitivity
in the near future. The detection of a nuclear transition in thorium-229 [18],
proposed as a new frequency standard [19], means that we may soon enter an
era of “nuclear clocks”, surpassing that which is achievable with clocks based
on electronic transitions. Considering this ever-increasing precision together with
proposals to exploit quantum effects for superior timekeeping (e.g. [20, 21]), we
argue that a consideration of GR alongside quantum theory will become not simply
possible, but in fact necessary in order to accurately describe the outcomes of
experiments.
Our second response is to point out the possibility of new technologies and exper-
iments. There are already suggestions exploiting the clock sensitivity mentioned
above, such as the proposal to use changes in time dilation for earthquake prediction
and volcanology [22]. On the other hand, there are proposals to use effects which
are both quantum and relativistic in order to measure the Schwarzschild radius
of the Earth [23], or to make an accelerometer [24], for example. See [25] for a
review of experiments carried out or proposed which employ both quantum and
general relativistic features. Beyond specific proposals, there are practical questions
which we cannot answer with quantum mechanics and GR separately; for example,
what happens if we distribute entanglement across regions with differing spacetime
1
Ignoring this difficulty, and naively picking some time coordinate, one finds that the Schrödinger
equation for a free particle does not possess the necessary symmetry; it is invariant under Galilean
(rather than Lorentz) transformations.
54 M.P.E. Lock and I. Fuentes
The clock model introduced in [45] allows us to integrate aspects of both general
relativity and quantum mechanics. It consists of a particular mode of a localized
quantum field; the boundaries confining the field define the spatial extent of the
clock, and the clock time is given by the phase of a single-mode Gaussian state. This
gives a clock that can undergo classical relativistic trajectories, but whose dynamics
are described by QFTCS. The former property means that we can compare this to
a pointlike clock by considering a classical observer following the trajectory of the
center of the cavity, while the latter property allows us to consider the effect of the
spacetime curvature on the whole extent of the quantum field, instead of relying on
the Schrödinger equation. The transformation of the quantum state of a localized
field due to boundary motion is a well-studied problem in flat spacetime [37, 46],
particularly the generation of particles due to the DCE [13]. Since the frequencies of
the field modes depend on the length between the boundaries, one must be careful to
choose the trajectories in such a way that the comparison with the pointlike classical
clock is a fair one. One must also be careful to distinguish between classical effects
arising purely from the spatial extent of the clock, and novel quantum effects due to
mode-mixing and particle creation.
To analyze the effect of non-inertial motion and spacetime curvature on the clock,
we first need to describe their effect on its quantum state, giving us the change in
phase (i.e. clock time). Since the phase is subject to an quantum uncertainty relation
with respect to the particle number (see [47], for example), a change in the state of
the field will in general modify the precision with which the phase can be estimated.
Once these changes have been determined, one can compare the overall phase with
the corresponding classical result to find quantum relativistic shifts in the clock
time, and one can see how the precision of the clock is affected by considering the
change in phase estimation precision. Before discussing the results obtained using
this clock model, we give a brief overview of the framework underpinning it.
56 M.P.E. Lock and I. Fuentes
The simplest quantum field theory is that of the massless scalar field. This can be
used, for example, to approximate the electromagnetic field when polarization can
be ignored [48], or phononic excitations in a proposed relativistic BEC setup [49].
For simplicity, we consider one spatial and one temporal dimension. In a general
1 C 1D spacetime, the massless scalar field satisfies the Klein-Gordon equation [50]
ˆ D 0; with WD g
r r
(1)
In some coordinate system .t; x/, imposing the boundary conditions ˆ.t; x1 / D 0
and ˆ.t; x2 / D 0 for a given x1 and x2 , we describe either an electromagnetic field
in a cavity or the phonons of a BEC trapped in an infinite square well. After finding
a set of mode solutions to Eq. (1), which we denote m .t; x/, one can (under certain
conditions, discussed briefly in Sect. 3.4) associate particles with the modes, and
quantize the field by introducing creation and annihilation
h operators
i am and am .
These satisfy the usual bosonic commutation relations, am ; an D ımn , and can be
used to define the vacuum and Fock states in the usual way. The total scalar field is
then given by
X
ˆ.t; x/ D am m .t; x/ C am m .t; x/ : (2)
m
If the field can be described in terms of a second set of mode solutions, we can
relate these to the first set by means of a Bogoliubov transformation. Denoting the
creation and annihilation operators associated with the new set of solutions by bm
and bm , the Bogoliubov transformation can be written as
X
bm D ˛mn an ˇmn an ; (3)
n
where ˛mn and ˇmn are known as the Bogoliubov coefficients, and can be computed
using an inner product between the first and second set of mode solutions (see [50]
for details). These transformations can be used, for example, to represent changes
in coordinate system between inertial and non-inertial observers, or the effect of
Gaussian operations or of spacetime dynamics. Mixing between modes due to the
transformation is determined by the ˛mn , while the ˇmn correspond to the generation
of particles. The fact that the ˇmn are non-zero for Bogoliubov transformations
between inertial and non-inertial observers leads to the Unruh effect and the DCE.
Relativistic Quantum Clocks 57
The relativistic clock model described in Sect. 3.1 makes use of only a single
mode of the field after the transformation. It is then very advantageous to work
with the covariance matrix formalism, which greatly simplifies the process of
taking a partial trace over field modes. In doing so, we restrict ourselves to the
consideration of Gaussian states of the field. The set of such states is closed
under Bogoliubov
transformations.
Defining
the quadrature
operators for mode n
by X2n1 WD 12 an C an and X2n WD 2i an an , a Gaussian state is completely
determined by the first moments q.n/ WD hX2n1 i and p.n/ WD hX2n i, and the second
moments i.e. the covariance matrix
1 ˝˚ ˛ ˝ ˛
ij D Xi ; Xj hXi i Xj : (4)
2
To take a partial trace over some modes, one simply removes the corresponding
rows and columns from the covariance matrix. Let k and .k/ denote respectively
a mode of interest and the reduced covariance matrix of that mode. Now consider
.k/ .k/
some initial state with first moments q0 and p0 , and reduced covariance matrix
.k/
0 . After a Bogoliubov transformation, the first and second moments are given
by [42, 43]
!
q.k/
.k/
q0 .k/ 1X
D Mkk .k/ and .k/ D Mkk 0 MTkk C Mkn MTkn ;
p.k/ p0 4
n¤k
(5)
with
< .˛mn ˇmn / = .˛mn C ˇmn / ;
Mmn WD (6)
= .˛mn ˇmn / < .˛mn C ˇmn /
p.k/
tan
D ; (7b)
q.k/
1
PD p ; (7c)
4 det .k/
58 M.P.E. Lock and I. Fuentes
0 r
2
2 1
.k/ .k/ .k/
B C 2 C
1 11 22 12
r D arctanh B
@ .k/ .k/
C;
A (7d)
2 11 C 22
.k/
212
tan.2
C / D .k/ .k/
: (7e)
11 22
1
p ; (8)
MH
where H is the quantum Fisher information (QFI). One can therefore use the
QFI to quantify the precision with which a parameter can be measured: a greater
QFI implies a greater precision. We note, however, that the QFI is obtained by
an unconstrained optimization over all generalized measurements [51], and as
such gives the theoretical maximum precision, without any consideration of the
feasibility of the measurement process required to achieve it.
In recent years there has been an interest in using squeezed light to improve
the sensitivity of gravitational measurements such as in the LIGO gravitational
wave detector [53], and in atom interferometric measurements of gravitational
field gradients [54]. Typically, proposals consider non-relativistic quantum theory
and Newtonian physics, while others include some corrections due to GR [55].
In [42, 43], quantum metrology was considered using QFTCS, giving a fully
relativistic application of quantum metrology. Applying these ideas, we consider
to be encoded into the Bogoliubov coefficients, and thus into the matrices Mmn
given by Eq. (6). From the corresponding transformation of the first and second
moments (Eq. (5)), and the expression of the Gaussian state parameters in terms
of these moments (Eq. (7)), one can see how the parameters encode . We apply
quantum metrology to the estimation of the phase of a single-mode Gaussian state,
i.e. D
. The QFI for the phase, written in terms of the other Gaussian state
parameters, is given by [56]
4 sinh2 .2r/
H
D 4˛ 2 P Œcosh.2r/ C sinh.2r/ cos C : (9)
1 C P2
Relativistic Quantum Clocks 59
To describe an accelerating clock in flat spacetime, one can make use of so-called
Rindler coordinates. An observer at any fixed spatial Rindler coordinate experiences
undergoes a constant proper acceleration and has a proper time linearly proportional
to the Rindler time coordinate. Furthermore, an extended object which is stationary
in Rindler coordinates satisfies a number of desirable properties, including Born
rigidity [57] and a constant “radar length” (the length as measured by the round-trip-
time of a light pulse) [6]. By judiciously connecting together Rindler coordinates
corresponding to different proper accelerations, the Bogoliubov transformation
corresponding to a continuously varying (finite-duration) proper acceleration can
be calculated [46]. For the results described in this section however, it suffices to
join segments of constant proper acceleration with segments of inertial motion, as
detailed in [37].
In [45], the effect of non-inertial motion on the clock time was investigated in the
famous twin-paradox scenario. In this scenario, one clock remains motionless while
another undergoes a round trip, and the stationary clock registers more time passing
than the round-trip clock. The round-trip trajectory was composed of periods of
constant proper acceleration a interspersed with periods of inertial motion (see
Fig. 1, and reference [45] for more details). The clocks were initialized in the same
coherent state. First considering the purely classical deviation (i.e. in the absence
of mode-mixing and particle creation), between a pointlike and a spatially extended
clock, one finds a difference only during the periods of acceleration. During a period
of proper acceleration, the time measured by the cavity-clock cav a
can be related to
the proper time of a pointlike observer point by
a
4 !
cav
a
1 aL 2 aL
D1 CO (10)
point
a
12 c2 c2
Recalling that less time passes for the accelerated pointlike “twin” than the
stationary one, we see from Eq. (10) that the classical effect of the clock’s nonzero
spatial extent is to increase this disparity. If we now include mode-mixing and
particle creation effects due to the motion, as determined by the Bogoliubov
transformation, we find a non-trivial relation between the time as measured by
the relativistic quantum clock model and a pointlike clock. This is illustrated in
Fig. 2 using experimentally feasible parameters for the superconducting quantum
interference device (SQUID) setup discussed in Sect. 3.5. The left inset of Fig. 2
shows the difference between the quantum clock and a pointlike clock, both with
and without mode-mixing and particle-creation effects, as a function of the clock
size L. The right inset gives the percentage of the effect due to particle creation
alone, again as a function of the clock size. Particle creation being a purely quantum
effect, this gives a new quantum contribution to the relativistic phenomenon of time
dilation. The complicated oscillatory behavior of this contribution is due to the non-
trivial L-dependence of numerous complex terms which are added together to give
60 M.P.E. Lock and I. Fuentes
t
tt -
3ta + 2ti -
Rob
3ta + ti -
Alice
ta + ti -
η = const.
ta -
x
Fig. 1 The twin paradox trajectory. One “twin”, Alice, remains stationary in some inertial
reference frame while the other, Rob, undergoes a round trip. Rob’s trajectory consists of segments
of proper acceleration of magnitude a (red) and segments of inertial motion (blue). The dashed
lines give the trajectories of the corresponding pointlike observers. Figure taken from [45]
60
120 50
40
Δτ(%)
100 30
20
80
10
60 0
Δθ
0 5 10 15 20
0.6
40 L(cm)
Δτ(%)
0
20 –0.6
0 5 10 15 20
0 L(cm)
0.0 0.2 0.4 0.6 0.8 1.0 1.2
103 h
Fig. 2 Time dilation, classical and quantum relativistic acceleration effects using feasible param-
eters for a SQUID setup, repeating the scenario 500 times. Unless variable, the parameters are
ta D 1 ns, ti D 0 ns, L D 1:1 cm, and a D 1:7 1015 m/s2 . Main plot: phase difference between
the twins, using spatially-extended relativistic quantum clocks (h ´ aL=c2 ). Left inset: time
difference between Rob using a pointlike and using a spatially extended clock, with (red) and
without (blue) mode-mixing and particle-creation effects, as a percentage of the total time dilation
between the twins. Right inset: percentage of the total time dilation between twins due exclusively
to particle-creation. Figure taken from [45]
Relativistic Quantum Clocks 61
the relevant Bogoliubov coefficients (see the appendix of [45] for details). The main
plot of Fig. 2 gives the relative phase shift between the twins’ quantum clocks.
In [56], the effect of non-inertial motion on the precision of the clock was
investigated. This depends on the state in which the clock is initialized. The QFI
for the phase of a Gaussian state was given in Eq. (9). From this we see that, for
… .=2; 2=3/ and a given purity, the precision of phase estimation increases
with the real displacement parameter ˛ and the magnitude r of the squeezing. For a
given average particle number hNi, the squeezed vacuum state is the best Gaussian
state for phase estimation [58]. In Fig. 3, the effect of non-inertial motion on the QFI
for coherent and squeezed vacuum states is depicted. In particular, one can see the
separability of the mode-mixing and particle creation effects. Mode-mixing acts to
decrease the QFI, and therefore the precision of the clock, more so for the squeezed
vacuum than for the coherent state, though in the regime considered there is no
point at which the coherent state gives a better clock than the squeezed vacuum.
Particle creation, on the other hand, can either ameliorate or exacerbate this effect,
depending on the initial phase
0 of the clock. For large hNi, the degradation due
to mode-mixing dominates, but as hNi decreases, one arrives at a regime where
particle-creation effects dominate. For low enough hNi and a careful choice of
parameters one can even find cases where the QFI is improved as a result of the
generation of the appropriate squeezing, though the set of such cases is relatively
small. One can therefore conclude that the typical effect of non-inertial motion is to
decrease the precision of the clock.
Fig. 3 The change in the QFI (given as a percentage of its pre-motion value) after non-inertial
motion with h ´ aL=c2 , for (a) a coherent initial state, and (b) a squeezed vacuum initial state
with hNi D 1 (blue), hNi D 5 (red) and hNi D 10 (green). The phase accrued during each ta of
acceleration was
a D . The solid curves give the effect of mode-mixing alone, while the dotted
and dashed curves incorporate the effect of particle creation for an initial phase of
0 D 0 and
0 D =2 respectively. Figure taken from [56]
62 M.P.E. Lock and I. Fuentes
gravitational field, tidal forces will reveal the curvature of the spacetime. Likewise,
one can equate a pointlike object at rest in a gravitational field with one undergoing
some proper acceleration in flat spacetime, and one finds again that this equivalence
breaks down for a system with finite extent. This is illustrated in [59], for example,
where it is shown that a reference frame at rest in a uniform gravitational field is
not equivalent to a uniformly accelerating one. Given these considerations, when
seeking to apply the results discussed above to curved spacetimes, one can only
invoke the equivalence principle in a limited sense. Here, we illustrate this in the
Schwarzschild spacetime, though a similar argument can be applied to any static
spacetime.
In the work discussed in Sect. 3.3, Rindler coordinates were used to represent the
accelerated observer. One example of such coordinates, .; /, can be obtained from
inertial coordinates .T; X/ by the transformation
1 rs 2GM
ds2 D f .r/dt2 C dr2 with f .r/ ´ 1 and rs ´ 2 : (12)
f .r/ r c
In this case, observers at fixed r experience the constant proper acceleration [60]
rs 1
aS D p ; (13)
2r2 f .r/
which is evidently different from the Rindler case. Since the clock has non-
negligible extent, we cannot equate these two circumstances in general. Close to the
event horizon at r D rs however, one can approximate the spacetime experienced
by stationary Schwarschild observers using Rindler coordinates [60], giving an
approximate equality between aR and aS , and in this case one can import the method
discussed in Sect. 3.3 into an investigation in curved spacetime.
To examine more general situations, we need to be able to describe the effect
of general boundary motion through curved spacetime on the quantum state of the
field. In [61], we provided a method for describing the effect of a finite period of
cavity motion through a static curved spacetime for a broad class of trajectories. This
provides us with the means to explore the effect of gravity on the clock, namely how
deviations from the proper-time prescription of relativity depend on the spacetime
curvature, and how the precision of the clock is affected.
There remain, however, certain challenges. In the flat spacetime case, there was
an unambiguous notion of length which could be adopted, determined by demanding
Relativistic Quantum Clocks 63
that an observer accelerating with the clock measure a constant length. This results
in a number of desirable properties, such as Born rigidity (a lack of stresses on
the clock support system), constant radar distance (the distance as measured by
timing classical light pulses), and constant proper length. In curved spacetime,
however, such notions do not necessarily coincide, and there is no unambiguous
generalization of Rindler coordinates. Fermi-Walker coordinates are a candidate for
such a generalization, but it unclear if this theoretical construction is in keeping
with the operationalism which we have until now adopted (for example by defining
time as that which is measured by a clock). We are currently investigating different
notions of length in curved spacetime, and how the choice of which notion to adopt
affects the measurement of time.
The discussion above considered only static spacetimes. Now including the
possibility of non-static ones, we can ask how the spacetime dynamics themselves
affect the clock. This question brings with it an added complication: in order to
associate a set of solutions to the field equations with particle modes, we require
that the spacetime admits a timelike Killing vector field, which is by no means
guaranteed for a nonstationary spacetime. Without such a vector field, there is an
ambiguity in the concept of particles [50]. Nonetheless, there are some cases in
which these issues can be overcome, such as in the usual calculation of particle
creation due to an expanding universe [62, 63], leaving us free to apply the quantum
clock model.
As noted in Sect. 3.2.1, the scalar field used in the clock model described above can
represent light in an optical cavity (neglecting polarization), or the phonons of a
BEC under certain conditions [49]. We only consider the former implementation
here. Subjecting the mirrors of an optical cavity to the necessary non-inertial
motion2 is technically infeasible [66]. To circumvent this requirement, a novel
solution was proposed in [67]; by placing a SQUID at one or both ends of a
waveguide, one can create effective mirrors whose position is determined by the
inductance of the SQUID, which is in turn controlled by an external magnetic field.
Modulating the external magnetic field therefore allows the experimenter to control
the position of this effective mirror. By making one mirror oscillate at a particle-
creation resonance, this setup was used to observe the DCE for the first time [30].
In [45], the authors analyzed the feasibility of implementing the trajectory detailed
in Sect. 3.3 using a comparable setup, concluding that the experiment would be
challenging but possible.
2
Note that it is not acceleration but rather its time-derivative (the “jerk”) which produces the
effect [64, 65].
64 M.P.E. Lock and I. Fuentes
4 Conclusion
The results discussed above demonstrate both a deviation from the proper-time
prescription of relativity when one considers a quantum clock with some finite
extent, and a relativistic change in the quantum uncertainty associated with its
measurement of time. Though these results are so far limited to flat spacetime, the
main challenge to applying the model in curved spacetime, i.e. calculating the effect
of motion through curved spacetime on the localized field, has now been overcome.
In Sect. 1, we noted four problems arising in the overlap of quantum mechanics
and relativity. For clarity we repeat them here, before discussing each of them in
turn:
1. finding the constraints imposed by quantum theory on clocks in GR;
2. reconciling the proper-time prescription of GR with the impossibility of pointlike
quantum trajectories;
3. investigating the validity of the clock hypothesis;
4. examining the applicability of the equivalence principle to a non-pointlike
quantum clock.
To address the first problem, the quantum uncertainty of the clock measurement
was quantified using the tools of quantum metrology, and in particular the Cramer-
Rao bound. One finds that the change in precision due to relativistic motion
depends upon the quantum state in which the clock was initialized, as one might
expect. While some states were more robust than others, except for very particular
circumstances, the motion had the effect of decreasing the QFI for all initial states,
largely due to the mode-mixing. In the example considered, the more nonclassical
the state, the greater its fragility with respect to the motion. A key goal of our
ongoing work is to determine how spacetime curvature affects this.
With regard to the second issue, we have attempted to move away from the
proper-time prescription of GR in favor of an operationalist view, instead defining
time as the result of a measurement performed on a quantum clock. This is in
keeping with the Machean view that a physical theory should be based entirely
on directly observable properties [68]. We have succeeded to some extent, in that
the particles of the field do not follow well defined trajectories, and the clock-time
is determined by the quantum evolution of the system and not simply the length
along a curve. However, we are still bound by the proper-time view, as we must
choose a classical observer whose proper time parametrizes the evolution of the
quantum field. Furthermore, the phase of the field, whose measurement we take as
time, has a definite, noncontextual value in this model, and so is not treated as a fully
quantum observable. Nonetheless, this value gives a different clock readout from
the corresponding proper time, and this difference is a highly non-linear function
of clock size (see the insets in Fig. 2), demonstrating the non-trivial effect of the
clock’s non-pointlike and quantum nature.
Concerning the clock hypothesis, we can clearly state that, with the clock model
employed here, one finds effects beyond the instantaneous-velocity-induced time
Relativistic Quantum Clocks 65
dilation (a finding which is corroborated in [14]). These effects modify both the
time measured by the clock, and the precision of this measurement. This is a
strong indication that, in a quantum theory of spacetime, the clock hypothesis is
not satisfied.
For the fourth problem, we discussed in Sect. 3.4 the applicability of the
equivalence principle in the current model. To fully investigate this, we first need
to study trajectories in curved spacetime. One would expect the clock to be subject
to a tidal effect from the difference in gravitational field across the extent of the
clock system, and for this to therefore depend on the clock size and the underlying
curvature. However, it seems unlikely that this will allow us to address the issue of
incorporating the physical insight of the equivalence principle into a non-pointlike
quantum theory.
We now note some limitations of the model and our analysis. Firstly, the QFI
is obtained by optimizing over all physically allowable measurements, with no
regard to their accessibility to an experimentalist, nor to the available energy. A
consideration of the latter, for example its effect on the spacetime which the clock
measures, could result in a greater clock uncertainty.
Another potential limitation is the possibility that the results discussed here are
not fundamental, but in fact particular to the specific clock model. However, the
model is rather general for QFTCS: we seek a localized field, which therefore
demands some kind of potential, and we justify the use of boundaries (i.e. infinite
potential barriers) by noting that the shape of this potential should not play
a fundamental role. One can nonetheless make this more general, by instead
considering some trapping potential, or by making the boundaries only reflective to
certain frequency ranges. This results in a motion-induced coupling between trapped
‘local’ modes and global ones, the latter spanning the entire spacetime, and such a
coupling would therefore likely reduce the precision of the clock. If this is true, the
choice of boundaries used here can be seen as optimizing the clock precision over
all possible localizing potentials.
As a final remark, we note that this clock model is, in effect, a quantum version
of the common light-clock thought experiment often used to illustrate relativistic
time dilation (including by Einstein himself [69]).
Acknowledgements MPEL acknowledges support from the EPSRC via the Controlled Quantum
Dynamics CDT (EP/G037043/1), and IF acknowledges support from FQXi via the ‘Physics of the
observer’ award ‘Quantum Observers in a Relativistic World’.
References
4. A. Peres, Measurement of time by quantum clocks. Am. J. Phys 48, 552 (1980)
5. S.L. Braunstein, C.M. Caves, G. Milburn, Generalized uncertainty relations: theory, examples,
and lorentz invariance. Ann. Phys. 247, 135–173 (1996)
6. W. Rindler, Relativity: Special, General, and Cosmological (Oxford University Press, Oxford,
2006)
7. S. Hossenfelder, Minimal length scale scenarios for quantum gravity. Living Rev. Relativ. 16,
90 (2013)
8. H. Salecker, E. Wigner, Quantum limitations of the measurement of space-time distances.
Phys. Rev. 109, 571 (1958)
9. L. Burderi, T. Di Salvo, R. Iaria, Quantum clock: a critical discussion on spacetime. Phys.
Rev. D 93, 064017 (2016)
10. C.W. Misner, K.S. Thorne, J.A. Wheeler, Gravitation (Macmillan, London, 1973)
11. S.A. Fulling, Nonuniqueness of canonical field quantization in Riemannian space-time. Phys.
Rev. D 7, 2850 (1973)
12. W.G. Unruh, Notes on black-hole evaporation. Phys. Rev. D 14, 870 (1976)
13. G.T. Moore, Quantum theory of the electromagnetic field in a variable-length one-dimensional
cavity. J. Math. Phys. 11, 2679–2691 (1970)
14. K. Lorek, J. Louko, A. Dragan, Ideal clocks–a convenient fiction. Classical Quantum Gravity
32, 175003 (2015)
15. C.-W. Chou, D. Hume, T. Rosenband, D. Wineland, Optical clocks and relativity. Science 329,
1630–1633 (2010)
16. T.L. Nicholson, A new record in atomic clock performance. Ph.D. Thesis, University of
Colorado (2015)
17. N. Poli, C.W. Oates, P. Gill, G.M. Tino, Optical atomic clocks. Riv. Nuovo Cimento 36,
555–624 (2013)
18. L. von der Wense et al., Direct detection of the 229th nuclear clock transition. Nature 533,
47–51 (2016)
19. C.J. Campbell et al., Single-ion nuclear clock for metrology at the 19th decimal place. Phys.
Rev. Lett. 108, 120802 (2012)
20. P. Komar et al., A quantum network of clocks. Nat. Phys. 10, 582–587 (2014)
21. O. Hosten, N.J. Engelsen, R. Krishnakumar, M.A. Kasevich, Measurement noise 100 times
lower than the quantum-projection limit using entangled atoms. Nature 529(7587), 505–508
(2016)
22. M. Bondarescu, R. Bondarescu, P. Jetzer, A. Lundgren, The potential of continuous, local
atomic clock measurements for earthquake prediction and volcanology, in EPJ Web of
Conferences, vol. 95 (EDP Sciences, Les Ulis, 2015), 04009
23. D.E. Bruschi, A. Datta, R. Ursin, T.C. Ralph, I. Fuentes, Quantum estimation of the
schwarzschild spacetime parameters of the earth. Phys. Rev. D 90, 124001 (2014)
24. A. Dragan, I. Fuentes, J. Louko, Quantum accelerometer: distinguishing inertial bob from his
accelerated twin rob by a local measurement. Phys. Rev. D 83, 085020 (2011)
25. R. Howl, L. Hackermuller, D.E. Bruschi, I. Fuentes, Gravity in the quantum lab. arXiv preprint
arXiv:1607.06666 (2016)
26. A. Derevianko, M. Pospelov, Hunting for topological dark matter with atomic clocks. Nat.
Phys. 10, 933–936 (2014)
27. M.D. Gabriel, M.P. Haugan, Testing the Einstein equivalence principle: atomic clocks and
local lorentz invariance. Phys. Rev. D 41, 2943 (1990)
28. P.C. Davies, Quantum mechanics and the equivalence principle. Classical Quantum Gravity
21, 2761 (2004)
29. S. Reynaud, C. Salomon,P. Wolf, Testing general relativity with atomic clocks. Space Sci. Rev.
148, 233–247 (2009)
30. C. Wilson et al., Observation of the dynamical Casimir effect in a superconducting circuit.
Nature 479, 376–379 (2011)
31. P. Lähteenmäki, G.S. Paraoanu, J. Hassel, P.J. Hakonen, Dynamical Casimir effect in a
Josephson metamaterial. Proc. Natl. Acad. Sci. 110, 4234–4238 (2013)
Relativistic Quantum Clocks 67
32. J.L. Ball, I. Fuentes-Schuller, F.P. Schuller, Entanglement in an expanding spacetime. Phys.
Lett. A 359, 550–554 (2006)
33. I. Fuentes, R.B. Mann, E. Martín-Martínez, S. Moradi, Entanglement of dirac fields in an
expanding spacetime. Phys. Rev. D 82, 045030 (2010)
34. I. Fuentes-Schuller, R.B. Mann, Alice falls into a black hole: entanglement in noninertial
frames. Phys. Rev. Lett. 95, 120404 (2005)
35. P.M. Alsing, I. Fuentes, Observer-dependent entanglement. Classical Quantum Gravity 29,
224001 (2012)
36. N. Friis, D.E. Bruschi, J. Louko, I. Fuentes, Motion generates entanglement. Phys. Rev. D 85,
081701 (2012)
37. D.E. Bruschi, I. Fuentes, J. Louko, Voyage to alpha centauri: entanglement degradation of
cavity modes due to motion. Phys. Rev. D 85, 061701 (2012)
38. G. Adesso, I. Fuentes-Schuller, M. Ericsson, Continuous-variable entanglement sharing in
noninertial frames. Phys. Rev. A 76, 062112 (2007)
39. N. Friis et al., Relativistic quantum teleportation with superconducting circuits. Phys. Rev.
Lett. 110, 113602 (2013)
40. N. Friis, M. Huber, I. Fuentes, D.E. Bruschi, Quantum gates and multipartite entanglement
resonances realized by nonuniform cavity motion. Phys. Rev. D 86, 105003 (2012)
41. D.E. Bruschi, A. Dragan, A.R. Lee, I. Fuentes, J. Louko, Relativistic motion generates quantum
gates and entanglement resonances. Phys. Rev. Lett. 111, 090504 (2013)
42. M. Ahmadi, D.E. Bruschi, I. Fuentes, Quantum metrology for relativistic quantum fields. Phys.
Rev. D 89, 065028 (2014)
43. M. Ahmadi, D.E. Bruschi, C. Sabín, G. Adesso, I. Fuentes, Relativistic quantum metrology:
exploiting relativity to improve quantum measurement technologies. Sci. Rep. 4, 4996 (2014)
44. C. Sabín, D.E. Bruschi, M. Ahmadi, I. Fuentes, Phonon creation by gravitational waves. New
J. Phys. 16, 085003 (2014). http://stacks.iop.org/1367-2630/16/i=8/a=085003
45. J. Lindkvist et al., Twin paradox with macroscopic clocks in superconducting circuits. Phys.
Rev. A 90, 052113 (2014)
46. D.E. Bruschi, J. Louko, D. Faccio, I. Fuentes, Mode-mixing quantum gates and entanglement
without particle creation in periodically accelerated cavities. New J. Phys. 15, 073052 (2013)
47. T. Opatrny, Number-phase uncertainty relations. J. Phys. A Math. Gen. 28, 6961 (1995)
48. N. Friis, A.R. Lee, J. Louko, Scalar, spinor, and photon fields under relativistic cavity motion.
Phys. Rev. D 88, 064028 (2013)
49. S. Fagnocchi, S. Finazzi, S. Liberati, M. Kormos, A. Trombettoni, Relativistic Bose–Einstein
condensates: a new system for analogue models of gravity. New J. Phys. 12, 095012 (2010)
50. N.D. Birrell,P.C.W. Davies, Quantum Fields in Curved Space (Cambridge University Press,
Cambridge, 1984)
51. V. Giovannetti, S. Lloyd, L. Maccone, Advances in quantum metrology. Nat. Photonics 5,
222–229 (2011). http://www.nature.com/nphoton/journal/v5/n4/full/nphoton.2011.35.html
52. H.M. Wiseman, G.J. Milburn, Quantum Measurement and Control (Cambridge University
Press, Cambridge, 2009)
53. J. Aasi et al., Enhanced sensitivity of the LIGO gravitational wave detector by using squeezed
states of light. Nat. Photonics 7, 613–619 (2013)
54. S.S. Szigeti, B. Tonekaboni, W.Y.S. Lau, S.N. Hood, S.A. Haine, Squeezed-light-enhanced
atom interferometry below the standard quantum limit. Phys. Rev. A 90, 063630 (2014)
55. B. Altschul et al., Quantum tests of the Einstein equivalence principle with the STE-QUEST
space mission. Adv. Space Res. 55, 501–524 (2015)
56. J. Lindkvist, C. Sabín, G. Johansson, I. Fuentes, Motion and gravity effects in the precision of
quantum clocks. Sci. Rep. 5, 10070 (2015).
57. M. Born, The theory of the rigid electron in the kinematics of the relativity principle. Ann.
Phys. (Leipzig) 30, 1 (1909)
58. A. Monras, Optimal phase measurements with pure gaussian states. Phys. Rev. A 73, 033821
(2006)
68 M.P.E. Lock and I. Fuentes
59. E.A. Desloge, Nonequivalence of a uniformly accelerating reference frame and a frame at rest
in a uniform gravitational field. Am. J. Phys. 57, 1121–1125 (1989)
60. F. Dahia, P.F. da Silva, Static observers in curved spaces and non-inertial frames in Minkowski
spacetime. Gen. Relativ. Gravit. 43, 269–292 (2011)
61. M. Lock, I. Fuentes, Dynamical Casimir effect in curved spacetime. New J. Phys. 19, 073005
(2017)
62. L.E. Parker, The creation of particles in an expanding universe. Ph.D. Thesis, Harvard
University (1966)
63. L. Parker, Particle creation and particle number in an expanding universe. J. Phys. A Math.
Theor. 45, 374023 (2012)
64. S. Fulling, P. Davies, Radiation from a moving mirror in two dimensional space-time:
conformal anomaly, in Proceedings of the Royal Society of London A: Mathematical, Physical
and Engineering Sciences, vol. 348, 393–414 (The Royal Society, London, 1976)
65. L. Ford, A. Vilenkin, Quantum radiation by moving mirrors. Phys. Rev. D 25, 2569 (1982)
66. C. Braggio et al., A novel experimental approach for the detection of the dynamical Casimir
effect. Europhys. Lett. 70, 754 (2005)
67. J.R. Johansson, G. Johansson, C. Wilson, F. Nori, Dynamical Casimir effect in a supercon-
ducting coplanar waveguide. Phys. Rev. Lett. 103, 147003 (2009)
68. J. Barbour, The End of Time: The Next Revolution in Physics (Oxford University Press, Oxford,
2001)
69. A. Einstein, Zur elektrodynamik bewegter körper. Ann. Phys. 322, 891–921 (1905)
Causality–Complexity–Consistency:
Can Space-Time Be Based on Logic
and Computation?
What is causality?—The notion has been defined in different ways and turned out
to be highly problematic, both in Physics and Philosophy. This observation is not
new, as is nicely shown by Bertrand Russell’s quote [37] from more than a century
ago:
The law of causality [. . . ] is a relic of a bygone age, surviving, like the monarchy, only
because it is erroneously supposed to do no harm.
Indeed, a number of attempts have been made to abandon causality and replace
global by only local assumptions (see, e.g., [33]). A particular motivation is
given by the difficulty of explaining quantum non-local correlation according to
Reichenbach’s principle [36]. The latter states that in a given (space-time) causal
structure, correlations stem from a common cause (in the common past) or a direct
influence from one of the events to the other. In the case of violations of Bell’s
inequalities, a number of results indicate that explanations through some mechanism
as suggested by Reichenbach’s principle either fail to explain the correlations [11]
or are unsatisfactory since they require infinite speed [3, 4, 19, 39] or precision [46].
All of this may serve as a motivation for dropping the assumption of a global causal
structure in the first place.
Closely related to causality is the notion of randomness: In [17], a piece of
information is called freely random if it is statistically independent from all other
pieces of information except the ones in its future light cone. Clearly, when the
assumption of an initially given causal structure is dropped, such a definition is
not possible any longer. One may choose to consider freely random pieces of
information as being more fundamental than a space-time structure—in fact, the
latter can then be seen as emerging from the former: If a piece of information is
free, then any piece correlated to it is in its causal future.1 But how can we define
the randomness of an object purely intrinsically and independently of any context?
For further motivation, note that Colbeck and Renner’s definition of random-
ness [17] is consistent with full determinism: A random variable with trivial
distribution is independent of every other (even itself). How can we exclude this
and additionally ask for the possibility in principle of a counterfactual outcome,
i.e., that the random variable X could have taken a value different from the one it
actually took? Intuitively, this is a necessary condition for freeness. The question
whether the universe (or a closed system) starting from a given state A always ends
up in the same state B seems to be meaningless: Even if rewinding were possible,
and two runs could be performed, the outcomes B1 and B2 that must be compared
never exist in the same reality since rewinding erases the result of the rewound
run (Renner, personal communication, 2013): “B1 D B2 ?” is not a question which
cannot be answered in principle, but that cannot even be formulated precisely. In
summary, defining freeness of a choice or a random event, understood as the actual
possibility of two (or more) well-distinguishable options, seems hard even when a
causal structure is in place.2
1
This change of perspective reflects the debate, three centuries ago, between Newton and Leibniz
on the nature of space and time, in particular on as how fundamental this causal structure is to be
considered.
2
In this context and as a reply to [25], we feel that the notion of a choice between different possible
futures by an act of free will put forward there is not only hard to formalize but also not much more
Causality–Complexity–Consistency: Can Space-Time Be Based on Logic and. . . 71
We look for an intrinsic definition of randomness that takes into account only the
“factuality,” i.e., the state of the closed system in question. Clearly, such a definition
is hard to imagine for a single bit, but it can be defined in a natural way for (long)
strings of bits, namely its length minus the work value (normalized through dividing
by kT) of a physical representation of the string with respect to some extraction
device; we relate this quantity to the string’s “best compression.”
We test the alternative view of randomness for physical meaning. More specifi-
cally, we find it to be functional in the context of non-local correlations: A reasoning
yielding a similar mechanism as in the probabilistic regime is realized which has
the conceptual advantage not to require relating the outcomes of measurements that
cannot all actually be carried out. That mechanism is: Random inputs to a non-local
system plus no-signaling guarantee random outputs.
In the second half of this text, we consider consequences of abandoning (space-
time) causality as being fundamental. In a nutshell, we put logical reversibility
to the center of our attention here. We argue that if a computation on a Turing
machine is logically reversible, then a “second law” emerges: The complexity of the
tape’s content cannot decrease in time. This law holds without failure probability,
in contrast to the “usual” second law, and implies the latter. In the same spirit, we
propose to define causal relations between physical points, modeled by bit strings,
as given by the fact that “the past is entirely contained in the future,” i.e., nothing is
forgotten.3 In this view, we also study the relationship between full causality (which
we aim at dropping) and mere logical consistency (that we never wish to abandon)
in the complexity view: They are different from each other as soon as more than two
parties are involved.
2 Preliminaries
Let U be a fixed universal Turing machine (TM).4 For a finite or infinite string s, the
Kolmogorov complexity [29, 31] K.s/ D KU .s/ is the length of the shortest program
for U such that the machine outputs s. Note that K.s/ can be infinite if s is.
Let a D .a1 ; a2 ; : : :/ be an infinite string. Then
aŒn WD .a1 ; : : : ; an ; 0; : : :/ :
innocent than Everettian relative states [21]—after all, the latter are real (within their respective
branches of the wave function). We have become familiar with the ease of handling probabilities
and cease to realize how delicate they are ontologically.
3
It has been argued that quantum theory violates the causal law due to random outcomes of
measurements. Hermann [27] argued that the law of causality does not require the past to determine
the future, but vice versa. This is in accordance with our view of logical reversibility: There can be
information growth, but there can be no information loss.
4
The introduced asymptotic notions are independent of this choice.
72 Ä. Baumeler and S. Wolf
We call a string a with this property incompressible. We also use K.aŒn / D ‚.n/,
as well as
K.aŒn /
K.a/ 0 W” lim D 0 ” K.aŒn / D o.n/:
n!1 n
Note that computable strings a satisfy K.a/ 0, and that incompressibility is, in
this sense, the extreme case of uncomputability.
Generally, for functions f .n/ and g.n/ 6 0, we write f g if f =g ! 1.
Independence of a and b is then5
K.a j b/ K.a/
or, equivalently,
If we introduce
or, equivalently,
K.a j b; c/ K.a j c/ ;
or
5
This is inspired by Cilibrasi and Vitányi [16], where (joint) Kolmogorov complexity—or, in
practice, any efficient compression method—is used to define a distance measure on sets of bit
strings (such as literary texts of genetic information of living beings). The resulting structure in
that case is a distance measure, and ultimately a clustering as a binary tree.
Causality–Complexity–Consistency: Can Space-Time Be Based on Logic and. . . 73
6
The Church-Turing thesis, first formulated by Kleene [28], states that any physically possible
process can be simulated by a universal Turing machine.
74 Ä. Baumeler and S. Wolf
State of the Art Bennett [13] claimed the fuel value of a string S to be its length
minus K.S/:
where the “defect” D.S/ is bounded from above and below by a smooth Rényi
entropy of the distribution of S from the demon’s viewpoint, modeling her igno-
rance. They do not consider the algorithmic aspects of the demon’s actions
extracting the free energy, but the effect of the demon’s a priori knowledge on S.
If we model the demon as an algorithmic apparatus, then we should specify the
form of that knowledge explicitly: Vanishing conditional entropy means that S is
uniquely determined from the demon’s viewpoint. Does this mean that the demon
possesses a copy of S, or the ability to produce such a copy, or pieces of information
that uniquely determine S? This question sits at the origin of the gap between the two
described groups of results; it is maximal when the demon fully “knows” S which,
however, still has maximal complexity even given her internal state (an example see
below). In this case, the first result claims W.S/ to be 0, whereas W.S/ len.S/
P 00 . . . . . . . . . 0
work
Causality–Complexity–Consistency: Can Space-Time Be Based on Logic and. . . 75
W.S; S/
len.S; S/ K.S; S/ 2 len.S/ K.S/ len.S/:
kT
In this case, knowledge has immediate work value.
The Model We assume the demon to be a universal Turing machine U the memory
tape of which is sufficiently long for the tasks and inputs in question, but finite.
The tape initially contains S, the string the fuel value of which is to be determined,
X, a finite string modeling the demon’s knowledge about S, and 0’s for the rest of
the tape. After the extraction computation, the tape contains, at the bit positions
initially holding S, a (shorter) string P plus 0len.S/len.P/ , whereas the rest of the tape
is (again) the same as before work extraction. The demon’s operations are logically
reversible and can, hence, be carried out thermodynamically reversibly [23]. Logical
reversibility in our model is the ability of the same demon to carry out the backward
computation step by step, i.e., from PjjX to SjjX.7 We denote by E.SjX/ the maximal
amount of 0-bits extractable logically reversibly from S given the knowledge X, i.e.,
W.SjX/ D E.SjX/kT ln 2 :
P 00 . . . 0 X 00 . . . 0
E(S|X)
7
Note that this is the natural way of defining logical reversibility in our setting with a fixed input
and output but no sets nor bijective maps between them.
76 Ä. Baumeler and S. Wolf
such that
is computable and bijective. Given the two (possibly irreversible) circuits computing
the compression and its inverse, one can obtain a reversible circuit realizing the
function and where no further input or output bits are involved. This can be achieved
by first implementing all logical operations with Toffoli gates and uncomputing all
junk [12] in both of the circuits. The resulting two circuits have now both still the
property that the input is part of the output. As a second step, we can simply combine
the two, where the first circuit’s first output becomes the second’s second input,
and vice versa. Roughly speaking, the first circuit computes the compression and
the second reversibly uncomputes the raw data. The combined circuit has only the
compressed data (plus the 0’s) as output, on the bit positions carrying the input
previously. (The depth of this circuit is roughly the sum of the depths of the two
irreversible circuits for the compression and for the decompression, respectively.)
We assume that circuit to be hard-wired in the demon’s head. A typical example for
a compression algorithm that can be used is Ziv and Lempel [47].
Upper Bound on the Fuel Value We have the following upper bound on E.SjX/:
The reason is that the demon is only able to carry out the computation in question
(logically, hence, thermodynamically) reversibly if she is able to carry out the
reverse computation as well. Therefore, the string P must be at least as long as
the shortest program for U generating S if X is given.
Although the same is not true in general, this upper bound is tight if KU .SjX/ D
0. The latter means that X itself is a program for generating an additional copy
of S. The demon can then bit-wisely XOR this new copy to the original S on the
tape, hereby producing 0len.S/ reversibly to replace the original S (at the same time
preserving the new one, as reversibility demands). When Bennett’s “uncomputing
trick” is used—allowing for making any computation by a Turing machine logically
Causality–Complexity–Consistency: Can Space-Time Be Based on Logic and. . . 77
Fig. 3 Knowing S
S S
(a)
P
S
S
(b)
“S is
S = ΩN
ΩN ”
(c)
reversible [12]—then a history string H is written to the tape during the computation
of S from X such that after the XORing, the demon can, going back step by step,
uncompute the generated copy of S and end up in the tape’s original state—except
that the original S is now replaced by 0len.S/ : This results in a maximal fuel value
matching the (in this case trivial) upper bound. Note that this harmonizes with [20]
if vanishing conditional entropy is so established.
Discussion We contrast our bounds with the entropy-based results of [20]: Accord-
ing to the latter, a demon having complete knowledge of S is able to extract maximal
work: E.S/ len.S/. What does “knowing S?” mean? (see Fig. 3). We have seen
that the results are in accordance with ours if the demon’s knowledge consists of
(a) a copy of S, or at least of (b) the ability to algorithmically reconstruct S, based
on a known program P, as discussed above. It is, however, possible (c) that the
demon’s knowledge is of different nature, merely determining S uniquely without
providing the ability to build S. For instance, let the demon’s knowledge about S
be: “S equals the first N bits N of the binary expansion of .” Here, is the
so-called halting probability [15] of a fixed universal Turing machine (e.g., the
demon U itself). Although there is a short description of S in this case, and S is thus
uniquely determined in an entropic sense, there is no set of instructions shorter than
S enabling the demon to generate S—which would be required for work extraction
from S according to our upper bound. In short, this gap reflects the one between the
“unique-description complexity”8 and the Kolmogorov complexity.
8
A diagonal argument, called Berry paradox, shows that the notion of “description complexity”
cannot be defined generally for all strings.
78 Ä. Baumeler and S. Wolf
X ˚ Y D AB: (1)
This system is no-signaling, i.e., the joint input-output behavior is useless for
message transmission. (Interestingly, on the other hand, the non-locality of the
correlation means that classically speaking, signaling would be required to explain
the behavior since shared classical information is insufficient.) According to a
result by Fine [22], the non-locality of the system (i.e., conditional distribution)
PXYjAB , which means that it cannot be written as a convex combination of products
PR PR
no-signaling no-signaling
conditionally random conditionally complex
(a) (b)
Fig. 4 The traditional (a) vs. the new (b) view: non-locality à la Popescu/Rohrlich (PR) plus
no-signaling leads to the output inheriting randomness (a) or complexity (b), respectively, from the
input
Causality–Complexity–Consistency: Can Space-Time Be Based on Logic and. . . 79
PXjA PYjB , is equivalent to the fact that there exists no “roof distribution” P0X0 X1 Y0 Y1
such that
P0Xi Yj D PXYjADi;BDj
for all .i; j/ 2 f0; 1g2. In this view, non-locality means that the outputs to alternative
inputs cannot consistently coëxist. The counterfactual nature of this reasoning has
already been pointed out by Specker [38]: “In einem gewissen Sinne gehören aber
auch die scholastischen Spekulationen über die Infuturabilien hieher, das heisst
die Frage, ob sich die göttliche Allwissenheit auch auf Ereignisse erstrecke, die
eingetreten wären, falls etwas geschehen wäre, was nicht geschehen ist.”—“In some
sense, this is also related to the scholastic speculations on the infuturabili, i.e., the
question whether divine omniscience even extends to what would have happened if
something had happened that did not happen.” Zukowski and Brukner [48] suggest
that non-locality is to be understood in terms of such infuturabili, called there
“counterfactual definiteness.”
We intend to challenge this view. Let us first restate in more precise terms the
counterfactual reasoning. Such reasoning is intrinsically assuming or concluding
statements of the kind that some piece of classical information, such as a bit U, exists
or does not exist. What does this mean? Classicality of information is an idealized
notion implying that it can be measured without disturbance and that the outcome of
a measurement is always the same (which makes it clear this is an idealized notion
requiring the classical bit to be represented in a redundantly extended way over an
infinite number of degrees of freedom). It makes thus sense to say that a classical
bit U exists, i.e., has taken a definite value.
In this way of speaking, Fine’s theorem [22] reads: “The outputs cannot exist
before the inputs do.” Let us make this qualitative statement more precise. We
assume a perfect PR box, i.e., a system always satisfying X ˚ Y D A B. Note
that this equation alone does not uniquely determine PXYjAB since the marginal of X,
for instance, is not determined. If, however, we additionally require no-signaling,
then the marginals, such as PXjAD0 or PYjBD0 , must be perfectly unbiased under the
assumption that all four .X; Y/-combinations, i.e., .0; 0/; .0; 1/; .1; 0/, and .1; 1/,
are possible. To see this, assume on the contrary that PXjAD0;BD0.0/ > 1=2. By
the PR condition (1), we can conclude the same for Y: PYjAD0;BD0 .0/ > 1=2.
By no-signaling, we also have PXjAD0;BD1.0/ > 1=2. Using symmetry, and no-
signaling again, we obtain both PXjAD1;BD1.0/ > 1=2 and PYjAD1;BD1 .0/ > 1=2.
This contradicts the PR condition (1) since two bits which are both biased towards 0
cannot differ with certainty. Therefore, our original assumption was wrong: The
outputs must be perfectly unbiased. Altogether, this means that X as well as Y
cannot exist (i.e., take a definite value—actually, there cannot even exist a classical
value arbitrarily weakly correlated with one of them) before for some nontrivial
deterministic function f W f0; 1g2 ! f0; 1g, the classical bit f .A; B/ exists. The
paradoxical aspect of non-locality—at least if a causal structure is in place—now
consists of the fact that fresh pieces of information come to existence in a spacelike-
separated way but that are nonetheless perfectly correlated.
80 Ä. Baumeler and S. Wolf
xi ˚ yi D a i b i : (2)
Obviously, the intuition is that the strings stand for the inputs and outputs of a PR
box. Yet, no dynamic meaning is attached to the strings anymore (or to the “box,”
for that matter) since there is no free choice of an input—i.e., a choice that “could
also have been different” (a notion we discussed and suspect to be hard to define
precisely in the first place)—and no generation of an output in function of an input;
all we have are four fixed strings satisfying the PR condition (2). However, nothing
prevents us from defining this (static) situation to be no-signaling:
Recall the mechanism which the maximal non-locality displayed by the PR box
enables: If the inputs are not entirely fixed, then the outputs must be completely unbi-
ased as soon as the system is no-signaling. We can now draw a similar conclusion,
yet entirely within actual—and without having to refer to counterfactual—data:
If the inputs are incompressible and independent, and no-signaling holds, then
the outputs must be uncomputable.
For a proof of this, let .a; b; x; y/ 2 .f0; 1gN/4 with x ˚ y D a b (bit-wisely),
no-signaling (3), and
K.a; b/ 2n ;
K.a b j b/ n=2 :
Note first that bi D 0 implies ai bi D 0, and second that any further compression
of a b, given b, would lead to “structure in .a; b/,” i.e., a possibility of describing
(programming) a given b in shorter than n and, hence, .a; b/ in shorter than 2n.
Observe now
which implies
and
i.e., the two expressions vanish simultaneously. We show that, in fact, they both fail
to be of order o.n/. In order to see this, assume K.x j a/ 0 and K.y j b/ 0.
Hence, there exist programs Pn and Qn (both of length o.n/) for functions fn and gn
with
For fixed (families of) functions fn and gn , asymptotically how many .an ; bn / can at
most exist that satisfy (6)? The question boils down to a parallel-repetition analysis
of the PR game: A result by Raz [35] states that when a game which cannot be
won with certainty is repeated in parallel, then the success probability for all runs
together is exponentially (in the number of repetitions) decreasing; this implies in
our case that the number in question is of order .2 ‚.1//2n . Therefore, the two
programs Pn and Qn together with the index, of length
.1 ‚.1//2n ;
82 Ä. Baumeler and S. Wolf
of the correct pair .a; b/ within the list of length .2 ‚.1//2n lead to a program,
generating .a; b/, that has length
o.n/ C .1 ‚.1//2n ;
X ˚ Y D ADm;BD1 ; (7)
K.a; b/ .log m C 1/ n ;
i.e., the string ajjb is maximally incompressible given the promise; the system is no-
signaling (3); the fraction of quadruples .ai ; bi ; xi ; yi /, i D 1; : : : ; n, satisfying (7) is
of order .1 ‚.1=m2//n. Then K.x/ D ‚.n/.
Let us prove this statement. First, K.a; b/ being maximal implies
n
K.aDm;bD1 j b/ W (8)
m
Causality–Complexity–Consistency: Can Space-Time Be Based on Logic and. . . 83
The fractions of 1’s in b must, asymptotically, be 1=m due to the string’s incompress-
ibility. If we condition on these positions, the string aDm;bD1 is incompressible,
since otherwise there would be the possibility of compressing .a; b/.
Now, we have
since one possibility for “generating” the string aDm;bD1 , from position 1 to n, is
to generate xŒn and yŒn as well as the string indicating the positions where (7) is
violated, the complexity of the latter being at most9
!
n
log h.‚.1=m2//n :
‚.1=m2 /n
Let us compare this with 1=m: Although the binary entropy function has slope
1 in 0, we have
if m is sufficiently large. To see this, observe first that the dominant term of h.x/ for
small x is x log x, and second that
2n
K.y j b/ & K.x/ (9)
3m
if m is chosen sufficiently large. On the other hand,
9
Here, h is the binary entropy h.x/ D p log p .1 p/ log.1 p/. Usually, p is a probability, but
h is invoked here merely as an approximation for binomial coefficients.
84 Ä. Baumeler and S. Wolf
K.x j a/ 6 0 or K.y j b/ 6 0 :
We illustrate the argument with the example of the magic-square game [2]: Let
.a; b; x; y/ 2 .f1; 2; 3gN /2 .f1; 2; 3; 4gN/2 be the quadruple of the inputs and
outputs, respectively, and assume that the pair .a; b/ is incompressible as well
as K.x j a/ 0 K.y j b/. Then there exist o.n/-length programs Pn , Qn such
that xŒn D Pn .aŒn / and yŒn D Qn .bŒn /. The parallel-repetition theorem [35]
implies that the length of a program generating .aŒn ; bŒn / is, including the employed
sub-routines Pn and Qn , of order .1 ‚.1//len.aŒn ; bŒn /—in contradiction to the
incompressibility of .a; b/.
An All-or-Nothing Flavor to the Church-Turing Hypothesis Our lower bound on
K.x j a/ or on K.y j b/ means that if the experimenters are given access to an
incompressible number (such as ) for choosing their measurement bases, then
the measured photon (in a least one of the two labs) is forced to generate an
uncomputable number as well, even given the string determining its basis choices.
Roughly speaking, there is either no incompressibility at all in the world, or it is
full of it. We can interpret that as an all-or-nothing flavor attached to the Church-
Turing hypothesis: Either no physical system at all can carry out “beyond-Turing”
computations, or even a single photon can.
General Definition of (Non-)locality Without Counterfactuality We propose the
following definition of when a no-signaling quadruple .a; b; x; y/ 2 .f0; 1gN/4
(where a; b are the “inputs” and x; y the outputs) is local: There must exist 2
.f0; 1gN/N such that
K.a; b/ 0 or K.x; y/ 0 ;
since we can set WD .x; y/. At the other end of the scale, we expect that for any non-
local “system,” the fact that K.a; b/ is maximal implies that x or y is conditionally
uncomputable, given a and b, respectively.
It is a natural question whether the given definition harmonizes with the
probabilistic understanding. Indeed, the latter can be seen as a special case of the
Causality–Complexity–Consistency: Can Space-Time Be Based on Logic and. . . 85
former: If the (fixed) strings are typical sequences of a stochastic process, our non-
locality definition implies non-locality of the corresponding conditional distribution.
The reason is that a hidden variable of the distribution immediately gives rise,
through sampling, to a in the sense of (11). Note, however, that our formalism
is strictly more general since asymptotically, almost all strings fail to be typical
sequences of such a process.
It has already been observed that the notion of Kolmogorov complexity can
allow, in principle, for thermodynamics independent of probabilities or ensembles:
Zurek [49] defines physical entropy Hp to be
where M stands for the collected data at hand while H.S j M/ is the remaining
conditional Shannon entropy of the microstate S given M. That definition of a
macrostate (M) is subjective since it depends on the available data. How instead can
the macrostate—and entropy, for that matter—be defined objectively? We propose
to use the Kolmogorov sufficient statistics [24] of the microstate: For any k 2 N, let
Mk be the smallest set such that S 2 Mk and K.Mk / k hold. Let further k0 be the
value of k at which the log-size of the set, log jMk j, becomes linear with slope 1.
Intuitively speaking, k0 is the point beyond which there is no more “structure” to
exploit for describing S within Mk0 : S is a “typical element” of the set Mk0 . We define
M.S/ WD Mk0 to be S’s macrostate. It yields a program generating S of minimal
length
The fuel value (as discussed in Sect. 3) of a string S 2 f0; 1gN is now related to the
macrostate M.S/ 3 S by
(see Fig. 5): Decisive is neither the complexity of the macrostate nor its log-size
alone, but their sum.
A notion defined in a related way is the sophistication or interestingness
as discussed by Aaronson [1] investigating the process where milk is poured
into coffee (see Fig. 6). Whereas the initial and final states are “simple” and
“uninteresting,” the intermediate (non-equilibrium) states display a rich structure;
here, the sophistication—and also K.M/ for our macrostate M—becomes maximal.
86 Ä. Baumeler and S. Wolf
E(S)
log(|Mk |)
k
k0
K(S)
log(|M |)
K(M )
During the process under consideration, neither the macrostate’s complexity nor
its size is monotonic in time: Whereas K.M/ has a maximum in the non-equilibrium
phase of the process, log jMj has a minimum there (see Fig. 7).
On the other hand, the complexity of the microstate,
ProbŒS1 S2 sk ln 2 D 2s :
Causality–Complexity–Consistency: Can Space-Time Be Based on Logic and. . . 87
t
88 Ä. Baumeler and S. Wolf
time can arise. A context-free definition of randomness (or free will for that matter)
has the advantage not to depend on the “possibility that something could have been
different from how it was,” a metaphysical condition we came to prefer to avoid.
The Traditional Second Law from Complexity Increase It is natural to ask what
the connection between logical reversibility and complexity on one side and the
traditional second law on the other is. We show that the latter emerges from
increasing complexity—including the exponential error probabilities.
Let x1 and x2 be the microstates of a closed system at times t1 < t2 with
K.x2 / K.x1 /. If the macrostates M1 and M2 of x1 and x2 , respectively, have small
Kolmogorov complexity (such as traditional thermodynamical equilibrium states
characterized by global parameters like volume, temperature, pressure, etc.), then
jM1 j . jM2 j W
If the macrostates are simple, then their size is non-decreasing. Note that this
law is still compatible with the exponentially small error probability (2N ) in the
traditional view of the second law for a spontaneous immediate drop of entropy
by ‚.n/: The gap opens when the simple thermodynamical equilibrium macrostate
of a given microstate differs from our macrostate defined through the Kolmogorov
statistics. This can occur if, say, the positions and momenta of the molecules of
some (innocent-, i.e., general-looking) gas encode, e.g., and have essentially zero
complexity.
We can now finish up by closing a logical circle. We have started from the
converse of Landauer’s principle, went through work extraction and ended up with a
complexity-theoretic view of the second law: We have returned back to our starting
point.
Landauer’s Principle, Revisited The (immediate) transformation of a string S to the
0-string of the same length requires free energy at least
K.S/kT ln 2 ;
which is then dissipated as heat to the environment. For every concrete lossless
compression algorithm C,
len.C.S//kT ln 2 C ‚.1/ ;
is, on the other hand, an upper bound on the required free energy.
Finally, Landauer’s principle can be combined with its converse and generalized
as follows.
Causality–Complexity–Consistency: Can Space-Time Be Based on Logic and. . . 89
Generalized Landauer’s Principle Let A and B two bit strings of the same length.
The (immediate) transformation from A to B costs at least
free energy, or it releases at most the absolute value of (12) if this is negative.
If the Turing machine is a closed physical system, then this principle reduces
to the complexity-non-decrease stated above. This suggests that the physical system
possibly simulated by the machine—in the spirit of the Church-Turing hypothesis—
also follows the second law (e.g., since it is a closed system as well). The
fading boundaries between what the machine is and what is simulated by it are
in accordance with Wheeler’s [42] “it from bit:” Every “it” — every particle,
every field of force, even the spacetime continuum itself — derives its function, its
meaning, its very existence entirely [. . . ] from the apparatus-elicited answers to
yes or no questions, binary choices, “bits.” If we try to follow the lines of such
a view further, we may model the environment as a binary string R as well. The
goal is a unified discourse avoiding to speak about complexity with respect to one
system and about free energy, heat, and temperature to the other. The transformation
addressed by Landauer’s principle and its converse then looks as in Fig. 9: The low-
complexity zero-string can be swapped with “complexity” in the environment which
in consequence becomes more redundant, i.e., cools down but receives free energy,
for instance in the form of a weight having been lifted.
1kg
R R
1kg
00 . . . 0
free (e.g., potential)
energy
Let us start with a finite set C of strings on which we would like to find a causal
structure arising from inside, i.e., from the properties of, and relations between,
these strings. The intuition is that an x 2 C encodes the totality of momentary
local physical reality in a “point,” i.e., parameters such as mass, charge, electric
and magnetic field density.
Let C
f0; 1gN be finite. We define the following order relation on C 10 :
x y W” K.x j y/ 0 :
We say that x is a cause of y, and that y is an effect of x. So, y is in x’s future exactly if
y contains the entire information about x; no information is ever lost. The intuition is
that any “change” in the cause affects each one of its effects—if sufficient precision
:
is taken into account. We write x D y if x y as well as x y hold. If x 6 y and
x 6 y, we write x 66 y and call x and y spacelike separated. We call the pair .C; /
a causal structure.
For a set fxi g
C and y 2 C, we say that y is the first common effect of the xi
if it is the least upper bound: xi y holds for all xi , and for any z with xi z for
all xi , also y z holds. The notion of last common cause is defined analogously.
10
In this section, conditional complexities are understood as follows: In K.x j y/, for instance, the
condition y is assumed to be the full (infinite) string, whereas the asymptotic process runs over xŒn .
The reason is that very insignificant bits of y (intuitively: the present) can be in relation to bits of x
(the past) of much higher significance. The past does not disappear, but it fades.
Causality–Complexity–Consistency: Can Space-Time Be Based on Logic and. . . 91
K.y j x1 ; x2 ; : : :/ 0 :
Observe first that every deterministic causal structure which has a big bang is
trivial: We have
:
x D y for all x; y 2 C :
This can be seen as follows. Let b be the big bang, i.e., b x for all x 2 C. On the
other hand, K.x j zi / 0 if fzi g is the set of predecessors of x. Since the same is true
for each of the zi , we can continue this process and, ultimately, end up with only b:
:
K.x j b/ 0, i.e., x b, and thus b D x for all x 2 C. In this case, we obviously
cannot expect to be able to explain space-time. (Note, however, that there can still
exist deterministic C’s—without big bang—with non-trivial structure.) However, the
world as it presents itself to us—with both big bang and arrow of time—seems to
direct us away from determinism (in support of [25]).
The situation is very different in probabilistic causal structures: Here, the partial
order relation gives rise to a non-trivial picture of causal relations and, ideally,
causal space-time including the arrow of time. Obviously, the resulting structure
depends crucially on the set C. Challenging open problems are to understand
the relationship between sets of strings and causal structures: Can every partially
ordered set be implemented by a suitable set of strings? What is the property of a
set of strings that gives rise to the “usual” space-time of relativistic light-cones?
Is it helpful to introduce a metric instead of just an order relation? As a first step,
it appears natural to define K.y j x/ as the distance of x from the set of effects of y.
In case y is an effect of x, this quantity intuitively measures the time by which x
happens before y.
Generally in such a model, what is a “second law,” and under what condition
does it hold? Can it—and the arrow of time—be compatible even with determinism
(as long as there is no big bang)?
What singles out the sets displaying quantum non-local correlations as observed
in the lab? (What is the significance of Tsirelson’s bound in the picture?)
92 Ä. Baumeler and S. Wolf
A recent framework for quantum [33] and classical [7] correlations without causal
order is based on local assumptions only. These are the local validity of quantum
or classical probability theory, that laboratories are closed (parties can only interact
through the environment), and that the probabilities of the outcomes are linear in the
choice of local operation. The global assumption of a fixed global causal order is
replaced by the assumption of logical consistency: All probabilities must be non-
negative and sum up to 1. Some correlations—termed non-causal—that can be
obtained in this picture cannot arise from global quantum or classical probability
theory. Similarly to the discovery of non-local correlations that showed the existence
of a world between the local and the signaling; in a similar sense, we discuss here
a territory that lies between what is causal and what is logically inconsistent: It is
not empty.
In the spirit of Sect. 4, where we studied the consequences of non-locality, we
show that the results from non-causal correlations carry over to the picture of
(conditional) compressibility of bit strings, where we do not employ probabilities,
but consider actual data only. In that sense, these are the non-counterfactual versions
of results on non-causal correlations.
output AO
Causality–Complexity–Consistency: Can Space-Time Be Based on Logic and. . . 93
one party, we consider only those input and output bit strings that satisfy some
global relation. These relations are, as in Sect. 4, to be understood to act locally
on the involved strings: A relation involves only a finite number of instances (bit
positions), and it is repeated n .! 1/ times for obtaining the global relation.
For two parties A and B, we say that A is in the causal past of B, A B, if and
only if
S T
Fig. 11 If a variable T is correlated to another variable S, and T is free but S is not, then T is in
the causal past of S
94 Ä. Baumeler and S. Wolf
Causal scenarios describe input and output strings of the parties where the resulting
causal relations reflect a partial ordering of the parties (see Fig. 12a).11 In the most
general case, the partial ordering among the parties of a set S, who are all in the
causal future of some other party A 62 S, i.e., for all B 2 S W A B, can depend
on (i.e., satisfy some relation with) the bit strings of A [6, 32]. A causal scenario, in
particular, implies that at least one party is not in the causal future of some other
parties. If no partial ordering of the parties arises, then the scenario is called causal
(see Fig. 12b).
A trivial example of a causal scenario is a communication channel over which
a bit is perfectly transmitted from a party to another. This channel, formulated as a
global relation, is f .x; y/ D .0; x/, with x; y 2 f0; 1g, and where the first bit belongs
to A (sender) and the second to B (receiver) (see Fig. 13a). Consider the n .! 1/-
C D C D
B B
A A
(a) (b)
Fig. 12 (a) Example of a causal scenario among four parties with .A; B/ C and C D. (b)
Example of a non-causal scenario with .A; B/ C, C D, and D B. Arrows point into the
direction of the causal future
AI BI
A B
f
AO BO
(a)
AI BI
A g B
AO BO
(b)
Fig. 13 (a) The global relation f describes a channel from A to B. (b) The input to party A is, as
defined by the global relation g, identical to the output from party B, and the input to party B is
identical to the output from party A
11
Transitivity arises from the assumption of a fixed causal structure within a party, where the input
is causally prior to the output.
Causality–Complexity–Consistency: Can Space-Time Be Based on Logic and. . . 95
fold sequential repetition of this global relation, and assume that both output bit
strings are incompressible and independent: K.AO ; BO / 2n. The bit string AI
is .0; 0; 0; : : : / according to the global relation. In contrast, BI is equal to AO .
Since K.BI / n and K.BI j AO / 0, the causal relation A B holds, restating
that A is in the causal past of B. Conversely, K.AI / 0 and, therefore, A 6 B: The
receiver is not in the causal future of the sender.
which describes a two-way channel: A’s output is equal to B’s input and B’s output
is equal to A’s input (see Fig. 13b). This global relation can describe a non-causal
scenario. If K.AO ; BO / 2n, then indeed, the causal relations that we obtain
are A B and B A. What we want to underline here is that for this particular
choice of local operations of the parties, input bit strings that are consistent with
the relation (14) exist. In stark contrast, if we fix the local operations of the
parties to be AO D AI (the output equals the bit-wise flipped input) for party A
and BO D BI for party B, then no choice of inputs AI and BI satisfies the desired
global relation (14). This inconsistency is also known as the grandfather antinomy.
If no satisfying input and output strings exist, then we say that the global relation
is inconsistent with respect to the local operations. Otherwise, the global relation is
consistent with respect to the local operations.
For studying bit-wise global relations, i.e., global relations that relate single
output bits with single input bits, that are consistent regardless the local operations,
we set the local operation to incorporate all possible operations on bits. These are
the constants 0 and 1 as well as the identity and bit-flip operations. The parties
additionally hold incompressible and independent strings that define which of these
four relations is in place at a given bit position. For party P, let this additional bit
string be PC . Formally, if we have k parties A; B; C; : : : , then
K.AC ; BC ; CC ; : : : / kn :
PO 0 0 1 1 0···
where a superscript .i/ selects the ith bit of a string. Depending on pairs of bits
on PC , the relation (15) states that a given output bit is either equal to 0 or 1 or
equal to or different from the corresponding input bit. An example is presented in
Fig. 14. Since all pairs of bits appear equally often in PC (asymptotically speaking),
in half the cases bits of the output string are identical to bits of the (incompressible)
string PC in the respective positions. Thus, the output satisfies K.PO / D ‚.n/.
We call the local operation of Eq. (15) of a party universal local operation. If a
global relation is consistent with respect to universal local operations, then we call it
logically consistent. If we consider all (bit-wise) operations, the global relation (14)
becomes inconsistent: No input and output strings exist that satisfy the desired
global relation (14). To see this, note that since we are in the asymptotic case, there
exist positions i where the relation of A states that the ith output bit is equal to the ith
input bit, and the relation of B states that the ith output bit is equal to the negated ith
input bit, which results in a contradiction—the global relation cannot be satisfied.
In more detail, there exists an i such that the bit string AC contains the pair .0; 1/
at position 2i, and such that the bit string BC contains the pair .1; 0/ at the same
position:
AC D .: : : ; 0; 1; : : : / ;
BC D .: : : ; 1; 0; : : : / :
On the one hand, the input to A has a value a on the ith position, and, because of AC ,
the same value is on the ith position of AO :
AI D .: : : ; a; : : : / ;
AO D .: : : ; a; : : : / :
The input and output bit strings of B, on the other hand, must, due to BC , have
opposite bits on the ith position:
BI D .: : : ; b; : : : / ;
BO D .: : : ; b ˚ 1; : : : / :
A contradiction arises: No choice of bits a and b exist that satisfy the global
relation (14): The global relation (14), which is depicted in Fig. 13b, is logically
inconsistent.
Causality–Complexity–Consistency: Can Space-Time Be Based on Logic and. . . 97
We show that there exist logically consistent global relations that are non-
causal [9]. Suppose we are given three parties A, B, and C with universal local
operations. There exist global relations where the input to any party is a function
of the outputs from the remaining two parties. An example [7] of such a global
relation is
x D :b ^ c ; y D a ^ :c ; z D :a ^ b ; (16)
where all variables represent bits, and where x; y; z is the input to A; B; C and a; b; c
is the output from A; B; C, respectively. This global relation can be understood as
follows: Depending on the majority of the output bits, the relation either describes
the identity channel from A to B to C, and back to A, or it describes the bit-flip
channel from A to C to B, and back to A (see Fig. 15). We study the causal relations
that emerge from n .! 1/ sequential repetitions of this global relation, i.e., infinite
strings that satisfy the global relation (16). The input to party A is uncomputable,
even K.AI / 6 0, because some bit positions of the outputs from B and C are
uncomputable. Yet, the outputs from B and C completely determine the input to A,
i.e., K.AI j BO ; CO / 0. Therefore, the causal relation .B; C/ A holds. Due to
symmetry, the causal relations .A; C/ B and .A; B/ C hold as well. All together
imply that every party is in the causal future of some other parties—the scenario
is non-causal. On the other hand, it is logically consistent: There exist input and
output bit strings that satisfy the global relation (16) at every bit-position.
In the probability view, there exists an example of a randomized process that
results in non-causal correlations [9], shown in Fig. 16. In every run, the process
maj(a, b, c) = 0 maj(a, b, c) = 1
A A
1 1
C B C B
Fig. 15 The left channel is chosen if the majority of the output bits is 0; otherwise, the right
channel is chosen
A A
1 1
1 1
2 2
C B C B
Fig. 16 The circular identity channel uniformly mixed with the circular bit-flip channel
98 Ä. Baumeler and S. Wolf
models with probability 1=2 the clockwise identity channel or the clockwise bit-flip
channel. In the probabilistic view, both channels appear with equal probability. This
leads to every party’s inability to influence its past. For instance, if parties B and C
copy the input to the output and party A has a on the output, then A has a random
bit on the input—party A cannot influence its past, and the grandfather antinomy
does not arise. If, however, the probabilities of the mixture are altered slightly, a
contradiction arises [7]. Furthermore, the process from Fig. 16 cannot be embedded
into a process with more inputs and outputs such that the larger process becomes
deterministic and remains logically consistent [10]. Since in the view studied here,
we look at single runs without probabilities—either of the global relations from the
left or from the right channel must hold. Thus, if all parties use the universal local
operation, a contradiction always arises, showing the inconsistency of the process.
Discussion In Sect. 1 we saw that one consequence of dropping the notion of an
a priori causal structure is that randomness becomes hard to define. Thus, we are
forced to take the “factual-only view:” No probabilities are involved. Here, we
show two facts that we formulate without considering counterfactuals. The first is
that causal relations among parties can be derived by considering fixed bit strings
only, without the use of the probability language. These causal relations are an
inherent property of the bit strings of the parties. In other words, these strings are
understood to be logically prior to the causal relations (just as in Sect. 6). The second
consequence is that the causal relations that stem from certain strings can describe
non-causal scenarios. This means that logical consistency does not imply a causal
scenario: Causality is strictly stronger than logical consistency.
8 Conclusions
Whereas for Parmenides of Elea, time was a mere illusion—“No was nor will,
all past and future null”—Heraclitus saw space-time as the pre-set stage on
which his play of permanent change starts and ends. The follow-up debate—two
millennia later and three centuries ago—between Newton and Leibniz about as how
fundamental space and time, hence, causality, are to be seen was decided by the
course of science in favor of Newton: In this view, space and time can be imagined
as fundamental and given a priori. (This applies also to relativity theory, where space
and time get intertwined and dynamic but remain fundamental instead of becoming
purely relational in the sense of Mach’s principle.) Today, we have more reason
to question a fundamental causal structure—such as the difficulty of explaining
quantum non-local correlations according to Reichenbach’s principle. So motivated,
we care to test refraining from assuming space-time as initially given; this has a
number of consequences and implications, some of which we address in this text.
When causality is dropped, the usual definitions of randomness stop making
sense. Motivated by this, we test the use of intrinsic, context-independent “ran-
domness” measures such as a string’s length minus its (normalized) fuel value.
Causality–Complexity–Consistency: Can Space-Time Be Based on Logic and. . . 99
References
1. S. Aaronson, http://www.scottaaronson.com/blog/?p=762,2012.
2. P.K. Aravind, Bell’s theorem without inequalities and only two distant observers. Found. Phys.
Lett. 15(4), 397–405 (2002)
3. J.-D. Bancal, S. Pironio, A. Acín, Y.-C. Liang, V. Scarani, N. Gisin, Quantum non-locality
based on finite-speed causal influences leads to superluminal signalling. Nat. Phys. 8, 867–870
(2012)
4. T.J. Barnea, J.-D. Bancal, Y.-C. Liang, N. Gisin, Tripartite quantum state violating the hidden
influence constraints. Phys. Rev. A 88, 022123 (2013)
5. J. Barrett, L. Hardy, A. Kent, No-signalling and quantum key distribution. Phys. Rev. Lett. 95,
010503 (2005)
100 Ä. Baumeler and S. Wolf
6. Ä. Baumeler, S. Wolf, Perfect signaling among three parties violating predefined causal
order, in Proceedings of IEEE International Symposium on Information Theory 2014 (IEEE,
Piscataway, 2014), pp. 526–530
7. Ä. Baumeler, S. Wolf, The space of logically consistent classical processes without causal
order. New J. Phys. 18, 013036 (2016)
8. Ä. Baumeler, S. Wolf, Non-causal computation avoiding the grandfather and information
antinomies. arXiv preprint, arXiv:1601.06522 [quant-ph], 2016; accepted for publication in
New J. Phys. (2016)
9. Ä. Baumeler, A. Feix, S. Wolf, Maximal incompatibility of locally classical behavior and
global causal order in multi-party scenarios. Phys. Rev. A 90, 042106 (2014)
10. Ä. Baumeler, F. Costa, T.C. Ralph, S. Wolf, M. Zych, Reversible time travel with freedom of
choice. Preprint (2017). arXiv:1703.00779 [quant-ph]
11. J.S. Bell, On the Einstein-Podolsky-Rosen paradox. Physics 1, 195–200 (1964)
12. C.H. Bennett, Logical reversibility of computation. IBM J. Res. Dev. 17(6), 525–532 (1973)
13. C.H. Bennett, The thermodynamics of computation. Int. J. Theor. Phys. 21(12), 905–940
(1982)
14. G. Brassard, A. Broadbent, A. Tapp, Quantum pseudo-telepathy. arXiv preprint, arXiv:quant-
ph/0407221 (2004)
15. G. Chaitin, A theory of program size formally identical to information theory. J. ACM 22,
329–340 (1975)
16. R. Cilibrasi, P. Vitányi, Clustering by compression. IEEE Trans. Inf. Theory 51(4), 523–1545
(2005)
17. R. Colbeck, R. Renner, No extension of quantum theory can have improved predictive power.
Nat. Commun. 2 411 (2011)
18. R. Colbeck, R. Renner, Free randomness can be amplified. Nat. Phys. 8, 450–454 (2012)
19. S. Coretti, E. Hänggi, S. Wolf, Nonlocality is transitive. Phys. Rev. Lett. 107, 100402 (2011)
20. O. Dahlsten, R. Renner, E. Rieper, V. Vedral, The work value of information. New J. Phys. 13,
053015 (2011)
21. H. Everett, “Relative state” formulation of quantum mechanics. Rev. Mod. Phys. 29(3), 454–
462 (1957)
22. A. Fine, Hidden variables, joint probability, and the Bell inequalities. Phys. Rev. Lett. 48, 291–
295 (1982)
23. E. Fredkin, T. Toffoli, Conservative logic. Int. J. Theor. Phys. 21(3–4), 219–253 (1982)
24. P. Gàcs, J.T. Tromp, P.M.B. Vitányi, Algorithmic statistics. IEEE Trans. Inf. Theory 47(6),
2443–2463 (2001)
25. N. Gisin, Time really passes, science can’t deny that, arXiv preprint, arXiv:1602.0149 [quant-
ph], 2016; in Proceedings of the Workshop on “Time in Physics,” ETH Zurich, 2015 (2016)
26. E. Hänggi, R. Renner, S. Wolf, Efficient information-theoretic secrecy from relativity theory,
in Proceedings of EUROCRYPT 2010. Lecture Notes in Computer Science (Springer, Berlin,
2010)
27. G. Hermann, Die naturphilosophischen Grundlagen der Quantenmechanik. Abh. Fries’schen
Schule, Band 6, 69–152 (1935)
28. S.C. Kleene, Introduction to Metamathematics (North-Holland, Amsterdam, 1952)
29. A.N. Kolmogorov, Three approaches to the quantitative definition of information. Problemy
Peredachi Informatsii 1(1), 3–11 (1965)
30. R. Landauer, Information is inevitably physical. Feynman and Computation 2 (Perseus Books,
Reading, 1998)
31. M. Li, P. Vitányi, An Introduction to Kolmogorov Complexity and Its Applications (Springer,
Berlin, 2008)
32. O. Oreshkov, C. Giarmatzi, Causal and causally separable processes. arXiv preprint,
arXiv:1506.05449 [quant-ph] (2015)
33. O. Oreshkov, F. Costa, C. Brukner, Quantum correlations with no causal order. Nat. Commun.
3, 1092 (2012)
34. S. Popescu, D. Rohrlich, Quantum non-locality as an axiom. Found. Phys. 24, 379–385 (1994)
Causality–Complexity–Consistency: Can Space-Time Be Based on Logic and. . . 101
35. R. Raz, A parallel repetition theorem. SIAM J. Comput. 27(3), 763–803 (1998)
36. H. Reichenbach, The principle of the common cause, in The Direction of Time, Chap. 19
(California Press, Berkeley, 1956), pp. 157–167
37. B. Russell, On the notion of cause. Proc. Aristot. Soc. New Ser. 13, 1–26 (1912)
38. E. Specker, Die Logik nicht gleichzeitig entscheidbarer Aussagen. Dialectica 14, 239–246
(1960)
39. A. Stefanov, H. Zbinden, N. Gisin, A. Suarez, Quantum correlations with spacelike separated
beam splitters in motion: experimental test of multisimultaneity. Phys. Rev. Lett. 88, 120404
(2002)
40. T.E. Stuart, J.A. Slater, R. Colbeck, R. Renner, W. Tittel, An experimental test of all theories
with predictive power beyond quantum theory. Phys. Rev. Lett. 109, 020402 (2012)
41. L. Szilárd, Über die Entropieverminderung in einem thermodynamischen System bei Ein-
griffen intelligenter Wesen (On the reduction of entropy in a thermodynamic system by the
intervention of intelligent beings). Z. Phys. 53, 840–856 (1929)
42. J.A. Wheeler, Information, physics, quantum: the search for link, in Proceedings III Interna-
tional Symposium on Foundations of Quantum Mechanics, pp. 354–368 (1989)
43. L. Wittgenstein, Logisch-philosophische Abhandlung. Annalen der Naturphilosophie, vol. 14
(Veit and Company, Leipzig, 1921)
44. S. Wolf, Non-locality without counterfactual reasoning. Phys. Rev. A 92(5), 052102 (2015)
45. J. Woodward, Making Things Happen: A Theory of Causal Explanation (Oxford University
Press, Oxford, 2003)
46. C. Wood, R. Spekkens, The lesson of causal discovery algorithms for quantum correlations:
causal explanations of Bell-inequality violations require fine-tuning. New J. Phys. 17, 033002
(2015)
47. J. Ziv, A. Lempel, Compression of individual sequences via variable-rate coding. IEEE Trans.
Inf. Theory 24(5), 530–536 (1978)
48. M. Zukowski, C. Brukner, Quantum non-locality - It ain’t necessarily so. . . . J. Phys. A Math.
Theor. 47, 424009 (2014)
49. W.H. Zurek, Algorithmic randomness and physical entropy. Phys. Rev. A 40(8), 4731–4751
(1989)
Causal Structures and the Classification
of Higher Order Quantum Computations
Paolo Perinotti
Abstract Quantum operations are the most widely used tool in the theory of
quantum information processing, representing elementary transformations of quan-
tum states that are composed to form complex quantum circuits. The class of
quantum transformations can be extended by including transformations on quantum
operations, and transformations thereof, and so on up to the construction of a
potentially infinite hierarchy of transformations. In the last decade, a sub-hierarchy,
known as quantum combs, was exhaustively studied, and characterised as the most
general class of transformations that can be achieved by quantum circuits with open
slots hosting variable input elements, to form a complete output quantum circuit.
The theory of quantum combs proved to be successful for the optimisation of
information processing tasks otherwise untreatable. In more recent years the study
of maps from combs to combs has increased, thanks to interesting examples showing
how this next order of maps requires entanglement of the causal order of operations
with the state of a control quantum system, or, even more radically, superpositions
of alternate causal orderings. Some of these non-circuital transformations are known
to be achievable and have even been achieved experimentally, and were proved
to provide some computational advantage in various information-processing tasks
with respect to quantum combs. Here we provide a formal language to form all
possible types of transformations, and use it to prove general structure theorems for
transformations in the hierarchy. We then provide a mathematical characterisation
of the set of maps from combs to combs, hinting at a route for the complete
characterisation of maps in the hierarchy. The classification is strictly related to
the way in which the maps manipulate the causal structure of input circuits.
P. Perinotti ()
Dipartimento di Fisica and INFN, QUIT Group, via Bassi 6, 27100 Pavia, Italy
e-mail: paolo.perinotti@unipv.it
1 Introduction
The explosion of the field of quantum information theory [1], and quantum com-
putation in particular, is largely based on the framework of quantum circuits [2, 3],
that provides an abstract language for the representation of quantum algorithms—
sequences of quantum operations performed in a precise order on a given input
state. The building blocks of quantum circuits are quantum gates, elementary unitary
operations on one or more qubits, along with very special operations corresponding
to the preparation of a reset state or measurement in the so-called computational
basis.
While standard quantum circuits evolve pure quantum states unitarily, this
language can be generalised to encompass evolution of mixed states via irreversible
channels [4]. Thus, in the generalised framework the primary notion becomes that of
a quantum instrument, a collection of transformations labeled by an outcome—the
value of a classical variable—representing a conditional evolution within a chosen
test. The quantum instrument provides the description of what is generally referred
to as state reduction after a quantum measurement.
Quantum circuits are then the language for description of input-output flow
of information in the processing of a quantum state. The classical counterpart of
such a processing is a function (here we consider general, possibly irreversible
computations, and thus the function can be more generally a probabilistic map) that
transforms input bit strings to output strings. One normally identifies the abstract
input-output flow in a circuit with the time evolution of the corresponding systems
implementing the algorithm. The identification of time evolution with the input-
output direction is a consequence of causality [5–7], the property of quantum theory
(and of classical information theory as well) that forbids communication from the
output towards the input.
What is peculiar about quantum channels and instruments is that one can define
them axiomatically—as maps on states that must only comply to the requirement of
providing positive and normalised probability distributions when used in a closed
circuit. No further requirement is necessary to identify physical transformations,
since all the conceivable quantum instruments satisfy a realisation theorem in
terms of standard unitary evolutions and projective quantum measurements [8–10],
granting that at least in principle they all correspond to implementable processes.
What happens if we now consider abstract maps from quantum channels to
quantum channels, or from quantum instruments to quantum instruments? Is it
sufficient for such a map to respect the properties of probabilities to be feasible in
practice? And what if we continue constructing higher and higher orders of maps?
What is known so far is that for a sub-hierarchy of maps—called quantum combs
[11, 12] and encompassing all conceivable strategies in a quantum game [13]—
compatibility with probability theory is sufficient for feasibility.
Causal Structures and the Classification of Higher Order Quantum Computations 105
2 Mathematical Preliminaries
T
denoting a choice of canonical orthonormal bases in HA and HA0 , and X denotes
transposition of the operator X in the canonical basis. The main reason of interest
in the Choi-Jamiołkowski isomorphism is that it provides a necessary and sufficient
condition for complete positivity as follows
The trace non-increasing completely positive maps are precisely those whose Choi-
Jamiołkowski image satisfies
3 The Hierarchy
We will now introduce higher order computation, by enlarging the class of trans-
formations that we consider. In particular, this is obtained by enriching the way in
which we can compose systems to get new systems. The new composition rule was
implicitly already used when we introduced transformations, to which we attributed
a type A ! B. However, since now we want to use objects of type A ! B as inputs
and outputs of a new type of transformations, we need to define the construction of
new types thoroughly. This is achieved by the following recursive definition.
Given two types x; y, one can form the type x ! y. In particular, as a shorthand
notation, x WD x ! I will denote positive linear functionals bounded by 1 on
elements of type x. We also formally define a new composition law ˝ of types
as follows. For every couple x; y,
x ˝ y WD x ! y:
Hx!y D Hy ˝ Hx : (4)
Hx D Hx ; (5)
and finally
Hx˝y D Hy ˝ Hx : (6)
108 P. Perinotti
The convex set of deterministic events T1 .x/ also determines the convex set of
events of type x, denoted by T.x/, as the set of Choi representatives of admissible
maps dominated by Z 2 T1 .x/. From this point of view the cone TC .x/ WD fK 2
Herm.Hx /j9 > 0; R 2 T.x/ W K D Rg is not sufficient to specify a type. As a
trivial example, consider the cones TC .I ! A/ and TC .A ! I/: They are the same,
but the types I ! A (states of A) and A ! I (effects of A) are different because
of very different normalisation constraints. We then introduce the following identity
criterion for types.
Definition 2 We say that two types x and y are equivalent, and denote it as x D y,
if TC .x/ D TC .y/ and T1 .x/ D T1 .y/.
Given this definition, we can show that A ˝ B is the parallel composition of
systems AB
Lemma 1 The type A ˝ B coincides with the parallel composition of systems AB.
Proof Let us first determine the most general map A ! B. Its Choi is a positive
operator on HA ˝ HB D HAB . A deterministic map of this kind corresponds to a
positive operator Q such that for all B 0 with TrŒB D 1 one has
This means that for every A ˝ B with TrŒ D TrŒ D 1 one has
TrŒQ. ˝ T / D 1;
and by the polarisation identity this implies Q D IAB . Thus, events of type A ! B
are positive operators bounded by I, namely they coincide with the set of effects of
AB. Finally, positive functionals bounded by 1 on these events coincide with states
of the system AB.
The construction of types through the composition rule “!” allows us to prove
properties P of types by induction, by proving it for every elementary type A, namely
P.A/ D 1, and then proving that P.x/ D P.y/ D 1 ) P.x ! y/ D 1. As an
example, we now prove two crucial lemmas.
Lemma 2 The convex set T1 .x/ of deterministic events of type x is the set of all
positive operators of the form
X D x Ix C T; (7)
where x is a suitable constant x > 0, and the operators T span a suitable subspace
x of the real space of traceless selfadjoint operators on Hx . In particular, the
operator x Ix represents a deterministic event.
Proof The thesis is true for elementary systems A, since a state A can be expressed
as A IA C T, with A D d1A , and the set of possible traceless T in this case spans
the whole T0 .A/. Now, let the thesis be true for the types x; y. Then, since for every
Causal Structures and the Classification of Higher Order Quantum Computations 109
and thus TrŒR D y dy =x independently of R. This implies that R D x!y Ix!y C T 0
where
TrŒT 0 D 0;
y
x!y WD :
x d x
TrŒ.Sy ˝ Tx /T 0 D 0; 8Tx 2 x ; Sy 2 y ;
In other words,
Thus there exists > 0 such that T 0 x!y Ix!y , and then x!y Ix!y C T 0 DW
R 0. Clearly, for X 2 T1 .x/ one has
Proof Since x Ix 2 T1 .x/, and for every T in P.Hx / there exists > 0 such that
T x Ix , one has T 2 TC .x/.
Corollary 2 It is X 2 T1 .x/ iff X T 2 T1 .X/.
From now on, given a deterministic event X 2 T1 .x/, we will denote the traceless
operator in the decomposition (7) of X as TX . Clearly, TX T D TXT .
One can now easily prove the following lemma
Lemma 3 An element X 2 TC .x/ is a deterministic event of type x if and only if
TrŒXY D 1; (8)
TrŒXx D 1: (9)
Since by Lemma 2 one also has that Y 2 T1 .x/ has the form Y D x Ix C TY with
TY 2 x , by Eq. (9) one has TrŒXTY D 0, for every Y, namely
TrŒXT D 0; 8T 2 x :
X D x Ix C TX ;
x D x: (10)
Herm.Hx / D x ˚ x ˚ x
D x ˚ x ˚ x :
We will now prove that ˝ is associative. For this purpose we need the following
lemma.
Lemma 6 The set of deterministic events T1 .x˝y/ is the intersection TC .x˝y/\A
of the cone TC .x ˝ y/ with
x!y AC :
112 P. Perinotti
On the other hand, suppose that the above inclusion is strict. Then one has
x!y
Œx!y ˚ AC ? ;
Z WD x!y Ix!y C T;
W WD x!y Ix!y C
T
W 2 T1 .x ! y/;
TrŒZW D 1 C
TrŒT 2 ¤ 1;
x!y D AC ;
and
.A \ TC .x ˝ y// D T1 .x ˝ y/:
Corollary 4 x ˝ y D y ˝ x
As a consequence of Corollary 4 x ! y D y ! x. Substituting y by y we obtain
the following identity
x ! y D y ! x: (14)
.x ˝ y/ ˝ z D x ˝ .y ˝ z/ 8x; y; z; (15)
x ! .y ! z/ D .x ˝ y/ ! z 8x; y; z; (16)
Proof Let us suppose that Eq. (15) holds. By definition, we have then
.x ˝ y/ ! z D x ! .y ! z/, 8x; y; z;, namely (substituting z for z)
.x ˝ y/ ! z D x ! .y ! z/ 8x; y; z: (17)
.x ˝ y/ ˝ z D x ˝ .y ˝ z/: (18)
We now prove associativity of ˝, which then trivially implies the uncurrying
identity.
Lemma 8 For every triple x; y; z, .x ˝ y/ ˝ z D x ˝ .y ˝ z/.
Proof Since TC .Œ.x ˝ y/ ˝ z/ D TC .Œx ˝ .y ˝ z//, it is sufficient to prove that
T1 .Œ.x ˝ y/ ˝ z/ D T1 .Œx ˝ .y ˝ z//. For this purpose, we define
P P P
Let cr WD p0 q0 ap0 q0 r , and brpq WD apqr =cr . It is clear that r cr D 1, and pq brpq D
1 for every r. Thus we have
X
VD cr brpq Xp ˝ Yq ˝ Zr
pqr
X
D cr Tr ˝ Zr ; (22)
r
P r
where Tr WD pq bpq Xp ˝ Yq 2 T1 .x ˝ y/. This proves that V 2 A, and
then C
A. A similar proof clearly holds also for B thus providing the thesis,
since A D C D B.
Corollary 5 For every triple x; y; z the following type equality holds x ! .y ! z/ D
.x ˝ y/ ! z.
Every new type x in the hierarchy comes from a couple y; z through one of the
two type compositions. This allows us to introduce a binary relation on types as
follows
One can easily verify that the relation between equivalence classes is well defined.
Indeed, by Lemma 9 one has Œx D fx; xg. Thus, if x y one also has x y, x y
and x y.
Lemma 10 The relation between equivalence classes is reflexive and antisym-
metric.
Proof Reflexivity is simply proved, because for any x we have x ˝ I D x. Suppose
now that Œx Œy and Œy Œx. Then x y and y x, namely xRy and thus
Œx D Œy.
In the following we will denote by the transitive closure of the relation
between equivalence classes. It is then clear that is a partial ordering in the
quotient of types modulo R.
Since every type x is obtained from elementary types by subsequent applications
of ˝ and , we can prove the property P of types by induction with respect to the
ordering by proving it for every elementary type A, namely P.A/ D 1, and then
proving that P.x/ D 1 ) P.x/ D 1, and P.x/ D P.y/ D 1 ) P.x ˝ y/ D 1.
The above induction technique will be used to prove the main result of the paper
in Sect. 4.
Finally, we now define the notion of intersection of types.
Definition 5 Let z be a type such that Hz D Hx D Hy for two types x; y, and
T1 .z/ D T1 .x/ \ T1 .y/. We say that the type z is the intersection of types x and y,
and write z D x \ y.
This definition bears the following elementary consequences.
Lemma 11 The type z is the intersection of x; y if and only if Hz D Hx D Hy ,
z D x D y DW , and z D x \ y .
Proof By definition, if z D x \ y it must be z D x D y . Moreover, for every
Z 2 T1 .z/ one has
Z D Iz C TZ ;
satisfies
TrŒT 0 TW D 0; 8TW 2 z ;
0 0
TrŒT TX D TrŒT TY D 0; 8TX 2 x ; TY 2 y :
The equalities in the second line imply that T 0 2 x \ y , while the one on the first
implies T 0 2 z . Thus, we have T 0 2 z \ z D f0g, contrarily to the hypothesis.
Then, it must be
Span.x [ y / D z
Moreover, we have the two following important lemmas.
Lemma 13 For every pair of types x; y one has
T D x\y C Z; Z 2 Span.x [ y /;
.x \ y/ ˝ z D .x ˝ z/ \ .y ˝ z/: (27)
T1 Œ.x \ y/ ˝ z D TC Œ.x \ y/ ˝ z
\ AfffW ˝ ZjW 2 T1 .x \ y/; Z 2 T1 .z/g
Causal Structures and the Classification of Higher Order Quantum Computations 117
DTC Œ.x \ y/ ˝ z
\ AfffW ˝ ZjW 2 T1 .x/; W 2 T1 .y/; Z 2 T1 .z/g
DTC .x ˝ z/ \ AfffW ˝ ZjW 2 T1 .x/; Z 2 T1 .z/g
\ TC .y ˝ z/ \ AfffW ˝ ZjW 2 T1 .y/; Z 2 T1 .z/g
DT1 .x ˝ z/ \ T1 .y ˝ z/:
.x\y/!z D .x\y/˝z:
.x\y/˝z D .x˝z/\.y˝z/ ;
Corollary 7 For every pair of types x; y and every z, one has
In the following we will prove results that depend on the structure of a type x rather
than on the dimension of the specific elementary systems Ai that compose it. For
example, we will treat on the same footing transformations A0 ! B0 and A1 ! B1 ,
118 P. Perinotti
even if dA0 ¤ dA1 or dB0 ¤ dB1 . For this purpose of the present section, it is
convenient to introduce a notation which is at the same time insightful and efficient.
Given a Hilbert space Hx D Hn ˝ Hn1 ˝ : : : ˝ H0 , one can expand any operator on
.n/ .n1/ .0/ . j/
Hx on the basis fSi D Sin ˝ Sin1 ˝ : : : ˝ Si0 g, where S0 WD IHj , and for every
. j/ . j/ . j/
j it is TrŒSl D 0 for l > 0. In the following we will denote Tl WD Sl for l > 0.
An important role in our analysis is played by those special subspaces of Herm.Hx /
having the following property: they are spanned by a subset of fSi g such that for
. j/ . j/
every j, either all the Si in the subset have Sij D IHj , or they all have TrŒSij D 0.
As an example, let Hx D H1 ˝ H0 . Then we have four subspaces of interest:
L00 WDSpan.fTi ˝ Tj g/
L01 WDSpan.fTi ˝ Ig/
L10 WDSpan.fI ˝ Tj g/
L11 WDSpan.fI ˝ Ig/:
In the general case, we will define the space Lb , where b is a string of bits of length
n C 1, as follows: Lb is the largest subspace spanned by Si ’s such that for all those
. j/
values of j for which bj D 1 one has ij D 0, i.e. Sij D IHi , while for all those values
. j/
of j for which bj D 0 one has ij > 0, i.e. TrŒTij D 0.
As a consequence of the definition, one has the following remarkable identity for
a string b D b1 b0
Proof The statement is trivial when b0 D b1 . Let us then focus on the case b0 ¤ b1 .
One easily realises that every element of the basis of Lb0 is orthogonal to every
element of the basis of Lb1 in the Hilbert-Schmidt sense. Indeed, b0 ¤ b1 implies
that there exists some j such that .b0 /j ¤ .b1 /j . Let us suppose without loss of
Causal Structures and the Classification of Higher Order Quantum Computations 119
This implies that two subspaces Lb0 and Lb1 are orthogonal, and then the thesis
follows.
Corollary 8 The sum Lb1 C Lb2 for b1 ¤ b2 is a direct sum Lb1 ˚ Lb2 .
Lemma 17 Let J
f0; 1gN , and L be the following direct sum of spaces Lb
M
LD Lb (31)
b2J
Herm.Hx / D L ˚ L? ;
x D L1 : (33)
TrŒ D 1; (35)
A particularly relevant sub-hierarchy, that was studied extensively in [12, 19], is that
of combs, given by the following recursive definition
Definition 6
1. The type 101 of 1-combs on HA1 ˝ HA0 is A0 ! A1 . The set T1 .101 / of
deterministic 1-combs on HA1 ˝ HA0 is the set of Choi operators of channels
in A0 ! A1 .
2. The type n01:::.2n1/ of n-combs on HA2n1 ˝ HA2n2 ˝ : : : ˝ HA0 is
The elements R of the set T1 .n01:::.2n1/ / are Choi operators of CP maps that
transform elements of T1 .n1:::.2n2/ / to elements of T1 .10.2n1/ /.
The pair of spaces H2j;2jC1 identifies the j C 1-th tooth of a comb, where the
nomenclature is due to the graphical representation of combs as in Fig. 1 (see [12,
20])
The main theorems in the theory of combs are the following
Theorem 2 A positive operator R on HA2n1 ˝ HA2n2 ˝ : : : ˝ HA0 belongs to
T1 .n01:::.2n1/ / iff it satisfies the following constraint
H0 H1 H2 H3 H 4 H5 H6 H7 H8 H9
0 1 2 3 4
R.n1/ WD R: (37)
H0 1 HA 1 H3
H2 2 HA 2
HA n 1 H2n 1
H2n 2 n
(38)
In the following we will prove characterisation theorems that depend on the depth
of combs, summarised by the integer n, and are independent of the particular
dimension of the spaces H0 ; H1 ; : : : ; H2n1 . For this reason, we will often refer to the
general class of n-combs by the type n, dropping the labels of spaces. Moreover, it
will be useful to consider classes of n-combs on the same spaces, but with permuted
teeth. For this reason, we will introduce the notation n , meaning that for any
given space H2n1 ˝ H2n2 ˝ : : : ˝ H0 , n encompasses the type of n-combs on
H2 .n1/C1 ˝ H2 .n1/ ˝ : : : ˝ H2 .0/ .
The next step is to prove a characterisation theorem for maps from combs to combs.
For this purpose, it is useful to prove some preliminary lemmas, providing a clearer
122 P. Perinotti
picture of the structure of the maps. In particular, the results presented in this section
are useful in identifying the general structure of spaces m!n that only depend
on the numbers m and n of teeth, and not on the dimensions dAi of the involved
systems Ai .
The first result that we need is a characterisation of the space m in terms of
spaces Lb .
Lemma 19 The space m is the direct sum
M
Lb ; (39)
b2E1
where E1 is the set of binary strings start with an even number of 1’s and that have
at least one 0.
Proof This characterisation immediately follows from Theorem 2.
Corollary 9 Let p D m C n. Then one has
p D .m ˝ n / ˚ .m ˝ n /
˚ .m ˝ n / ˚ .m ˝ n /: (40)
Let us now consider the types m˝n. By Eq. (13) the general element of T1 .m˝n/
is an affine combination of tensor products M ˝ N, with M 2 T1 .m/ and N 2 T1 .n/.
Considering each term of the affine combination separately, it is easy to check that if
we arrange the m teeth of the first comb to the left and the n teeth of the second to the
right, elements of T1 .m ˝ n/ satisfy condition (37), and thus belong to T1 .p/ with
p D m C n. Moreover, the same result holds if we permute the teeth of the p-comb
in such a way that the ordering of teeth of the m-comb and that of teeth of the n-
comb are preserved. We denote the set of these permutations as †m;n . For example,
let m D n D 2. In this case we have two combs, both having two teeth. Let us
label the teeth of the first comb by 0; 1 and those of the second by 2; 3. The starting
arrangement is thus 0; 1; 2; 3. The allowed permutations are all the permutations that
do not bring the tooth 1 to the left of 0 or 3 to the left of 2, namely
0; 1; 2; 3;
0; 2; 1; 3;
0; 2; 3; 1;
2; 0; 1; 3;
2; 0; 3; 1;
2; 3; 0; 1;
Causal Structures and the Classification of Higher Order Quantum Computations 123
that is
We can now evaluate the cardinality of †m;n through the following lemma.
Lemma 21 Let and n be two comb types. The cardinality of †m;n is
!
mCn
j†m;n j D (43)
n
We now use the above theorem to prove the main result in this section, which
provides a characterisation of maps from m-combs to n-combs.
Theorem 5 For maps of type m ! n one has
and
T1 .m ! n/ D TC .m ! n/
\ Aff.T1 Œ.m C n 1/ ! 1 [ T1 Œ.m C n 1/$ ! 1/; (47)
where $ is the permutation that exchanges the m-comb with the n 1-comb
representing the input type of the output n-comb.
Proof First of all, we remind that n D .n 1/ ! 1, and by Corollary 5,
m ! n Dm ! Œ.n 1/ ! 1
DŒm ˝ .n 1/ ! 1:
Causal Structures and the Classification of Higher Order Quantum Computations 125
0 1 2 3 0 2 1 3
a b
0 2 3 1 2 0 1 3
c d
2 0 3 1 2 3 0 1
e f
Fig. 2 Graphical representation of the structure of a map from 2- to 3-combs. The possible
orderings of the 2 ! 3 map correspond to the six white combs on the bottom. The input 2-comb
is the one represented in dark grey, with teeth labelled 0; 1. The output 3-comb is a map acting on
the light grey 2-comb with teeth labelled 2; 3. According to Theorem 5, the only the structures that
are necessary to define a map 2 ! 3 are those of diagrams (a) and (f)
m ˝ .n 1/ D .m C n 1/ \ .m C n 1/$ ;
T1 .m ! n/ D TC .m ! n/
\ AfffT1 Œ.m C n 1/ ! 1 [ T1 Œ.m C n 1/$ ! 1g:
Thanks to Lemma 13, we can figure out the meaning of the above theorem as
follows. The most general maps from m-combs to n-combs are represented by affine
combinations of m C n C 1-combs with orderings given by those permutations
of teeth that are compatible with both the teeth ordering of input m-combs and of
output n-combs. A more intuitive picture of the general map m ! n is provided in
Fig. 2 for the case m D 2, n D 3.
5 Conclusion
We reviewed the main points of the theory of combs, i.e. maps from quantum circuits
into quantum channels (or more generally quantum operations), reporting the crucial
realisation theorem, which asserts that combs are physically obtained by circuits
with open slots. We then focused our attention on the hierarchy of all mathematical
126 P. Perinotti
maps, from combs to combs and maps thereof, that are admissible, that is to say
consistent with the properties of probabilities. We introduced a language of types
and appropriate typing rules, with a partial ordering of types that allows for proofs
by induction, and used induction to prove general structure theorems for the set of
admissible maps of any type. In particular, we showed that maps at every order in the
hierarchy inherit normalisation constraints from the first-level causality constraints.
However, most of higher-order maps require indefinite causal structures for their
implementation. We then restricted attention to maps from combs to combs. We
first showed that such maps can be seen as maps from tensor products of combs into
channels. We then characterised them as those maps that can be represented as affine
combinations quantum combs with two different orderings, the first one treating the
input tensor product A ˝ B as a comb where the teeth of A precede those of B, and
the other one treating A ˝ B as a comb where the teeth of B precede those of A. This
result provides a great simplification of the general structure of maps from combs
to combs.
The surprising issue with the hierarchy of higher order quantum maps is that,
while for quantum combs the admissibility constraint are necessary and sufficient
for the existence of an implementation scheme, in the case of higher-order maps
such equivalence seems to be beyond our present understanding of physics, and
possibly requires a theory that encompasses quantum information theory and a
theory of indefinite causal orderings, such as general relativity. The problem of
implementation thus remains open, leaving three different possibilities: (1) all
admissible maps are achievable in a futuristic quantum-gravity scenario; (2) there
is some polynomially computable constraint beyond admissibility that separates
feasible from unfeasible maps; (3) the distinction is given by a non-computable
constraint, which essentially means that, given the Choi representation of an
admissible higher-order map, it is not possible to say a priori whether it represents
a feasible computation, and the answer can be given only in some special case.
The last situation represents to some extent a generalisation of the problem of
determining whether a given density matrix describes a quantum state that is
entangled or separable.
Acknowledgements The author is grateful Alessandro Bisio for carefully reading the first version
and pointing out an important imprecision, and to Aleks Kissinger and Fabio Costa for insightful
discussions. This publication was made possible through the support of a grant from the John
Templeton Foundation (Grant ID# 60609). The opinions expressed in this publication are those of
the author and do not necessarily reflect the views of the John Templeton Foundation.
References
1. M.A. Nielsen, I.L. Chuang Quantum Computation and Quantum Information (Cambridge
University Press, Cambridge, 2010)
2. D. Deutsch, Proc. R. Soc. Lond. Ser. A Math. Phys. Sci. 425, 73–90 (1989)
3. A.C.-C. Yao, in Proceedings of Thirty-fourth IEEE Symposium on Foundations of Computer
Science (FOCS1993), pp. 352–361 (1993)
Causal Structures and the Classification of Higher Order Quantum Computations 127
Dominik Janzing
Abstract Recent progress in the field of machine learning suggests that the
joint distribution of two variables X; Y sometimes contains information about the
underlying causal structure, e.g., whether X is the case of Y or Y the cause of X, given
that exactly one of the alternatives is true. To provide an idea about these statistical
asymmetries I show some intuitive examples, both hypothetical toy scenarios as
well as scatter plots from real world data. I sketch some recent approaches to infer
the causal direction based on these asymmetries and give some pointers to physics
literature that relate them to the thermodynamic arrow of time.
D. Janzing ()
Max Planck Institute for Intelligent Systems, Spemannstr. 34, 72076 Tübingen, Germany
e-mail: dominik.janzing@tuebingen.mpg.de
Fig. 1 The three possible causal explanations of statistical dependences between two random
variables X and Y. Every joint distribution of X and Y can be explained by any of these three causal
structures
X?
? Y jZ: (1)
This may be one of the earliest statements relating causality with statistics. Reichen-
bach further noticed that reversing the arrows in case (b) yields a different pattern of
statistical dependences: a common effect Z would not explain dependences between
X and Y although a common cause does. Hence, X and Y would then be independent,
but can get dependent when conditioning on the common effect Z. Reichenbach
linked this asymmetry between the common cause and the common effect scenario
to the statistical physics of mixing processes, which already suggests that in general
asymmetries between cause and effect are related with asymmetries between past
and future in the sense of the thermodynamic arrow of time.
To clarify terminology, note that we also call case (b) a causal relation (which is
sometimes also called ‘acausal’). Thus, Reichenbach’s postulate implies that every
statistical dependence is due to some kind of causal relation. Note also that the
cases are not exclusive. For instance, (a) and (b) or (b) and (c) will often occur
together. (a) and (c) can only occur together if X and Y do not refer to measurements
that correspond to well-defined locations in space time: whenever the measurements
are time-like separated, there can only be influence from the variable that has been
measured earlier. On the other hand, when they are space-like separated, case (b) is
the only possible one.1 Many scientific disciplines, however, deal with variables that
do not refer to well-defined time instances. For instance, if psychologists investigate
the statistical relation between the motivation X and the performance Y of students,
1
If X and Y refer to quantum measurements at two particles coming from a common source, there
may not be such a variable Z that screens off the dependences in the sense of (1) due to quantum
entanglement and the violation of Bell’s inequality. Nevertheless, one may visualize the scenario
as causal relation of type (b) and replace Z with some joint quantum state j i to indicate that the
common cause is no longer a random variable.
Statistical Asymmetries Between Cause and Effect 131
the categories of space-like vs. time-like separation do not make sense. It is then
likely that both quantities influence each other.
The fact that the cases (a), (b) and (c) are usually not mutually exclusive is an
additional obstacle for causal inference. We will now restrict our attention to the
easiest task of distinguishing between (a) and (c), that is, inferring the causal direc-
tion, which is already challenging. Section 2 argues that our intuition sometimes
recognizes cause and effect without knowing any formal criteria. Section 3 sketches
the principles of the conventional approach to causal inference based on conditional
independences. Section 4 explains why additional formal inference principles are
required and Sect. 5 describes a foundation for new inference methods. Section 6
sketches a particularly simple example of a new inference method. Pointers to
related problems from more standard branches of theoretical physics are spread over
the sections.
2 Intuitive Approach
** **
10 ***** *** * * * * * *
****** * ** *************** ************** * * *
8000
** * * ** * * * * * *
*** ** * ** **** *************************************************** ****** * *
*************** **** * *** * * * * ********** * *********************** ** * *** *
************* * ** * ** * *** * * *************************** *
********* ***** ******* ******* ** ************************* * *
******** ************* *********** * * ************************ ************************************************** **********
* * * *
*** * * ** * * * * * * *
** * * * * * * * *
********** ****** * ** * * ** * **** ********************************************************************* *
* * * * *
* * * * *************** * * * *
*
******** ** ******** ** * * **** ***** ***************************************************** * * ***
** * * ** * *** ************************************************************************ ****** * **
6000
* **** ** ***** ** * * ** * * ** *** * ** ****************************** ******* **
* * * *
* *** * * * * ** *** ******** ********************************************************** ************************************ *
* * ** ** * ** *********** ** ************************************************ ** * *
* * **** *************** ************************************************************************************************************ * *** *
5
solar radiation
* * *
temperature
4000
** *
* * *** ** * *** ******************************************************************************************************************************************************* *
* * ** * ** ** ********************************************** *
* * ***** **** **********************************************************************************************************************************************************************
* * ******** **************************************** ******* ** ******** ** **** *
* * * * * * **** ** *********************************************************************** * *
* * *** ******* ********************************************************************************************************************************************************* * **
* *
* * **** * **** ************************************** * * **** *
********************************************************************************************************** ** *
0
2000
* * ** * * * * * * * * ** * ** ** ** ** ****
* * * * * * *
*
* * * * *
************************************
*********************************** * * * * * * ** * * * * ***************************************************************************************************************************
*
* *
** **** *********************************************************************************************************************************************************************** ***** *
* * ** ** **** *** ****** *************************************************** ************** *
* * *** ************************************************************************************************************************************* *** *
* * ** *********** ******************************************************************************************************************************************************************************** ** *
* * ********************************************************************** *
* * * ** *****************************************************************************************
** * * ******************************************************************************************************************************************************************************* *
* ************ ****************** ** * * ******************* ************************************************************
*************
******************************************************************** * * *** *** * * *
** ******** ***************************************************************************************************
* * * * * * * *
** * *
−5
0
* * **
altitude temperature
(a) (b)
10000
30
*
* *
*
* * 20
8000
*
*
*
10
6000
* * *
income
temperature
*
*
*
* *
*
*
*
*
* 0
4000
*
* *
** * * **
** * * ***** * * * −10
* * *** * ** * * ** * *
* * *
* * * * * ** ** * * ** ** * ** * * * ** * * * * * * * ** *
2000
* * * * * * * * * **
** ** ** * ** *** ** * * * **** ** * * ** ** * ** * * ** ** ** ** ** * ** *** ** ** * * * *
* * **
* * * * * * * * * * * * * *
** * ** * ** ** ** ** ** ** ** ** ** ** * ** * * * ** ** ** ** *** * * ** * * ** ** *** ** * * * * * * * * * *
* * ** ** ** * ** ** ** ** ** *** ** ** ** ** ** ** ** *** ** *** ** ** ** ** ** * *** ** * * * ** ** * ** * * * * *
* * * * * * ** * * ** ** *** ** * ** * ** ** ** ** ** ** ** ** ** ** * * ** * * ** ** ** ** ** * *** * * ** ** * ** * * *
* ****** ********** ************** *** * ** * −20
** ** *** ** *** ** *** *** *** *** *** ** *** **** *** *** **** *** *** *** **** *** *** *** **** *** *** *** *** **** *** **** *** *** *** *** ** ** *** ** *** ** *** ** ** * ** * * *** ** * ** * ** * *
* * * ** *** *** **** **** ** *** **** *** **** **** *** **** **** *** **** *** ** *** *** *** ** **** ** **** **** *** *** *** **** *** **** ** **** *** *** *** *** ** *** *** ** ** *** ** ** ** *** *** *** ** * * * ** * *
*** *** *** ********** *** **** **** **** *** *** *** **** ** *** **** *** ******** ** *** ** ******** ** *** **** *** ** *** ** *** *** *** *** *** ** *** ** ** *** *** ** *** ** *** ** *** ** *** *** *** *** ** *** * *** * * * * * * * *
* * ** ** * ** * * * * * * ** * ** * * * * * * ** ** **** * ** *
0
*
−30
20 40 60 80 0 50 100 150 200 250 300 350 400
(c) (d)
Fig. 2 Four examples of cause-effect pairs from the database [2]. (a), (c), (d) are examples where
the cause is on the x-axis and the effect on the y-axis. (b) is an example where the cause is on the
y-axis and the effect on the x-axis
The problem of distinguishing cause and effect from the joint distribution of two
variables has been considered unsolvable for a long time. This opinion is certainly
based on reasonable arguments: as displayed by the equations in Fig. 1, every joint
distribution p.x; y/ can in principle be generated by each of the three possible causal
structures. To see this, just imagine a communication scenario with two random
generators. First, there is a random generator that outputs x-values according to
p.x/. Then, think of a second random generator that outputs y-values according to
p.yjx/ whenever it receives x as input. It is clear that in this scenario X causes Y and
not vice versa because changing the y-values has no impact on the x-values while
changing the x-values obviously changes the distribution of y-values. Likewise, one
can construct a scenario with two random generators where Y is the cause and X the
effect and also one with three random generators where Z is the cause of both X and
Y.
This shows that there are no ‘hard’ restrictions excluding any of the three types
of causal explanations in Fig. 1.2 In Sect. 4 we will explain why there are still ‘soft’
criteria that render one of the causal directions more plausible. ‘Hard’ constraints
are known from causal structures with at least three variables. An important example
has already been given above: whenever Z is the cause of X and Y, this entails a
constraint on the joint distribution of X; Y; Z, namely the conditional independence
displayed in (1).
For the joint distribution of n variables whose causal relations are described
by a directed acyclic graph (DAG), the causal Markov condition [6, 7] states that
every variable is conditionally independent of its non-descendant, given its parents.
The independence displayed in (1) is one of the simplest implications of the causal
Markov condition. It is obtained from applying the Markov condition to the DAG (b)
in Fig. 1: Z is the (only) parent of X. Therefore, given this parent, X is conditionally
independent of its non-descendants, which are Z and Y.
According to the Markov condition, the joint probability density3 of n variables
factorizes according to
Y
n
p.x1 ; : : : ; xn / D p.xj jpaj /; (2)
jD1
where paj denotes the values of the parents according to the causal DAG (i.e.,
the direct causes) of xj . Note that each conditional probability density p.xj jpaj /
2
This is different, however, in quantum theory: a joint operator describing two systems that are
correlated by a common cause is positive, while an operator describing correlations of cause and
effect has positive partial transpose [4]. Therefore, one can sometimes tell apart cases (a) and (c)
versus case (b) in Fig. 1, see [5].
3
Here we have implicitly assumed that the joint probability distribution has a joint density to
simplify explanations.
134 D. Janzing
describes how the distribution of a variable depends on its direct causes. Therefore,
these conditionals represent the causal mechanisms that generate the statistical
dependences.
The second cornerstone of the conditional independence based approach is
given by the postulate of causal faithfulness [6, 7] stating that no conditional
independences occur apart from those implied by the causal Markov condition. In
other words, one postulates that the causal mechanisms p.xj jpaj / are chosen in a
‘generic’ way in the sense that there are no conditional independences apart from
those that follow from the factorization (2).
As an example for a ‘non-generic’ choice of causal conditionals p.xj jpaj /
violating causal faithfulness assume that the joint distribution of three variables
X; Y; Z connected by the causal DAG in Fig. 3 satisfies X ? ? Y. Then the direct
and the indirect influence of X on Y would need to compensate each other, which
requires ‘fine-tuning’ of parameters. Assume, for instance, the variables are related
by the linear model
Z D ˛X C NZ
Y D X C ˇZ C NY ;
where ˛; ˇ; 2 R are coefficients and NY and NZ are noise terms such that NY ; NZ ; X
are jointly statistically independent. Then we have X ?
? Z whenever
˛ˇ C D 0: (3)
Causal faithfulness rejects (3) as an unlikely coincidence (unless there are good
reasons to believe in a mechanism that has done this tuning).4 A more natural
explanation of X ?? Y would be that X and Y are causes of Z. This causal hypothesis
does not require fine-tuning of parameters and is therefore considered more likely.
4
Note that causal faithfulness has also been discussed in the context of Bell’s inequality: Wood
and Spekkens [8] argue that classical explanations of the EPR-scenario (with superluminal
communication, for instance) require similar fine-tuning of parameters.
Statistical Asymmetries Between Cause and Effect 135
Fig. 4 Left: joint distribution of a real-valued variable Y and a binary variable X which suggests
that Y is the cause and X the effect. Right: if X was the cause, changing p.x/ while keeping p.yjx/
yields a rather strange joint distribution, which shows that the observed p.x/ is ‘non-generic’.
Example and images are taken from [9]
136 D. Janzing
The example in Fig. 4 suggests that there are additional causal inference rules to
be discovered. This is because we want be able to reject cases of fine-tuning of
parameters that do not yield additional conditional statistical independences. As a
theoretical foundation for new causal inference rule, Janzing and Schölkopf [10]
and Lemeire and Janzing [11] postulate the so-called Algorithmic Independence
of Conditionals stating that the shortest description of p.x1 ; : : : ; xn / is given by
separate descriptions of p.xj jpaj /. Here, description length is measured in terms of
Kolmogorov complexity [12]. For the case of two variables, this amounts to saying
that p.cause/ and p.effectjcause/ are algorithmically independent, that is, knowing
p.effectjcause/ does not admit a shorter description of p.cause/ and vice versa. This
P hypothesis X ! Y in Fig. 4 because p.x/ is the
principle is violated for the causal
unique distribution for which xD0;1 p.x/p.yjx/ is Gaussian. Hence, knowing p.yjx/
admits a shorter description of p.x/.
To show that these ideas are related to physics, I want to mention that [13]
describes a common root of the principle of Algorithmically Independent Condi-
tionals and the arrow of time in thermodynamics.
Y D ˛X C NY ; (4)
where NY ? ? X. On can then show [15] that there is no such linear model with
additive noise in the backward direction unless both X and NY are Gaussian. This
means that any model of the form X D f .Y; NX / either requires a non-linear function
f or a noise term NX that is not independent of Y. For short, we then state that there
is no linear model from Y to X.
Statistical Asymmetries Between Cause and Effect 137
To infer the causal direction, one fits a linear function in both directions and
checks whether the error term is statistically independent of the input variable.5 If
independence of the noise term holds in one direction but not the other, then the
former is considered the causal direction. A non-linear generalization where (4) is
replaced with Y D f .X/ C NY , for some general non-linear function f , has been
proposed in [16] and extensively tested in [17]. Janzing and Steudel [18] justifies
this kind of additive noise based causal inference methods via the Principle of
Algorithmically Independent Conditionals.
A similar method has been used by Peters et al. [19] to infer the time direction
of empirical time series from various domains such as brain research and finance.
Although the problem is of no obvious practical interest, it can be used to test
concepts of causal inference. Given a time series .Xt /t2Z whose time direction has
possibly been inverted, the idea is to fit a linear autoregressive moving average
(ARMA) model
X
p
X
q
Xt D ˛j Xtj C ˇj Ntj C Nt ; (5)
jD1 jD1
and test whether the residual terms Nj are statistically independent and not only
uncorrelated. It turned out that the empirical time series under consideration more
often admitted a linear model from the past to the future than from the future
to the past. For first-order Markov processes, for instance, this means that the
‘forward time conditional’ p.xt jxt1 / admits a linear model while the ‘backward
time conditional’ p.xt1 jxt / does not.
Janzing [20] describes a toy model that relates this asymmetry to the usual
thermodynamic arrow of time. The idea is that Xt is a physical observable measured
at some fixed system S at time t. The noise Nj is provided by a beam of incoming
particles interacting with S. Furthermore, one assumes linear dynamics of the joint
system consisting of S and the particle beam. Then (5) is obtained by restricting
the joint dynamics to S and independence of the noise Nj follows whenever the
incoming particles are independent. Since the outgoing particles are dependent in
the generic case, running the process backwards in time would assume dependent
particles that become independent after interacting with S. Janzing [20] shows that
the backward scenario thus requires fine-tuning the dependences of the incoming
particles in order to ensure that interacting with S makes them independent.
Janzing [20] further argues that the example shows why the linearity of the joint
dynamics of S plus its environment is inherited by the forward time conditional, but
not by the backward time conditionals. The backward conditional is less simple
5
Note that vanishing correlation is not enough because linear least square regression automatically
yields uncorrelated residuals. Instead, one needs a statistical dependence test that is able to detect
higher order dependences.
138 D. Janzing
due to the dependent noise terms. Likewise, one can argue that ‘causal’ con-
ditionals p.effectjcause/ are substantially different from ‘anticausal’ conditionals
p.causejeffect/ because the former inherit properties from the physical dynamics
that the latter may not inherit. The idea is that physical noise (permanently provided
by incoming particles and radiation) tends to be independent of what happened
before (the cause), but not independent of what happens then (the effect).
References
Thomas Pashby
Abstract This paper provides a general method for defining a generalized quantum
observable (or POVM) that supplies properly normalized conditional probabilities
for the time of occurrence (i.e., of detection). This method treats the time of occur-
rence as a probabilistic variable whose value is to be determined by experiment and
predicted by the Born rule. This avoids the problematic assumption that a question
about the time at which an event occurs must be answered through instantaneous
measurements of a projector by an observer, common to both Rovelli [22] and
Oppenheim et al. [17]. I also address the interpretation of experiments purporting to
demonstrate the quantum Zeno effect, used by Oppenheim et al. [17] to justify an
inherent uncertainty for measurements of times.
1 Introduction
This paper provides a critical guide to the literature concerning the answer to the
question: when does a quantum experiment have an result? This question was posed
and answered by Rovelli [22] and his proposal was critiqued by Oppenheim et al.
[17], who also suggest another approach that (as they point out) leads to the quantum
Zeno effect. What these two approaches have in common is the idea that a question
about the time at which an event occurs can be answered through the instantaneous
measurement of a projector (in Rovelli’s case, a single measurement; in that of
Oppenheim et al. [17], a repeated measurement). However, the interpretation of a
projection as an instantaneous operation that can be performed on a system at a time
of the experimenter’s choosing is problematic, particularly when it is the time of the
outcome of the experiment that is at issue.
In classical probability theory, illustrated in Sect. 4 with the simple case of an
exponential decay law, if the time of an event is a random variable (to be determined
by the result of experiment) then one integrates over a probability density to
find the probability for the occurrence of the event within a time interval (not
T. Pashby ()
Department of Philosophy, University of Chicago, 1115 E. 58th St., Chicago, IL 60637, USA
e-mail: pashby@uchicago.edu
The previous attempts to address the question posed here made use of the idea that
a quantum measurement is an instantaneous process that changes the state of the
system. It is habitually assumed in discussions of quantum foundations that the
observer is free to choose which observable to measure.1 However, the freedom
of choice allowed by the quantum formalism concerning the time at which the
observable is measured is seldom emphasized. It is this second kind of freedom
that I aim to challenge.
In orthodox quantum mechanics a self-adjoint operator A corresponds to a
physical quantity that can be measured by an experiment, i.e., a Schrödinger picture
observable. In the Heisenberg picture, the system state remains constant while the
observables vary with time, which means that to measure A at time t we must choose
At D Ut AUt from a whole family of time-indexed observables (each self-adjoint),
each of which measures A. This choice matters: in general, At and At0 will have
different expectation values for the same state (when t ¤ t0 ). But since nothing
in the formalism determines at what time the observable is to be measured, the
experimenter is apparently able to choose freely at what time she will act on the
system by measuring At , for some t.
This idea can be traced back to the first edition of Dirac’s classic Principles
of Quantum Mechanics. Working in the Heisenberg picture, Dirac describes the
situation as follows:
A system, when once prepared in a given state, remains in that state so long as it remains
undisturbed. . . . [I]t is sometimes purely a matter of convenience whether we are to regard
a system as being disturbed by a certain outside influence, so that its state gets changed,
or whether we are to regard the outside influence as forming part of and coming in the
definition of the system . . . There are, however, two cases where we are in general obliged
to consider the disturbance as causing a change in the state of the system, namely, when the
disturbance is an observation and when it consists in preparing the system so as to be in a
1
Some surprisingly powerful results have followed from this assumption. Consider the role of
parameter independence in Bell’s Theorem or, more controversially, the so-called Free Will
Theorem [7].
At What Time Does a Quantum Experiment Have a Result? 143
given state. . . . This requires that the specification of an observation shall include a definite
time at which the observation is to be made . . . [8, p. 9]
There is, however, a real disconnect here between experimental physics and the
quantum formalism. In a typical quantum experiment one uses a detector (or several)
to collect results, and in practice an experimenter has no real control over the time
at which the detector will detect something (by clicking, or fluorescing, or what
have you). In a classic two-slit experiment, for example, the screen lights up in
a fairly well-defined place at a fairly well-defined time and there is little that the
experimenter can do to influence the location of this detection event in time (or
space).
Now, the time at which a quantum particle (e.g., a photon) is emitted can be
fairly tightly controlled (by a pulsed laser) and this control allows for measurement
of time of arrival, often used as a proxy for energy. But the time of arrival (i.e.,
the time of detection minus the time of emission) is not the sort of thing that the
experimenter can control with precision, and the kind of control that one might exert
(by increasing the energy, say) would be best described theoretically as changing
the state. In practice, it seems there is nothing the experimenter may choose to do in
order to ‘make a measurement’ at a specific instant of time; there is no action that
the experimenter performs to bring about the detection of a particle—these detection
events happen of their own accord.
The discussion above indicates that the orthodox interpretation of the formalism is
on the wrong track. At the root of the problems with trying to make sense of the
‘time of measurement’ or ‘time of collapse’ in quantum theory is, I contend, the
idea that the experimenter has control of precisely when to apply an observable. My
suggestion is that we can avoid these problems by asking questions instead about the
time of detection (or registration)—a time that actually can be recorded in a quantum
experiment. This time may be taken as a proxy for the time of a microscopic event
(like the ionization of an atom, say) but the key point is that predictions made for
the distribution in time of these events can be compared with actual experimental
statistics.
To correctly describe these experimental statistics requires a probability distri-
bution for the time of detection. In classical probability theory, this means treating
the time of detection as a random variable. In quantum theory, then, one must treat
the time of detection as an observable. Now, as is well-known, Pauli’s Theorem
implies that there is no self-adjoint operator with the requisite properties [24]. But
what is actually established by this result is that if the Hamiltonian has a semi-
bounded spectrum then there can exist no time-translation covariant Projection
Valued Measure (PVM). However, (as is also well-known) this does not prevent
the use of generalized observables, known as Positive Operator Valued Measures
144 T. Pashby
(POVMs) to represent the time of an event [5, 9].2 The problem one then faces is to
pick out a particular POVM as being empirically significant.
Spatiotemporal symmetries have traditionally played a significant role in
attempts to pick out experimentally relevant POVMs. Wightman [27] argued that
the position observable could be (in effect) uniquely determined by requiring that
the projectors of the associated PVM provide a representation of the Euclidean
group. Over 20 years later, Werner [26] showed how to generalize this method to
find POVMs appropriate to the symmetry group of any given three-dimensional
hyperplane (spatial or spatiotemporal). In doing so, he subsumed both the position
observable and the quantum time of arrival (see [9, 13]) within the more general
notion of a screen observable.
Inspired by Mackey [14], Wightman had interpreted the projectors of his position
PVM as ‘experimental questions’. Underlying this idea is the notion that a projector,
P, represents a (possible) property of the system. An experimental question asks:
“Does the system have property P?” Each projector P defines a Heisenberg picture
family of observables Pt D Ut PUt corresponding to asking this experimental
question of the system at time t. Here, again, the time at which the question is asked
is apparently the free choice of the experimenter.
In the case of the position PVM, 7! P , a projector P corresponds to
the property of being located (or “localized”) in spatial region . According to
Wightman, measuring the observable P at time t asks the experimental question,
“is the system located in at time t?” However, from the perspective of an
actual experiment—operationally, that is—this interpretation is problematic. If we
interpret P in terms of a detector located in the region , this question asks: “if
the detector, sensitive in region , is turned on for an instant t, does it register the
system?” But very few experiments (if any) involve detectors that are sensitive for a
mere instant of time.
Much more common are detectors that are sensitive over an extended period of
time, like a luminescent screen. With Werner’s definition of a screen observable,
this is modeled by a spatiotemporal three-dimensional hyperplane that has one
dimension of time and two of space3 which leads to a POVM, † I 7! E†I .
Following Wightman, measurement of E†I could be thought of as asking, “if the
detector, sensitive in area †, is turned on for an interval of time I D Œt1 ; t2 , does it
register the system?”
2
In particular, requiring that an observable must correspond to a self-adjoint operator (and thus a
PVM) is sufficient to guarantee that it returns a valid probability distribution, but this requirement
is not necessary. What is necessary for an observable to return a valid probability distribution
from the quantum state is that it defines a POVM [4], and every PVM is a POVM. Given Pauli’s
Theorem, then, an event time observable will have to be a POVM that is not a PVM (and thus its
first moment defines an operator that is symmetric but not self-adjoint).
3
That is, Werner models a screen as a two-dimensional spatial plane, †, that extends indefinitely
in time.
At What Time Does a Quantum Experiment Have a Result? 145
4
Note that this can be seen to subsume the idea of a screen observable: Hoge [10] explicitly
demonstrates that this returns the appropriate screen observable as the volume becomes an area,
jj ! 0.
146 T. Pashby
TrŒFEF
Pr.EjF/ D TrŒE0 D ; (1)
TrŒF
and thus the following state transition is associated with this conditionalization:
FF
! 0 D : (2)
TrŒF
However, the time of occurrence necessarily defines a POVM rather than a PVM
(due to Pauli’s Theorem) so it cannot supply the projectors to which Lüders’ Rule
would apply. When dealing with such a POVM, the standard move is to make use
of the associated Lüders Operation instead, which conditions the system using a
positive operator [6]. The Lüders Operation is obtained simply by replacing the
projector E with the positive operator5 A1=2 :
A1=2 A1=2
! O D : (3)
TrŒA
TrŒA1=2 BA1=2
W.BjA/ D TrŒB
O D : (4)
TrŒA
this only obtains if A2 D A, in which case A is a projector and the Lüders Operation
reduces to Lüders’ Rule.
Therefore, if the Lüders Operation is to lead to a well-formed conditional
probability for the time of occurrence POVM, it must do so by other means.6
I resolve this problem in Sect. 5 by showing how to derive a valid conditional
probability distribution from the Born Rule (applied at an instant) that supplies an
alternative way to relate the time of occurrence POVMs of Brunetti and Fredenhagen
2
5
Any bounded positive operator A has a unique positive square root A1=2 such that A1=2 D A. A
2 1=2
projector P is just a positive operator for which P D P and thus P D P.
6
I first raised this difficulty in [19], which also contains a history of time in early quantum theory.
At What Time Does a Quantum Experiment Have a Result? 147
Here we examine how classical probability theory provides the probability distri-
bution for the time of an event whose occurrence follows an exponential decay
law. For a radioactive atom, one might justify such a law phenomenologically as
follows. Taking a uniform sample of a radioactive isotope we observe that the
number of nuclei that will decay in a given interval of time is proportional to
the original number of nuclei in the sample. Furthermore, the proportion of the
original sample of N0 nuclei that remains after a time t is seen to follow the simple
rule
Nt
D et=T D et ;
N0
Nt
D e.1=2 t/= ln 2 D et :
N0
And if our sample consists of a single atom then this fraction may be interpreted as
the probability of decay after time t.
To calculate the probability that the time of decay of a single atom, td , lies within
some time interval Œt1 ; t2 , where t1 0 and t2 > t1 , we integrate the corresponding
probability density, et , as follows:
Z Z
t2
1 t2
Pr.t1 td < t2 / D et dt D et dt: (6)
t1 T t1
the values of a random variable Td (in the sense of probability theory) corresponding
to the time of decay.
Using this assignment of probabilities to time intervals, we have the means to
answer the question: “Has the atom already decayed?” To this question, asked at
time t, there corresponds the probability assignment
Z t
Pr.0 td < t/ D et dt;
0
This simply says that the decay is certain to occur during the time interval Œ0; 1/.
Note, therefore, that it is simply built into this assignment of probabilities that
the atom will decay, which shows that these are conditional probabilities, as I now
explain. Consider the following simple example. In assigning probabilities to the
results of a coin toss occurring at some later time t, the obvious normalization
implies that the probability of the event that either the coin lands heads or the
coin lands tails is 1, and thus the coin toss is certain to occur.7 Therefore, if this
probability assignment is correct, then the future occurrence of the coin toss is
guaranteed.
This may make it sound as if we have ensured the future occurrence of this event
simply by assigning probabilities to its results, but of course we have not. All this
says is that the probabilities are conditioned on the occurrence of the coin toss and
apply just in case it does take place, though it need not—if the coin toss does not
occur then the condition is not met and there are no events to which the probability
assignment applies.
In the case of the time of decay the only difference is that the events (in the
sense of probability theory) to which we assign probabilities describe the decay
of the atom at different time intervals. These are the possible results of the decay
process, and to the event of decay at some time we assign probability one. But,
again, this is just to say that this is an assignment of conditional probabilities.
No event is forced to happen simply by making a probability assignment con-
ditioned on its occurrence, and if in fact no decay subsequently occurred then
there were simply no events to which the conditional probability assignment
applied.
7
This is just the assumption that we have a well-formed event space.
At What Time Does a Quantum Experiment Have a Result? 149
Taking the above treatment of the time of decay as a model, I now show how
to arrive at a well-formed probability distribution for the time of occurrence by
treating the real number supplied by the Born Rule as a probability density.
We begin with a time-indexed family of Heisenberg projectors Pt , where P
represents the occurrence of the outcome in question (such as the click of a
detector). These projectors correspond to a family of propositions stating that
“the outcome occurs at t”. We will assume that an outcome occurs at some
t 2 R, and only once.8 With this assumption the occurrence of an outcome
at t and the occurrence of an outcome at t0 are mutually exclusive events (for
t ¤ t0 ).
These assumptions amount to the claim that we are dealing with conditional
probabilities, conditioned on the fact that the experiment has a unique outcome at
some unknown time. This condition will be reflected in the proper normalization
of our probability distribution by means of a suitably formed probability density.
Assuming the Heisenberg state of the system is a pure state j i, an application
of the Born Rule to the Heisenberg projectors Pt returns the following function
of t:
This is not yet suitable for use as a probability density; if we integrate f .t/ over t we
find
Z 1 Z 1
f .t/dt D h jPt idt; (9)
1 1
which is evidently unnormalized and may not even converge. However, if the
integral is finite (at least for some j i) we may use it as a normalization factor in
defining the genuine probability density
Z 1 1
0
f .t/ D f .t/dt f .t/ D f .t/; (10)
1
8
It must be admitted that this restricts the application of this technique to experiments where only
a single outcome of some type is expected. However, these experiments are precisely those where
probabilistic answers to the question “when does the experiment have an outcome?” makes sense.
150 T. Pashby
Where defined, this operator S is positive and thus symmetric (on an appropriate
domain). Note that there is a good reason to be interested only in those states for
which the expectation of S is strictly positive and finite (otherwise we are dealing
with an experiment without an outcome), in which case we may write10 :
1 1
D D : (12)
hSi h jS i
This allows for the definition of a probability distribution for the time of the
outcome, to , which depends on the state j i. If f 0 .t/ is the correct probability density
then in the state j i the probability that the statement “the outcome occurs during
Œt1 ; t2 ” is true is given by
Z t2 Z t2
0 1
Pr.t1 to < t2 / D f .t/dt D h jPt idt: (13)
t1 hSi t1
Again, in quantum mechanics one would expect that this probability is the expec-
tation of a positive operator, and the linear operator FŒt1 ;t2 defined by varying
; 2 H in the following expression suggests itself for this purpose:
Z t2
hjFŒt1 ;t2 i WD hjPt idt: (14)
t1
This operator is positive and bounded, and one can see that the operators FŒt1 ;t2
obtained by varying Œt1 ; t2 are apt to form an unnormalized POVM, 7! F , of
which S is the first moment, i.e. FR D S. Using this operator, we can rewrite the
probability for time of occurrence as
Z t2
h jFŒt1 ;t2 i
Pr.t1 to < t2 / D f 0 .t/dt D : (15)
t1 h jFR i
9
Note that this operator depends on the Hamiltonian H through the Heisenberg picture family
Pt D Ut PUt , where Ut D eiHt . It is not surprising that an operator related to the time of an event
should depend on the dynamics.
10
For reasonable choices of P and H, Brunetti and Fredenhagen [3] demonstrate that this definition
does lead to a positive operator S defined on the domain obtained by taking the orthogonal
complement of the subspace of states for which the expectation of S is 0 or infinity.
At What Time Does a Quantum Experiment Have a Result? 151
POVM arises naturally from the expression above, we first consider how S acts to
condition the state through its associated Lüders Operation (Eq. (3)),
S1=2 S1=2
! O D : (16)
TrŒS
In Sect. 3.1, I pointed out that a naive interpretation of the Lüders Operation does not
lead to a conditional probability. We have avoided this problem here by adopting the
expression above (15) as our definition of a conditional probability. Indeed, applied
to the state , (15) leads to:
TrŒFŒt1 ;t2
Pr.t1 to < t2 / D W 0 .FŒt1 ;t2 jS/ WD :
TrŒS
Here the probabilities are properly conditioned since obviously W 0 .FR jFR / D 1,
although the connection to Lüders’ Rule is not yet clear. This connection is provided
by making use of Brunetti and Fredenhagen’s operator normalization technique,
which converts 7! F to a properly normalized POVM 7! E by defining
returns the probability for the occurrence of the outcome described by the projector
P during the time interval Œt1 ; t2 . This probability is conditioned on the occurrence
of the outcome at some time to 2 R during the entire course of the experiment.11
In Sect. 2, I argued that the orthodox account of quantum mechanics leads to the idea
that the measurement of an observable is an instantaneous process that takes place
at the behest of the experimenter. By rejecting this idea in favor of an interpretation
of time-indexed Heisenberg projectors as describing the occurrence of a detection
11
In case this seems only tangentially related to Lüders’ Rule, note that in the temporally extended
Hilbert space this expression becomes identical to Lüders’ Rule. See [18, §8.3].
152 T. Pashby
where jca j2 C jcb j2 D 1 and jiniti is the ‘ready’ state of the apparatus, then the final,
post-measurement state is
where jOa i and jOb i are states of the apparatus that indicate results a and b,
respectively.
Rovelli defines his projector M by specifying its action as follows:
and if ji is a state of the apparatus such that hjOa i D 0 and hjOb i D 0 then
P.t/ D h‰.t/jM‰.t/i
“will be a smooth function that goes monotonically from 0 to 1 in the time interval
0 to T” (1037).12
To find out whether or not the measurement of O has happened (as of time t), we
must perform a measurement of M on the combined system whose result indicates
either a perfect correlation between system and apparatus, or no correlation at all.
Rovelli’s proposal, then, involves two distinct types of measurement: there is the
measurement of O by the apparatus, described by unitary evolution of the joint
system, and the measurement of the projector M by the observer at a time of her
choosing, described according to von Neumann’s collapse process.
But if collapse (due to measurement of O) has already happened when M is
measured then P.t/ gives the wrong probabilities since it does not account for this
fact. And if it has not, then Rovelli’s interpretation of M as answering the question
“has a measurement of O already occurred?” cannot be sustained. That is, Rovelli’s
proposal faces the following dilemma:
When measuring M at time t either measurement of O already occurred at
some time t0 < t, in which case P.t/ does not give the correct probabilities, or
measurement of O occurs at time t, in which case the result M D 1 cannot be
interpreted as saying that a measurement has already happened.
On neither of these alternatives does Rovelli’s proposal do the job as advertised,
and on the second we again reach the problematic conclusion that the experimenter
may choose exactly when a measurement occurs. This problem is avoided in taking
12
Now, Oppenheim et al. [17] point out that there is a technical problem here since the system may
be subject to the backflow effect, in which case P.t/ can decrease with increasing t and thus fails
to be monotonic. There is also the further problem that no unitary evolution that can take a pure
state to a mixture. My concern, however, is conceptual rather than technical.
154 T. Pashby
the approach that I suggested above, which defines the time of occurrence as a
conditional probability for occurrence within a time interval. My way of thinking
is also suggested by one of Rovelli’s own formulations of the question we are trying
to answer:
When is it precisely that a quantum event “happens?” Namely, when precisely can we
replace the statement “this may happen with probability p” with the statement “this has
happened?” (1032)
His second formulation can be answered as follows: the statement “this (event)
has happened” is true at times later than the occurrence of the event (as indicated
by the use of the past tense). Furthermore, the statement “this event has happened”
becomes true not because the experimenter does something to the system at some
specific time, but simply because the event in question does occur in the course
of the experiment, and the time at which it does so is not under her control. So
instead of looking for an ‘experimental question’ that we ask at a specific time of
the system, we should seek to investigate the question, “at what time does the event
occur?” since it is after this time that we can truly say that the event has happened.
What Rovelli’s analysis of the situation here simply misses, however, is that the
probabilistic answers given to questions about the time of occurrence for some event
take it for granted that the event does occur (at some unknown time), and so the
statement “this event will happen” remains true until then.13 The reason that this
alternative is hard to see is because the method of asking and answering questions
at a time is so ingrained in the von Neumann-Mackey formalism. However, the idea
that we should ask questions about the time of an event (rather than at a time) arose
naturally in giving a probabilistic analysis of atomic decay—a canonical example
of a truly quantum process—and, in the previous section, I showed how to derive
a probability distribution for the time of any given registration event arising from a
quantum process.
13
In the background here is a simple picture of temporal passage in which a future event becomes
present and then past, and tensed statements about that event change their truth value in response.
(See [21] for the canonical statement of this view.) For example, if I light the fuse of a firework at
time tb and say “the firework will explode,” that statement is true and is made true by the occurrence
of the explosion at a later time te > tb . After the explosion, at times t > te , the statement “the
firework will explode” is false while the statement “the firework did explode” is true, and is made
true by the occurrence of the explosion at an earlier time.
At What Time Does a Quantum Experiment Have a Result? 155
time, they consider a family of repeated measurements of M. The idea behind this
is simple: to get a better read on this time, we record the results of many successive
measurements of M at times ft1 ; t2 ; t3 ; : : : g. Looking at a typical string of results
of these measurements we will see something like f0; 0; 1; : : : g, which in this case
indicates that the outcome occurred sometime between t2 and t3 .
This, they claim, provides a more appropriate answer to the question “when did
the measurement occur?” than Rovelli’s single measurement of M.14 To get the
correct measurement statistics for successive measurements of M, however, requires
taking into account the results of each prior measurement. These joint probabilities
require distinctive measurement statistics, arrived at by successive applications of
Lüders’ Rule (1).
In particular, if the state is pure then D jihj for some unit vector ji and so
from (1) we obtain
hjFEFi
Pr.EjF/ D h 0 jE 0 i D (25)
hjFi
which leads to
Fji
ji ! j 0 i D p : (26)
hjFi
14
They also point out that their analysis is completely general and applies to any projector.
15
That is, the subspace onto which I P projects is just the orthogonal complement of the
range of P.
156 T. Pashby
with
Oppenheim, Reznick and Unruh’s proposal is just a special case of this general
rule for calculating joint probabilities: according to their proposal, the probability
that the time of the outcome of the experiment lies between two times is given by the
application of Lüders’ Rule to a uniform sequence of instantaneous measurements
of M on the joint system (with original state j‰i). That is, one chooses a time-
resolution for the measurements and calculates the probability that the transition
between M D 0 and M D 1 occurs within a given time interval.16
However, the events to which one is really assigning probabilities are sets of
outcomes of a series of instantaneous measurements of M taking place at pre-
determined times. This is not a probability distribution for the time of an event.17
Nonetheless, in taking this approach, which gives a joint probability for the results
of series of instantaneous measurements taking place at pre-defined times, they do
16
One of the more counterintuitive aspects of this proposal is that successive measurements of
M need not agree—even after an outcome has ostensibly obtained—and thus a sequence of
results such as f0; 0; 1; 1; 0g is possible (in the sense that it is assigned non-zero probability). If
a measurement of M with result M D 1 takes place at t0 then the probability of measuring M D 1
at some later time t > t0 is given by
Assume for contradiction that the second measurement of M is guaranteed to return M D 1. Then
P.t/ D 1 for all t > t0 and thus Mt0 j‰i is an eigenstate of Mt with eigenvalue 1 for any t > t0 , that
is
Since j‰i was arbitrary, this is true for any j‰i 2 H. In that case, it follows that Mt0 Mt
(i.e., Mt0 projects onto a subspace of Mt ) and so they commute. But, as Oppenheim, Reznick and
Unruh point out (p. 133), in general two projections from the same family Mt ; Mt0 with t ¤ t0
do not commute. (And, as they argue, exceptions to this claim will be rare. For example, if the
Hamiltonian is periodic then only if jt t0 j is equal to (a multiple of) the period will Mt ; Mt0
commute (since in that case Mt D Mt0 ).) Therefore, in general there is a non-zero probability that
a measurement of Mt will return M D 0. (Note that we can run the same argument using the
Heisenberg picture family Mt0 D Ut M 0 Ut D Ut .I M/Ut D I Mt , or any such family.)
17
Oppenheim, Reznick and Unruh critique Rovelli’s proposal on the grounds that:
His scheme only answers the first question: “has the measurement occurred already at a
certain time?”, but does not answer the more difficult question “when did the measurement
occur?” In other words, it does not provide a proper probability distribution for the time of
an event. (108)
I agree, but note that their proposal is subject to precisely this latter critique.
At What Time Does a Quantum Experiment Have a Result? 157
avoid the assumption that the measurement process takes place at a time chosen by
the experimenter. But this leads instead to the idea that the apparatus is repeatedly
measuring M of its own accord, which results in the (so-called) quantum Zeno effect.
Since unstable particles do seem to decay, the obvious inference to make is that such
particles are not under continuous observation (in this sense). Misra and Sudarshan
make several suggestions along these lines, one of which is that the sort of apparatus
actually used to make observations in the lab (like a bubble chamber) is better
modeled by the discrete repeated measurement process rather than its continuous
limit.
This appears to be the opinion of Oppenheim, Reznick and Unruh, who extend
this idea to suggest that there is “always an inherent inaccuracy when measuring the
time that the event (of measurement) occurred” [17, p. 177]. That is, the accuracy
to which one can determine the time of measurement depends on the frequency
with which one can measure M, and this frequency cannot be increased arbitrarily
without running into the quantum Zeno effect (and thus ensuring that M never comes
to have value M D 1). However, the conclusion that the quantum Zeno effect limits
the accuracy to which one can determine the time of an event in this way depends
crucially on the assumption that the experiment in question is best modeled by the
repeated application of a projector to the system state.
18
They make two additional non-trivial assumptions. First, they assume that the projection P
commutes with the one-parameter semi-group of time translations generated by the dynamics.
Second, they assume that the semi-group representing the evolution of the system under continuous
observation (that results from taking the n ! 1 limit) is continuous for t 0, not just t > 0.
158 T. Pashby
Here I disagree: although some experiments are suitably modeled in this way
(see [2]), these are not experiments where the time of an outcome is allowed to vary.
What the numerous purported demonstrations of the quantum Zeno effect really
demonstrate is that this effect only comes into play under special experimental
circumstances involving strong interactions between system and apparatus, often
taking place at pre-determined times. However, these circumstances do not arise
when using registration devices such as photodetectors, Geiger counters, scintilla-
tors, bubble chambers, etc., and so a typical detector is poorly modeled by these
means.
Sudbery [25] argues for a similar conclusion by comparing the seminal demon-
stration of the quantum Zeno effect of Itano et al. [11] with similar experiments
aimed at displaying ‘quantum jumps’ [16]. In the former case, one ‘measures’ the
presence of a system (a BeC ion) in its ground state through the application of a
pulsed laser (which induces fluorescence) and finds that as the frequency of pulses
within some time period increases, the system is less likely to have made it out of
the ground state during that period. This is given the interpretation that the dynamics
are ‘frozen’ by the act of repeated measurement—the quantum Zeno effect.
In the quantum jump experiment of Nagourney et al. [16], however, the presence
of the system in its ground state is detected similarly (through fluorescence)
but the laser is not pulsed and couples less strongly. In this case, the system
displays the characteristically quantum behavior of ‘jumping’ between the two
energetically favored states at times that are fairly well-defined experimentally
(although the distribution of these times, being essentially random, can only be
predicted probabilistically). Here, the continuous application of the laser has no
Zeno-like effect.
Sudbery attributes the difference between these two cases to the fact that in the
quantum jump experiment the laser is steady and so can be modeled by a time-
independent interaction, whereas in the quantum Zeno experiment the pulsed laser
requires a time-dependent interaction. However, this cannot be the whole story since
then we would expect to see no Zeno effect when the laser is left on permanently
rather than pulsed, which is not what happens in the quantum Zeno experiment.19
Instead, I want to endorse Ballentine’s suggestion that, in the experiment of Itano
et al. [11],
the quantum Zeno effect is not really a consequence of continuous observation, but rather
of an excessively strong perturbation by an inappropriate apparatus [1, p. 5166]
This point of view is confirmed by the recent experiment of Patil et al. [20] which
displays a clear separation between a “Zeno regime” of strong coupling and a weak
coupling regime.
By design, then, a detector must be weakly coupled to the experimental system,
and this coupling can be modeled by a time-independent interaction—crucially,
19
In fact, Itano et al. [11] describe how they did exactly this to prepare the system in the ground
state.
At What Time Does a Quantum Experiment Have a Result? 159
without breaking the time translation symmetry of the system by applying repeated
projections at pre-determined times. In that case, we may apply the method from
Sect. 5 to derive a conditional probability distribution for the time of occurrence
from an effect that describes the event in question.20 Here, while the probability of
detection in unit time will vary with time, it varies as a result of unitary evolution of
the system rather than an explicit time-dependence of the coupling.21
This allows us to resolve the difficulties with predicting the time of an outcome
along the lines suggested above: rather than thinking of a measurement as something
one does to the system, we may think of an experiment as something that happens—
a process that results in a special kind of quantum jump which leads to a registration
event.22 This allows us to replace the idea that a measurement result arises from
the application of an observable to the system at a pre-specified time with the idea
that a registration event occurs as the result of a quantum jump of the detector at
an essentially random (although probabilistically predictable) time. In this way, the
time translation symmetry of the system is broken by nature, not the experimenter.
References
1. L.E. Ballentine, Comment on “quantum Zeno effect”. Phys. Rev. A 43(9), 5165 (1991)
2. A. Beige, G.C Hegerfeldt, Projection postulate and atomic quantum Zeno effect. Phys. Rev. A
53(1), 53 (1996)
3. R. Brunetti, K. Fredenhagen, Time of occurrence observable in quantum mechanics. Phys.
Rev. A 66(4), 044101 (2002)
4. P. Busch, Quantum states and generalized observables: a simple proof of Gleason’s theorem.
Phys. Rev. Lett. 91(12), 120403 (2003)
5. P. Busch, M. Grabowski, P.J. Lahti, Time observables in quantum theory. Phys. Lett. A 191(5),
357–361 (1994)
6. P. Busch, M. Grabowski, P.J. Lahti, Operational Quantum Physics (Springer, Berlin, 1995)
7. J. Conway, S. Kochen, The free will theorem. Found. Phys. 36(10), 1441–1473 (2006)
8. P.A.M. Dirac, The Principles of Quantum Mechanics. (Clarendon Press, Oxford, 1930)
9. I. Egusquiza, J. Muga, A. Baute, “Standard” quantum–mechanical approach to times of arrival,
in Time in Quantum Mechanics (Springer, New York, 2002), pp. 305–332
20
I note that Ruschhaupt et al. [23] provide a theoretical basis for modeling a detector as involving
quantum jumps in a related way, and also make use of the operator normalization technique of
Brunetti and Fredenhagen [3] to derive time of arrival POVMs. It seems likely that these POVMs
could be interpreted as conditional probabilities (as suggested above), although Ruschhaupt et al.
[23] do not interpret them as such.
21
To illustrate: if the interaction term of the Hamiltonian is position dependent then the strength of
the coupling may vary with time (through the Schrödinger picture evolution of the joint state) but
the Hamiltonian for the joint system will be time-independent.
22
The focus of this paper is the time of this event, which arises from the weak coupling of two
systems. An alternative question concerns the correct description of the occurrence of many such
events as a stochastic process. The use of a stochastic master equation to describe continuous weak
measurements seems promising in this regard [12, §4].
160 T. Pashby
10. M.O. Hoge, Relationale Zeit in der Quantenphysik. Master’s Thesis, University of Hamburg
(2008)
11. W.M. Itano, D.J. Heinzen, J.J. Bollinger, D.J. Wineland, Quantum zeno effect. Phys. Rev. A
41(5), 2295 (1990)
12. K. Jacobs, D.A. Steck, A straightforward introduction to continuous quantum measurement.
Contemp. Phys. 47(5), 279–303 (2006). http://dx.doi.org/10.1080/00107510601101934
13. J. Kijowski, On the time operator in quantum mechanics and the Heisenberg uncertainty
relation for energy and time. Rep. Math. Phys. 6(3), 361–386 (1974)
14. G.W. Mackey, The Mathematical Foundations of Quantum Theory (WA Benjamin, New York,
1963)
15. B. Misra, E.C. George Sudarshan. The zeno’s paradox in quantum theory. J. Math. Phys. 18(4),
756–763 (1977)
16. W. Nagourney, J. Sandberg, H. Dehmelt, Shelved optical electron amplifier: observation of
quantum jumps. Phys. Rev. Lett. 56(26), 2797 (1986)
17. J. Oppenheim, B. Reznik, W.G. Unruh, When does a measurement or event occur? Found.
Phys. Lett. 13(2), 107–118 (2000)
18. T. Pashby, Time and the foundations of quantum mechanics. Ph.D. Thesis, University of
Pittsburgh (2014). http://philsci-archive.pitt.edu/10723/
19. T. Pashby. Time and quantum theory: a history and a prospectus. Stud. Hist. Phil. Sci. Part B:
Stud. Hist. Phil. Mod. Phys. 52, 24–38 (2015)
20. Y.S. Patil, S. Chakram, M. Vengalattore, Measurement-induced localization of an ultracold
lattice gas. Phys. Rev. Lett. 115(14), 140402 (2015)
21. A.N. Prior, Changes in Events and Changes in Things. (Department of Philosophy, University
of Kansas, Lawrence, 1962)
22. C. Rovelli, “Incerto tempore, incertisque loci”: can we compute the exact time at which a
quantum measurement happens? Found. Phys. 28(7) 1031–1043 (1998)
23. A. Ruschhaupt, J. Gonzalo Muga, G.C. Hegerfeldt, Detector models for the quantum time of
arrival, in Time in Quantum Mechanics-Vol. 2 (Springer, Berlin, 2009), pp. 65–96
24. M. Srinivas, R. Vijayalakshmi, The ‘time of occurrence’ in quantum mechanics. Pramana
16(3), 173–199 (1981)
25. A. Sudbery, Diese verdammte quantenspringerei. Stud. Hist. Phil. Sci. Part B: Stud. Hist. Phil.
Mod. Phys. 33(3), 387–411 (2002)
26. R. Werner, Screen observables in relativistic and nonrelativistic quantum mechanics. J. Math.
Phys. 27, 793 (1986)
27. A.S. Wightman, On the localizability of quantum mechanical systems. Rev. Mod. Phys. 34,
845–872 (1962)