Special Relativity and Maxwell Equations
Special Relativity and Maxwell Equations
Special Relativity and Maxwell Equations
by
Bernd A. Berg
Department of Physics
c by the author.
Copyright
Chapter 1
An introduction to the theory of special relativity is given, which provides the space-
time frame for classical electrodynamics. Historically [2] special relativity emerged out of
electromagnetism. Nowadays, is deserves to be emphasized that special relativity restricts
severely the possibilities for electromagnetic equations.
Here and in the following abbreviations for units are placed in brackets [ ]. For most of the
1 1 1
20th century the second was defined in terms of the rotation of the earth as 60 × 60 × 24
of the mean solar day. Nowadays most accurate time measurements rely on atomic clocks.
They work by tuning an electric frequency into resonance with an atomic transition. The
second has been defined, so that the frequency of the light between the two hyperfine levels
of the ground state of the cesium 132 Cs atom is exactly 9,192,631,770 cycles per second.
Special relativity is founded on two basic postulates:
1. Galilee invariance: The laws of nature are independent of any uniform, translational
motion of the reference frame.
This postulate gives rise to a triple infinite set of reference frames moving with constant
velocities relative to one another. They are called inertial frames. For a freely moving
body, i.e., a body which is not acted upon by an external force, inertial systems exist. The
differential equations which describe physical laws take the same form in all inertial frames
(form invariance). Galilee invariance was known long before Einstein.
2. The speed c of light in empty space is independent of the motion of its source.
The second Postulate was introduced by Einstein 1905 [2]. It implies that c takes the same
constant value in all inertial frames. Transformations between inertial frames are implied,
which have far reaching physical consequences.
CHAPTER 1. 2
c=1 (1.4)
holds. The advantage of natural units is that factors of c disappear in calculations. The
disadvantage is that for converting back to conventional units the appropriate factors have
to be recovered by dimensional analysis. For instance, if time is given in seconds x = t in
natural units converts to x = ct with x in meters and c given by (1.3).
|~x1 | = c 4t/2 ,
where 4t is the time light needs to travel to O1 and back. This determines ~x1 and O0 signals
this information to O1 . By repeating the measurement, he can make sure that O1 is not
CHAPTER 1. 3
moving with respect to K. For an idealized, force free environment the observers will then
never start moving with respect to one another. O1 synchronizes her clock by setting it to
where O0 emits (superscript e) the signal at te1 and O1 receives (superscript r) it at tr1 . When
O0 flashes later his instant time te2 over to O1 , the clock of O1 will show time tr2 = te2 + |~x1 |/c
when receiving the signal. In the same way the time t can be defined at any desired point ~x
in K.
0
Now we consider an inertial frame K 0 with coordinates (t0 , ~x ), moving with constant
velocity ~v with respect to K. The origin of K 0 is defined through an observer O00 . How does
one know that O00 moves with constant velocity ~v with respect to O0 ? At times te1 and te2
observer O0 may flash light signals at O00 , which are reflected and arrive back after time
intervals 4t1 and 4t2 on the clock of O0 . From principle 2 it follows that the reflected light
needs the same time to travel from O00 to O0 , as it needed to travel from O0 to O00 . Hence,
O0 concludes that O00 received the signals at
in the O0 time. This simple equation becomes more complicated for non-relativistic physics,
because the speed on the return path would then be distinct from that on the arrival path
(consider elastic scattering of a very light particle on a heavy surface). The constant velocity
of light implies that relativistic distance measurements are simpler than such non-relativistic
measurements. For observer O0 the vector positions ~x1 and ~x2 of O00 at times tr1 and tr2 ,
respectively, are completely defined by the angles (θi , φi ) at which the light comes back and
the magnitude
|~xi | = 4ti c/2, (i = 1, 2) . (1.6)
For the assumed force free environment observer O0 can conclude that O00 moves with respect
to him with uniform velocity
~v = (~x2 − ~x1 )/(tr2 − tr1 ) . (1.7)
Actually, one measurement is sufficient to obtain the velocity when one employs the
relativistic Doppler effect as discussed later in section 1.1.8. O0 may repeat the procedure
to check that Oo0 moves indeed with uniform velocity.
The equation of motion for the origin of K 0 as observed by O0 is
0
~x (~x = 0) = ~x0 + ~v t, (1.8)
with ~x0 = ~x1 − ~v tr1 , expressing the fact that for t = tr1 observer O00 is at ~x1 . Shifting his
space convention by a constant vector, observer O0 can achieve ~x0 = 0, so that equation
(1.8) becomes
0
~x (~x = 0) = ~v t.
CHAPTER 1. 4
Similarly, observer O00 finds out that O0 moves with velocity ~v 0 = −~v . According to
principle 1, observers in K 0 can now go ahead to define t0 for any point ~x 0 in K 0 . Observer
O00 can choose his space convention so that
0
~x (~x = 0) = −~v t0
holds.
c2 t2 − x2 − y 2 − z 2 = 0 in K, (1.9)
and by
c2 t0 2 − x0 2 − y 0 2 − z 0 2 = 0 in K 0 . (1.10)
We define 4-vectors (α = 0, 1, 2, 3) by
α ct
(x ) = and (xα ) = (ct, −~x) . (1.11)
~x
Due to a more general notation, which is explained in section 1.1.5, the components
xα are called contravariant and the components xα covariant. In matrix notation the
contravariant 4-vector (xα ) is represented by a column and the covariant 4-vector (xα ) as a
row. The contravariant vectors are those encountered in non-relativistic mechanics, where
unfortunately the indices are usually down instead of up.
The Einstein summation convention is defined by
3
X
xα xα = xα xα = (x0 )2 − ~x 2 , (1.12)
α=0
and will be employed from here on. Equations (1.9) and (1.10) read then
xα xα = x0α x0 α = 0 . (1.13)
xα xα = x0α x0 α = s2 . (1.14)
This equation implies (1.13), but the reverse is not true. An additional transformation,
which leaves (1.13), but not (1.14) invariant, is the scale transformation x0 α = λ xα .
CHAPTER 1. 5
Future
B
0
Elsewhere A Elsewhere
x
Past
x1
Figure 1.1: Minkowski space: Seen from the spacetime point A at the origin, the spacetime
points in the forward light cone are in the future, those in the backward light cone are in
the past and the spacelike points are “elsewhere”, because their time-ordering depends on
the inertial frame chosen. Paths of two clocks which separate at the origin (the straight line
one stays at rest) and merge again at a future space-time point B are also indicated. For the
paths shown the clock moved along the curved (in the figure longer!) path will, at B, show
an elapsed time of about 70% of the elapsed time shown by the other clock, which stays at
rest.
0 0
If the initial condition t0 = 0 and ~x (~x = 0) = ~x(~x ) = 0 for t = 0 is replaced by an arbi-
trary one, the equation (xα − yα )(xα − y α ) = (x0α − y 0α )(x0α − y 0α ) still holds. Inhomogeneous
Lorentz or Poincaré transformations are defined as the group of transformations which leave
s2 = (x α − y α )(xα − y α ) invariant. (1.15)
In contrast to the Lorentz transformations the Poincaré transformations include invariance
under translations
xα → xα + aα and y α → y α + aα (1.16)
where aα is a constant vector. Independently of Einstein, Poincaré had developed similar
ideas, but pursued a more cautious approach.
A fruitful concept is that of a 4-dimensional space-time, called Minkowski space. Equation
(1.15) gives the invariant metric of this space. Compared to the norm of 4-dimensional
Euclidean space, the crucial difference is the relative minus sign between time and space
components. The light cone of a 4-vector xα0 is defined as the set of vectors xα which satisfy
(x − x0 )2 = (xα − x0α ) (xα − xα0 ) = 0.
CHAPTER 1. 6
The light cone separates events which are timelike and spacelike with respect to x α0 , namely
and
(x − x0 )2 < 0 for spacelike.
We shall see soon, note equation (1.22), that the time-ordering of spacelike points is distinct
in different inertial frames, whereas it is the same for timelike points. For the choice xα0 = 0
this Minkowski space situation is depicted in figure 1.1. On the abscissa we have the
projection of the three dimensional Euclidean space on r = |~x|. The regions future and
past of this figure are the timelike points of x0 = 0, whereas elsewhere are the spacelike
points.
To understand special relativity in some depth, we have to explore Lorentz and Poincaré
transformations. Before we come to this, we consider the two-dimensional case and introduce
some relevant calculus in the next two sections.
In equation (1.19) we choose the usual convention d = − sinh(ζ) and end up with
a b cosh(ζ) − sinh(ζ)
= , (1.21)
d e − sinh(ζ) cosh(ζ)
CHAPTER 1. 7
where ζ is called rapidity or boost variable and has the interpretation of an angle in a
hyperbolic geometry. For our present purposes no knowledge of hyperbolic geometries is
required. In components (1.18) reads then
An interesting feature of equation (1.22) is that for spacelike points, say x1 > x0 > 0, a value
ζ0 for the rapidity exists, so that
0 = + cosh(ζ) x0 − sinh(ζ) x1
and, therefore,
sign(x0 0 ) = −sign(x0 )
for ζ > ζ0 , so that the time-ordering becomes reversed, whereas for timelike points such a
reversal of the time-ordering is impossible as then |x0 | > |x1 |. In figure 1.1 this is emphasized
by calling the spacelike (with respect to x0 = 0) region elsewhere in contrast to future and
past.
The physical interpretation is straightforward. Seen from K, the origin x0 1 = 0 of K 0
moves with constant velocity v. In K this corresponds to the equation
0 = − sinh(ζ) x0 + cosh(ζ) x1
v x1 sinh(ζ)
β= = 0 = = tanh(ζ) . (1.24)
c x cosh(ζ)
These equation are called Lorentz transformations. Lorentz discovered them first in his
studies of electrodynamics, but it remained due to Einstein [2] to understand their physical
meaning. We may perform two subsequent Lorentz transformations with rapidity ζ1 and ζ2 .
They combine as follows:
+ cosh(ζ2 ) − sinh(ζ2 ) + cosh(ζ1 ) − sinh(ζ1 )
− sinh(ζ2 ) + cosh(ζ2 ) − sinh(ζ1 ) + cosh(ζ1 )
CHAPTER 1. 8
+ cosh(ζ2 + ζ1 ) − sinh(ζ2 + ζ1 )
= . (1.28)
− sinh(ζ2 + ζ1 ) + cosh(ζ2 + ζ1 )
The rapidities add up as
ζ = ζ 1 + ζ2 (1.29)
in the same way as velocities under Galilei transformations or angles for rotations about the
same axis do. Note that the inverse to the transformation with rapidity ζ1 is obtained for
ζ2 = −ζ1 . The relativistic addition of velocities follows from (1.29). Let β1 = tanh(ζ1 ) and
β2 = tanh(ζ2 ), then
β1 + β 2
β = tanh(ζ1 + ζ2 ) = (1.30)
1 + β 1 β2
holds. Another immediate consequence of the Lorentz transformations is the time dilatation:
A moving clock ticks slower. In K the position of the origin of K 0 is given by
x1 = v x0 /c = tanh(ζ) x0
This works also the other way round. In K 0 the position of the origin of K is given by
0 1 0 0
x = − tanh(ζ) x
0 1 0 0
and with this relation between x and x the inverse Lorentz transformation gives
0
x0 = x 0 / cosh(ζ) .
There is no paradox, because equal times at separate points in one frame are not equal in
another (remember that the definition of time in one frame relies already on the constant
speed of light). In particle physics the effect is day by day observed for the lifetimes of
unstable particles. To test time dilatation for macrosciopic clocks, we have to send a clock
on a roundtrip. For this an infinitesimal form of equation (1.31) is needed.
0 0
Allowing that x0 = x1 = 0 does not have to coincide with x 0 = x 1 = 0, we consider
Poincaré transformations. The light radiation may originate in K at (x00 , x10 ) and in K 0 at
0 0
(x00 , x01 ). This generalizes equation (1.17) to
0 0 0 0
(x 0 − x00 )2 − (x 1 − x01 )2 = (x0 − x00 )2 − (x1 − x10 )2 ,
This means, x0 α is a function of four variables and, when it is needed, this function is assumed
to be sufficiently often differentiable with respect to each of its arguments. In the following
we consider the transformation properties of various quantities (scalars, vectors and tensors)
under x → x0 .
A scalar is a single quantity whose value is not changed under the transformation (1.43).
The proper time is an example.
A 4-vector Aα , (α = 0, 1, 2, 3) is said contravariant if its components transform according
to
∂x0α β
A0α = A . (1.44)
∂xβ
An example is Aα = dxα , where (1.44) reduces to the well–known rule for the differential of
a function of several variable (f α (x) = x0α (x)):
0α∂x0 α β
dx = dx .
∂xβ
Remark: In this general framework the vector xα itself is not always contravariant (for a
discussion see books on General Relativity like [10]). When a linear transformation
x0 α = aαβ x β
holds with space-time independent coefficients aαβ , then x α is contravariant and one finds
∂x0 α
= aαβ .
∂x β
In special relativty we are only interested in linear transformations. Space-time dependent
transformations lead into general relativity.
A 4-vector is said covariant when it transforms like
∂xβ
Bα0 = Bβ . (1.45)
∂x0α
An example is
∂
Bα = ∂ α = , (1.46)
∂xα
because of
∂ ∂xβ ∂
= .
∂x0α ∂x0α ∂xβ
The inner or scalar product of two vectors is defined as the product of the components of a
covariant and a contravariant vector:
B · A = B α Aα . (1.47)
CHAPTER 1. 11
It follows from (1.44) and (1.45) that the scalar product is an invariant under the transfor-
mation (1.43):
∂xγ ∂xδ
G0αβ = Gγδ .
∂x0α ∂x0β
The inner product or contraction with respect to a pair of indices, either on the same tensor
or between different tensors, is defined as in (1.47). One index has to be contravariant and
the other covariant.
A tensor S ...α...β... is said to be symmetric in α and β when
S ...α...β... = S ...β...α....
A...α...β... = −A...β...α....
S ...α...β...A...α...β... = 0. (1.49)
Proof:
S ...α...β...A...α...β... = −S ...β...α...A...β...α... = −S ...α...β...A...α...β...,
and consequently zero. The first step exploits symmetry and antisymmetry, and the second
step renames the summation (dummy) indices. Every tensor can be written as a sum of its
symmetric and antisymmetric parts in two if its indices
by simply defining
1 ...α...β... 1 ...α...β...
TS...α...β... = T + T ...β...α... and TA...α...β... = T − T ...β...α... . (1.51)
2 2
So far the results and definitions are general. We now specialize to Poincaré transforma-
tions. The specific geometry of the space–time of special relativity is defined by the invariant
distance s2 , see equation (1.15). In differential form, the infinitesimal interval ds defines the
proper time c dτ = ds,
Here we have used superscripts on the coordinates in accordance to our insight that dx α is
a contravariant vector. Introducing a metric tensor gαβ we re–write equation (1.52) as
Comparing (1.52) and (1.53) we see that for special relativity gαβ is diagonal:
Comparing (1.53) with the invariant scalar product (1.47), we conclude that
xα = gαβ xβ .
The covariant metric tensor lowers the indices, i.e., transforms a contravariant into a
covariant vector. Correspondingly the contravariant metric tensor g αβ is defined to raise
indices:
xα = g αβ xβ .
The last two equations and the symmetry of gαβ imply
gαγ g γβ = δαβ
for the contraction of the contravariant with the covariant metric tensor. This equation
yields g αβ , called the normalized co–factor of gαβ . For the diagonal matrix (1.54) the result
is simply
g αβ = gαβ . (1.55)
Raising and lowering indices with gαβ and g αβ , the equations
A0 ~
α
A = ~ , Aα = (A0 , −A)
A
α ∂A0 ~α
∂ Aα = ∂ α A = +∇·A
∂x0
and the d’Alembert (4-dimensional Laplace) operator
!2
∂
= ∂α ∂ α = − ∇2
∂x0
are invariants. Sometimes the notation 4 = ∇2 is used for the (3-dimensional) Laplace
operator.
In matrix notation
ÃgA = g, (1.58)
where g = (gβα ) is given by (1.54),
a00 a01 a02 a03
a1 a11 a12 a13
A = (aβα ) =
20
, (1.59)
a 0 a21 a22 a23
a30 a31 a32 a33
and à = (ãβα ) with ãβα = aαβ is the transpose of the matrix A = (aβα ), explicitly
ã00 ã01 ã02 ã03 a00 a10 a20 a30
ã 0 ã11 ã12 3
ã1 a01 a11 a21 a31
à = (ãβα ) = 10 = . (1.60)
ã2 ã21 ã22 ã23 a02 a12 a22 a32
ã30 ã31 ã32 ã33 a03 a13 a23 a33
For this definition of the transpose matrix the row indices are contravariant and the column
indices are covariant, vice verse to the definition (1.11) for vectors and, similarly, ordinary
matrices. Certain properties of the transformation matrix A can be deduced from (1.58).
CHAPTER 1. 14
Taking the determinant on both sides gives us det(ÃgA) = det(g) det(A)2 = det(g). Since
det(g) = −1, we obtain
det(A) = ±1. (1.61)
One distinguishes two classes of transformations. Proper Lorentz transformations are
continuously connected with the identity transformation A = 1. All other Lorentz
transformations are called improper. Proper transformations have necessarily det(A) = 1.
To have an improper Lorentz transformations it is sufficient, but not necessary, to have
det(A) = −1. For instance A = −1 (space and time inversion) is an improper Lorentz
transformation with det(A) = +1.
Next the number of parameters, needed to identify a transformation in the group, follows
from (1.58). Since A and g are 4 × 4 matrices, we have 16 equations for 42 = 16 elements
of A. But they are not all independent because of symmetry under transposition. The off-
diagonal equations are identical in pairs. Therefore, we have 4 + 6 = 10 linearly independent
equations for the 16 elements of A. This leaves six free parameters, i.e., the Lorentz group
is a six–parameter group.
In the 19th century Lie invented the subsequent procedure to handle these parameters.
Let us now consider only proper Lorentz transformations. To construct A explicitly, Lie
made the ansatz ∞
X Ln
A = eL = ,
n=0 n!
Note that det(A) = +1 implies that L is traceless. Equation (1.58) can be written
From the definition of L, L̃ and the fact that g 2 = 1 we have (note (g L̃g)n = g L̃n g and
P P∞
1=( ∞ n n
n=0 L /n!) ( n=0 (−L) /n!))
The matrix gL is thus antisymmetric and it is left as an exercise to show that the general
form of L is:
0 l01 l02 l03
l0 0 l12 l13
L= 01
. (1.64)
l 2 −l12 0 l23
l 3 −l 3 −l23
0 1
0
CHAPTER 1. 15
where the commutator of two matrices is defined by [A, B] = AB − BA and ijk is the
completely antisymmetric Levi–Cevita tensor. Its definition in n–dimensions is
+1 for (i1 , i2 , ..., in ) being an even permutation of (1, 2, ..., n),
i1 i2 ...in = (1.68)
−1 for (i1 , i2 , ..., in ) being an odd permutation of (1, 2, ..., n),
0 otherwise.
To get the physical interpretation of equation (1.65) for A, it is suitable to work out simple
examples. First, let ζ~ = φ1 = φ2 = 0 and φ3 = φ. Then (left as exercise)
1 0 0 0
0 cos φ sin φ 0
A = e−φ S3 = , (1.69)
0 − sin φ cos φ 0
0 0 0 1
which describes a rotation by the angle φ (in the anti-clockwise sense) around the ê3 axis.
~ = ζ2 = ζ3 = 0 and ζ1 = ζ. Then
Next, let φ
cosh ζ − sinh ζ 0 0
− sinh ζ cosh ζ 0 0
A = e−ζ K1 = (1.70)
0 0 1 0
0 0 0 1
is obtained, where ζ is known as the boost parameter or rapidity. The structure is reminiscent
to a rotation, but with hyperbolic functions instead of circular, basically because of the
CHAPTER 1. 16
relative negative sign between the space and time terms in eqn.(1.52). “Rotations” in the
x0 − xi planes are boosts and governed by an hyperbolic gemometry, whereas rotations in
the xi − xj (i 6= j) planes are governed by the ordinary Euclidean geometry.
Finally, note that the parameters φi , ζi , (i = 1, 2, 3) turn out to be real, as equation
(1.57) implies that the elements of A have to be real. In the next subsection relativistic
kinematics is discussed in more details.
To find the transformation law of an arbitrary vector A ~ for a general relative velocity ~v , it
is convenient to decompose A ~ into components parallel and perpendicular to β~ = ~v /c. Let
β̂ be the unit vector in β~ direction,
~ = Ak β̂ + A
A ~ ⊥ with Ak = β̂ A
~.
x0 i = c−1 u0 i x0 0 .
What is its velocity ~u with respect to K? Let us first assume that the velocity ~v between
the frames is in x̂1 direction and rederive (1.30). Substituting (1.72) for x0 1 and (1.71) for
x0 0 gives
γ (x1 − βx0 ) = c−1 u0 1 γ (x0 − βx1 ) .
Sorting with respect to x1 and x0 yields
!
u0 1 v
γ 1+ 2 x1 = c−1 γ (u01 + v) x0
c
CHAPTER 1. 17
x1 u01 + v
u1 = c = . (1.77)
x0 1 + u01 v/c2
Dividing by x0 gives
u0i
ui = , (i = 2, 3). (1.78)
γ (1 + u01 v/c2 )
To derive these equations, ~v was chosen along to the x1 -axis. For general ~v one only has to
decompose ~u into its components parallel and perpendicular to the ~v
~u = uk v̂ + ~u⊥ ,
From this addition theorem of velocities it is obvious that the velocity itself is not part of
of a 4-vector. The relativistic generalization is given in subsection (1.1.9). It is left as an
exercise to relate these equations to the addition theorem for the rapidity (1.29.
The concepts of world lines in Minkowski space and proper time (eigenzeit) generalize
immediately to 4D. Assume the particle moves with velocity ~v (t), then d~x = β~ dx0 holds,
and the infinitesimal invariant along its world line is
When (k α ) is a 4-vector, it follows that the phase is a scalar, invariant under Lorentz
transformations
Φ0 (x0 ) = k 0α x0 α = k α x α = Φ(x) . (1.84)
That this is correct can be seen as follows: For an observer at a fixed position ~x (note the
term ~k · ~x is then constant) the wave performs a periodic motion with period
2π 1
T = = , (1.85)
ω ν
where ν is the frequency. In particular, the phase (and hence the wave) takes identical
values on the two-dimensional hyperplanes perpendicular to ~k. Let k̂ be the unit vector in ~k
direction. Decomposing ~x into components parallel and perpendicular to ~k, ~x = xk k̂ + ~x ⊥ ,
the phase becomes
Φ = ω t − k xk , (1.86)
where k = |~k| is the length of the vector ~k. Phases which differ by multiples of 2π give the
same values for the wave W . For example, when we take V0 = 0, the real part of the wave
becomes
Wx = U0 cos(ω t − k xk )
and Φ = 0, 2π n, n = ±1, ±2, ... describes the wave crests. From (1.86) it follows that the
crests pass our observer with speed ~u = u k̂, where
ω ω
u= as for Φ = 0 we have xk = t . (1.87)
k k
Let our observer count the number of wave crests passing by. How has the wave (1.81)
to be described in another inertial frame K 0 ? An observer in K 0 counting the number of
wave crests, passing through the same space-time point at which our first observer counts,
must get the same number. The coordinates are just labels and the physics is the same in all
systems. When in frame K the wave takes its maximum at the space-time point (xα ) it must
also be at its maximum in K 0 at the same space-time point in appropriately transformed
coordinates (x0α ). More generally, this holds for every value of the phase, because it is a
scalar.
CHAPTER 1. 19
As (k α ) is a 4-vector the transformation law for angular frequency and wave vector is
just a special case of equations (1.74), (1.75) and (1.76)
say an electron or proton. This has remained too inaccurate. The mass unit has resisted
modernization and is still defined through a
standard object, a cylinder of platinum alloy which is kept at the International Bureau of
Weights and Measures at Sévres, France.
Let us consider a point-like particle in its rest-frame and denote its mass there by m0 .
In any other frame the rest-mass of the particle is still m0 , which in this way is defined
as a scalar. It may be noted that most books in particle and nuclear physics simply use
m to denote the rest-mass, whereas some books on special relativity employ the notation
m = γm0 for a mass which is proportional to the energy, i.e., the zero component of the
energy-momentum vector introduced below. To avoid confusion, we use m0 for the rest mass.
In the non-relativistic limit the momentum is defined by p~ = m0 ~u. We want to define
p~ as part of a relativistic 4-vector (pα ). Consider a particle at rest in frame K, i.e., p~ = 0.
Assume now that frame K 0 is moving with a small velocity ~v with respect to K. Then the
non-relativistic limit is correct, and p~ 0 = −m0~v has to hold approximately. On the other
hand, the transformation laws (1.74), (1.75) and (1.76) for vectors (note p~ k β~ = ~v /c) imply
p − β~ p0 ) .
0
p~ = γ (~
0
p0 = c m0 in the rest frame, so that we get p~ = −m0 γ v. Consequently, for a particle
moving with velocity ~u in frame K
p~ = m0 γ ~u (1.91)
is the relation between relativistic momentum and velocity. Due to the invariance of the
scalar product pα pα = (p0 )2 − p~ 2 = p0α p0α = m20 c2 holds and
q
p 0 = + c2 m20 + p~ 2 (1.92)
follows, which is of course consistent with calculating p0 via the Lorentz transformation law
(1.74). As c p0 has the dimension of energy, the relativistic energy of a particle is
q p~ 2
E = c p0 = + c4 m20 + c2 p~ 2 = c2 m0 + + ... , (1.93)
2m0
where the second term is just the non-relativistic kinetic energy T = p~ 2 /(2m0 ). The first
term shows that (rest) mass and energy can be transformed into one another [3]. In processes
where the mass is conserved we just do not notice it. Using the mass definition of special
relativity books like [7], m = c p0 , together with (1.93) we obtain at this point the famous
equation E = m c2 . Avoiding this definition of m, because it is not the mass found in particle
tables, where the mass of a particle is an invariant scalar, the essence of Einstein’s equation
is captured by
E0 = m 0 c2 ,
CHAPTER 1. 21
where E0 is the energy of a massive body (or particle) in its rest frame.
Non-relativistic momentum conservation p~1 + p~2 = q~1 + ~q2 , where p~i , (i = 1, 2) are the
momenta of two incoming, and q~i , (i = 1, 2) are the momenta of two outgoing particles,
becomes relativistic energy–momentum conservation:
some measurement prescription. From a theoretical point of view the electrical charge unit
is best defined by the magnitude of the charge of a single electron (fundamental charge unit).
In more conventional units this reads
where the errors are given in parenthesis. Definitions of the electric charge through
measurement prescriptions rely presently on the current unit Ampére [A] and are given in
elementary physics textbooks like [8]. The numbers of (1.100) are from 1998 [1]. The website
of the National Institute of Standards and Technology (NIST) is given in this reference.
Consult it for up to date information.
The choice of constants in the inhomogeneous Maxwell equations defines units for the
electric and magnetic field. The given conventions 4πρ and (4π/c)J~ are customarily used in
connection with Gaussian units, where the charge is defined in electrostatic units (esu).
In the next subsections the concepts of fields and currents are discussed in the relativistic
context and the electromagnetic field equations follow in the last subsection.
T ...α...β... = T ...α...β...(x) .
~ x) in electrostatics would
It is called static when there is no time dependence. For instance E(~
be a static vector field in three dimensions. We are here, of course, primarily interested in
contravariant or covariant fields in four dimensions, like vector fields Aα (x).
Suppose n electric charge units are contained in a small volume v, so that we can talk
about the position ~x of this volume. The corresponding electrical charge density at the
position of that volume is then just ρ = n/v and the electrical current is defined as the
charge that passes per unit time through a surface element of such a volume. We demand
now that the electric charge density ρ and the electric current J~ form a 4-vector:
α cρ
(J ) = .
J~
The factor c is introduced by dimensional reasons and we have suppressed the space-time
dependence, i.e., J α = J α (x) forms a vector field. It is left as an exercise to write down the
4-current for a point particle of elementary charge qe .
The continuity equation takes the simple invariant form
∂α J α = 0 . (1.101)
c2 q02 = Jα J α .
CHAPTER 1. 23
F αβ = −F βα , (1.102)
the number of independent parameters is reduced to precisely six. The diagonal elements do
now vanish,
F 00 = F 11 = F 22 = F 33 = 0
and other elements follow from F αβ with α < β through (1.102). As desired, this gives
(16 − 4)/2 = 6 independent elements.
Up to an over–all factor, which is chosen by convention, the only way to obtain a 4-vector
through differentiation of F αβ is
4π β
∂α F αβ = J . (1.103)
c
This is the inhomogeneous Maxwell equation in covariant form. Note that it determines
the physical dimensions of the electric fields, the factor 4π/c on the right-hand side
corresponds to Gaussian units. The continuity equation (1.101) is a simple consequence
of the inhomogeneous Maxwell equation
4π
∂β J β = ∂β ∂α F αβ = 0
c
because the contraction of the symmetric tensor (∂β ∂α ) with the antisymmetric tensor F αβ
is zero.
~ and B
To relate the elements of the F αβ tensor to the E ~ fields, let us choose β = 0, 1, 2, 3
and compare equation (1.103) with the inhomogeneous Maxwell equations in their standard
form (1.98). For instance, ∂α F α 0 = ∇E ~ = 4π ρ yields the F i 0 = E i , the first column of the
F αβ tensor. Extending this comparision it the other β values, the final result is
0 −E x −E y −E z
Ex 0 −B z By
(F αβ ) = y . (1.104)
E Bz 0 −B x
Ez −B y Bx 0
CHAPTER 1. 24
Or, in components
X 1 X X kij ij
F i0 = E i and F ij = − ijk B k ⇔ B k = − F . (1.105)
k 2 i j
Consequently,
0 Ex Ey Ez
−E x 0 −B z By
(Fαβ ) = . (1.106)
−E y Bz 0 −B x
−E z −B y Bx 0
∂ α F βγ + ∂ β F γα + ∂ γ F αβ = 0 . (1.110)
The proof is left as an exercise to the reader. Let us mention that the homogeneous Maxwell
equation (1.109) or (1.110), and hence our demand that the field can be written in the form
(1.107), excludes magnetic monopoles.
The elements of the dual tensor may be calculated from their definition (1.108). For
example,
∗ 02
F = 0213 F13 = −F13 = −B y ,
CHAPTER 1. 25
where the first step exploits the anti-symmetries 0231 = −0213 and F31 = −F13 . Calculating
six components, and exploiting antisymmetry of ∗ F αβ , we arrive at
0 −B x −B y −B z
Bx 0 Ez −E y
(∗ F αβ ) = y x . (1.111)
B −E z 0 E
Bz Ey −E x 0
The homogeneous Maxwell equations in their form (1.99) provide a non–trivial consistency
check for (1.109), which is of course passed.
A notable observation is that equation (1.107) does not determine the potential uniquely.
Under the transformation
Aα 7→ A0α = Aα + ∂ α ψ, (1.112)
where ψ = ψ(x) is an arbitrary scalar function, the electromagnetic field tensor is invariant:
F 0αβ = F αβ , as follows immediately from ∂ α ∂ β ψ − ∂ β ∂ α ψ = 0. The transformations (1.112)
are called gauge transformation1 . The choice of a convenient gauge is at the heart of many
calculations.
Using the explicit form (1.70) of A = (aαβ ) for boosts in the x1 direction and (1.104) for the
~ and B
relation to E ~ fields, it is left as a straightforward exercise to derive the transformation
laws
2
~0 =γ E
E ~ + β~ × B~ − γ β~ β~ E ~ , (1.114)
γ+1
and
2
~0 =γ B
B ~ − γ β~ β~ B
~ − β~ × E ~ . (1.115)
γ+1
the conservation of energy and momentum and the fact that an electromagnetic field carries
energy as well as momentum. Here we are content with finding the Lorentz covariant force.
We consider a charged point particle in an electromagnetic field F αβ . Here external
means from sources other than the point particle itself and that the influence of the point
particle on these other sources (possibly causing a change of the field F αβ ) is neglected. The
infinitesimal change of the 4-momentum of a point point particle is dpα and assumed to be
proportional to (i) its charge q and (ii) the external electromagnetic field F αβ . This means,
we have to contract F αβ with some infinitesimal covariant vector to get dpα . The simplest
choice is dxβ , what means that the amount of 4-momentum change is proportional to the
space-time length at which the particle experiences the electromagnetic field. Hence, we
have determined dpα up to a proportionality constant, which depends on the choice of units.
Gaussian units are defined by choosing c−1 for this proportionality constant and we have
q
dpα = ± F αβ dxβ . (1.116)
c
As discussed in the next section, it is a consequence of energy conservation, which is in this
context known as Lenz law, that the force between charges of equal sign has to be repulsive.
This corresponds to the plus sign and we arrive at
q αβ
dpα = F dxβ . (1.117)
c
Experimental measurements are of course in agreement with this sign. The remarkable
point is that energy conservation and the general structure of the theory already imply
that the force between charges of equal sign has to be repulsive. Therefore, despite the
similarity of the Coulomb’s inverse square force law with Newton’s law it impossible to build
a theory of gravity along the lines of this chapter, i.e., to use the 4-momentum pα as source
in the inhomogeneous equation (1.103). The resulting force would necessarily be repulsive.
Experiments show also that positive and negative electric charges exist and deeper insight
about their origin comes from the relativistic Lagrange formulation, which includes Dirac’s
equation for electrons and leads to Quantum Electrodynamics.
Taking the derivative with respect to the proper time, we obtain the 4-force acting on a
charged particle, called Lorentz force,
dpα q
fα = = F αβ Uβ . (1.118)
dτ c
As in equation (1.97) f α = m0 dU α /dτ holds for non-zero rest mass and the definition of the
contravariant velocity is given by equation (1.96).
Using the representation (1.104) of the electromagnetic field the time component of the
relativistic Lorentz force, which describes the change in energy, is
dp0 q ~ ~
f0 = =− E U . (1.119)
dτ c
CHAPTER 1. 27
To get the space component of the Lorentz force we use (1.104) and (1.105) and get the
equality
3 3 X 3
qX qX
F ij Uj = − ijk B k Uj
c j=1 c j=1 k=1
The space components combine into the well-known Lorentz force
~+qU
f~ = q γ E ~ ×B
~, (1.120)
c
which reveals that the relativistic velocity (1.96) of the charge q and not its velocity ~v
enters the force equation. This allows, for instance, correct force calculations for fast flying
electrons in a magnetic field. The equation (1.120) for f~ can be used to define a measurement
prescription for an electric charge unit.
~
~ + 1 ∂B = 0 .
∇×E (1.121)
c ∂t
This equation is the differential form of Faraday’s law: A changing magnetic field induces
an electric field. In the following we derive the integral form, which is needed for circuits
of macroscopic extensions. We integrate over a simply connected surface S and use Stoke’s
theorem to convert the integral over ∇ × E ~ into a closed line integral along the boundary C
of S:
Z I Z ~
(∇× E ~ ) · d~a = ~ · d~l = − 1
E
∂B
· d~a .
S C c S ∂t
On the right-hand side we eliminate the partial derivative ∂/∂t using
3
!
d ∂ ∂~x X ∂xi
= + ~v · ∇ note ~v = = êi
dt ∂t ∂t i=1 ∂t
to get I Z Z
~ · d~l = − 1 d
E ~ · d~a + 1
B ~ · d~a
(~v · ∇) B (1.122)
C c dt S c S
Using the other homogeneous Maxwell equation, ∇ · B ~ = 0, and that the ∂/∂xi derivatives
of ~v vanish (e.g., (∂/∂x1 ) (∂x1 /∂t) = (∂/∂t) (∂x1 /∂x1 ) = 0), the vector identity
[∇ × ( ~a × ~b )] = ( ∇ · ~b ) ~a + ( ~b · ∇ ) ~a − ( ∇ · ~a ) ~b − ( ~a · ∇ ) ~b ,
gives
~ × ~v = ~v ∇ · B
∇× B ~ + (~v · ∇) B
~ = (~v · ∇) B
~
CHAPTER 1. 28
where Stoke’s theorem has been used and β~ = ~v /c. We re-write equation (1.122) with both
encountered line integral on the left-hand side
I
E ~ · d~l = − 1 d Φm ,
~ + β~ × B (1.123)
C c dt
where Φ is called magnetic flux and defined by
Z
Φm = ~ · d~a .
B (1.124)
S
Equation (1.123) is the fully relativistic version of Faraday’s law. The velocity β~ = ~v /c in
equation (1.123) refers to the velocity of the line element d~l with respect to the inertial frame
in which the calculation is done. In the frame co-moving with apparatus, normally the Lab
frame, the velocity differences between different line element sections are small so that we
can neglect the β~ × B
~ contribution:
I
emf = ~ · d~l = − 1 d Φm ,
E (1.125)
C c dt
where emf is called electromotive force (emf). In this approximation Faraday’s Law of
Induction is stated in most test books. Due to our initial treatment of special relativity
we do not face the problem to work out its relativistic generalization, but instead obtained
(1.125) as an approximation of the generally correct law (1.123).
becomes a magnet with north pole towards the bar magnet. The result is a repulsive force
between bar magnet and loop. Work against this force is responsible for the induced current
and its associated heat in the loop. Would the sign of the induced current be different, an
attractive force would result and the resulting acceleration of the bar magnet as well as the
heat in the loop would violate energy conservation. Note that pulling the bar magnet out of
the loop does also produce energy.
In our treatment the sign of Faraday’s law is already given by the electromagnetic field
equation and energy conservation determines the sign in equation (1.116) for the Lorenz
force.
Bibliography
[1] P.J. Mohr and B.N. Taylor, CODATA Recommended Values of the Fundamental
Physical Constants: 1998, J. of Physical and Chemical Reference Data, to appear.
See the website of the National Institute of Standards and Technology (NIST) at
physics.nist.gov/constants.
[2] A. Einstein, Zur Elektrodynamik bewegter Körper, Annalen der Physik 17 (1905) 891–
921.
[3] A. Einstein, Ist die Trägheit eines Körpers von seinem Energieinhalt abhängig?, Annalen
der Physik 18 (1906) 639–641.
[5] J.D. Jackson, Classical Electrodynamics, Second Edition, John Wiley & Sons, 1975.
[8] P.A. Tipler, Physics for Scientists and Engineers, Worth Publishers, 1995.
[10] S. Weinberg, Gravitation and Cosmology - Principles and Applications of the General
Theory of Relativity, John Wiley & Sons, 1972.