Lagrangian Mechanics Examples
Lagrangian Mechanics Examples
That being said, here we're going to cover two fairly quick examples first, the simple
pendulum and the spherical pendulum. These will illustrate the general process of how
Lagrangian mechanics is applied as well as lay out the basics we need to understand first.
We will then go over the full solution to the gravitational two-body problem, which will lead us
to various interesting physical results, such as Kepler's laws, why planetary orbits are
elliptical and many other things.
The first example we'll consider is the simple pendulum. This is a typical Lagrangian
mechanics example in almost all textbooks and courses that's often used to introduce the
concepts of generalized coordinates and so on.
While this example is quite common and perhaps a little boring, it illustrates the concepts
we're going to need later very well. So, it's worth going over this example as well.
Anyway, the simple pendulum is essentially a mass at the end of a rigid rod that's swinging
back and forth in a plane under the influence of gravity (a constant acceleration g pointing
downwards). We'll call the mass of the "pendulum bob" m and the length of the rod l.
It's often easiest to start in Cartesian coordinates before choosing any generalized
coordinates, so let's imagine placing the pendulum in an x,y -coordinate system:
We're interested in describing how the position of the pendulum (its x,y -coordinates) change
with time. However, we can equivalently do this, not by using the x,y -coordinates but an
angle 𝜃 (which in this case will be chosen relative to the y-axis).
Now, this angle 𝜃 (which is a function of time, 𝜃 = 𝜃(t)) characterizes all of the information
we wish to know about the pendulum's position. Therefore, we will choose this as our
generalized coordinate., meaning that q = 𝜃 (this is also the only generalized coordinate we
need).
It's however, still easiest to first construct the kinetic and potential energies in Cartesian
coordinates and then transform those in terms of the generalized coordinate 𝜃. By some
simple trigonometry, we can determine the x,y -coordinates of the pendulum in terms of this
angle 𝜃:
Note that the y-coordinate here is negative as it points down from the origin.
Anyway, we can now find the components of the velocity by taking the time derivatives of the
x- and y-coordinates (note that since 𝜃 is a function of time, you have to use the chain rule
here, so first differentiate cos and sin , then multiply by the time derivative of 𝜃):
dx d d𝜃
= (l sin 𝜃) = l cos 𝜃
dt dt dt
dy d d𝜃
= (-l cos 𝜃) = l sin 𝜃
dt dt dt
Now I'll switch to using the dot notation for these time derivatives, so dx / dt = x␒ and similarly
for y and 𝜃 also:
x␒ = l𝜃␒ cos 𝜃
y␒ = l𝜃␒ sin 𝜃
The kinetic energy is then just the sum of the squares of these velocities, which we can write
in terms of 𝜃 and 𝜃␒ now:
1 1
T = m x␒ 2 + y␒ 2 = m l 2 𝜃␒ 2 cos 2 𝜃 + l 2 𝜃␒ 2 sin 2 𝜃
2 2
We can factor out this l 2 𝜃␒ 2 here and use cos 2 𝜃 + sin 2 𝜃 = 1 to get:
1 1
T = ml 2 𝜃␒ 2 cos 2 𝜃 + sin 2 𝜃 = ml 2 𝜃␒ 2
2 2
That's the kinetic energy. Now, the potential energy of the mass m is just V = mgh, where h
is its height. We're free to choose the zero level of the potential energy anywhere we want, so
I'll choose it at y = 0. Therefore, the height is just the y-coordinate, which can also be
expressed as y = -l cos 𝜃:
1
L = T - V = ml 2 𝜃␒ 2 + mgl cos 𝜃
2
We can now construct the Euler-Lagrange equation, in this case, for the generalized
coordinate q = 𝜃:
d ∂L ∂L
- =0
dt ∂𝜃␒ ∂𝜃
d
⇒ ml 2 𝜃␒ + mgl sin 𝜃 = 0
dt
␒␒ + mgl sin 𝜃 = 0
⇒ ml 2 𝜃
␒␒ = - g sin 𝜃
⇒ 𝜃
l
This is the equation of motion for a simple pendulum! It describes the acceleration of the 𝜃-
coordinate and in principle, we could solve this to find 𝜃(t), which would determine the entire
motion of the pendulum.
However, a nice way to approximate a solution is by the so-called small angle approximation,
where sin 𝜃 ≈ 𝜃. This is valid as long as the angle 𝜃 is small, in which case this gives a
pretty good approximation.
So, for small oscillations of the pendulum, the differential equation reduces to:
␒␒ = - g 𝜃
𝜃
l
g
This happens to have the particular solution 𝜃(t) = A sin t + 𝜑 , where A and 𝜑 are
l
some arbitrary constants. You can check that this is a valid solution simply by plugging this
into the differential equation above if you wish.
The angular frequency 𝜔 of the pendulum's oscillations, in this case, is the thing in front of
g
the time t, which is 𝜔 = . Therefore, the period of the pendulum (for small oscillations) is
l
given by:
2𝜋 l
T= = 2𝜋
𝜔 g
The simple pendulum is one of those examples that's necessary to go through once but after
that, it's not so interesting anymore. So, let's consider something a bit more complicated; a
spherical pendulum.
This is essentially a pendulum (mass m and length l again) that can swing, not just in a plane
but in all three dimensions and the motion of the pendulum bob can get very complicated, but
also interesting.
We'll first derive the equations of motion for the spherical pendulum and after that, we'll do
some analysis of its motion. This will also lead us quite smoothly to the next example, the
Kepler problem.
To begin with, we can describe the position of the mass m by placing the pendulum in a
Cartesian x,y,z -coordinate system:
Now, Cartesian coordinates are certainly not the best choice here. Instead we will choose two
angles, 𝜃 and 𝜙, as our generalized coordinates (see picture below). These are indeed the
only necessary coordinates we need to specify the pendulum bob's position (since its
distance from the origin is fixed to just l).
The angle 𝜙 here essentially "rotates" in the x,y-plane and 𝜃 is the angle between the rod
and the vertical z-axis (it's essentially the same angle as in the simple pendulm case).
Now, let's try to express the x, y and z coordinates of the bob in terms of these two angles.
First, let's look at the situation from the "side" or in other words, let's look at this triangle you
see in the picture, from which we can determine the z-coordinate.
Now let's look at the situation from above or in other words, let's look at what happens in the
x,y -plane to determine the x and y coordinates:
x = l sin 𝜃 cos 𝜙
y = l sin 𝜃 sin 𝜙
z = -l cos 𝜃
We can then take the time derivatives of these (note that since both 𝜃 and 𝜙 are functions of
time, you have to use both the product and chain rules when differentiating x and y):
d
x␒ = (l sin 𝜃 cos 𝜙)
dt
d d
= l cos 𝜙 sin 𝜃 + l sin 𝜃
cos 𝜙
dt dt
= l𝜃␒ cos 𝜙 cos 𝜃 - l𝜙␒ sin 𝜃 sin 𝜙
d
y␒ = (l sin 𝜃 sin 𝜙)
dt
d d
= l sin 𝜙 sin 𝜃 + l sin 𝜃 sin 𝜙
dt dt
␒ ␒
= l𝜃 sin 𝜙 cos 𝜃 + l𝜙 sin 𝜃 cos 𝜙
d
z␒ = (-l cos 𝜃) = l𝜃␒ sin 𝜃
dt
Now, the kinetic energy of the pendulum bob is then the sum of the squares of these
1
velocities, T = m x␒ 2 + y␒ 2 + z␒ 2 . Writing out all these squares is somewhat frustrating, so
2
I've simply done that with a calculator. You can do this by hand as an exercise if you wish.
What you end up with after A LOT of terms cancelling and applying some trigonometric
identities a few times (namely, cos 2 𝜃 + sin 2 𝜃 = 1 and the same for 𝜙) is the following
kinetic energy:
1
T = ml 2 𝜃␒ 2 + 𝜙␒ 2 sin 2 𝜃
2
Now, the potential energy is much simpler. It is again just V = mgh, with the height being the
z-coordinate (z = -l cos 𝜃):
1
L = T - V = ml 2 𝜃␒ 2 + 𝜙␒ 2 sin 2 𝜃 + mgl cos 𝜃
2
In this case, since we have two generalized coordinates, we will also have two equations of
motion and thus, two Euler-Lagrange equations. The first one is the Euler-Lagrange equation
for the coordinate 𝜃:
d ∂L ∂L
- =0
dt ∂𝜃␒ ∂𝜃
d 1 ∂ ∂
⇒ ml 2 𝜃␒ - ml 2 𝜙␒ 2 sin 2 𝜃 - mgl cos 𝜃 = 0
dt 2 ∂𝜃 ∂𝜃
2 ␒␒ 2 ␒ 2
⇒ ml 𝜃 - ml 𝜙 sin 𝜃 cos 𝜃 + mgl sin 𝜃 = 0
␒␒ = - g sin 𝜃 + 𝜙␒ 2 sin 𝜃 cos 𝜃
⇒ 𝜃
l
This is the equation of motion for the 𝜃-coordinate. Notice that the first term on the right-hand
side is the same as for the simple pendulum, but the second term comes from the fact that
the pendulum is also swinging in the 𝜙-direction (meaning a non-zero 𝜙␒ ).
Let's now construct the equation of motion for the 𝜙-coordinate. Since nothing in the
Lagrangian depends explicitly on the 𝜙-coordinate, we have ∂L / ∂𝜙 = 0 and we get:
d ∂L ∂L d
- =0 ⇒ ml 2 𝜙␒ sin 2 𝜃 = 0
dt ∂𝜙␒ ∂𝜙 dt
This thing inside the parentheses is simply the generalized momentum associated with the 𝜙-
coordinate, so p 𝜙 = ml 2 𝜙␒ sin 2 𝜃. The statement dp 𝜙 / dt = 0 then means that p 𝜙 (which
actually happens to be the z-component of the angular momentum) is conserved!
We can actually use this to express the 𝜙␒ -velocity as (this is useful because p 𝜙 is simply
some constant determined by initial conditions as it is conserved):
p𝜙
p𝜙 = ml 2 𝜙␒ sin 2 𝜃 ⇒ 𝜙␒ =
ml 2 sin 2 𝜃
Then, inserting this expression for 𝜙␒ into the 𝜃-equation of motion, we get:
We now have an equation of motion completely in terms of the 𝜃-coordinate that determines
the motion of the pendulum! In principle, this could be used to, for example, simulate the
pendulum's motion using a computer or numerically solve for its motion.
Let's now look at some aspects relating to the motion of the spherical pendulum. In particular,
we'll construct a so-called effective potential for the spherical pendulum, which allows us to
look at some interesting details. This will also lead us nicely to the next example, which also
utilizes the notion of an effective potential.
The way we'll do this is by finding the total energy of the pendulum first. Simply put, the total
energy (which is conserved) is just the sum of the kinetic and potential energy, T + V .
Inserting these (which we calculated earlier), the energy is:
1
E = T + V = ml 2 𝜃␒ 2 + 𝜙␒ 2 sin 2 𝜃 - mgl cos 𝜃
2
p𝜙
We can now use the expression we found that 𝜙␒ = and insert this:
ml 2 sin 2 𝜃
2
1 p𝜙
E = ml 2 𝜃␒ 2 + 2 4 sin 2 𝜃 - mgl cos 𝜃
2 4
m l sin 𝜃
2
1 p𝜙
E = ml 2 𝜃␒ 2 + - mgl cos 𝜃
2 2ml 2 sin 2 𝜃
1 2 ␒2
Now, here we essentially have a kinetic energy termml 𝜃 and two terms that don't
2
have any velocity-dependence anymore and only depend on the position 𝜃.
Generally speaking, potentials are position-dependent, so we could say that the energy can
be expressed as the sum of a kinetic term and an "effective" potential that is a function of 𝜃
(again, this only makes sense since we've eliminated the 𝜙␒ -dependence and replaced it with
something that only involves 𝜃):
2
1 p𝜙
E = ml 2 𝜃␒ 2 + Veff (𝜃) , where Veff (𝜃) = - mgl cos 𝜃 .
2 2ml 2 sin 2 𝜃
Now, what's the point of this exactly? Well, the effective potential is an extremely useful tool
qualitatively analyze how a given system behaves. For example, by plotting the effective
potential, we can literally see how a system may behave.
By plotting this effective potential for the spherical pendulum, we can see that there is a
minimum point on the graph (here I've plotted the values for 0 ⩽ 𝜃 ⩽ 𝜋):
dVeff
=0
d𝜃
p𝜙2 cos 𝜃0
- + mgl sin 𝜃0 = 0
ml 2 sin 3 𝜃0
⇒ p𝜙2 cos 𝜃0 = m 2 gl 3 sin 4 𝜃0
Now, actually solving this for 𝜃 0 is actually incredibly messy and will not give you any kind of
nice looking or illuminating result. The actual value of 𝜃 0 is not so important here, but rather
what it implies in terms of the properties of the motion.
So, we can instead solve for p 𝜙 here (you'll see why soon):
gl 3
p𝜙 = m sin 2 𝜃0
cos 𝜃0
Now, since the pendulum bob is moving in circular motion, its angular frequency 𝜔 is a
constant and for circular motion this can be expressed in terms of the period T (time it takes
2𝜋
to move once around the circle) as 𝜔 = .
T
␒ 𝜃0 ), in
This angular frequency is simply the rate of rotation around the circle, which is just 𝜙(
other words, the rate of change of 𝜙 for the particular value of 𝜃 0 (which as a reminder, is the
value of 𝜃 0 that minimizes the effective potential). So, we can solve for the period of the
"circular" pendulum from this:
p𝜙 2𝜋 2𝜋ml 2 sin 2 𝜃0
␒ 𝜃0 ) =
𝜔 = 𝜙( = ⇒ T=
ml 2 sin 2 𝜃0 T p𝜙
gl 3
We can insert p 𝜙 = m sin 2 𝜃 0 into this to obtain:
cos 𝜃0
l cos 𝜃0
T = 2𝜋
g
This is the period of the pendulum bob in circular motion. Notice that if 𝜃 0 = 0 (the
pendulum's circular motion reduces to just it swinging along a "line"), this reduces to
T = 2𝜋 l / g , which is the same as the period of a simple pendulum.
The gravitational two-body problem, also called the Kepler problem, essentially consists of
two masses, we'll call them m 1 and m 2 , that interact with each other through an inverse-
square gravitational force (F ∝ 1 / r 2 ) in three dimensions.
Essentially, the solution to such a problem describes all possible gravitational interactions
(not accounting for general relativity). To put simply, this problem consists of describing what
happens when two massive bodies are both allowed to move in 3D with a force that depends
on the distance between them.
We'll see that this problem can more or less be solved using Lagrangian mechanics and we'll
discover many interesting things like why orbits happen in a plane, why orbits are generally
elliptic and so on.
Generally speaking, if the inital velocities of the two bodies are in different directions (which
they generally will be), the bodies will orbit each other in some complicated manner. Our goal
is to find exactly how they will orbit.
To begin with, let's place both of our objects in a 3D Cartesian coordinate system. I'll also
draw some vectors, r 1 and r 2 , to describe the positions of the masses m 1 and m 2 :
Now, the problem with this is that we now have 6 degrees of freedom (6 coordinates we
need), the x,y,z -coordinates for both of the masses. This makes things quite complicated but
luckily, we can simplify this a lot.
First of all, I'm going to draw in two more vectors, a vector r that points from one of the
masses to the other and a vector R that is the position vector of the center of mass (which
lies somewhere between the two masses):
We can now do a bit of vector algebra first before constructing the Lagrangian. Since the
vectors r and R both have three components, together they contain 6 degrees of freedom.
Moreover, the force only acts along the vector r .
Based on this, knowing just the displacement vector between the masses, r , and the center
of mass, described by R, is enough to specify everything about the system and we will also
see that this simplifies the Lagrangian and the equations of motion a lot.
Anyway, we now noeed to express the vectors r 1 and r 2 in terms of the vectors r and R. To
do this, we can draw a little vector diagram:
The vector r here can be expressed as r = r 1 - r 2 . The center of mass position vector can
be calculated from the formula (this is the definition of the center of mass):
2 mi r i m1 r 1 + m2 r 2
R=∑ = , where M = m 1 + m 2 is the total mass of the system.
i=1
M M
We now want to solve for r 1 and r 2 from these:
r = r1 - r2 ⇒ r1 = r + r2
m1 r 1 + m2 r 2 M m1
R= ⇒ r2 = R- r
M m2 m2 1
Inserting the second equation into the first and also the first equation into the second, we
obtain two equations, which we can again solve:
M m1 1 M
r1 = r + R- r1 ⇒ r1 = r+ R
m2 m2 m1 m1
1+ m 1+ m2
2 m2
M m1 M m1
r2 = R- (r + r 2) ⇒ r 2 = R- r
m2 m2 m1 m1
1+ m2
m2 1+ m2
m2
m1
Since M = m 1 + m 2 , we can simplify both of these a little bit by writing 1 + m2 = M .
m2
After these simplifications, we have:
m2
r1 = R + r
M
m1
r2 = R - r
M
So, we then have the positions of both of the masses expressed in terms of the center of
mass and the distance between the masses.
We can now calculate the velocities of both of the bodies by taking the time derivatives of
these (taking the time derivative of a vector simply just means differentiating each
component, so for example r␒ 1 = (x␒ 1 , y␒ 1 , z␒ 1 ) ):
␒ m2 ␒
r␒ 1 = R + r
M
␒ m1 ␒
r␒ 2 = R - r
M
Let's now consider the (squares) of the magnitudes of these velocity vectors, which are
calculated by taking dot products:
␒ m2 ␒ ␒ m2 ␒ ␒ ␒ m2 ␒ m22
r␒ 12 ␒ ␒
= r1 · r1 = R + r · R+ r = R · R + 2 R · r + 2 r␒ · r␒
␒
M M M M
2
2 ␒ ␒ ␒ m1 ␒ ␒ m1 ␒ ␒ ␒ m1 ␒ ␒ m1 ␒ ␒
r␒ 2 = r 2 · r 2 = R - r · R- r = R·R-2 R· r + r·r
M M M M2
We can now begin constructing our Lagrangian (though we'll still simplify this later). The total
kinetic energy of the system will be:
1 1
T = m1 r␒ 12 + m2 r␒ 22
2 2
2 2
1 ␒ ␒ m1 m2 ␒ ␒ 1 m1 m2 ␒ ␒ 1 ␒ ␒ m2 m1 ␒ ␒ 1 m2 m1 ␒ ␒
= m1 R · R + R· r + r · r + m2 R · R - R· r + r·r
2 M 2 M2 2 M 2 M2
Notice that these "cross-terms" cancel here. We can also combine some terms to get:
2 2
1 ␒ ␒ 1 m1 m2 + m2 m1 ␒ ␒
T = (m 1 + m 2 )R · R + r·r
2 2 M2
1 1 m1 m2
T = MR␒ 2 + 𝜇r␒ 2 , where I've defined 𝜇 = as the reduced mass.
2 2 M
The potential energy is just the gravitational potential energy, which only depends on the
distance r (magnitude of the r -vector) between the two masses:
Gm1 m2
V=-
r
1 1 Gm1 m2
L = T - V = MR␒ 2 + 𝜇r␒ 2 +
2 2 r
Now, this isn't our final form for the Lagrangian since we can find something more suitable for
analyzing the actual physics.
However, using this Lagrangian, we notice something particularly useful. Let's look at the
Euler-Lagrange equation for the R-coordinate. Since nothing in the Lagrangian depends
explicitly on R, we get:
d ∂L ∂L d
- =0 ⇒ (MR␒ ) = 0
dt ∂R␒ ∂R dt
This tells us that the velocity of the center of mass (R␒ ) is constant! But what constant? Well,
we're essentially free to choose that by simply looking at the situation from a different
coordinate system.
In particular, I'll choose the coordinate system such that the origin is always located at the
position of m 2 . This means that our coordinate system essentially "moves" with the location
of m 2 .
The justification for why we can do this is that the center of mass is moving with constant
velocity. This is then an "intertial coordinate system" and we're free to use any intertial
coordinate system that moves with constant velocity as all inertial systems are equivalent.
Anyway, the nice thing about doing this is that the displacement r now points from the origin
to m 1 (we'll see why this is useful soon). In this coordinate system, the center of mass
velocity R␒ is just some constant, let's just call it V (not to be confused with the potential
energy).
Our Lagrangian, by doing this, is then (notice that the magnitudes of all the vectors stay
exactly the same):
1 1 Gm1 m2
L = MV 2 + 𝜇r␒ 2 +
2 2 r
There is still one more thing we want to do; express this Lagrangian in spherical coordinates
in this new coordinate system with m 2 located at the origin. This is going to allow us to
derive for example, the conservation of angular momentum and the orbit equation quite
straightforwardly.
Now, spherical coordinates in general are defined by the distance from the center r, and two
angles, 𝜃 and 𝜙.
The coordinates (x,y,z) of any point in spherical coordinates can be expressed as (notice that
these are very similar to the coordinate relations we had for the spherical pendulum, but with
l now being another coordinate, namely r):
x = r sin 𝜃 cos 𝜙
y = r sin 𝜃 sin 𝜙
z = r cos 𝜃
1 1 Gm1 m2
To understand how our Lagrangian L = MV 2 + 𝜇r␒ 2 + changes when we switch
2 2 r
to spherical coordinates, we need to recall what this r␒ 2 really means (the vector r was the
displacement vector between the two masses):
r␒ 2 = r␒ · r␒ = r␒ 1 - r␒ 2 · r␒ 1 - r␒ 2 = r␒ 1 · r␒ 1 - 2 r␒ 1 · r␒ 2 + r␒ 2 · r␒ 2
However, in our current coordinates, the mass m 2 is always at the origin, so r␒ 2 = 0 and we
have r␒ 2 = r␒ 1 · r␒ 1 . Now, the vector r 1 is just the position vector of mass m 1 with components
r 1 = (x1 , y1 , z1 ) . We can express these points in spherical coordinates as:
x1 = r sin 𝜃 cos 𝜙
y1 = r sin 𝜃 sin 𝜙
z1 = r cos 𝜃
r␒ 2 = r␒ 1 · r␒ 1 = x␒ 12 + y␒ 12 + z␒ 12
Now, squaring the above expressions for the velocity components is extremely tedious, so
I'm not going to do that by hand here (you can if you wish). What you get as the end resul,
after a lot of annoying algebra, is:
r␒ 2 = r␒ 2 + r 2 𝜃␒ 2 + r 2 𝜙␒ 2 sin 2 𝜃
This may look weird, but the r␒ on the left-hand side is the magnitude of the full velocity vector,
while the r␒ on the right is only the rate of change of the r-coordinate, so these are two
different things. However, this isn't a problem as we won't need the "r␒" on the right anymore
soon.
1 1 Gm1 m2 1 1 Gm1 m2
L = MV 2 + 𝜇r␒ 2 + ⇒ L = MV 2 + 𝜇 r␒ 2 + r 2 𝜃␒ 2 + r 2 𝜙␒ 2 sin 2 𝜃 +
2 2 r 2 2 r
1
We may still clean this up a bit, namely by noting that the MV 2 -term is just a constant.
2
Constants in a Lagrangian, however, do not matter in any way as these go to zero anyway
when we calculate the Euler-Lagrange equations. Therefore, we may as well just discard this
term as it doesn't contribute anything to the equations of motion.
This is now the Lagrangian that describes the position (r, 𝜃, 𝜙 )-coordinates of mass m 1
relative to the other mass, m 2 . This is exactly enough to describe everything we may want to
know about the two-body system.
Note also the fact that we've effectively reduced this whole problem from needing 6 degrees
of freedom originally (the (x, y, z )-coordinates of both the masses) to only having 3 degrees
of freedom (the (r, 𝜃, 𝜙 )-coordinates of one of the bodies relative to the other).
Next, we will calculate the equations of motion and find that we can actually reduce the whole
problem to just 1 degree of freedom. How incredible is that!
Our generalized coordinates are now the radial coordinate, r, and the two angles, 𝜃 and 𝜙.
Therefore, we will have one Euler-Lagrange equation for each coordinate. Let's begin with
the r-equation:
d ∂L ∂L
- =0
dt ∂r␒ ∂r
d 1 ∂ ∂ Gm1 m2
⇒ (𝜇r␒) - 𝜇 r␒ 2 + r 2 𝜃␒ 2 + r 2 𝜙␒ 2 sin 2 𝜃 - =0
dt 2 ∂r ∂r r
Gm1 m2
⇒ 𝜇␒␒r - 𝜇r𝜃␒ 2 - 𝜇r𝜙␒ 2 sin 2 𝜃 + =0
r2
GM
␒␒r = r𝜃␒ 2 + r𝜙␒ 2 sin 2 𝜃 -
r2
Dividing by 𝜇, we get the equation of motion for 𝜃 (it's worth leaving in this form without
explicitly writing out the time derivative here, you'll see why soon):
d
r 2 𝜃␒ = r 2 𝜙␒ 2 sin 𝜃 cos 𝜃
dt
For last, we have the equation for 𝜙 (notice that nothing in the Lagrangian depends explicitly
on 𝜙, so ∂L / ∂𝜙 = 0 ):
d ∂L ∂L
- =0
dt ∂𝜙␒ ∂𝜙
d
⇒ 𝜇r 2 𝜙␒ sin 2 𝜃 = 0
dt
Since this is a time derivative equal to zero, the thing inside the parentheses must be a
constant. This is the generalized momentum associated with the 𝜙-coordinate (
p𝜙 = ∂L / ∂𝜙␒ ), which happens to be conserved here. We can use this to solve for 𝜙␒ , which
then gives us equation of motion for the 𝜙-coordinate:
p𝜙
𝜇r 2 𝜙␒ sin 2 𝜃 = p𝜙 ⇒ 𝜙␒ =
𝜇r 2 sin 2 𝜃
We can now insert this into the 𝜃-equation of motion, which will give us something quite
remarkable. Doing this, we get:
d
r 2 𝜃␒ = r 2 𝜙␒ 2 sin 𝜃 cos 𝜃
dt
2
d 2␒ p𝜙
⇒ r 𝜃 = r2 2 4 sin 𝜃 cos 𝜃
dt 𝜇 r sin 4 𝜃
d p𝜙2
⇒ r 𝜃␒ =
2
cos 𝜃
dt 𝜇 2 r 2 sin 3 𝜃
d p𝜙2
r 𝜃␒
2
r 𝜃␒ = 2
2
cos 𝜃𝜃␒
dt 𝜇 sin 𝜃3
If you now stare at this for a minute, you might notice that the left-hand side here can be
written as the time derivative of the square of this r 2 𝜃␒ -thing, i.e.
d d 2 d
r 2 𝜃␒ r 2 𝜃␒ = r 2 𝜃␒ = r 4 𝜃␒ 2 . You might also notice that the thing on the right-hand
dt dt dt
d p𝜙2
side can also be written as the time derivative of something, namely as - ,
dt 2𝜇 2 sin 2 𝜃
d 1 2 cos 𝜃
because = - 𝜃␒ . Anyway, we can them write the whole equation as:
dt sin 2 𝜃 sin 3 𝜃
d 4 ␒2 d p𝜙2 d 4 ␒2 p𝜙2
r 𝜃 = - ⇒ r 𝜃 + 2 =0
dt dt 2𝜇 2 sin 2 𝜃 dt 2𝜇 sin 2 𝜃
So, the thing inside these parentheses has to be constant and the only way for this to be
generally true is if 𝜃 is a constant (if this is the case, then 𝜃␒ = 0 and sin 2 𝜃 = constant, so
the whole thing is constant).
We've then found something very interesting; the motion of the two bodies has to occur in
a plane, namely a plane of constant 𝜃. This is a simple consequence of angular momentum
conservation (the thing inside the parentheses is actually the square of the Newtonian
angular momentum vector), which we've just proven using Lagrangian mechanics!
This is remarkable, among other things, because it allows us to choose any orbital plane we
wish, without any loss of generality. This is because we've just proven that the motion occurs
in a plane, so any orbital plane is equivalent to any other orbital plane.
What this practically means is that we can choose any constant value for the 𝜃-coordinate,
since the orbital plane is determined by the value of 𝜃. The typical choice here is the
𝜋
equatorial plane, which is at 𝜃 = . The motion of the two bodies is then restricted to the xy-
2
plane:
Now, using this fact that 𝜃 = 𝜋 / 2 , we get that the equation of motion for 𝜙 becomes:
p𝜙 p𝜙
𝜙␒ = =
𝜋 𝜇r 2
𝜇r 2 sin 2 2
We can then insert this into the equation of motion for the r-coordinate to get (and also using
𝜃 = 𝜋 / 2 , which means that 𝜃␒ = 0):
GM
␒␒r = r𝜃␒ 2 + r𝜙␒ 2 sin 2 𝜃 -
r2
p𝜙2 𝜋 GM
⇒ ␒␒r = r 2 4 sin 2 - 2
𝜇 r 2 r
p𝜙2 GM
⇒ ␒␒r = -
𝜇2r3 r2
This is our final equation of motion. Notice that everything else here is just a constant except
for the r-coordinate and indeed, this r-coordinate is the only coordinate we need to determine
the entire motion of the system; we've now reduced the whole two-body problem from six
degrees of freedom down to just one degree of freedom!
Granted, this equation of motion cannot actually be solved in close-form for the r-coordinate
as a function of time. However, it is possible to solve this for the r-coordinate as a function of
the angle 𝜙.
This will give us something called the orbit equation, which determines the shapes of all
gravitational orbits (also proving the fact that planetary orbits are generally elliptical).
There are generally two ways for us to "solve" for how the r-coordinate changes here. The
first way we'll do this is to actually solve the problem or obtain an analytical solution to the
equation of motion. This will give us the orbit equation, which can be used to, for example,
prove Kepler's laws.
The second way is by constructing an effective potential, similarly to how we did with the
spherical pendulum. This will also allow us to understand the two-body problem better.
Let's begin by deriving the orbit equation first. To do this, let's take the equation of motion for
r and multiply it by r␒:
p𝜙2 GM
␒r =
r␒␒ r␒ - r␒
𝜇2r 3
r2
d 1d 2
We'll now use the fact that r␒ 2 = 2r␒␒
␒r ⇒ r␒␒
␒r =
r␒ (which simply comes from the chain
dt 2 dt
rule). Using this and also explicitly writing r␒ = dr / dt , we have:
1d 2 p𝜙2 dr GM dr
r␒ = 2 3 - 2
2 dt 𝜇 r dt r dt
1 2 p𝜙2 1 2 p𝜙2
∫ d r␒ = ∫ 2 3 dr - ∫ 2 dr ⇒ r␒ = - 2 2 + GM + E
GM
(*)
2 𝜇 r r 2 2𝜇 r r
Here I've added an integration constant E (which is actually the energy; we'll come back to
this equation, so keep this at the back of your mind; this is why I've marked the equation with
a star).
We'll now do something clever here. Let's take the equation for 𝜙␒ from earlier, which is
d𝜙 p𝜙
𝜙␒ = = 2 . We can write d𝜙 / dt using the chain rule as:
dt 𝜇r
d𝜙 d𝜙 dr d𝜙
= = r␒
dt dr dt dr
p𝜙
Then, using the fact that this should be equal to , we can get another expression for r␒:
𝜇r 2
d𝜙 p𝜙 p𝜙 dr
r␒ = ⇒ r␒ =
dr 𝜇r 2 𝜇r 2 d𝜙
2
1 2 p𝜙2 GM 1 p𝜙 dr p𝜙2 GM
r␒ = - + +E ⇒ = - + +E
2 2𝜇 2 r 2 r 2 𝜇r 2 d𝜙 2𝜇 2 r 2 r
1
Before we do anything, let's make a substitution of the form u = . The reason for this is that
r
this allows us to get an integral that can be solved analytically. Anyway, the expression for
dr / d𝜙 would then by the chain rule, become:
dr d 1 1 du
= = - 2
d𝜙 d𝜙 u u d𝜙
2
1 p𝜙 2 1 du p𝜙2
u - 2 = - u 2 + GMu + E
2 𝜇 u d𝜙 2𝜇 2
2 2
1 p𝜙 du p𝜙2
⇒ = - u 2 + GMu + E
2 𝜇 2 d𝜙 2𝜇 2
du 2 2GM𝜇 2 2E𝜇 2
= -u + u+
d𝜙 p𝜙2 p𝜙2
This is now in a form we can actually integrate. To do this, let's move all the u-stuff to the left-
hand side and the d𝜙 to the right and then integrate both sides:
1
du = d𝜙
2 2
2GM𝜇 2E𝜇
-u 2 + u+
p𝜙2 p𝜙2
1
⇒ ∫ du = ∫d𝜙
2 2
2GM𝜇 2E𝜇
-u 2 + u+
p𝜙2 p𝜙2
Now, this may look complicated but it can still be solved for u. In particular, let's take the
1
cosine of both sides here and use the identity cos(arctan x) = :
2
1+x
GM𝜇 2
u-
p𝜙2
cos arctan = cos(𝜙 + 𝜙0 )
2 2
2GM𝜇 2E𝜇
-u 2 + u+
p𝜙2 p𝜙2
1
⇒ = cos(𝜙 + 𝜙0 )
2
GM𝜇 2
u-
p𝜙2
1+
2GM𝜇 2 2E𝜇 2
-u 2 + u+
p𝜙2 p𝜙2
After a little bit of algebra, we end up with the following expression for u:
Now, pulling out some terms from inside the square root and using
1 + cos 2 (𝜙 + 𝜙0 ) = sin(𝜙 + 𝜙0 ) , we get:
GM𝜇 2 2Ep𝜙2
u= 1+ 1+ sin(𝜙 + 𝜙0 )
p𝜙2 G2M2𝜇2
This thing inside the square root is called the eccentricity and it's commonly noted by
2Ep𝜙2
1+ = e. Also, since 𝜙0 is an arbitrary constant, we might as well choose it to be
G2M2𝜇2
1
𝜙0 = 𝜋 / 2 (for conventional reasons). Inserting these and u = , we have:
r
1 GM𝜇 2 𝜋 GM𝜇 2
= 1 + e sin 𝜙 + = (1 + e cos 𝜙)
r p𝜙2 2 p𝜙2
p𝜙2 1
r=
GM𝜇 2 1 + e cos 𝜙
This describes the distance between the two bodies as a function of the angle 𝜙. Moreover,
p𝜙2
this factor is usually called the semi-latus rectum and denoted by ℓ. So, defining
GM𝜇 2
p𝜙2
ℓ= , we have the final result called the orbit equation.
GM𝜇 2
Orbit equation:
ℓ
r(𝜙) =
1 + e cos 𝜙
Another way to solve the Kepler problem would be by the use of an effective potential,
similarly to how we did with the spherical pendulum. To do this, let's go back to the equation
from before that I marked with a star:
1 2 p𝜙2 GM
r␒ = - 2 2 + +E
2 2𝜇 r r
1 2 p𝜙2 GM
E = r␒ + 2 2 -
2 2𝜇 r r
This E is actually the total energy of the system divided by the reduced mass, which can be
proven for example, by constructing the Hamiltonian from the Lagrangian. You can read how
to do this from my introduction to Hamiltonian mechanics. Anyway, we should therefore add a
factor of 𝜇 here to get the actual total energy:
1 2 p𝜙2 Gm1 m2
E = 𝜇r␒ + - , where I've also used the fact that M𝜇 = m 1 m 2 .
2 2𝜇r 2 r
1 2
Now, this kind of looks like a kinetic energy, 𝜇r␒ , plus something that depends only on the
2
radial distance r between the two bodies. We can essentially imagine this as an "effective
potential", such that the total energy has the form:
1 2 p𝜙2 Gm1 m2
E = 𝜇r␒ + Veff (r) , where Veff (r) = -
2 2𝜇r 2 r
Again, this effective potential is only really a construct and not an actual potential energy.
However, the construction of such a quantity is incredibly useful as it allows us to essentially
Gm1 m2
analyze the interplay between the gravitational force (related to the potential - ) and
r
p𝜙2
the centrifugal force (related to this "potential" term ) arising from the angular
2𝜇r 2
momentum p 𝜙 .
This even allows us to determine the shapes of planetary orbits simply by just looking at the
graph of the effective potential, which tells us about the behaviour of the coordinate r:
I've actually written an entire article on this exact topic, which you can read here. The article
essentially uses the effective potential to analyze Newtonian orbits (which are solutions to
this Kepler problem), but also to analyze orbits around black holes using a relativistic form of
the effective potential.
I recommend you read the article linked above for the fulla analysis of this. However, one
interesting thing we can look at here are circular orbits, which occur at the minimum value
of the effective potential:
This r 0 here is the radius of the circular orbit. Now, these circular orbits can only occur at this
minimum, so at one value of r. The reason for this is that the effective potential only has one
minimum (interestingly, the relativistic effective potential has two possible values for circular
orbits), which is the only possible point at which the potential can be at a single, stable value;
thus, the radial coordinate stays constant, which indeed characterizes a circular orbit.
To find the radius of this circular orbit, r 0 , we need to differentiate the effective potential and
set it equal to zero (which is because at the minimum, the derivative i.e. the slope is zero):
2
dVeff d p𝜙 Gm1 m2
=0 ⇒ - =0
dr dr 2𝜇r 2 r
We can then solve for the value of r, which gives us the radius of the circular orbit, r 0 :
p𝜙2
r0 = = ℓ, where ℓ is the semi-latus rectum that appears in the orbit equation (see
GM𝜇 2
above).
Using this, we could also derive, for example, the period of a circular orbit. To do this, we
know that for a circular orbit, the angular frequency 𝜙␒ is a constant and can be expressed as:
2𝜋 2𝜋
𝜙␒ = ⇒ T=
T 𝜙␒
p𝜙
However, we also know that 𝜙␒ can be expressed, in this case, as 𝜙␒ = (this is essentially
𝜇r02
the equation of motion for the 𝜙-coordinate we found earlier). Moreover, we can use the
value for the radius of this circular orbit to get an expression for p 𝜙 :
p𝜙2
r0 = 2
⇒ p𝜙 = GM𝜇 2 r0
GM𝜇
Interestingly, the period of an elliptical orbit turns out to be quite similar to this, which we will
see soon.
Now, one more thing to note here is that we could also have derived the same thing from the
orbit equation. In particular, for a circular orbit the eccentricity e happens to be zero (this is
explained soon). In this case, the orbit equation would be:
ℓ
r(𝜙) = = ℓ = r0
1 + 0 · cos 𝜙
Personally, I think an interesting application of everything we've derived so far are Kepler's
laws. Historically, Kepler's laws were constructed experimentally from astronomical
observations, but by the use of the orbit equation and the various other things we've derived
here, we can indeed prove Kepler's laws completely analytically.
ℓ
r(𝜙) =
1 + e cos 𝜙
This equation indeed describes curves that are conic sections. If these conic sections happen
to be bound orbits, then they are indeed ellipses. This can be better seen, if we write the
radial distance r and this cos 𝜙 in terms of Cartesian coordinates:
ℓ
x2 + y2 = x
1+e
x2 + y2
ℓ
x2 + y2 = x
1+e
x + y2
2
2
x
⇒ x2 + y2 1+e = ℓ2
x2 + y2
⇒ 1 - e 2 x 2 + 2eℓx + y 2 = ℓ 2
This describes what are known as conic sections, of which a special case (0 < e < 1) are
ellipses in the xy-plane:
As a reminder, we've chosen our coordinates so that the mass m 2 always sits at the origin.
Now, to prove that the mass m 2 is at the focus point of the ellipse, let's consider the following
distance (between the center of the ellipse and the mass m 2 ):
This distance d here can be calculated by the formula d = | a - r(𝜋)| , where a is the semi-
major axis (distance between the center of the ellipse to either "edge") and r(𝜋) is the value
of the radius at the angle 𝜙 = 𝜋.
We can find r(𝜋) from the orbit equation:
ℓ ℓ
r(𝜋) = =
1 + e cos 𝜋 1 - e
Now, there's a useful equation that relates the semi-latus rectum (ℓ) of an ellipse to its semi-
major axis (a) and to the eccentricity of the ellipse (e), which is ℓ = a 1 - e 2 . Plugging this
in, we have
a 1 - e2 a(1 + e)(1 - e)
r(𝜋) = = = a + ae
1-e 1-e
d = |a - r(𝜋)| = | a - (a + ae)| = ae
Now, the distance of the focal point of an ellipse to the center of the ellipse is exactly ae.
Therefore, the mass m 2 indeed lies at one of the focal points of this ellipse. This then proves
Kepler's first law; planetary orbits are ellipses with the Sun at one focal point (by imagining
the mass m 2 as the Sun and m 1 as some planet orbiting the Sun).
Kepler's second law states that in an equal amount of time anywhere during the orbit, the
area swept by the planet's radius is the same:
An equivalent way of stating this is that the rate of change of the area, dA / dt , is constant all
throughout the orbit. To prove this, we can consider an "infinitesimal" area dA that is swept
out in a time dt and an angle d𝜙:
Given that this dA here is small enough (which it is, if it's infinitesimally small), we essentially
just have a triangle with base r and height rd𝜙:
1 1
dA = r · rd𝜙 = r 2 d𝜙
2 2
Consider now the equation for 𝜙␒ = d𝜙 / dt from earlier, which we can solve for d𝜙:
d𝜙 p𝜙 p𝜙
= ⇒ d𝜙 = dt
dt 𝜇r 2 𝜇r 2
1 p𝜙 p𝜙 dA p𝜙
dA = r 2 dt = dt ⇒ =
2 𝜇r 2 2𝜇 dt 2𝜇
Since the angular momentum p 𝜙 is conserved (it's a constant), the rate of change if the area
is indeed constant, which proves Kepler's second law.
The interesting thing is that Kepler's second law is only valid because the angular momentum
is conserved, so really, Kepler's second law is nothing but the conservation of angular
momentum in disguise. Equivalently, we could have started with Kepler's second law and
then derived the conservation of angular momentum from it.
Now, Kepler's third law states that the square of the orbital period of a planet is proportional
to the semi-major axis cubed:
T2 ∝ a3
We can actually prove this quite easily by using Kepler's second law we just proved. To do
this, we're going to integrate over the area of the ellipse, which gives us the total area of the
ellipse, A:
A = ∫dA
p𝜙
A=∫ dt
2𝜇
The integration limits here should be from 0 to T, the period of one orbit, since this completes
one total revolution around the ellipse. We then get:
Tp p𝜙 2𝜇
A=∫
𝜙
dt = T ⇒ T= A
0 2𝜇 2𝜇 p𝜙
Let's now go back to the definition of the semi-latus rectum, which we can use to solve for p 𝜙 :
p𝜙2
ℓ= ⇒ p𝜙 = 𝜇 GMℓ
GM𝜇 2
For an ellipse, there is a nice relation between the semi-latus rectum ℓ, the semi-major axis a
b2
and the semi-minor axis b, which is ℓ = . Inserting this into p 𝜙 , we have:
a
GM
p𝜙 = 𝜇b
a
Now, the total area of an ellipse is just A = 𝜋ab, so by inserting this and the expression for
p𝜙 , we get the period:
2𝜇 2𝜇 a
T= A ⇒ T= 𝜋ab = 2𝜋a
p𝜙 GM GM
𝜇b
a
2 4𝜋 2 3
T = a
GM
This is indeed exactly Kepler's third law, T 2 ∝ a 3 . The nice thing is that we also get this
proportionality constant 4𝜋 2 / GM for free, giving us a valid formula for the period of any
elliptical orbit.
This now completes the proofs for all of Kepler's three laws of planetary motion.