Extremal Problems - Past and Present Vladimir M. Tihomirov: Adef ABC

THE TEACHING OF MATHEMATICS
2002, Vol. V, 2, pp. 59{69
EXTREMAL PROBLEMS|PAST AND PRESENT

Vladimir M. Tihomirov
Abstract. The evolution of a phragment of the theory of extremal problems, the

necessary conditions of extremum, is explained. Four problems of Fermat, Lagrange,
Euler and Pontryagin are presented and four classical examples of Euclid, Kepler,
Newton and Bernoulli are solved.
AMS Subject Classication : 00 A 35, 01 A xx.
Key words and phrases : Extremal problems, problems with and without con-
straints, Lagrange multipliers, calculus of variation, optimal control, Pontryagin's max-
imum principle.
1. Introduction
There are at least three reasons for solving extremal problems. The rst one is
pragmatic: it is typical for mankind to search for the best way of using its resources,
and that is why a lot of problems of maxima and minima arise in the economics,
in solving technical questions, in managing various processes. The second reason
comes from the properties of the world around us: a lot of natural laws are explained
by extremal principles. Finally, the third reason is man's curiosity, his wish to fully
understand something.
Let us state four famous examples.
1. Euclid's problem. Euclid in his \Elements" (IVth century B.C.) states
a solution of the following extremal problem:
Inscribe a parallelogram ADEF of maximal area into the given triangle ABC
(Fig. 1). This problem was not motivated by an application of any kind, and it
does not explain any natural phenomenon. It was just an interesting geometrical
problem.
Fig. 1 Fig. 2
Communicated at the \Archimedes" Seminar, Belgrade, 07.10.2003
60 V. M. Tihomirov
2. Kepler's problem. In 1613 Kepler was about to get married and so he

was concerned with the organisation of his household. That was a year when the
crop of vineyards in his surrounding was very good and there were a lot of wine
barrels that were carried near his home in Linz on Rhine. Kepler asked for some
barrels to be brought to his yard, and so it was done. Then came a grader who
determined the volume of barrels in a very simple manner. He put a measure-stick
into the hole on top of the barrel as in Fig. 2, looked at the part of the stick which
was red and immediately told the price. It seemed odd to Kepler: the barrels were
of dierent kinds, and the way of measuring was always the same. The curiosity
led Kepler to investigate the problem mathematically. As a result he made several
crucial steps in the birth of integral calculus, he described various methods of
calculation of areas and volumes and he also solved several extremal problems. He
wrote about this in his book \Stereometry of Wine Barrels". He also solved there
the following problem:
Inscribe a rectangular parallelepiped of maximal volume into the given sphere.
We can see that this problem was also a result of curiosity and the wish to
fully understand a certain phenomenon.
3. Newton's aerodynamical problem. The greatest scientic work of
Newton, \The Mathematical Principles of Natural Philosophy" appeared in 1687.
There was a problem of technical nature in this book. Newton posed and solved
a problem about a rotating body with the given width and height, and giving the
smallest resistance in a viscous uid (Fig. 3). He added that his solution might
be \used for constructing ships"|here is an example of a problem with pragmatic
nature.
Fig. 3 Fig. 4
4. The brachistochrone problem. Another important event happened in

1696. Iohann Bernoulli published an article titled \A problem, mathematicians are
called to solve". The problem was formulated like this:
Given points A and B (in a vertical plane), determine the path a body M
descends from A to B forced by its own weight, and using the minimum of time.
This problem was called the brachistochrone problem (Fig. 4).
Here, four results from the general extremum theory will be explained, which
are connected with four mathematicians: P. Fermat (1601{1665), J. Lagrange
(1736{1813), L. Euler (1707{1783) and L. S. Pontryagin (1908{1988). All the
concrete examples we have mentioned before will be solved.
Extremal problems|past and present 61
2. Problems without constraints. A theorem of Fermat (1638)
Let f be a function of one variable (or, we can say, let f be dened on the real
axis R then we write f : R ! R). Consider the problem: Find the extremum (i.e.,
maximum or minimum) of function f without constraints. Formally, we will write
(P1 ) f (x) ! extr.
We will assume the function f to be dierentiable. Recall the meaning of this.
Suppose that the increment f (~x + x) ; f (~x) can be written as the sum of the linear
expression ax and a remainder r(x), where r(x) is \small compared with x". More
precisely, let
f (~x + x) = f (~x) + ax + r(x) where lim jr(x)j = 0.
!
x 0 jxj
The linear function y = ax is the main linear part of the increment. Number a
is called the derivative of function f at the point x~. It is denoted by f 0 (~x). The
following theorem is valid.
Theorem 1. (Fermat) If the function f is dierentiable at the point x~ which
is a solution of problem (P1 ), then
(1) f 0 (~x) = 0:
Historical comment. Fermat did not know the concept of derivative, but
actually (in his letter to Robervall and Mersenne in 1638, who were used by French
scientists of the time for scientic correspondence|journals did not exist) he liter-
ally explained the idea of \the main linear part" of a function and said that it has
to be equal to zero.
The concept of derivative was introduced by Newton and Leibniz. To Newton
the derivative was the measure of velocity of variation of a process. The result of
Theorem 1 was expressed by him as: \at the moment when a quantity attains its
maximum or minimum, it does not ow, either forwards, or backwards."
To Leibniz, the derivative was the slope of
the tangent. So, in his words, Fermat's theo-
rem says that \the tangent to the graph of a
function in an extremal point has to be hori-
zontal" (Fig. 5). Notice that even Kepler in
his \Stereometry" had a sentence also express-
ing the essence of Fermat's theorem. He wrote
that \on both sides of the place of maximum,
decreasing is not essential." Fig. 5
0
Notice also that the equality f (~x) = 0 is a necessary, but not a sucient
condition for extremum. Point 0 is neither a point of maximum, nor a point of
minimum, for the function g(x) = x3 , but g0 (0) = 0.
62 V. M. Tihomirov
Let us solve the Euclid's problem using Fermat's theorem. Let us look again
to Fig. 1, where a = AC , x = AF = DE , H is the altitude of ABC and h the
altitude of DBE . Using the similarity of triangles DBE and ABC , we obtain that
x = h . The area of the parallelogram is
a H
x(H ; h) = Ha x(a ; x):
So, the problem reduced to the problem of nding the maximum of the function
f0 (x) = x(a ; x) (we neglected the constant factor H=a), with the constraint
0 < x < a. But we can neglect this constraint, too, and we can consider a problem
without constraints
f0 (x) = x(a ; x) ! max.
At the point of maximum, the equality f 0 (~x) = 0, i.e., x~ = a=2 has to take place.
And then

f0 (~x + x) = a2 + x a2 ; x = a4 ; x2 = f0 (~x) ; x2
2
i.e., f0 attains its maximum at the point x~, without any constraints, and, a fortiori,
with our constraint. We have solved the problem|point F has to be the midpoint
of segment AC .
Let us continue by considering extremal
problems for functions of several variables.
Consider the Kepler's problem. Suppose that
we have constructed orthogonal axes, and de-
note variables by x1 , x2 and x3 (Fig. 6). Then
the sphere with radius 1 can be written as
x21 + x22 + x23 ; 1 = 0:
Let a vertex of a parallelepiped lying on the
sphere has coordinates (x1 x2 x3 ) and denote
it simply by x. Then the volume of the paral-
Fig. 6 lelepiped is equal to 8jx1 x2 x3 j.
We have just mentioned two examples of functions of three variables:
f1 (x) = 8x1 x2 x3 and f2 (x) = x21 + x22 + x23 ; 1:
For such functions, Fermat's theorem has the same formulation as for functions of
one variable, only the derivative is now not a single number, but a collection of
numbers. For instance, p in the case of three variables, if x = (x1 x2 x3 ), denote
by jxj the expression x1 + x22 + x23 . Let f be a function of three variables (then
2
we write f : R3 ! R), and let the increment f (~x + x) ; f (~x) at point x~ can be
represented as the sum of the linear part a1 x1 + a2 x2 + a3 x3 and the remainder
r(x), small when compared with x. More precisely, let
f (~x + x) = f (~x) + a x + r(x)
where a = (a1 a2 a3 ) is a collection of three numbers, a x denote the \scalar
product" of a and x, i.e., a x = a1 x1 + a2 x2 + a3 x3 , and jxlim jr(x)j
j!0 jxj = 0. Then
we say that function f is dierentiable at x~ and that a is the derivative of f at the
point x~ denote it by f 0 (~x). The derivative f 0 (~x) is the collection of three numbers
(fx0 1 (~x) fx0 2 (~x) fx0 3 (~x)), where fx0 1 (~x) is the derivative at zero of the function of one
variable g1 (x) = f (~x1 + x x~2 x~3 ) similarly fx0 2 (~x) and fx0 3 (~x) are dened.
Consider the problem without constraints
(P10 ) f (x) ! extr
where x = (x1 x2 x3 ) or even x = (x1 x2 . . . xn ) (function of n variables). The
following theorem is valid
Theorem 10 . If the function f is dierentiable at the point x~ and this point
is a solution of problem (P10 ), then f 0 (~x) = 0 (or, in the three-dimensional case,
fx0 1 (~x) = fx0 2 (~x) = fx0 3 (~x) = 0).
But there are few interesting problems without constraints of this kind. As
a rule, problems with constraints are more important (Kepler's problem is one of
them). A general method for solving problems with constraints belongs to La-
grange.
3. Finite-dimensional problems with constraints.
Lagrange's multipliers rule (1801)
Consider the problem
(P2 ) f0 (x) ! extr, f1 (x) = 0
where f0 and f1 are functions of n variables: x = (x1 x2 . . . xn ) (then we write
f : Rn ! R). There is a way of solving problems of the kind (P2 ), belonging
to Lagrange, by which one has to form the function L(x ) = 0 f0 (x) + 1 f1 (x)
with indenite multipliers 0 and 1 (this function is called Lagrange function and
= (0 1 ) is a collection of Lagrange multipliers ) and \then search for maxima
and minima |as Lagrange wrote|as if the variables were independent ", i.e., one
has to apply Fermat's theorem to the problem L(x ) ! extr without constraints.
More precisely, the following theorem is valid.
Theorem 2. (Lagrange's multipliers rule) Let functions fi be continuously
dierentiable in a neighbourhood of x~ and this point is a solution of problem (P2 ).
Then there is a collection of Lagrange multipliers = (0 1 ), distinct from zero
(j0 j + j1 j 6= 0), such that
(2) L0x(~x ) = 0 i.e. L0x1 (~x ) = 0 L0x2 (~x ) = 0 . . . L0xn (~x ) = 0:
Let us solve Kepler's problem by Lagrange's method. It can be formalized as

(i ) f0(x) = x1 x2 x3 f1 (x) = x21 + x22 + x23 ; 1 = 0
64 V. M. Tihomirov
(we neglected the factor 8). Lagrange's function is

L(x ) = 0 x1 x2 x3 + 1 (x21 + x22 + x23 ; 1):
Let x~ = (~x1 x~2 x~3 ) be a solution then x~i 6= 0, i = 1 2 3 (there is no parallelepiped
if some of them is zero). By Lagrange's theorem, L0x1 (~x ) = 0, and so 0 x~2 x~3 +
21 x~1 = 0, or, multiplying by x~1 , 0 x~1 x~2 x~3 + 21 x~21 = 0. Analogously, from
L0x2 (~x ) = 0 it follows that 0 x~1 x~2 x~3 + 21 x~22 = 0 and from L0x3 (~x ) = 0 it
follows that 0 x~1 x~2 x~3 + 21x~23 = 0. If we let 1 = 0, we would get that 0 = 0, and
this contradicts the condition of the Theorem that not both of Lagrange multipliers
can be equal to zero. We see that
x~i = 0 x~21x~2 x~3 i = 1 2 3:
1
This means the only possible solution is the cube for which x~1 = x~2 = x~3 = 1= 3.
p
It is proved in Mathematical Analysis that a solution of problem (i) exists,
but then it must satisfy Theorem 2 and there are eight solutions of equation (2)
for this problem, and all of them are vertices of the cube. Consequently, the cube
is the solution of Kepler's problem.
Historical comments. Lagrange described his solution in the book \Theory
of Analytic Functions" in 1801.
4. Problems of Calculus of Variation. Euler's theorem (1744)
It can appear to be strange that, when stating the names of mathematicians
that contributed to the extremum theory, I mentioned Lagrange before his elder
colleague Euler, as if I violated chronological order of things: multipliers rule is
dated 1801, and Euler's equation 1744. The reason is that the extremum theory
really made an unexpected jump, and it passed from functions of one variable
straight to functions whose arguments are curves, i.e., to functions with innite
number of variables. Let us make it more clear on the example of brachistochrone.
Let us direct the Oy-axis vertically down, put the point A into the origin,
and let coordinates of point B be (x1 y1 ) (Fig. 4). Let y() be a certain curve
(y() is the symbol for the function itself, y(x) is the value of this function at
point x). According to Galileo's law, a body with mass m, descending along the
curve y(), starting from
p the origin by the gravitational force, attains at the point
M (x y) the velocity 2gy(x), regardless of the mass m and the path it followed
to come to point M . Consequently, when moving along the curve y(), from the
point M (x y(x)) to the point (x + dx y(x + dx)), for small dx, the path it passed
is approximately equal to
p p p
dx2 + dy2 = dx2 + y0 2 (x) dx2 = 1 + y0 2 (x) dx
and so, the time dt for passing this
p path 0 2is approximately equal to the ratio of the
1 + y (x) dx
path and the velocity, i.e., dt = p . And so, the full time of passing
2gy(x)
the path from A to B is equal to
Z p
1p+ y0 2 (x) dx :
x1
0 2gy(x)
We have reformulated the brachistochrone problem: one has to nd the minimum
of the stated integral (considering all curves y() satisfying the conditions y(0) = 0,
y(x1 ) = y1)
in other words :
Z x1 p
1 +py0 2 (x) dx
y(x)
! min y(0) = 0 y(x1 ) = y1
p
0
(we neglected the factor 1= 2g).

In the 1720's, there was a young man who started coming to lectures given by
I. Bernoulli, and the lecturer immediately paid attention to him. The young man
was Leonhard Euler. I. Bernoulli posed to Euler the problem of nding the general
method of solving problems analogous to the brachistochrone problem, and Euler
really did nd such method. He generalized the brachistochrone problem in the
R and y () a function,
following way. Let f = f (x y z ) be a function of three variables
dierentiable on the segment x0 x1 ]. Then the number xx01 f (x y(x) y0 (x)) dx
depends on the curve y(). This is a \function of the function" (sometimes they
called it, and some call it even now, a \functional").pDenote it by J (y()). For
example, in the brachistochrone problem, f (x y z ) = 1p+y z (f does not depend
2
on x).
Euler developed a method of solving problems like
Z x1
(P3 ) J (y()) = f (x y(x) y0 (x)) dx ! extr y(x0 ) = y0 y(x1 ) = y1 :
x0
Function f is called an integrand in problem (P3 ). Problem (P3 ) is called the
simplest problem of the Calculus of Variation. We have
Theorem 3. If the function f is dierentiable (as a function of three vari-
ables), and the function y~() is a solution of problem (P3 ), then the following dif-
ferential equation is satis ed:
(3) ; dxd fz (x y~(x) y~0 (x)) + fy (x y~(x) y~0 (x)) = 0:
Equation (3) is usually called the Euler's equation.
If f does not depend on x, then equation (3) has an integral
0
(3 ) y~0 (x)fz (~y(x) y~0 (x)) ; f (~y(x) y~0 (x)) = const:
Let us apply (30 ) to the brachistochrone. We have
p
0y fz ; f = p y0 2 p ; 1p+ y02 = ; p p 1
1 + y0 2 y y y 1 + y0 2
66 V. M. Tihomirov
i.e., py 1 + y0 2 = const = C , and so

p p r
C ; 1 = y0 . Substitute y = C sin2 .
dy = C cos sin . But, y 2
Then, dx 2 2
s
ctg = C ; 1 = dy = dy d
y dx d dx
and, consequently, dx
d = C sin 2
2 . So, we obtain a solution in the parametric form
x = C2 ( ; sin ) y = C2 (1 ; cos ):
This curve is called a cycloid. And so, the brachistochrone appeared to be a cycloid.
Historical comment. In 1744, Euler published his famous memoir \A
method of nding curves having maximal and minimal properties, or a solution
of the isoperimetric problem, taken in the widest possible sense", in which he de-
veloped the basics of the Calculus of Variation, and, particularly, introduced the
Euler's equation.
5. Problems of optimal control. Pontryagin's maximum principle (1958)
More than two hundred years have passed since Euler's memoir was published,
and the Calculus of Variation have almost reached its nal form. Particularly, the
following class of problems, similar to (P2 ), has been investigated:
Z x1
I0 (y()) = f0 (x y(x) y0 (x)) dx ! min
(P30 ) Zxx0 1
I1 (y()) = f1 (x y(x) y0 (x)) dx = x(ti ) = yi i = 0 1
x0
(problems like that where called \isoperimetrical in the widest sense" by Euler).
The same idea of Lagrange that was considered earlier, can be applied to this
problem. Namely, in order to solve (P30 ), one has to form the Lagrange function
Z x1
L(y() ) = (0 f0 (x y(x) y0 (x)) + 1 f1 (x y(x) y0 (x))) dx
x0
and looking at the minimization problem of this function, one has to write down
Euler's equation
(30 ) ; dxd f z (x y(x) y0 (x)) + f y (x y(x) y0 (x))
0 0 0 0
; dxd f z (x y(x) y0 (x)) + f y (x y(x) y0 (x)) = 0

1 1 1 1
solve it, and try to satisfy the condition I (y()) = . (Equations of the type (30 )
1
are called Euler-Lagrange equations ).
Let us show how these equations can be applied in the following example:
Z Z
y2 dx ! max y0 2 dx = 1 y(0) = y() = 0:
0 0
Here, equation (30 ) has the form ;1 y00 + o y = 0, or y00 + y = 0, rwhere =
;0=1. Boundary conditions are satised by the sequencep yn (x) = 2 sinnnx .
The maximum value is attained for the function y1 = 2= sin x.
And two hundred years later, during 1940's and 1950's, it was the need of
Control Theory, Economics, Space Navigation, Military industry, that brought the
necessity to make supplements to the theory of Calculus of Variation introduc-
ing new constraints|the constraints containing inequalities about variable control.
When applied to problem (P3 ), such constraints are imposed to the derivative of
the function y(x). We can write it down in the form y0 (x) 2 U , where U is a certain
subset of R (say, nite segment a b] or the semiaxis R+ = fx > 0 j x 2 Rg, or
even a nite set of points).
The rst problem of optimal control was,
without doubt, Newton's aerodynamical prob-
lem. In his \Mathematical Principles", he
just stated the answer, without formalization
and solution. Two of his contemporaries|I.
Bernoulli and his student l'Hospital|formali-
zed the problem and tried to solve it analyti-
cally. They directed the body along x-axis, but
if they had directed it along y-axis, they would
have come to an easier expression for the inte-
grand, and they would have come to the prob-
lem Fig. 7
Z
(ii) I (y()) = x dx ! min y(0) = 0 y(x ) = y :
x1
0 1 + y0 2 (x) 1 1
\But this is just a problem of the Calculus of Variation|you might say|problem

(P3 ) with the integrand f (x z ) = 1 +x z 2 !" Moreover, there is something strange
here: if one takes a cogged prophile with large slopes as in Fig. 7, then y0 2 can be
very large and the integral in (ii) can become arbitrarily small.
One of serious specialists in Control Theory, L. Young, in his interesting book
\Lectures on Calculus of Variation and Optimal Control Theory", stated a brutal
objection to \illiterate" Newton. He wrote: \Newton formulated a variational
problem about rotating body, causing the least resistance while moving in the
gas. The physical law of resistance he applied was absurd, and as a result the
given problem had no answer (the more cogged is the prophile, the resistance is
smaller)". Alas, Young himself expressed here a stunning \illiteracy": he wrote
a book on the Calculus of Variation and Optimal Control, but he did not realise
68 V. M. Tihomirov
that Newton's problem was not a variational problem, but a problem of optimal
control, since monotonicity of the prophile, i.e., the inequality y0 > 0, was to be
understood. And Newton's solution was absolutely right!
The following problem is a particular, but important example of an optimal
control problem:
(P4 ) Z x1
I (y()) = f (x y(x) y0 (x)) dx ! min y(x0 ) = y0 y(x1 ) = y1 y0 (x) 2 U
x0
where U is a certain subset of R. In Newton's problem, f (x z ) = 1 +x z 2 , and
U = R+ . The following theorem is valid.
Theorem 4. (Pontryagin's maximum principle for problem (P4 )) If the func-
tion f is dierentiable as a function of three variables, and y~ is a solution of problem
(P4 ), then the following conditions are satis ed: there exists a function p() such
that
(4) ;p0(x) + fy (x y~(x) y~0(x)) = 0
(40 ) max
u2U
(p(x)u ; fy (x y~(x) u)) = p(x)~y0 (x) ; fy (x y~(x) y~0 (x)):
If U = R in (P4 ), then relation (40 ) can be dierentiated by u. As a result,

p(x) = fz (x y~(x) y~0 (x)) and (4)-(4') is just Euler's equation. So, Pontryagin's
maximum principle is a generalization of Euler's equation.
Historical comments. In the 1950's, L. S. Pontryagin got interested in
control problems. He attracted his disciples|V. G. Boltyanskii, R. V. Gamkrelidze
and E. G. Mishchenko|to problems of this kind. They formulated a special class
of problems, called problems of optimal control. Necessary conditions for such
problems were called the Pontryagin's maximum principle. It was V. G. Boltyanskii
who proved Pontryagin's maximum principle for a wide class of optimal control
problems.
6. Concluding remarks
Let us solve Newton's problem. The integrand f of this problem does not
depend on y, and so (from (4)), p(x) = const = p and the following maximum
problem for a function of one variable has to be solved:
(iii) g(u x) = pu ; 1 +x u2 ! max u > 0
(here, u is the variable, and x is xed). Clearly, p < 0, otherwise max g = 1.
If x is small, then maximum in (iii) is attained at zero. This is the case
until the moment when the value at zero becomes equal to the second, positive
maximum of this function. Critical point of function y~(), being the solution of
Newton's problem, is characterized by the equation
2~y(
)

p = (1 +
= 1 + y~
0 2 (
) ; py0 (
):
y~0 (
))2
It can be deduced that y~0 (
) = 1,
= 2p. And then, solving the equation gx0 = 0,
one can obtain the solution in the parametric form

(iv) y = c ln u1 + u2 + 43 u4 x = c u1 + 2u + u2 :
Constant c can be determined from the condition y(x1 ) = y1 .
The curve (iv) is called Newton's curve. We see that the solution of Newton's
problem has a corner point (Fig. 8). This has also provoked a doubt among many
engineers: whoever has seen that a \ship" has a at, and not sharpened edge in the
front? And Newton, after calculating the corner-point angle to be 45 , said that
even this note may be \not useless" for con-
structing ships! And once again, he appeared
to be right. This \note" really appeared to be
\not useless" for constructing \ships", name-
ly supersonic planes and space crafts, moving
in the space where the atmosphere follows the
model of \Newton's thin surrounding". In fact,
the optimal control was stimulated, among oth-
er things, by space problems. And when it came
to construction of crafts having to y with huge
velocities and on great heights, Newton once
again became one of the most cited authors. Fig. 8
In this article, we have told about the evolution of one phragment of the theory
of extrema|about necessary conditions for the extremum. It is of interest to note
that, in all the cases, two ideas where of the greatest importance|Kepler's idea
(that in a neighbourhood of a maximum of a dierentiable function, \decreasing
is not essential", which brings us to Fermat's theorem), and Lagrange's idea that
when dealing with problems with constraints, one has to consider the extremal
problem for Lagrange's function \as if variables were independent". We have seen
how these ideas work in nite-dimensional problems and in problems of the Calculus
of Variation. With small changes, the same ideas can be applied to problems of
optimal control: Pontryagin's maximum principle also follows Lagrange's idea.
Vladimir M. Tihomirov, Moscow State University, Moscow, Russia

Extremal Problems - Past and Present Vladimir M. Tihomirov: Adef ABC

Uploaded by

Copyright:

Available Formats

Extremal Problems - Past and Present Vladimir M. Tihomirov: Adef ABC

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Extremal Problems - Past and Present Vladimir M. Tihomirov: Adef ABC

Uploaded by

Copyright:

Available Formats

THE TEACHING OF MATHEMATICS

2002, Vol. V, 2, pp. 59{69