Extremal Problems - Past and Present Vladimir M. Tihomirov: Adef ABC
Extremal Problems - Past and Present Vladimir M. Tihomirov: Adef ABC
Extremal Problems - Past and Present Vladimir M. Tihomirov: Adef ABC
Fig. 1 Fig. 2
Communicated at the \Archimedes" Seminar, Belgrade, 07.10.2003
60 V. M. Tihomirov
Fig. 3 Fig. 4
Historical comment. Fermat did not know the concept of derivative, but
actually (in his letter to Robervall and Mersenne in 1638, who were used by French
scientists of the time for scientic correspondence|journals did not exist) he liter-
ally explained the idea of \the main linear part" of a function and said that it has
to be equal to zero.
The concept of derivative was introduced by Newton and Leibniz. To Newton
the derivative was the measure of velocity of variation of a process. The result of
Theorem 1 was expressed by him as: \at the moment when a quantity attains its
maximum or minimum, it does not ow, either forwards, or backwards."
To Leibniz, the derivative was the slope of
the tangent. So, in his words, Fermat's theo-
rem says that \the tangent to the graph of a
function in an extremal point has to be hori-
zontal" (Fig. 5). Notice that even Kepler in
his \Stereometry" had a sentence also express-
ing the essence of Fermat's theorem. He wrote
that \on both sides of the place of maximum,
decreasing is not essential." Fig. 5
0
Notice also that the equality f (~x) = 0 is a necessary, but not a sucient
condition for extremum. Point 0 is neither a point of maximum, nor a point of
minimum, for the function g(x) = x3 , but g0 (0) = 0.
62 V. M. Tihomirov
Let us solve the Euclid's problem using Fermat's theorem. Let us look again
to Fig. 1, where a = AC , x = AF = DE , H is the altitude of ABC and h the
altitude of DBE . Using the similarity of triangles DBE and ABC , we obtain that
x = h . The area of the parallelogram is
a H
x(H ; h) = Ha x(a ; x):
So, the problem reduced to the problem of nding the maximum of the function
f0 (x) = x(a ; x) (we neglected the constant factor H=a), with the constraint
0 < x < a. But we can neglect this constraint, too, and we can consider a problem
without constraints
f0 (x) = x(a ; x) ! max.
At the point of maximum, the equality f 0 (~x) = 0, i.e., x~ = a=2 has to take place.
And then
f0 (~x + x) = a2 + x a2 ; x = a4 ; x2 = f0 (~x) ; x2
2
i.e., f0 attains its maximum at the point x~, without any constraints, and, a fortiori,
with our constraint. We have solved the problem|point F has to be the midpoint
of segment AC .
Let us continue by considering extremal
problems for functions of several variables.
Consider the Kepler's problem. Suppose that
we have constructed orthogonal axes, and de-
note variables by x1 , x2 and x3 (Fig. 6). Then
the sphere with radius 1 can be written as
x21 + x22 + x23 ; 1 = 0:
Let a vertex of a parallelepiped lying on the
sphere has coordinates (x1 x2 x3 ) and denote
it simply by x. Then the volume of the paral-
Fig. 6 lelepiped is equal to 8jx1 x2 x3 j.
We have just mentioned two examples of functions of three variables:
f1 (x) = 8x1 x2 x3 and f2 (x) = x21 + x22 + x23 ; 1:
For such functions, Fermat's theorem has the same formulation as for functions of
one variable, only the derivative is now not a single number, but a collection of
numbers. For instance, p in the case of three variables, if x = (x1 x2 x3 ), denote
by jxj the expression x1 + x22 + x23 . Let f be a function of three variables (then
2
we write f : R3 ! R), and let the increment f (~x + x) ; f (~x) at point x~ can be
represented as the sum of the linear part a1 x1 + a2 x2 + a3 x3 and the remainder
r(x), small when compared with x. More precisely, let
f (~x + x) = f (~x) + a x + r(x)
Extremal problems|past and present 63
where a = (a1 a2 a3 ) is a collection of three numbers, a x denote the \scalar
product" of a and x, i.e., a x = a1 x1 + a2 x2 + a3 x3 , and jxlim jr(x)j
j!0 jxj = 0. Then
we say that function f is dierentiable at x~ and that a is the derivative of f at the
point x~ denote it by f 0 (~x). The derivative f 0 (~x) is the collection of three numbers
(fx0 1 (~x) fx0 2 (~x) fx0 3 (~x)), where fx0 1 (~x) is the derivative at zero of the function of one
variable g1 (x) = f (~x1 + x x~2 x~3 ) similarly fx0 2 (~x) and fx0 3 (~x) are dened.
Consider the problem without constraints
(P10 ) f (x) ! extr
where x = (x1 x2 x3 ) or even x = (x1 x2 . . . xn ) (function of n variables). The
following theorem is valid
Theorem 10 . If the function f is dierentiable at the point x~ and this point
is a solution of problem (P10 ), then f 0 (~x) = 0 (or, in the three-dimensional case,
fx0 1 (~x) = fx0 2 (~x) = fx0 3 (~x) = 0).
But there are few interesting problems without constraints of this kind. As
a rule, problems with constraints are more important (Kepler's problem is one of
them). A general method for solving problems with constraints belongs to La-
grange.
3. Finite-dimensional problems with constraints.
Lagrange's multipliers rule (1801)
Consider the problem
(P2 ) f0 (x) ! extr, f1 (x) = 0
where f0 and f1 are functions of n variables: x = (x1 x2 . . . xn ) (then we write
f : Rn ! R). There is a way of solving problems of the kind (P2 ), belonging
to Lagrange, by which one has to form the function L(x ) = 0 f0 (x) + 1 f1 (x)
with indenite multipliers 0 and 1 (this function is called Lagrange function and
= (0 1 ) is a collection of Lagrange multipliers ) and \then search for maxima
and minima |as Lagrange wrote|as if the variables were independent ", i.e., one
has to apply Fermat's theorem to the problem L(x ) ! extr without constraints.
More precisely, the following theorem is valid.
Theorem 2. (Lagrange's multipliers rule) Let functions fi be continuously
dierentiable in a neighbourhood of x~ and this point is a solution of problem (P2 ).
Then there is a collection of Lagrange multipliers = (0 1 ), distinct from zero
(j0 j + j1 j 6= 0), such that
(2) L0x(~x ) = 0 i.e. L0x1 (~x ) = 0 L0x2 (~x ) = 0 . . . L0xn (~x ) = 0:
This means the only possible solution is the cube for which x~1 = x~2 = x~3 = 1= 3.
p
It is proved in Mathematical Analysis that a solution of problem (i) exists,
but then it must satisfy Theorem 2 and there are eight solutions of equation (2)
for this problem, and all of them are vertices of the cube. Consequently, the cube
is the solution of Kepler's problem.
Historical comments. Lagrange described his solution in the book \Theory
of Analytic Functions" in 1801.
4. Problems of Calculus of Variation. Euler's theorem (1744)
It can appear to be strange that, when stating the names of mathematicians
that contributed to the extremum theory, I mentioned Lagrange before his elder
colleague Euler, as if I violated chronological order of things: multipliers rule is
dated 1801, and Euler's equation 1744. The reason is that the extremum theory
really made an unexpected jump, and it passed from functions of one variable
straight to functions whose arguments are curves, i.e., to functions with innite
number of variables. Let us make it more clear on the example of brachistochrone.
Let us direct the Oy-axis vertically down, put the point A into the origin,
and let coordinates of point B be (x1 y1 ) (Fig. 4). Let y() be a certain curve
(y() is the symbol for the function itself, y(x) is the value of this function at
point x). According to Galileo's law, a body with mass m, descending along the
curve y(), starting from
p the origin by the gravitational force, attains at the point
M (x y) the velocity 2gy(x), regardless of the mass m and the path it followed
to come to point M . Consequently, when moving along the curve y(), from the
point M (x y(x)) to the point (x + dx y(x + dx)), for small dx, the path it passed
is approximately equal to
p p p
dx2 + dy2 = dx2 + y0 2 (x) dx2 = 1 + y0 2 (x) dx
and so, the time dt for passing this
p path 0 2is approximately equal to the ratio of the
1 + y (x) dx
path and the velocity, i.e., dt = p . And so, the full time of passing
2gy(x)
Extremal problems|past and present 65
the path from A to B is equal to
Z p
1p+ y0 2 (x) dx :
x1
0 2gy(x)
We have reformulated the brachistochrone problem: one has to nd the minimum
of the stated integral (considering all curves y() satisfying the conditions y(0) = 0,
y(x1 ) = y1)
in other words :
Z x1 p
1 +py0 2 (x) dx
y(x)
! min y(0) = 0 y(x1 ) = y1
p
0
on x).
Euler developed a method of solving problems like
Z x1
(P3 ) J (y()) = f (x y(x) y0 (x)) dx ! extr y(x0 ) = y0 y(x1 ) = y1 :
x0
Function f is called an integrand in problem (P3 ). Problem (P3 ) is called the
simplest problem of the Calculus of Variation. We have
Theorem 3. If the function f is dierentiable (as a function of three vari-
ables), and the function y~() is a solution of problem (P3 ), then the following dif-
ferential equation is satis ed:
(3) ; dxd fz (x y~(x) y~0 (x)) + fy (x y~(x) y~0 (x)) = 0:
Equation (3) is usually called the Euler's equation.
If f does not depend on x, then equation (3) has an integral
0
(3 ) y~0 (x)fz (~y(x) y~0 (x)) ; f (~y(x) y~0 (x)) = const:
Let us apply (30 ) to the brachistochrone. We have
p
0y fz ; f = p y0 2 p ; 1p+ y02 = ; p p 1
1 + y0 2 y y y 1 + y0 2
66 V. M. Tihomirov
solve it, and try to satisfy the condition I (y()) = . (Equations of the type (30 )
1
are called Euler-Lagrange equations ).
Extremal problems|past and present 67
Let us show how these equations can be applied in the following example:
Z Z
y2 dx ! max y0 2 dx = 1 y(0) = y() = 0:
0 0
Here, equation (30 ) has the form ;1 y00 + o y = 0, or y00 + y = 0, rwhere =
;0=1. Boundary conditions are satised by the sequencep yn (x) = 2 sinnnx .
The maximum value is attained for the function y1 = 2= sin x.
And two hundred years later, during 1940's and 1950's, it was the need of
Control Theory, Economics, Space Navigation, Military industry, that brought the
necessity to make supplements to the theory of Calculus of Variation introduc-
ing new constraints|the constraints containing inequalities about variable control.
When applied to problem (P3 ), such constraints are imposed to the derivative of
the function y(x). We can write it down in the form y0 (x) 2 U , where U is a certain
subset of R (say, nite segment a b] or the semiaxis R+ = fx > 0 j x 2 Rg, or
even a nite set of points).
The rst problem of optimal control was,
without doubt, Newton's aerodynamical prob-
lem. In his \Mathematical Principles", he
just stated the answer, without formalization
and solution. Two of his contemporaries|I.
Bernoulli and his student l'Hospital|formali-
zed the problem and tried to solve it analyti-
cally. They directed the body along x-axis, but
if they had directed it along y-axis, they would
have come to an easier expression for the inte-
grand, and they would have come to the prob-
lem Fig. 7
Z
(ii) I (y()) = x dx ! min y(0) = 0 y(x ) = y :
x1
0 1 + y0 2 (x) 1 1
that Newton's problem was not a variational problem, but a problem of optimal
control, since monotonicity of the prophile, i.e., the inequality y0 > 0, was to be
understood. And Newton's solution was absolutely right!
The following problem is a particular, but important example of an optimal
control problem:
(P4 ) Z x1
I (y()) = f (x y(x) y0 (x)) dx ! min y(x0 ) = y0 y(x1 ) = y1 y0 (x) 2 U
x0
where U is a certain subset of R. In Newton's problem, f (x z ) = 1 +x z 2 , and
U = R+ . The following theorem is valid.
Theorem 4. (Pontryagin's maximum principle for problem (P4 )) If the func-
tion f is dierentiable as a function of three variables, and y~ is a solution of problem
(P4 ), then the following conditions are satis ed: there exists a function p() such
that
(4) ;p0(x) + fy (x y~(x) y~0(x)) = 0
(40 ) max
u2U
(p(x)u ; fy (x y~(x) u)) = p(x)~y0 (x) ; fy (x y~(x) y~0 (x)):