Advanced Mathematics For Engineers by W Ertel
Advanced Mathematics For Engineers by W Ertel
no. Then the series }a,, converges. If, from an index no, #21| > 1, then the series is divergent. Proof: Apply the comparison test to} a, and > |ao|g” Example 3.4 == = converges Proof:28 3. Calculus ~ Selected Topies 3.2.2 Power series ‘Theorem 3.8 For each x € R the power series a is convergent Proof: The ratio test gives zn! el 1 5 for n> Gabe| a1 sa nz ala|—1 ao exp(z) is called exponential function Theorem 3.9 (Remainder) exp(e) = 92+ Ryle) N= th approrimation v +i)! with |Ry(x)| < 2: ¢ 3.2.2.1 Practical computation of exp(s) : )) 2 +Ry with Ry SG +R, th Ru Say 15: [Ril << 10-8 2. 71828182815 4£2-10- — (zounding error 5 times 10-1) Theorem 3.10 The functional equation of the exponential function Vz,y € R exp(x-+y) = exp(x) - exp(y),3.3. Continuity 29 Proof: The proof of this theorem is via the series representation (definition 3.5). It is not easy, because it requires another theorem about the product of series (not covered here). Conclusions: L.W2eR exp(—z) = (exp(z))? = PER exr(—z) = (exp(2)) = 5 2.YreR exp(z)>0 3. ¥neZ exp(n) =e" Also for real numbers x € R= ef := exp(z). 1. exp(z) -exp(—2) = exp(x — 2) = exp(0) =1 = exp(—2) xf 2 B20 > esplz)=ltet>+...2150 2<0 & <0 = exp(—2) > 0 = exp(x) = 3. Induction: exp (1) exp (n) = exp(n 1 +1) o exp(h) (for large 2 faster Note: for large r:=n+h n€N — exp(x)=exp(n+h) =e than series expansion) 3.3. Continuity Functions are often characterized in terms of “smoothness”, The weakest form of smoothness is the continuity. Definition 3.6 Let DCR, f DR a function and a € R. We write lim f(x) = C if for cach sequence (2)nen, (tn) € D with lim x, =a we have Jim se)30 3. Calculus ~ Selected Topies Definition 3.7 For x € R the expression [x] denotes the unique integer number n with néz1. it holds BB F@) = co, if k even and ain. rea={_& » af & odd Proof: for 2 #0 since lim g(x) = 0, it follows lim f(x) = lim x* = oe. o Application: The asymptotic behavior for x -+ oo of polynomials is always determinate by the highest power in x Definition 3.8 (Continuity) Let f : D + R a function and a € D. The function f is called continuous at point a, if lim f( (@). J is called continuous in D, if f is continuous at every point of D.3.3. Continuity 31 For this function lim f(x) # a. f is discontin- uous at the point a x OX a x Example 3.6 ¢ f:x++ c (constant function) is continuous on whole R. # The exponential function is continuous on whole R. @ ‘The identity function f : 2 +4 x is continuous on whole I. Theorem 3.12 Let f,g : D > R functions, that are at a € D continuous and let r € B. Then the functions f+, rf, {+g at point a are continuous, too. If g(a) 4 0, then f is continuous at a, Proof: Lot (2) a sequence with (2,) © D and lim 2, =a Jim (f+ 9) = + ale) “Tim (A) (0) (ANG) Tim (faz) = (F-9)(a) folds because of rules for sequences Lye yi jim(L) = a) fae (a) o Definition 3.9 Let A, B,C subsets of R with the functions f : A+ B and g: BC. ‘Then gof:A—+C, x+49(f(z)) is called the composition of f and g. 1) fog(z) = F(z) Example 3.7. 2.) y—osin(z) = Ysin(a) 3) simoy—(z) = sin(yz) Theorem 3.13 Let f A> B continuous at a € A and g: A> C continuous at = f(a) Then the composition go f is continuous in a, too. Proof: to show: lim, — a =» — lim f(r) — f(a) => lim g( f(z) = aKa),32 3. Calculus ~ Selected Topies Example 3.8 is continuous on whole R, because f(x) = 2°, 9(z) = f(z) +a and +a h(x) = are continuous. g(z) Theorem 3.14 (« J Definition of Continuity) A function f : D + R is continuous at % € D iff: Ve>0 3650 YreD (\x—ml <6 > \f(2) — fle)l 2. The function fR* + R*, + 2* is continuous and strictly increasing, The inverse function f-!: Rt > R',2 ++ YF is continuous and strictly increasing Theorem 3.16 (Intermediate Value) Let f : [a,b] > R continuous with f(a) < 0 and f(b) > 0. Then there exists a p € [a, 6] with f(p) = 0. fx) | ¢iscontinaous, no zero!3.3. Continuity 33 Note: if f(a) > 0, f(b) < take —f instead of f and apply the intermediate value theorem Example 3.10 D=Q: x+>2?~2= f(z) fl) =~1,f(2) =2 there is ap € D with F(p) = 0. Corollar 3.3.1 Is f : [a,b] + R continuous and ¥ is any number between f(a) and f(b), then there is at least one Zé [a,6] with f(z) —¥. fb) fla) Note: Now it is clear that every continuous function on [a,6] assumes every value in the interval (f(a), f(b)! 3.3.1 Discontinuity Definition 3.10 We write lim f(x) =e (lim fla) = @), if for every soquonce (2) with >a (% R (n+ 1)- times continuously differentiable and xo,x € J. Then there is a z between xo and x such that flor) (2) BO) Tay (e—2)"* Example 3.13 f(x) =e Theorems 3.18 and 3.19 yield Convergence: the ratio test implies convergence of 7 bp. = lim), =0 > Jim Ry(z)=0 for all te R Thus the Taylor series for e* converges to f(x) for all x € R! Example 3.14 Evaluation of the integral [a wdz. In As the function f(r) = VI+ 2 has no simple antiderivative (primitive function), it can not he symbolically integrated. We compute an approximation for the integral by integrating the third order Taylor polynomial The exact value of the integral is about 1.11145, i.e. our approximation error is about 1%3.4 Taylor-Series 37 FO Definition 3.11 The series Tj(x) = )> i 2)» — ma) is called Taylor series of f with = 7 expansion point (point of approximation) 29 Note: 1, For © = xp every Taylor series converges, 2. But for x # 2» not all Taylor series converge! 3. A Taylor series converges for exactly these x ¢ I to f(r) for which the remainder term, from theorem 3.18 (3.19) converges to zero. 4, Even if the Taylor series of f converges, it does not necessarily converge tof. (+ example in the exercises.) Example 3.15 (Logarithm series) For 0 < x <2: in) -(e-1) -@ 5, ei Proof: tay ate ee 6 edgy pepe DD! W@)= 2 m@)=-5. m"@)=5, mM) =-F, we) = (ay Induction In (2) = (in(a))! = Expansion at 9 = 1 in (1), A (e-1" , (@-18 (@-1)4 Tinal2) = a@— 1k =(2-1)-% + ) Tale) = SO —*= 1) ; ve = This series converges only for 0 < x < 2 (without proof) Definition 3.12 If a Taylor series converges for all « in an interval I, we call I the convergence area. Is T= [to —r,20 +1] or P= (to- 1,40 +4 series, ris the convergence radius of the Taylor38 3. Calculus ~ Selected Topies Example 3.16 Relativistic ma: Einstein: total energy increase: ? Kinetic energy: Fain = (m — mo) mo m(v) to be shown: for v < ¢ we have kin * Emov” Bua = bm —mg)e = ( for x <1. ) mee = 3 oC = For 3.5 Differential Calculus in many Variables PRR (Hata yan) Y= Flt tay ate) zry=s(2) 3.5.1 The Vector Space R" In order to “compare” vectors, we use a norm: Definition 3.13 Any mapping || || :R” +R, x +> ||2!| is called Norm if and only if 1. |z|| = 0 iff 2. |Az|| = |Al|[z|| VA ER, 2 < RY 3. z+y|R",n a, is called sequence. Notation: (a,)ncx Example 3.17 me Definition 3.16 A sequence (a,)ncy of vectors a, € R" converges to a © R", if YT — shea YS q _— — ee = YS ~~ Ye>0 3N(e)EN |la,—al| N(e) Notation: lim a, =a ‘Theorem 3.21 A (vector) sequence (ay)nex converges to a if and only if all its coordinate sequences converge to the respective coordinates of a. (Proof as exercise.) Notation: a a= (a@z)kcx ax ER" . Note: Theorem 3.21 enables us to lift most properties of sequences of real numbers to sequences of vectors.40 3. Calculus ~ Selected Topies 3.5.3. Functions from R" to R™ Functions f from D CR" to B CR have the form f:DOB , a f(a) Example 3.18 I (x3, 02) = sin(x1 + lnra) m #1: Functions f from Dc R" to BCR” the form f:D>B , f(a) ea falta + yt) ™ Fa(@ay"** 5) Example 3.19 1 fF: RoR no ( van 2) ( cones tiny 2, Weather parameters: temperature, air pressure and humidity at any point on the earth, F : (0°, 360°} x [-90°, 90°] -+ [-270, 20] x [0, 00] x (0, 100% (> (srmecmct humnidity(®, ®) Note: The components f;(z),+-- , f(z) can be viewed (analysed) independently. Thus, in the following we can restrict ourselves to f : R" + R.3.5 Differential Calculus in many Variables aL 3.5.3.1 Contour Plots Definition 3.17 Let D CR?,B CR,c€ B, f: D+ B. The set {(x1,22)|f(x1,22) is called contour of f to the niveau c Example 3.20 f(x1,22) = x2 mm =e for n A0:m == (hyperbolas) c=0 & m=0Vm=0 ContourPlotix y, {x,-3,3}, {y,-3,3}, Contours -> {0,1,2,3,4,5,5,7,89-1, — proeapte y, 0,-3.3), fy.-3,3, -2,-3,-4,-5,-6,-7,-8,-9), PlotPoints -> 30) PlotPoints -> 60] 3.5.4 Continuity in R" analogous to continuity of functions in one variable: Definition 3.18 Let f : D + R” a function and a < R”. If there is a sequence (a,) (maybe more than one sequence) with lim a, = a, we write dim, f(«) if for any sequence (#,),» € D with lim &, = a: Im fla) =e Definition 3.19 (Continuity) Let f : DR” a finction and a € D, ‘The function f is continuous in a, if Jim, f(z) = f(a). f is continuous in D, if f is continuous in all points in D.42 3. Calculus ~ Selected Topies Note: These definitions are analogous to the one-dimensional case. Theorem 3.22 If f: D+ R",g : D> Rh: D > R are continuous in 9 € D, then f+9.f —9.f9 and £ (if h(ao)) £0) are continuous in ep. 3.5.5 Differentiation of Functions in R" 3.5.5.1 Partial Derivatives Partial Derivatives Example 3.21 fs: RoR S122) = 2atay keep a — const., and compute the I-dim. derivative of f wrt. 2; a. Fh cents) = filets) = tore} analogous with 2m = const. second derivatives: oe ate 2 f(z oe} Oo af — a af fy, 2) = L2ey2} Om Dry xz Ix, Example 3.22 (u, v,w) = ww + cosw B,(u,v,w) =v (u,v) =u (u,v, w) = —sinw ilway ++ stm) Definition 3.20 If f(x) = is partially differentiable in & = @o, ic. (iy san) all partial Derivatives 24 (a9)(i ym, k = 1,++> ,n) exist, then the matrix Fie) Flw) He) playa] 20 He) Fle) Be) Be) Be is called Jacobian matrix.3.5 Differential Calculus in many Variables 43 Example 3.23 Lincarisation of a function: f : R? > R¥ in 2» 2x, F(x) = [ sin(xr +2) In(ey) + 2 ° 2 f(2)= ( cos(y +2) cos(ri +22) ) z 1 FexohP GES Fa) I-dimensional J (#0) = Jim Linearisation g of f in 2p = (6) g(21,2) = F( 0 ° > glam)={ 0 J+[ a Inz 2 Note: For « -+ 20 ie, close to ato the linearisation g is a good approximation to f (under which condition?) ction f : R? > R with 2 it (x) 4( flay) 4 Ver if (x,y) # (0,0) 0 if (z,y) = (0,0) Differentiability: {is differentiable on R°\ {(0, 0)} since itis built up and division ifferentiable functions by sum, product a 6 yy sm Ly > tim 22 im 2F (2,0) ding gy (0.9) A lim 5, (0) — the partial derivative 2 is not continuous in (0,0). -> f is in (0,0) not differentiable.44 3. Calculus ~ Selected Topies Symmetries 1. f is symmetric wrt, exchange of x and y, ie, watt, the plane y=: 2. f is symmetric wrt. exchange of x and -y, ive. w.nt, the plane y 3. f(-2,y) = —F (zy), dh. f is symmetric w.r-t. the waxis. 4. f(x,y) =~ flay), Lh. f is symmetric wart. the 2-axis. Contours: Contours: Tate ife>0,r<0(3, Quadr) and ¢<0,x>0 (4, Quadr,) vf ga ife>0,2>0(1 Quadr) and e<0,2<0(2 Quadr,) Signs in the quadrants fayire Continuity: f is continuous on R?\{(0,0)}, since it is built up of continuous functions by sum, product and division. Continuity in (0,0): Let ¢ > 0 such that |x| =¢,ie ¢= Yr +y¥ @ y= vee3.5 Differential Calculus in many Variables 45 = fley=4* Aye str Vl-2/e from jal] < we get [fey] < fe and him f(x,y) Thus f is continuous in (0, 0) 3.5.5.2 The Gradient Definition 3.21 f : D+ R(D CR") H(z) The Vector grad f(x) := f'(x)7 = is called gradient of f, Le) The gradient of f points in the direction of the steepest ascent of /. fayae ry of Slay) =2 Example 3.25 Be) = 28 = uaasten= (2% )=2(3) of Dey) = 2 Jy) = 20 5.5.3 Higher Partial Derivatives of Let f | D> R"(D CR"). Thus 2£(z) is again a function mapping from D to R™ and 2 (4) o> Bons. Oni Dade, is well defined, Theorem 3.23 Let Dc R" open and f : D + R” two times partially differentiable. Then we have for all #9 € D and all ¢,j =1,+-+ ,n46 3. Calculus ~ Selected Topies Consequence: If f : D -+ R™(D © R* open) isk-times continuously partially differentiable, ‘then Or, for any Permutation Il of the numbers 1, 3.5.5.4 The Total Differential If f | R” — Ris differentiable, then the tangential mapping f(x) — f(20) + f'(20)(® — 20) represents a good approximation to the function f im the neighborhood of #5 which can be fila) ~ f (0) = f' (#0) — £0) With (x) and ae de = =a- to diy we get: f(a) = f'(ao)dx (2) = 0 FE (wojden = PF aaj, + 2x," in Definition 3.22 The linear mapping df yeh enlaen is called total differential oa Oe of the function f in ao Note: Since in a neighborhood of aq, fr is a good approximation of the function f, we have for all # close to 0 a(x) = fle) — f(o) Thus df (x) gives the approximate deviation of the function value f(z) from f(2o), when & deviates from ap a little bit 3.5.5.5 Application: The Law of Error Propagation Example 3.26 For a distance of s = 10 kn a runner needs the time of t = 30min yielding an average speed of v= 7 = 204", Let the measurement error for the distance s be As = Lo. and for the time we have At = 1 sec. Give an upper bound on the propagated error Av for the average speed! This can be solved as follows. To the given measurements 1, --- , tq, a funetion f : RX +R has to be applied. The measurement error for 2,--+ tq is given as +Ary,---,+Ar_3.5 Differential Calculus in many Variables az (An > 0. Vi = 1,+-,n). The law of exror propagation gives as a rough upper bound for the error Af (a) of f(y, +++ ,2tq) the assessment a, a, AS (t1,°-+ 20) < Se) An +...4 |e Ax, Definition 3.23 We call Afmaz(t1, the maximum error of f. The ratio “72#') js the relative maximum error. Not only occurs if all measurement errors dr, --- ,dzq add up with should be applied for about n < 5. Afnaz typically gives a too high estimate for the error of f, because this value the same sign. ‘This formula Definition 3.24 When the number of measurements n becomes large, a better estimate for the error Af is given by the formula Afmean(21, for the mean error of f. Example 3.27 Solution for example 3.26, Application of the maximum error formula leads to dv Av(s,t) =| (st) = | 0.001 km | 10 km 1, 05 hk” 025 7 3600 (st) av s+ | (5,0) As+ [Fee 1) 1 =|2[as+ At |; 0.0024 20) & _ oma 3600] 7h as This can be compactly written as the result v = (20 4 0.013)#2 Definition 3.25 Let f : D -> R two times continuously differentiable. The n x n—Matrix Fe) Bate (2) (Hess f)(2) := wrbg() -- FA(@) is the Hessian—Matrix of f in x, Note: Hess is symmetric, since ef os Onde; IxjOx,48 3. Calculus ~ Selected Topies 3.5.6 Extrema without Constraints Again we appeal to your memories of one-dimensional analysis: How do you determine extrema of a function f : R + R? This is just a special case of what we do now. Definition 3.26 Let D c R" and f : D + Ra function. A point x € D is a local maximum (minimum) of f, if there is a neighborhood U CD of x such that f(x) > fly) (Fe) < Fly) Wy eu. Analogously, we have an isolated local Maximum (Minimum) in 2, if there is a neighborhood U CD of x such that F(z) > Fly) (baw. f(z) < fly) WyeU, vAw Al these points are called extrema, If the mentioned neighborhood U of an extremum is the whole domain, ie. U = D, then the extremum is global. Give all local, global, isolated and non-isolated maxima and minima of the function shown in the following graphs: PlotaD(t [x,y], {x,-8,5.4y,-8,5}, PlotPoints -> 30] ContourPlot{tix.y], tx,-8,5).{y,-5,5}, PlotPoints -> 60, ContourSmoothing -> True, ContourShading-> False] Theorem 3.24 Let D CR" be open and f : D + R partially differentiable. If f has a local extremum in x € D, then grad f(«) = 0. Proof: Reduction on I-dim. case: For i= 1,---,n define gi(h) == ftay-- jt thy--- tm) If f has a local extremum in 2, then all g; have a local extremum in 0. Thus we have for all3.5 Differential Calculus in many Variables 0 is gf(0) =0. Since g{(0) = 2) we get aa) srad/(2) = =0 Ble) Note: © Theorem 3.24 represents a necessary condition for local extrema, « Why is the proposition of Theorem 3.24 false if D CR" is no open set? Linear Algebra Reminder: Definition 3.27 Let A a symmetric n x n-Matrix of real numbers. As positive (negative) definite, if all cigenvalues of A are positive (negative) As positive (negative) semidefinite, if all cigonvalues are > 0 (< 0) Ais indefinite, if all cigenvalues are # 0 and there exist positive as well as negative eigenvalues ‘Theorem 3.25 Criterium of Hurwitz Let A real valued symmetric matrix. A ist positive definite, if and only if for k = 1,---,n ay a >0 au One Ais negative definite if and only if-A is positive definite Theorem 3.26 For D < R" open and two times continuously differentiable f : D -+ R with grad f(x) =0 for @ € D the following holds: a) (Hess f)(x) positive definite f has in z an isolated minimum b) (Hlessf)(x) nogative definite + f has in & an isolated maximum ©) (Hessf)(#) indefinite + f has in # no local extremum. Note: ‘Theorem 3.26 is void if (Hessf)(z) is positive oder negative semidefinite. Procedure for the application of theorems 3.24 and 3.25 to search local extrema of a funetion f : (DC R") +R: 1. Computation of grad f50 3. Calculus ~ Selected Topies 2. Computation of the zeros of grad f 3. Computation of the Hessian matrix Hess/ 4, Evaluation of Hess (z) for all zeros a of grad f Example 3.28 Some simple functions f : IR? > R: L f@yaPty¥te srad/(e,y) = ( 3) = grad/(0,0) = ( ° ) =0 wer (3 8) is positive definite on all? = _f has an isolated local minimum in 0 (paraboloid) 2. fl,y)--e +e gradf(0,0) =0 Hess/ (¢ ’) = isolated local maximum in 0 (paraboloid). 3. f(xy) artby+e ab40 saty=(4) 0 Ve €R? => no local extremum, tess — (5 4) = Hessf indefinite = f has no local extremum, A Seu) Py! gradf = ( i) = grad (0,0) Hess (0,0) = G a) = Hess positive smidefinite, but f has in 0 an isolated minimum.ential Calculus in many Variables BL ered = ( ° ) > grad/(0,y) =0 (0.0) 20 joo (2 2) > Hessf positive semidefinite, but f has a (non isolated) local minimum. All points on the y-axis (x = 0) are local minima, sit 7 fayaery grad f(x,y) = ( Ft ) = gradf(0,0) = 0 Hess (0, 0) ( ‘4 ) = Hess positive semidefinite, but f has no local extremum, 3.5.7 Extrema with Constraints Example 3.29 Which rectangle (length «r, width y) has maximal area, given the perimeter v. Area f(x,y) = 2y. ‘The function f(x,y) has no local maximum on B®! Constraint: U = 2(2 + y) or x+y = ¥ substituted im f(x,y) = xy = = y=U/4 ist (the unique) maximum of the area for constant perimeter U! In many cases substitution of constraints is not feasible! Wanted: Extremum of a function f(z1,-- , 2) under the p constraints y(zis--- p49) =0 hepa, Tq) =O52 3. Calculus ~ Selected Topies Theorem 3.27 Let f: D> R and h: D~> RP be continuously differentiable functions on an open set DC R",n > p and the matrix h/(x) has rank p for all x € D. If zy © D is an extremum of f under the constraint(s) h(x) = 0, there exist real numbers Ay, ++ Ap with shea Da F le) =0 Wile yn and h(a) = 0 Vk =1,--p Illustration: For p= 1, i. only one given constraint, the theorem implies that for an extremum z of f under the constraint A(z) = 0 we have grad f(x0) + Agradh(2o) = 0 @ gradf and gradh are parallel in the extremum o! ¢ > Contours of f and A for h(x) = 0 are parallel in xo ¢ The numbers Ay, +++ ,\, are the Lagrange multipliers. Note: We have to solve n +p equations with n +p unknowns. Among the solutions of this (possibly nonlinear) system the extrema have to be determined. Not all solutions need to be extrema of f under the constraint(s) h(2r0) = 0 (necessary but not sufficient condition for extrema.) Definition 3.28 Let f,h be given as in theorem 3.27. The function L: D> R. Lx. S(@1y--> 420) + YO daha er,--- Zn) = is called Lagrange function. Conclusion: The equations to be solved in theorem 3.27 can be represented as: aL Ox, g(a) =0. (k= 1,---\p)3.5 Differential Calculus in many Variables 53 Example 3.30 Extroma of f(x,y) +3 under the constraint h(2, Contours of iy and const xy.2-0 Ley) = yy 434 Aes y—2) aL (x,y) = 20+ 2d0 Be aL Sey) = Bye = Arr gradL(x,y)=0 , A(z,y)=0 Qe+2e = 0 (I) Qtr = 0 (2) Pey-2=0 0) (2)in (1): 22—4zy — 0 (A) y = 2-22 (3a) (Sa) in (4): 2x—42(2—22) — 0 first soluti : 2 rst solution: ¢; = (9 ) is a maximum 2-8 det = 4a? = raf i mast Ms 5 Sel er w= (fF) oats (-y54 3. Calculus ~ Selected Topics Example 3.31 Extrema of the function f(x,y) = 4x? — 3xry on the disc Ko. = (eye + si} 7 | uN 1. local extrema inside the dise Dg ,: wads(en)- (852) 0 a ( t ) is the unique zero of the gradient. tres = ( Is] |-0-9- = Hess/ is neither positive nor negative definite. Eigenvalues of Hess f =: A At-e @ (A-\)z-0 B-A 3 B-A -3 o (8) B)s-vem(%) 2) @ (8B-A(-A)-9=0 & W-BA-9 Mia =44VI679 A=9 = Hess is indefinite = f has no local extremum on any open set D. => in particular f has on Dg , no extremum!3.5 Differential Calculus in many Variables 55 2. Local extrema on the margin, ic. on IDo,: local extrema von f(x,y) = 4x? — 3xy under the constraint «? + y? —1=0: Lagrangefunction L = 42? — 3xy + A(z? + y? — 1) % 8x — Sy + 2dr = (24+ 8)e ~ 3y aL ay 7 ty Equations for x,y, d: (1) 8r-3y+2\¢ =0 (2) 32+ Dy =0 @) @+yt-1 =0 (Dy -(@)x= (4) Sry —3y? +32? =0 first solution: (3) > (32) (3a)in(4) : £82vT—# ~ 3(1 ~ 24) +327 =0 Subst: 2? =u: 48VuvT—w = 3(1—u) -3u=3-6u squaring: O4u(l—u) = 9-36u+36u? Gh + 64u — 36u? +36u-9 = 0 00u? + 100u-9 = 0 Contours: > f(x,y) has on Rog im a ) isolated local maxima = f(x,y) has on Rog in 23 = ( ca isolated local minima.36 3. Calculus ~ Selected Topies 3.5.7.1 The Bordered Hessian In order to check whether a candidate point for a constrained extremum is a maximum or rinimum, we need a sufficient condition, similarly to the definiteness of the Hessian in the unconstrained case, Here we need the Bordered Hessian a ae mo 2 ® Hess hy hy ah, a ad ah Bhp aL eL Seo Be Sdn aE ‘This matrix can be used to check on local minima and maxima by computing certain subde- terminants. Here we show this only for the two dimensional case with one constraint where ‘the bordered Hessian has the form a i and the sufficient criterion for local extrema is (in contrast to the unconstrained case!) the following simple determinant condition: Under the constraint h(x, y) = 0 the function f has in (z,y) a © local maximum, if |Hess(1r, y)| > 0 © local minimum, if [Hess(x, y)| < 0. If |Fess(sx, y)| = 0, we can not decide on the properties of the stationary point (x,y). Appli- cation to example 3.30 yields eradZ(c,y) = ( 2x(1 +2) ) By +r 0m 1 Fess(x,y) = | 2e 21+) 0 1 0 2 Substitution of the first solution of gradZ = 0 which is x = gives y= 2, \= 4 into this matrix ol |Fress(0, 2)| = 60 2 ‘which proves that we indeed have a maximum in (0,2) 3.5.7.2 Extrema under Inequality Constraints ‘We recall that for finding an extremum of a function f(«) under the constraint g(x) = 0, we have to find a stationary point of the Lagrange function L(x, d) = f(z) + Ag(z)3.6 Exercises 57 by solving gradZ(w, A) = 0 Tn example 3.31 we had to find an extremum under an inequality constraint. We now want to develop a general method for finding maxinzum (minimum) of a function f(a) under a constraint (2) > 0 as shown in figure 3.1. rad JX) 2(x)=0 Figure 3.1 As in the example we have to consider two cases. Either g(x) > 0, ie. the constraint is inactive, or g(«) = 0 and the constraint is active. If g(x) > 0 the condition for an extremum, is simply grad f(z) = 0. This is cquivalent to solving grad L(x, A) = 0 with = 0 If the solution lies on the margin, i.e. g(x) = 0, we can apply the Lagrange formalism as shown above and get 40. Now f(z) will only have its maximum on the margin if \ > 0. If <0, f will be at a minimum on the margin (c.f. exercise 3.32). Thus, for finding a maximum of {(z) under the constraint g(2) > 0, in both cases we have Ag(a) = 0. Thus, to find a maximum of f(x) under the constraint g(x) > 0, we have to maximize the Lagrange function with respect to z and \ under the so called Karush-Kuhn-Tucker conditions a(z) >0 AzO Ag(a) =0. 3.6 Exercises Sequences, Series, Continuity Exercise 3.1 Prove (e.g. with complete induction) that for p € R it holds: (n+ (2+ n) p+k) = Step tn) +%) z Exercise 3.238 3. Calculus ~ Selected Topies a) Calculate i.e, the limit of the sequence (a,)new With ay = Land aay, = VIF aq. Give an exact solution as well as an approximation with a precision of 10 decimal places. b) Prove that the sequence (a,)nex converges, Exercise 3.3. Calculate i.e, the limit of the sequence (a,)nen with ao = 1 and any: = 1+1/ay. Give an exact solution as well as an approximation with a precision of 10 decimal places. Exercise 3.4 Calculate the number of possible draws in the German lottery, which result in having three correct numbers. In German lottery, 6 balls are drawn out of 49. ‘The 49 balls are numbered from 1-49. A drawn ball is not put back into the pot. In cach lottery ticket field, the player chooses 6 numbers out of 49. Then, what is the probability to have three correct numbers? Exercise 3.5 Investigate the sequen regarding convergence. (an)nen With ay 1= 1 Exercise 3.6 Calculate the infinite sum J" == Exercise 3.7 Prove: A series Df2g x with Vk - ax > 0 converges if and only if the sequence of the partial sums is limited. Exercise 3.8 Calculate an approximation (if possible) for the following series and investigate their convergence a) Psi" ™ 6) SO anit + (1/n)] * Exercise 3.9 Investigate the following functions f : R + R regarding contimuity (give an outline for each graph): 9 S@)= » s={ 2 fen a y= { ETP A272 9 serie 9 se Exercise 3.10 Show that f:R— R with f(a) ~ {9 fx zational ‘ 1 if ¢ irrational is not continuous in any point3.6 Exercises 59 Taylor-Series Exercise 3.11 Calculate the Taylor series of sine and cosine with zo = 0. Prove that the Paylor series of sine converges towards the sine function. VE at x = 0 and x Exercise 3.12 Try to expand the function f(x) series, Report about possible problems. 1 into a Taylor Exercise 3.13 Let f be expandable into a Taylor series on the interval (—r,r) around 0 ((r > 0). Prove: a) If f is an even function (f(z) = f(-z)) for all x € (-r,r), then only even exponents appear in the Taylor series of f, it has the form 7 ax,2** = b) If f is an odd function (f(x) = —f(—z)) for all x € (—r,r), then only odd exponents appear in the Taylor series of f, it has the form aaxyiz%* Exercise 3.14 Calculate the Taylor series of the function iy -[ o# itz 40 ne~{ 0 if2=0 t zo = 0 and analyse the series for convergence. Justify the result! Exercise 3.15 Calculate the Taylor series o for the approximate calculation of x. (Use for thi function arctan in arp = 0. Use the result for example tan(/4) = 1.) Functions from &" to R™ Exercise 3.16 Prove that the dot product of a vector « with itself is equal to the square, s length (norm) Exercise 3.17 a) Give a formal definition of the function f : R + B+ U {0} with f(x) = (2) b) Prove that for all real numbers x, y |x + yl < [2] + \y/ Exercise 3.18 a) In industrial production in the quality control, components are measured and the values 1, Ey determinated. The vector d = ¢—s indicates the deviation of the measurements to the nominal values s1,...,s,- Now define a norm on R® such that ||d]| < ¢ holds, iff all deviations from the nominal value are less than a given tolerance © b) Prove that the in a) defined norm satisfies all axioms of a norm. Exercise 3.19 Draw the graph of the following functions f : R? — R (first manually and then by the computer!) Aile.y) =? +98, falayy) =a? +O"? Fla, y) Exercise 3.20 Calculate the partial derivatives 74, #4, 24 of the following functions > Ban? Bar sR oR a) f(z) =|h a) flz)=s b) f(z) =a7 +f! ©) fle) =a" (x +22) e) f(z) =sin(z;+az2)60 3. Calculus ~ Selected Topies Exercise 3.21 Build a function f : R? + R, which generates roughly the following graph: ContourPlot [f [x,y], {x,-5,8},{y,-5,5}, PlotPoints PLotsDLE [x,y], {x,-5,5),4¥,-5,57, => 60, ContourSnoothing -> PlotPoints -> 30] True, ContourShading-> False] Exercise 3.22, Calculate the Jacobian matrix of the function f (x3, 2, 23) = ( sGnnen) ) 123 VEG Exercise 3.23 For f(x,y) = ( sin +e?) ): find the tangent plane at 9 = ( 3 ) Exercise 3.24 Draw the graph of the function y(1 cos) for |y| > || Heya { "rey Show that f is continuous and partially differentiable in R?, but not in 0 i , Pay Exercise 3.25 Caleulate the gradient ofthe function f(2,y) = -= > and draw it as Fy an arrow at different places in a contour lines image of f. Exercise 3.26 The viscosity 1) of a liquid is to be determinated with the formula K = 6rnor. Measured: r = 3em, v = Sem/sec, K = 1000dyn. Measurement error: |Ar| < O.lem, |Av| <0.003em/sec, |AK| <0.ldyn. Determine the viscosity 1) and its error An. Extrema Exercise 3.27 Examine the following fumction for extrema and specify whether it is a local, global, or an isolated extremum: a) fey) = yY-2-y) b) g(ey) = aks (ety)? (k=0,3,4) Exercise 3.28 Given the function f : R? > B, f(x,y) — (y — 2)(y — 32°). a) Calculate grad f and show: grad f(z, y) = 04 2 = b) Show that (Hess )(0) is semi-definite and that f has a isolated minimum on each straight line through 0. ¢) Nevertheless, f has not an local extremum at 0 (to be shown!) Exercise 3.29 Given the fimetions ®(x,y) = y2x— 25, f(x,y) = 2? +9? —3.6 Exercises 61 a) Examine & for extrema b) Sketch all contour lines A= 0 of & ¢) Examine @ for local extrema under the constraint {(c, y Exercise 3.30 The function sin(22? + 39°) Pry F(a,y) = has at (0,0) a definition gap. This ean be remedied easily by defining e.g. (0,0) a) Show that f is continuous on all R? except at (0,0). Is it possible to define the function at the origin so that it is continuous? b) Calculate all local extrema of the function f and draw (sketch) a contour Tine image (not easy). ©) Determine the local extrema under the constraint (not easy) i) e=01 ii) y=01 iii) Pry? Exercise 3.31 Show that grad(f g) = ggradf + f gradg. Exercise 3.32 Prove: When searching for an extremum of a function f(e) under am in- equality constraint g(a) > 0, the function f(x) will only have its maximum on the margin g() = 0 if A> 0 in the equation gradZ(x,d) =0. If <0, f will be at a minimum on the margin.Chapter 4 Statistics and Probability Basics Based on samples, statistics deals with the derivation of general statements on certain features, 4.1 Recording Measurements in Samples Discrete feature: finite amount of values, Continuous feature: values in an interval of real numbers. Definition 4.1 Let X be a feature (or random variable). A series of measurements &,...,% for X is called a sample of the length n. Example 4.1 For the feature X (grades of the exam Mathematics I in WS 97/98) following sample has been recorded: 1.013 2.22.22.225 2929 2.92.9 2.929 293.0 3.03.0 3333343.73.9394147 Let g(z) be the absolute frequency of the value x. ‘Then tz) = 2o(e) is called relative frequency or empirical density of X. Grade X | Absolute frequency g(a) | Relative frequency A(z) | TO T 0.042 13 1 0.042 22 3 0.13 25 1 0.042 29 7 0.29 30 3 0.13 33 2 0.083 34 1 0.042 37 1 0.042 39 2 0.083 41 1 0.042 47 1 0.042 The content of this chapter is strongly leaned on [2]. Therefore, [2] isthe ideal book to read4.1 Recording Measurements in Samples 63 Izy < 22 2 variables, the data But one can determine the covariances between two Thus all eigenvalues are If dependencies among different variables are to be compared, a4.3. Multidimensional Samples 67 Initial Math-Test Results of this Lecture aS ee Correlation of Pre-Test and Initial Math-Test CORRELATIONS, MM: 0.38 (blue), EEM: 0.15 (rec!) a £ ° = 7 2 ak 4 ¢ Bell Peal eal Bh ee lh, tle Ponts in presence tast Appendicitis database of 473 patients Example 4.3 In a medical database of 473 patients*with a surgical removal of their ap- pendix, 15 different symptoms as well as the diagnosis (appendicitis negative/positive) have "Tike data was obtained fom the hospital 14 Nothelfer in Weingarten with the friendly assistance of Dr. Ramp. Mr. Kuchelmeister used the data for the development of an expert system in his diploma thesis68 4. Statisties and Probability Basies been recorded, age gender_( D pain quadranti_(Omnein__ pain_quadrant?_(O-nein, pain_quadrant3_(O-nein, pain_quadrant4_(O=nein. guarding. (O=nein..1=ja) rebound tenderness_(0=nein_1=}a) pein.on_tapping__(O=nein_.1=}a) vibration_(O-nein_1=ja) rectal_pain_(Onein__1=ja) ‘tonp_ax tonp_re: Leukocytes: anitus_( appendicivis_ ‘The first 3 data records w% 10010101 w20010101 wm io0010000 10 10 00 Correlation matrix for the data of all 473 patients: continuous continuous continuous continuous ot on 37.9 38.8 23100 0 1 36.9 37.4 8100 0 0 36.7 36.9 9600 0 1 Loon 01 bows 034 0x7 os Loos 37 TON OT Toe TORO OUS Uo OUT OU OMT OOS OT OOO 1. 00074-0019 Dos O.968-017 O.0084-0.17 “O14 018 DOT DOM Oe O01 02 008-0081 -024 “1. 0059 DA OOM 0.14 G49 OORT O54 DOS OL OOIT “O14 00s D2 055 OoHL GO 019 088 O15 OUI O11 O12 OOES O21 D058 17 013 O08 O44 OoTL 1. O16 04 O28 02 O24 036 0.20 -OoDD1a O88 oe 021 025 Oomals O16 1 O47 O2% O21 O49 O24 O27 O08 OOM Day doi oor O44 O08 04 O27 1. O53 O35 O19 ORT BaF Boa O88 O13 002% OUST OUsTOOI 02 021 025 O24 1 GIT DIT 022 OOas OT ‘0017 002 O11 O084011 02 019 019 O15 OAT 1. O72 O28 O03 O15 0084 DOM 012 OomHO12 036 O21 O27 O19 OAT O72 1. O58 OOM OAL OM 00% O11 O41 0083 029 027 0.27 023 022 O24 038 1 GOs Odd 0.0 B11 024 OolTo21 -DOVI8 Hass 0.026 02 OMS O35 OOIL OosL 1. -LoDss The matrix structure is more apparent if the numbers are illustrated as density plot In ‘the left diagram, bright stands for positive and dark for negative, The right plot shows the absolute values, Here, white stands for a strong correlation between two variables and black for no correlation *The frst t0 images have been votated by 90°. Therefore, the fields in the density plot correspond to the matrix elements44 Probability Theory 69 It is clearly apparent that most of the variable pairs have no or only a very low correlation, ‘whereas the two temperature variables are highly correlated, 4.4 Probability Theory The purpose of probability theory is to determine the probability of certain possible events within an experiment, Example 4.4 When throwing a dic once, the probability for the event throwing a six” is 1/6, whereas the probability for the event throwing an odd number” is 1/2. Definition 4.4 Let © be the set of possible outcomes of an experiment. Each w < 9 stands for a possible outcome of the experiment. If the w, € M exclude each other, but cover all possible outcomes, they are called elementary events Example 4.5 When throwing a die once, = {1,2,3,4,5, 6}, because no two of these events can occur at the same time, Throwing an even number {2, 4,6} is not an elementary event, as well as throwing a number lower than 5 {1, 2,3, 4}, because (2,4, 6}/{1, 2,3, 4} = {2,4} # 0 Definition 4.5 Let be a set of elementary events, A= — A = {w € Mw ¢ A} is called the complementary event to A. A subset A of 2" is called event algebra over 9, if 1LMEA 2. With A, Ais also in A. 3. If (Ap)new is a sequence A , then US An is also in A. Every event algebra contains the sure event 9 as well as the impossible event 0. At coin toss, one could choose A = 2 and = {1,2,3,4,5,6}. Thus A contains any possible event by a toss,70 4. Statisties and Probability Basies If one is only interested in throwing a six, one would consider A = {6} and only, where the algebra results in A= {@, A, A, 9} The term of the probability should give us an as far as possible objective description of our ,believe” or conviction” about the outcome of an experiment. As numeric values, all real numbers in the interval [0, 1] shall be possible, whereby 0 is the probability for the impossible event and 1 the probability for the sure event. {1,2,3,4,5} 4.4.1 The Classical Probability Definition Lot $1 = {u1,W2,-.. un} be finite. No elementary event is preferred, that means we assume a symmetry regarding the frequency of occurence of all clementary events. The probability P(A) of the event A is defined by P(A) = [Al Amount of outcomes favourable to A jay ‘Amount of possible outcomes It is obvious that any elementary event has the probability 1/n. The assumption of the same probability for all elementary events is called the Laplace assumption. Any elementary event has the probability 1/n Laplace assumption Example 4.6 Throwing a die, the probability for an even number is 12,4.6}1 P24) = 345.01 4.4.2. The Axiomatic Probability Definition The classical definition is suitable for a finite set of elementary events only. For infinite sets a more general definition is required. Definition 4.6 Let be a set and A an event algebra on 9. A mapping P:A>(0!) is called probability measure if: 1 PO 2. If the events A, of the sequence (A,)nen are paitwise inconsistent, ie, for i,j € N itholds 4,9.A; = 0, then ( “ )-¥ For A € A, P(A) is called probability of the event A. From this definition, some rules follow directly44 Probability Theory m1 Theorem 4.1 1. P(@) = 0, ic. the impossible event has the probability 0. 2, For pairwise inconsistent events A and B it holds P(AU B) = P(A) + P(B) 3. For a finite amount of pairwise inconsistent events Aj, Ap, ... Ax it holds P (u ) SPA) a a 4. For two complementary events A and A it holds P(A) + PC 5. For any event A and B it holds P(AU B) = P(A) + P(B) — P(A B). 6. For AC B it holds P(A) < P(B) Proof: as exercise 4.4.3 Conditional Probabilities Example 4.7 In the Doggenriedstraije in Weingarten the speed of 100 vehicles is mea- sured. At each measurement it is recorded if the driver was a student or not. The results are as follows [Brat Frequency | Relative frequency Vehicle observed 100 T Driver is a student (S) 30 03 Speed too high (@) 10 on Driver is a student and speeding (SG) 5 0.05 We now ask the following question: Do students speed more frequently than the average person, or than non-students ‘The answer is given by the probability P(G|S) for speeding under the condition that the driver is a student Driver is a student and specding| 1 ‘Driver is a student] ~ 30° 6 P(G\S) = Definition 4.7 For two events A and B, the probability for A under the condition B (conditional probability) is defined by P(ANB) PAB) = At example 4.7 one can recognize that in the case of a finite event set the conditional probability P(A|B) can be treated as the probability of A, when regarding only the event 7 The determined probabilities can only be used for further statements ifthe sample (100 vehicles) is representative. Otherwise, one can only make a statement about the observed 100 vehicles.72 4. Statisties and Probability Basies B, i.e. as puja) = 408) Definition 4.8 If two events A and B behave as P(A|B) = P(A), then these events are called independent. A and B are independent, if the probability of the event A is not influenced by the event B. ‘Theorem 4.2 From this definition, for the independent events A and B follows P(ANB) = P(A) - P(B) Proof: P(ANB) P(AIB) PAI) = Bay =P(A) = P(ANB)= P(A): P(B) Example 4.8 The probability for throwing two sixes with two dice is 1/36 if the dice are independent, because 1 an 6 36 = P(die 1 = six die P(die 1 ix) - P(ai six), whereby the last equation applies only if the two dice are independent. If for example by magic power die 2 always falls like die 1, it holds P(die 1 = six die 2 = six) = z 4.4.4 The Bayes Formula Since equation (4.7) is symmetric in A and B, one can also write P(ANB) P(A\B) = w P(B|A) = P(A\B) aswell as P(BIA) = 57, Rearranging by P(A B) and equating results in the Bayes formula P(BIA)- P(A) P(A|B) = 4 P(B)4.5 Discrete Distributions 73 Bayes Formula, Example A very reliable alarm system wams at burglary with a certainty of 99%. So, can we infer from an alarm to burglary with high certainty? No, because if for example P(A|B) = 0.99, P(A) =0.1, P(B) = 0.001 holds, then the Bayes formula returns P(AIB)P(B) _ 0.990.001 P(B\A) = (lA) P(A) or =0.01 4.5 Discrete Distributions Definition 4.9 A random variable with finite or countably infinite range of values is called discrete random variable. Example 4.9 Throwing a die, the number X is a discrete random variable with the values {1,2,3,4,5,6}, this means in the cxample it holds 1 = 1,...,r = 6. If the dic does not prefer any number, then p= P(X =n) = 1/6, this means the numbers are uniformly distributed. The probability to throw a number <5 P(X <8) = Dw =5/6 In general, one defines Definition 4.10 The function, which assigns a probability p; to each x, of the random variable X is called the discrete density function of X. Definition 4.11 For any real number x, a defined function > rH P(X <2) is called distribution function of X. Such as the empirical distribution funetion, P(X < 2) is a monotonically increasing step function. Analogous to the mean value and variance of samples are the following definitions.74. 4 Statisties and Probability Basics Definition 4.12 The number E(X) = Soap. is called expected value. The variance is given by Var(X) = E((X ~ E(X))?) = )3(a: — B(X))?p: whereby \/Var(z) is called standard deviation. It is easy to see that Var(X) := E(X?) — E(X)* (exercise) 4.5.1 Binomial Distribution Let a soccer player's scoring probability at penalty kicking be = 0.9. The probability always to score at 10 independent kicks is Byooa(10) = 0.9" = 0.35. It is very unlikely that the player scores only once, the probability is Byoa9(1) = 10.0.1" - 0.9 = 0.000000009 We might ask the question, which amount of scores is the most frequent at 10 kicks Definition 4.13 The distribution with the density funetion Bue(s) = (2) p*( = py" is called binomial distribution. ‘Thus, the binomial distribution indicates the probability that with m independent tries of a binary event of the probability p the result will be x times positive, Therefore, we obtain Broo o(k) = (a) 0.1.0.9" The following histograms show the densities for our example for p = 0.9 as well as for p=05,4.6 Continuous Distributions 75 For the binomial distribution it holds B(X) = So. () p= 2)* = np BS and Var(X) = np(l ~ p). 4.5.2 Hypergeometric Distribution Let NV’ small balls be placed in a box. A of them are black and n balls, the probability to draw x black is Ky (® ~ Ky @) The left of the following graphs shows Hyooso,.o(z), the right one Hy.osyao(2). This cor responds to N balls in the box and 30% black balls. It is apparent, that for V = 10 the density has a sharp maximum, which becomes flatter with VV > 10. KC white. When drawing Ay,xn(2) = 4.6 Continuous Distributions76. 4. Statisties and Probability Basies Definition 4.14 A random variable X is called continuous, if its value range is a subset of the real numbers and if for the density function f and the distribution function F it holds F(z) = P(X <2) = [ F(t With the requirements P() = 1 and P() = 0 (see def. 4.6) we obtain tim, F(z) =0. sowie lim F(2) 4.6.1 Normal Distribution The most important continuous distribution for real applications is the normal distribu- tion with the density : yt (=n) Pual2) exp ( oe) Theorem 4.3 For a normally distributed variable X with the density y,,. it holds E(X) = pand Var(X) =o? For = 0 and @ = 1 one obtains the standard normal distribu- tion yoa. With ¢ = 2 one ob- tains the flatter and broader den- sity yo. Example 4.10 Let the waiting times at a traffic light on a country road at lower traffic be uniformly distributed. We now want to estimate the mean waiting time by measuring the waiting time T 200 times. aeutigke Wartezeiten (49 Klassen) 8 The empirical frequency of the waiting times is shown opposite in the image. The mean value ( lies at 60.165 s quencies and the mean value i econds. The free dicate a uniform distribution times between 0 und 120 sec. Tz t ( sec] Due to the finiteness of the sample, the mean value does not lie exactly at the expected value of 60 seconds. We now might ask the question, if the mean value is reliable, more precise with what probability such a measured mean differs from the expected value by