0% found this document useful (0 votes)
6 views11 pages

s m s t c Lecture Notes Lecture3

This document discusses generalised inverses and mixed-determined problems in the context of inverse problems, focusing on the resolution matrices and their implications for data fitting and model parameter estimation. It covers the definitions and properties of data and model resolution matrices, measures of goodness, and the implications of non-uniqueness in solutions without a priori information. Additionally, it presents examples and methods for constructing Householder transformations and solving mixed-determined problems.

Uploaded by

miru park
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views11 pages

s m s t c Lecture Notes Lecture3

This document discusses generalised inverses and mixed-determined problems in the context of inverse problems, focusing on the resolution matrices and their implications for data fitting and model parameter estimation. It covers the definitions and properties of data and model resolution matrices, measures of goodness, and the implications of non-uniqueness in solutions without a priori information. Additionally, it presents examples and methods for constructing Householder transformations and solving mixed-determined problems.

Uploaded by

miru park
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

SMSTC (2020/21)

INVERSE PROBLEMS
Lecture 3: Generalised Inverses and Mixed-determined problems
Anya Kirpichnikova, University of Stirlinga

www.smstc.ac.uk

Contents
3.1 Generalised Inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–1
3.1.1 Data Resolution matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–1
3.1.2 Model resolution matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–2
3.1.3 Measures of goodness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–2
3.1.4 Non-uniqueness without a priori information . . . . . . . . . . . . . . . . . . . . . . . . 3–3
3.2 Mixed determined problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–4
3.2.1 Model and data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–5
3.2.2 Coordinate transformation m0 = Tm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–6
3.2.3 Purely underdetermined case: what is expected from T? . . . . . . . . . . . . . . . . . . 3–6
3.2.4 Purely overdetermined case: what is expected from T? . . . . . . . . . . . . . . . . . . . 3–7
3.2.5 Householder transformation T for the constrained least squares . . . . . . . . . . . . . . . 3–7
3.3 Constructing Householder transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–8
3.4 Solution of the mixed-determined problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–9
3.4.1 Natural solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–9

3.1 Generalised Inverses


We have been solving Km = d minimising either prediction error e or solution simplicity (or length), such that in
general we would expect:
mest = Md + v, where M, v, are independent of d.
This our goal has shifted from K to M, which is an ”inverse” of K in some sense. We will denote this M by K−g
and call it the generalised inverse.
For the overdetermined case: K−g = [KT K]−1 KT the LS solution
For the underdetermined case: K−g = KT [KKT ]−1 the ML solution

3.1.1 Data Resolution matrix


Assume Km = d and K−g implies mest = K−g d (we assume v = 0 here in mest = Md + v above).
To check how well does the data fit d we substitute mest = K−g d into Km = d to get
d pre Kmest = K[K−g dobs ] = [KK−g ]dobs = Ndobs , where N = [KK−g ] is a data resolution matrix (3.1)

The data resolution matrix N is an N × N matrix that describes how well the predictions match the data:
• if N = I then dobs = d pre and the predicted error vector e = 0
• if N 6= I then the predicted error is not zero.
The rows of N describe how well neighbouring data can be independently predicted (resolved), N is a function of
K and a priori, but not d.
a anya@cs.stir.ac.uk

3–1
SM ST C : INVERSE PROBLEMS 3–2

Example 1

Assume d possesses some natural ordering to make sense of neighboring. Assume N is diagonal, i.e. the ith row
of N is
0 0 . . . 0 13 0 . . . 0 assuming this 13 is in the ith position
 

then
N
pre
di = ∑ Ni j d obs obs
j = 13di
j=1

if this is true for every i, d is predicted exactly.

Example 2

Assume d possesses some natural ordering to make sense of neighboring. Assume N is nearly diagonal, the
elements are concentrated near the main diagonal, i.e. the ith row of N is

0 0 . . . 0 0.2 0.7 0.1 0 . . . 0 assuming this 0.7 is in the ith position


 

then
N
pre
di = ∑ Ni j d obs obs obs obs
j = 0.2di−1 + 0.7di + 0.1di+1
j=1
pre
the predicted value diis a weighted average of three neighbouring observed data.
Definition n = diag(N) is called the importance of data.

3.1.2 Model resolution matrix


The model resolution matrix describes whether individual model parameters can be resolved [5].
Assume mtrue is a solution to Kmtrue = dobs , the question is how close is m pre from mtrue ?

mest = K−g dobs , Kmtrue = dobs

then together,
mest = K−g dobs = K−g [Kmtrue ] = [K−g K]mtrue = Rmtrue
where R = K−g K is the M × M model resolution matrix:

• If R = IM then each mi , i ∈ {1, . . . , M} is uniquely determined


• If R 6= IM then the estimates of the model parameters are really weighted averages of the true parameters

3.1.3 Measures of goodness


Notice the symmetry in both extreme cases: the mixed determined problems will have data and model resolution
matrices between those two extremes. We have been measuring the goodness of model parameters by overall
prediction error and simplicity. How do we check the goodness of resolution? Resolution is best when matrices
are identity, hence we check the spread (size of the off-main-diagonal elements) is based on L2 norm:
N N
spread(N) = kN − Ik22 = ∑ ∑ [Ni j − δi j ]2 (3.2)
i=1 j=1

M M
spread(R) = kR − Ik22 = ∑ ∑ [Ri j − δi j ]2 (3.3)
i=1 j=1
(
0, when i 6= k
where δik = are Kronecker deltas. When R = IM the spread(R) = 0
1, otherwise
Both spreads can be minimizes by using calculus similar to what we did in Section 2:
N N N N
Jk = ∑ (Nki − δki )2 = ∑ N2ki − 2 ∑ Nki δki + ∑ δ2ki → min
i=1 i=1 i=1 i=1
SM ST C : INVERSE PROBLEMS 3–3

and since each of the Jk are positive, we can minimise them separately to minimise the total spread:

spread(N) = ∑ Jk → min as Jk → min

The latter is achieved when the derivatives are zero.

Least squares (purely overdetermined case)

K−g = [KT K]−1 KT , N = KK−g = K[KT K]−1 KT , R = K−g K = [KT K]−1 KT K = I


For this case the spread(R) = 0, so we minimise the spread(N) :

∂Jk
N = KK−g : −g = 0
∂Kqr

the latter gives


KT KK−g = KT ⇒ K−g = [KT K]−1 KT
which is exactly the least squares (LS) generalised inverse. Conclusion The LS generalised inverse can be inter-
preted as

• the inverse that minimises the L2 norm of the prediction error or

• as the inverse that minimises the Dirichlet spread of the data resolution

Minimum Length (purely underdetermined case)

K−g = KT [KKT ]−1 , N = KK−g = KKT [KKT ]−1 = I, R = K−g K = KT [KKT ]−1 K
i.e. N = I, spread(N) = 0, so we minimise the spread of R by the same method. Finally,

K−g = KT [KKT ]−1

is exactly the minimum length (ML) generalised inverse. Conclusion The ML generalised inverse can be inter-
preted as

• the inverse that minimises the L2 norm of the solution length or


• as the inverse that minimises the Dirichlet spread of the model resolution

3.1.4 Non-uniqueness without a priori information


To fight non-uniqueness of solutions to Km = d without adding a priori information, we can shift our focus from
solutions that estimate model parameters, to solutions that estimate the weighted averages of model parameters.
If Km = d doesn’t have a unique solution, then there exists a non-trivial solution (some mi 6= 0) to Km = 0, i.e.
null-vector (prove as an exercise together with its converse).
Remark: Kmnull = 0 ⇒ mnull is orthogonal (perpendicular) to each row of K (dot product with every row of K is
zero), so no linear combination of rows of K can be a null vector.
If m particular is any non-null solution (say, minimum length) of Km = d implies

m particular + αmnull is also a solution with the same error for any scalar α.

Recall, that null vectors are only distinct when they are linearly independent [4].
If inverse problem has q distinct null solutions, then the general solution is
q
mgeneral = m particular + ∑ αi mnull(i) , 0≤q≤M (3.4)
i=1

The latter inequality means that there can be no more linearly independent solutions than there are unknowns (no
proof here).
SM ST C : INVERSE PROBLEMS 3–4

Example

Consider the following underdetermined problem where we measure the mean value of a set of model parameters:
 
m1
 1 1 1 1  m2   
Km = 4 4 4 4  m3  = d1

m4

Here 1 1 1 1 
16 16 16 16
1 1 1 1 
KT K ⇒ det(KT K) = 0, KKT = 0.25

= 16 16 16 16 
1 1 1 1 
16 16 16 16
1 1 1 1
16 16 16 16
the minimum length generalised inverse is
     
1 1 d1
1 1 d1 
K−g = KT [KKT ]−1 =  mML = K−g d = 
    

1 ⇒ 1 d1 = d1 

1 1 d1

Null solutions are  


x
y 1 1 1 1
z = 0 ⇒ 4x+ 4y+ 4z = 4w = 0
K 

w
so the three linearly independent solutions are
     
1 1 1
−1  , mnull(2) =   , mnull(3) =  0 
0
mnull(1) = 
     
0 −1 0
0 0 −1

so the most general solution is


       
d1 1 1 1
d1  −1 0 0
mgeneral d1  + α1  0  + α2 −1 + α3  0 
=       

d1 0 0 −1

What about a particular solution? We need to choose αi for example, for minimum length solution , αi = 0.
Actually, the minimal length solution never contains any null vectors (discussed in Sec. 2).

3.2 Mixed determined problems


Looking at Fig. 3.1 from X-ray tomography perspective, if the purple block has opacity m1 and the yellow one has
opacity m2 , the individual opacities might be underdetermined, but the average opacity
m1 + m2
m01 = might be overdetermined
2
and we can for example, consider determining the two linear combinations of model parameters instead. The
difference in opacity
m1 − m2
m02 = is however completely underdetermined
2
Main goal: Thus we want to partition our equation

Km = d 7→ K0 m0 = d0
SM ST C : INVERSE PROBLEMS 3–5

Figure 3.1: The average opacity of these two bricks is overdetmined, but since each path has an equal length in
either brick, the individual opacities are underdetermined

where m0 consists of two separate parts (upper and lower):


 0o ( 0
0 m m o is completely overdetermined
m = 0u , where 0
m m u is completely underdetermined

if this is achieved, the equation becomes:


 0o   0o " 0o#
K 0 m d
0 0 = 0u
0 Ku mu d

In this case we could apply different approaches to two parts. If the second part is not too underdetermined, some
approximate approaches would be useful (considered in Week 2). In all other cases, this partition can be achieved
through the singular value decomposition (SVD).

3.2.1 Model and data


It would be useful to understand both algebraic (quantity that is manipulated) and geometric representations (di-
rection and length) of vectors we use. Assume S(b) and S(m) are the vector spaces (very high dimensions in our
case) containing our data d and model parameters m, see Fig. 3.2, Fig. 3.3. Hence equation

forward problem Km = d can be considered as a mapping from S(m) to S(d)

and its solution


(inverse problem:) mest = K−g d is a mapping from S(d) to S(m)

Figure 3.2: S(d) : data vector space with data Figure 3.3: S(m) : model parameter space with model
vector d = [d1 , d2 , d3 ]T , N = 3. parameters vector m = [m1 , m2 , m3 ]T , M = 3.
SM ST C : INVERSE PROBLEMS 3–6

We can use various coordinate axes in S(m) any set of vectors that spans the space will serve as coordinate axes.
Space S(m) is M dimensional, and thus is spanned by any (linearly independent) M vectors, say m(i) . Each vector
m∗ ∈ S(m) can be expressed as a sum of these M basis vectors
M
m∗ = ∑ αi m(i)
i=1

0
where αi are the components of m∗ in the new coordinate system {m(i) }, i.e. mi∗ = αi .
M M
(i) 0 0
m∗i = ∑ v j m j∗ = ∑ Mi j m j∗ ,
i=1 i=1

(i)
where M is a matrix formed from basis vectors Mi j = v j .
If m(i) are linearly independent, then m(i) lie in a subspace (or hyperplane) of S(m).

3.2.2 Coordinate transformation m0 = Tm


Consider a transformation of coordinate systems of S(m) and S(d), we say if m is the representation of a vector in
one coordinate system, then m0 is its representation is another, and in this case

m0 = Tm and m = T−1 m0

where T is the transformation matrix. If the new basis vectors are also mutually orthogonal unit vectors, then T
represents rotations and reflections of the coordinate axes (sometimes they might not be unit).
We are looking for a (simple) transformation that has certain properties. The exact properties depend on the
questions, and we consider our needs in case of underdetermined (minimum length) and overdetermined (least
squares) problems separately.

3.2.3 Purely underdetermined case: what is expected from T?


Assume Km = d with M > N and we are looking for a minimum length solution by looking at minimising L =
mT m. The new transformation is m0 = Tm, and together they give

d = Km = K I m = [K T−1 ][Tm] = K0 m0

where K0 is the data kernel in the new coordinate system. If we choose T unitary, i.e. T−1 = T, this choice of
0
transformation will preserve the length of m, i.e. L = mT m = m T m (exercise to check)
0
The latter means that mt m and m T m0 have the same form in both coordinate systems, i.e. the sum of squares
in terms of elements of a vector. The transformations that do not change the length, unitary transformations are
rotations, reflections of coordinate axes.
Hence we want T

• to be unitary T−1 = TT
• additionally, we want T to make K0 lower triangular

Why would we benefit from K0 = TK being lower triangular in the purely underdetermined case?

m01
 
K011 0 
 
0 0 0 ... 0 0   m2 
 
 K0 d1
 21 K022 0 0 ... 0 0  ... 
  0  . . .
0 0 0
K m = d ⇐⇒ 
 K31 K032 K033 0 ... 0 0   m0 N  = . . .
    (3.5)
 ... ... ... ... ... ... . . . m 
 N+1  dN
K0N1 K0N2 ... K0NN ... 0 0  ... 
mM 0

Two main points we can highlight by looking at matrix from Eq. (3.5):
SM ST C : INVERSE PROBLEMS 3–7

Point 1) uncontrolled elements of model parameters The structure of the matrix suggests that the latter com-
0
ponents of the model parameters vector, i.e. mi , i ∈ {N + 1, . . . , M} cannot be changed since matrix ker-
0 0
nel doesn’t affect them. Decision: To minimise m T m, we set all the uncontrolled mest i = mi to zero,
i ∈ {N + 1, . . . , M}.
0
Point 2) controlled elements of model parameters We can solve for mi , i ∈ {1, . . . , N} by back-solving and
hence get the remaining components:
 d1
est 0
m1 = m1 = K011


mest = m0 = d2 −K0 021 m01


2 2 K22
0 d3 −K021 m01 −K032 m02


mest
3 = m3 = K033


...

Therefore we get the solution to the original system mest = T−1 m0 (equivalent to the minimum length solution).
Finally, we are looking for the transformation T such that it separates determined and undetermined linear combi-
nations of parameters.
Remark, [1] With this transformation it would be easy to determine the null vectors of the inverse problem. In the
transformed coordinate system they are the set of vectors whose first N elements are zero and whose last M − N
elements are zero except for one element. There are clearly M − N such vectors, so we have established that
there are never more than M null vectors in a purely underdetermined problem. The null vectors can easily be
transformed back into the original coordinate system by premultiplication by T−1 . As all but one element of the
transformed null vectors are zero, this operation just selects a column T−1 (or, equivalently, a row of T).

3.2.4 Purely overdetermined case: what is expected from T?


Assume Km = d with M < N and we are looking for minimum of E = eT e, i.e. we are looking for a transformation
T such that e0 = Te
0
• it should minimise e T e equivalent to minimise eT e
• it must transform data kernel into upper triangular form
e0 = e = T[d − Km] = Td − TKm = d0 − K0 m
where d0 is a transformed data, and as requested, K0 is upper triangular, i.e
 0
K11 K012 K013 . . . K01M  

m1
 0 K022 K023 . . . K02M 
K0 =  K033 . . . K03M   m2 
   
 0 0  ... 
 ... ... ... ... ... 
mM
0 0 0 ... ...
then
e01 d10
   

e02
 0
K012 K013 K01M    0 

  K11 ...  d2 
m1
K022 K023 0
 
 ...
 0 ... K2M     . . . 
 
e0M K033 K03M   m2   0 
   
 0  = − 0 0 ...   . . .  +  d0 M 
  
e   ... ... ... ... ...  d 
 M+1  mM  M+1 
 ...  0 0 0 ... ...  ... 
e0N dN0
The last N − M rows of the error vector are not affected by our choice of mest , the first M elements can be sent to
zero e0 = d0 − K0 m = 0 by back solving since K0 is triangular.

3.2.5 Householder transformation T for the constrained least squares


The transformation T we are looking for is a Householder transformation. It allows to introduce zeros instead
of under-diagonal elements. Before we actually construct it, let have a look at its usefulness again. Consider a
constrained least squares problem
Km = d and p linear euqality constraints Hm = h
SM ST C : INVERSE PROBLEMS 3–8

The Householder transformation will separate

• those linear combinations of elements of m that are completely determined by the constraints
• those linear combinations of elements of m that are completely underdetermined.

Thus we again see for a transformation T such that it triangulates Hm = h :

h = Hm = [HT−1 ] [Tm] = H0 m d = Km = [KT−1 ] [Tm] = K0 m

where H0 is triangular, however K0 is not. The first p elements of mest can be obtained by back-solving the
triangular system., thus we have a partition K0 = [K01 , K02 ] where K01 multiplies the p-determined model parameters,
K02 multiplies the M − p unknown model parameters. We have
h 0 0 0 est 0
iT
d = [K01 , K02 ] [m1est , . . . , m pest ], [m p+1 , . . . , mMest ]

we rearrange the latter to get


0 0 0 0
K02 [m p+1
est
, . . . , mMest ]T = d − K01 [m1est , . . . , m pest ]
where the right hand side is known, and the latter equation is completely overdetermined in M − p unknowns.
Finally,
0
mets = T−1 m est

3.3 Constructing Householder transformation


Householder transformation is an orthogonal transformation (reflection) We will follow [3] for diagonalisation
process.
Consider the transformation
2vvT
T =I− T
v v
show that T is unitary (exercise). Transformation T is unitary, hence it preserves length (seen above and in exer-
cises).
Remark
v·k
projectionv k = v
v·v
Gram-Schmidt orthogonalisation is pretty unstable and is usually avoided in numerics.
Goal Find v such that it triangulates a given matrix K, namely, we find T such that it converts some bottom elements
of k vector into zeros. Consider the ith column of K k = [K1i , K2i , . . . , KNi ]T , and v = [v1 , . . . , vN ]T . Our goal is
then
k0 = T(i) k = [K1i0 , K2i0 , . . . , Kii0 , 0, . . . , 0]T
hence making the last N − i elements of the column zero. Write
   
K1i v1
2vvT  K2i  2vkT  v2 
k0 = T(i) k = [K1i0 , K2i0 , . . . , Kii0 , 0, . . . , 0]T = Ik − I − T k = 
 . . .  − vT v
  
. . .
v v
KNi vN

2vkT
Note, that vT v
is a scalar. To make {K0(i+1)i , . . . , K0Ni } zero we have to choose the last N − i components of v such
that    
 v K
2vkT  i+1   (i+1)i 

... = ...
vT v
vN KNi
the other elements of v will follow from normalisation (take i − 1 as zeros), then the ith element comes from a
constraint:
N
vi = Kii − αi , αi = ∑ K2ji
j=1
SM ST C : INVERSE PROBLEMS 3–9

3.4 Solution of the mixed-determined problem


3.4.1 Natural solution
In this case some linear combinations of model parameters can be determined, we assume they belong to subspace
m p ∈ S p (m), the other linear combinations of model parameters cannot be determined (no information is provided,
”unilluminated” by Km = d), they belong to subspace m0 ∈ S0 (m), and m0 ∈ (K).
Km might not be able to span S(d) (if overdetermined), no matter what m we choose, at best it might be able to
span some subspace S p (d) ⊂ S(d), let those data points be d p ∈ S p (d). Assume also S0 (d) is the other part of S(d)
that is not S p (d). No part of S0 (d) can be satisfied by m, d0 ∈ S0 (d). Then

K (m p + m0 ) = d p + d0

and
L = mT m = [m p + m0 ]T [m p + m0 ] = mTp m p + mTp m0 + mT0 m p + mT0 m0 = mTp m p + mT0 m0
since m0 is orthogonal to m p .. In the right hand side above, mTp m p is determined by d, and mT0 m0 is determined
by a priori information. The total overall error is then

E = [d p + d0 − K(m p + m0 )]T [d p + d0 − K(m p + m0 )] =

E = [d p + d0 − Km p ]T [d p + d0 − Km p ]
due to Km0 = 0, simplifying further, with d0 being orthogonal to d p , we have

E = [d p − Km p ]T [d p − Km p ] + dT0 d0

where [d p − Km p ] is determined by m p and d0 is not possible to determine.


Solve [d p − Km p ] = 0 to get the natural solution (set m0 = 0.)
SM ST C : INVERSE PROBLEMS 3–10

Exercises
3–1. Shown that unitary transformation T preserves model paramaters length.

3–2. Show that transformation


2vvT
T =I−
vT v
is unitary.
3–3. If Km = d doesn’t have a unique solution, then there exists a non-trivial solution (some mi 6= 0) to Km = 0,
i.e. null-vector (prove as an exercise together with its converse).
SM ST C : INVERSE PROBLEMS 3–11

References
[1] W ILLIAM M ENKE, Geophysical Data Analysis: Discrete Inverse Theory, 3rd edition, Essevier, 2012.

[2] P ER C HRISTIAN H ANSEN, Discrete Inverse Problems, Insights and Algorithms, SIAM, 2010.
[3] B URDEN , R ICHARD ; FAIRES , D OUGLAS Numerical analysis, (8th ed.), Thomson Brooks/Cole, (10 Decem-
ber 2004). ISBN 9780534392000.
[4] W UNSCH , C., M INSTER , J.F., Methods fro box models and ocean circulation tracers: mathematical pro-
gramming and non-linear inverse theory, J. Geophys. Res., 87 (1982), pp. 5647–5662.

[5] W IGGINS , R.A.,, The general linear inverse problem: Implication of surface waves and free oscillations for
Earth structure, J. Geophys. Space Phys., 10 (1972), pp. 251–285.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy