s m s t c Lecture Notes Lecture3
s m s t c Lecture Notes Lecture3
INVERSE PROBLEMS
Lecture 3: Generalised Inverses and Mixed-determined problems
Anya Kirpichnikova, University of Stirlinga
www.smstc.ac.uk
Contents
3.1 Generalised Inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–1
3.1.1 Data Resolution matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–1
3.1.2 Model resolution matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–2
3.1.3 Measures of goodness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–2
3.1.4 Non-uniqueness without a priori information . . . . . . . . . . . . . . . . . . . . . . . . 3–3
3.2 Mixed determined problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–4
3.2.1 Model and data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–5
3.2.2 Coordinate transformation m0 = Tm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–6
3.2.3 Purely underdetermined case: what is expected from T? . . . . . . . . . . . . . . . . . . 3–6
3.2.4 Purely overdetermined case: what is expected from T? . . . . . . . . . . . . . . . . . . . 3–7
3.2.5 Householder transformation T for the constrained least squares . . . . . . . . . . . . . . . 3–7
3.3 Constructing Householder transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–8
3.4 Solution of the mixed-determined problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–9
3.4.1 Natural solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–9
The data resolution matrix N is an N × N matrix that describes how well the predictions match the data:
• if N = I then dobs = d pre and the predicted error vector e = 0
• if N 6= I then the predicted error is not zero.
The rows of N describe how well neighbouring data can be independently predicted (resolved), N is a function of
K and a priori, but not d.
a anya@cs.stir.ac.uk
3–1
SM ST C : INVERSE PROBLEMS 3–2
Example 1
Assume d possesses some natural ordering to make sense of neighboring. Assume N is diagonal, i.e. the ith row
of N is
0 0 . . . 0 13 0 . . . 0 assuming this 13 is in the ith position
then
N
pre
di = ∑ Ni j d obs obs
j = 13di
j=1
Example 2
Assume d possesses some natural ordering to make sense of neighboring. Assume N is nearly diagonal, the
elements are concentrated near the main diagonal, i.e. the ith row of N is
then
N
pre
di = ∑ Ni j d obs obs obs obs
j = 0.2di−1 + 0.7di + 0.1di+1
j=1
pre
the predicted value diis a weighted average of three neighbouring observed data.
Definition n = diag(N) is called the importance of data.
then together,
mest = K−g dobs = K−g [Kmtrue ] = [K−g K]mtrue = Rmtrue
where R = K−g K is the M × M model resolution matrix:
M M
spread(R) = kR − Ik22 = ∑ ∑ [Ri j − δi j ]2 (3.3)
i=1 j=1
(
0, when i 6= k
where δik = are Kronecker deltas. When R = IM the spread(R) = 0
1, otherwise
Both spreads can be minimizes by using calculus similar to what we did in Section 2:
N N N N
Jk = ∑ (Nki − δki )2 = ∑ N2ki − 2 ∑ Nki δki + ∑ δ2ki → min
i=1 i=1 i=1 i=1
SM ST C : INVERSE PROBLEMS 3–3
and since each of the Jk are positive, we can minimise them separately to minimise the total spread:
∂Jk
N = KK−g : −g = 0
∂Kqr
• as the inverse that minimises the Dirichlet spread of the data resolution
K−g = KT [KKT ]−1 , N = KK−g = KKT [KKT ]−1 = I, R = K−g K = KT [KKT ]−1 K
i.e. N = I, spread(N) = 0, so we minimise the spread of R by the same method. Finally,
is exactly the minimum length (ML) generalised inverse. Conclusion The ML generalised inverse can be inter-
preted as
m particular + αmnull is also a solution with the same error for any scalar α.
Recall, that null vectors are only distinct when they are linearly independent [4].
If inverse problem has q distinct null solutions, then the general solution is
q
mgeneral = m particular + ∑ αi mnull(i) , 0≤q≤M (3.4)
i=1
The latter inequality means that there can be no more linearly independent solutions than there are unknowns (no
proof here).
SM ST C : INVERSE PROBLEMS 3–4
Example
Consider the following underdetermined problem where we measure the mean value of a set of model parameters:
m1
1 1 1 1 m2
Km = 4 4 4 4 m3 = d1
m4
Here 1 1 1 1
16 16 16 16
1 1 1 1
KT K ⇒ det(KT K) = 0, KKT = 0.25
= 16 16 16 16
1 1 1 1
16 16 16 16
1 1 1 1
16 16 16 16
the minimum length generalised inverse is
1 1 d1
1 1 d1
K−g = KT [KKT ]−1 = mML = K−g d =
1 ⇒ 1 d1 = d1
1 1 d1
w
so the three linearly independent solutions are
1 1 1
−1 , mnull(2) = , mnull(3) = 0
0
mnull(1) =
0 −1 0
0 0 −1
d1 0 0 −1
What about a particular solution? We need to choose αi for example, for minimum length solution , αi = 0.
Actually, the minimal length solution never contains any null vectors (discussed in Sec. 2).
Km = d 7→ K0 m0 = d0
SM ST C : INVERSE PROBLEMS 3–5
Figure 3.1: The average opacity of these two bricks is overdetmined, but since each path has an equal length in
either brick, the individual opacities are underdetermined
In this case we could apply different approaches to two parts. If the second part is not too underdetermined, some
approximate approaches would be useful (considered in Week 2). In all other cases, this partition can be achieved
through the singular value decomposition (SVD).
Figure 3.2: S(d) : data vector space with data Figure 3.3: S(m) : model parameter space with model
vector d = [d1 , d2 , d3 ]T , N = 3. parameters vector m = [m1 , m2 , m3 ]T , M = 3.
SM ST C : INVERSE PROBLEMS 3–6
We can use various coordinate axes in S(m) any set of vectors that spans the space will serve as coordinate axes.
Space S(m) is M dimensional, and thus is spanned by any (linearly independent) M vectors, say m(i) . Each vector
m∗ ∈ S(m) can be expressed as a sum of these M basis vectors
M
m∗ = ∑ αi m(i)
i=1
0
where αi are the components of m∗ in the new coordinate system {m(i) }, i.e. mi∗ = αi .
M M
(i) 0 0
m∗i = ∑ v j m j∗ = ∑ Mi j m j∗ ,
i=1 i=1
(i)
where M is a matrix formed from basis vectors Mi j = v j .
If m(i) are linearly independent, then m(i) lie in a subspace (or hyperplane) of S(m).
m0 = Tm and m = T−1 m0
where T is the transformation matrix. If the new basis vectors are also mutually orthogonal unit vectors, then T
represents rotations and reflections of the coordinate axes (sometimes they might not be unit).
We are looking for a (simple) transformation that has certain properties. The exact properties depend on the
questions, and we consider our needs in case of underdetermined (minimum length) and overdetermined (least
squares) problems separately.
d = Km = K I m = [K T−1 ][Tm] = K0 m0
where K0 is the data kernel in the new coordinate system. If we choose T unitary, i.e. T−1 = T, this choice of
0
transformation will preserve the length of m, i.e. L = mT m = m T m (exercise to check)
0
The latter means that mt m and m T m0 have the same form in both coordinate systems, i.e. the sum of squares
in terms of elements of a vector. The transformations that do not change the length, unitary transformations are
rotations, reflections of coordinate axes.
Hence we want T
• to be unitary T−1 = TT
• additionally, we want T to make K0 lower triangular
Why would we benefit from K0 = TK being lower triangular in the purely underdetermined case?
m01
K011 0
0 0 0 ... 0 0 m2
K0 d1
21 K022 0 0 ... 0 0 ...
0 . . .
0 0 0
K m = d ⇐⇒
K31 K032 K033 0 ... 0 0 m0 N = . . .
(3.5)
... ... ... ... ... ... . . . m
N+1 dN
K0N1 K0N2 ... K0NN ... 0 0 ...
mM 0
Two main points we can highlight by looking at matrix from Eq. (3.5):
SM ST C : INVERSE PROBLEMS 3–7
Point 1) uncontrolled elements of model parameters The structure of the matrix suggests that the latter com-
0
ponents of the model parameters vector, i.e. mi , i ∈ {N + 1, . . . , M} cannot be changed since matrix ker-
0 0
nel doesn’t affect them. Decision: To minimise m T m, we set all the uncontrolled mest i = mi to zero,
i ∈ {N + 1, . . . , M}.
0
Point 2) controlled elements of model parameters We can solve for mi , i ∈ {1, . . . , N} by back-solving and
hence get the remaining components:
d1
est 0
m1 = m1 = K011
mest = m0 = d2 −K0 021 m01
2 2 K22
0 d3 −K021 m01 −K032 m02
mest
3 = m3 = K033
...
Therefore we get the solution to the original system mest = T−1 m0 (equivalent to the minimum length solution).
Finally, we are looking for the transformation T such that it separates determined and undetermined linear combi-
nations of parameters.
Remark, [1] With this transformation it would be easy to determine the null vectors of the inverse problem. In the
transformed coordinate system they are the set of vectors whose first N elements are zero and whose last M − N
elements are zero except for one element. There are clearly M − N such vectors, so we have established that
there are never more than M null vectors in a purely underdetermined problem. The null vectors can easily be
transformed back into the original coordinate system by premultiplication by T−1 . As all but one element of the
transformed null vectors are zero, this operation just selects a column T−1 (or, equivalently, a row of T).
e02
0
K012 K013 K01M 0
K11 ... d2
m1
K022 K023 0
...
0 ... K2M . . .
e0M K033 K03M m2 0
0 = − 0 0 ... . . . + d0 M
e ... ... ... ... ... d
M+1 mM M+1
... 0 0 0 ... ... ...
e0N dN0
The last N − M rows of the error vector are not affected by our choice of mest , the first M elements can be sent to
zero e0 = d0 − K0 m = 0 by back solving since K0 is triangular.
• those linear combinations of elements of m that are completely determined by the constraints
• those linear combinations of elements of m that are completely underdetermined.
where H0 is triangular, however K0 is not. The first p elements of mest can be obtained by back-solving the
triangular system., thus we have a partition K0 = [K01 , K02 ] where K01 multiplies the p-determined model parameters,
K02 multiplies the M − p unknown model parameters. We have
h 0 0 0 est 0
iT
d = [K01 , K02 ] [m1est , . . . , m pest ], [m p+1 , . . . , mMest ]
2vkT
Note, that vT v
is a scalar. To make {K0(i+1)i , . . . , K0Ni } zero we have to choose the last N − i components of v such
that
v K
2vkT i+1 (i+1)i
... = ...
vT v
vN KNi
the other elements of v will follow from normalisation (take i − 1 as zeros), then the ith element comes from a
constraint:
N
vi = Kii − αi , αi = ∑ K2ji
j=1
SM ST C : INVERSE PROBLEMS 3–9
K (m p + m0 ) = d p + d0
and
L = mT m = [m p + m0 ]T [m p + m0 ] = mTp m p + mTp m0 + mT0 m p + mT0 m0 = mTp m p + mT0 m0
since m0 is orthogonal to m p .. In the right hand side above, mTp m p is determined by d, and mT0 m0 is determined
by a priori information. The total overall error is then
E = [d p + d0 − Km p ]T [d p + d0 − Km p ]
due to Km0 = 0, simplifying further, with d0 being orthogonal to d p , we have
E = [d p − Km p ]T [d p − Km p ] + dT0 d0
Exercises
3–1. Shown that unitary transformation T preserves model paramaters length.
References
[1] W ILLIAM M ENKE, Geophysical Data Analysis: Discrete Inverse Theory, 3rd edition, Essevier, 2012.
[2] P ER C HRISTIAN H ANSEN, Discrete Inverse Problems, Insights and Algorithms, SIAM, 2010.
[3] B URDEN , R ICHARD ; FAIRES , D OUGLAS Numerical analysis, (8th ed.), Thomson Brooks/Cole, (10 Decem-
ber 2004). ISBN 9780534392000.
[4] W UNSCH , C., M INSTER , J.F., Methods fro box models and ocean circulation tracers: mathematical pro-
gramming and non-linear inverse theory, J. Geophys. Res., 87 (1982), pp. 5647–5662.
[5] W IGGINS , R.A.,, The general linear inverse problem: Implication of surface waves and free oscillations for
Earth structure, J. Geophys. Space Phys., 10 (1972), pp. 251–285.