Objective Functions For Full Waveform Inversion: William Symes

Download as pdf or txt
Download as pdf or txt
You are on page 1of 41

Objective functions for full waveform inversion

William Symes

The Rice Inversion Project

EAGE 2012
Workshop: From Kinematic to Waveform Inversion
A Tribute to Patrick Lailly
Agenda

Overview

Challenges for FWI

Extended modeling

Summary
Full Waveform Inversion

M = model space, D = data space

F : M → D forward model

Least squares inversion (“FWI”): given d ∈ D, find m ∈ M to


minimize

JLS [m] = kF [m] − dk2 [+ regularizing terms]

(k · k2 = mean square)
Full Waveform Inversion

+ accommodates any modeling physics, data geometry, spatial


variation on all scales (Bamberger, Chavent & Lailly 79,...)
+ close relation to prestack migration via local optimization
(Lailly 83, Tarantola 84)
+ gains in hard/software, algorithm efficiency ⇒ feasible data
processing method
++ some spectacular successes with 3D field data (keep listening!)
± with regularizations pioneered by Pratt and others, applicable
surface data if sufficient (i) low frequency s/n and (ii) long
offsets
- reflection data still a challenge
Full Waveform Inversion

Why are

I low frequencies important?


I long offsets (diving waves, transmission) easier than short
offsets (reflections)?

What alternatives to Standard FWI = output least squares?

I different error measures, domains - time vs. Fourier vs.


Laplace, L1, logarithmic - other talks today, survey Virieux &
Operto 09
I model extensions - migration velocity analysis as a
linearization, nonlinear MVA
Agenda

Overview

Challenges for FWI

Extended modeling

Summary
Nonlinear Challenges: Why low frequencies are important

Well-established observation, based on heuristic arguments (“cycle


skipping”), numerical evidence :forward modeling operator is more
linear [objective function is more quadratic] at lower frequencies

Leads to widely-used frequency continuation strategy (Kolb,


Collino, & Lailly 86)

Why?
Nonlinear Challenges: Why low frequencies are important
Visualizing the shape of the objective: scan from model m0 to
model m1
f (h) = JLS [(1 − h)m0 + hm1 ]

Expl: data = simulation of Marmousi data (Versteeg & Gray 91),


with bandpass filter source.
offset (km) offset (km)
0 2 4 6 8 0 2 4 6 8
0 0

0.5 0.5

1.0 1.0
depth (km)

depth (km)
1.5 1.5

2.0 2.0

2.5 2.5

20 40 60 20 40 60
bulk modulus (GPa) bulk modulus (GPa)

m0 = smoothed Marmousi, m1 = Marmousi (bulk modulus


displayed)
Nonlinear Challenges: Why low frequencies are important

1.0
MS error (normalized)

0.5

0
0 0.2 0.4 0.6 0.8 1.0
h (scan parameter)
Red: [2,5,40,50] Hz data. Blue: [2,4,8,12] Hz data
Nonlinear Challenges: Why low frequencies are important

Origin of this phenomenon in math of symmetric hyperbolic


systems:
∂u
A + Pu = f
∂t
u = dynamical field vector, A = symm. positive operator, P =
skew-symm. differential operator in space variables, f = source

Example: for acoustics, u = (p, v)T , A = diag(1/κ, ρ), and


 
0 div
P=
grad 0
Nonlinear Challenges: Why low frequencies are important

Theoretical development, including non-smooth A: Blazek, Stolk


& S. 08, Stolk 00, after Bamberger, Chavent & Lailly 79, Lions 68.

Sketch of linearization analysis - after Lavrientiev, Romanov, &


Shishatski 79, also Ramm 86:

δu = perturbation in dynamical fields corresponding to


perturbation δA in parameters
∂δu ∂u
A + Pδu = −δA
∂t ∂t
and   
∂ ∂u ∂f
A +P =
∂t ∂t ∂t
Nonlinear Challenges: Why low frequencies are important

Similarly for linearization error - h > 0, uh = fields corresponding


to A + hδA,
uh − u
e= − δu
h
∂e ∂
A + Pe = −δA (uh − u)
∂t ∂t
∂ 2 uh
  
∂ ∂
A +P (uh − u) = −hδA 2
∂t ∂t ∂t
 2 
∂2f

∂ ∂ uh
(A + hδA) + P =
∂t ∂t 2 ∂t 2
Nonlinear Challenges: Why low frequencies are important

Use causal Green’s (inverse) operator:


 −1  −1
∂ ∂ ∂f
δu = − A + P δA A + P
∂t ∂t ∂t
−1 −1 −1
∂2f
  
∂ ∂ ∂
e = −h A + P δA A + P δA (A + hδA) + P
∂t ∂t ∂t ∂t 2
pass to frequency domain:
ˆ = −[−iωA + P]−1 δA[−iωA + P]−1 iω fˆ
δu

ê = −h[−iωA+P]−1 δA[−iωA+P]−1 δA[−iωA+P]−1 (−iω)2 fˆ+O(h2 ω 2 )


Nonlinear Challenges: Why low frequencies are important
So for small ω,
ˆ = iωP −1 δAP −1 fˆ + O(ω 2 )
δu

ê = hω 2 P −1 δAP −1 δAP −1 fˆ + O(ω 3 )


fˆ(0) 6= 0 ⇒ there exist δA for which

I P −1 δAP −1 fˆ 6= 0 - δA is resolved at zero frequency

⇒ for such δA

I (energy in e) < O(kδAkhωi) (energy in δu)

So: linearization error is small ⇒ JLS is near-quadratic, for


sufficiently low frequency source and/or sufficiently small δA.

Further analysis: quadratic directions ∼ large-scale features


Linear Challenges: Why reflection is hard

Relative difficulty of reflection vs. transmission

I numerical examples: Gauthier, Virieux & Tarantola 86


I spectral analysis of layered traveltime tomography: Baek &
Demanet 11

Spectral analysis of reflection per se: Virieux & Operto 09


Linear Challenges: Why reflection is hard

Reproduction of “Camembert” Example (GVT 86) (thanks: Dong


Sun)

Circular high-velocity zone in 1km × 1km square background - 2%


∆v .

Transmission configuration: 8 sources at corners and side


midpoints, 400 receivers (100 per side) surround anomaly.

Reflection configuration: all 8 sources, 100 receivers on one side


(“top”).

Modeling details: 50 Hz Ricker source pulse, density fixed and


constant, staggered grid FD modeling, absorbing boundaries.
Linear Challenges: Why reflection is hard
Transmission inversion, 2% anomaly: Initial MS resid = 2.56×107 ;
Final after 5 LBFGS steps = 2.6×105

x (km) x (km)
0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8
0 0

0.2 0.2

0.4 0.4
z (km)

z (km)
0.6 0.6

0.8 0.8

x10 4 x10 4
2.45 2.50 2.55 2.60 2.45 2.50 2.55 2.60
MPa MPa

Bulk modulus: Left, model; Right, inverted


Linear Challenges: Why reflection is hard
Reflection configuration: initial MS resid = 3629; final after 5
LBFGS steps = 254

x (km) x (km)
0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8
0 0

0.2 0.2

0.4 0.4
z (km)

z (km)
0.6 0.6

0.8 0.8

x10 4 x10 4
2.45 2.50 2.55 2.60 2.45 2.50 2.55 2.60
MPa MPa

Bulk modulus: Left, model; Right, inverted


Linear Challenges: Why reflection is hard

Message: in reflection case, “the Camembert has melted”.

Small anomaly ⇒ linear phenomenon

Linear resolution analysis (eg. Virieux & Operto 09): narrow


aperture data does not resolve low spatial wavenumbers

Resolution analysis of phase (traveltime tomography) in layered


case: Baek & Demanet 11

I model 7→ traveltime map = composition of (i) increasing


rearangement, (ii) invertible algebraic tranformation, (iii)
linear operator
I factor (iii) has singular values decaying like n−1/2 for diving
wave traveltimes, expontially decaying for reflected wave
traveltimes.
Linear Challenges: Why reflection is hard
Putting it all together: “Large” Camembert (20% anomaly) with
0-60 Hz lowpass filter source. Continuation in frequency after
Kolb, Collino & Lailly 86 - 5 stages, starting with 0-2 Hz:
x (km) x (km)
0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8
0 0

0.2 0.2

0.4 0.4
z (km)

z (km)
0.6 0.6

0.8 0.8

x10 4 x10 4
2.2 2.3 2.4 2.5 2.6 2.7 2.2 2.3 2.4 2.5 2.6 2.7
MPa MPa

Bulk modulus: Left, model; Right, inverted


Agenda

Overview

Challenges for FWI

Extended modeling

Summary
Extended Models and Differential Semblance
Inversion of reflection data: difficulty rel. transmission is linear in
origin, so look to migration velocity analysis for useful ideas

Prestack migration as approximate inversion: fits subsets of data


with non-physical extended models (= image volume), so all data
matched - no serendipitous local matches!

Transfer info from small to large scales by demanding coherence of


extended models

Familiar concept from depth-domain migration velocity analysis -


independent models (images) grouped together as image gathers,
coherence ⇒ good velocity model

Exploit for automatic model estimation: residual moveout removal


(Biondi & Sava 04, Biondi & Zhang 12), van Leeuwen & Mulder 08
(data domain VA), differential waveform inversion (Chauris, poster
session), differential semblance (image domain VA) S. 86 ...
Extended Models and Differential Semblance
Differential semblance, version 1:

I group data d into gathers d(s) that can be fit perfectly (more
or less), indexed by s ∈ S (source posn, offset, slowness,...)
I extended models M̄ = {m̄ : S → M}
I extended modeling F̄ : M̄ → D by

F̄ [m̄](s) = F [m(s)]

I s finely sampled ⇒ coherence criterion is ∂ m̄/∂s = 0.

The DS objective:
2
2 2 ∂ m̄

JDS = kF̄ [m̄] − dk + σ + ...
∂s
Extended Models and Differential Semblance

Continuation method (σ : 0 → ∞) - theoretical justification


Gockenbach, Tapia & S. ’95, limits to JLS as σ → ∞.

“Starting” problem: σ → 0, minimizing JDS equivalent to

∂ m̄ 2

min
subj to F̄ [m̄] ' d
m̄ ∂s

Relation to MVA:

I separate scales: m0 = macro velocity model (physical), δm =


short scale reflectivity model
I linearize: m̄ = m0 + δ m̄, F̄ [m̄] ' F [m0 ] + D F̄ [m0 ]δ m̄
I approximate inversion of δd = d − F [m0 ] by migration:
δ m̄ = D F̄ [m0 ]−1 (d − F [m0 ]) ' D F̄ [m0 ]T (d − F [m0 ])
Extended Models and Differential Semblance

⇒ MVA via optimization:

2
h i
∂ T
min D F̄ [m 0 ] (d − F [m 0 ])]
m0 ∂s

Many implementations with various approximations of D F̄ T ,


choices of s: S. & collaborators early 90’s - present, Chauris-Noble
01, Mulder-Plessix 02, de Hoop & collaborators 03-07.

Bottom line: works well when hypotheses are satisfied:


linearization (no multiples), scale separation (no salt), simple
kinematics (no multipathing)
Nonlinear DS with LF control

Drop scale separation, linearization assumptions

Cannot use independent long-scale model as control, as in MVA:


“low spatial frequency” not well defined, depends on velocity.

However, temporal passband is well-defined, and lacks very low


frequency energy (0-3, 0-5,... Hz) with good s/n

Generally, inversion is unambiguous if data d is not band-limited


(good s/n to 0 Hz) - F̄ is nearly one-to-one - extended models m̄
fitting same data d differ by tradeoff between params, controllable
by DS term

So: find a way to supply the low-frequency data, as ersatz for


long-scale model - in fact, generate from auxiliary model!
Nonlinear DS with LF control

Define low-frequency source complementary to data passband,


low-frequency (extended) modeling op Fl (F̄l )

Given low frequency control model ml ∈ M, define extended model


m̄ = m̄[d, ml ] by minimizing over m̄
2
2 2 ∂ m̄

JDS [m̄; d, ml ] = kF̄ [m̄] + F̄l [m̄] − (d + Fl [ml ])]k + σ
∂s

Determine ml ⇒ minimize
2

JLF [d, ml ] = m̄[d, ml ]

∂s

(NB: nested optimizations!)


Nonlinear DS with LF control

2

min JLF [d, ml ] = m̄[d, ml ]
ml ∂s
ml plays same role as migration velocity model, but no
linearization, scale separation assumed

m̄[d, ml ] analogous to prestack migrated image volume

Initial exploration: Dong Sun PhD thesis, SEG 12, plane wave 2D
modeling, simple layered examples, steepest descent with quadratic
backtrack.

Greatest challenge: efficient and accurate computation of gradient


= solution of auxiliary LS problem
Example: DS Inversion with LF control, free surface

x (km)
0 2
0

0.2
z (km)

0.4

x10 4
0.6 0.8 1.0 1.2
MPa

Three layer bulk modulus model. Top surface pressure free, other
boundaries absorbing
Example: DS Inversion with LF control, free surface

sign(p)*p^2 (s^2/km^2)
-0.09 -0.08 -0.06 -0.04 -0.02 0.00 0.02 0.04 0.06 0.08
0
time (s)

0.5

1.0

-1000 -500 0 500 1000

Plane wave data, free surface case


Example: DS Inversion with LF control, free surface
sign(p)*p^2 (s^2/km^2)
-0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08
0
0.1
0.2
z (km)

0.3
0.4
0.5

-500 0 500
MPa

Extended model LS gradient at homog initial model (prestack


image volume)
Example: DS Inversion with LF control, free surface
sign(p)*p^2 (s^2/km^2)
-0.09 -0.08 -0.06 -0.04 -0.02 0.00 0.02 0.04 0.06 0.08
0
0.1
0.2
z (km)

0.3
0.4
0.5

x10 4
0.2 0.4 0.6 0.8 1.0 1.2
MPa

Inverted gather m̄[d, ml ], ml = homogeneous model, x = 1.5 km


Example: DS Inversion with LF control, free surface
Example: DS Inversion with LF control, free surface

Inverted gather m̄[d, ml ], 3rd DS iteration, x = 1.5 km


Example: DS Inversion with LF control, free surface
Standard FWI using stack of optimal DS m̄ as initial data
(one-step homotopy σ = 0 → ∞)

153 L-BFGS iterations, final RMS error = 6%, final gradient norm
< 1 % of original
Space Shift DS

Defect in version 1 of DS already known in MVA context:

Image gathers generated from individual surface data


bins may not be flat, even when migration velocity is
optimally chosen (Nolan & S, 97, Stolk & S 04)

Source of kinematic artifacts obstructing flatness: multiple ray


paths connecting sources, receivers with reflection points.

Therefore version 1 of DS only suitable for mild lateral


heterogeneity. Must use something else to identify complex
refracting structures
Space Shift DS
For MVA, remedy is known: use space-shift image gathers δ m̄ (de
Hoop, Stolk & S 09)

Claerbout’s imaging principle (71): velocity is correct if energy in


δ m̄(x, h) is focused at h = 0 (h = subsurface offset)

Quantitative measure of focus: choose P(h) so that P(0) = 0,


P(h) > 0 if h 6= 0, minimize
X
|P(h)δ m̄[m0 ](x, h)|2
x,h

(e. g. P(h) = |h|).

MVA based on this principle by Shen, Stolk, & S. 03, Shen et al.
05, Albertin 06, 11, Kubir et al. 07, Fei & Williamson 09, 10, Tang
& Biondi 11, others - survey in Shen & S 08. Gradient issues: Fei
& Williamson 09, Vyas 09.
Space Shift DS

Extension to nonlinear problems - how is δ m̄[x, h] the output of an


adjoint derivative?

Answer: ReplaceR coefficients m in wave equation with operators m̄:


e. g. κ̄[u](x) = dhκ̄(x, h)u(x + h). Physical case: multiplication
operators κ̄(x, h) = κ(x)δ(h). Then

δ m̄[m0 ] = D F̄ [m̄0 ]T (d − F [m])

for resulting extended fwd map F̄

⇒ Version 2 of nonlinear DS. Physical case =


no-action-at-a-distance principle of continuum mechanics =
nonlinear version of Claerbout’s imaging principle (S, 08).
Mathematical foundation: Blazek, Stolk & S. 08.
Agenda

Overview

Challenges for FWI

Extended modeling

Summary
Summary

I restriction to low frequency data makes FWI objective more


quadratic, just like you always thought
I transmission inversion is easier than reflection for linear
reasons, so MVA seems like a good place to look for reflection
inversion approaches
I extended modeling provides a formalism for expressing MVA
objectives that extend naturally to nonlinear FWI, via
continuation - provision of starting models, route to FWI
solution
I positive early experience with “gather flattening” nonlinear
differential semblance
I “survey sinking” NDS involves wave equations with operator
coefficients
I Patrick’s fingerprints are all over this subject
Thanks to...

I Florence Delprat and other organizers, EAGE


I my students and postdocs, particularly Dong Sun, Peng Shen,
Chris Stolk, Kirk Blazek, Joakim Blanch, Cliff Nolan, Sue
Minkoff, Mark Gockenbach, Roelof Versteeg, Michel Kern
I National Science Foundation
I Sponsors of The Rice Inversion Project
I Patrick Lailly, for inspired, inspiring, and fundamental
contributions to this field

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy