RFIC Symposium CDR

RTUIF-04
A 12.5Gbps Analog Timing Recovery System for PRML

Optical Receivers
Jomo Edwards1, Chris Gill1, Kelvin Tran1, Lloyd Linder1, Denis Zelenin2, Dalius Baranauskas2,
Matthias Bussmann1, Salam Elahmadi1, Harry Tan1
1
Menara Networks, Irvine, CA, 92612, USA
2
Pacific Microchip, Los Angeles, CA, 90045, USA
Abstract — This paper describes a timing recovery system The TRS decision directed phase detector (DDPD)
(TRS) based on an analog approximation of the minimum employs the MMSE algorithm to achieve this.
mean squared error (MMSE) algorithm. The TRS has been
fabricated in a 0.18µm, 150GHz SiGe BiCMOS process as
part of a high performance Class-2 Partial Response II. TIMING RECOVERY ARCHITECTURE
Maximum Likelihood (PRML) dispersion tolerant optical
receiver. This decision directed clock recovery architecture The architecture consists of three components: a time-
was implemented for the (1+D)2 partial response polynomial. interleaved DDPD, a frequency lock loop, and a digital
The TRS supports all XFI data rates [9.95, 11.09]Gbps and controller or lock detect block (LD). The frequency lock
has been verified up to 12.5Gbps. The TRS is functional at loop is a traditional phase-frequency detector type-2
OSNRs as low as 11dB and has been verified in the full
PRML receiver at a BER of 10-4 over 400km of charge-pump PLL. The initial step in the frequency
uncompensated single mode fiber (SMF). It dissipates acquisition process is a coarse calibration of the multiple
372mW from a dual 3.3V and 1.8V supply and complies with band differentially tuned LC-VCO. The VCO has a 4-bit
or exceeds all XFI receiver Jitter specifications for Telecom coarse tuning control giving the LC-Tank sixteen distinct
(SONET OC-192 and G.709 “OTU-2”) and Datacom frequency bands. During coarse tuning, the VCO is set to
(Ethernet 802.3ae or Fiber Channel). the center frequency of each band. Then, the LD
Index Terms — CDR, MLSE, partial response signaling,
sequentially transverses each zone comparing the center
PLL, PRML, PRS, timing recovery.
frequency (divided by 64) of each band to the reference
frequency and chooses the zone with the smallest
I. INTRODUCTION frequency offset. Frequency acquisition is completed by
the PLL before switching control to the DDPD.
Maximum-likelihood sequence estimation (MLSE) has
The phase acquisition process can be understood by
been shown to be a powerful technique for electronic
referring to the block diagram in Fig. 1. The FIR
compensation of both chromatic and polarization mode
interleaves the full-rate CTF output into two partial
dispersion [3]-[4]. For the first time, to our knowledge, we
response data vectors; the 0° phase yk(A) vector and the
propose a new kind of receiver based on the principles of
180° phase yk(B) vector. In contrast to standard binary
partial-response maximum likelihood equalization
phase detection, the discrete time analog PRS has five
techniques implemented in the analog signal domain [5].
possible levels, for our Class-2 target polynomial (1 + D)2,
This paper will describe the timing recovery system of the
these levels are: {0, ±1, ±2}. At sample time k, the DDPD
receiver.
uses an ideal PRS generator, the 5-level quantizer, or slicer
Fig. 1 shows a simplified block diagram of the TRS.
shown in Fig. 1, to produce an ideal sample y*k for each
The received NRZ data is first passed through an AGC
sample yk. An error signal is generated and delayed (ek-1 =
controlled VGA and equalized with a gm-C continuous-
y*k-1 – yk-1) between the ideal and actual sample. This error
time-filter (CTF). The CTF approximates a linear phase or
signal, (along with other PLL loop parameters), defines
constant group delay response up to approximately 1.2
the magnitude of the analog phase adjustment. The
times its programmable cutoff frequency which is
polarity of the phase adjustment is determined by the slope
optimized along with the Finite-Impulse-Response (FIR)
(Mk = yk – yk-2) at each sample point of the discrete time
filter transfer function for equalization of the input signal
analog PRS. The product of the slope and amplitude error
to the target partial response polynomial, (1+D)2. The
produces the phase error ( k) for the sample data at time
primary function of the TRS is to provide the FIR filter
k. The amplitude error is computed in the opposite phase
with the optimal clock phase for proper equalization.
interleave with respect to its slope calculation. The voltage
Phase information for the data samples is contained in the
domain phase errors from both phases are summed
difference between the samples and their ideal value [1].
978-1-4244-3376-6/978-1-4244-3378-0/09/$25.00 © 2009 IEEE 535 2009 IEEE Radio Frequency Integrated Circuits Symposium
together by a differential transconductor (Gm) which ek = y *k − yk (1)
drives the loop filter and VCO.
∞
MSE = E{ek2 } = ∫ ek2 PDF ( yk )dyk
(2)
AGC
−∞
DAC
Offset
7-Bit
7-Bit
DAC
Correction
GmC
dek2 ⎡d ⎤ (3)
CTF
VGA Line-In NRZ = −2ek (nk ) ⎢ y (t )⎥
dnk ⎣ dt ⎦ t =kT + nk
Continuous
Time PRS
Linear Transconductor (Gm)

5-Level DDPD
Quantizer
+ ek-1(A) Gm =
I Φ Offset Correction DAC
Programmable
Modulation of the sample phase will produce a
∑ VΦ
yk(A) yk-1(A)
T/H
-sgn{ek-1(B)}
-
Gm Load corresponding change in yk; therefore the minimum of the
4
∑ k ( n ) x (n − i)
i
+
MSE (2) is found by setting the derivative with respect to
i=0 fs/2 ∑
FIR output @ time +
the sampling time of (2) to zero and setting the timing
n ∑ Linear V2I
T/H
yk-1(B) - ek-1(B)
- Converter jitter to zero. The minimum occurs when the product of
yk(B) +
sgn{ek-1(A)}
Multi Band
LC-VCO
Test the amplitude error for sample k and the slope of the
Clock
Reference PFD
Charge VCM sample at time k is equal to zero. So, if the phase of the
Pump
Clock
loop_sel VCO is modulated with (4)
Lock-Detection
Digital Controller VCO
N-Bit Control & ZONE
0 1
zk = ek × ( y k +1 − y k −1 ), (4)
Status Signals
Recovered CLK (fs/2)

÷32 ÷2 the expected squared error will be minimized by adjusting
the sample timing phase in the data path feedback loop.
Fig.1. Timing Recovery Functional Diagram.
IV. PHASE DETECTOR FUNCTIONALITY
III. MMSE BASED TIMING RECOVERY The interleave topology of the fully differential DDPD
is shown in Fig. 3. The sample under measurement (yn) is
Sample at time kT + (nk + Tj) sent to two track and hold based current mode logic
CTF
y(t) yk(nk)
Timing (CML) processing paths. One path computes the
Slope
y*k(nk) error
Calculation amplitude error (partial response target error), the other
5-Tap FIR Filter detector
Ideal PRS determines the instantaneous slope of the sample whose
Generator
ek(nk) target error is being computed in the opposite interleave
Myk = yk+1 – yk-1
Hlf(s)
zk(nk)
Fig. 2. MMSE timing recovery conceptual diagram.

Fig. 2 depicts the basic concept of the MMSE timing
recovery algorithm. MMSE based timing recovery is the
application of the Least-Mean-Square (LMS) adaptation
technique to acquire phase in a sampled data clock
recovery system [2]. Simply stated, the timing information
is obtained as the instantaneous gradient, with respect to
phase; of an error signal that is proportional to the phase. Fig. 3. DDPD interleave architecture.
The received NRZ data is passed through the CTF whose
impulse response is such that the output y(t), when with yn, a data symbol under timing error measurement.
sampled at the appropriate phase, yields a sequence that The top half circuitry will calculate the amplitude error
corresponds to the target polynomial. The MMSE input Aen by sending yn through the slicer. The slicer is a 5-level
signal y(t) is sampled at time ( kT + (nk+Tj) ); where T is quantizer, whose digital output words are concatenated to
the sample interval, nk represents the optimal timing offset drive a reconstruction DAC. The output of the slicer is the
for the optimal sample phase of sample k, and Tj is timing best estimate of the input sample to the target polynomial.
jitter. To arrive at this optimal timing, the phase error of Ideal equalization to the (D2 + 2D + 1) target polynomial
the loop is derived in magnitude by the amplitude error ek yields five discrete equally spaced signal levels: {0, ±1,
which is the difference between the equalized sample yk ±2} the slicer will map yn to one of these signal levels.
and the quantized or ideal estimated value y*k, as shown in The difference between the quantized value (y*n) and the
(1). received sample (yn) is computed and retimed in phase
with its slope. The delayed error Aen-1 is then sent to a 6.0
536
dB linear amplifier which drives the analog input of a
saturated multiplier. The digital input to the multiplier is I =G ×(Φ +Φ )
Gm m Ae Be
the sign of the slope {±1}. The difference between the

next sample and the previous sample is computed and
quantified with a signed comparator. The product of the
amplitude error and the sign of the slope form the phase
error for one phase of the PRS. The outputs of both
interleaves are summed together in the transconductor to
produce a continuous time differential phase error. (a) (b)
Fig. 5. (a) The Linear Transconductor input and interleave

summing connection and (b) the load block.
Phase Transfer (V)
V. Transconductor and Loop Dynamics
The GM continuously sums the interleaved voltage

signals in the current domain via two independent fully
differential voltage-to-current converters sharing a
common load, as shown in Fig. 5. The PMOS current
mirror load composes the top half or current “source” of
SDR (dB)
the GM output stage. The transconductance of the GM is

controlled digitally by emitter degeneration. A 2-bit binary
control is converted into a 1-of-n code, where the three
resulting bits are used to selectively shunt unit resistance
Phase Offset (s)
segments. The programmable degeneration defines the
Fig. 4. DDPD phase transfer characteristic, PTC, (top) and signal overall input voltage versus output current linearity range
to distortion ratio, SDR, (bottom) as a function of phase offset for and sets the transconductance.
model (ideal), schematic, and layout parasitic extracted netlist. Since the phase transfer of the DDPD is a linear function of
phase error, this LMS based phase-detector is seen as a linear
In addition to the traditional specifications for clock and phase detector in the PLL loop dynamics. For this to remain
data recovery systems (i.e Jitter Tolerance, Jitter Transfer, valid the GM also needs to be a linear function of its input
etc.) our MMSE based interleaved analog architecture must phase-error. Therefore, a 2-bit emitter degeneration DAC
satisfy a strict linearity specification. The linearity metric is a gives us 4 distinct loop bandwidth settings covering a 4X
measure of the amount of “mis-equalization” the TRS range. To further enhance programmability, a 3-bit binary
introduces in the timing recovery process. The “mis- DAC scales the final output current and final
equalization” is a deviation from the ideal sample point, transconductance for each of the degeneration settings. The
resulting in a static phase offset, which causes the system to GM output current is divided into eighths, one eighth is sent
lock to some point less than the maximum SDR value. directly to the loop-filter, the remaining 87.5% of IGM is
Fig. 4 shows the PTC and SDR vs. phase offset for linearly distributed over the 3-bit binary DAC. An additional
11.1Gbps PRBS-7 data across a span of 160km of SMF under 7-bit binary current steering DAC is also implemented to
nominal conditions. The MMSE should be zero at the correct asymmetry or systematic offset that may be present in
maximum SDR. The theoretical maximum SDR possible at the timing error data-path. The digitally controlled emitter
the PRML detector input for this input signal is ~20.6dB (Fig. degeneration (2-bits) defines the overall input voltage vs.
4 bottom) and Fig. 4 (top) shows that our circuit LMS output current linearity range and sets the initial
implementation finds the minimum at 440fs offset with transconductance. The 3-bit DAC only scales the final output
respect to the mathematical model. This 440fs static phase current and, hence the final transconductance, for a given
offset results in less than a 0.2 dB penalty in SDR. degeneration setting. This permits another degree of freedom,
Simulations show a worst-case, over PVT and offset, of 780fs
of static phase offset.
537
JDSU ONT-506 TESTER JDSU ONT-506 TESTER
Jitter Generation: Jitter Generation:

Measure filter passband [10K, 80M] Hz Measure filter passband [10K, 80M] Hz
f3 dB ˜ 2.3 MHz f3dB ˜ 7.6 MHz

JGEN = 4.9 mUI (RMS) J GEN = 6.4 mUI (RMS)
JPEAKING < 0.01 dB J PEAKING < 0.01 dB (a) (b)
(2 31 -1) PRBS over 240Km of SMF

(231 -1) PRBS over 240Km of SMF
Data-Rate: 10.71Gb/s
Data-Rate: 11.09Gb/s
(c) (d)
Fig. 7. (a) Measured jitter tolerance vs. mask, (b) 320km

(a) (b) input eye, (c) 320km FIR output or Viterbi input, (d) 320km
recovered cock and NRZ data.
Fig. 6. Jitter Transfer response (top) and Recovered Clock & FIR
eye for two different bandwidth settings (a) 2.3MHz and (b) 7.6MHz.
VII. CONCLUSION
since the linearity range can now be set independently of the
overall transconductance. It is quite beneficial, since the gain A fully differential time-interleaved high-speed analog
that was traded for linearity can now be recovered and timing recovery system for a high performance PRML
returned. Class-2 dispersion tolerant optical receiver has been
implemented in 0.18µm SiGe BiCMOS. The TRS circuit
VI. MEASUREMENT RESULTS employs a type-2 charge pump PLL for frequency
acquisition, and a decision-directed LMS algorithm for
The TRS along with the analog Viterbi Decoder phase acquisition. The TRS architecture was implemented
comprise the clock and data recovery unit of a fully for the D2 +2D +1 PRS, but is suitable for any partial-
integrated XFI receiver. The recovered clock is accessible response polynomial. The circuit supports all XFI data
via a high-speed linear Test-Point MUX and buffer. The rates [9.95, 11.09]Gbps, and data-rates as high as 12.5
Jitter Tolerance was verified for a 240km SMF span at Gbps. The TRS provides SONET compliant jitter
11.09Gbps with the TRS programmed to a bandwidth of generation, tolerance and transfer and dissipates 372mW
8.0MHz. The actual measured bandwidth for this setting from a dual 3.3-V and 1.8-V supply.
was ~ 7.6MHz. A JDSU ONT-506 Tester was used for REFERENCES
this measurement and the results along with the SONET
Jitter Tolerance mask are shown in Fig. 7(a). The [1] P. Roo, R. Spencer, and P. Hurst, “A CMOS Analog
compliance is normalized for a BER of 10-6. The top of Timing Recovery Circuit for PRML Detectors,” IEEE J.
Fig. 6(a), shows the measured Jitter Transfer, Peaking, and Solid-State Circuits. vol. 35, no. 1, pp. 56-65, Jan. 2000.
Jitter Generation for this measurement: f3dB ~ 7.6MHz, [2] S. U. H. Qureshi, “Timing recovery for equalized partial-
response systems,” IEEE Trans. Commun., vol. 24, no. 12,
JGEN = 6.4mUIrms, and the Jitter Peaking is < 0.01 dB.
pp. 1326-1330, Dec. 1976.
The recovered clock and one phase of the FIR output or [3] R. Griffin et al., ”Combination of InP MZM Transmitter
Viterbi input are shown in the lower half. Fig. 6(b) shows and monolithic CMOS 8-state MLSE Receiver for
the results for the same Jitter metrics at 10.71Gbps for a dispersion tolerant 10 Gb/s transmission,” OthO2
lower loop bandwidth setting. The loop bandwidth was set OFCNFOEC 2008.
to 3.6MHz (measured ~ 2.3MHz) and the measured Jitter [4] A. Faerbert, ”Application of Digital Equalization in Optical
Generation is 4.9mUIrms. Fig. 7(b)-(d) shows the VGA Transmission Systems,” OTuE5 OFCNFOEC 2006.
[5] S. Elahmadi et al., ”A Monolithic One-Sample/Bit Partial-
input, one phase of the FIR, and the recovered clock and Response Maximum Likelihood SiGe Receiver for
data for a PRBS (231-1) input signal over 320km of SMF Electronic Dispersion Compensation of 10.7G Fiber Links,”
at 10.71Gbps Fig. The BER was > 10-5 and the input JWA34 OFCNFOEC 2009.
OSNR was 22dB.
538

RFIC Symposium CDR

Uploaded by

Copyright:

Available Formats

RFIC Symposium CDR

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

RFIC Symposium CDR

Uploaded by

Copyright:

Available Formats

RTUIF-04

A 12.5Gbps Analog Timing Recovery System for PRML

Linear Transconductor (Gm)

Recovered CLK (fs/2)

Fig. 2. MMSE timing recovery conceptual diagram.

the sign of the slope {±1}. The difference between the

Fig. 5. (a) The Linear Transconductor input and interleave

V. Transconductor and Loop Dynamics

The GM continuously sums the interleaved voltage

the GM output stage. The transconductance of the GM is

Jitter Generation: Jitter Generation:

f3 dB ˜ 2.3 MHz f3dB ˜ 7.6 MHz

(2 31 -1) PRBS over 240Km of SMF

Fig. 7. (a) Measured jitter tolerance vs. mask, (b) 320km

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.