Application: V.34 Transmitter and Receiver Implementation On The TMS320C50 DSP
Application: V.34 Transmitter and Receiver Implementation On The TMS320C50 DSP
Application: V.34 Transmitter and Receiver Implementation On The TMS320C50 DSP
This application report consists of one of the entries in a contest, The 1995 TI DSP Solutions
Challenge. The report was designed, prepared, and tested by university students who are not
employees of, or otherwise associated with, Texas Instruments. The user is solely responsible for
verifying this application prior to implementation or use in products or systems.
SPRA159
June 1997
Texas Instruments (TI) reserves the right to make changes to its products or to discontinue any
semiconductor product or service without notice, and advises its customers to obtain the latest
version of relevant information to verify, before placing orders, that the information being relied
on is current.
TI warrants performance of its semiconductor products and related software to the specifications
applicable at the time of sale in accordance with TI’s standard warranty. Testing and other quality
control techniques are utilized to the extent TI deems necessary to support this warranty.
Specific testing of all parameters of each device is not necessarily performed, except those
mandated by government requirements.
Certain applications using semiconductor products may involve potential risks of death,
personal injury, or severe property or environmental damage (“Critical Applications”).
Inclusion of TI products in such applications is understood to be fully at the risk of the customer.
Use of TI products in such applications requires the written approval of an appropriate TI officer.
Questions concerning potential risk applications should be directed to TI through a local SC
sales office.
In order to minimize risks associated with the customer’s applications, adequate design and
operating safeguards should be provided by the customer to minimize inherent or procedural
hazards.
List of Figures
1 Block Diagram of the Transmitter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Mapping Frame Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Scrambler Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4 240-Point Quarter Superconstellation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
5 16-State Convolutional Encoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
6 Block Diagram of Receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
7 Symbol Clock Recovery Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
8 Equalizer Adaptation-Loop Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
9 Fast Equalizer Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
VIII–1 Constellation Shaping With Shell Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-3
VIII–2 Plot of Function in Summation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-15
VIII–3 Concentric Shaping Rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-22
IX–3 The 2D Partition Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-2
iv SPRA159
V.34 Transmitter and Receiver Implementation on the
TMS320C50 DSP
ABSTRACT
This application report presents the design of an efficient V.34 transmitter and
receiver pair. The algorithms behind the advanced encoding and decoding schemes
of the V.34 recommendation are described, and the assembly language functions
that implement these algorithms are referenced. The entire assembly language
source code of the project is provided with full documentation of the details of the
implementation. C source code from V.34 modem simulations is also provided. It is
found that the TMS320C50 digital signal processor (DSP) is exceptionally
well-suited to the task of encoding and decoding V.34 modem signals.
1 Introduction
The International Telecommunications Union (ITU) modem
recommendation V.34 represents the state of the art in modern modem
design [1]. Through the use of advanced coding techniques, modems
conforming to the V.34 standard can achieve data communication rates
previously thought unattainable on standard general-switched telephone
networks (GSTNs). Such modems are rapidly becoming the method of
choice for transmitting data quickly and reliably over standard telephone
lines. Because of this popularity, the practical design and implementation of
a V.34 class modem using conventional digital signal processing hardware
is of prime importance.
This project submission details the implementation of a V.34
transmitter-and-receiver pair using standard TMS320C50 evaluation
modules (EVMs). The project was designed using two unmodified EVMs on
identical Hewlett Packard Vectra 486 PCs. The project consists of two
components: the V.34 transmitter and the V.34 receiver.
The transmitter has two parts: an EVM-compatible object file called tran.out
and a PC-based front-end called transmit.exe.To run the transmitter, one
must first load and run the object file on the EVM, and then run the PC
executable file on the host PC.
The receiver has two analogous parts: recv.out and receive.exe. The
procedure for running them is the same as for the transmitter. To
demonstrate their operation, the output of the EVM running the transmitter
should be connected to the input of the EVM running the receiver.
The programs for both the transmitter and receiver are written entirely in
TMS320C50 assembly language. They implement a large subset of the ITU
recommendation V.34. The transmitter operates at 9.6-, 14.4-, and
19.2-kbps data rates only. However, this speed limitation is imposed only by
the choice of a 2400-baud symbol rate. The routines developed in this
project are fully general and can be used to transmit at up to the full speed
of the V.34 specification (if the maximum 3200-baud symbol rate is utilized).
The PC front-end programs are written in C and provide a rudimentary user
interface to the DSP programs. In particular, they display the transmitted and
received point constellations on the PC graphics screen. These displays are
especially useful for debugging purposes, but they also illustrate the
operation of the transmitter-receiver pair nicely. Data typed at the terminal
also can be transmitted over the data channel once a connection has been
made.
The bulk of this paper documents the algorithms used in the two programs.
The assembly language routines that implement the particular algorithms
are referenced throughout the text. The source files contain explanations of
the ’C50 implementation particulars. Therefore, these source files should be
examined in addition to the text of the project submission.
2 SPRA159
Introduction
01010101
Data Stream
{Qi,j,k,1,Qi,j,k,2,Qi,j,k,3, , Qi,j,k,q}
Trellis
Encoder
Modulate Precode
2 V.34 Transmitter
The transmitter consists of four logical units that correspond to the stages
through which the data flows as it is encoded for transmission. A block
diagram of the basic V.34 transmitter is shown in Figure 1. The first of these
units, called parse, accepts a stream of binary input data, scrambles it, and
then partitions these scrambled bits into different groups to be passed to the
next unit. The second logical unit, point-select, uses the parsed bits to select
signal points from a constellation of 2-dimensional (2D) points that has been
specified for use in V.34. The third logical unit, precode, applies a precoding
filter to the signal points to compensate for the noise-whitening filter present
in the V.34 receiver. This unit also contains the trellis encoder, connected in
a feedback configuration, which ensures that the transmitted points
correspond to a proper trellis sequence. The final unit, modulate, performs
standard quadrature amplitude modulation (QAM) of the signal points to
construct the final output waveform.
4 SPRA159
V.34 Transmitter
Output Stream
16D: j = 0 1 2 3
4D: k = 0 1
2D Symbols
2.2 Parse
The parse logical unit serves two major purposes in the transmitter. First, it
scrambles the input data to eliminate long, repetitive sequences of bits.
Such sequences could cause a periodic output signal, eliciting loss of
symbol clock tracking, or unstable behavior in the adaptive filters of the V.34
receiver. Second, it groups the scrambled bits into discrete packets for
further processing by the point-select stage of the transmitter.
Input x(n)
+ D1 D2 D18 D19 D23
Output y(n)
6 SPRA159
V.34 Transmitter
(Si,1,Si,2,...,Si,K)
(I1i,0,I2i,0,I3i,0),(Qi,0,0,1,Qi,0,0,2,...,Qi,0,0,q),(Qi,0,1,1,Qi,0,1,2,...,Qi,0,1,q),
(I1i,1,I2i,1,I3i,1),(Qi,1,0,1,Qi,1,0,2,...,Qi(,1,0,q),(Qi,1,1,1,Qi,1,1,2,...,Qi,1,1,q),
(I1i,2,I2i,2,I3i,2),(Qi,2,0,1,Qi,2,0,2,...,Qi,2,0,q),(Qi,2,1,1,Qi,2,1,2,...,Qi,2,1,q),
(I1i,3,I2i,3,I3i,3),(Qi,3,0,1,Qi,3,0,2,...,Qi,3,0,q),(Qi,3,1,1,Qi,3,1,2,...,Qi,3,1,q).
The parsing operation is carried out by the parsedata function found in the
file parse.asm. This function of the transmitter differs for each set of
constants K and q. Therefore, the actual parsing of data is carried out by one
of three sub-functions: parsedata96, parsedata144, or parsedata192. The
choice of which parsing function to use is determined at the beginning of the
execution of the program and is dynamically executed by using the bacc
assembly language instruction.
2.3 Point-Select
The unique method of 2D point selection used by the V.34 transmitter is one
of its main advancements over previous generations of modems. It is
because of these advancements that data-transmission rates twice those
of older modems are now achievable.
The V.34 transmitter selects 2D points from a subset of the 2Z2+(1,1) lattice.
This subset is called the superconstellation, and it consists of 960 total
points. A 240-point subset of the superconstellation, on the 4Z2+(1,1) lattice,
is shown in Figure 4. The full superconstellation can be obtained from this
subset by rotating it by 0_, 90_, 180_, and 270_ degrees. Each combination
of symbol rate and data rate uses a particular subset of the
superconstellation; rarely is the full superconstellation utilized. In this report,
these subsets of the superconstellation are referred to as transmission
constellations.
234 206 185 173 164 162 170 181 197 220
29 29
226 193 165 146 133 123 121 125 137 154 179 207
25 25
229 189 156 131 110 96 87 83 92 100 117 140 172 208
21 21
205 176 150 130 114 107 105 109 120 136 161 191 227
-23 -23
215 184 169 153 145 143 151 159 178 202 231
-27 -27
239
-35 -35
8 SPRA159
V.34 Transmitter
The V.34 point-selection scheme has three parts that correspond to the
three sets of bits generated in the parse unit. First, the shell mapper uses
the K S bits to generate a sequence of eight rings. Second, the eight groups
of q uncoded Q bits are used to select a point within each ring from the
one-quarter superconstellation (see Figure 4). Third, the four groups of I bits
(along with output bits U0 from the trellis encoder) are used by the differential
encoder to rotate the four pairs of 2D points. The 90_ rotations applied by
the differential encoder result in points from the full superconstellation.
The easiest and fastest way to implement shell mapping is by table lookup.
However, the memory needed to store an 8 × 228-word lookup table far
exceeds that of most modem systems (as well as the 64K of SRAM on the
’C50 evaluation module that was used for this project). In light of this fact,
a divide-and-conquer approach, which combines smaller lookup tables with
some computational effort, is used. The ITU recommendation V.34 provides
the actual algorithm that was used in this implementation of a transmitter
(see Appendix A). A very brief description of how the algorithm works
follows.
Assume that the input to the shell mapper is a K-bit number R0. Three lookup
tables are used by the algorithm. The table g2(p) is the number of ring
sequences of length two with weight p. Likewise, table g4(p) is the number
of ring sequences of length four with weight p. Finally, the table z8(p) is the
number of ring sequences of length eight with weight less than p. The first
step in the algorithm is to determine the value p such that z8(p) < R0 <
z8(p+1). This value of p is the weight of the sequence of eight rings
corresponding to R0, and the difference R0–z8(p) is its offset into the shell
of equal-weight sequences. Using the value of the offset, the algorithm then
uses the g4 table to determine the weights and offsets of the two
sub-sequences of length four, which make up the final sequence. In a similar
fashion, the g2 table is used to find the weights of the four sub-sequences
of length two that constitute the final output sequence. Given the weight of
a length-two sequence, the two rings of the sequence can be determined by
a simple conditional statement. After finding the two rings in each of the four
sub-sequences, the final sequence is found by concatenation. A more
complete description of the algorithm is beyond the scope of this paper; refer
to Appendix A for a more detailed explanation.
The shell mapping operation is accomplished by the assembly language
function called shell_map in the source file named shell.asm. It accepts an
input in the accumulator and returns a sequence of eight shells in the array
mjk. Although the shell mapping algorithm remains invariant for any number
of shells, the lookup tables change with the number of shells. In this
implementation, the 9.6-kbps data rate requires M=6 shells, while the 14.4-
and 19.2-kbps data rates require M=12 shells. Two sets of tables are stored
to accommodate both configurations. The function shell_map dynamically
loads the addresses of all tables, so it can function correctly with any number
of shells (provided that the proper tables are pre-computed).
10 SPRA159
V.34 Transmitter
2.3.2 Mapper
Mapping the eight sets of uncoded Q bits to a point within a shell is a
straightforward operation. The points in Figure 4 are each labeled with a
number ranging from 0 to 239. They are numbered in order of increasing
power, and these numbers are the point indices into a lookup table. To select
a point within shell m with a given set of q Q bits = {Q1,Q2,...,Qq}, the following
equation is used:
In essence, the Q bits are taken as a binary number, and they are added to
the index of the first point within the shell. This addition results in the index
of a point within the quarter superconstellation. This point can then be
retrieved from the lookup table for further processing by the differential
encoder.
Assume that the input to the differential encoder is {I3, I2, I1, U0}, and that the
encoder has a pair of initial points (i.e., one 4D symbol) to encode. Then, by
considering the bit pair (I3,I2) to be a 2-digit binary number, the differential
encoder calculates the modulo-4 sum
Zm + [(I , I ) ) Z
3 2 m–1] mod 4,
where Zm±1 is the previously calculated value of Zm. The 2-bit number Zm
can assume the values (0,1,2,3) and represents a rotation of Zm*90o
clockwise. The first point of the 4D pair then is rotated by the amount
specified by Zm. The second point in the 4D pair is rotated in a similar
manner, with the rotation factor computed as
Wm + [(Z ) (I , U )] mod 4,
m 1 0
where (I1,U0) is considered a 2-digit binary number. The second point then
is rotated Wm*90o clockwise.
The functions of the differential encoder are performed by three assembly
language routines:
• Compute_rotation_factorZ
• Compute_rotation_factorW
• Rotate90_cw
These routines are found in the source file rotate.asm and provide
straightforward implementations of the operations described previously.
2.4 Precode
The precode unit consists of the combined nonlinear precoder and trellis
encoder. The innovative combination of the two is another feature unique to
the V.34 modem. By including the trellis encoder in a feedback loop (as
shown in Figure 1), it is possible to compensate for the noise-whitening
prediction-error filter of the V.34 receiver without destroying the shaping gain
provided by the shell mapper.
12 SPRA159
V.34 Transmitter
The V.34 transmitter contains a local replica of this filter, the inverse of which
is used to precode the transmitted signal points. This precoding has the
effect of pre-emphasizing the points in a certain way so that they will be
properly decoded at the receiver after the noise-whitening process. The
downside of this precoding is that it destroys the trellis sequence (see next
section) that is transmitted. To remedy this situation, V.34 provides for a
correction bit to be computed by the nonlinear precoder to preserve the
validity of the trellis sequence. This correction bit, C0, is computed once for
every 4D symbol. It is then exclusive-ORed with the output Y0 of the trellis
encoder to form the bit U0, one of the input bits to the differential encoder.
The functions of the nonlinear precoder are performed by a variety of short
assembly routines in the transmitter. The filtering process is done by a pair
of routines: precoder_filter_real and precoder_filter_imag. The strict
rounding procedures dictated by the V.34 recommendation are carried out
by two functions: round_precoder_output and quantize. The function
quantize is actually a wrapper for three other functions, quantize96,
quantize144, and quantize192, each of which executes slightly different
rounding techniques for the different speeds of the transmitter. The
computation of the correction bit is done by the routine compute_C0. These
routines are found in the source files pre_filt.asm, quantize.asm, and
conv_enc.asm.
There are three 4D trellis encoders specified for use with V.34: a 16-state
code, a 32-state code, and a 64-state code. In this transmitter, however, only
the 16-state 4D code invented by L-F Wei is implemented. The heart of the
trellis encoder is the 16-state convolutional encoder shown in Figure 5. In
a typical configuration, the convolutional encoder’s two inputs would be
taken from the differentially encoded I bits. However, when configured as in
the V.34 transmitter, the inputs are computed from the output points after the
nonlinear precoding has been applied. The algorithm for computing these
inputs to the convolutional encoder from the precoded output points is given
in Appendix A. This algorithm yields the inputs for all three types of encoders
for V.34. Modifying the transmitter to use one of the other codes would
require only inserting a new state table to implement the convolutional
encoder.
Y2(m) Y2(m) Y1(m)
Y0(m)
D2 D2 D2 D2
14 SPRA159
V.34 Transmitter
2.5 Modulate
The QAM of the transmitter is done in an efficient manner. The
digital-to-analog (D/A) converter on the EVM is configured to operate at
9600 samples per second. With a symbol rate of 2400 symbols/sec, this
results in four samples per symbol. A window of eight 2D symbols is stored
at any one time, so the modulator uses a baseband-shaping filter
32-samples wide whose output is modulated up to the passband around the
carrier frequency of 1800 Hz. The shaping filter used is a raised cosine filter
with excess bandwidth factor equal to 0.12. However, to save computation,
two passband-shaping filters are used: one for the in-phase component, and
the other for the quadrature component, both of which have been
pre-modulated by the carrier frequency. The 2D symbols must be modulated
into the passband before application of the filters. Because of the particular
choice of symbol rate and carrier frequency (2400 symbols/second and
1800 Hz), this modulation of the symbols simplifies to a rotation by a multiple
of 90o. This operation is performed by a short-jump table routine.
To save further computation, the passband-shaping filters are implemented
as four banks of 8-tap interpolation filters. Instead of
256 multiply-and-accumulate instructions per output symbol (32 taps ×
4 samples × 2 channels), only 64 (8 taps × 4 samples × 2 channels) are
required.
The QAM in the transmitter is performed by two functions: QAM_mod and
QAM_wait, located in the qam_mod.asm source file. QAM_mod performs
the actual filter calculations using repeated mac and macd instructions,
while QAM_wait simply pauses the execution of the program so that the
interrupt-driven output procedure can “sync up” with the main transmitter
routines. Since the ’C50 DSP is idle over 95% of the time that the transmitter
is executing, the QAM_wait function is essential for synchronizing the
program to the symbol rate.
Demodulate
Analog
Signal
Passband x(n) Prediction y( n)
Demod
Equalizer Error Filter
Decode
Output
y(n) u( n) Bits
Viterbi Inverse Inverse
Decoder Precoder Map
16 SPRA159
V.34 Receiver
3 V.34 Receiver
The goal of the V.34 receiver is to sample the sequence of 2D points that was
transmitted by the V.34 transmitter and to perform the inverse of the
transmitter’s encoder operations on this sequence. This goal is simple in
theory but difficult to achieve in practice because the transmitted QAM signal
is distorted by a variety of nonlinearities during its journey to the receiver. In
typical modem connections, these distortions result from traversing
numerous consumer phone lines and switching stations. For the
transmitter-receiver pair of this project, the distortions arise as the
transmitted signal travels through a coaxial cable, into an oscilloscope, and
finally to the receiver. Clearly, the distortions experienced by this signal are
(presumably) less than those found on a standard phone line, but they are
significant nonetheless.
The ideal situation for any modem receiver is to sample the exact
transmitted analog signal at the precise symbol rate. By doing so, the
sequence of 2D symbols can be retrieved exactly. There are two primary
problems that prevent the receiver from doing this directly: first, the sampling
rate of the receiver may not match exactly the symbol rate of the transmitter;
and second, the received signal may not have the same “shape” as the
transmitted signal (i.e., even if the sampling rate were exact, the sampled
points would not be exactly those transmitted). The first problem is caused
by small, but significant differences in the environment of the transmitter and
the receiver. Temperature differences, clock crystal differences, and
numerous other factors contribute to slight mismatches in the symbol rates
between two modems. The second problem is due to the channel distortions
mentioned above. These variations are unpredictable characteristics of the
channel through which the signal is transmitted.
3.2 Demodulate
Given an out-of-sync and distorted analog signal, it is the responsibility of
the demodulate unit to produce solid, accurate data for the decoder unit.
This is accomplished in two separate steps. First, the demodulate unit must
compensate for any errors in the signal’s phase. This is accomplished by the
symbol clock recovery section of the demodulator. This section of the
demodulator continuously modifies the sampling rate of the analog interface
chip (AIC) to ensure that the samples taken by the analog-to-digital (A/D)
converter will be as accurate as possible. The second problem, that of
channel distortion, is handled in the adaptive equalizer section of the
demodulator. This section maintains an array of coefficients which, when
convolved with the analog input signal, enables the demodulator to undo the
effects of channel distortion. Between these two tools, the demodulator is
able to transform an analog input into digital points for use by later parts of
the receiver.
18 SPRA159
V.34 Receiver
Hu(w)
Gu(1/7200 + t)
r(t) Inpbuffer
A/D * Im(Gu * Gl)
Gl(1/7200 + t)
v(n) - Timing
Phase Error
Hl(w)
AC & DC
Filter / Hard
Limiter
v(n)
+
Threshold
+ Delay (1 Baud)
Detector
dB
dh
+ dhr 1 dhi1
NJ
dB ) j * dB + –2 * E e(nT) * inpbuffer ǒ ǓNj
nT–m T
3
where e(nT) is the passband error, inpbuffer is the 144-word buffer where
the sampled received data are stored, n is the time instant, T is one over the
baud rate, and m runs from 0 to 143.
20 SPRA159
V.34 Receiver
In the equalize routine, found in the file eq.asm, the real filter hr1 is initialized
as a unit impulse function, so that its coefficients are zero everywhere except
for the center filter element, which is a one. The imaginary filter, hi1, is
initialized to be the Hilbert transform of hr1. First, the data in the input buffer
(inpbuffer) is convolved with real and imaginary filter taps. The convolved
output, represented in the program by two arrays sigR and sigI such that the
output is equal to (sigR + j*sigI), is then demodulated. This is accomplished
by performing a complex multiplication with the angle supplied by the carrier
tracking loop. The demodulated output (modR + j*modI) is then quantized
to the nearest ideal constellation point (cnR+j*cnI) in the slicer routine. Next,
the baseband error (diffR+ j*diffI) is computed by finding the difference
between the actual and the ideal points. Finally, this error is remodulated,
by again using the output from the carrier tracking loop to form the passband
error (errorR +j*errorI).
The routine eq_adapt, also found in eq.asm, uses the computed passband
error to update the filter taps (hr1 + j*hi1) in double precision. A block
diagram of the tap update process is shown in Figure 8. During any given
baud, only 36 real and imaginary filter taps of the 144 total real and imaginary
filter taps are updated. This is to ensure that the routine never runs for more
than one-third of a baud (2777 cycles). After testing the modem, it was noted
that by the time the initial training period is finished, the taps have become
very stable, and the updating can be done even less frequently.
Input 144-Tap
X Decode
Adaptive Slicer
Unit
Equalizer
144 144
Real Imaginary -
Tap Update Carrier
Tracking
Loop
Inpbuffer
fr hr1
144 Point Spectral
fi Unscrambler hi1
Complex FFT Division
fr fi
Real Equalizer
Coefficients
144-Point
Circular Copy Unscramble2
Complex IFFT
Imaginary Equalizer
Coefficients
22 SPRA159
V.34 Receiver
48) * D(i)Ȁ
C(i) + D(i mod 48) ) D(48B()i mod
2 i mod 48) ) D(96 ) i mod 48)
2 2
where i runs from 0 to 143, and B(i) is the FFT of the periodic sequence
transmitted by the transmitter. Next, the FFT routine is used to compute the
inverse Fourier transform of C to get the equalizer coefficients in time
domain. Then, unscramble2 is called to unscramble the FFT output, fr. The
output is placed into gr, which is copied into hr1 in a circular fashion, placing
the maximum value from gr into the center of hr1. Finally, unscramble2 is
called again on fi, and the result, gr, is copied circularly into hi1 such that it
is rotated by the same amount. Through this process, the fast equalizer is
able to completely fill all real and imaginary coefficients in a single pass.
3.3 Decoder
The output of the demodulate unit is a sequence of noise-corrupted 2D
points. The first step of the decoder unit is to decide to which ideal
constellation points these distorted points correspond. This task is
accomplished by the Viterbi decoder. Next, groups of eight points (i.e., one
mapping frame) are passed through the inverse precoder and the inverse
mapper to be decoded into the output data stream.
The Viterbi algorithm is initiated once every time the receiver has
demodulated two noise-corrupted 2D symbols (i.e., once every 4D symbol
period). It is assumed that the receiver has been operating for some time,
and that the Viterbi table has already been filled with valid data. The first step
in the algorithm is to quantize the noise-corrupted points to the closest point
in each of the eight possible 4D subsets. The squared errors for each of
these eight subsets, called the branch errors, are recorded. Next, the
algorithm iterates through each of the 16 state entries in the Viterbi table that
corresponds to the current 4D symbol. For each state, it seeks to update the
cumulative path metric, described in the following paragraph.
Each state has a previous cumulative path metric and a current cumulative
path metric (which is currently being computed). These metrics are the total
sum of all the individual branch errors that make up the trellis path
terminating at that state. A small cumulative path metric indicates that the
branches composing the path are very likely to be the correct ones. A large
path metric indicates just the opposite. Since the path branches correspond
directly to certain 4D symbols, finding the path with the smallest cumulative
path metric will result in the most likely sequence of points.
To compute the current cumulative path metric for a single state, the Viterbi
algorithm checks the four possible state transitions into that state. Each of
these transitions, or branches, has a branch error associated with it that was
calculated at the beginning of the algorithm. The current cumulative path
metric is chosen to be the minimum of the sums of the four previous path
metrics and their associated branch errors. The Viterbi algorithm updates
the current path metrics for all 16 states in this manner.
24 SPRA159
V.34 Receiver
Once the current cumulative path metrics have been computed, the
algorithm can determine a 4D output symbol. It does this by following the
trellis path starting at the state with the smallest cumulative path metric back
until it reaches the oldest state in the table. The 4D symbol located at this
state then is output for further processing by the later stages of the decode
unit.
The assembly language routines for implementing the Viterbi algorithm are
found in the source file viterbi.asm. Quantizing to the 4D subsets and
calculating the branch errors are accomplished by the routines quantize4,
calculate_errors, and calc_branch_errors. Updating the cumulative path
metrics is done in two passes, eight states each, by the routines
update_path_metrics and upm_finish. Tracing back the most likely trellis
path is done by the function trace_back.
(I 3, I 2) + [Z m–1–Z m]mod 4,
(I 1, I 0) + [Z m–W m] mod 4,
to determine the bits (I3, I2, I1) for each received 4D point.
The Q bits and the ring index of each point are found through the use of an
inverse constellation point lookup table. This two-dimensional table is
indexed by the x and y coordinates of a constellation point, and its entries
contain the point indices of the point table in the transmitter. After consulting
the lookup table, the lower q bits of the index are masked off to obtain the
q Q bits. The remaining high-order bits after the first q constitute the ring
index of the point, and they are stored in an array until a block of eight ring
indices has been obtained. When eight ring indices have been stored, the
end of a mapping frame has been reached, and the unshell-mapper can be
called to compute the final K S bits of the original bit sequence. At this point,
the eight groups of Q bits have been retrieved.
Upon decoding the S, I, and Q bits, the inverse mapper reverses the
operation of the parser to combine these groups of bits into one final output
array. This array is then unscrambled using a descrambler algorithm with the
same generator polynomial as that found in the transmitter. If all went well,
the unscrambled output is the same sequence of bits that was transmitted.
The inverse mapper routines are found in the source files invmap.asm and
unshell.asm. The simple decoding of the differential encoder is done using
the routine get_binary_subset_label to determine the rotation factor of the
points and the routine compute_ibits to retrieve the I bits. The function
find_point_index returns the index of the unrotated 2D constellation point as
well as its uncoded Q bits and ring index. The routine unshell_map performs
the inverse shell mapping algorithm to determine the S bits of the mapping
frame.
26 SPRA159
Summary
4 Summary
This project is implemented on two TMS320C50 EVM systems. It consists
of two parts: a V.34 transmitter and a V.34 receiver. One-way communication
links at speeds of 9.6, 14.4, and 19.2 kbps can be established, and the
performance of the ’C50 DSP has been evaluated.
Four distinct operational units comprise the V.34 transmitter. The parse unit
transforms the input data stream into a form usable by the rest of the
transmitter. The point-select unit chooses a sequence of constellation points
in a manner that minimizes transmitted signal power and maximizes the
probability of correct decoding. The precode unit precodes the transmitted
points to compensate for a prediction error filter in the receiver and ensures
that these precoded points still conform to a valid trellis sequence. The final
unit, modulate, uses QAM to construct the final output waveform.
The V.34 receiver has been similarly divided into distinct units: the
demodulate front-end and the decode back-end. The demodulate unit is
responsible for correctly sampling the input analog waveform at the symbol
instants. It contains functions for symbol clock recovery as well as adaptive
channel equalization. The decode unit performs the inverse of the
transmitter’s encoding operations. It uses the Viterbi algorithm to determine
the most likely trellis sequence.
The operations required to perform the encoding and decoding functions for
V.34 are ideally suited to fast DSPs such as the ’C50. By implementing the
algorithms entirely in assembly language, it is possible to exploit the unique
hardware features of the ’C50. Simulations of the transmitter program
indicate that the DSP is idle (i.e., waiting for interrupts to occur) over 95%
of the execution time. Although the task of the receiver is more complex
computationally, even the most conservative estimates show that the DSP
is working less than 42% of the time. These numbers demonstrate that the
next phase of the implementation of a V.34 modem, the combination of the
transmitter and receiver functions onto one DSP, is easily achievable on a
single ’C50 DSP. Additional features of V.34, such as channel separation by
echo cancellation techniques, could be accommodated with the excess
processor time.
References
1. ITU Study Group 14, “Draft Recommendation V.34 for a Modem
Operating at Data Signalling Rates of up to 28800 bit/s for Use on the
General Switched Telephone Network and on Leased Point-to-Point
2-Wire Telephone-Type Circuits”, Document 57-E, June 1994.
2. S. A. Tretter, “Fundamentals of Trellis Shaping and Precoding”,
unpublished notes.
3. R. Laroia, N. Farvardin, and S. A. Tretter, “On Optimal Shaping of
Multidimensional Constellations”, IEEE Trans. Inform. Theory,
Vol. IT-40, pp. 1044-1056, July 1994.
4. L. F. Wei, “Trellis Coded Modulation with Multidimensional
Constellations”, IEEE Trans. Inform. Theory, vol. IT-33, pp. 483-501,
July 1987.
5. S. A. Tretter, Communication System Design Using DSP Algorithms:
With Laboratory Experiments for the TMS320C30, Plenum Publishing
Corp., 1995.
6. R. D. Gitlin, J. H. Hayes, S. B. Weinstein, Data Communication
Principles, Plenum Publishing Corp, 1992.
7. P. R. Chevillat, D. M. Maiwald, and G. Ungerboeck, “Rapid Training of
a Voiceband Data-Modem Receiver Employing an Equalizer with
Fractional-T Spaced Coefficients”, IEEE Trans. Commun., vol. COM-35,
pp. 869±876, Sept. 1987.
8. v.34 transmitter and receiver code at:
ftp://ftp.ti.com/pub/tms320bbs/c5xfiles/apprep34.exe
28 SPRA159
Constellation Shaping by Shell Mapping
A-2 SPRA159
Constellation Shaping by Shell Mapping
A. System Description
The block diagram of the transmitter for a system using shell mapping for
constellation shaping is shown in Figure VIII–1. The diagram shows the Wei
16-state 4D channel code but can easily be generalized to use any channel
code. As described in Section IV, the transmitted 2D constellation is a subset
of z2+(1/2,1/2) and this constellation is partitioned into four 2D subsets A, B,
C, and D. The Wei encoder takes in 3 bits every 4D symbol and generates
4 output bits per 4D symbol. The first and second pairs of output bits specify
the pair of 2D symbols that form the 4D subset selected by the Wei encoder.
K Bits per Blocks of N
Shaping Block of Ring Indices
N 2D Symbols (r1, r2, . . . , rN)
Shell Mapping Ring
Sequence Algorithm
L + 4M 2 + M 2 )
u u 2 (VIII–1)
The ring closest to the origin should contain the 2u+2 constellation points of
least energy. The next ring should contain the next 2u+2 points in order of
energy, etc.
2K vM N
or 2 KńN vM (VIII–2)
[2
) )
N(u 2) K
]
ń
1 N
+2 ) ) ń
u 2 K N (VIII–4)
CER s + M 2 ) ń2 ) ) ń + M 2
u 2 u 2 K N ń
–K N (VIII–5)
M v 1.5 2 ń K N (VIII–6)
Combining (VIII–2) and (VIII–6), we see that M must be limited to the range
8. G.D. Forney and L-F. Wei, “Multidimensional Constellations, Part I,” IEEE J. SAC,
Vol. 7, No. 6, August 1989, p. 887.
A-4 SPRA159
Constellation Shaping by Shell Mapping
L +M 2 ) +M 2
u 2 n–p (VIII–10)
The number of uncoded bits must be positive, so from (VIII–9) we find that
u +n*2*pw0
or 0 v p v n–2 (VIII–11)
and n w 2
v
M to increase and results in greater complexity. For a block size of N=8,
Motorola recommends using p 3.
B. Weights of Ring Blocks, the Basic Concept of Shell Mapping, and Some
Data Tables Required for Efficient Implementations
Suppose that the rings are labelled from 0 to M–1 with ring 0 being closest
to and ring M–1 furthest from the origin. Let a block of ring indices be denoted
by
r + [r , r , @@@ , r ]
1 2 N
(VIII–12)
with r iŮ AAA
{0, 1, , M–1} for i =1, ... , N. Each ring must be assigned an integer
weight W1 (ri ) where the subscript 1 indicates that the argument is a
one-dimensional vector. For example, assuming that the constellation point
coordinates are integers, the ring weight could be the average squared
distance of points in the ring from the origin. Motorola7 suggests using the
weight function w1(i) = i for i = 0, ... , M–1. Justification is given in
Appendix VIII–A. The ring weights must form a nondecreasing sequence,
that is,
w N( r ) + ȍ+
N
i 1
w 1(r i) (VIII–14)
The basic idea behind shell mapping is that the MN possible ring blocks are
arranged in an ordered list with lower weight blocks appearing closer to the
beginning of the list. There may be many blocks with the same weight and
these are called a shell. We will see how to lexicographically order the blocks
in a shell. The block at the beginning of the list will be labelled or indexed by
0 and the one at the end by MN – 1. The 2K blocks closest to the beginning
of the list which are also the 2K lowest weight blocks are used as the shell
sequences. Encoding is performed by using the binary K-tuples of shaping
bits as the indexes of the list elements. Decoding is performed, basically, by
observing that given a ring block, its index is the number of blocks below it
on the list. We will see how to achieve reasonable table storage
requirements by a “divide and conquer” approach where the N dimensional
problem is divided into two N/2 dimensional problems, etc. The problem,
then, is to efficiently assign shaping K-tuples to ring blocks and vice versa.
+ NJr AAA , r Nj
To perform shell mapping, the number of vectors with a given weight must
be known. Let Mp(j) be the number of p-tuples of ring indices r
ǒ Ǔ ) AAA ) w ǒr Ǔ. The weight generating
1, p
ȍ
function is defined to be
For example, with the Motorola weight function w1(i)=i, the number of
1-tuples of weight j is M1(j) = 1 for j = 0, ... , M–1 and is 0 otherwise. The
corresponding generating function is
A-6 SPRA159
Constellation Shaping by Shell Mapping
G 1 (z) +
k
ȍ+
M–1
0
zk + 11–z–z
M (VIII–16)
M 2p (j) + ȍ
+
j
k 0
M p(k) M p(j–k) (VIII–17)
G 2p (z) +G 2
p (z) (VIII–18)
C N (j) + ȍ+
j
k 0
M N (k) (VIII–19)
Binary shaping K-tuples can be assigned to ring blocks in many ways. The
method we will examine is based on a splitting algorithm that successively
divides blocks into pairs of half the length until blocks of length 1 are
reached. Therefore, we will assume that the original block length is a power
of 2, that is, N = 2q.
At each step, the i-tuples of rings will be ordered according to rules which
will be given shortly. The decoding, ordering, or indexing function on i-tuples
will be called Di(·).
The ordering of 1-tuples is easy. According to (VIII–13), the ring weights
must form a nondecreasing sequence. Therefore, 1-tuples will be listed in
numerical order. That is, the one-dimensional decoding function is
This is the starting point for building higher dimensional decoding functions.
As a matter of notation, let an i-tuple of ring indexes be
r [i ] + ƪr , ..., r ƫ
1 i (VIII–22)
r
[i]
1
+ [r , AAA ,
1 r iń2] and r
[i]
2
+ [r ń ) , AAA , r ]
i 2 1 i (VIII–23)
We will now see how to define a decoding function for i-tuples in terms of one
for i/2-tuples. Assume Di/2(·) is known. First, order i-tuples of rings according
to the following rules for the three possible cases:
A-8 SPRA159
Constellation Shaping by Shell Mapping
Case 3: Equal Weight i-tuples. Indexes of 1st Halves are the same
ǒ Ǔ ǒ Ǔ
When w i u [i ] = w i v [i ] and u 1 [i ] = v 1 [i ] , then list u [i ] below v [i ] if
[i]
D iń2 (u )
2
t [i]
D iń2 (v )
2 .
ǒ Ǔ+
D i v [i] {number of i-tuples below v [i]}
+ {number of i-tuples u [i] with w i(u [i]) t w (v )} [i]
ƪ ǒ Ǔ ƫ) ǒ Ǔ
and u [i] below v [i]}
+C i w i v [i ] – 1 N i v [i ]
The quantities Di(·) for i < N do not have to be computed. Also, Ci(·) is needed
only for i=N. Finally, we will see in the next paragraph how to recursively
compute Ni(·) for i = 2, 4, ... , N using Ni/2(·) and Mi/2(·). Then DN(·) can be
computed by (VIII–25) for i=N.
The vectors counted in Ni(v[i]) can be partitioned into the following three
types:
NJŤ Nj
Consider the set
u [i] w i (u [i]) + w (v ) i
[i]
Ɛ [i ]
w iń2 (u )
1
t wń i 2
[i]
(v )
1
(VIII–26)
ȍ+
[i]
w iń2 (v1 ) –1
k 0
NJŤ
Consider the set
According to (VIII–24), the number of choices for u1[i] is Ni/2(v1[i]). The total
weight must be wi(v[i]), so the number of choices for u2[i] is Mi/2[wi(v[i]) –
wi/2(v1[i])]. Thus the number of type 2 vectors is
a2 + Nń i 2
[i]
1
[i]
(v ) M iń2 [w i (v [i])–w iń2 (v )]
1
(VIII–29)
NJ ǒ Ǔ t ǒ ǓNj
Type 3: 1st Halves Identical
The number of vectors in the set
ǒǓ
is
a3 + Nń i 2 v
[i]
2
(VIII–31)
ǒǓ ǒǓ
Notice, also, that the weight of v [2i] is
w iń2 v
[i]
2
+w i
ǒv Ǔ * w ń
[i]
i 2 v
[i]
1
(VIII–32)
A-10 SPRA159
Constellation Shaping by Shell Mapping
ȍ+
[i]
ƪ ǒ Ǔ* ƫ
w iń2 (v1 )–1
ǒ Ǔ+
ǒ Ǔ ƪ ǒ Ǔƫ ǒ Ǔ
N i v [i] M iń2 ( k ) M iń2 w i v [i] k
k 0 (VIII–33)
)Nń i 2 v
[i]
1
M iń2 W iń2 v
[i]
2
)Nń i 2 v
[i]
2
w 1 (r ) +r + 0, AAA , M * 1
for r (VIII–34)
NJ
weight is
v1 [2]
+ ƪr ƫ 1 and v 2 [2]
+ ƪr ƫ 2
N2 vǒ Ǔ+
[2] ȍ+
ǒ Ǔ*1
w1 r1
ƪ ǒ Ǔ* ƫ
M 1 ( k ) M 1 w 2 v [2] k
(VIII–36)
ǒ Ǔ ƪ ǒ Ǔ * ǒ Ǔƫ ) ǒ Ǔ
k 0
)N 1 r 1 M 1 w 2 v [2] w1 r1 N1 r2
Since there is a single 1-tuple of each weight, N1(r) = 0, so the last line of
(VIII–36) is zero and
N2 v ǒ Ǔ+
[2] ȍ
r1
+
k 0
*1
ǒǓ
M1 k M1 r1 ǒ ) r * kǓ +
2
ȍ
r1
+
k 0
*1
ǒ ) r * kǓ
M1 r1 2
(VIII–37)
NJ
From Figure VIII–2 it can be seen that this sum is
N2 ǒƪ ƫǓ +
r 1, r 2
r1
M *1*r 2
for 0
for M
vr )r vM*1
vr )r vr )M*2
1
1
2
2 1
(VIII–38)
0 0 M1(0) = 1
1 1 M1(1) = 1
2 2 M1(2) = 1
D2(r) r
0 00 M2(0) = 1
–––––––––––– –––––––––––
1 01 M2(1) = 2
2 10
–––––––––––– –––––––––––
3 02 M2(2) = 3
4 11
5 20
–––––––––––– –––––––––––
6 12 M2(3) = 2
7 21
–––––––––––– –––––––––––
8 22 M2(4) = 1
D4(r) r
0 0000 M4(0) = 1
–––––––––––– –––––––––––
1 0001 M4(1) = 4 C4(0) = 1
2 0010
3 0100
4 1000
A-12 SPRA159
Constellation Shaping by Shell Mapping
–––––––––––– –––––––––––
5 0002 M4(2) = 10 C4(1) = 5
6 0011
7 0020
8 0101
9 0110
10 1001
11 1010
12 0200
13 1100
14 2000
–––––––––––– –––––––––––
15 0012 M4(3) = 16 C4(2) = 15
16 0021
17 0102
18 0111
19 0120
20 1002
21 1011
22 1020
23 0201
24 0210
25 1101
26 1110
27 2001
28 2010
29 1200
30 2100
–––––––––––– –––––––––––
31 0022 M4(4) = 19 C4(3) = 31
32 0112
33 0121
34 1012
35 1021
36 0202
37 0211
38 0220
39 1102
40 1111
41 1120
42 2002
43 2011
44 2020
45 1201
46 1210
47 2101
48 2110
49 2200
–––––––––––– –––––––––––
50 0122 M4(5) = 16 C4(4) = 50
51 1022
52 0212
53 0221
54 1112
55 1121
56 2012
57 2021
58 1202
59 1211
60 1220
61 2102
62 2111
63 2120
64 2201
65 2210
–––––––––––– –––––––––––
66 0222 M4(6) = 10 C4(5) = 66
67 1122
68 2022
69 1212
70 1221
71 2112
72 2121
73 2202
74 2211
75 2220
–––––––––––– –––––––––––
76 1222 M4(7) = 4 C4(6) = 76
A-14 SPRA159
Constellation Shaping by Shell Mapping
77 2122
78 2212
79 2221
–––––––––––– –––––––––––
80 2222 M4(8) = 1 C4(7) = 80
–––––––––––– –––––––––––
C4(8) = 81
M1 (r1 + r2 – k)
r1 + r2 – (M – 1) 0 r1 – 1 r1 + r2
k
(a) 0 ≤ r1 + r2 ≤ M – 1
M1 (r1 + r2 – k)
0 r1 + r2 – (M – 1) r1 – 1 r1 + r2
k
(b) M ≤ r1 + r2 ≤ r1 + M – 2
ǒ Ǔ + C ƪ w ǒv Ǔ * ƫ ) N ǒv Ǔ
D N v [N] N N
[N]
1 N
[N] (VIII–39)
A-16 SPRA159
Constellation Shaping by Shell Mapping
Remember that NN(v[N]) is the number of blocks with the same weight as v[N]
that are below it on the list. Also, MN(wN(v[N])) is the number of blocks with
the weight of v[N]. Therefore,
0 vN N
ǒv Ǔ t M ƪw ǒv Ǔƫ
[N]
N N
[N] (VIII–40)
Also,
ƪ ǒ Ǔƫ + ƪ ǒ Ǔ * Ǔ ) ƪ ǒ Ǔƫ
C N w N v [N] C N w N v [N] 1 M N w N v [N] (VIII–41)
Thus,
ƪ ǒ Ǔ ƫ v ǒ Ǔ t ƪ ǒ Ǔƫ
C N w N v [N] –1 D N v [N ] C N w N v [N ] (VIII–42)
This shows that the index DN will always fall in an interval bracketed by a pair
of successive CN’s.
Encoding Step 1: Compute the Weight of v[ N]
Based on (VIII–42), the weight of v[ N] is
ǒ Ǔ+
w N v [N]
max
n n NJŤ CN (n * 1) v D N
ǒv ǓNj
[N] (VIII–43)
ǒ Ǔ + D ǒv Ǔ * C ƪ w ǒv Ǔ * ƫ
N N v [N] N
[N]
N N
[N]
1 (VIII–44)
According to (VIII–33),
N ǒv
ȡ
ȧ ǒ Ǔ*
Ǔ + ȥȧ ȍ ń
w Nń2 v 1[N] 1
ƪ ǒ Ǔ ƫ ȣ
* ȧȦȧ
Ȣ + ń
Ȥ
[N]
MN (k ) MN w N v [N] k (VIII–45)
N 2 2
)NJ ń ǒ Ǔ ń ƪ ǒ Ǔ ƫNj
k 0
NN 2 v [1N ] M N 2 W Nń 2 v ( N )
) NJ ń ǒ ǓNj
NN 2 v 2[N]
Of the weight wN(v[N]) N-tuples, the number with the same weight as v[N] in
+
ƪ ǒ Ǔƫ ƪ ǒ Ǔƫ
the first half, that is, w Nń2(u [N]
1
) w Nń2(v [i]
1
), is
[N] [N ]
M Nń 2 w Nń 2 v M Nń 2 w Nń 2 v (VIII–46)
1 2
0 vN ń N 2
ǒ Ǔ ƪ ǒ Ǔƫ ) ǒ Ǔ t ƪ ǒ Ǔƫ ƪ ǒ Ǔƫ
v [1N] M Nń2 w Nń2 v [2N] N Nń2 v [2n] M Nń2 w Nń2 v 1[N] M Nń2 w N v 2[N] (VIII–47)
NJ ƪ ǒ Ǔƫ ) ǒ ǓNj * ƪ ǒ Ǔƫ ƪ ǒ Ǔƫ
Rearranging this last inequality gives
ǒ Ǔ
N Nń2 v 1[N] M Nń2 w Nń2 v [2N] N Nń2 v [2N] M Nń2 w Nń2 v 1[N] M Nń2 w N v 2[N] (VIII–48)
ȡ
+ ȧȥ ǒ Ǔ*
ń ǒ Ǔ*
ȍ+
w N 2 v [N]
ƪ ǒ Ǔ * ƫȣȧȦȧ
ȧȢ
1
1
NN v [N] M Nń2 ( k ) MNń2 wN v [N] k
Ȥ
ƪ ǒ Ǔ ƫ ƪ ǒ Ǔƫ t
k 0
NJ Ťȍ
From the inequality (VIII–48), we see that the weight of the first half of v[N]
Nj
must be
ǒ Ǔ + maxn
W Nń2 v 1[N] n
n–1
+
k 0
ƪƫ ƪ ǒ Ǔ * k ƫ v N ǒv Ǔ
M N ń2 k M N ń2 w N ń2 v [ N ] N
[N] (VIII–49)
ǒ Ǔ+
w Nń2 v [2N] w N v [N]ǒ Ǔ*w ń N 2
ǒ Ǔ
v [1N] (VIII–50)
A-18 SPRA159
Constellation Shaping by Shell Mapping
ǒ Ǔ ń ƪ ń ǒ Ǔƫ ) ń ǒ Ǔ
Now the following partial offset d can be computed:
ǒ Ǔ*
+N N
ǒv Ǔ *
[N] ȍ+
w Nń2 v 1[N]
k 0
1
Encoding Step 5: Compute the Offsets of the 1st and 2nd Halves
Remember that N Nń2 (v 2[N]) is the number of weight w Nń2 (v [2N]) N/2-tuples
below v [2N]. The total number of N/2-tuples with weight w Nń2 (v [2N]) is
M Nń2 [w Nń2 (v [2N])], so
0 vN ń N 2
ǒ Ǔ t ń ƪ ń ǒ Ǔƫ
v [2N] MN 2 wN 2 v [2N] (VIII–52)
Using the same reasoning, we see that the following inequality must also be
true:
0 vN ń N 2
ǒ Ǔ t ń ƪ ń ǒ Ǔƫ
v [1N] MN 2 wN 2 v [1N] (VIII–53)
Using the Euclidean division algorithm, N Nń2 (v [2N]) and N Nń2 (v 1[N]) can be
found by dividing d by M Nń2 [w Nń2 (v 2[N])]. N Nń2 (v 1[N]) is the remainder and
N Nń2 (v [2N]) is the quotient.
ǒǓ ƪ ǒ ǓƫȣȦȤ
Thus, to complete step 5, compute the following:
ȡȥ
+ int dńM ń
Ȣ
[N] [N]
N Nń 2 v N 2 w Nń 2 v (VIII–54)
1 2
ǒǓ ƪ ǒ Ǔƫ ǒ Ǔ
and
N Nń 2 v
[N]
2
+ d*M ń N 2 w Nń 2 v
[N]
2
N N ń2 v
[N]
1
(VIII–55)
NJ
can be found from (VIII–38). Let v = [r1,r2] so w2(v) = r1 + r2. Then from the
first part of Example 1, it follows that
+ N 2 (v ) for w 2 (v )vM*1
r1 W 2 (v ) )N 2 (v ) * (M * 1) for M 1 * t W (v ) 2
(VIII–56)
r2 +w 2 (v ) *r 1
ǒ Ǔ+ ǒ Ǔ+ ǒ Ǔ+
Thus
w 2 r [4]
1
2 and w 2 r [4]
1
ǒ Ǔ*
w 4 r [4] w 2 r [4]
1
2 *2+0
Step 4
The partial offset is d = 8 – 7 = 1
A-20 SPRA159
Constellation Shaping by Shell Mapping
Step 5
Find Offsets of 1st and 2nd Halves
ǒ ǒ ǓǓ +
M2 w2 r2 [4]
M 2 (0) +1
N2 r1ǒ Ǔ + NJ d ń M ǒ w ǒ r ǓǓ +
[4]
int 2 2 2
[4]
1
N2 ǒr Ǔ + d * M ǒw ǒr ǓǓ N ǒr Ǔ +
2
[4]
2 2 2
[4]
2 1
[4]
1 *1 1 +0
Using (VIII–56) we find that
r1 +N 2
ǒr Ǔ +
1
[4]
1, r2 +w 2
ǒr Ǔ * r
1
[4]
1 +2*1+1
r3 +N 2
ǒr Ǔ +
2
[4]
0, r4 +w 2
ǒr Ǔ * r
2
[4]
3 +0*0+0
So the encoded ring block is r[4] = [ 1100 ]
Appendix VIII-A
Justification for using the Motorola weight function is given in this appendix.
Suppose that the rings are formed from M concentric circles as shown in
Figure VIII–3.
RM – 1 RM = R
R1
R2
M–1
A0 + pR 21 + pR ń M 2 (VIII–57)
so
2
R1 +R ńM 2 (VIII–58)
A-22 SPRA159
Constellation Shaping by Shell Mapping
Ak + pR 2K ) 1 + (k ) 1) A + (k ) 1) p R 21
0
(VIII–59)
so
) 1 + (k ) 1) R 1 + (k ) 1) R ń M
2 2 2 (VIII–60)
RK
Pk + ŕ
R k) 1
Rk
r 2 dA
Ak
+ ŕ ǒ
R k) 1
Rk
r2
p Rk
2
2pr dr
)1
2
– Rk Ǔ (VIII–61)
Rk ) 1 ) Rk
ǒ Ǔ
)
4 4 2 2
Rk
1 – Rk
+ 2 + 2
2 Rk ) 1 – Rk
2
Pk + RM (k ) 0.5)
2
(VIII–62)
ǒ Ǔ
The average power in a ring block r= [r1,r2,...,rN ] is
+ȍ + )ȍ
N N
P(r ) P rk R2 0.5N rk (VIII–63)
+ k 1
M
+ k 1
Thus, any ring sequence with the same sum has the same average power.
This justifies using the sum of the ring values as a weight function.
A-24 SPRA159
Appendix B.
Selected excerpts from Dr. Steven A. Tretter’s “Fundamentals of Trellis Shaping and
Precoding” on the subject of obtaining the inputs to the convolutional encoder. Used
with permission.
A.30 A Method for Determining the Binary Subset Label From the
Coordinates of a 2D Point
The combined precoding and trellis coding scheme for V.34 will be
presented in Chapter X. Finding the input to the trellis encoder involves
finding the subset binary labels for 2D constellation points. A simple method
for finding these labels will now be presented.
Let (x,y) be a 2D constellation point. First translate this point to a lattice point
by forming
According to (IX–8)
The sum of the components of any point in 2RZ2 is some even number 2n.
Thus,
x0 ) y + 2n ) 2ƪJ ę ǒJ @ J Ǔƫ ) 2J ) J
0 2 0 1 1 0 (IX–11)
J0 + modǒx ) y , 2 Ǔ
0 0 (IX–13)
J0 = 0 J0 = 1
J1 = 0 J1 = 1 J1 = 0 J1 = 1
J2 = 0 J2 = 1 J2 = 0 J2 = 1 J2 = 0 J2 = 1 J2 = 0 J2 = 1
a A c C b B d D
2 RZ2 2 RZ2 + (2, 0) 2 RZ2 + (1, 1) 2 RZ2 + (3, 1) 2 RZ2 + (2, 1) 2 RZ2 + (0, 1) 2 RZ2 + (1, 2) 2 RZ2 + (3, 2)
+ 2n ) 2ƪ J ę ǒJ Ǔƫ ) J
Therefore, it follows from (IX–10) that
x0 1 2 0 ·J 1 1 (IX–14)
J1 + modǒx , 2 Ǔ + lsb of x
0 0 (IX–15)
Once J0 and J1 have been determined, their effects can be subtracted out
to form
ǒx , y Ǔ + ǒx , y Ǔ – J (1,2 1) – J (0, 1)
1 1
0 0 1 0
) ƪJ ę ǒJ Ǔƫ (2, 0) + RZ ) ƪJ ę ǒJ ·J Ǔƫ (1, 0)
(IX–16)
Ů
2RZ 2 2 0 ·J 1
2
2 2 0 1
The sum of the coordinates of any point in RZ 2 must be some even number
) y + 2n ) ƪJ ę ǒJ Ǔƫ
2n3. Therefore,
x1 1 3 2 0 ·J 1 (IX–17)
so
B-2 SPRA159
ƪ ę ǒ Ǔƫ + ǒ
J2 J 0 ·J 1 mod x 1 ) y , 2Ǔ
1 (IX–18)
and
J2 + modǒx ) y , 2 Ǔ ę ǒJ
1 1 0 ·J 1Ǔ (IX–19)