Low-Latency Ordered Statistics Decoding of BCH Codes
Low-Latency Ordered Statistics Decoding of BCH Codes
Abstract—This paper proposes a low-latency ordered statistics information sets generated by randomly biased log-likelihood
decoding (OSD) algorithm for BCH codes. The OSD latency ratios (LLRs) were proposed in [11]–[12] in order to improve
is mainly caused by Gaussian elimination (GE) that produces the OSD performance.
a systematic generator matrix of the code. Considering BCH
codes is binary subcodes of Reed-Solomon (RS) codes, we show However, the GE latency challenge remains, which will be
that the BCH codeword candidates can be produced through the addressed by this work. Since BCH codes are binary subcodes
systematic generator matrix of the corresponding RS code. The of Reed-Solomon (RS) codes, their codeword candidates can
systematic generator matrix of an RS code can be formed by be generated through the corresponding RS codewords, which
generating the linearly independent RS codewords in parallel, requires the RS systematic generator matrix. It can be formed
replacing the GE process and enabling a low OSD latency.
This paper further proposes a segmented variant that facilitates by generating the linearly independent RS codewords in
the decoding by reducing the number of test error patterns parallel, underpinning a low decoding latency. In particular,
(TEPs). Complexity of the proposed OSD is also analyzed. Our an (n, k) BCH code is a binary subcode of an (n, k 0 ) RS
simulation results show that the proposed decoding can achieve code that is defined over a binary extension field, where n is
a similar performance as the conventional OSD, but with a lower their codeword length and the dimension of the RS code is
decoding complexity. The decoding latency can be reduced over
the conventional OSD substantially. greater than that of the BCH code, i.e., k 0 > k. The k 0 linearly
Index Terms—BCH codes, low-latency, subfield subcode, max- independent RS codewords can be generated in parallel using
imum likelihood decoding, ordered statistics decoding the Lagrange interpolation polynomials, forming the RS sys-
tematic generator matrix. The BCH codeword candidates can
I. I NTRODUCTION be yielded through generating the binary RS codewords by the
The realization of ultra-reliable low-latency communication matrix. In order to further reduce the decoding complexity,
(URLLC) requires the support of competent short-to-medium a segmented low-latency OSD is further proposed. By seg-
length channel codes. The transmission limit of a finite menting the original TEPs, a near ML decoding performance
length coded system has been characterized in [1]. Recent can still be achieved with less TEPs, resulting in a lower
research on short-to-medium length codes has shown that decoding complexity. Complexity of the proposed OSD is
ordered statistics decoding (OSD) of BCH codes can yield analyzed. Our simulation results show that the decoding
a performance that is closed to the transmission limit [2]–[3]. latency (in microsecond) can be substantially reduced over the
In OSD, the codeword candidates are generated through the conventional OSD. They yield a similar decoding performance
re-encoding of test messages that are formed by alternating as the conventional OSD with a smaller decoding output list,
decisions of the most reliable independent positions (MRIPs) resulting in fewer floating point operations for identifying the
in a codeword. The re-encoding process requires Gaussian most likely codeword from the list.
elimination (GE) that produces a systematic generator matrix
II. P RELIMINARIES
of the code. However, due to the sequential feature of GE,
its latency cannot be compromised, which is also a long- A. Ordered Statistics Decoding
standing challenge for OSD [4]. In order to reduce the Let Fq denote a finite field of size q, and its extension
OSD complexity, several skipping and stopping rules have field is further denoted as Fqm , where m > 1. Let f =
been proposed in [5]–[8]. They facilitate the decoding by (f0 , f1 , . . . , fk−1 ) ∈ Fk2 and c = (c0 , c1 , . . . , cn−1 ) ∈ Fn2
identifying the unpromising test error patterns (TEPs) and denote the message vector and codeword vector of an (n, k)
the maximum likelihood (ML) codeword candidate within BCH code, respectively, and d denote its minimum Ham-
the decoding output list, respectively. They result in skipping ming distance. Its generator matrix G is a k × n binary
the unpromising TEPs, or terminating the decoding earlier. matrix as G = [g0 , g1 , · · · , gn−1 ], where g0 , g1 , · · · , gn−1
The box-and-match algorithm [9] trades time and space com- are the column vectors of length k. Let us assume that
plexity by considering the TEPs of small weights. Moreover, a BCH codeword c is transmitted by the use of BPSK
the MRIPs segmentation approach was proposed in [10], modulation as : 0 7→ 1; 1 7→ −1. The modulated symbol
dividing the OSD operation into several segments to reduce sequence is x = (x0 , x1 , . . . , xn−1 ), where xj ∈ {−1, 1}
the decoding complexity. On the other aspect, the multiple and j = 0, 1, . . . , n − 1. After a memoryless channel, the
received symbol sequence is r = (r0 , r1 , . . . , rn−1 ) ∈ Rn . A codeword candidate with a smaller correlation distance to
Let Pr (rj | cj = 0) and Pr (rj | cj = 1) denote channel ob- y is more likely to be the transmitted codeword. Let Sω =
servations of cj , its received LLR is defined as (ω)
{Lj |yj = ĉj }, elements Lj of Sω can be reordered as
Pr(rj | cj = 0) |Lξ0 | ≤ |Lξ1 | ≤ · · · ≤ Lξ(n−dω −1) , (8)
Lj = ln . (1)
Pr(rj | cj = 1)
where dω denotes the Hamming distance between y and ĉ(ω) .
Subsequently, the hard-decision received word y = (y0 , y1 ,
The ML criterion is [5]
. . . , yn−1 ) ∈ Fn2 can be obtained. That says if Lj > 0, yj = 0;
otherwise, yj = 1. Since a greater |Lj | indicates the received d−d
X ω −1
information of cj is more reliable, reliability of the received d(y, ĉ(ω) ) ≤ Lξj . (9)
information for all coded bits can be ordered based on |Lj |, j=0
yielding a refreshed bit index sequence j0 , j1 , . . . , jn−1 . It If ĉ(ω) satisfies (9), it will be the ML codeword. The OSD
indicates |Lj0 | ≥ |Lj1 | ≥ · · · ≥ |Ljn−1 |. A permuted received decoding can be terminated once the ML codeword is found.
word can be further obtained as Otherwise, the one that yields the smallest correlation distance
y 0 = Π y = yj0 , yj1 , . . . , yjn−1 , to y will be selected as the decoding output ĉopt .
(2)
Note that the GE that produces the systematic generator
where Π denotes the permutation function. Applying the same matrix G00 is a sequential process incurring the OSD latency
permutation to the columns of G yields challenge.
G0 = Π (G) = gj0 , gj1 , . . . , gjn−1 .
(3)
B. BCH Codes and RS Codes
GE will be performed on G0 , reducing columns
The subfield subcode relationship between BCH codes and
gj0 , gj1 , . . . , gjk−1 into weight one and yielding a systematic
RS codes is stated as follows.
generator matrix as
h i Definition I ([13]): Given two linear block codes C and C 0
G00 = gj00 , gj01 , . . . , gj0n−1 , (4) of length n, they are defined over Fq and Fqm , respectively.
If C = C 0 ∩ Fnq , C is a subcode of C 0 over Fq .
where columns gj00 , gj01 , . . . , gj0k−1 form a k × k identity Lemma 1([14]): An (n, k) t error-correcting BCH code
submatrix. However, this cannot be achieved if the first k defined over F2 is a subcode of an (n, k 0 ) t error-correcting
columns are not linearly independent. In this case, a second RS code defined over F2m . Note that, the RS codes are the
permutation will be needed, and the GE will be conducted maximum distance separable (MDS) codes. With the same
again. This adjustment continues until the first k columns of error correction capacity, the RS code dimension is greater
G0 are linearly independent. Note that if a second permutation than that of the BCH subcode, i.e., k 0 > k.
is needed, y 0 will also be updated accordingly. Without further
mentioning, we assume that the first k columns of G0 have III. L OW-L ATENCY O RDERED S TATISTICS D ECODING
been ensured with this property.
A. RS Systematic Generator Matrix
0
Consequently, after ensuring the first k columns of G With the permuted received word y 0 of (2), let us de-
being linearly independent, the first k positions in y 0 are fine Θ = {j0 , j1 , . . . , jk0 −1 } as the index set of its k 0
called the MRIPs and their index set is denoted as Υ = most reliable positions (MRPs), and its complementary set
{j0 , j1 , . . . , jk−1 }. Let f = (yj0 , yj1 , . . . , yjk−1 ) denote a Θc = {jk0 , jk0 +1 , . . . , jn−1 }. Note that since the OSD is
(ω) (ω) (ω)
message and e(ω) = (ej0 , ej1 , . . . , ejk−1 ) ∈ Fk2 de- discussed under the binary BCH code paradigm, it is assumed
note a TEP that will be used to update f , where ω = y 0 ∈ Fn2 . Otherwise, for an RS code, y 0 ∈ Fn2m . Picking
Pτ
1, 2, . . . , λ=0 λk . For each e(ω) , there are at most τ non- up the received symbols indexed by Θ, an initial message
0
zero entries. The test messages can be generated by u = (yj0 , yj1 , . . . , yjk0 −1 ) ∈ Fk2 can be formed. We also
denote the support of its symbol indices that are realized in
f (ω) = f + e(ω) . (5) y 0 as supp(u) = {j0 , j1 , . . . , jk0 −1 }. With u, the message
The corresponding codeword candidate can be generated by polynomial of the (n, k 0 ) RS code can be defined as
X
(ω) (ω) (ω) Hu (x) = yj Lj (x), (10)
ĉ(ω) = (ĉ0 , ĉ1 , . . . , ĉn−1 ) = Π−1 (f (ω) · G00 ), (6)
j∈supp(u)
where ĉ(ω) ∈ Fn2 and Π−1 is the inverse of the permutation
where
function Π. Let us further define the correlation distance Y x − αj 0
Lj (x) = (11)
between y and ĉ(ω) as αj − αj 0
j 0 ∈supp(u),j 0 6=j
405
Authorized licensed use limited to: Rutgers University Libraries. Downloaded on March 22,2025 at 21:57:54 UTC from IEEE Xplore. Restrictions apply.
2022 IEEE Information Theory Workshop (ITW)
406
Authorized licensed use limited to: Rutgers University Libraries. Downloaded on March 22,2025 at 21:57:54 UTC from IEEE Xplore. Restrictions apply.
2022 IEEE Information Theory Workshop (ITW)
Algorithm 1 Low-Latency OSD of BCH Codes significantly, resulting in a reduced decoding complexity.
Input: Received symbol sequence r, order τ ; Note that the partition point in the MRIPs can be more
Output: v̂ opt ; flexibly adjusted to achieve a better complexity reduction. But
1: Compute the LLRs as in (1), and determine y; this process remains heuristic. More numerical results on this
2: Define MRPs, u, and let dmin = +∞; will be provided in Section VI.
3: Generate GRS as in (14);
(0) V. C OMPLEXITY A NALYSIS
4: Generate the initial codeword v̂ as in (16);
0 (ω)
5: For each TEP e , do This section analyzes the complexity of the proposed OSD
6: Test if the codeword v̂ (ω) is binary as in (22); and compares it with the conventional OSD. In the conven-
7: If v̂ (ω) is binary tional OSD, binary operations and floating point operations
8: Determine d(y, v̂ (ω) ) as in (7); are needed. The GE process requires n · (min{n − k, k})
2
407
Authorized licensed use limited to: Rutgers University Libraries. Downloaded on March 22,2025 at 21:57:54 UTC from IEEE Xplore. Restrictions apply.
2022 IEEE Information Theory Workshop (ITW)
For the segmented low-latency OSD, it is parameterized by the complexity advantage of the proposed OSDs, as discussed
(τ1 | l, τ2 ), where l denotes the length of the first segment. below.
(ω) (ω) (ω) (ω) (ω) (ω)
That says e0 1 = (e0 j0 , e0 j1 , . . . , e0 jl−1 ) and e0 2 = (e0 jl ,
TABLE II
(ω) (ω)
e0 jl+1 , . . . , e0 jk0 −1 ). Performance of the Berlekamp-Massey N UMERICAL RESULTS OF Nj 0 IN DECODING THE (63, 45) BCH CODE
WITH τ = 3.
(BM) decoding [15] and the conventional OSD [3] are pre-
sented as benchmarks. The ML decoding performances were j0 0 1 2 3 4 5 6
obtained in [16]. Our results show that the low-latency OSD Nj 0 30914 957 36 9 8 7 7
performance can approach that of the conventional OSD, but
requires a larger decoding order. This is due to the fact that
k 0 > k and |Θ| > |Υ|, more errors will be introduced in
the MRPs of the low-latency OSD. However, our results also TABLE III
N UMERICAL RESULTS OF COMPLEXITY AND LATENCY IN DECODING THE
show that the segmented variant can yield a similar decoding (63, 45) BCH CODE .
performance with a smaller order.
SNR Complexity Latency
1.E+0 Algorithms
(dB) F2 /F64 oper. Floating oper. (µs)
4 2.78 × 104 81 6.58 × 102
1.E-1 BM
OSD (1) 5 2.60 × 104 19 5.34 × 102
OSD (1) 6 2.56 × 104 8 5.06 × 102
1.E-2
ML 4 1.81 × 104 15 1.99 × 103
FER
Low-Lat.
5 5.21 × 103 8 4.36 × 102
1.E-3 OSD (3)
6 2.58 × 103 7 1.32 × 102
Low-Lat. OSD (1)
1.E-4
4 3.69 × 103 8 2.71 × 102
Low-Lat. OSD (2) Seg. Low-Lat.
OSD (1 | 45, 3) 5 2.64 × 103 7 1.44 × 102
Seg. Low-Lat. OSD ( 1 | 21, 1)
1.E-5 6 2.45 × 103 7 1.17 × 102
2 3 4 5 6 7 8
SNR (dB)
ML spite the proposed OSDs incur more TEPs, Table II shows that
1.E-3
Low-Lat. OSD (1)
the binary codeword assessment of Theorem 2 helps eliminate
Low-Lat. OSD (2) the redundant ones effectively, resulting in a relatively low
1.E-4 Low-Lat. OSD (3) level of finite field operations. This assessment also results in
Seg. Low-Lat. OSD (1 | 45, 3)
Seg. Low-Lat. OSD (1 | 47, 2)
fewer floating point operations required by the ML criterion.
1.E-5 Finally, Table III also vindicates the latency advantage of the
2 3 4 5 6 7 8
SNR (dB) proposed OSDs. Our simulations were performed with the
Intel core i7-10710U CPU. In the proposed OSDs, each row
Fig. 2. Performance of the (63, 45) BCH code. of GRS is generated in parallel. In all OSDs, the TEPs are
decoded in a serial manner. It can be seen that both the low-
B. Decoding Complexity and Latency latency OSD and its segmented variant can effectively reduce
the decoding latency over the conventional OSD, vindicating
As pointed out in Section V, complexity of the proposed the latency advantage of our proposed OSDs.
OSD depends on Nj 0 . Table II shows our numerical results of
Nj 0 in decoding the (63, 45) BCH code with τ = 3. Note that ACKNOWLEDGEMENT
the BCH code is a binary subcode of the (63, 57) RS code. It
This work is sponsored by National Natural Science Foun-
can be seen that the assessment of Theorem 2 can effectively
dation of China (NSFC) with project ID 62071498.
eliminate the nonbinary codewords. E.g., after assessing the
(ω)
first symbol in Θc , i.e., v̂jk0 , there are only 957 TEPs that can R EFERENCES
possibly produce BCH codewords. Moreover, the decoding
[1] Y. Polyanskiy, H. V. Poor, and S. Verdu, “Channel coding rate in the
output list cardinality N6 is only 7, which is far smaller than finite blocklength regime,” IEEE Trans. Inf. Theory, vol. 56, no. 5, pp.
that of the conventional OSD with τ = 1. This will result in 2307–2359, 2010.
408
Authorized licensed use limited to: Rutgers University Libraries. Downloaded on March 22,2025 at 21:57:54 UTC from IEEE Xplore. Restrictions apply.
2022 IEEE Information Theory Workshop (ITW)
409
Authorized licensed use limited to: Rutgers University Libraries. Downloaded on March 22,2025 at 21:57:54 UTC from IEEE Xplore. Restrictions apply.