Dorsa Fathollahi, Nariman Farsad, Seyyed Ali Hashemi, Marco Mondelli
Reed-Muller (RM) codes are one of the oldest families of codes. Recently, a recursive projection aggregation (RPA) decoder has been proposed, which achieves a performance that is close to the maximum likelihood decoder for short-length RM codes. One of its main drawbacks, however, is the large amount of computations needed. In this paper, we devise a new algorithm to lower the computational budget while keeping a performance close to that of the RPA decoder. The proposed approach consists of multiple sparse RPAs that are generated by performing only a selection of projections in each sparsified decoder. In the end, a cyclic redundancy check (CRC) is used to decide between output codewords. Simulation results show that our proposed approach reduces the RPA decoder's computations up to $80\%$ with negligible performance loss.
Seyyed Ali Hashemi, Nghia Doan, Warren J. Gross, John Cioffi, Andrea Goldsmith
A low-complexity tree search approach is presented that achieves the maximum-likelihood (ML) decoding performance of Reed-Muller (RM) codes. The proposed approach generates a bit-flipping tree that is traversed to find the ML decoding result by performing successive-cancellation decoding after each node visit. A depth-first search (DFS) and a breadth-first search (BFS) scheme are developed and a log-likelihood-ratio-based bit-flipping metric is utilized to avoid redundant node visits in the tree. Several enhancements to the proposed algorithm are presented to further reduce the number of node visits. Simulation results confirm that the BFS scheme provides a lower average number of node visits than the existing tree search approach to decode RM codes.
Nghia Doan, Seyyed Ali Hashemi, Marco Mondelli, Warren J. Gross
A novel recursive list decoding (RLD) algorithm for Reed-Muller (RM) codes based on successive permutations (SP) of the codeword is presented. A low-complexity SP scheme applied to a subset of the symmetry group of RM codes is first proposed to carefully select a good codeword permutation on the fly. Then, the proposed SP technique is integrated into an improved RLD algorithm that initializes different decoding paths with random codeword permutations, which are sampled from the full symmetry group of RM codes. Finally, efficient latency and complexity reduction schemes are introduced that virtually preserve the error-correction performance of the proposed decoder. Simulation results demonstrate that at the target frame error rate of $10^{-3}$ for the RM code of length $256$ with $163$ information bits, the proposed decoder reduces $6\%$ of the computational complexity and $22\%$ of the decoding latency of the state-of-the-art semi-parallel simplified successive-cancellation decoder with fast Hadamard transform (SSC-FHT) that uses $96$ permutations from the full symmetry group of RM codes, while relatively maintaining the error-correction performance and memory consumption of the semi-parallel permuted SSC-FHT decoder.
Seyyed Ali Hashemi, Carlo Condo, Marco Mondelli, Warren J. Gross
Polar codes have gained extensive attention during the past few years and recently they have been selected for the next generation of wireless communications standards (5G). Successive-cancellation-based (SC-based) decoders, such as SC list (SCL) and SC flip (SCF), provide a reasonable error performance for polar codes at the cost of low decoding speed. Fast SC-based decoders, such as Fast-SSC, Fast-SSCL, and Fast-SSCF, identify the special constituent codes in a polar code graph off-line, produce a list of operations, store the list in memory, and feed the list to the decoder to decode the constituent codes in order efficiently, thus increasing the decoding speed. However, the list of operations is dependent on the code rate and as the rate changes, a new list is produced, making fast SC-based decoders not rate-flexible. In this paper, we propose a completely rate-flexible fast SC-based decoder by creating the list of operations directly in hardware, with low implementation complexity. We further propose a hardware architecture implementing the proposed method and show that the area occupation of the rate-flexible fast SC-based decoder in this paper is only $38\%$ of the total area of the memory-based base-line decoder when 5G code rates are supported.
Nghia Doan, Seyyed Ali Hashemi, Marco Mondelli, Warren J. Gross
Polar codes are a channel coding scheme for the next generation of wireless communications standard (5G). The belief propagation (BP) decoder allows for parallel decoding of polar codes, making it suitable for high throughput applications. However, the error-correction performance of polar codes under BP decoding is far from the requirements of 5G. It has been shown that the error-correction performance of BP can be improved if the decoding is performed on multiple permuted factor graphs of polar codes. However, a different BP decoding scheduling is required for each factor graph permutation which results in the design of a different decoder for each permutation. Moreover, the selection of the different factor graph permutations is at random, which prevents the decoder to achieve a desirable error-correction performance with a small number of permutations. In this paper, we first show that the permutations on the factor graph can be mapped into suitable permutations on the codeword positions. As a result, we can make use of a single decoder for all the permutations. In addition, we introduce a method to construct a set of predetermined permutations which can provide the correct codeword if the decoding fails on the original permutation. We show that for the 5G polar code of length $1024$, the error-correction performance of the proposed decoder is more than $0.25$ dB better than that of the BP decoder with the same number of random permutations at the frame error rate of $10^{-4}$.
Seyyed Ali Hashemi, Carlo Condo, Warren J. Gross
Polar codes have gained significant amount of attention during the past few years and have been selected as a coding scheme for the next generation of mobile broadband standard. Among decoding schemes, successive-cancellation list (SCL) decoding provides a reasonable trade-off between the error-correction performance and hardware implementation complexity when used to decode polar codes, at the cost of limited throughput. The simplified SCL (SSCL) and its extension SSCL-SPC increase the speed of decoding by removing redundant calculations when encountering particular information and frozen bit patterns (rate one and single parity check codes), while keeping the error-correction performance unaltered. In this paper, we improve SSCL and SSCL-SPC by proving that the list size imposes a specific number of bit estimations required to decode rate one and single parity check codes. Thus, the number of estimations can be limited while guaranteeing exactly the same error-correction performance as if all bits of the code were estimated. We call the new decoding algorithms Fast-SSCL and Fast-SSCL-SPC. Moreover, we show that the number of bit estimations in a practical application can be tuned to achieve desirable speed, while keeping the error-correction performance almost unchanged. Hardware architectures implementing both algorithms are then described and implemented: it is shown that our design can achieve 1.86 Gb/s throughput, higher than the best state-of-the-art decoders.
Seyyed Ali Hashemi, Marco Mondelli, John Cioffi, Andrea Goldsmith
A two-part successive syndrome-check decoding of polar codes is proposed with the first part successively refining the received codeword and the second part checking its syndrome. A new formulation of the successive-cancellation (SC) decoding algorithm is presented that allows for successively refining the received codeword by comparing the log-likelihood ratio value of a frozen bit with its predefined value. The syndrome of the refined received codeword is then checked for possible errors. In case there are no errors, the decoding process is terminated. Otherwise, the decoder continues to refine the received codeword. The proposed method is extended to the case of SC list (SCL) decoding by terminating the decoding process when the syndrome of the best candidate in the list indicates no errors. Simulation results show that the proposed method reduces the time-complexity of SC and SCL decoders and their fast variants, especially at high signal-to-noise ratios.
Nghia Doan, Seyyed Ali Hashemi, Warren J. Gross
A novel permuted fast successive-cancellation list decoding algorithm with fast Hadamard transform (FHT-FSCL) is presented. The proposed decoder initializes $L$ $(L\ge1)$ active decoding paths with $L$ random codeword permutations sampled from the full symmetry group of the codes. The path extension in the permutation domain is carried out until the first constituent RM code of order $1$ is visited. Conventional path extension of the successive-cancellation list decoder is then utilized in the information bit domain. The simulation results show that for a RM code of length $512$ with $46$ information bits, by running $20$ parallel permuted FHT-FSCL decoders with $L=4$, we reduce $72\%$ of the computational complexity, $22\%$ of the decoding latency, and $84\%$ of the memory consumption of the state-of-the-art simplified successive-cancellation decoder that uses $512$ permutations sampled from the full symmetry group of the code, with similar error-correction performance at the target frame error rate of $10^{-4}$.
Elie Ngomseu Mambou, Thibaud Tonnellier, Seyyed Ali Hashemi, Warren J. Gross
Visible light communication (VLC) provides a short-range optical wireless communication through light-emitting diode (LED) lighting. Light beam flickering and dimming are among the challenges to be addressed in VLC. Conventional methods for generating flicker-free codes in VLC are based on run-length limited codes that have poor error correction performance, use lookup tables which are memory consuming, and have low transmission rates. In this paper, we propose an efficient construction of flicker-free forward error correction codes to tackle the issue of flickering in VLC. Our simulation results show that by using polar codes and at a dimming ratio of 50%, the proposed system generates flicker-free codes without using lookup tables, while having lower complexity and higher transmission rates than the standard VLC methods. For an information block length of 256, the error correction performance of the proposed scheme is $1.8$ dB and $0.9$ dB better than that of the regular schemes at the bit error rate of $10^{-6}$ for a rate of 0.44 and 0.23, respectively.
Marco Mondelli, Seyyed Ali Hashemi, John Cioffi, Andrea Goldsmith
This work analyzes the latency of the simplified successive cancellation (SSC) decoding scheme for polar codes proposed by Alamdar-Yazdi and Kschischang. It is shown that, unlike conventional successive cancellation decoding, where latency is linear in the block length, the latency of SSC decoding is sublinear. More specifically, the latency of SSC decoding is $O(N^{1-1/μ})$, where $N$ is the block length and $μ$ is the scaling exponent of the channel, which captures the speed of convergence of the rate to capacity. Numerical results demonstrate the tightness of the bound and show that most of the latency reduction arises from the parallel decoding of subcodes of rate $0$ or $1$.
Carlo Condo, Seyyed Ali Hashemi, Arash Ardakani, Furkan Ercan, Warren J. Gross
In blind detection, a set of candidates has to be decoded within a strict time constraint, to identify which transmissions are directed at the user equipment. Blind detection is required by the 3GPP LTE/LTE-Advanced standard, and it will be required in the 5th generation wireless communication standard (5G) as well. Polar codes have been selected for use in 5G: thus, the issue of blind detection of polar codes must be addressed. We propose a polar code blind detection scheme where the user ID is transmitted instead of some of the frozen bits. A first, coarse decoding phase helps selecting a subset of candidates that is decoded by a more powerful algorithm: an early stopping criterion is also introduced for the second decoding phase. Simulations results show good missed detection and false alarm rates, along with substantial latency gains thanks to early stopping. We then propose an architecture to implement the devised blind detection scheme, based on a tunable decoder that can be used for both phases. The architecture is synthesized and implementation results are reported for various system parameters. The reported area occupation and latency, obtained in 65 nm CMOS technology, are able to meet 5G requirements, and are guaranteed to meet them with even less resource usage in the latest technology nodes.
Seyyed Ali Hashemi, Marco Mondelli, S. Hamed Hassani, Rudiger Urbanke, Warren J. Gross
Polar codes represent one of the major recent breakthroughs in coding theory and, because of their attractive features, they have been selected for the incoming 5G standard. As such, a lot of attention has been devoted to the development of decoding algorithms with good error performance and efficient hardware implementation. One of the leading candidates in this regard is represented by successive-cancellation list (SCL) decoding. However, its hardware implementation requires a large amount of memory. Recently, a partitioned SCL (PSCL) decoder has been proposed to significantly reduce the memory consumption. In this paper, we examine the paradigm of PSCL decoding from both theoretical and practical standpoints: (i) by changing the construction of the code, we are able to improve the performance at no additional computational, latency or memory cost, (ii) we present an optimal scheme to allocate cyclic redundancy checks (CRCs), and (iii) we provide an upper bound on the list size that allows MAP performance.
Carlo Condo, Seyyed Ali Hashemi, Warren J. Gross
In blind detection, a set of candidates has to be decoded within a strict time constraint, to identify which transmissions are directed at the user equipment. Blind detection is an operation required by the 3GPP LTE/LTE-Advanced standard, and it will be required in the 5th generation wireless communication standard (5G) as well. We propose a blind detection scheme based on polar codes, where the radio network temporary identifier (RNTI) is transmitted instead of some of the frozen bits. A low-complexity decoding stage decodes all candidates, selecting a subset that is decoded by a high-performance algorithm. Simulations results show good missed detection and false alarm rates, that meet the system specifications. We also propose an early stopping criterion for the second decoding stage that can reduce the number of operations performed, improving both average latency and energy consumption. The detection speed is analyzed and different system parameter combinations are shown to meet the stringent timing requirements, leading to various implementation trade-offs.
Nghia Doan, Seyyed Ali Hashemi, Elie Ngomseu Mambou, Thibaud Tonnellier, Warren J. Gross
Polar codes are the first class of error correcting codes that provably achieve the channel capacity at infinite code length. They were selected for use in the fifth generation of cellular mobile communications (5G). In practical scenarios such as 5G, a cyclic redundancy check (CRC) is concatenated with polar codes to improve their finite length performance. This is mostly beneficial for sequential successive-cancellation list decoders. However, for parallel iterative belief propagation (BP) decoders, CRC is only used as an early stopping criterion with incremental error-correction performance improvement. In this paper, we first propose a CRC-polar BP (CPBP) decoder by exchanging the extrinsic information between the factor graph of the polar code and that of the CRC. We then propose a neural CPBP (NCPBP) algorithm which improves the CPBP decoder by introducing trainable normalizing weights on the concatenated factor graph. Our results on a 5G polar code of length 128 show that at the frame error rate of 10^(-5) and with a maximum of 30 iterations, the error-correction performance of CPBP and NCPBP are approximately 0.25 dB and 0.5 dB better than that of the conventional CRC-aided BP decoder, respectively, while introducing almost no latency overhead.
Seyyed Ali Hashemi, Nghia Doan, Thibaud Tonnellier, Warren J. Gross
A deep-learning-aided successive-cancellation list (DL-SCL) decoding algorithm for polar codes is introduced with deep-learning-aided successive-cancellation (DL-SC) decoding being a specific case of it. The DL-SCL decoder works by allowing additional rounds of SCL decoding when the first SCL decoding attempt fails, using a novel bit-flipping metric. The proposed bit-flipping metric exploits the inherent relations between the information bits in polar codes that are represented by a correlation matrix. The correlation matrix is then optimized using emerging deep-learning techniques. Performance results on a polar code of length 128 with 64 information bits concatenated with a 24-bit cyclic redundancy check show that the proposed bit-flipping metric in the proposed DL-SCL decoder requires up to 66% fewer multiplications and up to 36% fewer additions, without any need to perform transcendental functions, and by providing almost the same error-correction performance in comparison with the state of the art.
Seyyed Ali Hashemi, Marco Mondelli, Arman Fazeli, Alexander Vardy, John Cioffi, Andrea Goldsmith
This paper characterizes the latency of the simplified successive-cancellation (SSC) decoding scheme for polar codes under hardware resource constraints. In particular, when the number of processing elements $P$ that can perform SSC decoding operations in parallel is limited, as is the case in practice, the latency of SSC decoding is $O\left(N^{1-1/μ}+\frac{N}{P}\log_2\log_2\frac{N}{P}\right)$, where $N$ is the block length of the code and $μ$ is the scaling exponent of the channel. Three direct consequences of this bound are presented. First, in a fully-parallel implementation where $P=\frac{N}{2}$, the latency of SSC decoding is $O\left(N^{1-1/μ}\right)$, which is sublinear in the block length. This recovers a result from our earlier work. Second, in a fully-serial implementation where $P=1$, the latency of SSC decoding scales as $O\left(N\log_2\log_2 N\right)$. The multiplicative constant is also calculated: we show that the latency of SSC decoding when $P=1$ is given by $\left(2+o(1)\right) N\log_2\log_2 N$. Third, in a semi-parallel implementation, the smallest $P$ that gives the same latency as that of the fully-parallel implementation is $P=N^{1/μ}$. The tightness of our bound on SSC decoding latency and the applicability of the foregoing results is validated through extensive simulations.
Nghia Doan, Seyyed Ali Hashemi, Warren Gross
In this paper we address the problem of selecting factor-graph permutations of polar codes under belief propagation (BP) decoding to significantly improve the error-correction performance of the code. In particular, we formalize the factor-graph permutation selection as the multi-armed bandit problem in reinforcement learning and propose a decoder that acts like an online-learning agent that learns to select the good factor-graph permutations during the course of decoding. We use state-of-the-art algorithms for the multi-armed bandit problem and show that for a 5G polar codes of length 128 with 64 information bits, the proposed decoder has an error-correction performance gain of around 0.125 dB at the target frame error rate of 10^{-4}, when compared to the approach that randomly selects the factor-graph permutations.
Nghia Doan, Seyyed Ali Hashemi, Warren J. Gross
This work presents a fast successive-cancellation list flip (Fast-SCLF) decoding algorithm for polar codes that addresses the high latency issue associated with the successive-cancellation list flip (SCLF) decoding algorithm. We first propose a bit-flipping strategy tailored to the state-of-the-art fast successive-cancellation list (FSCL) decoding that avoids tree-traversal in the binary tree representation of SCLF, thus reducing the latency of the decoding process. We then derive a parameterized path selection error model to accurately estimate the bit index at which the correct decoding path is eliminated from the initial FSCL decoding. The trainable parameter is optimized online based on an efficient supervised learning framework. Simulation results show that for a polar code of length 512 with 256 information bits, with similar error-correction performance and memory consumption, the proposed Fast-SCLF decoder reduces up to $73.4\%$ of the average decoding latency of the SCLF decoder with the same list size at the frame error rate of $10^{-4}$, while incurring a maximum computational complexity overhead of $27.6\%$. For the same polar code of length 512 with 256 information bits and at practical signal-to-noise ratios, the proposed decoder with list size 4 reduces $89.3\%$ and $43.7\%$ of the average complexity and decoding latency of the FSCL decoder with list size 32 (FSCL-32), respectively, while also reducing $83.2\%$ of the memory consumption of FSCL-32. The significant improvements of the proposed decoder come at the cost of $0.07$ dB error-correction performance degradation compared with FSCL-32.
Seyyed Ali Hashemi, Alexios Balatsoukas-Stimming, Pascal Giard, Claude Thibeault, Warren J. Gross
Successive-cancellation list (SCL) decoding is an algorithm that provides very good error-correction performance for polar codes. However, its hardware implementation requires a large amount of memory, mainly to store intermediate results. In this paper, a partitioned SCL algorithm is proposed to reduce the large memory requirements of the conventional SCL algorithm. The decoder tree is broken into partitions that are decoded separately. We show that with careful selection of list sizes and number of partitions, the proposed algorithm can outperform conventional SCL while requiring less memory.
Seyyed Ali Hashemi, Carlo Condo, Warren J. Gross
Polar codes are capacity achieving error correcting codes that can be decoded through the successive-cancellation algorithm. To improve its error-correction performance, a list-based version called successive-cancellation list (SCL) has been proposed in the past, that however substantially increases the number of time-steps in the decoding process. The simplified SCL (SSCL) decoding algorithm exploits constituent codes within the polar code structure to greatly reduce the required number of time-steps without introducing any error-correction performance loss. In this paper, we propose a faster decoding approach to decode one of these constituent codes, the Rate-1 node. We use this Rate-1 node decoder to develop Fast-SSCL. We demonstrate that only a list-size-bound number of bits needs to be estimated in Rate-1 nodes and Fast-SSCL exactly matches the error-correction performance of SCL and SSCL. This technique can potentially greatly reduce the total number of time-steps needed for polar codes decoding: analysis on a set of case studies show that Fast-SSCL has a number of time-steps requirement that is up to 66.6% lower than SSCL and 88.1% lower than SCL.