Preparing your result...
Loading...
Press Esc to dismiss this message

Hardware-efficient low density parity check code for digital communications (23-Feb-2010)

Thumbnail
US Patent Publication (Source: USPTO)
Publication No. US 7669109 B2 published on 23-Feb-2010
Application No. US 11/463236 filed on 08-Aug-2006
Abstract (English)
A low density parity check (LDPC) code for a belief propagation decoder circuit is disclosed. LDPC code is arranged as a macro matrix (H) representing block columns and block rows of a corresponding parity check matrix (Hpc). Each non-zero entry corresponds to a permutation matrix with a shift corresponding to the position of the permutation matrix entry in the macro matrix. The block columns are grouped, so that only one column in the group contributes to the parity check sum in a row. A parity check value estimate memory is arranged in banks logically connected in various data widths and depths. A parallel adder generates extrinsic estimates for generating new parity check value estimates that are forwarded to bit update circuits for updating of probability values. Parallelism, time-sequencing of ultrawide parity check rows, and pairing of circuitry to handle ultrawide code rows, are also disclosed.
Inventors/Applicants
Hocevar, Dale E.
Plano, TX, US
Assignees
Texas Instruments Incorporated
Dallas, TX, US
Classifications
International (2006.01): H03M 13/00; H03M 13/03
National: 714/780; 714/794; 714/795
Field of Search: 714/780; 714/794; 714/795
Patent References
US 5406570 A Method for a maximum likelihood decoding of a convolutional code with decision weighting, and corresponding decoder Apr-1995
US 5446747 A Error-correction coding method with at least two systematic convolutional codings in parallel, corresponding iterative decoding method, decoding module and decoder Aug-1995
US 5563897 A Method for detecting information bits processed by concatenated block codes Oct-1996 [+18] [-18]
US 5630156 A Process for parallel operation of several computation units, especially in image processing, and corresponding architecture May-1997
US 6023783 A Hybrid concatenated codes and iterative decoding Feb-2000
US 6065147 A Process for transmitting information bits with error correction coding, coder and decoder for the implementation of this process May-2000
US 6108388 A Iterative-structure digital signal reception device, and module and method therefor Aug-2000
US 6119264 A Data block convolutional coding device and method, and corresponding decoding method and device Sep-2000
US 6122763 A Process for transmitting information bits with error correction coding and decoder for the implementation of this process Sep-2000
US 6473010 B1 Method and apparatus for determining error correction code failure rate for iterative decoding algorithms Oct-2002 341/107
US 6539367 B1 Methods and apparatus for decoding of general codes on probability dependency graphs Mar-2003 706/14
US 6633856 B2 Methods and apparatus for decoding LDPC codes Oct-2003 706/15
US 6895547 B2 Method and apparatus for low density parity check encoding of data May-2005 714/801
US 6938196 B2 Node processors for use in parity check decoders Aug-2005 714/752
US 7139959 B2 Layered low density parity check decoding for digital communications Nov-2006 714/752
US 7178080 B2 Hardware-efficient low density parity check code for digital communications Feb-2007 714/752
US 7181676 B2 Layered decoding approach for low density parity check (LDPC) codes Feb-2007 714/780
US 2003/0033575 A1 Methods and apparatus for decoding LDPC codes Feb-2003
US 2003/0043487 A1 Recording and reproducing apparatus, signal decoding circuit, error correction method and iterative decoder Mar-2003
US 2003/0065989 A1 Evaluating and optimizing error-correcting codes using projective analysis Apr-2003
US 2003/0104788 A1 Decoding architecture for low density parity check codes Jun-2003
Other References
Tanner et al. “A Class of Group-Structured LDPC Codes”, ISTCA-2001 Proceedings (Ambleside, England, 2001). [+7] [-7]
Richardson et al., “Design of Capacity-Approaching Irregular Low-Density Parity Check Codes”, Trans. on Information Theory, vol. 47, No. 2 (IEEE, Feb. 2001), pp. 619-637.
Zhang et al., “VLSI Implementation-Oriented (3,k)-Regular Low-Density Parity-Check Codes”, IEEE Workshop on Signal Processing Systems (Sep. 2001), pp. 25-36.
Boutillon et al., “Decoder-First Code Design”, Proc. Int'l Symp. on Turbo Codes and Related Topics (Brest, France, Sep. 2001).
Sridhara et al., “Low Density Parity Check Codes from Permutation Matrices”, 2001 Conference on Information Sciences and Systems (Johns Hopkins, Mar. 2001).
Chung et al., “Analysis of sum-product decoding of low-density parity-check codes using a Gaussian approximation”, Trans. Information Theory, vol. 47, No. 2 (IEEE, Feb. 2001), pp. 657-670.
Richardson et al., “Efficient Encoding of Low-Density Parity-Check Codes”, Trans. Information Theory, vol. 47, No. 2 (IEEE, Feb. 2001), pp. 638-656.
Mackay et al., “Comparison of Constructions of Irregular Gallager Codes”, Trans. on Communications, vol. 47, No. 10 (IEEE, 1999), pp. 1449-1454.
Prior Publications
Related Documents
Continuation of application No. US 10/329597 00, filed on 26-Dec-2002, now Pat. No. US 7178080 A. [+1] [-1]
Provisional application No. US 60/403668 00, filed on 15-Aug-2002.
Examiners
Primary: Torres, Joseph D
Attorney, Agent or Firm
Shaw, Steven A. [+2] [-2]
Brady, W. James
Telecky, Jr., Frederick J.

Supplemental Information (Source: DOCDB)
Inventors
HOCEVAR DALE E
US
Assignees/Applicants
TEXAS INSTRUMENTS INC
US
Priority
US 463236 A  08-Aug-2006 [+2] [-2]
US 329597 A  26-Dec-2002
US 403668 P  15-Aug-2002
Classifications
International (2010.01): H03M 13/00
International (2006.01): H03M 13/00; H03M 13/03; H03M 13/11; H04L 1/00
European: H04L 1/00B5H; H03M 13/11; H04L 1/00B5E5; H04L 1/00B7B
Preview up to the first 8 page images of this publication.
--- Page 1 ---
Page 1
--- Page 2 ---
Page 2
--- Page 3 ---
Page 3
--- Page 4 ---
Page 4
--- Page 5 ---
Page 5
--- Page 6 ---
Page 6
--- Page 7 ---
Page 7
--- Page 8 ---
Page 8
(Source: USPTO)
CROSS-REFERENCE TO RELATED APPLICATIONS
This application is a continuation of copending U.S. application Ser. No. 10/329,597, filed Dec. 26, 2002, which in turn claims priority, under 35 U.S.C. §119(e), of Provisional Application No. 60/403,668, filed Aug. 15, 2002.
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
Not applicable.
BACKGROUND OF THE INVENTION
This invention is in the field of digital data communications, and is more specifically directed to redundant coding for error detection and correction in such communications.
High-speed data communications, for example in providing high-speed Internet access, is now a widespread utility for many businesses, schools, and homes. At this stage of development, such access is provided according to an array of technologies. Data communications are carried out over existing telephone lines, with relatively slow data rates provided by voice band modems (e.g., according to the current v.92 communications standards), and at higher data rates using Digital Subscriber Line (DSL) technology. Another modern data communications approach involves the use of cable modems communicating over coaxial cable, such as provided in connection with cable television services. The Integrated Services Digital Network (ISDN) is a system of digital phone connections over which data is transmitted simultaneously across the world using end-to-end digital connectivity. Localized wireless network connectivity according to the IEEE 802.11 standard has become very popular for connecting computer workstations and portable computers to a local area network (LAN), and often through the LAN to the Internet. Wireless data communication in the Wide Area Network (WAN) context, which provides cellular-type connectivity for portable and handheld computing devices, is expected to also grow in popularity.
A problem that is common to all data communications technologies is the likelihood of corruption of data due to noise. As is fundamental in the art, the signal-to-noise ratio for a communications channel is a degree of goodness of the communications carried out over that channel, as it conveys the relative strength of the signal that carries the data (as attenuated over distance and time), to the noise present on that channel. These factors relate directly to the likelihood that a data bit or symbol received over the channel will be in error relative to the data bit or symbol as transmitted. This likelihood is reflected by the error probability for the communications over the channel, commonly expressed as the Bit Error Rate (BER) ratio of errored bits to total bits transmitted. In short, the likelihood of error in data communications must be considered in developing a communications technology. Techniques for detecting and correcting errors in the communicated data must be incorporated for the communications technology to be useful.
Error detection and correction techniques are typically implemented through the use of redundant coding of the data. In general, redundant coding inserts data bits into the transmitted data stream that do not add any additional information, but that indicate whether an error is present in the received data stream. More complex codes provide the ability to deduce the true transmitted data from a received data stream, despite the presence of errors.
Many types of redundant codes that provide error correction have been developed. One type of code simply repeats the transmission, for example repeating the payload twice, so that the receiver deduces the transmitted data by applying a decoder that determines the majority vote of the three transmissions for each bit. Of course, this simple redundant approach does not necessarily correct every error, but greatly reduces the payload data rate. In this example, a predictable likelihood remains that two of three bits are in error, resulting in an erroneous majority vote despite the useful data rate having been reduced to one-third. More efficient approaches, such as Hamming codes, have been developed toward the goal of reducing the error rate while maximizing the data rate.
The well-known Shannon limit provides a theoretical bound on the optimization of decoder error as a function of data rate. The Shannon limit provides a metric against which codes can be compared, both in the absolute and relative to one another. Since the time of the Shannon proof, modern data correction codes have been developed to more closely approach the theoretical limit. An important type of these conventional codes are “turbo” codes, which encode the data stream by applying two convolutional encoders. One convolutional encoder encodes the datastream as given, while the other encodes a pseudo-randomly interleaved version of the data stream. The results from the two encoders are interwoven to produce the output encoded data stream.
Another class of known redundant codes is the Low Density Parity Check code. According to this class of codes, a sparse matrix H defines the code, with the encodings t of the payload data satisfying:
Ht=0  (1)
over Galois field GF(2). Each encoding t consists of the source message s combined with the corresponding parity check bits for that source message s. The encodings t are transmitted, with the receiving network element receiving a signal vector r=t+n, n being the noise added by the channel. Because the decoder at the receiver knows matrix H, it can compute a vector z=Hr. However, because r=t+n, and because Ht=0:
z=Hr=Ht+Hn=Hn  (2)
The decoding process thus involves finding the sparsest vector x that satisfies the equation:
Hx=z  (3)
over GF(2). The vector x becomes the best guess for noise vector n, which can be subtracted from the received signal vector r to recover encodings t, from which the original source message s is recoverable. There have been many examples of LDPC codes that are known in the art, and these LDPC codes have been described as providing code performance that approaches the Shannon limit, as described in Tanner et al., “A Class of Group-Structured LDPC Codes”, ISTCA-2001 Proc. (Ambleside, England, 2001).
In general, high-performance LDPC code decoders are difficult to implement into hardware. In contrast to Shannon's adage that random codes are good codes, it is regularity that allows efficient hardware implementation. To address this difficult tradeoff between code irregularity and hardware efficiency, the technique of belief propagation provides an iterative implementation of LDPC decoding can be made somewhat efficient, as described in Richardson, et al., “Design of Capacity-Approaching Irregular Low-Density Parity Check Codes,” IEEE Trans. on Information Theory, Vol. 47, No. 2 (February 2001), pp. 619-637; and in Zhang et al., “VLSI Implementation-Oriented (3,k)-Regular Low-Density Parity-Check Codes”, IEEE Workshop on Signal Processing Systems (September 2001), pp. 25.-36. Belief propagation decoding algorithms are also referred to in the art as probability propagation algorithms, message passing algorithms, and as sum-product algorithms.
In summary, belief propagation algorithms are based on the binary parity check property of LDPC codes. As mentioned above and as known in the art, each check vertex in the LDPC code constrains its neighboring variables to form a word of even parity. In other words, the product of the LDPC code word vector with each row of the parity check matrix sums to zero. According to the belief propagation approach, the received data are used to represent the input probabilities at each input node (also referred to as a “bit node”) of a bipartite graph having input nodes and check nodes. Within each iteration of the belief propagation method, bit probability messages are passed from the input nodes to the check nodes, updated according to the parity check constraint, with the updated values sent back to and summed at the input nodes. The summed inputs are formed into log likelihood ratios (LLRs) defined as:
L ( c ) = log ( P ( c = 0 ) P ( c = 1 ) ) ( 4 )
where c is a coded bit received over the channel.
In its conventional implementation, the belief propagation algorithm uses two value arrays, a first array L(qmj) storing the LLRs for the input nodes, and the second array Rmj storing the results of the parity check node updates, with m being the parity check row index and j being the column (or input node) index. The general operation of this conventional approach determines, in a first step, the Rmj values by estimating, for each check sum (row of the parity check matrix) the probability of the input node value from the other inputs used in that checksum. The second step of this algorithm determines the LLR L(qmj) probability values by combining, for each column, the Rmj values for that input node from parity check matrix rows in which that input node participated. A “hard” decision is then made from the resulting probability values, and is applied to the parity check matrix. This two-step iterative approach is repeated until the parity check matrix is satisfied (all parity check rows equal zero, GF(2)), or until another convergence criteria is reached, or a terminal number of iterations have been executed.
By way of further background, the code design approach described in Boutillon et al., “Decoder-First Code Design”, Proc.: Int'l Symp. on Turbo Codes and Related Topics (Brest, France, September 2001) defines the decoder architecture first, and uses this architecture to constrain the design of the LDPC code itself. Sridhara, et al., “Low Density Parity Check Codes from Permutation Matrices”, 2001 Conference on Information Sciences and Systems (Johns Hopkins University, Mar. 21-23, 2001) describes the LDPC code as constructed from shifted identity matrices (i.e., permutation matrices).
However, it has been observed in connection with this invention, that these prior approaches are somewhat limited, in that these approaches are limited to a single code or a small selection of codes. Practically useful communications receivers require some amount of flexibility in code rates, and in optimizing their operation for varying noise levels and channel conditions.
BRIEF SUMMARY OF THE INVENTION
It is therefore an object of this invention to provide an LDPC decoding scheme which can be efficiently implemented in an integrated circuit.
It is a further object of this invention to provide such a scheme that is flexible over a wide range of code rates.
It is a further object of this invention to provide such a scheme having the capability of parallelism, to provide further efficiencies in operation and construction.
Other objects and advantages of this invention will be apparent to those of ordinary skill in the art having reference to the following specification together with its drawings.
The present invention may be implemented in connection with a network receiver, having a decoder that implements a Low-Density Parity-Check (LDPC) code for retrieving the transmitted message. The LDPC code is implemented according to a parity check matrix consisting of an irregular arrangement of cyclically shifted identity matrices, resulting in an irregular LDPC code that provides performance near the Shannon limit. A decoder architecture for this code includes a group of column sum memories that receive the received input data, and that accumulate and store updated values for the input node predictions. A reversible router block forwards these column, input node, values to a parity check update block, at which multiple predictions are generated for each input node, one prediction for each parity check (row) in which the input node is involved; a prediction memory is also provided for storing these predictions. The outputs of the parity check update block are forwarded through the router, and accumulated in the column sum memories.
According to another aspect of the invention, the invention is implemented by encoding a datastream by applying a systematic block code corresponding to an irregular arrangement of circularly shifted identity matrices.
BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWING
FIG. 1 is a functional block diagram of communications between two OFDM transceivers, where at least the receiving transceiver is constructed according to a first preferred embodiment of the invention.
FIG. 2 is an electrical diagram, in block form, of a transceiver constructed according to the preferred embodiments of the invention.
FIG. 3 is a flow chart illustrating a method of designing an LDPC code according to the preferred embodiments of the invention.
FIGS. 4a and 4b are examples of LDPC code macro matrices according to the preferred embodiments of the invention.
FIG. 5 is an electrical diagram, in block form, of an LDPC decoder according to a first preferred embodiment of the invention.
FIG. 6 is an electrical diagram, in block form, of a parity check and update circuit in the LDPC decoder of FIG. 5, according to the first preferred embodiment of the invention.
FIG. 7 is an electrical diagram, in block form, of an example of routing circuitry in the LDPC decoder of FIG. 5, according to the first preferred embodiment of the invention.
FIG. 8 is an electrical diagram, in block form, of a bit update circuit in the LDPC decoder of FIG. 5, according to the first preferred embodiment of the invention.
FIG. 9 is an electrical diagram, in block form, of an LDPC decoder according to a second preferred embodiment of the invention.
FIG. 10 is a timing diagram, illustrating data word misalignment according to the second embodiment of the invention.
FIG. 11 is a flow chart illustrating a method for solving data word misalignment according to the second embodiment of the invention.
FIGS. 12 and 12a are electrical diagrams, in block form, of the construction of a parity check and update circuit according to an alternative embodiment of the invention.
FIG. 13 is an electrical diagram, in block form, of the construction of a parallel parity check and update circuit according to another alternative embodiment of the invention.
FIGS. 14a through 14g are electrical diagrams, in block form, of various alternative logical arrangements of memory according to the preferred embodiments of the invention and the physical circuitry for effecting these logical arrangements.
DETAILED DESCRIPTION OF THE INVENTION
The present invention will be described in connection with an example of its implementation in an exemplary transceiver, for example a wireless network adapter such as according to the IEEE 802.11 wireless standard. It will be apparent to those skilled in the art having reference to this specification that this invention is particularly well-suited for use in such an application. However, it is also contemplated that this invention will be of similar benefit in many other applications that involve error correction coding, including communications according to orthogonal frequency division multiplexing (OFDM), discrete multitone modulation (DMT) for example as used in conventional Digital Subscriber Line (DSL) modems, and other modulation and communication approaches, whether carried out as land line or wireless communications. It is therefore to be understood that these and other alternatives to and variations on the embodiment described below are contemplated to be within the scope of the invention as claimed.
FIG. 1 functionally illustrates an example of a somewhat generalized communication system into which the preferred embodiment of the invention is implemented. The illustrated system corresponds to an OFDM modulation arrangement, as useful in OFDM wireless communications as contemplated for IEEE 802.11 wireless networking. The data flow in this approach is also analogous to Discrete Multitone modulation (DMT) as used in conventional DSL communications, as known in the art. It is contemplated that this generalized arrangement is provided by way of context only. In the system of FIG. 1, only one direction of transmission (from transmitting transceiver 10 over transmission channel C to receiving transceiver 20) is illustrated. It will of course be understood by those skilled in the art that data will also be communicated in the opposite direction, in which case transceiver 20 will be the transmitting transceiver and transceiver 10 the receiving transceiver.
As shown in FIG. 1, transmitting transceiver 10 receives an input bitstream that is to be transmitted to receiving transceiver 20. The input bitstream may be generated by a computer at the same location (e.g., the central office) as transmitting transceiver 10, or alternatively and more likely is generated by a computer network, in the Internet sense, that is coupled to transmitting transceiver 10. Typically, this input bitstream is a serial stream of binary digits, in the appropriate format as produced by the data source.
The input bitstream is received by LDPC encoder function 11, according to this embodiment of the invention. LDPC encoder function 11 digitally encodes the input bitstream by applying a redundant code for error detection and correction purposes. According to this embodiment of the invention, the redundant LDPC code applied by encoder function 11 is selected in a manner that facilitates implementation and performance of the corresponding decoder in receiving transceiver 20. The specifics of the code will become apparent from the description of this decoder function, presented below relative to the description of the construction and operation of receiving transceiver 20. In general, the coded bits include both the payload data bits and also code bits that are selected, based on the payload bits, so that the application of the codeword (payload plus code bits) to the sparse LDPC parity check matrix equals zero for each parity check row. After application of the LDPC code, bit to symbol encoder function 11 groups the incoming bits into symbols having a size, for example, ranging up to as many as fifteen bits. These symbols will modulate the various subchannels in the OFDM broadband transmission.
The encoded symbols are then applied to inverse Discrete Fourier Transform (IDFT) function 14. IDFT function 14 associates each input symbol with one subchannel in the transmission frequency band, and generates a corresponding number of time domain symbol samples according to the Fourier transform. These time domain symbol samples are then converted into a serial stream of samples by parallel-to-serial converter 16. Functions 11 through 16 thus convert the input bitstream into a serial sequence of symbol values representative of the sum of a number of modulated subchannel carrier frequencies, the modulation indicative of the various data values, and including the appropriate redundant code bits for error correction. Typically, for an input of N/2 complex symbols, IDFT function 14 outputs a block of N real-valued time domain samples. Those skilled in the art having reference to this specification will readily recognize that each of functions 11 through 16 may be carried out, and preferably actually are carried out, as digital operations executed by a digital signal processor (DSP).
Filtering and conversion function 18 then processes the datastream for transmission. Function 18 applies the appropriate digital filtering operations, such as interpolation to increase sample rate and digital low pass filter for removing image components, for the transmission. The digitally-filtered datastream signal is then converted into the analog domain and the appropriate analog filtering is then applied to the output analog signal, prior to its transmission.
The output of filter and conversion function 18 is then applied to transmission channel C, for forwarding to receiving transceiver 20. The transmission channel C will of course depend upon the type of communications being carried out. In the wireless communications context, the channel will be the particular environment through which the wireless transmission takes place. Alternatively, in the DSL context, the transmission channel is physically realized by conventional twisted-pair wire. In any case, transmission channel C adds significant distortion and noise to the transmitted analog signal, which can be characterized in the form of a channel impulse response.
This transmitted signal is received by receiving transceiver 20, which, in general, reverses the processes of transmitting transceiver 10 to recover the information of the input bitstream.
FIG. 2 illustrates an exemplary construction of receiving transceiver 20, in the form of a wireless network adapter. Transceiver 20 is coupled to host system 30 by way of a corresponding bus B. Host system 30 corresponds to a personal computer, a laptop computer, or any sort of computing device capable of wireless networking in the context of a wireless LAN; of course, the particulars of host system 30 will vary with the particular application. In the example of FIG. 2, transceiver 20 may correspond to a built-in wireless adapter that is physically realized within its corresponding host system 30, to an adapter card installable within host system 30, or to an external card or adapter coupled to host computer 30. The particular protocol and physical arrangement of bus B will, of course, depend upon the form factor and specific realization of transceiver 20. Examples of suitable buses for bus B include PCI, MiniPCI, USB, CardBus, and the like.
Transceiver 20 in this example includes spread spectrum processor 31, which is bidirectionally coupled to bus B on one side, and to radio frequency (RF) circuitry 33 on its other side. RF circuitry 33, which may be realized by conventional RF circuitry known in the art, performs the analog demodulation, amplification, and filtering of RF signals received over the wireless channel and the analog modulation, amplification, and filtering of RF signals to be transmitted by transceiver 20 over the wireless channel, both via antenna A. The architecture of spread spectrum processor 31 into which this embodiment of the invention can be implemented follows that of the TNETW1100 single-chip WLAN medium access controller (MAC) available from Texas Instruments Incorporated. This exemplary architecture includes embedded central processing unit (CPU) 36, for example realized as a reduced instruction set (RISC) processor, for managing high level control functions within spread-spectrum processor 31. For example, embedded CPU 36 manages host interface 34 to directly support the appropriate physical interface to bus B and host system 30. Local RAM 32 is available to embedded CPU 36 and other functions in spread spectrum processor 31 for code execution and data buffering. Medium access controller (MAC) 37 and baseband processor 39 are also implemented within spread-spectrum processor 31 according to the preferred embodiments of the invention, for generating the appropriate packets for wireless communication, and providing encryption, decryption, and wired equivalent privacy (WEP) functionality. Program memory 35 is provided within transceiver 20, for example in the form of electrically erasable/programmable read-only memory (EEPROM), to store the sequences of operating instructions executable by spread-spectrum processor 31, including the coding and decoding sequences according to the preferred embodiments of the invention, which will be described in further detail below. Also included within wireless adapter 20 are other typical support circuitry and functions that are not shown, but that are useful in connection with the particular operation of transceiver 20.
According to the preferred embodiments of the invention, LDPC decoding is embodied in specific custom architecture hardware associated with baseband processor 39, and shown as LDPC decoder circuitry 38 in FIG. 2. LDPC decoder circuitry 38 is custom circuitry for performing the coding and decoding of transmitted and received data packets according to the preferred embodiments of the invention. Examples of the particular construction of LDPC decoder circuitry 38 according to the preferred embodiment of this invention will be described in further detail below.
Alternatively, it is contemplated baseband processor 39 itself, or other computational devices within transceiver 20, may have sufficient computational capacity and performance to implement the decoding functions described below in software, specifically by executing a sequence of program instructions. It is contemplated that those skilled in the art having reference to this specification will be readily able to construct such a software approach, for those implementations in which the processing resources are capable of timely performing such decoding.
Referring back to the functional flow of FIG. 1, filtering and conversion function 21 in receiving transceiver 20 processes the signal that is received over transmission channel C. Function 21 applies the appropriate analog filtering, analog-to-digital conversion, and digital filtering to the received signals, again depending upon the technology of the communications. In the DSL context, this filtering can also include the application of a time domain equalizer (TEQ) to effectively shorten the length of the impulse response of the transmission channel H. Serial-to-parallel converter 23 converts the filtered datastream into a number of samples that are applied to Discrete Fourier Transform (DFT) function 24. Because, in this OFDM context, the received signal is a time-domain superposition of the modulated subchannels, DFT function 24 recovers the modulating symbols at each of the subchannel frequencies, reversing the IDFT performed by function 14 in transmitting transceiver 10. DFT function 24 outputs a frequency domain representation of a block of transmitted symbols, multiplied by the frequency-domain response of the effective transmission channel. Recovery function 25 then effectively divides out the frequency-domain response of the effective channel, for example by the application of a frequency domain equalizer (FEQ), to recover an estimate of the modulating symbols. Symbol-to-bit decoder function 26 then demaps the recovered symbols, and applies the resulting bits to LDPC decoder function 28.
LDPC decoder function 28 reverses the encoding that was applied in the transmission of the signal, to recover an output bitstream that corresponds to the input bitstream upon which the transmission was based. This output bitstream is then forwarded to the host workstation or other recipient.
LDPC Decoding
The theory of operation of the preferred embodiment of the invention will now be described, following which its implementation into LDPC decoding function 28 in transceiver 20, in the form of LDPC decoder circuitry 38 operating in cooperation with baseband processor 39, will then be described.
By way of nomenclature, the LDPC code is fundamentally contained within an mxj parity check matrix Hpc that, when multiplied by the true transmitted code word vector c equals zero:
Hpc·c=0  (5)
over Galois Field (2). For a single one of the m rows in parity check matrix Hpc, this parity check amounts to:
H1c1+H2c2+ . . . +Hjcj=0  (6a)
over GF(2). In the LDPC code according to the preferred embodiments of the invention, the parity check matrix Hpc is formed from a composite of circularly shifted identity matrices represented by a macro matrix H. Each entry in macro matrix H represents a permutation matrix (e.g., a circularly shifted identity matrix), and in this example takes either a 1 or a 0 value. As will be described below, an entry with a 1 value in macro matrix H symbolizes a p×p permutation matrix at that position within parity check Hpc, while entries with a 0 value symbolize a p×p zero matrix. The parity-check equation thus logically becomes, for an exemplary row of matrix Hpc having a “1” in its columns 1, 3, 4, and 7:
c1⊕c3⊕c4⊕c7=0  (6b)
Once the coding matrix Hpc is defined, the encoding of a message frame is relatively straightforward, as known in the art, and can easily be performed by conventional programmable integrated circuits such as digital signal processors and the like. According to the preferred embodiments of the invention, the circularly shifted identity matrices are tiled within macro matrix H in an irregular manner, as will be described below, to provide excellent coding performance.
On the decoding side, one can define a set N(m) as the set of all bit indices (columns) in a given row m for which codeword bits contribute to the checksum (i.e., all bit indices for which the entries of parity check matrix Hpc in row m are 1). The checksum equation for a row of the parity check can be expressed as:
n N ( m ) c n = 0 ( 7 )
over GF(2) or, logically, the exclusive-OR of the input bits cj that correspond to column bits in the row having a 1 value. One can thus determine, for a given codeword vector c, whether an error is present by determining whether this equation is true for each row of the parity check matrix Hpc.
In practice, however, the actual input bit values rj that are recovered after demodulation and that are to be interpreted as codeword vector c by a decoder, for example by decoding function 28 in transceiver 20 of FIG. 1, are not binary values. Rather, these bit values are expressed as a fractional value, for example between zero and one, expressed in several bits (e.g., six or seven). In effect, the input bit values rj can be considered as, and converted to, probabilities that their respective bit is a 0 (or conversely a 1). As known in this art, the log likelihood ratio (LLR) is a commonly used representation for these probabilities:
L ( r j ) = log ( P ( c j = 0 ) P ( c j = 1 ) ) ( 8 )
which can of course take negative and positive values, corresponding to 1 and 0 being more likely, respectively. For this description of the preferred embodiment of the invention, one can assume that the incoming LLRs (i.e., the received data) have the form −2rj2 where σ2 represents channel noise variance.
Fundamentally, the LDPC decoding process according to the preferred embodiments of the invention involves an iterative two-step process:
    • 1. Estimate a value Rmj for each of the j input nodes, for each of the m rows of the checksum, using the current probability values from the other input nodes, setting the result of the checksum for the row to 0; and
    • 2. Update the sum L(qj) for each of the j input nodes from a combination of the m values of Rmj in the same column.
      The iterations continue until a termination criterion is reached. A preferred termination criteria is the earlier of (i) evaluation of the matrix operation Hpc·c=0 (mod 2), using “hard” decisions from the LLRs L(rj) as the codeword vector c, and (ii) completion of a specified number of iterations.
Mathematically, for the first step of estimating values Rmj for each of the j input nodes, for each of the m rows of the checksum, one can derive an amplitude Amj and a sign value smj as follows:
A mj = n N ( m ) ; n j Ψ ( L ( q mn ) ) ( 9 )
where the function Ψ is defined as:
Ψ(x)≡log(|tan h(x/2)|)=log(tan h|x/2|)  (10)
The function Ψ is its own negative inverse: Ψ(Ψ(x))=−|x|. For computational ease, one can express tanh(L/2) as:
tanh ( L 2 ) = ( L - 1 L + 1 ) ( 11 )
The sign is determined from:
s mj = n N ( m ) ; n j sgn ( L ( q mn ) ) ( 12 )
which is simply an odd/even determination of the number of negative probabilities, excluding each row's own contribution. The updated estimate of values Rmj is thus:
Rmj=−smjΨ(Amj)  (13)
The negative sign of value Rmj contemplates that the function Ψ is its own negative inverse. The value Rmj thus corresponds to an estimate of the LLR for input node j as derived from the other input nodes in the mth row of the parity check matrix, except input node j itself.
An alternative computation of the estimate values Rmj can be defined as a special summation:
R mj = n N ( m ) ; n j [ + ] L ( q mn ) ( 14 )
where the LLR addition [+] is defined as:
L ( q u ) [ + ] L ( q v ) log ( 1 + ( L ( q u ) + L ( q v ) ) L ( q u ) + L ( q v ) ) ( 15 )
This alternative determination of the estimate values Rmj may be easier to implement into some integrated circuit architectures. The selection of the computations may be made by those skilled in the art having reference to this specification, and confronted by a particular technology and decoding application.
In the second step of each decoding iteration, the LLR estimates for each input node are updated. For each column (i.e., each input node):
L ( q j ) = m M ( j ) R mj + ( - 2 r j σ 2 ) ( 16 )
where the set M(j) is the set of all check sum indices (rows) for a given column j of check sum equations to which input bit j contributes to the checksum (i.e., all row indices for which the entries of parity check matrix Hpc in column j are 1). This operation effectively sums the estimated values Rmj over the jth column, and adds in the original received input node value
- 2 r j σ 2
to form the best full estimate of the LLR for input node j in this iteration. This column estimate will be used in the hard decision check. In preparation for the next iteration, the per-row (or extrinsic) LLR probabilities are then derived:
L(qmj)=L(qj)−Rmj  (17)
for each column j in each row m. The per-row probabilities thus amount to an estimate for the probability of the input value, excluding the contribution to the estimate for each row from the row itself.
As noted above, the determination of whether the iterations have converged to an error free decoding is based on the per-column LLRs L(qj):
L(qj)≧0, [Image Omitted] cj=0  (18a)
L(qj)<0, [Image Omitted] cj=1  (18b)
The codeword vector c={c0, c1, . . . cN} is then applied to the parity check matrix H to determine if the product is zero, which as noted above indicates that codeword vector c is a valid codeword, from which the payload data portion can be readily extracted.
In practice, for those value arrays Rmj, L(qmj), Amj that are used in the algorithm, the computations performed and the non-zero array values occur only at those index positions (m,j) in parity check matrix Hpc where a “1” value appears. Also in practice, the initialization of the array L(qmj) can be arbitrary. For example, each of the L(qmj) values can be initialized to zero, or to the input values −2rj2, as desired.
According to the preferred embodiment of the invention, an LDPC code is used for encoding data to be transmitted which has the benefits of performing near the Shannon limit, while being implementable in a very efficient manner. The encoding function according to this preferred embodiment of the invention will be described in detail below. However, it is the decoding operation that requires significant computational power, considering that the received datastream is expected to include some frequency of errors, as evident by the use of redundant coding to begin with. Accordingly, the computational complexity and performance of decoding circuitry has significant practical importance, and is often the deciding factor in whether a particular code is practically useful. It will therefore be useful, for purposes of this description, to first describe the code in terms of the decoding algorithm and architecture, prior to describing the encoding function itself, following a description of the code derivation in general terms.
FIG. 3 illustrates a method of deriving an LDPC code according to the preferred embodiments of the invention. It is contemplated that the particular construction of the LDPC codes according to this invention will become apparent from the manner in which these codes are constructed. In process 40, the code rate is selected. This code rate selection of course depends upon the usual factors involved in the communications, including the expected noise level on the channel, the bit-error-rate (BER) performance that is desired, and of course the desired data rate. According to the preferred embodiments of the invention, data rates of ½ and ⅓ are contemplated. However, as will become apparent to those skilled in the art having reference to this description, this invention permits a wide range of flexibility in the selection and implementation of the redundant code, and as such a wide range of data rates are contemplated.
In process 42, the largest input node degree, or variable degree, for the code is selected. As known in the art, this maximum input node degree corresponds to the largest number of checksum rows that involves a given input node; for LDPC codes such as used in this embodiment of the invention, this input node degree corresponds to the maximum number of parity checks that any input node contributes to. While larger input node degrees are desirable, for better error correction capability, the input node degree is ultimately constrained by the hardware implementation. In the examples of the preferred embodiments of the invention described in this specification, the largest variable degree is contemplated to be on the order of ten to twenty.
In process 44, the degree distributions within parity check matrix Hpc are optimized. These degree distributions refer to (i) the input node degree distribution, which is the number of check nodes that each input node contributes to, and (ii) the check node degree distribution, which is the number of input nodes that each check node receives. Regular LDPC codes are those for which all nodes of the same type have the same degree. According to the preferred embodiment of the invention, however, irregular LDPC codes are used, to improve error rate performance, and as such each code will have a distribution of degrees over its input nodes, and possibly also over its check nodes. A preferred example of an optimization tool for process 46 is a web-based tool known as “ldpcopt”, which is readily available at http://lthcwww.epfl.ch/research/ldpcopt/ and is described in Richardson et al., “Design of Capacity-Approaching Irregular Low-Density Parity-Check Codes”, Transactions on Information Theory, Vol. 47, No. 2 (IEEE, February 2001), pp. 619-637; and in Chung, et al., “Analysis of sum-product decoding of low-density parity-check codes using a Gaussian approximation”, Transactions on Information Theory, Vol. 47, No. 2 (IEEE, February 2001), pp. 657-670; both incorporated herein by this reference. The optimization criteria used by this exemplary process minimizes the bit error rate for the code, by varying the variable degrees of the LDPC parity check matrix. The results of the optimization provide the optimum numbers of “1” values (but not position) in parity check matrix Hpc, in the form of optimum numbers of “blocks” within its defining macro matrix H.
According to the preferred embodiments of the invention, as briefly mentioned above, the parity check matrix Hpc is determined through the use of a macro matrix H. Macro matrix H is a matrix, of dimensions c by d, in which each entry (“block”) indicates the position of a p×p matrix in parity check matrix Hpc, and in which the value of each entry symbolizes the form of the corresponding p×p matrix at that position. As a result, parity check matrix Hpc has a total number of rows m=p×c and a total number of columns j=p×d. Each row of macro matrix H thus defines a “block row” of p rows in parity check matrix Hpc; conversely, each column of macro matrix H defines a “block column” of p columns in parity check matrix Hpc. As such, this description will refer to rows of macro matrix H as “block rows”, and columns of macro matrix H as “block columns”. In the preferred embodiments of this invention, a zero value of an entry in macro matrix H symbolizes a zero-valued p×p matrix (i.e., all entries in the p×p matrix are zero). A non-zero value (e.g., 1) of an entry in macro matrix Hpc symbolizes the location of a permutation matrix, which in the preferred embodiments of the invention is a cyclically (or circularly) shifted identity matrix.
The optimization of process 44 determines, for a given macro matrix H of c block rows and d block columns, the distribution of the “1” values within the matrix. Specifically, the distribution determines the number of block columns of macro matrix H that have each of a range of numbers of “1” values, and the number of block rows of macro matrix H that have each of a range of numbers of “1” values. By way of definition, the block rows and block columns refer to the illustration of macro matrix H that indicates the locations of the permutation matrices by “1” values. Once the optimization of process 44 is performed, the specific size of c block rows and d block columns is selected in process 46 as shown in FIG. 3. Alternatively, one may select the numbers of c block rows and d block columns prior to optimization process 44. In either case, the optimization of process 44 can be illustrated by way of examples.
A first example has the following parameter values for a code rate of ½:

c d p a b

12 24 193 7 49


In this example, the optimization of process 44, performed by way of the “ldpcopt” program, provides the following distributions of column blocks:

g 2 per column 3 per column 4 per column 11 per column

Optimum 10.80 8.90 0.74 3.56
Useful 11 9 1 3


In this table, the value g is the number of “1” bits in a given block column. As shown in this example, the optimization indicates that the optimum distribution of its twenty-four block columns (d=24) would have 10.80 block columns with two “1” bits, 8.9 block columns with three “1” bits, 0.74 block columns with four “1” bits, and 3.56 block columns with eleven “1” bits. Of course, this arrangement is not possible for a single macro matrix H having discrete rows and columns. As such, the “Useful” values in this table have rounded the optimum value to integer values.
Optimization process 44 also provides an optimized allocation of row blocks, by providing an optimized selection of the number of block rows that have varying numbers of “1” values. In this same first example, the optimization process provided the following results:

g 7 per row 8 per row

Optimum 5.56 6.64
Useful 10 2


As evident from this table, the optimum distribution provided for 5.56 block rows with seven “1” bits in each row, and 6.64 block rows with eight “1” bits in each row, for a total of c=12 rows. The wide variation for the “Useful” allocation from the optimum is due to the number of filled “1” blocks defined for the columns, which is incompatible with the number of filled blocks defined by the optimized allocation of block rows. In this example, the “Useful” column allocation establishes that there are eighty-six filled block matrix entries; the only combination of row allocations that satisfies that optimized column allocation, while maintaining either seven or eight filled “1” blocks in each row, is the “Useful” allocation of ten block rows with seven “1” bits each, and two block rows with eight “1” bits each, as shown in this table. Alternatively, one could arrange the matrix to satisfy the optimum row allocation and adjust the column allocation. FIG. 4a illustrates a macro matrix H constructed according to the arrangement of this example.
According to a second example, for a code rate of ⅓, the parameters are as follows:

c d p a b

16 24 241 2 44


In this example, the optimization of process 44, performed by way of the “ldpcopt” program, provides the following distributions of column blocks:

g 2 per column 3 per column 5 per column 15 per column

Optimum 13.22 5.73 3.17 1.88
Useful 13 6 3 2

This optimum distribution of the twenty-four block columns (d=24) is approximated, in this case, by relatively close rounding of the optimum counts to the nearest integer. The row allocation in this example is as follows:

g 5 per row 6 per row

Optimum 8.73 7.27
Useful 9 7


In this case, the optimum distribution of the sixteen (c=16) block rows is also closely approximated by rounding to the nearest integers, as shown in the “Useful” row of the table. In this case, the optimized block row and block column allocations are compatible. FIG. 4b illustrates an example of macro matrix H according to this optimization.
In process 46, if not previously selected prior to optimization process 44, the multiplicative orders c, d are defined, in which order value c and order value d correspond to the number of block rows and block columns, respectively, in macro matrix H, as described above.
In process 48, the code designer next constructs the particular arrangement of macro matrix H according to the optimization results of process 44, and according to additional constraints according to this embodiment of the invention. According to the preferred embodiments of the invention, the macro matrix H is arranged in groups of block columns, so that, for any given block row in macro matrix H, at most one block column within each group has a filled “1” entry. This arrangement must take into account the manner in which the constituent matrices are developed.
FIGS. 4a and 4b illustrate examples of macro matrix H, following the two optimization examples described above. Each of FIGS. 4a and 4b illustrates macro matrix H representing a matrix of matrices. Each entry of macro matrix H itself represents a square matrix of dimensions p×p. Each empty entry (“0”) of macro matrix H constitutes an empty p×p matrix (all entries are “0”). Each filled entry (“1”) of macro matrix H constitutes a permutation p×p matrix, each formed as an identity matrix with its rows cyclically shifted (modulo p), by an amount Ps,t=bsat, where s, t represent the row and column positions, respectively, of the permutation matrix within macro matrix H, and a, b are the generator values. Referring to FIG. 3, these additional code parameters p, a, b, are selected in process 49. These parameters include prime number p that defines the size of the constituent matrices within macro matrix H, and the generators a, b that are used in the definition of the constituent matrices within macro matrix H. Generator a is selected to have multiplicative order d with respect to prime p, and generator b has multiplicative order c also with respect to prime p. The set of parameters p, c, d, define the size of parity check matrix Hpc, with the total number of rows m=p×c and the total number of columns j=p×d, and thus define the appearance of the LDPC code.
As noted above, generator a is selected to have multiplicative order d, and generator b has multiplicative order c. In process 50, these cyclically shifted identity matrices are then generated for each of the filled “1” locations of macro matrix H, to produce parity check matrix Hpc. As mentioned above, parity check matrix Hpc thus has m=p×c rows and j=p×d columns, with the “1” values determined by generators a, b. The manner in which the permutation matrices are formed is described in Sridhara, et al., “Low Density Parity Check Codes from Permutation Matrices”, 2001 Conference on Information Sciences and Systems (Johns Hopkins University, Mar. 21-23, 2001), incorporated herein by this reference.
In contrast to the Sridhara approach, however, macro matrix H according to the preferred embodiments of this invention is irregular, in that it is itself relatively sparse, including many zero-valued entries. As will be evident from the following description, the irregularity included by the code designer in process 48, in combination with grouping of block columns of macro matrix H, provides a high performance LDPC code that can be efficiently implemented in hardware.
It is also contemplated, in connection with this invention, that the cyclic shifts of the identity matrices corresponding to the row and column position of macro matrix H need not follow this particular approach. Indeed, the offset Ps,t of the cyclic shift within a particular permutation matrix can be an arbitrary selection. Such arbitrary assignment, if followed, may affect the ultimate dimensions of macro matrix H.
The examples of FIGS. 4a and 4b illustrate irregular macro matrices H constructed according to this embodiment of the invention. The irregularity and the block construction is evident from these two LDPC code macro matrices H. It is contemplated that those skilled in the art, having reference to this specification, will be readily able to construct macro matrices and the resulting corresponding parity check matrices in this manner, for other code rates and performance optimization levels.
Referring now to FIG. 5, the construction of LDPC decoder 38 according to a preferred embodiment of the invention will now be described in detail. While it is also contemplated, according to this invention, that the LDPC decoding operations described herein may alternatively be performed as a software routine, for example by baseband processor 39 in the form of a DSP or another programmable logic device executing a sequence of program instructions, the exemplary architecture of FIG. 5 is especially well-suited to the irregular LDPC codes described above, and provide the important benefit of efficient and rapid performance of the iterative belief propagation decoding described above.
As shown in FIG. 5, LDPC decoder 38 includes memory 52, which is preferably a local random access memory (RAM) for storing the Rmj estimates that are derived within each iteration of the belief propagation. These Rmj estimates are packed into memory 52, so that the column positions within memory 52 do not physically align with the column positions within parity check matrix Hpc, to save chip area considering the sparseness of matrix H. R storage memory 52 has an output coupled to one input of parallel adder 54. Specifically, this output of R storage memory 52 is connected to a negative input of parallel adder 54, to provide the subtrahend for a subtraction performed by parallel adder 54. The output of parallel adder 54 is applied to parity check update circuitry 56. Parity check update circuitry 56 performs the updating of estimates Rmj for each of the parity check nodes, as will be described in further detail below. As such, the output of parity check update circuitry 56 is applied to R storage memory 52, for storage of the updated values; in addition, the output of parity check update circuitry 56 is also applied to router circuitry 58f, for use in updating the log likelihood ratios (LLRs) of the input nodes.
Router circuitry 58f is a bank of multiplexers and demultiplexers, as will be described in further detail below, that forwards the appropriate estimate terms Rmj to the corresponding bit update circuits 60. Bit update circuits 60 are effectively accumulators, by way of which current values of the LLRs of the input nodes are maintained from iteration to iteration. The number of bit update circuits 60 depends upon the maximum number of groups of block columns of macro matrix H. In the particular code; for the code example of FIG. 4a, nine bit update circuits 60 will be provided, while for the code example of FIG. 4b, seven bit update circuits 60 are necessary. The specific construction of bit update circuits 60 will be described in further detail below.
Bit update blocks 60 receive the input node data values, prior to the first iteration of the belief propagation. As mentioned above, the received input node data values are typically based on a multiple bit fractional value, for example expressed in six or seven bits, as produced after demodulation and recovery after frequency domain equalization (e.g., by functions 26 and 27 of FIG. 1). More specifically, because LDPC decoder 38 operates on LLRs, the initial input node data value is preferably expressed as the ratio
- 2 r j σ 2 ,
the value rj being the received data value. Bit update blocks 60 each forward an output to reverse router circuitry 58r, which in turn applies the output value to parallel adder 54, as minuends for the subtraction performed by that circuit. In addition, the outputs of bit update blocks 60 are also applied by reverse router circuitry 58r to parity check function 61, which performs a slicing function on these estimates, and after converting these values to “hard” decisions, determines whether the parity check equation is satisfied by the current estimates for each row of parity check matrix Hpc.
Referring now to FIG. 5 in combination with FIGS. 6 through 8, the operation of LDPC decoder 38 in performing belief propagation decoding according to the preferred embodiment of the invention will now be described in further detail. The specific construction of constituent circuit blocks and functions within decoder 38 will also be described in connection with this operational description. For the sake of clarity, this description will first be provided with respect to an arbitrary selected iteration in the process. The initialization of the belief propagation operation begins with the storage of values corresponding to ratio
- 2 r j σ 2 ,
the value rj being the received data value, stored in the appropriate memory locations as the initial estimate of the LLRs for the columns.
This description of the operation of LDPC decoder 38, and the detailed construction of its functional blocks, for this exemplary iteration, will begin at parallel adder 54. Parallel adder 54 receives the current estimates Rimj for the participating columns in a given row m of macro matrix H, from memory 52. These current estimates Rimj, which were generated in the previous iteration i of the process, are subtracted from the current LLR values L(qj) of the input nodes that participate in the current row m of parity check matrix Hpc, thus generating the LLR values L(qmj) according to Equation (17) described above. These values L(qmj) are forwarded to parity check update circuitry 56, the construction of which will be described in detail relative to FIG. 6.
The updating process as carried out by parity check update circuitry 56 begins with the application of each of the values L(qmj) in the input word received from parallel adder 54 to look-up tables (LUTs) 80, by way of which the Ψ function of Equation (10) is evaluated for each of the values L(qmj) within the current row m. The outputs of LUTs 80 are forwarded to augmented adder tree 82, which performs the summing of the values Ψ(L(qmj)) over all of the columns participating in the current row m of macro matrix H. Augmented adder tree 82 effects this summation in a manner that can be readily implemented by those skilled in the art having reference to this specification. This overall sum result is applied to an input of adders 86, one adder 86 associated with each of the columns j contributing to the current row m. Each adder 86 also receives, at a negative input, the output of its corresponding LUT 80, and thus subtracts the column's own contribution from the overall sum. The outputs of adders 86 thus present the set of amplitude values Amj corresponding to the result of Equation (9), each associated with one of the columns j that are participating in this row. The outputs of adders 86 are then again applied to corresponding LUTs 88, to again apply the Ψ function to the amplitude values Amj, according to Equation (13). Sign correction functions 90 apply the appropriate sign to the output of LUTs 88, based on the logical combination of the sign bit outputs of LUTs 80 for the corresponding column according to an odd/even determination of the number of negative probabilities, excluding each row's own contribution, as described above relative to Equation (12), and effecting the negative sign applied according to Equation (13). It is this handling of the sign bit outputs of LUTs 80 that corresponds to the augmented addition performed by augmented adder tree 82. Sign correction functions 90 thus present each of the updated estimate values Ri+1mj as updated for this, the i+1 iteration.
Alternatively, parity check update circuitry 56 may instead follow the approach described above in connection with Equations (14) and (15). According to this approach, LUTs 80, 88 for applying the Ψ function are not used, but instead a special addition function [+] of Equation (15) is applied to the L(qmj) values, and these values are summed according to Equation (14), to derive the updated estimate values Ri+1mj for the i+1 iteration. Variations of these parity check update approaches, and other alternative parity check update approaches, may also be realized within parity check update circuitry 56, within LDPC decoding circuitry 38 according to this invention. The selection of the particular arithmetic approach will depend upon the available circuitry and performance of the specific implementation.
Referring back to FIG. 5, these updated estimate values Ri+1mj for the i+1 iteration are applied to memory 52, to overwrite the previous estimate values Rimj from the prior iteration. This effectively completes the first step of the belief propagation algorithm, for this row of parity check matrix Hpc, with the updating of the estimates of the column value in a row, based on the other column values involved in the same row (i.e., the same parity check equation).
The second step of the belief propagation algorithm, in which the input node estimates (in the form of the LLRs) are updated, begins with the application of these updated estimate values Ri+1mj to the appropriate bit update circuit 60, via router and reverse router circuitry 58. The construction of router and reverse router circuitry 58 preferably depends upon the particular code arrangement, either by way of hard wiring or alternatively by way of a software controlled logic arrangement.
FIG. 7 illustrates an example of routing circuitry 58f. Reverse routing circuitry 58r can be constructed in a similar fashion, but reversed to route signals in the opposite direction, as will be apparent to those skilled in the art having reference to this embodiment of the invention. As illustrated in FIG. 7, the output word of the estimates Rmj from parity check update circuitry 56 includes several values, each for one of the columns j that are involved in the current row of parity check matrix Hpc. Referring back to FIGS. 4a and 4b, it is apparent that the number of columns involved in a particular row (i.e., the degree of the row) can vary. As such, the number of positions in the output word from parity check update circuitry 56 can also vary from row to row. As such, there may be instances in which one or more of the positions of the output word from parity check update circuitry 56 may be empty.
Routing circuitry 58f thus consists of a set of multiplexers 92, which effect the forwarding of the values Rmj of the output word to the appropriate bit update circuit 60. Knowledge of the particular code arrangement within macro matrix H defines the control of these multiplexers 92 because, according to the preferred embodiments of this invention, macro matrix H is constructed with column block grouping, by way of which only one possible column of parity check matrix Hpc is involved within each column block group, for any given row. The example of routing circuitry 58f illustrated in FIG. 7 corresponds to the code shown in FIG. 4a, which has nine groups of column blocks, and thus involves nine positions that are applied to the nine bit update circuits 601 through 609.
In this example, the left-most column block group of macro matrix H of FIG. 4a has a filled “1” for every row, considering that each “1” in the matrix of FIG. 4a corresponds to a sliding identity permutation matrix. As such, the left-most position of the output word from parity check update circuit 56 is always forwarded to bit update circuit 601. The second-most position of the output word may be forwarded either to the second bit update circuit 602 or to the third bit update circuit 603, depending on the code row. In any event, the control of multiplexers 92 is effected depending upon the contributions from the various column block groups to the parity check code, in each row; it is contemplated that multiplexers 92 will be switched to some extent as the process advances from one block row to another. According to this embodiment of the invention, the assignment of the positions of the output word to the various bit update circuits 60 can thus be greatly simplified with knowledge of the code, so that router circuitry 58f and reverse router circuitry 58r need not be overwhelmingly complex. In addition, this embodiment of the invention reduces the number of necessary bit update circuits 60 greatly, from what would otherwise be required (e.g., one bit update circuit for each of the k block columns of macro matrix H).
In any event, router circuitry 58f forwards the most recent iteration of estimates Ri+1mj to the appropriate bit update circuits 60. Bit update circuits 60 accumulate these estimates Ri+1mj with the estimates for the same input node in the same column j, from different rows, as will now be described relative to FIG. 8, which illustrates the construction of one of bit update circuits 60 according to the preferred embodiment of the invention. The others of bit update circuits 60 within LDPC decoder 38 are contemplated to be similarly constructed.
As shown in FIG. 8, bit update circuit 60 has a first adder 62 receiving an input from the router portion of router circuitry 58f via input aligner 63, and a second adder 74 that forwards its output to reverse router circuitry 58r via output aligner 75. Aligners 63, 75 are effectively shifters that can be used to align the incoming and outgoing data words as desired or necessary. Adder 62 has its output coupled to demultiplexer 64, which forwards the output of adder 62 to a selected one of column sum memories 66A, 66B. Address generator circuit 68 controls the addressing of column sum memories 66A, 66B, and received data memory 70. Received data memory 70 receives and stores channel input data, and applies this channel input data to an input of adder 74; the other input of adder 74 receives the output of a selected one of column sum memories 66A, 66B, via cross-switching multiplexer 72. The other output of cross-switching multiplexer 72 is applied to a second input of adder 72. Cross-switching multiplexer 72, in combination with demultiplexer 64, control the operation of column sum memories 66A, 66B to operate in a ping-pong buffer fashion relative to one another. One of column sum memories 66A, 66B is in an accumulation mode, by multiplexer 72 applying its output to adder 62 along with the adding its current value (via multiplexer 72) with the results from router circuitry 58, storing the result by multiplexer 64 connecting the output of adder 62 to the input of that accumulating one of column sum memories 66B, 66A. Meanwhile, cross-switching multiplexer 72 is forwarding the output of the other one of column sum memories 66A, 66B to an input of adder 74, to be summed with the contents of received data memory 70 and forwarded to reverse router circuitry 58r.
In operation, with reference to Equation (16), column sum memories 66A, 66B of bit update circuits 60 accumulate the sum of the estimated values Rmj for its corresponding input node, which is associated with a corresponding one of the blocks of columns in the appropriate code. This accumulation is carried out by adder 62 receiving the most recent estimate Ri+1mj at one input, and receiving the current accumulation of estimates Rmj for the same column j, but for different rows m, from one of column sum memories 66A, 66B, selected via cross-switching multiplexer 72. Adder 62 combines these values, and forwards the sum back to the selected column sum memory 66A, 66B, which rewrites the accumulated sum for that row and column position, expressed as the sum
m M ( j ) R mj i + 1
for iteration i+1, following Equation (16). This value is retained in the selected one of column sum memories 66A, 66B, at a memory location associated with the corresponding input node, as addressed by address generator circuit 68.
Address generator circuit 68 includes the appropriate logic and memory circuitry for maintaining and applying memory address values associated with the input nodes managed by bit update circuit 60. According to this preferred embodiment of the invention, each of the permutation matrices involved in generating parity check matrix Hpc from macro matrix H are circularly shifted identity matrices, with the particular position of the identity diagonal varying with the position of the permutation matrix within macro matrix H. Because the rows within parity check matrix H

The remainder of this text has been abbreviated because it is either very complex or very long and may not be displayed properly or efficiently by your web browser. Even with this precaution, certain browsers may display odd behaviors when rendering this document. Please download the document to view it in its entirety.
(Source: USPTO)
What is claimed is:
1. A spread spectrum processor coupled to a program memory for storing a sequence of operating instructions, said spread spectrum processor comprising: a macro matrix defining a Low Density Parity Check (LDPC) code said macro matrix having zero-valued and non-zero-valued entries arranged in block rows and block columns and in which each zero-valued entry corresponds to a p×p zero-valued matrix and each non-zero-valued entry corresponds to a p×p permutation matrix that has at most a single “1” entry in each row and each column and “0” entries elsewhere to define a parity check matrix, wherein the block columns of the macro matrix are grouped into groups of block columns so that at most one column in any group has a “1” entry in any row, at least one of the groups including more than one block column, and wherein the columns of the parity check matrix correspond to input nodes and the rows of the parity check matrix correspond to parity check sums; local memory for code execution and data buffering; an embedded central processing unit (CPU) for executing said sequence of operating instructions to perform a plurality of operations comprising: receiving a set of input values corresponding to input nodes of the macro parity check matrix; estimating, for each of the input nodes, over each of a plurality of parity check sums of the LDPC code, a check node value using values of other input nodes contributing to the parity check sum; estimating, for each of the input nodes, a probability value using the estimates of the check node values for that input node; and repeating the estimating operations to a termination criterion.
2. The spread spectrum processor of claim 1, wherein each permutation matrix corresponding to a non-zero entry of the macro matrix representing the LDPC code is a cyclically shifted identity matrix.
3. The spread spectrum processor of claim 2, wherein an offset for each of the cyclically shifted identity matrices corresponds to the block row and block column of the permutation matrix in the macro matrix representing the LDPC code.
4. The spread spectrum processor of claim 1, wherein the program instructions further comprise instructions for performing the operation of: evaluating each of the plurality of parity check sums using decisions based upon the estimated probability values to determine whether the parity check sums are satisfied; and wherein the termination criterion corresponds to each of the plurality of parity check sums being satisfied using decisions based upon the estimated probability values.
5. The spread spectrum processor of claim 1, wherein the program instructions further comprise: generating an extrinsic estimate for each input node position for each parity check matrix row to which it contributes, after the probability values is estimated for each of the input nodes; and wherein the program instructions for performing the operation of estimating a check node value for each of the input nodes over each of a plurality of parity check sums of the LDPC code comprises, for each corresponding row of the parity check matrix, program instructions for performing the operations of: applying extrinsic estimates for the input nodes contributing to the parity check sum to a first look-up table to retrieve a corresponding first function value; applying the first function values to an augmented adder to generate a full sum amplitude over the row; for each contributing input node position, subtracting the first function value from the full sum amplitude; applying the result of the subtracting step for each contributing input node position to a second look up table to retrieve a second function value; and then correcting the sign of the second function value for each contributing input node position, using a logical combination of sign bit outputs from the first function values, to produce the estimates of the check node values.
6. The spread spectrum processor of claim 1, wherein the program instructions further comprise instructions for performing the operation of: generating an extrinsic estimate for each input node position for each parity check matrix row to which it contributes, after probability values for each of the input nodes are estimated; and wherein the program instructions for performing the operation of estimating a check node value for each of the input nodes over each of a plurality of parity check sums of the LDPC code comprises, for each corresponding row of the parity check matrix, program instructions further comprise instructions for performing the operation of: performing a sum of log likelihood ratios of each of the extrinsic estimates of the contributing input node positions, the sum corresponding to a log ratio of exponentials of the extrinsic estimates.
7. The spread spectrum processor of claim 1, wherein the program instructions further comprise instructions for performing the operation of: generating an extrinsic estimate for each input node position for each parity check matrix row to which it contributes, after probability values for each of the input nodes are estimated; and wherein program instructions for performing the operation of estimating a check node value for each of the input nodes over each of a plurality of parity check sums of the LDPC code comprises, for each corresponding row of the parity check matrix, program instructions for performing the operation of: receiving first and second portions of the extrinsic estimates for the contributing input nodes in successive cycles; using the first and second portions of the extrinsic estimates to produce a sum over all of the contributing input nodes; and producing, from the sum, first and second groups of the check node estimates in successive cycles.
8. The spread spectrum processor of claim 1, program instructions for performing the operation of estimating a probability value using the estimates of the check node values for that input node comprise program instructions for performing the operation of: accumulating a plurality of check node value estimates for the input node, over each of the parity check sums to which the input node contributes; then adding a stored value corresponding to the original received input value for the input node; and forwarding the result of the adding step as a next estimate of the probability value for the input node.
9. A transceiver comprising: a program memory for storing a sequence of operating instructions; a macro matrix defining a Low Density Parity Check (LDPC) code said macro matrix having zero-valued and non-zero-valued entries arranged in block rows and block columns and in which each zero-valued entry corresponds to a p×p zero-valued matrix and each non-zero-valued entry corresponds to a p×p permutation matrix that has at most a single “1” entry in each row and each column and “0” entries elsewhere to define a parity check matrix, wherein the block columns of the macro matrix are grouped into groups of block columns so that at most one column in any group has a “1” entry in any row, at least one of the groups including more than one block column, and wherein the columns of the parity check matrix correspond to input nodes and the rows of the parity check matrix correspond to parity check sums; local memory for code execution and data buffering; a baseband processor coupled to said program memory and local memory for executing said sequence of operating instructions to perform a plurality of operations comprising: receiving a set of input values corresponding to input nodes of the macro parity check matrix; estimating, for each of the input nodes, over each of a plurality of parity check sums of the LDPC code, a check node value using values of other input nodes contributing to the parity check sum; estimating, for each of the input nodes, a probability value using the estimates of the check node values for that input node; and repeating the estimating operations to a termination criterion.
10. The transceiver of claim 9, wherein each permutation matrix corresponding to a non-zero entry of the macro matrix representing the LDPC code is a cyclically shifted identity matrix.
11. The transceiver of claim 10, wherein an offset for each of the cyclically shifted identity matrices corresponds to the block row and block column of the permutation matrix in the macro matrix representing the LDPC code.
12. The transceiver of claim 9, wherein the program instructions further comprise instructions for performing the operation of: evaluating each of the plurality of parity check sums using decisions based upon the estimated probability values to determine whether the parity check sums are satisfied; and wherein the termination criterion corresponds to each of the plurality of parity check sums being satisfied using decisions based upon the estimated probability values.
13. The transceiver of claim 9, wherein the program instructions further comprise: generating an extrinsic estimate for each input node position for each parity check matrix row to which it contributes, after the probability values is estimated for each of the input nodes; and wherein the program instructions for performing the operation of estimating a check node value for each of the input nodes over each of a plurality of parity check sums of the LDPC code comprises, for each corresponding row of the parity check matrix, program instructions for performing the operations of: applying extrinsic estimates for the input nodes contributing to the parity check sum to a first look-up table to retrieve a corresponding first function value; applying the first function values to an augmented adder to generate a full sum amplitude over the row; for each contributing input node position, subtracting the first function value from the full sum amplitude; applying the result of the subtracting step for each contributing input node position to a second look up table to retrieve a second function value; and then correcting the sign of the second function value for each contributing input node position, using a logical combination of sign bit outputs from the first function values, to produce the estimates of the check node values.
14. The transceiver of claim 9, wherein the program instructions further comprise instructions for performing the operation of: generating an extrinsic estimate for each input node position for each parity check matrix row to which it contributes, after probability values for each of the input nodes are estimated; and wherein the program instructions for performing the operation of estimating a check node value for each of the input nodes over each of a plurality of parity check sums of the LDPC code comprises, for each corresponding row of the parity check matrix, program instructions further comprise instructions for performing the operation of: performing a sum of log likelihood ratios of each of the extrinsic estimates of the contributing input node positions, the sum corresponding to a log ratio of exponentials of the extrinsic estimates.
15. The transceiver of claim 9, wherein the program instructions further comprise instructions for performing the operation of: generating an extrinsic estimate for each input node position for each parity check matrix row to which it contributes, after probability values for each of the input nodes are estimated; and wherein program instructions for performing the operation of estimating a check node value for each of the input nodes over each of a plurality of parity check sums of the LDPC code comprises, for each corresponding row of the parity check matrix, program instructions for performing the operation of: receiving first and second portions of the extrinsic estimates for the contributing input nodes in successive cycles; using the first and second portions of the extrinsic estimates to produce a sum over all of the contributing input nodes; and producing, from the sum, first and second groups of the check node estimates in successive cycles.
16. The transceiver of claim 9, program instructions for performing the operation of estimating a probability value using the estimates of the check node values for that input node comprise program instructions for performing the operation of: accumulating a plurality of check node value estimates for the input node, over each of the parity check sums to which the input node contributes; then adding a stored value corresponding to the original received input value for the input node; and forwarding the result of the adding step as a next estimate of the probability value for the input node.
(Source: USPTO)