Chuong 8.html

ADSL, VDSL, and Multicarrier Modulation . John A. C. Bingham Copyright # 2000 John Wiley & Sons, Inc. Print ISBN 0-471-29099-8 Electronic ISBN 0-471-20072-7

8

IMPLEMENTATION OF DMT: ADSL

8.1 OVERALL SYSTEM

Figure 8.1 is a simple block diagram of an ADSL system. It is important to note the two stages of “splitting”: from the line side the POTS and the xDSL (both bidirectional) are ®rst separated by the low-pass and high-pass ®lters, and then the xDSL transmit and receive signals (unidirectional) may be separated by any combination of ®lters, 4W/2W hybrid, and echo canceler.

In this chapter we describe most of the components and incorporated algorithms of such a system: Section 8.2 for the transmitter, Section 8.3 for transmitter/receiver interconnection, Section 8.4 for the receiver, and Section 8.5 for the algorithms. Some of the components are generic to xDSL, some speci®c to DMT xDSL,¹ and some even more speci®c to DMT ADSL or VDSL. Some

Figure 8.1 Block diagram of xDSL system.

¹ Strictly speaking, any modulation technique could be inserted in the transmitter and receiver boxes of Figure 8.1, but I am sure that readers will understand and empathize if I say “Perish the thought!” and do not talk about the “other” techniques.

133

componentsÐPOTS splitters, IFFTs and FFTs, and equalizersÐhowever, deserve chapters or appendices of their own. In this chapter we deal with what has already been implemented; possible components of future xDSL modemsÐ improved equalizers and RFI cancelers, and crosstalk cancelersÐare discussed in Chapter 11.

System Timing. For the timing of transmission between two modems, one modem (the “master”)² must de®ne the frequency and phase of all clocks, and the other modem (the “slave”) must lock itself to those clocks. In some early xDSL systems, the remote unit, for some mysterious and now obsolete reasons, was the master, but for all xDSL systems considered here the central unit (the ATU-C or VTU-O) is the master. The parts most concerned with timing, therefore, are the central transmitter and the remote receiver. The primary purpose of the timing “circuitry” in this pair is to reproduce the sampling clock (2.208 MHz for ADSL, 22.08 MHz for one VDSL proposal) in the receiver. This could be done by decision-aided operations on all the data-carrying subcarriers, but T1E1.4 took the easy way out and decided to reserve one subcarrier as an unmodulated pilot.

In addition to recovering the sampling clock:

1. The remote receiver must establish the symbol and superframe³ clocks by division of the sampling clock.
2. The remote receiver must slave its upstream transmitter to this clock, and because frequency lock is thereby assured, the central receiver need establish only phase lock.
3. The PMD layer may be required to pass an 8.0-kHz network timing reference to the remote DTE. The NTR is a very precise clock that is used throughout a data network for voice sampling and CBR applications such as videoconferencing and VTOA (see Section 2.3). The NTR may also be up-sampled to get a bit clock that is locked to PRS (Stratum 1 CLK) and used for n Â 64 kbit /s CES.

The ®rst two are receiver functions, which are described in Section 8.4.4. The third one requires both standardization as a transmitter function and implementation in a receiver, so it is described in Sections 8.2.1 and 8.4.3.

8.1.1 The Design and Implementation Problem
As in the design of all modemsÐvoice-band, wireless, DSL, and so onÐthe basic problem is to achieve as nearly as possible the theoretical relationship

² This modem may, in turn, have to accept clocks fromÐthat is, be slaved toÐhigher layers of a system.
³Superframe has a different meaning in ADSL and SDMT VDSL, but we can use the word in a generic sense here.

OVERALL SYSTEM 135

between data rate, error rate, and range. One way of expressing this more precisely is to de®ne an SNR loss as the difference in decibels between the signal / unavoidable noise ratio for any loop and the achieved signal / total (i.e., unavoidable plus avoidable) noise ratio. That is,

SUNR
ˆ
10 log
signal
¹⁰ unavoidable noise^…8:1† STNR
ˆ
10 log
signal
¹⁰ unavoidable noise ‡ avoidable noise^…8:2† SNR_loss ˆ SUNR À STNR ˆ 10 log₁₀ 1 ‡^{avoidable noise} …8:3†_{unavoidable noise}

A measure of the state of evolution of modem design in any medium is the SNR loss achieved by an average-to-good factory-built modem. I would estimate that voice-band modems today achieve about 1 to 2 dB, but xDSL modems, which are much less mature, are probably much worse than this (4 to 5 dB?).

The main sources of unavoidable and avoidable noise in an xDSL system are as follows:

1. Unavoidable
^* Alien crosstalk
^* Kindred FEXT
^* AWGN

2. Avoidable (at least partially)
^* Kindred NEXT, whose level depends on the cable characteristics and the number of interferers, which are unavoidable, but whose effect on transmission depends on the duplexing technique used

^* RFI (AM radio and amateur radio)

^* “Impulse” noise: any noise pulses of fairly short duration that occur spasmodically and unpredictably
^* POTS signaling
^* Linear distortion (resulting, in an MCM system, in intersymbol and interchannel distortion)
^* Nonlinear distortion
^* Down/up interference: leak through FDD ®lters and/or echo canceler
^* Clipping
^* Quantizing noise in DAC and ADC
^* DSP round-off noise
^* Noise and/or distortion introduced by clock jitter

TABLE 8.1 ADSL Basic Numbers Down
ÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐ
G.992.1
and T1.413 G.992.2 Lite Up

f_samp(MHz) 2.208 1.104 0.276 IFFT size 512 256 64 Cyclic pre®x

Without sync symbol 40 5
With sync symbol 32 4
Data symbol rate 4 kHz
On-line symbol rate 4.0588 kHz
Subcarrier spacing 4.3125 kHz
Superframe 68 symbols plus one sync symbol Used subcarriers (FDD) 36±255 36±127 7±28 (note 1) (notes 1, 2) (note 1) In-band transmit (PSD) À40 À38 (dBm/Hz)

Notes:
1. These numbers do not appear in any standard; they are practical recommendations to ease the ®ltering requirements only.
2. In G.922.2 a few subcarriers may be taken from upstream to help the downstream; the used subcarriers may be more like 33±127 and 7±25.

Equation (8.3) can then be written as _X SNR_loss ˆ 10 log₁₀ 1 ‡^{Avoidable noise} …8:4† _{all noise sources} Unavoidable noise
and each noise source should be assigned an allowable contribution (typically between 0.1 and 2.0 dB) to the total noise budget.
8.1.2 Numerical Details
The important numbers for ADSL are given in Table 8.1. The operative arithmetic is
4Â…512 ‡ 40†ˆ 4:3125 Â 512 ˆ 2208 …8:5a†
and
68Â…512 ‡ 40†ˆ 69Â…512 ‡ 32†…8:5b†

Output Power Spectral Densities (PSD). As xDSL services are extended to higher and higher data rates and frequencies, crosstalk becomes more and more critical, and the transmitted PSDs have to be more tightly controlled. As discussed in Section 4.5 the speci®cation of ADSL PSDs was tightened in Issue 2 of T1.413 to reduce crosstalk into the emerging VDSL service. The PSDs for the ATU-C and ATU-R are shown in Figures 25 and 29 of T1.413.

8.2 TRANSMITTER

Figure 8.2 is a block diagram of a DMT transmitter; it shows those parts common to Figures 2 to 5 of T1.413 Issue 2, omitting the de®nitions of the input data and control channels and the numbers speci®c to the direction of transmission. We consider each of the blocks in turn,⁴ but some of themÐ scrambling, FEC, and interleavingÐhave been well covered elsewhere, and othersÐthe inverse discrete Fourier transform and the cyclic pre®xÐare discussed elsewhere in this book, so I will be brief.

8.2.1 Transport of the Network Timing Reference

As shown in Figure 8.2, the NTR is an input to the mux/sync control. Information about the NTR is included in the data stream, and it is processed thereafter like any other data. Because the basic sampling clock used in an ADSL system, 2.208 MHz, is an integer multiple of 8 kHz, it might be thought that the simplest way would be to slave the whole ADSL system to the NTR. It was quickly realized during the discussions in T1E1.4, however, that a prohibitive amount of ®ltering would be needed in a PLL to attenuate the highfrequency components of the output of the phase detector; it would probably be nearly impossible to keep the input to the VCXO quiet enough and the resulting

Figure 8.2 Block diagram of a DMT transmitter.

⁴ Ideally, the length of discussion of each block should be proportional to the estimated unfamiliarity of the block to the reader, but occasionally, it will be proportional to the familiarity to the author!

Figure 8.3 Transmission of the NTR phase information.

jitter on the derived 2.208 MHz low enough. Therefore, it was decided, surprisingly but wisely, to keep the ADSL system clocks independent of the NTR and to transmit information about any frequency offset between the local timing reference (LTR ˆ 2.208 Ä 276) and the NTR. The remote unit can then recreate the NTR from its own reconstructed LTR.

Figure 8.3 is a copy of Figure 9 of T1.413 Issue 2. The NTR and the LTR are both “8-kHz” clocks that are very close in frequency but have an arbitrary phase relationship. The phase of the LTR is therefore sampled on each NTR, and the value of this is sampled into register 2 at the end of each superframe. Registers 2 and 3 therefore contain the most recent and the previous phase differences (measured in units of 1/2.208 ms) between the NTR and the LTR. The second ®nite difference of (i.e., frequency difference) is then de®ned by just four bits, and transmitted as “data” in one of the downstream overhead channels. One possible circuit for recovery of the NTR is shown in Figure 8.4, and discussed, along with all other receiver timing recovery issues, in Section 8.4.3.

8.2.2 Input Multiplexer and Latency (Interleave) Path Assignment

Interleaving greatly increases the ability of the Reed±Solomon (R-S) forward error correction (FEC) coding and decoding to correct bursts of errors due to either externally generated noise impulses or internally generated clips (see Section 5.6.1), but it does increase the latency of the data. Deciding on a compromise between burst error rate and latency for each data channel is a function of the transmission convergence (TC) layer, which must combine the multiple input data channels and assign them to either the “fast” (i.e., not interleaved) or the interleaved path.

8.2.3 Scrambler
The scrambler chosen for ADSL is of the self-synchronizing type (see, e.g, [Bingham, 1988]). The descrambler for this has the effect of tripling the bit error
Figure 8.4 Recovery of the NTR in an ATU-R.

rate, but the general opinion seems to be that this is inconsequential if a R-S FEC is used. A DMT xDSL system must be locked to a superframe clock, and loss of that lock would affect many more receiver functions than just the descrambler, so the self-synchronizing ability of this type of scrambler brings no advantage. An additive scrambler that is reinitialized with each superframe would have been better and was proposed in [Bingham,1993], but tradition prevailed.⁵

8.2.4 Reed±Solomon Forward Error Correction

The only aspect of R-S FEC as used for xDSL that has not been covered in many books (e.g., [Berlekamp, 1980], [Clark and Cain, 1981], and [Lin and Costello, 1983]), and is the fact that the R-S codewords are locked to the DMT symbol rate. The idealÐfor ease of coding at leastÐwould be that each codeword contain exactly one symbol of data, but at low data rates this would lead to small codewords and inef®cient codes, and at very high data rates it would result in inconveniently large codewords. Therefore, the number of DMT symbols per codeword (designated by S in T1.413) may be¹, 1, 2, 4, and so on. A typical₂
ADSL downstream transmitter at 1.6 Mbit/s would use a DMT symbol of 50 bytes and a codeword of 200 bytes; that is, S ˆ 4.

One result of the integer constraint on other parts of the system is that since R-S codewords are composed of bytes, the number of bits per symbol must be an integer multiple of 8; for ADSL this means that the on-line data rate must be an integer multiple of 32 kbit/s. We look at an example of the design of an encoder/ interleaver in the next section.

8.2.5 Interleaving⁶

The purpose of the combination of an interleaver in the transmitter and a de-interleaver in the receiver is to spread bursts of errors, which occur between the two, over many codewords, and thus reduce the number of errors in any one codeword to what can be corrected by the decoder (usually half the number of redundant bytes added in the encoder). The two important parameters for an interleaver are the number of bytes⁷ per codeword, N, and the interleave depth, D, which is de®ned as the minimum separation at the output of the interleaver of any two input bytes in the same codeword.⁸ Another, system-oriented, way of de®ning D is as the dilution ratio of errors in a codeword. If in an MCM system with S ˆ 1 (i.e., one symbol per codeword) a symbol experiences a large noise impulse such that all its bytes contain errors, then approximately 1/D of the bytes in each codeword at the output of the de-interleaver will be in error.

⁵ I believe that there was an argument that the self-synchronizing scrambler is better for ATM transmission, but I do not remember either the argument or its source.
⁶ I am indebted to Po Tong of TI/Amati for much of this section.

Interleavers used for xDSL are convolutional interleavers. The advantages compared to traditional block interleavers are well known: for the same N and D they require half or less than half the memory (42ND for an end-to-end system, compared to 4ND) and incur less than half the end-to-end delay [N(DÀ1), compared to 2ND]. One small disadvantage is that N and D must be coprime; that is, their highest common factor must be unity.

The interleaver/de-interleaver pair described in [Clark and Cain, 1981] and [Lin and Costello, 1983], and shown in Figure 8.5, are “triangular” convolutional. The interleave depth D ˆ NM ‡ 1 (thus guaranteeing that N and D are coprime), and the memory requirement is (NÀ1) NM ˆ (NÀ1)(DÀ1). The implementation is very ef®cient (less than a quarter of the memory needed for a block interleaver), but the constraint on D is very inconvenient for xDSL; a typical xDSL system uses D < N.

The interleaver de®ned in T1.413 was originally proposed in [Aslanis et al., 1992] and [Tong et al., 1993]. It is a generalization of the triangular interleaver in that N and D can be de®ned (almost) independently.⁹The interleaving rule is the

Figure 8.5 Triangular interleaver.

7 In the literature these are often referred to as symbols, but we use “symbols” for a much larger grouping of bits.
⁸Readers are warned that this is the de®nition given in some books but not in all; careful reading is needed to reconcile the various de®nitions.
⁹ The requirement that they be coprime is met by using only odd N values and making D a power of 2.

Figure 8.6 Interleave matrix forNˆ 7 andDˆ 4. same as for the triangular interleaver: Each of the N bytes, B_i for i ˆ0to NÀ1, of a codeword is delayed by i Â (DÀ1).

Thus the ®rst and second bytes of a codeword (indices 0 and 1) are delayed by 0 and (DÀ1), respectively, for a net (minimum) separation of D. The interleaver can be implemented by N Â D matrices that are written into by columns and read from by rows; the economy of memory compared to block interleavers is that only one matrix is used at each end: writing into one location and reading from another are performed alternately. [Aslanis et al., 1992] showed an example in which the interleave matrix is read seqentially by rows, and readÐusing a complicated set of rulesÐby columns. Figure 8.6 shows another example for N ˆ 7 and D ˆ 4 (odd and a power of 2, respectively, as required by T1.413).

NOTE: The de-interleaver is both the reverse of the interleaver (substitute “rows”for “columns,”and vice versa) and the complement, in that the delay through each of them varies from byte to byte, but the total through both is constant.

A simpler way of implementing the interleaver, which requires calculating only a one-dimensional address, is with a circular buffer of circumference ND. For byte B_n

write address ˆ n; mod ND …8:6† read addressˆfn ‡ NDÀ…D À 1†Â…n; mod†N†g; mod ND …8:7†

It can be seen that the total memory requirement is 2 ND: twice that of a triangular pair. [Tong, 1998] describes the best of both worlds: independence of N and D, and a (NÀ1)(DÀ1) memory requirement. This is achievedÐalbeit at the cost of some fairly complicated programmingÐby reading from an address, writing to the same address, and then calculating the next address. One useful feature of Tong’s implementation is that the resulting interleaving is exactly the same as that performed by the single matrix method, thus allowing for different implementations that are end-to-end compatible.

An Example of an FEC/Interleaver Combination. A typical downstream ADSL signal might have the following parameters:

Data rate 6.4 Mbit/s Symbol rate 4.0 kBaud Bits/symbol 1600
Bytes/symbol 200
FEC redundancy bytes 16 (8% overhead) Correcting ability 8 bytes/symbol

If such a system experienced a large impulse that corrupted all 200 bytes in one symbol, those 200 would need to be diluted by a factor of at least 200/8 ˆ 25 in order to be correctable. This suggests that D ˆ 32 would be an appropriate choice. The end-to-end delay for such an interleaver/de-interleaver pair would be 32 Â 200 ˆ 6400 bytes, which at 800 kbyte/s, would be 8 ms.

8.2.6 Tone Ordering

As we have seen in Section 5.2, subcarriers are bit-loaded in proportion to their SNR measured at the receiver, which typically, decreases rapidly with frequency. At the transmitter, however, clipping noise is impulsive in the time domain and white in the frequency domain, and would be much more damaging to those carriers that are heavily loaded if the effects of the noise were not spread out by interleaving. Tone ordering therefore arranges all the subcarriers in order of increasing numbers of loaded bits and then assigns the data in the fast (i.e., noninterleaved) path to the subcarriers in sequence followed by the data in the interleaved path to the remaining (more heavily loaded) subcarriers.

8.2.7 Trellis Code Modulation

Trellis code modulaltion (TCM) has been extensively discussed elsewhere (e.g., see [Kurzweil, 1999]), and the particular one used for ADSL is de®ned precisely in T1.413. There are, nevertheless, several aspects of it that are speci®c to MCM and need to be discussed:

^* An SCM TCM system encodes and decodes the symbols of data (all with the same number of bits) in time sequence. A MCM system could do the same if it processed each subcarrier separately, but both the memory requirements and the latency of the N/2 Viterbi decoders would be very large. A much better idea [Decker et al., 1990] is to encode from one subcarrier to the next.

^* When encoding across subcarriers the number of bits will vary from one input to the encoder to the next. This does not cause any problems, however, because only a few of the bits ( just three for two successive subcarriers wih Che four-dimensional code used in T1.413) are involved in the coding; the other bits, whose number varie are passed Chrough uncoded.

^* In order that each symbol can be encoded and decoded by itself without reference to Che vious symbol, it is necessary to start from Che ®rst subcarrier of each symbol with Che encoder in a known state and to force the encoder into that state after the last subcarrier.

^* The carrier recovery circuitry or algorithms of an SCM receiver are typically only able to resolv the phase of the carrier modulo /2. Therefore, any trellis code that is used must be /2 rotationally invariant; the four-dimensional nonlinear code Wei code [Wei, 1984] is the best known example of such a code. A MCM system must, however, establish an absolute phase reference for every subcarrier and maintain this throughout a session. It does not therefore require the /2 rotationally invariant property, and a simpler code could probably be used. This was proposed several times during the work on T1.413, but nevertheless, the patented Wei code was selected.

8.2.8 Pilot Tone

T1.413 speci®es that one of the subcarriers (number 64) should be left unmodulated. It was argued in [Spruyt, 1997] that sensitivity to slowly varying interference would be reduced by modulating the subcarrier with random 4-QAM, but unfortunately, this good proposal came too late to be acceptable.

8.2.9 Inverse Discrete Fourier Transform
The ef®c implementation of an inverse discrete Fourier transform (IDFT) as an inverse fast Fourier transform (IFFT) is discussed in Appendix C.
8.2.10 Cyclic Pre®x
The form and purpose of the cyclic pre®x are explained in Section 6.1. Figure 8.2 shows it graphically as the replication of inputs to the parallel-to-serial converter.
8.2.11 PAR Reduction
NOTE: The maximum permissible output signal of an ADSL transmitter is constrained by its PSD, not its total power. Therefore, an ADSL downstream
Figure 8.6 Interleave matrix forMˆ 7 andA ˆ 4. same as for the triangular interleaver: Each of the L bytes, t_l for l ˆ0to L À1, of a codeword is delayed by l Â (i À1).

Thus the ®rst and second bytes of a codeword (indices 0 and 1) are delayed by 0 and (i À1), respectively, for a net (minimum) separation of i . The interleaver can be implemented by L Â ˆ matrices that are written into by columns and read from by rows; the economy of memory compared to block interleavers is that only one matrix is used at each end: writing into one location and reading from another are performed alternately. [Aslanis et al., 1992] showed an example in which the interleave matrix is read seqentially by rows, and readÐusing a complicated set of rulesÐby columns. Figure 8.6 shows another example for L ˆ 7 and i ˆ 4 (odd and a power of 2, respectively, as required by T1.413).

A simpler way of implementing the interleaver, which requires calculating only a one-dimensional address, is with a circular buffer of circumference Li . For byte t_.

write address ˆ f; mod Li …8:6† read addressˆff ‡ %ˆ À…ˆ À 1†Â…f; mod†N†g; mod ND …8:7†
It can be seen that the total memory requirement is 2Li : twice that of a triangular pair. [Tong, 1998] describes the best of both worlds: independence of ^* Reducing the transmit level (4) and rotating the IFFT inputs (5) require signaling with every processed symbol.

It is probable that none of these methods will be adequate by itself; a judicious combination will be needed. We discuss each of them brie¯y using the background information in Section 5.6.1, but the discussion will not be conclusive; many months of study and discussion among people with complementing skills will be needed before the best solution can be found. It will be obvious which are my favorites from the amount of space I devote to each method, but in order not to keep the reader in suspense (and to increase his or her reading ef®ciency), I will state from the start that they are numbers 3 and 7.

1. Accepting Errors and Correcting by Retransmission. It can be seen from Figure 5.1 that with a PAR of 4.0, the probability of a clip is approximately
6 Â 10^À5. This means that for the 512-sample symbols used by ADSL downstream and all VDSL, the probability of a clipped symbol is approximately
1/30. This seems like a reasonable upper demand to put on any retransmission system. Therefore, for applications that can live with a 3% retransmission rate, a PAR of 4.0 (12 dB) is achievable.

NOTE: There might be problems because the R-S codewords are synchronized with the DMT symbols; it would obviously achieve nothing if the same symbol were retransmitted.

2a. Attempting to Correct the Errors by an FEC Alone. As we saw in Section
5.6.1, a small proportion of clips will cause errors on all subcarriers loaded with more than a certain number of bits. For example, from Table 5.2 we can deduce that for a PAR of 4.0, one symbol out of 30 will have a very high error rate on all subcarriers carrying more than 7 bits. If there are more than a very few of these, it is unlikely that an FEC alone will be able to correct the errors. On the other hand, in the other 29 symbols there will be no errors and the FEC will not be needed. Clearly, an FEC alone contributes almost nothing to the correction of clipinduced errors.

2b. Attempting to Correct the Errors by an FEC and Interleaving. Interleaving has the effect of diluting each burst of clip-induced errors, andÐperhapsÐ reducing them to a proportion that can be handled by the FEC. Analysis of the combined effects of interleaving and FEC on error rates in general, and on the required PAR in particular, is very complicated, and must be considered un®nished business.

NOTE: It has been argued that the FEC is intended to correct for unpredictable impulses from outside, and should not be pre-empted to deal with “predictable” events such as clips. If, however, the number of clipped symbols is kept to only a few percent, I see no reason why the FEC should not perform the double duty. 3. Filtering the Clip Impulses. [Chow et al., 1998] A clip in the time domain generates noise that is white in the frequency domain; this is potentially very harmful to those subcarriers carrying a large number of bits but totally innocuous to those carrying a small number. For minimum average deleterious effect, the clip noise should be ®ltered so that the signal/clip ratio (SCR) parallels the SNR.

Zero at dc. The simplest case is when the SNR decreases monotonically with frequency from dc (e.g., an SDMT VDSL system with only kindred FEXT). The clip noise should then be ®ltered by

F…z†ˆÀ0:5z ‡ 1 À 0:5z^À1 …8:8†
which

^* Has a double zero at dc
^* Has unity gain at f_samp/4
^* Increases the total clip noise across the entire band by 1.76 dB

This is very simple to implement on the output of the parallel-to-serial converter after the cyclic pre®x has been added: If the magnitude of the nth sample exceeds the prede®ned clip level (3.5 is a reasonable clip level to strive for), it is clipped and half the clip noise subtracted from the (nÀ1)th and (n ‡ 1)th samples. If the ®rst or last sample of the symbol must be clipped, then in order to allow each symbol to be processed completely without reference to previous or subsequent symbols, the full amount of the clip should be subtracted from the second or penultimate sample, respectively. This is equivalent to only a ®rst-order high-pass ®lter, but since it is invoked only rarely, the effect on the average clip ®ltering is negligible.

If because of line attenuation, some of the upper subcarriers are not used, successive samples will be correlated and clips may occur in pairs. Then if both the nth and the (n ‡ 1)th samples exceed the clip level, they should both be clipped and half of the sum of the clip noises subtracted from the (nÀ1)th and (n ‡ 2)th samples. For the reasons discussed in the note at the beginning of this section, correlated clips will be rare, but modi®cation of the algorithm to deal with them should be included.

Zero at Some Higher Frequency. If the SNR does not decrease monotonically [e.g., if NEXT from an HDSL system(s) is the dominant noise source], or if the SCR at low frequencies is of no interest (e.g., the downstream signal in an FDD ADSL system), the zero of the high-pass ®lter should be moved up to subcarrier n₀:

F…z†ˆÀaz ‡ 1 À az^À1 …8:9† where
n
^{N À1 1} …8:10†₀ ˆ 2 cos_2a
and for the sake of simple digital operations, a is constrained to values that can be expressed as (2^À1 ‡ 2^Àm).

Distribution of the Clip Energy. The ®lter acts on each clip separately: big or small clips become big or small ®ltered clips. The analysis in Section 5.6.1 of the distribution of clip energy is therefore still pertinent; the predicted levels on any subcarrier must just be multiplied by the ®lter transfer function.

Gain at f_samp. The generalized ®lter of (8.9) has unity gain at f_samp=4, so all the calculations in Section 5.5.1 of the effects of the distribution of the (un®ltered) clipped energy apply unchanged to the ®ltered clip at subcarrier N/4. This allows the design process to be separated into two almost distinct steps:

1. What PAR is needed to support the required number of bits at the center of the band?
2. How should the SCCR be shaped to get the greatest margin relative to the SNR at other frequencies?

Over-Sampling. The means for doing digital signal processing are improving much faster than those for analog processing, and oversampling and digital ®ltering of the transmit signalÐparticularly the upstream signalÐare becoming more common. Oversampling is an extreme example of the situation described aboveÐsuccessive samples are very strongly correlatedÐand trying to correct for the clipping of one sample using (8.8) or (8.9) would fail because most of the adjacent samples would also be clipping. The algorithm must therefore be modi®ed as follows.

Consider the case of 8 Â oversampling that is common for the ADSL upstream signal.¹⁰ If the sample number (8m ‡ 1) clips, then (8m ‡ 2) to (8m ‡ 8) will also, and the clip must be ®ltered by subtracting 8a Â the clip from samples 8m and (8m ‡ 9). The resultant clip pulse [À8a11111111 À8a] has the desired double zero at n₀, as given by (8.10). The subtraction of 8a Â the clip means that samples 8m and (8m ‡ 9) are now themselves more likely to clip, and the algorithm will perform slightly below theory; simulation is the only way to check, so an 8 Â oversampled ADSL upstream transmitter is included in the following simulated examples.

Some Results. Three systems were tested by simulation. A fairly conservative PAR of 3.5 (10.9 dB) was used throughout:
¹⁰ This is also discussed as decremented oversampling in Section 8.2.12.
Figure 8.7 Downstream SCCRs foraˆ 0.5: average un®ltered and ®ltered, and worst case out of 10,000 symbols.

1. A basic (EC ADSL down or VDSL) system: N ˆ 512, subcarriers 6 ± 255 are loaded,¹¹and a ˆ 0.5. The probability of a clipped symbol ˆ 0.21. The average SCCR for the un®ltered system and for subcarrier 128 of the ®ltered system is 35.8 dB, but as discussed in Section 5.6.1, this is not very informative; a margin of about 18.5 dB is needed to prevent the occasional big clips and multiple clips from overwhelming the FEC. The resulting “worst-case” SCCR of 18.3 dB will support 6 bits, which is about the maximum that would be expected in the middle of the band (552 kHz for ADSL downstream, 5.52 MHz for SDMT VDSL).

Figure 8.7 shows the average un®ltered and ®ltered SCCRs. As predicted by theory, the former is 35.8 dB across the band, and the latter is 35.8 dB at subcarrier 128. Below subcarrier 128 the ®ltered SCCR increases faster than a typical SNR (and thereforeÐwhat is most importantÐfaster than a typical bit loading). Figure 8.7 also shows the worst ®ltered SCCR from 10,000 symbols (2130 clipped symbols); it is approximately 14 dB worse than the average: comparable to the 12 dB predicted in Section 5.6.

¹¹ In all cases the lower ®ve subcarriers were omitted to allow for POTS. Figure 8.8 Downstream SCCRs foraˆ 0.5625 on a short loop.

2. An FDD ADSL downstream signal on a short loop: N ˆ 512, subcarriers 36 through 255 are loaded, and a ˆ 0.5625 (2^À1‡ 2^À4). The ®lter notch is approximately at the bottom edge of the used band (n₀ ˆ 38.8). It can be seen from Figure 8.8 that compared to the system with a ˆ 0.5, at the low end of the band there is an improvement of as much as 8 dB, and at the top end there is a deterioration of about 1 dB.

3. An 8 Â over-sampled upstream ADSL signal: N ˆ 64, subcarriers 6 through 28 are loaded, and a ˆ 0.5 and 0.53125 (2^À1‡ 2^À5). It can be seen from Figure 8.9 that the combination of oversampling and ®ltering improves the SCCR across the entire band, and that tailoring the ®lter transfer function to match the POTS-required gap at the low end further improves the SCCR by as much as 3 dB.

An FDD ADSL downstream signal on a long loop with only subcarriers 36 through 128 loaded was also investigated, but because the total power was reduced by 3.7 dB, clips were so infrequent as to be insigni®cant.

4. Reducing the Transmit Level of Symbols That Would Otherwise Be Clipped. The method described in [Chow et al., 1998] greatly reduces the probability of a clip by digitally detecting a potential clip and reducing the level

Figure 8.9 Upstream SCCRs using 8 Â oversampling.

of every sample of the symbol by one of a few prede®ned amounts. This, of course, increases the probability of error due to external crosstalk and noise, but for a given PAR, this increase can be balanced against the decrease in clipinduced errors. The ideal balance is with a PAR of about 3.5 and the ability to deal with signals up to 6.0; this requires a range of about 4.5 dB, which can be conveniently divided into three 1.5-dB steps.

This method would require the use of two bits per symbol to signal the reduced level. Errors on these two bits would be catastrophic, but they could be made very secure by loading them by themselves on a subcarrier with a high SNR. Nevertheless, both ANSI and ITU have rejected the method as being too vulnerable.

5. Retransforming with Randomly Selected IFFT Inputs. This method was originally proposed for DMT in [Mestdagh and Spruyt, 1996], and it has been proposed for OFDM in [MuÈller and Huber, 1997]. In its simplest form, the method is as follows. If a sample in any symbol exceeds the clip level, the IFFT is recalculated with a new set of inputs that are related to the originals in some easily de®ned way but random enough that the second set of samples is independent of the ®rst. If the new set contains a potential clip, the process is repeated.

This is not in any way a rigorous mathematical de®nition of the method, but the result can be simply and precisely de®ned. If Pr_clipsymb as de®ned in (5.13), is the probability of a symbol containing a clip, the probability of requiring more than n IFFTs (i.e., the ®rst n symbols all contained clips) is Prⁿ For_clipsymb example, for a PAR of 4.0 (12 dB) the probability of requiring more than four IFFTs is approximately 10^À6. The average rate at which IFFTs must be performed is

R_IFFT % f_symb‰1 ‡ 2Pr_clipsymb ‡ 3Pr² ‡ 4Pr³ Š…8:11†_{clipsymb clipsymb}

which is typically only slightly greater than f_symb. If, however, the amount of RAM available for buffering and the system latency are both tightly limited, the ability to perform IFFTs at the worst-case rate of 4 f_symb must be provided. Because of both the large computational penalty and the vulnerability to errors in the channel that signals the number of recalculations, the method has been rejected for ADSL by both ANSI and ITU.

6. Adding Constrained Dummy Subcarriers. There are several variations of this method (see, e.g., [Gatherer and Polley, 1997], [Tellado and Ciof®, 1998], [Shepherd et al., 1998], and [Kschischang et al., 1998]), but they all involve using only a subset of the available (N/2À1) subcarriers for data and adding redundant signals on another subset of subcarriers so that the IFFT of the aggregate signal has a much lower PAR. The methods differ in how they choose the subsets¹²; in order of increasing sophistication they use:

1. Subcarriers at the edges of the band, where it is anticipated that the SNR will be low enough that the subcarriers would not be used for data anyway
2. Subcarriers randomly distributed throughout the band in a way that is generically optimal
3. Subcarriers distributed throughout the band so as to be optimal for the measured SNRs¹³

Relative to the other two, the ®rst one has the advantage that it does not sacri®ce any capacity, but it has the disadvantages that (1) a contiguous block of carriers at the edge of the band is not ef®cient at combating peaks, and (2) the usual reason why these subcarriers are not used is that they experience high attenuation in the channel, so the worry is that when the redundant signals are attenuted, the PAR will increase again! The other two methods have the disadvantage that they waste capacity; up to 20% of redundancy has been proposed.

¹² They also differ in how they calculate the redundant signals, but that need not concern us here.
¹³ That is, the deduced SNR values, because the calculation is done at the transmitter where only the bit loadings are known.

7. Constellation Expansion for Some Data-Carrying Subcarriers. The following is a very brief summary of the method that is described in [Tellado and Ciof®, 1998]. Consider as a simple example an eight-level (three-bit) PAM signal that might be modulated onto one dimension of one of the subcarriers.¹⁴ The signal points are conventionally de®ned as À7, À5, À3, À1, ‡ 1, ‡ 3, ‡ 5, and ‡ 7, but if instead of the point ‡ 3, for example, À13 were transmitted,¹⁵ it could be correctly decoded in the receiver by a modulo-16 operation. That is, the original and “minimally expanded” sets (shown in bold and regular type, respectively) are

À15 À13 À11 À9 À7 À5 À3 À1 ‡1 ‡3 ‡5 ‡7 ‡9 ‡11 ‡13 ‡15
It can be seen that in all cases there is a change in signal value of Æ 16 or, for a general L-level signal, Æ 2L.
To normalize the original set to unit average energy per subcarrier, they would_p
have to be multiplied by 1/
21
, or, in general, by
3=…L² À 1† p_{. Previously,} however, we have normalized each time-domain sample to unit energy, so the _pnormalization of the signals on each subcarrier must also include a factor

N_dim, where N_dim ˆ 2N_sc4N. Then the change in baseband signal value achieved by replacing any original point by its minimally expanded equivalent is given by

2
L
p

3
Á_sig ˆ p …8:12† N_dim…L² À 1†

If the subcarrier onto which this baseband signal is to be modulated has a peak value (again normalized to unity) at the sample that is to be modi®ed, the Á_sig of (8.12) is also the change in the sample of the passband signal.

The basic steps of the method are therefore:

1. Identify the sample (de®ned as sample n_max) of largest magnitude.
2. Identify a set of cosine or sine subcarriersÐeach of which de®nes a dimensionÐthat have a maximum or near-maximum¹⁶ at sample n_max.
3. Find one subcarrier of that set that was modulated by an appropriate outer point (i.e., appropriate in that its sign was such that it contributed to the peak).

¹⁴ This simpli®cation is valid only for subcarriers with an even number of bits, for which the two dimensions can be considered separately, but the extension to the cross constellations used for odd numbers of bits is fairly straightforward.
¹⁵Any point de®ned by (3 ‡ 16m) could be transmitted, but the others would require much greater increases in energy.
¹⁶ The reduction in PAR per step is proportional to the magnitude of the subcarrier at n_max,soitis advantageous to use only those subcarriers that are near a maximum.

4. Replace the point modulated onto the chosen dimension by its minimally expanded (and opposite signed) equivalent.
5. Recalculate all the samples.
6. Return to step 1 to ®nd more points to expand so as to bring the potentially clipped signal within range.

Step 3 needs a little more explanation. Although all changes from an original point to a minimally expanded point change the signal level by the same amount, the increase in energy of the point resulting from that change does, however, vary considerably. If the original point was de®ned, before normalization, as Æ (2mÀ1), for m ˆ1to L/ 2, the increase in energy is

Á_en ˆ³ †‰…2m À 1 À 2L†²À…2m À 1†²Š…8:13†_{Ndim…L2 À 1}

For L large the increase in energy from an inner point (m ˆ 1) to its minimially expanded point is approximately 12 Â the average energy, but from an outer point (m ˆ L/2) to its minimally expanded equivalent it is only approximately (12/L) Â that energy. The power increase involved would therefore be minimized if only outer points of large constellations were expanded.

Figure 8.10 shows a 16 QAM constellation and its minimal expansion. The original points are shown bold; the 16 preferred points to be used for PAR reduction are shown full size; 32 of the rest of the minimally expanded set that could be used but at the expense of greater power increase are shown in smaller type; the corner 16 that should never be used because the sine and cosine subcarriers cannot have simultaneous maxima are shown in very small type.

This algorithm may require 10 or more iterations (using a different subcarrier each time) to get down to a PAR of the order of 10 dB, and Tellado reports¹⁷that these 10 appear to require approximately the same amount of computation as one IFFT. There is, however, little actual computation (one vector of sines or cosines per iteration) but a lot of searching and “random logic”; the algorithm will probably have to be optimized for each different DSP system.

Un®nished Business: Combining Clip Filtering (3) and Constellation Expansion (7). The main problem with the clip ®ltering algorithm is that the occasional large clips have enough energy that they may cause problems even when ®ltered. It would seem that application of just a few iterations of the constellation expansion algorithm to eliminate the peaks beyond about 5 before ®ltering would be very useful.

NOTE: PAR reduction is a hot subject while I am writing this book, and everbody may have agreed on a method (or at least those parts of the method that have to be standardized) by the time the book appears. If so, I hope that the above

¹⁷ Private conversation. Figure 8.10 Sixteen- and minimally expanded 32-pt constellations.
discussion provides useful background to help in understanding whatever is decided.
8.2.12 Digital-to-Analog Converter

As discussed in Section 5.5, the high PAR of a multicarrier signal puts a lot of strain on many components in an MCM system, the digital-to-analog converter (DAC) among them. The conventional calculation of the number of bits required in the DAC goes as follows: For an M-bit DAC that accommodates a peak signal voltage¹⁸ of k the signal power/quantizing noise ratio, SQR, is given by

SQR
ˆ
12 Â 2^2…MÀ1†
_k2 …8:14†

¹⁸ This is often called the PAR of the DAC, but we must be aware of the subtle shift of meaning here; it is now the PAR of the output of the DAC, not of the MCM signal driving it; some clippingÐ either digitally explicit before the DAC or inherent in the DAC’s input circuitryÐmay have already occurred.

This noise is white, so its effect will be most serious on those subcarriers that are loaded with the most bits. For a subcarrier carrying b bits, the calculation proceeds as follows:

^* The SNR required for a 10^À7 error rate is (ignoring margin and coding gain) approximately (8 ‡ 3b) dB.
^* Assume that the DAC quantizing noise is allowed just 0.1 dB out of the noise budget.
^* Therefore, SQR must be 16 dB greater than the required SNR.
^* Each quantizing step, , is given by

ˆ^k …8:15†_{2M À1}
^* Therefore,
12 Â 2^2…MÀ1†
_{k2 >10}… 8‡3b‡16†=10_…8:16†
which for a reasonable maximum b ˆ 12 and a typical k ˆ 5 (assuming that no PAR reduction has been performed) means that
M > 11:5 …8:17† ^* Rounding M to 12 bits allows k to increase to 8.0 or b to 13 ^* Increasing b to the maximum of 15 de®ned by T1.413 increases M to 13 bits

Reducing the Number of DAC Bits Required. The above calculation of the number of DAC bits is very conservative. Four methods of reducing the number have been described.

1. PAR reduction (from the k ˆ 5 assumed here to about k ˆ 3, thereby saving 3/4 of a bit)
2. Decremented oversampling [Flowers et al., 1998], to spread the quantizing noise over a wider band
3. Run-sum ®ltering to reduce the quantizing noise at low frequencies where the SNR is highest.
4. Predistortion to reduce the signal at high frequencies where the SNR is lowest

NOTE: All of these methods would be as applicable to SCM as to MCM, but they are particularly useful for MCM because of its more stringent conversion requirements.

PAR reduction has been discussed in Section 8.2.11; the others are discussed below.
Decremented Oversampling (DOS). For some DAC technologies it may be advantageous to increase the sampling frequency above that required by the transmitter (i.e., to oversample) in order to reduce the number of bits required. Oversampling is particularly appropriate for the upstream ADSL channel. The minimum sampling rate is 276 kHz, but digital oversampling by a factor of 8 (up to 2.208 MHz, the downstream sampling rate) may be convenient. If the same amount of quantizing noise were then spread over eight times the bandwidth, its level would be reduced by 9 dB, thereby saving

^{1 1 bits.2} The simplest method of oversampling is to replicate the samples seven times. This would result in a zero of the output spectrum at 276 kHz with a “sin x/x” roll-off of 3.92 dB at the 138-kHz band edge. Such replication would, however, also replicate the quantizing noise and leave most of it at low frequencies. In order to spread the quantizing noise over the whole wider band, it must be decorrelated from one oversample to the next. One (and maybe the simplest) way to do this is to decrement each successive repeated sample by a factor (1ÀÁ). For M ˆ 12 and k ˆ 5 as in the previous example, each quantizing step ˆ 5 Â 2^À11,so Á ˆ 2^À8 allows for a very simple decrementing operation (shift right by eight and subtract) and ensures that each successive sample will move into another quantizing bracket.

Run-Sum Filtering (RSF). For the great majority of loops, signals, and interferers, the SNR decreases almost monotonically with frequency; as we have seen, the low frequencies often support 12 or more bits, and the channel is truncated when it can no longer support one. Clearly, quantizing noise is more harmful at low frequencies than at high frequencies, and ideally it should be high-pass ®ltered.

A very crude ®lter can be implemented by maintaining a running sum of the quantizing errors and shifting the quantized output by one step whenever the running sum exceeds half a step; that is:

^* [x] ˆ the quantization of x to the nearest integer number of steps, n Â ^* e ˆ xÀ[x]
^* rsum ˆ rsum ‡ e
^* If rsum> / 2, then [x] ˆ (n ‡ 1) and rsum ˆ rsumÀ
^* If rsum < À / 2, then [x] ˆ (nÀ1) and rsum ˆ rsum ‡

This algorithm was simulated for the 12-bit DAC with a 5 range that was recommended by (8.17). The conventional quantizing noise was, as would be expected, constant at À63 dB relative to the signal; the ®ltered quantizing noise was less than this over approximately the lower one-third of the band and 6 dB greater at the top of the band (subcarrier 255).

The algorithm zeros the dc component of the quantizing error, and it appears that the operation is equivalent to a ®rst-order (single real pole) high-pass ®lter. The transfer function of this ®lter was found from the best match to the simulated results to be

H
_rs ˆ^{2p
jf} …8:18†_{jf ‡ fN}

where f_N is the Nyquist frequency ( ˆ f_samp/2). The model can be justi®ed theoreticallyÐwith the bene®t of a lot of hindsightÐby the following, not very rigorous, argument.

^* The cutoff frequency is determined by the average rate at which the running sum approaches the reset thresholds of Æ /2.
^* Since the quantizing noise is uniformly distributed over the interval À/2 to ‡ /2, it would appear that the average number of samples to reach a threshold is two: resulting in a “cutoff” at half the sampling frequency.
^* The occasional adjustment of the quantized value has the effect of_p doubling the total quantizing power; the in®nite-frequency gain of 2 2 in (8.18) is needed to ensure this.

The use of this RSF algorithm for a TDD VDSL system¹⁹ that experiences only FEXT is shown in Figure 8.11, which plots the SNR and the SQRs for aonly FEXT is shown in Figure 8.11, which plots the SNR and the SQRs for a bit conventional DAC and a 9-bit DAC using RSF. As a check on the model, it also superimposes a few points for a 9-bit DAC ®ltered by (8.18). It can be seen that the model ®ts the simulated performance very well, andÐmore importantÐthe ®ltered SQR approximately parallels the SNR.

The effects of each of the DACs on the performance of the system (i.e., their share of the noise budgets) can be quanti®ed as the logarithmic ratios of external noise plus quantizing noise to external noise alone:

dB_loss ˆ 10 Â log₁₀ 1 ‡^SNR …8:19†_SQR

These are plotted in Figure 8.12. As would be expected, the conventional DAC is better at high frequencies, but both are unimportant there anyway; the run-sum®ltered DAC (with two fewer bits) is much better at low frequencies, where both are important.

Predistortion (Un®nished Business). The calculation of M in (8.17) is based on b ˆ 12, which is the maximum loading that will be applied. A SQR of (8 ‡ 3 Â 12 ‡ 16) ˆ 60 dB is needed only at low frequencies, so the power of the full signal, which controls the amount of quantizing noise, can be reduced by digitally de-emphasizing the high frequencies. This de-emphasis, which is

¹⁹ This chapter is about ADSL, but it seems appropriate to cite some VDSL results here. Figure 8.11 SNR and SQR for VDSL system with FEXT.

the complement of the run-sum (high-pass) ®ltering of the noise, could be done with a simple third-order digital FIR ®lter, and the output spectrum could be ¯attened with a three-pole analog ®lter. It might be possible to use the transformer and dc-blocking capacitor to perform this ¯attening. An even more interesting idea, which would be appropriate for the upstream transmitter, is to combine this with the pre-equalizer discussed in Section 8.3.4; that is, use the g_I values to shape the signal, and then both ¯atten the output PSD and pre-equalize the upstream band with the high-pass ®lter.

Combination of the Algorithms. If the DOS and RSF algorithms are combined (DOS ®rst) the quantization noise is reduced and much of it is swept up into the unused higher part of the band. For an 8 Â oversampled ADSL upstream signal, a 9-bit DAC with the DOS/RSF algorithm is better than a 12-bit conventional DAC over the lower part of the band and worse over the upper part. The relative dB_loss values [as de®ned by (8.19)] incurred will depend on the shape of the SNR curve, which in turn depends on the type of crosstalk (HDSL NEXT? EC or FDD ADSL?), but a saving of 3 DAC bitsÐapproximately 1¹bits₂ each from the two algorithmsÐrelative to a conventional system can often be achieved. Run-sum ®ltering and pre-emphasis are probably alternatives that should not be combined; they both strive to shape the SQNR to match the expected SNR of the loop.

Figure 8.12 Decibel loss of conventional and run-sum ®ltered DAC.
8.2.13 Line Drivers

The main problem for xDSL line drivers is the required voltage swing. Even if the PAR can be reduced to 10 dB, for an average downstream output power of the PAR can be reduced to 10 dB, for an average downstream output power of line, as de®ned in T1.413, the peak line voltage, V_max, is given by

V
2
^max ˆ 1W …8:20†_R
whence V_max ˆ 10:0V …8:21†

† source, the driver voltage must be twice this²⁰; 20 V is very dif®cult to get out of an integrated circuit! Similar calculations for upstream (À38 dBm/Hz across approximately 100 kHz ˆ 12 dBm ˆ 16 mW) would call for a driver voltage of 8.0 V. One solution to this problem is to use two push/pull ampli®ers with “inherent impedances” (ratio of voltage drive to current drive capabilities) of about 30 to drive the (balanced) primary of a 1:3 transformer; this is discussed in more detail in the next section.

²⁰ Methods have been described for incorporating the line input impedance in the feedback of an ampli®er and thus saving the voltage and power that are “wasted” in the driving resistance, but it is very dif®cult to incorporate these circuits into the 4W/2W hybrids and/or echo canceler circuits.

8.3 FOUR-WIRE/TWO-WIRE CONVERSION AND TRANSMIT/RECEIVE SEPARATION
8.3.1 Line-Coupling Transformer
The traditional line-coupling transformer is a three-port device that performs three functions:

1. Common-mode isolation (lightning protection).
2. Unbalanced-to-balanced conversion; all internal operations are performed unbalanced (i.e., referenced to ground).
3. Four wire-to-two wire conversion with (partial) separation of transmit and receive signals; if the hybrid impedance equals the input impedance of the line (seen through the transformer), there will be in®nite loss from transmitter to receiver.

For xDSL these functions must be modi®ed:

^* As discussed in the preceding section, there are advantages to using balanced line drivers, so function 2 is not needed.
^* The inherent impedance of ampli®ers is much less than 100 ,sowenow need an impedance transformation.
^* The shunt inductance of the transformer can perform part of the high-pass ®lter function needed for the POTS splitter.

One possible con®guration is shown in Figure 8.13.
8.3.2 4W/2W Hybrid

The ®rst means of separation of transmit and receive signals is the 4W/2W hybrid. The maximum amount of separation that can be achieved is the return loss (RL) between the input impedance of the line and the reference impedance (perhaps hypothetical²¹) of the hybrid, and this RL must be taken account of when designing the FDD separation ®lters (see Section 8.3.4). For loops without bridge taps the input impedance is approximately equal to the characteristic impedance, whose variation with frequency can be well modeled by an RRC

²¹ The 4W/2W network may not contain a reference impedance per se; it may include a transfer function, which attempts to model the re¯ection coef®cient, and a subtraction circuit. The theoretical performance limit would be the same.

Figure 8.13 Balanced line drivers and coupling transformer.
Figure 8.14 Compromise RRC model of loop input impedance.

impedance. Figure 8.14 shows an impedance that is a compromise between 24 and 26 AWG loops, and Figure 8.15 shows its RL against the input impedance of two “basic” loops de®ned in T1.413 and G.995: CSA 6 (9 kft of 26 AWG) and CSA 8 (12 kft of 24 AWG). These are best cases, which maintain an average RL of about 28 dB across the band.

Bridge taps near the end of a loop greatly reduce the RL at that end²² around the “notch” frequency, which is a function of the length of the bridge tap (see Section 3.5.2). Figure 8.15 also shows the return loss of CSA 7 (another one of the test loops), which has a bridge tap right at the end; the minimum return loss is about 4 dB! A conservative design approachÐat the RT at leastÐis therefore to assume that a bridge tap can be of almost any length and at any distance from the RT. Analysis of many different loops using the program in Appendix A suggests that a worst-case RLRT of 5 dB should be assumed across the entire band.

The situation at the CO is more controversial. Bridge taps in the feeder cable are certainly less common, but according to [AT&T, 1982] they do occur, and the test suite de®ned in T1.413 includes one such loop. Therefore, the conservative approach is to assume a worst-case RLCO of 5 dB also.

²² They are almost invisible from the other end. Figure 8.15 Return loss of RRC impedance against CSA loops 6 and 8.

An xDSL unit, of course, sees the loop through the second-order high-pass ®lter, and the RL value will be very low over a signi®cant frequency range (far beyond the 30 kHz cutoff frequency). The RL could be improved either by incorporating an average L and C into the hybrid reference impedance, or by passing the transmit signal through an equivalent third-order active-RC transfer function and subtracting it from the re¯ected signal. Because of the problems with bridge taps, however, it is betterÐfor FDD systems at leastÐto design the ®lters for worst-case RL values and live with re¯ections from the high-pass ®lter.

NOTE: It has been suggested that ADSL modems designed speci®cally for countries that do not have bridge taps in their cables might rely on the higher RLs and have simpler ®lters. To take advantage of the greater certainty about the input impedance of the loop, however, these modems would have to compensate for re¯ections through the high-pass ®lter: probably just as dif®cult as providing the extra ®ltering.

Adaptive Hybrid. The trans hybrid loss (THL) can be improved adaptively either by adjusting the balance impedance in the hybrid or by passing the transmit signal through a separate echo-emulating path and subtracting the result from the re¯ected signal. The latter approach seems to be preferred, but application of the method at xDSL frequencies has previously been hampered by the dif®culty of implementing highly linear, controllable analog components (resistors, capacitors, multipliers, transconductance ampli®ers, etc.). Recent work, reported in [PeÂcourt et al., 1999], appears to have solved this problem, however, and THLs >25 dB have been achieved even with the most demanding bridge taps. The next step must be the development of on-line algorithms for calculating the parameters of the emulating path.

Tuned Adaptive Hybrid. The attenuation from transmitter to receiver is provided by ®lter plus hybrid plus ®lter, where each “®lter” may be the combination of analog, digital, and (I)FFT. The combined ®lters typically have the least attenuation around the crossover frequency, so that if the error measure used for adaptation of the hybrid is equally weighted at all frequencies, the total attenuation will have a minimum in that region. A better strategy is to weight the adaptation error more heavily in the crossover region so that the hybrid is partially “tuned.” This would allow system management to partly control the crossover frequencyÐperhaps between 100 and 150 kHzÐin response to a combination of loop lengths, traf®c needs, and binder-group management (see Section 4.6.5).

NOTE: The crossover frequency must be the same for all the pairs in the binder group to avoid kindred NEXT.
8.3.3. EchoCanceler?

In Section 4.2.1 we distinguished between EC as a duplexing strategyÐthat is, using band 1 for both downstream and upstream and EC as an implementation tactic. We now need to consider both for ADSL.

EC as a Duplexing Strategy. T1.413 and G.992.1 state that EC is optional for ADSL, but my conclusion in Section 4.4 was that for most systems it is obsolescent. Now I will go so far as to say that allowing it in G.992 was a mistake. This does not, however, mean that ATUs with echo cancelers are obsolete. The ATU-Cs can simply turn off band 1, and both ATU-C and ATU-R can use the cancelers to assist in band separation. Because EC for ADSL is very complicated in both design and implementation,²³but has nevertheless been well covered elsewhere (see the specialized bibliography at the end of the references) I will not discuss it here.

EC as an Implementational Tactic. Most designers of DMT transceivers have long realized that DMT does not like being bandlimited, and FDD ®lters may be a big (perhaps the biggest) contributor to distortion.²⁴ Therefore, it was thought that the sidelobes should be removed by a simpli®ed EC instead of ®lters. Methods of implementing these are proprietary, but it is clear that at the very least they must use a precanceler to protect the ADC. Prudently designed ®lters would probably do almost as well and would be much simpler.

²³Usually comprising three cancelers: a precanceler to protect the ADC, a time-domain canceler to remove the noncyclic part of the echo, and a frequency-domain canceler.
8.3.4 FDD Filters

T1.413 speci®es the out-of-band PSDs for both transmitters, but it does not specify the separating ®lters needed for FDD operation; they were left as vendor discretionary because their primary purpose is to protect the “near-end” receiver. The lower end of the downstream band as speci®ed in Figure 25 of T1.413 is assumed to be that appropriate for an EC system, and extends down to 26 kHz; the speci®cation is much looser than that needed for an FDD system. Only for the upper end of the upstream band do the two requirements meet, and then the roll-off of the ATU-R PSD, as de®ned in Figure 29 of T1.413 to limit XT into other xDSL systems, is comparable to that needed for FDD.

In an ATU-C transmitter some separation is achieved merely by turning off subcarriers 1 to 35; the lower sidelobes of the used subcarriers (36±255) are attenuated by the IFFT. In an ATU-R receiver the re¯ection of the subcarriers transmitted in the low band is attenuated by the lower sidelobes of the FFT. Similar effects can be achieved in the upstream direction by performing a double-size (i.e., 128-pt) IFFT and FFT. This requires extra computation, but it provides distortion-free ®ltering. It is important to note that if this is not done in the ATU-R transmitter, the ®ltering needed to meet the upstream PSD mask (regardless of any calculations about interference with the receive signal) is very sharp.

Figure 8.16 shows the PSDs of received signals and unavoidable noise (10 HDSL and 10 ADSL crosstalkers) for CSA loop 8, assuming that as suggested in Table 8.1, subcarriers 29 through 35 are sacri®ced to a guard band. Transmitter leakage into the receiver is a major impairment in ADSL systems, but ®ltering (particularly analog) consumes power and distorts the signal, so it is advisable to allow leakage to be as big as possible and use as much as 1.0 dB of the noise budget. This means that leakage can be about 6 dB below the unavoidable noise.

The design of ®lters for FDD DMT is more complicated than for SCM because the re¯ected transmit signal must be considered in each separate subchannel rather than as an aggregate across the band. If the design is done carefully and prudently, however, the resulting ®lters should be less complex than those required for SCM because of the ®ltering inherent in the IFFT and FFT [smoothed as shown in Figure 6.2 and modeled by (6.18)].

It is best to consider the two ®lters at an ATU together and to design them by iterative modeling (using poles and zeros immediately rather than templates such as Butterworth or elliptic²⁵) and analysis. The power spectrum of the signal

²⁴ The only explicit statement of this that I have seen, however, is in [Saltzberg, 1998]. ²⁵ The poles and zeros of one of these ®lters could be used as a starting point for the iteration. Figure 8.16 Received signals and “noise” (crosstalk) for CSA 8.
delivered to the FFT can easily be calculated from the tandem connection of the IFFT, transmit ®lter, and receive ®lter:
S_ITR… f†ˆ S_I… f †S_T… f †S_R… f†…8:22†
or more conveniently, as a function of the tone number, n:
S_ITR…n†ˆ S_I…n†S_T…nÁf †S_R…nÁf†…8:23†
where, for convenience, S is written for |F|². Then the interfering transmit signal appearing at the output of the FFT in subchannel m is given by the convolution X
S_ITRF…m†ˆ S_ITR…n†S_F…n À m†…8:24†
and the independent parameters of each ®lterÐpreferably as few as possible²⁶Ðcan be found by iteration.
²⁶A ®lter with a maximally-¯at passband is ideal because it can be fully de®ned by its transmission zeros.
Figure 8.17 Total transmit leakage compared to noise for CSA 8.

The S_ITRF value of a pair of CO ®ltersÐinverse Chebyshev with 35 dB of stopband rejectionÐdesigned this way is shown in Figure 8.17 together with the unavoidable noise for CSA 8; the goal of a 6-dB “margin” is achieved. The RT ®lters are a little more problematic; the receive ®lter would be considerably simpli®ed if the cyclic pre®x were shaped in the receiver according to method 2 of Section 6.4. Design of the ®lters should be postponed until it has been decided whether the extra distortion associated with the shaping can be tolerated.

8.4 RECEIVER

NOTE: Some parts of the discussion in this section may seem rather vague, but at this early stage of the technology many of the details of receivers are proprietary,²⁷ and I can say no more.

Much of the receiverÐthe analog-to-digital converter (ADC), the FFT, the Viterbi trellis decoder, the de-interleaver, the Reed±Solomon decoder, and the descramblerÐis the mirror image of a transmitter, and furthermore, much of it is not unique to either DMT or xDSL; nevertheless, for the sake of consistency and continuity, these components are each given a (sometimes very short) section of their own.

²⁷ Receivers are not de®ned in any standard; they have only to demodulate and decode a de®ned transmit signal.
8.4.1 Analog Equalizer?

In many modems where adaptive equalization is needed, a pre-equalizer (®xed or switchable in very coarse steps) is used to reduce (1) the variation of attenuation with frequency, (2) the spread of eigenvalues of the signal input to the adaptive equalizer, and (3) the convergence time of that equalizer; this preequalizer is frequently analog. We have to consider whether such an equalizer should be used for xDSL. The conditions and conclusions are different in the ATU-C and ATU-R, so we will consider them separately.

ATU-R. The PSD of a downstream ADSL receive signal typically decreases rapidly andÐignoring dips due to bridge tapsÐalmost monotonically with frequency. The average xDSL noise, on the other hand, is approximately white (NEXT increases with frequency, FEXT decreases with freqency), and the quantizing noise out of most analog-to-digital converters (ADCs) is also white. The optimum conditions for an ADCÐwith the SQR greater than the SNR by a constant amount at all frequenciesÐare thus achieved. If, however, the signal plus noise are analog equalized, the SQR at low frequencies, where the SNR and bit loadings are highest, will be much reduced. In extreme cases the ADC would need an extra two bits to achieve the same performance. The conclusion is that there should be no analog amplitude equalization²⁸ in the ATU-R.

ATU-C. The variation in received level across the 30 to 110 kHz received band is much less than at the ATU-R, so the considerations about SQR in the ADC are less important. On the other hand, the variation per Hertz and the resultant distortion are greater, so it might be useful to ease the task of the TEQ (Section 8.4.4) by some pre-equalization. This could be done by raising the cutoff frequency of the high-pass ®lter to about 100 kHz. This would have two other small bene®ts:

^* The inductance of the transformer and the resulting distortion due to the dc current would be reduced.
^* The duration of the impulse response of the total channel would be reduced.

NOTE: The slope in the high-pass response would affect the upstream transmit signal also and would have to be compensated for digitally by the g_i values (see Section 5.3).

²⁸ There might be some value in an all-pass delay equalizer; I know of no discussion of this. 8.4.2 Analog-to-Digital Converter

Some PAR reduction techniquesÐparticularly methods 6 and 7 of Section 8.2.11Ðmay be slightly reversed by the ®ltering performed by the line and/or the 4W/2W hybrid, but the PAR of a receive signal is probably not much different from that of a transmit signal. There are two con¯icting factors operating here; compared with the DAC in the transmitter:

^* The peak voltage-handling capability of the receiver front-end circuitry is not as important as that of the transmitter because the voltages and consumed power are much lower.

^* On the other hand, bits are more expensive in an ADC than in a DAC.
A compromise decision would be to set the PAR of the ADC 1 or 2 dB higher than that of the DAC.

The high-frequency roll-off of the channel, which greatly reduces the total power of the receive signal, performs the same function for the ADC as predistortion does for the DAC. Consequently, the requirement for 11.5 conversion bits calculated in (8.17) can be much reduced; 10 bits would be ample; 9 would probably suf®ce; 8 would be pushing it!

8.4.3 Timing Recovery and Loop Timing

It is important to note that in a DMT xDSL transmitter the clock and subcarrier frequencies are related by integers, and because there can be no frequency shift in the channel, they are similarly related in the receiver. Therefore, recovery of the “sampling clock” in the receiver is equivalent to recovery of the “carriers.” Furthermore, the only offset that the recovery circuitry has to deal with is that caused by the mismatch of two crystal oscillators: typically, less than Æ 100 ppm.

There is certainly suf®cient information contained within any randomly modulated MCM signal to allow recovery of the sampling clock, but early ADSL systems took the easy way out: they accepted a very small loss in data rate and dedicated one unmodulated subcarrier (n_p) for use as a pilot. If this tone is considered to be real, a feedback loop can be constructed to drive its imaginary part to zero.

The imaginary part of the complex output of bin n_p from the FFT is input to a loop ®lter, which, via a simple DAC, delivers a control voltage to a voltagecontrolled crystal oscillator (VCXO). Ideally, the loop ®lter would calculate the frequency of the sampling clock to be used for conversion and demodulation of the next symbol, but because of the time required to perform the FFT, the new frequency is not available until the following symbol, and an extra factor of z^À1 must be inserted in the loop. The design of such loops is well covered in the literature (e.g., see [Lindsey, 1972] or [Gardner, 1979]).

Speci®cation of the Recovered Sampling Clock. The control voltage for the VCXO must be maintained for one symbol period. The permissible error in this voltage (or, rather, in the induced VCXO frequency) can be calculated as follows.

The received signal is the sum of subcarriers (n₁ to n₂) randomly modulated by a(n):
n
ˆ
n
X
2
S…k†ˆ a…n†e^j2nk=N …8:25†
nˆn₁
If this is demodulated (FFTed) using an offset sampling frequency of f_samp(1 ‡ Á)/2, the appropriately scaled output for subcarrier m is
Y
…
m
†ˆ
1
X²X¹
_a…n†ej2k‰nÀm…1‡Á†Š=N_…8:26a†^N nˆn₁ kˆ0
^…N
%
1
X²
a…n†e^{j2x‰nÀm…1‡Á†Š=N}dx …8:26b†
^N nˆn₁⁰
ˆ ^{X2 1} a…n†‰e^{j2‰nÀm…1‡Á†Šx=N}Š^N …8:26c† _nˆn1j2‰n À m…1 ‡ Á†Š^xˆ0
ˆ a…m†…1 À jmÁ†‡ a…n
^X † À^mÁ_{…8:26d†
nTˆm} n À m

The ®rst term in (8.26 d ) is what would occur with an SCM system: the desired output and a quadrature distortion term proportional to the shift of its carrier. The second termÐin-phase interchannel distortion from all the other subcarriersÐ occurs only with MCM.

If each of the a(n) is considered to have unit power (i.e., Efjaj²gˆ 1), the total distortion is
“#

Anh Đức Blog

Chủ Nhật, 18 tháng 8, 2013

Chuong 8.html

8

IMPLEMENTATION OF DMT: ADSL

Không có nhận xét nào:

Đăng nhận xét