Network Working Group Y(J) Stein Internet-Draft I. Druker Expires: March 2, 2003 RAD Data Communications September 1, 2002 The Effect of Packet Loss on Voice Quality for TDM over Pseudowires draft-stein-pwe3-tdm-packetloss-00.txt Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http:// www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on March 2, 2003. Copyright Notice Copyright (C) The Internet Society (2002). All Rights Reserved. Abstract The effect of packet loss on voice quality has been the subject of detailed study in the VoIP community, but these results are not directly applicable to TDM transport as being studied in the PWE WG. The present document presents an analysis of packet loss for the TDM over PW case, and demonstrates that packet loss of a few percent can be tolerated. We propose that robustness to packet loss of a few percent be a requirement for any proposed method for transport of TDM over pseudowires. Stein & Druker Expires March 2, 2003 [Page 1] Internet-Draft PWE3 TDM Packet Loss September 2002 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Measures of Voice Quality . . . . . . . . . . . . . . . . . . 4 3. Packet Loss Replacement Algorithms . . . . . . . . . . . . . . 5 4. Experimental Results . . . . . . . . . . . . . . . . . . . . . 6 5. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 8 6. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 10 Full Copyright Statement . . . . . . . . . . . . . . . . . . . 12 Stein & Druker Expires March 2, 2003 [Page 2] Internet-Draft PWE3 TDM Packet Loss September 2002 1. Introduction There are several sources of packet loss in PSNs. Routers purposely drop packets when they detect errors in them, or when they need to manage congestion, leading to typical packet loss rates of between 1 and 20 percent. Real-time streams have an additional source of packet loss, namely reordered packet rejection at the PE. Non-real- time data communications is not overly effected by packet loss, due to retransmission mechanisms, but real-time constraints usually prohibit retransmission. Packet loss in voice traffic can cause in gaps or artifacts that result in choppy, annoying or even unintelligible speech. The precise effect of packet loss on voice quality, and the development of packet loss concealment algorithms have been the subject of detailed study in the VoIP community. Their results can be summarized as follows: 1) One percent packet loss causes perceived voice quality to drop from toll-quality to cell-phone quality. 2) Above two percent, packet loss is the dominant cause of voice quality deterioration, compressed and uncompressed speech becoming comparable in quality. 3) Packet length is not a significant factor (at least for lengths typically employed in VoIP). 4) By using appropriate packet loss concealment algorithms (PLC) five percent packet loss of uncompressed speech can be comparable or better than cell-phone quality. Unfortunately, these results are not directly applicable to TDM transport as being studied in the PWE WG [TDMoIP,CESoPSN,SONET-VT]. This is because VoIP packets typically contain between 80 samples (10 milliseconds) and 240 samples (30 milliseconds) of the speech signal, while multichannel TDM packets may contain only a single sample, or perhaps a very small number of samples. Stein & Druker Expires March 2, 2003 [Page 3] Internet-Draft PWE3 TDM Packet Loss September 2002 2. Measures of Voice Quality Perceived voice quality is a psychophysical quantity that depends on the physiology and psychology of the listener. The most universally accepted subjective measure of voice quality is the mean opinion score (MOS) defined by the ITU-T for telephone quality speech in [P.800], and by the ITU-R for higher fidelity audio in [BS.1116-1]. It is found by averaging the reported grades of multiple listeners, each of which rates the audio on a five point quality scale, with MOS=1 being unintelligible, and MOS=5 meaning excellent quality. Due to the 4 KHz bandwidth limitation and the logarithmic amplitude characteristics of the 64 Kbps DS0 digital channel, telephony voice is rated lower than 5, with 4 to 4.5 being considered "toll-quality". MOS ratings of 3.5 to 4 are considered acceptable to many listeners, and cellular telephones audio is readily accepted at about MOS=3.5 due to the added convenience of the cellular medium. Speech quality lower than MOS=3 is considered acceptable only for special applications, such as encrypted military communications. The problem with MOS is that being a subjective measure it is time consuming and costly to measure. Objective measures, ones that can be computed by algorithms based on the signal samples, are preferable if they correlate well with the subjective measures. The ITU-T has standardized two such measures for telephony quality speech, namely PSQM [P.861] and PESQ [P.862], while the ITU-R has decided on PEAQ [BS.1387] for higher fidelity radio quality audio. These objective measures utilize models of the biological auditory system and have been shown to correlate well with subjective measurements of MOS. PSQM was developed for lab comparison of different speech codecs and does not take such factors as delay or packet loss into account. PESQ specifically performs end-to-end speech quality assessment and was therefore chosen for our experiment. Stein & Druker Expires March 2, 2003 [Page 4] Internet-Draft PWE3 TDM Packet Loss September 2002 3. Packet Loss Replacement Algorithms In this section we discuss algorithms for concealing packet loss when it occurs. For concreteness we will assume in the following discussion that packets carry single samples of each TDM timeslot. The extension to multiple samples is relatively straightforward, and turns out not to drastically change the results of the next section. The simplest ploy to implement is to blindly insert a constant value in place of any lost speech samples. Since we can assume that the input signal is zero-mean (i.e. contains no DC component) minimal distortion is attained when this constant is chosen to be zero. This is in fact precisely what happens when a G.711 mu-law codec receives a word containing all-ones, as would be the case if AIS were to be received (but unfortunately is not true for A-law). A slightly more sophisticated technique is to replace the missing sample with the previous one. This method is somewhat more justifiable in the VoIP case where the quasistationarity of the speech signal means that the missing buffer is expected to be similar to the previous one. Even in the single sample case it is decidedly better than replacement by zero due to the typical low-pass quality of speech signals, and to the fact that during intervals with significant high frequency content (e.g. fricatives) the error is less noticeable. A packet is usually declared lost following the reception of the next packet, hence the both the sample prior to the missing one, and that following it are available. This enables us to estimate the missing sample value by interpolation, the simplest type of which is linear interpolation, whereby the missing sample is replaced by the average of the two surrounding values. This serves to conceal the packet loss event. More complex interpolation, such as quadratic interpolation or splines can be used as well, but for the purposes of this analysis we will restrict ourselves to the linear case. More sophisticated methods of packet concealment are based on model- based prediction. Standardized speech compression algorithms have had integral packet loss concealment methods for some time, and more recently the ITU-T has standardized a packet loss concealment method for uncompressed speech [G.711App1]. For such algorithms to function previous sample values must be saved in a circular buffer or re- extracted from the system jitter buffer. For the purposes of the experiment described in Section 4, we need only to estimate the value of a single missing sample, and so relatively simple modeling is sufficient. We used an interpolation model based on second order statistics of the previous 30 samples. Details of this algorithm will be reported elsewhere. Stein & Druker Expires March 2, 2003 [Page 5] Internet-Draft PWE3 TDM Packet Loss September 2002 4. Experimental Results In order to quantify the anecdotal results we have observed in real- world deployments, we have carried out a controlled experiment to measure the effect of packet loss on voice quality. We first describe the methodology we employed. The speech data was selected from English and American English subsets of the ITU-T P.50 Appendix 1 corpus [P.50App1] and consisted of 16 speakers, eight male and eight female. Each speaker spoke either three or four sentences, for a total of between seven and 15 seconds. The selected files were filtered to telephony quality using modified IRS filtering and downsampled to 8 KHz. A uniform random number generator was used to simulate packet loss. Packet loss of 0, 0.25, 0.5, 0.75, 1, 2, 3, 4 and 5 percent were tested. In the simulations reported here we disallowed loss of successive packets; bursty packet loss (where the probability of groups of missing samples is much higher than would be expected from the average packet loss rate) was also simulated but is not reported here. For each file four methods of lost sample replacement were applied and PESQ software was then used to estimate the MOS rating. A graph depicting the PESQ derived MOS as a function of packet loss for the four lost packet replacement algorithms cases is available in ps and pdf formats at http://www.dspcsp.com/tdmoip/pl.ps and http://www.dspcsp.com/tdmoip/pl.pdf respectively. We obtained the following qualitative and quantitative results. 1) For all cases the MOS resulting from the use of zero insertion is less than that obtained by replacing with the previous sample, which in turn is less than that of linear interpolation, which is slightly less than that obtained by statistical interpolation. 2) Unlike the artifacts speech compression methods may produce when subject to buffer loss, packet loss here effectively produces additive white impulse noise. The subjective impression is that of static noise on AM radio stations or crackling on old phonograph records. For a given PESQ, this type of degradation is more acceptable to listeners than choppiness or tones common in VoIP. Stein & Druker Expires March 2, 2003 [Page 6] Internet-Draft PWE3 TDM Packet Loss September 2002 3) If MOS>4 (full toll quality) is required, then the following packet losses are allowable: zero insertion - 0.05 % previous sample - 0.25 % linear interpolation - 0.75 % statistical interpolation - 2 % 4) If MOS>3.75 (barely perceptible quality degradation) is tolerable, then the following packet losses are allowable: zero insertion - 0.1 % previous sample - 0.75 % linear interpolation - 3 % statistical interpolation - 6.5 % 5) If MOS>3.5 (cell-phone quality) is sufficient, then the following packet losses are allowable: zero insertion - 0.4 % previous sample - 2 % linear interpolation - 8 % statistical interpolation - 14 % Stein & Druker Expires March 2, 2003 [Page 7] Internet-Draft PWE3 TDM Packet Loss September 2002 5. Discussion The most undemanding approach to handling packet loss in TDM over PW is to generate Alarm Indication Signal (AIS) whenever a packet is lost. This results in insertion of constant values, and extremely low tolerance. Transport methods that respect frame structure, such as AAL1, employ "frame replay", which increases the perceived voice quality and has the added benefit that CAS signaling integrity is guaranteed. The linear and statistical interpolation methods can only be employed when the TDM is transported in the PW in a framed and structured fashion, i.e. that the timeslot signal values are readily available for manipulation. This rules out unframed transport and non-byte- oriented transport (including some methods of transporting T1 links). In addition, complex encapsulations that impede the extraction of required samples, may hinder the use of these methods. Assuming a processor with hardware companding and which can perform an addition and a shift in a single cycle (e.g. a DSP), linear interpolation requires a single cycle per timeslot per sample loss, or 8000 L instruction cycles per second, where L is the packet loss percentage. An entire 30 channel E1 link will thus require 0.24 L MIPS, and an entire 24 channel T1 link 0.192 L MIPS. For example at 2% packet loss, an average processing power of 1 MIPS will suffice for 208 E1 trunks or 260 T1 trunks. Even using a processor that requires 10 instructions to process an interpolation, dedicating 1 MIPS will enable fixing 20 E1s or 26 T1s. The statistical interpolation method requires the computation of energy, single and dual lag autocorrelations, which for a history buffer of N samples involves approximately 3N multiplications and additions. For processors with MAC operations (e.g. a DSP) this translates to 0.024 N L MIPS per timeslot (0.72 N L MIPS per E1 or 0.576 N L MIPS per T1). N must be chosen large enough to capture the signal statistics, but not so large that the statistics would be expected to change significantly in normal speech. Numbers in the range 10 to 100 are reasonable. For example, using N=30 and once again assuming 2% packet loss, the processing drain would be 0.432 MIPS per E1 and 0.3456 MIPS per T1. Although statistical interpolation is consistently better than simple linear interpolation, the additional MIPS would probably only be justifiable when the packet loss rate is particularly high. Stein & Druker Expires March 2, 2003 [Page 8] Internet-Draft PWE3 TDM Packet Loss September 2002 6. Summary Packet loss is to be expected in any packet switched network, but does not degrade most data traffic since retransmission mechanisms compensate for it with no ill effects other than a reduction in effective data transfer rate. Unfortunately, real-time traffic such as TDM can frequently not tolerate the added latency retransmission incurs. Conventional TDM networks dedicate highly synchronous circuits to voice calls. Hence there is never packet loss, and even individual bit slips are tightly controlled. Telephony customers have grown accustomed to telephone service quality, and will not consent to lower quality unless there are other major advantages (e.g. mobility, significantly lower price). Market acceptance of TDM transport over PW will depend on service providers being able to offer SLAs with meaningful voice quality guarantees, while deploying networks with some reasonable amount of packet loss. We have shown that by using simple packet loss concealment techniques, methods of transporting TDM over PW can function under a few percent packet loss without dramatic degradation of voice quality. Since the voice quality is not a major obstacle, it is mandatory that the protocols employed not introduce additional impediments to operation at realistic packet loss rates. We therefore propose that robustness to packet loss of a few percent be a requirement for any proposed method for pseudowire transport of TDM. Stein & Druker Expires March 2, 2003 [Page 9] Internet-Draft PWE3 TDM Packet Loss September 2002 7. References [BS.1116-1] ITU-R Recommendation BS.1116-1 (1994-1997) Methods for the Subjective Assessment of Small Impairments in Audio Systems Including Multichannel Sound [BS.1387] ITU-R Recommendation BS.1387 (1998) Method for Objective Measurements of Perceived Audio Quality [CESoPSN] draft-vainshtein-cesopsn-03.txt (2002) TDM Circuit Emulation Service over Packet Switched Network (CESoPSN), Alexander ("Sasha") Vainshtein et al, work in progress [G.711App1] ITU-T Recommendation G.711 - Appendix I (1999) A high quality low-complexity algorithm for packet loss concealment with G.711 [P.50App1] ITU-T Recommendation P.50 - Appendix I (1998) Artificial Voices - Test Signals [P.800] ITU-T Recommendation P.800 (1996) Methods for Subjective Determination of Transmission Quality [P.861] ITU-T Recommendation P.861 (1998) Objective Quality Measurement of Telephone-band (300-3400 Hz) Speech Codecs [P.862] ITU-T Recommendation P.862 (2001) Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrow-band Telephone Networks and Speech Codecs [SONET-VT] draft-ietf-pwe3-sonet-vt-00.txt (2002) TDM Service Specification for Pseudo-Wire Emulation Edge to Edge (PWE3), Prayson Pate et al, work in progress [TDMoIP] draft-anavi-tdmoip-04.txt (2002) TDM over IP, Yaakov (Jonathan) Stein et al, work in progress Stein & Druker Expires March 2, 2003 [Page 10] Internet-Draft PWE3 TDM Packet Loss September 2002 Authors' Addresses Yaakov (Jonathan) Stein RAD Data Communications 24 Raoul Wallenburg St., Bldg C Tel Aviv 69719 ISRAEL Phone: +972 3 6455389 EMail: yaakov_s@rad.co.il Ilya Druker RAD Data Communications 24 Raoul Wallenburg St., Bldg C Tel Aviv 69719 ISRAEL Phone: +972 3 7657061 EMail: ilya_d@rad.co.il Stein & Druker Expires March 2, 2003 [Page 11] Internet-Draft PWE3 TDM Packet Loss September 2002 Full Copyright Statement Copyright (C) The Internet Society (2002). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Stein & Druker Expires March 2, 2003 [Page 12]