Wideband VoIP

WideBand VoIP



To date, VoIP has predominantly been hyped as a means of cost savings, and this is a legitimate pitch. But VoIP also lays the groundwork for a revolution in the quality of voice we communicate with which was heretofore capped by the fixed rate nature of PSTN.

Telephones have remained unchanged for so long, most people have no idea what limitations they have lived with. This has begun to change with the recent rise of wideband technologies for the masses such as Skype. Now people are asking everywhere, "why does Skype sound so darn good?" The answer is that Skype is capturing double the spectrum of voice frequencies that are captured by standard telephones.

Standard PSTN, and the overwhelming majority of VoIP telecom codecs, capture at 8 kHz. Skype can capture at 16 kHz. The added fidelity in the high frequencies adds substantially to the realism of the reproduction, makes it possible to distinguish between otherwise difficult syllables of "s" "f" "c" "e" "d" and so on, and in an unmeasurable way, creates much more a sense of "being there" with the other person. The comparison is analagous to that of AM and FM broadcast fidelity. (For a more complete explanation of the concept of digital samping, including audio comparison samples, you can refer to here.)

The bitrate this data is transmitted is a separate topic from the sample rate of the initial capture. A great codec may minimize the amount of data needed to be transmitted per second (bitrate), but it can never capture it any better than the original sampling (sample rate). This is the "buy low, sell high" of communications. We want the lowest bitrate possible for the highest sample rate possible... low cost, high quality. There are many codecs available for this compression / decompression process, but they almost all are tuned for an 8 kHz process. The simple reason for this is that regardless of how high fidelity a VoIP signal is, as soon as it passes into the PSTN, anything beyond 8 kHz is discarded. Furthermore, a 16 kHz signal that is downsampled to 8 kHz typically doesn't sound as good as one that started at 8 kHz from the start. Thus, the critical issue ....

Wideband VoIP only works when sent VoIP from end to end. With the prevalence of existing PSTN infrastructure in our lives, pure end to end VoIP links still make up only a small percentage of communications traffic.

For a list of prevalent codecs, see: http://www.speex.org/comparison

Visible in the above link are the notable exceptions to the 8 kHz rule to keep an eye on, namely G.722 and Speex. There is also G729.1, an interesting recent codec, compatible with G729. G729.1, as opposite to speex and G722, is compatible with a well accepted and well tested codec, g729.

It seems to be prevalent folklore that Skype uses what is commonly referred to as "Wideband iLBC". While it is not inconceivable that iLBC could be used to capture at 16 kHz, I have found no credible sources citing anyone doing it. I suspect this is a misconception based on reasoning something like this:

Skype uses GIPS codecs. iLBC was created by Global IP Sound (GIPS). Skype is wideband capable. Therefore Skype uses wideband iLBC.

Let me attempt to clear this up somewhat. As confirmed to me by sources at GIPS, "'wideband iLBC' is not the preferred terminology. To be more accurate, Skype has licensed our VoiceEngine product, which is a comprehensive solution that includes all of our codecs, as well as a jitter buffer, error concealment, and echo cancellation technology."

The GIPS suite of codecs includes, among others, iLBC which is a fixed rate codec. But, as indicated on the Skype website, Skype varies its bit rate (see here) which could not happen with iLBC. But also in the suite is the iSAC codec. According to a GIPS engineer, "The iLBC and iSAC algorithms are pretty much unrelated. iLBC is a narrowband fixed rate codec operating at 13.3 kbps or 15.2 kbps. The algorithm is available from IETF (RFC 3951 and 3952). iSAC is an unrelated wideband variable rate codec, which can adapt its operating rate between 10 kbps and 32 kbps. The codec is proprietary and the algorithm is unavailable."

So, though at times, Skype may indeed utilize iLBC, it must at others be using iSAC. Since Skype has licensed both (as well as iPCM-wb, another 16 kHz codec), it would IMHO make little sense for them to go tweak iLBC for wideband when the others are already available for their use. (If someone can cite some evidence though, by all means, post.)

As the engineer pointed out, the wideband GIPS codecs have not been released royalty free like iLBC has. Asterisk and other open source types therefore show little interest in them. The good news is that alternatives exist - namely Speex - which supports 8, 16 and 32 kHz sample rates and is open source freeware. So if you are looking for wideband VoIP, look at Speex.

A caveat for Asterisk hacks: The internal guts of Asterisk are still substantially geared for 8 kHz sampling, so arriving wideband signals will end up downsampled. I understand this is pervasive enough in the core code that it is not likely to evolve past 8 kHz for some time to come.

Learn More

Wide Band IP Phones
HD Voice from a hosted voip provider perspective
HD Voice
WideBand Low Latency Networking for VoIP
Created by: Randomandy, Last modification: Mon 12 of Sep, 2011 (02:10 UTC) by admin
Please update this page with new information, just login and click on the "Edit" or "Discussion" tab. Get a free login here: Register Thanks! - Find us on Google+