The comfort noise that he is referring to is the white noise that is intentionally introduced by modern digital cell phones. Most cell phones have a CODEC that will detect silence and rather than wasting valuable bandwidth on nothing, a silence packet is sent that says something like "quiet for the next 250ms" or something like that. The GSM specification for the AMR CODEC provides for a feature called comfort noise. It turns out that on the other end, if the listener hears total silence, they worry that the connection was lost. So instead it will provide white noise, specifically +/- 2048 (of a 16 bit word). Since a pseudo random number generator on the phone is typically used, it is making white noise, similar to the white noise you hear on an old fashioned analog phone or as the author references, a radio tuned to no station and you hear random static a percentage of which is cosmic MW background radiation from the big bang. It is ironic that with all that digital technology and people still like the comfort of white noise in the background, not complete silence. Of course the carriers could instead use some extra BW and transmit the actual background noise of whoever you are talking to.
A little bit more advanced, but specifically the AMR codec also provides a noise colouring packet to be sent. This packet requires far less bandwidth than the actual voice codec packets.
This packet describes the background noise that is present at the sender, so that when speach resumes, the natural background noise and the fake comfort noise sounds the same and so there is no discontinuity between the two.
To give an idea of the bandwidth requirements, the actual data portion every 20 milliseconds of speech audio is ~14 Bytes, noise colouring ~6 Bytes and silence packets is 1 Byte. It might be implementation specific though.
edit - sorry it was not clear, but during silent periods the silent packets and noise colouring packets are intermixed in a ratio of something like 6 to 1.
I hadn't made the connection before, but this is basically how many industrial sensors work. They transmit currents of 4 to 20 mA, with 0 mA likely indicating power loss or a short.
The goal is essentially the same: make it easier to tell if the line's gone dead or not. I think it would be interesting if there could be a good use for comfort noise in these signals, too; at least for me, I've watched for noise levels to shift to figure out if something was amiss.
> It is ironic that with all that digital technology and people still like the comfort of white noise in the background, not complete silence. Of course the carriers could instead use some extra BW and transmit the actual background noise of whoever you are talking to.
"Comfort noise" is actaully useful to signal if the call (or the audio channel) is active. Otherwise people ask each other "You still there? Can you hear me?" hearing the noise indicates the channel is still open.
As for carriers transmitting the actual background noise, it would increase bandwidth requirements. But I am not sure if I would even like to hear the background of the person talking. It could be distracting (even if it wouldn't make a difference for bandwidth consumption).
It's less complicated and unrelated to echo cancellation; you just tell the transmitting codec to always encode whatever audio you've given it, rather than detecting silence.
(Echo cancel is usually in the local audio path before data is passed to the codec. So the outbound codec would hear silence rather than an echo of inbound audio. At least that's my experience of working with PJ-SIP)