Why?

Blogging Reading Chatting Meeting. The other aspect of life.

Monday, May 26, 2008

Encoding and Transmitting Speech

The simplest way to encode speech is to use PCM, discussed above. The 8-bit, 8-kHz standard for speech is of significantly lower quality than what is required for music. Still, at the nominal 64-kb/sec rate for speech, if one bit per sample can be saved, then the total saving is 8 kb/sec. Methods for lowering the bit rate thus remain an active area of research. The ADPCM method discussed above can easily save 2 to 4 bits per sample. PCM, ADPCM, and related methods attempt to describe the waveform itself. There are other methods, such as the Subband coding discussed above under MPEG. We now turn to another class of methods, called voice coders, or vocoders.
The human vocal tract can be simplified by assuming, for example, that the source of vibration for voiced sounds is not affected by the rest of the vocal tract. The series of filters that model the vocal tract can be modeled such that if one filter changes, there is no effect on the others. Under such conditions, we can calculate the voice model coefficients independently of the fundamental frequency or the voiced/unvoiced decision. We can also reasonably assume that formats change quite slowly compared to the rate of individual pulses from the vocal tract and transmit the filter coefficients at a slower rate.

No comments: