ITUT G729 preprocessing question [Archive]

View Full Version : ITUT G729 preprocessing question

HyeeWang

04-29-2009, 09:24 AM

About ITUT G729, I have two questions.

1. In the processing section of G729 spec, it says to apply a high
pass filter with a cut-off frequency of 140 Hz. As we know, the
range of pitch is between 55 and 400 Hz. Then, how to search the pitch
with frequency below 140Hz with such a preprocessing algo? It is as
fetching the moon from sea.

2. G729 preprocessing section uses a second order pole/zero filter
with a cut-off frequency of 140 Hz.
H(z) = (0.46363718-0.927247058.*z.^-1+0.46363718*z.^-2)/
(1-1.9059465.*z.^-1+0.9114024*z.^-2);
But ,In contrast to the spec,the G729 codec software implementation
uses a different filter, which is described as follows. It is also
2nd order high pass filter with cut off frequency at 140 Hz. To my
surpise,I find the frequency response of that filter is bad.

Why? Why the G729 codec software implementation use such a bad
filter,rejecting a designed and perfect filter?
That puzzle me greatly.

/
*------------------------------------------------------------------------
*
* 2nd order high pass filter with cut off frequency at 140
Hz. *
* Designed with SPPACK efi command -40 dB att, 0.25
ri. *

*
*
*
Algorithm:
*

*
*
* y[i] = b[0]*x[i] + b[1]*x[i-1] + b[2]*x
[i-2] *
* + a[1]*y[i-1] + a[2]*y
[i-2]; *

*
*
* b[3] = {0.92727435E+00, -0.18544941E+01, 0.92727435E
+00}; *
* a[3] = {0.10000000E+01, 0.19059465E+01, -0.91140240E
+00}; *

*-----------------------------------------------------------------------
*/

Regards
HyeeWang

Vladimir Vassilevsky

04-30-2009, 01:02 AM

HyeeWang wrote:

> About ITUT G729, I have two questions.
>
> 1. In the processing section of G729 spec, it says to apply a high
> pass filter with a cut-off frequency of 140 Hz. As we know, the
> range of pitch is between 55 and 400 Hz. Then, how to search the pitch
> with frequency below 140Hz with such a preprocessing algo? It is as
> fetching the moon from sea.

Fundamental frequency is not required to find the pitch.

> 2. G729 preprocessing section uses a second order pole/zero filter
> with a cut-off frequency of 140 Hz.
> But ,In contrast to the spec,the G729 codec software implementation
> uses a different filter, which is described as follows. It is also
> 2nd order high pass filter with cut off frequency at 140 Hz. To my
> surpise,I find the frequency response of that filter is bad.
>
> Why? Why the G729 codec software implementation use such a bad
> filter,rejecting a designed and perfect filter?

Don't know where did you get that junk implementation. Refer to the
original ITU-T annex. Anyway the main purpose of this filter is removing
DC from the signal, so the frequency response is not critical.

Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com

HyeeWang

04-30-2009, 02:18 AM

On Apr 30, 8:02*am, Vladimir Vassilevsky <[email protected]>
wrote:
> HyeeWang wrote:
> > About ITUT G729, I have two questions.
>
> > 1. *In the processing section of G729 spec, it says to apply a high
> > pass filter with a cut-off frequency of 140 Hz. * As we know, the
> > range of pitch is between 55 and 400 Hz. Then, how to search the pitch
> > with frequency below *140Hz with such a preprocessing algo? *It is as
> > fetching the moon from sea.
>
> Fundamental frequency is not required to find the pitch.
>
> > 2. G729 preprocessing *section uses a second order pole/zero filter
> > with a cut-off frequency of 140 Hz.
> > But ,In contrast to the spec,the G729 codec software implementation
> > uses a different filter, which is described as * follows. It is also
> > 2nd order high pass filter with cut off frequency at 140 Hz. *To my
> > surpise,I find the frequency response of that filter is bad.
>
> > *Why? Why the G729 codec software implementation use such a bad
> > filter,rejecting a designed and perfect filter?
>
> Don't know where did you get that junk implementation. Refer to the
> original ITU-T annex. Anyway the main purpose of this filter is removing
> DC from the signal, so the frequency response is not critical.
>
> Vladimir Vassilevsky
> DSP and Mixed Signal Design Consultanthttp://www.abvolt.com

Vladimir Vassilevsky ! Thank you.

The two filter actually is almost same,except for a scaling of factor
2.
Due to my not thinking over the coefficients of that filter,I misplot
its frequency response. I just use the vector a[3] of the comments as
the parameter of freqz function,forgetting to add the minus sign with
the second and third elements.

But another question still make me puzzled.
Maybe I am confused with the concept of pitch and fundmental
frequency. But I did not find the difference of them. So after cutting
off the frequency below 140hz of speech,I think it have lost pitch
components thus pitch searching algo does not work.

Would you be kind to explain it to me?

Regards
HyeeWang

Vladimir Vassilevsky

04-30-2009, 02:35 AM

HyeeWang wrote:

> Maybe I am confused with the concept of pitch and fundmental
> frequency. But I did not find the difference of them. So after cutting
> off the frequency below 140hz of speech,I think it have lost pitch
> components thus pitch searching algo does not work.
> Would you be kind to explain it to me?

Pitch is a repetition rate. If a signal is repeated with the frequency
Fo, that doesn't mean that there are any components at the frequency Fo.

Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com

04-30-2009, 05:34 AM

On 30 Apr., 03:18, HyeeWang <[email protected]> wrote:
> Would you be kind to explain it to me?

You better get text books about it. This way you (a) won't waste
people's time and (b) you'll learn something.
Second option: Quit your job and get a new one that doesn't require
competency in DSP.

btw: I'm the one who told you about the equivalence of the two filters
(up to a scaling factor) in another forum. Funny, how you sometimes
try to look smart by repeating other people's responses and making it
look like it was the result of your thoughts.

Cheers!
SG

HyeeWang

04-30-2009, 06:08 AM

On Apr 30, 12:34*pm, SG <[email protected]> wrote:
> On 30 Apr., 03:18, HyeeWang <[email protected]> wrote:
>
> > Would you be kind to explain it to me?
>
> You better get text books about it. This way you (a) won't waste
> people's time and (b) you'll learn something.
> Second option: Quit your job and get a new one that doesn't require
> competency in DSP.
>
> btw: I'm the one who told you about the equivalence of the two filters
> (up to a scaling factor) in another forum. Funny, how you sometimes
> try to look smart by repeating other people's responses and making it
> look like it was the result of your thoughts.
>
> Cheers!
> SG

hi,SG.
Thank you.

It looks as like you are argry that I "repeating other people's
responses and making it
look like it was the result of your thoughts. "

The two filter actually is almost same,except for a scaling of factor
2.

It is a basic fact. I did not think over and forget the minus signs
and mistake the response.
Thank you for you reminding me to that.

But I have expressed thinkful to you in that forum. More ,it is in
another forum. How can you wanna
me to do here?
Could you teach me?

Regards
HyeeWang

HyeeWang

04-30-2009, 06:17 AM

On Apr 30, 1:08*pm, HyeeWang <[email protected]> wrote:
> On Apr 30, 12:34*pm, SG <[email protected]> wrote:
>
>
>
>
>
> > On 30 Apr., 03:18, HyeeWang <[email protected]> wrote:
>
> > > Would you be kind to explain it to me?
>
> > You better get text books about it. This way you (a) won't waste
> > people's time and (b) you'll learn something.
> > Second option: Quit your job and get a new one that doesn't require
> > competency in DSP.
>
> > btw: I'm the one who told you about the equivalence of the two filters
> > (up to a scaling factor) in another forum. Funny, how you sometimes
> > try to look smart by repeating other people's responses and making it
> > look like it was the result of your thoughts.
>
> > Cheers!
> > SG
>
> hi,SG.
> Thank you.
>
> It looks as like you are argry that I *"repeating other people's
> responses and making it
> look like it was the result of your thoughts. "
>
> The two filter actually is almost same,except for a scaling of factor
> 2.
>
> It is a basic fact. I did not think over and forget the minus signs
> and mistake the response.
> Thank you for you reminding me to that.
>
> But I have expressed thinkful to you in that forum. More ,it is in
> another forum. How can you wanna
> me to do here?
> Could you teach me?
>
> Regards
> HyeeWang- Hide quoted text -
>
> - Show quoted text -

Thank all teachers in my long journey.
Maybe I should tell that it is *** who teach me every
knowledge ,including 1+1=2,here.

hehe.

Regards
HyeeWang

steveu

04-30-2009, 07:38 AM

>
>
>HyeeWang wrote:
>
>> About ITUT G729, I have two questions.
>>
>> 1. In the processing section of G729 spec, it says to apply a high
>> pass filter with a cut-off frequency of 140 Hz. As we know, the
>> range of pitch is between 55 and 400 Hz. Then, how to search the pitch
>> with frequency below 140Hz with such a preprocessing algo? It is as
>> fetching the moon from sea.
>
>Fundamental frequency is not required to find the pitch.
>
>> 2. G729 preprocessing section uses a second order pole/zero filter
>> with a cut-off frequency of 140 Hz.
>> But ,In contrast to the spec,the G729 codec software implementation
>> uses a different filter, which is described as follows. It is also
>> 2nd order high pass filter with cut off frequency at 140 Hz. To my
>> surpise,I find the frequency response of that filter is bad.
>>
>> Why? Why the G729 codec software implementation use such a bad
>> filter,rejecting a designed and perfect filter?
>
>Don't know where did you get that junk implementation. Refer to the
>original ITU-T annex. Anyway the main purpose of this filter is removin

>DC from the signal, so the frequency response is not critical.

Its more than just an issue of removing DC. You need to remove the entir
bass end of the spectrum to optimise the performance of the codec. This i
true of most narrowband speech codecs. Even with an elementary codec lik
G.711, filtering below something like 200Hz dramatically improves how wel
the remainder of the signal codes.

Steve

Vladimir Vassilevsky

04-30-2009, 11:34 AM

"steveu" <[email protected]> wrote in message
news:[email protected]...
> >
> >
> >HyeeWang wrote:
> >
> >> 1. In the processing section of G729 spec, it says to apply a high
> >> pass filter with a cut-off frequency of 140 Hz.

> > Anyway the main purpose of this filter is removing
> >DC from the signal, so the frequency response is not critical.
>
> Its more than just an issue of removing DC. You need to remove the entire
> bass end of the spectrum to optimise the performance of the codec. This is
> true of most narrowband speech codecs. Even with an elementary codec like
> G.711, filtering below something like 200Hz dramatically improves how well
> the remainder of the signal codes.

I agree. There could be a lot of energy at low frequencies, however that
part is not very important for the perception of the speech. It only wastes
the bandwidth. Same idea applies to the power amplification.
What always surprised me in the G.711, G.721 and such: why didn't they use a
fixed preemphasis/deemphasis in the analog. It improves the quality very
noticeably.

Vladimir Vassilevsky
DSP and Mixed Signal Consultant
www.abvolt.com

HyeeWang

05-04-2009, 08:33 AM

steveu,thank you.

1. Removing DC is reasonable,for it is not a part of speech at all.But
what is the reason to remove bass?
How and why it can optimise the performance ? We should be to
noted that G729 is not a waveform coder,not
a frequency domain waveform coder also, it is a parameter vocoder.

2. G711, as an elementary coder algo, use the basic A/mu law
nonuniform PCM quantization. I wonder whether it
can be called "codec". How it can be dramatically improves the
quality?

Vladimir Vassilevsky. Thank you.

G.711, G.721, two elementary coding algo, use basic A/mu law
nonuniform PCM quantization and ADPCM.

They all use waveform to compress data. It is nothing about frequency
domain. Why you wanna use a
fixed preemphasis/deemphasis in them? How u expect it work?

Any comments would be appreciated.
[email protected]

Vladimir Vassilevsky

05-04-2009, 12:05 PM

Get a basic book such as Rabiner & Schafer.

VLV

"HyeeWang" <[email protected]> wrote in message
news:[email protected]...
> steveu,thank you.
>
> 1. Removing DC is reasonable,for it is not a part of speech at all.But
> what is the reason to remove bass?
> How and why it can optimise the performance ? We should be to
> noted that G729 is not a waveform coder,not
> a frequency domain waveform coder also, it is a parameter vocoder.
>
> 2. G711, as an elementary coder algo, use the basic A/mu law
> nonuniform PCM quantization. I wonder whether it
> can be called "codec". How it can be dramatically improves the
> quality?
>
> Vladimir Vassilevsky. Thank you.
>
> G.711, G.721, two elementary coding algo, use basic A/mu law
> nonuniform PCM quantization and ADPCM.
>
> They all use waveform to compress data. It is nothing about frequency
> domain. Why you wanna use a
> fixed preemphasis/deemphasis in them? How u expect it work?
>
> Any comments would be appreciated.
> [email protected]

Jerry Avins

05-04-2009, 02:12 PM

HyeeWang wrote:
> steveu,thank you.
>
> 1. Removing DC is reasonable,for it is not a part of speech at all.But
> what is the reason to remove bass?

Bass can contain a lot of power, but adds little to intelligibility.

> How and why it can optimise the performance ?

By removing bass, more power it available for the most effective
frequencies.

> We should be to noted that G729 is not a waveform coder,not
> a frequency domain waveform coder also, it is a parameter vocoder.

Noted. So?

> 2. G711, as an elementary coder algo, use the basic A/mu law
> nonuniform PCM quantization. I wonder whether it
> can be called "codec". How it can be dramatically improves the
> quality?

By providing 12 bits of dynamic range with an 8-bit system.

> Vladimir Vassilevsky. Thank you.
>
> G.711, G.721, two elementary coding algo, use basic A/mu law
> nonuniform PCM quantization and ADPCM.
>
> They all use waveform to compress data. It is nothing about frequency
> domain. Why you wanna use a
> fixed preemphasis/deemphasis in them? How u expect it work?

Preemphasis "whitens" the signal, improving SNR.

Jerry
--
Engineering is the art of making what you want from things you can get.
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

HyeeWang

05-05-2009, 10:02 AM

On May 4, 9:12*pm, Jerry Avins <[email protected]> wrote:
> HyeeWang wrote:
> > steveu,thank you.
>
> > 1. Removing DC is reasonable,for it is not a part of speech at all.But
> > what is the reason to remove bass?
>
> Bass can contain a lot of power, but adds little to intelligibility.
>
> > * How and why it can optimise the performance ? *
>
> By removing bass, more power it available for the most effective
> frequencies.
>
> > We should be to noted that G729 is not a waveform *coder,not
> > * a frequency domain waveform coder also, it is a parameter vocoder.
>
> Noted. So?
>
> > 2. G711, as an elementary coder algo, *use the basic A/mu law
> > nonuniform PCM quantization. I wonder whether it
> > * *can be called "codec". *How it can be dramatically improves the
> > quality?
>
> By providing 12 bits of dynamic range with an 8-bit system.
>
> > Vladimir Vassilevsky. Thank you.
>
> > G.711, G.721, two elementary coding algo, use basic A/mu law
> > nonuniform PCM quantization and ADPCM.
>
> > They all use waveform to compress data. It is nothing about frequency
> > domain. Why you wanna use a
> > fixed preemphasis/deemphasis in them? How u expect it work?
>
> Preemphasis "whitens" the signal, improving SNR.
>
> Jerry
> --
> Engineering is the art of making what you want from things you can get.
> ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

erry.Thank you.

Thank you for your graceful description about bass . Thus I get that
bass components occupy more in power,but contribute less in
intelligibility

1.If in a frequency wavefrom coder,such as mp3/aac, removing bass
must improve intelligibility. It is resonable for it can give more
attention/bits to that effective for intelligibility.

But, G729/G723.1,based on CELP, is a parameter vocoder. It use speech
model and analyse LPC parameter/pitch. It is not a wavefrom coder.
Removing bass can not work.
Can the operation of removing Bass make the LPC coefficients/Pitch
more effective?

2. G711,although it is a waveform coder,but it operate in time
domain,not in frequency domain. Removing bass can not work also, for
BASS is a characteristic of frequency.

3. Preemphasis "whitens" the signal, improving SNR. It did and made
spectrum curve more flat.Thus raising SNR is right. But G711/G721
(nonuniform PCM and ADPCM) is operating in time domain. The large SNR
in frequency sense can give rise to large SNR in time sense?

Thank you for your reply.

Regards
[email protected]

HyeeWang

05-05-2009, 10:05 AM

On May 5, 5:02*pm, HyeeWang <[email protected]> wrote:
> On May 4, 9:12*pm, Jerry Avins <[email protected]> wrote:
>
>
>
>
>
> > HyeeWang wrote:
> > > steveu,thank you.
>
> > > 1. Removing DC is reasonable,for it is not a part of speech at all.But
> > > what is the reason to remove bass?
>
> > Bass can contain a lot of power, but adds little to intelligibility.
>
> > > * How and why it can optimise the performance ? *
>
> > By removing bass, more power it available for the most effective
> > frequencies.
>
> > > We should be to noted that G729 is not a waveform *coder,not
> > > * a frequency domain waveform coder also, it is a parameter vocoder..
>
> > Noted. So?
>
> > > 2. G711, as an elementary coder algo, *use the basic A/mu law
> > > nonuniform PCM quantization. I wonder whether it
> > > * *can be called "codec". *How it can be dramatically improves the
> > > quality?
>
> > By providing 12 bits of dynamic range with an 8-bit system.
>
> > > Vladimir Vassilevsky. Thank you.
>
> > > G.711, G.721, two elementary coding algo, use basic A/mu law
> > > nonuniform PCM quantization and ADPCM.
>
> > > They all use waveform to compress data. It is nothing about frequency
> > > domain. Why you wanna use a
> > > fixed preemphasis/deemphasis in them? How u expect it work?
>
> > Preemphasis "whitens" the signal, improving SNR.
>
> > Jerry
> > --
> > Engineering is the art of making what you want from things you can get.
> > ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
>
> erry.Thank you.
>
> Thank you for your graceful description about bass . Thus I get that
> bass components occupy more in power,but contribute less in
> intelligibility
>
> 1.If in a frequency wavefrom coder,such as mp3/aac, *removing bass
> must improve intelligibility. It is resonable for it can give more
> attention/bits to that effective for intelligibility.
>
> But, G729/G723.1,based on CELP, is a parameter vocoder. It use speech
> model and analyse LPC parameter/pitch. It is not a wavefrom coder.
> Removing bass *can not work.
> Can the operation of removing Bass make the LPC coefficients/Pitch
> more effective?
>
> 2. G711,although it is a waveform coder,but it operate in time
> domain,not in frequency * domain. Removing bass can not work also, for
> BASS is a characteristic of frequency.
>
> 3. Preemphasis "whitens" the signal, improving SNR. It did and made
> spectrum curve more flat.Thus raising SNR is right. But G711/G721
> (nonuniform PCM and ADPCM) is operating in time domain. The large SNR
> in frequency sense can give rise to large SNR in time sense?
>
> Thank you for your reply.
>
> Regards
> [email protected] Hide quoted text -
>
> - Show quoted text -

Jerry.Thank you.

Thank you for your graceful description about bass . Thus I get that
bass components occupy more in power,but contribute less in
intelligibility

1.If in a frequency wavefrom coder,such as mp3/aac, removing bass
must improve intelligibility. It is resonable for it can give more
attention/bits to that effective for intelligibility.

But, G729/G723.1,based on CELP, is a parameter vocoder. It use speech
model and analyse LPC parameter/pitch. It is not a wavefrom coder.
Removing bass can not work.
Can the operation of removing Bass make the LPC coefficients/Pitch
more effective?

2. G711,although it is a waveform coder,but it operate in time
domain,not in frequency domain. Removing bass can not work also, for
BASS is a characteristic of frequency.

3. Preemphasis "whitens" the signal, improving SNR. It did and made
spectrum curve more flat.Thus raising SNR is right. But G711/G721
(nonuniform PCM and ADPCM) is operating in time domain. The large SNR
in frequency sense can give rise to large SNR in time sense?

Thank you for your reply.

Regards
[email protected]

Sebastian Doht

05-05-2009, 12:53 PM

HyeeWang schrieb:
> On May 5, 5:02 pm, HyeeWang <[email protected]> wrote:
>> On May 4, 9:12 pm, Jerry Avins <[email protected]> wrote:
>>
>>
>>
>>
>>
>>> HyeeWang wrote:
>>>> steveu,thank you.
>>>> 1. Removing DC is reasonable,for it is not a part of speech at all.But
>>>> what is the reason to remove bass?
>>> Bass can contain a lot of power, but adds little to intelligibility.
>>>> How and why it can optimise the performance ?
>>> By removing bass, more power it available for the most effective
>>> frequencies.
>>>> We should be to noted that G729 is not a waveform coder,not
>>>> a frequency domain waveform coder also, it is a parameter vocoder.
>>> Noted. So?
>>>> 2. G711, as an elementary coder algo, use the basic A/mu law
>>>> nonuniform PCM quantization. I wonder whether it
>>>> can be called "codec". How it can be dramatically improves the
>>>> quality?
>>> By providing 12 bits of dynamic range with an 8-bit system.
>>>> Vladimir Vassilevsky. Thank you.
>>>> G.711, G.721, two elementary coding algo, use basic A/mu law
>>>> nonuniform PCM quantization and ADPCM.
>>>> They all use waveform to compress data. It is nothing about frequency
>>>> domain. Why you wanna use a
>>>> fixed preemphasis/deemphasis in them? How u expect it work?
>>> Preemphasis "whitens" the signal, improving SNR.
>>> Jerry
>>> --
>>> Engineering is the art of making what you want from things you can get.
>>> ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
>> erry.Thank you.
>>
>> Thank you for your graceful description about bass . Thus I get that
>> bass components occupy more in power,but contribute less in
>> intelligibility
>>
>> 1.If in a frequency wavefrom coder,such as mp3/aac, removing bass
>> must improve intelligibility. It is resonable for it can give more
>> attention/bits to that effective for intelligibility.
>>
>> But, G729/G723.1,based on CELP, is a parameter vocoder. It use speech
>> model and analyse LPC parameter/pitch. It is not a wavefrom coder.
>> Removing bass can not work.
>> Can the operation of removing Bass make the LPC coefficients/Pitch
>> more effective?
>>
>> 2. G711,although it is a waveform coder,but it operate in time
>> domain,not in frequency domain. Removing bass can not work also, for
>> BASS is a characteristic of frequency.
>>
>> 3. Preemphasis "whitens" the signal, improving SNR. It did and made
>> spectrum curve more flat.Thus raising SNR is right. But G711/G721
>> (nonuniform PCM and ADPCM) is operating in time domain. The large SNR
>> in frequency sense can give rise to large SNR in time sense?
>>
>> Thank you for your reply.
>>
>> Regards
>> [email protected] Hide quoted text -
>>
>> - Show quoted text -
>
> Jerry.Thank you.
>
> Thank you for your graceful description about bass . Thus I get that
> bass components occupy more in power,but contribute less in
> intelligibility
>
> 1.If in a frequency wavefrom coder,such as mp3/aac, removing bass
> must improve intelligibility. It is resonable for it can give more
> attention/bits to that effective for intelligibility.
>
> But, G729/G723.1,based on CELP, is a parameter vocoder. It use speech
> model and analyse LPC parameter/pitch. It is not a wavefrom coder.
> Removing bass can not work.
> Can the operation of removing Bass make the LPC coefficients/Pitch
> more effective?
>
>
> 2. G711,although it is a waveform coder,but it operate in time
> domain,not in frequency domain. Removing bass can not work also, for
> BASS is a characteristic of frequency.
>
> 3. Preemphasis "whitens" the signal, improving SNR. It did and made
> spectrum curve more flat.Thus raising SNR is right. But G711/G721
> (nonuniform PCM and ADPCM) is operating in time domain. The large SNR
> in frequency sense can give rise to large SNR in time sense?
>
>
> Thank you for your reply.
>
> Regards
> [email protected]

Before you are trying to understand a codec you should get hands on some
textbook like "Digital Communications" from Proakis, because you lack
some basics:

1. What can be done in the frequency domain can also be done in the time
domain. It depends on the computational costs which domain should be
choosen.

2. SNR is defined as the ratio of signal power to noise power.
Therefore it is not a question of time or frequency (see theorem of
parseval)

Sebastian

Jerry Avins

05-05-2009, 05:53 PM

HyeeWang wrote:
> On May 4, 9:12 pm, Jerry Avins <[email protected]> wrote:
>> HyeeWang wrote:
>>> steveu,thank you.
>>> 1. Removing DC is reasonable,for it is not a part of speech at all.But
>>> what is the reason to remove bass?
>> Bass can contain a lot of power, but adds little to intelligibility.
>>
>>> How and why it can optimise the performance ?
>> By removing bass, more power it available for the most effective
>> frequencies.
>>
>>> We should be to noted that G729 is not a waveform coder,not
>>> a frequency domain waveform coder also, it is a parameter vocoder.
>> Noted. So?
>>
>>> 2. G711, as an elementary coder algo, use the basic A/mu law
>>> nonuniform PCM quantization. I wonder whether it
>>> can be called "codec". How it can be dramatically improves the
>>> quality?
>> By providing 12 bits of dynamic range with an 8-bit system.
>>
>>> Vladimir Vassilevsky. Thank you.
>>> G.711, G.721, two elementary coding algo, use basic A/mu law
>>> nonuniform PCM quantization and ADPCM.
>>> They all use waveform to compress data. It is nothing about frequency
>>> domain. Why you wanna use a
>>> fixed preemphasis/deemphasis in them? How u expect it work?
>> Preemphasis "whitens" the signal, improving SNR.
>>
>> Jerry
>> --
>> Engineering is the art of making what you want from things you can get.
>> ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
>
> erry.Thank you.
>
> Thank you for your graceful description about bass . Thus I get that
> bass components occupy more in power,but contribute less in
> intelligibility
>
> 1.If in a frequency wavefrom coder,such as mp3/aac, removing bass
> must improve intelligibility. It is resonable for it can give more
> attention/bits to that effective for intelligibility.
>
> But, G729/G723.1,based on CELP, is a parameter vocoder. It use speech
> model and analyse LPC parameter/pitch. It is not a wavefrom coder.
> Removing bass can not work.

Think again!

> Can the operation of removing Bass make the LPC coefficients/Pitch
> more effective?

Sure. Removing the bass range means that there are fewer pitches to
encode. Think of bass as similar to font information in text in a word
processor. Removing it makes the presentation less pretty, but not much
less intelligible.

> 2. G711,although it is a waveform coder,but it operate in time
> domain,not in frequency domain. Removing bass can not work also, for
> BASS is a characteristic of frequency.

When bass is removed, there is less signal to encode, and usually a
smaller dynamic range.

> 3. Preemphasis "whitens" the signal, improving SNR. It did and made
> spectrum curve more flat.Thus raising SNR is right. But G711/G721
> (nonuniform PCM and ADPCM) is operating in time domain. The large SNR
> in frequency sense can give rise to large SNR in time sense?
>
>
> Thank you for your reply.

For what it's worth. Anyhow, you're quite welcome.

Jerry
--
Engineering is the art of making what you want from things you can get.
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

HyeeWang

05-06-2009, 07:41 AM

On May 6, 12:53*am, Jerry Avins <[email protected]> wrote:
> HyeeWang wrote:
> > On May 4, 9:12 pm, Jerry Avins <[email protected]> wrote:
> >> HyeeWang wrote:
> >>> steveu,thank you.
> >>> 1. Removing DC is reasonable,for it is not a part of speech at all.But
> >>> what is the reason to remove bass?
> >> Bass can contain a lot of power, but adds little to intelligibility.
>
> >>> * How and why it can optimise the performance ? *
> >> By removing bass, more power it available for the most effective
> >> frequencies.
>
> >>> We should be to noted that G729 is not a waveform *coder,not
> >>> * a frequency domain waveform coder also, it is a parameter vocoder..
> >> Noted. So?
>
> >>> 2. G711, as an elementary coder algo, *use the basic A/mu law
> >>> nonuniform PCM quantization. I wonder whether it
> >>> * *can be called "codec". *How it can be dramatically improves the
> >>> quality?
> >> By providing 12 bits of dynamic range with an 8-bit system.
>
> >>> Vladimir Vassilevsky. Thank you.
> >>> G.711, G.721, two elementary coding algo, use basic A/mu law
> >>> nonuniform PCM quantization and ADPCM.
> >>> They all use waveform to compress data. It is nothing about frequency
> >>> domain. Why you wanna use a
> >>> fixed preemphasis/deemphasis in them? How u expect it work?
> >> Preemphasis "whitens" the signal, improving SNR.
>
> >> Jerry
> >> --
> >> Engineering is the art of making what you want from things you can get..
> >> ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
>
> > erry.Thank you.
>
> > Thank you for your graceful description about bass . Thus I get that
> > bass components occupy more in power,but contribute less in
> > intelligibility
>
> > 1.If in a frequency wavefrom coder,such as mp3/aac, *removing bass
> > must improve intelligibility. It is resonable for it can give more
> > attention/bits to that effective for intelligibility.
>
> > But, G729/G723.1,based on CELP, is a parameter vocoder. It use speech
> > model and analyse LPC parameter/pitch. It is not a wavefrom coder.
> > Removing bass *can not work.
>
> Think again!
>
> > Can the operation of removing Bass make the LPC coefficients/Pitch
> > more effective?
>
> Sure. Removing the bass range means that there are fewer pitches to
> encode. Think of bass as similar to font information in text in a word
> processor. Removing it makes the presentation less pretty, but not much
> less intelligible.
>
> > 2. G711,although it is a waveform coder,but it operate in time
> > domain,not in frequency * domain. Removing bass can not work also, for
> > BASS is a characteristic of frequency.
>
> When bass is removed, there is less signal to encode, and usually a
> smaller dynamic range.
>
> > 3. Preemphasis "whitens" the signal, improving SNR. It did and made
> > spectrum curve more flat.Thus raising SNR is right. But G711/G721
> > (nonuniform PCM and ADPCM) is operating in time domain. The large SNR
> > in frequency sense can give rise to large SNR in time sense?
>
> > Thank you for your reply.
>
> For what it's worth. Anyhow, you're quite welcome.
>
> Jerry
> --
> Engineering is the art of making what you want from things you can get.
> ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯- Hide quoted text -
>
> - Show quoted text -

I have recognized that the operation of pre-emphasize can make the
spectrum curve flatness,thus reduce the dynamic range in time domain.
With the matlab srcipt, I performed a simple pre-emphasize algo and
confirmed that.

Jerry, you said "When bass is removed, there is less signal to encode,
and usually a
smaller dynamic range. "

I know Bass is the section of low frequency components. but I do not
know the exact frequency range of bass contained.It must be not the
biggest item (800 HZ or so). Since Bass is neither the smallest
elements or the biggest , how can removing BASS reduce the dynamic
range?

thank you all for kind explanation.

HyeeWang

Jerry Avins

05-06-2009, 02:44 PM

HyeeWang wrote:

...

> I have recognized that the operation of pre-emphasize can make the
> spectrum curve flatness,thus reduce the dynamic range in time domain.
> With the matlab srcipt, I performed a simple pre-emphasize algo and
> confirmed that.

Interesting!

> Jerry, you said "When bass is removed, there is less signal to encode,
> and usually a smaller dynamic range. "
>
> I know Bass is the section of low frequency components. but I do not
> know the exact frequency range of bass contained.It must be not the
> biggest item (800 HZ or so). Since Bass is neither the smallest
> elements or the biggest , how can removing BASS reduce the dynamic
> range?

"Bass" is the lower end of the spectrum you deal with. There is no
precise dividing line. (On keyboard instruments, it is often considered
to be what the left hand plays.) For voice intelligibility and speaker
recognition, there's little need for frequencies below 300 Hz, although
reproduction quality improves noticeably when an octave lower is
included. (Standard telephones are typically limited to a range of 300
to 3000 Hz.)

As for dynamic range, the sum of the bass and other amplitudes often
exceeds that of the others alone.

> thank you all for kind explanation.

You're welcome.

Jerry
--
Engineering is the art of making what you want from things you can get.
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯