How to accurately calculate the instantaneous frequency for a clip of speech
hi,every one;
How to accurately calculate the instantaneous frequency for a clip o
speech? and with high frequcy resolution??
for example:
[d,sr] = wavread('xx.wav'); % a slip of sound;
Re: How to accurately calculate the instantaneous frequency for aclip of speech
On Jul 24, 10:29*pm, "deltuo" <del...@yahoo.com.cn> wrote:
> hi,every one;
> * * How to accurately calculate the instantaneous frequency for a clip of
> speech? and with high frequcy resolution??
> * *for example:
> * *[d,sr] = wavread('xx.wav'); *% a slip of sound;
Can you define exactly what you mean by instantaneous frequency
in an unambiguous manner? A single point could be a part
of a sine wave of any frequency.
Re: How to accurately calculate the instantaneous frequency for aclip of speech
On Jul 25, 5:29 pm, "deltuo" <del...@yahoo.com.cn> wrote:
> hi,every one;
> How to accurately calculate the instantaneous frequency for a clip of
> speech? and with high frequcy resolution??
> for example:
> [d,sr] = wavread('xx.wav'); % a slip of sound;
>
> thank you
You would have to find the phase and differentiate it - as in FM
decoding. No simple matter when there is noise.
Re: How to accurately calculate the instantaneous frequency for aclip of speech
On Jul 25, 7:29*am, "deltuo" <del...@yahoo.com.cn> wrote:
> hi,every one;
> * * How to accurately calculate the instantaneous frequency for a clip of
> speech? and with high frequcy resolution??
> * *for example:
> * *[d,sr] = wavread('xx.wav'); *% a slip of sound;
>
> thank you
Re: How to accurately calculate the instantaneous frequency for aclip of speech
On Jul 25, 1:29*am, "deltuo" <del...@yahoo.com.cn> wrote:
> hi,every one;
> * * How to accurately calculate the instantaneous frequency for a clip of
> speech? and with high frequcy resolution??
> * *for example:
> * *[d,sr] = wavread('xx.wav'); *% a slip of sound;
>
> thank you
For short time windows, speech is approximately stationary and
periodic. Try a Short-Time Fourier Transform (STFT) with a 32 ms
Hamming window -- e.g. N = 256 samples for an 8 kHz sampling rate.
Downsample to 8 kHz first if your sampling rate is higher (the
identifying characteristics of speech are below 4 kHz).
The 256-point FFT of each time frame will yield 129 frequency
components [0,128] evenly spaced on the domain [0, 4] kHz. If desired
you can zero-pad the time-domain frame to interpolate the FFT at other
frequencies. You should see the fundamental pitch at around 110 Hz
with about 30-40 peaks for a male speaker and about half that many
peaks for a female speaker with a 200 Hz pitch.
Multiplying by a Hamming window in the time domain is cyclic
convolution with its spectrum in the frequency domain. Therefore, the
spectrum of each voiced frame consists of scaled copies of the Hamming
window's spectrum at multiples of the speaker's pitch. If the Hamming
window's lobes are run together such that there is significant
spectral leakage, you should increase the length of the time window.
The Hamming window should be at least two pitch periods (e.g. N=267
for a 60 Hz pitch).
It's common to plot the STFT as a spectrogram -- a 2D gray-scale plot
of time vs. frequency interpolated from the log of the spectrum's
squared magnitude (energy). The energy values are converted to gray
scale pixels with dynamic range set between a low threshold (white)
and a high threshold (black) and with brightness varying linearly in
between (the slope determines contrast).
Re: How to accurately calculate the instantaneous frequency for aclip of speech
On Jul 25, 4:33*am, kronec...@yahoo.co.uk wrote:
> On Jul 25, 5:29 pm, "deltuo" <del...@yahoo.com.cn> wrote:
>
> > hi,every one;
> > * * How to accurately calculate the instantaneous frequency for a clip of
> > speech? and with high frequcy resolution??
> > * *for example:
> > * *[d,sr] = wavread('xx.wav'); *% a slip of sound;
>
> > thank you
>
> You would have to find the phase and differentiate it - as in FM
> decoding. No simple matter when there is noise.
>
> K.
I interpreted the OP less literally in my first reply, but you are
technically correct. In that case, the OP may be looking for pitch
tracking techniques, where pitch is the instantaneous fundamental
frequency of voiced speech. Typically this is done using a windowed
autocorrelation. But if the harmonics are filtered out, I suppose one
could estimate the pitch as the backward difference of the phase. Is
that sensible?
Re: How to accurately calculate the instantaneous frequency for aclip of speech
On Jul 25, 1:29*am, "deltuo" <del...@yahoo.com.cn> wrote:
> hi,every one;
> * * How to accurately calculate the instantaneous frequency for a clip of
> speech? and with high frequcy resolution??
> * *for example:
> * *[d,sr] = wavread('xx.wav'); *% a slip of sound;
>
> thank you
Did you ever read the responses to your July 17 post?