I'm trying to find a way to detect the pitches of fundamental
frequencies of harmonic content shown in log-scale FFTs. Basically,
all I have is the log-scale FFTs of bits of sound which may or may not
contain harmonic content, and I want the output to look like the
result of cepstral analysis. This is what it looks like
http://photosounder.com/misc/comp.dsp_fig1.png (and from such an input
ideally the result would be a fairly sharp peak in the place of the
leftmost peak, surrounded by noise)
Because in log frequency scale harmonics have a spacing that is
independent from their fundamental frequency (i.e. the distance
between the Nth harmonic and the N+1th harmonic is always the same) I
thought a great way to get to what I want would be to convolve the log-
FFT with a kernel that would somehow stack up all the harmonic peaks
on top of each other and leave the surroundings clean. However I
really don't have any idea how to come up with such a kernel. I
already tried cross-correlating with a series of peaks similar to the
figure above but as one might expect all I obtained was a wide and
blurry mess. It don't _have_ to solve it using convolution, I'm open
to any alternative, it just seems to me like there would be such a
convolution kernel that would give me what I want.
As for the characteristics of what it would be used for, it might be
on any type of input sound really. The log-FFT might represent 30
oboes playing simultaneously, or just a bunch of noise, or some
anharmonic sound like a bell (which "pitch" I don't want to detect).
Ultimately I don't even need to isolate and determine precisely how
many different discrete fundamental pitches are present, it's more
about getting a cepstral visualisation. Also I don't necessarily have
access to the original sound nor the original linear-scale FFT, so
really all I have to base the analysis from is log-scale FFTs.
Thanks in advance,
Michel Rouzic