Re: How do I optimize filter coefficient bit length and signal bitlength?
On May 20, 3:04 pm, "From Sweden" <s...@sw.se> wrote:
> Hello all
>
> I have made an 8 channel 500kHz low pass IIR-filter in VHDL. The filter uses
> 32 bits for it's coefficients and 32 bits for it's internal signals.
>
> The filter doesn't give the same DC-gain for small vs. large input signals.
> I suspect the internal truncation of the intermediate sums and states
> effects this.
>
> But today I thought about increasing the bits for the signal and decreasing
> the bits for the coefficients. I tried it out and the filter gave better
> gain over different input signal levels.
>
> Now I wonder how I should optimize the coefficient and signal bit lengths to
> get the best result?
32 bits oughta be enough for nearly any application. a quantization
error of 1 part outa 4 bizillion? i mean, holy crap!!!
the consequences of quantizing coefficients is different from
quantizing the signal (or some internal intermediate signal).
quantizing coefficients means that the filter you get is not precisely
the one that you designed. the poles and zeros didn't go exactly to
where you wanted them to go. but with 32-bits it should easily be
close enough. how the coefs map to the poles and zeros depends on the
filter topology. what topology are you using? Direct Form 1 or
Direct Form 2 or Lattice or Normalized Ladder or some other? (i think
there is a Gold-Rader form, there's a bunch of them, some of which
have an internal All-pass filter that the rest of the thing is built
around. i am a partisan for the Direct Form 1 in fixed-point
applications.) what you do, is solve for the pole and zero loci as a
function of the coefs (that get quantized) and see what effect the
coef quantization has on the pole/zero locus. but dividing each of
two dimensions of the unit circle up into 4 bizillion slices should be
more than good enough.
consequences of quantizing the signal can range from a additive noise
model (if the signal amplitude is much, much larger than the
quantization level) to all sorts of nasties (harmonic distortion,
limit cycles). triangular PDF additive dither of 2 LSB amplitude is
sufficient to get rid of that stuff. i would think that at 32 bits,
simple 1st-order noise shaping (with a zero at DC) would suffice if
you got 32 bit words (no dither necessary). this particular error or
noise shaping requires one extra state in the DF1 and has been called
"fraction saving" and Randy Yates has written about it recently in the
IEEE Sig Proc magazine.
really, 32 bit words oughta be good enough.
r b-j
|