FPGA Groups - View Single Post

#3 (**permalink**) 06-01-2007, 04:56 AM

On May 31, 8:28 pm, "Bob" <[email protected]> wrote:
> "bngguy" <[email protected]> wrote in message
>
> news:[email protected] oups.com...
>
>
>
> > Hi,
> > I'm working on implementing an FIR Filter on a FPGA (Spartan 3E),
> > here's what i want to accomplish -->
>
> > The FIR Filter coefficients are generated on a host system using
> > LabView, these coefficients are written to a RAM / PROM on a DSP
> > card , the number of taps is constant but other parameters like
> > sampling frequency and cut off frequencies can change according to
> > requirements.
>
> > The FPGA reads these coefficients from the RAM / PROM and implements
> > the FIR Filter.There should be a single bit file that is downloaded to
> > configure the FPGA.
>
> > Any pointers in the right direction would be appreciated.
>
> > Thanks
> > Tim
>
> There's an echo in here.
>
> Bob

Hi Tim,

Your post looks like a previous post, but perhaps you didn't get the
response you were looking for ...

So, here's a more detailed response.

Spartan 3E ranges from :
4-36 multipliers
2K-33K logic cells

Your choices for FIR implementation are :
Distributed Arithmetic FIR Filters
Multiplier Based FIR Filters

Distributed Arithmetic (DA) will tend to provide small size, high
speed operations, but are more difficult to change coefficients (you
have to calculate ROM LUT values from the coefficients).

Multiplier based FIR structures can take advantage of the built-in
multipliers, and are much easier to reload, but there's a limited
number of multipliers. Multiplier

>From a flexibility standpoint, multiplier based FIR filters are a bit
more flexible than DA FIR structures. Multiplier structures (MAC)
range (in area vs performance) from N multipliers - 1 clock cycle per
computation, to N clocks and 1 multiplier (where N=the number of
coefficients).

DA fir structures range in computation rate from 1 clock (fully
parallel) to M clocks (where M = input bit width). Usually, the input
bit width may be as high as 16 bits, so we're usually ranging up to 16
clocks.

Naturally, there's some wiggle room in both of the above paragraphs,
as it's possible to take advantage of symmetry to decrease the number
of multipliers in half for the MAC based FIR filters. Symmetry can
also add another clock cycle to serial distributed arithmetic FIR
filters while decreasing the number of ROM LUT's by half. There are
other tricks such as polyphase decomposition for interpolation and
decimation which can also reduce multiplier and ROM LUT usage.

Xilinx provides a distributed arithmetic FIR filter generator as part
of ISE, and it produces good results, but since it's basically a black
box, you'll be dependent on the vendor and may have to perform gate
level simulation.

You may be interested in looking at a new clear text human readable
Verilog based FIR filter generator from Optunis (
http://www.optunis.com/fir_hdl_write...iter_info.html ),
which also generates a testbench for impulse, step, and random
response. It's new (still in Beta) and utilizes the hard multipliers
built into Spartan 3E.

Best of Luck,

Tony