"mikel" <

[email protected]> wrote in message

news:

[email protected] oups.com...

> John,

>

> Actually, the LUT/ROM is replicated twice as much as I said before (18

> times 1st stage, 22 times 2nd stage). Synthesis tool was smart enough

> to reduce size of ROMs memory bits from 35840 bits to 25600 bits (there

> are few identical values inside every ROM, synthesis tool placed

> additional decoding logic for input address to reduce memory size). But

> this is still too much.

>

>> You have 20 different addresses for the 20 replications, correct?

>

> yes, I have different address for every ROM access, and I need to

> access all ROMs at the same clock cycle for performance.

>

>> Which FPGA family are you using?

>

> I want design to be generic, though I ordered Virtex 2Pro board from

> Digilent so this will be my target.

>

> Michal K
If you have 40 different 6-bit addresses for 40 different 64x14 ROMs, I

don't see how you can do better than 40instances*4LUTs*14bits = 2240 LUTs

(or 280 CLBs in your current architecture). Implementing each ROM with

fewer than 4 LUTs per bit would be possible for some 6-in-1 out functions.

Each ALM in the Stratix-II series (roughly equivalent but twice the LUT size

as a Xilinx slice) can provide a 64x1 ROM.

You could use a BlockRAM to provide 2 ports of 14 bits each (up to 36 bits

available) to displace 56 LUTs each. The 4.5 kbit Altera M4K blocks would

be more "efficent" since only 64 entries are needed in your application and

there are typically many more M4K blocks than BlockRAMs in equivalent A vs X

devices.

It's quite possible you could time-multiplex your 14-bit lookups at 2x, 3x,

even 4x your main design speed since the ROM lookup time as implemented in

distributed CLB SelectRAM is one LUT plus MUXF5 plus MUXF6, roughly less

than 2 levels of logic in a pipelined implementation.

The bottom line is that you have to pull out 40 unique 14-bit values. If

there is no convenient way to reduce the uniqueness, the replication has to

be there.

What does help is that each LUT or LE can give you 16 bits of ROM. Each ALM

can give you 64 bits of ROM.