FPGA Central - World's 1st FPGA / CPLD Portal

FPGA Central

World's 1st FPGA Portal

 

Go Back   FPGA Groups > NewsGroup > FPGA

FPGA comp.arch.fpga newsgroup (usenet)

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 11-03-2007, 05:44 PM
G Iveco
Guest
 
Posts: n/a
Default How do I meet this memory IO with least resources on FPGA?

Hi, there

My design needs a 16X16 matrix, each of 32-bit. The matrix must
be read row by row or column by column each in one clock..

Direct register implementation takes a lot of resources and routing can
be difficult. Will 256 pieces of 32-bit RAM with single address work
in Xilinx?

TIA!



Reply With Quote
  #2 (permalink)  
Old 11-03-2007, 06:58 PM
KJ
Guest
 
Posts: n/a
Default Re: How do I meet this memory IO with least resources on FPGA?


"G Iveco" <[email protected]> wrote in message
news:[email protected]
> Hi, there
>
> My design needs a 16X16 matrix, each of 32-bit. The matrix must
> be read row by row or column by column each in one clock..
>
> Direct register implementation takes a lot of resources and routing can
> be difficult. Will 256 pieces of 32-bit RAM with single address work
> in Xilinx?
>


Yes. You write the code to implement whatever logic function you require.

Start wirting some code and simulating it until it's functionally working
the way you want it to be. As a BACKGROUND task start running your code
through the synthesis process and look to see how what you've written is
being implemented and what sort of clock cycle performance you can expect.
If it's not to your liking then start perusing for other more elegant ways
of implementing your logic but don't get so focused on the synthesis task
that you forget to handle the primary task which is to get functionally
correct code.

KJ


Reply With Quote
  #3 (permalink)  
Old 11-03-2007, 08:08 PM
G Iveco
Guest
 
Posts: n/a
Default Re: How do I meet this memory IO with least resources on FPGA?


"KJ" <[email protected]> wrote in message
news:[email protected] net...
>
>
> Yes. You write the code to implement whatever logic function you require.
>
> Start wirting some code and simulating it until it's functionally working
> the way you want it to be. As a BACKGROUND task start running your code
> through the synthesis process and look to see how what you've written is
> being implemented and what sort of clock cycle performance you can expect.
> If it's not to your liking then start perusing for other more elegant ways
> of implementing your logic but don't get so focused on the synthesis task
> that you forget to handle the primary task which is to get functionally
> correct code.
>
> KJ
>


Thank you KJ.
I understand in large designs simulation and synthesis had better go
concurrently
to make sure the design passes both steps nicely.

But my question is, for memory based systems,
For very large memory, register implementation takes N times silicon than
RAM.

For a small memory, RAM have overheads like RW, sensing, amplifier, etc
which may
be equivalent to a few hundred registers in terms of silicon and power.

as a result, in the 2nd case, how much is this RAM overhead comparing to a
32-bit register in
Xilinx?

If there are good comparisons, then I can skip the trouble of testing..



Reply With Quote
  #4 (permalink)  
Old 11-03-2007, 08:33 PM
Nico Coesel
Guest
 
Posts: n/a
Default Re: How do I meet this memory IO with least resources on FPGA?

"G Iveco" <[email protected]> wrote:

>Hi, there
>
>My design needs a 16X16 matrix, each of 32-bit. The matrix must
>be read row by row or column by column each in one clock..
>
>Direct register implementation takes a lot of resources and routing can
>be difficult. Will 256 pieces of 32-bit RAM with single address work
>in Xilinx?


I'd suggest using the LUT rams as much as possible. Look in the
datasheet. AFAIK Xilinx is one of the few FPGA vendors that has RAM in
the logic slices. If you use these lut rams in a smart way, you can
cram many times more logic in a device with lut ram than in a device
without lut ram.

--
Reply to [email protected] (punt=.)
Bedrijven en winkels vindt U op www.adresboekje.nl
Reply With Quote
  #5 (permalink)  
Old 11-03-2007, 09:25 PM
KJ
Guest
 
Posts: n/a
Default Re: How do I meet this memory IO with least resources on FPGA?


"G Iveco" <[email protected]> wrote in message
news:[email protected]
>
>
> But my question is, for memory based systems,
> For very large memory, register implementation takes N times silicon than
> RAM.
>
> For a small memory, RAM have overheads like RW, sensing, amplifier, etc
> which may
> be equivalent to a few hundred registers in terms of silicon and power.
>
> as a result, in the 2nd case, how much is this RAM overhead comparing to a
> 32-bit register in
> Xilinx?
>

But it really doesn't matter. When you follow the proper template, your
code can be synthesized to use internal RAM or LUTs. That decision will be
made by the synthesis tool. So look up the form of VHDL that will infer
memory, write your code in that fashion, avoid use of wizards and such, and
your code will synthesize to fit into the resources that are on the chip.
It makes no difference whether the memory gets implemented in logic cells or
memory arrays as long as it
- implements the intended function
- meets the performance requirements
- Fits in the targetted device.

> If there are good comparisons, then I can skip the trouble of testing..
>

Testing which 'method' is better is pointless. Write code that can be
inferred properly to the targetted part and leave the rest for the tools to
implement.

KJ


Reply With Quote
  #6 (permalink)  
Old 11-04-2007, 03:32 AM
Peter Alfke
Guest
 
Posts: n/a
Default Re: How do I meet this memory IO with least resources on FPGA?

On Nov 3, 12:33 pm, [email protected] (Nico Coesel) wrote:
> "G Iveco" <[email protected]> wrote:
> >Hi, there

>
> >My design needs a 16X16 matrix, each of 32-bit. The matrix must
> >be read row by row or column by column each in one clock..

>
> >Direct register implementation takes a lot of resources and routing can
> >be difficult. Will 256 pieces of 32-bit RAM with single address work
> >in Xilinx?

>
> I'd suggest using the LUT rams as much as possible. Look in the
> datasheet. AFAIK Xilinx is one of the few FPGA vendors that has RAM in
> the logic slices. If you use these lut rams in a smart way, you can
> cram many times more logic in a device with lut ram than in a device
> without lut ram.
>
> --
> Reply to [email protected] (punt=.)
> Bedrijven en winkels vindt U opwww.adresboekje.nl


Here is my best-case estimate:
You obviously need 16 x 32 = 512 parallel outputs
In Virtex-5 each LUT can be used with 5 address bits and 2 outputs
( 32 x 2 RAM)
That means you need 256 LUTs = 32 CLBs. And nothing else.
This optimized packing requires that the software is smart enough to
configure the LUTs appropriately.
Worst-case, that is not yet the case, and you need 64 CLBs total. And
nothing else.
Even the small 'LX50 has 3600 CLBs total, (but not all of them can be
used as memory).
Just so you can have an educated guess.
Let the software do the crunching...
Peter
Alfke


Reply With Quote
  #7 (permalink)  
Old 11-04-2007, 05:03 AM
G Iveco
Guest
 
Posts: n/a
Default Re: How do I meet this memory IO with least resources on FPGA?


"Peter Alfke" <[email protected]> wrote in message
news:[email protected] ups.com...
> On Nov 3, 12:33 pm, [email protected] (Nico Coesel) wrote:
>> "G Iveco" <[email protected]> wrote:
>> >Hi, there

>>
>> >My design needs a 16X16 matrix, each of 32-bit. The matrix must
>> >be read row by row or column by column each in one clock..

>>
>> >Direct register implementation takes a lot of resources and routing can
>> >be difficult. Will 256 pieces of 32-bit RAM with single address work
>> >in Xilinx?

>>
>> I'd suggest using the LUT rams as much as possible. Look in the
>> datasheet. AFAIK Xilinx is one of the few FPGA vendors that has RAM in
>> the logic slices. If you use these lut rams in a smart way, you can
>> cram many times more logic in a device with lut ram than in a device
>> without lut ram.
>>
>> --
>> Reply to [email protected] (punt=.)
>> Bedrijven en winkels vindt U opwww.adresboekje.nl

>
> Here is my best-case estimate:
> You obviously need 16 x 32 = 512 parallel outputs
> In Virtex-5 each LUT can be used with 5 address bits and 2 outputs
> ( 32 x 2 RAM)
> That means you need 256 LUTs = 32 CLBs. And nothing else.
> This optimized packing requires that the software is smart enough to
> configure the LUTs appropriately.
> Worst-case, that is not yet the case, and you need 64 CLBs total. And
> nothing else.
> Even the small 'LX50 has 3600 CLBs total, (but not all of them can be
> used as memory).
> Just so you can have an educated guess.
> Let the software do the crunching...
> Peter
> Alfke
>
>


Thank you nico and Alfke.

I tried using registers coded by RTL and estimate the hardware
requirement and found my gigantic math module can fit in a Virtex 2,
V3000.. It's old technology though.

In Virtex 4, will XC4VLX40 be able to handle this? The documentation
of two specs are different, only Slice count can be used as references.

IOs are no issue here.




Reply With Quote
  #8 (permalink)  
Old 11-04-2007, 09:52 AM
Guest
 
Posts: n/a
Default Re: How do I meet this memory IO with least resources on FPGA?

On Nov 3, 4:44 pm, "G Iveco" <[email protected]> wrote:
> Hi, there
>
> My design needs a 16X16 matrix, each of 32-bit. The matrix must
> be read row by row or column by column each in one clock..
>
> Direct register implementation takes a lot of resources and routing can
> be difficult. Will 256 pieces of 32-bit RAM with single address work
> in Xilinx?
>
> TIA!



It may be possible to implement this structure using RAM's, and thus
far fewer resources, but that will depend on a couple things.

First, does each 16x16 matrix need to be accessed using both row and
colomn addressing? Second, how is data written into the matrix - do
entire rows/cols need to be written in a single cycle as well?

The first requirement is easy to deal with. The second one, in
conjunction with the first, makes life difficult.

If you can handle writing data in one element at a time then you can
construct a pair of ram structures one of which handles column
addressing while the other handles row addressing. Both get written
with the same data.

Thanks,
Andy.

Reply With Quote
  #9 (permalink)  
Old 11-04-2007, 04:46 PM
Peter Alfke
Guest
 
Posts: n/a
Default Re: How do I meet this memory IO with least resources on FPGA?

On Nov 3, 9:03 pm, "G Iveco" <[email protected]> wrote:
> "Peter Alfke" <[email protected]> wrote in message
>
> news:[email protected] ups.com...
>
>
>
> > On Nov 3, 12:33 pm, [email protected] (Nico Coesel) wrote:
> >> "G Iveco" <[email protected]> wrote:
> >> >Hi, there

>
> >> >My design needs a 16X16 matrix, each of 32-bit. The matrix must
> >> >be read row by row or column by column each in one clock..

>
> >> >Direct register implementation takes a lot of resources and routing can
> >> >be difficult. Will 256 pieces of 32-bit RAM with single address work
> >> >in Xilinx?

>
> >> I'd suggest using the LUT rams as much as possible. Look in the
> >> datasheet. AFAIK Xilinx is one of the few FPGA vendors that has RAM in
> >> the logic slices. If you use these lut rams in a smart way, you can
> >> cram many times more logic in a device with lut ram than in a device
> >> without lut ram.

>
> >> --
> >> Reply to [email protected] (punt=.)
> >> Bedrijven en winkels vindt U opwww.adresboekje.nl

>
> > Here is my best-case estimate:
> > You obviously need 16 x 32 = 512 parallel outputs
> > In Virtex-5 each LUT can be used with 5 address bits and 2 outputs
> > ( 32 x 2 RAM)
> > That means you need 256 LUTs = 32 CLBs. And nothing else.
> > This optimized packing requires that the software is smart enough to
> > configure the LUTs appropriately.
> > Worst-case, that is not yet the case, and you need 64 CLBs total. And
> > nothing else.
> > Even the small 'LX50 has 3600 CLBs total, (but not all of them can be
> > used as memory).
> > Just so you can have an educated guess.
> > Let the software do the crunching...

Easy.
In Virtex-4, you again need 512 outputs, each driven by a 32-bit RAM.
In Virtex-4, each 32 x 1 RAM consists of two LUTs plus a free
multiplexer.
Call out RAM32x1S as shown in the "CLB Overview" fig 5-6 on page 219
of the Virtex-4 Handbook.
That consumes 1024 LUTs, or 128 CLBs.
The LX40 has 128 x 36 CLBs, which is 36 times more than you need.
Peter Alfke, Xilinx Applications
===============================
> > Peter
> > Alfke

>
> Thank you nico and Alfke.
>
> I tried using registers coded by RTL and estimate the hardware
> requirement and found my gigantic math module can fit in a Virtex 2,
> V3000.. It's old technology though.
>
> In Virtex 4, will XC4VLX40 be able to handle this? The documentation
> of two specs are different, only Slice count can be used as references.
>
> IOs are no issue here.



Reply With Quote
  #10 (permalink)  
Old 11-05-2007, 12:53 AM
Alvin Andries
Guest
 
Posts: n/a
Default Re: How do I meet this memory IO with least resources on FPGA?


"G Iveco" <[email protected]> wrote in message
news:[email protected]
> Hi, there
>
> My design needs a 16X16 matrix, each of 32-bit. The matrix must
> be read row by row or column by column each in one clock..
>
> Direct register implementation takes a lot of resources and routing can
> be difficult. Will 256 pieces of 32-bit RAM with single address work
> in Xilinx?
>
> TIA!
>


>>IF<< you can garantee that all R/W accesses will be row or column based,

you can get away with 16 BRAMs arranged in a diagonal pattern.

r0c0
r1c1
r2c2
...
r15c15

Regards,
Alvin.


Reply With Quote
Reply

Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
All ASIC VLSI FPGA resources [email protected] Verilog 2 07-31-2007 02:55 PM
Historical Fpga Resources [email protected] FPGA 6 03-19-2006 04:21 PM
Learning resources for Xilinx memory controllers [email protected] FPGA 2 03-02-2005 05:46 AM
Resources on FPGA wanted... Amit Olkar FPGA 3 07-23-2004 04:48 PM
What can I do if my chip can't meet timing? Student FPGA 17 05-28-2004 03:09 AM


All times are GMT +1. The time now is 11:00 AM.


Powered by vBulletin® Version 3.8.0
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.2.0
Copyright 2008 @ FPGA Central. All rights reserved