FPGA Central - World's 1st FPGA / CPLD Portal

FPGA Central

World's 1st FPGA Portal

 

Go Back   FPGA Groups > NewsGroup > FPGA

FPGA comp.arch.fpga newsgroup (usenet)

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 02-24-2005, 10:26 PM
gretzteam
Guest
 
Posts: n/a
Default Fast 28x28 multiplier + adder in Virtex4

Hi,
We are using a virtex4 FPGA to prototype a DSP processor to be
implemented in an ASIC. We are using the ISE flow and everything works
fine except that we can't prototype at full speed. We are only able to
run at about 65MHz, which is far from the 150MHz target. The longest
combinationnal path is in the MAC, which contains a 28x28 multiplier
followed by a 56x56 adder. I created the multiplier and the adder
using Core Generator.

Is there a way to speed this up? The virtex4 have those Xtreame DSP
slices, but I can't find a way to to make good use of them, since our
datapath is so large.

Thank you,
David

Reply With Quote
  #2 (permalink)  
Old 02-24-2005, 10:42 PM
glen herrmannsfeldt
Guest
 
Posts: n/a
Default Re: Fast 28x28 multiplier + adder in Virtex4

gretzteam wrote:

> We are using a virtex4 FPGA to prototype a DSP processor to be
> implemented in an ASIC. We are using the ISE flow and everything works
> fine except that we can't prototype at full speed. We are only able to
> run at about 65MHz, which is far from the 150MHz target. The longest
> combinationnal path is in the MAC, which contains a 28x28 multiplier
> followed by a 56x56 adder. I created the multiplier and the adder
> using Core Generator.


> Is there a way to speed this up? The virtex4 have those Xtreame DSP
> slices, but I can't find a way to to make good use of them, since our
> datapath is so large.


Virtex4 has 18x18 multiplier hardware. Your 28x28 may be made
from them, but you need to pipeline it, and also a pipeline
stage before the adder. I will guess that gets to 150MHz, but
you will have to try it to find out.

-- glen

Reply With Quote
  #3 (permalink)  
Old 02-24-2005, 10:47 PM
Kevin Neilson
Guest
 
Posts: n/a
Default Re: Fast 28x28 multiplier + adder in Virtex4

gretzteam wrote:
> Hi,
> We are using a virtex4 FPGA to prototype a DSP processor to be
> implemented in an ASIC. We are using the ISE flow and everything works
> fine except that we can't prototype at full speed. We are only able to
> run at about 65MHz, which is far from the 150MHz target. The longest
> combinationnal path is in the MAC, which contains a 28x28 multiplier
> followed by a 56x56 adder. I created the multiplier and the adder
> using Core Generator.
>
> Is there a way to speed this up? The virtex4 have those Xtreame DSP
> slices, but I can't find a way to to make good use of them, since our
> datapath is so large.
>
> Thank you,
> David
>


If you use the Xtreme DSP slices properly, with all of their dedicated
interconnects, you should be able to do a 34x34 multiply using 4
pipelined slices at full rate (450-500MHz, depending upon part speed).
You might need an extra two slices to do the 56-bit accumulate. Look
for the "XtremeDSP Design Consdierations" guide on the Xilinx site and
it describes how to do this. I'm not sure exactly what CoreGen is
producing but it might not be completely optimized. It might be using
CLB fabric for some of the operations.
-Kevin
Reply With Quote
  #4 (permalink)  
Old 02-25-2005, 03:47 PM
gretzteam
Guest
 
Posts: n/a
Default Re: Fast 28x28 multiplier + adder in Virtex4

Right now I'm not using anything fancy. I created a 28x28 multiplier
and a 56x56 adder with coregen and wired them together. I used the
multiplier component and it is supposed to use the XtremeDSP slices.
Maybe it is not wise enough to make use of other dedicated
interconnects. I will look at this "XtremeDSP Design Consdierations".
Thank you,
David

Reply With Quote
  #5 (permalink)  
Old 02-25-2005, 05:08 PM
Falk Brunner
Guest
 
Posts: n/a
Default Re: Fast 28x28 multiplier + adder in Virtex4


> Right now I'm not using anything fancy. I created a 28x28 multiplier


Pipelining is the magic word (Coregen calls it registered inputs and
outputs)

Regards
Falk



Reply With Quote
  #6 (permalink)  
Old 02-25-2005, 10:00 PM
gretzteam
Guest
 
Posts: n/a
Default Re: Fast 28x28 multiplier + adder in Virtex4

I can't really use pipelining here. The MAC is all combinationnal; i
receive inputs at time 0, and I need an answer by time x. I don't see
how pipelining would help.
Thanks,
Dave

Reply With Quote
  #7 (permalink)  
Old 02-25-2005, 10:42 PM
glen herrmannsfeldt
Guest
 
Posts: n/a
Default Re: Fast 28x28 multiplier + adder in Virtex4

gretzteam wrote:
> I can't really use pipelining here. The MAC is all combinationnal; i
> receive inputs at time 0, and I need an answer by time x. I don't see
> how pipelining would help.


What is x?

If x is one clock cycle then you need either faster logic or
a lot more of it. I believe this can be done easily with a
three cycle pipeline, so that you get an answer out every cycle,
which each one taking three cycles.

-- glen

Reply With Quote
  #8 (permalink)  
Old 02-27-2005, 03:04 AM
David
Guest
 
Posts: n/a
Default Re: Fast 28x28 multiplier + adder in Virtex4

Hi,
I guess I don't understand something about pipeling. In my case, the
whole system runs at master clock, which I would like to be 100MHz or
more. Right now, the whole MAC unit is combinational logic and needs
to produce an answer for each clock cycle (time x=1/100MHz). Are you
guys saying that if I would run the mac at 3 times the master clock
(300MHz) with a three stage pipeline, I could compute the answer fast
enough?

Thanks,
David

glen herrmannsfeldt <[email protected]> wrote in message news:<[email protected]>...
> gretzteam wrote:
> > I can't really use pipelining here. The MAC is all combinationnal; i
> > receive inputs at time 0, and I need an answer by time x. I don't see
> > how pipelining would help.

>
> What is x?
>
> If x is one clock cycle then you need either faster logic or
> a lot more of it. I believe this can be done easily with a
> three cycle pipeline, so that you get an answer out every cycle,
> which each one taking three cycles.
>
> -- glen

Reply With Quote
  #9 (permalink)  
Old 02-27-2005, 05:14 AM
Marc Randolph
Guest
 
Posts: n/a
Default Re: Fast 28x28 multiplier + adder in Virtex4

David wrote:
> Hi,
> I guess I don't understand something about pipeling. In my case, the
> whole system runs at master clock, which I would like to be 100MHz or
> more. Right now, the whole MAC unit is combinational logic and needs
> to produce an answer for each clock cycle (time x=1/100MHz). Are you
> guys saying that if I would run the mac at 3 times the master clock
> (300MHz) with a three stage pipeline, I could compute the answer fast
> enough?


Howdy David,

Using different terms, let's try another analogy on this Saturday:
imagine an automobile assembly line. It puts out a certain number of
cars per hour. If you add another step in the assembly process, you
can still get the same number of cars per hour out - it just takes a
little longer for it to roll off the assembly line. Circuits work the
same way.

If your main requirement is to be able to handle a certain number of
calculations per second, you can possibly break the calculations up
into smaller parts which are easier to do in series: rather than doing
a multiply and an accumulate in the same cycle, do the multiply in one
cycle, and the addition in the next cycle. While the accumulation is
occuring during this 2nd cycle, the 2nd piece of data is being
multiplied. On the 3rd cycle, the 2nd piece of data is now in the
accumulator and a 3rd piece of data enters the multiplier. You get the
same number of calculations per second out of the circuit (or perhaps
even more, since you can meet timing now!), but it takes 20 ns rather
than 10 ns. If you can't stand the extra delay, then you may need to
up the clock rate (and then you will sure enough have to pipeline!).

Hope that helps,

Marc

Reply With Quote
  #10 (permalink)  
Old 03-01-2005, 04:08 PM
gretzteam
Guest
 
Posts: n/a
Default Re: Fast 28x28 multiplier + adder in Virtex4

Hi,
I understand what you mean. However, I don't think it works in my case
because I have a loop (it is a MAC). In order to start the next
calculation, I need an answer to the previous one. I guess the only
solution is faster logic. I thought that a virtex4 would be able to
give us those kind of calculation speed...

Dave

Reply With Quote
  #11 (permalink)  
Old 03-01-2005, 06:19 PM
glen herrmannsfeldt
Guest
 
Posts: n/a
Default Re: Fast 28x28 multiplier + adder in Virtex4

gretzteam wrote:

> I understand what you mean. However, I don't think it works in my case
> because I have a loop (it is a MAC). In order to start the next
> calculation, I need an answer to the previous one. I guess the only
> solution is faster logic. I thought that a virtex4 would be able to
> give us those kind of calculation speed...


Unless the result from the accumulator goes as an input to the
multiplier, it should pipeline just fine. Using the built in
multipliers, it should be two or three stages. The answers
will come out, one per clock cycle, two or three clocks later.

-- glen

Reply With Quote
Reply

Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
32x32 fast multiplier designer Verilog 5 09-30-2008 06:21 AM
floating point adder and multiplier mmt1 Verilog 1 01-31-2006 08:17 PM
Virtex4: where is ICAP? bob FPGA 1 02-16-2005 03:54 PM
virtex4 distributed RAM Ram FPGA 5 02-10-2005 04:02 AM
fast adder and equal Rune Christensen FPGA 11 09-29-2004 01:47 AM


All times are GMT +1. The time now is 08:48 AM.


Powered by vBulletin® Version 3.8.0
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.2.0
Copyright 2008 @ FPGA Central. All rights reserved