View Single Post
  #6 (permalink)  
Old 11-22-2005, 11:53 PM
Antti Lukats
Guest
 
Posts: n/a
Default Re: Disabling Xilinx clock enable usage...

"John_H" <[email protected]> schrieb im Newsbeitrag
news:[email protected]
> "Antti Lukats" <[email protected]> wrote in message
> news:[email protected]
>>
>> "johnp" <[email protected]> schrieb im Newsbeitrag
>> news:[email protected] ps.com...
>>> I'm working on a high speed design in a Xilinx V2Pro and I'm running
>>> into a timing
>>> problem. Instead of packing logic into LUTs, XST wants to use the
>>> Enable
>>> signal in the CLB. To use the Enable, it needs to use an extra LUT to
>>> create
>>> the Enable signal, so I get routing delays and an extra CLB delay.
>>>
>>> Here's some sample code:
>>>
>>> req [3:0] sig4;
>>> wire [3:0] sig3;
>>>
>>> always @(posedge clk)
>>> if (sig1 & ~sig2)
>>> sig4 <= sig3;
>>>
>>> Xilinx could fit this into 4 CLBs total by simply using the 4 LUTs and
>>> the 4 flip-flops.
>>> Each LUT would handle one bit of sig4.
>>>
>>> Instead, XST uses a LUT to create (sig1 & ~sig2), then feeds that
>>> output to the
>>> Enable pins on 4 flip-flops. I now get the delay through the LUT and
>>> routing delays
>>> to my flip-flops.
>>>
>>> Any way to tell XST to not use the Enable signal and force it to use
>>> the LUTs for
>>> this section of logic?
>>>
>>> Thanks!
>>>
>>> John Providenza
>>>

>>
>> Hi John,
>>
>> in your example XST does exactly what it should do given your code.
>>
>> if you want the synthesis to avoid using clock enable then you should
>> rewrite your code
>>
>> antti

>
> I would respectfully disagree.
>
> A decent synthesizer should *not* produce an extra level of logic with an
> actual increase in area unless - and it's hard to see this as the case -
> the extra fanout for a heavily loaded signal causes timing problems
> elsewhere in the design.
>
> In a properly constrained design, a decent synthesizer should *not*
> produce logic that violates the timing constraints if there's an available
> solution that meets the timing. Unfortunately we have to spend much of
> our time tuning things manually to get the "obvious" to happen.


Dear John (and John),

I am glad to see someone to disagree with me once in a while, but the issue
isnt that simple

the way XST does synthesize the example in the original posting DOES NOT add
extra delay
and is in most cases the most effective coding. The flip flops are feed
either by direct connect
bypassing the LUT in their slices, or from feeding logic that is packed into
the slice where
the FF is, in what case the delay before the FF is absolutly minimal (LUT to
FF in same slice).
In most cases the timing delays in clock enable and data path will somewhat
overlay and
cancel out a bit from timing budget so the clock enable version would be
faster, that is
implementing the clock enable emulation in the D input would make one delay
path longer
and overall timing worse.

OTOH in some cases the no clock enable version may yield to better overall
timing depending where the critical path is, but here my bet is that there
is no
"decent" synthesizer that would optimize the clock enable out from the
sample
code based on critical path analyze alone. It would be possible, yes - but I
would
be surprised to see some synthesis tool to actually do that without explicit
coding or constraining. Hm, maybe am wrong and some synthesis tool is
as smart already

I do AGREE that the syntesis tools do not the best and in cases where
solution to meet timing is available, that solution is not used
automatically
and needs manual 'tuning'.

Antti

































Reply With Quote