FPGA Central - World's 1st FPGA / CPLD Portal

FPGA Central

World's 1st FPGA Portal

 

Go Back   FPGA Groups > NewsGroup > FPGA

FPGA comp.arch.fpga newsgroup (usenet)

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 11-18-2005, 06:00 PM
Sylvain Munaut
Guest
 
Posts: n/a
Default Virtex 4 FIFO16 blocks - Corruption ?

Hi,


We're faced with a strange problem ...
While investigating a bug in one design, we could only observe that
behavior on real board and not in simulation.

Using chipscope, we finally traced down the problem by monitoring
both write and read port of a FIFO16 configured as 18x1024, using the
same rd/wr clocks. That fifo was used in a "weird" way, by setting a
ALMOSTFULL threshold very high (but still within spec), so that it turn
on very quicly. And what we observed is that we push a data with some
parity bits (which are not 'true' parity but some critical control), we
continue to push, the almost full goes up (normal), and we still push
(we still have plenty of room) and at the same time we re-read but
slower (not at each clock cycle) and when we finally re-read the data
where the parity bit was set, the data (15:0) are there but the parity
bit is not, it's just 0 ...

The chipscope 'probes' were tied directly to the fifo signals, no logic
in between. That fifo is supposed to cross clock domains but for
debugging, we just sent the same clock everywhere. And the behavior of
the surrounding logic is consitent with that bit being missed.

Instead of using ALMOSTFULL set to a very high value, we used not
ALMOSTEMPTY (here since we're debugging with just 1 clock domain, it's
ok), and there it looks like we never observe such a miss.


Has someone ever observed such a behavior ?



Sylvain
Reply With Quote
  #2 (permalink)  
Old 12-01-2005, 03:32 PM
Ray Andraka
Guest
 
Posts: n/a
Default Re: Virtex 4 FIFO16 blocks - Corruption ?

Sylvain Munaut wrote:
> Hi,
>
>
> We're faced with a strange problem ...
> While investigating a bug in one design, we could only observe that
> behavior on real board and not in simulation.
>
> Using chipscope, we finally traced down the problem by monitoring
> both write and read port of a FIFO16 configured as 18x1024, using the
> same rd/wr clocks. That fifo was used in a "weird" way, by setting a
> ALMOSTFULL threshold very high (but still within spec), so that it turn
> on very quicly. And what we observed is that we push a data with some
> parity bits (which are not 'true' parity but some critical control), we
> continue to push, the almost full goes up (normal), and we still push
> (we still have plenty of room) and at the same time we re-read but
> slower (not at each clock cycle) and when we finally re-read the data
> where the parity bit was set, the data (15:0) are there but the parity
> bit is not, it's just 0 ...
>
> The chipscope 'probes' were tied directly to the fifo signals, no logic
> in between. That fifo is supposed to cross clock domains but for
> debugging, we just sent the same clock everywhere. And the behavior of
> the surrounding logic is consitent with that bit being missed.
>
> Instead of using ALMOSTFULL set to a very high value, we used not
> ALMOSTEMPTY (here since we're debugging with just 1 clock domain, it's
> ok), and there it looks like we never observe such a miss.
>
>
> Has someone ever observed such a behavior ?
>
>
>
> Sylvain


Have you got any resolution on this? Have you opened a case with
Xilinx? What does Xilinx have to say about it?

I am aware that some people have had problems with the FIFO16 not
working correctly. I had an issue with trying to use the FIFO as a
synchronous fifo (it is async, so there is a possibility with some
ambiguity on the flag latency when both clocks are the same). I have
asked Xilinx repeatedly to document this behavior prominently in the
user guide, but so far they have only quietly acknowledged that the user
has to be careful if read and write clocks are the same.

That said, your problem is different than the one I experienced and
appears to be a more serious problem in the FIFO16 logic. You are not
the first person I've heard state they had problems with the fifo16
async behavior. There may be some issues with the flag logic for
asynchronous use as well.

I do find it interesting that Altera was forthcoming with their recent
problems with dual port memories. I hope that Xilinx is equally
forthcoming if there is indeed a problem with the FIFO16 logic.

Reply With Quote
  #3 (permalink)  
Old 12-01-2005, 05:06 PM
Austin Lesea
Guest
 
Posts: n/a
Default Re: Virtex 4 FIFO16 blocks - Corruption ?

Ray,

The bug for use of the async FIFO synchronously has been acknolwedged,
and we apologize for not getting it out there more prominently. But:

In our defense, it is unusual (or at least, so far we think it is
unusual) where the read and write clocks are tied directly together (why
use a FIFO at all? I guess it is a really useful structure, so even
when used this way it is too useful to ignore....?).

The solution is to not source the two clocks from the same source
directly, but place a small delay in one, or the other.

The problem does not exist in the asynchronous case, as it takes two
subsequent clock cycles on BOTH clocks (at exactly the wrong times) to
cause the problem. As long as the probability of two adjacent clock
cyles not coming in on both clocks exactly the same just as you are
getting full (or is it empty? I'm not the expert on this), it works fine.

Sometimes with problems like this (that are difficult to even cause) it
doesn't make sense to put up a billboard that it is an issue, as then
everyone comes down with the disease (mass hypochrondira) when they
don't really have the problem.

Now, if the feature is just plain broke, then it is a different story,
and we will end the pain as soon as we are sure it is just plain broke.

No one is intentionally hiding anything, but we are judiciously placing
(obscure) bug information only with the hotline and support community,
rather than broadcasting it across the entire user community publicy.

If, for any reason, you feel that you have caught the disease (have a
bug we haven't shared universally), the entry of a webcase will get you
the help you need, as the hotline will search for all such issues. If
yours is there, then we will immediately share with you the solution.

These are known as "internal answers" and it isn't that we don't want to
share them, we just don't think they are likely issues for everyone.
Better to talk to you and find out what the problem is, first.

If these internal answers are made external, we imagine there would be
thousands of designers running down debug paths that are so obscure,
there is almost no chance they will find this as their problem. Then we
get a bad reputation, and the hotline is overwhelmed with folks who all
think they have this obscure problem!

I hope folks will appreciate that sometimes telling every strange and
obscure story causes more trouble than selectively understanding each
issue that arises, and dealing with it directly.

Support: it is an art.

Austin


Ray Andraka wrote:

> Sylvain Munaut wrote:
>
>> Hi,
>>
>>
>> We're faced with a strange problem ...
>> While investigating a bug in one design, we could only observe that
>> behavior on real board and not in simulation.
>>
>> Using chipscope, we finally traced down the problem by monitoring
>> both write and read port of a FIFO16 configured as 18x1024, using the
>> same rd/wr clocks. That fifo was used in a "weird" way, by setting a
>> ALMOSTFULL threshold very high (but still within spec), so that it turn
>> on very quicly. And what we observed is that we push a data with some
>> parity bits (which are not 'true' parity but some critical control), we
>> continue to push, the almost full goes up (normal), and we still push
>> (we still have plenty of room) and at the same time we re-read but
>> slower (not at each clock cycle) and when we finally re-read the data
>> where the parity bit was set, the data (15:0) are there but the parity
>> bit is not, it's just 0 ...
>>
>> The chipscope 'probes' were tied directly to the fifo signals, no logic
>> in between. That fifo is supposed to cross clock domains but for
>> debugging, we just sent the same clock everywhere. And the behavior of
>> the surrounding logic is consitent with that bit being missed.
>>
>> Instead of using ALMOSTFULL set to a very high value, we used not
>> ALMOSTEMPTY (here since we're debugging with just 1 clock domain, it's
>> ok), and there it looks like we never observe such a miss.
>>
>>
>> Has someone ever observed such a behavior ?
>>
>>
>>
>> Sylvain

>
>
> Have you got any resolution on this? Have you opened a case with
> Xilinx? What does Xilinx have to say about it?
>
> I am aware that some people have had problems with the FIFO16 not
> working correctly. I had an issue with trying to use the FIFO as a
> synchronous fifo (it is async, so there is a possibility with some
> ambiguity on the flag latency when both clocks are the same). I have
> asked Xilinx repeatedly to document this behavior prominently in the
> user guide, but so far they have only quietly acknowledged that the user
> has to be careful if read and write clocks are the same.
>
> That said, your problem is different than the one I experienced and
> appears to be a more serious problem in the FIFO16 logic. You are not
> the first person I've heard state they had problems with the fifo16
> async behavior. There may be some issues with the flag logic for
> asynchronous use as well.
>
> I do find it interesting that Altera was forthcoming with their recent
> problems with dual port memories. I hope that Xilinx is equally
> forthcoming if there is indeed a problem with the FIFO16 logic.
>

Reply With Quote
  #4 (permalink)  
Old 12-01-2005, 07:41 PM
Ray Andraka
Guest
 
Posts: n/a
Default Re: Virtex 4 FIFO16 blocks - Corruption ?

Austin,

You are kidding as far as the usefulness of a synchronous fifo (one
which has both sides clocked by the same clock), right? This is a
rather common structure in pipelined designs, it is an elastic buffer.
Useful, for example, for processing bursty data at a more relaxed rate
than the data is presented. I'd be hard pressed to find one of my
designs that does NOT have a synchronous FIFO in it. The solution with
the "small" delay is fine if you are not pushing the performance
envelope, but it will destroy timing closure in designs that are. For
example, I have a floating point FFT design with a target clock rate of
400 MHz in an SX55-10 part... basically running at the DSP48/memory
speed. It has synchronous FIFOs in it, and there is no room in the
timing for adding small delays to clocks. This is a real limitation to
the FIFO16 design, and has cost me several weeks of debug and redesign
time to find and work around it. It should be prominently highlighted
in the user guide under the section that describes the used of the
FIFO16. I am sure other users are going to encounter the same issue. No
one looks at the answers database until they have a problem and have
identified the source of the problem. The synchronous FIFO issue could
easily be considered a limitation rather than an outright bug, but it
does have to be made clear to the user before he does the design, not
when trying to figure out why it isn't working. By keeping it close to
your chest as an internal answer, I suspect you'll wind up generating a
heck of a lot more hotline cases than if you put it in black and white
right in the user's guide that this is the way the FIFO16's work and
that these are the things you need to do to work around the limitation
if the clocks are the same on both sides. BTW, I don't think this is an
"obscure" issue either, as anyone attempting to use the FIFO16 as a
synchronous FIFO is going to encounter it.

The flip answers regarding the synchronous FIFO (things like such a
structure is not useful, and just add delays to the clock when I've
explained that it is not a viable solution for maximum performance
designs), combined with the reluctance to make it clear to users that
this is a limitation of the FIFO16 design, makes it appear that either
Xilinx doesn't understand the issue or that they are trying to sweep it
under the rug. I presume and hope it is the former, although neither is
a particularly good outcome.

I am reluctant to enter a webcase on an issue such as this unless it has
become critical for the project. Invariably, the result of entering a
webcase is my having to generate and submit testcases to prove the
problem, and often having to come up with my own work-around because the
fix won't be available until the next major release. Nobody pays me for
the time spent doing testcases to ferret out the source of a bug in the
software or silicon. There have been months recently where I've spent
more than a quarter of my time identifying and generating test cases for
problems in the tools (not just Xilinx). Naturally, I'd like to avoid
that as much as practical.

Regarding the asynchronous FIFO behavior, I don't have any direct
experience with the FIFO16 behaving badly as an async FIFO, But I
haven't used it in that mode in a design that has made it to testing.
Silvain's description does sound as though the FIFO may be misbehaving,
and it jives with things I've heard from others. This is why I asked
him if he had opened a case with Xilinx and what the resolution of that
case was. It is important to know if there is a potential problem so
that I can avoid it during the design rather than discover it during
integration. I am currently working on a design that has several async
FIFO16's in it, and would like to believe that they will work for me,
however these rumblings have me concerned, hence my asking Sylvain about
his resolution. So far, the work arounds I am aware of have used the
coregen FIFO instead of the FIFO16, which does not have the same clock
performance as the FIFO16.

I didn't intend to kick over the beehive here, I was only trying to
collect more data so that I might avoid a problem in my own design if it
does exist.





Austin Lesea wrote:
> Ray,
>
> The bug for use of the async FIFO synchronously has been acknolwedged,
> and we apologize for not getting it out there more prominently. But:
>
> In our defense, it is unusual (or at least, so far we think it is
> unusual) where the read and write clocks are tied directly together (why
> use a FIFO at all? I guess it is a really useful structure, so even
> when used this way it is too useful to ignore....?).



>
> The solution is to not source the two clocks from the same source
> directly, but place a small delay in one, or the other.
>
> The problem does not exist in the asynchronous case, as it takes two
> subsequent clock cycles on BOTH clocks (at exactly the wrong times) to
> cause the problem. As long as the probability of two adjacent clock
> cyles not coming in on both clocks exactly the same just as you are
> getting full (or is it empty? I'm not the expert on this), it works fine.
>
> Sometimes with problems like this (that are difficult to even cause) it
> doesn't make sense to put up a billboard that it is an issue, as then
> everyone comes down with the disease (mass hypochrondira) when they
> don't really have the problem.
>
> Now, if the feature is just plain broke, then it is a different story,
> and we will end the pain as soon as we are sure it is just plain broke.
>
> No one is intentionally hiding anything, but we are judiciously placing
> (obscure) bug information only with the hotline and support community,
> rather than broadcasting it across the entire user community publicy.
>
> If, for any reason, you feel that you have caught the disease (have a
> bug we haven't shared universally), the entry of a webcase will get you
> the help you need, as the hotline will search for all such issues. If
> yours is there, then we will immediately share with you the solution.
>
> These are known as "internal answers" and it isn't that we don't want to
> share them, we just don't think they are likely issues for everyone.
> Better to talk to you and find out what the problem is, first.
>
> If these internal answers are made external, we imagine there would be
> thousands of designers running down debug paths that are so obscure,
> there is almost no chance they will find this as their problem. Then we
> get a bad reputation, and the hotline is overwhelmed with folks who all
> think they have this obscure problem!
>
> I hope folks will appreciate that sometimes telling every strange and
> obscure story causes more trouble than selectively understanding each
> issue that arises, and dealing with it directly.
>
> Support: it is an art.
>
> Austin
>
>
> Ray Andraka wrote:
>
>> Sylvain Munaut wrote:
>>
>>> Hi,
>>>
>>>
>>> We're faced with a strange problem ...
>>> While investigating a bug in one design, we could only observe that
>>> behavior on real board and not in simulation.
>>>
>>> Using chipscope, we finally traced down the problem by monitoring
>>> both write and read port of a FIFO16 configured as 18x1024, using the
>>> same rd/wr clocks. That fifo was used in a "weird" way, by setting a
>>> ALMOSTFULL threshold very high (but still within spec), so that it turn
>>> on very quicly. And what we observed is that we push a data with some
>>> parity bits (which are not 'true' parity but some critical control), we
>>> continue to push, the almost full goes up (normal), and we still push
>>> (we still have plenty of room) and at the same time we re-read but
>>> slower (not at each clock cycle) and when we finally re-read the data
>>> where the parity bit was set, the data (15:0) are there but the parity
>>> bit is not, it's just 0 ...
>>>
>>> The chipscope 'probes' were tied directly to the fifo signals, no logic
>>> in between. That fifo is supposed to cross clock domains but for
>>> debugging, we just sent the same clock everywhere. And the behavior of
>>> the surrounding logic is consitent with that bit being missed.
>>>
>>> Instead of using ALMOSTFULL set to a very high value, we used not
>>> ALMOSTEMPTY (here since we're debugging with just 1 clock domain, it's
>>> ok), and there it looks like we never observe such a miss.
>>>
>>>
>>> Has someone ever observed such a behavior ?
>>>
>>>
>>>
>>> Sylvain

>>
>>
>>
>> Have you got any resolution on this? Have you opened a case with
>> Xilinx? What does Xilinx have to say about it?
>>
>> I am aware that some people have had problems with the FIFO16 not
>> working correctly. I had an issue with trying to use the FIFO as a
>> synchronous fifo (it is async, so there is a possibility with some
>> ambiguity on the flag latency when both clocks are the same). I have
>> asked Xilinx repeatedly to document this behavior prominently in the
>> user guide, but so far they have only quietly acknowledged that the
>> user has to be careful if read and write clocks are the same.
>>
>> That said, your problem is different than the one I experienced and
>> appears to be a more serious problem in the FIFO16 logic. You are not
>> the first person I've heard state they had problems with the fifo16
>> async behavior. There may be some issues with the flag logic for
>> asynchronous use as well.
>>
>> I do find it interesting that Altera was forthcoming with their recent
>> problems with dual port memories. I hope that Xilinx is equally
>> forthcoming if there is indeed a problem with the FIFO16 logic.
>>

Reply With Quote
  #5 (permalink)  
Old 12-01-2005, 08:55 PM
John_H
Guest
 
Posts: n/a
Default Re: Virtex 4 FIFO16 blocks - Corruption ?

Ray, your comments are again right on target with my own feelings about bugs
and support. The webcase submission issue especially hits home. One minor
difference for me may be that when I find unusual behavior and have it
isolated to a functional portion of the design, I may check the (externally
available) knowledge database for any information relating to my problem
area before spending a few more days to further isolate the cause. I've had
several instances where the information is in *a* database, just not one I
can get to.

For ANYONE who is concerned with whether or not to air the dirty laundry of
their EDA tools and silicon, PLEASE read through Ray's note and understand
where designers come from. Our company has had TOO many issues with silicon
(non-FPGA as well) and EDA tools ("you knew about this for how many
months?") that when we encounter known bugs that are "hidden" from plain
view, we are LIVID. There is no excuse to withhold information that WILL
affect designs if there is a way to communicate the issues externally.

In this instance, there is a way.



"Ray Andraka" <[email protected]> wrote in message
news:[email protected]
> Austin,
>
> You are kidding as far as the usefulness of a synchronous fifo (one which
> has both sides clocked by the same clock), right? This is a
> rather common structure in pipelined designs, it is an elastic buffer.
> Useful, for example, for processing bursty data at a more relaxed rate
> than the data is presented. I'd be hard pressed to find one of my designs
> that does NOT have a synchronous FIFO in it. The solution with the
> "small" delay is fine if you are not pushing the performance envelope, but
> it will destroy timing closure in designs that are. For example, I have a
> floating point FFT design with a target clock rate of 400 MHz in an
> SX55-10 part... basically running at the DSP48/memory speed. It has
> synchronous FIFOs in it, and there is no room in the timing for adding
> small delays to clocks. This is a real limitation to the FIFO16 design,
> and has cost me several weeks of debug and redesign time to find and work
> around it. It should be prominently highlighted in the user guide under
> the section that describes the used of the FIFO16. I am sure other users
> are going to encounter the same issue. No one looks at the answers
> database until they have a problem and have identified the source of the
> problem. The synchronous FIFO issue could easily be considered a
> limitation rather than an outright bug, but it does have to be made clear
> to the user before he does the design, not when trying to figure out why
> it isn't working. By keeping it close to your chest as an internal answer,
> I suspect you'll wind up generating a heck of a lot more hotline cases
> than if you put it in black and white right in the user's guide that this
> is the way the FIFO16's work and that these are the things you need to do
> to work around the limitation if the clocks are the same on both sides.
> BTW, I don't think this is an "obscure" issue either, as anyone attempting
> to use the FIFO16 as a synchronous FIFO is going to encounter it.
>
> The flip answers regarding the synchronous FIFO (things like such a
> structure is not useful, and just add delays to the clock when I've
> explained that it is not a viable solution for maximum performance
> designs), combined with the reluctance to make it clear to users that this
> is a limitation of the FIFO16 design, makes it appear that either Xilinx
> doesn't understand the issue or that they are trying to sweep it under the
> rug. I presume and hope it is the former, although neither is a
> particularly good outcome.
>
> I am reluctant to enter a webcase on an issue such as this unless it has
> become critical for the project. Invariably, the result of entering a
> webcase is my having to generate and submit testcases to prove the
> problem, and often having to come up with my own work-around because the
> fix won't be available until the next major release. Nobody pays me for
> the time spent doing testcases to ferret out the source of a bug in the
> software or silicon. There have been months recently where I've spent
> more than a quarter of my time identifying and generating test cases for
> problems in the tools (not just Xilinx). Naturally, I'd like to avoid
> that as much as practical.
>
> Regarding the asynchronous FIFO behavior, I don't have any direct
> experience with the FIFO16 behaving badly as an async FIFO, But I haven't
> used it in that mode in a design that has made it to testing.
> Silvain's description does sound as though the FIFO may be misbehaving,
> and it jives with things I've heard from others. This is why I asked him
> if he had opened a case with Xilinx and what the resolution of that case
> was. It is important to know if there is a potential problem so that I
> can avoid it during the design rather than discover it during integration.
> I am currently working on a design that has several async FIFO16's in it,
> and would like to believe that they will work for me, however these
> rumblings have me concerned, hence my asking Sylvain about his resolution.
> So far, the work arounds I am aware of have used the coregen FIFO instead
> of the FIFO16, which does not have the same clock performance as the
> FIFO16.
>
> I didn't intend to kick over the beehive here, I was only trying to
> collect more data so that I might avoid a problem in my own design if it
> does exist.
>
>
>
>
>
> Austin Lesea wrote:
>> Ray,
>>
>> The bug for use of the async FIFO synchronously has been acknolwedged,
>> and we apologize for not getting it out there more prominently. But:
>>
>> In our defense, it is unusual (or at least, so far we think it is
>> unusual) where the read and write clocks are tied directly together (why
>> use a FIFO at all? I guess it is a really useful structure, so even when
>> used this way it is too useful to ignore....?).

>
>
>>
>> The solution is to not source the two clocks from the same source
>> directly, but place a small delay in one, or the other.
>>
>> The problem does not exist in the asynchronous case, as it takes two
>> subsequent clock cycles on BOTH clocks (at exactly the wrong times) to
>> cause the problem. As long as the probability of two adjacent clock
>> cyles not coming in on both clocks exactly the same just as you are
>> getting full (or is it empty? I'm not the expert on this), it works fine.
>>
>> Sometimes with problems like this (that are difficult to even cause) it
>> doesn't make sense to put up a billboard that it is an issue, as then
>> everyone comes down with the disease (mass hypochrondira) when they don't
>> really have the problem.
>>
>> Now, if the feature is just plain broke, then it is a different story,
>> and we will end the pain as soon as we are sure it is just plain broke.
>>
>> No one is intentionally hiding anything, but we are judiciously placing
>> (obscure) bug information only with the hotline and support community,
>> rather than broadcasting it across the entire user community publicy.
>>
>> If, for any reason, you feel that you have caught the disease (have a bug
>> we haven't shared universally), the entry of a webcase will get you the
>> help you need, as the hotline will search for all such issues. If yours
>> is there, then we will immediately share with you the solution.
>>
>> These are known as "internal answers" and it isn't that we don't want to
>> share them, we just don't think they are likely issues for everyone.
>> Better to talk to you and find out what the problem is, first.
>>
>> If these internal answers are made external, we imagine there would be
>> thousands of designers running down debug paths that are so obscure,
>> there is almost no chance they will find this as their problem. Then we
>> get a bad reputation, and the hotline is overwhelmed with folks who all
>> think they have this obscure problem!
>>
>> I hope folks will appreciate that sometimes telling every strange and
>> obscure story causes more trouble than selectively understanding each
>> issue that arises, and dealing with it directly.
>>
>> Support: it is an art.
>>
>> Austin
>>
>>
>> Ray Andraka wrote:
>>
>>> Sylvain Munaut wrote:
>>>
>>>> Hi,
>>>>
>>>>
>>>> We're faced with a strange problem ...
>>>> While investigating a bug in one design, we could only observe that
>>>> behavior on real board and not in simulation.
>>>>
>>>> Using chipscope, we finally traced down the problem by monitoring
>>>> both write and read port of a FIFO16 configured as 18x1024, using the
>>>> same rd/wr clocks. That fifo was used in a "weird" way, by setting a
>>>> ALMOSTFULL threshold very high (but still within spec), so that it turn
>>>> on very quicly. And what we observed is that we push a data with some
>>>> parity bits (which are not 'true' parity but some critical control), we
>>>> continue to push, the almost full goes up (normal), and we still push
>>>> (we still have plenty of room) and at the same time we re-read but
>>>> slower (not at each clock cycle) and when we finally re-read the data
>>>> where the parity bit was set, the data (15:0) are there but the parity
>>>> bit is not, it's just 0 ...
>>>>
>>>> The chipscope 'probes' were tied directly to the fifo signals, no logic
>>>> in between. That fifo is supposed to cross clock domains but for
>>>> debugging, we just sent the same clock everywhere. And the behavior of
>>>> the surrounding logic is consitent with that bit being missed.
>>>>
>>>> Instead of using ALMOSTFULL set to a very high value, we used not
>>>> ALMOSTEMPTY (here since we're debugging with just 1 clock domain, it's
>>>> ok), and there it looks like we never observe such a miss.
>>>>
>>>>
>>>> Has someone ever observed such a behavior ?
>>>>
>>>>
>>>>
>>>> Sylvain
>>>
>>>
>>>
>>> Have you got any resolution on this? Have you opened a case with
>>> Xilinx? What does Xilinx have to say about it?
>>>
>>> I am aware that some people have had problems with the FIFO16 not
>>> working correctly. I had an issue with trying to use the FIFO as a
>>> synchronous fifo (it is async, so there is a possibility with some
>>> ambiguity on the flag latency when both clocks are the same). I have
>>> asked Xilinx repeatedly to document this behavior prominently in the
>>> user guide, but so far they have only quietly acknowledged that the user
>>> has to be careful if read and write clocks are the same.
>>>
>>> That said, your problem is different than the one I experienced and
>>> appears to be a more serious problem in the FIFO16 logic. You are not
>>> the first person I've heard state they had problems with the fifo16
>>> async behavior. There may be some issues with the flag logic for
>>> asynchronous use as well.
>>>
>>> I do find it interesting that Altera was forthcoming with their recent
>>> problems with dual port memories. I hope that Xilinx is equally
>>> forthcoming if there is indeed a problem with the FIFO16 logic.
>>>



Reply With Quote
  #6 (permalink)  
Old 12-01-2005, 09:13 PM
Austin Lesea
Guest
 
Posts: n/a
Default Re: Virtex 4 FIFO16 blocks - Corruption ?

John, Ray,

I never said, nor implied we would intentionally withhold information.

That is bad. Really bad.

I understand that you may consider our choice of distribution of
information (through answers externally available, or through answers
internally available to case workers and FAEs) to be unacceptable.

I will entertain any other solutions. One that I might suggest is that
you sign up for a push email whenever something happens that you
indicate you are interested in. Maybe its been tried, maybe not. I
know we do have push email systems in place now. Perhaps we need to add
features?

As far as async FIFO issues go, I am getting some emails on that subject
as well. So, even though I thought (and witnessed) the extensive FIFO
testing on V4, the problem with a test that passes is that a test is
never the application.

I will reserve a whole-hearted endorsement of perfection until I hear
more about what the alleged issues are with async mode.

And Ray, I appreciate the use of the FIFO synchronously (toungue firmly
in cheek comments), I just never thought about it before. Making an
async FIFO is so much black magic that you spend all your time looking
at the async case, and no time with the sync case (obviously the problem
here).

Austin


John_H wrote:

> Ray, your comments are again right on target with my own feelings about bugs
> and support. The webcase submission issue especially hits home. One minor
> difference for me may be that when I find unusual behavior and have it
> isolated to a functional portion of the design, I may check the (externally
> available) knowledge database for any information relating to my problem
> area before spending a few more days to further isolate the cause. I've had
> several instances where the information is in *a* database, just not one I
> can get to.
>
> For ANYONE who is concerned with whether or not to air the dirty laundry of
> their EDA tools and silicon, PLEASE read through Ray's note and understand
> where designers come from. Our company has had TOO many issues with silicon
> (non-FPGA as well) and EDA tools ("you knew about this for how many
> months?") that when we encounter known bugs that are "hidden" from plain
> view, we are LIVID. There is no excuse to withhold information that WILL
> affect designs if there is a way to communicate the issues externally.
>
> In this instance, there is a way.
>
>
>
> "Ray Andraka" <[email protected]> wrote in message
> news:[email protected]
>
>>Austin,
>>
>>You are kidding as far as the usefulness of a synchronous fifo (one which
>>has both sides clocked by the same clock), right? This is a
>>rather common structure in pipelined designs, it is an elastic buffer.
>>Useful, for example, for processing bursty data at a more relaxed rate
>>than the data is presented. I'd be hard pressed to find one of my designs
>>that does NOT have a synchronous FIFO in it. The solution with the
>>"small" delay is fine if you are not pushing the performance envelope, but
>>it will destroy timing closure in designs that are. For example, I have a
>>floating point FFT design with a target clock rate of 400 MHz in an
>>SX55-10 part... basically running at the DSP48/memory speed. It has
>>synchronous FIFOs in it, and there is no room in the timing for adding
>>small delays to clocks. This is a real limitation to the FIFO16 design,
>>and has cost me several weeks of debug and redesign time to find and work
>>around it. It should be prominently highlighted in the user guide under
>>the section that describes the used of the FIFO16. I am sure other users
>>are going to encounter the same issue. No one looks at the answers
>>database until they have a problem and have identified the source of the
>>problem. The synchronous FIFO issue could easily be considered a
>>limitation rather than an outright bug, but it does have to be made clear
>>to the user before he does the design, not when trying to figure out why
>>it isn't working. By keeping it close to your chest as an internal answer,
>>I suspect you'll wind up generating a heck of a lot more hotline cases
>>than if you put it in black and white right in the user's guide that this
>>is the way the FIFO16's work and that these are the things you need to do
>>to work around the limitation if the clocks are the same on both sides.
>>BTW, I don't think this is an "obscure" issue either, as anyone attempting
>>to use the FIFO16 as a synchronous FIFO is going to encounter it.
>>
>>The flip answers regarding the synchronous FIFO (things like such a
>>structure is not useful, and just add delays to the clock when I've
>>explained that it is not a viable solution for maximum performance
>>designs), combined with the reluctance to make it clear to users that this
>>is a limitation of the FIFO16 design, makes it appear that either Xilinx
>>doesn't understand the issue or that they are trying to sweep it under the
>>rug. I presume and hope it is the former, although neither is a
>>particularly good outcome.
>>
>>I am reluctant to enter a webcase on an issue such as this unless it has
>>become critical for the project. Invariably, the result of entering a
>>webcase is my having to generate and submit testcases to prove the
>>problem, and often having to come up with my own work-around because the
>>fix won't be available until the next major release. Nobody pays me for
>>the time spent doing testcases to ferret out the source of a bug in the
>>software or silicon. There have been months recently where I've spent
>>more than a quarter of my time identifying and generating test cases for
>>problems in the tools (not just Xilinx). Naturally, I'd like to avoid
>>that as much as practical.
>>
>>Regarding the asynchronous FIFO behavior, I don't have any direct
>>experience with the FIFO16 behaving badly as an async FIFO, But I haven't
>>used it in that mode in a design that has made it to testing.
>>Silvain's description does sound as though the FIFO may be misbehaving,
>>and it jives with things I've heard from others. This is why I asked him
>>if he had opened a case with Xilinx and what the resolution of that case
>>was. It is important to know if there is a potential problem so that I
>>can avoid it during the design rather than discover it during integration.
>>I am currently working on a design that has several async FIFO16's in it,
>>and would like to believe that they will work for me, however these
>>rumblings have me concerned, hence my asking Sylvain about his resolution.
>>So far, the work arounds I am aware of have used the coregen FIFO instead
>>of the FIFO16, which does not have the same clock performance as the
>>FIFO16.
>>
>>I didn't intend to kick over the beehive here, I was only trying to
>>collect more data so that I might avoid a problem in my own design if it
>>does exist.
>>
>>
>>
>>
>>
>>Austin Lesea wrote:
>>
>>>Ray,
>>>
>>>The bug for use of the async FIFO synchronously has been acknolwedged,
>>>and we apologize for not getting it out there more prominently. But:
>>>
>>>In our defense, it is unusual (or at least, so far we think it is
>>>unusual) where the read and write clocks are tied directly together (why
>>>use a FIFO at all? I guess it is a really useful structure, so even when
>>>used this way it is too useful to ignore....?).

>>
>>
>>>The solution is to not source the two clocks from the same source
>>>directly, but place a small delay in one, or the other.
>>>
>>>The problem does not exist in the asynchronous case, as it takes two
>>>subsequent clock cycles on BOTH clocks (at exactly the wrong times) to
>>>cause the problem. As long as the probability of two adjacent clock
>>>cyles not coming in on both clocks exactly the same just as you are
>>>getting full (or is it empty? I'm not the expert on this), it works fine.
>>>
>>>Sometimes with problems like this (that are difficult to even cause) it
>>>doesn't make sense to put up a billboard that it is an issue, as then
>>>everyone comes down with the disease (mass hypochrondira) when they don't
>>>really have the problem.
>>>
>>>Now, if the feature is just plain broke, then it is a different story,
>>>and we will end the pain as soon as we are sure it is just plain broke.
>>>
>>>No one is intentionally hiding anything, but we are judiciously placing
>>>(obscure) bug information only with the hotline and support community,
>>>rather than broadcasting it across the entire user community publicy.
>>>
>>>If, for any reason, you feel that you have caught the disease (have a bug
>>>we haven't shared universally), the entry of a webcase will get you the
>>>help you need, as the hotline will search for all such issues. If yours
>>>is there, then we will immediately share with you the solution.
>>>
>>>These are known as "internal answers" and it isn't that we don't want to
>>>share them, we just don't think they are likely issues for everyone.
>>>Better to talk to you and find out what the problem is, first.
>>>
>>>If these internal answers are made external, we imagine there would be
>>>thousands of designers running down debug paths that are so obscure,
>>>there is almost no chance they will find this as their problem. Then we
>>>get a bad reputation, and the hotline is overwhelmed with folks who all
>>>think they have this obscure problem!
>>>
>>>I hope folks will appreciate that sometimes telling every strange and
>>>obscure story causes more trouble than selectively understanding each
>>>issue that arises, and dealing with it directly.
>>>
>>>Support: it is an art.
>>>
>>>Austin
>>>
>>>
>>>Ray Andraka wrote:
>>>
>>>
>>>>Sylvain Munaut wrote:
>>>>
>>>>
>>>>>Hi,
>>>>>
>>>>>
>>>>>We're faced with a strange problem ...
>>>>>While investigating a bug in one design, we could only observe that
>>>>>behavior on real board and not in simulation.
>>>>>
>>>>>Using chipscope, we finally traced down the problem by monitoring
>>>>>both write and read port of a FIFO16 configured as 18x1024, using the
>>>>>same rd/wr clocks. That fifo was used in a "weird" way, by setting a
>>>>>ALMOSTFULL threshold very high (but still within spec), so that it turn
>>>>>on very quicly. And what we observed is that we push a data with some
>>>>>parity bits (which are not 'true' parity but some critical control), we
>>>>>continue to push, the almost full goes up (normal), and we still push
>>>>>(we still have plenty of room) and at the same time we re-read but
>>>>>slower (not at each clock cycle) and when we finally re-read the data
>>>>>where the parity bit was set, the data (15:0) are there but the parity
>>>>>bit is not, it's just 0 ...
>>>>>
>>>>>The chipscope 'probes' were tied directly to the fifo signals, no logic
>>>>>in between. That fifo is supposed to cross clock domains but for
>>>>>debugging, we just sent the same clock everywhere. And the behavior of
>>>>>the surrounding logic is consitent with that bit being missed.
>>>>>
>>>>>Instead of using ALMOSTFULL set to a very high value, we used not
>>>>>ALMOSTEMPTY (here since we're debugging with just 1 clock domain, it's
>>>>>ok), and there it looks like we never observe such a miss.
>>>>>
>>>>>
>>>>>Has someone ever observed such a behavior ?
>>>>>
>>>>>
>>>>>
>>>>> Sylvain
>>>>
>>>>
>>>>
>>>>Have you got any resolution on this? Have you opened a case with
>>>>Xilinx? What does Xilinx have to say about it?
>>>>
>>>>I am aware that some people have had problems with the FIFO16 not
>>>>working correctly. I had an issue with trying to use the FIFO as a
>>>>synchronous fifo (it is async, so there is a possibility with some
>>>>ambiguity on the flag latency when both clocks are the same). I have
>>>>asked Xilinx repeatedly to document this behavior prominently in the
>>>>user guide, but so far they have only quietly acknowledged that the user
>>>>has to be careful if read and write clocks are the same.
>>>>
>>>>That said, your problem is different than the one I experienced and
>>>>appears to be a more serious problem in the FIFO16 logic. You are not
>>>>the first person I've heard state they had problems with the fifo16
>>>>async behavior. There may be some issues with the flag logic for
>>>>asynchronous use as well.
>>>>
>>>>I do find it interesting that Altera was forthcoming with their recent
>>>>problems with dual port memories. I hope that Xilinx is equally
>>>>forthcoming if there is indeed a problem with the FIFO16 logic.
>>>>

>
>
>

Reply With Quote
  #7 (permalink)  
Old 12-01-2005, 09:36 PM
Sylvain Munaut
Guest
 
Posts: n/a
Default Re: Virtex 4 FIFO16 blocks - Corruption ?

Hi Ray,

Ray Andraka wrote:
> Have you got any resolution on this? Have you opened a case with
> Xilinx? What does Xilinx have to say about it?


My colleague had some contact with our distributor but afaik, no news yet.

Looking at the xilinx answer record , I saw that fifo usign FIFO16
blocks generated with an old version of fifogenerator could show some
datacorruption problem and that usage of the new one is recommanded ...
but I didn't use coregenerator, i instanciated FIFO16 directly (coregen
doesn't have first word fall thru anyway ...)

I haven't opened a webcase myself yet, ... often before "bothering"
xilinx peoples, I want to be sure ;p I've tried to reproduce the problem
with a far simpler design but so far no luck ... (even in the full
design it's quite "rare" but 1 times suffice to lock it ...)


> I am aware that some people have had problems with the FIFO16 not
> working correctly. I had an issue with trying to use the FIFO as a
> synchronous fifo (it is async, so there is a possibility with some
> ambiguity on the flag latency when both clocks are the same). I have
> asked Xilinx repeatedly to document this behavior prominently in the
> user guide, but so far they have only quietly acknowledged that the user
> has to be careful if read and write clocks are the same.


What exactly is the problem if the clocks are the same ? (what behaviour
could happen ?)

> That said, your problem is different than the one I experienced and
> appears to be a more serious problem in the FIFO16 logic. You are not
> the first person I've heard state they had problems with the fifo16
> async behavior. There may be some issues with the flag logic for
> asynchronous use as well.


Well, here we use the fifo synchronously ... They are meant in the
future to be used asynchronously but for testing, we've put everything
at the same clock. But other part in the design will always use them
synchronously so I must get it working in both mode ...



Sylvain

Reply With Quote
  #8 (permalink)  
Old 12-01-2005, 09:49 PM
Sylvain Munaut
Guest
 
Posts: n/a
Default Re: Virtex 4 FIFO16 blocks - Corruption ?

Austin Lesea wrote:
> Ray,
>
> The bug for use of the async FIFO synchronously has been acknolwedged,
> and we apologize for not getting it out there more prominently. But:


Where can I get detailled infos about it ? (to be sure not to run into
it, or at least that it doesn't cause trouble in my design ?)



Sylvain
Reply With Quote
  #9 (permalink)  
Old 12-01-2005, 10:14 PM
Austin Lesea
Guest
 
Posts: n/a
Default Re: Virtex 4 FIFO16 blocks - Corruption ?

Sylvain,

I expect the fastest way is to open a webcase requesting the information.

As I already stated, if both read and write clocks are from the same
BUFG net, then this may (will) probably be an issue at some
process/voltage/temperature corner (hence the indsidiousness of the issue).

A quick fix is to drive one of the clocks from the other edge (one
rising, one falling) which may require another BUFG resource (in order
to be sure the delay doesn't put you right back where you started).

It is my understanding that a macro will be created to instantiate the
sync FIFO with the required offset delay automatically in the best way
we can (probably using fabric resources, like a LUT, doubles, hexes, etc.).

The issue as I was told is that at the critical instant, the almost
full/almost empty flag assertions will be correct, but if the event
occurs again on the very next clock cycle, the flag will reset to 0,
which will not be correct (as the FIFO is still almost full, or almost
empty if nothing was done to read anything out, or write anything in on
that cycle).

There may be other simpler solutions (that we haven't thought of yet).

Again, the jury is out on the async case....

Austin

Sylvain Munaut wrote:

> Austin Lesea wrote:
>
>>Ray,
>>
>>The bug for use of the async FIFO synchronously has been acknolwedged,
>>and we apologize for not getting it out there more prominently. But:

>
>
> Where can I get detailled infos about it ? (to be sure not to run into
> it, or at least that it doesn't cause trouble in my design ?)
>
>
>
> Sylvain

Reply With Quote
  #10 (permalink)  
Old 12-01-2005, 10:18 PM
Ray Andraka
Guest
 
Posts: n/a
Default Re: Virtex 4 FIFO16 blocks - Corruption ?

Sylvain Munaut wrote:

> Austin Lesea wrote:
>
>>Ray,
>>
>>The bug for use of the async FIFO synchronously has been acknolwedged,
>>and we apologize for not getting it out there more prominently. But:

>
>
> Where can I get detailled infos about it ? (to be sure not to run into
> it, or at least that it doesn't cause trouble in my design ?)
>
>
>
> Sylvain


This is exactly what I mean by the problem being hidden. I searched the
answers database for FIFO16, and did not turn up anything regarding the
known synchronous behavior problem, nor any async problems. It may
still only be in the internal database, if it is even there. In
debugging stuff like this, I've always assumed the silicon is good and
that any problems are a result of the design until I can prove
otherwise. As a result, you don't suspect the FIFO itself as being the
problem. That can lead to a tremendous amount of debugging effort
before finding out there is a problem or unpublished limitation with the
silicon. Considering how much time I spent fiddling with this problem, I
suspect there are literally thousands of manhours put into debugging the
same problem in different projects simply because Xilinx doesn't want to
advertise a limitation with their design.

The problem with the synchronous usage is that the flag circuit is an
async design. When the clock is the same to both sides, and a read and
write are done on the same clock cycle, the flag circuit displays a one
clock jitter in the timing of the flag outputs, such that the word
written in at the same time the last one is read out may or may not make
the fifo show empty. If empty does get set, it then takes something
like 3 clocks to go away, so you wind up with a non-deterministic
behavior. It is an artifact of using an async flag circuit.

BTW, finding stuff in the answers database is a lot like finding a
needle in a haystack, provided you even know what you are looking for.
Reply With Quote
  #11 (permalink)  
Old 12-01-2005, 10:30 PM
Ray Andraka
Guest
 
Posts: n/a
Default Re: Virtex 4 FIFO16 blocks - Corruption ?

John_H wrote:

> Ray, your comments are again right on target with my own feelings about bugs
> and support. The webcase submission issue especially hits home. One minor
> difference for me may be that when I find unusual behavior and have it
> isolated to a functional portion of the design, I may check the (externally
> available) knowledge database for any information relating to my problem
> area before spending a few more days to further isolate the cause. I've had
> several instances where the information is in *a* database, just not one I
> can get to.
>
> For ANYONE who is concerned with whether or not to air the dirty laundry of
> their EDA tools and silicon, PLEASE read through Ray's note and understand
> where designers come from. Our company has had TOO many issues with silicon
> (non-FPGA as well) and EDA tools ("you knew about this for how many
> months?") that when we encounter known bugs that are "hidden" from plain
> view, we are LIVID. There is no excuse to withhold information that WILL
> affect designs if there is a way to communicate the issues externally.
>
> In this instance, there is a way.


>

John, I also search the answers database to see if there are any matches
to the problem at hand. More often than not, even when there is
something that is close in there, it is difficult to find, and then
determine whether it matches your problem or not. The answers database
was once a very useful resource, but it has now grown big enough that
you often can't find the magic incantation to pare down the hits to ones
that match your problem.

Austin,
If it is restricted just to the internal database, it is being withheld.
Unless I am experiencing the problem, debugged and isolated the
problem, I don't even know to look for it. In the case of the
synchronous FIFO, there is no excuse to keep that to the internal
database. A simple statement in the user's guide saying that the FIFO16
is an async FIFO and has these particular limitations when used in an
application where both clocks are the same would have been sufficient to
avoid a heck of a lot of troubleshooting time, and would have put the
onus on me the designer. The way it is now, it is hidden until a
designer trips over it in the lab, and by then you've got many hours in
debug, isolation, talking to xilinx to figure out what is going on, and
redesign to work around it. When I found out that Xilinx internally
already knew this was an issue, I was livid.
Reply With Quote
  #12 (permalink)  
Old 12-01-2005, 11:12 PM
Austin Lesea
Guest
 
Posts: n/a
Default Re: Virtex 4 FIFO16 blocks - Corruption ?

Ray,

I hear you, and understand your concern.

I also have been in the same situation you were in, and had the same
feelings.

I am not asking forgiveness, but asking for some help in how to deal
with these issues.

If they are really rare, and unusual. Perhaps this isn't so rare and
unusual (now). That is why I am placing it here.

If everytime we think we might have a problem, we shared it, there would
be ten times as much stuff. 9/10's or more of it bogus (not really an
issue).

I'll give you my hot button: NBTI was indicated as an issue with DCMs
in V4. The results were based on HTOL testing at accelerated temps, and
voltages.

Every single device that failed after that torture test that was tested
in my FPGA Lab PASSED the spec. Yet, because the production tester can
not test everything at every corner of voltage and temperature (finite
test time), the tester failed these parts.

Yet despite that I was able to prove that there was never a case where
the part actually failed, the tester folks prevailed. Now I understand
that if that is the only way to determine good, or bad, that if they say
it is bad, I am unable to prove it will never go bad, as I can't test as
many devices as they do.

The NBTI keep alive core, documentation, etc. was a huge effort. Lots
of work. Lots of pain. We contacted hundreds of customers. Made
trips, presentations, etc.

Now, in fairness, it might really be an issue. And if it is, we just
made it a non-issue. But, it never failed to meet specifications on the
bench!

So, we did share it, and in my own humble strange mind (toungue firmly
in cheek), I really believe it is a non-problem, never having occurred
anywhere other than a HTOL test after burnin, with one and only one test
method, and passing when tested using bench equipment (like a scope,
freq generator, etc. rather than an automated pattern run once on a "big
iron" tester).

Austin


Ray Andraka wrote:

> John_H wrote:
>
>> Ray, your comments are again right on target with my own feelings
>> about bugs and support. The webcase submission issue especially hits
>> home. One minor difference for me may be that when I find unusual
>> behavior and have it isolated to a functional portion of the design, I
>> may check the (externally available) knowledge database for any
>> information relating to my problem area before spending a few more
>> days to further isolate the cause. I've had several instances where
>> the information is in *a* database, just not one I can get to.
>>
>> For ANYONE who is concerned with whether or not to air the dirty
>> laundry of their EDA tools and silicon, PLEASE read through Ray's note
>> and understand where designers come from. Our company has had TOO
>> many issues with silicon (non-FPGA as well) and EDA tools ("you knew
>> about this for how many months?") that when we encounter known bugs
>> that are "hidden" from plain view, we are LIVID. There is no excuse
>> to withhold information that WILL affect designs if there is a way to
>> communicate the issues externally.
>>
>> In this instance, there is a way.

>
>
>>

> John, I also search the answers database to see if there are any matches
> to the problem at hand. More often than not, even when there is
> something that is close in there, it is difficult to find, and then
> determine whether it matches your problem or not. The answers database
> was once a very useful resource, but it has now grown big enough that
> you often can't find the magic incantation to pare down the hits to ones
> that match your problem.
>
> Austin,
> If it is restricted just to the internal database, it is being withheld.
> Unless I am experiencing the problem, debugged and isolated the
> problem, I don't even know to look for it. In the case of the
> synchronous FIFO, there is no excuse to keep that to the internal
> database. A simple statement in the user's guide saying that the FIFO16
> is an async FIFO and has these particular limitations when used in an
> application where both clocks are the same would have been sufficient to
> avoid a heck of a lot of troubleshooting time, and would have put the
> onus on me the designer. The way it is now, it is hidden until a
> designer trips over it in the lab, and by then you've got many hours in
> debug, isolation, talking to xilinx to figure out what is going on, and
> redesign to work around it. When I found out that Xilinx internally
> already knew this was an issue, I was livid.

Reply With Quote
  #13 (permalink)  
Old 12-02-2005, 01:42 AM
Ray Andraka
Guest
 
Posts: n/a
Default Re: Virtex 4 FIFO16 blocks - Corruption ?

Austin Lesea wrote:

> Ray,
>
> I hear you, and understand your concern.
>
> I also have been in the same situation you were in, and had the same
> feelings.
>


Austin,

I think you did the right thing with regards to NBTI. Yes, it
apparently is a non-issue, but before you knew it was a non-problem you
notified the customers and therefore we were aware of it being a
potential problem. It is much better, IMHO, to err on the side of
caution (ie, a false positive) rather than to under-report a real
problem. Yes, I got caught up in the NBTI thing too, and wound up
rewriting the Xilinx macro (long story), but in the end we took it out
even before Xilinx scaled back the severity because we never saw any
evidence that NBTI was affecting the design.

My point with the synchronous FIFO application is that that is a real
problem that affects real designs unless it is designed around. I'm
pretty sure Xilinx understands that it is a problem, yet there is no
public pronouncement warning a designer to look out for it. The time is
already past for getting that caution added to the user manual.

I'd like to be included on any internal email push list for issues
regarding the V4, as I'd rather find out about them from Xilinx than
have to discover them on my own after the design is done.

Reply With Quote
  #14 (permalink)  
Old 12-02-2005, 04:35 PM
Austin Lesea
Guest
 
Posts: n/a
Default Re: Virtex 4 FIFO16 blocks - Corruption ?

Ray,

OK. We will try to do better.

I will look at how email push works now, and see if there is an
opportunity here.

Meantime, everyone stay tuned for FIFO16 news, as the story isn't done yet.

For sync case, we know the delay 'fixes' the issue.

For the async case, there is a whole lot of work going on RIGHT NOW to
figure out what is the "problem" and what is the (simple) solution.

Austin

Ray Andraka wrote:

> Austin Lesea wrote:
>
>> Ray,
>>
>> I hear you, and understand your concern.
>>
>> I also have been in the same situation you were in, and had the same
>> feelings.
>>

>
> Austin,
>
> I think you did the right thing with regards to NBTI. Yes, it
> apparently is a non-issue, but before you knew it was a non-problem you
> notified the customers and therefore we were aware of it being a
> potential problem. It is much better, IMHO, to err on the side of
> caution (ie, a false positive) rather than to under-report a real
> problem. Yes, I got caught up in the NBTI thing too, and wound up
> rewriting the Xilinx macro (long story), but in the end we took it out
> even before Xilinx scaled back the severity because we never saw any
> evidence that NBTI was affecting the design.
>
> My point with the synchronous FIFO application is that that is a real
> problem that affects real designs unless it is designed around. I'm
> pretty sure Xilinx understands that it is a problem, yet there is no
> public pronouncement warning a designer to look out for it. The time is
> already past for getting that caution added to the user manual.
>
> I'd like to be included on any internal email push list for issues
> regarding the V4, as I'd rather find out about them from Xilinx than
> have to discover them on my own after the design is done.
>

Reply With Quote
  #15 (permalink)  
Old 12-02-2005, 07:49 PM
Ray Andraka
Guest
 
Posts: n/a
Default Re: Virtex 4 FIFO16 blocks - Corruption ?

Austin Lesea wrote:
> Ray,
>
>
> For sync case, we know the delay 'fixes' the issue.
>
> For the async case, there is a whole lot of work going on RIGHT NOW to
> figure out what is the "problem" and what is the (simple) solution.
>
>


Austin,
The synchronous case is a subset of the async case. If the sync case
were simply a case of an indeterminant delay of the flaq output, then it
wouldn't be a problem, however since the synchronous case can screw up
the flags if you have two flag events in a row, the same holds true for
the async case when the clock edges happen to be close enough together.
The async case is actually more problematic, because you can't just add
a delay to one of the clocks to "fix" it. It is also more insidious,
since it may work fine until one day when your async clocks happen to
have the edges align just as the fifo is going empty or full.

Obviously, I can't tell Xilinx what to do here, and they can easily tell
me to pound sand (yes, I am sticking my neck out here). I think Xilinx
owes its current V4 customers the courtesy of a notification that a
potential problem has surfaced with the FIFO16s, as they are a key
feature in the FPGA architecture.
Reply With Quote
  #16 (permalink)  
Old 12-02-2005, 09:42 PM
Austin Lesea
Guest
 
Posts: n/a
Default Re: Virtex 4 FIFO16 blocks - Corruption ?

Ray,

I never said the async case was bulletproof. In fact, I said the "jury
was still out."

Well, the jury is walking back in... They are sitting down...

I am intentionally hiding anything, but rather because the solution for
the FIFO usage is not fully tested.

I am told that the good news is that the sync case is 100% fixed by the
~1ns delay or 180 degree clock phase (whichever is easier to do).

The async case solution is also easy (I am told).

Austin

Ray Andraka wrote:

> Austin Lesea wrote:
>
>> Ray,
>>
>>
>> For sync case, we know the delay 'fixes' the issue.
>>
>> For the async case, there is a whole lot of work going on RIGHT NOW to
>> figure out what is the "problem" and what is the (simple) solution.
>>
>>

>
> Austin,
> The synchronous case is a subset of the async case. If the sync case
> were simply a case of an indeterminant delay of the flaq output, then it
> wouldn't be a problem, however since the synchronous case can screw up
> the flags if you have two flag events in a row, the same holds true for
> the async case when the clock edges happen to be close enough together.
> The async case is actually more problematic, because you can't just add
> a delay to one of the clocks to "fix" it. It is also more insidious,
> since it may work fine until one day when your async clocks happen to
> have the edges align just as the fifo is going empty or full.
>
> Obviously, I can't tell Xilinx what to do here, and they can easily tell
> me to pound sand (yes, I am sticking my neck out here). I think Xilinx
> owes its current V4 customers the courtesy of a notification that a
> potential problem has surfaced with the FIFO16s, as they are a key
> feature in the FPGA architecture.

Reply With Quote
  #17 (permalink)  
Old 12-06-2005, 06:20 PM
Ray Andraka
Guest
 
Posts: n/a
Default Re: Virtex 4 FIFO16 blocks - Corruption ?

Austin,

FWIW, there is fairly low complexity work-around for the FIFOs that can
be made to come at least close to the FIFO16 maximum performance. It
involves operating the FIFO16 as a synchronous FIFO with one side
clocked by the rising edge and one side with the falling edge of the
same clock. This is cascaded with a small coregen async fifo. The
small (15 deep) async FIFO can be made to run at the max FIFO16 speed
with some minor modifications and/or floorplanning around the flag
counters. This looks like it will completely avoid the potential flag
issues in the FIFO16 for both async and sync operation.

For synchronous use, the FIFO16 can be clocked by opposite edges of the
same clock, which is fine for lower performance designs. For high speed
designs, either use the above async fifo clocked on the FIFO16 side by
the falling edge of the fifo clock, or use a double rank register with
the first rank clocked by the rising edge, passing data to the second
rank which is clocked by the falling edge, which then passes the data
onto the write side of the fifo16. Placing the register ranks in
adjacent columns will meet the max timing of the fifo16 without the hit
you'd normally get by running one side on the negative clock.

I passed this solution on to your applications folks through Jim
Simkins. Hopefully it will make it into an app-note or answer record as
a work-around.

Reply With Quote
  #18 (permalink)  
Old 12-06-2005, 09:21 PM
Austin Lesea
Guest
 
Posts: n/a
Default Re: Virtex 4 FIFO16 blocks - Corruption ?

Ray,

Thanks.

There is another solution we are also looking after for the async case
which is even simpler(?) with no performance hit, which looks promising.

I believe all such solutions will be combined into one resource.

Jim and I have been discussing the FIFO, as well as other issues recently.

Austin

Ray Andraka wrote:

> Austin,
>
> FWIW, there is fairly low complexity work-around for the FIFOs that can
> be made to come at least close to the FIFO16 maximum performance. It
> involves operating the FIFO16 as a synchronous FIFO with one side
> clocked by the rising edge and one side with the falling edge of the
> same clock. This is cascaded with a small coregen async fifo. The
> small (15 deep) async FIFO can be made to run at the max FIFO16 speed
> with some minor modifications and/or floorplanning around the flag
> counters. This looks like it will completely avoid the potential flag
> issues in the FIFO16 for both async and sync operation.
>
> For synchronous use, the FIFO16 can be clocked by opposite edges of the
> same clock, which is fine for lower performance designs. For high speed
> designs, either use the above async fifo clocked on the FIFO16 side by
> the falling edge of the fifo clock, or use a double rank register with
> the first rank clocked by the rising edge, passing data to the second
> rank which is clocked by the falling edge, which then passes the data
> onto the write side of the fifo16. Placing the register ranks in
> adjacent columns will meet the max timing of the fifo16 without the hit
> you'd normally get by running one side on the negative clock.
>
> I passed this solution on to your applications folks through Jim
> Simkins. Hopefully it will make it into an app-note or answer record as
> a work-around.
>

Reply With Quote
  #19 (permalink)  
Old 12-07-2005, 03:23 AM
johnp
Guest
 
Posts: n/a
Default Re: Virtex 4 FIFO16 blocks - Corruption ?

Ray -

I've got to agree with you that finding stuff in the answer data base
is
hit-n-miss at best. I use XIlinx parts in my designs and I like the
parts,
but, the support from the web site leaves a lot to be desired. Thank
goodness
Peter and Austin pay attention to this group!

I often think that web masters should be forced to sit with users for a
while
so they end up understanding how slow and poor the user experience is.

My most recent frustration was trying to find information on a DCM bug
requiring the bitgen centered option. Good luck finding info on it.

John Providenza.

Reply With Quote
Reply

Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Specify Blocks Pankaj Golani Verilog 1 08-31-2004 12:51 PM
Specify Blocks....Need help Steve Hamm Verilog 0 08-25-2004 02:18 AM
Specify Blocks Pankaj Golani Verilog 0 08-19-2004 07:12 AM
Specify Blocks Pankaj Golani Verilog 0 08-19-2004 07:12 AM


All times are GMT +1. The time now is 07:09 AM.


Powered by vBulletin® Version 3.8.0
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.2.0
Copyright 2008 @ FPGA Central. All rights reserved