View Single Post
  #2 (permalink)  
Old 04-02-2004, 04:03 PM
Austin Lesea
Guest
 
Posts: n/a
Default Re: ML300 and GigE Experiences

Tony,

A well designed link should be error free (ie many many hours without a
single bit in error). Contact the hotline for details about MGT support
on specific ML300 series boards: some early versions were not designed
for supporting links above 1 Gbs! as they were designed to show off the
405PPC(tm IBM) instead.

So, there is a hundred things to check once you find out if your board
was built for MGT usage, but you have to start somewhere:

1) is your refclk meeting the jitter spec? The MGTs require a very low
jitter refclk. You can check this by observing a 1,0 pattern from the
outputs of the MGTs and seeing how much jitter is there. Should be much
less than 10% of a unit interval (bit period). If it is more than this,
you have a tx jitter problem. If you loop with a bad jitter rx clock,
everything is OK because the receiver is getting exactly the same bad
clock to work with.

2) is your logic error free when looped back? I think you said yes, but
often timing constraints may be missing, and the fabric is the source of
errors.

3) are your errors in burts? or single? Bursts may indicate FIFO
overflow/underflow (refclks far apart in frequency, and no means to deal
with it, or the means is not working in logic -- when looped, the same
clock is used, so no problem).

4) what is the channel? coax cables are not a differential channel,
common mode noise will roar right into the receiver if the channel is
not differential. Usually the coax's are used to connect the TX and RX
pairs to a XAUI adapter module to the actual backplane (still not
ideal, but at least most of the channel is differential).

5) what does the received eye pattern look like? This will tell you if
you have a jitter problem, or an amplitude/loss problem. If the eye
looks fantastic, that takes you right back to the digital processing,
and takes away the analog side of things again....

6) have you tried a far end loopback? Loop the digital data directly
back to the far end tx from the far end rx to go back to the near end.

7) contact an FAE, and arrange to go to one of our 15 world wide
RocketLabs(tm) locations where we have all of the equipment and
resources to debug your board, and compare it with our own boards and
designs in the labs.

Austin

Tony wrote:

> I am curious if anyone here has had success maintaining a very low BER
> link using the fiber connections on the ML300 boards.
>
> We have implemented an Aurora Protocol PLB Core for the ML300 (adding
> interface FIFO and FSMs to the Aurora CoreGen v2 core. It is
> currently a single lane system using Gige-0 on the ml300 board (MGT
> X3Y1). We were having small issues using the 156.25 bref clock so we
> are currently using a 100 MHz clock (we are just using the PLB clock
> plb_clk out of the Clock0 module on the EDK2 reference system). Clock
> compensation occurs at about 2500 reference clocks. (tried 5000, same,
> if not worse problems). Best results were with Diffswing=800mv,
> Pre-Em=33%.
>
> Unfortunately our link has problems staying up for more than 20
> minutes (it will spontaneously lose link and channel, until a
> mgt-reset on both partners kicks them off again). Additionally, there
> are mass HARD and SOFT errors reported by the Aurora core. I do not
> send any data, just let the Aurora core auto-idle. This is the
> timing:
>
> DIFFSW=800 PREEM=33% Stays up: 30+ minutes, ~5 soft errors/sec
> DIFFSW=700 PREEM=33% Stays up: 30+ minutes, ~10 soft errors/sec
> DIFFSW=600 PREEM=33% Stays up: not tested, ~20 soft errors/sec
> (explodes to 200-300 errors/sec at about 13 minutes)
> DIFFSW=500 PREEM=33% Stays up: not tested, ~30 soft errors/sec
> (explodes to 200-300 errors/sec at about 13 minutes)
>
> DIFFSW=800 PREEM=25% Stays up: not testeds, ~200-300 soft errors/sec
>
> - In loopback mode (serial or parallel) the channel/lane are crisp and
> clean as ever.
>
> - When the boards start up, the errors in each situation are small
> parts/second, but then grow over time. I dont know if this is a
> function of board/chip temperature (i put a heat sink on and it seems
> to slow the increase of the error rate), or if for some reason the
> Aurora core cannot compensate for some clock skew and jitter
>
> -
>
>
> Could any of you guys steer me in the right direction?
>
> Is the higher loaded plb_clk as my ref_clk a source of problem?
> Anybody able to get low error rates?
>
> Thanks,
> Tony
>
>
>
>
>
>

Reply With Quote