FPGA Central - World's 1st FPGA / CPLD Portal

FPGA Central

World's 1st FPGA Portal

 

Go Back   FPGA Groups > NewsGroup > Verilog

Verilog comp.lang.verilog newsgroup / usenet

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 02-25-2004, 01:21 PM
DaveW
Guest
 
Posts: n/a
Default Leading Zero count

I came up with the following (simplified) code snippet for a leading
zero counter - question is: is this the "best" way to do it? I know it
works but I wonder if there is a more efficient/correct approach.
==============================================
:
input clk;
input [23:0] xn;
:
:
reg [23:0] lz_count_reg;
reg [4:0] count;

always @( posedge clk )
begin
lz_count_reg=24;
for( count=0; count<24; count=count+1 )
if( xn[ count ] )
lz_count_reg=23-count;
end
=================================================

Thanks in anticipation...
Reply With Quote
  #2 (permalink)  
Old 02-25-2004, 05:33 PM
John_H
Guest
 
Posts: n/a
Default Re: Leading Zero count

The counter you defined is very clean and will produce valid results.

The question I have is whether you intend to synthesize this in a very fast
design. Most synthesizers will do a reasonable job for your target device
but if you need high performance, coding the loop in a more
architecture-friendly fashion may produce better results. The code will be
harder to understand at a glance but your performance can change
significantly.


"DaveW" <[email protected]> wrote in message
news:[email protected] om...
> I came up with the following (simplified) code snippet for a leading
> zero counter - question is: is this the "best" way to do it? I know it
> works but I wonder if there is a more efficient/correct approach.
> ==============================================
> :
> input clk;
> input [23:0] xn;
> :
> :
> reg [23:0] lz_count_reg;
> reg [4:0] count;
>
> always @( posedge clk )
> begin
> lz_count_reg=24;
> for( count=0; count<24; count=count+1 )
> if( xn[ count ] )
> lz_count_reg=23-count;
> end
> =================================================
>
> Thanks in anticipation...



Reply With Quote
  #3 (permalink)  
Old 02-25-2004, 06:55 PM
Chris F Clark
Guest
 
Posts: n/a
Default Re: Leading Zero count

Dave Wooff asked:
> I came up with the following (simplified) code snippet for a leading
> zero counter - question is: is this the "best" way to do it? I know it
> works but I wonder if there is a more efficient/correct approach.


I think you can fid a variety of different implementations under the
names "find first 1 bit" and "priority encoder". Different
implementations are likely to cause the synthesizer to generate
different circuits and they are likely have different trade-offs in
terms of speed, size, etc.

Consider, for example:

casez(xn)
24'b0: count = 24;
24'b1: count = 23;
24'b1?: count = 22;
24'b1??: count = 21;
...
24'b1???????????????????????: count = 0;
endcase

or this version

if (xn[23:12] == 0)
begin
count = count + 12;
end
else
begin
xn = xn >> 12;
end;
if (xn[11:6] == 0)
begin
count = count + 6;
end
else
begin
xn = xn >> 6;
end;
if (xn[5:3] == 0)
begin
count = count + 3;
end
else
begin
xn = xn >> 3;
end;
if (xn[2:1] == 0)
begin
count = count + 2;
end
else
begin
xn = xn >> 2;
end;
if (xn[0] == 0)
begin
count = count + 1;
end

There may be other versions that are more friendly to your particular
technology and needs.

Hope this helps,
-Chris

************************************************** ***************************
Chris Clark Internet : [email protected]
Compiler Resources, Inc. Web Site : http://world.std.com/~compres
23 Bailey Rd voice : (508) 435-5016
Berlin, MA 01503 USA fax : (978) 838-0263 (24 hours)
------------------------------------------------------------------------------
Reply With Quote
  #4 (permalink)  
Old 02-25-2004, 08:32 PM
bj Porcella
Guest
 
Posts: n/a
Default Re: Leading Zero count

Dave:

First of all the code doesn't work.
Yes lz_count_reg gets the correct value on the first 1 found in the input,
but you gotta break out of the loop now in case there is another 1 e.g.:

for( count=0; count<24 && !xn[ count ]; count=count+1 )

now this doesn't work either because the loop test comes before your
internal test --- but why not make the loop test key...

i.e. for(count=23; count>0 && !xn[ count ]; count = count-1) ;

now you don't need the lz_count_reg (count gives you the answer) - and
you don't need the subractor you will instantiate with line:
lz_count_reg=23-count.
btw why is lz_count_reg 24 bits -- it need only be 5.

That's the silly stuff you did.
More subtitle is the test of xn[ count ]; This instantiates a 24 to 1
multiplexor -- a fairly large piece logic.
It might be better to shift the input value in a shift register and only
test the top bit ---


And then you didn't mention speed --- but the result can be obtained
immediately -- at some cost in logic one reasonable
way to document:

egg count = !xn[23] ? 0 :
!xn[22] ? 1 :
!xn[21] ? 2 :
!xn[20] ? 3 :



etc -- it wouldn't be a bad idea to make the execution order of above
explicit by using parenthesis.

--
bj Porcella
http://pages.sbcglobal.net/bporcella/
"DaveW" <[email protected]> wrote in message
news:[email protected] om...
> I came up with the following (simplified) code snippet for a leading
> zero counter - question is: is this the "best" way to do it? I know it
> works but I wonder if there is a more efficient/correct approach.
> ==============================================
> :
> input clk;
> input [23:0] xn;
> :
> :
> reg [23:0] lz_count_reg;
> reg [4:0] count;
>
> always @( posedge clk )
> begin
> lz_count_reg=24;
> for( count=0; count<24; count=count+1 )
> if( xn[ count ] )
> lz_count_reg=23-count;
> end
> =================================================
>
> Thanks in anticipation...



Reply With Quote
  #5 (permalink)  
Old 02-25-2004, 10:52 PM
David Wooff
Guest
 
Posts: n/a
Default Re: Leading Zero count

John_H <[email protected]> wrote in message
news:f74%[email protected]
> The counter you defined is very clean and will produce valid results.
>
> The question I have is whether you intend to synthesize this in a very

fast
> design. Most synthesizers will do a reasonable job for your target device
> but if you need high performance, coding the loop in a more
> architecture-friendly fashion may produce better results. The code will

be
> harder to understand at a glance but your performance can change
> significantly.


Thanks. I'm using an FPGA. I come from a DSP Software background really so
I'm a little green on this subject. I guessed that this code may be
inefficient so I was interested to know if the approach I had taken had any
merit whatsoever.


Reply With Quote
  #6 (permalink)  
Old 02-25-2004, 10:58 PM
David Wooff
Guest
 
Posts: n/a
Default Re: Leading Zero count

Chris F Clark <[email protected]> wrote in message
news:[email protected]
> Dave Wooff asked:
> > I came up with the following (simplified) code snippet for a leading
> > zero counter - question is: is this the "best" way to do it? I know it
> > works but I wonder if there is a more efficient/correct approach.

>
> I think you can fid a variety of different implementations under the
> names "find first 1 bit" and "priority encoder". Different
> implementations are likely to cause the synthesizer to generate
> different circuits and they are likely have different trade-offs in
> terms of speed, size, etc.
>
> Consider, for example:......


Thanks for this, it is good food for thought. I can see how your alternate
approach tries to reduce the number of steps involved and I guess this would
result in faster logic, although I would not know for sure without
experimenting. I trust your comments in any case.


Reply With Quote
  #7 (permalink)  
Old 02-25-2004, 11:03 PM
David Wooff
Guest
 
Posts: n/a
Default Re: Leading Zero count


bj Porcella <[email protected]> wrote in message
news:IL6%[email protected] m...
> Dave:
>
> First of all the code doesn't work.


Well, it seems to on my simulator, I think you need to re-examine the code
snippet. Yes the result could be 5 bits, that was a mistake in simplifying
the code to put in my original post.


> That's the silly stuff you did.
> More subtitle is the test of xn[ count ]; This instantiates a 24 to 1
> multiplexor -- a fairly large piece logic.


It did seem to use up not an insubstantial amount of logic, but not so much
that I was worried. I take your point though.

Thankyou.



Reply With Quote
  #8 (permalink)  
Old 02-25-2004, 11:07 PM
David Wooff
Guest
 
Posts: n/a
Default Re: Leading Zero count

I think my other motivation for posting this was that this problem must be
one of the more common ones as it is used in floating point arithmetic, so I
would have thought it had been done to death and that a standard Verilog
algorithm would have resulted by now. I did search for this, however, and
didn't find all that much really, so I was a little surprised.

--
Dave Wooff
[email protected]
DaveW <[email protected]> wrote in message
news:[email protected] om...
> I came up with the following (simplified) code snippet for a leading
> zero counter - question is: is this the "best" way to do it? I know it
> works but I wonder if there is a more efficient/correct approach.
> ==============================================
> :
> input clk;
> input [23:0] xn;
> :
> :
> reg [23:0] lz_count_reg;
> reg [4:0] count;
>
> always @( posedge clk )
> begin
> lz_count_reg=24;
> for( count=0; count<24; count=count+1 )
> if( xn[ count ] )
> lz_count_reg=23-count;
> end
> =================================================
>
> Thanks in anticipation...



Reply With Quote
  #9 (permalink)  
Old 02-26-2004, 03:22 AM
Allan Herriman
Guest
 
Posts: n/a
Default Re: Leading Zero count

On Wed, 25 Feb 2004 21:52:00 -0000, "David Wooff"
<[email protected]> wrote:

>John_H <[email protected]> wrote in message
>news:f74%[email protected]
>> The counter you defined is very clean and will produce valid results.
>>
>> The question I have is whether you intend to synthesize this in a very

>fast
>> design. Most synthesizers will do a reasonable job for your target device
>> but if you need high performance, coding the loop in a more
>> architecture-friendly fashion may produce better results. The code will

>be
>> harder to understand at a glance but your performance can change
>> significantly.

>
>Thanks. I'm using an FPGA. I come from a DSP Software background really so
>I'm a little green on this subject. I guessed that this code may be
>inefficient so I was interested to know if the approach I had taken had any
>merit whatsoever.


I did a 200MHz 16 input / five output priority encoder in an FPGA a
few years back. I used a carry chain to find the first bit set, then
some straightforward logic (in a tree structure) to convert the bit
position to binary.

Regards,
Allan.
Reply With Quote
  #10 (permalink)  
Old 02-26-2004, 04:53 PM
john jakson
Guest
 
Posts: n/a
Default Re: Leading Zero count

[email protected] (DaveW) wrote in message news:<[email protected] com>...
> I came up with the following (simplified) code snippet for a leading
> zero counter - question is: is this the "best" way to do it? I know it
> works but I wonder if there is a more efficient/correct approach.
> ==============================================
> :
> input clk;
> input [23:0] xn;
> :
> :
> reg [23:0] lz_count_reg;
> reg [4:0] count;
>
> always @( posedge clk )
> begin
> lz_count_reg=24;
> for( count=0; count<24; count=count+1 )
> if( xn[ count ] )
> lz_count_reg=23-count;
> end
> =================================================
>
> Thanks in anticipation...


You need to think with a HW hat on to get really good performance. Dig
up an old TI TTL book and check out the section on Priority Encoding.
Things have't changed much since then. The natural way to think is
recursive division. Typically for large N say 16 or 64 partition into
4 ways and solve that and recombine results. When you know how to
recombine, you only need to PE 4bits and build result from there.

Should be possible to do arbitrary encoder for N (power of 2 or 4) in
about O logN time where O might be 1 or 2 gate delays. In an ASIC, O
might be closer to transit time so even faster. For an FPGA you will
be better off with adder type circuit and XST doesn't seem to synth
fast adders with ripple adder until the length is long enough say 4b,
so recursion doesn't work here. And doing math with multiple adder
fragments in one pipe will be slow too.

If your input word is all 0's but for a single 1, the solution is much
easier, then a tree decoder works quickly. If you have 1 or more 1s,
1st build a force all right 1s to 0 stage infront. This stage looks
triangular if you do it bitwise, but again recursion can help here
too, break into 4 or 8 bit sub words. The right most 1 or lsb needs to
be killed by all the 1's to the left, and big wide gates in FPGA are
also slow.

Its a solved problem, but not neccesary encoded into any tool I know
of. Also look at the big blue Verilog/VHDL book by Smith IIRC that
describes hundreds of synth code modules in both langs with
schematics. PEs should be in there. its a must have book for any HW
guy.

johnjakson_usa_com
Reply With Quote
  #11 (permalink)  
Old 02-26-2004, 06:13 PM
John_H
Guest
 
Posts: n/a
Default Re: Leading Zero count

Greetings, bj Porcella.

"bj Porcella" <[email protected]> wrote in message
news:IL6%[email protected] m...
> Dave:
>
> First of all the code doesn't work.


This form of code has worked fine for my own synthesis and simulation,
relying on the blocking operator (=) to provide a loop-based priority
encoder. The later iterations of the loop are higher in the priority chain.

> Yes lz_count_reg gets the correct value on the first 1 found in the input,
> but you gotta break out of the loop now in case there is another 1 e.g.:


The idea with the loop isn't to find the first one (note that the counter is
incrementing from 0 to find the MSbit that is one) but to find the *last*
one.

> for( count=0; count<24 && !xn[ count ]; count=count+1 )
>
> now this doesn't work either because the loop test comes before your
> internal test --- but why not make the loop test key...
>
> i.e. for(count=23; count>0 && !xn[ count ]; count = count-1) ;


This form of conditional loop exit will probably wreak havok on synthesis
though simulators would probably be okay with it.

> now you don't need the lz_count_reg (count gives you the answer) - and
> you don't need the subractor you will instantiate with line:
> lz_count_reg=23-count.


The subtraction doesn't get implemented in logic but provides a constant
that feeds the priority encoder.

> btw why is lz_count_reg 24 bits -- it need only be 5.
>
> That's the silly stuff you did.
> More subtitle is the test of xn[ count ]; This instantiates a 24 to 1
> multiplexor -- a fairly large piece logic.


The 24-stage priority encoder could be nasty in general but the case of a 24
stage priority encode of *constants* the situation is much cleaner. I
believe it was Peter Alfke who proposed using BlockRAMs instead of general
logic to do most of the work for a count like this.

> It might be better to shift the input value in a shift register and only
> test the top bit ---
>
>
> And then you didn't mention speed --- but the result can be obtained
> immediately -- at some cost in logic one reasonable
> way to document:


I love immediate logic. Even 10 ps of propagation delay is such a bother.

> egg count = !xn[23] ? 0 :
> !xn[22] ? 1 :
> !xn[21] ? 2 :
> !xn[20] ? 3 :
>
>
>
> etc -- it wouldn't be a bad idea to make the execution order of above
> explicit by using parenthesis.
>
> --
> bj Porcella
> http://pages.sbcglobal.net/bporcella/


[ original post omitted ]


Reply With Quote
  #12 (permalink)  
Old 02-27-2004, 01:04 AM
bj Porcella
Guest
 
Posts: n/a
Default Re: Leading Zero count

Dave and John:

mea culpa.... It didn't even cross my mind that you would be checking
from the bottom up. Sorry. The code does work. That's the
good news.

On the other hand that is SO wistful of time....... start checking from
the MSB and stop when you hit a 1. If data is uniformly distributed
the circuit will finish on any given check 50% of the time -- much faster
on average than going through all the bits.

And yes I agree with Dave that my "loop code" is not good for synthesis --
I would never write it that way -- but might implement the
basic idea in a different fashion. ---- Personally I never use for
loops in RTL code --- I like to think more in terms of hardware
structure...

like:

//assume begin_check is a pulse indicating good xn data and I am looking
for the MS 1 )...

always @(posedge clk)
if (begin_check) test_bit <= 23;
else if (xn[test_bit] == 0) test_bit <= (test_bit >0) ? test_bit-1 : 0;

wire done = xn[test_bit]; // when done test_bit points to MSB 1.



--
bj Porcella
http://pages.sbcglobal.net/bporcella/
"John_H" <[email protected]> wrote in message
news:WOp%[email protected]
> Greetings, bj Porcella.
>
> "bj Porcella" <[email protected]> wrote in message
> news:IL6%[email protected] m...
> > Dave:
> >
> > First of all the code doesn't work.

>
> This form of code has worked fine for my own synthesis and simulation,
> relying on the blocking operator (=) to provide a loop-based priority
> encoder. The later iterations of the loop are higher in the priority

chain.
>
> > Yes lz_count_reg gets the correct value on the first 1 found in the

input,
> > but you gotta break out of the loop now in case there is another 1

e.g.:
>
> The idea with the loop isn't to find the first one (note that the counter

is
> incrementing from 0 to find the MSbit that is one) but to find the *last*
> one.
>
> > for( count=0; count<24 && !xn[ count ]; count=count+1 )
> >
> > now this doesn't work either because the loop test comes before your
> > internal test --- but why not make the loop test key...
> >
> > i.e. for(count=23; count>0 && !xn[ count ]; count = count-1) ;

>
> This form of conditional loop exit will probably wreak havok on synthesis
> though simulators would probably be okay with it.
>
> > now you don't need the lz_count_reg (count gives you the answer) -

and
> > you don't need the subractor you will instantiate with line:
> > lz_count_reg=23-count.

>
> The subtraction doesn't get implemented in logic but provides a constant
> that feeds the priority encoder.
>
> > btw why is lz_count_reg 24 bits -- it need only be 5.
> >
> > That's the silly stuff you did.
> > More subtitle is the test of xn[ count ]; This instantiates a 24 to

1
> > multiplexor -- a fairly large piece logic.

>
> The 24-stage priority encoder could be nasty in general but the case of a

24
> stage priority encode of *constants* the situation is much cleaner. I
> believe it was Peter Alfke who proposed using BlockRAMs instead of general
> logic to do most of the work for a count like this.
>
> > It might be better to shift the input value in a shift register and only
> > test the top bit ---
> >
> >
> > And then you didn't mention speed --- but the result can be obtained
> > immediately -- at some cost in logic one reasonable
> > way to document:

>
> I love immediate logic. Even 10 ps of propagation delay is such a bother.
>
> > egg count = !xn[23] ? 0 :
> > !xn[22] ? 1 :
> > !xn[21] ? 2 :
> > !xn[20] ? 3 :
> >
> >
> >
> > etc -- it wouldn't be a bad idea to make the execution order of

above
> > explicit by using parenthesis.
> >
> > --
> > bj Porcella
> > http://pages.sbcglobal.net/bporcella/

>
> [ original post omitted ]
>
>



Reply With Quote
Reply

Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT +1. The time now is 06:32 AM.


Powered by vBulletin® Version 3.8.0
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.2.0
Copyright 2008 @ FPGA Central. All rights reserved