+ Post New Thread
Results 1 to 19 of 19
  1. #1
    Advanced Member level 5
    Points: 11,840, Level: 26

    Join Date
    Aug 2011
    Posts
    2,421
    Helped
    280 / 280
    Points
    11,840
    Level
    26

    VHDL Comparison Tree

    Hello,

    I have an input array of 128 unsigned vectors - 0 to 127.
    I need to compare all cells and return the POSITION of the cell with the highest result.
    To achieve higher throughput - I used a pipelined tree topology as follows:

    Code:
        comparison_tree : process ( IN_CLOCK , IN_RESET_ASYNCHRONOUS ) is
        begin
            if IN_RESET_ASYNCHRONOUS = '1' then
                stage_0_vote <= ( others => 0 ) ;    
                stage_1_vote <= ( others => 0 ) ;    
                stage_2_vote <= ( others => 0 ) ;    
                stage_3_vote <= ( others => 0 ) ;    
                stage_4_vote <= ( others => 0 ) ;    
                stage_5_vote <= ( others => 0 ) ;    
                stage_6_vote <= ( others => 0 ) ;    
                stage_7_vote <= 0 ;           
            elsif rising_edge ( IN_CLOCK ) then
                if IN_RESET_SYNCHRONOUS = '1' then
                    stage_0_vote <= ( others => 0 ) ;    
                    stage_1_vote <= ( others => 0 ) ;    
                    stage_2_vote <= ( others => 0 ) ;    
                    stage_3_vote <= ( others => 0 ) ;    
                    stage_4_vote <= ( others => 0 ) ;    
                    stage_5_vote <= ( others => 0 ) ;    
                    stage_6_vote <= ( others => 0 ) ;    
                    stage_7_vote <= 0 ;        
                else    
                
                    for index in 0 to 63
                    loop
                        if input ( 2 * index ) > input ( 2 * index + 1 ) then
                            stage_0_vote ( index ) <= 2 * index ;
                        else
                            stage_0_vote ( index ) <= ( 2 * index + 1 ) ;
                        end if ;    
                    end loop ;
                    
                    for index in 0 to 31
                    loop
                        if stage_0_vote ( 2 * index ) > stage_0_vote ( 2 * index + 1 ) then
                            stage_1_vote ( index ) <= stage_0_vote ( 2 * index ) ;
                        else
                            stage_1_vote ( index ) <= stage_0_vote ( 2 * index + 1 ) ;
                        end if ;    
                    end loop ;                
    
                    for index in 0 to 15
                    loop
                        if stage_1_vote ( 2 * index ) > stage_1_vote ( 2 * index + 1 ) then
                            stage_2_vote ( index ) <= stage_1_vote ( 2 * index ) ;
                        else
                            stage_2_vote ( index ) <= stage_1_vote ( 2 * index + 1 ) ;
                        end if ;    
                    end loop ;        
    
                    for index in 0 to 7
                    loop
                        if stage_2_vote ( 2 * index ) > stage_2_vote ( 2 * index + 1 ) then
                            stage_3_vote ( index ) <= stage_2_vote ( 2 * index ) ;
                        else
                            stage_3_vote ( index ) <= stage_2_vote ( 2 * index + 1 ) ;
                        end if ;    
                    end loop ;   
                    
                    
                    for index in 0 to 3
                    loop
                        if stage_3_vote ( 2 * index ) > stage_3_vote ( 2 * index + 1 ) then
                            stage_4_vote ( index ) <= stage_3_vote ( 2 * index ) ;
                        else
                            stage_4_vote ( index ) <= stage_3_vote ( 2 * index + 1 ) ;
                        end if ;    
                    end loop ;        
    
                    for index in 0 to 1
                    loop
                        if stage_4_vote ( 2 * index ) > stage_4_vote ( 2 * index + 1 ) then
                            stage_5_vote ( index ) <= stage_4_vote ( 2 * index ) ;
                        else
                            stage_5_vote ( index ) <= stage_4_vote ( 2 * index + 1 ) ;
                        end if ;    
                    end loop ;  
    
                    for index in 0 to 1
                    loop
                        if stage_5_vote ( 2 * index ) > stage_5_vote ( 2 * index + 1 ) then
                            stage_6_vote ( index ) <= stage_5_vote ( 2 * index ) ;
                        else
                            stage_6_vote ( index ) <= stage_5_vote ( 2 * index + 1 ) ;
                        end if ;    
                    end loop ;  
                    
                    if stage_6_vote ( 0 ) > stage_6_vote ( 1 ) then
                        stage_7_vote <= stage_6_vote ( 0 ) ;
                    else
                        stage_7_vote <= stage_6_vote ( 1 ) ;
                    end if ;                   
         
                end if ;
            end if ;
        end process comparison_tree ;
    Do you see a way how this can be written more generically for any even input size ?

  2. #2
    Advanced Member level 3
    Points: 4,171, Level: 15

    Join Date
    Apr 2016
    Posts
    873
    Helped
    161 / 161
    Points
    4,171
    Level
    15

    Re: VHDL Comparison Tree

    with a tree structure it's complicated, because the number of stages depends on the size.

    why don't you perform the comparison in time instead of in space? write a simple fsm that loops through the entire array, compares, stores the current highest value, advances, rinse repeat. it is much more flexible but at the cost of latency.
    Really, I am not Sam.


    1 members found this post helpful.

  3. #3
    Advanced Member level 5
    Points: 11,840, Level: 26

    Join Date
    Aug 2011
    Posts
    2,421
    Helped
    280 / 280
    Points
    11,840
    Level
    26

    Re: VHDL Comparison Tree

    Interesting workaround - but it will also consume much more registers without (my guess) a significant performance gain.



    •   Alt18th June 2017, 18:20

      advertising

        
       

  4. #4
    Super Moderator
    Points: 232,370, Level: 100
    Awards:
    1st Helpful Member

    Join Date
    Jan 2008
    Location
    Bochum, Germany
    Posts
    40,133
    Helped
    12258 / 12258
    Points
    232,370
    Level
    100

    Re: VHDL Comparison Tree

    What's the dataflow? A full array loaded every clock cycle (512 bit parallel interface), one array element updated every clock cycle, anything in between?



  5. #5
    Advanced Member level 5
    Points: 11,840, Level: 26

    Join Date
    Aug 2011
    Posts
    2,421
    Helped
    280 / 280
    Points
    11,840
    Level
    26

    Re: VHDL Comparison Tree

    What's the dataflow?
    128 words - every clock of which I have to find the location of the highest.

    - - - Updated - - -

    The "steps" topology ThisIsNotSam suggested would require a number of registers that's equivalent to the sum of an Arithmetic Sequence from 1 to 128 - I.E: 1+2+...+128 registers = 8256



    •   Alt18th June 2017, 19:41

      advertising

        
       

  6. #6
    Advanced Member level 3
    Points: 4,279, Level: 15

    Join Date
    Feb 2015
    Posts
    705
    Helped
    211 / 211
    Points
    4,279
    Level
    15

    Re: VHDL Comparison Tree

    Code:
    -- This is more generic.  It relies on the synthesis tools doing the right thing.
    -- I haven't synthesized this yet.  You should compare it to other styles.
    -- Historically, tools have treated sequentially numbered names as a bus.  This
    -- is not the case for "stages".  Likewise using variables in the loop for indexing
    -- probably synthesizes fine, but maybe not.  You can always ensure it works by
    -- creating constant integer_arrays for them.  
    
    -- I will assume a power of two input.  normalize to this if needed.
    signal stages : element_array(2*N-1-1 downto 0);
    constant kStages : integer := ceil_log2(N); -- find some version of this common function
    
    function stage(x : element_array) return element_array is
      variable x_nml : element_array(x'length-1 downto 0) := x;
      variable ret   : element_array(x'length/2 -1 downto 0);
    begin
      for i in ret'range loop
        if x_nml(2*i+1) > x_nml(2*i) then
          ret(i) := x_nml(2*i+1);
        else
          ret(i) := x_nml(2*i);
        end if;
      end loop;
      return ret;
    end function;
    
    
    process (clk, rst) is
      variable idx_low : integer := 0;
      variable idx_size : integer := N;
    begin
      if rst = '1' then
        stages := (others => kElementZero);
      else if rising_edge(clk) then
        idx_low := 0;
        idx_size := N;
        for i in 1 to kStages loop
          stages(idx_low + idx_size/2 -1 downto idx_low) <= stage(stages(idx_low+idx_size-1 downto idx_low));
          idx_low := idx_low + idx_size/2;
          idx_size := idx_size/2;
        end loop;
      end if;
    end process;
    This is one way to do this. It is similar to a software approach, although it is logically equivalent to the other style. It might not synthesize optimally, or at all. You might need to precompute more constants to ensure the tools know they are constant.

    Another generic method would use if-generate/for-generate to recursively generate components. But that generates a lot of components and probably isn't as sim-friendly.

    In terms of registers, you should compare the number of registers to the amount of logic. Registers are often treated as "free" in FPGAs. With the exception being large register-based rams.

    - - - Updated - - -

    I will also mention that neither the OP nor my code solve the problem in the first post. They both return the value of the highest element, not the position in the array.

    - - - Updated - - -

    oh, my code has bugs as well. In any case, this shows one way the code could be more generic. I might update it later as I'm interested to see what the results are.


    1 members found this post helpful.

  7. #7
    Advanced Member level 5
    Points: 11,840, Level: 26

    Join Date
    Aug 2011
    Posts
    2,421
    Helped
    280 / 280
    Points
    11,840
    Level
    26

    Re: VHDL Comparison Tree

    Another generic method would use if-generate/for-generate to recursively generate components. But that generates a lot of components and probably isn't as sim-friendly.
    Can you show an example of what you mean ?



  8. #8
    Advanced Member level 3
    Points: 5,779, Level: 18
    Achievements:
    7 years registered

    Join Date
    Jul 2010
    Location
    Sweden
    Posts
    754
    Helped
    311 / 311
    Points
    5,779
    Level
    18

    Re: VHDL Comparison Tree

    Quote Originally Posted by shaiko View Post
    128 words - every clock of which I have to find the location of the highest.
    What is the clock frequency?
    Is it possible to use a higher frequency clock to drive a state machine?



  9. #9
    Advanced Member level 5
    Points: 11,840, Level: 26

    Join Date
    Aug 2011
    Posts
    2,421
    Helped
    280 / 280
    Points
    11,840
    Level
    26

    Re: VHDL Comparison Tree

    No.
    The data input clock is already high.
    But what do you suggest ?



    •   Alt19th June 2017, 12:43

      advertising

        
       

  10. #10
    Advanced Member level 5
    Points: 11,840, Level: 26

    Join Date
    Aug 2011
    Posts
    2,421
    Helped
    280 / 280
    Points
    11,840
    Level
    26

    Re: VHDL Comparison Tree

    This one works well:
    Code:
    entity comparison_tree is
    
    generic
    (
        NUMBER_OF_WORDS	: positive  := 128 ; 
        WIDTH_WORD     	: positive  := 12 
    ) ;
    
    port 
    (
    	IN_CLOCK			  	: in	std_logic 																				; -- Global clock.
    	IN_DATA 				: in	generic_1d_unsigned_array ( 0 to NUMBER_OF_WORDS - 1 ) ( WIDTH_WORD - 1 downto 0 )   ; 
        IN_RESET_ASYNCHRONOUS	: in	std_logic 																				; -- Asynchronous Reset.
    	IN_RESET_SYNCHRONOUS	: in	std_logic 																				; -- Synchronous Reset.	
                        
    	OUT_SOURCE			  	: out 	unsigned ( positive ( ceil ( log2 ( real ( NUMBER_OF_WORDS ) ) ) ) - 1 downto 0 )   
    ) ;
    
    end entity comparison_tree ;
    
    
    
    
    
    architecture rtl_comparison_tree of comparison_tree is 
    
    signal stage_0_result : generic_1d_unsigned_array ( 0 to NUMBER_OF_WORDS / 2 - 1 ) ( WIDTH_WORD - 1 downto 0 ) ;
    signal stage_1_result : generic_1d_unsigned_array ( 0 to NUMBER_OF_WORDS / 4 - 1 ) ( WIDTH_WORD - 1 downto 0 ) ;  
    signal stage_2_result : generic_1d_unsigned_array ( 0 to NUMBER_OF_WORDS / 8 - 1 ) ( WIDTH_WORD - 1 downto 0 ) ;
    signal stage_3_result : generic_1d_unsigned_array ( 0 to NUMBER_OF_WORDS / 16 - 1 ) ( WIDTH_WORD - 1 downto 0 ) ;    
    signal stage_4_result : generic_1d_unsigned_array ( 0 to NUMBER_OF_WORDS / 32 - 1 ) ( WIDTH_WORD - 1 downto 0 ) ;
    signal stage_5_result : generic_1d_unsigned_array ( 0 to NUMBER_OF_WORDS / 64 - 1 ) ( WIDTH_WORD - 1 downto 0 ) ;
    signal stage_6_result : unsigned ( WIDTH_WORD - 1 downto 0 ) ;  
        
    signal stage_0_source : generic_1d_unsigned_array ( 0 to NUMBER_OF_WORDS / 2 - 1 ) ( positive ( ceil ( log2 ( real ( NUMBER_OF_WORDS ) ) ) ) - 1 downto 0 ) ;
    signal stage_1_source : generic_1d_unsigned_array ( 0 to NUMBER_OF_WORDS / 4 - 1 ) ( positive ( ceil ( log2 ( real ( NUMBER_OF_WORDS ) ) ) ) - 1 downto 0 ) ;  
    signal stage_2_source : generic_1d_unsigned_array ( 0 to NUMBER_OF_WORDS / 8 - 1 ) ( positive ( ceil ( log2 ( real ( NUMBER_OF_WORDS ) ) ) ) - 1 downto 0 ) ;
    signal stage_3_source : generic_1d_unsigned_array ( 0 to NUMBER_OF_WORDS / 16 - 1 ) ( positive ( ceil ( log2 ( real ( NUMBER_OF_WORDS ) ) ) ) - 1 downto 0 ) ;    
    signal stage_4_source : generic_1d_unsigned_array ( 0 to NUMBER_OF_WORDS / 32 - 1 ) ( positive ( ceil ( log2 ( real ( NUMBER_OF_WORDS ) ) ) ) - 1 downto 0 ) ;
    signal stage_5_source : generic_1d_unsigned_array ( 0 to NUMBER_OF_WORDS / 64 - 1 ) ( positive ( ceil ( log2 ( real ( NUMBER_OF_WORDS ) ) ) ) - 1 downto 0 ) ;
    signal stage_6_source : unsigned ( positive ( ceil ( log2 ( real ( NUMBER_OF_WORDS ) ) ) ) - 1 downto 0 ) ;      
        
    begin
    
    
        OUT_SOURCE <= stage_6_source ;
    
    
        comparison_tree : process ( IN_CLOCK , IN_RESET_ASYNCHRONOUS ) is
        begin
            if IN_RESET_ASYNCHRONOUS = '1' then
                stage_0_result <= ( others => ( others => '0' ) ) ;    
                stage_1_result <= ( others => ( others => '0' ) ) ;    
                stage_2_result <= ( others => ( others => '0' ) ) ;    
                stage_3_result <= ( others => ( others => '0' ) ) ;    
                stage_4_result <= ( others => ( others => '0' ) ) ;    
                stage_5_result <= ( others => ( others => '0' ) ) ;    
                stage_6_result <= ( others => '0' ) ;    
            elsif rising_edge ( IN_CLOCK ) then
                if IN_RESET_SYNCHRONOUS = '1' then
                    stage_0_result <= ( others => ( others => '0' ) ) ;    
                    stage_1_result <= ( others => ( others => '0' ) ) ;    
                    stage_2_result <= ( others => ( others => '0' ) ) ;    
                    stage_3_result <= ( others => ( others => '0' ) ) ;    
                    stage_4_result <= ( others => ( others => '0' ) ) ;    
                    stage_5_result <= ( others => ( others => '0' ) ) ;    
                    stage_6_result <= ( others => '0' ) ;    
                else    
                
                    for index in 0 to NUMBER_OF_WORDS / 2 - 1
                    loop
                        if IN_DATA ( 2 * index ) > IN_DATA ( 2 * index + 1 ) then
                            stage_0_result ( index ) <= IN_DATA ( 2 * index ) ;
                            stage_0_source ( index ) <= to_unsigned ( 2 * index , positive ( ceil ( log2 ( real ( NUMBER_OF_WORDS ) ) ) ) ) ;  
                        else
                            stage_0_result ( index ) <= IN_DATA ( 2 * index + 1 ) ;
                            stage_0_source ( index ) <= to_unsigned ( 2 * index + 1 , positive ( ceil ( log2 ( real ( NUMBER_OF_WORDS ) ) ) ) ) ;
                        end if ;    
                    end loop ;
                    
                    for index in 0 to NUMBER_OF_WORDS / 4 - 1
                    loop
                        if stage_0_result ( 2 * index ) > stage_0_result ( 2 * index + 1 ) then
                            stage_1_result ( index ) <= stage_0_result ( 2 * index ) ;
                            stage_1_source ( index ) <= stage_0_source ( 2 * index ) ;  
                        else
                            stage_1_result ( index ) <= stage_0_result ( 2 * index + 1 ) ;
                            stage_1_source ( index ) <= stage_0_source ( 2 * index + 1 ) ;
                        end if ;    
                    end loop ;                
    
                    for index in 0 to NUMBER_OF_WORDS / 8 - 1
                    loop
                        if stage_1_result ( 2 * index ) > stage_1_result ( 2 * index + 1 ) then
                            stage_2_result ( index ) <= stage_1_result ( 2 * index ) ;
                            stage_2_source ( index ) <= stage_1_source ( 2 * index ) ;  
                        else
                            stage_2_result ( index ) <= stage_1_result ( 2 * index + 1 ) ;
                            stage_2_source ( index ) <= stage_1_source ( 2 * index + 1 ) ;
                        end if ;   
                    end loop ;        
    
                    for index in 0 to NUMBER_OF_WORDS / 16 - 1
                    loop
                        if stage_2_result ( 2 * index ) > stage_2_result ( 2 * index + 1 ) then
                            stage_3_result ( index ) <= stage_2_result ( 2 * index ) ;
                            stage_3_source ( index ) <= stage_2_source ( 2 * index ) ;  
                        else
                            stage_3_result ( index ) <= stage_2_result ( 2 * index + 1 ) ;
                            stage_3_source ( index ) <= stage_2_source ( 2 * index + 1 ) ;
                        end if ;    
                    end loop ;   
                    
                    
                    for index in 0 to NUMBER_OF_WORDS / 32 - 1
                    loop
                        if stage_3_result ( 2 * index ) > stage_3_result ( 2 * index + 1 ) then
                            stage_4_result ( index ) <= stage_3_result ( 2 * index ) ;
                            stage_4_source ( index ) <= stage_3_source ( 2 * index ) ;  
                        else
                            stage_4_result ( index ) <= stage_3_result ( 2 * index + 1 ) ;
                            stage_4_source ( index ) <= stage_3_source ( 2 * index + 1 ) ;
                        end if ;    
                    end loop ;        
    
                    for index in 0 to NUMBER_OF_WORDS / 64 - 1
                    loop
                        if stage_4_result ( 2 * index ) > stage_4_result ( 2 * index + 1 ) then
                            stage_5_result ( index ) <= stage_4_result ( 2 * index ) ;
                            stage_5_source ( index ) <= stage_4_source ( 2 * index ) ;  
                        else
                            stage_5_result ( index ) <= stage_4_result ( 2 * index + 1 ) ;
                            stage_5_source ( index ) <= stage_4_source ( 2 * index + 1 ) ;
                        end if ;    
                    end loop ;  
    
                    if stage_5_result ( 0 ) > stage_5_result ( 1 ) then
                        stage_6_result <= stage_5_result ( 0 ) ;
                        stage_6_source <= stage_5_source ( 0 ) ;  
                    else
                        stage_6_result <= stage_5_result ( 1 ) ;
                        stage_6_source <= stage_5_source ( 1 ) ;
                    end if ;                  
         
                end if ;
            end if ;
        end process comparison_tree ;  
    
    
    
    
    
    end architecture rtl_comparison_tree ;



  11. #11
    Advanced Member level 3
    Points: 4,171, Level: 15

    Join Date
    Apr 2016
    Posts
    873
    Helped
    161 / 161
    Points
    4,171
    Level
    15

    Re: VHDL Comparison Tree

    Quote Originally Posted by shaiko View Post
    128 words - every clock of which I have to find the location of the highest.

    - - - Updated - - -

    The "steps" topology ThisIsNotSam suggested would require a number of registers that's equivalent to the sum of an Arithmetic Sequence from 1 to 128 - I.E: 1+2+...+128 registers = 8256
    Yes it would, and that is pretty acceptable in most cases. 8k registers is chump change in the world of multi-million gate designs.
    Really, I am not Sam.



  12. #12
    Advanced Member level 3
    Points: 4,279, Level: 15

    Join Date
    Feb 2015
    Posts
    705
    Helped
    211 / 211
    Points
    4,279
    Level
    15

    Re: VHDL Comparison Tree

    You can reduce the number of registers and the associated routing by building the address in reverse.

    The first stage needs only 1 bit for location -- even or odd. The second stage passes this as well as 1 bit for upper two or lower two. The third stage adds the next msb. And this continues.

    the log2 calculation should be a function as it is used several times in the file.


    1 members found this post helpful.

  13. #13
    Super Moderator
    Points: 26,790, Level: 39
    ads-ee's Avatar
    Join Date
    Sep 2013
    Location
    USA
    Posts
    6,092
    Helped
    1493 / 1493
    Points
    26,790
    Level
    39

    Re: VHDL Comparison Tree

    Quote Originally Posted by ThisIsNotSam View Post
    Yes it would, and that is pretty acceptable in most cases. 8k registers is chump change in the world of multi-million gate designs.
    FYI this is an FPGA design not a multi-million gate ASIC design.

    Even though 8k registers isn't a terribly large number of FFs to use for one module, it is a significant chunk of resources when you only have 400k in a medium sized FPGA (that's 2% of the total in one module).

    Of course if this is done in a large Virtex/Stratix part or an Ultrascale part then yes, I would agree it is chump change.



  14. #14
    Advanced Member level 3
    Points: 4,171, Level: 15

    Join Date
    Apr 2016
    Posts
    873
    Helped
    161 / 161
    Points
    4,171
    Level
    15

    Re: VHDL Comparison Tree

    Quote Originally Posted by ads-ee View Post
    FYI this is an FPGA design not a multi-million gate ASIC design.

    Even though 8k registers isn't a terribly large number of FFs to use for one module, it is a significant chunk of resources when you only have 400k in a medium sized FPGA (that's 2% of the total in one module).

    Of course if this is done in a large Virtex/Stratix part or an Ultrascale part then yes, I would agree it is chump change.
    Sure, that is understandable. FFs aside, the solution I have proposed would generate only one comparator, whereas the tree structure would create many. Depending on resources, size of the problem, etc., it might pay off.
    Really, I am not Sam.


    1 members found this post helpful.

  15. #15
    Advanced Member level 5
    Points: 11,840, Level: 26

    Join Date
    Aug 2011
    Posts
    2,421
    Helped
    280 / 280
    Points
    11,840
    Level
    26

    Re: VHDL Comparison Tree

    Yes it would, and that is pretty acceptable in most cases. 8k registers is chump change in the world of multi-million gate designs.
    When I said "register" I meant each register is actually a vector of DFFs.
    So it's actually 8256 * 16 (result width) = 132096 DFFs just for the "result steps".
    This will have to be coupled with a similar "steps" structure for the result source (similar in spirit to what I wrote in #10).
    8256 * log2(128) = 57792

    132096 + 57792 = 190K DFFs.



  16. #16
    Advanced Member level 3
    Points: 4,279, Level: 15

    Join Date
    Feb 2015
    Posts
    705
    Helped
    211 / 211
    Points
    4,279
    Level
    15

    Re: VHDL Comparison Tree

    It is important to keep in mind FPGA issues. For example, if you have simple logic followed by registers then the registers are free. The registers are in the same CLBs. If there is no logic then registers are not free -- unused CLBs are used for just registers. If the registers are not needed, they lead to routing congestion.

    As a result, 132096 registers is not meaningful if they are in the "basically free" category.



  17. #17
    Advanced Member level 5
    Points: 11,840, Level: 26

    Join Date
    Aug 2011
    Posts
    2,421
    Helped
    280 / 280
    Points
    11,840
    Level
    26

    Re: VHDL Comparison Tree

    Yes...I know that. But even if we do as we should and look at the total number of LEs (instead of looking at registers and LUTs separately) the "steps" topology is orders of magnitude more resource hungry than a the "tree"



    •   Alt19th June 2017, 22:10

      advertising

        
       

  18. #18
    Advanced Member level 3
    Points: 4,279, Level: 15

    Join Date
    Feb 2015
    Posts
    705
    Helped
    211 / 211
    Points
    4,279
    Level
    15

    Re: VHDL Comparison Tree

    The results should describe the same logic and should infer the same logic. Unless you do it wrong and the tools don't do any optimization.

    You should have:
    16*64 + 1*64 registers for stage 1.
    16*32 + 2*32 registers for stage 2.
    16*16 + 3*16 for stage 3
    16*8 + 4*8 for stage 4
    16*4 + 5*4 for stage 5
    16*2 + 6*2 for stage 6
    7 for stage 7, assuming you only need the location and not the value.

    This is a lot less than the 8k that you mention. If you need the 128 values to propagate, it probably makes sense to have them in a SRL. That would give 2k SRLs and a little over 2k registers.


    1 members found this post helpful.

  19. #19
    Advanced Member level 5
    Points: 11,840, Level: 26

    Join Date
    Aug 2011
    Posts
    2,421
    Helped
    280 / 280
    Points
    11,840
    Level
    26

    Re: VHDL Comparison Tree

    vGoodtimes,

    What you wrote is correct for a tree topology. The numbers I mentioned are for steps topology.



--[[ ]]--