Continue to Site

Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

Pipeline: For Loop comparing Module (VHDL)

Status
Not open for further replies.

yashjain

Junior Member level 1
Joined
Jul 28, 2019
Messages
15
Helped
0
Reputation
0
Reaction score
0
Trophy points
1
Activity points
164
Hi, Iḿ trying to compare 40 values from a memory array to my input data. If any of those 40 data matches to my input data I go to next state.
I´m using a FOR LOOP for comparing the values in one clock cycle. But it limits my Fmax clock frequency.

How do I write a pipeline based code to check my input data with those 40 values in such a wau that increases my clock frequency?

-> It doesn´t matter if it takes even 40 clock cycles to do the whole thing

Code:
signal checksum_Y      : std_logic_vector(8 downto 0);    

signal y_data          : std_logic_vector(9 downto 0);   

type did_reg is array (0 to did_width-1) of std_logic_vector (9 downto 0);            --intiating inferred memory
  signal did_arr : did_reg;  

    when DID_Y =>                                                             

              checksum_Y <= std_logic_vector(unsigned(y_data(8 downto 0)) + unsigned(checksum_Y)); -- load check sum with previous checksum and current data value 

              for i in 0 to (did_width) loop                        -- Loop the did array                                                          
                if i = did_width then                            -- If counter reaches the max amount of DID present+1 due to no DID match                               
                  curr_state_Y       <= IDLE_Y;                     -- jump tp next state, IDLE_Y                                                                                 
					 
					      elsif y_data = did_arr(i) then                         -- check if y_data matches with the element of the did array                            
                  mem_arr_Y(count_Y) <= y_data;                     -- if true, load the data to memeory                                                                                                      
                  count_Y            <= count_Y+1;                  -- increase the memory adress counter                                            
                  curr_state_Y       <= SDID_Y;                     -- jump to next state, SDID_Y                                                                     
                  id_pair_cnt_Y      <= i;                          -- ID pair adrs counter, for SDID pair adrs                                                                 
                exit;                                             -- exit the loop when true                                                                                              
					      end if;
              end loop;

Right now the Fmax is 141 MHz

I want to increase it to it max value.
 

Incomplete specification. What about the compare value, is it expected to change during processing?
If not, you don't need a pipeline, just a sequential process that steps through all memory addresses, one per clock cycle.

If it changes at arbitrary times, you need to put the compare value into a pipeline register array. And you get a multidimensional state array.
 

First of all, you have only provided a code snippet, not the whole code. You don't show libraries or the process so we can only assume it's a synchronous process.

If you're happy with multiple clocks for the search, remove the for loops and use a counter instead. Currently it does all compares in parallel with the lowest index match getting priority.
 

Hi,

Yes, I have to compare my input data(y_data) with multiple did values. And have a enable signal high as a flag to indicate a match.
I think, a sequential process is a good option and I agree with your suggestion of multidimensional array.
Also we need to delay the data behind the DID value, to the same no. of clock cyles .

Could you help me with the code, too?

Code:
  type did_reg is array (0 to did_width-1) of std_logic_vector (9 downto 0);            --instiating register based array of size equal to did width generic
  signal did_arr : did_reg;

  type did_in_delay is array (0 to did_width-1) of std_logic_vector(9 downto 0);
  signal did_in : did_in_delay  := (others => '0');

  type y_in_delay is array (0 to did_width-1) of std_logic_vector(9 downto 0);
  signal y_data_delay : y_in_delay  := (others => '0');


y_data   <= DIN(19 downto 10);
	when DID_Y =>                                                             
     
     for i in did_width loop

      did_in(i) <=  y_data;

     if did_in(i) = did_array(i) then
    
      enable = '1';
     else
     enable = '0';
  end if;

      y_data_delay(i) <= y_data_delay(i)((y_data_delay(i)'high-1) downto 0)) & did_in(i)(i);

end loop;

Can you see what´s the problem in the code?
 
Last edited:

The problem with the code is that it is not a complete example and you do not explain exactly what problems you're having with it.
 

Hi, sorry for the late reply.

I´m trying to extract ancillary data based on their DID/SDID. Ancillarya data is found in video data, and is serial data.
The ancillary data follows the protocol SMPTE 291M

Format -> FLAG1 -> FLAG2 -> FLAG3 -> DID-> SDID -> DC ->UDW-> CS
DID/SDID - DATA id
DC -> no of UDW
UDW -> user data words (0 to 255)
CS -> Error bit

I have to compare the incoming data packets DID with the one stored in the memory before. If matched then stored in the memory.
Right now, I am comparing the incoming DID with the stored DID values by using a for loop. This reduces my clk frq.
I want to piepline this process.
I want to compare the input data DID value with one of the stored DID value one at a time every clk cycle. So, if there are 40 DID values to compare with, then we delay the data 40 clk cycles.

*NOTE :- The packets are contiguous, I.E. , there is no space in between 2 packets. There can be 0 to 7 ancillary data packets in one vertical video line. A 1920x1020 has 1020 vertical video lines in one frame.
The size of packets are varying.

Code:
entity generic_anc_extractor is 

generic(
        did_width  : integer ;
        sdid_width : integer   
          );                                       
port(    
  output_data_at_vcount : in integer;  --simulation specific feature. determines when to output the data.                                   
                                       -- x=0 -> output data at last vcount of frame- default value
                                       -- 0<x -> output data at every vcount sel when hav = 1 
  VIDEO_CLK    : in std_logic;
  DATA_OUT_CLK : in std_logic;
  RESET_N      : in std_logic;
  SOF          : in std_logic;    
  HAV          : in std_logic;
  V_COUNT      : in std_logic_vector(10 downto 0);
  H_COUNT      : in std_logic_vector(12 downto 0);
  DIN_VALID    : in std_logic;    
  DIN          : in std_logic_vector(19 downto 0); 
  DIN_FORMAT   : in std_logic_vector(3 downto 0); 
  DID_data     : in std_logic_vector(9 downto 0);
  SDID_data    : in std_logic_vector(9 downto 0);
  D_WR         : in std_logic;
  D_ADRS       : in std_logic_vector(log2c(did_width)-1 downto 0); 
  SD_WR        : in std_logic;
  SD_ADRS      : in std_logic_vector(log2c(sdid_width)-1 downto 0);
  DOUT_OUT     : out std_logic_vector(19 downto 0);
  DATA_VALID_Y : out std_logic;  
  DATA_VALID_C : out std_logic
);
end generic_anc_extractor;

architecture rtl of generic_anc_extractor is 

  constant C_ADF1        : std_logic_vector(9 downto 0 ) := "0000000000"; -- 0  
  constant C_ADF2        : std_logic_vector(9 downto 0 ) := "1111111111"; -- 3FF
  constant C_ADF3        : std_logic_vector(9 downto 0 ) := "1111111111"; -- 3FF
  constant main_mem_N_Y  : integer := 10000;                -- Width of the main memory, equal to max anc data to be stored
  constant main_mem_N_C  : integer := 10000;                -- Width of the main memory, equal to max anc data to be stored
  -- constant did_width     : integer := 5;
  -- constant sdid_width    : integer := 5;

  type state_extrac_y is (IDLE_Y,
                          ADF_2_Y,
                          ADF_3_Y,
                          DID_Y,
                          SDID_Y,
                          DC_Y,
                          UDW_Y,
                          CS_Y
                          );
 

  type state_extrac_c is  (IDLE_C,
					               ADF_2_C,
						             ADF_3_C,
						             DID_C,
						             SDID_C,
                         DC_C,
						             UDW_C,
						             CS_C
                         );                         

  signal curr_state_Y    : state_extrac_Y_1;
  signal curr_state_C      : state_extrac_C;  
  
  signal dc_count_Y        : integer := 0;
  signal dc_count_C        : integer := 0;  
  
  signal reg_out_Y       : std_logic_vector(9 downto 0);
  signal reg_out_C       : std_logic_vector(9 downto 0);  

  signal data_count_Y    : integer := 0;                    -- count is equal to DC_Y, representing the amount of UDW_Y data 
  signal checksum_Y      : std_logic_vector(8 downto 0);    -- checksum is equal to the sum of DID_Y to UDW_Y, which is then checked against CS_Y 9 bits value
  signal count_Y         : integer := 0;                    -- count for adress of the register memory
  signal cnt_Y           : integer := 0;                    -- count for writing extracted data to output port
  
    signal data_count_C    : integer := 0;                  -- count is equal to DC_C, representing the amount of UDW_C data 
  signal checksum_C      : std_logic_vector(8 downto 0);    -- checksum is equal to the sum of DID_C to UDW_C, which is then checked against CS_C 9 bits value
  signal count_C         : integer := 0;                    -- count for adress of the register memory
  signal cnt_C           : integer := 0;                    -- count for writing extracted data to output port  
  signal did_cnt         : integer := 0;                    -- DID counter
  signal sdid_cnt        : integer := 0;                    -- SDID counter
  
  signal id_pair_cnt_Y   : integer := 0;                    -- Counter for the pair index for DID and SDID
  signal id_pair_cnt_C   : integer := 0;
  
  signal last_vcount_f2  : integer := 0;                   -- returns the last value of vcount of every format of frame 2 
  signal last_vcount_f1  : integer := 0;                   -- returns the last value of vcount of every format of frmae 1
  
  signal y_data          : std_logic_vector(9 downto 0);   -- upper 10 bits of the video stream(20-11 / 20)
  signal c_data          : std_logic_vector(9 downto 0);   -- lower 10 bits of the video stream(10-0 / 20)

  signal DATA_VALID_Y_d  : std_logic := '0';
  signal DATA_VALID_C_d  : std_logic := '0';
  
  type main_mem_reg_Y is array (0 to main_mem_N_Y-1) of std_logic_vector (9 downto 0);  --instiating register based main memory of max size of anc data that can be stored
  signal mem_arr_Y : main_mem_reg_Y;

  type main_mem_reg_C is array (0 to main_mem_N_C-1) of std_logic_vector (9 downto 0);  --instiating register based main memory of max size of anc data that can be stored
  signal mem_arr_C : main_mem_reg_C;                                                   

  type did_reg is array (0 to did_width-1) of std_logic_vector (9 downto 0);            --instiating register based array of size equal to did width generic
  signal did_arr : did_reg;                                                       

  type sdid_reg is array (0 to sdid_width-1) of std_logic_vector (9 downto 0);          --instiating register based array of size equal to sdid width
  signal sdid_arr : sdid_reg;                                                        

--########################################################################################
  
  -- type did_in_delay is array (0 to did_width-1) of std_logic_vector(9 downto 0);
  -- signal did_delay : did_in_delay ;

  -- type y_in_delay is array (0 to did_width-1) of std_logic_vector(9 downto 0);
  -- signal y_delay : y_in_delay;

  -- signal did_match : std_logic;
  -- signal c : std_logic_vector(8 downto 0);

  -- signal shreg_did : std_logic_vector(9 downto 0);
  -- signal shreg_y   : std_logic_vector(9 downto 0);
--############################################################################################ 
begin 
-------------------------------------------------------------------------------------Generate output data signal for extracted data from y and c data
process(DATA_OUT_CLK)
begin
  if rising_edge(DATA_OUT_CLK) then      -- Output of the data on the rising edge of clk given by the outer interface
    if unsigned(H_COUNT) >= 4096 then   -- check for hcount 0, start of active video
      DOUT_OUT <= reg_out_Y&reg_out_C;   -- output the extracted the data from y and c data
      DATA_VALID_Y <= DATA_VALID_Y_d;
      DATA_VALID_C <= DATA_VALID_C_d;
    end if;
  end if;
end process;

---------------------------------------------------------------------------------------------Loading DID_Y/SDID_Y serially to the array
process(VIDEO_CLK)
begin
                          
if rising_edge(VIDEO_CLK) then
  if D_WR = '1' then                                              -- check for did write enable signal                           
    did_cnt          <= to_integer(unsigned(D_ADRS));             -- loading the did adrs to did_cnt signal 
    did_arr(did_cnt) <= DID_data;                                 -- load the did in to did array according to the adress signal
  end if;  
 
  if SD_WR = '1' then                                             -- check for did write enable signal                           
    sdid_cnt           <= to_integer(unsigned(SD_ADRS));          -- loading the sid adrs to sdid_cnt signal 
    sdid_arr(sdid_cnt) <= SDID_data;                              -- load the did in to did array according to the adress signal
  end if;
end if;    
end process;
------------------------------------------------------------------------------------Current State Logic
--Generate reset and next state signals
--at every clock, update the curr_state value
--when reset,curr_state is IDLE_Y
	generate_state: process(VIDEO_CLK)
	begin 

    if rising_edge(VIDEO_CLK) then 
    y_data   <= DIN(19 downto 10);                   -- loading the upper 10 bits
      if RESET_N = '0' then 
        curr_state_Y    <= IDLE_Y;                   -- default state
		  else

        if DIN_VALID = '1' then                      -- check for data valid signal
      
          case curr_state_Y_1 is 
	------  -------------------------------------------------------------------------------------------------------------IDLE_Y 						
            when IDLE_Y =>                           
            
              checksum_Y   <= (others=>'0');                                     -- Reset checksum value for next anc data packet
              DATA_VALID_Y_d <= '0';
				  
              last_vcount_f1 <= return_vcount_active_end_f1(DIN_FORMAT);        -- last vcount of frame according to the Video Format
              last_vcount_f2 <= return_vcount_active_end_f2(DIN_FORMAT);

              if y_data = C_ADF1 then                                          -- check for ADF_1 - 0
				    		curr_state_Y <= ADF_2_Y;                                       -- jump to next state when true
              else
                 curr_state_Y <= IDLE_Y;                                       -- Jump to IDLE_Y when false
              end if;
              --OUTPUTTING THE EXTRACTED DATA FROM THE MEMORY
              if output_data_at_vcount = 0 and cnt_Y< count_Y and HAV = '1' and (unsigned(V_COUNT) = last_vcount_f1 or unsigned(V_COUNT) = last_vcount_f2) then -- check for last count according to frame 1 and 2; simulation specification -> FPGA implementation, check for space in memory and active video 
                    reg_out_Y    <= mem_arr_Y(cnt_Y);                         -- output data of register buffer starting from adrs 0 to the data stored in one line
                    cnt_Y        <= cnt_Y+1;                                  -- increase the adrs cnt till all the data is read from the register
                    DATA_VALID_Y_d <= '1';                                    -- Trigger signal to indicate data valid when the data is being output                   
              elsif output_data_at_vcount >= 1 and unsigned(V_COUNT) > 0 and cnt_Y< count_Y and HAV = '1' then-- simulation specification -> Simulation implementation,-- output only after vcount 0, check for space in memory and active video                                  
                      reg_out_Y    <= mem_arr_Y(cnt_Y);                       -- output data of register buffer starting from adrs 0 to the data stored in one line
                      cnt_Y        <= cnt_Y+1;                                -- increase the adrs cnt till all the data is read from the register
                      DATA_VALID_Y_d <= '1';
              end if;         

	-----------------------------------------------------------------------------------------------------------------ADF_2_Y / ANC DATA FLAG					
				    when ADF_2_Y =>                                                     -- ADF_2_Y - 3FF
                  
              if y_data = C_ADF2 then                                           -- check for ADF_2_Y - 3FF 
				    		curr_state_Y <= ADF_3_Y;                                        -- jump to next state when true,ADF_3_Y
				    	else                        
                curr_state_Y <= IDLE_Y;                                         -- Jump to IDLE_Y when false
              end if;

	-----------------------------------------------------------------------------------------------------------------ADF_3_Y / ANC DATA FLAG					
				    when ADF_3_Y =>                                                     -- ADF_3_Y
                       
              if y_data = C_ADF3 then                                           -- check for ADF_3_Y - 3FF
				    		curr_state_Y <= DID_Y;                                          -- jump to next state when true, DID_Y
              else                         
                curr_state_Y <= IDLE_Y;                                         -- Jump to IDLE_Y when false
              end if;
  

            when DID_Y =>
            checksum_Y <= std_logic_vector(unsigned(y_data(8 downto 0)+checksum_Y)); -- load check sum with previous checksum and current data value 

            for i in 0 to (did_width) loop                        -- Loop the did array                                                          
              if y_data = did_arr(i) then                         -- check if y_data matches with the element of the did array                            
                mem_arr_Y(count_Y) <= y_data;                     -- if true, load the data to memeory                                                                                                      
                count_Y            <= count_Y+1;                  -- increase the memory adress counter                                            
                curr_state_Y       <= SDID_Y;                     -- jump to next state, SDID_Y                                                                     
                id_pair_cnt_Y      <= i;                          -- ID pair adrs counter, for SDID pair adrs                                                                 
                exit;                                             -- exit the loop when true                                                                                              
              elsif i = did_width then                            -- If counter reaches the max amount of DID present+1 due to no DID match                               
                curr_state_Y       <= IDLE_Y;                     -- jump tp next state, IDLE_Y                                                                       
              end if;
            end loop;

	------  -----------------------------------------------------------------------------------------------------------------SDID_Y/ SECONDARY DATA ID          					 
            when SDID_Y =>                    

            checksum_Y <= std_logic_vector(unsigned(y_data(8 downto 0))+unsigned(checksum_Y));-- load check sum with previous checksum_C and current data value                                                                                                                      

            if y_data = sdid_arr(id_pair_cnt_Y then                   -- SDID array of adress of id pair cnt equal to y data                      
              mem_arr_Y(count_Y) <= y_data;                            -- if true, then load the data to the memory                                                                                                                                               
              count_Y            <= count_Y+1;                         -- increase the memory adress counter                                                                                                                                                    
              curr_state_Y       <= DC_Y;                              -- jump to next state, DC_Y                                                                               
            else                                                                                                                                                     
              curr_state_Y <= IDLE_Y;                                  -- if false, jump to default state, IDLE_Y                                                       
              count_Y      <= count_Y-1;                               -- decrease the memory adres counter, to over write the data alerady written                      
            end if;   
          
          
          -----------------------------------------------------------------------------------------------------------------DC_C/ DATA COUNT
          when DC_Y =>                                                 

            dc_count_Y       <= to_integer(unsigned(y_data(8 downto 0)));                    -- first 9 bits of the c_data, is equal to the no. of UDW_C in the packet
            
            checksum_Y         <= std_logic_vector(unsigned(y_data(8 downto 0))+unsigned(checksum_Y)); -- load check sum with previous checksum_C and current data value                                        
            mem_arr_Y(count_Y) <= y_data;                                                    -- store data in memory 
            count_Y            <= count_Y +1;                                                -- increase the memeory adress counter                                                                      
            
            curr_state_Y    <= UDW_Y;                                                        --jump to next state, UDW_C      
            
          -----------------------------------------------------------------------------------------------------------------UDW_C/ USER DATA WORDS				
          when UDW_Y =>            
            
              if data_count_Y <= dc_count_Y then                                            -- loops the process checking if data_count_C is less than/equal to dc_count(no of UDW_C)
                checksum_Y    <= std_logic_vector(unsigned(y_data(8 downto 0))+unsigned(checksum_Y)); -- load check sum with previous checksum_C and current data value when true
                data_count_Y  <= data_count_Y+1;               -- data count value determines the amount of packet loaded in the memory
              
                mem_arr_Y(count_Y) <= y_data;                  -- store data in memory
                count_Y            <= count_Y+1;               -- incraese the memory adress counter    
              
                  if data_count_Y = dc_count_Y-1 then          -- when count is equal to the no. of UDW_C. dc_count-1, because we start counting from 0
                    data_count_Y <= 0;                         -- reset count to 0, will count next from next data UDW_C
                    curr_state_Y <= CS_Y;                      -- jump to next state, CS_C when all the UDW_C is loaded
                  end if; 
              end if;
            
          -- -----------------------------------------------------------------------------------------------------------------CS_C/CHECKSUM 						
          when CS_Y =>	                                                 
            
            
            if checksum_Y = y_data(8 downto 0) then             -- checks the data with the checksum_C value
              curr_state_Y       <= IDLE_Y;                     -- jump to next state, IDLE_C when true
              mem_arr_Y(count_Y) <= y_data;                     -- store data in memory
              count_Y            <= count_Y +1;                 -- incraese the memory adress counter 

            else                                                                                                    
              curr_state_Y     <= IDLE_Y;    					          -- jump to default IDLE_C, when false           
              count_Y          <= count_Y - dc_count_Y-3;       -- wrap around the adres of the register back to where it stored the 1st data of the pckt
            end if;                                             -- will overwrite the new data over the present adres                                    
          
          when others => curr_state_Y <= IDLE_Y;                -- default case
          end case;

          end if; -- dvalid
        end if; -- reset
      end if; -- VIDEO_CLK        
    end process;
  end process;

You can surely ignore some unecceary comments. I've tried several designs / code, but can´t get it work.
Can you help me in this, please.

The code here is the original one, using the for loop for DID comparison.
 

dont use a for loop. Use a counter, and wait in the state until the counter is complete.
 

Hi,

I just can't use a counter for it.

Reason: The input is a serial data of 10 bits each. So, I can't pause the data. And we do not want to use a buffer type system. But something more of using shifter or small array based system.
 

Then you could make some kind of search unit. Or increase the clock.
Probably best to go back to your architecture/design drawing you made before you wrote the code and having a re-think on system design.
 
-> It doesn´t matter if it takes even 40 clock cycles to do the whole thing

Hi,

I just can't use a counter for it.

Reason: The input is a serial data of 10 bits each. So, I can't pause the data. And we do not want to use a buffer type system. But something more of using shifter or small array based system.

These statements seem to contradict each other.
 

Status
Not open for further replies.

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Back
Top