Continue to Site

Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

[SOLVED] Verilog generate loop problem, asymmetric FIFO variation.

Status
Not open for further replies.

rupertlssmith

Junior Member level 1
Joined
Mar 2, 2011
Messages
17
Helped
0
Reputation
0
Reaction score
0
Trophy points
1,281
Activity points
1,449
Hi,

I'm trying to produce a variation on a FIFO, that lets me consume less than the input width of the FIFO on the output end; between zero and the full width. The width of the FIFO is a multiple of a word size, and that multiple is a parameter to the module. In order to map the read and write vectors onto the FIFO registers, I wanted to make use of a generate loop. Xilinx ISE doesn't like it, and complains that multiple inputs are driving elements of the register vector. If I hand unroll the loop, its ok, just not with the loop in place.

Here's the module:

Code:
module fifo
  #(
    parameter ADDR_SIZE = 4, // Address size in bits.
    parameter WORD_SIZE = 8, // Word size in bits.
    parameter CHNK_EXPO = 2, // Chunk factor as a power to raise 2 by to multiply the word size by.

    // Derived parameters do not override.
    parameter CHNK_FACT = 2 ** CHNK_EXPO,
    parameter CHNK_SIZE = WORD_SIZE * CHNK_FACT,
    parameter FIFO_SIZE = 2 ** ADDR_SIZE
    )
   (
    input wire clk, reset,
    input wire rd, wr,
    input wire [CHNK_EXPO - 1 : 0] consume,
    input wire [CHNK_SIZE - 1 : 0] w_data,
    output wire empty, full,
    output wire [CHNK_EXPO - 1 : 0] avail,
    output wire [CHNK_SIZE - 1 : 0] r_data
    );

   // Signal declaration.
   reg [WORD_SIZE - 1 : 0]          array_reg [FIFO_SIZE - 1 : 0];
   reg [ADDR_SIZE - 1 : 0]          w_ptr_reg, w_ptr_next, w_ptr_succ;
   reg [ADDR_SIZE - 1 : 0]          r_ptr_reg, r_ptr_next, r_ptr_succ;
   reg [CHNK_EXPO - 1 : 0]          avail_reg, avail_next;
   reg                              full_reg, empty_reg, full_next, empty_next;
   wire                             wr_en;

   // Synchronous state update.
   genvar                           i;

   /*
   generate
      for (i = 0; i < CHNK_FACT; i = i + 1) begin : gen_write
         always @(posedge clk)
           if (wr_en)
             begin
                array_reg[w_ptr_reg + i] <= w_data[WORD_SIZE * (i + 1) - 1 : WORD_SIZE * i];
             end
      end
   endgenerate
   */

   always @(posedge clk)
     if (wr_en)
       begin
          array_reg[w_ptr_reg] <= w_data[WORD_SIZE - 1 : 0];
          array_reg[w_ptr_reg + 1] <= w_data[WORD_SIZE * 2- 1 : WORD_SIZE];
          array_reg[w_ptr_reg + 2] <= w_data[WORD_SIZE * 3 - 1 : WORD_SIZE * 2];
          array_reg[w_ptr_reg + 3] <= w_data[WORD_SIZE * 4 - 1 : WORD_SIZE * 3];
       end

   always @(posedge clk, posedge reset)
     if (reset)
       begin
          w_ptr_reg <= 0;
          r_ptr_reg <= 0;
          full_reg <= 1'b0;
          empty_reg <= 1'b1;
          avail_reg <= 0;
       end
     else
       begin
          w_ptr_reg <= w_ptr_next;
          r_ptr_reg <= r_ptr_next;
          full_reg <= full_next;
          empty_reg <= empty_next;
          avail_next <= avail_reg;
       end

   // Next state logic.
   assign wr_en = wr & ~full_reg;

   always @*
     begin
        w_ptr_succ = w_ptr_reg + CHNK_FACT;
        r_ptr_succ = r_ptr_reg + consume + 1;

        w_ptr_next = w_ptr_reg;
        r_ptr_next = r_ptr_reg;

        full_next = full_reg;
        empty_next = empty_reg;

        avail_next = avail_reg;

        case ({wr, rd})
          // 2'b00: no op
          2'b01: // read.
            if (~empty_reg)
              begin
                 r_ptr_next = r_ptr_succ;
                 full_next = 1'b0;
                 if (r_ptr_succ == w_ptr_reg)
                   empty_next = 1'b1;
                 avail_next = avail_reg - (consume + 1);
              end
          2'b10: // write.
            if (~full_reg)
              begin
                 w_ptr_next = w_ptr_succ;
                 empty_next = 1'b0;
                 if (w_ptr_succ > r_ptr_reg + CHNK_FACT)
                   full_next = 1'b1;
                 avail_next = avail_reg + CHNK_FACT;
              end
          2'b11: // read and write.
            begin
               w_ptr_next = w_ptr_succ;
               r_ptr_next = r_ptr_succ;
               if (w_ptr_succ > r_ptr_reg + CHNK_FACT)
                 full_next = 1'b1;
               avail_next = avail_reg - (consume + 1) + CHNK_FACT;
            end
        endcase
     end

   // Output logic.
   assign full = full_reg;
   assign empty = empty_reg;

   generate
      for (i = 0; i < CHNK_FACT; i = i + 1) begin : gen_read
         assign r_data[WORD_SIZE * (i + 1) - 1 : WORD_SIZE * i] = array_reg[r_ptr_reg + i];
      end
   endgenerate

   /*
   assign r_data[WORD_SIZE - 1 : 0] = array_reg[r_ptr_reg];
   assign r_data[WORD_SIZE * 2 - 1 : WORD_SIZE] = array_reg[r_ptr_reg + 1];
   assign r_data[WORD_SIZE * 3 - 1 : WORD_SIZE * 2] = array_reg[r_ptr_reg + 2];
   assign r_data[WORD_SIZE * 4 - 1 : WORD_SIZE * 3] = array_reg[r_ptr_reg + 3];
   */

endmodule

You can see I have commented out the first generated loop, for the write vector, and am using a hand unrolled version. The generate loop at the end for the read vector seem ok, so the generate loop is being used there, and the hand unrolled version is commented out. The generate loop at the start is acceptable syntax but causes the multiple inputs driving the register array error.

Is this just an ISE quirk? or am I trying to do something more fundamentally wrong in Verilog?

Hand unrolling 4 loops is ok for now, but I'd really like to get the module parametrized to generate different widths.

Thanks for any assistance.

Rupert

---------- Post added at 23:17 ---------- Previous post was at 23:12 ----------

Also, I feel like I want to do this for the write vector generate loop:

Code:
   generate
     always @(posedge clk)
       if (wr_en)
         begin
            for (i = 0; i < CHNK_FACT; i = i + 1) begin : gen_write
                array_reg[w_ptr_reg + i] <= w_data[WORD_SIZE * (i + 1) - 1 : WORD_SIZE * i];
            end
         end
   endgenerate

That is, to put the loop inside the always block. But that doesn't seem to be legal Verilog. Is there a way of putting a generate loop inside an always block? or do I have to generate multiple always blocks?
 

Hi,

I'm trying to produce a variation on a FIFO, that lets me consume less than
the input width of the FIFO on the output end;
between zero and the full width.

honestly I don't understand the text above;

but have few notes to your code;
Code:
genvar i;
 generate
  for (i = 0; i < CHNK_FACT; i = i + 1) begin : gen_write
    always @(posedge clk)
     if (wr_en)
        array_reg[w_ptr_reg + i] <= w_data[WORD_SIZE * (i + 1) - 1 : WORD_SIZE * i];
  end
 endgenerate

generate loop does not describe sequental action to be executed,
like in a processor code, it describes a hardware to be generated 'N' times
and its connections, in your rtl the synth. tool has no clear definition
how many times the hardware has to be implemented;

probably you mean something like this:
Code:
integer j;
   always @(posedge clk)
     for (j = 0; j < CHNK_FACT; j = j + 1)         
       if (wr_en)
         array_reg[w_ptr_reg + j] <= w_data[WORD_SIZE * (j + 1) - 1 -: WORD_SIZE];

note the 'trick' in the assignment, there is a top index and the size, instead
of a bottom index, which is [size] a constant, otherwise the tool will exit
with an error [index is not a constant];

this way you can not create a fifo, if fifo is a memory with a wrt and rd
pointers, the tool will emulate such functionality by an array of regs,
what is not bad if the intendent fifo is small;
---
have fun
J.A
 
Last edited:

generate loop does not describe sequental action to be executed,
like in a processor code, it describes a hardware to be generated 'N' times
and its connections, in your rtl the synth. tool has no clear definition
how many times the hardware has to be implemented;

probably you mean something like this:
Code:
integer j;
   always @(posedge clk)
     for (j = 0; j < CHNK_FACT; j = j + 1)         
       if (wr_en)
         array_reg[w_ptr_reg + j] <= w_data[WORD_SIZE * (j + 1) - 1 -: WORD_SIZE];

I'm not so sure about your for loop? I thought for loops are not synthesizeable and a generate loop is? As I understand it a generate loop is a bit like a macro to save on typing, and is elaborated before synthesis. Hence my confusion about the hand unrolled versus generate loop being different. See:

**broken link removed**

I'll try what you suggest though, and let you know how I got on. Top index and size is an easier way to write what I wanted to express. Thanks.

Rupert
 

I thought for loops are not synthesizeable
not true,
under some restrictions there are;
just as an example a bit swap function, synthesizable:
Code:
module swap
(  input      [N-1:0] din,
   output reg [N-1:0] d_swaped  );
parameter N = 8;
integer i;
always @(*)
  for (i=0; i<N; i=i+1)
    d_swaped[i] <= din[(N-1)-i];
endmodule

As I understand it a generate loop is a bit like a macro to save on typing
true;
but if you use parameters it can be impossible to instantiate a module required
number of times without generate statement;
---
have fun
J.A
 

I retract my earlier scepticism on the for loop. Also, your trick of using a size instead of a bottom index was also necessary. This seems to be accepted:

Code:
   integer                          j;

   always @(posedge clk)
     if (wr_en)
       for (j = 0; j < CHNK_FACT; j = j + 1)
          array_reg[w_ptr_reg + j] <= w_data[WORD_SIZE * (j + 1) - 1 -: WORD_SIZE];

However, now I have this problem:

ERROR:Xst:2071 - Available block RAM resources offer a maximum of two write ports. You are apparently describing a RAM with 4 separate write ports for signal <array_reg>.

Understandable, but is there some way I can force it to not use the block RAM in that case? My register array will only ever be a small one.

Thanks for the help on the for loop. much appreciated.

Rupert

---------- Post added at 23:27 ---------- Previous post was at 23:25 ----------

Also, if I use the size instead of the bottom index, but hand unroll the loop, as below, I don't get the error about the 4 separate write ports.

Code:
   always @(posedge clk)
     if (wr_en)
       begin
          array_reg[w_ptr_reg    ] <= w_data[WORD_SIZE - 1     -: WORD_SIZE];
          array_reg[w_ptr_reg + 1] <= w_data[WORD_SIZE * 2 - 1 -: WORD_SIZE];
          array_reg[w_ptr_reg + 2] <= w_data[WORD_SIZE * 3 - 1 -: WORD_SIZE];
          array_reg[w_ptr_reg + 3] <= w_data[WORD_SIZE * 4 - 1 -: WORD_SIZE];
       end
 

I compiled your code with quartus with such for loops:

Code:
output /*wire*/ reg  [CHNK_SIZE - 1 : 0] r_data
  /...../

integer i;
   always @(posedge clk)
     if (wr_en)
      for (i = 0; i < CHNK_FACT; i = i + 1)
         array_reg[w_ptr_reg + i] <= w_data[WORD_SIZE * (i + 1) - 1 -: WORD_SIZE];

 /....../
    always @*
      for (i = 0; i < CHNK_FACT; i = i + 1)
          r_data[WORD_SIZE * (i + 1) - 1 -: WORD_SIZE] <= array_reg[r_ptr_reg + i];

compilation ended without errors and - as expected - the tool created an array
of regs;

ise tries to implement your 2d array as a block memory,
havn't work with ise for a while but believe there is a switch somewhere
in the tool settings or synthesis attribute which allows you to guide ise to
desired direction;

the second possibility is generation multiple time [CHNK_FACT times I guess]
of one port block ram - this ways you can emulate multiply write port memory;
[in this case you need to go back to the 'generate' solution]
---
have fun
 
Last edited:
Hi,

I found the option to disable use of a block RAM. Thanks for the pointers to it. However, now that error is fixed, it has reverted to the error I was getting previously about signal connect to multiple drivers:

Code:
ERROR:Xst:528 - Multi-source in Unit <fifo> on signal <array_reg<15><7>>; this signal is connected to multiple drivers.
ERROR:Xst:528 - Multi-source in Unit <fifo> on signal <array_reg<15><6>>; this signal is connected to multiple drivers.
ERROR:Xst:528 - Multi-source in Unit <fifo> on signal <array_reg<15><5>>; this signal is connected to multiple drivers.
ERROR:Xst:528 - Multi-source in Unit <fifo> on signal <array_reg<15><4>>; this signal is connected to multiple drivers.
ERROR:Xst:528 - Multi-source in Unit <fifo> on signal <array_reg<15><3>>; this signal is connected to multiple drivers.
ERROR:Xst:528 - Multi-source in Unit <fifo> on signal <array_reg<15><2>>; this signal is connected to multiple drivers.
ERROR:Xst:528 - Multi-source in Unit <fifo> on signal <array_reg<15><1>>; this signal is connected to multiple drivers.
ERROR:Xst:528 - Multi-source in Unit <fifo> on signal <array_reg<15><0>>; this signal is connected to multiple drivers.
ERROR:Xst:528 - Multi-source in Unit <fifo> on signal <array_reg<14><7>>; this signal is connected to multiple drivers.
... and so on.

I'm guessing ISE is not clever enough to realise that the array_reg is not set in such a way that its components are connected to multiple drivers, even though the for loop uses a constant bound and so on.

This is prompting me to rethink my design. Perhaps I should just use a normal FIFO and a shifter to achieve my aims? I'm probably getting ahead of myself as a novice, and trying to do things in a more complicated way than I might need to.

I'm trying to create a parser for a variable width data structure, so I do not always want to consume the full width of the data available at the FIFO output. This could also be compared with an instruction decoder for an instruction set without fixed width instructions, such as x86. The instruction data is loaded in cache line widths (128 bits or is it 256 nowadays?), but may not always be fully consumed up to that width.

Thanks for all the replies.

Rupert
 

I have synthesized your code in post#1 with ISE-12.4, I dont get any ERROR with the code.

Let me know if you need any help to run this design.
 

Hi Ruppert,

the example below has nothing to do with your
intention, but it demonstrates how to build
a 32 bits 'fifo' with selection which 8 bit word
and where has to be written and with similar
read selection, hope this point you to a useful solution;
and I believe even ise will accept it ...
Code:
module fifo
  #( parameter ADDR_SIZE = 4, 
     parameter WORD_SIZE = 8, 
     parameter FIFO_SIZE = 2 ** ADDR_SIZE )
 (
  input                            clk, reset,
  input      [ADDR_SIZE-1:0]       w_pointer, r_pointer,
  input      [ADDR_SIZE-1:0]       sel_w, sel_r,
  input      [WORD_SIZE*4 - 1 : 0] w_data,
  output reg [WORD_SIZE*4 - 1 : 0] r_data
 );

  wire [WORD_SIZE*4-1:0] ram_data;
  
genvar i;  
  generate
   for (i = 0; i < 4; i = i + 1) 
    begin : mem 
      ram #( .ADDR_SIZE(4), .WORD_SIZE(8), .FIFO_SIZE(16) )
	  slice 
 	   ( .clk(clk), .wr(sel_w[i]), .din(w_data[WORD_SIZE*i +: WORD_SIZE]),
 	     .addr_w(w_pointer + i),   .addr_r(r_pointer + i), 
	     .q(ram_data[WORD_SIZE*i +: WORD_SIZE])  
	   );
    end
  endgenerate
 
 integer j;
 
 always @(posedge clk)
   for ( j=0; j<4; j=j+1 )
     if ( sel_r[j] )  r_data[WORD_SIZE*j +: WORD_SIZE] <= ram_data[WORD_SIZE*j +: WORD_SIZE];

endmodule//=====================================

  module ram 
     #(parameter ADDR_SIZE = 4, 
	   parameter WORD_SIZE = 8, 
	   parameter FIFO_SIZE = 16)
  (
    input                      clk,
	                           wr,
	input      [WORD_SIZE-1:0] din,
	input      [ADDR_SIZE-1:0] addr_w,
	input      [ADDR_SIZE-1:0] addr_r,
	output reg [WORD_SIZE-1:0] q

  );
reg [WORD_SIZE-1:0] array_reg [FIFO_SIZE-1:0];

  always @(posedge clk)
    begin
     if ( wr ) array_reg[addr_w] <= din;
	 q <= array_reg[addr_r];
	end
endmodule
---
have fun
 
Hi,

I get what you have done; put 4 fifos together side by side to achieve what I was trying to do with my original design. Much better! Thanks for that.

Rupert
 

I think you people can explain me about designing a cache memory using RRIC algorithm?
plz help me find some materials for this project..
 

Status
Not open for further replies.

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Back
Top