+ Post New Thread
Results 1 to 11 of 11
-
18th September 2018, 05:45 #1
- Join Date
- Feb 2016
- Posts
- 433
- Helped
- 1 / 1
- Points
- 2,362
- Level
- 11
reorder queue mechanism
I have some confusion regarding reorder queue using tag as sequence number
Do anyone have any general ideas about the following code snippet ?
Code Verilog - [expand] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
assign EXT_TAG = rPos; assign EXT_TAG_VALID = rValid; // Move through tag/slot/bucket space. always @ (posedge CLK) begin if (RST) begin rPos <= #1 0; rUse <= #1 0; rValid <= #1 0; end else begin if (INT_TAG_VALID & EXT_TAG_VALID) begin rPos <= #1 rPos + 1'd1; rUse <= #1 1<<rPos; rValid <= #1 !rUsing[rPos + 1'd1]; end else begin rUse <= #1 0; rValid <= #1 !rUsing[rPos]; end end end // Update tag/slot/bucket status. always @ (posedge CLK) begin if (RST) begin rUsing <= #1 0; rFinished <= #1 0; end else begin rUsing <= #1 (rUsing | rUse) & ~wClear; rFinished <= #1 (rFinished | wFinish) & ~wClear; end end
- - - Updated - - -
Why do we need EXT_TAG_VALID in line 12 of the above code ?
-
Advertisment
-
18th September 2018, 06:16 #2
- Join Date
- Feb 2015
- Posts
- 946
- Helped
- 270 / 270
- Points
- 5,748
- Level
- 17
Re: reorder queue mechanism
This is a bitmask to determine which slots are occupied. Slots are set sequentially. They can be cleared in any order. eg, for a much smaller rUse of 4 bits, it can go 0000, 0001 0011 0111 1111 at which point the lsb would need to be cleared. as mentioned, the lsb doesn't need to be the first bit to be cleared (just looking at this module). the next state could be 0001 (just looking at this module). At that point, the lsb would still need to be cleared before other bits are set.
ext_tag_valid is used to indicate that the bit in the bitmask is 0 and thus that slot in ram is unoccupied.
you would have to look at the documentation or other modules to determine if the protocol is always safe.
1 members found this post helpful.
-
18th September 2018, 08:17 #3
- Join Date
- Feb 2016
- Posts
- 433
- Helped
- 1 / 1
- Points
- 2,362
- Level
- 11
Re: reorder queue mechanism
but that still does not tell why we need EXT_TAG_VALID at line 12 of the above code
- - - Updated - - -
Besides , why it uses "C_PCI_DATA_WORD" RAMs for packet reordering ? It does not make sense to split the data into 32 bits chunk each for the purpose of packet reordering using tag sequence number
Code Verilog - [expand] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
genvar r; generate for (r = 0; r < C_PCI_DATA_WORD; r = r + 1) begin : rams // RAMs for packet reordering. (* RAM_STYLE="BLOCK" *) ram_1clk_1w_1r #(.C_RAM_WIDTH(32), .C_RAM_DEPTH(C_NUM_TAGS*C_DW_PER_TAG/C_PCI_DATA_WORD) ) ram ( .CLK(CLK), .ADDRA(wWrDataAddr[C_DATA_ADDR_WIDTH*r +:C_DATA_ADDR_WIDTH]), .WEA(wWrDataEn[r]), .DINA(wWrData[32*r +:32]), .ADDRB(wRdDataAddr), .DOUTB(wRdData[32*r +:32]) ); end endgenerate
Last edited by promach; 18th September 2018 at 08:40.
-
Advertisment
-
19th September 2018, 06:42 #4
- Join Date
- Feb 2015
- Posts
- 946
- Helped
- 270 / 270
- Points
- 5,748
- Level
- 17
Re: reorder queue mechanism
the 32b chunks is probably for the independent write enables. look at the design to see if this is correct.
-
19th September 2018, 07:37 #5
- Join Date
- Feb 2016
- Posts
- 433
- Helped
- 1 / 1
- Points
- 2,362
- Level
- 11
Re: reorder queue mechanism
vGoodTimes
the 32b chunks is probably for the independent write enables
I am attaching a relevant ILA waveform trace for this reorder queue and I am still looking at why splitting into multiple 32-bits chunk for each RAM
-
Advertisment
-
19th September 2018, 17:01 #6
- Join Date
- Feb 2015
- Posts
- 946
- Helped
- 270 / 270
- Points
- 5,748
- Level
- 17
Re: reorder queue mechanism
DINA doesn't matter. it is just splitting the N 32bit words of the input into 32b slices for the N rams with independent write enables.
edit: the ram takes, for example, a 128b bus which is logically 4 32b words as input. it allows partial writes to the ram -- eg, 0-4 of the 32b words can be written to the ram.
-
20th September 2018, 03:36 #7
- Join Date
- Feb 2016
- Posts
- 433
- Helped
- 1 / 1
- Points
- 2,362
- Level
- 11
Re: reorder queue mechanism
Let's back up a bit.
It does not make sense to split the ( C_PCI_DATA_WIDTH or 128 bits ) data into 32 bits chunk each for the purpose of packet reordering using tag sequence number .
Note: we are not sorting the value of data, instead we are sorting the packets containing the data.
-
22nd September 2018, 06:05 #8
- Join Date
- Feb 2015
- Posts
- 946
- Helped
- 270 / 270
- Points
- 5,748
- Level
- 17
Re: reorder queue mechanism
it looks like this is actually N rams with independent write address, write data, and write enable. The read interface seems to have shared address/enable and gives a N*32b output. You should look at the reorder input module to see what is going on.
-
24th October 2018, 09:07 #9
- Join Date
- Feb 2016
- Posts
- 433
- Helped
- 1 / 1
- Points
- 2,362
- Level
- 11
Re: reorder queue mechanism
For https://github.com/promach/riffa/blo...ut.v#L228-L229 , why use [2*C_TAG_WIDTH +:C_TAG_WIDTH] instead [0*C_TAG_WIDTH +:C_TAG_WIDTH] ?
Code Verilog - [expand] 1 2
rUseCurrPos <= #1 (rTag[2*C_TAG_WIDTH +:C_TAG_WIDTH] == rTag[3*C_TAG_WIDTH +:C_TAG_WIDTH] && rValid[3]); rUsePrevPos <= #1 (rTag[2*C_TAG_WIDTH +:C_TAG_WIDTH] == rTag[4*C_TAG_WIDTH +:C_TAG_WIDTH] && rValid[4]);
Same questions go to lines 319, 322, 336, and 339 of the same verilog file which are the respective ADDRA and ADDRB signals in the two following ram_1clk_1w_1r submodules
Code Verilog - [expand] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
// RAM for counts (* RAM_STYLE="DISTRIBUTED" *) ram_1clk_1w_1r #( .C_RAM_WIDTH(C_TAG_DW_COUNT_WIDTH), .C_RAM_DEPTH(C_NUM_TAGS)) countRam ( .CLK(CLK), .ADDRA(rTag[2*C_TAG_WIDTH +:C_TAG_WIDTH]), .WEA(rValid[2]), .DINA(rCount), .ADDRB(rTag[0*C_TAG_WIDTH +:C_TAG_WIDTH]), .DOUTB(wCount) ); // RAM for positions (* RAM_STYLE="DISTRIBUTED" *) ram_1clk_1w_1r #( .C_RAM_WIDTH(C_PCI_DATA_WORD*C_DATA_ADDR_STRIDE_WIDTH), .C_RAM_DEPTH(C_NUM_TAGS)) posRam ( .CLK(CLK), .ADDRA(rTag[4*C_TAG_WIDTH +:C_TAG_WIDTH]), .WEA(rValid[4]), .DINA(rPos), .ADDRB(rTag[2*C_TAG_WIDTH +:C_TAG_WIDTH]), .DOUTB(wPos) );
-
31st October 2018, 08:14 #10
- Join Date
- Feb 2016
- Posts
- 433
- Helped
- 1 / 1
- Points
- 2,362
- Level
- 11
Re: reorder queue mechanism
Code Verilog - [expand] 1
parameter C_DATA_ADDR_STRIDE_WIDTH = 5,// Width of max num stored data addr positions per tag
Code Verilog - [expand] 1
wire [(C_DATA_ADDR_STRIDE_WIDTH*C_PCI_DATA_WORD)-1:0] wPos;
why wPos need to have bit-width of (C_DATA_ADDR_STRIDE_WIDTH*C_PCI_DATA_WORD) ?
-
Advertisment
-
20th November 2018, 09:10 #11
- Join Date
- Feb 2016
- Posts
- 433
- Helped
- 1 / 1
- Points
- 2,362
- Level
- 11
Re: reorder queue mechanism
For https://docs.google.com/spreadsheets...it?usp=sharing , it does not make sense to compute rDEShift when rDE[2] and rShift signals ( these two signals do not carry the same data enable logic in the same clock cycles, see carefully the table and clock cycles count ) are not using the same DATA_EN and its corresponding DATA_EN_COUNT. Anyone have any comments ?
+ Post New Thread
Please login