27th July 2017, 21:24 #1
Need help for write verilog code for recurrent block
Hi,
I wrote a verilog code for two blocks, that inside both of them, I have some multiplier and adder. both block have two input and two output, but input of one block is output of another block, something that we name it in neural network as Recurrent. so now both block work correct if I simulate them separately , But when we use their output and input just like picture, my output is 'x' . that i think because of depending their input and output. Can you help me in this case. I will appreciate. Im stuck with it for 3 days. thanksLast edited by Adnan86; 27th July 2017 at 21:32.

27th July 2017, 21:24

28th July 2017, 08:01 #2
Re: Need help for write verilog code for recurrent block
hi,
please show your code, that will give a better understanding on where you are wrong :)
regards

28th July 2017, 12:40 #3
Re: Need help for write verilog code for recurrent block
I didnt mention the code because it was several block code , and its waste your time, my question just was a tiny part of project. Indeed I work on implementation of block base in neural network on fpga. that they have a lot of block like above picture and all block get their input from neighbor blocks. so we have recurrent input/output for all of them. and this case repeat several times.

28th July 2017, 12:49 #4
Re: Need help for write code for recurrent block
X usually occurs as you have multiple drivers on the same signal  so without the code  we cant really tell you whats wrong.
Search through and make sure all signals are only driven from a single source.
28th July 2017, 12:49

28th July 2017, 15:26 #5
28th July 2017, 15:26

29th July 2017, 17:09 #6
Re: Need help for write verilog code for recurrent block
yes, it should be combitional
this code it just for one block ?
I understant that two output of this block is always 'x'
Code:`timescale 1ns / 1ps ////////////////////////////////////////////////////////////////////////////////// // Company: // Engineer: // // Create Date: 23:56:40 07/21/2017 // Design Name: // Module Name: neuron // Project Name: // Target Devices: // Tool versions: // Description: // // Dependencies: // // Revision: // Revision 0.01  File Created // Additional Comments: // // 3.306 = 0000001101001110 w11 // 3.155 = 0000001100101000 w21 // 1.823 = 0000000111010011 w21 // 2.820 = 0000001011010010 w22 bin(fi(2.820,1,16,8)) // 0.986 = 1111111100000100 b1 // 2.051 = 0000001000001101 b2 ////////////////////////////////////////////////////////////////////////////////// module neuron #(parameter NB=16, w11=16'b0000110100111001,//3385, w12=16'b0000011101001011,//1867, w21=16'b0000110010011111,//3231, w22=16'b0000101101001000,//2888, b1=16'b1111110000001110,//1010, b2=16'b0000100000110100)(//2100 )( input clk, input rst, input [NB1 : 0] x1, input [NB1 : 0] x2, output reg ready, output reg [2*NB1 : 0] y1, output reg [2*NB1 : 0] y2); wire [2*NB1 : 0] product1; wire [2*NB1 : 0] product2; wire [2*NB1 : 0] product3; wire [2*NB1 : 0] product4; wire [2*NB1 : 0] out_add1; wire [2*NB1 : 0] out_add2; wire en1; wire en2; wire en3; wire en4; reg en13, en24; FP_mult uut1 ( .clk(clk), .rst(rst), .multiplicand(x1), .multiplier(w11), .ready(en1), .product(product1)); FP_mult uut2 ( .clk(clk), .rst(rst), .multiplicand(x1), .multiplier(w12), .ready(en2), .product(product2)); FP_mult uut3 ( .clk(clk), .rst(rst), .multiplicand(x2), .multiplier(w21), .ready(en3), .product(product3)); FP_mult uut4 ( .clk(clk), .rst(rst), .multiplicand(x2), .multiplier(w22), .ready(en4), .product(product4)); // adder FP_add #(9,22) uutA1 ( .en(en1 & en3), .in1(product1), .in2(product3), .out_add(out_add1) ); FP_add #(9,22) uutA2 ( .en(en2 & en4), .in1(product2), .in2(product4), .out_add(out_add2) ); FP_add #(9,22) uutA3 ( .en(en13), .in1(out_add1), .in2({{16{b1[15]}},b1}), .out_add(y1) ); FP_add #(9,22) uutA4 ( .en(en24), .in1(out_add2), .in2({{16{b2[15]}},b2}), .out_add(y2) ); always @ (posedge clk) if (en1 && en2 && en3 && en4) begin ready <= 1'b1; en13 <= en1 & en3; en24 <= en2 & en4; end else begin ready <= 1'b0; en13 <= 1'b0; en24 <= 1'b0; end //assign y2 = product2 + product4 + b2; //assign ready = (en1 && en2 && en3 && en4) ? 1'b1 : 1'b0; endmodule
Code:`timescale 1ns / 1ps ////////////////////////////////////////////////////////////////////////////////// // Company: // Engineer: // // Create Date: 16:21:25 07/19/2017 // Design Name: // Module Name: FP_mult // Project Name: // Target Devices: // Tool versions: // Description: // // Dependencies: // // Revision: // Revision 0.01  File Created // Additional Comments: // ////////////////////////////////////////////////////////////////////////////////// module FP_mult #(parameter M=4, N=11)( input clk, input rst, input [M+N : 0] multiplicand, input [M+N : 0] multiplier, output ready, output [2*(M+N)+1 : 0] product); reg [2*(M+N)+1 : 0] partial_mult; //reg [2*(M+N)+1 : 0] extent_multiplicand; //extent_multiplicand = {16{multiplicand[M+N]},multiplicand}; reg [4:0] num; reg flag; always @ (posedge clk , posedge rst ) begin //extent_multiplicand = {16{multiplicand[M+N]},multiplicand}; if (rst) begin partial_mult <= 0; num <= 0; flag <= 0; end else if ((num <= M+N) && (flag != 1'b1)) begin partial_mult <= partial_mult + ({32{multiplier[num]}} & ({{16{multiplicand[M+N]}},multiplicand} << num)); num <= num + 1'b1; end else if ((num == M+N+ 1'b1) && (flag != 1'b1)) begin if (multiplier[M+N]) begin partial_mult <= partial_mult + (~({{16{multiplicand[M+N]}},multiplicand} << num)) + 1'b1; flag <= 1'b1; end else flag <= 1'b1; end end assign product = flag ? partial_mult : 'bz; assign ready = flag ? 1'b1 : 1'b0; endmodule
Code:`timescale 1ns / 1ps ////////////////////////////////////////////////////////////////////////////////// // Company: // Engineer: // // Create Date: 20:43:03 07/21/2017 // Design Name: // Module Name: FP_add // Project Name: // Target Devices: // Tool versions: // Description: // // Dependencies: // // Revision: // Revision 0.01  File Created // Additional Comments: // ////////////////////////////////////////////////////////////////////////////////// module FP_add #(parameter M=4, N=11)( input en, input [M+N : 0] in1, input [M+N : 0] in2, output reg [(M+N) : 0] out_add); reg [(M+N) : 0] tem_add; reg s; always @ (*) begin if (en) begin if (in1[M+N] == in2[M+N]) begin tem_add = in1[M+N1 : 0] + in2[M+N1 : 0]; s = in1[M+N]; end else if (in1[M+N] > in2[M+N]) begin if ((~in1[M+N1 : 0]+1'b1) > in2[M+N1 : 0]) begin tem_add = ~((~in1[M+N1 : 0] + 1'b1)  (in2[M+N1 : 0])) + 1'b1 ; s = in1[M+N]; end else begin tem_add = ((in2[M+N1 : 0])  (~in1[M+N1 : 0] + 1'b1) ) ; s = in2[M+N]; end end else begin if ((~in2[M+N1 : 0]+1'b1) > in1 [M+N1 : 0]) begin tem_add = ~((~in2[M+N1 : 0] + 1'b1)  (in1[M+N1 : 0])) + 1'b1 ; s = in2[M+N]; end else begin tem_add = ((in1[M+N1 : 0])  (~in2[M+N1 : 0] + 1'b1) ) ; s = in1[M+N]; end end end else begin tem_add = 'bz ; s = 1'bz; end out_add = {s,tem_add[M+N1 : 0]}; end //assign out_add = {s,tem_add[M+N1 : 0]}; endmodule
y1 and y2 is my output, but show 'x'

29th July 2017, 17:11 #7
29th July 2017, 17:21 #8
Re: Need help for write code for recurrent block
I changed the code several times. my first idea was, just multiplier is sequential , and other part combitional. but after that I change it several times to get answer

29th July 2017, 17:21

29th July 2017, 20:22 #9
Re: Need help for write code for recurrent block
I thought, I found my problem for one block neuron, now , when i mixed 2 block together just like picture , i have 'x' in output. what I should do ?

29th July 2017, 22:17 #10
Re: Need help for write code for recurrent block
The block diagram is a conceptual version of how the design works. A practical design in software or hardware will have some sequence of processing and some memory elements to hold intermediate values. For your case, a practical design would create some number of processors. Each processor would contain the two blocks in your diagram. There would also be a state ram that holds intermediate results.
An optimization would allow each processor to represent multiple nodes in the larger neural network. This is channelization, and it allows pipelining to work in this case. In this case, the state ram includes coefficients and intermediate results. The goal is to allow a longer pipeline in order to get a high rate of processing. Because the next operations are based on previous outputs, the rate of processing for a single channel is lowered by the pipeline delay. If the same processing is done on several independent channels, the processor can have these in the pipeline at the same time.
30th July 2017, 17:15 #11
