instantiate a distributed ram using core generator in xilinx ISE

tanish · Sep 18, 2017

hello.
I try to implement a distributed RAM using ISE IP core generator but I have this warning:

WARNING:HDLCompiler:1499 - "E:\M.Sc\ISE projects\ipcore-test\test2\ipcore_dir\myram.v" Line 39: Empty module <myram> remains a black box.

my verilog code is :

Code Verilog - [expand]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
module test2( 
data_in, 
clock, 
wen, 
ce,
addr, 
data_out
);
 
input   [15:0]data_in;
input   clock;
input   wen;
input   ce;
input   [5:0]addr;
output  [15:0]data_out;
 
 
myram a1 (
  .a(addr), // input [5 : 0] a
  .d(data_in), // input [15 : 0] d
  .clk(clock), // input clk
  .we(wen), // input we
  .i_ce(ce), // input i_ce
  .spo(data_out) // output [15 : 0] spo
);
 
 
endmodule

I used .veo file to instantiate myram.

and myram.v file that has been made by core generator is :

Code Verilog - [expand]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
`timescale 1ns/1ps
 
module myram(
  a,
  d,
  clk,
  we,
  i_ce,
  spo
);
 
input [5 : 0] a;
input [15 : 0] d;
input clk;
input we;
input i_ce;
output [15 : 0] spo;
 
// synthesis translate_off
 
  DIST_MEM_GEN_V7_2 #(
    .C_ADDR_WIDTH(6),
    .C_DEFAULT_DATA("0"),
    .C_DEPTH(64),
    .C_FAMILY("spartan6"),
    .C_HAS_CLK(1),
    .C_HAS_D(1),
    .C_HAS_DPO(0),
    .C_HAS_DPRA(0),
    .C_HAS_I_CE(1),
    .C_HAS_QDPO(0),
    .C_HAS_QDPO_CE(0),
    .C_HAS_QDPO_CLK(0),
    .C_HAS_QDPO_RST(0),
    .C_HAS_QDPO_SRST(0),
    .C_HAS_QSPO(0),
    .C_HAS_QSPO_CE(0),
    .C_HAS_QSPO_RST(0),
    .C_HAS_QSPO_SRST(0),
    .C_HAS_SPO(1),
    .C_HAS_SPRA(0),
    .C_HAS_WE(1),
    .C_MEM_INIT_FILE("no_coe_file_loaded"),
    .C_MEM_TYPE(1),
    .C_PARSER_TYPE(1),
    .C_PIPELINE_STAGES(0),
    .C_QCE_JOINED(0),
    .C_QUALIFY_WE(0),
    .C_READ_MIF(0),
    .C_REG_A_D_INPUTS(1),
    .C_REG_DPRA_INPUT(0),
    .C_SYNC_ENABLE(1),
    .C_WIDTH(16)
  )
  inst (
    .A(a),
    .D(d),
    .CLK(clk),
    .WE(we),
    .I_CE(i_ce),
    .SPO(spo),
    .DPRA(),
    .SPRA(),
    .QSPO_CE(),
    .QDPO_CE(),
    .QDPO_CLK(),
    .QSPO_RST(),
    .QDPO_RST(),
    .QSPO_SRST(),
    .QDPO_SRST(),
    .DPO(),
    .QSPO(),
    .QDPO()
  );
 
// synthesis translate_on
 
endmodule

could anyone tell me exactly what the problem is?

when I try this method on a simple 6bit adder everything is ok.

vGoodtimes · Sep 19, 2017

This warning should be safe to ignore. The issue here is that there is a module that has no defined content at synthesis time. This occurs when no RTL is available for the module -- the tools assume that a netlist will be provided at a later part of the build process. If no netlist is provided, it is assumed that these tools would fail.

You may be able to remove the warning by adding a syn_blackbox synthesis attribute to the module. Not sure exactly where this would be applied. Look up "synthesis attributes" to find applicable options.

The build tools are annoying because they often mark optimizations and common use-cases as warnings.

tanish · Sep 19, 2017

I checked https://www.xilinx.com/support/documentation/sw_manuals/xilinx11/cgn_c_df_synthesize_verilog_design.htm

but it was no special instructions for xilinx ISE to solve this problem.
Can you explain it more(adding a syn_blackbox synthesis attribute to the module)?I don't have any idea about it.

ads-ee · Sep 19, 2017

You shouldn't have to add syn_blackbox to a ISE project unless you are using a core that has no RTL to begin with e.g. a third party core distributed as an ngc file.

I'm wondering if you didn't add the IP core correctly to the project.

tanish · Sep 20, 2017

actually I think I do it correctly because when I change the prefered language to vhdl and then use core generator it works.
but I don't know the exact reason unfortunately.
I hope someone could help me.

tanish · Sep 20, 2017

I did the xst synthesize and the maximum frequency was about 350 MHz.
but when I do post place and route simulation there is some problem with my results.
this is my code:

Code Verilog - [expand]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
module 
ram_top(
clk,
en,
we,
addr,
oen,
data_in,
data_out
);
 
input     clk, en, we, oen;
input     [5:0] addr;
input     [255:0] data_in;
output    [255:0] data_out;
 
 BUFG BUFG_inst (
      .O(clk1), // 1-bit output: Clock buffer output
      .I(clk)  // 1-bit input: Clock buffer input
   );
 
myram a1 (
  .clka(clk1), // input clka
  .ena(en), // input ena
  .wea(we), // input [0 : 0] wea
  .addra(addr), // input [5 : 0] addra
  .regcea(oen),
  .dina(data_in), // input [255 : 0] dina
  .douta(data_out) // output [255 : 0] douta
);
 
endmodule

and my testbench is:

Code Verilog - [expand]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
module testbench;
 
    // Inputs
    reg clk;
    reg en;
    reg we;
    reg [5:0] addr;
    reg oen;
    reg [255:0] data_in;
 
    // Outputs
    wire [255:0] data_out;
 
    // Instantiate the Unit Under Test (UUT)
    ram_top uut (
        .clk(clk), 
        .en(en), 
        .we(we), 
        .addr(addr), 
        .oen(oen),
        .data_in(data_in), 
        .data_out(data_out)
    );
 
    always begin
     clk = 1'b0;
     #4;
     clk = 1'b1;
     #4;
    end
 
    initial begin
        en = 1'b1;
        oen = 1'b0;
        //#107;
        #105;
        data_in = 256'ha000000000000b000000000000c000000000000000000d0000000000000000e0;
        we = 1'b1;
        addr = 6'b000011;
      //#15;
        #8;
        data_in = 256'h000000f0000000000e000000000d00000000c0000000000b00000000000a0000;
        we = 1'b1;
        addr = 6'b001001;
      //#15;
        #8;
        
        data_in = 256'h000000100000000001011111000d000000001000000000010000000000010000;
        we = 1'b1;
        addr = 6'b111001;
      //#15;
        #8;
        
        oen = 1'b1;
        we = 1'b0;
        addr = 6'b000011;
      //#15;
        #8;
        
        we = 1'b0;
        addr = 6'b111001;
        #8;
        
        we = 1'b0;
        addr = 6'b100001;
        #8;
        
        we = 1'b0;
        addr = 6'b001001;
      //#15;
        #8;
        
    end
      
      
endmodule

and this is my result :

why some parts are x(red)?
and why it takes so long (about 2 or 3 clocks) for each output(from an address) to be stable?
when the max frequency is 350MHz I think I should have output for each address in about 3ns! but it didn't happen!

ads-ee · Sep 21, 2017

tanish said:
and why it takes so long (about 2 or 3 clocks) for each output(from an address) to be stable?
when the max frequency is 350MHz I think I should have output for each address in about 3ns! but it didn't happen!

Uh, 3 ns is not the pipeline delay that is the approximate clock period, which isn't the same thing.

1 clock to capture the address and read enable
1 clock to read the memory array and load it into the output buffer register

So there is a two clock delay to get read data from the RAM, which you then capture on the third clock cycle.

Also why are you using the falling edge of the clock instead of the normal rising edge of a clock? I also don't get why you have so many extra transitions of bit in your data_out bus not aligned with any clock edges?

vGoodtimes · Sep 21, 2017

ads-ee said:
I also don't get why you have so many extra transitions of bit in your data_out bus not aligned with any clock edges?

tanish said:
... post place and route simulation ...

Code:

wire [255:0] data_out;

This is multiple DMEMs/registers with various simulated routing delays for this large bus.

tanish · Sep 24, 2017

vGoodtimes said:
Code:

wire [255:0] data_out;

This is multiple DMEMs/registers with various simulated routing delays for this large bus.

could you please explain more about DMEMs/registers?
I defined data_out as an output port and I didn't specify that it's a reg or wire.I think in this case verilog HDL consider it as a wire by default,isn't it?

- - - Updated - - -

ads-ee said:
Uh, 3 ns is not the pipeline delay that is the approximate clock period, which isn't the same thing.

1 clock to capture the address and read enable
1 clock to read the memory array and load it into the output buffer register

So there is a two clock delay to get read data from the RAM, which you then capture on the third clock cycle.

Also why are you using the falling edge of the clock instead of the normal rising edge of a clock? I also don't get why you have so many extra transitions of bit in your data_out bus not aligned with any clock edges?

Actually I've used rising edge.
and I exactly wnat to know why there are extra transitions of bit in my data_out bus ?!!

vGoodtimes · Sep 24, 2017

my point was that you have a wide data bus and you are modeling the delays from elements that might be spread out a little bit. If you expand the bus you'll likely see each bit transitions one time, but with a different delay compared to other bits. As a result, the combined bus shows many transitions.

tanish · Sep 25, 2017

vGoodtimes said:
my point was that you have a wide data bus and you are modeling the delays from elements that might be spread out a little bit. If you expand the bus you'll likely see each bit transitions one time, but with a different delay compared to other bits. As a result, the combined bus shows many transitions.

yes I know but I will use the combine of this bits in my code!
for this reason I want to know is there any solution to reduce this long transition?it's so important for me.

- - - Updated - - -

Actually I have another question.
If I try a code like this at asic does it have a long transiton like this?or does it happen just in FPGAs?

vGoodtimes · Sep 25, 2017

My answer is the same as the one for communications: if it looks clean you aren't doing it fast enough. Does the design meet static timing requirements? if so, this simulation behavior is expected and doesn't matter.

ads-ee · Sep 25, 2017

tanish said:
Actually I have another question.
If I try a code like this at asic does it have a long transiton like this?or does it happen just in FPGAs?

It's even more likely to happen with even more transitions in an ASIC as an ASIC typically has far more levels of logic with far greater granularity down to individual gates (unlike FPGAs which have n-input LUTs).

As long as the transitions all finish before the setup time of the capture FF (i.e. meets static timing as vGoodtimes pointed out) then everything should work as designed.

instantiate a distributed ram using core generator in xilinx ISE

tanish

Junior Member level 2

vGoodtimes

Advanced Member level 4

tanish

Junior Member level 2

ads-ee

Super Moderator

tanish

Junior Member level 2

tanish

Junior Member level 2

ads-ee

Super Moderator

vGoodtimes

Advanced Member level 4

tanish

Junior Member level 2

vGoodtimes

Advanced Member level 4

tanish

Junior Member level 2

vGoodtimes

Advanced Member level 4

tanish

ads-ee

Super Moderator

tanish

Similar threads

instantiate a distributed ram using core generator in xilinx ISE

Junior Member level 2

Advanced Member level 4

Junior Member level 2

Super Moderator

Junior Member level 2

Junior Member level 2

Super Moderator

Advanced Member level 4

Junior Member level 2

Advanced Member level 4

Junior Member level 2

Advanced Member level 4

Super Moderator

Similar threads

Privacy & Transparency

Privacy & Transparency