Continue to Site

Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

Why FPGA is consuming this much block RAM. comments plz

Status
Not open for further replies.

hallovipin

Member level 1
Joined
Dec 23, 2009
Messages
40
Helped
0
Reputation
0
Reaction score
0
Trophy points
1,286
Activity points
1,638
Friends,
I have built one 12 bit 8K RAM inside FPGA (spartan 3A). This FPGA has 20K block ram INSIDE. BUT WHAT i FIND IS IN-SPITE OF USING ONLY 8 k MY WHOLE 20 k BLOCK RAM has got exhausted.

have a look on my coding style.




always @(posedge clk_adc) begin /////////////////// Write Buffer //////////////////////
if(write_enable)
if(address_w== 13'b1111111111111)
begin
address_w<=13'b0000000000000;
end
else begin
int_ram[address_w]<=adc_data_in;
address_w<=address_w+ 13'b0000000000001;
end
end


always @(posedge clk) begin
if(!write_enable) begin
address_curr<=address_r;
if(address_r==13'b1111111111111)begin
address_r<=13'0000000000000;
address_next<=13'b0000000000000;
end
else begin
temp_sum<=temp_sum+neg_check;
address_r<=address_r+13'b0000000000001;
address_next<=address_r+13'b0000000000001;
end

assign neg_check=(int_ram[address_next]>int_ram[address_r])?(int_ram[address_next]-int_ram[address_r]):12'd0;

for this code ISE is using 18K RAM. IS it due to the fact that I am reading wto data point sin a single clock (int_ram[address_next]-int_ram[address_r]) or is there anything else.

if it is so then how can I solve this isse. But I need to have two data points to make calcultaions.

thanx
 

not entirely sure. to be sure, there is no tri-port block-ram primitive in the spartan. but one can be made for the 1 write, 2 read case by using twice as much block ram. the write is the same for both sets of block ram, but the reads are independent.

Next, the spartan doens't have primitives with 12b inputs, but rather has 9b/18b/36b/72b inputs. thus some amount of space is wasted. coregen may be able to do a better job than the synthesis tool, and reduce the amount of wasted RAM.

i suspect the design uses 20k. 16k from needing the two reads, and the rest lost due to poor mapping to the actual block rams.

as before, look into converting this design into FIFOs or simple dual port block rams. One way to do this is to exploit the read pattern:
(0 1) (1 2) (2 3) (3 4) ...
could be converted to "read 0" "read 1 and store" "read 2 and use the stored 1" "read 3 and use the stored 2" ...

failing this, it might be possible to run the read clock at 2x (or faster). then you can read from two addresses in two clock cycles -- 1 per cycle.

also, you are using "write_enable" in both adc_clk and clk processes. ISE won't give you warnings about this, but this design practice will almost always fail eventually.

lastly, temp_sum is used without any obvious initialization or reset.
 

    hallovipin

    Points: 2
    Helpful Answer Positive Rating
@ permute

thanx a lot for replying.

Why should there be a problem if I use write_enable for both clk and adc_clk. I amwriting data at low speed (adc_clk) and reading it fast (clk) so I have enough time for processing also.

write_enable is just a flag which controls read write operations. beside it my post translate simulation is perfect. but even then if u think its a problem what is the solution.

Also in the original code I am reseting temp_sum ever 5 clock cycles.

I tried using dual port RAM through coregen but it is giving problem as I am not sure which mode of dp ram is to be used for above purpose.

Added after 10 minutes:

@ permute
If u have time can u plz show me the way of using fifo for above application as u told.
 

hallovipin:
think about how an FPGA is constructed -- lots of small elements connected together with some configurable routing. Your "write_enable" signal will arrive at each element at slightly different times. Even without this, the logic delays from "write_enable" to each affected register will be slightly different. (same effect for slightly different clock delays).

If the two clocks do not have a FROM-TO constraint, then the results will be indeterminate, but will have a high probability of failure over time (eg, 0.0001%/cycle for a 1MHz system means failure in a very short period of time).

This means the clocks must have a known, rational relationship, as well as a defined phase difference. Otherwise the FROM-TO constraint wouldn't make any sense.

I encourage you to look at what your system is trying to do. Attempt to convert it into a system that read 1 uniquely new value per cycle, and uses previously read values to perform some calculations.
 

A block RAM always occupied complete memory whether you try or not to try to use the complete block i.e. you cant instantiate 2 10k RAMs in a 20k block...... Use distributed rams in order to implement small size RAMs.... Thats all I know.... May be my statement is incorrect
 

@ umair

Its not like that block ram can be used partially. If I reduce my buffer size to 4k from 8k I am only consuming 65% of total block ram.

How to group the distributed RAM together so as to make it work like a block RAM.
 

I agree with you.... but my point was you cant implant 2 seperate buffers or RAMs in the same BLOCK .. implement two 2k block RAMs and see whether they are using 1 8k block or two 8k blocks
 

    hallovipin

    Points: 2
    Helpful Answer Positive Rating
@ Umair..

Oh My GOD . you are absolutely right.
Now I got the devil.

It doesnt matter what size of ram you build .... it will consume a whole block for even a 3 bit RAM.
Thanx a lot ... it really helped me.
Now I will rethink my strategy.

Added after 13 minutes:

BOTTOM LINE IS:

U can use 2 blocks for a single RAM but u cant use single block for a 2 RAMs.
 

Status
Not open for further replies.

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Back
Top