Continue to Site

Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

[SOLVED] Running parallel RAM at a different clock than CPU?

Status
Not open for further replies.

Artlav

Full Member level 2
Joined
Nov 26, 2010
Messages
144
Helped
2
Reputation
4
Reaction score
2
Trophy points
1,298
Activity points
2,723
Let's say i have a parallel static RAM that is running at one frequency (8 MHz), and a CPU (in the FPGA) tthat is running at a higher frequency (16MHz).
The CPU asserts a read or write signal, the RAM gets accessed, sets a done flag for one clock, the CPU sees the flag and deassert the request signal.

It all works fine when they both use the same clock, but if i lower the RAM clock (either from dividing the CPU one or async from a PLL), things start to lock up and act weird.

I suspect this needs some sort of synchronization. I tried to sync the done flag (running theory was that the flag stays up as the CPU carries on and messes something up on next reads) and read and write flags (just in case), as described here: http://fpga4fun.com/CrossClockDomain2.html

However, that had no effect (maybe even made it worse, since with the sync it locks up almost reliably).

What can i be doing wrong?
How is this normally done?

The code for the RAM controller:
Code:
//----------------------------------------------------------------------------//
module ffram_controller(
 input rst,
 input clk,
 
 output reg [14:0] fram_addr,
 inout [15:0] fram_io,
 output reg fram_oe,
 output reg fram_we,
 output reg fram_ce,
 
 input  [31:0] addr,
 input  [31:0] data2mem,
 output reg [31:0] data2cpu,
 input  re,
 input  we,
 output reg done
);
//----------------------------------------------------------------------------//
reg  [2:0] state;
reg  [15:0] xlate_data;
wire [14:0] xlate_addr;
//----------------------------------------------------------------------------//
parameter [2:0] idle=3'd0;
parameter [2:0] re_a=3'd1;
parameter [2:0] re_b=3'd2;
parameter [2:0] we_a=3'd3;
parameter [2:0] we_b=3'd4;
parameter [2:0] we_c=3'd5;
parameter [2:0] skip=3'd6;
//----------------------------------------------------------------------------//
assign xlate_addr=addr[15:1];
//----------------------------------------------------------------------------//
always @(posedge clk)
begin
 if(rst)begin
  data2cpu<=0;
  state<=idle;
  done<=0;
  fram_addr<=15'h7FFF;
  xlate_data<=16'hFFFF;
  fram_oe<=1;
  fram_we<=1;
  fram_ce<=1;
 end else begin
  case(state)
   idle:begin
    done<=0;
    fram_oe<=1;
    fram_we<=1;
    fram_ce<=1;
    if(re)begin
     state<=re_a;  
     fram_addr<=xlate_addr;
     fram_oe<=0;
     fram_we<=1;
     fram_ce<=0;
    end
    if(we)begin
     state<=we_a;  
     fram_addr<=xlate_addr;
     xlate_data<=data2mem[15:0];
     fram_oe<=1;
     fram_we<=0;
     fram_ce<=0;
    end
   end

   re_a:begin
    data2cpu[15:0]<=fram_io;
    fram_addr<=xlate_addr+1'd1;
    state<=re_b;
   end
   re_b:begin
    data2cpu[31:16]<=fram_io;
    done<=1;
    state<=skip;
    fram_oe<=1;
    fram_we<=1;
    fram_ce<=1;
   end

   we_a:begin
    fram_we<=1;
    state<=we_b;
   end
   we_b:begin
    xlate_data<=data2mem[31:16];
    fram_addr<=xlate_addr+1'd1;
    fram_we<=0;
    state<=we_c;
   end
   we_c:begin
    done<=1;
    state<=skip;
    fram_oe<=1;
    fram_we<=1;
    fram_ce<=1;
   end
   skip:begin
    done<=0;
    state<=idle;
   end
  endcase
 end
end
//----------------------------------------------------------------------------//
assign fram_io=we?xlate_data:16'bzzzzzzzzzzzzzzzz;
//----------------------------------------------------------------------------//
endmodule
//----------------------------------------------------------------------------//
 

Let's say i have a parallel static RAM that is running at one frequency (8 MHz)

What do you mean by running at 8 MHz, is this supposed to be a synchronous RAM (i.e. one with a clock) or is this a parallel asynchronous RAM with reads and writes done at 8 MHz.

I ask because I can't think of any low speed synchronous SRAM devices most of the sync SRAM type products are for high speed applications in the 100's of MHz range.


Regardless based on what you've written it doesn't look like you have any kind of wait states or such to wait for the SRAM to respond if run at anything other than the same clock frequency of this FSM.
 

What do you mean by running at 8 MHz, is this supposed to be a synchronous RAM (i.e. one with a clock) or is this a parallel asynchronous RAM with reads and writes done at 8 MHz.
Parallel static asynchronous RAM.
The datasheet specifies only the maximum speed (70ns access time), not any particular frequency to be used. AFAIK, it can work down to near-DC.

What sort of wait states do i need?
It's supposed to be clocked at the rate that gives the RAM enough time to respond between the one clock and the next.
The CPU, on the other hand, would like to run much faster. Therefore, the problem.
 

You hold values long enough to run the RAM correctly.

e.g. if the ADDR needs to be there for 70 ns before you get data then you output the ADDR and wait for 2 16 MHz clocks (70/62.5 = 1.12) and then you can capture the read data. In this case there is a 1 clock cycle wait state that the CPU needs to adhere to to allow the RAM time to respond.

Draw a timing diagram it will help you understand what is going on.
 
  • Like
Reactions: Artlav

    Artlav

    Points: 2
    Helpful Answer Positive Rating
@Artlav, what specific problem are you having? "things start to lock up and act weird." is less informative than I would like.

For example, "lock up" could occur due to a software issue -- failing to read/write a loop variable from memory for example. It can also be caused by a desync between the two (CPU + RAM) state machines where the CPU thinks the RAM is not in the idle state. It sounds like this should be very hard to do as the interface you describe has re/we being held until "done" is pulsed. Of course, if the CPU pulses re/we then you could easily get into a state where the CPU is waiting for the RAM and the RAM isn't doing anything.

Likewise, the "done" pulse is also twice as long in terms of CPU cycles. It isn't clear if the CPU can deal with this. Effectively, you are acknowledging an io operation that was never requested. If the CPU ignores the second CPU cycle of "done" you will be fine.



and yes, looking at the datasheet would help. You should be able to get the controller to be more efficient if that is important.
 

e.g. if the ADDR needs to be there for 70 ns before you get data then you output the ADDR and wait for 2 16 MHz clocks (70/62.5 = 1.12) and then you can capture the read data. In this case there is a 1 clock cycle wait state that the CPU needs to adhere to to allow the RAM time to respond.
Ok, so instead of having two clocks you have one, and explicitly wait for the time it would take the RAM to process?

That works. Tested up to 50MHz with no problems.
So, is that a proper way to do it - keep it all in one clock domain and insert delay states?

Precisely what worked:
Code:
//----------------------------------------------------------------------------//
module ffram_controller(
 input rst,
 input clk,
 
 output reg [14:0] fram_addr,
 inout [15:0] fram_io,
 output reg fram_oe,
 output reg fram_we,
 output reg fram_ce,
 
 input  [31:0] addr,
 input  [31:0] data2mem,
 output reg [31:0] data2cpu,
 input  re,
 input  we,
 output reg done
);
//----------------------------------------------------------------------------//
(* syn_encoding = "user" *) reg  [3:0] state;
(* syn_encoding = "user" *) reg  [3:0] next_state;
reg  [15:0] xlate_data;
wire [14:0] xlate_addr;
reg  [1:0] skip_cnt;
//----------------------------------------------------------------------------//
parameter [3:0] idle     =4'd0;

parameter [3:0] re_a     =4'd1;
parameter [3:0] re_b     =4'd2;
parameter [3:0] re_c     =4'd3;
parameter [3:0] re_d     =4'd4;

parameter [3:0] we_a     =4'd5;
parameter [3:0] we_b     =4'd6;
parameter [3:0] we_c     =4'd7;
parameter [3:0] we_0     =4'd9;
parameter [3:0] we_1     =4'd10;

parameter [3:0] skip     =4'd11;
parameter [3:0] wait_time=4'd12;
//----------------------------------------------------------------------------//
assign xlate_addr=addr[15:1];
//----------------------------------------------------------------------------//
 task do_idle(input [3:0] next);
 begin
  skip_cnt<=2'd2;
  state<=wait_time;
  next_state<=next;
 end
 endtask
//----------------------------------------------------------------------------//
always @(posedge clk)
begin
 if(rst)begin
  data2cpu<=0;
  state<=idle;
  done<=0;
  fram_addr<=15'h7FFF;
  xlate_data<=16'hFFFF;
  fram_oe<=1;
  fram_we<=1;
  fram_ce<=1;
 end else begin
  case(state)
   idle:begin
    done<=0;
    fram_oe<=1;
    fram_we<=1;
    fram_ce<=1;
    if(re)begin
     state<=re_a;
     fram_addr<=xlate_addr;
     fram_oe<=0;
     fram_we<=1;
     fram_ce<=1;
    end
    if(we)begin
     state<=we_0;
     fram_addr<=xlate_addr;
     xlate_data<=data2mem[15:0];
     fram_oe<=1;
     fram_we<=0;
     fram_ce<=1;
    end
   end

   re_a:begin
    fram_ce<=0;
    do_idle(re_b);
   end
   re_b:begin
    data2cpu[15:0]<=fram_io;
    fram_ce<=1;
    fram_oe<=1;
    fram_addr<=xlate_addr+1'd1;
    do_idle(re_c);
   end
   re_c:begin
    fram_ce<=0;
    fram_oe<=0;
    do_idle(re_d);
   end
   re_d:begin
    data2cpu[31:16]<=fram_io;
    fram_oe<=1;
    fram_we<=1;
    fram_ce<=1;
    done<=1;
    state<=skip;
   end
	

   we_0:begin
    fram_ce<=0;
    do_idle(we_a);
   end
   we_a:begin
    fram_ce<=1;
    fram_we<=1;
    do_idle(we_b);
   end
   we_b:begin
    xlate_data<=data2mem[31:16];
    fram_addr<=xlate_addr+1'd1;
    fram_we<=0;
    state<=we_1;
   end
   we_1:begin
    fram_ce<=0;
    do_idle(we_c);
   end
   we_c:begin
    done<=1;
    state<=skip;
    fram_oe<=1;
    fram_we<=1;
    fram_ce<=1;
   end
   skip:begin
    done<=0;
    state<=idle;
   end
	
   wait_time:begin
    skip_cnt<=skip_cnt-2'd1;
    if(skip_cnt==2'd0)state<=next_state;
   end
  endcase
 end
end
//----------------------------------------------------------------------------//
assign fram_io=we?xlate_data:16'bzzzzzzzzzzzzzzzz;
//----------------------------------------------------------------------------//
endmodule
//----------------------------------------------------------------------------//


"things start to lock up and act weird." is less informative than I would like.
It boils down to the read/write errors, except for the case of trying to sync up re and we signals (that caused an actual lockup).

I had some primitive RAM tests (ran from FPGA's internal memory) which showed errors on gross problems, but there was a "weird stuff happening" gap between there being nothing on the tests and the code actually working.
As i improved the tests, that gap shrunk to almost nothing, so the "weird stuff" was still the read/write errors screwing up the code or data.
 

Ok, so instead of having two clocks you have one, and explicitly wait for the time it would take the RAM to process?

That works. Tested up to 50MHz with no problems.
So, is that a proper way to do it - keep it all in one clock domain and insert delay states?

Yes, that is what you do with an asynchronous SRAM. That is why I wanted clarification of the SRAM type.

For a synchronous SRAM you would probably want to use a FIFO (easiest solution) or some double buffering scheme to run the SRAM interface on the SRAM clock. Or if you can you run the SRAM off a clock generated by the FPGA that is some multiple of the CPU clock and just use the same simple technique used on the asynchronous SRAM interface.

- - - Updated - - -

One change I would make in your code is add in an input clock frequency parameter and calculate the ceiling of the RAM access time divided by the input clock period, so you don't have a hard coded wait_time. This will allow the clock cycles to be optimized depending on the CPU clock frequency driving this module.
 
  • Like
Reactions: Artlav

    Artlav

    Points: 2
    Helpful Answer Positive Rating
Status
Not open for further replies.

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Back
Top