Continue to Site

Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

[SOLVED] Continuous sampling 3GSPS ADC and Altera FPGA interface

Status
Not open for further replies.

pcbeng25

Newbie level 4
Joined
May 20, 2011
Messages
6
Helped
0
Reputation
0
Reaction score
0
Trophy points
1,281
Location
Northwest, USA
Activity points
1,351
I'm currently writing an interface for a 2.5GSPS 8-bit ADC connecting to an Altera Stratix III FPGA via 32 LVDS lines. The data rate is 625MHz on the input to the FPGA, and it is deserialized 1:4 for an internal data rate of 156.25MHz. I am currently feeding this data into a FIFO. My dilemma is that I need to process this data at 187.5MHz inside the FPGA and it takes 5 clocks for every read out of the FIFO to get a DPRAM updated (read, increment, write back). I am running into an overflow condition on the FIFO as the data is continuous, meaning it is constantly writing into the FIFO. I tried to establish a number of samples required for reliable data processing and then just write that many samples into the FIFO, but my PHB seems to think we need to keep our sampling going and we shouldn't have to stop. I'm wondering if there are any other options so I am not losing data. Currently I have already reduced the data to processing one of every 4 samples coming in, but without stopping the writes to the FIFO, I cannot process the data fast enough to keep the FIFO from overflowing, no matter how big I make it.

I've searched and searched for techniques on how people do this, but haven't come up with anything useful. If anyone can suggest some techniques for continuous sampling data processing in an FPGA, I would appreciate it. Thanks in advance.
 

It's not clear what you are doing with the DPRAM, more generally, you should at first clarify whats the intended data sink, how it will be able to achieve a continuously maintained data rate of 128 x 187.5 MBPS. You also should explain, which signal processing operations you want to perform.
 

It's not clear what you are doing with the DPRAM, more generally, you should at first clarify whats the intended data sink, how it will be able to achieve a continuously maintained data rate of 128 x 187.5 MBPS. You also should explain, which signal processing operations you want to perform.

well what can I say without upsetting the lawyers...Basically I am not allowed to say too much. I am using the data from the ADC as an address to a DPRAM, then incrementing the contents to create a histogram.

The FIFO only gets 32 bits written into it (one of the 4 samples on the output of the LVDS block). I am reading out 64 bits from the FIFO. There are then 4 DPRAM blocks (one for each 8-bit sample), so for every FIFO read, I must write to the DPRAM 2x. Each update to the DPRAM takes 5 clock cycles @ 187.5MHz and since I am reading out two samples, that's 10 clock cycles between reads from the FIFO. That 10 cycles between reads is what is causing the FIFO to overflow as the bathtub is not draining fast enough.

There will be a separate state machine that will sample the histogram to do the actual processing, but that is independent of the FIFO. My problem is just getting the data counted and stored in a reasonable amount of time and not overflowing the FIFO.
 

What I describe below is for one DPRAM block. It sounds that you will need 4 or 8 identical blocks.

You need a pipeline. You must start the next operation before the previous is finished. Since the same sample value can occur again before a previous occurence processing is finished, you need multiple locations in the DPRAM for each sample value. A location can not be reused until the previous operation for that location is guaranteed to be finished. If a complete turn takes 5 cycles, you need at least 5 DPRAM locations for each sample value.

This means that if you get the same sample value 5 times in a row, 5 different DPRAM locations will be incremented.

You will need some post-processing to get the sum of the locations corresponding to each sample value.
 
I see, that the histogram unit is the data sink, the high data rate is effectively stopped here. Also without telling more details, you clarified the basic problem. The good thing is, that you can further parallelize the data path, using more resources, if necessary to keep up with the data rate. Of course, the processed histogram data have to be assembled again later, as std_match mentioned.
 
the idea is:
you do not have to latch adding result,
you can run read - incr -write operation with double clock,
300MHz is below limits of todays fpga;
as an example: [simple simulation of a netlist shows it might work ...]
Code:
/.../
  input            clk, clk2x,  //clk2x - pll generated, phase alligned to clk
  input      [7:0] data,
  input      [7:0] h_addr,
  output reg [7:0] hist,
/.../


reg [7:0] ram[255:0];
    
reg [7:0] din;
 always @(posedge clk)
   din <= data;
 
reg wr;
 always @(posedge clk2x)
   wr <= !wr;

 always @(posedge clk2x)
    if ( wr ) ram[din] <= ram[din] + 1'b1;
    
 always @(posedge clk)   
    hist <= ram[h_addr];

----
J.A
 
Sorry I haven't replied...for some reason I have not received any emails that this thread was updated. I will look over the replies and see what I can do. I appreciate the suggestions. Now I just need to map this out a bit and see where it gets me. I have been trying to figure out how to pipeline this on and off for a week. It looks like std_match may have solved that in my mind. I'll post back if I have any more trouble.
 

Ok, so it seems if I take 32-bits out of the FIFO instead of 64, then have a 5 stage pipeline, I will require 20 DPRAMs in order to store the data. The post-processing will have to sum the locations in those 20 RAMs into one complete histogram. Reasoning for this:

one 32-bit read from the FIFO actually contains 4 samples, Da, Db, Dc and Dd.

Each one of those will go into a 5 stage pipeline and write to one port of a DPRAM, so that is 5 DPRAMs for Da, 5 for Db, 5 for Dc, and 5 for Dd. The DPRAMs would be 256x16. I have thought about adding an offset to each 256 addresses to make all 5 DPRAMs implement into one DPRAM, but that requires using both ports for the RAM. See next paragraph...

In that case, one port is reading while the other is writing, but then I don't have a port free for the post-processing loop which will be reading and adding them all together. Also, how would that account for two samples which are the same. If the value in memory is 1, the read line will read out 1 (read during write yields old contents), while the write line is writing in a 2....so then a 2 will get written again, and I will miss a count.

I'm trying to get the architecture down before I start coding again. If I am missing something please chime in. I would like to not have to use 20 DPRAMs, but I just don't see how to avoid it at this point. Any suggestions to minimize the # of RAM blocks would be appreciated.
 
Last edited:

I think you only need 3 locations per sample value to do the pipeline with normal synchronous RAMs, if they have one write port and one read port.

Cycle 1: Start the read operation
Cycle 2: Clock out the data. Combinatorial logic will prepare the incremented value for writing in the next clock cycle
Cycle 3: Start the write operation

After these 3 cycles, you can use the same location again.

No count will be missed even if the same sample value appears several times in a row. Different RAM locations will be incremented.

What is the size of your DPRAM blocks?
How many bits do you need for the histogram accumulation?

If your DPRAM's are big enough you can have all 3 locations in the same DPRAM, for each byte of the data you clock out from the FIFO. Per byte from the FIFO you need a DPRAM size of 256 * 3 * (histogram word size)

To be able to do the summing and postprocessing you can increase the DPRAM clock to create spare cycles, or duplicate all the writes to a second bank of DPRAM's.
 
The DPRAM right now is 16 bits wide...may make it wider depending on how many samples we require for an accurate representation of our data...still yet to be determined, but that is an easy task. I like the idea of duplicating the writes to a second set of DPRAMs and using both ports (one for read, one for write) to create the sub-histograms. Not sure why I didn't think of that as I had done that in a previous job a few years ago...I believe the last part of that statement is why.

You are also correct that the counts won't be missed, because they are indexing a different part of memory...something that I came up with when I started mapping it out on paper.

Also, I was able to make my 5 clocks become 3 clocks. I didn't register the addition which I was doing. That now happens combinatorily. So indeed, it is only 3 clocks. Read, gather data (increment happens here), write.

Very helpful stuff.
 
Last edited:

To update. I did get the data into RAM and I am correctly creating a Histogram now without missing samples and without overflowing the FIFO. In fact, I have to have pauses when reading the data out of the FIFO to avoid underflow. Now I can move on to my processing algorithms. I want to thank all who contributed to this task and helped me clarify the concepts in my head.
 

Status
Not open for further replies.

Similar threads

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Back
Top