Continue to Site

Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

How to calculate maximum possible parallel data transfer rate for 2 FPGAs?

Status
Not open for further replies.

matrixofdynamism

Advanced Member level 2
Joined
Apr 17, 2011
Messages
593
Helped
24
Reputation
48
Reaction score
23
Trophy points
1,298
Activity points
7,681
Suppose I connect 2 FPGAs together in a parallel bus on the same PCB. They will use simple protocol:
1. data bus - 16 bits (or may be bigger)
2. same clock
3. rd_req
4. wr_e

One FPGA shall put an address and assert rd_req then the other shall assert wr_e when data is on the bus, the bus is not bidirectiona.

Now how can I determine the maximum possible data rate for such a system? Basically, I just want to understand what factors would impact the calculation of maximum possible data rate as a function of given clock frequency. I assume that this has to do with things outside the FPGA like the PCB track length and capacitance but am not sure exactly.
 

It has to do with the entire path for both the clocks on both FPGAs to the clock to out of the FFs in the first FPGA to the pins to the board to the input pins on the second FPGA and the input setup time of the FFs in the 2nd FPGA.

This is all basic timing arc analysis of a synchronous system. Perhaps you need to brush up on that as it's the "hardware" part of digital design with FPGAs.

Coming from a board design background with SSI components 7400 series of ICs I had to do this type of analysis on a regular basis (i.e. every day).
 

Assuming you can queue reads, it should be based on the IO and how complex your are going to get with it. If a read generates the next address, and there are multiple cycles of latency, then the rate would be much lower.

--edit: By this I mean that fpga's have more and more advanced IO features to solve some of the problems with high speed IO.
 

Suppose I connect 2 FPGAs together in a parallel bus on the same PCB. They will use simple protocol:
1. data bus - 16 bits (or may be bigger)
2. same clock
3. rd_req
4. wr_e

One FPGA shall put an address and assert rd_req then the other shall assert wr_e when data is on the bus, the bus is not bidirectiona.

Now how can I determine the maximum possible data rate for such a system? Basically, I just want to understand what factors would impact the calculation of maximum possible data rate as a function of given clock frequency. I assume that this has to do with things outside the FPGA like the PCB track length and capacitance but am not sure exactly.

The maximum rate will have more to do with other decisions that you've already apparently made rather than PCB effects. Some things that you appear to have decided that will hinder high throughput:
- A parallel bus rather than a SERDES interface
- Not using a source synchronous interface to transmit the clock to the device
- Random memory access rather than blocks of data (but maybe this is something you have to provide)

Given the 18 signals between the two FPGAs that you mention (not counting clock) if they were used in a SERDES interface it would be relatively straightforward to achieve multi-GByte performance. Using a 'simple' parallel bus you'll likely be limited to low hundreds of MByte performance and that will be limited primarily by clock to output delays, setup times and uncertainties in both. Next up would probably be clock skew depending on how these two FPGAs are getting whatever is the source of their clock. After that the PCB delays would likely start to come into play

Since your parallel interface appears to be a homegrown protocol, you'll probably be losing throughput between the clock cycle when the master asserts rd_req and some other clock cycle when wr_e is asserted to return the data. If you've thought ahead, the master should be able to keep rd_req asserted to make multiple requests before the slave returns the first data. There would be an upper limit of course on how many read requests could be outstanding, but presumably you know how many clock cycles of latency there could be which means you can design both ends to be able to handle the latency and get 100% utilization. If you don't design this in, then your bus utilization drops immediately to 50% for a single clock cycle of latency, 33% for two, 25% for three, etc.

Kevin Jennings
 
Last edited:

Status
Not open for further replies.

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Back
Top