# Formula for Depth calculation for any FIFO (syncronous or asyncronous )

Status
Not open for further replies.

#### dll_fpga

##### Full Member level 3
I think depth of any FIFO can be calculated as

Syncronous FIFO or
Asyncronous FIFO: = Bwr/Brd where Bwr is the peak write side data rate or bandwdth .Brd is the peak read side data rate.

Anyone with different opinions please let me know.

Look at a 1Mbps data stream and a processor capable of processing data at 100Mbps. It would appear as if no fifo is required. However, what if the 1Mbps is coming from a 10Gbe interface as a 1Mbit burst once per second. The data is written in 100us. During this time, 10kb of data is processed. This means a fifo would need to buffer 990kb of data, which is nearly the entire size of the burst.

using your formula, 10E9 / 100E6 = 100, which is not correct because it doesn't include the size of the burst. If this system expects to buffer up 1Gb of data over the course of minutes, then burst it out, the fifo would need to be much larger.

This is a simple example, other fifo cases may involve both variable write and read speeds. For example, if the input is compressed data and must be decompressed to be processed.

The only real requirement is that the system be able to process data at a higher average rate then the average rate of the input. If this is not satisfied, then the fifo would grow over time and eventually overflow.

• dll_fpga

### dll_fpga

Points: 2
I think depth of any FIFO can be calculated as

Syncronous FIFO or
Asyncronous FIFO: = Bwr/Brd where Bwr is the peak write side data rate or bandwdth .Brd is the peak read side data rate.

Anyone with different opinions please let me know.

The implementation of a FIFO can require additional depth due to latency delays. A simple example is a dual clock fifo. In order to function, signals need to be transferred between clock domains. This transfer will take clock cycles to accomplish. During that latency time you will need to have additional storage (i.e. fifo depth) without which the fifo will overflow.

You can observe a simple test case of this by
- Creating a 1 deep dual clock fifo
- Clock both sides with the same frequency clock, perhaps phase shifting the one if you'd like

Depending on the implementation, this latency might be needed even for a single clock fifo. In any case, to correct your equation you'd need to add round trip latency delay.

Another example where your equations fails is if Bwr/Brd is not an integer. Simple case, Bwr = 1.1 MB/sec; Brd = 1 MB/sec so your fifo depth would be 1.1. Or Bwr = 1; Brd = 2 so your ratio is 0.5. Fifos have depths that are integers. This one is simple to correct: Depth = ceil(Bwr/Brd)

As permute already pointed out, if Bwr > Brd, then you need to know how long such this condition is maintained (it can't be 100% of the time obviously).

In short your equation is not correct under any of the following conditions:
- Bwr > Brd (see permute's posting for example)
- Bwr < Brd (this post, can't have fractional fifo depth)
- Bwr = Brd (this post, need to consider latency in dual clock fifos)

Kevin Jennings

• dll_fpga

### dll_fpga

Points: 2
The implementation of a FIFO can require additional depth due to latency delays. A simple example is a dual clock fifo. In order to function, signals need to be transferred between clock domains. This transfer will take clock cycles to accomplish. During that latency time you will need to have additional storage (i.e. fifo depth) without which the fifo will overflow.

You can observe a simple test case of this by
- Creating a 1 deep dual clock fifo
- Clock both sides with the same frequency clock, perhaps phase shifting the one if you'd like

Depending on the implementation, this latency might be needed even for a single clock fifo. In any case, to correct your equation you'd need to add round trip latency delay.

Another example where your equations fails is if Bwr/Brd is not an integer. Simple case, Bwr = 1.1 MB/sec; Brd = 1 MB/sec so your fifo depth would be 1.1. Or Bwr = 1; Brd = 2 so your ratio is 0.5. Fifos have depths that are integers. This one is simple to correct: Depth = ceil(Bwr/Brd)

As permute already pointed out, if Bwr > Brd, then you need to know how long such this condition is maintained (it can't be 100% of the time obviously).

In short your equation is not correct under any of the following conditions:
- Bwr > Brd (see permute's posting for example)
- Bwr < Brd (this post, can't have fractional fifo depth)
- Bwr = Brd (this post, need to consider latency in dual clock fifos)

Kevin Jennings

Hello Permute and KJ,
Thanks for the post.

Permute,
I think instead of peak bandwidth we need to consider the peak throughput(actual amount of DATA that is written/read in a given time frame). .

>>Look at a 1Mbps data stream and a processor capable of processing data at 100Mbps. It would appear as if no FIFO is required. However, what if the 1Mbps is coming from a 10Gbe interface as a 1Mbit burst once per second. The data is written in 100us. During this time, 10kb of data is processed. This means a FIFO would need to buffer 990kb of data, which is nearly the entire size of the burst.

My assumption is that the entire DATA chunk written will be processed during the read
In the above example since the data throughput of the write side is lesser than read side(1Mpbs and 100 Mbps) ,i think, a FIFO is not needed.

Similarly for variable write and read speeds we need to compute maximum write/read that can happen in a given time frame.
FIFO size = max write -max read(within a given time frame)

K-J ,
I think a correction factor of +-2 can be considered for the synchronization delay.As K-J pointed out any calculation results has to be rounded to the next highest integer value.

Thus combining both we get FIFO size = calculated size +2 .(considering worst case)

Another thing to be noted is "do not write to a full FIFO and do not read from an empty FIFO".so i think Bwr > Brd is the worst case scenario.(to avoid overflow)

so instead of Bwr/Brd ,the formul has to be modified to wr(total writes) - rd(total read)

Last edited:

Hello Permute and KJ,
Thanks for the post.

Permute,
I think instead of peak bandwidth we need to consider the peak throughput(actual amount of DATA that is written/read in a given time frame). .

>>Look at a 1Mbps data stream and a processor capable of processing data at 100Mbps. It would appear as if no FIFO is required. However, what if the 1Mbps is coming from a 10Gbe interface as a 1Mbit burst once per second. The data is written in 100us. During this time, 10kb of data is processed. This means a FIFO would need to buffer 990kb of data, which is nearly the entire size of the burst.

My assumption is that the entire DATA chunk written will be processed during the read
In the above example since the data throughput of the write side is lesser than read side(1Mpbs and 100 Mbps) ,i think, a FIFO is not needed.

Similarly for variable write and read speeds we need to compute maximum write/read that can happen in a given time frame.
FIFO size = max write -max read(within a given time frame)

K-J ,
I think a correction factor of +-2 can be considered for the synchronization delay.As K-J pointed out any calculation results has to be rounded to the next highest integer value.

Thus combining both we get FIFO size = calculated size +2 .(considering worst case)

Another thing to be noted is "do not write to a full FIFO and do not read from an empty FIFO".so i think Bwr > Brd is the worst case scenario.(to avoid overflow)

so instead of Bwr/Brd ,the formul has to be modified to wr(total writes) - rd(total read)

Assumption made: word size is same on read side and write side.
eg: wr 100Mhz and rd 10Mhz
FIFO size is (100 -10 )*10E6
Assumption made: word size is same on read side and write side.
Then again the question comes write and read number is to be computed in which time frame? Also i think we can use a FIFO full which can crate a back pressure in the write side so that FIFO size is not very large.So in this case how we deal with FIFO depth?

Last edited:

To be practical, the average write bandwidth (in bits per second) must be less or equal to the average write bandwidth. As I've shown, the actual fifo size is not based on the average rates and can be difficult to compute in some cases. Your example of 100Mbps in, 10Mbps out is again arbitrary. All you've shown is that the fifo would grow by 90Mb per second (or 905GB/day).

any flow-control will reduce the average write bandwidth. In this case, the fifo size is based on the round-trip decision time. for example, an ethernet system would need to allow for several milliseconds of delay if maximum performance is needed. This may be important when there is an output rate requirement -- eg real-time video must have data available for the next frame in a timely manner. In this case, the fifo size would also be based on the variation in latency over the network.

I guess my point is that any general formula would not be as simple as (a-b)*c or such (where a,b,c are numbers). The fifo size is generally: max over w,r,t of sum( w(t) - r(t) ) subject to w(t), r(t) valid pairing of write/read behavior for the system. This is very general, but doesn't prescribe how to choose or determine what pairs of w(t), r(t) are valid, nor how to maximize these functions. However, many practical cases allow easy fifo sizing based on knowledge of the input/output behavior.

--edit, also, for my original example a fifo would be required. I've provided the size of such a fifo as well as an explanation in the original post.

• dll_fpga

### dll_fpga

Points: 2
permute,
From your original example, i could see that the, only requirement is to backup 990kb of data in a sec.So Depth is not only the factor,that we need to consider.
That means depth*width = 990kb.
I think any solution to the above equation can correctly implement the requirement. Please let me know your views

mostly yes, and that is why I left the format in terms of bits. It is fairly common to have width/rate changing fifos.

There may be some implementation reasons to avoid some width changes. For example, it may be desirable to place the first word of a packet in an aligned location. Doing this would artificially increase the write bandwidth by a small amount. Likewise, the receiver may need additional cycles within a FSM for each received packet. This would lower the read bandwidth by a small amount unless the state machine were changed.

In my original example, the only reason why buffering 990kb was enough was that the system only received 1Mb in that second and was fully capable of processing all of it. If the system receives more data than it can process on a continual basis, the fifo would overflow eventually.

V
Points: 2
For complex cases,as indicated by permute,we need to consider the burst size and the effect of subsequent burts.
FIFO size = sigma [i=1 to n burst] ( w(t) - r(t) )

For simple systems [subject to below mentioned assumptions]
FIFO size = Time to read / Time to write. ---(1)
1/T = f ---(2)
Multiplying by both side by data width ---(3)
combining (1) , (2) and (3)
FIFO size = write bandwidth /read bandwidth

[Assumptions]

1. Read and write side width are same.
2. Subsequent bursts start only after current burst is processed.
3. Read happens once FIFO is full and thereafter the stored data is flushed on to read domain.
4. Back pressure mechanism exists on the write domain to prevent updates ,once the FIFO is full

As already pointed out, FIFO size is subject to a correction of 2 to account for synchronization delay.

Permute,
If we consider the effect of subsequent burst,we saw that the FIFO size exponentially increases.
How can this be accounted. Can you illustrate with some details/examples?

mostly yes, and that is why I left the format in terms of bits. It is fairly common to have width/rate changing fifos.

There may be some implementation reasons to avoid some width changes. For example, it may be desirable to place the first word of a packet in an aligned location. Doing this would artificially increase the write bandwidth by a small amount. Likewise, the receiver may need additional cycles within a FSM for each received packet. This would lower the read bandwidth by a small amount unless the state machine were changed.

In my original example, the only reason why buffering 990kb was enough was that the system only received 1Mb in that second and was fully capable of processing all of it. If the system receives more data than it can process on a continual basis, the fifo would overflow eventually.

Last edited:

fundamentally, Bwr/Brd, or any similar ratio, will never be correct. The fifo has a size in bits (or bytes, words, etc...). time/time, freq/freq, etc... will all be unitless.

My intent here is to show that there isn't a useful and simple formula for every fifo sizing problem. In many practical cases, it isn't difficult to size the fifo. Beyond this, sizing based on an upper-bound (eg, 1Mbit vs 990kbit) is often not significantly different than the optimum sizing. More complex fifo sizing problem might be closely upper-bounded by convex optimization problems. In other cases, it may be a combinatorial optimization problem. Though more often a loose upper bound found by simplified assumptions is used instead.

ya.Whatever computed above is unitless,this has to be mutiplied by the data width to get the actual size in bit.
FIFO size = write bandwidth /read bandwidth maybe be true only if write side frequency is very high as compared to read side frequency.
ie:By the time first clock edge of read edge is reached FIFO will be full.

fundamentally, Bwr/Brd, or any similar ratio, will never be correct. The fifo has a size in bits (or bytes, words, etc...). time/time, freq/freq, etc... will all be unitless.

My intent here is to show that there isn't a useful and simple formula for every fifo sizing problem. In many practical cases, it isn't difficult to size the fifo. Beyond this, sizing based on an upper-bound (eg, 1Mbit vs 990kbit) is often not significantly different than the optimum sizing. More complex fifo sizing problem might be closely upper-bounded by convex optimization problems. In other cases, it may be a combinatorial optimization problem. Though more often a loose upper bound found by simplified assumptions is used instead.