FIFO depth required for async FIFO

sun_ray · May 2, 2012

What is the minimum depth required to transfer data for an asynchronous FIFO having full and empty signal? What is the minimum depth required to transfer data for an asynchronous FIFO which neither have a full nor have an empty signal?

alaparthi · May 21, 2012

In case, we use full/empty signals, to avoid unnecessary stalling when using async FIFO, the minimum safe depth should be 8.
Otherwise, the minimum depth can go to as low as 2.

tdminion · May 21, 2012

Your question needs to be more specific. For instance, if you have full/empty flags, your fifo depth can be '1'. It will stall the writes, but it will still 'work.' As for NO full/empty flags, it depends on the clock frequencies and how often you can push and pop the fifo. Also, is the read data the same width as the write data?

sun_ray · May 22, 2012

alaparthi said:
In case, we use full/empty signals, to avoid unnecessary stalling when using async FIFO, the minimum safe depth should be 8.
Otherwise, the minimum depth can go to as low as 2.

alaparthi

Can you please substantiate how the minimum depth can go to as low as 2 when we do not have Full and empty signal o the FIFO?

---------- Post added at 10:10 ---------- Previous post was at 10:00 ----------

tdminion

I have following four question in general

(a)What is the minimum depth required to transfer data for an asynchronous FIFO having full and empty signal when read data has the same width as the write data?
(b)What is the minimum depth required to transfer data for an asynchronous FIFO which neither have a full nor have an empty signal read data has the same width as the write data?
(c)What is the minimum depth required to transfer data for an asynchronous FIFO having full and empty signal when read data does not have the same width as the write data?
(d)What is the minimum depth required to transfer data for an asynchronous FIFO which neither have a full nor have an empty signal when read data does not have the same width as the write data?

How do you take care of read data does not have the same width as the write data in calculating FIFO depth for FIFOs that do not have full and empty signal

tdminion · May 22, 2012

Sun_ray,

There are no short cuts for FIFOs. 1st you have to understand the architecture of a simple fifo that passes gray encoded pointers across the async boundaries. Then you can draw timing diagrams for the read/write pointers. FIFO's are very simple but you have to understand the architecture and then draw timing diagrams. Generalized depths are irrelevant unless you have a dumb professor or are being interviewed by an idiot at Broadcom. It makes no sense to ask what is the minimum depth. In fact, you can design an async FIFO with full/empty flags that has a minimum depth of 3 (not 8 as the engineer in Irvine previously suggested). Simply run the read clock at a much greater frequency than the write clock so that the read path latency approaches ~zero write cycles. This is not practical but it would work.

Do yourself a BIG favor and 1st learn the architecture, then draw timing diagrams for the Read and Write pointers. This is the only way it will make sense in the long run. Give it a try!

sun_ray · May 23, 2012

tdminion said:
Sun_ray,

There are no short cuts for FIFOs. 1st you have to understand the architecture of a simple fifo that passes gray encoded pointers across the async boundaries. Then you can draw timing diagrams for the read/write pointers. FIFO's are very simple but you have to understand the architecture and then draw timing diagrams. Generalized depths are irrelevant unless you have a dumb professor or are being interviewed by an idiot at Broadcom. It makes no sense to ask what is the minimum depth. In fact, you can design an async FIFO with full/empty flags that has a minimum depth of 3 (not 8 as the engineer in Irvine previously suggested). Simply run the read clock at a much greater frequency than the write clock so that the read path latency approaches ~zero write cycles. This is not practical but it would work.

Do yourself a BIG favor and 1st learn the architecture, then draw timing diagrams for the Read and Write pointers. This is the only way it will make sense in the long run. Give it a try!

I understand architecture of FIFO including the gray encoded pointers across the async boundaries that you are mentioning. Even I drew riming diagrams long ago.

It is good that you suggested a right homework for me. Can you please suggest more general basic points, concepts to think of in deciding about FIFO depth for async design?

What is the "bandwidth should be same concept" in this regard of designing FIFO? Can there be a async FIFO without Full and empty signal because all of the async FIFO I see have full and empty signals.

I have gone through many of the threads in Edaboard and some other VLSI forums and also in internet including some papers on this topic of ' FIFO depth calculation, Async FIFO design', but I never found a good detail description on this 'FIFO depth calculation, Async FIFO design' topic . Can you please provide some of them if you have them with you or any good link related to this topic?

Earlier Xilinx have some good open sites on FIFO architectures and now a days, those sites are not still accessible. Can you please provide some documents or links on FIFO architectures, if possible?

Regards,
sun_ray

aparnass · Jun 4, 2012

Hi Sun_ray
the issue of Asynchronous FIFO depth is dependent on the requirements of the surrounding logic.
typically, under the following requirements the depth of a FIFO can be determined. the requirements for a normal design may be:
1. the FIFO can read may be stalled by downstream logic and get full
2. once the FIFO is flowing, there should be no bubbles (cycles where the FIFO gets empty)
3. the FIFO should not get full unless it is stalled.

under those requirements, the FIFO depth is to be calculated for a worst case scenario which is when the FIFO is constantly written to and is stalled by downstream logic and gets full, you remove the stall and start reading as fast as possible.
in this condition you do not want the FIFO to become underrun, so you need enough entries to accommodate for the round trip from the release of the stalling to the arrival of the first data written to the FIFO after the stall on the right side is removed.

when you remove the stalling, the read pointer of the FIFO would change and it will no longer be full. this information would have to be converted to Grey code and driven to a 3-flop synchronizer to the write side where it will change the value of the full signal
after that a new data will be written to the FIFO. the information of the new data would change the write pointer and again, this will have to be synchronized back to the read side.
if the information about the new write, reached the read side before, the FIFO got empty, there will not be any unnecessary bubbles and the data rate would re-stabilize on the intended rate.
taking into account the synchronizer inherent extra cycle delay, you can calculate the worst round trip delay, as measure in the read side clocks to be:

4*read_cycle + 4*write_cycle*frequency_ratio

so if you are synchronizing between equal frequency asynchronous domains, you will need 8 entries, this would also apply in the case you have a faster to slower synchronization.
if you synchronize a slow domain to a fast domain, you will need some more, to get back to the expected number of bubbles on the read side.

if the FIFO can not be stalled (no full indication needed), the depth of the FIFO becomes a function of the level of burstiness of the write side, if for some periods of time the write side is capable of writing more than average data in a short time, the FIFO should be able to accommodate that and its depth is to be set accordingly.

in all cases, to be on the safe side the write bandwidth should not exceed the read bandwith or else the FIFO will overrun, no matter what the depth is. it is advisable to use some degree of speedup to prevent such cases. keeping the bandwidth requirement, would allow you to read different size chunks from the FIFO and still have it function correctly.

if you are looking for a good example of a simple asynchronous FIFO you can checkout this site, it has some explanation and reference design :
https://www.rtlery.com/components/asynchronous-fifo-synchronizer

hope this helps
Amnon

sun_ray · Jun 5, 2012

aparnass said:
Hi Sun_ray
the issue of Asynchronous FIFO depth is dependent on the requirements of the surrounding logic.
typically, under the following requirements the depth of a FIFO can be determined. the requirements for a normal design may be:
1. the FIFO can read may be stalled by downstream logic and get full
2. once the FIFO is flowing, there should be no bubbles (cycles where the FIFO gets empty)
3. the FIFO should not get full unless it is stalled.

under those requirements, the FIFO depth is to be calculated for a worst case scenario which is when the FIFO is constantly written to and is stalled by downstream logic and gets full, you remove the stall and start reading as fast as possible.
in this condition you do not want the FIFO to become underrun, so you need enough entries to accommodate for the round trip from the release of the stalling to the arrival of the first data written to the FIFO after the stall on the right side is removed.

when you remove the stalling, the read pointer of the FIFO would change and it will no longer be full. this information would have to be converted to Grey code and driven to a 3-flop synchronizer to the write side where it will change the value of the full signal
after that a new data will be written to the FIFO. the information of the new data would change the write pointer and again, this will have to be synchronized back to the read side.
if the information about the new write, reached the read side before, the FIFO got empty, there will not be any unnecessary bubbles and the data rate would re-stabilize on the intended rate.
taking into account the synchronizer inherent extra cycle delay, you can calculate the worst round trip delay, as measure in the read side clocks to be:

4*read_cycle + 4*write_cycle*frequency_ratio

so if you are synchronizing between equal frequency asynchronous domains, you will need 8 entries, this would also apply in the case you have a faster to slower synchronization.
if you synchronize a slow domain to a fast domain, you will need some more, to get back to the expected number of bubbles on the read side.

if the FIFO can not be stalled (no full indication needed), the depth of the FIFO becomes a function of the level of burstiness of the write side, if for some periods of time the write side is capable of writing more than average data in a short time, the FIFO should be able to accommodate that and its depth is to be set accordingly.

in all cases, to be on the safe side the write bandwidth should not exceed the read bandwith or else the FIFO will overrun, no matter what the depth is. it is advisable to use some degree of speedup to prevent such cases. keeping the bandwidth requirement, would allow you to read different size chunks from the FIFO and still have it function correctly.

if you are looking for a good example of a simple asynchronous FIFO you can checkout this site, it has some explanation and reference design :
https://www.rtlery.com/components/asynchronous-fifo-synchronizer

hope this helps
Amnon

What do you mean by stalling here? Is it stalling of write date or read data?

aparnass · Jun 5, 2012

The stalling refers to a situation where not reads are performed from the FIFO and it is allowed to get full to the point where the full indication on the write side is asserted

sun_ray · Jun 6, 2012

What do you mean by the followings two cases . Can you please explain more?

1. the FIFO can read may be stalled by downstream logic and get full. What does 'FIFO can read mean' here?
2. once the FIFO is flowing, there should be no bubbles (cycles where the FIFO gets empty). What do you mean by flowing here?

Can you please write the mathematical equation for 'write bandwidth should be equal to the read bandwidth'? What is the exact requirement of this write bandwidth and read bandwidth? If we are transferring data from one clock domain to a new clock domain then the write data width should be equal to read data width. The read data width cannot be bigger than the write data width as it is a data (which has some information is coming ) coming and so the width cannot change. Please explain how width can vary if you do not agree with my statement.

aparnass · Jun 6, 2012

sun_ray said:
What do you mean by the followings two cases . Can you please explain more?

1. the FIFO can read may be stalled by downstream logic and get full. What does 'FIFO can read mean' here?
2. once the FIFO is flowing, there should be no bubbles (cycles where the FIFO gets empty). What do you mean by flowing here?

Can you please write the mathematical equation for 'write bandwidth should be equal to the read bandwidth'? What is the exact requirement of this write bandwidth and read bandwidth? If we are transferring data from one clock domain to a new clock domain then the write data width should be equal to read data width. The read data width cannot be bigger than the write data width as it is a data (which has some information is coming ) coming and so the width cannot change. Please explain how width can vary if you do not agree with my statement.

when a FIFO is read, at the same rate as it is written, it will not get full, but as soon as you stop reading, it will start to fill up, to the point where the full indication is asserted. this scenario is the basis for calculating the required FIFO depth because once this situation is cleared (you start to read the FIFO again) you want to be able to read constantly until the FIFO returns to its normal level without having to stop for some cycles until the newly written data is ready to be read. given that this "bubble free" behavior is a requirement, the FIFO depth should be as calculated above. by flowing, i mean that data is both written and read at the correct long term average rate.

bandwidth is measured in bits/sec or bytes/sec so when you use a clock domain crossing you should see that the rate for write is lower or equal to the rate for read. typically, you would want the read bandwidth to be higher by a bit, because if they are equal, the FIFO may build up over time and eventually overflow. the reason we need the FIFO to be deep enough is the fact that there are short term mismatches in the rate of read and write, caused by the implementation or the downstream logic or just by a PLL frequency drift, so you need some entries to cover those temporary situations. moreover, if you read and write in different width of data, the logic for doing that creates distortions in the rate and this should also be accounted for.

in the real world, asynchronous FIFOs that transfer equal rate in the read and write side should be at a steady state depth of 2-4 entries, those would account for the logic itself and the minor effects discussed above.

hope it clear thing
Amnon

sun_ray · Jun 7, 2012

aparnass said:
when a FIFO is read, at the same rate as it is written, it will not get full, but as soon as you stop reading, it will start to fill up, to the point where the full indication is asserted. this scenario is the basis for calculating the required FIFO depth because once this situation is cleared (you start to read the FIFO again) you want to be able to read constantly until the FIFO returns to its normal level without having to stop for some cycles until the newly written data is ready to be read. given that this "bubble free" behavior is a requirement, the FIFO depth should be as calculated above. by flowing, i mean that data is both written and read at the correct long term average rate.

bandwidth is measured in bits/sec or bytes/sec so when you use a clock domain crossing you should see that the rate for write is lower or equal to the rate for read. typically, you would want the read bandwidth to be higher by a bit, because if they are equal, the FIFO may build up over time and eventually overflow. the reason we need the FIFO to be deep enough is the fact that there are short term mismatches in the rate of read and write, caused by the implementation or the downstream logic or just by a PLL frequency drift, so you need some entries to cover those temporary situations. moreover, if you read and write in different width of data, the logic for doing that creates distortions in the rate and this should also be accounted for.

in the real world, asynchronous FIFOs that transfer equal rate in the read and write side should be at a steady state depth of 2-4 entries, those would account for the logic itself and the minor effects discussed above.

hope it clear thing
Amnon

What do you mean by 'bubble'?

Thanks for explaining what you mean by flowing. But the explanation still requires clarification. What do you mean by ' correct long term average rate ' when you write 'that data is both written and read at the correct long term average rate.'? What is this ' correct long term average rate '? What rate are you talking of when you say '' correct long term average rate ''?

You wrote 'so when you use a clock domain crossing you should see that the rate for write is lower or equal to the rate for read. But we use FIFO when the rate of writing is higher than the rate of reading so that the FIFO is used as a queuing element/buffering element?

aparnass · Jun 7, 2012

A bubble is a cycle on the read side which is not read from the FIFO because the FIFO indicates that it is empty. if the FIFO is not deep enough, this case can happen at the point where the FIFO was stalled and then start to flow again. this situation can be avoided by keeping the FIFO at the correct depth.

the long term rate of data flow is the same for write and for read. a FIFO, no matter how deep, can not overcome a higher long term average write bandwidth.
consider this example:
on the write side you have a write rate of 6B/sec
on the read side you have a read rate of 5B/sec

so
after 1 second you have 1B written but not read
after 2 second you have 2B written but not read
after 3 second you have 3B written but not read
after 4 second you have 4B written but not read
after 5 second you have 5B written but not read
after 6 second you have 6B written but not read
after 7 second you have 7B written but not read
........

so if this continues, to support 1 hour of constant traffic you need a FIFO of 3,600B and for a day you need 86,400B ... clearly this is not the way to go.

clearly your statement that :

But we use FIFO when the rate of writing is higher than the rate of reading so that the FIFO is used as a queuing element/buffering element

is incorrect, you need the same rate on both sides to make sure the FIFO does not overflow. and as i mentioned, you may want some speedup of the read over the write to make sure it does not fill in steady state.
the FIFO depth is used for overcoming the limitations of the logic and the frequency drift as i explained above

the correct long term average rate is therefore the rate in which the FIFO does not eventually overflow
Amnon

sun_ray · Jun 7, 2012

aparnass said:
A bubble is a cycle on the read side which is not read from the FIFO because the FIFO indicates that it is empty. if the FIFO is not deep enough, this case can happen at the point where the FIFO was stalled and then start to flow again. this situation can be avoided by keeping the FIFO at the correct depth.

the long term rate of data flow is the same for write and for read. a FIFO, no matter how deep, can not overcome a higher long term average write bandwidth.
consider this example:
on the write side you have a write rate of 6B/sec
on the read side you have a read rate of 5B/sec

so
after 1 second you have 1B written but not read
after 2 second you have 2B written but not read
after 3 second you have 3B written but not read
after 4 second you have 4B written but not read
after 5 second you have 5B written but not read
after 6 second you have 6B written but not read
after 7 second you have 7B written but not read
........

so if this continues, to support 1 hour of constant traffic you need a FIFO of 3,600B and for a day you need 86,400B ... clearly this is not the way to go.

clearly your statement that :

is incorrect, you need the same rate on both sides to make sure the FIFO does not overflow. and as i mentioned, you may want some speedup of the read over the write to make sure it does not fill in steady state.
the FIFO depth is used for overcoming the limitations of the logic and the frequency drift as i explained above

the correct long term average rate is therefore the rate in which the FIFO does not eventually overflow
Amnon

Thanks for the response. I think it is a basic requirement that long term rate of data flow is the same for write and for read. Can you please explain this basic requirement? How is long term defined here?

Sometimes the following question is asked: FIFO is written at 80 words/ 100 clock cycles and read at 8 words/10 clock cycles. How much will be the correct FIFO depth?

In the above case as you note writing is faster than reading in worst write case (20 nop-80-80-20 nop). Here nop stands for no operation. So in this case we need a FIFO to buffer the extra data that is being written. This is the reason I wrote " But we use FIFO when the rate of writing is higher than the rate of reading so that the FIFO is used as a queuing element/buffering element ". How do you say that it is incorrect?

Suppose a FIFO do not have full and empty signal. Then if the FIFO depth is not proper, the reading side can read some data which the FIFO has already read from the FIFO. Is that what you call a bubble? In that case the reading side can read all zeros in those clock cycles id FIFO is rest to zero after all the data has been read.

THOUGH YOU EXPLAINED WHAT YOU MEAN BY "the FIFO can read may be stalled by downstream logic and get full", but it is not still clear. Is it you wanted to say 'The FIFO reading may be stalled by downstream logic and get full' ?

YOU WROTE "in this condition you do not want the FIFO to become underrun, so you need enough entries to accommodate for the round trip from the release of the stalling to the arrival of the first data written to the FIFO after the stall on the right side is removed." What do you mean by release of stalling? Do you want to mean the start of stalling the reading operation or start of reading? What do you mean by ' the stall on the right side is removed'? Do you want to mean the start of reading operation again after the stall by ' the stall on the right side is removed'?

YOU WROTE "4*read_cycle + 4*write_cycle*frequency_ratio

so if you are synchronizing between equal frequency asynchronous domains, you will need 8 entries, this would also apply in the case you have a faster to slower synchronization."

Do you define the frequency ratio to be write frequency by read frequency or the reverse? How do you say we need a depth of 8 in the case of a faster to slower synchronization (Assuming that what you wanted to mean by stating 'you will need 8 entries, this would also apply in the case you have a faster to slower synchronization')? This last question arises because by above formula the depth is not eight if frequency_ratio is not equal to 1.

Thanks

aparnass · Jun 11, 2012

Hello Sun_ray

for the question:

FIFO is written at 80 words/ 100 clock cycles and read at 8 words/10 clock cycles. How much will be the correct FIFO depth?

Since the long term write and read rate is equal, the depth of the FIFO can be calculated to be a finite number (this relates to mt explanation from above)
this question is not necessarily an asynchronous FIFO question.
anyway, the answer is to find the worst case which is as you mentioned above. during the consequtive 160 write clocks, 16X8 = 128 words would be read (16 time 10 clocks in each 8 data would be read)
therefore the depth would be at least 32 entries for a synchronous FIFO. in the remaining 40 clocks the FIFO would be read back to empty before it is written again.

the worst case described is clearly an issue of burstiness or short-term high rate of write and the FIFO should be planned for this worst case, providing the long-term rate is equal as it is in the example.
the question does not refer to the actual FIFO implementation, there may be extra cycles that are required because of data manipulations on the write or read data.
as you can see from this example, short term write rate may be higher, but long term rate must be lower or equal to the read rate for the FIFO to work properly without overflow.

bubble refers to cycles where there are elements in the asynchronous FIFO that are not read because of the synchronization. those bubbles can be overcome by adding more entries, specifically when the FIFO was previously full due to the reading stopped and start to flow again after the read side started reading again.
releasing of the stall means that the FIFO read wouls start reading entries from the FIFO after a period of time that got the FIFO full by not reading

the equation assumes read is faster, this is the case you may want more entries, to prevent unnecessary bubbles when the FIFO gets full and now being read constantly. more entries would allow the FIFO to stablize without getting empty.
it is true that the worst case burstiness would be when the write side is faster, so you may think that you need more entries in that case, but to stablize on the long term rate after a long stall where no data is read would require only 8 entries that would prevent the bubble, the read slower rate would take care of the read rate and the FIFO would get full anyway due to the allowed burstiness. you typically can not stall a FIFO that does not have a full indication on the write side.

prady019 · May 21, 2014

This discussion helped me.
Few months back i was asked question in interview "What will be minimum and maximum FIFO depth if write cycle is 80 words per 100 clock (80w/100clk) and read cycle is 8 words per 10 clock (8w/10clk) ?"
And i was like not getting the question.
can u help me guys. What he was trying to know from me.

alaparthi · May 22, 2014

This is standard question in chip design interviews. The basic idea here is to decide the right FIFO depth not to loose any of the words.

Writes are happening @rate = 80words/100 cycle.
The immediate question to answer here " where in those 100 cycles the writes happen - are they continuous burst or interleaved or at the end?"

The answer to this question, yields different answers. Worst case is as follows:

1) Data occurs in Bursts: All 80 words are written in first 80 cycles and last 20 cycles of 100 cycles are idle.

In this case, we just need FIFO Depth of 16.
As the data is always written in first 80 cycles, the read side would have finished reading 64 words and have 20more cycles to empty the 16 words of the FIFO. No data loss.

2) Data occurs in Bursts: All 80 words are written together, but their presence in 100 cycles is not guaranteed - that is, all of the following are possible: Out of 100 cycles
(a) 10(IDLE), 80(DATA), 10(IDLE)
(b) 20(IDLE), 80(DATA)
(c) 80(DATA), 20(IDLE)
etc.

On the first look, case(2) appears similar to case(1)
But for this scenario, to get worst case, we need to look at two back to back writes, so in 200 cycle period, worst case is when 2(b) occurs in first 100cycles and 2(c) occurs in second 100cycles.
Making it 160words. Back to back, where as Read just would have consumed only, 128 of them. So FIFO Depth for this case should be : 32 words

prady019 said:
This discussion helped me.
Few months back i was asked question in interview "What will be minimum and maximum FIFO depth if write cycle is 80 words per 100 clock (80w/100clk) and read cycle is 8 words per 10 clock (8w/10clk) ?"
And i was like not getting the question.
can u help me guys. What he was trying to know from me.

prady019 · May 23, 2014

This is simple and best explanation i was looking for.

Thanks Alaparthi

FvM · May 23, 2014

But for this scenario, to get worst case, we need to look at two back to back writes, so in 200 cycle period, worst case is when 2(b) occurs in first 100cycles and 2(c) occurs in second 100cycles.
Making it 160words. Back to back, where as Read just would have consumed only, 128 of them. So FIFO Depth for this case should be : 32 words

You are assuming that 10 and 100 cycle timeslots are aligned. If not, there may be only 126 words consumed.

Welcome to EDAboard.com

FIFO depth required for async FIFO

Advanced Member level 3

Junior Member level 3

Member level 1

Advanced Member level 3

Member level 1

Advanced Member level 3

Newbie level 4

Advanced Member level 3

Newbie level 4

Advanced Member level 3

Newbie level 4

Advanced Member level 3

Newbie level 4

Advanced Member level 3

Newbie level 4

Junior Member level 3

Junior Member level 3

Junior Member level 3

Super Moderator

Similar threads

Part and Inventory Search

Welcome to EDABoard.com

Sponsor