Circular FIFO in VHDL

Binome · Feb 13, 2015

OK, thanks for the link.
Here is my new code:

Code:

library	ieee;
use		ieee.std_logic_1164.all;
use		ieee.numeric_std.all;

entity ring_fifo is
	generic(
		fifo_length	: integer:=8;
		fl_bits		: integer := 3;
		data_width	: integer:=4);
	port(
		clk			: in std_logic;
		rst			: in std_logic;
		ren			: in std_logic;
		wen			: in std_logic;
		dataout		: out std_logic_vector(data_width-1 downto 0);
		datain		: in  std_logic_vector(data_width-1 downto 0);
		empty		: out std_logic;
		err			: out std_logic;
		full		: out std_logic);
	end ring_fifo;

architecture arc of ring_fifo is

	type memory_type is array (0 to fifo_length-1) of std_logic_vector(data_width-1 downto 0);

	signal memory			: memory_type := (others => (others => '0'));
	signal readptr,writeptr	: unsigned(fl_bits-1 downto 0) := to_unsigned(0, fl_bits);
	signal full0			: std_logic := '0';
	signal empty0			: std_logic := '1';

begin
	full <= full0;
	empty <= empty0;
					
	err <= '1' when (empty0='1' and ren='1') or (full0='1' and wen='1')
				else '0';

	fifo0: process(clk,rst)
	begin
		if rst='1' then
			memory <= (others => (others => '0'));
			readptr <= to_unsigned(0, fl_bits);
			writeptr <= to_unsigned(0, fl_bits);
			full0 <= '0';
			empty0 <= '1';
		elsif rising_edge(clk) then
			if (wen='1' and full0='0') then
				memory(to_integer(writeptr)) <= datain ;
				writeptr <= writeptr+1;
			end if;

			if (ren='1' and empty0='0') then
				dataout <= memory(to_integer(readptr));
				readptr <= readptr+1;
			end if ;
			
			if (writeptr+1=readptr) and (wen='1') and (ren='0') then
				full0 <= '1';
			else
				full0 <= '0';
			end if;
			
			if (readptr=writeptr) and (ren='1') and (wen='0') then
				empty0 <= '1';
			else
				empty0 <= '0';
			end if;
				
		end if; 
	end process;
end arc;

What I understand is that full is '1' only if someone tried to write and couldn't. What I'd like is full to prevent someone to write because it would be impossible.

TrickyDicky · Feb 13, 2015

Full shouldnt be '1' when a write has failed, it should indicate there is no more room in the fifo.

There are two ways to handle what to do when full and a write is asserted
1. Implement some write protection to prevent writes when it is full (this is what you have done). This gives slightly poorer timing performance. But this way users can simply leave wren high and use the full flag as back pressure (they probably have another fifo with (not full) connected to the read enable.

2. Have no such protection. Offers better timing performance but this requires the use to gate the full flag with their own write enable signal.

K-J · Feb 13, 2015

Your logic for the flags is not correct. Let's say the fifo is full, but wen and ren are both 0. Your logic will say that your fifo is not full which is not correct, it is full. What you want is the following:

Code:

if (rst = '1') or (ren = '1') then
    full0 <= '0';
elsif (writeptr+1=readptr) and (wen='1') then
    full0 <= '1';
end if;

Similar comments for empty.

Kevin Jennings

vGoodtimes · Feb 16, 2015

Some notes:
1.) it is hard to say for sure if full should be '1' on reset. Some people use full to mean "size = capacity", others use it to mean "write will fail".
2.) with async resets you may have other similar issues if you use that flag near a reset.
3.) (others => (others => '0') ) prevents mapping to DMEM or BRAM (for xilinx). This might be ok for applications with only 8 entries, but you have no commentary or limits on the generics.
4.) you might benefit from using two processes. One of the annoying parts of VHDL is that certain signals really aren't combinatorial and can't be put in the clocked process, even though they are key to understanding the process and are highly related to that process.

My suggestion is to spend about 30 minutes to put together the two-process version that works first, then spend whatever time you want to clean it up.

Having done VHDL for a while, I like that 1 process is concise, but 2 process designs are often easier to deal with.

Binome · Feb 17, 2015

@TrickyDicky
in the document of the link I can read in 2.2 that the pointers have to contain an extra bit to know if the fifo is full or empty and that was the use of my 'rcycle' and 'wcycle' signals. So I think I have to reintroduce them.

@K-J
I'm not sure about the new code:

Code:

library	ieee;
use		ieee.std_logic_1164.all;
use		ieee.numeric_std.all;

entity ring_fifo is
	generic(
		fl_bits		: integer range 2 to 15 := 3; -- log2(fifo_length)
		data_width	: integer range 1 to 32 := 4);
	port(
		clk			: in std_logic;
		rst			: in std_logic;
		ren			: in std_logic;
		wen			: in std_logic;
		dataout		: out std_logic_vector(data_width-1 downto 0);
		datain		: in  std_logic_vector(data_width-1 downto 0);
		empty		: out std_logic;
		err			: out std_logic;
		full		: out std_logic);
	end ring_fifo;

architecture arc of ring_fifo is

	type memory_type is array (0 to 2**fl_bits-1) of std_logic_vector(data_width-1 downto 0);

	signal memory			: memory_type := (others => (others => '0'));
	signal readptr,writeptr	: unsigned(fl_bits-1 downto 0) := to_unsigned(0, fl_bits);
	signal full0			: std_logic := '0';
	signal empty0			: std_logic := '1';

begin
	full <= full0;
	empty <= empty0;
					
	err <= '1' when (empty0='1' and ren='1') or (full0='1' and wen='1')
			else '0';

	fifo0: process(clk,rst)
	begin
		if rst='1' then
			memory <= (others => (others => '0'));
			readptr <= to_unsigned(0, fl_bits);
			writeptr <= to_unsigned(0, fl_bits);
			full0 <= '0';
			empty0 <= '1';
		elsif rising_edge(clk) then
			if (wen='1' and full0='0') then
				memory(to_integer(writeptr)) <= datain ;
				writeptr <= writeptr+1;
			end if;

			if (ren='1' and empty0='0') then
				dataout <= memory(to_integer(readptr));
				readptr <= readptr+1;
			end if ;
			
			if (writeptr+1=readptr) and (ren='0') and (wen='1') then
				full0 <= '1';
			else
				full0 <= '0';
			end if;
			
			if (readptr+1=writeptr) and (wen='0') and (ren='1') then
				empty0 <= '1';
			else
				empty0 <= '0';
			end if;
				
		end if; 
	end process;
end arc;

@vGoodtimes
3.) Now the generics have limits and I don't understand the DMEM and BRAM matter. Shouls I use the (others => (others => '0')) or not?
4.) I don't understand where I could use two processes.

ads-ee · Feb 17, 2015

Binome said:
in the document of the link I can read in 2.2 that the pointers have to contain an extra bit to know if the fifo is full or empty and that was the use of my 'rcycle' and 'wcycle' signals. So I think I have to reintroduce them.

In 2.2 of the document I posted the extra pointer bits are for when you want to actually use all the locations in the FIFO. It's usually more resource efficient to allow the full condition to be one less than the depth, thereby avoiding the extra pointer logic. So for a 1024 deep memory you would have a full condition when 1023 data writes have occurred without any reads. i.e. the write pointer is at address 1023 and the read pointer is at 0 (so full is the write pointer one away, increment, from the read pointer and empty is the write pointer equal to the read pointer).

K-J · Feb 17, 2015

Binome said:
@K-J
I'm not sure about the new code:

Comments added below to your code. For the proper logic, refer to what I posted in my last post

Binome said:

Code:

    if (writeptr+1=readptr) and (ren='0') and (wen='1') then
        full0 <= '1';
    else
        ----------------------------------------------------------------
        -- KJ:  The problem is that you'll get to this code whenever wen is 0.  Consider the following scenario:
        -- 1. ren is always 0
        -- 2. wen is 1 up until the fifo fills up and then shuts off and stays 0
        -- Obviously now the fifo has been filled up (but not overflowing).  Since the read signal is never
        -- set to 1, the fifo will remain full (ignoring a reset).  Now let's look at what your logic does:
        -- In #2, on the write cycle where the writeptr is about to wrap around equal the readptr, you will set
        -- full to 1 with the 'if' condition that you have (so that is good).  But on the very next clock cycle, #2 says
        -- that wen will then shut off and stay 0.  So now the 'if' condition is no longer true and you end up in this
        -- 'else' branch which will set full to 0.  But nothing has been read out, so the fifo is still full so full should 
        -- be set to 1.
        -- In the code that I posted previously, it works because of the following true statements:
        --     Full can never be set to 1 if ren is also 1
        --     Full can change from 1 to 0 only if ren is also 1
        -- The same principles apply to generating the empty signal.
        ----------------------------------------------------------------
        full0 <= '0';
    end if;

@vGoodtimes
3.) Now the generics have limits and I don't understand the DMEM and BRAM matter. Shouls I use the (others => (others => '0')) or not?

No you should not

4.) I don't understand where I could use two processes.

You should not use two processes, vGoodtimes gave you poor advice.

Kevin Jennings

vGoodtimes · Feb 18, 2015

@Binome: FPGA's have different built-in memories. registers are the most general purpose and the fastest for small depths, but they aren't very dense. distributed rams are a bit denser, but have restrictions on what can be done. block rams are the largest elements, but have even more restrictions. BRAMs are typically 2kB - 4kB large with a handful of width/depth options. Distributed are typically 16-32 bits each (but much more plentiful). If you want more info, look at Xilinx's (or Altera's) synthesis guides. These show the style used to infer these special elements from VHDL/Verilog. In your case, a 32k deep fifo would use 32k registers per bit of width. a 32kB fifo would be 256k registers! (or just 8 BRAM for Xilinx). Even if your FPGA has 256k free registers, the map/par time to implement the design will be much longer.

Two process designs have one process that is clocked and ONLY has simple logic for reset and "x <= next_x;" or similar "x_reg <= next_x;". The other process is not clocked and has combinatorial logic.

There are several pro's and con's on 1 clocked process vs 1 clocked + 1 combinatorial. It would make for a good thread on it's own. (having done both, I really _want_ the 1 process design, but the 2 process has many nice features.)

@K-J/Binome
I suggested the 2-process design because it has benefits in understanding what is going on. It gives you both current and next value visibility in sim. It makes you think about the logic for driving the registers. It also removes the "this has a different name so it is an extra register" implications that cause so many issues in single process designs.

TrickyDicky · Feb 18, 2015

2 process style is really a throwback.
In the old old days, synthesis tools were not very good at synthesising RTL code. So required separate processes for logic and registers. Then textbooks taught this method, and the style has persevered. Now that logic and registers are essentially free in FPGAs, you really should be registering as much as you can (which makes a single process desirable). In addition, it has the benefit that it is impossible to produce latches, which are all too often generated by beginners. 2 processes just spreads related logic around your code, making it harder to follow.

Unregistered outputs can be done easily with a single process and some external assignments, but you would want to be avoiding this anyway.

Its just a shame that most teaching sources are old, and stick with the 2 process style, then many people think it's a good way to learn.

vGoodtimes · Feb 20, 2015

Unregistered outputs can be done easily with a single process and some external assignments, but you would want to be avoiding this anyway.

This is the main reason I went back to 2-process after 6 years of mostly 1-process designs. The external assignments often duplicated logic inside the process and would not be updated by the next dev. Many of the designs I saw had _complex_ related code spread around as a result.

I have tools to deal with verbose code, so that aspect didn't really concern me.

(You can also generate latches with 1-process. eg, data_out in the OP's code)

FvM · Feb 20, 2015

There's by the way a third option, combining registered and combinatorial assignments in a single process. But let's start with the usual schemes.

I presume that the designer decides which signal should be registered. The single registered process scheme suggests to register all signals, except for those you don't want to get delayed by a clock cycle. They must be assigned in some way outside the clock edge sensitive condition.

This may result in more signals being registered than necessary, but usually with no or little impact on resource utilization. Timing closure is simplified on the other hand.

In a single process scheme, the combinatorial logic description is packed with register description. As in any behavioral code, the logic complexity isn't obvious at first sight.

The classical two process scheme cuts the description between combinatorial logic and registers, needing additional signals for each register input. Although it's a step towards low-level design, the behavioral description of the combinatorial logic doesn't necessarily reveal the logic complexity better than in the single process scheme. The need for extra combinational "wire" signals is probably a sufficient reason for many designer to stay away from this concept.

If you prefer the two process scheme for your design, i don't want to lead you astray. But expect contradiction if you claim it's superiority.

In case of doubt I prefer the implementation that needs less code lines for the same functionality.

Binome · Feb 20, 2015

I'll use the altera's fifo as it seems too complex for me to understand such a component.
Thank you all.

vGoodtimes · Feb 21, 2015

Oh, certainly. You really should use vendor parts where possible. If something can be exactly a builtin fifo, it is almost always best to instantiate it. The tools won't infer it and not inferring it usually means more area and more delay.

IMO the 1 vs 2 process argument is more interesting. Both have merit and both arguments stem from a weakness in the language. VHDL does allow variables to allow common subexpessions (lines > 80 chars) to be combinatorial inside of a clocked process. But the failing is the inability to export next state logic out of a process. I am an experienced dev, and have no issues mixing variables and signals in a process. But I can't get next state logic out of a process without manual effort, and I know that this will only come back to me when someone tries to update my code.

So I use vim and scripts to handle the verbose two-process designs. I would love the ability to export next-state logic from a clocked process! But that will never happen with VHDL. Maybe with Verilog 3000...

For the same reason, I prefer the implementation with the fewest interesting lines. I'll use procedures liberally even though VHDL can't return procedures from functions... (failing). Code describing HW is ok. Code describing code that describes HW -- that is the devil's work...

Welcome to EDAboard.com

Circular FIFO in VHDL

Binome

Full Member level 3

TrickyDicky

Advanced Member level 7

K-J

Advanced Member level 2

vGoodtimes

Advanced Member level 4

Binome

Full Member level 3

ads-ee

Super Moderator

K-J

Advanced Member level 2

vGoodtimes

Advanced Member level 4

TrickyDicky

Advanced Member level 7

sharath666

vGoodtimes

Advanced Member level 4

FvM

Super Moderator

Binome

Full Member level 3

vGoodtimes

Advanced Member level 4

Similar threads

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Connect with us

Online statistics

Forum statistics