First some 'generic' remarks.
Since the receiver state machine is a clocked process, you only need 'int_clk' in the sensitivity list if I am not mistaken.
Note that you use 'int_clk' as a derived clock, which is not always a good thing.
If you want to use the 'master' clock (clk) then simply combine the baudrate and receiver state.
Something like this:
process (Clk)
if rising_edge(Clk)
clk_count <= clk_count - 1;
if (clk_count = "00000")
clk_count <= "10101";
case State_RX_0 is
.....
end case;
end if;
end if;
end process;
Now the clocked state machine runs at the rising edge of the master clock.
I find your process for creating the baudrate a little difficult to
understand. If you want to toggle the 'int_clk' every 21 clk cycles
then the following will do too.
process (Clk)
if rising_edge(Clk)
clk_count <= clk_count - 1;
if (clk_count = "00000")
clk_count <= "10101";
int_clk <= not int_clk;
else
int_clk <= int_clk;
end if;
end if;
end process;
If you want the 'int_clk' just active for one 'clk' cycle then use.
process (Clk)
if rising_edge(Clk)
clk_count <= clk_count - 1;
if (clk_count = "00000")
clk_count <= "10101";
int_clk <= '1';
else
int_clk <= '0';
end if;
end if;
end process;
You don't always need a reset signal. Since 'clk_count', even if it has
an 'unknown' value, will always count down (or up) to the value we check,
we are always sure it will run correctly at some stage (and within a
reasonable time). Only when synchronization or a correct initialization
is of importance, you need
to use a reset. You then also have the option to use a synchronous
or asynchronous reset. If I remember correctly a synchronous reset
uses less resources.
Asynchronous reset:
process (Clk)
if (reset = '1')
..
else
if rising_edge(Clk)
..
end if;
end if;
end process;
Synchronous reset:
process (Clk)
if rising_edge(Clk)
if (reset = '1')
..
else
..
end if;
end if;
end process;
Note: All the code I present here is not checked in any way ....
I don't know about your test bench, but you can use a 'real' test bench
(you put i defined data and check the result), or you just create
a very basic one and check the signals 'manually'/'visibly'.
In general, if the test bench indicates that everything works properly
(and I assume then that the test bench is correctly setup), then
a real implementation should also be succesfull. If not, then either
the idea (what the code does) or the test bench is not correct.
It seems that the first data bit is checked 12 int_clk cycles after
detecting the start bit (10 * RX_0_IDLE + 1 * RX_0_START + 1 * RX_0_DATA).
This should, I think. be at least 16+8 = 24 int_clk cycles after
detecting the start bit. 24 int_cycle would be at halve the first data bit.
For an example on an UART sample from me, look here:
it also includes test benches.