I have a buffer that is 256 bit in length and I am using the picoblaze to read/write data to it.
I can only read/write in 8 bit chunks which is acceptable. So with CPU_Addr = 0 to 31 I want the 8 bit block to be read/written
The problem is if I enable the writing line, the number of used slices grows from 310 to 1292 and I can not understand why ?
Code:
signal buffer : std_logic_vector(255 downto 0);
process (clk)
variable ptrStart : integer range 0 to 255 := 0;
variable tmpAddr : integer range 0 to 255 := 0;
begin
If falling_edge(clk) Then
tmpaddr := to_integer(unsigned(CPU_Addr)) ;
ptrStart := tmpaddr * 8;
if write_strobe = '1' then
buffer(ptrStart + 7 downto ptrStart) <= data_in; -- problem with this line !!
else
data_out <= buffer(ptrStart +7 downto ptrStart);
end if;
end if;
end process;
Instead of treating VHDL as programming language you should think about FPGA hardware structures that can effectively handle the data processing problem.
Reading and writing 8 data bits from/to a register array with variable bit address maps to 256 DFF and much more logic cells acting as huge mux array. Very ineffective in hardware design terms.
- - - Updated - - -
At first sight, the bit addressing is obsolete, you can actually use a byte register array. A dual port ram would even allow simultaneous read and write if required.
Thanks for the speedy reply. The reason I need (or I think I need) the buffer is that I will be doing some calculations on the whole buffer and also performing a CRC on the whole buffer.
if I used a byte register array, could I simply convert/copy to a 256 buffer in order to perform the above tasks ?
What I do not understand is why reading in this fashion 'appears' to be ok but writing causes major difference to the amount of slices used. I tested it by removing the read line and added the write line in place expecting the number of slices to be approximately the same but it is not and takes up about 1000 slices more for the change of direction. (It is an simplified way of looking at it and this is probably why I am missing the obvious !)
The reason I need (or I think I need) the buffer is that I will be doing some calculations on the whole buffer and also performing a CRC on the whole buffer.
You need to design a dataflow model that considers the FPGA hardware capabilities. E.g. a RAM based register can be only accessed at one address per clock cycle (respectively two addresses in case of a dual port RAM).
What I do not understand is why reading in this fashion 'appears' to be ok but writing causes major difference to the amount of slices used.
Without the write enable, you simply have a rom. With the write enable, it needs to attach the write enable signal and the decoded address to each register, hence the extra luts.
This does not fit the template for a ram, and hence all built out of registers.
Without the write enable, you simply have a rom. With the write enable, it needs to attach the write enable signal and the decoded address to each register, hence the extra luts.
This does not fit the template for a ram, and hence all built out of registers.