[MOVED] 2d convolution problem using vhdl

sreevenkjan · Nov 26, 2013

Hi all,

I am designing a filter to do convolution of a binary image which is stored in the bram as a .coe file.I am able to do convolution and my logic is working fine.Well since images are 2d I need to do filtering in the horizontal direction(x direction) and also in the vertical direction(y direction).if I store in the block ram the image pixels in the x direction(i.e image row pixels) as 9 bits wide upto a depth of 25000 then my filtering logic in done only in the x-direction and if i store the .coe file with vertical pixel values(column pixels) then filtering is done only in the y direction and my output is not the correct one because i need filtering in both the direction....if i do both the filtering processes separately(storing 2 different .coe files of data) and then take the simulated data obtained and combine it...then my output is correct.

i have explained the problem above..so now my question is do i need to store two sets of data in block ram and then do the convolution or is there any other way to proceed.

Thanks,
Sreeni.

TrickyDicky · Nov 27, 2013

I think you're making it a little complicated. All you need for 2D convolution is M-1 line buffers, each connected to a shift register with N-1 taps (as you can use the immediate data from the line buffer also), where your convolution matrix is NxM. Then you can just multiply the values and sum them together (and then do soemthing with the result). This all assumes you are streaming the data in from some source (it could be your block ram storage, or external storage or whetever). You will need some control logic to control what you do at the edges of the image (zero padding, edge repetition etc). But once you've sussed it you can easily filter video in real time.

Even better, if your matrix is seperable into a Nx1 and 1xM matrices, you only need 1 shift regiser on the input before storing in M-1 line buffers, but it reduces the multiplier useage from NxM to N+M.

sreevenkjan · Nov 27, 2013

I am not able to visualize your idea....i am sending u my code which does convolution.....my .coe file inside a block ram is 25600 deep and 9 bits wide....for example my image size is 480x480 binary pixels is = [ 1 2 3 4 5 ........480;481 482 483 ...........960;961 962 963 .......1140; and so on....Now i have stored these pixels in the block ram as 1 2 3 4 5 6 7 8 9,10 11 12 13 14 15 16 17 18,........till the 25600 row in a .coe file......as u can see I have stored the data in the x direction and not in the y direction which would be 1 481 961.... and so on....so when I give the data in the x direction my convolution logic works in the x direction and when i give data in the y direction my logic works in the y direction.my vhdl code is as follows...my filter kernel is [1 1 1;1 1 1;1 1 1]...so that means it can be separated to [1 1 1] and [1;1;1].....

my vhdl code is below

Code:

library IEEE;library std;
use ieee.std_logic_1164.all;
--use ieee.std_logic_arith.all;
use ieee.std_logic_unsigned.all;
use ieee.numeric_std.all;
use ieee.std_logic_textio.all;
use std.textio.all;
--use std.env.all;

entity testbench is
end testbench;

architecture Behavioral of testbench is

file data:text open write_mode is "data.txt";

COMPONENT blk_mem_pic23_9bit
    PORT (
       clka  : IN STD_LOGIC;
       wea   : IN STD_LOGIC_VECTOR(0 DOWNTO 0);
       ena :    IN STD_LOGIC;
       addra : IN STD_LOGIC_VECTOR(14 DOWNTO 0);
       dina  : IN STD_LOGIC_VECTOR(8 DOWNTO 0);
       douta : OUT STD_LOGIC_VECTOR(8 DOWNTO 0)
    );
END COMPONENT;

--inputorary signal declarations for bram_test.
signal ena : std_logic;
signal wea : std_logic_vector(0 downto 0);
signal dina : std_logic_VECTOR(8 downto 0);
signal douta : std_logic_VECTOR(8 downto 0);
signal addra : std_logic_VECTOR(14 downto 0);
signal clk : std_logic;

begin

--Instantiating BRAM.
bram : blk_mem_pic23_9bit
    port map(
    clka => clk,  
    ena => ena,   
    wea => wea,   
    addra => addra, 
    dina => dina,   
    douta => douta);

--Simulation process.
bram1 : process
variable temp,temp1 : std_logic_vector(8 downto 0);
type mat is array(0 to 8) of std_logic_vector(8 downto 0);
variable value:mat;
variable dil,depth : integer := 1;
variable count: integer := 0;
variable carry : std_logic := '0';
variable carry_fwd : std_logic := '0';
variable txt:line;
begin
ena <='1';
wea <= "0";
dina <= "000000000";
value(8) := "000000000";
addra <= "000000000000000";
for dil in 1 to 4 loop
for depth in 0 to 25600 loop
wea <= "0";
temp  := douta;
            for count in 0 to 8 loop
if (count = 0) then
	if (temp(count) = '1') then
temp1(0) := temp(0) and '1';
temp1(1) := temp(1) and '1';
temp1(3) := temp(3) and '1';
temp1(4) := temp(4) and '1';
value(0) := "000011011";
    elsif (temp(count) = '0') then
	  temp1(count) := '0';
	  value(0):= value(8)+"000000000";
	end if;
elsif (count = 1) then
	if (temp(count) = '1') then
--temp1 := temp(0)*1+temp(1)*1+temp(2)*1+temp(3)*1+temp(4)*1+temp(5)*1;
temp1(0) := temp(0)and'1';
temp1(1) := temp(1)and'1';
temp1(2) := temp(2)and'1';
temp1(3) := temp(3)and'1';
temp1(4) := temp(4)and'1'; 
temp1(5) := temp(5)and'1';
value(1) := "000111111";
      elsif (temp(count) = '0') then
	  temp1 := value(0)+"000000000";
	  value(1):= "000000000";
	end if;
elsif (count = 2) then
if (temp(count) = '1') then
--temp1(count) := temp(1)*1+temp(2)*1+temp(4)*1+temp(5)*1;
temp1(1) := temp(1)and'1';
temp1(2) := temp(2)and'1';
temp1(4) := temp(4)and'1'; 
temp1(5) := temp(5)and'1';
value(2) := "000110110";
      elsif (temp(count) = '0') then
	  temp1 := value(1)+"000000000";
	  value(2):= "000000000";
	end if;
	  
elsif (count = 3) then
if (temp(count) = '1') then
--temp1(count) := temp(0)*1+temp(1)*1+temp(3)*1+temp(4)*1+temp(6)*1+temp(7)*1;
temp1(0) := temp(0)and'1';
temp1(1) := temp(1)and'1';
temp1(3) := temp(3)and'1';
temp1(4) := temp(4)and'1'; 
temp1(6) := temp(6)and'1';
temp1(7) := temp(7)and'1';
value(3) := "011011011";
      elsif (temp(count) = '0') then
	  temp1 := value(2)+"000000000";
	  value(3):= "000000000";
	end if;
	  
elsif (count = 4) then
if (temp(count) = '1') then
--temp1(count) := temp(0)*1+temp(1)*1+temp(2)*1+temp(3)*1+temp(4)*1+temp(5)*1+temp(6)*1+temp(7)*1+temp(8)*1;
temp1(0) := temp(0)and'1';
temp1(1) := temp(1)and'1';
temp1(2) := temp(2)and'1';
temp1(3) := temp(3)and'1';
temp1(4) := temp(4)and'1'; 
temp1(5) := temp(5)and'1';
temp1(6) := temp(6)and'1';
temp1(7) := temp(7)and'1';
temp1(8) := temp(8)and'1';
value(4) := "111111111";
      elsif (temp(count) = '0') then
	  temp1 := value(3)+"000000000";
	  value(4):= "000000000";
	end if;
elsif (count = 5) then
if (temp(count) = '1') then
--temp1(count) := temp(1)*1+temp(2)*1+temp(4)*1+temp(5)*1+temp(7)*1+temp(8)*1;
temp1(1) := temp(1)and'1';
temp1(2) := temp(2)and'1';
temp1(4) := temp(4)and'1'; 
temp1(5) := temp(5)and'1';
temp1(7) := temp(7)and'1';
temp1(8) := temp(8)and'1';
value(5) := "110110110";
      elsif (temp(count) = '0') then
	  temp1 := value(4)+"000000000";
	  value(5):= "000000000";
	end if;
elsif (count = 6) then
if (temp(count) = '1') then
--temp1(count) := temp(3)*1+temp(4)*1+temp(6)*1+temp(7)*1;
temp1(3) := temp(3)and'1';
temp1(4) := temp(4)and'1'; 
temp1(6) := temp(6)and'1';
temp1(7) := temp(7)and'1';
value(6) := "011011000";
      elsif (temp(count) = '0') then
	  temp1 := value(5)+"000000000";
	  value(6):= "000000000";
	end if;
elsif (count = 7) then
if (temp(count) = '1') then
--temp1(count) := temp(3)*1+temp(4)*1+temp(5)*1+temp(6)*1temp(7)*1+temp(8)*1;
temp1(3) := temp(3)and'1';
temp1(4) := temp(4)and'1'; 
temp1(5) := temp(5)and'1';
temp1(6) := temp(6)and'1';
temp1(7) := temp(7)and'1';
temp1(8) := temp(8)and'1';
value(7) := "111111000";
      elsif (temp(count) = '0') then
	  temp1 := value(6)+"000000000";
	  value(7):= "000000000";
	end if;
elsif (count = 8) then
if (temp(count) = '1') then
--temp1(count) := temp(4)*1+temp(5)*1+temp(7)*1+temp(8)*1;
temp1(4) := temp(4)and'1'; 
temp1(5) := temp(5)and'1';
temp1(7) := temp(7)and'1';
temp1(8) := temp(8)and'1';
value(8) := "110110000";
      elsif (temp(count) = '0') then
	  temp1 := value(7)+"000000000";
	  value(8):= "000000000";
	end if;
end if;                        
end loop;
--wait for 2ns;
wea <= "1";
dina <= value(0) or value(1) or value(2) or value(3) or value(4) or value(5) or value(6) or value(7) or value(8);
if(wea = "1") then
write (txt,dina);
writeline(data,txt);
end if;
wait for 2ns;
wea <="0";
addra <= addra + "1";
if (addra = "110010000000000") then
addra <= "000000000000000";
end if;
                end loop;

             end loop;
report " Simulation finished";
wait;
            
end process bram1;  

--Clock generation - Generates 500 MHz clock with 50% duty cycle.

process
begin
    clk <= '1';
    wait for 1ns;  --"ON" time.
    clk <= '0';
    wait for 1ns;  --"OFF" time.
end process;

end Behavioral;

I have another question..is it possible to store a 480x480 pixel matrix as a variable matrix type in vhdl in ram or rom??....is it possible to synthesize??...i do not need the whole matrix as the fpga output but I need it only for my next algorithm and having the data as matrix would help me in solving it faster...my fpga output would only be few pixels of data...I am using zynq fpga...

thanks,
Sreeni

TrickyDicky · Nov 27, 2013

Ok. Lets start with this bombshell: You're doomed to failure already as you're writing VHDL as if it was C code. This code will not synthesise as it is for many reason.

I suggest you start over. Start by learning how to write VHDL for different digital design primitives (registers, rams, counters etc) and only when you've got to grips with the basics, then get a blank sheet of paper and draw out the circuit you intend to do convolution with. HDLs are hardware description languages. Without knowing the circuit, you have no chance of describing it.

sreevenkjan · Nov 27, 2013

Hmmm..well I do know that I am a beginner....I have not optimised the code as you can see...there are many variables which are not not used and not needed...could you tell me where am i going wrong...I was doing a simulation and my code is working fine with the vhdl code above...i do know how to design counters and registers...i have not yet written my top module yet...I am trying to debug and fix my algorithm and get my code running...could you tell me a way as to how I can read my data in a y direction from bram

is it possible to store a 480x40 pixels binary data in a ram??

TrickyDicky · Nov 27, 2013

You code is not synthesisable, so is only a simulation model. And it is written like software, which does not translate very well (if at all) to hardware.
a 2d convolution is not really a beginner project. Start with 1D convolution. And then scale it up into 2 dimension the way I mentioned (Nx1 and 1xM filters).

Yes you can store 480x80 in a ram. You can store whatever image sizes you like, assuming you have enough ram.

sreevenkjan · Nov 27, 2013

I did understand your convolution logic using a shift register but you are not able to understand my main question....my question is since my block ram has stored the data saved in the x direction...my convolution is only in x direction...how do i do the convolution in the y direction i.e for the pixels stored in the columns of the image....

when you mean by scale it upto 2 dimension...have you stored the 1D convolved values using Nx1 filter in a temporary matrix buffer with the size of the image??and then do the convolution using 1xM filter??

TrickyDicky · Nov 27, 2013

THis is where the line buffers come in. It doesnt matter how it is stored in memory. You just start at the top left and finish at bottom right. you only grab one pixel at a time. THen you use shift registers and line buffers to carry out the 2D convolution in real time. You can also affect a rotation, shift or any other translation function on the image simply by reading it in another way.

You only need to read the source image once to do ALL the convolution in 1 go.

sreevenkjan · Nov 27, 2013

in your case I would need to have a ram based shift register which would store my binary image and then shift the pixels one bit at a time and then do the 2d convolution....

but in my case I have the input image already stored as a .coe file in the bram...since the data stored in bram are 9 bits wide and its not similar to my image row size...i need to have the 1D convoled data to do convolution in the y direction...while i read the data from the block ram i do the 1D convolution i.e in the x-direction...

I found a good article on doing dilation using shift registers and line buffers...but I do not know clearly understand by line buffer...how many bits of data are there in the the line buffer??..this is the link which i got it...

https://people.ece.cornell.edu/land/courses/ece5760/FinalProjects/s2013/xb46_jw937/xb46_jw937/

TrickyDicky · Nov 28, 2013

Are you doing two separate convolutions? are you trying to get two sets of output data? What size is the overall convolution matrix?
It doesnt matter where the source data is, or how it is stored. You just create a pipeline to do the convolution (really just a load of multiplies and adds).

That article describes exactly what I already said - line buffers and shift registers to do a 3x3 convolution.

sreevenkjan · Nov 28, 2013

no i am trying to do convolution 4 times on a image and each time my input image is convolved I do again the convolution 3 times on the previous convolved image...my image is of size 480x480 binary image...I have stored it in the block ram as .coe file...

my question is what should the size of my line buffers be??.....should it be one bit in size??...I do not understand the logic there of just shifting the input data and doing the convolution...my convolution matrix is 3x3 matrix [1 1 1;1 1 1;1 1 1].

TrickyDicky · Nov 28, 2013

that matrix is going to give an image gain of x9. that means each output image will need 4 extra bits (so final image will need 20 bits per pixel). Or did you intend the values to all be 1/9? There is no reason you cannot pipeline the filters, so you only need 1 version of the origional image, a few buffers in between and your final output is the 3 convolutions. you should only need to read the source data once.

sreevenkjan · Nov 28, 2013

no i intend to repeat the convolution loop 3 more times..so all together it would have performed 4 concolutions...my question is...is it better to read one pixel at a time and have line buffers the size of the input read or read 9pixels and pass it on to the shift registers??

just for confirmation....so each time my shift registers shift the convolution matrix...there should be data incoming to the process and also changing the line buffers right??

TrickyDicky · Nov 29, 2013

You cannot read 9 pixels in parrallel, so you will have to read them 1 at a time anyway.

Welcome to EDAboard.com

[MOVED] 2d convolution problem using vhdl

sreevenkjan

Full Member level 5

TrickyDicky

Advanced Member level 7

sreevenkjan

Full Member level 5

TrickyDicky

Advanced Member level 7

sreevenkjan

Full Member level 5

TrickyDicky

Advanced Member level 7

sreevenkjan

Full Member level 5

TrickyDicky

Advanced Member level 7

sreevenkjan

Full Member level 5

TrickyDicky

Advanced Member level 7

sreevenkjan

Full Member level 5

TrickyDicky

Advanced Member level 7

sreevenkjan

Full Member level 5

TrickyDicky

Advanced Member level 7

Similar threads

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Connect with us

Online statistics

Forum statistics