using port map to add rows of ROM

masoud.malekzadeh · Feb 9, 2012

I have defined a component as an adder how can i add rows of ROM ?

mrflibble · Feb 9, 2012

Ey? By typing code using your chosen HDL?

Alternatively: please be more specific. Show us the code you have so far at least.

masoud.malekzadeh · Feb 9, 2012

the entity of component is pre defined in xilinx ( Ip core generator ) now i want to add rows of Rom using port map adder
here is my code ....

library IEEE;
use ieee.std_logic_textio.all;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_unsigned.ALL;
use std.textio.all;

entity ROM is

port (

clk: in std_logic

);

end ROM;

architecture Behavioral of ROM is

type ram_type is array (9 downto 0 ) of std_logic_vector(31 downto 0 );
signal ram :ram_type;

signal w:std_logic_vector(31 downto 0);

signal c:std_logic_vector(31 downto 0);
signal t:std_logic_vector(31 downto 0);
signal Prev_Sum:std_logic_vector(31 downto 0);
signal i : integer range 9 downto 0 :=0;

component adder
port (
a: in std_logic_vector(31 downto 0);
b: in std_logic_vector(31 downto 0);
clk: in std_logic;
result: out std_logic_vector(31 downto 0));
end component;

begin

process
FILE infile : TEXT is in "in_code.txt";
FILE outfile : TEXT IS OUT "out_code.txt";
VARIABLE out_line: LINE;
variable my_line : line;
variable int: std_logic_vector(31 downto 0 ) ;
begin
for i in 0 to 9 loop
readline(infile,my_line);
read (my_line,int);
ram(i)<=int;
write(out_line,int);
writeline(outfile,out_line );
end loop ;
wait; -- Waits forever

end process;

TrickyDicky · Feb 9, 2012

for a rom on hardware using internal BRAMs, you can only ever access one element at a time, so you need to create a design that reads the elements from the rom and feeds them into the adder.

xtcx · Feb 10, 2012

I wonder why the IP generates using a "clk" for your adder design. I think couldbe because to make the latency finite say 1 clks. Fine, but
why does the output width seem to be same as input 32 bits, but it shouldbe 33, if you dont want to omit the carry bit. Or have you set it IP parameters?.

TrickyDicky · Feb 10, 2012

xtcx said:
I wonder why the IP generates using a "clk" for your adder design. I think couldbe because to make the latency finite say 1 clks. Fine, but
why does the output width seem to be same as input 32 bits, but it shouldbe 33, if you dont want to omit the carry bit. Or have you set it IP parameters?.

All IP should have a clock input.
And the OP was talking about floating point, so the ip and op widths will always be the same.

xtcx · Feb 10, 2012

Oh FP Ip is it....then they shouldbe same

And yeah yeah all IPs will have clk, but to know why is that so....
Because as you know, adders dont necessarily need clks for synchronisation.Is it just becuase to have finite or clk delays? in giving output so that it will not mess-up in timming?....
In this case it is a FP adder...

TrickyDicky · Feb 10, 2012

everything in an FPGA should be clocked - so why wouldnt an adder have one?
Floating point takes several clock cycles to complete in any decent amount of time. For altera FP cores, the default value for a floating point square root is 48 clocks IIRC.

xtcx · Feb 10, 2012

TrickyDicky said:
everything in an FPGA should be clocked - so why wouldnt an adder have one?
Floating point takes several clock cycles to complete in any decent amount of time. For altera FP cores, the default value for a floating point square root is 48 clocks IIRC.

Well, I didn't mean a FP in this case, where clk is mandatory because some shifting operations are performed and hence the latency. But in case of adders\multipliers usage of clks is not necessary, but if you want to speak about timing issues because of this, just register the input and outputs....Why should we use clk based adders\mult ?

Am I making it clear?

TrickyDicky · Feb 10, 2012

Yes I see what you're saying, but why go to the bother of adding them externally when its just cleaner to contain them internally. Many IP cores like this have a pipeline parameter which allows you to set the pipeline length - for timing reasons and also to ensure pipe lneghts match across parralllel pipelines. You can usually set this to 0 if you really want it unregistered (but why bother if you're going to add them externally)?

In addition, some compilers allow register retiming, which mean they can pull registers into the IP if they fail to meet timing, but its much easier just to set a pipe length and let the retiming deal with those, rather than working out which external registers it is allowed to use.

Then finally - some multiply/accumulators have embedded registers. In the past I have seen these registers only used when you included them in the IP. Synthesisors are better now, but why not us the embedded ones? And remember IP cores usually cover different FPGAs, so architecture can vary, and one IP block covers them all.

So many many reasons to have a clock input (and you'd be silly not to use it).

xtcx · Feb 10, 2012

Mmm...sounds good, I agree. But it is not always applicable in all applications is my fact
Let's see my case...

I'd to design through external registering, the reason was in that case, I needed to perform about some 30 arith operations. Some are mult and some are add operations. In that case, I roughly instantiated the same custom logic that much no.times as required and globally registered all 30 outputs. Well, this gave me advantage of clk loading and hence less fan-out as well as less routing. Timing didnt pop up as the requirement was low, but still it was doing above 125MHz. I couldn't think the way of using clk based arith logic here. You can see for yourself the reason why...and which wouldbe better in such cases!!!

mrflibble · Feb 10, 2012

TrickyDicky said:
Yes I see what you're saying, but why go to the bother of adding them externally when its just cleaner to contain them internally. Many IP cores like this have a pipeline parameter which allows you to set the pipeline length - for timing reasons and also to ensure pipe lneghts match across parralllel pipelines. You can usually set this to 0 if you really want it unregistered (but why bother if you're going to add them externally)?

You would bother setting it to 0 if you are trying to shave off yet another cycle in the entire pipeline. ;-) But maybe your point is ... why bother using an adder IP for the specific non-buffered case? If so, good question. The only reason I can think of that you don't know in advance if you can get away with 0 delay at that point, because it will depend on a lot of the To Be Designed surrounding modules. So you'd like the freedom of it becoming 0, 1 or 2 delay.

In addition, some compilers allow register retiming, which mean they can pull registers into the IP if they fail to meet timing, but its much easier just to set a pipe length and let the retiming deal with those, rather than working out which external registers it is allowed to use.

Agreed. Although maybe you could enlighten me here ... I have in the past tried to do exactly this, have the tools do their retiming thing. And I gave the tools plenty of room to work with, slapped on registers both at input + output, enabled retiming moving registers backwards and forwards. Hell, depending on my mood I even allowed it to flatten the entire design so it could mess stuff up with my signal names.

But IMO the results were lackluster at best. It could very well be that I am just doing it wrong, but can't think of what to do to improve it. The best thing was just to say Fsck It! and hand optimize it. That did give me the performance I was after, and didn't take eons to synthesize. Because that's another gripe I have with retiming, take too frigging long. Again, I could be doing it totally wrong. So I'd love to have some pointers / reading material on how to do it properly!

Then finally - some multiply/accumulators have embedded registers. In the past I have seen these registers only used when you included them in the IP. Synthesisors are better now, but why not us the embedded ones? And remember IP cores usually cover different FPGAs, so architecture can vary, and one IP block covers them all.

So many many reasons to have a clock input (and you'd be silly not to use it).

I guess you mean that you just provide the clock input regardless. For everything with 1 or more pipeline delay the clock actually gets used inside the IP, and for 0 it doesn't get used and is optimized away. if so, that would make perfect sense because you want the have the interface to your IP the same, regardless of pipeline depth parameter....

masoud.malekzadeh · Feb 10, 2012

I would be thankful if you suggest me a way to add Rows of this rom toghether using this component , the clock is not my issue !

mrflibble · Feb 10, 2012

masoud.malekzadeh said:
I would be thankful if you suggest me a way to add Rows of this rom toghether using this component , the clock is not my issue !

In which case you can apply this response:

TrickyDicky said:
for a rom on hardware using internal BRAMs, you can only ever access one element at a time, so you need to create a design that reads the elements from the rom and feeds them into the adder.

K-J · Feb 11, 2012

masoud.malekzadeh said:
I would be thankful if you suggest me a way to add Rows of this rom toghether using this component , the clock is not my issue !

Then I suggest you review the thread that you started on this board called 'Port mapping with process'. Here is the link since you seem to have lost it https://www.edaboard.com/threads/239415/

In there you'll see that even after providing you the framework for the solution, you were then given feedback on a debug question you asked and that ended that discussion but for whatever reason spawned this new thread. Except in this thread you for some reason did not not include the relevant code for sequencing and accumulating the ROM data...even though that is exactly what you're asking here.

Kevin Jennings

masoud.malekzadeh · Feb 11, 2012

process(clk)

begin
if rising_edge(clk) then

if (i <= 9) then
Prev_Sum <= t;
w<=ram(i);
i <= i + 1;
end if;
end if ;
end process;

g1 : adder port map (a => Prev_Sum,b=>w,clk => clk,result =>t);

i used the code above as you said but the Prev_Sum signal some how is always zero and t equals to w , the only way to solve this problem i modified the length of my array to 8 and used 3 successive For Generate to add the data :

----------------------- Mean calculation for data 1
GEN_ADDERS1 : for i in 0 to 3 generate
g1:adder port map (a => data1(2*i),b=>data1((2*i)+1),clk => clk,result =>ram1(i));
end generate;

GEN_ADDERS2 : for i in 0 to 1 generate
g1:adder port map (a => ram1(2*i),b=>ram1((2*i)+1),clk => clk,result =>ram2(i));
end generate;

g1:adder port map (a => ram2(0),b=>ram2(1),clk => clk,result =>mean1);

GEN_Subtractor : for i in 0 to 7 generate
g1:subtractor port map (a => data1(i),b=>mean1,clk => clk,result =>res1(i));
end generate;

----------------------- end calculation for data 1

but my real length of data is 1024 and i have to do this arithmetic so many times ........

K-J · Feb 11, 2012

masoud.malekzadeh said:
process(clk)

begin
if rising_edge(clk) then

if (i <= 9) then
Prev_Sum <= t;
w<=ram(i);
i <= i + 1;
end if;
end if ;
end process;

g1 : adder port map (a => Prev_Sum,b=>w,clk => clk,result =>t);

i used the code above as you said but the Prev_Sum signal some how is always zero and t equals to w ,

That's not the code I posted, that's what you modified a bit and came up with.

the only way to solve this problem i modified the length of my array to 8 and used 3 successive For Generate to add the data :

Not a clue how you think that what you did is the only way to solve this problem

----------------------- Mean calculation for data 1
GEN_ADDERS1 : for i in 0 to 3 generate
g1:adder port map (a => data1(2*i),b=>data1((2*i)+1),clk => clk,result =>ram1(i));
end generate;

GEN_ADDERS2 : for i in 0 to 1 generate
g1:adder port map (a => ram1(2*i),b=>ram1((2*i)+1),clk => clk,result =>ram2(i));
end generate;

g1:adder port map (a => ram2(0),b=>ram2(1),clk => clk,result =>mean1);

GEN_Subtractor : for i in 0 to 7 generate
g1:subtractor port map (a => data1(i),b=>mean1,clk => clk,result =>res1(i));
end generate;

----------------------- end calculation for data 1

And now you've completely changed whatever the problem is to something completely different. You have many new undeclared signals and no definition now of just what you're trying to do.

but my real length of data is 1024 and i have to do this arithmetic so many times ........

Whatever that is supposed to mean...

Anyway, assuming that the actual problem that you are trying to solve is to add up the contents of a ROM then the code I have posted below works correctly assuming that entity 'adder' is purely combinatorial. I realize that 'adder' probably has at least one clock cycle of latency and I'll leave it to you to figure out how this might affect things in your application.

As a side note, if you truly are trying to 'add up the contents of a ROM', then the obvious question would be 'Why?'. The contents of the ROM are known, therefore the sum of those contents can be computed and therefore also known ahead of time. You can compute this either manually outside of the source code, or (as I prefer), right in the VHDL code itself. There would be no 'adder' entity at all, you would add up the ROM table entries and compute the result. But I'll assume for now that you're simply using the ROM table as a method to get the adder functionality working and that in the end you're not really using hardware to add up the contents of a known fixed table.

Kevin Jennings

Code:

library IEEE;
use ieee.std_logic_textio.all;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.numeric_std.ALL;
entity adder is
port (
    a:      in  std_logic_vector(31 downto 0);
    b:      in  std_logic_vector(31 downto 0);
    clk:    in  std_logic;
    result: out std_logic_vector(31 downto 0));
end adder;
architecture rtl of adder is
begin
    result <= std_logic_vector(unsigned(a) + unsigned(b));-- when rising_edge(clk);
end rtl;

library IEEE;
use ieee.std_logic_textio.all;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_unsigned.ALL;
use std.textio.all;

entity ROM is
port ( 
--    x_out:  inout   std_logic_vector(31 downto 0 );
--    x_in:   inout   std_logic_vector(31 downto 0 );
    clk:    in      std_logic;
    reset:  in      std_logic
);

end ROM;

architecture Behavioral of ROM is
    type ram_type is array (9 downto 0 ) of std_logic_vector(31 downto 0 );
    signal ram :ram_type;
    signal i : integer range ram_type'range;
    signal w:std_logic_vector(31 downto 0);
    signal y:std_logic_vector(31 downto 0);
    signal t:std_logic_vector(31 downto 0);
    signal Prev_Sum:std_logic_vector(31 downto 0);

    component adder
    port (
        a: in std_logic_vector(31 downto 0);
        b: in std_logic_vector(31 downto 0);
        clk: in std_logic;
        result: out std_logic_vector(31 downto 0));
    end component;
begin 
    process
        FILE infile : TEXT is in "in_code.txt";
        FILE outfile : TEXT IS OUT "out_code.txt";
        VARIABLE out_line: LINE;
        variable my_line : line;
        variable int: std_logic_vector(31 downto 0 ) ;
    begin 
        for i in 0 to 9 loop
        readline(infile,my_line);
        read (my_line,int);
        ram(i)<=int;
        write(out_line,int);
        writeline(outfile,out_line );
        end loop ;
        wait; -- Waits forever
    end process;

    process(clk)
    begin
      if rising_edge(clk) then
        if (reset = '1') then -- Something that resets the accumulated sum to 0
          Prev_Sum <= (others => '0');  -- The accumulated sum
          i <= 0;
        elsif (i <= 8) then
          Prev_Sum <= t;  -- Save the updated sum
          i <= i + 1;
        end if;
      end if;
    end process;

    w <= ram(i);
    g1 : adder port map (a => w,b=>Prev_Sum,clk => clk,result =>t);
end Behavioral;

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
entity tb_ROM is
end tb_ROM;
architecture RTL of tb_ROM is
    signal clk:             std_logic := '0';
    signal reset:           std_logic;
    signal sim_complete:    std_logic;
begin
    reset   <= '1', '0' after 10 ns;
    sim_complete    <= '0', '1' after 200 ns;
    clk     <= not(sim_complete) and not(clk) after 5 ns;

    DUT : entity work.ROM
    port map ( 
        clk     => clk,
        reset   => reset);
end RTL;

Welcome to EDAboard.com

using port map to add rows of ROM

Member level 1

Advanced Member level 5

Member level 1

Advanced Member level 7

Advanced Member level 1

Advanced Member level 7

Advanced Member level 1

Advanced Member level 7

Advanced Member level 1

Advanced Member level 7

Advanced Member level 1

Advanced Member level 5

Member level 1

Advanced Member level 5

Advanced Member level 2

Member level 1

Advanced Member level 2

Similar threads

Part and Inventory Search

Welcome to EDABoard.com

Sponsor