+ Post New Thread
Results 1 to 16 of 16
  1. #1
    Member level 4
    Points: 1,110, Level: 7

    Join Date
    Jun 2015
    Posts
    77
    Helped
    0 / 0
    Points
    1,110
    Level
    7

    Vivado Taking A Long Time To Run Synthesis & Implementation

    I am new to Vivado , but it seems like Vivado 17.4 takes longer than it should to run through Synthesis and Implementation, i'm working on a design of sha-512 algorithm( hash function using in security) ,utilization is attached.
    it takes around 3 hours to complete implementation.
    Is my computer is an effective factor or it is normal in vivado? and how can i speed this up?



    Click image for larger version. 

Name:	Capture22.PNG 
Views:	22 
Size:	17.0 KB 
ID:	153159

  2. #2
    Advanced Member level 5
    Points: 8,111, Level: 21

    Join Date
    Apr 2016
    Posts
    1,707
    Helped
    300 / 300
    Points
    8,111
    Level
    21

    Re: Vivado Taking A Long Time To Run Synthesis & Implementation

    Quote Originally Posted by MSAKARIM View Post
    I am new to Vivado , but it seems like Vivado 17.4 takes longer than it should to run through Synthesis and Implementation, i'm working on a design of sha-512 algorithm( hash function using in security) ,utilization is attached.
    it takes around 3 hours to complete implementation.
    Is my computer is an effective factor or it is normal in vivado? and how can i speed this up?



    Click image for larger version. 

Name:	Capture22.PNG 
Views:	22 
Size:	17.0 KB 
ID:	153159
    runtime is proportional no only to design size. maybe the target frequency is too high and the tool spends a lot of time optimizing. maybe the IO/floorplan is bad and the tool tries to overcome that.
    Really, I am not Sam.


    1 members found this post helpful.

    •   AltAdvertisement

        
       

  3. #3
    Super Moderator
    Points: 256,061, Level: 100
    Awards:
    1st Helpful Member

    Join Date
    Jan 2008
    Location
    Bochum, Germany
    Posts
    44,650
    Helped
    13586 / 13586
    Points
    256,061
    Level
    100

    Re: Vivado Taking A Long Time To Run Synthesis & Implementation

    Sounds like a poorly designed (e.g. pure combinational) sha implementation which hardly meets timing.


    1 members found this post helpful.

  4. #4
    Super Moderator
    Points: 30,595, Level: 42
    ads-ee's Avatar
    Join Date
    Sep 2013
    Location
    USA
    Posts
    7,038
    Helped
    1682 / 1682
    Points
    30,595
    Level
    42

    Re: Vivado Taking A Long Time To Run Synthesis & Implementation

    Evidence for it lacking pipelining...

    LUT utilization of 17%
    FF utilization of 1%
    along with IO utilization of 70% means the design is spread all over the die but has virtually no registers to pipeline across the die.

    I recall other threads on aspects of this design and from what I remember of the code snippets, I didn't think this design would run better than a few MHz.


    1 members found this post helpful.

    •   AltAdvertisement

        
       

  5. #5
    Full Member level 5
    Points: 2,171, Level: 10

    Join Date
    May 2014
    Posts
    253
    Helped
    27 / 27
    Points
    2,171
    Level
    10

    Re: Vivado Taking A Long Time To Run Synthesis & Implementation

    You can get timestamps for when the various parts of par complete.

    This way you can get a feel for how long design init, opt_design, place_design and route design take.

    If you "pipelined" then the tool has an easier time during placement.


    1 members found this post helpful.

    •   AltAdvertisement

        
       

  6. #6
    Member level 4
    Points: 1,110, Level: 7

    Join Date
    Jun 2015
    Posts
    77
    Helped
    0 / 0
    Points
    1,110
    Level
    7

    Re: Vivado Taking A Long Time To Run Synthesis & Implementation

    SHA algorithm has 80 rounds (iterations), may this be the reason?
    is Loop Pipelining or Loop Unrolling like that https://www.xilinx.com/support/docum...unrolling.html enhancing the design performance?

    - - - Updated - - -

    Quote Originally Posted by ads-ee View Post
    Evidence for it lacking pipelining...

    LUT utilization of 17%
    FF utilization of 1%
    along with IO utilization of 70% means the design is spread all over the die but has virtually no registers to pipeline across the die.

    I recall other threads on aspects of this design and from what I remember of the code snippets, I didn't think this design would run better than a few MHz.
    Yes, i have posted threats about this before, but honestly i changed more things to enhance it and still takes long time during implementation.



  7. #7
    Super Moderator
    Points: 30,595, Level: 42
    ads-ee's Avatar
    Join Date
    Sep 2013
    Location
    USA
    Posts
    7,038
    Helped
    1682 / 1682
    Points
    30,595
    Level
    42

    Re: Vivado Taking A Long Time To Run Synthesis & Implementation

    If you are doing this in HLS then you need the pragma for pipelining according to that page.


    1 members found this post helpful.

  8. #8
    Super Moderator
    Points: 256,061, Level: 100
    Awards:
    1st Helpful Member

    Join Date
    Jan 2008
    Location
    Bochum, Germany
    Posts
    44,650
    Helped
    13586 / 13586
    Points
    256,061
    Level
    100

    Re: Vivado Taking A Long Time To Run Synthesis & Implementation

    is Loop Pipelining or Loop Unrolling like that
    The link is about pipelining with HLS compiler, not applicable to generic HDL code.

    As far as I understand, you are coding in generic VHDL. You need to implement pipelining explicitly using a clock and pipeline registers.


    1 members found this post helpful.

  9. #9
    Member level 4
    Points: 1,110, Level: 7

    Join Date
    Jun 2015
    Posts
    77
    Helped
    0 / 0
    Points
    1,110
    Level
    7

    Re: Vivado Taking A Long Time To Run Synthesis & Implementation

    If i have this code partition:
    Code:
    w(15)<=Message_block(63 downto 0);
    w(14)<=Message_block(127 downto 64);
    w(13) <=Message_block(191 downto 128);
    w(12)<=Message_block(255 downto 192);
    w(11) <=Message_block(319 downto 256);
    w(10) <=Message_block(383 downto 320);
    w(9) <=Message_block(447 downto 384);
    w(8) <=Message_block(511 downto 448);
    w(7) <=Message_block(575 downto 512);
    w(6) <=Message_block(639 downto 576);
    w(5) <=Message_block(703 downto 640);
    w(4) <=Message_block(767 downto 704);
    w(3)<=Message_block(831 downto 768);
    w(2)<=Message_block(895 downto 832);
    w(1)<=Message_block(959 downto 896);
    w(0)<=Message_block(1023 downto 960);
    wordGen : for t in 16 to (79) generate
    
     WOW : WordT port map(w((t-2)), w((t-7)) , w((t-15)) , w((t-16)),w(t));
    end generate ;
    How can i make it pipelined?
    what about this try ( adding this part to the previous code):
    Code:
    REGIS: for i in 0 to 79  generate
    
    reg: Reg64 port map (clk,rst,w(i),wo(i));
    
    end generate;
    where REG64 code is:
    Code:
    entity Reg64 is
     Port (clk,rst:in std_logic;
           D: in std_logic_vector(63 downto 0);
           RegW: out std_logic_vector(63 downto 0) );
    end Reg64;
    
    architecture Behavioral of Reg64 is
    
    begin
    
    process(clk,rst)
    begin
    if (rst ='0' ) then RegW <= (others=> '0');
    elsif (clk'event and clk = '1') then
    RegW <= D;
    end if;
    end process;
    end Behavioral;



  10. #10
    Advanced Member level 5
    Points: 8,111, Level: 21

    Join Date
    Apr 2016
    Posts
    1,707
    Helped
    300 / 300
    Points
    8,111
    Level
    21

    Re: Vivado Taking A Long Time To Run Synthesis & Implementation

    I can't understand what you are doing (coding a register bank?!), but it doesn't look like pipelining. when we say to pipeline we mean to split a long combinational logic into two or more stages. the first code snippet has no combinational logic, just some mapping. I doubt that is the problem.
    Really, I am not Sam.


    1 members found this post helpful.

  11. #11
    Member level 4
    Points: 1,110, Level: 7

    Join Date
    Jun 2015
    Posts
    77
    Helped
    0 / 0
    Points
    1,110
    Level
    7

    Re: Vivado Taking A Long Time To Run Synthesis &amp; Implementation

    Quote Originally Posted by ThisIsNotSam View Post
    I can't understand what you are doing (coding a register bank?!), but it doesn't look like pipelining. when we say to pipeline we mean to split a long combinational logic into two or more stages. the first code snippet has no combinational logic, just some mapping. I doubt that is the problem.
    " when we say to pipeline we mean to split a long combinational logic into two or more stages "
    >> i did this and still take more time.

    - - - Updated - - -

    This a part of my original code, Plz i need some notes (just notes) to make it pipline:
    Code:
    library IEEE;
    use IEEE.STD_LOGIC_1164.all;
    use IEEE.NUMERIC_STD.All;
    use IEEE.STD_LOGIC_UNSIGNED.ALL;
    
    
    entity Round_sha is
    
    port( clk,rst,init: in STD_LOGIC;   
          IV: in std_logic_vector(511 downto 0);
          Message_block: in std_logic_vector(1023 downto 0);
         Hashed_512: out std_logic_vector(511 downto 0));
    end Round_sha;
    
    
    architecture rtl of Round_sha is 
    
    component K_rom IS
    PORT ( addr: IN INTEGER RANGE 0 TO 79;
    data: OUT STD_LOGIC_VECTOR (63 DOWNTO 0));
    END component;
    
    component WordT is 
    port (w2,w7,w15,w16: in std_logic_vector (63 downto 0);
          wnext: out std_logic_vector (63 downto 0));
    end component;  
    
    Component Func_round is
    port(a,b,c,e,f,g: in std_logic_vector(63 downto 0);
    f0,f1,f2,f3: out std_logic_vector( 63 downto 0));
    end component;
    
    
    
    type word is array (-1 to 79 ) of std_logic_vector(63 downto 0);
    signal a,b,c,d,e,f,g,h:word;
    type word2 is array (0 to 79 ) of std_logic_vector(63 downto 0);
    signal w,f0,f1,f2,f3,k: word2;
    
    begin 
    
    
    --assigning block to words
    
    
    --words from t=0:15
    w(15)<=Message_block(63 downto 0);
    w(14)<=Message_block(127 downto 64);
    w(13) <=Message_block(191 downto 128);
    w(12)<=Message_block(255 downto 192);
    w(11) <=Message_block(319 downto 256);
    w(10) <=Message_block(383 downto 320);
    w(9) <=Message_block(447 downto 384);
    w(8) <=Message_block(511 downto 448);
    w(7) <=Message_block(575 downto 512);
    w(6) <=Message_block(639 downto 576);
    w(5) <=Message_block(703 downto 640);
    w(4) <=Message_block(767 downto 704);
    w(3)<=Message_block(831 downto 768);
    w(2)<=Message_block(895 downto 832);
    w(1)<=Message_block(959 downto 896);
    w(0)<=Message_block(1023 downto 960);
    a(-1) <= IV(511 downto 448);        -- <= X"6a09e667f3bcc908";
    b(-1) <= IV(447 downto 384);        -- <= X"bb67ae8584caa73b";
    c(-1) <= IV(383 downto 320);        -- <= X"3c6ef372fe94f82b";
    d(-1) <= IV(319 downto 256);        -- <= X"a54ff53a5f1d36f1";
    e(-1) <= IV(255 downto 192);        -- <= X"510e527fade682d1";
    f(-1) <= IV(191 downto 128);        -- <= X"9b05688c2b3e6c1f";
    g(-1) <= IV(127 downto 64) ;       -- <= X"1f83d9abfb41bd6b";
    h(-1) <= IV(63 downto 0);        -- <= X"5be0cd19137e2179";
    --words from t=16:79
    wordGen : for t in 16 to (79) generate
    
     WOW: WordT port map(w((t-2)) , w((t-7)) , w((t-15)) , w((t-16)),w(t));
    
    end generate ; -- wordGen
    
    Funcc: for i in 0 to 79  generate
    
    Func: Func_round port map(a(i-1),b(i-1),c(i-1),e(i-1),f(i),g(i),f0(i),f1(i),f2(i),f3(i));
    KROM: K_rom     port map (i,k(i));
            h(i)          <=  g(i-1);
            g(i)          <=  f(i-1);
            f(i)          <=  e(i-1);
    --	e          <=  d +          T1        ;
    --      e          <=  d + h + f3 + f0 + k + w;
            e (i)         <= std_logic_vector(unsigned(d(i-1)) + Unsigned(h(i-1)) + unsigned(f3(i)) + unsigned(f0(i)) + unsigned(k(i)) + unsigned(w(i)));
            d (i)         <=  c(i-1);
            c (i)         <=  b(i-1);
            b (i)         <=  a(i-1);
    --	a          <=             T1           +    T2  ;
    --      a          <=      h + f3 + f0 + k + w + f2 + f1;
            a(i)          <= std_logic_vector(unsigned(h(i-1)) +unsigned(f3(i)) + unsigned(f0(i)) +unsigned(k(i)) + unsigned(w(i))  + unsigned(f2(i)) + unsigned(f1(i)));
    
    
    end generate;
    
     process (clk,rst)
    variable   H0 , H1 , H2 , H3 , H4 , H5 , H6 ,H7     :     Std_logic_vector ( 63 downto 0);
      begin
       
          if (rst = '1') then
            H0         := X"6a09e667f3bcc908";
            H1         := X"bb67ae8584caa73b";
            H2         := X"3c6ef372fe94f82b";
            H3         := X"a54ff53a5f1d36f1";
            H4         := X"510e527fade682d1";
            H5         := X"9b05688c2b3e6c1f";
            H6         := X"1f83d9abfb41bd6b";
            H7         := X"5be0cd19137e2179";
          
          elsif  (init ='1') then
            if ((clk = '1') and clk'event) then 
            H0         := std_logic_vector( unsigned(a(79)) + unsigned(a(-1)));
            H1         := std_logic_vector( unsigned(b(79)) + unsigned(b(-1)));
            H2         := std_logic_vector( unsigned(c(79)) + unsigned(c(-1)));
            H3         := std_logic_vector( unsigned(d(79)) + unsigned(d(-1)));
            H4         := std_logic_vector( unsigned(e(79)) + unsigned(e(-1)));
            H5         := std_logic_vector( unsigned(f(79)) + unsigned(f(-1)));
            H6         := std_logic_vector( unsigned(g(79)) + unsigned(g(-1)));
            H7         := std_logic_vector( unsigned(h(79)) + unsigned(h(-1)));
    --      h0         <=      a + h0;
    --      h1         <=      b + h1;
    --      h2         <=      c + h2;
    --      h3         <=      d + h3;
    --      h4         <=      e + h4;
    --      h5         <=      f + h5;
    --      h6         <=      g + h6;
    --      h7         <=      h + h7;
        end if;
    end if;
    Hashed_512<= H0&H1&H2&H3&H4&H5&H6&H7;
      end process;
    
    end rtl;



  12. #12
    Advanced Member level 5
    Points: 37,484, Level: 47
    Achievements:
    7 years registered

    Join Date
    Jun 2010
    Posts
    6,810
    Helped
    1999 / 1999
    Points
    37,484
    Level
    47

    Re: Vivado Taking A Long Time To Run Synthesis & Implementation

    why is init outside the clock condition?> you're forcing the clock into logic, which is probably causing massive timing failures. Either put the init check inside the clock or remove it altogether.

    You also have massive logic chains as I dont see any pipelining between the components generated in the large generate loops! You need pipelining at ALL stages, not just the output stage. You design has basically zero pipelining.


    1 members found this post helpful.

  13. #13
    Advanced Member level 3
    Points: 6,045, Level: 18

    Join Date
    Feb 2015
    Posts
    991
    Helped
    284 / 284
    Points
    6,045
    Level
    18

    Re: Vivado Taking A Long Time To Run Synthesis & Implementation

    Any pipeline latency appears in the longest feedback path due to the output of the 80-rounds being the init for the next chunk. Maybe pipelining would help for the immediate problem of synthesis times, but the resulting design isn't that good anyways.

    The 6kW power estimate is interesting.


    1 members found this post helpful.

    •   AltAdvertisement

        
       

  14. #14
    Member level 4
    Points: 1,110, Level: 7

    Join Date
    Jun 2015
    Posts
    77
    Helped
    0 / 0
    Points
    1,110
    Level
    7

    Re: Vivado Taking A Long Time To Run Synthesis &amp; Implementation

    Quote Originally Posted by TrickyDicky View Post
    why is init outside the clock condition?> you're forcing the clock into logic, which is probably causing massive timing failures. Either put the init check inside the clock or remove it altogether.

    You also have massive logic chains as I dont see any pipelining between the components generated in the large generate loops! You need pipelining at ALL stages, not just the output stage. You design has basically zero pipelining.
    please give some example or notes how can i make pipline in that loops

    - - - Updated - - -

    Quote Originally Posted by vGoodtimes View Post
    Any pipeline latency appears in the longest feedback path due to the output of the 80-rounds being the init for the next chunk. Maybe pipelining would help for the immediate problem of synthesis times, but the resulting design isn't that good anyways.

    The 6kW power estimate is interesting.
    How can i reduce this excessive power (too high) ?



  15. #15
    Advanced Member level 5
    Points: 8,111, Level: 21

    Join Date
    Apr 2016
    Posts
    1,707
    Helped
    300 / 300
    Points
    8,111
    Level
    21

    Re: Vivado Taking A Long Time To Run Synthesis &amp; Implementation

    Quote Originally Posted by MSAKARIM View Post
    please give some example or notes how can i make pipline in that loops
    I think you need help from a professor/class or a textbook. You keep asking the same question over and over when the answer was already given.
    Really, I am not Sam.



  16. #16
    Advanced Member level 3
    Points: 6,045, Level: 18

    Join Date
    Feb 2015
    Posts
    991
    Helped
    284 / 284
    Points
    6,045
    Level
    18

    Re: Vivado Taking A Long Time To Run Synthesis &amp; Implementation

    Quote Originally Posted by ThisIsNotSam View Post
    You keep asking the same question over and over when the answer was already given.
    I think he's asking a slightly different question that what's been answered. His problem is more similar to "how do you pipeline an IIR filter". The SHA512 computation has rounds where the output of each round becomes the input to the next. This is a tight feedback loop, similar to the IIR filter. The output after these rounds then becomes part of the input for the next chunk -- another feedback path like the IIR filter.



--[[ ]]--