Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

Place & Route takes too long

Status
Not open for further replies.

SharpWeapon

Member level 5
Joined
Mar 18, 2014
Messages
89
Helped
0
Reputation
0
Reaction score
0
Trophy points
6
Activity points
705
Hello,

My design is taking really too long in place & route part, it went fine till some phase, but then it took forever. After around 40 min I stopped it. Here is the place and route progress report:

Code:
Phase  1  : 33641 unrouted;      REAL time: 36 secs 
Phase  2  : 23627 unrouted;      REAL time: 42 secs 
Phase  3  : 4206 unrouted;      REAL time: 1 mins 29 secs 
Phase  4  : 4218 unrouted; (Setup:1612, Hold:149225, Component Switching Limit:0)     REAL time: 1 mins 47 secs 
Updating file: toplevel.ncd with current fully routed design.
Phase  5  : 0 unrouted; (Setup:1612, Hold:147854, Component Switching Limit:0)     REAL time: 1 mins 54 secs 
Phase  6  : 0 unrouted; (Setup:1612, Hold:147854, Component Switching Limit:0)     REAL time: 1 mins 55 secs 
Phase  7  : 0 unrouted; (Setup:0, Hold:177889, Component Switching Limit:0)     REAL time: 2 mins 20 secs 
Phase  8  : 0 unrouted; (Setup:0, Hold:177889, Component Switching Limit:0)     REAL time: 2 mins 20 secs

Btw, I am using 14.7, latest version of xilinx with full license. Any suggestion?

Thanks!
 

ads-ee

Super Moderator
Staff member
Joined
Sep 10, 2013
Messages
7,860
Helped
1,817
Reputation
3,644
Reaction score
1,782
Trophy points
1,393
Location
USA
Activity points
59,412
Phase 8 was the last line output by PAR, or is there more to the report?

If this Phase 8 was the last output from par then it's probably due to some poor constraints that can't realistically be met. Like synchronous transfers between clocks generated in the logic fabric resulting in large amounts of hold time.

I'm making an assumption it's a problem with the hold time as that is 177.889 ns of total hold time violation that is being reported. Which is an excessive amount in any design. Something under a couple of thousand (i.e 2 ns) is more or less normal, but 177,889 isn't.

Regards
 

mrinalmani

Advanced Member level 1
Joined
Oct 7, 2011
Messages
459
Helped
60
Reputation
120
Reaction score
54
Trophy points
1,318
Location
Delhi, India
Activity points
5,255
I had the same problem once with ultiboard. I exploded the over-crowded areas a bit and auto-routing completed within seconds.
 

SharpWeapon

Member level 5
Joined
Mar 18, 2014
Messages
89
Helped
0
Reputation
0
Reaction score
0
Trophy points
6
Activity points
705
Hey ads-ee, thanks! Yeah, phase 8 is the last one. I have embedded some external code to my design for some interfacing issues, probably that is causing the problem. How can I trace which signal is causing hold time violation?

mrinalmani, thanks for the reply but what exactly did you do?
 

ads-ee

Super Moderator
Staff member
Joined
Sep 10, 2013
Messages
7,860
Helped
1,817
Reputation
3,644
Reaction score
1,782
Trophy points
1,393
Location
USA
Activity points
59,412
Re: Place & Route takes too long

I had the same problem once with ultiboard. I exploded the over-crowded areas a bit and auto-routing completed within seconds.
I highly doubt placement is a problem as the design completed routing.
Phase 5 : 0 unrouted; (Setup:1612, Hold:147854, Component Switching Limit:0) REAL time: 1 mins 54 secs
See it has 0 unrouted nets in 1 min 54 seconds.
Now PAR tries to meet timing...
Phase 6 : 0 unrouted; (Setup:1612, Hold:147854, Component Switching Limit:0) REAL time: 1 mins 55 secs
Phase 7 : 0 unrouted; (Setup:0, Hold:177889, Component Switching Limit:0) REAL time: 2 mins 20 secs
Phase 8 : 0 unrouted; (Setup:0, Hold:177889, Component Switching Limit:0) REAL time: 2 mins 20 secs
But once it fixes the setup times it ends up with even more hold time violations...This really looks like a design+constraint problem.

Regards

- - - Updated - - -

Hey ads-ee, thanks! Yeah, phase 8 is the last one. I have embedded some external code to my design for some interfacing issues, probably that is causing the problem. How can I trace which signal is causing hold time violation?
It's not going to be a single signal, more likely a whole bunch of them.

You can't really trace anything until PAR completes.
My suggestion...turn down the effort level and let it finish routing the design. I think there is a way to make it not keep trying until it makes timing, but I don't remember off hand what that was. Make sure the advanced options are showing. Once it finishes you can bring up trace and take a look at where the hold violations are occurring. My suspicion will be that they are between two different clock domains that the tool is assuming are synchronous.

mrinalmani, thanks for the reply but what exactly did you do?
Ignore this response as it's not the root of your problem.

Regards
 

SharpWeapon

Member level 5
Joined
Mar 18, 2014
Messages
89
Helped
0
Reputation
0
Reaction score
0
Trophy points
6
Activity points
705
Re: Place & Route takes too long

Now it takes two hours+ in a 4GB RAM core 2 duo machine and still running. In another slow machine, the same project already took four hours+ yet not finished. :( Your suspicion might be true, I have two asynchronous clock domains, but that is part of the design.
 

std_match

Advanced Member level 4
Joined
Jul 9, 2010
Messages
1,215
Helped
451
Reputation
902
Reaction score
413
Trophy points
1,363
Location
Sweden
Activity points
9,420
Are both clocks using clock nets (BUFG or similar) ?
Have you set constraints to ignore timing for all signals that cross the clock domains?
What are the clock frequencies?
 

mrflibble

Advanced Member level 5
Joined
Apr 19, 2010
Messages
2,724
Helped
679
Reputation
1,360
Reaction score
651
Trophy points
1,393
Activity points
19,551
Re: Place & Route takes too long

Your suspicion might be true, I have two asynchronous clock domains, but that is part of the design.

If it is trying to meet timing for those async domains as well that might make for interesting coffee breaks. If you didn't do so already you should put a TIG (timing ignore) constraint on those async signals. And then double check in the logs that those constraints are actually applied, because that sometimes is a bit sneaky.
 

pbernardi

Full Member level 2
Joined
Nov 21, 2013
Messages
130
Helped
27
Reputation
54
Reaction score
27
Trophy points
1,308
Activity points
2,286
Are the FPGA close to the limit? If yes, you could try the following:

- Oversize your FPGA (chose a FPGA one or two families bigger that you are using).
- Run place and route.
- The time to finish place and route should be smaller, once you have more logic. Once finished, you can check your critical path on timing analysis.

It may give you a hint of what you need to optimize.
 

SharpWeapon

Member level 5
Joined
Mar 18, 2014
Messages
89
Helped
0
Reputation
0
Reaction score
0
Trophy points
6
Activity points
705
Thank you both std_match and mrflibble for your reply. Btw one of the machine finished it just now. Here is the log file. I didn't quite understand my way forward after reading the report tho. And please tell me how I should apply timing ignore thing. Thanks!
 

Attachments

  • implimentation report.txt
    43.9 KB · Views: 13
  • Timing report.txt
    181.5 KB · Views: 5
  • UCF.txt
    6 KB · Views: 9
Last edited:

ads-ee

Super Moderator
Staff member
Joined
Sep 10, 2013
Messages
7,860
Helped
1,817
Reputation
3,644
Reaction score
1,782
Trophy points
1,393
Location
USA
Activity points
59,412
Re: Place & Route takes too long

need the timing report to understand the path.

The constraints that have a problem look to be out of an MMCM. You should post the timing report and the UCF file you are using.

...and ignore the replies about using bigger FPGAs, as I've said before you've got design and/or constraint issues. It's looking more like constraint problems given the -0.887 ns Hold slack. Something must be specified incorrectly.

Regards

- - - Updated - - -

FYI you could have aborted the run after it wrote out the updated ncd file:
Phase 4 : 4420 unrouted; (Setup:0, Hold:87443, Component Switching Limit:0) REAL time: 3 mins 26 secs

Updating file: ML605_fmc150.ncd with current fully routed design.

Phase 5 : 0 unrouted; (Setup:0, Hold:85977, Component Switching Limit:0) REAL time: 3 mins 41 secs

Then read that ncd and ucf into trce to get the timing information against the constraints.
 

ads-ee

Super Moderator
Staff member
Joined
Sep 10, 2013
Messages
7,860
Helped
1,817
Reputation
3,644
Reaction score
1,782
Trophy points
1,393
Location
USA
Activity points
59,412
--------------------------------------------------------------------------------
Slack (hold path): -1.313ns (requirement - (clock path skew + uncertainty - data path))
Source: FFT_TopLevel/TopModule1/Top[6].FinalStage.Final/tempdataOutAImg_16 (FF)
Destination: FFT_Output/U0/I_DQ.G_DW[33].U_DQ (FF)
Requirement: 0.000ns
Data Path Delay: 0.457ns (Levels of Logic = 0)
Clock Path Skew: 1.247ns (7.913 - 6.666)
Source Clock: FFT_TopLevel/clk_122_88MHz rising at 0.000ns
Destination Clock: clk_245_76MHz rising at 0.000ns
Clock Uncertainty: 0.523ns

Clock Uncertainty: 0.523ns ((TSJ^2 + DJ^2)^1/2) / 2 + PE
Total System Jitter (TSJ): 0.070ns
Discrete Jitter (DJ): 0.225ns
Phase Error (PE): 0.404ns

Minimum Data Path at Slow Process Corner: FFT_TopLevel/TopModule1/Top[6].FinalStage.Final/tempdataOutAImg_16 to FFT_Output/U0/I_DQ.G_DW[33].U_DQ
Location Delay type Delay(ns) Physical Resource
Logical Resource(s)
------------------------------------------------- -------------------
SLICE_X70Y127.AQ Tcko 0.270 freqDomainDataAImg<17>
FFT_TopLevel/TopModule1/Top[6].FinalStage.Final/tempdataOutAImg_16
SLICE_X66Y123.BX net (fanout=1) 0.326 freqDomainDataAImg<15>
SLICE_X66Y123.CLK Tckdi (-Th) 0.139 FFT_Output/U0/iDATA<35>
FFT_Output/U0/I_DQ.G_DW[33].U_DQ
------------------------------------------------- ---------------------------
Total 0.457ns (0.131ns logic, 0.326ns route)
(28.7% logic, 71.3% route)
The highlighted part is where your problem resides.

I'd need to see the specific code that generates the clocks and perhaps the code for the interface between the two domains.

As the two clocks are multiples of each other I'd assume the path may be a valid path between the two domains. The amount of skew between the two clocks suggest they may not both be generated from the same MMCM?

You should rerun trace using the full_path switch to make it show the clock paths.


Your UCF contains this line:
NET "CLK_AB_P" CLOCK_DEDICATED_ROUTE = FALSE;
which isn't a good idea as this means the clock isn't using the dedicated routing from the package pin to the clock buffers/MMCM/PLLs. According to the document for the ML605 the pin pair is K26(FMC_LPC_LA00_CC_P)/K27(FMC_LPC_LA00_CC_N) correct?:
Code:
NET "CLK_AB_N" LOC="K27";
NET "CLK_AB_P" LOC="K26";
Not sure why you used CLOCK_DEDICATED_ROUTE = FALSE, since that pair of pins are clock capable.

Am I wrong in assuming you didn't use a MMCM/PLL to generate the divide by 2 clock: clk_122_88MHz? If so that is the source of your problem and will be a classic example of why you never want to generate a clock using the core logic. Show me the code for the clock generation, so I don't have to guess.

Regards
 
Last edited:

SharpWeapon

Member level 5
Joined
Mar 18, 2014
Messages
89
Helped
0
Reputation
0
Reaction score
0
Trophy points
6
Activity points
705
The amount of skew between the two clocks suggest they may not both be generated from the same MMCM?
Correct! I had two MMCMs, I just thought both will be synchronous(infact they were, both in simulation and in post synthesis). Now I have one MMCM and passed the CLK to all modules only from this MMCM. Now PAR takes a very short time with all time constrains met. Yaay! :)

a classic example of why you never want to generate a clock using the core logic
What is the best way to do it then?
 
Last edited:

ads-ee

Super Moderator
Staff member
Joined
Sep 10, 2013
Messages
7,860
Helped
1,817
Reputation
3,644
Reaction score
1,782
Trophy points
1,393
Location
USA
Activity points
59,412
Your problem was a result of one of the MMCMs not being able to use dedicated routingas it was in a different bank.

If you need to use multiple MMCMs (e.g. you have a very large number of clocks to generate or they don't have a common VCO frequncy), you can:
a) feed the FPGA with two versions of the same external clock.
b) feed the single clock through a BUFG, which will then feed both MMCMs.

To avoid generating clocks in the fabric. Use a PLL/MMCM/DCM or instead of making a clock make a siingle clock wide pulse and use that as an enable for any flip-flop that would have used the generated clock. In this way you'll end up with a single clock design with all the related domains on the same clock but only getting enabled every Nth clock.

Regards
 

SharpWeapon

Member level 5
Joined
Mar 18, 2014
Messages
89
Helped
0
Reputation
0
Reaction score
0
Trophy points
6
Activity points
705
Thanks that really helped. One last question though, should the 'Clock Path Skew' always be 0.00 or is it acceptable if it is too small or negative(why is it negative btw). If so what is the reference for that, the time specified in the bracket of 'Slack (setup paths): [~]ns (requirement..)'?
 

ads-ee

Super Moderator
Staff member
Joined
Sep 10, 2013
Messages
7,860
Helped
1,817
Reputation
3,644
Reaction score
1,782
Trophy points
1,393
Location
USA
Activity points
59,412
Clock path skew can be any of positive, negative, and 0. It's normally referenced to the source clock so if one branch of the clock that drives the source register is shorter than the branch driving the destination register you end up increasing setup margin but decreasing hold margin (destination clock is delayed with respect to the source clock). The opposite can also occur where the source clock has a longer clock path than the destination clock. In this case the source clock is delayed with respect to the destination clock and decreases setup margin and increases the hold margin.

With your non-optimal input clock routing you were ending up with the first situation where the destination clock had so much positive skew relative to the source clock that the result was a huge decrease in the hold margin.

Regards
 
Status
Not open for further replies.

Similar threads

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Top