Continue to Site

Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

[SOLVED] P&R Buffer Reduction in a Shift Register

Status
Not open for further replies.

ranaya

Advanced Member level 4
Joined
Jan 22, 2012
Messages
101
Helped
4
Reputation
8
Reaction score
9
Trophy points
1,298
Location
Kelaniya
Activity points
2,164
Hi All,

I am doing a P&R experiment for a shift register which looks like below :

**broken link removed**

As you can see, it's composed of MUXs and Flip-Flops. Basically the input MUX of every Flip-Flop chain feeds the data to the chain and the data can also be interchanged between two parallel chains using the same MUXs. For now I only do this experiment for the typical corner at 0.6V (TT, 25C). From the (pre-layout) synthesis, it was confirmed that the system can operate at 240 MHz with sufficient setup/hold margins. P&R is intended for 200MHz and the related settings/constraints look like below :

Code:
SDC :
set_units -time ns -resistance kOhm -capacitance pF -voltage V -current mA
create_clock [get_ports clk]  -period 5  -waveform {0 2.5}
set_clock_uncertainty 0.05  [get_clocks clk]
set_input_delay -clock clk  -max 0.3  -all_inputs
set_output_delay -clock clk  -max 0.3  -all_outputs
set_load -pin_load 0.004 -all_outputs

Critical Innovus Settings :
setAnalysisMode -analysisType single -checkType setup -skew true -clockPropagation sdccontrol
set_ccopt_property buffer_cells { CKBUFM8R CKBUFM6R CKBUFM4R CKBUFM3R CKBUFM2R CKBUFM1R }
set_ccopt_property inverter_cells { CKINVM8R CKINVM6R CKINVM4R CKINVM3R CKINVM2R CKINVM1R }
set_ccopt_property delay_cells { DEL1M1R DEL1M4R DEL2M1R DEL2M4R DEL3M1R DEL3M4R DEL4M1R DEL4M4R }
# Include this setting to use inverters in preference to buffers
set_ccopt_property use_inverters true
set_ccopt_property target_max_trans 600ps
set_ccopt_property target_skew 600ps
create_ccopt_clock_tree_spec
ccopt_design

So in the routed design I do not see any violations and DRC related issues. There were no setup and hold issues either in the design. The intended CCOPT constraints have also been met. But I strangely find some buffers (CKBUF*) inserted between the Q-D pins of consequent flops in the design even with a sufficient slack time left between them. i.e. See following timing report :
Capture.PNG

So we have enough slack time in this path. The input cap load of both CKBUFM2R and the D input of the capturing flop are similar, and I don't see any reason to use this buffer in between. As you can see, even with the uncertainty of the clock, the difference of the clock signal arrival between the launching and capturing flops is quite small and with the large CP-Q delay of the launching flop, there should not be any hold violation without the buffer CKBUFM2R (If yes, Innovus would have used Delay elements in the list). So basically I do not see any reason to have these buffers in between. Even at a lower frequency, I see the same buffer count between the flops in the design. What could possibly be the reason for this ?

Thanks
Anuradha
 
Last edited:

this report looks very odd to me. are you using the clock signal as data and computing on it? (i can't see the attachment, so I don't know what the circuit is)
 

Here is the image of the design:
SHER (1).png

@ThisIsNotSam : Note that the CKINVX are the clock tree components (I prefer them instead of buffers). So the CKINVX we see in bot launching and capturing ends are there for the clock tree.....
 
Last edited:

Here is the image of the design:
View attachment 159428

@ThisIsNotSam : Note that the CKINVX are the clock tree components (I prefer them instead of buffers). So the CKINVX we see in bot launching and capturing ends are there for the clock tree.....

just because they are clock tree buffers, doesn't mean they can't be used in the middle of the logic. this happens unless you prevent it with dont use statements

I still can't understand your timing report. this is not a typical reg to reg path. how are you reporting this? innovus usually doesn't show the clock tree in reg to reg paths.
 

Thanks for your replies....

just because they are clock tree buffers, doesn't mean they can't be used in the middle of the logic. this happens unless you prevent it with dont use statements

Yes, I understand this. What I meant in my previous reply was, I avoid using buffers in the clock tree (prefer inverters instead). So I don't have them in the clk lines, but as you said in data lines. However I do not find any reason to have them in the data lines due to the timing reasoning mentioned in my first post as well as the close proximity of these two flops in the design.


I still can't understand your timing report. this is not a typical reg to reg path. how are you reporting this? innovus usually doesn't show the clock tree in reg to reg paths.

These are the timing reports automatically generated by Innovus after each optDesign stage. The one above was after the postRoute stage. So it shows the clk inverters from the clock source to the launching and capturing leaf ends of the flops as well. Does it look strange ?
 

ok, let's ignore the timing report and assume it is normal.

now let's try to figure out why the clk buffer is preferred over the normal buffer.
first, they might have the same footprint and the tool doesn't even differentiate between them and just picks whatever. maybe the clkbuffer comes first in the lib file or some silly reason like that. if they have different footprints, routing could have impacted the decision if the clkbuffer is more routing friendly. you should also look at leakage and dynamic power, which could affect the cell selection.

I have spent a lot of time trying to make sense of decisions like this before. I can't say it was time well spent.
 

Hi,

The issue was in the set_max_transition parameter of the design. Since it wasn't explicitly set during the PnR, tool had used the one available in the Std_Lib. And it was too pessimistic. Even though the flops are placed closed together, the added parasitic caps from the wires between Q and D pins, violates the maximum transition comes from the library. By adjusting it appropriately, the buffer count can be reduced. Thanks all for your inputs.

Ranaya
 

cool, problem solved. just be careful when changing the max tran settings, there are usually guidelines for this coming from the std cell provider. I remember some really nice tables from ARM IP that explained how to set based on the target frequency of the design and some FoM metric.
 
  • Like
Reactions: ranaya

    ranaya

    Points: 2
    Helpful Answer Positive Rating
Status
Not open for further replies.

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Back
Top