1) Check which buffers you are allowing CTS to use. You may be able to give it faster buffers (even SVT/LVT at a cost of leakage power)
2) Check the logic in your clock lines. If you have dividers + a lot of other logic, it might be hard to meet the 2ns due to the design itself.
3) Check the physical placement and routing of the clock tree. If you have hard macros, check where clock pin enters, you may be able to move the clock pin or even move the macro itself for better timing.
4) If your design is large, you can break the clock tree up to debug where CTS is having trouble meeting maxdelay. Start defining separate clocks for different parts of your design in the ctscth file and leave them unbalanced. Then you can see which parts of your design are having trouble meeting 2ns. This may help debugging of #2