Re: buffers
Normally not always buffers are inserted...First determine the critical or timing violated path in your design.Rearrange the combinatinal blocks near to the capturing flops or launching flops.try to reduce the distance between the launching and capturing flops.
Usually go for combinational block optimisation like logic duplication, increasing drive strength of devices, rearrangement of blocks,pin swapping for slow arrival signals etc..
Buffers are usually perferred as a last option to meet the timing as it consumes extra area.