clkbuffer as output rise/fall transitions matched.
in a normal buffer, normally fall transition is faster than rise transition. so if you use normal buffer on clock trees, you'll have duty cycle distortion.
since the clock buffers are optimised for the transitions their power consumption would be higher than ordinary buffers.... speed is trade offed for power consumption...