Hi there,
well it's quite general question, since there exist different ways to do it according to the power you are targeting (i.e., dynamic or static). And this is much more true for scaled technologies. Also, you should consider desing-time choices and run-time adaptation.
A good starting point, however, to reduce dynamic power is to use clock-gating: you stop/gate (i.e., you basically AND the clock with a control signal) the switching activity of unused regions of the chip. This brings dynamic power to zero (according to classical power dissipation analytics), but it's not a guarantee for static power. For instance, you might want to reduce it with power gating (i.e., cut-off supply voltage through sleep transistors).
Some microarchitecture choices then might be considered as power-optimization techniques, but they highly depend on the running applications: for instance a very accurate branch prediction can help you in reducing the power dissipation of an out-of-order processor.
If you are interested in the (pretty vast topic) you can start from classical references such as the "Low power methodology manual" written by Synopsys and ARM guys. It conveys a lot of the power dissipation problem.
Cheers