You need to understand the difference between actual instantiation of clock gating cell and inferred clock gating cell. Do you have options of running simple synthesis flow? Try running with a design where you have instantiated clock gating cell of rocking_vlsi module as well as using RTL as I have described earlier. After synthesis, if you do clock gating report, it will show two types of clock gating in the design. One instantiated and one inferred. Doing this will give you clearer understanding , rather than me trying to explain it. For instantiated one, you can manually instantiate standard cell clock gating cell also, or rtl level clock gating cell as shown by rocking_vlsi.