Wek did a great job explaining the difference between oscillator cycles and instruction cycles.
If you were thinking 4 cycles you might be using a Dallas device which works in the performance range.
The good old 51 needed 12 oscillator cycles for an instruction cycles, then there were many devices with a 6-clock option, timing compatible, everything was just twice as fast, the fastests timing compatible devices today are 2-clock devices, e.g. the LPC900 which are EXACTLY 6 times faster than a 12 clock executing the same instructions.
If you go to the "so called single clock" devices from Silabs or Atmel (the 51s), you will find that there are one cycle, two cycle, and multiple cycle instructions but the truth is, they are fast than two cycle devices. Not twoce as fast but faster. If anything in your system depends on timing cmpatibility, hands off from these devices, if it is about more performance, the so called single clock devices are faster than 2-clock devices but not twice as fast as the 1-clock would indicate.
For an original 51-architecture specification you can check this document:
**broken link removed**
it includes instruction cycles times based on the 12-clock implementation. External clock was 12 MHz and one instruction cycle time was 1usec.
More 51-originals can be found here:
**broken link removed**
hope this helps to understand the different "cycles" in the 51-world a little bit.
Bob