The "gearbox" is used to align the input bits into output words and also to remove extra encoding overhead. For example, 8b10b's overhead or 64b66b's overhead. For the output, the gearbox adds this overhead to the bitstream.
For the 8b10b example, the serdes inputs would look like: (kkdddddd) (ddkkdddd) (ddddkkdd) (ddddddkk) (dddddddd) and then this repeats. (here, k = control bit, d = data bit) 4 bytes of data are transferred in 5 cycles for this example. The gearbox likely also generates a "request data" that goes low once per five cycles.
The input gearbox does the same basic thing, although there are 10 input cases for where data/control could be. The gearbox is responsible for aligning the data on word boundaries. For this example, it would output 4 bytes per 5 cycles assuming 8:1 serdes. The gearbox would likely generate a data valid which would be true four out of five cycles, after alignment has completed.