If you mean transistor X9, its Vds voltage is actually forced by the input common mode. In detail: in conventional operation the differential inputs would be close enough (real ground should be close to virtual ground even if the DC voltages are different) so two input transistors sink almost the same amount of current and they share a source node. The voltage at the source node should be equal to Input common mode level level (vinp + vinn / 2) minus whatever voltage is needed to turn the input transistors on. If you choose the common mode something higher the voltage drop on the current source will increase and if you choose it lower it will decrease. This is a fundamental problem that limits the input range of OPAMPs because there's no ideal current source.
What you can do is that you can reduce the vdssat voltage of the current mirror so that the unusable range is smaller (the region it goes out of saturation). You can do this by increasing the size of it, but how good it will be is debatable in a practical implementation. Also if the reported regions in the simulator includes the sub-threshold it may be in that region if the size of the transistor is too large for the amount of current flows through the biasing branch (which by the way may have very bad PSRR performance)
For the gain I need to look at this in more detail, I can't really comment on that. It may also be related to other biasing issues if it is significantly lower than what you were expecting, I'm just talking about the tail current source here.
An OPAMP book that I like a bit is Vadim Ivanov's book, I forgot its name though. It's extremely straightforward and mostly talks about conventional implementations of many blocks required in OPAMP design, it's a bit old but I like it. There are other books you can follow as well, but this one is the one came to my mind first.