What is important to realize here is that you have a pmos current mirror as a load which will always try to fix 1:1 current ratio in the two branches of the diff pair.
It is easy to figure out what will happen if the gate of M2 varies from 0 to Vdd/2. Almost all of the tail current will go through M1 since M2 is off. Since there is no current in M2 but through the PMOS mirror M4 wants to source current equal to Itail, there is no other way but for M4 to go in deep triode with Vds=0 and then Vout=Vdd.
When VG2=Vdd/2, the tail current splits equally in M1 and M2 and Vout is defined by the output resistance of M4 and M2 but will be somewhere in the middle.
Interesting is what happens when VG2 continues increasing beyond VDD/2. The drain of M1 is always one Vgs of M3 below Vdd, so M1 remains in saturation. And because of the PMOS mirror, the currents in the two branches of the diff pair will want to stay more or less equal to Itail/2. However, since the gate of M2 goes up, M2 will want to sink more current and thus drive Vout lower. At some point M2 goes in triode but the currents in the two branches continue to be mostly equal because unlike before, M4 is well into saturation and the PMOS mirror works fine. With M2 going into triode and deeper into it, Vout has no choice but become equal to the tail node (the common source point of M1 and M2) because the Vds of M2 tends towards 0. Even then the currents into the two diff pair branches remain equal. So, Vout is then limited on the lower side by the tail node, which in turn is defined by Vdd/2-VGS1 (M1 works as a source follower now with current equal to Itail/2). VGS1 is the result of Itail/2 going through M1.
All this is for the case when there is no extra load connected to the output of the diff pair as shown on the picture above.