The threshold for logic state. Which probably also
moves as you jack the tail current.
The more you overtravel past threshold, the more
distance to come back the other way hence more
delay.
Why don't you plot yourself families of switching
waveforms and see what's happening?
Also it appears that the load FETs ought to have
their current scaled similarly to the tail but you
show no such thing. Maybe this is the problem,
upping the tail current might "bury" the diff pair
drain like I'm talking about.