1 db compression point
To maximize linearity, use a mixer with a high IMD intercept point and a high 1 dB compression point. Then make sure that your signal is much lower than these levels. The optimum point is 2/3 of the way up (in dB power) from the noise floor to the IP. This also applies to all amplifiers, RF and IF in the signal path. You will need to be very careful in making a signal path diagram and listing all of the parameters at each stage along with the signal level to make sure you keep the whole system linear.
Diode ring mixers are the most linear, but take the most LO power. I suspect that with 128 QAM your system is not the low cost type and these more expensive mixers and LO amplifiers are acceptable. You may have to AGC the RF stage as well as the IF stages to get the signal input to the mixer into this range for all signal levels. As you probably well know, the phase jitter of all conversion oscillators is very critical as well. Also make sure that the agc stages do not degrade their IMD parameters or at least you consider them in your calculations for the whole range of input signal amplitudes.
How you terminate the mixer affects the performance. Look here **broken link removed** and read the guides for mixers. Also, you should choose the IF so that the mixer products MxRF+-NxLO do not fall in the IF frequency. For those that do, M and N should be large numbers and N should be even.