I have just said with large signal amplitudes on the RF or the LO input the non-linear mixing products will be such high it doesn't matter where the transistor operates.
But with smaller signal amplitudes it can be better if you don't let the BJTs to leave the normal active region, they shouldn't operate in saturation if the high linearity is the target.
And I told the opposite, if the BJTs operate in saturation the mixer will be slower and won't handle well higher frequencies. BJT is not as good switch like MOSFET, it needs more time to come out from saturation as I know, but parasitic capacitances are smaller so with linear operating quite high frequencies are reachable.