Look in Razavi's "Design of analog CMOS integrated circuits" page 635. The idea is to make the termal noise from the gate resistance of the finger smaller than the transistors noise reflected to the gate.
But in P635,
"In low-noise applications,the gate resistance must be 1/5 to 1/10 of 1/gm".
It is "gate resistance",not "no. of fingers".
So who is wrong?
these are related. If you one finger for a transistor W/L, then the gate resistance is (W/L)*Rpoly - where Rpoly is the sheet resistance of the material. If you now make the same transistor with n fingers, you have gate resistance of (W/nL)*Rpoly, simply because you get the fingers connected in paralel.
Another arguement to this (something designers often forget) is that very large transistors are never characterised properly in the models. Fabs will typically characterise out to about 20um width then extrapolate the performance in Spice (the test chips for characterisation rarely go larger than this). The model is very often incorrect at large L or W, so if you design the input to an op amp with 40um or something, the gain may be a lot lower than you think and cause offsets to appear that look like mis matching.
Also, any slight Vt shifts from metallisation processes will hit larger devices (larger antenna's to pick up charges) so matching issues can occur also.
Yes, when we use the inverse ratio MOS(W<<L) or large ratio(w>>L), we often use several transistor series which every transistor w and L all meet the model max. and min. W and L to guarantee the model accuracy.