Normally for an extreme W/L like 400u/0.35u, the layout will be done with multiple smaller transistors.
Typically, this would be done with stripes (example 20 stripes, each of 20u width).
It can also be done with a waffle structure, but you have less assurance of the actual width or W/L that you get.
If you are using an extreme width in order to get a ratio, you may be able to vary width in one transistor and length in another. Due to delta-L, delta-W and threshold differences between long and short, wide and narrow devices, the matching will not be real good, but you can obtain an inaccurate real large ratio fairly easily:
Example need 400:1. You could use a device that is 400u/0.35u and another that is 1u/0.35u to get this ratio. You could also use a device that is 20u/0.35u and a second device that is 1u/7u to get the same ratio. The chip area would be 140u^2 for the first option, but only 14u^2 for the second. The matching will not be as good for the second, due to delta-L mostly, but it would work.
In reality, you would probably use either 401 unit 1u/0.35u devices with 400 in parallel for the 400u wide device, or you would use 40 unit 1u/0.35u devices, with 20 in parallel for the 20u/0.35u and 20 in series for the 20u/7u device.