If your measured results shows the offset error, first I believe it could be due to the opamp itself. As you stated above, the poor device mathing of the opamp layout itself could have triggered this. You have can re-simulate your pre-silicon simulation by intentionally introduces some device input mismatch to see this effect. Other factor diode mismatch as well.
Have you run Monte Carlo analysis for mismatch? The process corners do not count with the mismatch of the devices. The input transistors of the amp can be critical.
What are the sizes of the input transistors?