difference between rtl and other logic families
As a first point, you should consider that the RTL to technology mapping always causes structural differences, for various reasons. If you think about, you most likely find the reasons in effect in your simple example.
Of course, the fitter result depends on the utilized logic family, but I guess, the registers don't have inverted outputs in your case. That would be already a plausible reason. For internal registers, a succeeding logic element causing additional tco delay would be the alternative, for an output register, this option didn't exist at all.