Yes, they can - especially if you have nonuniform taper
progression. The rise / fall edge time asymmetry can
stack up (and is very loading-detail-sensitive; slow
edges are where you convert voltage noise (as can
be seen on your top trace) to time noise (jitter)).
I once had to redo a chip-scale clock tree (on a
40Kgate ASIC done by Spectre simulation and hand
layout of every steenkin' gate and upward) because
I encountered this kind of distortion, having used
buffer gates (noninverting) with 1X:3X inverters.
The internal node was fast and the external was
slow, and the difference really stacked up over the
tree depth. Noninverting stages do that while matched
inverting stages first-order cancel. You want to look
at the edge rate of each inverter stage and try to
keep them matched inclusive of the true net loading.
Now, you might also suffer from wafer processing
skew (N vs P drive strength and inverter net threshold).
Expect the lvt devices to vary more widely than regular
(lighter implants are more sensitive to gate ox / channel
surface qualities, random dopant fluctuation, etc. in VT
and drive strength).
That's something you could explore in Spectre. But I'd
explore it using an analog_extracted netlist of high
(relative to final layout) parasitics fidelity, schematic
based simulation is likely to miss the mark.