Hi,
I am assuming that you want to generate ATPG vectors using a custom load-unload sequence.
The standard steps are to sensitize and measure. The shift in and parallel load sensitize the circuit; at this time you can detect some faults on the regular outputs. The capture-clock captures the results of the sensitization for the rest of the chip. I could see where you might want to measure the outputs after the capture clock, but I don't think you want to not perform the initial parallel test (step 3).
By moving the "measure outputs" to after the capture clock you may adversely impact the ability to detect certain faults, or you may make the faults difficult to detect.
Why is it necessary to change the sequence?
You could measure the affects of changing the sequence by running ATPG with the standard sequence and then the custom sequence. Whichever sequence gives you the best coverage with the fewest vectors would be the way to go.