Theoretically speaking, you can figure out the clock insertion delays to every flip-flop, sort the clock insertion delays from the longest to the shortest, and reordered the scan chains accordingly. This will guarantee that there is no hold time violations, assuming that the hold time requirement on the SI input of your flip-flops is 0ns or less.
Practically, this is not the best way, since it will still lead to long clock routes.
The best thing to do is to look at the clock trees, and do a bottom up approach by stitching all flip-flops from the same clock branch first. The theory is that FFs in the same branch must be physically close to each other, and have minimum skew, meeting both scan reordering criteria.