You are correct, sampling & quantization are interchangeable theoretically. But practically you have lot more issues by interchanging them. Let say you do Quantization first, a completely non-linear block, then the sampler would relatively see very large frequency at its input & making them to have very low distortion would require a lot of power --> too much of constraint on Sampler. And also to characterizing the sampler on its own would be a nightmare.
also have a look at this lecture: **broken link removed**