First eliminate all static sources of error using a calibrated DAC. THis includes ground shift of Vref from digital currents and thermal drift of Vref.
When I designed a 12bit SCADA system in the 70's I used BB 12 bit ADC's and DAC and was happy with DAC but had issues with monotonicity missing codes and hysteresis which I presumed was an early flaw in the Vref grounding internally with SAR comparator ground currents. THis was at the time a state of the art SAR ADC hybrid in ceramic inside a shielded can.
To test the DAC I used a scope ac coupled and detected each step using a suitable clocked counter to drive the parallel input DAC. The response error and linearity was +/- 1/2 LSB as expected with very little error noted by scope step size drifting to zero (AC coupled) Then changing speeds and variable load could verify the output impedance of the driver and load regulation errors. No problem.
TO test the ADC I used my confirmed calibrated DAC and now use the ADC output to drive the DAC and compare In - Out using Scope A+ B invert. When I swept with DC I could see the errors clearly at boundaries of ...xxx111xx to ...xxx100xxx and visa versa with a hysteresis effect. Thus confirming a Vref issue in the DAC or a droop problem in the S&H. Note the hold capacitor must either be COG/NPO or Film as anything else will have a memory effect and sag or droop with load impedance must be negated. RC time droop during SAR calculation in ns in your case or us. Mine was 20MHz sample rate. in my case I used film cap and CMOS S&H had no crosstalk or droop in the held value up to the bandwidth of interest and all bandstop filters had zero group delay in the passband. YOu need to consider how to do this with eye pattern data rate tests or use group delay measurements. THe S&H error is a big challenge for error with crosstalk and transients that are polarity and level dependent. SHielding, ground guard tracks , "active guarding methods " and Common mode filter rejection with ferrite are often necessary to ensure signal integrity before ADC.
Thus you can determine many sources of error simply using a scope and input AC waveforms from DC, sine, mixed sine, square pulse and sinx/x to measure each source of error and compare with with error budget. The SNR will be a result of all the DC and AC noise sources including sensor EMI crosstalk, PSU noise and quality of signal source. Using a PRSG data pattern limited by the bandwidth of your input filter clocked at suitable rate is also another way to measure eyepattern distortion or group delay distortion in your filters and also your ADC quantization errors.
I know this is old school, back in the late 70's but it worked for me. So many people ignore Vref errors and ground shift noise and hysteresis effects at certain bit boundaries and lump the results into one reading. It is more important to have test methods for isolating each source of error, like for linearity and asymmetry and harmonic distortion.. Use a square wave with symmetrical rise fall times and measure the 2nd harmonic as -60 to -80 dB for quality of source then inject into the ADC and see the results of asymmetry at different source frequencies and DC offsets.
I wasn't in the desert but in a small R&D lab on my own.. but I can understand your challenges.
I had no help and was only out of UofM for a couple years. In the end the system worked well.. (automated Eddy CUrrent inspection probing with robotics for Candu Nuclear Secondary heat exchangers with 2000x 60m U-tubes per tube sheet heat exhcanger and dozens per reactor building with many reactor buildings and an ADC resolution of 0.1mm of pipe ( desimated to 0.2mm resolution at 10MHz sampling rate with controlled probe drive speed rates and calibration holes and slots for the quadrature 100/200kHZ eddy current probe signals. to measure down to 100 ppm change in vector impedance with high SNR.) THis allowed them to detect metallurgic flaws before a heavy water leak could occur at 10k atmospheres of pressurized heavy water.