Sorry, I can't help beyond the basic concepts. No experience with Spartan, FPGA, PMOD, or SD cards.
Your row of numbers do not look the same as digitized audio. From seeing the word 'sequencer', I think they are codes which choose different settings in a drum machine sequencer.
Digitized audio has thousands of numbers per second. A single percussion burst might occupy several kBytes. (A sequencer contains a variety of these digitized sounds.)
If you wish to devise your own application, the general process is like this:
(a) read one number at a time,
(b) feed the number to a digital-to-analog converter (DAC),
(c) the output amplitude is low if the incoming number was low, high if the number was high (simplified explanation),
(d) apply this volt level to an audio amplifier.
(e) read the next number, etc.
By doing this thousands of times a second, it produces waveforms from the speakers.
I've stored the desired tones in wav format
You can read the data, and send the numbers to a DAC. I understand that microcontrollers contain a DAC, therefore you probably only need the software portion. If you cannot find an application that handles this, then you will need to learn how the format works, since it ought to be possible to write your own program. There may be message boards with helpful details for doing this.