why 89S52 ? is it because you have one lying around or can buy easily, or to keep the cost low ?
if the answer is yes to either one of them, then i'm afraid you'd face some roadblocks (to put it mildly).
firstly, you can forget playing back a mp3 encoded song. the decoding process of mp3 is not possibly on a MCU like mp3.
so you'd need a mp3 decoder chip (many available from cos. like TI, maxim, NXP and also many small Taiwan based cos.), but buying those in retail is going to be more expensive (and hard to find) than your MCU. however, even they need to be feed with a continuous data-stream of encoded data, to generate PCM output, and that needs storage and then you need the audio electronics (well, some of the decoder IC's have high integration and can even drive a small 2Ohm 0.25W tiny speaker).
alternatively if you are thinking of raw audio samples, even at the lowest spec level 8bits/sample, 11.1KHz sampling, and a full song say 4-5mins, you need some significant memory there. since you mention "high quality", I think that kills it, as the melody IC's won't do either.
---------- Post added at 09:38 ---------- Previous post was at 09:33 ----------
@pashok84. It's a fun idea. quite doable. just don't say that you want to do it with 89C52 only
... well, doable with 89C52, with that MCU playing a small "glue" role. there are 89C52's with RTC (I think), so your timekeeping logic should be simple. else, use an external RTC IC, and use 12 OTP / reprogrammable melody-IC's to store small songs. The quality won't be great. Pretty much like the cheap chinese toys (what you get at 8kHz, 8bits/sample raw audio)... and not hi fidelity audio.