Sound to text
You can study some good books on this topic, among them, "Digital Speech processing, Synthesis and Recognition" I think, By Furui. Another valuable source by Deller, I think is "Digital Speech processing" IEEE Press.
I'm wondering if you are asking for a speech recognition system. what do you mean by sound to text conversion? Why you have not written Speech to text conversion?
There are some systems capable of showing the manner of a person as words. Analysing how a person is speaking you can understand if he/she is sad or happy, if he/she is crying and .... these kind of systems are suitable for applications like an automatic system for taking care of a baby .... Using Wavelets beautiful tasks like what I said can be done.
Cosider HMM, Hidden Markove Models are the base of almost all of speech recognition systems. HTK, is a toolkit for acting with HMMs. it is a famous tool. it has a site for itself. have a look at it. studying the HTK documentation you will certainly understand how a good speech recognition system works. you will understand the role of grammar, words, ....
I think it is enough for now, however there is much much more to say.
Take care, if you are going to build a speech recognizer for your own language then the hardest problem to solve is to obtain a phonetically balanced dictionary of spoken words.