Hi
What we did here in Brazil was to record a set of some 100 words from about 16k people over the country, each one with its own accent.
Then we mounted a phonetic translator that recognizes any person talking.
The pnhonetic lib takes up 1MB
The you put plain text on your application, like yes, no, up down etc...
When someone speaks it decodes the basic phonemes from the table and mount a string. If the string is onboard, ok, if not, then send the problem forward to the server, e.g.: finding a route in a hands free gps system.
Works fine and realy uses a small memory footprint