A simple frame buffer memory as you mentioned is probably the simplest solution, but it has various problems with image quality. Various cheap converter boxes work that way. Is that what you meant by "not very good solution"? There are expensive broadcast quality products that use much more sophisticated image processing, such as from Snell & Wilcox, although I'm not sure if their products have VGA format output.
Here are some technical papers. Try "The Engineer's Guide to Standards Conversion" for a general overview, but no FPGA specifics:
**broken link removed**