speaker and mic
At minimum you need a preamp and an ADC for the microphone and a DAC and an audio amp for the loudspeaker output.
There are microcontrollers which have built in ADC's and PWM outputs. Voice processing will require a significant amount of processing power and buffer memory, you will need a high end microcontroller for this, paticularly if you are going to use a speech codec or do echo cancelation.
You might find that some analog processing before the ADC such as some filtering and AGC will make the device perform better.