I think you can allow the ringing of button to set of a timer (of 5 sec etc), after it counts down, a speach IC can ask the visitor to say the message. And in this time it can be recorded. You could use a micrcontroller I think, but its not necessary. If you want the 3 times ringing thing each ring apart from setting the ring sound can decrement a counter (at count 3), after its done thrice it comes to zero and a speech IC can be set to ask the visitor to leave a message.
I don't know how helpful this was, or if it sounded too basic. My knowledge is quiet basic too actually, maybe someone here can explain the details of the sound recording device.
"but i worry how can it mix with doorbell and others function"
Thats not a problem actually. The doorbell ringing should just be set in a manner to trigger other things (the switch should simultaneously do it). This you can do by having a single switch end touching/triggering two circuits simultaneously I guess.
BTW, if Ive made any mistake in any of the assumption explaination, I hope those better knowing of this will correct and explain further. All the best for your project.