kostbill
Full Member level 1
Hello.
I want to implement a pitch tracking algorithm for voice. Before that, I am implementing the endpoint detection from Rabiner (1981).
In this paper, the energy contour of the audio data is something like the image in the attached file. Rabiner wants to find the beginning and ending of the word from the energy contour.
Since I only want to find the pitch, should I know the start and end of the word? I know for sure that one of these energy pulses (the one with the higher energy) is a part of the spoken word, isn't it enough to calculate the pitch only in that energy pulse?
Thanks,
Bill.
I want to implement a pitch tracking algorithm for voice. Before that, I am implementing the endpoint detection from Rabiner (1981).
In this paper, the energy contour of the audio data is something like the image in the attached file. Rabiner wants to find the beginning and ending of the word from the energy contour.
Since I only want to find the pitch, should I know the start and end of the word? I know for sure that one of these energy pulses (the one with the higher energy) is a part of the spoken word, isn't it enough to calculate the pitch only in that energy pulse?
Thanks,
Bill.