Hey, I already solved problem with modulation entropy.
For anyone intereded (for further reading):
Entropy Based Voice Activity Detection in Very Noisy Conditions, by Philippe Renevey and Andrzej Drygajlo.
https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.123.6098&rep=rep1&type=pdf
As my waveform signals I use wave files (*.wav), which I already can read and apply FFT on them. Thus there is no problem from beggining. I just can't exactly understand how to use MFCC coefficients after receiving them from algorithm (fe with mentioned FFT data).
Speech signal has characteristic 4Hz modulation peak. I have to check this. I can look for this modulation energy after passing data trough FFT and bandpass filter centered arund 4Hz. But the thing is, this is academic work - so I'm forced to somehow connect this with usage of MFCC coefficients. I just came with the idea, that I should follow this steps:
Steps I think I need to follow:
1. Get MFCC coefficients as amplitiudes of signals spectrum and save two or three first coefficients only (this should cover frequencies I need),
2. Apply FFT (yes, again, after DCT),
3. Check for signal energy near desired frequency bins.
But I need to be sure that this is the right approach.