Continue to Site

Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

Mel Filter Bank Processing

Status
Not open for further replies.

Phillip+

Newbie level 1
Joined
Dec 3, 2012
Messages
1
Helped
0
Reputation
0
Reaction score
0
Trophy points
1,281
Activity points
1,291
Hello,

Sorry about my ignorance, I am trying to learn this subject for a finals project I am undertaking.

Brief background:

I am developing a Speech Recognition algorithm that identifies whether someone is saying a particular word, in this case "Yes" or "No".


I am computing an MFCC (From this paper: https://arxiv.org/pdf/1003.4083.pdf) and what I have done so far is:

  • Pre-emphasis
  • Framing
  • Hamming Windowing

The equation I am struggling on is "Step 4" .. Now ok, if I take the FFT of each of the "Windows" in the Time-domain and multiply by the Mel filters' frequency response, would this be enough?

I also have a problem with this equation:

mel.gif

For example, what does F represent? Does it represent the FFT of the "Window" or the "Window" in the time-domain?


I hope someone can help, sorry for my lack of understanding.. I am learning here. :)
 

Hi, please see the following link of MATLAB codes:

http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html

specially, function of "melcepst.m" implements a mel-cepstrum front end (MFCC) as a feature extractor for speech decoder.


F in your equ. is frequency. It represents range of frequency between 0 to Fs by steps of Fs/N, where N is the length of window.
 

Status
Not open for further replies.

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Back
Top