arplon
Newbie level 3
Hello all,
I had some questions regarding audio compression (bit reduction) using the Short Time Fourier Transform.
I am trying to implement this in MATLAB.
For the STFT, my method has been as follows:
1. Window the desired signal by multiplying the signal with a rectangle of desired length (256 pts for example)
2. Perform an FFT on each of these short time sections
3. Look at the frequency response of each of these short time sections and eliminate (set to 0) any components that are below a certain threshold
4. Perform an IFFT on each of these short time sections
5. Reconstruct the signal by stitching together each of the windowed sections
When I reconstruct the signal, based on the window length I use as well as the thresholding I perform, I can see a clear degradation in the clarity of the signal. By varying these two factors, I can get to the point where the audio is not intelligible anymore. However, when I try to write this file back into a .wav to compare with the original sound file, the two are identical in file size. I understand I am writing the new "compressed" version of the signal with the same # of bits, but then I don't understand how this can be used as a compression scheme.
I was wondering if someone had any intuition as to whether I was going about this in the correct manner or not? Am I missing the fundamental concept of compression using STFT's or is there simply a consideration in MATLAB I am missing that is preventing me from realizing the results?
Any help would be greatly appreciated!
I had some questions regarding audio compression (bit reduction) using the Short Time Fourier Transform.
I am trying to implement this in MATLAB.
For the STFT, my method has been as follows:
1. Window the desired signal by multiplying the signal with a rectangle of desired length (256 pts for example)
2. Perform an FFT on each of these short time sections
3. Look at the frequency response of each of these short time sections and eliminate (set to 0) any components that are below a certain threshold
4. Perform an IFFT on each of these short time sections
5. Reconstruct the signal by stitching together each of the windowed sections
When I reconstruct the signal, based on the window length I use as well as the thresholding I perform, I can see a clear degradation in the clarity of the signal. By varying these two factors, I can get to the point where the audio is not intelligible anymore. However, when I try to write this file back into a .wav to compare with the original sound file, the two are identical in file size. I understand I am writing the new "compressed" version of the signal with the same # of bits, but then I don't understand how this can be used as a compression scheme.
I was wondering if someone had any intuition as to whether I was going about this in the correct manner or not? Am I missing the fundamental concept of compression using STFT's or is there simply a consideration in MATLAB I am missing that is preventing me from realizing the results?
Any help would be greatly appreciated!