Audio Compression using STFT

arplon · Dec 4, 2011

Hello all,

I had some questions regarding audio compression (bit reduction) using the Short Time Fourier Transform.

I am trying to implement this in MATLAB.
For the STFT, my method has been as follows:
1. Window the desired signal by multiplying the signal with a rectangle of desired length (256 pts for example)
2. Perform an FFT on each of these short time sections
3. Look at the frequency response of each of these short time sections and eliminate (set to 0) any components that are below a certain threshold
4. Perform an IFFT on each of these short time sections
5. Reconstruct the signal by stitching together each of the windowed sections

When I reconstruct the signal, based on the window length I use as well as the thresholding I perform, I can see a clear degradation in the clarity of the signal. By varying these two factors, I can get to the point where the audio is not intelligible anymore. However, when I try to write this file back into a .wav to compare with the original sound file, the two are identical in file size. I understand I am writing the new "compressed" version of the signal with the same # of bits, but then I don't understand how this can be used as a compression scheme.

I was wondering if someone had any intuition as to whether I was going about this in the correct manner or not? Am I missing the fundamental concept of compression using STFT's or is there simply a consideration in MATLAB I am missing that is preventing me from realizing the results?

Any help would be greatly appreciated!

cmontoya · Dec 5, 2011

the size of the wav file has nothing to do with the bits per sample of your audio stream. wav files have both predefined numbers of samples and bits per samples. I think you are limited to three setting 16 24 32bits per fixed sample rate. read WAV - Wikipedia, the free encyclopedia. In your case all that dictates size is the length of your audio stream.

arplon · Dec 5, 2011

I was suspecting that the size of the file was not related to the actual content of the data.
In that case, is the process I am using to compress the data the correct method? And in order to find the relative compression should I be comparing the energy of the original signal to my compressed signal?

cmontoya · Dec 5, 2011

As to your first question I have no idea.
for your secound question I think that would be correct. The energy should be less in the " compressed "signal, But not by much your throwing out the weakest components.

I am sure there is a DSP expert here who can help you better than me. But if all your trying to do is lower the file size just dump some of the LSB of the signal. Their are also logarithmic systems that can represent lager dynamic ranges, with grater quantization error.

You may also what to read a description of an MP3 file. Compression is used to mean a different things, by reading a description of the mp3 format you should get a better understanding of what your affter. you don't need to be able to read an mp3 file by looking at it, just get an idea of how it works I think you will find that helpful in understanding this problem.

Welcome to EDAboard.com

Audio Compression using STFT

arplon

Newbie level 3

cmontoya

Junior Member level 2

arplon

Newbie level 3

cmontoya

Junior Member level 2

Similar threads

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Connect with us

Online statistics

Forum statistics