## Hypothetically, how an FFT-based equalizer can be programmed.

One of the concepts which I only recently posted about was, that I had activated an equalizer function, that was once integrated into how the PulseAudio sound server works, but which may be installed with additional packages, in more-recent versions of Debian Linux. As I wrote, to activate this under Debian 8 / Jessie was a bit problematic at first, but could ultimately be accomplished. The following is what the controls of this equalizer look like, on the screen:

And, this is what the newly-created ‘sink’ is named, within the (old) KDE-4 desktop manager’s Settings Panel:

What struck me as remarkable about this, was its naming, as an “FFT-based Equalizer…”. I had written an earlier posting, about How the Fast Fourier Transform differs from the Discrete Fourier Transform. And, because I tend to think first, about how convolutions may be computed, using a Discrete Cosine Transform, it took me a bit of thought, to comprehend, how an equalizer function could be implemented, based on the FFT.

BTW, That earlier posting which I linked to above, has as a major flaw, a guess on my part about how MP3 sound compression works, that makes a false assumption. I have made more recent postings on how sound-compression schemes work, which no longer make the same false assumption. But otherwise, that old posting still explains, what the difference between the FFT and other, Discrete Transforms is.

So, the question which may go through some readers’ minds, like mine, would be, how a graphic equalizer based on the FFT can be made computationally efficient, to the maximum. Obviously, when the FFT is only being used to analyze a sampling interval, what results is a (small) number of frequency coefficients, spaced approximately uniformly, over a series of octaves. Apparently, such a set of coefficients-as-output, needs to be replaced by one stream each, that isolates one frequency-component. Each stream then needs to be multiplied by an equalizer setting, before being mixed into the combined equalizer output.

I think that one way to compute that would be, to replace the ‘folding’ operation normally used in the Fourier Analysis, with a procedure, that only computes one or more product-sums, of the input signal with reference sine-waves, but in each case except for the lowest frequency, over only a small fraction of the entire buffer, which becomes shorter according to powers of 2.

Thus, it should remain constant that, in order for the equalizer to be able to isolate the frequency of ~31Hz, a sine-product with a buffer of 1408 samples needs to be computed, once per input sample. But beyond that, determining the ~63Hz frequency-component, really only requires that the sine-product be computed, with the most recent 704 samples of the same buffer. Frequency-components that belong to even-higher octaves can all be computed, as per-input-sample sine-products, with the most-recent 352 input-samples, etc. (for multiples of ~125Hz). Eventually, as the frequency-components start to become odd products of an octave, an interval of 176 input samples can be used, for the frequency-components belonging to the same octave, thus yielding the ~500Hz and ~750Hz components… After that, in order to filter out the ~1kHz and the ~1.5kHz components, a section of the buffer only 88 samples long can be used…

Mind you, one alternative to doing all that would be, to apply a convolution of fixed length to the input stream constantly, but to recompute that convolution, by first interpolating frequency-coefficients between the GUI’s slider-positions, and then applying one of the Discrete Cosine Transforms to the resulting set of coefficients. The advantage to using a DCT in this way would be, that the coefficients would only need to be recomputed once, every time the user changes the slider-positions. But then, to name the resulting equalizer an ‘FFT-based’ equalizer, would actually be false.

(Updated 7/25/2020, 11h15… )

## An Observation about the Daubechies Wavelet and PQF

In an earlier posting, I had written about what a wonderful thing Quadrature Mirror Filter was, and that it is better to apply the Daubechies Wavelet than the older Haar Wavelet. But the question remains less obvious, as to how the process can be reversed.

The concept was clear, that an input stream in the Time-Domain could first be passed through a low-pass filter, and then sub-sampled at (1/2) its original sampling rate. Simultaneously, the same stream can be passed through the corresponding band-pass filter, and then sub-sampled again, so that only frequencies above half the Nyquist Frequency are sub-sampled, thereby reversing them to below the new Nyquist Frequency.

A first approximation for how to reverse this might be, to duplicate each sample of the lower sub-band once, before super-sampling them, and to invert each sample of the upper side-band once, after expressing it positively, but we would not want playback-quality to drop to that of a Haar wavelet again ! And so we would apply the same wavelets to recombine the sub-bands. There is a detail to that which I left out.

We might want to multiply each sample of each sub-band by its entire wavelet, but only once for every second output-sample. And then one concern we might have could be, that the output-amplitude might not be constant. I suspect that one of the constraints which each of these wavelets satisfies would be, that their output-amplitude will actually be constant, if they are applied once per second output-sample.

Now, in the case of ‘Polyphase Quadrature Filter’, Engineers reduced the amount of computational effort, by not applying a band-pass filter, but only the low-pass filter. When encoding, the low sub-band is produced as before, but the high sub-band is simply produced as the difference between every second input-sample, and the result that was obtained when applying the low-pass filter. The question about this which is not obvious, is ‘How does one recombine that?’

And the best answer I can think of would be, to apply the low-pass wavelet to the low sub-band, and then to supply the sample from the high sub-band for two operations:

1. The first sample from the output of the low-pass wavelet, plus the input sample.
2. The second sample from the output of the low-pass wavelet, minus the same input sample, from the high sub-band.

## An Observation about the Discrete Fourier Transforms

Discrete Fourier Transforms, including the Cosine Transforms, tend to have as many elements in the frequency-domain, as the sampling interval had in the time-domain.

Thus, if a sampling interval had 1024 samples, there will be as many frequency-coefficients, numbered from 0 to 1023 inclusively. One way in which these transforms differ from the FFT, is in the possibility of having a number of elements either way, that are not a power of 2. It is possible to have a discrete transform with 11 time-domain samples, that translate into as many frequency-coefficients, numbered from 0 to 10 inclusively.

If it was truly the project to compute an FFT that has one coefficient per octave, then we would include the Nyquist Frequency, which is usually not done. And in that case, we would also ask ourselves, whether the component at F=0 is best computed as the summation over the longest interval, where it would usually be computed, or whether it makes more sense then, just to fold the shortest interval, which consists of 2 samples, one more time, to arrive at 1 sample, the value of which corresponds to F=0 .

Now, if our discrete transform had the frequency-coefficients


G(n) = {1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1}




Then the fact could be exploited that these transforms tend to act as their own inverse. Therefore I can know, that the same set of samples in the time-domain, would constitute a DC signal, which would therefore have the frequency-coefficients


F(n) = {1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}




If this was taken to be a convolution again, because the discrete transforms are their own inverse, it would correspond to the function


F(n) · S(m) == S(m)




We would assume that multiplication begins with element (0) and not with element (10). So I have a hint, that maybe I am on the right track. But, because the DCT has an inverse which is not exactly the same, the inverse being the IDCT, the next question I would need to investigate, is whether indeed I should be using the DCT and not the IDCT, to turn an intended set of frequency-coefficients, into a working convolution. And to answer that question, the simple thought does not suffice.

The main advantage with the DCT would be, that we will never need to deal with complex values.

Dirk

## The approximate Difference between a DFT and an FFT

Both the Discreet Fourier Transform and the Fast Fourier Transform produce complex-numbered coefficients, the non-zero amplitudes of which will represent frequency components in the signal. They both produce a more accurate measure of this property of the signal, than the Discreet Cosine Transforms do.

Without getting into rigorous Math,

If we have a 1024-sample interval in the time-domain, then the DFT of that simply computes the coefficients from 0 through to 1023, half-cycles. A frequency component present at one coefficient, let us say an even-numbered coefficient, will also have a non-zero effect on the adjacent, odd-numbered coefficients, which can therefore not be separated fully, by a Fourier Transform that defines both sets. A DFT will generally compute them all.

An FFT has as a premise, a specific number of coefficients per octave. That number could be (1), but seldom actually is. In general, an FFT will at first compute (2 * n) coefficients over the full sampling interval, will then fold the interval, and will then compute another (n) coefficients, and will fold the interval again, until the highest-frequency coefficient approaches 1/2 the number of time-domain samples in the last computed interval.

This will cause the higher-octave coefficients to be more spread out and less numerous, but because they are also being computed for successively shorter sampling intervals, they also become less selective, so that all the signal energy is eventually accounted for.

Also, with an FFT, it is usually the coefficients which correspond to the even-numbered ones in the DFT which are computed, again because one frequency component from the signal does not need to be accounted for twice. Thus, whole-numbers of cycles per sampling interval are usually computed.

For example, if we start with a 1024-sample interval in the time-domain, we may decide that we want to achieve (n = 4) coefficients per octave. We therefore compute 8 over the full interval, including (F = 0) but excluding (F = 8). Then we fold the interval down to 512 samples, and compute the coefficients from (F = 4) through (F = 7).

A frequency component that completes the 1024-sample interval 8 times, will complete the 512-sample interval 4 times, so that the second set of coefficients continues where the first left off. And then again, for a twice-folded interval of 256 samples, we compute from (F = 4) through (F = 7)…

After we have folded our original sampling interval 6 times, we are left with a 16-sample interval, which forms the end of our series, because (F = 8) would fit in exactly, into 16 samples. And, true to form, we omit the last coefficient, as we did with the DFT.

210  =  1024

10 – 6 = 4

24  =  16

So we would end up with

(1 * 8) + (6 * 4) =  32  Coefficients .

And this type of symmetry seemed relevant in this earlier posting.

Dirk