## Hypothetically, how an FFT-based equalizer can be programmed.

One of the concepts which I only recently posted about was, that I had activated an equalizer function, that was once integrated into how the PulseAudio sound server works, but which may be installed with additional packages, in more-recent versions of Debian Linux. As I wrote, to activate this under Debian 8 / Jessie was a bit problematic at first, but could ultimately be accomplished. The following is what the controls of this equalizer look like, on the screen:

And, this is what the newly-created ‘sink’ is named, within the (old) KDE-4 desktop manager’s Settings Panel:

What struck me as remarkable about this, was its naming, as an “FFT-based Equalizer…”. I had written an earlier posting, about How the Fast Fourier Transform differs from the Discrete Fourier Transform. And, because I tend to think first, about how convolutions may be computed, using a Discrete Cosine Transform, it took me a bit of thought, to comprehend, how an equalizer function could be implemented, based on the FFT.

BTW, That earlier posting which I linked to above, has as a major flaw, a guess on my part about how MP3 sound compression works, that makes a false assumption. I have made more recent postings on how sound-compression schemes work, which no longer make the same false assumption. But otherwise, that old posting still explains, what the difference between the FFT and other, Discrete Transforms is.

So, the question which may go through some readers’ minds, like mine, would be, how a graphic equalizer based on the FFT can be made computationally efficient, to the maximum. Obviously, when the FFT is only being used to analyze a sampling interval, what results is a (small) number of frequency coefficients, spaced approximately uniformly, over a series of octaves. Apparently, such a set of coefficients-as-output, needs to be replaced by one stream each, that isolates one frequency-component. Each stream then needs to be multiplied by an equalizer setting, before being mixed into the combined equalizer output.

I think that one way to compute that would be, to replace the ‘folding’ operation normally used in the Fourier Analysis, with a procedure, that only computes one or more product-sums, of the input signal with reference sine-waves, but in each case except for the lowest frequency, over only a small fraction of the entire buffer, which becomes shorter according to powers of 2.

Thus, it should remain constant that, in order for the equalizer to be able to isolate the frequency of ~31Hz, a sine-product with a buffer of 1408 samples needs to be computed, once per input sample. But beyond that, determining the ~63Hz frequency-component, really only requires that the sine-product be computed, with the most recent 704 samples of the same buffer. Frequency-components that belong to even-higher octaves can all be computed, as per-input-sample sine-products, with the most-recent 352 input-samples, etc. (for multiples of ~125Hz). Eventually, as the frequency-components start to become odd products of an octave, an interval of 176 input samples can be used, for the frequency-components belonging to the same octave, thus yielding the ~500Hz and ~750Hz components… After that, in order to filter out the ~1kHz and the ~1.5kHz components, a section of the buffer only 88 samples long can be used…

Mind you, one alternative to doing all that would be, to apply a convolution of fixed length to the input stream constantly, but to recompute that convolution, by first interpolating frequency-coefficients between the GUI’s slider-positions, and then applying one of the Discrete Cosine Transforms to the resulting set of coefficients. The advantage to using a DCT in this way would be, that the coefficients would only need to be recomputed once, every time the user changes the slider-positions. But then, to name the resulting equalizer an ‘FFT-based’ equalizer, would actually be false.

(Updated 7/25/2020, 11h15… )

I have run into people, who believe that a signal cannot be phase-advanced in real-time, only phase-delayed. And as far as I can tell, this idea stems from the misconception, that in order for a signal to be given a phase-advance, some form of prediction would be needed. The fact that this is not true can best be visualized, when we take an analog signal, and derive another signal from it, which would be the short-term derivative of the first signal. ( :1 ) Because the derivative would be most-positive at points in its waveform where the input had the most-positive slope, and zero where the input was at its peak, we would already have derived a sine-wave for example, that will be phase-advanced 90⁰ with respect to an input sine-wave.

But the main reason this is not done, is the fact that a short-term derivative also acts as a high-pass filter, which progressively doubles in output amplitude, for every octave of frequencies.

What can be done in the analog domain however, is that a signal can be phase-delayed 90⁰, and the frequency-response kept uniform, and then simply inverted. The phase-diagram of each of the signal’s frequency-components will then show, the entire signal has been phase-advanced 90⁰.

(Updated 11/29/2017 : )

## There has been some confusion about the Sinc-Filter.

I have read descriptions about the Sinc-Filter somewhere, which predicted that it would become unstable, if the frequency of the input stream, happened to correspond to the spacing, between its non-zero coefficients. As far as I can tell, this prediction was based on a casual inspection of the Sinc Function, but overlooks something which is easy to overlook about it. This case also happens to correspond, to the input stream having a frequency equal to the Nyquist Frequency, of certain practical applications, such as over-sampling.

The Sinc Function has zero-crossings at regular intervals, including the center-point, where its coefficient is stated as being equal to (1.0) . This happens because the value at the center-point, is the solution to a limit equation, that corresponds to (0/0) .

This center coefficient is symmetrically flanked by two positive ones, one of which is only positive, because it forms as a division of the sine of x by the corresponding negative value of x. At frequencies below the Nyquist Frequency, the sum of their products starts to reinforce the center element. Above Nyquist, they start to cancel the product with the center coefficient.

This can be complicated to plot using Computer Algebra Systems, because plotting functions are always numerical, and at (x=0), there is no numerical solution (only the Algebraic solution given lHôpitals Rule). So, a CAS typically needs to have the Sinc Function defined as a special case, to be able to plot it, otherwise requiring a complex workaround.

So it is possible that the frequency of the incoming stream aligns to the spacing between the maxima and minima of the Sinc Function. If that happens, there are two behaviors to bear in mind:

1. The peak of the input stream could be aligned with the center-point. In that case, all the other waves will have zero-crossings, where the Sinc Function has maxima. The fact that the single input-sample seems to produce (1.0) as the output amplitude, is due to how the function is frequently normalized for practical use. According to that, maximum output should reach (2.0) at a frequency of zero…
2. The input stream could have a zero-crossing, at the center-point of the Sinc Function, so that its product from there should equal (0.0) . In that case, the input stream will have positive peaks on one side of the center-point, that all correspond to negative peaks on the other side of the center-point. According to that, the instantaneous output should equal (0.0) .

All of this would suggest to me, that the Sinc-Filter will work properly.

One way in which people can misinterpret the plot of the curve, would be to notice it has a positive peak in the center, to notice that after a zero-crossing, it forms two negative peaks, and then to conclude that those negative peaks are also the two closest non-zero coefficients to the center.

## I feel that standards need to be reestablished.

When 16-bit / 44.1kHz Audio was first developed, it implied a very capable system for representing high-fidelity sound. But I think that today, we live in a pseudo-16-bit era. Manufacturers have taken 16-bit components, but designed devices which do bot deliver the full power or quality of what this format once promised.

It might be a bit of an exaggeration, but I would say that out of those indicated 16 bits of precision, the last 4 are not accurate. And one main reason this has happened, is due to compressed sound. Admittedly, signal compression – which is often a euphemism for data reduction – is necessary in some areas of signal processing. But one reason fw data-reduction was applied to sound, had more to do with dialup-modems and their lack of signal-speed, and with the need to be able to download songs onto small amounts of HD space, than it served any other purpose, when the first forms of data-reduction were devised.

Even though compressed streams caused this, I would not say that the solution lies in getting rid of compressed streams. But I think that a necessary part of the solution would be consumer awareness.

If I tell people that I own a sound device, that it uses 2x over-sampling, but that I fear the interpolated samples are simply generated as a linear interpolation of the two adjacent, original samples, and if those people answer “So what? Can anybody hear the difference?” Then this is not an example of consumer awareness. I can hear the difference between very-high-pitch sounds that are approximately correct, and ones which are greatly distorted.

Also, if we were to accept for a moment that out of the indicated 16 bits, only the first 12 are accurate, but there exist sound experts who tell us that by dithering the least-significant bit, we can extend the dynamic range of this sound beyond 96db, then I do not really believe that those experts know any less about digital sound. Those experts have just remained so entirely surrounded by their high-end equipment, that they have not yet noticed the standards slip, in other parts of the world.

Also, I do not believe that the answer to this problem lies in consumers downloading 24-bit, 192kHz sound-files, because my assumption would again be, that only a few of those indicated 24 bits will be accurate. I do not believe Humans hear ultrasound. But I think that with great effort, we may be able to hear 15-18kHz sound from our actual playback devices again – in the not-so-distant future.