## Some realizations about Digital Signal Processing

One of the realizations which I’ve just come across recently, about digital signal processing, is that apparently, when up-sampling a digital stream twofold, just for the purpose of playing it back, simply to perform a linear interpolation, to turn a 44.1kHz stream into an 88.2kHz, or a 48kHz stream into a 96kHz, does less damage to the sound quality, than I had previously thought. And one reason I think this is the factual realization that to do so, really achieves the same thing that applying a (low-pass) Haar Wavelet would achieve, after each original sample had been doubled. After all, I had already said, that Humans would have a hard time being able to hear that this has been done.

But then, given such an assumption, I think I’ve also come into more realizations, of where I was having trouble understanding what exactly Digital Signal Processors do. It might be Mathematically true to say, that a convolution can be applied to a stream after it has been up-sampled, but, depending on how many elements the convolution is supposed to have, whether or not a single DSP chip is supposed to decode both stereo channels or only one, and whether that DSP chip is also supposed to perform other steps associated with playing back the audio, such as, to decode whatever compression Bluetooth 4 or Bluetooth 5 have put on the stream, it may turn out that realistic Digital Signal Processing chips just don’t have enough MIPS – Millions of Instructions Per Second – to do all that.

Now, I do know that DSP chips exist that have more MIPS, but then those chips may also measure 2cm x 2cm, and may require much of the circuit-board they are to be soldered in to. Those types of chips are unlikely to be built-in to a mid-price-range set of (Stereo) Bluetooth Headphones, that have an equalization function.

But what I can then speculate further is that some combination of alterations of these ideas should work.

For example, the convolution that is to be computed could be computed on the stream before it has been up-sampled, and it could then be up-sampled ‘cheaply’, using the linear interpolation. The way I had it before, the half-used virtual equalizer bands would also accomplish a kind of brick-wall filter, whereas, to perform the virtual equalizer function on the stream before up-sampling would make use of almost all the bands, and doing it that way would halve the amount of MIPS that a DSP chip needs to possess. Doing it that way would also halve the frequency linearly separating the bands, which would have created issues at the low end of the audible spectrum.

Alternatively, implementing a digital 9- or 10-band equalizer, with the
bands spaced an octave apart, could be achieved after up-sampling, instead of before up-sampling, but again, much more cheaply in terms of computational power required.

Dirk

## LG Tone Infinim HBS-910 Bluetooth Headphones

In This earlier posting, I had written that my LG Tonepro HBS-750 Bluetooth Headphones had permanently failed. Today, I received the HBS-910 headphones that are meant to replace those. And as I’ve written before, it is important to me, to benefit from the high-quality sound, that both sets of headphones offer.

I’m breaking in the new ones, as I’m writing this.

There exists a design-philosophy today, according to which music-playback is supposed to boost the bass and attenuate the highest frequencies – the ones higher than 10kHz – so that the listener will get the subjective impression that the sound is ‘louder’, and so that the listener will reduce the actual signal-level, to preserve their hearing better than it was done a few decades ago.

1. The lowest-frequency (default) setting on the equalizer of the headphones does both of those things.
2. The next setting stops boosting the bass.
3. The third setting, stops attenuating the treble.

Overall, I get the impression that the highest frequencies which the HBS-910 can reproduce, extend higher, than what the HBS-750 was able to reproduce.

## An Observation about the Discrete Fourier Transforms

Discrete Fourier Transforms, including the Cosine Transforms, tend to have as many elements in the frequency-domain, as the sampling interval had in the time-domain.

Thus, if a sampling interval had 1024 samples, there will be as many frequency-coefficients, numbered from 0 to 1023 inclusively. One way in which these transforms differ from the FFT, is in the possibility of having a number of elements either way, that are not a power of 2. It is possible to have a discrete transform with 11 time-domain samples, that translate into as many frequency-coefficients, numbered from 0 to 10 inclusively.

If it was truly the project to compute an FFT that has one coefficient per octave, then we would include the Nyquist Frequency, which is usually not done. And in that case, we would also ask ourselves, whether the component at F=0 is best computed as the summation over the longest interval, where it would usually be computed, or whether it makes more sense then, just to fold the shortest interval, which consists of 2 samples, one more time, to arrive at 1 sample, the value of which corresponds to F=0 .

Now, if our discrete transform had the frequency-coefficients


G(n) = {1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1}




Then the fact could be exploited that these transforms tend to act as their own inverse. Therefore I can know, that the same set of samples in the time-domain, would constitute a DC signal, which would therefore have the frequency-coefficients


F(n) = {1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}




If this was taken to be a convolution again, because the discrete transforms are their own inverse, it would correspond to the function


F(n) · S(m) == S(m)




We would assume that multiplication begins with element (0) and not with element (10). So I have a hint, that maybe I am on the right track. But, because the DCT has an inverse which is not exactly the same, the inverse being the IDCT, the next question I would need to investigate, is whether indeed I should be using the DCT and not the IDCT, to turn an intended set of frequency-coefficients, into a working convolution. And to answer that question, the simple thought does not suffice.

The main advantage with the DCT would be, that we will never need to deal with complex values.

Dirk

## Playing Games With Numbering

There is an art-form which seems to exist, in the design of graphical equalizers, to choose channel-frequencies that are approximately spaced one octave apart, yet which will produce numbers that ‘look clean’ in decimal. For example, a sequence of frequencies is possible that goes 150 Hz, 300 Hz, 600 Hz, 1.2 kHz, 2.4 kHz, 5 kHz, 10 kHz, 20 kHz, resulting in an 8-band equalizer. Notably, in this example, 2.4 kHz will be treated as if it was 2.5 kHz as well, so that the next-higher band, at 5 kHz, will seem to be an exact octave higher.

This will not work as well, with a 20-band equalizer.

Dirk

(Edit 03/26/2017 : )

Also, the difference between 2.4 and 2.5 is less than 1/20. Anything further-off will produce a hot-spot. So below 150 Hz, we might be ill-advised to put 80 instead of 75, because they would be too close by more than 1/20. I would actually suggest, 76 – 38 – 20 . Mind you, that 20 Hz suggestion would be off by 1/19, but who hears those frequencies so accurately anyway?