## Exploring the Discrete Sine Transform…

I can sometimes describe a way of using certain tools – such as in this case, one of the Discrete Cosine Transforms – which is correct in principle, but which has an underlying flaw, that needs to be corrected, from my first approximation of how it can be applied.

One of the things which I had said was possible was, to take a series of frequency-domain ‘equalizer settings’, which be at one per unit of frequency, not, at so many per octave, compute whichever DCT was relevant, such that the result had the lowest frequency as its first element, and then to apply that result as a convolution, in order finally to apply the computed equalizer to a signal.

One of the facts which I’m only realizing recently is, that if the DCT is computed in a one-sided way, the results are ‘completely non-ideal’, because it gives no control over what the phase-shifts will be, at any frequency! Similarly, such a one-sided convolution can also not be applied as the sinc function, because the amount of sine-wave output, in response to a cosine-wave input, will approach infinity, when the frequency is actually at the cutoff frequency.

What I have found instead is, that if such a cosine transform is mirrored around a centre-point, the amount of sine response, to an input cosine-wave, will cancel out and become zero, thus giving phase-shifts of zero.

But a result which some people might like is, to be able to apply controlled phase-shifts, differently for each frequency, such that those people specify a cosine as well as a sine component, for an assumed input cosine-wave.

The way to accomplish that is, to add-in the corresponding (normalized) sine-transform, of the series of phase-shifted response values, and to observe that the sine-transform will actually be zero at the centre-point. Then, the thing to do is, to apply the results negatively on the other side of the centre-point, which were to be applied positively on one side.

I have carried out a certain experiment with the Computer Algebra System named “wxMaxima”, in order first to observe what happens if a set of equal, discrete frequency-coefficients belonging to a series is summed. And then, I plotted the result of the definite integral, of the sine function, over a short interval. Just as with the sinc function, The integral of the cosine function was (sin(x) – sin(0)) / x, the definite integral of the sine function will be (1 – cos(x)) / x, and, Because the derivative of cos(x) is zero at (x = 0), the limit equation based on the divide by zero, will actually approach zero, and be well-behaved.

(Update 1/31/2021, 13h35: )

There is an underlying truth about Integral Equations in general, which people who studied Calculus 2 generally know, but, I have no right just to assume that any reader of my blog did so. There exist certain standard Integrals, which behave in the reverse way of how the standard Derivatives behave, just because ‘Integrals’ are ‘Antiderivatives’…

When one solves the Derivatives of certain trig functions repeatedly, one obtains the sequence:

sin(x) -> cos(x) -> -sin(x) -> -cos(x) -> sin(x)

Solving the Indefinite Integrals of the same trig functions yields the result:

sin(x) -> -cos(x) -> -sin(x) -> cos(x) -> sin(x)

Hence, the Indefinite Integral of sin(x) is in fact -cos(x), and:

( -(-cos(0)) = +1 )

(End of Update, 1/31/2021, 13h35.)

(Updated 2/04/2021, 17h10…)

## Hypothetically, how an FFT-based equalizer can be programmed.

One of the concepts which I only recently posted about was, that I had activated an equalizer function, that was once integrated into how the PulseAudio sound server works, but which may be installed with additional packages, in more-recent versions of Debian Linux. As I wrote, to activate this under Debian 8 / Jessie was a bit problematic at first, but could ultimately be accomplished. The following is what the controls of this equalizer look like, on the screen:

And, this is what the newly-created ‘sink’ is named, within the (old) KDE-4 desktop manager’s Settings Panel:

What struck me as remarkable about this, was its naming, as an “FFT-based Equalizer…”. I had written an earlier posting, about How the Fast Fourier Transform differs from the Discrete Fourier Transform. And, because I tend to think first, about how convolutions may be computed, using a Discrete Cosine Transform, it took me a bit of thought, to comprehend, how an equalizer function could be implemented, based on the FFT.

BTW, That earlier posting which I linked to above, has as a major flaw, a guess on my part about how MP3 sound compression works, that makes a false assumption. I have made more recent postings on how sound-compression schemes work, which no longer make the same false assumption. But otherwise, that old posting still explains, what the difference between the FFT and other, Discrete Transforms is.

So, the question which may go through some readers’ minds, like mine, would be, how a graphic equalizer based on the FFT can be made computationally efficient, to the maximum. Obviously, when the FFT is only being used to analyze a sampling interval, what results is a (small) number of frequency coefficients, spaced approximately uniformly, over a series of octaves. Apparently, such a set of coefficients-as-output, needs to be replaced by one stream each, that isolates one frequency-component. Each stream then needs to be multiplied by an equalizer setting, before being mixed into the combined equalizer output.

I think that one way to compute that would be, to replace the ‘folding’ operation normally used in the Fourier Analysis, with a procedure, that only computes one or more product-sums, of the input signal with reference sine-waves, but in each case except for the lowest frequency, over only a small fraction of the entire buffer, which becomes shorter according to powers of 2.

Thus, it should remain constant that, in order for the equalizer to be able to isolate the frequency of ~31Hz, a sine-product with a buffer of 1408 samples needs to be computed, once per input sample. But beyond that, determining the ~63Hz frequency-component, really only requires that the sine-product be computed, with the most recent 704 samples of the same buffer. Frequency-components that belong to even-higher octaves can all be computed, as per-input-sample sine-products, with the most-recent 352 input-samples, etc. (for multiples of ~125Hz). Eventually, as the frequency-components start to become odd products of an octave, an interval of 176 input samples can be used, for the frequency-components belonging to the same octave, thus yielding the ~500Hz and ~750Hz components… After that, in order to filter out the ~1kHz and the ~1.5kHz components, a section of the buffer only 88 samples long can be used…

Mind you, one alternative to doing all that would be, to apply a convolution of fixed length to the input stream constantly, but to recompute that convolution, by first interpolating frequency-coefficients between the GUI’s slider-positions, and then applying one of the Discrete Cosine Transforms to the resulting set of coefficients. The advantage to using a DCT in this way would be, that the coefficients would only need to be recomputed once, every time the user changes the slider-positions. But then, to name the resulting equalizer an ‘FFT-based’ equalizer, would actually be false.

(Updated 7/25/2020, 11h15… )

## An observation about how the OGG Opus CODEC may do Stereo.

One of the subjects which I’ve written about before, is the fact that the developers of the original OGG Vorbis form of music compression, have more recently developed the OGG Opus CODEC, which is partially the CELT CODEC. And, in studying the manpage on how to use the ‘opusenc’ command (under Linux), I ran across the following detail:


--no-phase-inv
Disable use of phase inversion for intensity stereo. This trades
some stereo quality for a higher quality mono  downmix,  and  is
useful when encoding stereo audio that is likely to be downmixed
to mono after decoding.



What does this mean? Let me explain.

I should first preface that with an admission, of the fact that an idea which was true for the original version of the Modified Discrete Cosine Transform, as introduced by MP3 compression and then reused frequently by other CODECs, may not always be the case. That idea was that, when defining monaural sound, each frequency coefficient needed to be signed. Because CELT uses a form of the Type 4 Discrete Cosine Transform which is only partially lapped, it may be that all the coefficients are assumed to be positive.

This will work as long as there is no destructive interference between the same coefficient, in the overlapping region, from one frame to the next, in spite of the half-sample shift of each frequency-value. Also, a hypotenuse function should be avoided, as that would present itself as distortion. One explicit way to achieve this could be, to rotate the reference-waves (n)·90° + 45° for coefficient (n):

Where ‘FN‘ refers to the current Frame-Number.

In general, modern compressed schemes will subdivide the audible spectrum into sub-bands, which in the case of CELT are referred to as its Critical Bands. And for each frame, the way stereo is encoded for each critical band, switches back and forth between X/Y intensity stereo, and Mid/Side stereo, which also just referred to as M/S stereo. What will happen with M/S stereo is, that the (L-R) channel has its own spectral shape, independent of the (L+R) channel’s, while with X/Y stereo, there is only one spectral pattern, which is reproduced by a linear factor, as both the (L+R) component, and the (L-R) component.

Even if the (L+R) is only being recorded as having positive DCT coefficients, with M/S stereo, the need persists for the (L-R) channel to be signed. Yet, even if M/S stereo is not taking place, implying that X/Y stereo is taking place, what can happen is that:

|L-R| > (L+R)

This would cause phase-inversion to take place between the two channels, (L) and (R). Apparently, a setting will prevent this from happening.

Further, because CELT has as its main feature, that it first states the amplitude of the critical band, and then a Code-Word which identifies the actual non-zero coefficients, which may only number 4, the setting may also affect critical bands for which M/S stereo is being used during any one frame. I’m not really sure if it does. But if it does, it will also make sure that the amplitude of the (L+R) critical band exceeds or equals that of the (L-R) critical band.

The way in which the CODEC decides, whether to encode the critical band using X/Y or M/S, for any one frame, is to detect the extent to which the non-zero coefficients coincide. If the majority of them do, encoding automatically switches to X/Y… Having said that, my own ideas on stereo perception are such that, if none of the coefficients coincide, it should not make any difference whether the specific coefficients belonging to the (L-R) channel are positive or negative. And finally, a feature which CELT could have enabled constantly, is to compute whether the (L-R) critical band correlates positively or negatively with the (L+R), independently of what the two amplitudes are. And this last observation suggests that even when encoding in M/S mode, the individual coefficients may not be signed.

(Update 10/03/2019, 9h30 … )

## Comparing two Bose headphones, both of which use active technology.

In this posting I’m going to do something I rarely do, which is, something like a product review. I have purchased the following two headphones within the past few months:

The first set of headphones has an analog 3.5mm stereo input cable, which has a dual-purpose Mike / Headphone Jack, and comes either compatible with Samsung, or with Apple phones, while the second uses Bluetooth to connect to either brand of phone. I should add that the phone I use with either set of headphones is a Samsung Galaxy S9, which supports Bluetooth 5.

The first set of headphones requires a single, AAA alkaline battery to work properly. And this not only fuels its active noise cancelling, but also an equalizer chip that has become standard with many similar middle-price-range headphones. The second has a built-in rechargeable Lithium-Ion Battery, which is rumoured to be good for 10-15 hours of play-time, which I have not yet tested. Like the first, the second has an equalizer chip, but no active noise cancellation.

I think that right off the bat I should point out, that I don’t approve of this use of an equalizer chip, effectively, to compensate for the sound oddities of the internal voice-coils. I think that more properly, the voice-coils should be designed to deliver the best frequency response possible, by themselves. But the reality in the year 2019 is, that many headphones come with an internal equalizer chip instead.

What I’ve found is that the first set of headphones, while having excellent noise cancellation, has two main drawbacks:

• The jack into which the analog cable fits, is poorly designed, and can cause bad connections,
• The single, AAA battery can only deliver a voltage of 1.5V, and if the actual voltage is any lower, either because a Ni-MH battery was used in place of an alkaline cell, or, because the battery is just plain low, the low-voltage equalizer chip will no longer work fully, resulting in sound that reveals the deficiencies in the voice-coil.

The second set of headphones overcomes both these limitations, and I fully expect that its equalizer chips will have uniform behaviour, that my ears will be able to adjust to in the long term, even when I use them for hours or days. Also, I’d tend to say that the way the equalizer arrangement worked in the first set of headphones, was not complete in fulfilling its job, even when the battery was fully charged. Therefore, If I only had the money to buy one of the headphones, I’d choose the second set, which I just received today.

But, having said that, I should also add that I have two 12,000BTU air conditioners running in the Summer months, which really require the noise-cancellation of the first set of headphones, that the second set does not provide.

Also, I have an observation of why the EQ chip in the second set of headphones may work better than the similarly purposed chip in the first set…

(Updated 9/28/2019, 19h05 … )