There exists an argument, against Headphone Spatialization.

Headphone Spatializaion is also known as ‘Binaural Sound’ or ‘Binaural Audio’. It is based on the idea, that when people hear direction, people do not only take into account the relative amplitudes of Left versus Right – aka panning – but that somehow, people also take into account time-delay that sound requires, to get to the more-distant ear, with respect to the closer ear. This time-delay is also known as the Inter-Aural Delay.

Quite frankly, if people are indeed able to do this, I would like to know how, because the actual cochlea cannot do this. The cochlea only perceives frequency-energies, and the brain is able in each hemisphere, to do a close comparison of those energies, between sound perceived by both ears.

If such an ability exists, it may be due to what happens in the middle ear. And this could be, because sound from one source, reaches each cochlea, over more than one pathway…

But what this also means is that if listeners simply put on headphones and listen to stereo, they too are relatively unable to make out positioning, unless that stereo is very accurate, which it is not, after it has been MP3-compressed.

So technology exists hypothetically, that will take explicit surround-sound, and encode it into stereo which is not meant to be re-compressed afterward, but that allows for stereo-perception.

There exist valid arguments against the widespread use of such technology. The way each person interprets the sound from his two ears, is an individual skill which he learns, in many cases people can only hear direction by moving their heads slightly, and the head of each person is slightly different anatomically. So there might not be any unified way to accomplish this.

What I find, is that when there are subtle differences in how this works over a large population, there is frequently a possible simplification, that does not correspond 100% to how one case interprets sound, but that works better in general, than what would happen if the effect is not applied at all.

Therefore, I would think of this as a “Surround Effect”, rather than as ‘Surround Sound’, the latter of which is meant to be heard over speakers, and where the ability of an individual to make out direction, falls back on the ability of the individual, also to make out the direction of his own speakers.

Dirk

 

Modern Consumer Sound Appreciation

Over recent months, I have been racking my brain, trying to answer questions I have, about how sound that was compressed in the frequency-domain, may or may not be able to preserve phase-information. This does not mean that I, personally, can hear phase-information, nor that specific MP3 Files I have been listening to, would even be good examples of how well modern MP3s compress sound. I suspect that in order to stay in business, the developers of MP3 have in fact been improving their codec, so that when played back correctly, the quality of MP3s will stay in line with more-recent formats that exist, such as OGG Vorbis…

But I think that people under-appreciate my intellectual point of view.

For many months and years, I had my doubts, that MP3 Files can in fact encode ± 180⁰ phase-shifts, i.e. a stereo-difference channel that has the correct polarity with respect to the stereo-sum channel, over a range of frequencies. What my own musings have only taught me in recent days, is that in fact, MP3 is capable of ± 180⁰ phase-separation.

Further, similar types of compression should be capable of better phase-separation than that, If their bit-rates are set high enough, that not too many of their frequency-coefficients get chopped down – according to what I have reasoned out today.

What I also know, is that the sound-formats AC3 and AAC have as an explicit feature, to store surround-sound. MPEG-2 Video Files more-or-less require the use of the AC3 codec for sound, and MP4 Files absolutely require the use of the AAC codec. And, stored in its compressed format, the surround-effect only requires ± 180⁰ phase-accuracy.

This subject is orthogonal to debate which exists, about whether it is of benefit to human listeners, to have sound reproduced at very high sample-rates, or at great bit-depths. Furthermore, I do not fully know what good a very high sample-rate – such as “192kHz” – is supposed to do any listener, if his sound has been MP3-compressed. As far as I am concerned, ultra-high sample-rates have to do with lossless compression, or no compression, which also happen to produce the same file-sizes at that signal-format.

What I did was just check, in what format iTunes downloads music by default. And it downloads its music in AAC Format. All this does for me, is corroborate a claim a friend of mine made, that he can hear his music with full positioning, since that is also the main feature of AAC, and not of MP3.

Continue reading Modern Consumer Sound Appreciation

A Thought on SRS

Today, when we buy a laptop, we assume that its internal speakers offer inferior sound by themselves, but that through the use of a feature named ‘SRS’, they are enhanced, so that sound which simply comes from two speakers in front of us, seems to fill the space around us, kind of how surround-sound would work.

The immediate problem with Linux computers is, that they do not offer this enhancement. However, technophiles have known for a long time that this problem can be solved.

The underlying assumption here is, that the stereo being sent to the speakers should act as if each channel was sent to one ear in an isolated way, as if we were using headphones.

The sound that leaves the left speaker, reaches our right ear with a slightly longer time-delay, than the time-delay with which it reaches our left ear, and a converse truth exists for the right speaker.

It has always been possible to time-delay and attenuate the sound that came from the left speaker in total, before subtracting the result from the right speaker-output, and vice-verso. That way, the added signal that reaches the left ear from the left speaker, cancels with the sound that reached it from the right speaker…

The main problem with that effect, is that it will mainly seem to work when the listener is positioned in front of the speakers, in exactly one position.

I have just represented a hypothetical setup in the time-domain. There can exist a corresponding representation in the frequency-domain. The only problem is, that this effect cannot truly be achieved just with one graphical equalizer setting, because it affects (L+R) differently from how it affects (L-R). (L+R) would be receiving some recursive, negative reverb, while (L-R) would be receiving some recursive, positive reverb. But reverb can also be expressed by a frequency-response curve, as long as that has sufficiently fine resolution.

This effect will also work well with MP3-compressed stereo, because with Joint Stereo, an MP3 stream is spectrally complex in its reproduction of the (L-R) component.

I expect that when companies package SRS, they do something similar, except that they may tweak the actual frequency-response curves into something simpler, and they may also incorporate a compensation, for the inferior way the speakers reproduce frequencies.

Simplifying the curves would allow the effect to break down less, when the listener is not perfectly positioned.

We do not have it under Linux.

(Edit 02/24/2017 : A related effect is possible, by which 2 or more speakers are converted into an effectively-directional speaker-system. I.e., the intent could be, that sound which reaches our filter as the (L) channel, should predominantly leave the speaker-set at one angle, while sound which reaches our filter as the (R) channel, should leave the speaker-set at an opposing angle.

In fact, if we have an entire array of speakers – i.e. a speaker-bar – then we can apply the same sort of logic to them, as we would apply to a phased-array radar system.

The main difference with such a system, as opposed to one based on the Inter-Aural Delay, is that this one would absolutely require we know the distance between the speakers. And then we would use that distance, as the basis for our time-delays… )

Continue reading A Thought on SRS