A Gap in My Understanding of Surround-Sound Filled: Separate Surround Channel when Compressed

In This earlier posting of mine, I had written about certain concepts in surround-sound, which were based on Pro Logic and the analog days. But I had gone on to write, that in the case of the AC3 or the AAC audio CODEC, the actual surround channel could be encoded separately, from the stereo. The purpose in doing so would have been, that if decoded on the appropriate hardware, the surround channel could be sent directly to the rear speakers – thus giving 6-channel output.

While writing what I just linked to above, I had not yet realized, that either channel of the compressed stream, could contain phase information conserved. This had caused me some confusion. Now that I realize, that the phase information could be correct, and not based on the sampling windows themselves, a conclusion comes to mind:

Such a separate, compressed surround-channel, would already be 90⁰ phase-shifted with respect to the panned stereo. And what this means could be, that if the software recognizes that only 2 output channels are to be decoded, the CODEC might just mix the surround channel directly into the stereo. The resulting stereo would then also be prepped, for Pro Logic decoding.



Dolby Atmos

My new Samsung Galaxy S9 smart-phone exceeds the audio capabilities of the older S-series phones, and the sound-chip of this one has a feature called “Dolby Atmos”. Its main premise is, that a movie may have had audio encoded according to either Dolby Atmos, or according to the older, analog ‘Pro Logic’ system, and that, using headphone spatialization, it can be played back with more or less correct positioning. Further, the playback of mere music can be made more rich.

(Updated 11/25/2018, 13h30 … )

Rather than just to write that this feature exists and works, I’m going to use whatever abilities I have to analyze the subject, and to try to form an explanation of how it works.

In This earlier posting, I effectively wrote the (false) supposition, that sound compression which works in the frequency domain, fails to preserve the phase position of the signal correctly. I explained why I thought so.

But in This earlier posting, I wrote what the industry had done in practice, which can result in the preservation of phase-positions, of frequency components.

The latter of the above two postings is the more-accurate. What follows from that is, that if the resolution of the compressed stream is high, meaning that the quantization step is small, phase position is likely to be preserved well, while if the resolution (of the sound) is poor, meaning that the quantization step is large, and the resulting integers small, poor phase information will also result, that may be so poor as only to observe the ±180⁰ difference that also follows, from recorded, non-zero coefficients being signed values.

‘Dolby Atmos’ is a multi-track movie sound system, that encodes actual speaker positions, but, not based on the outdated Pro Logic boxes, which were based on analog wires coming in. In order to understand what was done with Pro Logic, maybe the reader should also read This earlier posting of mine, which explains some of the general principles. In addition, while Pro Logic 1 and 2 had as outputs, physical speakers, Dolby Atmos on the S9 aims to use headphone spatialization, to achieve a similar effect.

I should also state from the beginning, that the implementation of Dolby Atmos in the Samsung S9 phone, allows the user to select between three modes when active:

  1. Movies,
  2. Music,
  3. Voice.

In addition to the actual surround decoding, the Samsung S9 changes the equalizer settings – yes, it also has a built-in equalizer.

(Updated 11/30/2018, 7h30 … )

Continue reading Dolby Atmos

My Opinion on the Opinion of Chris “Monty” Montgomery

Chris Montgomery is the Audio Expert, who invented the OGG Vorbis codec. That gives enough reason to accredit him with good advice. I recommend that my readers read his advice here.

I did read the whole thing, but have three comments on it:

  1. The Author suggests that 16-bit sample-depth offers a de-facto solution to the limits in dynamic range, simply due to the correct application of dithering. If I cannot trust my hardware to perform correct low-pass filtering, why on Earth would I trust it to perform correct, 16-bit, audio dithering?
  2. The Author explains the famous loudness curves, that define threshold of perceptibility, as well as the higher threshold of pain. What he fails to point out is that these curves assume, that the sound being tested for, is the only sound being played over the headphones. If there is another, background sound being played – i.e. the current loudness-level already higher than zero – then the threshold of perception for a given test-sound, is higher – requires a higher level, for that test-sound itself to be heard. Yet, this level is still lower, than the peak level of the background sound. People who design codecs know this, as I am sure the author does. It is the threshold of perceptibility next to a background sound – not the absolute threshold – which gets used in the design of codecs.
  3. The Author suggests it would be a misuse of his codec, to encode discrete multi-channel sound. And one reason he states, would be the waste in file-size, while the next reason he states, would be the fact that sound jumps to the nearest speaker, when they are all encoded that way.

This last observation strikes a cord with me. I have already noticed, that OGG Files do allow numerous channels to be encoded in parallel, but that if we exceed 2, we lose the benefits of Joint Stereo. By itself, this does not really count against this Author, whose codec therefore does not offer explicit surround-sound. But the possibility is very real, that the localization of sound will jump to the nearest speaker, if the listener moves and the sound was encoded that way. It is entirely possible, that purposeful encoding of surround-sound by the (competing) AC3 or the AAC codecs, reduces this risk.

But then I would suggest an alternative approach, to people who do not want to use the proprietary codecs, yet who wish to encode their movies with surround.

There exists the Steve Harris LADSPA plug-in library, which includes a matrix encoder for Pro Logic. This matrix encoder accepts 4 input channels, one of which is the surround channel, and outputs 2 stereo channels.

Further, the circuitry must exist someplace as well, to accept 2 stereo, 1 center and 1 surround-channel, and to encode those in real-time, so that the surround-effect can be played back over 6 speakers.

  • In principle, it should be possible to OGG-compress 4 channels and not 6, so that these channels can be used as inputs, to a matrix surround-system, like to the LADSPA plug-in, so that listenable surround will emanate from all speakers. Does Audio Software exist, which applies the LADSPA plug-in in real-time?
  • Alternatively, it might be possible to mix down Pro Logic sound into Stereo using the Steve Harris plug-in, and then to use FLAC on the resulting stereo.

BTW: What the Author mainly writes, is how incorrect it would be for pure listeners, to download their music in 24/192 format. He does not actually write, that Music / Sound Authors should avoid recording in this format. And so one fact which I have observed, is that there exists a lot of Audio Software – such as – that stores its sound in some higher, internal format, but which, when instructed to Export that to a 16-bit format, offer Dithering as an option.

This is possible because the Application is numeric and not physical. Thus, If I had used my USB-sound-device to record in 24-bit, I could next Export the finished sound tracks to 16-bit:



But, It would also seem that Chris Montgomery equates the use of such technology, as only being suited for Professionals. I am not a professional, and do not have the extremely expensive tools they do. Yet, I am able to author sound-projects.



Some Music may be Suitable for Surround-Sound.

One question which older members of the population might ask themselves, is whether it makes any sense, for people to be listening to music with surround-sound.

This question – or rather bias – stems from the way much music was mixed-down in the 1970s and 1980s, where the Artists applied a catch-as-catch-can approach, to creating Stereo. In fact back then, the goal was often even to confuse how the listener hears sound, using phase-shifts, and thus to be psychedelic. And so a basis exists to think, that electronic music especially, was never meant to be heard in surround-sound.

But the situation has changed since then. For several decades, some FM radio stations have been offering some music in surround-sound. And further, much of the old music from the 1970s has also been remastered, with more-modern technical considerations.

Before I could know whether a friend of mine is listening to Beethovens 9th Symphony, I cannot be sure whether what he says is real, and so I generally give people the benefit of the doubt. And, if he says he is listening to Neil Young, does he mean the way he recorded in the 1970s, or is he referring to a recording, which Neil Young personally remastered after the year 2000? ;)