## A Gap in My Understanding of Surround-Sound Filled: Separate Surround Channel when Compressed

In This earlier posting of mine, I had written about certain concepts in surround-sound, which were based on Pro Logic and the analog days. But I had gone on to write, that in the case of the AC3 or the AAC audio CODEC, the actual surround channel could be encoded separately, from the stereo. The purpose in doing so would have been, that if decoded on the appropriate hardware, the surround channel could be sent directly to the rear speakers – thus giving 6-channel output.

While writing what I just linked to above, I had not yet realized, that either channel of the compressed stream, could contain phase information conserved. This had caused me some confusion. Now that I realize, that the phase information could be correct, and not based on the sampling windows themselves, a conclusion comes to mind:

Such a separate, compressed surround-channel, would already be 90⁰ phase-shifted with respect to the panned stereo. And what this means could be, that if the software recognizes that only 2 output channels are to be decoded, the CODEC might just mix the surround channel directly into the stereo. The resulting stereo would then also be prepped, for Pro Logic decoding.

Dirk

## Dolby Atmos

My new Samsung Galaxy S9 smart-phone exceeds the audio capabilities of the older S-series phones, and the sound-chip of this one has a feature called “Dolby Atmos”. Its main premise is, that a movie may have had audio encoded according to either Dolby Atmos, or according to the older, analog ‘Pro Logic’ system, and that, using headphone spatialization, it can be played back with more or less correct positioning. Further, the playback of mere music can be made more rich.

(Updated 11/25/2018, 13h30 … )

Rather than just to write that this feature exists and works, I’m going to use whatever abilities I have to analyze the subject, and to try to form an explanation of how it works.

In This earlier posting, I effectively wrote the (false) supposition, that sound compression which works in the frequency domain, fails to preserve the phase position of the signal correctly. I explained why I thought so.

But in This earlier posting, I wrote what the industry had done in practice, which can result in the preservation of phase-positions, of frequency components.

The latter of the above two postings is the more-accurate. What follows from that is, that if the resolution of the compressed stream is high, meaning that the quantization step is small, phase position is likely to be preserved well, while if the resolution (of the sound) is poor, meaning that the quantization step is large, and the resulting integers small, poor phase information will also result, that may be so poor as only to observe the ±180⁰ difference that also follows, from recorded, non-zero coefficients being signed values.

‘Dolby Atmos’ is a multi-track movie sound system, that encodes actual speaker positions, but, not based on the outdated Pro Logic boxes, which were based on analog wires coming in. In order to understand what was done with Pro Logic, maybe the reader should also read This earlier posting of mine, which explains some of the general principles. In addition, while Pro Logic 1 and 2 had as outputs, physical speakers, Dolby Atmos on the S9 aims to use headphone spatialization, to achieve a similar effect.

I should also state from the beginning, that the implementation of Dolby Atmos in the Samsung S9 phone, allows the user to select between three modes when active:

1. Movies,
2. Music,
3. Voice.

In addition to the actual surround decoding, the Samsung S9 changes the equalizer settings – yes, it also has a built-in equalizer.

(Updated 11/30/2018, 7h30 … )

## An Observation about Pro Logic Versus AC3

One question that people might ask, would be ‘Why is there still any interest in Pro Logic, when in the world today, we have AC3 sound compression?’ Beyond AC3, we also have AAC sound compression, which gets used in MP4 Video Files, or by itself, in M4A Audio Files.

The answer I would give is as follows. As long as our player or player application supports AC3, it will definitely be better able to output 6 channels of sound from such a compressed, digital stream.

But it can happen to us that our speaker amplifier only accepts two analog channels, which would have been called Left and Right. In such a case, If our speaker amp possesses a Pro Logic decoder, the player of our AC3-compressed stream still has the option, of Pro Logic encoding its stereo output.

In that case, our speaker amp will still try to decode that into surround sound, with as many speakers as we have connected to this amp.

But, If we do that, we are subjecting the sound to a loss in quality, because the sound has been collapsed into analog stereo first.

Yet, to substitute some other, Back-Front component for the surround channel, which is being fed to the Pro Logic decoder, does not really hurt the quality of the surround decoding more, than using Pro Logic already would. And so I would see no hesitation in doing so, if the need arises.

Dirk

## Some Thoughts on Surround Sound

The way I seem to understand modern 5.1 Surround Sound, there exists a complete stereo signal, which for the sake of legacy compatibility, is still played directly to the front-left and the front-right speaker. But what also happens, is that a third signal is picked up, which acts as the surround channel, in a way that neither favors the left nor the right asymmetrically.

I.e., if people were to try to record this surround channel as being a sideways-facing microphone component, by its nature its positive signal would either favor the left or the right channel, and this would not count as a correct surround-sound mike. In fact, such an arrangement can best be used to synthesize stereo, out of geometries which do not really favor two separate mikes, one for left and one for right.

But, a single, downward-facing, HQ mike would do as a provider of surround information.

If the task becomes, to carry out a stereo mix-down of a surround signal, this third channel is first phase-shifted 90 degrees, and then added differentially between the left and right channels, so that it will interfere least with stereo sound.

In the case where such a mixed-down, analog stereo signal needs to be decoded into multi-speaker surround again, the main component of “Pro Logic” does a balanced summation of the left and right channels, producing the center channel, but at the same time a subtraction is carried out, which is sent rearward.

The advantage which Pro Logic II has over I, is that this summation first adjusts the relative gain of both input channels, so that the front-center channel has zero correlation with the rearward surround information, which has presumably been recovered from the adjusted stereo as well.

Now, an astute reader will recognize, that if the surround-sound thus recovered, was ‘positive facing left’, its addition to the front-left signal will produce the rear-left signal favorably. But then the thought could come up, ‘How does this also derive a rear-right channel?’ The reason for which this question can arise, is the fact that a subtraction has taken place within the Pro Logic decoder, which is either positive when the left channel is more so, or positive when the right channel is more so.

(Edit 02/15/2017 : The less trivial answer to this question is, A convention might exist, by which the left stereo channel was always encoded as delayed 90 degrees, while the right could always be advanced, so that a subsequent 90 degree phase-shift when decoding the surround signal can bring it back to its original polarity, so that it can be mixed with the rear left and right speaker outputs again. The same could be achieved, if the standard stated, that the right stereo channel was always encoded as phase-delayed.

However, the obvious conclusion of that would be, that if the mixed-down signal was simply listened to as legacy stereo, it would seem strangely asymmetrical, which we can observe does not happen.

I believe that when decoding Pro Logic, the recovered Surround component is inverted when it is applied to one of the two Rear speakers. )

But what the reader may already have noticed, is that if he or she simply encodes his mixed-down stereo into an MP3 File, later attempts to use a Pro Logic decoder are for not, and that some better means must exist to encode surround-sound onto DVDs or otherwise, into compressed streams.

Well, because I have exhausted my search for any way to preserve the phase-accuracy, at least within highly-compressed streams, the only way in which this happens, which makes any sense to me, is if in addition to the ‘joint stereo’, which provides two channels, a 3rd channel was multiplexed into the compressed stream, which as before, has its own set of constraints, for compression and expansion. These constraints can again minimize the added bit-rate needed, let us say because the highest frequencies are not thought to contribute much to human directional hearing…

(Edit 02/15/2017 :

Now, if a computer decodes such a signal, and recognizes that its sound card is only in  stereo, the actual player-application may do a stereo mix-down as described above, in hopes that the user has a pro Logic II -capable speaker amp. But otherwise, if the software recognizes that it has 4.1 or 5.1 channels as output, it can do the reconstruction of the additional speaker-channels in software, better than Pro Logic I did it.

I think that the default behavior of the AC3 codec when decoding, if the output is only specified to consist of 2 channels, is to output legacy stereo only.

The approach that some software might take, is simply to put two stages in sequence: First, AC3 decoding with 6 output channels, Secondly, mixing down the resulting stereo in a standard way, such as with a fixed matrix. This might not be as good for movie-sound, but would be best for music.


1.0   0.0
0.0   1.0
0.5   0.5
0.5   0.5
+0.5  -0.5
-0.5  +0.5



If we expected our software to do the steering, then we might also expect, that software do the 90° phase-shift, in the time-domain, rather than in the frequency-domain. And this option is really not feasible in a real-time context.

The AC3 codec itself would need to be capable of 6-channel output. There is really no blind guarantee, that a 6-channel signal is communicated from the codec to the sound system, through an unknown player application... )

(Edit 02/15/2017 : One note which should be made on this subject, is that the type of matrix which I suggested above might work for Pro Logic decoding of the stereo, but that if it does, it will not be heard correctly on headphones.

The separate subject exists, of ‘Headphone Spacialization’, and I think this has become relevant in modern times.

A matrix approach to Headphone Spacialization would assume that the 4 elements of the output vector, are different from the ones above. For example, each of the crossed-over components might be subject to some fixed time-delay, which is based on the Inter-Aural Delay, after it is output from the matrix, instead of awaiting a phase-shift… )

(Edit 03/06/2017 : After much thought, I have come to the conclusion that there must exist two forms of the Surround channel, which are mutually-exclusive.

There can exist a differential form of the channel, which can be phase-shifted 90⁰ and added differentially to the stereo.

And there can exist a common-mode, non-differential form of it, which either correlates more with the Left stereo or with the Right stereo.

For analog Surround – aka Pro Logic – the differential form of the Surround channel would be used, as it would for compressed files.

But when an all-in-one surround-mike is implemented on a camcorder, this originally provides a common-mode Surround-channel. And then it would be up to the audio system of the camcorder, to provide steering, according to which this channel either correlates more with the front-left or the front-right. As a result of that, a differential surround channel can be derived. )

(Updated 11/20/2017 : )