I feel that standards need to be reestablished.

When 16-bit / 44.1kHz Audio was first developed, it implied a very capable system for representing high-fidelity sound. But I think that today, we live in a pseudo-16-bit era. Manufacturers have taken 16-bit components, but designed devices which do bot deliver the full power or quality of what this format once promised.

It might be a bit of an exaggeration, but I would say that out of those indicated 16 bits of precision, the last 4 are not accurate. And one main reason this has happened, is due to compressed sound. Admittedly, signal compression – which is often a euphemism for data reduction – is necessary in some areas of signal processing. But one reason fw data-reduction was applied to sound, had more to do with dialup-modems and their lack of signal-speed, and with the need to be able to download songs onto small amounts of HD space, than it served any other purpose, when the first forms of data-reduction were devised.

Even though compressed streams caused this, I would not say that the solution lies in getting rid of compressed streams. But I think that a necessary part of the solution would be consumer awareness.

If I tell people that I own a sound device, that it uses 2x over-sampling, but that I fear the interpolated samples are simply generated as a linear interpolation of the two adjacent, original samples, and if those people answer “So what? Can anybody hear the difference?” Then this is not an example of consumer awareness. I can hear the difference between very-high-pitch sounds that are approximately correct, and ones which are greatly distorted.

Also, if we were to accept for a moment that out of the indicated 16 bits, only the first 12 are accurate, but there exist sound experts who tell us that by dithering the least-significant bit, we can extend the dynamic range of this sound beyond 96db, then I do not really believe that those experts know any less about digital sound. Those experts have just remained so entirely surrounded by their high-end equipment, that they have not yet noticed the standards slip, in other parts of the world.

Also, I do not believe that the answer to this problem lies in consumers downloading 24-bit, 192kHz sound-files, because my assumption would again be, that only a few of those indicated 24 bits will be accurate. I do not believe Humans hear ultrasound. But I think that with great effort, we may be able to hear 15-18kHz sound from our actual playback devices again – in the not-so-distant future.

Continue reading I feel that standards need to be reestablished.

My Opinion on the Opinion of Chris “Monty” Montgomery

Chris Montgomery is the Audio Expert, who invented the OGG Vorbis codec. That gives enough reason to accredit him with good advice. I recommend that my readers read his advice here.

I did read the whole thing, but have three comments on it:

  1. The Author suggests that 16-bit sample-depth offers a de-facto solution to the limits in dynamic range, simply due to the correct application of dithering. If I cannot trust my hardware to perform correct low-pass filtering, why on Earth would I trust it to perform correct, 16-bit, audio dithering?
  2. The Author explains the famous loudness curves, that define threshold of perceptibility, as well as the higher threshold of pain. What he fails to point out is that these curves assume, that the sound being tested for, is the only sound being played over the headphones. If there is another, background sound being played – i.e. the current loudness-level already higher than zero – then the threshold of perception for a given test-sound, is higher – requires a higher level, for that test-sound itself to be heard. Yet, this level is still lower, than the peak level of the background sound. People who design codecs know this, as I am sure the author does. It is the threshold of perceptibility next to a background sound – not the absolute threshold – which gets used in the design of codecs.
  3. The Author suggests it would be a misuse of his codec, to encode discrete multi-channel sound. And one reason he states, would be the waste in file-size, while the next reason he states, would be the fact that sound jumps to the nearest speaker, when they are all encoded that way.

This last observation strikes a cord with me. I have already noticed, that OGG Files do allow numerous channels to be encoded in parallel, but that if we exceed 2, we lose the benefits of Joint Stereo. By itself, this does not really count against this Author, whose codec therefore does not offer explicit surround-sound. But the possibility is very real, that the localization of sound will jump to the nearest speaker, if the listener moves and the sound was encoded that way. It is entirely possible, that purposeful encoding of surround-sound by the (competing) AC3 or the AAC codecs, reduces this risk.

But then I would suggest an alternative approach, to people who do not want to use the proprietary codecs, yet who wish to encode their movies with surround.

There exists the Steve Harris LADSPA plug-in library, which includes a matrix encoder for Pro Logic. This matrix encoder accepts 4 input channels, one of which is the surround channel, and outputs 2 stereo channels.

Further, the circuitry must exist someplace as well, to accept 2 stereo, 1 center and 1 surround-channel, and to encode those in real-time, so that the surround-effect can be played back over 6 speakers.

  • In principle, it should be possible to OGG-compress 4 channels and not 6, so that these channels can be used as inputs, to a matrix surround-system, like to the LADSPA plug-in, so that listenable surround will emanate from all speakers. Does Audio Software exist, which applies the LADSPA plug-in in real-time?
  • Alternatively, it might be possible to mix down Pro Logic sound into Stereo using the Steve Harris plug-in, and then to use FLAC on the resulting stereo.

BTW: What the Author mainly writes, is how incorrect it would be for pure listeners, to download their music in 24/192 format. He does not actually write, that Music / Sound Authors should avoid recording in this format. And so one fact which I have observed, is that there exists a lot of Audio Software – such as – that stores its sound in some higher, internal format, but which, when instructed to Export that to a 16-bit format, offer Dithering as an option.

This is possible because the Application is numeric and not physical. Thus, If I had used my USB-sound-device to record in 24-bit, I could next Export the finished sound tracks to 16-bit:

ardour_klystr_6

 

But, It would also seem that Chris Montgomery equates the use of such technology, as only being suited for Professionals. I am not a professional, and do not have the extremely expensive tools they do. Yet, I am able to author sound-projects.

Dirk