An observation about some purchased FLAC Files.

One of the ideas which I’ve blogged about often – a pet peeve of mine – is how lossy compression is not inaudible, although some people have claimed it is, and how its use degrades the final quality of modern, streamed or downloaded music.

And so if this is taken to be real for the moment, a question can rise as to what the modern methods are, to purchase High-Fidelity, Classical Music after all. One method could be, only to purchase Audio CDs that were mastered in the 1990s. But then, the eventual problem becomes, that even the best producers may not be mastering new recordings in that format anymore, in the year 2019. We may be able to purchase famous recordings made in the 1990s, but none from later, depending on what, exactly, our needs are. But, an alternative method exists to acquire such music today, especially to acquire the highest quality of Classical music recorded recently.

What people can do is to purchase and download the music in 16-bit, FLAC-compressed format. Ideally, this form of compression should not insert any flaws into the sound on its own. The sound could still be lacking in certain ways, but if it is, then this will be because the raw audio was flawed, before it was even compressed. By definition, lossless compression decompresses exactly to what was present, before the sound was compressed.

I have just taken part in such a transaction, and downloaded Gershwin’s Rhapsody In Blue, in 16-bit FLAC Format. But I made an interesting observation. The raw 16-bit audio at a sample-rate of 44.1kHz, would take up just over 1.4mbps. When I’ve undertaken to Flac-compress such recordings myself, I’ve never been able to achieve a ratio much better than 2:1. Hence, I should not be able to achieve bit-rates much lower than 700kbps. But the recording of Gershwin which I just downloaded, achieves 561kbps. This is a piece in which a piano and a clarinet feature most prominently, and, in this version, also some muted horns. And yet, the overall sound quality of the recording seems good. So what magic might be employed by the producers, to result in smaller FLAC Files?

(Updated 8/27/2019, 14h45 … )

Continue reading An observation about some purchased FLAC Files.

Linear Predictive Coding

Linear Predictive Coding is a method of using a constant number of known sample-values, that precede an unknown sample-value, and to find coefficients for each of the preceding sample-values, which they should be multiplied by, and the products summed, to derive the most-probable following sample-value.

More specifically, while the exercise of applying these coefficients should not require much explaining, methods for deriving them do.

Finding these coefficients is also called an Auto-Correlation, because the following sample is part of the same sequence, as the preceding samples belonged to.

Even though LPC travels along the stream of values, each preceding position relative to the current sample to predict is treated as having a persistent meaning, and for the sake of simplicity I’ll be referring to each preceding sample-position as One Predictor.

If the Linear Predictive Coding was only to be of the 2nd order, thus taking into account 2 Predictors, then it will often be simple to use a fixed system of coefficients.

In this case, the coefficients should be { -1, +2 }, which will successfully predict the continuation of a straight line, and nothing else.

One fact about a set of coefficients is, that their sum should be equal to 1, in order to predict a DC value correctly.

( If the intent was to use a set of 3 predictors, to conserve both the 1st and the 2nd derivatives of a curve, then the 3 coefficients should automatically be { +1, -3, +3 } . But, what’s needed for Signal Processing is often not, what’s needed for Analytical Geometry. )

But for orders of LPC greater than 3, the determination of the coefficients is anything but trivial. In order to understand how these coefficients can be computed, one must first understand a basic concept in Statistics called a Correlation. A correlation supposes that an ordered set of X-value and Y-value pairs exist, which could have any values for both X and Y, but that Y is supposed to follow from X, according to a linear equation, such that

Y = α + β X

Quite simply, the degree of correlation is the ideal value for β, which achieves the closest-fitting set of predicted Y-values, given hypothetical X-values.

The process of trying to compute this ideal value for β is also called Linear Regression Analysis, and I keep a fact-sheet about it:

Fact Sheet on Regression Analyses.

This little sheet actually describes Non-Linear Regression Analysis at the top, using a matrix which states the polynomial terms of X, but it goes on to show the simpler examples of Linear Regression afterward.

There is a word of commentary to make, before understanding correlations at all. Essentially, they exist in two forms

  1. There is a form, in which the products of the deviations of X and Y are divided by the variance of X, before being divided by the number of samples.
  2. There is a form, in which the products of the deviations of X and Y are divided by the square root, of the product of the variance of X and the variance of Y, before being divided by the number of samples.

The variance of any data-set is also its standard deviation squared. And essentially, there are two ways to deal with the possibility of non-zero Y-intercepts – non-zero values of α. One way is to compute the mean of X, and to use the deviations of individual values of X from this mean, as well as to find the corresponding mean of Y, and to use deviations of individual values of Y from this mean.

Another way to do the Math, is what my fact-sheet describes.

Essentially, Form (1) above treats Y-values as following from known X-values, and is easily capable of indicating amounts of correlation greater than 1.

Form (2) finds how similar X and Y -values are, symmetrically, and should never produce correlations greater than 1.

For LPC, Form (2) is rather useless, and the mean of a set of predictors must be found anyway, so that individual deviations from this mean are also the easiest values to compute with.

The main idea when this is to become an autocorrelation, is that the correlation of the following sample is computed individually, as if it was one of the Y-values, as following each predictor, as if that was just one of the X-values. But it gets just a touch trickier…

(Last Edited 06/07/2017 … )

Continue reading Linear Predictive Coding

How certain signal-operations are not convolutions.

One concept that exists in signal processing, is that there could be a definition of a filter, which is based in the time-domain, and that this definition can resemble a convolution. And yet, a derived filter could no longer be expressible perfectly as a convolution.

For example, the filter in question might add reverb to a signal recursively. In the frequency-domain, the closer two frequencies are, which need to be distinguished, the longer the interval is in the time-domain, which needs to be considered before an output sample is computed.

Well, reverb that is recursive would need to be expressed as a convolution with an infinite number of samples. In the frequency-domain, this would result in sharp spikes instead of smooth curves.

I.e., If the time-constant of the reverb was 1/4 millisecond, a 4kHz sine-wave would complete within this interval, while a 2kHz sine-wave would be inverted in phase 180⁰. What this can mean is that a representation in the frequency-domain may simply have maxima and minima, that alternate every 2kHz. The task might never be undertaken to make the effect recursive.

(Last Edited on 02/23/2017 … )

Continue reading How certain signal-operations are not convolutions.

A Note on FLAC -Compressing 24-bit

One note which I had commented about before my blog began, was that if authors decide to capture sound at 96k samples /second, the resulting sound should compress well using FLAC.

But now that I have experimented with ‘QTractor‘ and an external sound card, I have realized that we will probably also be capturing that sound in 24-bit sample-format, instead of 16-bit. And the sad fact is, that FLAC will not compress the 24-bit format as well, as it did 16-bit.

The reason seems clear. Using ‘Linear Predictive Coding’ means that FLAC will be able to predict the next sample in a set of so-many, to maybe 8 bits of precision, except that the next sample will always deviate from this prediction by a small residual. So 8-bit sound should compress brilliantly.

But then with 16-bit, the accuracy of the encoding stays the same. So again, the ‘LPC’ is really only 8-bits accurate at best, meaning that we get a larger residual. The size of that residual is what makes up most of a FLAC File.

Well at 24-bit, again, the LPC will only predict the next sample, accurately to within 8 bits. And so the residual is likely to be twice as large, as it was with 16-bit, completing 24-bit accuracy this time. We are not left with much compression then.

When I recorded my 14-second sound session the other day, I selected FLAC as my capture file format. I had a noisy air-conditioner running in the background. Additionally, the compression level defaults to Fastest, because the file needs to be written in real-time, and not chewed on.

At 96 kHz, 24-bit stereo, raw audio will take up about 4.6 mbps. At 44.1 kHz, 16-bit stereo, raw audio takes up about 1.4 mbps.

Well I was capturing to a stereo FLAC File, but was only using one channel out of the two. So the FLAC File that resulted, had a bit-rate of 2.3 mbps. This means that FLAC recognized the silent track and used ‘Run-Length Encoding’ on it, but that was about all this CODEC could do for me.

Now, we do have a command-line tool which will-re-compress that file:


$ flac -8 infile.flac -o outfile.flac
$ flac -8 infile.flac --channels=1 -o outfile.flac
$ flac -8 infile.flac --channels=1 --blocksize=8192 -o outfile.flac

The -8 means to use maximum compression.

For me, the bit-rate went down to 2.2 mbps either way.

It beats using a raw format, because using the latter would have meant, nothing would have detected my silent stereo channel, and the file would have been twice as large.

Dirk

Continue reading A Note on FLAC -Compressing 24-bit