Linear Predictive Coding

Linear Predictive Coding is a method of using a constant number of known sample-values, that precede an unknown sample-value, and to find coefficients for each of the preceding sample-values, which they should be multiplied by, and the products summed, to derive the most-probable following sample-value.

More specifically, while the exercise of applying these coefficients should not require much explaining, methods for deriving them do.

Finding these coefficients is also called an Auto-Correlation, because the following sample is part of the same sequence, as the preceding samples belonged to.

Even though LPC travels along the stream of values, each preceding position relative to the current sample to predict is treated as having a persistent meaning, and for the sake of simplicity I’ll be referring to each preceding sample-position as One Predictor.

If the Linear Predictive Coding was only to be of the 2nd order, thus taking into account 2 Predictors, then it will often be simple to use a fixed system of coefficients.

In this case, the coefficients should be { -1, +2 }, which will successfully predict the continuation of a straight line, and nothing else.

One fact about a set of coefficients is, that their sum should be equal to 1, in order to predict a DC value correctly.

( If the intent was to use a set of 3 predictors, to conserve both the 1st and the 2nd derivatives of a curve, then the 3 coefficients should automatically be { +1, -3, +3 } . But, what’s needed for Signal Processing is often not, what’s needed for Analytical Geometry. )

But for orders of LPC greater than 3, the determination of the coefficients is anything but trivial. In order to understand how these coefficients can be computed, one must first understand a basic concept in Statistics called a Correlation. A correlation supposes that an ordered set of X-value and Y-value pairs exist, which could have any values for both X and Y, but that Y is supposed to follow from X, according to a linear equation, such that

Y = α + β X

Quite simply, the degree of correlation is the ideal value for β, which achieves the closest-fitting set of predicted Y-values, given hypothetical X-values.

The process of trying to compute this ideal value for β is also called Linear Regression Analysis, and I keep a fact-sheet about it:

Fact Sheet on Regression Analyses.

This little sheet actually describes Non-Linear Regression Analysis at the top, using a matrix which states the polynomial terms of X, but it goes on to show the simpler examples of Linear Regression afterward.

There is a word of commentary to make, before understanding correlations at all. Essentially, they exist in two forms

  1. There is a form, in which the products of the deviations of X and Y are divided by the variance of X, before being divided by the number of samples.
  2. There is a form, in which the products of the deviations of X and Y are divided by the square root, of the product of the variance of X and the variance of Y, before being divided by the number of samples.

The variance of any data-set is also its standard deviation squared. And essentially, there are two ways to deal with the possibility of non-zero Y-intercepts – non-zero values of α. One way is to compute the mean of X, and to use the deviations of individual values of X from this mean, as well as to find the corresponding mean of Y, and to use deviations of individual values of Y from this mean.

Another way to do the Math, is what my fact-sheet describes.

Essentially, Form (1) above treats Y-values as following from known X-values, and is easily capable of indicating amounts of correlation greater than 1.

Form (2) finds how similar X and Y -values are, symmetrically, and should never produce correlations greater than 1.

For LPC, Form (2) is rather useless, and the mean of a set of predictors must be found anyway, so that individual deviations from this mean are also the easiest values to compute with.

The main idea when this is to become an autocorrelation, is that the correlation of the following sample is computed individually, as if it was one of the Y-values, as following each predictor, as if that was just one of the X-values. But it gets just a touch trickier…

(Last Edited 06/07/2017 … )

Continue reading Linear Predictive Coding

How certain signal-operations are not convolutions.

One concept that exists in signal processing, is that there could be a definition of a filter, which is based in the time-domain, and that this definition can resemble a convolution. And yet, a derived filter could no longer be expressible perfectly as a convolution.

For example, the filter in question might add reverb to a signal recursively. In the frequency-domain, the closer two frequencies are, which need to be distinguished, the longer the interval is in the time-domain, which needs to be considered before an output sample is computed.

Well, reverb that is recursive would need to be expressed as a convolution with an infinite number of samples. In the frequency-domain, this would result in sharp spikes instead of smooth curves.

I.e., If the time-constant of the reverb was 1/4 millisecond, a 4kHz sine-wave would complete within this interval, while a 2kHz sine-wave would be inverted in phase 180⁰. What this can mean is that a representation in the frequency-domain may simply have maxima and minima, that alternate every 2kHz. The task might never be undertaken to make the effect recursive.

(Last Edited on 02/23/2017 … )

Continue reading How certain signal-operations are not convolutions.

A Note on FLAC -Compressing 24-bit

One note which I had commented about before my blog began, was that if authors decide to capture sound at 96k samples /second, the resulting sound should compress well using FLAC.

But now that I have experimented with ‘QTractor‘ and an external sound card, I have realized that we will probably also be capturing that sound in 24-bit sample-format, instead of 16-bit. And the sad fact is, that FLAC will not compress the 24-bit format as well, as it did 16-bit.

The reason seems clear. Using ‘Linear Predictive Coding’ means that FLAC will be able to predict the next sample in a set of so-many, to maybe 8 bits of precision, except that the next sample will always deviate from this prediction by a small residual. So 8-bit sound should compress brilliantly.

But then with 16-bit, the accuracy of the encoding stays the same. So again, the ‘LPC’ is really only 8-bits accurate at best, meaning that we get a larger residual. The size of that residual is what makes up most of a FLAC File.

Well at 24-bit, again, the LPC will only predict the next sample, accurately to within 8 bits. And so the residual is likely to be twice as large, as it was with 16-bit, completing 24-bit accuracy this time. We are not left with much compression then.

When I recorded my 14-second sound session the other day, I selected FLAC as my capture file format. I had a noisy air-conditioner running in the background. Additionally, the compression level defaults to Fastest, because the file needs to be written in real-time, and not chewed on.

At 96 kHz, 24-bit stereo, raw audio will take up about 4.6 mbps. At 44.1 kHz, 16-bit stereo, raw audio takes up about 1.4 mbps.

Well I was capturing to a stereo FLAC File, but was only using one channel out of the two. So the FLAC File that resulted, had a bit-rate of 2.3 mbps. This means that FLAC recognized the silent track and used ‘Run-Length Encoding’ on it, but that was about all this CODEC could do for me.

Now, we do have a command-line tool which will-re-compress that file:

$ flac -8 infile.flac -o outfile.flac
$ flac -8 infile.flac --channels=1 -o outfile.flac
$ flac -8 infile.flac --channels=1 --blocksize=8192 -o outfile.flac

The -8 means to use maximum compression.

For me, the bit-rate went down to 2.2 mbps either way.

It beats using a raw format, because using the latter would have meant, nothing would have detected my silent stereo channel, and the file would have been twice as large.


Continue reading A Note on FLAC -Compressing 24-bit