Exploring the Discrete Sine Transform…

I can sometimes describe a way of using certain tools – such as in this case, one of the Discrete Cosine Transforms – which is correct in principle, but which has an underlying flaw, that needs to be corrected, from my first approximation of how it can be applied.

One of the things which I had said was possible was, to take a series of frequency-domain ‘equalizer settings’, which be at one per unit of frequency, not, at so many per octave, compute whichever DCT was relevant, such that the result had the lowest frequency as its first element, and then to apply that result as a convolution, in order finally to apply the computed equalizer to a signal.

One of the facts which I’m only realizing recently is, that if the DCT is computed in a one-sided way, the results are ‘completely non-ideal’, because it gives no control over what the phase-shifts will be, at any frequency! Similarly, such a one-sided convolution can also not be applied as the sinc function, because the amount of sine-wave output, in response to a cosine-wave input, will approach infinity, when the frequency is actually at the cutoff frequency.

What I have found instead is, that if such a cosine transform is mirrored around a centre-point, the amount of sine response, to an input cosine-wave, will cancel out and become zero, thus giving phase-shifts of zero.

But a result which some people might like is, to be able to apply controlled phase-shifts, differently for each frequency, such that those people specify a cosine as well as a sine component, for an assumed input cosine-wave.

The way to accomplish that is, to add-in the corresponding (normalized) sine-transform, of the series of phase-shifted response values, and to observe that the sine-transform will actually be zero at the centre-point. Then, the thing to do is, to apply the results negatively on the other side of the centre-point, which were to be applied positively on one side.

I have carried out a certain experiment with the Computer Algebra System named “wxMaxima”, in order first to observe what happens if a set of equal, discrete frequency-coefficients belonging to a series is summed. And then, I plotted the result of the definite integral, of the sine function, over a short interval. Just as with the sinc function, The integral of the cosine function was (sin(x) – sin(0)) / x, the definite integral of the sine function will be (1 – cos(x)) / x, and, Because the derivative of cos(x) is zero at (x = 0), the limit equation based on the divide by zero, will actually approach zero, and be well-behaved.

(Update 1/31/2021, 13h35: )

There is an underlying truth about Integral Equations in general, which people who studied Calculus 2 generally know, but, I have no right just to assume that any reader of my blog did so. There exist certain standard Integrals, which behave in the reverse way of how the standard Derivatives behave, just because ‘Integrals’ are ‘Antiderivatives’…

When one solves the Derivatives of certain trig functions repeatedly, one obtains the sequence:

sin(x) -> cos(x) -> -sin(x) -> -cos(x) -> sin(x)

Solving the Indefinite Integrals of the same trig functions yields the result:

sin(x) -> -cos(x) -> -sin(x) -> cos(x) -> sin(x)

Hence, the Indefinite Integral of sin(x) is in fact -cos(x), and:

( -(-cos(0)) = +1 )

(End of Update, 1/31/2021, 13h35.)

(Updated 2/04/2021, 17h10…)

Photometrics and Face Recognition

A question which I’ve recently stumbled in to, is whether computer face recognition should be based on some sort of Fourier Transform of an image, which also refers to spatial frequencies, or on the Photometric placement of key features belonging to the face in 3D.

This photometric analysis of geometries was once referred to as ‘Photo-Modeling’, or, ‘Photo-Grammetry’.

There is a good chance that either method of face recognition can be made to work. But in the case of the photometric approach, there are two caveats which I don’t see news sources as mentioning:

1. Photometrics requires a fairly high-resolution image of one face, while the methods based on spatial frequencies can work with low resolutions and poor-quality images,
2. AFAIK, In order for photometrics to proceed in a completely automated way, footage or images of the subject need to be recorded, from at least three camera-positions, preferably in such a way that the lighting matches exactly. In this regard, modeling points that belong to the face is similar today to how it was a few decades ago, when a Hollywood Laser needed to paint grid-lines on the face, but is possible today without the grid-lines.

Hence, if a group has in fact used photometrics on a face, because they had 3 camera-positions, they’d also be in a position to display only one of the camera-positions, with the required points being positioned automatically. If the group presents the resulting overlay by itself, they may be confusing some viewers by omission.

In other words, the subject could be asked to look directly at one camera-position, that is obvious to him, but there could have been two additional camera-positions, that he was not aware of.

(Updated 09/15/2018, 17h35 … )

(As of 09/06/2018 : )

Alternatively, I am aware that ‘3D Cameras’ exist, which obtain a depth-map of the scene in front of them, due to an additional laser-emitter, that has been positioned in a way offset from the main camera axis.

A Word Of Compliment To Audacity

One of the open-source applications which can be used as a Sound-Editor, is named ‘Audacity’. And in an earlier posting, I had written that this application may apply certain effects, which first involve performing a Fourier Transform of some sort on sampling-windows, which then manipulate the frequency-coefficients, and which then invert the Fourier Transform, to result in time-domain sound samples again.

On closer inspection of Audacity, I’ve recently come to realize that its programmers have avoided going that route, as often as possible. They may have designed effects which sound more natural as a result, but which follow how traditional analog methods used to process sound.

In some places, this has actually led to criticism of Audacity, let’s say because the users have discovered, that a low-pass or a high-pass filter would not maintain phase-constancy. But in traditional audio work, low-pass or high-pass filters always used to introduce phase-shifts. Audacity simply brings this into the digital realm.

I just seem to be remembering certain other sound editors, that used the Fourier Transforms extensively.

Dirk

An Observation about Modifying Fourier Transforms

A concept which seems to exist, is that certain standard Fourier Transforms do not produce desired results, and that therefore, They must be modified for use with compressed sound.

What I have noticed is that often, when we modify a Fourier Transform, it only produces a special case of an existing standard Transform.

For example, we may start with a Type 4 Discrete Cosine Transform, that has a sampling interval of 576 elements, but want it to overlap 50%, therefore wanting to double the length of samples taken in, without doubling the number of Frequency-Domain samples output. One way to accomplish that is to adhere to the standard Math, but just to extend the array of input samples, and to allow the reference-waves to continue into the extension of the sampling interval, at unchanged frequencies.

Because the Type 4 applies a half-sample shift to its output elements as well as to its input elements, this is really equivalent to what we would obtain, if we were to compute a Type 2 Discrete Cosine Transform over a sampling interval of 1152 elements, but if we were only to keep the odd-numbered coefficients. All the output elements would count as odd-numbered ones then, after their index is doubled.

The only new information I really have on Frequency-Based sound-compression, is that there is an advantage gained, in storing the sign of each coefficient, notwithstanding.

(Edit 08/07/2017 : )