Photometrics and Face Recognition

A question which I’ve recently stumbled in to, is whether computer face recognition should be based on some sort of Fourier Transform of an image, which also refers to spatial frequencies, or on the Photometric placement of key features belonging to the face in 3D.

This photometric analysis of geometries was once referred to as ‘Photo-Modeling’, or, ‘Photo-Grammetry’.

There is a good chance that either method of face recognition can be made to work. But in the case of the photometric approach, there are two caveats which I don’t see news sources as mentioning:

  1. Photometrics requires a fairly high-resolution image of one face, while the methods based on spatial frequencies can work with low resolutions and poor-quality images,
  2. AFAIK, In order for photometrics to proceed in a completely automated way, footage or images of the subject need to be recorded, from at least three camera-positions, preferably in such a way that the lighting matches exactly. In this regard, modeling points that belong to the face is similar today to how it was a few decades ago, when a Hollywood Laser needed to paint grid-lines on the face, but is possible today without the grid-lines.

Hence, if a group has in fact used photometrics on a face, because they had 3 camera-positions, they’d also be in a position to display only one of the camera-positions, with the required points being positioned automatically. If the group presents the resulting overlay by itself, they may be confusing some viewers by omission.

In other words, the subject could be asked to look directly at one camera-position, that is obvious to him, but there could have been two additional camera-positions, that he was not aware of.


(Updated 09/15/2018, 17h35 … )

(As of 09/06/2018 : )

Alternatively, I am aware that ‘3D Cameras’ exist, which obtain a depth-map of the scene in front of them, due to an additional laser-emitter, that has been positioned in a way offset from the main camera axis.

Continue reading Photometrics and Face Recognition

A Word Of Compliment To Audacity

One of the open-source applications which can be used as a Sound-Editor, is named ‘Audacity’. And in an earlier posting, I had written that this application may apply certain effects, which first involve performing a Fourier Transform of some sort on sampling-windows, which then manipulate the frequency-coefficients, and which then invert the Fourier Transform, to result in time-domain sound samples again.

On closer inspection of Audacity, I’ve recently come to realize that its programmers have avoided going that route, as often as possible. They may have designed effects which sound more natural as a result, but which follow how traditional analog methods used to process sound.

In some places, this has actually led to criticism of Audacity, let’s say because the users have discovered, that a low-pass or a high-pass filter would not maintain phase-constancy. But in traditional audio work, low-pass or high-pass filters always used to introduce phase-shifts. Audacity simply brings this into the digital realm.

I just seem to be remembering certain other sound editors, that used the Fourier Transforms extensively.



An Observation about Modifying Fourier Transforms

A concept which seems to exist, is that certain standard Fourier Transforms do not produce desired results, and that therefore, They must be modified for use with compressed sound.

What I have noticed is that often, when we modify a Fourier Transform, it only produces a special case of an existing standard Transform.

For example, we may start with a Type 4 Discrete Cosine Transform, that has a sampling interval of 576 elements, but want it to overlap 50%, therefore wanting to double the length of samples taken in, without doubling the number of Frequency-Domain samples output. One way to accomplish that is to adhere to the standard Math, but just to extend the array of input samples, and to allow the reference-waves to continue into the extension of the sampling interval, at unchanged frequencies.

Because the Type 4 applies a half-sample shift to its output elements as well as to its input elements, this is really equivalent to what we would obtain, if we were to compute a Type 2 Discrete Cosine Transform over a sampling interval of 1152 elements, but if we were only to keep the odd-numbered coefficients. All the output elements would count as odd-numbered ones then, after their index is doubled.

The only new information I really have on Frequency-Based sound-compression, is that there is an advantage gained, in storing the sign of each coefficient, notwithstanding.

(Edit 08/07/2017 : )

Continue reading An Observation about Modifying Fourier Transforms

A single time-delay can also be expressed in the frequency-domain.

Another way to state, that a stream of time-domain samples has been given a time-delay, is simply to state that each frequency-coefficient has been given a phase-shift, that depends both on the frequency of the coefficient, and on the intended time-delay.

A concern that some readers might have with this, is the fact that a number of samples need to be stored, in order for a time-delay to be executed in the time-domain. But as soon as differing values for coefficients, for a Fourier Transform, are spaced closer together, indicating in this case a longer time-delay, its computation also requires that a longer interval of samples in the time-domain need to be combined.

Now, if the reader would like to visualize what this would look like, as a homology to a graphical equalizer, then he would need to imagine a graphical equalizer the sliders of which can be made negative – i.e. one that can command, that one frequency come out inverted – so that then, if he was to set his sliders into the accurate shape of a sine-wave that goes both positive and negative in its settings, he should obtain a simple time-delay.

But there is one more reason for which this homology would be flawed. The type of Fourier Transform which is best-suited for this, would be the Discrete Fourier Transform, not one of the Discrete Cosine Transforms. The reason is the fact that the DFT accepts complex numbers as its terms. And so the reader would also have to imagine, that his equalizer not only have sliders that move up and down, but sliders with little wheels on them, from which he can give a phase-shift to one frequency, without changing its amplitude. Obviously graphical equalizers for music are not made that way.

Continue reading A single time-delay can also be expressed in the frequency-domain.