A question which I’ve recently stumbled in to, is whether computer face recognition should be based on some sort of Fourier Transform of an image, which also refers to spatial frequencies, or on the Photometric placement of key features belonging to the face in 3D.
This photometric analysis of geometries was once referred to as ‘Photo-Modeling’, or, ‘Photo-Grammetry’.
There is a good chance that either method of face recognition can be made to work. But in the case of the photometric approach, there are two caveats which I don’t see news sources as mentioning:
- Photometrics requires a fairly high-resolution image of one face, while the methods based on spatial frequencies can work with low resolutions and poor-quality images,
- AFAIK, In order for photometrics to proceed in a completely automated way, footage or images of the subject need to be recorded, from at least three camera-positions, preferably in such a way that the lighting matches exactly. In this regard, modeling points that belong to the face is similar today to how it was a few decades ago, when a Hollywood Laser needed to paint grid-lines on the face, but is possible today without the grid-lines.
Hence, if a group has in fact used photometrics on a face, because they had 3 camera-positions, they’d also be in a position to display only one of the camera-positions, with the required points being positioned automatically. If the group presents the resulting overlay by itself, they may be confusing some viewers by omission.
In other words, the subject could be asked to look directly at one camera-position, that is obvious to him, but there could have been two additional camera-positions, that he was not aware of.
(Updated 09/15/2018, 17h35 … )
(As of 09/06/2018 : )
Alternatively, I am aware that ‘3D Cameras’ exist, which obtain a depth-map of the scene in front of them, due to an additional laser-emitter, that has been positioned in a way offset from the main camera axis.
The idea behind those devices is, that laser-diodes in the emitter operate at near infra-red wavelengths which humans cannot see, but which the main camera can ‘see’ perfectly well. When one of the laser-diodes is turned on, a spot visible to the camera lights up, but not always in the same 2D position as seen from that camera, because the emitter is sending a beam into the scene, along a path which is skewed. Software can then apply the concept of either horizontal or vertical parallax, to determine the depth into the scene, corresponding to the laser-diode in question.
I suspect that one main drawback with this latter type of device has always been: The device being produced on a budget creates tolerances with which the laser-diodes are mounted, yet they are mounted with a static position once the device has been assembled. Therefore, some sort of calibration may need to be done during the production of such a device, with a flat reflective surface at a known distance, by which the position of the illuminated spot that the camera detects, can be associated with the calibration distance as well.
I do not know whether simply to obtain a depth-map of a Human face, is so helpful in recognizing his or her identity.
(Update 09/06/2018, 17h35 : )
Actually, to do so really only requires an extra stage of computation. On the assumption that a 3D mesh of the face is available as input, as outlined in the post I linked to, ‘points’ that correspond to facial features can be ‘moved’ along one axis at a time – let’s say perpendicularly to a facial ellipsoid – Until the cross that 3D mesh. As soon as they do, one metric of the shape of the face has effectively been found.
This does not require the highest polygon count for the mesh, which could just be a 20×20 vertex mesh and work.