Photometrics and Face Recognition

A question which I’ve recently stumbled in to, is whether computer face recognition should be based on some sort of Fourier Transform of an image, which also refers to spatial frequencies, or on the Photometric placement of key features belonging to the face in 3D.

This photometric analysis of geometries was once referred to as ‘Photo-Modeling’, or, ‘Photo-Grammetry’.

There is a good chance that either method of face recognition can be made to work. But in the case of the photometric approach, there are two caveats which I don’t see news sources as mentioning:

  1. Photometrics requires a fairly high-resolution image of one face, while the methods based on spatial frequencies can work with low resolutions and poor-quality images,
  2. AFAIK, In order for photometrics to proceed in a completely automated way, footage or images of the subject need to be recorded, from at least three camera-positions, preferably in such a way that the lighting matches exactly. In this regard, modeling points that belong to the face is similar today to how it was a few decades ago, when a Hollywood Laser needed to paint grid-lines on the face, but is possible today without the grid-lines.

Hence, if a group has in fact used photometrics on a face, because they had 3 camera-positions, they’d also be in a position to display only one of the camera-positions, with the required points being positioned automatically. If the group presents the resulting overlay by itself, they may be confusing some viewers by omission.

In other words, the subject could be asked to look directly at one camera-position, that is obvious to him, but there could have been two additional camera-positions, that he was not aware of.


(Updated 09/15/2018, 17h35 … )

(As of 09/06/2018 : )

Alternatively, I am aware that ‘3D Cameras’ exist, which obtain a depth-map of the scene in front of them, due to an additional laser-emitter, that has been positioned in a way offset from the main camera axis.

Continue reading Photometrics and Face Recognition

A Method for Obtaining Signatures for Photogrammetry

I have posed myself the question in the past, in which a number of photos is each subdivided into a grid of rectangles, how a signature can be derived from each rectangle, which leads to some sort of identifier, so that between photos, these identifiers will either match or not, even though there are inherent mismatches in the photos, to decide whether a rectangle in one photo corresponds to the same subject-feature, as a different rectangle in the other photo.

Eventually, one would want this information in order to compute a 3D scene-description – a 3D Mesh, with a level of detail equal to how finely the photos were subdivided into rectangles.

Since exact pixels will not be equal, I have thought of somewhat improbable schemes in the past, of just how to compute such a signature. These schemes once went so far, as first to compute a 2D Fourier Transform of each rectangle @ 1 coefficient /octave, to quantize those into 1s and 0s, to ignore the F=(0,0) bit, and then to hash the results.

But just recently I have come to the conclusion that a much simpler method should work.

At full resolution, the photos can be analyzed as though they formed a single image, in the ways already established for computing an 8-bit color palette, i.e. a 256-color palette, like the palettes once used in GIF Images, and for other images that only had 8-bit colors.

The index-number of this palette can be used as an identifier.

After the palette has been established, each rectangle of each photo can be assigned an index number, depending on which color of the palette it best matches. It would be important that this assignment not take place, as though we were just averaging the colors of each rectangle. Instead, the strongest basis of this assignment would need to be, how many pixels in the rectangle match one color in the palette. (*)

After that, each rectangle will be associated with this identifier, and for each one the most important result will become, at what distances from its camera-position the greatest number of other cameras confirm its 3D position, according to matching identifiers.


Continue reading A Method for Obtaining Signatures for Photogrammetry

Photogrammetry Acknowledgement

In this earlier posting, I described a form of “photogrammetry” in which an arbitrary, coarse base-geometry is assumed as a starting point, and from which micropolygons are spawned, in order to approximate a more-detailed final geometry.

I must acknowledge that within this field, a domain also exists, which is not like that, and in which the computer tries to guess at a random, arbitrary geometry. Of course, this is a much more difficult form of the subject, and I do not know much about how it is intended to work.

I do know that aside from the fact that swatches of pixels need to be matched from one 2D photo to the next, one challenge which impedes this, is the fact that parts of the (yet-unknown) mesh will occlude each other to some camera-positions but not others, in ways that computers are poor at predicting. To deal with that requires such complex fields as “Constraint Satisfaction Programming” – aka ‘Logic Programming’, etc..

(Edit 01/05/2017 : Also, if we can assume that a 2D grid of pixel-swatches is being tagged for exact matching, and that only horizontal parallax is to be measured, the problem of entire rows of rectangles that all have the same signature can be cumbersome to code for, where only the end-points change position from one photo to the next… And then their signature can end, to be replaced by another, after which, on the same row, the first set of signatures can simply resume.

Further, If we knew that this approach was being used, Then we could safely infer that the number of mesh-units we derive, will also correspond to the number of rectangles, which each photo has been subdivided in to, not the number of pixels. )

If that was to succeed, I suppose it could again form a starting-point, for the micropolygon-based approach I was describing.

I do know of at least one consumer-grade product, which uses micropolygons.


Continue reading Photogrammetry Acknowledgement

The wording ‘Light Values’ can play tricks on people.

What I wrote before, was that between (n) real, 2D photos, 1 light-value can be sampled.

Some people might infer that I meant, always to use the brightness value. But this would actually be wrong. I am assuming that color footage is being used.

And if I wanted to compare pixel-colors, to determine best-fit geometry, I would most want to go by a single hue-value.

If the color being mapped averages to ‘yellow’ – which facial colors do – then hue would be best-defined as ‘the difference between the Red and Green channels’.

But the way this works out negatively, is in the fact that actual photographic film which was used around 1977, differentiated most poorly between between Red and Green, as did any chroma / video signal. And Peter Cushing was being filmed in 1977, so that our reconstruction of him might appear in today’s movies.

So then an alternative might be, ‘Normalize all the pixels to have the same luminance, and then pick whichever primary channel that the source was best-able to resolve into minute details, on a physical level.’

Maybe 1977 photographic projector-emulsions differentiated the Red primary channel best?

Further, given that there are 3 primary colors in most forms of graphics digitization, and that I would remove the overall luminance, it would follow that maybe 2 actual remaining color channels could be used, the variance of each computed separately, and the variances added?

In general, it is Mathematically safer to add Variances, than it would be to add Deviations, where Variance corresponds to Deviation squared, and where Variance therefore also corresponds to Energy, if Deviation corresponded to Potential. It is more generally agreed that Energy and its homologues are conserved quantities.