## Modern Photogrammetry

Modern Photogrammetry makes use of a Geometry Shader – i.e.  Shader which starts with a coarse grid in 3D, and which interpolates a fine grid of microplygons, again in 3D.

The principle goes, that a first-order, approximate 3D model provides per-vertex “normal vector” – i.e. vectors that always stand out at right angles from the 3D model’s surface in an exact way, in 3D – and that a Geometry Shader actually renders many interpolated points, to several virtual camera positions. And these virtual camera positions correspond in 3D, to the assumed positions from which real cameras photographed the subject.

The Geometry Shader displaces each of these points, but only along their interpolated normal vector, derived from the coarse grid, until the position which those points render to, take light-values from the real photos, that correlate to the closest extent. I.e. the premise is that at some exact position along the normal vector, a point generated by a Geometry Shader will have positions on all the real camera-views, at which all the real, 2D cameras photographed the same light-value. Finding that point is a 1-dimensional process, because it only takes place along the normal vector, and can thus be achieved with successive approximation.

(Edit 01/10/2017 : To make this easier to visualize. If the original geometry was just a rectangle, then all the normal vectors would be parallel. Then, if we subdivided this rectangle finely enough, and projected each micropolygon some variable distance along that vector, There would be no reason to say that there exists some point in the volume in front of the rectangle, which would not eventually be crossed. At a point corresponding to a 3D surface, all the cameras viewing the volume should in principle have observed the same light-value.

Now, if the normal-vectors are not parallel, then these paths will be more dense in some parts of the volume, and less dense in others. But then the assumption becomes, that their density should never actually reach zero, so that finer subdivision of the original geometry can also counteract this to some extent.

But there can exist many 3D surfaces, which would occupy more than one point along the projected path of one micropolygon – such as a simple sphere in front of an initial rectangle. Many paths would enter the sphere at one distance, and exit it again at another. There could exist a whole, complex scene in front of the rectangle. In those cases, starting with a coarse mesh which approximates the real geometry in 3D, is more of a help than a hindrance, because then, optimally, again there is only one distance of projection of each micropolygon, that will correspond to the exact geometry. )

Now one observation which some people might make, is that the initial, coarse grid might be inaccurate to begin with. But surprisingly, this type of error cancels out. This is because each microploygon-point will have been displaced from the coarse grid enough, that the coarse grid will finally no longer be recognizable from the positions of micropolygons. And the way the micropolygons are displaced is also such, that they never cross paths – since their paths as such are interpolated normal vectors – and so no Mathematical contradictions can result.

To whatever extent geometric occlusion has been explained by the initial, coarse model.

Granted, If the initial model was partially concave, then projecting all the points along their normal vector will eventually cause their paths to cross. But then this also defines the extent, at which the system no longer works.

But, According to what I just wrote, even the lighting needs to be consistent between one set of 2D photos, so that any match between their light-values actually has the same meaning. And really, it’s preferable to have about 6 such photos…

Yet, there are some people who would argue, that superior Statistical Methods could still find the optimal correlations in 1-dimensional light-values, between a higher number of actual photos…

One main limitation to providing photogrammetry in practice, is the fact that the person doing it may have the strongest graphics card available, but that he eventually needs to export his data to users who do not. So in one way it works for public consumption, the actual photogrammetry will get done on a remote server – perhaps a GPU farm, but then simplified data can actually get downloaded onto our tablets or phones, which the mere GPU of that tablet or phone is powerful enough to render.

But the GPU of the tablet or phone is itself not powerful enough, to do the actual successive approximation of the micropolygon-points.

I suppose, that Hollywood might not have that latter limitation. As far as they are concerned, all their CGI specialists could all have the most powerful GPUs, all the time…

Dirk

P.S. There exists a numerical approach, which simplifies computing Statistical Variance in such a way, that Variance can effectively be computed between ‘an infinite number of sample-points’, at a computational cost which is ‘only proportional to the number of sample-points’. And the equation is not so complicated.
 s = Mean(X2) - ( Mean(X) )2 
(Next)