In this earlier posting, I described a form of “photogrammetry” in which an arbitrary, coarse base-geometry is assumed as a starting point, and from which micropolygons are spawned, in order to approximate a more-detailed final geometry.
I must acknowledge that within this field, a domain also exists, which is not like that, and in which the computer tries to guess at a random, arbitrary geometry. Of course, this is a much more difficult form of the subject, and I do not know much about how it is intended to work.
I do know that aside from the fact that swatches of pixels need to be matched from one 2D photo to the next, one challenge which impedes this, is the fact that parts of the (yet-unknown) mesh will occlude each other to some camera-positions but not others, in ways that computers are poor at predicting. To deal with that requires such complex fields as “Constraint Satisfaction Programming” – aka ‘Logic Programming’, etc..
(Edit 01/05/2017 : Also, if we can assume that a 2D grid of pixel-swatches is being tagged for exact matching, and that only horizontal parallax is to be measured, the problem of entire rows of rectangles that all have the same signature can be cumbersome to code for, where only the end-points change position from one photo to the next… And then their signature can end, to be replaced by another, after which, on the same row, the first set of signatures can simply resume.
Further, If we knew that this approach was being used, Then we could safely infer that the number of mesh-units we derive, will also correspond to the number of rectangles, which each photo has been subdivided in to, not the number of pixels. )
If that was to succeed, I suppose it could again form a starting-point, for the micropolygon-based approach I was describing.
I do know of at least one consumer-grade product, which uses micropolygons.
(Edit 01/05/2017 : One of the tasks which “123D Catch” allows its users to perform, is to download and install a companion PC-Desktop-Application, from which a regular 3D model can be downloaded and exported, corresponding to our capture, and then that model can be analyzed with whatever 3D model editors we have available already on our PC.
When I did this with a past, successful capture, I found that 123D Catch had assumed the maximum amount of tessellation, and had given me a regular 3D model with millions of vertices – which is not what a standard 3D model is supposed to have. And this is where I realized, that the Internet-based application is storing the model in a different way, than a standard OBJ File, and making assumptions about what the Level Of Detail (LOD) is that a single download is asking for. At the maximum number of downloaded vertices, there is literally one per capture-photo pixel. But this result will be exported to OBJ Format anyway.
BTW, This app requires that the user take a circle of eye-level photos, after which we may take optional photos looking down at the subject at an angle. The app has an on-screen widget, which guides us into positioning the camera correctly, based on the motion-sensors built-in to the phone. I will assume that the second set of photos is for vertical parallax, while the first set is for horizontal parallax.
If we fail to hold the phone straight, the OSD warns us, and suspends taking photos, until the user has corrected this problem… )
While this is being marketed as a personal app for mobile devices, it requires that the devices be online, because they send the actual derivations to perform, to a more-powerful server on the Internet. My smart-phone and tablet do not have a GPU strong enough to run the shader which I detailed above – Are You kidding me? But my smart-phone has the less-powerful type of GPU, which allows me to download the simplified versions of my captures, and to inspect them from different 3D angles, as basic 3D models – without GS support.
And, I have just checked that this app is still supported by their servers,
where some of my own past captures are still being stored for me.
(Edit 01/09/2017 : The captures remained stored on the device that requested them, and will presumably stay there. Yet, the most recent public successes shared on the server, are available for me to view, on both my mobile devices.
I just upgraded to a Premium account, years after my old captures took place. After that, I created a new capture using my phone, which, after being shared by me, also appeared successfully on my tablet. Here is the link:
Unfortunately, the 3D View button does not work on FireFox, but the other perspective-view buttons work. This is the explanation.
The app has a feature to allow following publicly-shared feeds like ‘Instagram’ does, to other users with the same app installed.)
By making such a surprising, simplifying assumption (involving micropolygons), it can sometimes be possible to achieve success where a more-analyzed approach was not able to.
(Edit 01/05/2017 : ) IIRC, One of the simplifying assumptions which “123D Catch” makes, in its creation of a basic 3D mesh, is that out of several, one camera position contains the entire scene, accurately to how it appears in 2D. Then, I believe the patches that belong to it are projected 3-dimensionally along lines that join at that one camera position, in order to account where matches occur in the other photos.
One advantage this brings, is the possibility of outputting a 3D mesh which is more varied in its full 3D form, than just a simple extrusion from a fixed rectangle.
And the reason for this would be, that however oblique the final geometry of the mesh is, to this main camera position, a CPU next computes normal vectors, based on the final 3D orientation of the triangles that connect the vertices. Hence, the normal vectors computed can easily be at any angle to the main camera position, and still from the basis on which micropolygons are displaced, in a second phase.
But then one result a user can obtain, is to be presented with a 3D model which looks 100% correct at first glance, the way the online processing hands it back to him, only to find that when he rotates it on his smart-phone, it seems to transform into a scene that represents something completely incorrect, but which only happens to line up correctly according to one 2D view. These cases are failed captures.
OTOH, If it was known in advance that the subject is a human face or head, Then one advantage we would have with the other methodology, is the ability to substitute one out of several possible ‘generic heads’ as the base-mesh, from which an extrusion of micropolygons should produce a specific face-geometry. And so this other example would not truly be one, in which the basic geometry was ever really unknown or arbitrary.