One concept which has been used often in the design of Fragment Shaders and/or Materials, is “DOT3 Bump-Mapping”. The way in which this scheme works is rather straightforward. A Bump-Map, which is being provided as one (source) texture image out of several, does not define coloration, but rather relief, as a kind of Height-Map. And it must first be converted into a Normal-Map, which is a specially-formatted type of image, in which the Red, Green and Blue component channels for each texel are able to represent floating-point values from (-1.0 … +1.0) , even though each color channel is still only an assumed 8-bit pixel-value belonging to the image. There are several ways to do this, out of which one has been accepted as standard, but then the Red, Green and Blue channels represent a Normal-Vector and its X, Y, and Z components.
The problem arises in the design of simple shaders, that this technique offers two Normal-Vectors, because an original Normal-Vector was already provided, and interpolated from the Vertex-Normals. There are basically two ways to blend these Normal-Vectors into one: An easy way and a difficult way.
Using DOT3, the assumption is made that the Normal-Map is valid when its surface is facing the camera directly, but that the actual computation of its Normal-Vectors was never extremely accurate. What DOT3 does is to add the vectors, with one main caveat. We want the combined Normal-Vector to be accurate at the edges of a model, as seen from the camera-position, even though something has been added to the Vertex-Normal.
The way DOT3 solves this problem, is by setting the (Z) component of the Normal-Map to zero, before performing the addition, and to normalize the resulting sum, after the addition, so that we are left with a unit vector anyway.
On that assumption, the (X) and (Y) components of the Normal-Map can just as easily be computed as a differentiation of the Bump-Map, in two directions. If we want our Normal-Map to be more accurate than that, then we should also apply a more-accurate method of blending it with the Vertex-Normal, than DOT3.
And so there exists Tangent-Space Mapping. According to Tangent-Mapping, the Vertex-Normal is also associated with at least one tangent-vector, as defined in model space, and a bitangent-vector must either be computed by the Vertex Shader, or provided as part of the model definition, as part of the Vertex Array.
What the Fragment Shader must next do, after assuming that the Vertex- Normal, Tangent and Bitangent vectors correspond also to the Z, X and Y components of the Normal-Map, and after normalizing them, since anything interpolated from unit vectors cannot be assumed to have remained a unit vector, is to treat them as though they formed the columns of another matrix, IF Mapped Normal-Vectors multiplied by this
texture matrix, are simply to be rotated in 3D, into View Space.
(Above Corrected 07/05/2018 . )
I suppose I should add, that these 3 vectors were part of the model definition, and needed to find their way into View Space, before building this matrix. If the rendering engine supplies one, this is where the Normal Matrix would come in – once per Vertex Shader invocation.
Ideally, the Fragment Shader would perform a complete Orthonormalization of the resulting matrix, but to do so also requires a lot of GPU work in the FS, and would therefore assume a very powerful graphics card. But an Orthonormalization will also ensure, that a Transposed Matrix does correspond to an Inverse Matrix. And the sense must be preserved, of whether we are converting from View Space to Tangent-Space, or from Tangent-Space into View Space.
One main problem with Tangent-Mapping is the fact, that there is more than one valid way to carry it out. If all we want to derive is surface-brightness according to what DOT3 used to do, then to map View Space to Tangent-Space is slightly less expensive, but then if we want to derive such factors as a camera-space-reflection vector, let us say in order for an environment cube to appear as if reflected correctly by all the bumps in a model, we need to convert from Tangent-Space to View Space.
If we are assuming to rotate View-Space coordinates into Tangent-Space, then we must do so for each of our light-source direction vectors, as well as for the (unit) Camera-Z Vector, in order for the reflected Camera-Vector to form an accurate dot-product with the light-source vector.
If we decide only to rotate our Tangent-Space vectors into View Space, then we do not have enough, also to perform Parallax-Mapping.
Another main problem with Tangent-Mapping the Normal-Map in either direction, is the fact that its vectors were possible while the surface was facing the virtual camera, but that as we rotate the surface away from the virtual camera, more and more of its texels will seem to form vectors, that are actually pointing away from the camera.
This anomaly can also be related to the fact, that U,V texture coordinates were simply interpolated from the same Vertex parameters, without taking into account whether one texel would actually remain visible, Tangent-Mapped. And so texels will initially be referenced, which should no longer be so.
One way to solve that problem is to suggest, that whenever we Tangent-Map a model, we should also Parallax-Map it. What happens with Parallax-Mapping, is that per Fragment, the U,V texture coordinates from which the texels are being sampled, are displaced in the texture- U,V space, i.e. in the Tangent-Space, to simulate parallax at the per-pixel level. In reality, Parallax-Mapping works at the texel-level, and does not truly affect which screen-pixel, the current Fragment is being rendered to.
If we wanted our Fragment Shader to work with vectors that are both in Tangent-Space, and in View Space, then the added cost threatens to arise, of also working with two matrices,
one of which might be the transpose of the other. Because the GPU operation to transpose a matrix is non-trivial, and because the Fragment Shader does run – in principle – once per screen-pixel, there would be an additional cost, over only computing this once per vertex. I would consider the question carefully, of whether we needed both vector categories to apply accurate, orthonormal transformations. In such a case, only the eye-vector would really need to remain in Tangent-Space, and not orthonormal?
Parallax-Mapping can be extended, such as to Kill the current Fragment – i.e. not to render it – as that is an instruction available to a Fragment Shader. And doing so selectively, can make the appearance more complete, as though the edges of a model either were or were not being rendered, depending on whether they stood out enough, and thus to make the illusion more complete, as though the screen-pixel being rendered to was being affected. It is not.
Parallax-Mapping requires use of the Height-Map, or an associated Depth-Map, requires that the lateral displacement has accurately been rotated into Tangent-Space from View Space. And if it has been carried out correctly, Parallax-Mapping will generate texture-coordinates that diverge, away from texels that should no longer be visible to the virtual camera.
This should also remove any texels from view, whose Tangent-Mapped Normal-Vectors are facing away from the camera.
But other approaches have also been devised, which are not as expensive for the GPU to compute.
In any case, the availability of these advanced mapping-methods requires, that the shader be fed both a Normal-Map, as well as either a Height-Map or a Depth-Map, and thus requires that numerous texture images were assigned, as sources of data. Certain rendering engines I know of, actually reserve the Alpha-Channel of the source-texture to be its Height-Map, while continuing to use R, G and B as its Normal-Map.
It remains a constant, that the conversion of a Height-Map into a Normal-Map needs to be done on the CPU, just as the generation of accurate per-vertex Normal-Vectors does. And the reason for this is the fact, that each Fragment Shader invocation uses the per-texel data it loads, and then forgets that data again, by the time of the next invocation.
Further, I might mention, The use of one Material Parameter such as Gloss for an entire model, which can be fed in as a Shader Uniform parameter, has often made simulated 3D models seem fake, because different parts of the same mesh could represent different types of surface. I.e., if we wanted the model to appear ‘woody’, the entire model would need to be woody, or not so.
If it has become accepted to feed our Fragment Shader numerous texture images as input, then we can also assign a material-texture, one color-channel of which can define how much Gloss each Fragment is supposed to receive, so that one part of the same mesh can appear more glossy than other parts.
I am sure the reader has played games, in which a single model represents a person, but several regions of which represent combat-equipment while others represent skin. The combat-equipment is supposed to be more metallic or glossy, than the skin is…
But at that point were are contemplating the fabled ‘Uber-shader‘, that everybody talks about, but which we should actually avoid trying to implement.
IF the model was not tangent-mapped, there is a way to fake that. Properly, the Tangent Vector defines which direction in Model Space corresponds to the U texture coordinate. We can just insert the assumption that our 3D application has a special coordinate – like the North Pole of the Earth – and that the Tangent Vector is always assumed to run parallel to the equator, while the Bitangent always points True North, along the surface of the Earth.
The trick would be to form a cross-product, of the special vector by the real, Normal Vector, and to normalize that, to obtain tangent. And then we would re-cross the Normal Vector with this pseudo-tangent, to obtain bitangent. Of course this would fail at points on the model surface which point exactly along the special direction.
Ideally, in the case of scene-geometry, this special direction would point straight up off the ground, which, depending on which rendering system we are using, is either the (Y) or the (Z) world-coordinate. We would be careful not to render the ground or the ceilings with a Tangent-Mapping shader in that case, only walls.
When performing the Gram-Schmidt orthonormalization in this context, it is important to use the output vector v1 as a starting point, which we want most reliably. This would be the texture-(Z) component. That way, if the other vectors end up exactly parallel, the process will reduce them to zero. And if texture- (X) and (Y) are in fact zero, the worst outcome will be a flat-shaded surface – maybe as opposed, to a black surface?
The other modification to Gram-Schmidt which I would mention, which reduces the amount of GPU computation required, is that as soon as u1 has been recomputed, it is made a unit vector. Then, when the u2 component of the matrix is being recomputed, the dot product of u1 with v2 also forms the exact scalar-product of u1, which must be subtracted from v2 to leave u2 , which is again normalized as soon as it is determined. Therefore, the dot-products of u1 and u2, with v3, will again become the exact scalar-products of u1 and u2 which must be subtracted from v3, to leave u3 … Gram-Schmidt does not presume, that the resulting vectors u1, u2, or u3, need to be unit vectors.