One of the facts which the WiKiPedia mentions, is, that for 20 years, there was a patent on the “Marching Cubes” algorithm, which basically forced some software developers – especially Linux and other, open-source developers – to use “Marching Tetrahedra” as an alternative. But I think that this article has one flaw:

Its assumptions are too modern.

What this article states is that, like Marching Cubes, individual tetrahedra can be fed to the GPU as “Triangle Strips”. The problem with this is the fact, that triangle *strips* are only recognized by the GPU, if DirectX 10(+), or OpenGL 3(+) is available, which means, that ‘a real Geometry Shader’ needs to be running.

Coders were working with Iso-surfaces, during the DirectX 9.0c / OpenGL 2 days, when there were no real Geometry Shaders. And then, one of the limitations that existed in the hardware was, that even if the Fragment Shader received vertices grouped as triangles, usually, Vertex Shaders would only get to probe one vertex at a time. So, what early coders actually did was, to implement a kind of poor man’s geometry shader, within the Fragment Shader. This was possible because one of the pixel formats which the FS could output, also corresponded to one of the vertex formats, which a VS could read as input.

Hence, a Fragment Shader running in this fashion would render its output – under the pretense that it would form an image – into the Vertex Buffer of another rendering pipeline. This was therefore appropriately named “Render-To-Vertex-Buffer”, or, ‘R2VB‘. And today, graphics cards exist, *which no longer permit R2VB*, but which permit OpenGL 4 and/or real Geometry Shaders, the latter of which, in turn, can group their Output Topologies into Triangle Strips.

This poses the question, ‘Because any one shader invocation can only see its own data, how could this result in a Marching Tetrahedra implementation?’ And I don’t fully know the answer.

Today, I can no longer imagine in a satisfyingly complete way, how the programmers in the old days solved such problems. Like many other people today, I need to imagine that the GPU does offer a Geometry Shader – a GS – explicitly, in order to implement a GS.

In a slightly different way, Marching Tetrahedra will continue to be important in the near future, because coders needed to implement the algorithm on the CPU, not the GPU, because they had Iso-Surfaces to render, but no patent-rights to the Marching Cubes algorithm, and, because programmers are not usually asked to rewrite all their predecessors’ code. Hence, code exists, which does all this purely on the CPU, and for which the man-hours don’t exist, to convert it all to Marching Cubes code.

(Update 5/09/2020, 17h30… )

(As of 5/09/2020, 13h55: )

I can try to guess, that one key aspect to how such an Iso-Surface (*shaped using ‘R2VB’*) was supposed to work was, that the CPU was still the processor, which would change the vertex-indices, in the (output) index-buffer, to triples of vertices, that arose out of the enumeration, from which input vertices were ‘above the threshold’. (:1)

Yet, Vertex Shaders running on the GPU might have had as their only function, to read the vertex attributes of 4 input vertices, instead of 1 input vertex, in one invocation, and just, to output what the positions of the interpolated 6 output vertices were supposed to be, *regardless of whether each output vertex was in fact, still a part of the resulting output mesh*, for any given frame-cycle.

*I’m assuming that the algorithm would also first test, whether the Density values at the endpoints were equal, which would be the only situation in which a simple interpolation fails to result in a valid extrapolation. If they were, then the algorithm would apply a blend-value of (0.0), which in turn would mean that the ~interpolated~ output-position would equal either input-position. But then, this output-position would no longer be part of the rendered Iso-Surface. Otherwise, the interpolation-become-extrapolation would output non-erroneous positions and/or normal-vectors*.

I’m assuming here that the vertices were stored in the (input) vertex buffer separately, as belonging to each tetrahedron, so that the VS would advance by 4 vertices, if it was also programmed to read-in 4 vertices. This wastes some VRAM, but makes the attempt look more feasible. *And of course, as far as the VS was concerned, a set of 4 real vertices, consisted of ‘1 virtual super-vertex’*.

As usual, each input vertex would have a vector of coordinates as attribute, at which a volume texture can be sampled by the VS.

(Update 5/09/2020, 16h00: )

I suppose that a rather murky question which can follow would be, which of the computational tasks in such a ‘rendering pipeline’, would continue to be the responsibility of the VS, and which would be the responsibility of the FS. And one answer I’d venture would be, that varyings passed from the VS to the FS must be chosen such, that hypothetically, an interpolation could be performed on them, while the output of one FS invocation would finally pack the results into one notional output pixel, from which no more interpolation would be performed, before it gets read-in by the second rendering pipeline’s VS, as one vertex structure.

Therefore, the answer will follow deterministically, from how ‘R2VB’ works. The output-pixels that correspond to a subsequent vertex buffer’s vertices, still have coordinates, belonging to the first output-buffer’s pixel resolution. Further, the VS must still finish by outputting a position vector, that corresponds to where, within this virtual output image, a fragment will eventually be rendered to, independently of where, from the second rendering pipeline, the Iso-Surface will be positioned.

What this means is, that because sets of 6 output-vertices are assumed, the resolution of the output image *could be* (6 by *N*), where (*N*) is the number of (input) tetrahedra. And in that case, there will be (*N*) VS-invocations, the output-positions of which only differ along the (Y-axis). (:2)

The VS must organize what it passes on to the FS in such a way, that each FS-invocation can pack a complete output-vertex, on its own:

- Therefore, the VS must also organize its set of varyings, as a set of fully interpolated vectors, each of which may consist of 6 position,
*and*6 normal-vectors, - It would be the responsibility of the FS-invocation, to look at
*its own*(X-coordinate) in this case, which would correspond to a column-number from [0 .. 5], and therefore, to output the selected varying, received from the VS.

Now, if I am to consider the task of doing all this with an OpenGL 3(+) Geometry Shader, then I can start to imagine a lot better. What can happen here is, that the Input Topology could be a Point, but that the GS-invocation could sample a volume texture 8 times, as a function of the 3D coordinates of that point, and that it could then output a Triangle *List*, that corresponds to the Marching *Cubes* algorithm…

(Update 5/09/2020, 14h45: )

**1:)**

There is an underlying fact, which I know with certainty to be true, which is, that legal index-buffer values must exist, which simply turn a given triangle ‘off’, so that when the command is given to render what’s implied by the entire index-buffer, many triples of index-numbers, result in no output. And this has to happen, without resulting in a malfunction. But as is often the case, I can think of 2 mutually exclusive ways this could have been implemented, and not just 1:

- The vertex-number (0) may have such a special meaning, such that real vertex numbers start from (1), and a triple indicating its triangle should be switched off, could be [0, 0, 0],
- Vertex-numbers of real vertices may start from (0), but vertex-number (-1) in two’s compliment may have such a special meaning, so that a triple indicating its triangle should be switched off, would be [-1, -1, -1].

Whichever the case may be, the reader would need to confirm, before writing any code.

(Update 5/09/2020, 17h30: )

**2:)**

I suppose that there would also be an implementation-specific detail, which makes it easier for the VS to compute its output-positions, which, as I just wrote, should really only progress along one coordinate out of (X) or (Y). Each vertex should also have as attribute, the tetrahedron-number it belongs to, an integer which could rage from {j ∈ [0 .. T-1]}, if (T) next becomes the total number of tetrahedra. That way it could easily be coded into the VS:

```
float Y := (((j + 0.5) / T) * 2.0) - 1.0;
```

Dirk