Quickie: How 2D Graphics is just a Special Case of 3D Graphics

I have previously written in-depth, about what the rendering pipeline is, by which 3D graphics are rendered to a 2D, perspective view, as part of computer games, or as part of other applications that require 3D, in real time. But one problem with my writing in-depth might be, that people fail to see some relevance in the words, if the word-count goes beyond 500 words. :-)

So I’m going to try to summarize it more-briefly.

Vertex-Positions in 3D can be rotated and translated, using matrices. Matrices can be composited, meaning that if a sequence of multiplications of position-vectors by known matrices accomplishes what we want, then a multiplication by a single, derived matrix can accomplish the same thing.

According to DirectX 9 or OpenGL 2.x , 3D objects consisted of vertices that formed triangles, the positions and normal-vectors of which were transformed and rotated, respectively, and where vertices additionally possessed texture-coordinates, which could all be processed by “Vertex Pipelines”. The output from Vertex Pipelines was then rasterized and interpolated, and fed to “Pixel Pipelines”, that performed per-screen-pixel computations on the interpolated values, and on how these values were applied to Texture Images which were sampled.

All this work was done by dedicated graphics hardware, which is now known as a GPU. It was not done by software.

One difference that exists today, is that the specialization of GPU cores into Vertex- and Pixel-Pipelines no longer exists. Due to something called Unified Shader Model, any one GPU-core can act either as a Vertex- or as a Pixel-Shader, and powerful GPUs possess hundreds of cores.

So the practical question does arise, how any of this applies to 2D applications, such as Desktop Compositing. And the answer would be, that it has always been possible to render a single rectangle, as though oriented in a 3D coordinate system. This rectangle, which is also referred to as a “Quad”, first gets Tessellated, which means that it receives a diagonal subdivision into two triangles, which are still references to the same 4 vertices as before.

When an application receives a drawing surface, onto which it draws its GUI – using CPU-time – the corners of this drawing surface have 2D texture coordinates that are combinations of [ 0 ] and ( +1 ) . The drawing-surfaces themselves can be input to the GPU as though Texture Images. And the 4 vertices that define the position of the drawing surface on the desktop, can simply result from a matrix, that is much simpler than any matrix would have needed to be, that performed rotation in 3D etc., before a screen-positioning could be formed from it. Either way, the Vertex Program only needs to multiply the (notional) positions of the corners of a drawing surface, by a single matrix, before a screen-position results. This matrix does not need to be computed from complicated trig functions in the 2D case.

And the GPU renders the scene to a frame-buffer, just as it rendered 3D games.

Continue reading Quickie: How 2D Graphics is just a Special Case of 3D Graphics

About the Black Borders Around some of my Screen-Shots

One practice I have, is to take simple screen-shots of my Linux desktop, using the KDE-compatible utility named ‘KSnapshot’. It can usually be activated, by just tapping on the ‘Print-Screen’ keyboard-key, and if not, KDE can be customized with a hot-key combination to launch it just as easily.

If I use this utility to take a snapshot, of one single application-window, then it may or may not happen, that the screen-shot of that window has a wide, black border. And the appearance of this border, may confuse my readers.

The reason this border appears, has to do with the fact that I have Desktop Compositing activated, which on my Linux systems is based on a version of the Wayland Compositor, that has been built specifically, to work together with the X-server.

One of the compositing effects I have enabled, is to draw a bluish halo around the active application-window. Because this is introduced as much as possible, at the expense of GPU power and not CPU power, it has its own way of working, specific to OpenGL 2 or OpenGL 3. Essentially, the application draws its GUI-window into a specifically-assigned memory region, called a ‘drawing surface’, but not directly to the screen-area to be seen. Instead, the drawing surface of any one application window, is taken by the compositor to be a Texture Image, just like 3D Models would have Texture Images. And then the way Wayland organizes its scene, essentially just simplifies the computation of coordinates. Because OpenGL versions are optimized for 3D, they have specialized way to turn 3D coordinates into 2D, screen-coordinates, which the Wayland Compositor bypasses for the most part, by feeding the GPU some simplified matrices, where the GPU would be able to accept much more complex matrices.

In the end, in order for any one application-window to receive a blue halo, to indicate that it is the one, active application in the foreground, its drawing surface must be made larger to begin with, than what the one window-size would normally require. And then, the blue halo exists statically within this drawing-surface, but outside the normal set of coordinates of the drawn window.

The halo appears over the desktop layout, and over other application windows, through the simple use of alpha-blending on the GPU, using a special blending-mode:

  • The inverse of the per-texel alpha determines by how much the background should remain visible.
  • If the present window is not the active window, the background simply replaces the foreground.
  • If the present window is the active window, the two color-values add, causing the halo to seem to glow.
  • The CPU can decide to switch the alpha-blending mode of an entity, without requiring the entity be reloaded.

KSnapshot sometimes recognizes, that if instructed to take a screen-shot of one window, it should copy a sub-rectangle of the drawing surface. But in certain cases the KSanpshot utility does not recognize the need to do this, and just captures the entire drawing surface. Minus whatever alpha-channel the drawing surface might have, since screen-shots are supposed to be without alpha-channels. So the reader will not be able to make out the effect, because by the time a screen-shot has been saved to my hard-drive, it is without any alpha-channel.

And there are two ways I know of by default, to reduce an image that has an alpha-channel, to one that does not:

  1. The non-alpha, output-image can cause the input image to appear, as though in front of a checkerboard-pattern, taking its alpha into account,
  2. The non-alpha, output-image can cause the input image to appear, as though just in front of a default-color, such as ‘black’, but again taking its alpha into account.

This would be decided by a library, resulting in a screen-shot, that has a wide black border around it. This represents the maximum extent, by which static, 2D effects can be dawn in – on the assumption that those effects were defined on the CPU, and not on the GPU.

So, just as the actual application could be instructed to draw its window into a sub-rectangle of the whole desktop, it can be instructed to draw its window into a sub-rectangle, of its assigned drawing-surface. And with this effect enabled, this is indeed how it’s done.

Dirk

 

Alpha-Blending

The concept seems rather intuitive, by which a single object or entity can be translucent. But another concept which is less intuitive, is that the degree to which it is so can be stated once per pixel, through an alpha-channel.

Just as every pixel can possess one channel for each of the three additive primary colors: Red, Green and Blue, It can possess a 4th channel named Alpha, which states on a scale from [ 0.0 … 1.0 ] , how opaque it is.

This does not just apply to the texture images, whose pixels are named texels, but also to Fragment Shader output, as well as to the pixels actually associated with the drawing surface, which provide what is known as destination alpha, since the drawing surface is also the destination of the rendering, or its target.

Hence, there exist images whose pixels have a 4channel format, as opposed to others, with a mere 3-channel format.

Now, there is no clear way for a display to display alpha. In certain cases, alpha in an image being viewed is hinted by software, as a checkerboard pattern. But what we see is nevertheless color-information and not transparency. And so a logical question can be, what the function of this alpha-channel is, which is being rendered to.

There are many ways in which the content from numerous sources can be blended, but most of the high-quality ones require, that much communication takes place between rendering-stages. A strategy is desired in which output from rendering-passes is combined, without requiring much communication between the passes. And alpha-blending is a de-facto strategy for that.

By default, closer entities, according to the position of their origins in view space, are rendered first. What this does is put closer values into the Z-buffer as soon as possible, so that the Z-buffer can prevent the rendering of the more distant entities as efficiently as possible. 3D rendering starts when the CPU gives the command to ‘draw’ one entity, which has an arbitrary position in 3D. This may be contrary to what 2D graphics might teach us to predict.

Alas, alpha-entities – aka entities that possess alpha textures – do not write the Z-buffer, because if they did, they would prevent more-distant entities from being rendered. And then, there would be no point in the closer ones being translucent.

The default way in which alpha-blending works, is that the alpha-channel of the display records the extent to which entities have been left visible, by previous entities which have been rendered closer to the virtual camera.

Continue reading Alpha-Blending

The laptop ‘Klystron’ suspends to RAM half-decently.

One subject which I have written a lot about, was that as soon as I close the lid of the laptop I name ‘Klystron’, it seems to lose its WiFi signal, and that that can get in the way of comfortable use, because to close the lid also helps shield the keyboard of dust etc..

This Linux laptop boots decently fast, and yet is still a hassle to reboot very often. And so I needed to come up with a different way of solving my problem, on a practical level. My solution for now, is to tell the laptop to Suspend To RAM, as soon as I close the lid. That way, the WiFi signal is gone more properly, and when the laptop resumes its session, the scripts that govern this behavior also re-initialize the WiFi chipset and its status on my LAN. This causes less confusion with running Samba servers etc., on my other computers.

There is a bit of terminology, which I do not think that the whole population understands, but which I think that people are simply using differently from how it was used in my past.

It used to be, that under Linux, we had ‘Suspend To RAM’ and ‘Suspend To Disk’. In the Windows world, these terms corresponded to ‘Standby’ and ‘Hibernate’ respectively. Well in the terminology today, they stand for ‘Sleep’ and ‘Hibernate’, borrowing those terms from mobile devices.

There are two types of Suspend working in any case.

In past days of Linux, we could not cause a laptop just to Hibernate. We needed to install special packages and modify the Grand Unified Bootloader, before we could even Suspend To Disk. Suspending To RAM used to be less reliable. Well one development with modern Linux which I welcome, among many, is the fact that Sleep and Hibernate should, in most cases, work out-of-the-box.

I just tried Sleep mode tonight, and it works 90%.

One oddity: When we Resume, on this laptop, the message is displayed on the screen numerous times, of a CPU Error. But after a few seconds of CPU errors, apparently the session is restored without corruption. Given that I have 300 (+) processes, I cannot be 100% sure that the Restore is perfectly without corruption. But I am reasonably sure, with one exception:

The second oddity is of greater relevance. After Waking Up, the clock of the laptop seems to be displaced 2 days and a certain number of hours into the future. This bug has been observed on some other devices, and I needed to add a script to the configuration files as a workaround, which simply sets the system clock back that many days and that many hours, after Waking. Thankfully, I believe that doing so, was as much of a workaround as was needed.

One side-effect of not having done so, before being aware of the problem, was that the ‘KNotify’ alarms for the next two days have also all sounded, so that it will take another two days, before personal organizer – PIM – notifications may sound for me again.

The fact that numerous CPU errors are displayed bothers me not. What that means, is that the way the CPU goes to sleep, and then wakes up, involves power-cycling in ways that do not guarantee the integrity of data throughout. But it would seem that good programming of the kernel does provide data integrity, with the exception of the system clock issue.

But the fact that the hardware is a bit testy when using the Linux version of Sleep, also suggests that maybe this is also the kind of laptop that powers down its VRAM. It is a good thing then, that I disabled the advanced compositing effects, that involve vertex arrays.


 

There is a side-note on the desktop cube animation I wanted to make.

In general, when raster-rendering a complex scene with models, each model is defined by a vertex array, an index array, one or more texture images etc., and the vertex array stores the model geometry statically, as relative to the coordinate-origin of the model. Then, a model-view-projection matrix is applied – or just a rotation matrix for the normal vectors – to position it with respect to the screen. Moving the models is then a question of the CPU updating the model-view matrix only.

Well when a desktop cube animation is the only model in the scene, as part of compositing, I think that the way in which this is managed differs slightly. I think that what happens here, is that instead of the cube having vertex coordinates of +/- 1 all the time, the model-view matrix is kept as an identity matrix.

Instead, the actual vertex data is rewritten to the vertex array, to reposition the vertices with respect to the view.

Why is this significant? Well, if it was true that Suspending the session To RAM also cut power to the VRAM, it would be useful to know, which types of data stored therein will seem corrupted after a resume, and which will not.

Technically, texture images can also get garbled. But if all it takes is one frame cycle for texture images to get refreshed, the net result is that the displayed desktop will look normal again, by the time the user unlocks it.

Similarly, if the vertex array of the only model is being rewritten by the CPU, doing so will also rewrite the header information in the vertex array, that tells the GPU how many vertices there are, as well as rewriting the normal vectors, as when they are a part of any normal vertex animation, etc.. So anything resulting from the vertex array should still not look corrupted.

But one element which generally does not get rewritten, is the index array. The index array states in its header information, whether the array is a point list, a line list, a triangle list, a line strip, a triangle strip… It then states how many primitives exist, for the GPU to draw. And then it states sets of elements, each of which is a vertex number.

The only theoretical reason fw the CPU would rewrite that, would be if the topology of the model was to change, which is as good as never in practice. And so, if the VRAM gets garbled, what was stored in the index array would get lost – and not refreshed.

And this can lead to the view, of numerous nonsensical triangles on the screen, which many of us have learned to associated with a GPU crash.

 

Dirk