“Hardware Acceleration” is a bit of a Misnomer.

The term gets mentioned quite frequently, that certain applications offer to give the user services, with “Hardware Acceleration”. This terminology can in fact be misleading – in a way that has no consequences – because computations that are hardware-accelerated, are still being executed according to software that has either been compiled or assembled into micro-instructions. Only, those micro-instructions are not to be executed on the main CPU of the machine.

Instead, those micro-instructions are to be executed either on the GPU, or on some other coprocessor, which provide the accelerating hardware.

Often, the compiling of code meant to run on a GPU – even though the same, in theory as regular software – has its own special considerations. For example, this code often consists of only a few micro-instructions, over which great care must be taken to make sure that they run correctly on as many GPUs as possible. I.e., when we are coding a shader, is often a main paradigm. And the possibility crops up often in practice, that even though the code is technically correct, it does not run correctly on a given GPU.

I do not really know how it is with SIMD coprocessors.

But this knowledge would be useful to have, in order to understand this posting of mine.

Of course, there exists a major contradiction to what I just wrote, in OpenCL and CUDA.

Continue reading “Hardware Acceleration” is a bit of a Misnomer.

Google Pixel C does not have NEON.

I have been thoroughly enjoying my Google Pixel C, which I ordered only recently, and which I ordered because the actual tablet I have been using before, was only a first-generation Samsung Galaxy Tab S.

Sometimes we obtain many new features, but also at the expense of losing some feature. Because the ARM CPU is a RISC-Chip, the manufacturers of Android devices have sometimes made up for this by including a coprocessor called NEON. NEON is an SIMD – a Single-Instruction, Multiple-Data – coprocessor – aka a Vector-Processor, which is often useful to allow the decoding of high-definition video streams in real-time, without placing the burden of doing so on the main CPU.

(Edit 04/08/2017 : I have given my own definition of what “Hardware Acceleration” means, Here. )

What has happened with the Pixel C, is that Google has decided to put a Tegra X1 CPU into it, which is an SoC that also has a big coprocessor – its mighty GPU. With this tablet, real-time video-decoding is meant to be performed by the GPU, which advertizes several system-installed Codecs. Therefore, watching videos in high definition should not require a NEON coprocessor, and the Tegra X1 does not have one. (And, when I scroll further down the list of Codecs, that list includes two of the corresponding Encoders, from Nvidia, not only the Decoders. )

foxy_149159474357

In fact, the Pixel C only has a 4-core main CPU!

Continue reading Google Pixel C does not have NEON.

Alpha-Blending

The concept seems rather intuitive, by which a single object or entity can be translucent. But another concept which is less intuitive, is that the degree to which it is so can be stated once per pixel, through an alpha-channel.

Just as every pixel can possess one channel for each of the three additive primary colors: Red, Green and Blue, It can possess a 4th channel named Alpha, which states on a scale from [ 0.0 … 1.0 ] , how opaque it is.

This does not just apply to the texture images, whose pixels are named texels, but also to Fragment Shader output, as well as to the pixels actually associated with the drawing surface, which provide what is known as destination alpha, since the drawing surface is also the destination of the rendering, or its target.

Hence, there exist images whose pixels have a 4channel format, as opposed to others, with a mere 3-channel format.

Now, there is no clear way for a display to display alpha. In certain cases, alpha in an image being viewed is hinted by software, as a checkerboard pattern. But what we see is nevertheless color-information and not transparency. And so a logical question can be, what the function of this alpha-channel is, which is being rendered to.

There are many ways in which the content from numerous sources can be blended, but most of the high-quality ones require, that much communication takes place between rendering-stages. A strategy is desired in which output from rendering-passes is combined, without requiring much communication between the passes. And alpha-blending is a de-facto strategy for that.

By default, closer entities, according to the position of their origins in view space, are rendered first. What this does is put closer values into the Z-buffer as soon as possible, so that the Z-buffer can prevent the rendering of the more distant entities as efficiently as possible. 3D rendering starts when the CPU gives the command to ‘draw’ one entity, which has an arbitrary position in 3D. This may be contrary to what 2D graphics might teach us to predict.

Alas, alpha-entities – aka entities that possess alpha textures – do not write the Z-buffer, because if they did, they would prevent more-distant entities from being rendered. And then, there would be no point in the closer ones being translucent.

The default way in which alpha-blending works, is that the alpha-channel of the display records the extent to which entities have been left visible, by previous entities which have been rendered closer to the virtual camera.

Continue reading Alpha-Blending