The PC Graphics Cards have specifically been made Memory-Addressable.

Please note that this posting does not describe

  • Android GPUs, or
  • Graphics Chips on PCs and Laptops, which use shared memory.

I am writing about the big graphics cards which power-users and gamers install into their PCs, which have a special bus-slot, and which cost as much money in themselves, as some computers cost.

The way those are organized physically, they possess one or more GPU, and , which loosely correspond to the CPU and RAM on the motherboard of your PC.

The GPU itself contains registers, which are essentially of two types:

  • Per-core, and
  • Shared

When coding shaders for 3D games, the GPU-registers do not fulfill the same function, as addresses in . The addresses in typically store texture images, vertex arrays in their various formats, and index buffers, as well as frame-buffers for the output. In other words, the typically stores model-geometry and 2D or 3D images. The registers on the GPU are typically used as temporary storage-locations, for the work of shaders, which are again, separately loaded onto the GPUs, after they are compiled by the device-drivers.

A major feature which the designers of graphics cards have given them, is to extend the system memory of the PC onto the graphics card, in such a way that most of its memory actually has hardware-addresses as well.

This might not include the GPU-registers that are specific to one core, but I think does include shared GPU-registers.

Continue reading The PC Graphics Cards have specifically been made Memory-Addressable.

“Hardware Acceleration” is a bit of a Misnomer.

The term gets mentioned quite frequently, that certain applications offer to give the user services, with “Hardware Acceleration”. This terminology can in fact be misleading – in a way that has no consequences – because computations that are hardware-accelerated, are still being executed according to software that has either been compiled or assembled into micro-instructions. Only, those micro-instructions are not to be executed on the main CPU of the machine.

Instead, those micro-instructions are to be executed either on the GPU, or on some other coprocessor, which provide the accelerating hardware.

Often, the compiling of code meant to run on a GPU – even though the same, in theory as regular software – has its own special considerations. For example, this code often consists of only a few micro-instructions, over which great care must be taken to make sure that they run correctly on as many GPUs as possible. I.e., when we are coding a shader, is often a main paradigm. And the possibility crops up often in practice, that even though the code is technically correct, it does not run correctly on a given GPU.

I do not really know how it is with SIMD coprocessors.

But this knowledge would be useful to have, in order to understand this posting of mine.

Of course, there exists a major contradiction to what I just wrote, in OpenCL and CUDA.

Continue reading “Hardware Acceleration” is a bit of a Misnomer.

Why R2VB Should Not Simply be Deprecated

The designers of certain graphics cards / GPUs, have decided that Render-To-Vertex-Buffer is deprecated. In order to appreciate why I believe this to be a mistake, the reader first needs to know what R2VB is – or was.

The rendering pipeline of DirectX 9 versus DirectX 11 is somewhat different, yet also very similar, and DirectX 9 was extremely versatile, with a wide range of applications written that use it, while the fancier Dx 11 pipeline is more powerful, but has less of an established base of algorithms.

Dx 9 is approximated in OpenGL 2, while Dx 10 and Dx 11 are approximated in OpenGL 3(+) .

Continue reading Why R2VB Should Not Simply be Deprecated