## I’m impressed with the Mesa drivers.

Before we install Linux on our computers, we usually try to make sure that we either have an NVIDIA or an AMD / Radeon  GPU  – the graphics chip-set – so that we can use either the proprietary NVIDIA drivers designed by their company to run under Linux, or so that we can use the proprietary ‘fglrx’ drivers provided by AMD, or so that we can use the ‘Mesa‘ drivers, which are open-source, and which are designed by Linux specialists. Because the proprietary drivers only cover one out of the available families of chip-sets, this means that after we have installed Linux, our choice boils down to a choice between either proprietary or Mesa drivers.

I think that the main advantage of the proprietary drivers remains, that they will offer our computers the highest version of OpenGL possible from the hardware – which could go up to 4.5 ! But obviously, there are also advantages to using Mesa , one of which is the fact that to install those doesn’t install a ‘blob’ – an opaque piece of binary code which nobody can analyze. Another is the fact that the Mesa drivers will provide ‘VDPAU‘, which the ‘fglrx’ drivers fail to implement. This last detail has to do with the hardware-accelerated playback of 2D video-streams, that have been compressed with one out of a very short list of Codecs.

But I would add to the possible reasons for choosing Mesa, the fact that its stated OpenGL version-number does not set a real limit, on what the graphics-chip-set can do. Officially, Mesa offers OpenGL 3.0 , and this could make it look at the surface, as though its implementation of OpenGL is somewhat lacking, as a trade-off against its other benefits.

One way in which ‘OpenGL’ seems to differ from its competitor in real-life: ‘DirectX’, is in the system by which certain DirectX drivers and hardware offer a numeric compute-level, and where if that compute-level has been achieved, the game-designer can count on a specific set of features being implemented. What seems to happen with OpenGL instead, is that 3.0 must first be satisfied. And if it is, the 3D application next checks individually, whether the OpenGL system available, offers specific OpenGL extensions by name. If the application is very-well-written, it will test for the existence of every extension it needs, before giving the command to load that extension. But in certain cases, a failure to test this can lead to the graphics card crashing, because the graphics card itself may not have the extension requested.

As an example of what I mean, my KDE / Plasma compositor settings, allow me to choose ‘OpenGL 3.1′ as an available back-end, and when I select it, it works, in spite of my Mesa drivers ‘only’ achieving 3.0 . I think that if the drivers had been stated to be 3.1 , then this could actually mean they lose backward-compatibility with 3.0 , while in fact they preserve that backward-compatibility as much as possible.

## I question the amount of VRAM on Phoenix.

I am still contemplating, why the server-box I name ‘‘ was crashing, and my attention keeps coming back to the graphics chip. Before this computer was resurrected, it was running in 32-bit mode, as ‘‘. At that time, it only had 2GB of RAM. But now it runs in 64-bit mode, with 4GB of RAM.

When I boot, the BIOS message still tells me that it has 128MB of shared memory, for the graphics chip. But strangely enough, the piece of text I pasted into this posting, reads that the graphics driver has set aside 256MB of VRAM, near the top of the 4GB of physical addresses. I did not know that the kernel can override a BIOS setting in this way, let us say just because processing has been switched to 64-bit mode.

One mishap which could naively go wrong, is that the legacy driver, unaware of the specifics of this motherboard, could be allocating 256MB of shared memory, but that physically, the hardware cannot share past the address ‘‘. That is, the address ‘‘ may have become forbidden territory for the graphics card. It is however uncommon, that the programmers of kernel-space modules, would make such a simple mistake.

This is a 64-bit system, which only accepts up to 4GB of RAM, thus only possessing 32-bit physical addresses, to go with its 64-bit virtual addresses.

According to this screen-shot:

I only have 3.74GB of RAM available to the system, instead of 4GB. The reason for this, is the fact that 256MB have in fact been reserved for the graphics chip. By itself this would seem to suggest, that the allocation has succeeded.

Also, the fact that 49.26MB of shared memory was momentarily being indicated, is not too telling, because several types of processes could be using shared memory for some purpose. This feature does not only exist, for user-space processes to make texture images available to the graphics card.

## New Case-Fan Installed

During previous postings, I had written about crashes, which the computer I name ‘Phoenix’ was suffering from. And I had written that one possible reason could have been the failed case-fan, which could have been causing something on the motherboard to overheat.

Just today, this box suffered from another similar crash. This time, I opened up the case, and replaced the 92mm case-fan. Therefore, the reader might expect some optimism on my part, that this server-box will not crash again. But in reality I have two reasons, for which my optimism does not overwhelm:

1. If an overheated chip has already caused crashes, there is some tendency for it to suffer from a memory-effect, of wanting to fail again, whenever it gets slightly warm, or just so. Therefore, due to the first crash possibly having happened for that reason, this machine could now have a penchant for crashing, even though the initial cause has been removed.
2. The cause may not have been an overheated chip, but rather, a pure software-problem with the legacy graphics driver (nVidia). On such a big display, the graphics driver may have been suffering from some sort of resource leak – aka memory leak – and during boot-up, the BIOS displays it only possesses 128MB of shared RAM! Thus, the problem could be cumulative and result from regular copying-and-pasting, with many HW-accelerated drawing surfaces and many compositing effects enabled. Once we have an unstable graphics driver – and the graphics driver has received several updates recently – having a stable one could be a luxury we cannot easily reproduce.

I was down from roughly 19h00 until 20h00, and apologize to my readers for any inconvenience.

Dirk

BTW: I have an additional reason, not really to believe, that these crashes are due to an overheated graphics chip. During the actual reboot, the graphics chip should get especially hot, and especially so, if the case-fan is not turning.

I can see that if this chip did overheat, the TDR would not be able to reboot it.

But the crashes never seem to occur, directly after the reboot. I generally seem to obtain about 6 days of smooth computing, before another crash happens.

Also, it should not be a VRAM leak, because this is a pre-GPU-type graphics chip. With the old graphics chips, that maximally had several pixel and several vertex pipelines, VRAM consumption was more or less static, while with the more-modern GPUs, some amount of VRAM-creep is at least plausible.


root@Phoenix:/home/dirk# lspci | grep vga
root@Phoenix:/home/dirk# lspci | grep VGA
00:0d.0 VGA compatible controller: NVIDIA Corporation C61 [GeForce 6150SE nForce 430] (rev a2)
root@Phoenix:/home/dirk# lspci -v -s 00:0d.0
00:0d.0 VGA compatible controller: NVIDIA Corporation C61 [GeForce 6150SE nForce 430] (rev a2) (prog-if 00 [VGA controller])
Subsystem: Hewlett-Packard Company Device 2a61
Flags: bus master, 66MHz, fast devsel, latency 0, IRQ 21
Memory at fb000000 (32-bit, non-prefetchable) [size=16M]
Memory at e0000000 (64-bit, prefetchable) [size=256M]
Memory at fc000000 (64-bit, non-prefetchable) [size=16M]
[virtual] Expansion ROM at f4000000 [disabled] [size=128K]
Capabilities: [48] Power Management version 2
Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
Kernel driver in use: nvidia

root@Phoenix:/home/dirk#