I could make hypothetical guesses, as to why crashes like this one happen, on the machine I name ‘Phoenix’, which was manufactured in 2008. This time I noticed, that the cursor on the screen stopped moving, then that mouse-input was not being interpreted, then that the screen just filled with an image, which was a diagonally-scrambled version of the normal screen content:
- It could be that the old GPU is no longer reliable at the hardware level, and that it may now suffer from random crashes, which also crash the X-server. The “Timeout Detection and Recovery” (‘TDR’) feature I have seen the nVidia Driver execute properly in past situations, may just not kick in.
- When I reinstalled, replacing the old 32-bit O/S with the current 64-bit O/S, I also replaced the 2GB of RAM with completely new, 4GB of RAM, and the “Front-Side Bus Speed” (‘FSB’) of the new RAM has also become faster, that becoming 800MHz instead of the earlier 600MHz. Either set of DDR RAM modules was running with dual-channel capability. The motherboard may detect this capability of the new RAM modules and start using it, as the motherboard itself may have the stated capability of running at 800MHz. Yet, at 800MHz, the way this Motherboard works may not be stable.
- There could be some sort of kernel issue…
What I do find a bit more specific, is the fact that there seem to be no log entries for the Xorg, suggesting that although an X-server crash eventually takes place, this may not be the root cause. Also, the fact that the mouse has become unresponsive for a few seconds, before screen-content collapses, seems to suggest the same thing…
But the most important fact for me to observe, is that simply being able to suggest plausible reasons for the crash, is not the same thing as having diagnosed the crashes. Honestly, I do not know at present, why this type of crash happens.
One of the observations about this machine which had impressed me in the past, was that I had pushed 3D rendering beyond the limits of the old GPU, thereby crashing this graphics chip, but that the desktop manager I had in place was able to restart the GPU, and to resume the session, without requiring any action from me, but displaying a well-behaved message to the effect that the GPU needed to be rebooted. This is called “Timeout Detection and Recovery” (‘TDR’), and does the same thing under Linux, that it does under Windows, and depends on stable graphics drivers.
The fact that I do possess ‘TDR’ on this machine suggests, that a simple failure of the graphics chip, should not take out my session.
According to my latest inquiry, this Motherboard is ‘only’ running at 66MHz. Therefore, the maximum speed of the newer RAM Module should not be an issue after all.
Another detail which I should mention, which computing specialists already know, is that if we have a CPU which can run at 2.6GHz, as mine can, this does not mean that anything on the motherboard actually runs at 2.6GHz. In fact, the speed of light prevents this.
Instead, the 2 GPU cores are communicating with their respective L1 cache at up to 2.6GHz, that are located on the CPU chip itself, but communication with the larger motherboard, where distances are longer, is taking place more slowly.
Thus, the ‘Front-Side Bus Speed’ is one significant indicator, of how fast the MB is in fact running.
For a Motherboard that was manufactured in 2008, 800MHz would be unrealistically fast. The new DDR RAM modules were purchased in 2016, and I was concerned about this. But I was next satisfied, because initially after installing them, nothing went wrong.
Another fact which I should mention, is that on ‘Phoenix’, at one point in time, I installed Kernel version 3.18.0-14-generic. The problem with this is the fact, that this is not the standard kernel image to be running with Debian 8. Instead, Debian presently recommends Kernel version 3.16.0-4-amd64 , which I also have installed.
In plain English, this means that every time Debian Team issues a patch for the standard Kernel version, 3.16.0 is being affected, while I continue to boot into 3.18.0 .
As long as my box was stable, this was actually a sensible thing to do, because then, every time a regression might have been inserted into the kernel updates, I would not have been affected.
But if there are specific weaknesses in the higher kernel, which is not receiving any patches, then it makes more sense to boot this computer into Kernel version 3.16.0 , simply because the ongoing patches may have resolved this problem by now.
According to what I have been running on the laptop ‘Klystron’ , Kernel version 4.4.0-60-generic , I see that kernel updates tend to be more reliable than what I had been anticipating, where even the most cutting-edge kernels do not seem to suffer from many bugs.
So during the next reboot of ‘Phoenix’, I may try the Kernel version, which I have been ignoring for the past year or so.