Another possible hypothesis, for why my server-box sometimes crashes.

I have written before, that my Linux computer ‘Phoenix’, which acts both as my server and a workstation, sometimes crashes. I have another possible explanation for why.

The graphics chip on this machine is only a , capable of OpenGL 2.1.2 using proprietary (legacy) drivers. It only has 128MB of shared memory with my motherboard.

Under Windows 10, this chipset is no longer supported at all.

I may simply be pushing this old GPU too hard.

My display is a 1600×1200 monitor, and much of the graphics memory is simply being taken up by that fact. Also, I have many forms of desktop compositing switched on. And at the time of the last crash, I had numerous applications open at the same time, which use hardware 2D acceleration as part of their canvas. And I was copying and pasting between them.

I am hoping that this is easing the burden on my equally-dated CPU.

But then the triggering factor may simply be an eventual error in the GPU.

The fact that the Timeout Detection and Recovery (‘TDR’) does not kick in to save the session, may be due to the possibility that the TDR only works, in specific situations, such as OpenGL, 3D rendering windows. If the GPU crash happens as part of the compositing, it may take out the X-server, and therefore my whole system.

The only workaround I may have, is to avoid using this box as a workstation. When I avoid doing that, it has been known to run for 60 days straight, without crashing…

Dirk

(Edit 01/28/2017 : )

I use a widget on my desktops, which is named ‘‘, and I find that it gives me a good intuitive grasp of what is happening on my Linux computers.


 

phoenix_temperatures_1


 

This widget has as a disadvantage, that when extensions have been installed to display temperatures, sometimes we do not know which temperature-sensors stand for which temperature. This is due to the fact that Linux developers have to design their software, without any knowledge of the specific hardware it is going to run on. Inversely, the makers of proprietary drivers know exactly which machine those are going to run on, and can therefore identify what each of them stands for.

This also means, sometimes we have temperature readings in ‘‘, which may just be spurious, and which may just constantly display one meaningless number, in which case we reduce our selection of indicated temperatures to ones we can identify.

In the context of answering my own question, another detail which becomes relevant, is the fact that this tower computer has a failed case-fan, which is accurately being indicated as the ‘‘ entry, running at 46 RPM at the moment of the screen-shot. I know that this case-fan is in fact stalled, from past occasions when I opened up the tower.

Continue reading Another possible hypothesis, for why my server-box sometimes crashes.

Powerful Computer ‘Mithral’ May Require A Reinstall

It is a shame but might be true.

I have been using a powerful Tower-PC named ‘Mithral’, on which I have installed Windows 7 Pro, and which has an 8-core CPU, threaded as 4. This computer was first installed in 2011.

This computer had special software installed, named “Diskeeper 2011″, which continuously defragmented my Hard Drive, but in an effort to be able to save the HD even in the worst-case Fragmentation / FS Corruption scenarios.

‘Mithral’ has developed the habit now, of repeatedly going into an unresponsive state, overnight, which is also the time when it has been configured to do its background defragmentation.

The fact that it never freezes during the day, but often now, overnight, suggests that the problem may lie in File System Corruption, which can lead to fatal errors, if defragmentation encounters it under Windows. I have run a ‘Check-Disk’ on it, but doing so has not remedied the problem.

However, one apsect of this problem which puzzles me, is the fact that so many defragmentations have run successfully, without immediately tripping over any FS-Corruption issues. This casts doubt on the idea, that the problem may be due to FS Corruption. What could also be happening, is that Diskeeper 2011 may no longer really have been compatible with the most-currently updated Windows 7.

And so, because these hangups seemed to be taking place during a time when Diskeeper 2011 was running, what I have done for now, is to uninstall that, and to install the most up-to-date version, that being “Diskeeper 2016″. As usual, I am able to tell Diskeeper to defragment the whole HD without any immediate errors. But now I will have to wait and see, whether continuous background-defragmentation, using version 2016 of the software, is ultimately more stable.

Because a computer which crashes every second night is not really supportable, I might eventually need to wipe ‘Mithral’, and to make an attempt to resurrect it, using some version of Linux. If I succeed, at least I will still have the benefit of the hardware.

There is some chance, however slight, that the problem is not really FS Corruption, but some sort of Hardware problem. If that should turn out to be the case, then even with Linux on it, this computer will still crash and/or be unstable. This will be even more sad.

To prepare for eventually reinstalling, I will need to migrate critical data off it, to my other computers, with a depth I have not had to do so in before. This will not be one of those cases, where I can just wipe the HD and kick the present O/S off it on a whim. I need to think this through, as far as data is concerned.

This is my last remaining Windows computer. There exist certain services under Windows, that I cannot duplicate under Linux, and I might need to get by without those.

Dirk