Routine Kernel Update Today, Downtime

Today, the Debian Package Maintainers pushed through a routine kernel update, to version ‘3.16.0-4-amd64‘ . Even though this machine is a Linux computer, this update required a reboot. Further, I take the unusual step of hosting my Web-site and blog on my own server at home.

This implies that the Web-site was offline briefly, more specifically from about 13h05 until 13h15. I cannot display a Maintenance Mode Page during such an event, because doing so would still require that my Web-server be online.

However, because everything about this update went smoothly, the interruption to the system processes had a duration closer to 5 minutes than to 10 minutes. I did not notice any malfunctions.

Oh yes, this also caused my local cache to get flushed (‘memcached‘), for which reason access to the favorite postings of the readers will remain a bit sluggish for the next day or so.

Dirk

 

Routine Kernel-Update Today, Downtime

I take the unusual measure, of hosting my Web-site on my own server. Yet, from time to time, updates come to this same host-machine – which I name ‘Phoenix’ – and some of those updates require a reboot, even though it is a Linux computer. When I do this, my site temporarily becomes unavailable. In fact, even though my installation of ‘WordPress’ has a Maintenance Mode page, I am unable to display it in most cases, because that page still requires for a Web-server to be running, to display on your browser.

Just today, there was a Kernel Update pushed through the package manager, to Kernel version 3.16.0-4-generic. And so I felt it was only reasonable to perform the reboot. Also, there were some more-minor updates to ‘Wine’, but those would not usually, by themselves, require a reboot.

Because the entire process ran smoothly and without notable incident, my site and blog were only offline from about 12h35 until 12h45.

I apologize for any inconvenience.

Also, because my ‘memcached’ daemon has been restarted, this blog will seem a bit sluggish for the next day or so.

Dirk

 

Plausible does not mean Assumed

I could make hypothetical guesses, as to why crashes like this one happen, on the machine I name ‘Phoenix’, which was manufactured in 2008. This time I noticed, that the cursor on the screen stopped moving, then that mouse-input was not being interpreted, then that the screen just filled with an image, which was a diagonally-scrambled version of the normal screen content:

  • It could be that the old GPU is no longer reliable at the hardware level, and that it may now suffer from random crashes, which also crash the X-server. The “Timeout Detection and Recovery” (‘TDR’) feature I have seen the nVidia Driver execute properly in past situations, may just not kick in.
  • When I reinstalled, replacing the old 32-bit O/S with the current 64-bit O/S, I also replaced the 2GB of RAM with completely new, 4GB of RAM, and the “Front-Side Bus Speed” (‘FSB’) of the new RAM has also become faster, that becoming 800MHz instead of the earlier 600MHz. Either set of DDR RAM modules was running with dual-channel capability. The motherboard may detect this capability of the new RAM modules and start using it, as the motherboard itself may have the stated capability of running at 800MHz. Yet, at 800MHz, the way this Motherboard works may not be stable.
  • There could be some sort of kernel issue…

What I do find a bit more specific, is the fact that there seem to be no log entries for the Xorg, suggesting that although an X-server crash eventually takes place, this may not be the root cause. Also, the fact that the mouse has become unresponsive for a few seconds, before screen-content collapses, seems to suggest the same thing…

But the most important fact for me to observe, is that simply being able to suggest plausible reasons for the crash, is not the same thing as having diagnosed the crashes. Honestly, I do not know at present, why this type of crash happens.

One of the observations about this machine which had impressed me in the past, was that I had pushed 3D rendering beyond the limits of the old GPU, thereby crashing this graphics chip, but that the desktop manager I had in place was able to restart the GPU, and to resume the session, without requiring any action from me, but displaying a well-behaved message to the effect that the GPU needed to be rebooted. This is called “Timeout Detection and Recovery” (‘TDR’), and does the same thing under Linux, that it does under Windows, and depends on stable graphics drivers.

The fact that I do possess ‘TDR’ on this machine suggests, that a simple failure of the graphics chip, should not take out my session.

Addendum:

According to my latest inquiry, this Motherboard is ‘only’ running at 66MHz. Therefore, the maximum speed of the newer RAM Module should not be an issue after all.

ram_phoenix_1

Dirk

Continue reading Plausible does not mean Assumed

Kernel Update, Downtime

I take the unusual approach, of hosting my blog on my own Web-server at home. Tonight, this host machine, named ‘Phoenix’, received a standard Debian Kernel Update via the package manager.

This procedure is routine, but requires a reboot, even from a Linux computer.

The visibility of my site and of my blog, was affected from approximately 21h40 until 21h55. Also, because the server itself was rebooting, it was not possible to have it display a Maintenance Mode image.

I apologize for any inconvenience.

Further, the Server response may be a bit sluggish for the next few hours or day, simply because its cache gets cleared, when we reboot.

The last session, that I just rebooted, had been running for 31 days continuously.

Dirk