Questioning whether the Linux kernel can take advantage of hyper-threading in a positive sense?

One of the features which CPUs were advertised as having several years ago, was “hyper-threading”. It can always happen to me that CPU cores have some feature which I’m not up-to-date on, but my own mind has a concept for this feature, which I find to be adequate:

Given a CPU with 8 cores, it can happen that an L1 cache is shared between every pair of cores, while the L2 and the L3 cache may be shared between all cores. Because of the way each L1 cache instance is optimized, this can lead to a performance penalty, if threads from different processes are assigned to the 2 cores belonging to one shared L1 cache. This can also be referred to as ‘false sharing’, since lines in the cache will be referenced repeatedly, that actually map to completely different regions of RAM. Every time that happens, a replacement operation needs to be done in the cache, to update the offending line with the most-recently-addressed memory region (called the corresponding “frame”), which happens more slowly than the full speed at which the CPU core-proper can read instructions and/or data, for which there are actually separate caches.

Meanwhile, if threads belonging to the same process are mapped to pairs of CPU cores that share an L1 cache, and if such a pair of threads needs to communicate data, there can be a boost to efficiency because each thread in such a pair is only communicating with cache, the lines of which do map to the same regions of memory. (:1)

I had my doubts as to whether the Linux kernel can increase the probability of this second scenario taking place successfully.

And the main reason for my doubt was, the mere observation that when the kernel re-schedules a single-threaded process, it has no preference for either even-numbered or odd-numbered cores to schedule it to. I can see this because I have a widget running on my desktops, which displays continuous graphs of hardware usage, from which I can infer information as I’m using my computers on a day-to-day basis.

When a programmer programs threads to run on CPU cores, he can make sure that his first thread only communicates with his second, that his third thread only communicates with his fourth, etc. But in that case, unless the kernel actually schedules the first thread of the program to run on an even-numbered logical core (starting from core zero), these pairs of threads which the programmer intended to communicate, will be communicating across the boundaries imposed by separate L1 cache instances. This will still succeed, but only at a performance penalty. (:2)

There was a Python programming exercise in which I felt I had overcome this problem, by assigning a number of threads exactly equal to the number of cores on each machine’s CPU. In that case, the kernel may schedule the threads to the cores in their natural order, so that physical pairing would be observed. But aside from trying this exercise, under Linux, hyper-threading mainly presented an avoidance issue to my mind.

Simultaneously, a modern CPU is plausible, which has 32 cores, but in which each L1 cache is actually shared between 4 cores, not 2. And so each programmer is left to his means, to optimize any threaded code.

Furthermore, I know of one CPU architecture in which the first 4 logical CPUs are mapped to the existing 4 real CPUs sequentially, after which the last 4 logical CPUs are mapped that way again.


Under Linux, the user may type in the following command, in order actually to see how his logical cores are mapped, at least numerically:


egrep "(( id|processo).*:|^ *$)" /proc/cpuinfo


As it happens, I was able to find direct evidence of a Python function which actually chooses which CPU cores the present process wishes to run on. And what this means is, that the kernel must also expose such a feature to the user-space application…

(Updated 5/22/2019, 7h00 … )

Continue reading Questioning whether the Linux kernel can take advantage of hyper-threading in a positive sense?

Plausible does not mean Assumed

I could make hypothetical guesses, as to why crashes like this one happen, on the machine I name ‘Phoenix’, which was manufactured in 2008. This time I noticed, that the cursor on the screen stopped moving, then that mouse-input was not being interpreted, then that the screen just filled with an image, which was a diagonally-scrambled version of the normal screen content:

  • It could be that the old GPU is no longer reliable at the hardware level, and that it may now suffer from random crashes, which also crash the X-server. The “Timeout Detection and Recovery” (‘TDR’) feature I have seen the nVidia Driver execute properly in past situations, may just not kick in.
  • When I reinstalled, replacing the old 32-bit O/S with the current 64-bit O/S, I also replaced the 2GB of RAM with completely new, 4GB of RAM, and the “Front-Side Bus Speed” (‘FSB’) of the new RAM has also become faster, that becoming 800MHz instead of the earlier 600MHz. Either set of DDR RAM modules was running with dual-channel capability. The motherboard may detect this capability of the new RAM modules and start using it, as the motherboard itself may have the stated capability of running at 800MHz. Yet, at 800MHz, the way this Motherboard works may not be stable.
  • There could be some sort of kernel issue…

What I do find a bit more specific, is the fact that there seem to be no log entries for the Xorg, suggesting that although an X-server crash eventually takes place, this may not be the root cause. Also, the fact that the mouse has become unresponsive for a few seconds, before screen-content collapses, seems to suggest the same thing…

But the most important fact for me to observe, is that simply being able to suggest plausible reasons for the crash, is not the same thing as having diagnosed the crashes. Honestly, I do not know at present, why this type of crash happens.

One of the observations about this machine which had impressed me in the past, was that I had pushed 3D rendering beyond the limits of the old GPU, thereby crashing this graphics chip, but that the desktop manager I had in place was able to restart the GPU, and to resume the session, without requiring any action from me, but displaying a well-behaved message to the effect that the GPU needed to be rebooted. This is called “Timeout Detection and Recovery” (‘TDR’), and does the same thing under Linux, that it does under Windows, and depends on stable graphics drivers.

The fact that I do possess ‘TDR’ on this machine suggests, that a simple failure of the graphics chip, should not take out my session.


According to my latest inquiry, this Motherboard is ‘only’ running at 66MHz. Therefore, the maximum speed of the newer RAM Module should not be an issue after all.



Continue reading Plausible does not mean Assumed