Why some Linux devs still use spinlocks.

There is An interesting article, on another BB, which explains what some of the effects are, of spinlocks under Linux. But, given that at least under Linux, spinlocks are regarded as ‘bad code’ and constitute ‘a busy-wait loop’, I think that that posting does not explain the subject very well of why, in certain cases, they are still used.

A similar approach to acquiring an exclusive lock – that can exist as an abstract object, which does not really need to be linked to the resource that it’s designed to protect – could be programmed by a novice programmer, if there were not already a standard implementation somewhere. That other approach could read:

 

from time import sleep
import inspect
import thread

def acquire( obj ):
    assert inspect.isclass(obj)
    myID = thread.get_ident()
    if obj.owner is None:
        # Replace .owner with a unique attribute name.
        # In C, .owner needs to be declared volatile.
        # Compiler: Don't optimize out any reads.
        obj.owner = myID
    while obj.owner != myID:
        if obj.owner == 0:
            obj.owner = myID
        sleep(0.02)

def release( obj ):
    assert inspect.isclass(obj)
    if obj.owner is None:
        # In C, .owner needs to be declared volatile.
        obj.owner = 0
        return
    if obj.owner == thread.get_ident():
        obj.owner = 0

 

(Code updated 2/26/2020, 8h50…

I should add that, in the example above, Lines 8-12 will just happen to work under Python, because multi-threading under Python is in fact single-threaded, and governed by a Global Interpreter Lock. The semantics that take place between Lines 12 and 13 would break in the intended case, where this would just be pseudo-code, and where several clock cycles elapse, so that the ‘Truth’ which Line 13 tests may not last, just because another thread would have added a corresponding attribute on Line 12. OTOH, adding another sleep() statement is unnecessary, as those semantics are not available, outside Python.

In the same vein, if the above code is ported to C, then what matters is the fact that in the current thread, several clock-cycles elapse between Lines 14 and 15. Within those clock cycles, other threads could also read that obj.owner == 0, and assign themselves. Therefore, the only sleep() instruction is dual-purpose. Its duration might exceed the amount of time the cache was slowed down to, to execute multiple assignments to the same memory location. After that, one out of possibly several threads would have been the last, to assign themselves. And then, that thread would also be the one that breaks out of the loop.

However, there is more that could happen between Lines 14 and 15 above, than cache-inefficiency. An O/S scheduler could cause a context-switch, and the current thread could be out of action for some time. If that amount of time exceeds 20 milliseconds, then the current thread would assign itself afterwards, even though another thread has already passed the retest of Line 13, and assumes that it owns the Mutex. Therefore, better suggested pseudo-code is offered at the end of this posting…

)


 

This pseudo-code has another weakness. It assumes that every time the resource is not free, the program can afford to wait for more than 20 milliseconds, before re-attempting to acquire it. The problem can crop up, that the current process or thread must acquire the resource, within microseconds, or even within nanoseconds, after it has become free. And for such a short period of time, there is no way that the O/S can reschedule the current CPU core, to perform any useful amount of work on another process or thread. Therefore, in such cases, a busy-wait loop becomes The Alternative.

I suppose that another reason, for which some people have used spinlocks, is just due to bad code design.


 

Note: The subject has long been debated, of what the timer interrupt frequency should be. According to kernel v2.4 or earlier, it was 100Hz. According to kernel v2.6 and later, it has been 1000Hz. Therefore, in principle, an interval of 2 milliseconds could be inserted above (in case the resource had not become free). However, I don’t really think that doing so would change the nature of the problem.

Explanation: One of the (higher-priority, hardware) interrupt requests consists of nothing but a steady pulse-train, from a programmable oscillator. Even though the kernel can set its frequency over a wide range, this frequency is known not to be terribly accurate. Therefore, assuming that the machine has the software installed, that provides ‘strict kernel marshalling’, every time this interrupt fires, the system time is advanced by a floating-point number, that is also adjusted over a period of hours and days, so that the system time keeps in-sync with an external time reference. Under Debian Linux, the package which installs that is named ‘ntp’.

 

There exist a few other tricks to understand, about how, in practice, to force an Operating System which is not an RTOS, to behave like a Real-Time O/S. I’ve known a person who understood computers on a theoretical basis, and who had studied Real-Time Operating Systems. That’s a complicated subject. But, given the reality that his operating system was not a Real-Time O/S, he was somewhat stumped by why then, it was able to play a video-clip at 24 FPS…

(Updated on 3/12/2020, 13h20 …)

Continue reading Why some Linux devs still use spinlocks.