I have been exploring this subject, through a series of experiments written in Python, and through what I learned when I was studying the subject of System Hardware, at Concordia University.
When a person uses a Windows computer, this O/S provides all the details of scheduling processes and threads. And arguably, it does well. But when a person is using Linux, the kernel makes all the required information available, but does not take care of optimizing how threads are scheduled, specifically. It becomes the responsibility of the application, or any other user-space program, to optimize how it will take up threads, using CPU affinity, or using low-level C functions that instruct the CPU to replace a single line in the L1 cache…
In the special case when a person is writing scripts in Python, because this is an interpreted language, the program which is actually running, is the Python interpreter. How well the scheduling of threads works in that case, depends on how well this Python interpreter has been coded to do so. In addition, how well certain Python modules have been coded, has a strong effect on how efficiently they schedule threads. It just so happens that I’ve been lucky, in that the Python versions I get from the Debian repositories, happen to be programmed very well. By other people.