## Getting Pulseaudio to schedule real-time threads under Debian / Stretch and beyond.

On the computer which I name ‘Phosphene’, I have Debian 9 / Stretch installed, and, going the popular route, the Debian maintainers have decided that ‘Pulseaudio’ should be the main sound server, while in certain cases, it can be made to coexist with the ‘Jack’ / ‘Jack v2′ daemon, the latter for more demanding needs. In my case, starting Jack suspends Pulse.

In addition, Debian systems I’m familiar with run ‘pulseaudio’ as user, and not as root.

But, the way my Pulse daemon is configured, includes a ‘loopback’ module, that sends audio that has been input and sampled from the analog input of my (“Creative X-Fi”) sound card, back to its analog output, so that I can listen to an old-fashioned-but-modern radio receiver, over my speakers, without having to connect anything special to do so.

The problem that I was experiencing was, buffer underruns, and sound drop-outs, due to the way this default sound server was operating the sound card. And, surely enough, it had something to do with a by-now famous issue, of allowing Pulse to schedule certain threads to run with real-time priority. This can be enabled in the config file ‘/etc/pulse/daemon.conf‘, but fails to become actual when just enabled so.

Trying to find an answer to why this attempt was failing, sent me on a trip to many Web-sites and BB-postings, most of which had as disadvantage, the fact that with Pulse, there is an old way and a new way to accomplish this, and the fact that most of the BB-postings assumed the old way, being dated to the year 2016 and earlier.

Rest assured, under Debian 9 / Stretch, authorizing Pulse to schedule some threads with RT priority, is managed according to ‘the new way’. Just to recap how ‘the old way’ worked:

The file ‘/usr/bin/pulseaudio‘ needed to be set with ‘SETUID root’, and when run this way, if the sound server detected that the real user (not the effective user) belonged to the group ‘pulse-rt‘, would schedule the threads it needed to run with RT, and would then drop privileges, so that according to the process-monitor ‘top’, it would never even give a hint that anything had in fact been scheduled with RT.

This is how ‘the new way’ is supposed to work:

If the user needs for Pulse specifically to run with RT, he must actually install the package ‘rtkit‘, which runs as a system user, and which sets certain threads to have RT priority in a curated way, that’s meant to be ‘safer’, than it still is for any one user to belong to a group, that simply allows him or her to run arbitrary processes with RT. In fact, ‘the old way’ was also meant as a similar safeguard.

I had installed the specified package, rebooted my system – with Pulse given its new startup parameters, and found alas, that the file ‘/var/log/syslog‘, which was benefiting from an elevated logging level set by me, revealed numerous messages that I did not understand.

Trying to figure out what was happening resulted in more than one day’s frustration, and then finally, to some peace of mind…

(Updated 4/16/2020, 13h15 … )

During this posting from some time ago, I wrote at length about the subject of interrupt prioritization. I wanted to demonstrate that I really did study System Software, and that therefore, I know something which I can pass along, about the subject of multitasking, which is a different subject.

The perspective from which I appreciate multitasking, is the perspective of the device-driver. According to what I was taught – and it is a bit dated – a device-driver had a top half and a bottom half. The top half was invoked by user-space processes, in a way that required a system-call, while the bottom half was invoked by interrupt requests, from the hardware. This latter detail did not change if instead of using purely interrupt-driven I/O, the hardware and driver used DMA.

A system-call is also known as a software-interrupt.

The assumption which is made is that a user-space process can request a resource, thus invoking the top half of the device driver, but that the resource is recognized by the device driver as being busy, and that processes on the CPU run much faster than the I/O device. What the device-driver will do is change the state of the process which invoked it from ‘running’ to ‘blocked’, and it will make an entry in a table it holds, which identifies which process had requested the resource, as well as device-specific information, regarding what exactly the process asked for. Then, the top half of the device-driver will send whatever requests to the device that are needed, to initiate the I/O operation, and to ensure that at some future point in time, the resource and piece of data which were asked for, will become available. The top half of the device-driver then exits, and the process which invoked it will typically not be in a ‘running’ state anymore, unless for some reason the exact item being requested was immediately available. So as far as the top half is concerned, what usually needs to happen next is that some other process, which is in the ‘ready’ state, needs to be made ‘running’ by the kernel.

The bottom half of the device driver responds to the interrupt request from the device, which has signaled that something has become available, and looks up in the table belonging to the device driver, which process asked for that item, out of several possible processes which may be waiting on the same device. The bottom half then changes the state of the process in question from ‘blocked’ to ‘ready’, so that whenever a ‘ready’ process is about to be made ‘running’, the previously-blocked process will have become a contender.

The O/S kernel is then free to schedule the ‘ready’ process in question, making it ‘running’.

Under Linux specifically, when access to devices is requested by user-space processes, through device-files in the directory ‘/dev/*’, this is also an equivalent that calls the top half of a device driver.

Now, aside from the fact that processes can be ‘running’, ‘ready’, or ‘blocked’, a modern O/S has a differentiation, between ‘active’ and ‘suspended’, that applies to ‘ready’ and ‘blocked’ processes. My System Software course did not go into great detail about this, because in order to understand why this is needed, one also needs to understand that there exists virtual memory, and that processes can be swapped out. The System Software course I took, did not cover virtual memory, but I have read about this subject privately, after taking the course.

(Update 2/28/2020, 12h30 :

Another reason for which my course did not cover the subject about Active versus Suspended states, is probably the fact that the course still assumed that the Running process would use the (deprecated) ‘Yield()’ instruction, which was roughly equivalent to issuing ‘sleep(0.0)’. With this instruction, the state of a Running process could directly be made Ready-Active, with no need for Ready-Suspended. The instructors may well have thought, that much of the detail which follows below, would actually have been too much information to add to their course, which was focussed more on device-drivers.)

The kernel could have several reasons to suspend a process, and the programmer should expect that his process can be suspended at any time. But one main reason why it would be suspended in practice, would be so that it can be swapped out. Only the kernel can make a ‘suspended’ process ‘active’ again. (:1)

(Updated 4/16/2020, 8h30 … )