How configuring VirtualBox to use Large Pages is greatly compromised under Linux.

One of the things which Linux users will often do is, to set up a Virtual Machine such as VirtualBox, so that a legitimate, paid-for instance of Windows can run as a Guest System, to our Linux Host System. And, because of the way VMs work, there is some possibility that to get them to use “Large Pages”, which under Linux have simply been named “Huge Pages”, could improve overall performance, mainly because without Huge Page support, the VM needs to allocate Memory Maps, which are subdivided into 512 standard pages, each of which has a standard size of 4KiB. What this means is that in practice, 512 individual memory allocations usually take place, where the caching and remapping requires 2MiB of memory. Such a line of memory can also end up, getting saved to the .VDI File – in the case of VirtualBox, from 512 discontiguous pieces of RAM.

The available sizes of Huge Pages depend on the CPU, and, in the case of the x86 / x86_64 CPUs, they tend to be either 2MiB in size or 1GiB, where 2MiB is already quite ambitious. One way to set this up is being summarized in the following little snip of commands, which need to be given as user:

 


VBoxManage modifyvm "PocketComp_20H2" --nestedpaging on
VBoxManage modifyvm "PocketComp_20H2" --largepages on

 

In my example, I’ve given these commands for the Virtual machine instance named ‘PocketComp_20H2‘, and, if the CPU is actually an Intel with ‘VT-x’ (hardware support for virtualization), large page or huge page -support should be turned on. Yet, like several other people, what I obtained next in the log file for the subsequent session, was the following line of output:

 


00:00:31.962754 PGMR3PhysAllocateLargePage: allocating large pages takes too long (last attempt 2813 ms; nr of timeouts 1); DISABLE

 

There exist users who searched the Internet in vain, for an explanation of why this feature would not work. I want to explain here, what goes wrong with most simple attempts. This is not really an inability of the platform to support the feature, as much as it’s an artifact, of how the practice of Huge Pages under Linux, differs from the theoretical, hypothetical way in which some people might want to use them. What will happen, if Huge Pages are to be allocated after the computer has started fully, is that Linux will be excruciatingly slow in doing so, at the request of the VM, because some RAM would need to be defragmented first.

This is partially due to the fact, that VirtualBox will want to map all the virtual RAM of the Guest System using them, and not, the .VDI File. (:1)  I.e., if the very modest Guest System has 4GiB of (virtual) RAM, it implies that 2048 Huge (2MiB) Pages will be needed, and those will take several minutes to allocate. If that Guest System is supposed to have larger amounts of RAM, the problem just gets worse. If the VM fails to allocate them within about 2 seconds of requesting them, it aborts, and continues with standard pages.

What Linux will offer as an alternative behaviour is, to allocate a fixed number of Virtual Pages on boot-up – when the memory is not yet very fragmented – and then, to allow any applications which ‘know how’, to help themselves to some of those Huge Pages. Thus, if 128 Huge Pages are to be preallocated, then the following snip shows, roughly how to do so, assuming a Debian distro. (:2)  Lines that begin with hash-marks (‘#‘) are commands that would need to be given as root. I estimate this number of Huge Pages to be appropriate for a system with 12GiB of RAM:

 


# groupadd hugetlbfs
# adduser dirk hugetlbfs
# getent group hugetlbfs

hugetlbfs:x:1002:dirk


# cd /etc
# edit text/*:sysctl.conf

vm.nr_hugepages = 128
vm.hugetlb_shm_group = 1002

# edit text/*:fstab

hugetlbfs       /hugepages      hugetlbfs mode=1770,gid=1002        0       0


# ulimit -H -l

(...)


# cd /etc/security
# edit text/*:limits.conf

@hugetlbfs      -       memlock         unlimited



 

The problem here is, that for a Guest System with 4GiB of virtual RAM to launch, 2048 Huge Pages would need to be preallocated, not, 128. To make things worse, Huge Pages cannot be swapped out! They remain locked in RAM. This means that they also get subtracted from the maximum number of KiB that a user is allowed to lock in RAM. In effect, 4GiB of RAM would end up, being tied up, not doing anything useful, until the user actually decides to start his VM (at which point, little additional RAM should be requested by VirtualBox).

Now, there could even exist Linux computers which are set up, on that set of assumptions. Those Linux boxes do not count as standard personal, desktop computers.

If the user wishes to know, how slow Linuxes tend to be, actually allocating some number of Huge Pages, after they have started to run fully, then he or she can just enter the following commands, after configuring the above, but, before rebooting. Normally, a reboot is required after what is shown has been configured, but instead, the following commands could be given in a hurry. My username ‘dirk‘ will still not belong to the group ‘hugetlbfs‘…

 


# sync ; echo 3 > /proc/sys/vm/drop_caches
# sysctl -p

 

I found that, on a computer which had run for days, with RAM that had gotten very fragmented, the second command took roughly 30 seconds to execute. Imagine how long it might take, if 2048 Huge Pages are indeed to be allocated, instead of 128.


 

What some people have researched on the Web – again, to find that nobody seems to have the patience to provide a full answer – is if, as indicated above, the mount-point for the HugeTLBFS is ‘/hugepages‘ – which few applications today would still try to use – whether that mount-point could just be used as a generic Ramdisk. Modern Linux applications simply use “Transparent Huge Pages”, not, access to this mount-point as a Ramdisk. And the real answer to this hypothetical question is No…

(Updated 5/20/2021, 8h20… )

 

Continue reading How configuring VirtualBox to use Large Pages is greatly compromised under Linux.

Revisiting the Android, UserLAnd app.

One of the facts which I had reported some time ago was, that a handy, easy-to-use Android app exists, which is called ‘UserLAnd‘, and, that I had installed it on my Google Pixel C Tablet. As the tooltip suggests, this is an Android app that will allow people to install a basic Linux system, without requiring ‘root’. Therefore, it mounts the apparent, local Linux file system with ‘proot’ – which is similar in how it works to ‘chroot’, except that ‘proot’ does not require root by the host system to set up – and any attempts to obtain root within this Linux system really fail to change the userid, of the app that the files belong to, or of the processes running. Yet, becoming root within this sandboxed version of Linux will convince Linux, for the purpose of installing or removing packages via ‘apt-get’.

In the meantime, I have uninstalled the ‘UserLAnd’ Linux guest system from my Pixel C Tablet, in order to free up storage. But, I have set up something like this again, on my Samsung Galaxy Tab S6 Tablet, which has 256GB of internal storage. Therefore, I have a few observations to add, about how this app runs under Android 10.

Through no fault of the developer of this Android app, the user is more restricted in what he can run, because Android 10 places new restrictions on regular processes. Specifically, none of the major LISP Interpreters that were designed to run under Debian 10 / Buster will run. (:1) What the Linux developers did was, to make the garbage collection of their LISP Interpreters more aggressive, through a strategy that changes the memory protection bits of memory-maps, to read-only if they belong to the state of the machine, and then, ~to try deleting as much of the bytecode as can still be deleted~. Pardon me, if my oversimplification gets some of it wrong.

Well, Android 10 no longer allows regular apps to change the protected memory state of any pages of RAM, for which reason none of the affected LISP Interpreters will run. And for that reason, neither “Maxima” nor anything that depends on Maxima can be made to run.

Yet, certain other Linux applications, notably “LibreOffice” and “Inkscape”, run just fine… So does “GIMP”…

Screenshot_20200912-171020_VNC Viewer

Also, the way in which files can be shared between the  Android Host and the Linux Guest System has been changed, as the following two screen-shots show:

Screenshot_20200912-155032_VNC Viewer

Screenshot_20200912-155144_File Manager

Here, the file ‘Text-File.txt’ has been shared between Android and Linux. Larger files can also be shared in the same way, and the folder bookmarked under Linux. (:2)

In many ways, the Linux applications behave as described before, with the unfortunate exceptions I just named, and I intend to keep using this app.

Technically, a Host app that just sandboxes a Guest Application in this way, does not count as a Virtual Machine. A real VM allows processes to obtain root within the Guest System, without endangering the Host System. Also, ‘a real VM’ provides binary emulation, that makes no specific assumptions about the Guest System, other than, usually, what CPU is being used. Emulation that includes non-native CPU emulation is still a somewhat special type of emulation.

Therefore, the ability of Debian 10 / Buster to run under ‘UserLAnd’ depends, in this case, on the Linux package maintainers having cross-compiled the packages, to run on an ‘ARM-64′ CPU…

 

(Updated 9/13/2020, 21h30… )

Continue reading Revisiting the Android, UserLAnd app.

I am no longer 100% Linux.

Something which I recently did – as of May 17 to be precise – was, to install Windows 10 on a Virtual Machine. This is not to be confused with the use of ‘Wine’, because an Emulator is not the same thing as a VM. When using a VM, the ISO File authored by Microsoft, in this case, needs to be provided, so that Genuine Windows can install itself, in an isolated environment, that behaves exactly as a regular computer would behave, by itself.

AFAIK, this is a perfectly legal thing to do. And my perception of that is amplified, by the fact that within Windows 10, I was able to go through the Windows Store, to purchase the activation for that instance. If it was illegal, then I should have obtained a message to the effect.

Also, when Windows software runs on a VM, certain hardware can be ‘fed through’ to this ‘Guest System’, such as specific USB Devices. But, when they are, Windows relies on its own device-drivers, to be able to use them, or, on vendor-supplied device drivers. What this means is that at the raw binary level, the VM itself is forwarding the data, without attempting to reparse or analyze it in any way.

Screenshot_20200522_153241

I can still get error messages, but so far, those have only come as a result of silly user errors, that would have produced the same error messages had Windows been running natively.

And, because the Guest System is genuinely Windows, I can no longer say that I have zero Windows instances running. It’s just that, for now, the Windows instance I have running, resides inside a VM and doesn’t own ‘the Real Computer’ – aka the ‘Host System’.

What I do notice is the fact, that some of the errors which I made, were due to not having used Windows for a long time.

Dirk

 

Android Permissions Control

One fact which I had written about before, was that Android differs from Linux, in that under Android, every installed app has its own username. Also, because different users installed a different set of apps in different order, the UID – an actual number – for any given username will be different from one device to the next. And then I also wrote, that a username belonging to a group or not so, can be used to manage access control, just like under Linux.

There is a reason for which things are not so simple with Android. Most Android apps are written in a language named “Dalvik”, the source code of which has syntax identical to “Java”, and which must be compiled into “Bytecode”. The bytecode in turn runs on a bytecode interpreter, which is also referred to as a Virtual Machine.

The reason for which this affects permissions, is because as far as the kernel can see, this VM itself runs under one username. This is similar to how a Java VM would run under one username. And so a much more complex security model is put in place by the VM itself, because presumably this VM’s username has far-reaching capabilities on the device.

The actual use of groups to control access under Android is simpler, and applies at first glance to processes which have been compiled with the ‘NDK’ – with the “Native Development Kit” – and which therefore run directly, say from C++ source code.

Further, the Dalvik VM is capable of reading the permissions of actual files, and is capable of applying its own security model, in a way that takes the permission bits into account, that have been assigned to the files by the Linux kernel. So for most purposes, the security model on the VM is more important than the actual permission bits, as assigned and implemented by the kernel, because most Android source code is effectively written in a Java-like language.

Dirk