How configuring VirtualBox to use Large Pages is greatly compromised under Linux.

One of the things which Linux users will often do is, to set up a Virtual Machine such as VirtualBox, so that a legitimate, paid-for instance of Windows can run as a Guest System, to our Linux Host System. And, because of the way VMs work, there is some possibility that to get them to use “Large Pages”, which under Linux have simply been named “Huge Pages”, could improve overall performance, mainly because without Huge Page support, the VM needs to allocate Memory Maps, which are subdivided into 512 standard pages, each of which has a standard size of 4KiB. What this means is that in practice, 512 individual memory allocations usually take place, where the caching and remapping requires 2MiB of memory. Such a line of memory can also end up, getting saved to the .VDI File – in the case of VirtualBox, from 512 discontiguous pieces of RAM.

The available sizes of Huge Pages depend on the CPU, and, in the case of the x86 / x86_64 CPUs, they tend to be either 2MiB in size or 1GiB, where 2MiB is already quite ambitious. One way to set this up is being summarized in the following little snip of commands, which need to be given as user:

 


VBoxManage modifyvm "PocketComp_20H2" --nestedpaging on
VBoxManage modifyvm "PocketComp_20H2" --largepages on

 

In my example, I’ve given these commands for the Virtual machine instance named ‘PocketComp_20H2‘, and, if the CPU is actually an Intel with ‘VT-x’ (hardware support for virtualization), large page or huge page -support should be turned on. Yet, like several other people, what I obtained next in the log file for the subsequent session, was the following line of output:

 


00:00:31.962754 PGMR3PhysAllocateLargePage: allocating large pages takes too long (last attempt 2813 ms; nr of timeouts 1); DISABLE

 

There exist users who searched the Internet in vain, for an explanation of why this feature would not work. I want to explain here, what goes wrong with most simple attempts. This is not really an inability of the platform to support the feature, as much as it’s an artifact, of how the practice of Huge Pages under Linux, differs from the theoretical, hypothetical way in which some people might want to use them. What will happen, if Huge Pages are to be allocated after the computer has started fully, is that Linux will be excruciatingly slow in doing so, at the request of the VM, because some RAM would need to be defragmented first.

This is partially due to the fact, that VirtualBox will want to map all the virtual RAM of the Guest System using them, and not, the .VDI File. (:1)  I.e., if the very modest Guest System has 4GiB of (virtual) RAM, it implies that 2048 Huge (2MiB) Pages will be needed, and those will take several minutes to allocate. If that Guest System is supposed to have larger amounts of RAM, the problem just gets worse. If the VM fails to allocate them within about 2 seconds of requesting them, it aborts, and continues with standard pages.

What Linux will offer as an alternative behaviour is, to allocate a fixed number of Virtual Pages on boot-up – when the memory is not yet very fragmented – and then, to allow any applications which ‘know how’, to help themselves to some of those Huge Pages. Thus, if 128 Huge Pages are to be preallocated, then the following snip shows, roughly how to do so, assuming a Debian distro. (:2)  Lines that begin with hash-marks (‘#‘) are commands that would need to be given as root. I estimate this number of Huge Pages to be appropriate for a system with 12GiB of RAM:

 


# groupadd hugetlbfs
# adduser dirk hugetlbfs
# getent group hugetlbfs

hugetlbfs:x:1002:dirk


# cd /etc
# edit text/*:sysctl.conf

vm.nr_hugepages = 128
vm.hugetlb_shm_group = 1002

# edit text/*:fstab

hugetlbfs       /hugepages      hugetlbfs mode=1770,gid=1002        0       0


# ulimit -H -l

(...)


# cd /etc/security
# edit text/*:limits.conf

@hugetlbfs      -       memlock         unlimited



 

The problem here is, that for a Guest System with 4GiB of virtual RAM to launch, 2048 Huge Pages would need to be preallocated, not, 128. To make things worse, Huge Pages cannot be swapped out! They remain locked in RAM. This means that they also get subtracted from the maximum number of KiB that a user is allowed to lock in RAM. In effect, 4GiB of RAM would end up, being tied up, not doing anything useful, until the user actually decides to start his VM (at which point, little additional RAM should be requested by VirtualBox).

Now, there could even exist Linux computers which are set up, on that set of assumptions. Those Linux boxes do not count as standard personal, desktop computers.

If the user wishes to know, how slow Linuxes tend to be, actually allocating some number of Huge Pages, after they have started to run fully, then he or she can just enter the following commands, after configuring the above, but, before rebooting. Normally, a reboot is required after what is shown has been configured, but instead, the following commands could be given in a hurry. My username ‘dirk‘ will still not belong to the group ‘hugetlbfs‘…

 


# sync ; echo 3 > /proc/sys/vm/drop_caches
# sysctl -p

 

I found that, on a computer which had run for days, with RAM that had gotten very fragmented, the second command took roughly 30 seconds to execute. Imagine how long it might take, if 2048 Huge Pages are indeed to be allocated, instead of 128.


 

What some people have researched on the Web – again, to find that nobody seems to have the patience to provide a full answer – is if, as indicated above, the mount-point for the HugeTLBFS is ‘/hugepages‘ – which few applications today would still try to use – whether that mount-point could just be used as a generic Ramdisk. Modern Linux applications simply use “Transparent Huge Pages”, not, access to this mount-point as a Ramdisk. And the real answer to this hypothetical question is No…

(Updated 5/20/2021, 8h20… )

 

Continue reading How configuring VirtualBox to use Large Pages is greatly compromised under Linux.

Android Permissions Control

One fact which I had written about before, was that Android differs from Linux, in that under Android, every installed app has its own username. Also, because different users installed a different set of apps in different order, the UID – an actual number – for any given username will be different from one device to the next. And then I also wrote, that a username belonging to a group or not so, can be used to manage access control, just like under Linux.

There is a reason for which things are not so simple with Android. Most Android apps are written in a language named “Dalvik”, the source code of which has syntax identical to “Java”, and which must be compiled into “Bytecode”. The bytecode in turn runs on a bytecode interpreter, which is also referred to as a Virtual Machine.

The reason for which this affects permissions, is because as far as the kernel can see, this VM itself runs under one username. This is similar to how a Java VM would run under one username. And so a much more complex security model is put in place by the VM itself, because presumably this VM’s username has far-reaching capabilities on the device.

The actual use of groups to control access under Android is simpler, and applies at first glance to processes which have been compiled with the ‘NDK’ – with the “Native Development Kit” – and which therefore run directly, say from C++ source code.

Further, the Dalvik VM is capable of reading the permissions of actual files, and is capable of applying its own security model, in a way that takes the permission bits into account, that have been assigned to the files by the Linux kernel. So for most purposes, the security model on the VM is more important than the actual permission bits, as assigned and implemented by the kernel, because most Android source code is effectively written in a Java-like language.

Dirk