## I’ve just tweaked my kernel memory a little.

According to This previous posting of mine, I had set aside 270MiB of RAM, just in case the kernel needed it, on the computer I name ‘Phoenix’, to keep track of user-space programs, watching files and directories. I had also taken A Quick Glance at the way 64-bit kernel, virtual memory is organized under Linux, differently from how 32-bit virtual addresses were. Given the additional fact, that I only want to set aside 128MiB of user-space RAM, To cache my blog’s Web-pages, I decided that the earlier amount of kernel-memory was too large.

And so now I’ve reduced the amount, for ‘INotify’, to a mere 135MiB.

This computer has 4GiB total RAM.

Elaboration:

On 32-bit systems, the entire address space only had 4GB, but under no circumstances was the kernel expected to take up more than 1 GB. The computer ‘Phoenix’ is a 64-bit system, but one which only has 4GB of physical RAM. Therefore, even though it’s a 64-bit system, its kernel should still not take up more than 1GB. Yet, if I preallocate 270MB, just to perform one task, I’m pushing this system’s kernel closer to using up 1GB for real. So just for this one task, I’ve reduced this one commitment to 135MB.

Dirk

## Some Suggested Code

In This Earlier Posting, I had written at first some observations about Bluetooth-pairing, but then branched out on the subject, of whether a Diffie-Hellman Key Exchange could be easier to compute, if it was somehow simplified into using a 32-bit modulus. Obviously, my assumption was that a 64-bit by 32-bit divide instruction would be cheap on the CPU, while arbitrary-precision integer operations are relatively expensive, and actually cause some observable lag on CPUs which I’ve used.

And so, because I don’t only want to present theory in a form that some people may not be able to visualize, what I did next was to write a C++ program, that actually only uses C, that assumes the user only has a 32-bit CPU, and yet that performs a 64-bit by 64-bit division.

This has now been tested and verified.

One problem in writing this code is the fact that, depending on whether the divisor, which is formatted as a 64-bit field, contains an actual 64-bit, 32-bit, 24-bit, or 16-bit value, a different procedure needs to be selected, and even this fixed-precision format cannot assume that the bits are always positioned in the correct place.

I invite people to look at this sample-code:

http://dirkmittler.homeip.net/text/divide64.cpp

(Update 06/10/2018, 23h30 : )

I needed to correct mistakes which I made in the same piece of code. However, I presently know the code to be correct.

Just to test my premises, I’m going to assume that the following division is to be carried out, erroneously as a simple division, but assuming a word-size of 32 bits:

## 64-bit FORTH

Before I describe a 64-bit FORTH version, I need to explain something about the more-established, general 32-bit version of this low-level language. The 32-bit FORTH had an accepted method of storing 64-bit, so-called ‘double-width’ numbers, in two positions on the stack, with the most-significant word ‘on top’, and the less-significant word in second position from the ‘top’. Correspondingly, ‘normal’ 32-bit FORTH possesses special operators that can either perform full, double-width arithmetic, which treats two consecutive stack-positions as defining a single number, or mixed-width arithmetic, in which two single-width numbers can lead to a double-width product, or by which a double-width number can be divided by a single-width, to arrive at a single-width quotient, and optionally, also to arrive at a single-width modulus / remainder.

This is a fashion in which 32-bit CPUs have generally been able to perform 64-bit arithmetic, partially. And if the reader is not familiar with how this is accessible under FORTH, I can suggest This External Article as a source of reference.

But, if the reader has installed the 64-bit GNU FORTH, which is also just called ‘gforth’ under Linux, then I should call to his attention, that now, each stack-position is capable of holding a 64-bit number, and that all the operators on those numbers are possible, which would otherwise be available for 32-bit numbers, with no special naming.

The following is a small text-session-clip, that illustrates how this works:


dirk@Klystron:~$gforth Gforth 0.7.2, Copyright (C) 1995-2008 Free Software Foundation, Inc. Gforth comes with ABSOLUTELY NO WARRANTY; for details type license' Type bye' to exit 1 cells . 8 ok hex ok$0123456789ABCDEF $100000000 /mod .s <2> 89ABCDEF 1234567 ok$20 lshift + .s <1> 123456789ABCDEF  ok
dup  ok
m* .s <2> -235A1DF76F0D5ADF 14B66DC33F6AC  ok
d. 14B66DC33F6ACDCA5E20890F2A521  ok
bye
dirk@Klystron:~\$




So the ‘/mod’, the ‘lshift’ and the ‘+’ operators are spelled exactly as they would have been for 32-bit FORTH, but operate on potential 64-bit numbers. ‘gforth’ still preserves the double-width operators with the special naming, but in this case, double-width actually means 128-bit ! Its implementation of the standard Fetch operator, which is still named ‘@’, now fetches a 64-bit value from RAM. And I have already documented this slight incompatibility in This Earlier Posting.

If we can assume that our source-code is to be compiled on 64-bit FORTH, we can just perform 64-bit operations on single stack-positions, at will.

It should also be noted that in FORTH comments and stack-traces, the topmost stack-position is written on the right-hand side of the textual list. The apparent negative number above, in the second stack position from the logical top, after ‘ m* .s ‘ , is the result of the most-significant bit of that word being a (1) and not a (0). By convention, in signed integers, this will trigger that a negative number is meant, using two’s complement. And this is still the case, in hexadecimal. But, because this word is the less-significant of the two listed, its most-significant bit will no longer be the most-significant, after it has been combined with the other word, thereby again forming a positive number when printed out as a single 128-bit, signed, integer.

(Edit 07/31/2017 : )

One fact which I have blatantly ignored in my own coding, was that the way in which I chose to separate a single numeric value into two bit-fields – through a modulus-division – is not the most efficient in terms of how many CPU-clock-cycles it consumes. A preferable way to do the same thing, is by using ‘rshift’, and then masking.

The reason for this is the fact that when a CPU is instructed to left-shift or right-shift a binary register-content, doing so takes up about 1 clock-cycle per bit-position shifted. What people may not realize, is that although addition and subtraction can easily be performed in one step by logic-circuits, multiplication and division may not be, assuming a generic, general-purpose CPU. To multiply two 64-bit numbers, actually means to perform 64 additions optionally, each depending on the value of one bit. And to divide a 64-bit number by another, actually means to perform 64 subtractions optionally, each depending on the outcome of a comparison. Maybe for 32-bit or 16-bit registers, we don’t care. But by the time we’re using 64-bit numbers, it penalizes our CPU-load twice as much.

## I question the amount of VRAM on Phoenix.

I am still contemplating, why the server-box I name ‘‘ was crashing, and my attention keeps coming back to the graphics chip. Before this computer was resurrected, it was running in 32-bit mode, as ‘‘. At that time, it only had 2GB of RAM. But now it runs in 64-bit mode, with 4GB of RAM.

When I boot, the BIOS message still tells me that it has 128MB of shared memory, for the graphics chip. But strangely enough, the piece of text I pasted into this posting, reads that the graphics driver has set aside 256MB of VRAM, near the top of the 4GB of physical addresses. I did not know that the kernel can override a BIOS setting in this way, let us say just because processing has been switched to 64-bit mode.

One mishap which could naively go wrong, is that the legacy driver, unaware of the specifics of this motherboard, could be allocating 256MB of shared memory, but that physically, the hardware cannot share past the address ‘‘. That is, the address ‘‘ may have become forbidden territory for the graphics card. It is however uncommon, that the programmers of kernel-space modules, would make such a simple mistake.

This is a 64-bit system, which only accepts up to 4GB of RAM, thus only possessing 32-bit physical addresses, to go with its 64-bit virtual addresses.

According to this screen-shot:

I only have 3.74GB of RAM available to the system, instead of 4GB. The reason for this, is the fact that 256MB have in fact been reserved for the graphics chip. By itself this would seem to suggest, that the allocation has succeeded.

Also, the fact that 49.26MB of shared memory was momentarily being indicated, is not too telling, because several types of processes could be using shared memory for some purpose. This feature does not only exist, for user-space processes to make texture images available to the graphics card.