## Understanding the long long data-type with C++

One of the conventions which I was taught when first learning C++, was that, when declaring an integer variable, the designations were to be used, such as:


short int a = 0;  // A 16-bit integer
int b = 0;  // A 32-bit integer
long int c = 0;  // Another, 32-bit integer



But, since those ancient days, many developments have come to computers, that include 64-bit CPUs. And so the way the plain-looking declarations now work, is:


short int a = 0;  // A 16-bit integer
int b = 0;  // A 32-bit integer
long int c = 0;  // Usually, either a 32-bit or a 64-bit, depending on the CPU.
long long d = 0;  // Almost always, a 64-bit integer



The fact is not so surprising, that even on 32-bit CPUs, modern C++ runtimes will support 64-bit values partially, because support for longer fields, than the CPU registers, can be achieved, using subroutines that treat longer numbers similarly to how in Elementary School, we were taught to perform multiplication, long division, etc., except that where people were once taught how to perform these operations in Base-10 digits, the subroutine can break longer fields into 32-bit words.

In order to avoid any such confusion, the code is sometimes written in C or C++, like so:


uint16_t a = 0;
uint32_t b = 0;
uint64_t c = 0;



The ability to put these variables into the source-code does not pose as much of a problem, as the need which sometimes exists, to state literals, which can be assigned to these variables. Hence, a developer on a 64-bit machine might be tempted to put something like this:


uint64_t c = 1L << 32;



Which literally means, ‘Compute a constant expression, in which a variable of type (long) is left-shifted 32 bits.’ The problem here would be, that if compiled on a 32-bit platform, the literal ‘1L’ just stands for a 32-bit long integer, and usually, some type of compiler warning will ensue, about the fact that the constant expression is being left-shifted, beyond the size of the word in question, which means that the value ‘0’ would result.

If we are to write source-code that can be compiled equally-well on 32-bit and 64-bit machines, we really need to put:


uint64_t c = 1ULL << 32;



So that the literal ‘1ULL’ will start out, as having a compatible word-size. Hence, the following source-code will become plausible, in that at least it will always compile:

#include <cstdlib>
#include <iostream>

using std::cout;
using std::endl;

int main(int argc, char* argv[]) {

long int c = 1234567890UL;

if (sizeof(long int) > 7) {
c = (long int) 1234567890123456ULL;
}

cout << c << endl;

return 0;
}



The problem that needed to be mitigated, was not so much whether the CPU has 32-bit or 64-bit registers at run-time, but rather, that the source-code needs to compile either way.

Dirk

## I’ve just tweaked my kernel memory a little.

According to This previous posting of mine, I had set aside 270MiB of RAM, just in case the kernel needed it, on the computer I name ‘Phoenix’, to keep track of user-space programs, watching files and directories. I had also taken A Quick Glance at the way 64-bit kernel, virtual memory is organized under Linux, differently from how 32-bit virtual addresses were. Given the additional fact, that I only want to set aside 128MiB of user-space RAM, To cache my blog’s Web-pages, I decided that the earlier amount of kernel-memory was too large.

And so now I’ve reduced the amount, for ‘INotify’, to a mere 135MiB.

This computer has 4GiB total RAM.

Elaboration:

On 32-bit systems, the entire address space only had 4GB, but under no circumstances was the kernel expected to take up more than 1 GB. The computer ‘Phoenix’ is a 64-bit system, but one which only has 4GB of physical RAM. Therefore, even though it’s a 64-bit system, its kernel should still not take up more than 1GB. Yet, if I preallocate 270MB, just to perform one task, I’m pushing this system’s kernel closer to using up 1GB for real. So just for this one task, I’ve reduced this one commitment to 135MB.

Dirk

## Some Suggested Code

In This Earlier Posting, I had written at first some observations about Bluetooth-pairing, but then branched out on the subject, of whether a Diffie-Hellman Key Exchange could be easier to compute, if it was somehow simplified into using a 32-bit modulus. Obviously, my assumption was that a 64-bit by 32-bit divide instruction would be cheap on the CPU, while arbitrary-precision integer operations are relatively expensive, and actually cause some observable lag on CPUs which I’ve used.

And so, because I don’t only want to present theory in a form that some people may not be able to visualize, what I did next was to write a C++ program, that actually only uses C, that assumes the user only has a 32-bit CPU, and yet that performs a 64-bit by 64-bit division.

This has now been tested and verified.

One problem in writing this code is the fact that, depending on whether the divisor, which is formatted as a 64-bit field, contains an actual 64-bit, 32-bit, 24-bit, or 16-bit value, a different procedure needs to be selected, and even this fixed-precision format cannot assume that the bits are always positioned in the correct place.

I invite people to look at this sample-code:

http://dirkmittler.homeip.net/text/divide64.cpp

(Update 06/10/2018, 23h30 : )

I needed to correct mistakes which I made in the same piece of code. However, I presently know the code to be correct.

Just to test my premises, I’m going to assume that the following division is to be carried out, erroneously as a simple division, but assuming a word-size of 32 bits:

## 64-bit FORTH

Before I describe a 64-bit FORTH version, I need to explain something about the more-established, general 32-bit version of this low-level language. The 32-bit FORTH had an accepted method of storing 64-bit, so-called ‘double-width’ numbers, in two positions on the stack, with the most-significant word ‘on top’, and the less-significant word in second position from the ‘top’. Correspondingly, ‘normal’ 32-bit FORTH possesses special operators that can either perform full, double-width arithmetic, which treats two consecutive stack-positions as defining a single number, or mixed-width arithmetic, in which two single-width numbers can lead to a double-width product, or by which a double-width number can be divided by a single-width, to arrive at a single-width quotient, and optionally, also to arrive at a single-width modulus / remainder.

This is a fashion in which 32-bit CPUs have generally been able to perform 64-bit arithmetic, partially. And if the reader is not familiar with how this is accessible under FORTH, I can suggest This External Article as a source of reference.

But, if the reader has installed the 64-bit GNU FORTH, which is also just called ‘gforth’ under Linux, then I should call to his attention, that now, each stack-position is capable of holding a 64-bit number, and that all the operators on those numbers are possible, which would otherwise be available for 32-bit numbers, with no special naming.

The following is a small text-session-clip, that illustrates how this works:


dirk@Klystron:~$gforth Gforth 0.7.2, Copyright (C) 1995-2008 Free Software Foundation, Inc. Gforth comes with ABSOLUTELY NO WARRANTY; for details type license' Type bye' to exit 1 cells . 8 ok hex ok$0123456789ABCDEF $100000000 /mod .s <2> 89ABCDEF 1234567 ok$20 lshift + .s <1> 123456789ABCDEF  ok
dup  ok
m* .s <2> -235A1DF76F0D5ADF 14B66DC33F6AC  ok
d. 14B66DC33F6ACDCA5E20890F2A521  ok
bye
dirk@Klystron:~\$




So the ‘/mod’, the ‘lshift’ and the ‘+’ operators are spelled exactly as they would have been for 32-bit FORTH, but operate on potential 64-bit numbers. ‘gforth’ still preserves the double-width operators with the special naming, but in this case, double-width actually means 128-bit ! Its implementation of the standard Fetch operator, which is still named ‘@’, now fetches a 64-bit value from RAM. And I have already documented this slight incompatibility in This Earlier Posting.

If we can assume that our source-code is to be compiled on 64-bit FORTH, we can just perform 64-bit operations on single stack-positions, at will.

It should also be noted that in FORTH comments and stack-traces, the topmost stack-position is written on the right-hand side of the textual list. The apparent negative number above, in the second stack position from the logical top, after ‘ m* .s ‘ , is the result of the most-significant bit of that word being a (1) and not a (0). By convention, in signed integers, this will trigger that a negative number is meant, using two’s complement. And this is still the case, in hexadecimal. But, because this word is the less-significant of the two listed, its most-significant bit will no longer be the most-significant, after it has been combined with the other word, thereby again forming a positive number when printed out as a single 128-bit, signed, integer.

(Edit 07/31/2017 : )

One fact which I have blatantly ignored in my own coding, was that the way in which I chose to separate a single numeric value into two bit-fields – through a modulus-division – is not the most efficient in terms of how many CPU-clock-cycles it consumes. A preferable way to do the same thing, is by using ‘rshift’, and then masking.

The reason for this is the fact that when a CPU is instructed to left-shift or right-shift a binary register-content, doing so takes up about 1 clock-cycle per bit-position shifted. What people may not realize, is that although addition and subtraction can easily be performed in one step by logic-circuits, multiplication and division may not be, assuming a generic, general-purpose CPU. To multiply two 64-bit numbers, actually means to perform 64 additions optionally, each depending on the value of one bit. And to divide a 64-bit number by another, actually means to perform 64 subtractions optionally, each depending on the outcome of a comparison. Maybe for 32-bit or 16-bit registers, we don’t care. But by the time we’re using 64-bit numbers, it penalizes our CPU-load twice as much.