Understanding the long long data-type with C++

One of the conventions which I was taught when first learning C++, was that, when declaring an integer variable, the designations were to be used, such as:

 


short int a = 0;  // A 16-bit integer
int b = 0;  // A 32-bit integer
long int c = 0;  // Another, 32-bit integer

 

But, since those ancient days, many developments have come to computers, that include 64-bit CPUs. And so the way the plain-looking declarations now work, is:

 


short int a = 0;  // A 16-bit integer
int b = 0;  // A 32-bit integer
long int c = 0;  // Usually, either a 32-bit or a 64-bit, depending on the CPU.
long long d = 0;  // Almost always, a 64-bit integer

 

The fact is not so surprising, that even on 32-bit CPUs, modern C++ runtimes will support 64-bit values partially, because support for longer fields, than the CPU registers, can be achieved, using subroutines that treat longer numbers similarly to how in Elementary School, we were taught to perform multiplication, long division, etc., except that where people were once taught how to perform these operations in Base-10 digits, the subroutine can break longer fields into 32-bit words.

In order to avoid any such confusion, the code is sometimes written in C or C++, like so:

 


uint16_t a = 0;
uint32_t b = 0;
uint64_t c = 0;

 

The ability to put these variables into the source-code does not pose as much of a problem, as the need which sometimes exists, to state literals, which can be assigned to these variables. Hence, a developer on a 64-bit machine might be tempted to put something like this:

 


uint64_t c = 1L << 32;

 

Which literally means, ‘Compute a constant expression, in which a variable of type (long) is left-shifted 32 bits.’ The problem here would be, that if compiled on a 32-bit platform, the literal ‘1L’ just stands for a 32-bit long integer, and usually, some type of compiler warning will ensue, about the fact that the constant expression is being left-shifted, beyond the size of the word in question, which means that the value ‘0’ would result.

If we are to write source-code that can be compiled equally-well on 32-bit and 64-bit machines, we really need to put:

 


uint64_t c = 1ULL << 32;

 

So that the literal ‘1ULL’ will start out, as having a compatible word-size. Hence, the following source-code will become plausible, in that at least it will always compile:

 

#include <cstdlib>
#include <iostream>

using std::cout;
using std::endl;

int main(int argc, char* argv[]) {

	long int c = 1234567890UL;

	if (sizeof(long int) > 7) {
		c = (long int) 1234567890123456ULL;
	}

	cout << c << endl;

	return 0;
}

 

The problem that needed to be mitigated, was not so much whether the CPU has 32-bit or 64-bit registers at run-time, but rather, that the source-code needs to compile either way.

Dirk

 

Three types of Constants that exist in C++

In a previous posting, I explained that I am able to write C++ programs, that test some aspect of how the CPU performs simplistic computations, even though C++ is a high-level language, which its compiler tries to optimize, before generating a machine-language program, the latter of which is called the run-time.

As I pointed out in that posting, there exists a danger that the computation may not take place the way in which the code reads to plain inspection, more specifically, in the fact that certain computations will be performed at compile-time, instead of at run-time. In reality, both the compile-time and the run-time computations as such still take place on the CPU, but in addition, a compiler can parse and then examine larger pieces of code, while a CPU needs to produce output, usually just based on the contents of a few registers – in that case, based on the content of up to two input-registers.

I feel that in contrast with the example which I wrote will work, I should actually provide an example, which will not work as expected. The following example is based on the fact that in C++, there exist three basic types of constants:

  1. Literals,
  2. Declared constants,
  3. Constant Expressions.

Constant expressions are any expressions, the parameters of which are never variables, always operators, functions, or other constants. While they can increase the efficiency with which the program works, the fact about constant expressions which needs to be remembered, is that by definition, the compiler will compute their values at compile-time, so that the run-time can make use of them. The example below is an example in which this happens, and which exists as an antithesis of the earlier example:

 


// This code exemplifies how certain types of
// computations will be performed by the compiler,
// and not by the compiled program at run-time.

// It uses 3 types of constants that exist in C++ .

#include <cstdio>

int main () {
	
	// A variable, initialized with a literal
	double x = 0;
	
	// A declared constant
	const double Pi = 3.141592653589;
	
	// A constant expression, being assigned to the variable
	x = ( Pi / 6 );
	
	printf("This is approximately Pi / 6 : %1.12f\n", x);
	
	return 0;
}


 


dirk@Plato:~/Programs$ ./simp_const
This is approximately Pi / 6 : 0.523598775598
dirk@Plato:~/Programs$


 

If this was practical code, there would be no problem with it, purely because it outputs the correct result. But if this example was used to test anything about how the run-time behaves, its innocent Math will suffer from one main problem:

When finally assigning a computed value to the double-precision floating-point variable ‘x’ , the value on the right-hand side of the assignment operator, the ‘=’ , is computed by the compiler, before a machine-language program has been generated, because all of its terms are either literals or declared constants.

This program really only tests, how well the ‘printf()’ function is able to print a double-precision value, when instructed exactly in what format to do so. That value is contained in the machine-code of the run-time, and not computed by it.

BTW, There exists a type of object in C, called a Compound Literal, which is written as an Initializer List, preceded by a data-type, as if to type-cast that initializer list to that data-type. But, because all manner of syntax that involves initializer lists is distinctly C and not C++ , I’ve never given much attention to compound literals.

(Edit 11/13/2017 : )

In other words, if the code above contained a term of the form

( 1 / ( 1 / Pi ))

Then chances are high, that because this also involves a constant expression, the compiler may simplify it to

( Pi )

Without giving the programmer any opportunity, to test the ability of the CPU, to compute any reciprocals.

Dirk