A Small Touch-Up to one of my Projects

One of the things which I did in the past was, to write a C++, Qt5 application, which was meant as a test, of how to incorporate certain GUI elements into a program, those elements extending as far as, how to force the program to display with my own desktop theme, even on target computers that do not have this theme set or installed.

I was so focused on that aspect of the project, that I had overlooked a simpler aspect of it: What would happen if the user decided to resize the window. The initial window size is 850×720, and resizing, so far, was ugly. But, according to tonight’s update, the resizing behaviour has been made much nicer. (:1)

The relevant files can be found in the following directory on my site:

https://dirkmittler.homeip.net/binaries/

The files are the ones that begin with ‘Creator_Test6...‘ .

There is also a compiled AppImage that will run on Linuxes with at least ‘CXXABI_1.3.9′ (equivalent to Debian 9 / Stretch), but no binary to run under Windows. Sorry for that last omission. However, readers who have the Qt SDK installed, should be able to compile under Windows as well. What needs to be done, when Qt projects are being recompiled with another computer’s SDK, is, that the file ‘Creator_Test6.pro.user‘, typically, needs to be deleted, as it contains details specific to one version of the SDK. Because of this, the Project Configuration – i.e., the Toolkits that are to become compile targets – will then also need to be redefined by any interested power-user.


 

There’s a small observation to add about this software project. The concept that it should display in its own desktop theme, only works fully, when running the AppImage under Linux. That is because I linked the Theme Engines and Styles in, using the command-line tool ‘linuxdeployqt‘. Those were binary plug-ins, that I do not even possess the source-code for. Hence, if the reader custom-compiles the same project, they will find that those plug-ins have not been compiled, and that for this reason, the app will display with the local Style at best.

In reality, there are two separate settings:

  1. QIcon::setThemeSearchPaths(":/res/icons");
  2. QApplication::setStyle("Breeze");

Out of those, the first should even work if the reader custom-compiles, but the second should not.

 

(Updated 8/02/2021, 7h45… )

Continue reading A Small Touch-Up to one of my Projects

Creating a C++ Hash-Table quickly, that has a QString as its Key-Type.

This posting will assume that the reader has a rough idea, what a hash table is. In case this is not so, this WiKiPedia Article explains that as a generality. Just by reading that article, especially if the reader is of a slightly older generation like me, he or she could get the impression that, putting a hash-table into a practical C++ program is arduous and complex. In reality, the most efficient implementation possible for a hash table, requires some attention to such ideas as, whether the hashes will generally tend to be coprime with the modulus of the hash table, or at least, have the lowest GCDs possible. Often, bucket-arrays with sizes that are powers of (2), and hashes that contain higher prime factors help. But, when writing practical code in C++, one will find that the “Standard Template Library” (‘STL’) already provides one, that has been implemented by experts. It can be put into almost any C++ program, as an ‘unordered_map’. (:1)

But there is a caveat. Any valid data-type meant to act as a key needs to have a hashing function defined, as a template specialization, and not all data-types have one by far. And of course, this hashing function should be as close to unique as possible, for unique key-values, while never changing, if there could be more than one possible representation of the same key-value. Conveniently, C-strings, which are denoted in C or C++ as the old-fashioned ‘char *’ data-type, happen to have this template specialization. But, because C-strings are a very old type of data-structure, and, because the reader may be in the process of writing code that uses the Qt GUI library, the reader may want to use ‘QString’ as his key-data-type, only to find, that a template specialization has not been provided…

For such cases, C++ allows the programmer to define his own hashing function, for QStrings if so chosen. In fact, Qt even allows a quick-and-dirty way to do it, via the ‘qHash()’ function. But there is something I do not like about ‘qHash()’ and its relatives. They tend to produce 32-bit hash-codes! While this does not directly cause an error – only the least-significant 32 bits within a modern 64-bit data-field will contain non-zero values – I think it’s very weak to be using 32-bit hash-codes anyway.

Further, I read somewhere that the latest versions of Qt – as of 5.14? – do, in fact, define such a specialization, so that the programmer does not need to worry about it anymore. But, IF the programmer is still using an older version of Qt, like me, THEN he or she might want to define their own 64-bit hash-function. And so, the following is the result which I arrived at this afternoon. I’ve tested that it will not generate any warnings when compiled under Qt 5.7.1, and that a trivial ‘main()’ function that uses it, will only generate warnings, about the fact that the trivial ‘main()’ function also received the standard ‘argc’ and ‘argv’ parameters, but made no use of them. Otherwise, the resulting executable also produced no messages at run-time, while having put one key-value pair into a sample hash table… (:2)

 

/*  File 'Hash_QStr.h'
 * 
 *  Regular use of a QString in an unordered_map doesn't
 * work, because in earlier versons of Qt, there was no
 * std::hash<QString>() specialization.
 * Thus, one could be included everywhere the unordered_map
 * is to be used...
 */

#ifndef HASH_QSTR_H
#define HASH_QSTR_H


#include <unordered_map>
#include <QString>
//#include <QHash>
#include <QChar>

#define PRIME 5351


/*
namespace cust {
	template<typename T>
	struct hash32 {};

  template<> struct hash32<QString> {
    std::size_t operator()(const QString& s) const noexcept {
      return (size_t) qHash(s);
    }
  };
}

*/


namespace cust
{
	inline size_t QStr_Orther(const QChar & mychar) noexcept {
		return ((mychar.row() << 8) | mychar.cell());
	}
	
	template<typename T>
	struct hash64 {};
	
    template<> struct hash64<QString>
    {
        size_t operator()(const QString& s) const noexcept
        {
            size_t hash = 0;

            for (int i = 0; i < s.size(); i++)
				hash = (hash << 4) + hash + QStr_Orther(s.at(i));

            return hash * PRIME;
        }
    };
}


#endif  //  HASH_QSTR_H

 

(Updated 4/12/2021, 8h00… )

Continue reading Creating a C++ Hash-Table quickly, that has a QString as its Key-Type.

Why C++ compilers use name-mangling.

A concept which exists in C++ is, that the application programmer can simply define more than one function, which will seem to have the same names in his or her source code, but which will differ, either just because they have different parameter-types, or, because they are member functions of a class, i.e., ‘Methods’ of that class. This can be done again, for each declared class. In the first case, it’s a common technique called ‘function overloading’. And, if the methods of a derived class replace those of a base-class, then it’s called ‘function overriding‘.

What people might forget when programming with object-oriented semantics is, that all the function definitions still result in subroutines when compiled, which in turn reside in address-ranges of RAM, dedicated for various types of code, either in ‘the code segment of a process’, or in ‘the addresses which shared libraries will be loaded to’. This differs from the actual member variables of each class-object, also known as its properties, as well as for entries that the object might have, for virtual methods. Those will reside in ‘the data-segment of the process’, if the object was allocated with ‘new’. Each method would be incapable of performing its task if, in addition to the declared parameters, it did not receive an invisible parameter, that will be its ‘this’ pointer, which will allow it to access the properties of one object. And such a hidden ‘this’ pointer is also needed by any constructors.

Alternatively, properties of an object can reside on the stack, and therefore, in ‘the stack segment of the process’, if they were just declared to exist as local variables of a function-call. And, if an array of objects was declared, let’s say mistakenly, and not, of pointers to those objects, then each entry in the array will, again, need to have a size determined at compile-time, for which reason such objects will not be polymorphic. I.e., in these two cases, any ‘virtuality’ of the functions is discarded, and only the declared class of the object will be considered, for resolving function-calls. Such an object ends up ‘statically bound’, in an environment which really supports ‘dynamically bound’ method-invocation.

First of all, when programming in C, it is not allowed to overload functions by the same name like that. According to C, a function by one name can only be defined once, as receiving the types in one parameter-list. And the only real exception to this is in the existence of ‘variadic functions,’ which are beyond the scope of this one posting. (:1)

Further, C++ functions that have the same name, are not (typically) an example of variadic functions.

This limitation ‘makes sense’, because the compiler of either language still needs to generate one subroutine, which is the machine-language version of what the function in the source-code defined. It will have a fixed expectation, of what parameter list it was fed, even in the case of ‘variadic functions’. I think that what happens with variadic functions is, that the machine-language code will search its parameter list on the stack, for whatever it finds, at run-time. They tend to be declared with an ellipsis, in other words with ‘…’, for the additional parameters, after the entries for any fixed parameters.

So, the way in which C++ resolves this problem is, that it “mangles” the names of the functions in the source code, deterministically, but, with a system that takes into account, which parameter types they receive, and which class they may belong to, if any. The following is an example of C++ source code that demonstrates this. I have created 3 versions of the function ‘MyFunc()’, each of which only has as defined behaviour, to return the exact data which they received as input. Obviously, this would be useless in a real program.

But what I did next was to compile this code into a shared library, and then to use the (Linux) utility ‘nm’, to list the symbols which ended up being defined in the shared library…

Source Code:

 

/*  Sample_Source.cpp
 * 
 * This snippet is designed to illustrate a capability which C++ has,
 * but which requires name-mangling...
 * 
 */

#include <cmath>
#include <complex>

 /*  If this were a regular C program, then we'd include...
  *
#include <math.h>
#include <complex.h>
  *
  */

using std::complex;

typedef complex<double> CC;

class HasMethods {
public:
	HasMethods() { }
	~HasMethods() { }
	
	CC MyFunc(CC input);
};

//  According to the given headers, there are at least 3 functions
// that I could define below. First, two free functions, aka
// global functions...

double MyFunc(double input) {
	return input;
}

CC MyFunc(CC input) {
	return input;
}

//  Next, the member function of HasMethods can be defined, aka
// the supposed main 'Method' of a HasMethods object...

CC HasMethods::MyFunc(CC input) {
	return input;
}

 

(Updated 4/12/2021, 21h30… )

Continue reading Why C++ compilers use name-mangling.

Using Factory Functions to Export C++ Objects from Shared Libraries / DLLs, but also applying Factory Class Design Pattern.

(Edited 3/15/2021, 22h45: )

The main reason for which factory functions are sometimes used is the fact that, while object-oriented code can in fact be compiled into shared libraries, situations exist in which the application programmer is not aware of specific libraries that will need to be loaded at run-time. As a result, this developer cannot specify them during the linking stage of his or her application.

(End of Edit, 3/15/2021, 22h45.)

What one does in practical cases is, to define “factory functions” in the shared library, such as the function “maker()” in the example below, and to give the compiler directive ‘extern “C”‘ when doing so:

https://www.linuxjournal.com/article/3687

I think that the example I’ve just linked to suffers from some seriously bad typesetting. But to my mind, it gets the point across. The methods that belong to C++ classes have mangled names, which are difficult to load from shared libraries directly (using the ‘dlsym()’ function). In principle, one could try to predict the name mangling in the client program, but in practice, this is avoided. Instead, a factory function is exported from the shared library to the client program, the name of which is explicitly not mangled, due to this compiler directive, and what that function does when called is, to create an object of the specified type. The client program’s C++ ‘knows’ what methods of the object it can call because of header files…

When I was studying System Software at Concordia University, an exercise the whole class was required to carry out was, to define a function that was declared with ‘extern “C”‘, to store that in a shared library, and, to write a client program which loaded that function. We were never required to load C++ objects from that shared library. But, the article which I just linked to above, explains how to do that, at least well enough so that I can follow it.

(…)

If you’re one of those people like me, who have seen C++ with many ‘Create…()’ function-names, even though we know that in general, in C++, the constructors of class-objects have the same name as the class, what is written above is likely the reason, for which so many Creator functions have been defined.

I think, though, that there is one way in which I must second-guess the author of the article I just linked to. He ended up using the ‘extern “C”‘ directive in more places than he needed to. If it was something his project set out to do from the beginning, for the source, shared libraries to register their factory functions in an array of function-pointers automatically, then there is really no longer any reason, why their names should not be mangled. In fact, in certain cases conflicts could result, as soon as two functions have the same name, such as “proxy()”. And so, in such cases, there is really only one object in the whole process, the name of which must not be mangled, and in the case cited above, that would be the associative array.

Another fact which the cited article does not mention is simply, that it might result in tedious code, if the factory function was defined separately, for every class of objects that a shared library can export. It might simplify things, if a template Creator class could be defined, which has a factory method as one of its methods, which can be applied to a series of classes that have default constructors… The following article describes a slightly different use:

https://refactoring.guru/design-patterns/factory-method/cpp/example

Of course, the cited function ‘SomeOperation()’ could not be used…

 

//  The following lines should probably go into some header file.

#include <map>
#include <string>
#include <stdlib.h>
#include <stdio.h>
#include <dlfcn.h>

using std::string;

//  If the library existed, either IMPORT or WINDLL would be defined...

typedef std::map<string, void *> function_array;

function_array *reg_factories_p = NULL;

#ifdef WIN32
#define DLL_EXPORT extern "C" __declspec(dllexport)
#define DLL_IMPORT extern "C" __declspec(dllimport)
#else
#define DLL_EXPORT extern "C"
#define DLL_IMPORT extern "C"
#endif

#ifdef WINDLL
DLL_EXPORT function_array reg_factories;
function_array reg_factories;
#else
#ifndef IMPORT
function_array reg_factories;
#endif
#endif

//  What comes below this line, goes into the shared library / DLL.

class Obj_1 {
public:
	Obj_1() {}
};

class Obj_2 {
public:
	Obj_2() {}
};

class Obj_3 {
public:
	Obj_3() {}
};

 
//  Template definition (can't be compiled).
 
template<class T>
class Creator {
public:

	typedef T * (Creator<T>::*method)() const;
	method m_ptr;
	
    Creator(string object_name) {
		m_ptr = &Creator<T>::FactoryMethod;
#ifndef IMPORT
		if (! reg_factories_p) {
			reg_factories_p = &reg_factories;
		}
		reg_factories[object_name] = (void *) this->m_ptr;
#endif
	}
 
    T *FactoryMethod() const {
        return new T;
    }
 
};
 
 
//  Template instantiations (can be compiled).

Creator<Obj_1>  mk1("Object1");
Creator<Obj_2>  mk2("Object2");
Creator<Obj_3>  mk3("Object3");

//  Static objects will be constructed, if the shared library
//  has been opened with 'RTLD_NOW'.
//  This will cause the constructor within 'Creator' to be
//  called, and therefore, the associative array to be populated.


//  Code below this line is meant to go into the client program...

//  The following template class will serve to cast void pointers back
//  to Maker objects, each with a method that makes the object...

#ifndef WINDLL

#define LIB_NAME "./libfunnylib.so"

template<class T>
class Maker {

public:
	typedef T * (*obj_maker) ();
	
	obj_maker m_fptr;

	Maker(void *fptr) {
		m_fptr = reinterpret_cast<obj_maker> (fptr);
	}
	
	T * Make() {
		return (*m_fptr) ();
	}
	
};


int main(int argc, char* argv[]) {

	void *hndl = dlopen(LIB_NAME, RTLD_NOW);
	if(hndl == NULL) {
		printf("%s\n\n", dlerror());
	} else {
		void *symbol = dlsym(hndl, "reg_factories");
		if (symbol) {
			reg_factories_p = (function_array *) symbol;
		} else {
			printf("%s Did not export symbol reg_factories.\n\n", LIB_NAME);
			return -1;
		}
	}
	
    char *buffer;
    size_t bufsize = 32;

	printf("The purpose here is, to find out whether\n");
	printf("I was able to store non-trivial adresses in the\n");
	printf("associative array.\n");
	printf("\n");
	printf("Object1 factory address: %p\n", (*reg_factories_p)["Object1"]);
	printf("Object2 factory address: %p\n", (*reg_factories_p)["Object2"]);
	printf("Object3 factory address: %p\n", (*reg_factories_p)["Object3"]);
	printf("\n");
	printf("Now attempting to execute those factories...\n");
	
	//  Exploiting the compiler's willingness to create temporary
	//  objects, invoked directly by their class-name...
	try {
		Maker<Obj_1>((*reg_factories_p)["Object1"]).Make();
		Maker<Obj_2>((*reg_factories_p)["Object2"]).Make();
		Maker<Obj_3>((*reg_factories_p)["Object3"]).Make();
	} catch (...) {
		printf("Some type of error took place!\n");
		return -1;
	}
	
	printf("All pointer casts executed.\n\n");
	
	printf("Press Enter to quit program.\n");
	
    buffer = (char *) malloc(bufsize * sizeof(char));
    if( buffer == NULL)
    {
        perror("Unable to allocate buffer");
        exit(-1);
    }
	
	getline(&buffer,&bufsize,stdin);
	

	//  We're done. Cleaning up.
	
	free(buffer);
	
	return 0;
}
#endif

 

One fact which must be kept in mind, if this code is ever to be used ‘in the real world’, is, that The associative array should be declared as ‘extern “C”‘ in a header file, but additionally allocated wherever it’s going to be used.  This results in two similar-looking C++ statements, when building the shared library.

Also, compiling this code will generate one warning per factory method, unless, under ‘g++’, the flag ‘-Wno-pmf-conversions‘ is set. Additionally, on Linux computers, linking requires that the flag ‘-ldl‘ be set.

Now, an astute reader will ask, ‘Mainstream code can be compiled without requiring special flags. Why does this guy’s code require that special flags be set?‘ And the answer to that question is, ‘Because this code cheats. The warning which up-to-date compilers will generate – at best – exists, because on some platforms, a pointer-to-method cannot be cast to a pointer-to free function, in the form of a single address, and always work. Therefore, this practice is actually shunned.’

At the end of this posting, I will show compatible code, that does not cheat…

(Updated 3/22/2021, 3h55… )

Continue reading Using Factory Functions to Export C++ Objects from Shared Libraries / DLLs, but also applying Factory Class Design Pattern.