# Hash Tables

There is already a lot of work published, that details numerous established ways to implement hash tables. The WiKiPedia article on the broad class of hash-tables, is already a good starting point to become familiar with them.

But just to make this interesting, I undertook the mental exercise of designing a hash table, which when grown, does not require that the contents of the original hash table be rearranged in any way, but in which the expansion is achieved, by simply appending zeroes to the original hash table, and then changing one of the parameters, with which the new, bigger hash table is accessed.

My exercise led me to the premise, that each size of my hash table be a power of two, and it seems reasonable to start with 2^16 entries, aka 65536 entries, giving the hash table an initial Power Level (L1) of 16. There would be a pointer I name P to a hash-table position, and two basic Operations to arrive at P:

1. The key can be hashed, by multiplying by the highest prime number smaller than 2^L1 – This would be the prime number 65521 – within the modulus of 2^L1, and yielding Position P0.
2. The current Position can have added the number A, which is intentionally some power of two lower than L1, within the modulus of 2^L, successively yielding P1, P2, P3, etc…

Because the original multiplier is a prime number lower than 2^L1, and the latter, the initial modulus of the hash table, a power of two, consecutive key-values will only lead to a repetition in P0 after 2^L key-values. But, because actual keys are essentially random, a repetition of P0 is possible by chance, and the resolution of the resulting hash-table collision is the main subject in the design of any hash table.

Default-Size Operation

By default, each position of the hash table contains a memory address, which is either NULL, meaning that the position is empty, or which points to an openly-addressable data-structure, from which the exact key can be retrieved again. When this retrieved exact key does not match the exact key used to perform the lookup, then Operation (2) above needs to be performed on P, and the lookup-attempt repeated.

But, because A is a power of 2 which also fits inside the modulus 2^L, we know that the values of P will repeat themselves exactly, and how small A is, will determine how many lookup-attempts can be undertaken, before P repeats, and this will also determine how many entries can be found at any one bucket. Effectively, if A = 2^14, then 2^L / A = 2^2, so that a series of 4 positions would form the maximum bucket-size. If none of those entries is NULL, the bucket is full, and if an attempt is made to insert a new entry to a full bucket, the hash-table must be expanded.

Finding the value for a key will predictably require, that all the positions following from one value of P0 be read, even if some of them were NULL, until an iteration of P reveals the original key.

It is a trivial fact that eventually, some of the positions in this series will have been written to by other buckets, because keys will be random again, thus leading to (their own values of P) which coincide with (current nA + P0) . But, because the exact key will be retrieved from non-NULL addresses before those are taken to have arisen from the key being searched for, those positions will merely reflect a performance-loss, not an accuracy-loss.

Because it is only legal, for 1 key to lead to a maximum of 1 value, as is the case with all hash tables, before an existing key can be set to a new value, the old key must be found and be deleted. In order for this type of hash table to enforce that policy, its own operation to insert a key would need to be preceded by an operation to retrieve it, which can either return a Boolean True value if the key exists, plus the position at which its address is located, or a Boolean False value if it does not exist, plus the position at which the first NULL address is located, at which a following operation could insert one, in a single step. I would be asking two operations to take place atomically, because this can pose further issues with multi-threaded access to the hash table.

(Edit 06/22/2017 : The premise here, in the case of a multi-threaded application, is that it should be possible to lock a global object, before the retrieval-function is called, as long as it is not already locked, but that if a thread gives the instruction to lock it when it already is, then that thread is put to sleep, until the other thread, which originally locked the object, unlocks it again.

I believe that Such an object is also known as a ‘Mutex’.

In my thought-exercise, this Mutex-object not being locked, should also be a signal to the retrieval-function, that the position returned, in case the key is not found, does not need to be of significance, specifically for the situation that the bucket could be full, in which case there should be no valid pointer to a NULL-address. )

Any defined operation to delete a key would follow according to the same logic – It must be found first, and if not found, cause the appropriate error-code to be returned. Any hash table with more than one entry for the same key is corrupted, but can conceivably be cleaned, if the program that uses it sends multiple commands to delete the same key, or one command to purge it…

Growing the Hash Table

(Last Updated 06/28/2017 … )

Because A was a power of 2 that fits inside 2^L1, growing the hash table once, to a size of 2^L2, where L2 = L1 + 1 , will also result in A fitting inside 2^L2, and additionally, repeating whatever values of P it yielded, within 2^L1. The number of entries any one bucket can hold, and the maximum number of lookup-attempts that will be needed to find a key, will increase to 2^L / A ,  which will next become 8 where it was 4, given the parameters I just suggested above.

The advantage to making A large, and thus the initial number of entries possible within each bucket small, would lie in the simultaneous weakness of my idea, in that the number of entries belonging to any one bucket will increase exponentially after the table has been grown several times.

But my idea is also that Operation (1) above continues to generate positions P0 that belong to the same bucket, that followed from the same key as before, because the prime-number will not change. It is an open-ended question purely from the Mathematical standpoint, whether the modulus of Operation (1) above actually needs to change. If the modulus does not change and the original hash-table was densely-populated, then it will be more probable that the first entries within the bucket will be full, and this could lead to a penalty when inserting new entries.

If the modulus of Operation (1) above was updated to 2^L instead of 2^L1, then the insertion of new entries would spread out over the new size better, which would eventually become necessary, but finding old entries would be penalized more. The time needed to find entries in general, will double, every time the hash-table has been grown in this way. This would happen because 2^L2 is divisible by 2^L1, so that half the time, P0 would end up in the part of 2^L2 which did not originally belong to 2^L1. But P0 would then still be correctly-positioned within 2^L2, so that some multiple of A can be added to it, and result in its original position within 2^L1, that position now being P4 out of P0 – P7 , given the parameters I suggested above.

(Edit 06/23/2017 :

If we were to make it the responsibility of the key-retrieval function to grow the hash-table, in case a key was not found, but the position of a NULL-address is supposed to be returned, into which a new key can be inserted, then:

• Multiple threads could simply be retrieving keys, while only one thread can be setting them or growing the hash table. In this case the next issue becomes, that naive retrieval operations should not try to grow the table, but instead return an invalid address-position, at which the address was supposed to be, but is not NULL, on the premise that no operation will follow to insert a key.
• Care would need to be taken, not to lock the operation to grow the table, using the same Mutex-object as the one that the retrieval function may already have been executed under, as part of a ‘set key’ sequence. Doing so would cause a deadlock, for which reason the operation to grow the table would need to lock its own Mutex-object.
• Therefore, within the operation to grow the table, an non-imperative attempt should be made to lock the Mutex-object used when setting a key, i.e. a ‘try-lock’, and this may only be done before the operation’s own Mutex-object has been locked. If the attempt to lock the parent’s Mutex-object succeeds, the current retrieval function was not part of a ‘set key’ sequence, the parent’s Mutex-object must be unlocked again, and the operation to grow the table must exit with a minor error-code, without attempting to grow the table.
• Multiple simultaneous commands to grow the table should be executed unfortunately, even when issued, for which reason the attempt to lock the Mutex-object belonging to this operation, should probably be an imperative one, i.e. a blocking ‘lock’.
• Within this scheme, it would be possible to give an externally-originated command to grow the table, at an arbitrary point in time.
• After such a Mutex-lock has been granted, such an operation would need to verify that it still wants the hash table grown, since the table might have been grown, since the Mutex-lock was requested. The easiest way in which this can be done, is if this operation was given a parameter, which states to what power level the table should be grown, and which can be compared to the table’s current power level.

The reader might be questioning, why I am putting so much detail, into the design of my hypothetical key retrieval-function – in the event that they key is not found and would be inserted. The reason is my own realization of the fact, that my idea still has as main weakness, that the hash table could have been grown numerous times, and that the ratio of (2^L / A) could become high. For example, if the original Power Level (L1) was merely 16, and the initial table therefore had 65536 positions, but if it had been grown to L = 20 , allowing it to have 1M positions, this would also mean that (2^L5 / A) = 2^6, meaning that each bucket would have a capacity for 64 entries ! This would also reflect the maximum number of tries needed then, before a key could be retrieved once, even though P0 was known !

Clearly, it would good to avoid having to traverse this virtual bucket more than once, if doing so can be avoided, let’s say once to determine whether the key already has a defined value, and then once again if it does not, to find an available position to insert it.

But it would be even better in such a scenario, if a Human Being operating the hash table recognized that his table could grow to L = 20 , and for that person to set an initial Power level much higher, such as L1 = 18 or even L1 = 20 , so that the size of one virtual bucket could be kept as small as 16 or 4 . )

Now, there could be some people challenged by the thought, of how a key that could be 32 characters of text long, or consist of up to 32 bytes, can be multiplied by a single prime-number and thus hashed – to fit within a 32-bit modulus.

One simpler way is to establish that we have an accumulator-register with the required number of bits, and that is initialized to zero.

For every byte in such a sequence:

2. Multiply the accumulator by the prime-number, resulting in two output-registers of (n) bits, if the accumulator had (n) bits – a low output-register and a high output-register.
3. Keep the low output-register as the new accumulator-value.

A certain number of least-significant bits will also define the final, real modulus.

I should stress how important it is, that the so-called ‘Golden Prime’ used be less than the modulus, of 2^L1. The reason for this is, that if we multiply any key by a larger prime, only the L1 least-significant bits of this prime will actually contribute to the values of the L1 least-significant bits, of the modulus. This is simply due to how the carry-operations in a multiplication work: Right-to-Left and not Left-to-Right.

Hence, if L1 was really 16, and we tried to use as our golden prime, 2^16 + 1 = 65537, then the least-significant bits of that prime would be equal to ‘0x0001‘. And the least-significant bits of the position P0 would also, only have been multiplied by ‘0x0001‘.

(1) Doesn’t work too well as a golden multiplier. And if the LSBs were random, by default, they would just not be prime, if the actual prime had been greater than or equal to the modulus.

By same token, if we were frequently being fed 32-bit key-values whose last 16 bits were zeroes, or anything else that repeats itself, then my first example would fail.

In that case, it would be possible to add a context-specific obfuscation to Operation (1), such as to take the 16 MSBs of the key, and multiply those by the golden prime, resulting in a 32-bit product, to take the 16 LSBs from the same key, and add those to the value just generated from the MSBs, and to multiply the sum by the same golden prime again. As long as the obfuscation does not break the compatibility of the number output, with the scaled modulus-possibilities for L2, L3, L4

A sequence of 16-bit words should only be hashed in this way, if L1 >= 16 . Hence, if the programmer wanted to create a smaller hash table, with a power level of 15, 14, 13 or 12, he also needs to hash smaller individual message-words, such as bytes.

Also, the use of (-15) in 2’s complement may not look optimal when L = 16 – honestly without knowing why – But the fact should not be forgotten, that my purpose was to allow expanding L multiple times, without changing the prime, and that then, 65521 will continue to perform, when L = 17, 18, 19 …

(Edit 06/23/2017 : )

Another Variation, This Type of Hash Table

Very hypothetically, a similar scheme can be devised, in which Operations (1) and (2) are changed into a different pair of ‘hashing functions':

1. Take the key, and do not change it – i.e. multiply it by 1 – in order to arrive at P0.
2. Make A a large prime-number, to add to Pn, to arrive at Pn+1.

This would result in the type of mapping, that could conceivably be used to map virtual addresses to locations in the swap partition. Because, an operating system kernel might not store an explicit mapping in that direction. Instead, the amount of memory the kernel uses to manage the swap partition, is proportional to the size of the swap partition, and states at any moment, which virtual address each real block of the swap partition points back to.

Because now, Operation (1) does not really hash the value, this assures that the sequence of positions in the swap partition will be linear, if the sequence of virtual addresses was linear. This is very important for accessing a real swap-partition, on a traditional hard drive.

But, because Operation (2) would now be applying a prime number, we’d no longer have any guarantee, of how many visits to the hash table are required, before the matching position in the swap partition has been found. Pn could come back to such a hash-table 100 times, or 1000 times, or 10000 times. And so some limit would then need to be set elsewhere, before the kernel faults. That could be the total number of blocks in the swap partition.

We also do not usually see, that the total number of blocks in the swap partition would be a prime number. (…)

(Edit 06/24/2017 : )

I think that in order for this idea to succeed, the kernel needs to recognize, that a contiguous sequence of virtual memory pages is to be swapped out, so that a multiple of A can be found, according to which the entire sequence finds unallocated blocks in the swap partition.

Failing that, the kernel should ‘remember’ which multiple of A led to the greatest number of consecutive virtual pages finding unallocated blocks in the swap partition, starting with the given page, if any, and break the sequence apart.

After that, when a request is processed to swap one page back in, because code is often written to access / read / write addresses sequentially, but because the user-space code is also put to sleep, as soon as it hits one virtual page which is not mapped, when the multiple of A is found that leads to the matching block in the swap file, it could be the policy of the kernel, also to swap back in all the subsequent blocks, whose virtual addresses were contiguous with that one’s.

The main problem I find with my own idea, seems to lie in finding the most-appropriate value for A, given any real swap partition size. One possible solution might be to store a list of prime numbers in the kernel code, and then, when the size of the swap partition is known,

• Select each prime number in ascending order, and square it, until a prime number is found, the square of which is greater than the size of the swap partition in blocks,
• Try to divide the size of the swap partition by A. If it is divisible, choose the next prime number up.
• Dividing the size of the swap partition by A and ignoring the remainder, may also reveal the most suitable limit, of how many times A can be added to the virtual address page number, to find available blocks in the swap partition, for a complete, contiguous sequence. Doing so would then almost circumnavigate the swap partition. For swapping out page(s) even if no free position of Pn was found within this first limit, it might actually be best to set as the second limit, the size of the swap partition, but nevertheless to ‘remember’ how many consecutive free blocks followed from the first satisfyingly-unallocated one, still corresponding to Pn, but assuming a higher value of (n).
• Alternatively, the fact could be observed in advance by the System Software programmer, that after the first limit has been exhausted, as mentioned above, of how many times to add A to the virtual address page number, and therefore, the assumed swap partition has been circumnavigated once, a fixed second limit is only likely to reveal values of Pn, that lead to a partial allocation (of a contiguous sequence of virtual pages, to an assumed swap partition). And so the second limit, of how many times A should be applied, and therefore, of how many additional times the assumed swap partition should be circumnavigated, could be a fixed multiple of what the first limit was (such as once more). And then, the best result could be used.

(Edit 06/26/2017 :

In theory, any prime number would work for A, as long as the size of the swap partition is not divisible by it, if it remained our assumption that only exact positions Pn are required. But because every time, an entire range of virtual addresses is to be swapped out, if the prime chosen as A was even close to 100% or 50% the size of the swap partition, this would mean that after every application of A, or after every second application of A, Pn might seem to be in-use, due to the same previous allocation. And this would lead to trouble…

Some people might argue, that A need not be prime, but need just be the rational square root of the size of the swap partition. The same people might be assuming, that if the kernel was tasked to swap out a contiguous sequence of pages, it should not try to break apart that sequence. But, if the kernel had as an ability to break apart a contiguous sequence to be swapped out, then the worst that could happen would be that (Pn+0) be unavailable, even to swap out a single page.

If A is indeed prime, the risk of that one scenario happening is lower, and each time after at least one page has been swapped out, the attempt can continue with the following page – at first, to swap out what remains of the entire sequence, contiguously.

For the sake of academia it might suffice, to design a kernel on the assumption that the swap partition should only be circumnavigated once, and that if doing so fails to produce the required result, the kernel already signal an Out-Of-Memory condition. In that case, multiple additions of A to the virtual starting-address could still be used, but the best value for A would in fact be, the rational square root, of the size of the swap partition in blocks. )

While it is true that the management of virtual memory involves the existence of a Page-Global Directory (the PGD), which is a tree-like organization of tables that map from the virtual address, to physical memory, to the best of my knowledge the maximum amount of information it will yield about any virtual address-page, is the page-frame it has when mapped to physical memory, plus a lengthy set of flags. It only denotes that a page of virtual memory has been swapped out, with a single flag, that tells the kernel that the bits which normally state the page-frame, are invalid.

On a 32-bit system, if those same bits were reused to state what block of the swap partition the virtual page was swapped to (If the corresponding flag was set), Then the maximum size of the swap partition should also equal the maximum size of physical memory: 4GB. But I have installed many 32-bit Linux systems, that had 2GB of RAM, plus a 10GB swap partition, easily. Mind you, those 32-bit systems were still limited to having 32-bit virtual addresses, so that even according to my own theory, the whole swap partition could never have been addressed.

I think that it might encumber a real O/S, if blocks belonging to the swap partition needed to be allocated, before virtual memory could be swapped out.

In general, the swap partition only starts to lose effectiveness, after it has been made twice as large as the total amount of physical RAM. And if we do this, the only further effect is to waste some hard drive space. Also, in general, Linux systems are capable of swapping out a larger amount of memory, than Windows systems are.

This type of realization also helped me to recognize, that the Translation Lookaside Buffer (the TLB) is more than just an optimization. Without a TLB, a computer would constantly need to refer to the PGD, in order to translate virtual user-space addresses into physical addresses. And while the system can be set up rather easily, by which ‘the top 1GB of virtual addresses’ map to virtual kernel-space addresses, in which the PGD also resides, user-space needs to be mapped in more-complex ways, which would eventually require that every attempt to access a virtual user-space address be intercepted somehow.

The TLB buffers the page-frames of the most-recently used virtual addresses, and only attempts to access ones not in the TLB, are in fact intercepted with a page-fault. This page-fault causes the kernel to look up the virtual addresses in question in the PGD, from where the real, physical page-frames are loaded into the TLB – After those virtual addresses are swapped back in, from the swap partition, if need be.

(Edit 06/25/2017 : )

I suppose that another variation of this idea which might work well, would be that when a sequence of virtual addresses is swapped out, whatever allocation operation would have taken place in the swap partition, be replaced by the hashing function, as I suggested, but that once a position in the swap partition has been found, which the range of addresses can be swapped out to, the block-numbers within the swap partition used can in fact be saved, in the part of the entries of the Page Global Directory – the PGD – which would otherwise have been used, to store the page-frame of the virtual pages in RAM – in real memory.

That way, the allocation within the swap partition has been simplified, but the task of swapping individual pages back in, would also have been accelerated, since then, successive multiples of A do not need to be tried, to find those pages.

Oh yes. While this computer had a 32-bit version of “Kanotix” as its O/S, I had given it the name ‘Thunderbox’. But in more recent years, I resurrected this computer with a 64-bit version of the same O/S, that version being named “Spitfire” by the Kanotix developers, and in the same process I renamed this computer ‘Phoenix’.

It still has its 10GB swap partition, but now also has 4GB RAM. At least this way, because 64-bit virtual addresses can be much higher than 32-bit ones, there is a theoretical possibility of those virtual addresses wrapping around the swap partition size, and the full swap partition being used.

Yet, the practical reality with Linux computers is also, that once they have 4GB RAM or more, they rarely make use of their full swap capacity. Mine is currently using 650MB of its 10GB swap capacity. Further, under Linux, once the computers have 12-16GB RAM, they typically stop using swap at all – unless they are told to hibernate.

The advantage to me, of having allocated a 10GB swap partition to ‘Thunderbox’, was that when I resurrected the computer as ‘Phoenix’, I did not need to create a new swap partition in the process.

(Edit 06/28/2017 : )

I suppose that one observation to make, about the practical design of O/S kernels, reflects a key difference, to my suggested approach here. While it is possible for the kernel to compute all sorts of constants at boot-up, and even to access a (short) list of prime numbers written into its source-code, in order to arrive at the optimal hashing constant, the concept of what hashing constant is actually optimal is an approximate one. Thus, if I was to say that ‘The prime number just greater than the square root of the size of the swap partition – in blocks – is optimal,’ it would be just as well to say that ‘The prime number just smaller than the square root of the size of the swap partition is optimal.’ And in fact, some number that deviates from this square root more significantly, might work just as well.

Only, If the concept is that this added number is to make its way around a modulus more than once, it suddenly becomes important that it be coprime with the modulus. And the easiest way to ensure that is to make it prime, and to ensure that the modulus is not divisible by it.

But I think that in the practical design of kernels, programmers have frequently used one statically-defined constant, for their hashing functions, which the kernel does not need to compute at boot-up. I know that this works well for 32-bit systems, but am not sure how programmers followed up, with their design of 64-bit systems, where (the square root of (2^32 / 2^12)) no longer means anything.

Dirk