What users expect these days, is that a session with a server is secured via ‘High Encryption’, so that both the user’s password and his data are truly secure. However sometimes, the server may not have an SSL / TLS certificate, or for some other reason, high encryption may not be available. And in such cases, the need has arisen for the client to prove that it knows the password, without a potential eavesdropper being made aware of it, even if the eavesdropper can capture the entire negotiation.
In such cases, really, the only alternative is challenge-response authentication. I suppose that clear-text password authentication can also be selected on some platforms, but obviously, clear-text solutions are always insecure.
The way challenge-response authentication works in general is, that both the client and the server, have the password or a password-equivalent stored, and that the server next sends a small piece of data to the client, which constitutes ‘the challenge’. This piece of data can most-conveniently be derived from the date and time, so that it never repeats itself, but must be chosen by the server. It is not secret. ( :1 )
What the client is then expected to do, is merely to append this challenge to the password, and to hash the result. The same process is applied by the server, but only the client communicates the result back to the server, the latter of which verifies that the result from the client matches, with what the server obtained internally, when the server also appended the same challenge, to the same password, and hashed the results.
There are a few differences in specific implementations. For example, it may be that the actual password is not stored on the server, but that only the hash-code of the password is stored. Well in that case, after prompting the user for the password, the client must also hash it once, to obtain the equivalent of what’s stored on the server, the client must then append the challenge to it, and the client must hash the result a second time.
(Updated 04/01/2018, 14h00 … )
(As of 03/27/2018 : )
Implementing this approach has as key practical issue, that the hashing-algorithm used for challenge-response, may not be the same, as what gets used for general log-ins, on the server. And so typically, the server will need a separate file to store its hashes, from the file that stores the log-in credentials. The only way the server can recover the password, is if a request is processed, to create a new one.
But there is a certain way in which password-hashing in this form, fails to add much security, over what we’d get, if the passwords were just stored in clear-text. In the other situation, where high encryption is available, the client is required to submit the password, so that the server can hash it, and can then compare the result to the stored hash-code. What this would do, is ensure that the client did in fact receive a true password. If the client is never required to submit an actual password, then an arbitrarily altered client-program can also just use a snooped hash-code, to authenticate. I.e., an altered client-program would not need to know the password, only the first hash-code, to append the challenge and then to re-hash the result.
Further, the dangers of having the hash-codes stolen can even pose a problem, if a true password must be submitted to the server instead, because matching hash-codes would imply that, matching passwords should also work. This means that if a hash-code database is ever stolen, it could potentially also result in log-in credentials to some other Web-site or server becoming known to an attacker. And to mitigate the dangers from this, hash-code databases are sometimes ‘salted’.
What ‘salting’ implies, is that instead of the hash-codes of only the passwords being stored on the server, a small piece of data is appended to the passwords, which is unique to the server, and the results of that hashed for the first time. This way, even if an attacker was to have stolen the hash-code, at least the corresponding hash-code on a different server, would still be different, if that account had the same password.
The concept of hash-code salting could be extended to challenge-response authentication, but I don’t see the point. Presumably, the client would be given both the salt, as well as the challenge, openly, so that an honest client would first append the salt to a password that was being typed in, would hash the result, would apply the actual challenge, and would then hash the result a second time… The reason why this provides diminishing returns, in the case of challenge-response authentication, is the fact that again, the client-program could have been altered, so that it starts with the (this time salted, but stolen) password-equivalent, instead of starting with an actual password.
Salting and hashing really only provide true improvements to security, if the client is required to submit the actual password, which is only feasible, if the connection between the client and server is strongly encrypted.
(Update as of 03/29/2018 : )
AFAIK, The main Security Vulnerability this entails, lies in how the challenge is managed in practice. For example:
- It could be a design-decision, just to store the challenge on the sever. The problem with this would be, that this information – actually, the resulting hash – must nevertheless be associated with something. Some programmers have been foolish enough, actually to store this with the user-id. Those programmers may have felt that doing so was secure, but it actually implies that a surreptitious authentication will succeed, all the way until the hash-code is deleted again.
- It could be a design-decision, that the challenge be a compact binary representation of a time-stamp, precise to within 1 second, which is stored with the user-id. But then, the client could be required to pass an up-to-date time-stamp back to the server, which is to verify that the returned time-stamp’s age does not exceed some time-limit. As long as not, the server could use the returned time-stamp to verify the hash. Further, the stored time-stamp could be increased by the server, to the maximum between the returned time-stamp plus one second, and the server’s time, just to make sure that the same challenge is never used again.
- In principle, the previous suggestion would already need to be modified, so that the server-generated times are spun backwards by maybe 5 minutes, just in case the client’s time-base is inaccurate. But alternatively, the scheme could be modified, by applying no time-limit per se, but only requiring that the returned time-stamp not precede the stored one. In that case, the client might be biasing the scheme against his own security, by submitting an authentication which is ahead of the real time by more than a certain amount. Because in that case, a hypothetical eavesdropper may yet succeed at replaying an already-used challenge / resulting hash.
- Along the same lines, the client could be programmed to use the maximum between its own time, and the time-stamp sent by the server, as the actual challenge.
- The issue will eventually ‘come back’, of how to manage multiple sessions from the same user-id, which the first, supposedly insecure method in this list had already taken care of – simply. To deal with that possibility – more securely, the server would need to use a table for each user-id, of all his logged-in sessions, each of which could add the session-id to the hash, as well as its own
server-suggestedtime-stamp. But then, because each user needs to log in before his session is established, the credentials would also need to be stored on the server, of each user-id, not logged in.
(Update 03/30/2018, 22h05 : )
Because this subject is vaguely connected to challenge-response authentication, I’m going to mention it here:
Rather than to store a password hash, the server may be programmed to store an encrypted password. The difference would be, that an encrypted password can be decrypted, so that the actual password is routinely fetched by the server again.
The type of password encryption I’m mentioning is cryptographically weak. It merely prevents an attacker, who may have been able to read the password database, from being able to recognize the password on-sight. But typically, the method of decryption is so simple, that a sophisticated attacker may also know it, and be able to recover actual passwords.
Actually, the method is ‘weak’ for a different reason. If a hacker steals encrypted passwords, he may also be in a position to steal the ‘secret key’, with which the password-database’s user-passwords are encrypted. If he does both, then he will effectively have stolen the passwords!
One way in which risk is mitigated, in the case of Web-applications, is by storing these two pieces of data, in different places. For example, Web-applications driven by PHP-Scripts will have an include, which resides on one machine, and which sets certain variables, such as the key with which the Web-application accesses the ‘MySQL Database’ (which is by now a ‘MariaDB Database’). This include will also set the key with which critical fields such as passwords are to be encrypted, in the database. Therefore, if the database is compromised, which may reside on a different physical machine, it does not automatically follow that the server-scripts have also been compromised.
Even though the method is cryptographically weaker in general, in the case of challenge-response authentication it’s actually stronger, because in this case, a stolen password hash could be used directly by an attacker, while a weakly-encrypted password still could not. ( …?… )
We would have the best of both worlds, if the password was both hashed and encrypted, on the server. The nicety of this approach strikes me, in the possibility of the client still only being required to hash the password which its user types in, applying the challenge, and then hashing the results again. But then, to introduce this to an old platform, such as ‘NTLMv2 Samba File-sharing’, would simply produce incompatibility with the existing implementations, the latter of which expect that the actual password be used as the basis for challenge-response authentication, which still needs to generate identical results between the platform-versions.
A recent development in the news has been, that a certain company which manages a large scale of log-ins, has had a data-breach, in which password-hashes were stolen. The method used to hash the passwords, was ‘bcrypt’. The fact should be remembered that even though this is a strong hashing algorithm, which supports server-salt by the way, ‘bcrypt’ is still only a hashing algorithm. The fact that the company uses it, does not by itself explain, what their authentication protocols are.
What is to be applauded about ‘bcrypt’, is that the program that invokes it can set a Cost-factor / i.e. difficulty-factor, to be as high as desired, which will increase the number of iterations the algorithm uses. What this has the potential to thwart, is a dictionary-attack on stolen hashes, which attackers could run at the speed of their own computers, as each potential password would then also need to be put through the number of iterations the hash was sealed with. Therefore, ‘bcrypt’ will effectively prevent a password from being recovered, when the hash is known.
When it comes to the question of merely wanting to obfuscate a recoverable password, the amount of encryption strength in that, is just as good, as the amount of effort which one application has put into it.
For example, I’ve seen applications which run on our PCs, which just Base-64 encode the password, so that visual inspection won’t derive the true password. But OTOH, 128-bit encryption could just as easily be used on a server. The latter case however requires, that an encryption key be stored on the server, which can never be changed – unless the sysadmins also want to invalidate their whole password-database – but which sophisticated hackers would also know, they need to steal.
A middle ground which some applications might take, is to have a prime-number occur in their source-code, which has a decently-random bit-pattern, and for which there must be a multiplicative inverse, in the modulus of 2^32. The application could compute the multiplicative inverse every time it starts up, rather than to state it in an obviously-visible way, in the same source-code. Then, both the password and its encrypted equivalent could be broken into 4-byte chunks, each of which is a number in the modulus of 2^32 … ( :2 )
( Edit 03/31/2018, 9h15 : )
A variation on this scheme which can also work, is that every packet / transaction exchanged between a server and client can be signed, using a hashing algorithm. This would mean, that each type of transaction can have only one out of several possible fixed formats, and that one implicit field in the transactions, which does not get sent, would contain the password / password-hash. The entire transaction, including the password-equivalent field, gets hashed. Since both server and client know the password equivalent, they can each insert that, after receiving a transaction, before verifying the hash.
(Edit 04/01/2018, 14h00 :
If this feature was adapted to challenge-response authentication, then the part of the scheme which I listed above, which must be omitted, would be ‘For the server to keep advancing the time-stamp, associated with one established session.’ The reason for this would be the fact that many messages could be sent to the server as part of a stream, timed much less than 1 second apart. And then it would be necessary for the client or the server, to be able to put the same time-stamp on all of them. The only additional check which could then be performed on a time-stamp, would be to determine ‘If it’s within ±5 minutes of the current time, as reported by the server.’ And so, the revised method for already-established sessions would become:
- The client starts with either its password-equivalent, or its token, which is not to be communicated, but appends the user-id, the session-id, and the current time-stamp, before computing the hash.
- The resulting hash-code is appended to the transaction.
- The server then verifies that the time-stamp is within ±5 minutes of the server-time, as well as that the session-id exists as belonging to the user-id.
- The server verifies the hash, since it also ‘knows’ either the password-equivalent of the user-id, or the token associated specifically with the session-id.
- A ‘window of opportunity’ now exists, consisting of 5-10 minutes, within which a hypothetical eavesdropper could hijack the session.
This can help secure the many packets sent after authentication, i.e. belonging to an established session.
Because communication on the established Internet is broken apart into packets, which could even be sent along different paths between client and server, the signed part of any transaction should never be longer than 1 packet.
But, since ‘bcrypt’ only allows a password-field of up to 72 characters / bytes, and since many transactions will contain more than 72 bytes, the method of hashing used for a whole transaction needs to be something else, such as maybe SHA-1, SHA-256, etc.. ‘bcrypt’ would only be used, to hash the password itself.
Also, one concept which gets used a lot in modern computing / mobile apps, is the concept of ‘tokens’. Here, the token is used to validate repeated connections from the client, rather than a password equivalent. Tokens are assigned to clients by the server, after the clients have authenticated successfully, using a password equivalent. One problem with using tokens however is, that if sent plain-text, they could still get copied by an unauthorized third party. And my posting mainly covers what can be done about that. In that context, a session-id could get included in the transaction literally, for which the server has a token stored. But the token-field could get left out, so that the token is again inserted by the server or client, when verifying the hash.
The problem that ensues with tokens is, that immediately after establishing a session, the token would still need to be communicated by the server, to the client, securely. And my posting mainly covers the question of how to proceed, if the connection was never secure. According to the same premise, there would be no way for the client, to request a password-change from the server.
An additional question which my line of reasoning poses is, whether ‘insecure hashes’, such as MD5, should still be used in this context. And in order to answer that question, I’d need to elaborate on the ways, in which MD5 is actually known to be insecure:
Nobody actually claims, that an MD5 hash can be reversed easily, into revealing the data that was hashed. Instead, the complaint about MD5 is ‘poor to nonexistent collision resistance’. If it should happen that two different messages result in the same hash-code, this is called a ‘hash-collision’, and they are known to occur.
Hash-collisions become a hindrance to data-security, if a 3rd party can craft a message, which produces a known hash-code. Well, If the challenge-response system was designed in such a way, that no challenge is ever used twice, then no hash-code will be expected to recur. Not even a sophisticated attacker would be able to predict, what the next hash-code is supposed to be, that he sets out to create a fake transaction, to reproduce.
Further, when known methods are used to recreate a known hash-code, doing so usually requires that gibberish be inserted into the message, which may still look correct at its beginning and end, but which will break formatting in the middle of the message. This can be exploited, if there is a region in the message allowed to contain a comment. The comment would be ignored by machines, because it would still be delimited correctly, but could in principle become longer than the rest of the message, and could contain whatever gibberish was meant to reproduce a known hash-code overall.
What this suggests is, that If the transaction has a strict format in terms of fields allowed, and if only certain values in those fields result in behavior useful to the attacker, then to try to reproduce a known hash-code becomes much harder.
Actually, the possibility of hash-collisions is one of the reasons, why the hash-code itself cannot be reversed into revealing the original message – multiple possible messages map to the same hash-code.
Because I’ve recently been studying the subject of (Windows-inspired) SMB File-Sharing, and of how Linux emulates it using their Samba servers, which sometimes fall back to SMB1-protocol, I’ve found the revelation to be disappointing, that SMB1 still uses MD5 as its signing algorithm, with no option for the client to specify a better hashing algorithm.
The problem for me with this is, that because SMB transactions contain large data-fields – corresponding to intended content of files – these variable parts of the transactions will also not conform to a strict format, and do represent some vulnerability to 3rd parties, who could plausibly fill packets with gibberish.
But then again, there is a limit as to how Paranoid I’m likely to get about data passed around on my personal LAN at home, and further, much of my Linux-based software is capable of choosing SMB3 as its protocol. As of SMB2, a stronger hashing algorithm was put into effect, such as ?SHA-256? .
(Edit 03/31/2018, 15h00 : )
On the subject of merely obfuscating a locally-stored password, in user-space:
In this earlier posting, I described how, given a prime number (E) and a modulus (M), a maximum number of iterations (i) less than (E) would be required, to compute its multiplicative inverse (D). The only condition was, that (E) and (M) would need to be coprime. Well if a constant (E) was embedded in source-code, which in turn was to compute (D) in a way not obvious to casual inspection, and if (M) was simply 2^32, then it would follow that the programmers would already known which iteration (i) would yield (D). And so this task could be made much simpler, by only applying the iteration (i) that was known to work, when (E) was known, when the code was written.
Hence, if we were to get ridiculous, we could actually test the premise with (E == 3) ! In this case, only (i == 1) and (i == 2) would need to be tested by the programmers, and I already found that (i == 2) does not yield a fraction. So the code would look something like this:
long int mod = 1L << 32; long int e = 3; long int d = ( mod * 2 + 1 ) / e;
What this code would do, is assign the value to (D), which is:
The main reason we wouldn’t use (3) is the fact that doing so, would only cause single ASCII-characters to be encrypted, most of the time, which is cryptanalytically easy to break. But using something like:
(E == 65719)
Should give more random-looking results:
long int mod = 1L << 32; long int e = 65719; long int d = ( mod * 32729 + 1 ) / e;
And This is the C++ Program, which can be used to determine the numerical constants, to embed in an eventual C program, that will produce the decryption key (D), which is also the multiplicative inverse of (E).