Hashing and HMAC

What is hashing?
Giving some text as input and getting entirely different text of fixed length as output is hashing. Output is called as hash value/checksum/digest/hash(es).

Why do we need hashing?
Instead of storing the password in plain text or encrypting the password, we go for hashing technique. Hence it cannot be reversed. Also used to achieve the integrity of the files after sending across the internet.

Note : Integrity of the files is the Property which ensures that the file sent to anyone is the same as that of the source implying it is not modified/corrupted

How is it done
Hashing is done using hash generators. Let's learn more about it.

Hash generators and its purpose
Give the input (text/image/ file/software) and get the hash value for it in the hash generator. Hash value will differ even if there is a very small change in the input. With the hash, we can ensure whether data has been tampered or not. This is how the hashing achieves the integrity property.
Before downloading any software, check for the hash value given in the technical section. Check the hash value after the software has been downloaded. If the hash values didn’t match after downloading, implies that the downloaded one is modified or corrupted. In cisco routers, values of hash would differ since they add additional characters along with the input provided and then generate the hash. This is done to keep it more secure.

Hash algorithms are based on the length/size of the hash. Few of them are:

  • MD5 : length is 128 bits / 32 characters(Hexadecimal values) - Old, popular but not recommended to use when it comes to security
  • SHA1 : length is 160 bits / 40 characters (Hexadecimal values) - Not recommended to use
  • SHA2 : length is 256 bits / 64 characters (Hexadecimal values) - Safe to use(Secure)
  • SHA3 : length is 512 bits / 128 characters (Hexadecimal values) - Safer than SHA2(Most secure)

Note : Secured hash algorithms in order : SHA3 > SHA2> SHA1 > MD5

Let's look at the hash value of "Hello world!" generated by using each of the above mentioned hash algorithms.

Image 1: MD5 hash generator.


Image 2: SHA1 hash generator.


Image 3: SHA256 hash generator.


Image 4: SHA512 hash generator.

Problems with MD5
Sometimes two different inputs could have the same hash value. This is because of the less characters used in hashing. Hence there occurs high probability of hash collision. Let us look at what hash collision is.

Hash collision in MD5
In this example we can see two different images generating the same hash by MD5 algorithm

Image 5: Hash collision.

SHA1 : Researchers found that two pdf files generated the same signature leading to SHA1 hash collision. Hence it is not recommended.
SHA2 and SHA3 : They are safe compared to MD5 and SHA1 so far.

Possibility of attacks
If hash is not verified and consider that the downloaded one is corrupted then there can be chance of ransomware attack, Keylogger could have been installed which tracks the keystrokes. In pictures, if hash is not checked, there could be malicious script behind the image that are making use of steganography technique to hide behind and execute the script later.

Drawback
If file is sent over the internet along with the hash value, then there could be intruder who would tamper the data and he would compute the hash accordingly and transfer it to the recipient. Hence, recipient would not know whether the data has been tampered or not as he believes that his hash value is matching with the sender’s hash value. We can overcome this drawback by using HMAC.


HMAC - Hash Message Authentication Code

Keyed-hash message authentication code/Hash-based message authentication code (HMAC) is obtained by running the hash algorithms like MD5, SHA1, SHA2 and SHA3 over the data and the shared secret key.
Why HMAC - Hash function helps to attain integrity and to enhance the security further we come in for HMAC which helps to achieve the authentication along with the integrity at the same time.
Where is it used - Used in secure file transfer protocols like FTPS, SFTPS and HTTPS.

Process
First the secret key is shared between the two parties. Then they would exchange the message. Hash of the message is computed using that shared secret key. Let us look at the hash value of "Hello world!" using shared key "Network geek" with the help of algorithms that we learnt.

Image 6: MD5 HMAC generator.


Image 7: SHA1 HMAC generator.


Image 8: SHA256 HMAC generator.


Image 9: SHA512 HMAC generator.