A hash function is any function that can be used to map data of arbitrary size to fixed-size values, called digest. In cryptography, we are interested in cryptographic hash function.
Cryptographic hash function has the following three properties:
- Preimage resistance: No one should be able to reverse the hash function in order to recover the input given an output.
- Second preimage resistance: Given an input and the digest it hashes to, no one should be able to find a different input that hashes to the same digest.
- Collision resistance: No one should be able to produce two different inputs that hash to the same output.
In probability theory, the birthday problem asks for the probability that, in a set of
randomly chosen people, at least two will share a birthday. The birthday paradox is that, counterintuitively, the probability of a shared birthday exceeds 50% in a group of only 23 people.
A birthday attack is a type of cryptographic attack that exploits the mathematics behind the birthday problem in probability theory. This attack can be used to abuse communication between two or more parties. The attack depends on the higher likelihood of collisions found between random attack attempts and a fixed degree of permutations (pigeonholes). With a birthday attack, it is possible to find a collision of a hash function in
being the classical preimage resistance security.
If our hash function generates random outputs of 256 bits, the space of all outputs is of size
. This means that collisions can be found with good probability after generating
digests (due to the birthday bound). This is the number we’re aiming for, and this is why hash functions at a minimum must provide 256-bit outputs.
Suppose that Alice knows some secret, call it
, and this secret can be revealed after a certain date
. She wants to show Bob that she actually knows the secret
but she can't reveal the content of
for now. In this case, a commitment scheme does the following things:
- Alice computes, such as SHA256(M), and sendto Bob.
- After date, Alice sends Bob the secret.
- Bob computesand compares the result with.
Commitments in cryptography generally try to achieve two properties:
- Hiding: A commitment must hide the underlying value.
- Binding: A commitment must hide a single value. In other words, if you commit to a value, you shouldn't be able to later successfully reveal a different value.
Hash function provides hiding and binding if used as a commitment scheme, because:
- Preimage resistence is equivalent to hinding.
- Collision resistence is equivalent to binding.
<script src="https://code.jquery.com/jquery-2.1.4.min.js" integrity="sha256-8WqyJLuWKRBVhxXIL1jBDD7SDxU936oZkCnxQbWwJVw="></script>
Most websites require username and password as authentication method. If a website stores user passwords in plaintext in the database, those passwords may be leaked through web attacks such as SQL injection. Therefore, websites should only store the hash of user passwords. But other problems exist with this solution.
- Problem 1: If an attacker retrieves hashed passwords, a brute force attack or an exhaustive search (trying all possible passwords) can be undertaken. This would test each attempt against the whole database. Ideally, we would want an attacker to only be able to attack one hashed password at a time.
- Problem 2: Hash functions are supposed to be as fast. Attackers can leverage this to brute force (many, many passwords per second). Ideally, we would have a mechanism to slow down such attacks.
The solutions are:
- Solution 1: Append salts to user passwords before hashing. Salts are random values that are public and different for each user. Even weak passwords will become stronger because of the existence of satls.
- Solution 2: Use password hashes, which are designed to be slow. The current state-of-the-art choice for this is Argon2.