As an engineer, you’ll likely come across the concepts of data hashing and encryption whenever you’re handling sensitive data. Both hashing and encryption are important to cryptography and are often confused, and choosing the right algorithms for your use case is not necessarily a straightforward decision.
In this article, we'll cover the basic definitions of hashing and encryption, and compare three common hashing algorithms you're likely to come across in your work: Argon2, bcrypt and scrypt. We'll look at their origins, their strengths and weaknesses, and in what circumstances you'd likely use each. L
et's dive in.
Encryption is a process that involves scrambling or coding information so that only someone with a key to how you scrambled or coded it can read it. This is key: you only encrypt information when you expect or intend at least certain parties to decrypt it and access it later. It is a two-way, reversible process. In computer science, encrypting is typically done with complex mathematical algorithms.
Similar to encryption, hashing is another mathematical technique used to obfuscate data you want to keep unavailable to other people. But there are two major distinctions that make hashing different:
The conversion from data to hash value is done by an algorithm, and the choices and operations involved are what make each hashing algorithm unique. Every hashing algorithm has its own history and design. Some were created to support additional parameters that you can adjust to meet your security or computational needs. What parameters are available depends on the hashing algorithm you choose to use.
This is why the choice of hashing algorithm is so important: you want to make sure it meets the computational and security needs of your use case.
Password storage is one of the paramount concerns for any application or web service that deals with authentication. Depending on how and where they store passwords, their users' credentials could be exposed to bad actors in the event of a data breach or cyber attack.
Password hashing is a great way apps and online services can protect users' accounts in the case of a cyber attack or data leak. A stored hash has two big advantages over storing a plain text password or even an encrypted password.
Once a hash value is generated, there is no way to derive the original input based on the hash value alone. Hash values usually appear as a random string of characters.
At the same time, companies that only store a password's hash can still verify the identity of a user with their username and password because the same input string will always generate the same hash. So if a user enters their password and it generates the same hash as the one in the application's database, they can safely verify the user without storing their credentials.
Like any cybersecurity measure, though password hashing can greatly increase the cost of hacking or account takeovers, it is not completely attack proof.
The main way hackers can break a hashing system occurs requires a few steps, each one of which can be thwarted by additional security measures:
While we've summed up this process in three neat steps, it's a very time and computing-intensive process, which serves as a strong deterrent against most hackers.
All hashing algorithms involve taking a piece of information called the "input" and a number of other parameters that determine the hash's complexity, computing requirements, and additional security measures.
Note that not all hashing algorithms use all the parameters below. Indeed, the parameters hashing algorithm use are a big part of what distinguishes one from another.
Some examples of parameters include:
With so many options for hashing algorithms, like SHA-1, SHA-256, MD-5, Argon2, scrypt, and bcrypt, it's important to understand the differences and choose the right one for your needs. For password protection, Argon2, bcrypt, and scrypt are recommended due to their configurable memory and cost parameters that can increase computational strength against attacks.
Argon2 was designed by Alex Biryukov, Daniel Dinu, and Dmitry Khovratovich from the University of Luxembourg. They released their specification paper on Argon2 in 2015 and that same year won the Password Hashing Competition, organized by a global panel of security and cryptographic experts. In their paper, the designers state their motivation for creating Argon2 was "to maximize the cost of password cracking" and that "passwords, despite all their drawbacks, remain the primary form of authentication."
Bcrypt was designed by Niels Provos and David Mazières. They presented their paper "A Future-Adaptable Password Scheme" in 1999 at the Unix Users Group conference. They based their hashing algorithm on Blowfish, an encryption algorithm created by Bruce Schneier in 1993, to take advantage of its purposefully expensive key setup phase. Provos and Mazières took the concept further by designing bcrypt to have adjustable cost. In their paper, they state that "the computational cost of any secure password scheme must increase as hardware improves."
Scrypt was created by Colin Percival who presented his conference paper in 2009 at the Berkeley Software Distribution conference. It was originally developed for Tarsnap, an encrypted online backup service for UNIX operating systems, which Percival also created. Scrypt was designed to be a memory-hard algorithm that would be maximally secure against hardware brute-force attacks.
While there are of course deeper nuances to Argon2, bcrypt, and scrypt, the choice between them boils down to weighing computing and time requirements against memory hardness and parameter number.
Argon2 is a great memory-hard password hashing algorithm, which makes it good for offline key derivation. But it requires more time, which, for web applications is less ideal.
bcrypt can deliver hashing times under 1 second long, but does not include parameters like threads, CPU, or memory hardness.
scrypt (Stytch's personal choice!) is maximally hard against brute force attacks, but not quite as memory hard or time-intensive as Argon2.
At Stytch, once we've salted and hashed all passwords using scrypt, we store them in an encrypted database that we manage. This ensures our Passwords solution is secure and built for performance.
If you’re looking for more information on each hashing algorithm, read more about the differences here.