In recent years, advances in authentication technology have enabled developers to enhance the cybersecurity of their app or website without overburdening their users or disrupting the user experience. One of the most powerful innovations in this regard is biometrics, or tech-forward tools that can verify a user’s identity using only their physical or behavioral characteristics.
While fingerprint scans and facial recognition remain the most popular forms of biometrics, other emerging methods like voice remain more on the fringe.
Below, we explore how voice authentication works, what its benefits are, and perhaps why it hasn’t seen the same level of adoption as other biometric methods.
Biometrics is considered something-you-are authentication, because it relies on attributes that are inherent to a user’s body or behavior.
This sets biometrics apart from something-you-know auth factors (like passwords), which require users to create and remember complex credentials, and something-you-have auth factors (like one-time passcodes or security keys), which require users to prove possession of a registered phone number, email inbox, or physical device.
Instead, biometric factors use built-in verification tools to capture, analyze, and recognize a user’s distinctive, measurable features. There are many different types of biometric authentication, with flows that vary according to the specific feature involved. Some of the most common include:
Fingerprint recognition tools scan and map the unique ridges and valleys of a user’s fingerprint when they press their finger to their laptop or mobile device.
Facial recognition tools scan and map the proportions and contours of a user’s face and translate them into a unique numerical code known as a “faceprint.”
Iris and retina recognition tools scan and analyze the distinctive color markings (iris) or the distinctive blood vessel patterns (retina) in a user’s eye.
Voice recognition tools capture and analyze the sound qualities and other unique properties of a user’s voice, which are determined by their particular jaw movements, throat shape, and behavioral or linguistic inflections.
Many biometric auth solutions are equipped with liveness detection capabilities, so they can distinguish between a real, live user and a mere reproduction or copy — like a photographic image or voice recording — in order to detect and prevent fraud.
Let’s dive deeper into what voice recognition technology does and how it works.
Essentially, voice recognition tools allow users to access, activate, and/or interact with digital platforms simply by speaking to them. They’ve become well-known in recent years, with the release of smartphone software (like Siri) and virtual assistants (like Amazon’s Alexa and Echo devices or Google’s Home or Nest systems).
However, there’s a big difference between voice recognition technology and the speech recognition technology many of these voice-controlled systems rely on.
When used as an authentication factor, biometric flows rely on voice recognition technology to verify that the person speaking is the legitimate, registered user they say they are.
Voice recognition tools must be programmed or trained to identify a specific user’s voice.
This typically involves taking one or more speech samples and creating a unique digital template or “voiceprint” — similar to the fingerprints and faceprints used in other biometrics. This voiceprint is stored within the system and compared against any sample obtained during a log in attempt.
A voiceprint takes into account physical/physiological as well as behavioral/inflectional attributes, such as a user’s:
Some voice recognition systems require a user to input a specific, pre-determined phrase or sentence — which they must then repeat with every login — whereas others allow a user to say whatever they please.
The precision of voice recognition technology has improved significantly in recent years, and it continues to make steady progress. As with many authentication methods, voice biometrics is not foolproof, but accuracy rates top 95% on average and frequently reach up to 99%. Compare that to passwords, only 80% of which can be considered even remotely secure.
When compared to other biometric authentication factors, voice biometric authentication has a couple of distinct advantages. Voice recognition can be used with accessories that sometimes get in the way of other methods, like hats, gloves, masks or sunglasses. Voice biometrics is also contactless.
That said, there are a few shortcomings that make the voice a less preferable biometric method than others.
Some of the downsides that have hampered voice biometric authentication adoption to date include:
While some voice recognition solutions are turning to artificial intelligence (AI) and machine learning models to better understand and adapt to changes in the human voice across circumstance, these friction points and spoofability factors make the voice a less attractive biometric option than say, face recognition or fingerprints.
At the end of the day, we at Stytch believe that the security of any given authentication method is effectively moot if users won’t adopt it. So while the tech of voice biometrics may feel like the thing of sci fi movies (and kinda fun!), we’re far more interested in the biometric methods that are gaining traction and popularity, in large part due to the pioneering work of companies like Apple, who have worked hard to make biometrics a seamless, integrated part of their user experience.
Stytch offers many passwordless authentication solutions as part of our comprehensive product suite, including Native Mobile Biometrics.