Auth & identity
January 10, 2023
Author: Stytch Team
Welcome back to B2B Auth School. Our mission is to help B2B companies’ uplevel their understanding and implementation of user authentication technologies. Our first series of posts is dedicated to single sign on (SSO). This article is lesson six in that series.
Lesson one | Introducing B2B Auth School
Lesson two | Organization tenancy: the foundation of SSO and B2B data models
Lesson three | What is single sign on?
Lesson four | SSO protocols: SAML vs OIDC
Lesson five | What is OpenID Connect (OIDC)?
Lesson six | What is Security Assertion Markup Language (SAML) and how does it work?
Lesson seven | Choosing a B2B auth provider
In the previous lesson, we took a close look at OpenID Connect or OIDC – one of the two most popular protocols for handling single sign on (SSO). We looked at the origin of OIDC, its close relation to OAuth, and how that authorization protocol was built upon to create what is now quickly becoming a preferred standard for identity claims for federated, enterprise SSO.
Today we’re going to look at the other most popular SSO protocol, Secure Assertion Markup Language, or SAML protocol. We’ll look at:
Let’s get started!
Like OIDC, SAML is a protocol for exchanging authentication and authorization data between parties (if you want a refresher on protocols, check out the previous blog post). Its most common usage is for single sign on. Unlike OIDC, which is built largely with JWTs and claims, SAML is built with Extensible Markup Language (XML) and what are called SAML assertions.
The origins of the security assertion markup language date back all the way to the beginning of this millennium, when it was developed to overcome certain shortcomings in lightweight directory access protocol (LDAP), an even older authentication standard that was popular for on-prem systems in the 1990s. Before SAML, LDAP was the de-facto authentication protocol, and excelled when users and servers all shared the same network and building. But as companies expanded and the internet grew in prominence, the need emerged to perform user authentication and user authorization across networks, directories, and applications. SAML filled that whitespace.
The first version of SAML 1.0 was adopted by the Organization for the Advancement of Structured Information Standards (OASIS) in 2002, for the purpose of cross-directory security assertions and authentication. SAML 2.0 was ratified in 2005 and remains the latest up-to-date version of the protocol. Today, SAML’s primary use case is in B2B identity and access management, namely in enabling SSO.
Like OIDC, SAML authentication also works through a series of redirects and information that is conveyed along with those redirects. To start, let’s take a look at the main components.
Unlike OIDC, which inherited most of its terms and operations from OAuth, SAML authentication uses very similar terms to those used to describe single sign on more generally. If you read our overview of SSO, most of these terms and components will seem familiar.
To see the big picture, let’s first walk through the SSO login flow that occurs between the SAML key components.
Note: for this blog post, we’ll focus on a service provider-initiated (SP-initiated) SAML authentication flow, as it’s the most comparable with OIDC. BUT, readers should note that unlike OIDC, SAML authentication also allows for identity provider-initiated (IdP-initiated) single sign on. This looks a little different, and comes with specific security vulnerabilities that we touch on later. And if you want a quick refresher on IdP-initiated SSO, check out our overview of SSO.
Now, back to that (service provider-initiated) SAML flow:
Compared to OIDC, the SAML authentication flow is rather straightforward. The auth provider quarterbacks the redirects and exchanges between the service provider and identity provider in a certain defined order. The only difference here is that the auth provider sends, receives, and validates data in the form of SAML requests and SAML responses.
Unfortunately, that’s where the simplicity ends.
The notoriously difficult parts about SAML relate to how the data is structured – which are coded in XML. So in order to fully understand SAML, we need to study its XML anatomy defined by the SAML 2.0 specs. SAML has several specs – Core, Bindings, Profiles, and Metadata – but for this article we’ll primarily be focused on Core.
It’s worth noting that XML, like SAML, has been around for decades. Its maturity as a language and technology means that web services still use it in production today, but mostly as legacy applications that rely on older protocols like SOAP or XML-RPC. Ultimately, XML has aged quite poorly in today’s REST API landscape due to its verbose and delicate format when compared to modern data formats like JSON.
For SSO login, SAML messages come in two forms: SAML requests and SAML responses. The main difference between them is the direction in which they are sent. A SAML request is sent from the auth provider to the identity provider to request information about a user, while the identity provider sends a SAML response to the auth provider to provide the requested information.
However, they both have an envelope-like data model with these four core data elements:
SAML messages can be tens or even hundreds of kilobytes in size. But remember, the vast majority of SAML syntax will layer into one of these four data elements. To demonstrate, let’s dissect specific examples of a SAML request and SAML response.
Created by the auth provider and sent to the identity provider, the SAML request is a request for authentication with attached data properties. Among these properties may be information that relates to the service provider, identity provider, auth provider, B2B customer, or how the data needs to be delivered and formatted.
Note that in this article we won’t be going deep into XML semantics or schema (those could be their own article – stay tuned!). But if you want more specific information on SAML data schema or namespaces (mentioned below) check out page 11 in the SAML specs.
From top to bottom, we can deconstruct the XML into its most important parts and summarize their meaning:
To quickly paraphrase: the SAML message is going to send an authentication request bound to an HTTP POST to the specified destination and entity ID and, within the request, it asserts that it needs an email address in the response when returned to the callback URL.
Congrats, you’re now successfully reading SAML.
The SAML response is a much longer and nastier web of XML, because it contains much more information. It has all the requested data, and usually more, about the user and the authentication flow. Created by the identity provider, the SAML response is the most important artifact since it carries all the assertions necessary that enable SSO to work.
We’ll have to break the full SAML response into three parts.
Part one should look familiar. Like the SAML request, the SAML response has a top level element called the Response with attributes like xmlns and Destination, but with updated values; for example, the Destination attribute now specifies the auth provider’s callback url that will parse and validate the SAML response.
Part two of the SAML response is where all the relevant assertions start to come in. These assertions start to give context about the user and how they authenticated with the identity provider. You’ll see the Assertion element that wraps all the data statements inside, which include:
And finally, part three contains all the assertions about the user’s identity and profile. These user assertions are called attributes. Common attributes about a user include but are not limited to name, phone, email, location, etc. In our example, you’ll see an Attribute element for:
These attributes can provide a robust profile of a user’s identity. And that’s because attributes are customizable, in the sense that assertions can include about as much information and metadata an IT admin cares to stuff in there. But be mindful, every identity provider has its own unique, generally configurable, set of attributes for a user – which means the auth provider will need to know how to interpret them.
Adding all three parts together, the SAML response gives a full picture of the authentication flow and the user’s identity. But with multiple assertions and custom attributes, the length and format of the XML starts to multiply in complexity very quickly. It is a requirement for the auth provider to be able to parse and validate all the different variations of SAML syntax and assertions – which is a huge challenge, easier said than done.
While XML makes SAML seemingly endlessly extensible, it also comes with its fair share of footguns and challenges. The hive-mind that is the internet likes to give XML and/or SAML a hard rap for this, but we think it’s more helpful to think of these as challenges to be aware of rather than outright drawbacks. SAML is an incredibly powerful tool, but like any widely-adopted or long-existing standard, it needs to be approached with due consideration and thoroughness to get the most out of it.
SAML’s flexible XML structure that makes it so powerful also makes it vulnerable to footguns and cyberattacks if implemented poorly or without proper guardrails. Some of these include issues like buffer overflow attacks that target the XML parser; others include canonicalization issues, XML signature wrapping attacks, or attacks on XML encryption like adaptive chosen ciphertext attacks.
To protect your product against these vulnerabilities, it’s important to:
Many XML libraries have un-patched vulnerabilities, and many libraries are also not regularly maintained when new ones are found. This makes the XML parser a critical juncture for the security and efficiency of your SSO process.
Especially if you are building your SAML / SSO solution in house, make sure your developers spend the time reviewing XML specific bugs and issues. Most developers who are more familiar or experienced with other languages will not intuit all of XML’s quirks and requirements. The devil is in the details with this one.
IdP-initiated SSO flows are more vulnerable than SP-initiated flows, and they are only possible with SAML. Because auth providers don’t initiate the request in this case, hackers attack by pretending to be the identity providers themselves. In spite of these vulnerabilities, larger B2B enterprise customers often request IdP-initiated SSO for the convenience of their users. To keep those customers safe, B2B companies who permit IdP-initiated SSO with their app should research and protect against these vulnerabilities thoroughly.
The online community is continually poking holes and experimenting with SAML vulnerabilities, in part because it is such an important cornerstone of B2B authentication and security. Because of this, you should never consider SAML a one-and-done, and should constantly be on the lookout for emerging news stories or vulnerabilities as they are discovered. For extra reading, check out engineer Ionnis Kakavis’ deep dive on a SAML assertion-related vulnerability he found in Microsoft Office 365. It’s a perfect demonstration of how this legacy protocol continues to get broken and improved as time goes on.
So if there is so much literature on the challenges of working with security assertion markup language, why would your B2B app use it? Is it really worth the headache?
We’d say yes, but with an important caveat.
SAML is definitely worth offering as a B2B company: there are no signs of it going away any time soon, despite OIDC’s rising popularity, and the flexibility and capacity of its assertions to communicate loads of detailed information proves ever-desirable, especially for bigger enterprise clients with custom security and identity management needs.
But just because SAML is worth offering doesn’t necessarily mean it’s worth building in house, and it also doesn’t mean every auth provider that offers it is created equal. At Stytch, whenever we’re evaluating a vendor (or even our own product features), we ask two critical questions:
When talking with customers, we too often see amazing and promising B2B companies try to build SAML SSO in-house, or go with a provider that just checks a box, but perhaps doesn’t offer the flexibility or features that B2B team will need after they hit certain milestones. Both choices end up draining in-house engineering time and talent far more than the company would have wanted or anticipated, and then requires even more time to rip and replace.
So if you’re thinking about how to uplevel your B2B product with SSO (OIDC or SAML), we’d love to talk to you about future-proofing your product with best-in-class SSO from Stytch.