Intro to OpenID, OAuth and SAML

Picture Following-up from our previous introduction to authentication and identity, we are going to focus this blog on defining Federated IDentity (FID) and then discussing three  FID standards SAML (Security Assertion Markup Language), OpenID and OAuth:  What each of them are, when and why they were developed, and how they are evolving.


Federated Identity

Federated identity is the means of linking a person's electronic identity and attributes, stored across multiple distinct identity management systems.  This means that a service or application does not need to obtain and store users’ credentials in order to authenticate users; Rather it can use another service or application, that is already storing the users' electronic identities, as a trusted identity management system to authenticate the user.  
 
For example, Facebook Connect is a federated identity management system.  Weebly, the web-hosting service that I use to publish this site lets you sign-up and sign-in using your Facebook credentials.  This means Weebly is no longer responsible for authenticating the user, Facebook is.  Both Weebly and I trust Facebook to be my identity provider by safely storing my credentials (currently, username and password) and correctly authenticating me.  Another example, this time in the public arena, is the UK Government Department for Work and Pensions allowing UK residents to sign-up and sign-into its website to manage benefit claims using their login information from one of eight providers, all private companies, including Experian and PayPal.  Just as in the prior case, this means that the UK Government is no longer responsible for authenticating each person, it trusts those eight companies to authenticate users on its behalf (although I imagine the liability for erroneous authentication still falls on the side of the government).Two of the key benefits of federated identity are:
Users only need to remember a few sets of credentials - from those companies they trust to safeguard their identity - and can use them to sign into many different sites
The identity providers have identity management as one of their, or possibly their only, core competencies.  The level of security and protection of personal data will, in all likelihood, be much higher that that of a generic service or app
It is important to highlight that although it is a third party that authenticates the user, it is the service or app itself that authorizes the user, meaning that it is in control of the level of access the user has to different resources and functions.  There is a clear and complete split between authentication and authorization.SAML, OpenID and OAuthThese are three different FID standards that were originally built to address very different needs: 
SAML was developed in 2002 by the OASIS Security Services Technical Committee as an XML-based open standard for exchanging authentication and authorization between parties.  It's main purpose was to facilitate Single Sign-On (SSO) for enterprise users. OpenID is an open standard released in 2006 with the same purpose as SAML (SSO) but for consumer apps and servicesOAuth also became available in 2006 as an open standard to allow apps to share information via APIs with the right level of authorization
It is important to note that SAML and OpenID were both authentication protocols, but OAuth was an authorization protocol (OAuth stands for Open Authorization).  Now a days, they are all grouped under the banner of federated identity standards because of how they have evolved, but strictly speaking, OAuth was not a federated identity standard at its inception (one could argue that it is still not).The Evolution of OpenID and OAuth to OpenID Connect and OAuth 2.0In a nutshell, Open ID gives you one log-in for multiple sites. For example, when you need to log into LifeJournal, a site that accepts OpenID, you will be redirected to the provider of your OpenID, for example a WordPress blog (more on this later), for the provider to authenticate you, and then redirected back to LifeJournal.  As explained above, and as the case with federated identity in general, LifeJournal is no longer responsible for authenticating the user.  Both LifeJournal and the user trust a third-party identity provider to correctly authenticating the user.

Figure 1 is a flowchart graphically representing the interactions we have just described.

Picture

Figure 1 - OpenID Flow

When Brad Fitzpatrick and team first created OpenID, they were looking to develop a protocol that made it possible for a commenter to claim her comments on someone else’s blog. For the commenter, she could claim her posts and build a reputation; for the blog owner, he had a way to recognize and link readers' comments. 

Given this context, all that was required in the early days of OpenID was to uniquely identify an individual thus allowing them to establish identity across contexts.  There was no suggestion of two web apps or websites sharing data, except possibly, very general information, such as address or phone number, but only with the users explicit consent.  This allowed users to hold a single account, at say yahoo.com, but sign in to third party sites using “non-correlatable identifiers”, enabling users to keep their identity private across all their interactions. 

It was also important for OpenID to keep authentication capabilities decentralize.  That means, that there is not a single Identity Provider or even a small set of Identity Providers that users can choose from.  Any site can offer the service - such as any site created with WordPress.  Then it is up to users and websites to choose the third parties they want to trust for authentication purposes.


The role of OAuth is different.  It lets you authorize one website – the consumer – to access user data from another website – the provider. For instance, a user wants to authorize her favorite social network to grab her contacts info from her e-mail provider.  The social network will redirect the user to her e-mail provider so that she can be authenticated.  Once she had authenticated herself (this authentication step is totally orthogonal to the OAuth process), the e-mail provide will ask her to confirm that she wants to share her contact info with the social network.  If she confirms, the e-mail provider will send the data requested to the social network. 

Figure 2 below is a flowchart graphically representing the interactions we have just described.
Picture

Figure 2 - OAuth Flow

Both OpenID and OAuth were published in 2006 and they have evolved over a number of releases, gaining in flexibility, security and overall capabilities.

Starting in May 2008, Facebook launched Facebook Connect, a new authentication protocol built on top of OAuth by adding an authentication layer on top of the authorization standard that set restrictions and specific security and encryption requirements.   Overtime, Facebook Connect evolved to use OAuth 2.0 (a version of the OAuth spec published in October 2012) and other similar services, also based on OAuth 2.0, emerged, such as Twitter and Google Connect.  The focus of these protocols was authentication with profile portability - sharing app-specific user information at time of authentication - facilitated by OAuth data-sharing capabilities.

In spite of interoperability issues, these proprietary protocols were viewed as more valuable by service providers than traditional OpenID because the focus is to authenticate but also to share information between apps, which can greatly benefit service providers and websites in general.  OpenID Connect represents years of work to align consumer Identity Providers (i.e. MSFT, Google, Yahoo…) and other industry participants on a single profile of OAuth 2.0 for authentication.  Now, most of the consumer Identity Providers, such as Google or MSFT, provide solutions that fully support required features of OpenID Connect.

For those readers inclined to audio-visual learning (including me), here is the link to one of my favorite videos on OpenID Connect by Nat Sakimura

What role does SAML play?

Although SAML 1.0 was released in 2002, the version most widely used today, SAML 2.0, was released in 2005. SAML was designed to cover B2B, as well as B2C scenarios, although its implementation proved to be too complicated to gain mass adoption among smaller B2C players, which have mostly elected to implement OpenID, and more recently, OpenID Connect.

SAML defines XML-based assertions and protocols, bindings, and profiles.  A profile describes how all the other elements are combined to support a use case.  The most widely used SAML profile, and also the one one which we are focused here, is the Web Browser Single Sign On Profile.

An assertion contains a packet of security information, usually transferred from identity providers to service providers.  Assertions contain statements that service providers use to make access control decisions.A SAML protocol describes how certain SAML elements (including assertions) are packaged within SAML request and response elements, and gives rules about how to process those packages. For the most part, a SAML protocol is a simple request-response protocol.A SAML binding is a mapping of a SAML protocol message onto standard
messaging formats and/or communications protocols. For example, the SAML over SOAP
or SAML over HTTP.
Depending on the protocol and binding chosen, the communication flows between the parties can vary greatly, which provides the benefit of flexibility, but the simplest cases align with the OpenID flow that we described above.

Security has always been an absolutely key requirement for SAM, and it has a variety of security mechanisms at transport- and message-level
An example of a very public project that has deployed SAML is Healthcare.gov.  In the case of this project, SAML is used to connect the Healthcare.gov server with the servers from a number of federal and state entities, such as Social Security, IRS, Medicare and Medicaid...  This ensures that the exchange of personal data between the agencies and Healthcare.gov stays private and secure.In fact, according to John Bradley, the SAML implementation seems to be at the heart of the problem.  In his own words, as published in his blog on October 22nd:
While HealthCare.gov is using SAML 2.0 they are not using a standard deployment profile [...]

The industry puts time and effort into producing profiles and testing against them, to reduce integration problems when people deploy federations.   The [id]Management.gov for the US Gov is clear on its deployment profile for SAML SSO.One of the problems the insurers had integrating with HealthCare.gov was a divergence they made from these profiles in what they were sending to the insurers.
How does SAML compare to OpenID Connect?

Although in the 2000s, SAML had levels of flexibility, security and reliability much greater than OpenID, OAuth or any combination of those two standards, the latest versions of OpenID Connect and OAuth 2.0 provide most, if not all, the benefits that SAML brings to the table.  For example, from a security perspective, OpenID Connect can now offer ISO/IEC 29115 Level of Assurance 1 to 4, leveraging on crypto and other techniques.  

For these reasons, while most people think of OpenID Connect as being adopted by Social sites like Google for Login, it is also gaining traction in enterprise targeted services like Windows Azure Active Directory (WAAD),  Ping Federate and PingAccess.  It can be even more powerful when used in combination with provisioning protocols like System for Cross-domain Identity Management (SCIM).