I don't really like OIDC
When an organization grows, centralized account management becomes an important issue. The modern protocol to do single sign-on (SSO) is called OpenID Connect (OIDC). In this post I will look into this protocol and figure out why it is so darn complicated.
My naive expectation
This is going to be quite a long post. But let's kick things off with a concrete example that illustrates what I expected, based on some basic knowledge about SSO:
When I try to access the application, I get redirected to a SSO login form:
$ curl https://myapp.example/ < HTTP/1.1 303 See Other < Location: https://sso.example/login/?client_id=myapp
I authenticate (e.g. by providing a username and password) and get redirected back to the application.
$ curl https://sso.example/login/?client_id=myapp --form "username=tobias" --form "password=…" < HTTP/1.1 303 See Other < Location: https://myapp.example/?code=ABC123
I am back at the application, but now with a
code
parameter. To verify this authorization code, the application (not my browser!) sends it back to the SSO provider:$ curl https://sso.example/verify/ --form "code=ABC123"
The SSO provider verifies the authorization code and responds with some information about my account, most notably a unique identifier.
< HTTP/1.1 200 OK < Content-Type: application/json { "username": "tobias", "email": …, "name": …, "groups": […] }
That's it.
The actual protocol
In reality, OpenID Connect has some additional steps:
The application fetches some information needed to interact with the SSO provider:
$ curl https://identifier-provider.example/.well-known/openid-configuration/ < HTTP/1.1 200 OK { "issuer": "https://sso.example", "authorization_endpoint": "https://sso.example/login/", "token_endpoint": "https://sso.example/token/", "userinfo_endpoint": "https://sso.example/userinfo/", "jwks_uri": "https://sso.example/jwks/", "response_types_supported": ["code"], "grant_types_supported": ["authorization_code"], "id_token_signing_alg_values_supported": ["RS256"], "token_endpoint_auth_methods_supported": ["client_secret_post"], "code_challenge_methods_supported": ["S256"] … }
When I try to access the application, I get redirected to the authorization endpoint:
$ curl https://myapp.example/ < HTTP/1.1 303 See Other < Location: https://sso.example/login/ ?client_id=myapp &response_type=code &scope=openid+email+profile &redirect_uri=https%3A%2F%2Fmyapp.example%2F &state=XXX &nonce=YYY &code_challenge=ZZZ &code_challenge_method=S256
I authenticate (e.g. by providing a username and password) and get redirected back to the application.
$ curl https://sso.example/login/ ?client_id=myapp &response_type=code &scope=openid+email+profile &redirect_uri=https%3A%2F%2Fmyapp.example%2F &state=XXX &nonce=YYY &code_challenge=ZZZ &code_challenge_method=S256 --form "username=tobias" --form "password=…" < HTTP/1.1 303 See Other < Location: https://myapp.example/?code=ABC123&state=XXX
As part of this authentication, I also explicitly consent that the application may access my information on the SSO provider.
I am back at the application, but now with
code
andstate
parameters. First, the application checks if thestate
parameter matches the one it sent in step 2. After that, to verify the authorization code, the application (not my browser!) sends it to the token endpoint:$ curl https://sso.example/token/ --form "client_id=myapp" --form "client_secret=…" --form "code=ABC123" --form "code_verifier=…" --form "grant_type=authorization_code"
The SSO provider checks that the
client_secret
andcode_verifier
parameters and that the authorization code is both valid and has not been used before. Then it responds with some tokens.< HTTP/1.1 200 OK < Content-Type: application/json < Cache-Control: no-store { "id_token": …, "access_token": "TTT", "token_type": "Bearer, }
The ID token is a JWT (basically a signed JSON blob) that contains some additional information:
{ "iss": "https://identifier-provider.example", "iat": 1736000000, "exp": 1736000020, "aud": "myapp", "nonce": "YYY" }
The application now does all kinds of verification:
- check the signature of the JWK (using the keys received from
jwks_uri
in step 1) - check that "iss" matches the "issuer" from step 1
- check that the token has been issued in the past (
iat
) and that it has not yet expired (exp
). - check that this token was created for this client (
aud
) - check that the nonce matches the one that was sent in step 2
- check the signature of the JWK (using the keys received from
Finally, the application fetches the user information from the userinfo endpoint, using the access token received in step 5:
$ curl https://sso.example/userinfo/ -H 'Authorization: Bearer TTT' < HTTP/1.1 200 OK < Content-Type: application/json { "sub": "tobias", "email": …, "name": …, "groups": […] }
This protocol is obviously much more complicated than my naive expectation (though the basic structure is the same). In the following sections I want to examine all the little differences and ask: Why is it there and is it really necessary?
OAuth legacy
As a first step it is important to understand that OpenID Connect is based on OAuth 2.
OAuth is not really an authentication protocol by itself. I feel like most explanations are overly complicated, so I will use an example instead:
There is a cool new service called awesome-meetings.example. I want to start using it immediately, but first it needs access to my calendar. So I press a button and get redirected to serious-calendar.example, where I verify that I indeed want to share my calendar with awesome-meetings.example. I get redirected back and can start scheduling meetings.
What happens in the background is basically the same as the protocol I described above. awesome-meetings.example ends up with an access token that it can use to access my calendar. The scope
parameter restricts what the token can be used for. In this example, the token can only be used to access my calendar, but not my address book.
The OpenID Connect authors squinted at this and decided that being allowed to access a user's data is really the same as authentication. They also figured that big companies like Google, Facebook, or Microsoft would probably want to provide both SSO and resource access. So combining the two seemed like a good fit.
OpenID Connect mostly adds the concept of the ID token, as well as the nonce
parameter. We will discuss both later in this article. They also add the .well-known/openid-configuration/
endpoint, which makes sense given all the available options.
Because oh boy are there options. The protocol I described above is just one of many possible ways to do it. There are many different and incompatible authentication schemes built on top of OAuth. OpenID Connect standardizes some of that, and OAuth 2.1 (still a draft) removes some further options.
Even though some options have been removed, there are still plenty left. For example, there are at least two ways to pass user information to applications (none of which match my expectation): It can be included in the ID token or received from a separate userinfo endpoint. I have seen both in the wild. Realistically, SSO services need do both to be compatible.
Terminology
Quick note on naming things:
- SAML uses the terms "service provider" (SP) and "identity provider" (IdP)
- OAuth uses the terms "client", "authorization server" (AS), and "resource server" (RS)
- OpenID uses the terms "relying party" (RP) and "OpenID Provider" (OP)
- I talk above about "application" and "SSO provider".
I am sorry for adding yet another set of terms, but I find all the others really confusing.
Threat Analysis
In non-SSO login, there are two main attack vectors: Either you manage to trick the login (e.g. by guessing the password) or you manage to steal a session cookie. Both of these vectors are the exactly the same with SSO.
The benefits are that you only have a single login implementation, so you can focus on making that really robust. You also only expose the password to a single service, which is an improvement over older SSO mechanisms such as LDAP, where the password was given to each application which verified it with the SSO provider in the background.
But there is also new attack surface. Authorization codes are sufficient to log in, and they are easily stolen (e.g. from the browser history). It is therefore crucial that they expire quickly, and also once they have been used. They should also not contain any personal information about the user.
A second, less obvious attack, is that an attacker could get a user to click a link with a crafted authorization code. As a result, the user might do something using the attackers account, while thinking they are using their own.
Of course, misconfigured applications may also allow to bypass SSO, maybe even register new accounts. Correct configuration is crucial.
Threat Mitigations: State, Nonce, and Code Challenge
These three parameters can be used to further limit the risk of authorization code injection. They all work very similarly: A random value is stored in the application session, and a cryptographic hash is sent in the initial request and then passed along. When it comes time to check the value it is compared to the hash of the value in the session again.
This way the whole transaction is bound to the application session. Even if an attacker would steel the authorization code, they could only use it if they also manage to steal the session cookie (e.g. by getting physical access to the device), by which point they don't really need the authorization code anymore.
These mechanisms also significantly raise the bar for supplying crafted authorization codes, because attackers need to include parameters that match the ones in the user's session (e.g. by witnessing the initial authentication request).
The differences between these parameters are small: state
is checked in step 4, so it can prevent making the token request. code_challenge
is checked in step 5, so the token request is made, but the application does not receive tokens. nonce
is checked in step 6, at the very end.
One benefit of code_challenge
is that it is checked by the SSO provider, so by requiring it you can be sure that it is implemented correctly everywhere. Of course that requires that all applications are compatible.
So which one should you implement? This is another case where I wish the spec had less options. Right now, for the sake of compatibility, it is probably best to support all of them. On the other hand, this increases the risk of downgrade attacks.
ID token
The main addition of OpenID Connect on top of OAuth is the ID token. From what I understand, it is completely redundant.
- Its cryptographic signature can be used to verify that authorization code, but we have already done that by sending it to the token endpoint over a TLS connection.
- It can contain information about the user, but we can also get that from the userinfo endpoint.
In an alternate world, we would receive the ID token directly instead of taking the detour of using an authorization code (this is called the "implicit flow" in OAuth). We would then validate the ID token and extract the user info, no additional requests necessary.
My main issue, again, is that there are too many options. We should pick one. And we should certainly not have to support both, that is just unnecessary complexity.
In the implicit flow, the tokens are passed in the URL and end up in the browser history, from where they can easily be stolen. This is not so much an issue for the SSO usecase, because the tokens have limited use there. But in the OAuth usecase, this is a real issue. I don't want people to steal the access token to my calendar.
OAuth 2.1 therefore went ahead and removed the implicit flow completely. This is a huge step in the right direction (which would also make the response_type=code
parameter obsolete if it wasn't for backwards compatibility). If the OpenID Connect spec got rebased onto that, it could be simplified massively.
Dynamic Redirects
The authorization endpoint receives both a client_id
and a redirect_uri
parameter. However, it would be insecure to allow arbitrary values for redirect_uri
. This would for example allow to redirect to an attacker-controlled URI that steals the authorization code.
Of course, always redirecting to the application start page would be annoying for users. When I open a link and need to log in before accessing the page, I want to get redirected to that page after login.
In the end, only the application can decide which redirect URIs are safe. So the best solution is to always redirect to a pre-defined URI and let the application handle the rest. In the meantime, the application could store the original URI in the session.
In other words: The redirect_uri
parameter is completely dispensable.
Client Secret
The token endpoint receives a client_secret
parameter. This allows the SSO provider to verify that the request comes from the same application for which the authorization code has been created. This is of course important for the OAuth usecase, because you don't want the wrong application to receive the access token for your calendar.
For the SSO usecase, this is less relevant though. What is the worst thing that could happen? A malicious client learns that I can successfully authenticate? That doesn't sound so bad. The token endpoint may gives you access to some limited information about the user though.
There may be more attacks that I don't see right now. Protecting the user information alone might be worth it. So I don't really mind it.
But again, there are way too many options: "the authorization server MAY accept any form of client authentication meeting its security requirements (e.g., password, public/private key pair)."
Native Applications
So far I mostly assumed that the application is a server-side web application. If instead the application is a SPA or a native app, things get more complicated:
- The client secret is exposed
- The values for
state
,code_challenge
, andnonce
are exposed - The request to the token endpoint uses the user's network, which makes MITM attacks much simpler
- The authorization endpoint cannot simply redirect to a native app as you would to a web application
I will not go into more detail here. The OAuth spec has a whole section on native applications. Just be aware that they are special.
Logout
One nice feature of SSO is that you may not even notice it: Clicking the login button in an application may seemingly just refresh the page and log you in. This is because the authorization endpoint can just redirect you back immediately if you are already logged in at your SSO provider.
However, there is an issue: Users may not realize that they are logged in at the SSO provider. Imagine someone using a shared computer in a library. They log in to their email account using SSO, then log out of the email account again. But they are still logged in on the SSO provider. The next person using the device could trivially log back in.
I can think of multiple solutions:
- When I log out of any application, I am also logged out of the SSO provider.
- When I log out of any application, I am also logged out of the SSO provider and all other applications.
- The SSO provider does not keep a session. When I want to log in to a second service I have to authenticate again.
- Just don't use shared devices.
I believe the issue here is that we do not have a shared mental model of how SSO logout should work. It may also depend on context. For example, I sometimes use github for SSO, but I also use github for other things, so I know that I have a session there. On the other hand, I would not remember to log out of keycloak because that is literally only used for SSO.
Zombie Sessions
Having centralized account management is nice. When a person leaves your organization, you can simply remove their account and they immediately loose access.
However, as I described so far, SSO is only used for initial authentication. After that, each application has its own session. People might hold on to their sessions long after the SSO account has been removed.
In the OAuth usecase, the access tokens connected to the central account would also expire. But in the SSO usecase, there is no standardized solution that I know of. Each application must be handled individually.
Permission management
When you have centralized account management, you may also want to do centralized permission management. To a degree this is possible.
On a basic level, you can configure to which applications an account even has access. You could also configure groups at the SSO provider that get mapped to application groups. But in my experience, this only gets you so far. You will probably still have some application specific permission management.
Conclusion
OpenID Connect is a solid SSO protocol. Unfortunately, it suffers from far too many options and some missed opportunities. The job of a standard is not to show the set of possibilities, but to restrict it. This is especially true for security sensitive protocols such as this one.
I do understand that some things should be pluggable. Cryptographic primitives need regular updates. But that's basically it.
OAuth 2.1 is a great step in the right direction. I am really looking forward to it. It seems to be active, even though it has been in draft state for a long time.
But it still has way to many options.