
Hi everyone, thank you for attending this amazing session on tokens uh and mainly on the account recovery layer uh that I'm going to be covering today. So my name is Viola Likova. I am a senior software engineer with an SR mind. Um and I I am a person to to go to to go person when it comes to authentication. I own end to-end authentication flows and I'm very passionate about um making sure that your authentication does what it's supposed to do. So let's talk about the hardening of your login. The bad stuff that happens, it happens with the worst paths in your uh system. The worst parts are not the paths that you harden so much that they look perfect. This
door is actually representing the um the way that your login might look. It's very secure. It's very uh it's not prone to any attacks. Uh I would hardly believe that any hacker would be able to go through the login page. However, you might not be thinking about the backdoor. And the back door in this size uh in this sense is when the attacker chooses to go for the easier path. And the easier path is typically where you're restoring access to your system. the reset password paths and paths that that you use to create uh the new trust with the system. So you establish who will be now in charge of the system and typically this has been treated as a UX.
It's not a UX when you are hardening something that can regrant access and reissue tokens that would be able to grant access to the system where the user originally had it for. And I will show you the demo of how exactly it can u damage uh the account and how attacker can still be logged in if uh if they have the access. So I know that multiffactor authentication is now the standard. We all know that single sign on and pass keys we keep polishing those things. We keep making login path the more we'll keep perfecting it and we we will be probably perfecting it still even after my talk. But obviously the path that goes behind
it's not been you know it's not been hardened. It's probably been done years ago and then not really revisited because everyone just treats it as a back path. And uh when we imagine to uh when we imagine a hacker getting into our system, we typically imagine a person who comes to our website for some reason goes to the login page and on the login page tries to just brute force until until it works. Yes, brute forcing can be effective, but uh that's not the easiest uh way around your system. So I'm going to be talking about the failure modes that I think that are most important uh when it comes to recovery and when it comes to uh the actual
session handover to attacker. So this is what I call a really poor authentication reset flows. This is where you say basically this is my login path. It's amazing. But when you reset token, when you reset token, when you reset access to your system, please close the gate. Unless you want an attacker to go in, but there are other ways to go in. So why would you just tell anything to the attacker, they will always choose the path that is easier and the lowhanging fruit rather than going straight through your login page. So the password reset is basically not just some kind of a flow which will reissue the token and it will just go back to normal. Things like you you will
still be signed on on different devices can happen. When it happens then it means that whoever was already in the session and whoever was already resetting that password was still in the session and they will keep in the session. So multiffactor reset same thing we can we can trust we can reissue trust but the problem is that we won't really have to um to think about uh other ways of logging in through the login page rather than just stay in the system using the multiffactor authentication device. And the life cycle management is quite poor in that sense because yes, if you think about it, you have other means that that allow you to authenticate in your system and
allow you to be in the system and reissue the trust. But what if someone has the access to that device or the access to that means of authentication and it will actually let them into your system by default and you will be also in the system might not be already but uh the problem is that the life cycle management uh we need to think about it as a chain rather than just single uh single point of entry. So the support recovery can become a lower assurance bypass. Yes, this true because the assurance that we are giving the system uh the system that we are managing is important at the level that we need to look at the chain and where the weakest
paths links are and if we don't treat those weakest links are like reset password reset token whatever we then be we we then are opening access to those being the long-hanging fruit that I'm talking about. So this is the recovery chain diagram. Uh this is how you essentially request the access uh to be restored to be back to your hands. And this is a uh PDF checklist that you can download from our website. Uh it basically lists everything you need that will help you essentially harden your reset flow. And I bet so many companies still have the uh old UX and uh just the part that actually just restores the token but is very weak because the login
will probably be quite nice. So how does this chain particularly happen? So it starts with a request for a reset. Let's take the easiest part, the reset password. Then it happens the delivery, the delivery of the message. So, someone gets the message. Who gets the message? It it's a good question because it can either be an attacker who has some bad thoughts and maybe intentions to your account. If you are well, if this is a bank account and you're very rich and they just want to get a hold of the money, uh they will get the notification to their email and they can actually change the notifications to actually be delivered to their email account rather
than yours which was the original one. Because once the trust has been reissued in the account, they are now in charge of that account and you can do nothing. Essentially, you'll be locked out. Well, I mean, you probably are all familiar with uh the verification. Uh, of course, you would you would click the link, it will lead somewhere. Uh, but what if the token in that link is actually not being reissued? What if it's replayable? What if you can actually use it more and more and more? Um and it's not going to expire. There is no time to leave. There is nothing that that is set on this token to be secure. And I mean yes you
can you can always say okay I have a time to leave on the token. Uh it's now very secure because it expires after one use. Uh we make sure to expire it all the time. But then what if the token has other things like for example on your website? What if your session is actually showing the times when something is happening? So clock skew or the difference in timing on delivery of authentic of the notifications. What if your UX is actually the bad guy and is telling your attacker who's trying to get in your system? They are essentially getting the messages from your system like the that password doesn't exist in the system or that login doesn't exist
in the system. This kind of stuff can actually damage uh the chain at this point uh of verification because it basically tells them uh where to go and look for it. And the UX is actually quite an important part at this point because what if you think about it when you look at the times which uh messages delivered at what time and what um what kind of text do you see when the request uh submits what happens after that? When you look at those timings and compare them, uh the verification becomes quite a um quite a vulnerable path because well we can actually understand based on that timing uh what is exactly happening with the account. We can also try and
use different emails uh of for example the person we know would definitely have the account with the system versus the person who would definitely not have or like a fake email. So these things can happen at that point in the chain and we should never think that is just that verification point. We should think that this is a problem of multiple multiple factors in the chain. So then will come the change. The change is very it's it's a factor change. It's a credential change. So we can change the password. We can change the uh primary means of accessing of account of the account like a phone number uh like the account change uh not necessarily the security
um event uh but but still this can be changed and then locked out uh of the original account holder. So then comes the session shutdown. Not many systems are actually uh shutting down the the systems uh when they reset the password or reissue some token or even when the multiffactor authentication device allows you to actually use different codes at different times after the expiry or when a SMS message arrives and you allow to use the older message. Still this is the kind of stuff I'm talking about. Then the then notification. So notification as I've mentioned uh it can happen uh to both sides. So it can happen to the original user who has issued the the claim to uh
reset or re uh regrand the access reissue the trust or it can also happen with uh with the attacker. they can simply change uh at the point of change they can h happen to change the email or the primary means of receiving that notification and now they will be notified but the original user won't be notified so and obviously monitor um as a site reliability engineer I feel like observability is one of the things that you need to always think about so alerting uh yourselves and looking at the spikes of the um particularly high attempts to log into a particular account or from a particular IP address, you need to be observing. You need to be
monitoring this stuff. And I think if you're not doing that, you're more prone to actually get hacked. And yeah, I mean, let's go to the next one. So, these are some failure modes that I feel like the most important ones. Um there are many of the failure modes and honestly every day uh I feel like attackers will come up with more and more but it doesn't mean that we should be uh covering everything at once. I think there are the core failure modes that you can actually talk about that will serve as the core understanding of just how easy it is to get into the account. So the first class is this signaling in user enumeration. I'm sure
everyone is familiar with uh the typical simple user enumeration that happens when an attacker simply wants to request and see what kind of users are in the account uh are in the database, what kind of emails are not in the database. And it's not just going to happen uh through like different text, right? they can observe attackers can observe uh different things that happen to that account. So the response wording can be different at particular operations. So they input something and they monitor what is happening at particular uh input. The uh response times as I said if they if there is a small discrepancy even like by a couple of milliseconds and you actually notice that and
typically people who want to really badly get into the account they do notice that it is becoming very obvious that the account uh the account might exist or the account might not exist so they shouldn't waste the time on it. uh the email sending behavior whether you send an email straight after and you allow allow the fastest recovery time and reissuing the trust or is it is it going to wait for a second and then still give away some details uh and all obviously the rate limiting differences. So when you're rate limiting for particular uh operations like for example you're brute forcing passwords or brute forcing emails uh say that you're rate limiting them in
different ways and that is a discrepancy that we will be looking at if you wanted to hawk into that account. So status codes, different status codes. Uh developers honestly like to use different codes. I don't know why. Well, no, I know why. Because they want to like to look smarter, but uh honestly speaking, uh using different uh header codes, it's probably going to be just um a more giving of the on the side of the information of the account. And that is that is probably not what you want in your system. If you want to be secure, you you need to stay as generic as possible uh whenever possible because uh generic is good. Generic means that you
are not uh giving away any details. You're you're not supposed to be communicating uh with the outer world unless the person exactly exists and it's the trust that we want to issue. uh we should not communicate any changes uh any uh information to the outer world with our system and obviously the UI state the UI is quite progressive at this point in time we are looking at uh masterpieces in terms of UI and yet the messages that we see like uh this mail does not exist it it basically means that you're saying okay this doesn't exist right we're going to cross that out from the database uh let's try another one and say rate limit ing on
the emails is a bit different from the uh the actual emails that exist. And now we're talking so and obviously the API response shapes maybe as developers we use one type of shape and template for uh the JSON response um uh that comes through when uh the account doesn't exist. Maybe we even say that account doesn't exist in the message of the JSON or do some kind of a coded message and then we do a different thing when uh a separate API uh response comes through. So um I think that here probably the uh the oracle will be in details and smaller details that we want to uh make sure we don't give away and we we don't
want to let the accounts to signal um the things that we did not intend it to signal. So the failure class two would be concerning with the token life cycle and replay and I've already mentioned that but uh it's worth talking about it in a better in a deeper sense. So this is a typical simplified super simplified uh path of the life cycle of the token. So you can see that we can create the token. It will then be active until something happens to it. Uh it can become suspended at some point. it can also become deleted. So this is a very simple example of how how you can manage the token. But there are more complex
cases where you'd probably have different states based on different operations based on different environments and states. And that is when it becomes very important to make sure that you expire tokens uh in the life cycle when they're supposed to be expired. So whenever I start with uh designing the architecture for the uh authentication flows, I actually start with the paths that recover the accounts. So I I start with the with the backend shape and specifically the token life cycles because I need to understand what exactly will happen at each point of time in the chain when the person is going to be restoring the access. what kind of cases do I want to avoid and
what kind of cases I want to actually support. So we should ask questions like who is this token is for? What action is permitted to perform with this token? And again this comes to the point where when I work with AI systems what action should be performed with a specific token is an interesting question because when you if you think that you can issue just one token for all of the actions and just grant access to the AI model on an MCP server that would just allow all the operations then you probably should think twice because it can actually act on your behalf with that token and especially if you don't expire it then it's going to be a really big mess
because it can grant things to uh well essentially any application that uh that might be uh on their radar how long the token is for. So the time to leave of the token needs to be set explicitly and it should be very temporary uh 15 minutes 30 minutes depending on the system but you have to make sure that the token expires. It needs to expire because the token is existent in that system for the purpose of recovering an account. The recovery of the account is an action that you undertake in a specific period of time. So you don't want to be recovering your account in the course of a week, in the course of a
month or a year. uh you want to simply do that action in order to access your account uh soon enough after it. So when you expire the token, this becomes uh when you don't expire the token, this becomes a severe vulnerability because this means that this token is active somewhere on the internet. I will be able to pull it uh easily with a with a model uh in maybe five minutes from across the internet all the active tokens and just see if uh you know check if it works and what system it belongs to. So uh yeah first use only first use only is basically the token that goes uh only once and you then expire it after
it. So you do not allow to uh reset the account access to someone who has already uh used that token or use that link. You need to reissue the same token again and it's not going to be the same token. It's going to be a different token this time. So and obviously yes the mechanism that invalidates the token is very important as well because we need to think about how exactly we will invalidate the token and will it be secure enough that it's not going to be actually alive after we invalidate it. So um yeah, I think this is pretty um aggressive uh but it's very very needed in the systems because um if we avoid
the tokens to be quite broad and quite uh per permitting um then we we will let the system be uh more secure and I think the most securest stuff starts with the tokens as well. So failure class three is the leaking of the URLs. Oh yeah, this is uh this is the one that is basically when you have a URL and you create um you you generate a token and you then include that token in the URL. So you want to make sure that your exact token is not leaking through that. For example, uh I see in some systems, for example, we have logs, right? Like data dog or graphana. And then we log what
happens on the system, which is great, but then I see how that URL that is logging what is happening on the website is actually showing the exact token that is used to authorize the person. And knowing how you know the logs can be stolen, the systems can be hacked. This is not a really secure way of storing the the session URLs and uh the leaking of the token through the URL is quite an often well it's a it's a I see it a lot in the systems and this is bad because uh analytics yes they are good but they can uh bring a lot of damage if uh you store very sensitive data in the not um
well in the non-permitted space where it's not going to be the uh the main destination of token. So you shouldn't be uh storing what is not supposed to be there. Uh so yes and analytics I think it's uh it's quite often that happens that uh you basically just store wrong data um for whatever purposes but uh it just accidentally happens to be there. So um yes strong token weak plumbing uh and uh the same incident happens over and over again. uh the plumbing when when something breaks in our house. The uh the plumbing for example, right? So we know that there's probably not one point of failure. We know how it could be just bad across a couple of different stages.
So the same can happen with that URL. It can basically travel from one point to another point to then go and end up in a completely different point but then end up in the hands of um CIS admin or a person who will forget to use uh the password on their laptop and then the open link is on their screen. So you can easily access that use it and that kind of plumbing is bad for the system. And if you want to actually have like a system that is secure, you would probably need to think about securing the surrounding area of those things that you are uh working on. So if you're working on tokens, make sure that you're
creating the safe area for that token and the life cycle that is within a secure environment uh that will be contained and not going to be let out uh to to leave wherever wherever it will want to. So um I would probably name a couple of uh the leakage paths that I can think of here. So browser history all of us know that in the browser history we persist things. Uh that is a leak. Um that's why I always say delete your browser history because the browser history can actually possess the risk for um the weaker security weaker configured websites um and you can actually risk uh with the tokens. The link scanners that will scan across the
internet and that can potentially leak um and that that's why if you don't expire your token that's where it would probably become the best liability. Um in those uh scanners where we are pulling a lot of aggregation into one place and uh unsafe link generation uh when we initially generate the link the wrong way uh put stuff that is not supposed to be in the link. So in the URL we have a lot of things that were not supposed to be there. So uh yeah and uh email previews actually email previews are quite a thing because uh when you when you look in the email you can see the link then you don't delete that email. This is a leak at this
point. So failure four would be the multiffactor reset email change and the weaker alternative channels. This is an an important one. So even though multiffactor authentication is quite an advanced technology for that is used everywhere right now for authentication is great but the problem is that every single time that you are configuring an authenticator to actually take um the access to um well the responsibility of reissuing the trust for your system you're actually creating a security event. security event in the sense that uh when we are creating the link of a third-party authentication uh application with your system, we're actually granting that third party system the access to your system and reissuing that trust. So we would then
be able to uh access anything. And then also it comes to the point when um we want to change something in the account and we want to uh control it. Uh it will be so easy if we can just have that multiffactor authentication set and access to that authenticator device. Um not even device there are actually now plugins and browser extensions. So you can easily generate those codes once you have the um the account especially if it's a cloud-based account then you can easily overtake any session. So uh in this sense I would say if you want to create some kind of a um uh link to any third party authentication you need to think of a fallback paths that will
actually allow you to be more secure and if you do not do that then the uh the paths that you are not supporting would become the weakest links in your authentication. Um therefore when you go to the login page it's absolutely secure. It's great. you can uh you can be uh the top leader of the login page but then your support paths that can be easily found well like the fallbacks that you're creating for authentication like multiffactor authentication then the weaker channels will be the giveaway that your system is not really secure. So uh the mistake that I see teams are making is uh resetting the sessions without the actual shutdown of all of the other sessions. I believe that you
shouldn't really give the user the choice of resetting the session and you should always shut down the sessions because think about the the way where you are uh authentication on different devices. What if one of the devices gets stolen? You need to by default let's you know let let the system reset the password and reset exactly all the sessions that are open. It's not going to be a choice. Uh but I keep seeing that it's a really good like it's a choice still on the systems which is um it's an option for advanced users who understand that the implications can be serious when you don't remember that maybe a phone that was stolen like maybe
two months ago uh from you actually has an active session that can be still accessed and this is really bad when you you have that especially on the bank accounts or uh some other uh important sensitive information uh applications. So uh I always say that the password changed is not the account recovered because we don't know what happens within that process of changing of the password. The password can change but the account might still be uh well being used by different actors. So you as a primary actor and maybe other ones uh who are attackers who are intercepting your sessions uh maybe some connected devices maybe uh the APIs that you've granted access at some point. So those
accesses they would still be uh present and I'm going to show you the demo that basically shows that. Um so uh let's just look at the demo here. Um so what's happening is basically I I have provided a bit of a uh what is going on. Uh we're going to see that in a sec. Let me just um let's put it on the bigger screen. So this is a dashboard. Uh we can see that this is uh there's just been a login uh and the session was created for one user. We now go to the attacker um attacker session. We're now in the attacker's browser and we've just logged in. These are the open sessions.
Then we go back to our normal browser, the victim's browser. We reset the password. This is a really good message. If the account exists, then we'll uh send the email. This is a typical link to just restore access. Now we go back to we copy the link. We paste the link in and we go back to reset the password. We input the new password and notice that we've already we using the token there needs to be reset after that. Now we can relog to that session. So we as a victim think that we have just relogged in but in the active sessions and we're actually actually see that it's not been revoked. Uh I hope I go to the session
here. Yeah. So the sessions are not revoked. So you can still pretty much see in this browser. This is the attacker browser that it's still active. All of the three sessions are active. The previous session of the victim browser, the attacker browser plus the new uh with the new password reissued. So this is the the kind of trust. And I'm re uh reloading the page and it's not going to get anywhere. So we're still logged in in all the devices that we were logged in on. This is scary because imagine if you've logged in in your bank account uh on a computer that was uh that was somewhere in the house then someone comes steals that computer
and then uh accesses the uh the session that is still going to be present or if someone accesses your account without your knowledge and they will still be there pretty much. So uh let's go to the next demo. This is the fixed flow. This is where uh our recovery artifact burns after the use. So this is important to just notice the difference. So we still log in. We get our session ID. We go to the sessions. This is our active session as a victim browser which is great. We want to forget the password.
Now we can see in the attacker browser we've just logged in. Both sessions are active. So in our victim browser, we're now going to be resetting the password. We're going to again provide our email, get our recovery link, paste that in. The token is there. New password. Great. We've updated the password. Now we're using the new password. And we can see that the sessions have been revoked. There is a time stamp staying there. So we only have one active session at this point. We don't have anymore. So the uh attacker's browser on the reload of the page will lock out and they will not be able to login with the same password. So this is the kind of stuff I'm talking
about. We need to be more precise with the things that we are doing in our systems. We need to make sure that whatever is going on with with our trust reissions, it's actually uh worth it. It's actually uh making sense and we notify the actors that are uh in that loop of the of the recovery and recovery paths needs to be probably the the the first path that you will create rather than restoring and uh redoing your login to make beautiful and adhering to standards. create the good reset flow to make your session secure to make your users safe and this is the first thing I would start to uh to do if uh if I was
uh developing another authentication path. So this is the shutdown sequence that we want to see in production like this is the correct stuff that we want to actually have. Um it starts with burning the artifacts. As I said, first thing that we have created in that demo, we're actually uh burning the artifact that was already used. So the token that we used to restore the access with the token that was before issued to keep the session in, we want to also revoke the sessions. And as I said, this is probably one of the the most important ones. In fact, it's not probably it's definitely one of the most important ones. we need to revoke the sessions and
make sure that whoever already had the account access will no longer have. So um because we've just changed the the trust, we've reissued that again. So we need to rotate the tokens. Uh we can't use the same token all the time. We need to rotate them and make sure we're using different tokens for different operations with different access grants and with different flows. Um and then also requir um at some point. So I know it's very convenient to uh have the session always on. We we could have the session uh in fact all the time. Uh you know some systems do it. Uh but I really like what the banks do with the sessions. You keep uh kicking people out
of the session to once again show that they are who they are. And this is the important stuff that we need to do for all the systems. whatever it is, if it's a shopping uh system where you have your address information or if it's a uh some kind of a um game account where you probably have your great trophies or whatever the the amount of hours you've played. Uh this is all requiring that reauthentication once um once a life cycle happens. So um and in fact one of the systems that are doing it quite great is npm. So for example, the node package manager is uh a manager of the packages for your node applications and
it basically allows you to create a token that you that allows you to be the creator of the account and to authenticate with that token. But that token can expire in a year, in six months, in two months, in one month or every week. And that is a good way, it's a good strategy to manage the secure tokens uh and the life cycle of them because it allows you to still prove that you are who you are. Um yeah, and then notification of the user. Obviously, when I showed the demos, there was um the email that came to the inbox of the user who was restoring the account. And that inbox can change easily if uh someone has access to your
account already. someone wants to get the um well change the means of logging into that account they can absolutely change your email and if they do then you will not be notified. So again we need to make sure that we notify the user when we uh have changed something. So the original user needs to get uh the message as well as the new uh email holder so that we know that something go wrong uh when when something go wrong we basically know about that uh and yes signal uh whatever is happening every event needs to be signaled needs to be recorded on the logs but without um sensitive information. So uh monitoring is as I said one of the
one of the key things here in this bus. Um if you don't see what is happening in your system, you cannot improve it. If you don't see how your tokens are operating at the moment, especially if you're in charge of a legacy system, you cannot improve it and you cannot reissue them. You cannot manage the life cycle properly. So the first thing you need to do is to inspect how exactly is uh is and what is happening at each point in time when you reset uh your password, reset the credentials, reset anything within your system uh that is the back side of the login. So you need to make sure that there is a continuity of uh
the uh path that the user can follow along to then go back to the login and relog to um get back to the application. uh and we need to uh support the fact that is uh we have secrets and storing secrets needs to happen uh securely on the server. We need we need to really think about how we can uh securely store the credentials without like accessing uh without letting uh anyone else to access them even the people who are managing the system. So uh the investigation uh when something happens can actually be effective rather than being um well ineffective in a sense of uh seeing what is uh what is going on but not not being able to help it. So uh
there is a builder checklist um that I have created uh for you to download. Uh feel free to follow that link. Um and uh it's not it's not a malware. It's just a QR code for the PDF. So uh you can uh you can get all of these um like checkpoints that I uh created today for the for the talk. Um and you can actually use that on Monday uh to harden your authentication systems. If you're managing dashboards, if you're managing something uh that is that is involving that login path um then you can absolutely use that checklist to uh work on the scope and um improve it. Um and all of these are actually also quite
important. So um uh yeah I think that is uh that is all for today and uh if you have questions please do ask me anything to do with that authentication uh resetting of the tokens. Thank you.