← All talks

Time Traveling Exploitation: Remote Code Execution in a 10,000 Day Old Game Protocol

BSides Canberra · 202523:071.4K viewsPublished 2025-12Watch on YouTube ↗
Speakers
Tags
CategoryTechnical
StyleTalk
About this talk
BSides Canberra 2025
Show transcript [en]

time traveling exploitation, remote code execution in a 10,000 day old game protocol by Riley Kid. Let's welcome him to the stage.

[applause]

>> Hi, my name's Riley. So, I work at the National Australia Bank in adversary simulation where we proactively do what our tracked threat actors do. Uh, I've created trainings on Udemy, TCM, and Pluralsite. I run the 247 CTF, have a YouTube channel and a blog. Uh, and I like to collect securitys. I'm also an Age of Empires fan. Uh, and this is my first time making it to Bides CRA. Had tickets a few times, but first time I've actually been able to attend, so really looking forward to it. So, what are we doing here? Time traveling, exploitation, remote code execution in a 10,000 day old game protocol. We're going to travel through time using different tools and

techniques to exploit an old game in different period correct ways. And these are the tools we're going to be using and their release date. So, I'm not an internet historian, but what I will say is any screenshots you see in the presentation, we'll be using tools or software from these release dates. And the most important thing here really is what happened in 2002. So back in the day, serial boxes would have physical promotion. So one of the promos included video games, Need for Speed, Age of Empires, uh, and Age of Empires really fostered my interest in computers, hacking, and the internet, which has remained to this day. Uh, and I also want to mention here that there's a lot

of tools and techniques to talk about in 25 minutes, uh, which have evolved over the last quarter century. So I can't cover everything in detail, but feel free to find me later if you want to go into some more detail. So Age of Empires is a historical real-time strategy game and the game has the concept of different ages which you advance through as part of the game. And when you advance to a new age, you get access to different tools and more advanced technologies. So using that analogy, we're going to start in the most basic age of the game, the Stone Age. And in the Stone Age, we had access to the game running on Windows XP. And

of course, we had access to Neutrorain. So, you're a young kid, exposed to the internet for the first time, exposed to hackers. So, what do you do? You jump on Ask Jeeps, ask the original AI chatbot, Bonsy Buddy, how to download Age of Empires hacks, you download some random binaries, and you run them. Uh, and you end up not having a great time or using some assets from the game series, you've been defeated. So some time passes, some new skills and tools emerge and you advance from the stone age into the tool age. And in the tool age we have access to OI, a debugger, Ethereal or Yshark as it's now known for network analysis and we also

have access to an early version of IDA. So a disassembler decompiler. So what do you do in the tool H? You open up the game binary in IDA and you start looking at functions and you can find the classic sprinf pattern and sprinf takes a format string and some arguments and formats that string into a buffer and this function here is vulnerable to buffer overflow. You can see the function sets the counter to the max signed 32-bit value which is used for bounce checking. And if the caller's buffer is smaller than the generated format string, data will overflow past the buffer without any bounce check and an attacker could gain control of execution flow. But is it exploitable? So in the tool

age you can use IDA to find where the function is called statically using cross references or you can also hook the function to find the calls and parameters dynamically. And once you know the instances of where the function is called, you can look for interesting use cases which could be exploited remotely such as during a multiplayer game. And what you can see here is the output of the format string built when you are looking to join an advertised game. And if you translate that output back to the source format string, you can find it in the data section and then find references to that function address which you can then look at in either statically or ollie dynamically to

understand how the function is being called. And if we look at the size of the buffer buffer address, we can see 144 bytes are reserved on the stack for the format string. So can we write more than 144 bytes? So by the game user interface, no. But what if we don't use the UI? So if we look in Ethereal, we can see the unencrypted packets for the game name are being sent over the wire. But what does the game's packet flow actually look like? So performing packet analysis and reading hundreds of pages of docs, we can understand the protocol is direct play, which is an old peer-to-peer protocol used to manage multiplayer games. Basically just a bunch of packets

sent between different parties to negotiate a game session. Uh and it's old. There's no pip install here. But by understanding the protocol, we can recreate the process to spoof a fake game name. And what's interesting is the packet structure based on the documentation allows much more than 144 bytes for that game name variable. So what happens if we advertise a game name larger than the known buffer size? Well, when someone searches for a new game, we can exploit a classic buffer overflow. And just like that, we can control the code flow by controlling the instruction pointer. So we've done it. We are victorious. So what's next? We send a de brown sequence to get the instruction pointer

offset. We can also see our payload on the executable stack. So we know where we can execute code from. And we also need to find out bades which would break any exploitation. So by sending every possible bite. We can see null bytes and a bunch of bytes in that 8 0 to 9F range are converted to to 2D. So we'll need to avoid these. So what's next? We know the offsets. We know have bad bites. We know data is on the stack. So, let's pivot there. Windows XP doesn't use direct execution prevention. There's no address space layout randomization. So, we can find a jump ESP instruction in system DLL which will be loaded at a reliable address for

that operating system and can be used to jump to the stack. This works because the jump ESP instruction will transfer execution to the memory address currently stored in the ESP register which currently points to our shell code. So, what does this tool age attack look like? So on the stack we have our overflow bytes. Then the address of jump ESP from the user 32 DL which is the next instruction which will be executed once we overflow and control the instruction pointer which will then redirect us back to our payload on the stack. But we can't directly execute a payload because of the bades. So first we need a way of encoding a payload which doesn't

use any bades. So in the tool age, there was no out of the box solution for this restricted bad bite character set. So we need to find something a bit more custom. Now the format of the exploit on the stack is the decoder stub followed by the encoded payload followed by the encoder offsets. And using some strange instructions due to the bades, we need to create a sub decoder stub. So what does that stop do? Well, first we need to set up the length of the payload using negation. Then we can use a jump call to push the current location to the stack. Set up some registers. So EDI points to the shell code and ESI points to the the

decoder. Then we loop through each encoded bite, subtract one because we can't use null bytes in our shell code. Then subtract the encoder values in place. And once the loop is finished, we can then jump into that now restored shell code. So for example, if we want encoded shell code containing 41, we would have 46 in EDI and six in ESI. Subtract one from AL gives 5. Subtract 5 from 46 in place gives 41. We can use this same method to encode every possible bad bite in any given shell code. And what is the shell code? So same setup, we need to do this manually. So we set up a stack frame, traverse the process environment block to locate the

loc lo loaded modules. We calculate the hashes to resolve libraries and functions we need. Then we load those libraries and use those functions to set up a TCP socket. And lastly, we create a command process and redirect the coms pipe which ultimately gives a reverse shell. So what does that exploit look like? Now we jump to the stack, execute the sub decoder stub, then execute the unencrypted unencoded reverse shell shell code. And with that we are victorious in the tool age. But exploitation didn't end in the tool age. So let's go up to the bronze age and do this again but differently. So same vulnerability but some new tools and tricks for the exploitation including metas-ploit.

So what did we actually do in the tool age? So we use a static system DL address to jump to the stack. So this works if the other player is using the exact same operating system version with the same offsets, but otherwise it will likely crash because there'll be different instructions at those hard-coded addresses. So we could look at using some other kind of solution. So the partial overwrite doesn't work because of the percentage 20 for the space. We can't control the next handler for the structured exception handler override and the heap. We can't control data flow at this point in the game. And it's also an unpredictable location. So that's what we can't do. But what can

we do? So Frack 49, Smashing the Stack for Fun and Profit from 1996 discussed a lot of these concepts that we've covered so far in the tool age. But Frack 58 from 2001, it discusses a relatively newer type of attack. So return into lib C, which leverages remote uh return oriented programming or rock gadgets. And these gadgets are snippets of existing code which usually end in a return instruction. And this return allows you to chain together multiple instructions while maintaining stack control because after each gadget return executes uh you return back to the stack. So for example, pop ax red is a gadget which pops a value from the stack into the ex register and returns back to the stack.

So you can chain a bunch of these small gadgets together to do something. So in the context of our game exploit do something would be to jump to the stack so we can execute our shell code. So Empire X is the main game binary and has a preferred load address containing a null bite. Contains a good jump ESP trampoline but the address is not directly usable in our exploit due to bades. However, the game also loads a language library which does have usable gadgets in its memory addresses but the gadgets are not good. So they redirect code flow away from the stack or they return too far away to be useful. So once again we have been defeated.

But what can we do? So let's take another look at those bad bites and see if we can increase the scope of our gadget corpus. So over the wire the game name is sent as a wide string. However in the game view we can see the value represented in ASI. So how is this code page translation happening? So on Windows, if you have a wide character string in little Indian UTF-16 and you want to convert to another code page, you would usually use the standard Windows API function with, for example, default parameters. But how does this function work? It uses a code page mapping which specifies how values in one encoding like UTF-16LE map to character values in another

encoding. So if you send UTF-16LE of 410 0, the function recognizes this as the unic code point for the character A, which it then looks up in the target code page to find that single bite representation. And this process also specifies control codes which can be used to give meta information about the text and their mapping to printable characters is less predictable. So that was why we saw those 2D bytes before in our bad bytes array. We were sending non-printable control codes. But we can also see in this mapping is that if we want to have for example 88 in our exploit, we can send C62 instead of 8800 and the code page conversion will

perform that translation into something useful for us. So that gives us a smaller bad bites corpus which means easier shell code and more gadgets. And this first gadget allows us to control the data present in specific registers by preparing the stack. This second gadget allows us to jump to the value at a specific address. And this third gadget allows us to exor data at an address with a value in EBX. And we can chain these gadgets together to jump to the stack. ECX contains 41s from the overflow and we control ESI from the first gadget. So, we can do some 32-bit math to perform the exor at the jump address used in the second gadget. And what do we want to write

there? We can use the jump ESP instruction from the main game binary. And we can calculate the value for EBX based on the current data stored at that address exord with the value that we want. So, this is just a reminder of the tool age exploit using the hardcoded jump ESP instruction. And now with our bronze age version, bronze age version using a gadget chain. So the first gadget updates the registers with the specific values from our calculations and returns to the stack. The second gadget uses exor to write a specific value to a specific address and returns to the stack. The third gadget then jumps the value we just wrote at that known written location which takes

us to the jump and speed instruction in the main game binary which trampolines us to the stack where we can execute our payload decoder and finally the decoded payload. And we also have access to MSF Venom type tooling now which makes it much easier to create the actual shell code and decoder itself. So using bronze age techniques we are again victorious. So now let's enter our final age, the Iron Age. And in the Iron Age, we have access to modern tools including Gedra, the reverse engineering framework, Gedra's MCP server, and large language models. And I'm using a specific logo here as it had the best results during testing, but I experimented with Google Anthropic and OpenAI models and APIs for

this use case. And there were four questions that I wanted to answer when trying to use large language models for this problem. And I also want to point out I didn't provide in context learning, didn't give few shot examples, didn't have access to specialized or non-g guardrail models. Uh and agree that better prompts will definitely give better results. But the general process here was to first ask highle architectural security questions, then ask small specific focus questions and challenge the LM on its responses. And I also made multiple attempts with cleared memory using different models to try to get the best answers for these four questions. So let's start with the first one. Can the LM do the reverse

engineering? So to answer this question, use a GDra MCP server and a model context protocol. MCP server is just a service that provides the LLM with standardized access standardized access to tools. In this case, that tool is Gedra. Uh so now the LM can look at binary functions, decompile functions, rename functions. Basically, anything you can do in GDra, now the LM can do it too. And with access to GERA, the LM was really good at reverse engineering. So adding comments, renaming functions, variables, understanding control flow, and focusing on specific prompted areas of the binary like multiplayer code. And you can see here how easy to read the code becomes. The setup and buffer size for that vulnerable function call here

is much more obvious than in that original IDE example from the tool age. So can the LM help with reverse engineering? Yes, it can help. So, what about finding the actual vulnerability? [clears throat] This is not so clear. So, the LMS were quite obsessed with format string vulnerabilities which didn't appear to exist. Uh, but it did find the vulnerable sprinf function but not the actual vulnerable call despite multiple prompt event uh attempts to find it. It did find some vulnerabilities that I didn't see. So, it found a one bite overflow, found a way to write an arbitrary null bite to memory. Uh so it could definitely find some vulnerabilities but only after a lot of coaching and

pushing to find the known buffer overflow and to stop talking about format string vulnerabilities which it really didn't want to do. It would eventually find the vulnerable call but only after being told specifically where to look. So could it find the vulnerability for going binary yes or no? It has to be no for this specific instance of the vulnerability. Can it create useful fuzzing scripts? So creating and updating fuzzing scripts and helping to test out ideas was where it was the most helpful. So even if it needed to be convinced to do it, sometimes saying it was for homework or using an API seemed to be the most reliable way to get around most of the hacking objections. And one of

the exploit paths I explored was looking for bytes in the language binary which by chance ended up as pointers to code in the game binary. And when I explained this approach to the LLM, it could code up a working script after a few tries and even made some good suggestions to look for offsets rather than direct matches as well. And it was also helpful to beautify the working exploit script which we will see in the demo soon. So could it create useful fuzzing scripts? Yes, absolutely. So what about the big one completing the exploit chain? So the first step here really is can it interface with the game? So the LLM understood the request especially with

access to uh GED through the MCP server. It understood the general process to set up the game and it created code which could run. Uh but the code wasn't able to correctly create valid direct play packets. Even when feeding it back errors, it couldn't figure out how to properly interface with the game. But it did create a bunch of functions and outputs that made it appear like it could. So what about the rock chain? I supplied it with all the gadgets and asked it to identify ways to pivot to the stack and it came up with a simple much more elegant solution uh using an exchange instruction than what I just spoke about that I found manually uh and I was

surprised that I missed it but when you went to the actual address it hadn't considered the change to the control flow. So even though the gadget does technically exchange the values in the registers it also jumps to a random location. It doesn't directly return to the stack. So this gadget isn't really usable to solve the problem. Well, could it create the shell code decoder with that original restricted character set? So the dark background here is the LLM stub and the lighter background is my stub. Uh the LM decoder sort of works and it's a pretty similar sub decoder solution, but it uses a static key and only supports shell code up to 255 bytes. And this is another example where the

LLM was good at writing scripts. When informed, it accepted the limitations and created a script to fuzz for key bites to use with specific shell code. And the fuzzer identified a number of bytes for a simple short shell code example. But with actual working shell code, there was no usable key. So it wasn't really a functional decoder, especially considering the added size limitation. And reprompting in this case also got you stuck in a loop where it wouldn't help refine the decoder as it was viewed as hacking even though it outputed the original content in the first place. So can it complete the exploit chain in this case? No. So enough stalling. Time for the demo.

So what you can see here is starting the exploit script which the LLM has beautified with a bunch of logging. [snorts] On the right you can see the user is searching for a game. So it's looking uh on the broadcast domain for a game while the exploit script is handing any of those direct play packets. It's caught the packet. It's spitting out a bunch of debug information about what's in the packet. It's crafted the exploit script. It's put the encoder and the shell code into the packet and it's send it as a response for the fake game name and it's also handled the reverse shell. So now we have an interactive shell where we're able to execute commands on

the remote host after the user has searched for a game. And a cleaner exploit would be to fix up the game, hide the command window and continue execution of the game without crashing. Uh but I didn't want to go into that. So what's changed and what's stayed the same in the context of exploiting this type of vulnerability? Well, let's start with what's changed. So network infrastructure, we've gone from that direct peer-to-peer connection with required packet foring at your router, by the way, to centralized servers using encrypted uh communication streams. tools have evolved from basic and manual scripting to frameworks like metas-loit advanced disassemblers like IDA pro and gedra which can be leveraged to perform LM assisted analysis. We've gone from

basically no exploit mitigation protections to ASLR, depth, stackaries, control flow, integrity, and much more. Many of which would break this actual exploit chain. And I don't think this vulnerability is really exploitable on a modern system, even if you could get it to run. And with this comes a higher barrier of entry. So trivial bugs are less common. And modern exploitation requires an understanding of modern mitigations which often requires more complicated exploitation chains rather than these just single point in time vulnerabilities and payload and encoder generation has moved from manual per exploit processes to more automated and featurerich frameworks. But despite that what has remained the same. So the base concepts, understanding memory corruption, controlling instruction flow, and those basic

exploit conditions are still foundational to understand and build more complicated attacks. The basic exploit process, so understanding offsets, finding bad characters, understanding and navigating mitigations is still a core process. Even as those mitigations have become more complicated over time, that basic vulnerability identification, so reverse engineering, fuzzing, code tracing are still useful discovery tools and processes to identify these types of vulnerabilities. And with them the low-level primitives, so understanding assembly, op codes, calling conventions and registers to understand exploitable vulnerabilities and build reliable rces. So what is the takeaway here? Well, the game has changed literally and figuratively in this case, but the core foundational principles still remain the same. And that was the journey of time

traveling exploitation from 1997 to 2025.