
All right, thank you for coming to my talk. It's called Unlocking the Secrets of Strict Go Binaries at runtime. And so by way of introduction, my name is Alex and I'm a security researcher at a company called uh Foundry Zero, which up in Chelenham, uh which does low-level research and development. So we go like right to the core of how things work. Uh which means that this talk is going to be like really technical. So, uh, fair warning, if that sounds really scary, it's your chance to leave. Don't say you haven't been warned. Um, so for the agenda for this, I've kind of split this into two parts. So, this is all about reverse engineering Go binaries that
have been written in the Go programming language. So, um, actually I'll give a slight introduction to that first so you're not completely in the lurch. Um, and we're going to introduce some tools that get used by professionals for reverse engineering code binaries. And that's using a static approach, which means the programs aren't running uh when you're actually doing your analysis. Um, but I quite like dynamic analysis because doing analysis of a program while it's running uh because you can often see data as it's moving around, which really quite helpful and can be really quite quick way to answer a question, just see the answer often. Um so we're going to try doing dynamic analysis on go binary and we're going to
see why it's really hard uh which is why that's going to lead us into part two which is about kind of trying to have an approach for doing dynamic analysis which is a little bit smarter. It's going to take some of the techniques uh that are used in sort of these uh cutting edge tools for static analysis and starting to apply those to dynamic analysis to really turbocharge uh reverse engineering. Um and actually we've written a tool for this. So if sort of part one is going over your head and just like just show me the tool, we will look at that at the end. So actually there will be a couple of points where you can jump on. Uh I'm
going have to be quite quick because we've only got 40 minutes. So I'm going to get get on. So I don't know. Breaking news like memory safe programming languages are here. Um actually it's uh it's not it's not that new. These things have been around since 2009 was the first release of the Go programming language. And the whole idea was going to usher in this new era where uh bugs can't exist anymore. Um it's not true but you know it's a good idea. And sometimes you get binaries if you're a reverse engineer. I don't actually get to pick what binaries land on my desk. I you know if I have to reverse engineer my boss says how does this thing work? I
don't really get to choose how the developer made it. And uh one day it was a go binary. And so I put it into my decompiler. This is Gedra. and it sort of spat out something that looked almost exactly like that. It wasn't exactly that, but it looked a lot like that. And um if this is comprehensible to you online, um it's not comprehensible. Um so we're going to have to do some work on being able to understand this and being able to reverse engineer it and understand what's going on with it. So I just want a quick disclaimer about Go is that when you're talking about Go, every single version of Go is a different Go.
So this is a recent version of Go that we're going to be talking about and I'll be diving into some of the internals of it, how it works and how we can get some useful stuff out of it. So this is for Go24 and just on the off chance you are doing any reverse in your Go, I highly recommend you have this uh open- source repository explorer open uh in your web browser and you can actually look at the release branch uh for all the different versions of Go and you just see the truth for what the strcts are at any particular version. But that's kind of by the buy. So let's wind this back a
bit and start talking about Go because if maybe not everyone knows about Go. So Go is a programming language. Um and it was designed to be readable. So a bit like Python, but they also wanted to keep some of the speed of languages like C. They wanted to sort of have like a middle ground. So in order to achieve that objective, it's a compiled language. So it's machine code. Um but it does have something called a managed runtime which means that there are some things going on as it's running to try and take a little bit of complexity away from a developer and that is particularly to do with uh memory management. So in C it's like DIY in Go
they try and do as much of the memory life cycle management uh for you as as they can. Um so there's doing things like garbage collection which is like uh doing things like uh if you've got pointers and they've gone out of scope being able to deallocate memory just for you. Um and one thing it has got which is particularly important what we're talking about here is it's got a static typing system. So if if you're familiar with Python you've got a single variable you can sort of it's quite malleable. Uh whereas in Go uh all types are a single type for the duration of their life cycle. If you convert one type to another type you'll create a new object.
Uh so the objects stay the same type the whole type whole way through. Another thing that's confusing if anyone's done reversing engineering before is where if you've done lots of C we're very used to this idea of a null terminated string. So you've got your asky characters and then you've got a zero bite at the end and that's how you know that's your string and that's how the end you know you've got the end of the string is through that null bite. In go they're not null terminated which means you have a a point to the start of a string and you get a length as well. So if you start trying to read it like a go like a
C string, you have these thousands of characters long string which doesn't make a lot of sense. It's because the length you're only meant to read eight. Um and they're also statically linked by default. I'm going to gloss over that for a moment. So one thing that's important to know is that the AI is not stable. What I mean by AI, so it's the application binary interface. like internally when data is moving around the program like how does it like you call a function how is it going to put variables into the function that's called the AI and uh they change it version to version so uh it's actually not particularly C like you're used to
reverse engineering C binaries uh you're used to a way that okay the first argument is going to go in this register the second argument is going to go in this register um in go it started off with that do on the stack um and now it's on the on the on the Using registers and so one of the things that's quite interesting is that in C we get quite this idea that we have a function call and it returns just a single value. Uh whereas in Go you can have function calls returning many values and they get returned in a sort of similar way to the to the way they get put into a function. Um so I'm not
going to go over this too hard. Uh but you can sort of see here that on different platforms back um these are sort of register order in which you'd have uh your sort of function arguments depending on whether uh you've got floating point or or integers most things go in integer registers uh floating point not much floatingoint numbers um you can kind of see that on x8664 it's um it's relatively sane on ARM 64 it's relatively sane on risk on risk 5 they decided to start at uh x10 and then go to x8 8, X9, and then X18. Um, yeah, I don't know why they've done that. Um, so I'm going to sort of gloss
over this. It's not that important, but essentially there's a a routine for how uh each argument is assigned to these registers. And essentially, if it fits in integer register, it's going to go inside an integer register. Uh but the thing that's quite important to note here is that if you have this thing called a strruct which is basically a collection of fields uh instead of giving like a pointer to that strruct as an argument it quite often uh will uh split a strct's fields across registers and we'll see that uh later. Um so now we've gone a little bit onto the go the way that go works um there are some opportunities in go that you would not
get if you were looking at a C binary. So, and that's that's even if you try and strip it out, by the way. So, the name of every single function inside a binary is just stored in the metadata. Um, and the fields. So, if you've got a structure, let's say you've got um uh let's say you're storing um a collection of fields with a person. You've got age, name, um you know, associations of groups. instead of just having that as being like raw bytes, uh you can now go and get the um semantics of of that data and it's just stored right there which in C you do not get. Um and the other thing is that these
things are not position independent at runtime that basically means that uh every time you run the program it's going to be at the same location in memory whereas uh the sort of industry standard now for C binaries they they shift every time. So uh when you're doing runtime analysis having things in the same location every time is really helpful because it means that uh whenever you hit breakpoint for example you get an address every time you run the thing um it's uh going to be the same address it's not true on C. So there are really great tools out there I said that this metadata is around um and there's a lot of go gets used to write
malware and things like this. So the sorts of people you'd expect to be looking at malware um have written their own tools. So, so Mandant now Google uh it's written a cool written a tool called go recent there's also some tools called Redress and uh for Gedra there's like scripts that go and try and repopulate things in a decompiler for you. Um but there's an interesting thing here that uh if I have my laser pointer sometimes tools don't work. So this is me trying to get an example screenshot of goes and it just said can't extract any any strings. Uh but this one here is is redress and that could pull out exactly what we're looking for. So you
can see uh this is a strruct field and it's got for instance a name, package path, type, tag, offset, index. This is what I mean about pulling this data out. So going back to our examples, we've got a decompiler here and this is kind of like junk. So what we can do is we can apply these static tools and we can try and make it a little bit more useful to reverse engineer. So we can recover the names of uh functions. So before and after this is the the names of these functions. You can recover them. So you can see this is instantly more readable than it was before. And so we can see here what's going on is it says okay
listen to UDP. Okay that's interesting. And then we go along and we say okay so it's going to do some reading from UDP. And then at this bottom here we've got this JSON.shall. So you know a reasonable theory here is it's a UDP server. It's pulling in some data and it's uh expecting JSON data. It's going to unmarshall that JSON data. So it's going to turn that JSON data into um a go object so it can start working on it. Um so that's what we got out of our static tool out of just five minutes of looking. But there are still a few unanswered questions. So what port is the UDP service listening to? What
object is being deserialized? We can see it's JSON. We have no idea. So one of the ways you could do it is you could do it dynamically. And so the way you would normally do this is you'd pick your favorite debugger. Um, in this case, you might pick delve. That's the go debugger. It's kind of a bit rubbish for reverse engineering because it's designed for debugging and not for reverse engineering, which means it expects you to have debug information included. You don't have that, it sort of uh falls over. Um, the classic venerable is GDB. So, if you've ever done like hacking normally, like and you were sort of trying to diagnose uh a se
a segmentation fault, this is probably what you've used. Um, but it doesn't work that well with Go. Uh, I had some problems with it. Sort of crashed a bit. So there's a sort of new kit of the block. It's LDB. Uh it's essentially if you ever heard of the Clang compiler, the LLVM, it's sort of like a new generation of compiler, new generation of debugger. So this is the new thing on the block. Um and you need to put in some commands to stop it from uh catching random signals from the uh the runtime. So just a quick note on the LLDB as I say before uh it's a debugger for the LLVM project and it is actually
increasing the default for a range of platforms now as a debugger because it's uh I think it's more permissively licensed than GDB. So if you use Android Studio for example, Xcode, which is the IDE for Mac OS 10, um it's now the default. It's just going to get used. It also works on a range of platforms from Linux, Mac OS, Windows, or BSD if you're crazy enough to do that. Um but it is sort of designed to kill your GDB muscle memory if you use GDB. All the commands are different. It's really annoying. So um and just to prepare you, this is going to get really in the weeds. If you are not that interested, don't worry.
We'll sort of come back and recap. Uh but just sort of for interest if you're interested in how we might reverse into this like with no assistance just just a debugger I want to find out this information. What we're going to do is we're going to create a breakpoint on the call to resolve UDP address. Uh we're answering to answer the question what port is listening to. So we're going to try and inspect the arguments this function call when that breakpoint is hit. So we can go to the go uh documentation and we can say this is the function it's calling and we can see that takes in two arguments. one is network and one is address and they're
both uh strings. Um it's going to return two things as well. We're not too interested in that right now. And so uh when you pass around strings in Go, you're not actually passing around the strings themselves. You're passing out a strct. Um and that's going to have two things to it. It's going to have a pointer to data and it's going to have a length value. And because as we mentioned before in go sometimes strcts get split across uh registers um you're going to have register one is going to have uh the point data and register two is going to have the uh length but it's worth bearing in mind this is a structure that's being split
up here. That's how it works internally. So we can go and map that to the registers that we're looking at here. We can see there's x86. So we're going to say the string pointer for network is going to be rax. The length is going to be in rbx. The pointer for the address is going to be an RCX and the length is going to be an RDI. So this is all relatively straightforward. We can go and just interrogate that in our debugger. So we can say we're going to read X. Um I'm going to read for a length of X plus the length. So is it going to start and end the memory read and that's going to give us UDP. Great.
It's sort of it's looking quite promising. Um and then we can go and read the address string and we can see 12852. So now we've answered our first question dynamically. Congratulations. Pat yourself on the back. Uh the port it's listening to is 12852. Great. Um now there's another question which is actually quite difficult because we're now talking about objects which is well what object is it? Because we can see that it's expecting to pull in JSON of some kind. It's going to populate something. Um but what object is that actually going to go into in Go? And so the way you would do this is you would say okay I'm going to put a break point
on the JSON marshall call. So I go and look at that in the uh in the documentation. And we've got this funny thing here. We've got two arguments. We've got this data. So this is an array. So we're going to an array of bytes as the as the JSON data. That's quite straightforward. But then we've got the second thing here which is an any. Okay. How are we going to work out what any is at runtime? We'll get to that. So argument one's quite straightforward. Uh this is just data. Uh this is an array, but it's a dynamic array. So in go it's a map. So uh for array we're going to have a data
pointer, a length, and a capacity. And so because that's a strct again, we're going to split that off into our registers, RAX, RBX, and RCX. And we can read that out of memory. And we can see, okay, that looks like JSON. If you've never seen JSON before, that's certifiably JSON. Um, so the theory is still working. Um, but we want to see about that anything. And so that's uh going to tie into how go runtime works. And it needs to understand the type of that object. You can't just guess. Uh, which means it needs to have that stored somewhere. And it does that in an interface structure, also known as an eface. And so that's going to look like
this. So you're going to have a pointer to the type and then you have a pointer to the actual data itself. Again, it's a strct. It's going to get split across those registers. So we're now expecting to see a type pointer in RDI and a data pointer in RSI. So let's go ahead and read that. Ah, this is not useful. Um, turns out it's a pointer to another strct. So just reading it out like this is not going to give us anything particularly useful. And that's why we need to go deeper. Uh so we can actually find out the uh type by reading something called a kind value. So we we have this pointer this type here. And so
we've got loads of metadata that we can work with. So we've got a size uh we've got alignment and this is really thing we're interested in here called a kind. And it's going to tell us roughly what sort of primitive type this type is. So things like strings, integers, etc. Um and so just to be difficult, it's an enum value packed in the lower five bits of it. So, uh, what we're going to do is we're going to cross reference that with the definition. Um, so if we're going to go ahead and read that value, it's offset 0x17. We can see it's value 36. Hex 36. As you see, it's packed within the lowest five bits. So, if we go and
do a mask, that's going to bring us the value 22. So, the value of the kind is 22 something. And then we go into the Go source code and we count down this enum and we can see it's a pointer. All right, great. We've learned something. it's a pointer type. Um, but you can actually get a bit further than learning it's a pointer type, you can actually read the name of that type. And so in order to do that, and hold on to your seats because this is going to get deep. Um, we need to read something called the str parameter from that type. And that's all the way down here. And you might think we've won because that's a that's
a strum. We just read that and get the name. Uh, not so fast. It's called a name off. What's a name off? It's not an insult. It's a name offset. And uh that's an offset from something else. And we need we need to go find that something else. And that something else is something called module data which is a data structure in a go binary which is stored in the initialized data section. And so basically need to record information about a go executable file. And so for now we're just interested in the type information. Um and that's going to be in our module data strct. There's this value here called types. And we're trying to go and find that
because the thing we had before was an offset from that. So we need to find this first. Unfortunately, we can't find model data because there's no pointer anywhere in the binary that just points to module data. That would be too easy. Uh so what we're going to do is we're going to use this thing called a PC line tab, which is another structure in the binary. Um and that's going to basically tell us about the structure uh of a uh uh of a go binary. And so what we're going to do is we're going to find this PC line tab and we're going to cross reference backwards because uh the uh module data, the first pointer in the
module data has a pointer to this PC line tab. So we're going to go backwards and find it. Don't worry about this if I've been having to speed through it because we're going to get this through quickly, but uh it's sort of kind of irrelevant. We're just going to find where this thing is. So we're going to find the PC line tab and what we're going to do is we're going to find the name offset from type. We're going to add the address, the type span from module uh module data. We're going to plus whatever flag spike. Don't worry about it. Uh we're going to leave read length as a variable length integer. Again, don't worry about it. Um we're
going to read the name as a UTF8 string. Um so how are we going to do that? So we're going to read this this offset. So it's 19,241. Uh we're going to read the length. That's 0. Um and then we're going to put that all together and we're going to say this is a main.log entry. Oh my god, that was a lot of work. Um, if you if if I completely lost you, don't worry about it. That was sort of intentional. The point here was to say this is really hard and annoying and actually uh it just took me an hour to get the screenshot. So, but what we've done here is we found the type of runtime. So, we
can go and uh one of the other static tools we used before redress or whatever. Uh we can go and line that with main.log entry. And we can see that this is actually what the JSON is deserializing. serializing a time stamp, service level, message, and status code. Uh, we did it. Woo. Okay, that's the end of part one. That was deliberately painful. Um, and just to recap because I lost you. Um, there is lots of useful information you can find in a Go binary. And so, static tools such as Redress can actually recover this information and give you lots of things to know about a Go binary if you're trying to reverse malware. Um, I think we've just I think
I've just illustrated that manual dynamic analysis in a debugger is not practical. Uh, those screenshots took me an hour and I knew what I was looking for and I bought you stupid talking about it. Um, so that does raise a question which is well why can't we write a tool that does this all for us? Because it's well structured and it's all there. So why can't we just write a tool? How hard can it be? Um, and that leads us on to part two. So if you disassociated in part one, don't worry. This is kind of its own thing. Um, so you can wake up again. Um, so you're writing a tool and you know any great
tool starts with some requirements. In this case, it's a wish list. I just want it to just work. Um, which is a great requirement. So what do I mean by that? Um, I want it to work with all versions of Go, not just one. I want it to work on a variety of CPU architectures and operating systems. I want it to automatically annotate function names when I have break points. I wanted to handle those pesky ghost strings and all that work that we did with that type strruct that's really boring. I want it to do it all for me so that people who come after me don't have to look it up. Um so we're talking about complex data
structures here and we don't want to parse them out. We actually want to display uh data because if we've gone through this effort of working out what a type is we might as well use that information and display data with that type so it gets a really nice markup on the call stack as well. Um and for some people external dependencies is a real plus if you can't connect to the internet. So the company I work for I work for a company called family zero based in Chelham and we create this tool already called which is LLDB plus extended features. So LDB is the debugger and our extended features essentially make it so that it looks
quite cool um and sort of has like you know gives it C strings and things like this and gives you like a stack trace uh which is open source not I'm not trying to sell anything. Um, this is the baseline. So, we have this already. And I thought, well, we've got this cool product here. Um, and it can do stuff for C binaries. Like, what if we could do this with go binaries as well? Um, what if you hit a break point and it instead of giving you uh nothing, it would give you like a go type information to just display it. I thought it'd be really cool. So, we need to talk about how this is actually going
to work. So, depending on the kind value, uh, there may be actually extra data in that type header. So we're going to have pointer type, strruct type, array type and all this sort of thing. Um so these are all the different type things that get passed around. So at runtime it looks a little bit like this. So you get this. So we if you recall earlier we read this kind value and we worked out we were dealing with a pointer. Um if we uh once you've read that you can now know there may be some extra information down here. Um, so for example, this is a strruct type. And so we'd read this kind value and we'd say,
okay, it's a strruct. Cool. I now know that it has some fields because strructs have fields. That's great. That's actually a slice type, which means we have an array of these strruct fields. And so for every single uh field in the strruct, we're going to have one of these strruct fields, and it's going to give us a name, a type, and an offset. And that's actually all the information we need to pass uh those strcts. So for each one, we can say this is the name of it. Um, and if it's like another strct within a strct, we can go ahead and, you know, have this sort of recursive process where we go back and read the
kind value again. Um, and then there might be some extra fields down there depending on the kind value. And this offset allows us to work out where inside the strct this type's going to be. Um, so we don't actually have to wait until runtime to pull this information. We can actually do it all statically, which is how these tools actually work. Um, so a full list of types is actually found in this module data. And there's only a single record uh per type. So this is going to look a little bit like this. Again, don't worry too much about this if you're just sort of interested in the tools and the techniques. Uh but this is sort of for u
for interest if you care about how this sort of stuff works. Um so what we're going to have in our module data section is we have a pointer to types. Now that's going to give us a pointer to all these different types. All these things we saw before. Um we have all these in memory. Um there's like a sequential set of these all in memory uh one after the other. And because they have like different sizes because they may or may not have extra fields depending on what kind they are. So if it's a strct type, an array type or just a generic type um has different number of fields. So you might say, well, how am I going to pass
these things? Because I can't just do like a fixed offset to iterate through the array um because they have different length. Um so that's why you get this thing called a type links and that gives us an offset from the start of this and says okay offset zero got you got this offset uh whatever that's where the array type starts. This offset that's where this type starts. And you can actually see they sort of self- refferential. So if you got strct type have some fields and that might be uh two fields that one and that one. An array type is going to have a field and that's going to be that one. So you can
see how this can be parsed and you can now get a uh a nicely laid out definition for every every type in your in your binary. So that's that's that's type recovery. Uh but in terms of function name recovery, um that's also a really interesting cool thing. So we're interested in again if you remember before we were talking about something called a PC line tab. If I sort of expand that, that's a program counter line t. And what that has an association between your sort of program counter register and the line within a source code of go and it helps for debugging essentially if you have a panic. You have an exception that's raised. It
allows you to say uh I panicked within this function and write 112 lines into it and that's why it stores that information. Um, one thing it does quite interesting for us is it stores offsets to function name strings and offsets to functions in the text section. So, we can now get an association between this is the name of a function and this is the uh start of that function in memory. So, it's got it's got a scary looking header file here, but we're kind of interested in uh the um these offsets. So, what we're going to do is we're going to read a uh the number of functions binary from M funk. Uh we're going to read read an offset which is
actually an offset from the start of the uh PC lines up header. Uh which going to give us a function table. Uh and once we've got our function table, we're going to read pairs of offsets for each function. Uh the first is an offset into the text section which is the start of a function and the second is an offset from the PC header to a bunk header. So that's metadata about a uh a function. We have just to wind up for a second, we've got a pointer essentially to uh a function in memory and we've got a pointer to metadata about that function. So we can now read information about that function. Um and that's going to
look a little bit like this. So in our PC header uh we've got these uh funk name offsets. So that's going to be a pointer to just a list of null terminated which they decided they like n terminated strings now um for this. And that's going to be a list of all of your function names like one up to the other. Um, but that's not no good on its own. We need to know how that maps to uh functions in the binary. So we've got this uh PC line offset table. So we're going to have repeated values. We've got uh so in order of the functions, we're going to have an offset to the text
section which going to tell us where uh where the actual function is. And then we're going to have an offset to metadata. And that metadata is going to be an offset. And that's going to bring us to our funk header, which is going to give us uh the most important thing we care about here is the offset into this table to get the name. Again, they love offsets. Um I wouldn't worry about it too much. Essentially, you've got a logical association between this is a function where it starts and this is the name of the function, but we can we can associate those two things as the important thing. So there are some also opportunities,
but I'm going to skip over this because we're low on time. But essentially at the end result uh we can rerun the previous questions. Uh what port is the UDP service listing on and what data type is being deserialized in JSON. Now hopefully we've rent this tool. Um it's going to be an easier question to answer. So to start off with we can hit a break point and this is on the resolve UDP address. As you can see the tool sort of marks us up. So we can see the D disassembly here. We can see that it's been uh marked up as being that function which you would not get normally. Uh we can see we've got a
stack trace. So it's saying we're actually in main which is cool. Um and we can go along here and we can see now that our first argument which is an RAX it says UDP and in RCX it says 12852. It's quite a lot easier. We just get the answer. I mean we we have to know the calling convention. So a possible future version would try and make that a bit easier. But for now we can see those things a lot easier than we could before. Um but let's try the hard question because the other one was quite easy. Um which is what object are we actually deserializing? So we can hit a break point that's called the JSON. Marshall
and uh now if we look at the top of the screen uh we can see that it's recognized in RDI we've got our uh go type pointer and we can actually see all that data there is being displayed uh which you would not get before. That's you know for reverse engineering that is a really really awesome piece of information let me tell you. Um, and it sort of doesn't show how much work went into getting that, but uh, there it is. It's just there. And it's also tried and point. It's got a pointer uh, just some empty data. It thinks it might be associated. It's it's showing nothing at all. That makes sense. We haven't
actually gone into our function call yet. We sort of break before it. Um, so now what we can do is we can sort of jump into this function. That sort of proves that our stat trace now works. We've gone from main into encoding JSON data marshall. And if we sort of jump out of it, uh we'll see that type data is gone. It's actually because it doesn't return it. Um but just by good fortune, it's on the stack. And you can see it's actually populated this object uh with all this data here. So we can see there's a time stamp, there was a service, there was an error, there was a message. This looks sort of API log of
some kind. Um but you can see if you're reverse engineering, just being able to hit a break point and answer this question really quickly uh beats spending an hour fing around with go types. Um, and we have actually got like a a command if you give it a type pointer a pointer to the data and then a type name. It'll just go and unpack it for you, which is really cool. So, uh, again, this is not a sales pitch. We're not trying to sell you anything. It's open source and free and all those good things. Um, and it's on GitHub. And so, if anyone could star it, we're really close to 500 stars. And, um, that'd be
really cool if we could get to 500 stars. So, I really appreciate that if you could uh, go and start on GitHub. Um, time I got left. I got uh 10 minutes. 10 minutes. Oh, great. Um so there are some challenges. Um we just kind like buy the buy here. Um the Go developers really like to mess around with their internals. Uh they state very prominently do not rely on any part of the internals of Go for anything. And uh they mean they mean it they've changed it a lot. Um and so when you're writing a tool and you'll have lofty ambitions that say I'm going to work with any Go binary that's been compiled the last 15 years. And that's a
problem uh because whilst it's easy enough to do it for one version, doing it for all of them is a pain. So for example in cleft the tool um we had to write three pes for that PC line tap structure. That's the header in the function table. Uh we had to write five module data pes. That's how many times they've changed it. Um and that's actually going to be six. Don't worry about reading this. This is basically like a a spiel from the latest version of Go 126 where they said we've changed it all again. Yay. Um so that's going to be six. So I'll go do that later at some point. Um it's also worth pointing out
there's some limitations to this tool. Uh for example that all that metadata I was talking about before for types. Actually you might ask why would you pass that around and show the compiler knows what those types are. You'd be absolutely right. It only includes that information when the runtime needs that information. So in our example, there was like an any type. It needed to pass around the information. So it's an any type. Um what you can still do is you can hit break points when you're allocating objects because it works out that the object allocators are also taken and any type a lot of the time because they need to know what type it is in order to allocate the space for
it. Uh so you just have to move where you do your analysis a little bit. U help out with this. PLF includes a feature to sort of remember a type to a data point relationship that can sort of decay if they're not sequential anymore. Uh and there are also manual commands to go over what it's doing automatically. Uh and then stack binding only works x8x6 but we'll go that um so I think we've got time. I'm going to you might ask well all this data is being thrown around. Malware doesn't like being reverse engineered uh as a rule. So why do they write stuff in go and uh what do they do about it to stop analysts from
having such a great time? And so the most popular open source go obuscator to make reversing hard is called garble and so all those beautiful strings that we pulled out work so hard uh it's now junk uh now base 64 hashes that you can't decode. Um it removes go version information so you can't know what version of go you're actually looking at to start analyzing it. um it clobberers magic bites so you can't have nice anchors and where you're going to analyze things. It's going to obuscate string literal and it's also going to mess around with control flow using an abstract syntax tree. So I I originally didn't include this as part of scope obuscation because I don't really deal
with obuscate values very often but it was kind of asked of me and so I had a quick investigation and so they've clobbered this magic value here which should be this value here but the rest of this structure this is for our PC header PC line tab this all looks quite sensible they haven't actually clobbered all this so there's a glimmer of hope that we might still be able to do some smart reverse engineering dynamically so what I did was I went into source code and because some of the uh information was missing that I was relying upon to work out the Go version information. I just said always go for version 18 to 24
which which worked for the version I was looking at. Just a proof of concept. Um this is where I would have a live demo but I think we're short of time. Um but essentially uh what that looks like is all of the uh function names are replaced by like basically random data. But what you can still have is you can still have uh the structure of that types. So, uh, where before I might have been able to see, um, all the method data giving like the name of the fields, uh, now I just be able to see, okay, but there's still a string, a string, an integer, a string, um, and a pointer, whereas before, uh, you would have got
nothing at all. So, it's it's better than nothing. It's actually still better than what you get with C, even though it's obuscated. So, in terms of, uh, future wish list, I like have a bit more of fallback mode for that. um and sort of bake an awareness for some go library function signatures. Um and that's kind of the end. Um and I just want to have a bit of acknowledgement here um that actually trailer bits ran a similar project um but uh ours is better um and uh there was also a hidden tool hidden hero here which is actually this was uh so fab we have some placement students and this is actually a summer placement
project. uh the person who did this decided not to get publicly recognized for it uh but they were the ones who did a lot of the uh coding of this to write this tool um sort of taking the idea and and running with it um so it's kind of their awesome work that means I could be here today um and this was a bonus so when I um delivered this originally I gave it to reverse in Orlando and there's lots of uh high-flying vulnerability analysts uh over there and so what we have in the Go runtime. So, uh we had the student picked out the f their favorite line of source code from Go and uh you know it's
it's a memory safe programming language. So, you can't have memory corruption. Uh well, there's this function in the Go uh source code called add checked. And what it does, it takes a pointer and adds an offset to it. And that's quite an unsafe thing to do actually adding an integer to uh an a pointer like that you could go anywhere. Um, so clearly because that's a potentially unsafe operation, I added a string saying why it's safe. Um, now if you've been around security for a while, uh, you might have realized that just adding a string explaining why it's safe doesn't actually make it safe. Um, and I was curious what justification they would give in their own source
code. So I saw would they use this function anywhere? They have themselves. is they take a uh they take a point of these bytes and they give an offset and they say why it's safe. Uh the runtime doesn't need to give you a reason why it's safe apparently. So uh um it's a bit of arrogance there really. So hopefully they someone finds a bug in their runtime pretty soon. Um but uh I think we've got like three minutes if anyone is sort of not liquefied enough to ask a question. >> Hello. Yeah. Just just one out of interest. So the strings aren't null terminated. No, not preixed in the traditional sense. They're pointed to some data somewhere else. And then how?
>> Yeah. So it's a great question. It on first glance it looks like it. So uh it's not quite a Pascal string which is probably what we're thinking of, but you have a length and then immediately you have uh like bytes. You have a string header uh which points to two things. one is a value of the the length and one is a point of the data. So when you're actually reverse engineering these things, it looks really confusing because Gedra the decompiler will helpfully say that's a string. It's got huistic but then it will assume it's not terminated. So you get this huge block uh of like a thousand line strings um even though it's just like eight
characters long because it doesn't have to deal with it. >> Is the reason that Go implemented that way ever that you can therefore have multiple of those string to the strings. >> Yeah, they do. It's it's like statically held. So, it's they they just have pointers into this big block. >> Strings are immutable. >> They're always immutable. Um, so in Go, strings are always immutable. You cannot add, append, move, change strings. What you can do is you have bytes. So, you can uh you can do a bite dynamic array and there's a bit of a code on there to say if it goes over a certain length, it reallocates it, moves it. Um, and then
you could that again to a string. What that's going to do is it's going to allocate a new string that is immutable, but it's just be generated on the fly rather than being statically stored in the binary like that. If that makes slication
around that kind of boundary. >> So when you say CG go you mean like if you've linked in C into your runtime? >> Um >> so into the into the If if you're linking in with a C library for example linked would be C code embedded in the Go source files. >> Um yeah so it has a bridge for that. Um I assume it converts the calling convention into the C calling convention for that platform. Um I assume it doesn't actually. Yeah. Um one of the really fun things about this is that you can you can do that. You can have C code in your Go binary. which means that you get all the memory problems of C in your
code binary and they decided not to have ASLR which is the um position independent code which means that and there's a discussion on this online which was really fun to watch I'm nearly out of time but um they had a discussion about whether they have mitigations exploit mitigations and they said we know we've got memory safe program language we don't need mitigations we don't need position independent we don't need ASLR that's like having seat belts in restaurants is what the developer said uh but they sort of forgot that you could have C code in your uh go binary which means if you find memory corruption in the C code um now you've got like free real estate on gadgets for
doing return orientated programming exploitation because it's all at principal locations you don't have to have a leak um so yeah um in terms of that boundary um it should it on the tool it should work fine because it's made for C and then we added go features on top so it should handle that I don't really see I haven't seen that It's not back on. >> Perfect. That's all we have time for, I'm afraid. So, come find me if you have a question.