
yep my name is Eric Chris I work for the national cybersecurity Center in the vulnerability research team as a principal researcher twitter handles up there this is a big piece of research that I've been doing my spare time to look at using deidre the slate language P code and apply that to a processor that NSA didn't release so little bit of an agenda I'm gonna talk a little bit about me obviously orange lanyard not being recorded sorry about that that's my employer going to talk quickly about Giro because some of you guys might not know about deidre before what it is a bit of background about it a very quick intro to the processor that I
attempted to reverse engineer it's a Texas Instruments processor it's from 1972 which makes it older than me it is complicated it was a competitor to the Motorola 68000 and we know we all know which one won again there's some good reasons for that I'm going to go through all of the slave files that you need some important concepts so this is going to be kind of a tutorial it's all the problems that I butted up against and hopefully you guys can learn from it and if you want to do this as well it will be much easier and you won't make the same mistakes that I did and swear a lot say that my name's Chris I've got a
couple of degrees I've got a computer degree in theatre science from Essex University and a master's from neo you I've been a principal researcher in the entities vulnerability research team for about four years but I have worked in the organ in the industry for about 10 years I've got a bunch of CVS some of which I'm not personally credited on but I do have a track record in finding vulnerabilities in products so guitro what is that then well it could be a mythical dragon if you do your homework this picture comes up a lot it's a three-headed mythical dragon or it could be a reverse engineering tool this was released in an essay by an
essay on the fifth of May RSA Conference by called Bob Joyce this was announced in January of this year and created a huge Twitter storm because some people had seen some references to that and things like vol 7 and there had been a reference everyone very very very excited it's currently on version 9 so you can learn from that but there were previous versions it's quite a mature product and there currently have been three point revisions it's been fully open source now so you can download the source and futz with it yourself if you really want to one minor drawback it is written in Java and it can be quite slow and it can be very very memory hungry
it's supported by a whole bunch of scripts so again you can write Java scripts to do things in in Ghidorah but it also has a Python interface and it also has an intermediate language called P code so if you're familiar with something like a binary ninja then you can lift to an intermediate language and do some really cool stuff so it has full 64 32 bit support so if you're familiar with Ida Ida Pro you don't have to do Ida Pro Advanced amazing to get all of your processor types it supports PowerPC mips arm x86 860 for pretty much all of the little pic microcontrollers and a whole bunch of like really ancient zx80 is another one
so a whole bunch of processors there are roughly 21 and people are starting to use Hedra to reverse-engineer other processes so we move on to well why why am i doing this work so this is work against a serial multiplexer called a case DXE 850 so if you are having lots of telexes or lots of satellite that networks that you wanted to know work together back in the 70s and 80s and you would see really multiplex the connections in a massive rack like that so definitely a Raspberry Pi could do all of the work and again you kind of go well why well this is replicating the work from the all observer Corps so back in the 70s
60s 70s 80s and the hall observer Corps were responsible if there was a nuclear war for a network of concrete bunkers in the UK and if we lost all normal communications these would be how government function and specifically would pass around messages to the size of nuclear blast where it was and tried to get civilization working again these are the sorts of messages that would be passing via these telex machines and these serial multiplex connections in this case it's things like the blasting kilotons how high it was above ground and with a network of these stations you could work out exactly where the nuclear weapon had been detonated and how bad it was so on the left there we have a picture
from the 60s of the workstations that these people operate it and they did lots lots of drills to make sure that knew what they were doing again we're still on why so on the bottom right there is the current recreation so this is a bunker in Dundee and this is the current recreation of that could that particular station what they're really interested in doing is recreating all of the software and hardware so they can have the bunker network set up as much as they can in the UK people can visit and learn about the Cold War learn about all these sorts of things that were humming so the processor itself says the TMS 9900 if you look down on there on
the bottom right there you'll see yeah it was originally issued in 1974 and realized in 1978 again that's older than me 16 bit figure Indian architecture sounds okay at the moment sounds relatively easy to do within has three registers that's unusual it doesn't have a stack also unusual and the instructions that is also a bit weird and it doesn't have a ret instruction there is no instruction to say return from a subroutine call so when I started this piece of work you kind of go oh yeah I can climb up that hill as an academic exercise there's any of there it's only a hill except you find out the hills a very long way away and it's called Everest
so three registers and no stack so how does that work so it has a program counter as you'd expect it has a status register as you'd expect and then it has a workspace register the workspace register points to normal RAM we're 15 general-purpose registers R so you do have 15 registers but any access required has to go through the workspace register which means that context which is a really fast because you only need to change the work point the workspace register to point a different place control flow changes good stuff happens but otherwise it's epically slow because any access to any general workspace register has to go across RAM and in the 80s and 90s and 70s even Graham was
really slow and really really expensive this is the main reason why it lost out to thermosolar 886 it was about 10 times slower than anything else one of my colleagues was we were chatting about it's like why did case use this chip it's awful and from reverse-engineering it yes it is [Music] so slay slay is the underlying language in deidre that models and instruction sets it's a successor to sled there's a whole bunch of work in the open source environment so academia was this specification specification language for encoding and decoding slays a really really complex language I don't know how much NSA invested in it but it a lot it's really really complicated and it's really really full-featured we're
describing the instruction sets so that we can lift that into an intermediate language and then we can decompile the assembly language and then we can lift that D compilation into source slave facilitates lifting Pecos of your set and P code is the intermediate language okay so there's a whole bunch of previous work that's been done obviously so NSA have done slave for all of their processes that they've released including arm both in thirty and sixty four-bit and thumb code and for both of those x86 and as I say some cute older processors yeah the the 90s what no processor that there's a guy called B run who is doing this work for a particular chip he at the time we
started writing these slides he only implemented three instructions on his github he's up to about eighteen now so he's doing quite well but those the that's the only prior are in this in this arena that I could find so now I'm going to go quickly through what files you need what they do and then we'll start talking about the really important things like tokens and actually passing out the the instructions and how all fits together so you go and look in the processors directory in in Ghidorah and you see a bunch of files so what they do so bottom half of that slide there yes I have got an extra zero on the ninety nine hundred I couldn't be positive I
can change it because it's a pain in the backside so you need a bunch of files else back el déficit piece back a slave file and slaves back so that's relatively in terms of files you don't need a huge amount but it is relatively complex so first thing you needs the language definition and this is relatively simple this is describing your processor whether it's big-endian little-endian and things like to spray it display variables so we've got up there the language dialogue that you getting deidre that when you select I wanted to disassemble this particular device that's what you get so that one's relatively simple you then need a compiler specification so compilers that compilers can be different in the way
that they generate function calls so you have read Pomfrey or you have or GCC versus Windows the way they pass variables are built on the stack will be it in registers those can be different you'll notice here that I've specified a stack pointer with the register B in the workspace pointer this is a problem that is an artifact of Udrih the fact it's expecting there to be a stack pointer and in this case I don't have one because this architecture doesn't support stack which is just horrendous so that's if you've got a whole bunch of different ways of calling things this is where you do that so you can say when you do a standard call then I expect the
arguments to be in certain registers I expect you to take certain size of the stack and it deals with returning so it can work out where functions are for you it will then populate the P code which then goes and goes on I know this is a function call and I know that the registers were r0 r1 or whatever and then it then populates those arguments into the P code and shows you in see the decompile function a they need to process a specification this is a place where you can if you have fixed memory models so in this case again this is the TMS 9,900 which has a very small memory model because it's a 16-bit machine and
that will then populate the display within giedrius so that you know that if you get to a certain address you can see R this is the reset vector so the reset vector will then go back to the interrupt handler the interrupt handler will then reset over then and then restart machine so when nearly they are all down down those the setup list until you get to this which is display slave specification this is a tiny tiny piece of XML of an absolutely gianormous file I was sitting now I was actually trained in how on am I ever going to recreate this because it's all XML there are links in there it's all linked forward link backwards
how am I going to do this till the penny dropped nobody could write that it's machine generated so let's hope that NSA released the program that makes this and they do so there's a program called sada actually which is in the utils directory which takes another file and generates this only thing nonsense but it isn't because it works generates this unreadable mess so you don't need to worry about the slave file itself you need to write a slave specification so I okay right we're good on that so you need a bunch of stuff so there's some basic definitions within slay you need to define the end eunice this has actually passed through from the very
first file I talked about which was the selector so the big engine or little engine that will get passed through you can override it if you want to it any point you can specify the alignment so for example on arm if you read on odd word boundaries you're going to get protection fault because it has to be a half word lined or word aligned and so if you need to do that in your architecture obviously x86 doesn't do that and then you need to define a bunch of registers so you need to tell Deidre well the sleigh what registers you're going to use they define is strings big string list and here you've got an
offset the size so that's all relatively self-explanatory the size is in bytes so these are the general purpose registers it's 60 16-bit machine for two by it's job done what's really cool about this is using the offset you can overlap registers so if you think of x86 where it started with a which can be a X and then are a X using offsets and sizes you can overlap all over all of the registers and that means that you'll be able in the end to set individual bits of those registers if you need to there a special set of functions to deal with bit field registers so status register when you're trying to set individual bits that to me didn't work so the top
part of the slide there is copied word-for-word from the deidre documentation which is all there there's tons and tons of written documentation however it's very imperative all and there's loads and loads of things that says you may find this confusing at first yes I did when I copied that word-for-word and it does compile down into the XML you load it into gear and it does that it says bad XML it's like thanks just no indication of why that doesn't work great thank you I said I guess I could raise a problem on github but I was busy what you can do is because there are so many other slave files within deidre you can go look at
the other the 6502 or x86 and this is how they do it so what we've got here is an offset so at the end of my registers I've said can I have some more of size one I do realize that that means that these registers are 4 bytes not 1 but it seems to work for NSA so it's working for me now on to the really really important things that are really getting into the nitty-gritty here so variables variables obviously really really important so that we can assign values so that we can show P code what's going on and this we can again left to see tokens and fields so this is how you describe how the processor instructions
work in terms of bits a token is a byte and fields are a field within that bite that they don't have to be bites they can be as small as a single bit and constructors these are where the magic happens this is where you say this is my mnemonic this is how many this is how or not this is what the opcode looks like and this is what I want you to do after we also need the while more in variables so we need to attach registers to variable so I talked previously about you define the registers you define their size at this point we are saying any of those registers can you attach that to in this
case RS or Rd so register source register destination so that means that the intermediate language doesn't actually care whether it was our 0 or our 15 it just knows that it's expecting a register which is a really really really powerful thing because you don't have to within the instruction you don't have to worry about which register you're using so they can be used interchangeably as I said there and slate doesn't actually care about what register it is so more on tokens so we have here three of the six addressing modes within the TMS 900 as you can see they're all a little bit different up codes are different sizes and so how we're going to use tokens to model all
of this so to start with the the opcode I've said here is 16 bytes long so that's fine it has to be divisible by 4 as I said previously it has to be by tribe so ok so in this case we're going to say op 2 is the first opcode that's from by X 0 to 2 and we can follow on from that we're going to say op 5 is from 0 to 5 and so on and so forth so we can talk about model or instruction building up these tokens in this way we can say register they are D which is the destination register from bite 6 to 9 but you also notice on this particular
in the last instruction the destination register and there's a sit there's a seal Oh what was that account so here you could use Rd to model that particular count but you're going to get yourself confused instantly as soon as you start trying to model the instruction when you say to yourself is this a register destination or is it account and then you have to go back to the PDF which is 27 Meg's and doesn't search very well and don't do that well you need what you can do is you can specify another token that's in the same place key three lets you do that that's fine you don't need to worry about that but now we have register nation Bridget
account and then we need to model the source register again that's from both bits twelve to fifteen and then we have some of the other stuff which is to do with TD and TS to do with register register addressing and whether you use in parts or words and we need a semicolon and that indicates in a kind of see structure like way that we're finished interestingly the opcode up there that is just for your benefit it's never ever used so you can call them what you like but it's best to be the boast about it because when you start getting to the end of it but these token lists can be enormous the one forearm is about 10 to
12 pages of the tiny little bit fields within things ok so we defined our token now we've got tokens we've got registers we've got variables so now we need to model the instructions so here is how you model one of the instructions this is a branch instruction so we'll go we'll go through building this out and then liver at the end of it you should have a better idea of how this works so so we've got everything after the semicolon but before is is the display variable so on top right there that shows you what you're displaying so if you want to put a hash in there so often from the media value or you want to display in a
different way everything that goes in there is your display variable after the ears is where the modeling happens so op 25 you saw previously that that's a certain bit field so what I'm saying here is in a branch instruction I want the first 8 bits to be a 0 and a fall and I want the second 4 bits to be a 4 and I also want a register in there and then what that will do it takes the value at the end because we our earth can be bound to any register it will go to the value of Rs go to as a reserved word and Ghidorah and it will fix all that for us
IRS is Rs is at the end as I said previously okay so what is the factory look like once you start to stick this into the decompiler so this is a very very very complicated program which ends which is just incrementing and decrementing registers and a couple of jumps so the branch instruction I just talked about is at the bottom and you'll notice that's what I was talking about slash you display variable so it has said and there's my value if you look at if you look there you can see that the first the first byte is a zero in the four and the second right after that the next byte after that is again a four
what you'll notice here as well is that the last byte is a five which indicates that it's register five and giedrius fix that up for us so I didn't do that so if that value had been seven it would say are seven going to label hex 600 you'll notice that gauge was drawn in a nice arrow to show that there's a branch instruction there great I didn't do that either and you'll also notice it's do a cross reference for us awesome so you can see even just modeling instructions is really really powerful I do want to say a couple of things about P catenation so for the most part Geetha says and I say the documentation says
this is a C like language yes it is until one point when it all goes totally out of the window that there's there's like a semicolon in that for almost no reason but a really really important one so this is where you can concatenate two different tokens so here we're saying in the same way I want to grab the first two bytes of the instruction and that's going to be able 25 and that has to equal zero two and I want to grab the next by or the next set of bits at a bit position seven eight and I need that to be a zero but then it says please get another token which I define to be
immediate sixteen so there's that which says immediate sixteen is from 0 to 15 so that's the whole address from that so what happens here is we go and get the first token I didn't do that cool I would say that in 20 years I've never done animated PowerPoint apart from today so I'm still working with it so there we have the top register which is our instruction and saying well we want to go to and the next by in this case it takes head to 300 and put it into that register it is a really really big gotcha because if you try and concatenate tokens that are in that original token I I talked about so in
this case if op op 78 had been after the semicolon it wouldn't go and get the token from the first read it would go and read another whole token and just go and grab that bit out of it so you can end up if you don't do this properly you can end up with much bigger instructions than you expect so it absolutely does what it says on the tin however it really isn't obvious and it's one of those places and the documentation where it says you might find this confusing at first yes yes I did some useful tools so Sladek see that's really cool because that helps make that giant XML file if you really care about your XML they've
released all the XML validators so all of those XML files I showed you at the beginning you can validate that those are correct deidre scripts the written in Java you can may undo some awesome things real Oh slate language is really helpful because that allows you to reload the language without restarting deidre it really speeds up development took me a while to find that so I was killing easier starting it again Java slow so that was a real boon and debug slowing straight from Pascal Java you put the cursor on an instruction whether it's D compiler or not and it will show you what it thinks it's doing in this case this is loading media so that's the previous
instruction I show you it shows how it's trying to patent maps bits it tells you how long the instruction is and it tells you what values it's using so in this case it's picked out that it's register five which it was so that was really really really useful just a quick word about some results so I've got about fifty instructions modeled it disassembly works in most cases the stack pointer is a real problem because giver and Schley are built around a model that uses a stack on a stack pointer so there is some more work to do especially with regards to the instruction branch with link register pointer which means that instead of just branching to it to the
particular place you load a whole set of works-based registers from an arbitrary location the interesting thing here is if this because of the list at the non stack based architecture and the way this works if this had one instead of the Motorola we would have ended up with profitably a more difficult architecture to exploit the kids buffer overflows on the stack would not have been a thing overflowing workspaces would be but that would have been dependent on how the compiler c compiler was written and whether it put workspaces together certainly you would have been able to overflow into the intro table and with that I'm done with any questions so that's a really good question which I
can't answer is from it it's terrible from an architectural point of view there's there's some really cool stuff on Wikipedia that says how bad it was with regards to the 8086 and the ec2000 and it was on on some operations it was almost 10 times for some of the just a register operation would take upwards of 40 microseconds which was an absolute age for that sort of architecture we don't know is the quick answer case why I've got a deal with Texas might hope my Texas Instruments Dennett anything else no it's not sorry with my employer we can maybe so but I'm sorry bunch of this is being greeted in academic research so the deidre they slay particularly
documentation does link back to all of the academic research that's being done and now it's been open source there's much more opportunity to talk about it I mean if you've said three years ago that I'd be standing here talking about Deidre lights laughter because we would have said that's never happening and you know the NCC doesn't talk do talks and technical talks thank you very much