A Tourist's Guide to the ARM Cortex M3

Name: A Tourist's Guide to the ARM Cortex M3
Uploaded: 2020-05-03
Duration: 58 min 42 s
Description: A practical introduction to reverse engineering ARM microcontroller firmware, focusing on Cortex M devices like the STM32 and nRF51 series. The talk covers extracting firmware from physical devices, identifying memory maps without datasheets, and loading binaries into GHIDRA or IDA Pro. Attendees le

BSides Knoxville · 202058:42889 viewsPublished 2020-05Watch on YouTube ↗

Speakers

Travis Goodspeed Ryan Speers

Tags

CategoryTechnical

TopicHardware Security Reverse Engineering

DifficultyIntermediary

StyleTalk

Mentioned in this talk

Tools used

Binary Ninja Ghidra IDA Pro QEMU Radare2

About this talk

A practical introduction to reverse engineering ARM microcontroller firmware, focusing on Cortex M devices like the STM32 and nRF51 series. The talk covers extracting firmware from physical devices, identifying memory maps without datasheets, and loading binaries into GHIDRA or IDA Pro. Attendees learn tricks for automating analysis, finding magic constants, and converting hardware problems into manageable software challenges.

Show original YouTube description

Recorded on May 1st, 2020 at the 6th annual BSides Knoxville (virtual this year) conference This is a friendly and fast introduction to reverse engineering ARM microcontroller firmware, starting from a physical device and ending with firmware that can be read, understood and patched. We’ll focus on Cortex M devices, such as the STM32 and nRF51 series. They have different architectures, registers, instruction sets, and calling conventions from x86, but they follow their own consistent rules. We’ll teach you how to rip firmware out of these devices, how to identify it and its memory map without a datasheet, and how to load it into GHIDRA or IDA Pro when no ELF headers are included. Beginners will find a handy new type of reverse engineering, and seasoned pros will still learn some nifty tricks that they might not already know.

Show transcript [en]

yep so sorry ryan i don't know you but i i know travis is is one of our most neighborly hackers in the uh community and um always interest is interested to see what he's talking about if it's gonna be radios or watches or watches with radios or studebakers i see we've already at least do the bakers today we've ticked off the studebaker box real quick we got studebakers we got cats moisture studebakers and more cats yeah we got one cat picture that includes a studebaker excellent every combination let's see how many we can hit all right it's it's five o'clock uh you might as well uh go ahead and take it away i don't see

any reason to wait all right so um the lecture we'll be sharing with you today is based on a paper that ryan and i wrote a few years back um the concept is that um many of you have done a little bit of reverse engineering and you would like to do more reverse engineering of uh things and the skills that it takes to reverse engineer an embedded system really aren't that different from the skills that are necessary to reverse engineer like a desktop application or a game um so today we're going to discuss the fancy tricks that take what might be seen as a hardware problem and converting that hardware problem into a software problem that is then

easy to solve on a desktop with modern tooling and a nice environment uh sound good everybody um so i keep forgetting that there's no audience feedback in this so ryan is going to sort of play the part of the audience and interrupt me if i skip over something or fail to express it properly this is yeah feel free to send in those all right so um reverse engineering embedded systems is like uh loads of fun on its own but at the end of the day you're just looking at numbers and figuring out what the code does so what you really want is a worthy project at the end of it that proves that your time wasn't wasted

um one of the the first of these that was pretty cool is an open source project called rockbox i believe this debuted in 2002. um the idea was that they wrote their own mp3 mp3 player application and then reverse engineered commercial mp3 player hardware in order to port this application to run on it so you can run this on an ipod you can run this on pretty much any of the popular mp3 player hardware from 15 years ago back when uh people collected mp3s um another nifty result of revised engineering is a project that i started back when i lived in new york or the godless north as we call it a knoxville this project is called mv380 tools it's

reverse engineered its patches against a commercial push push-to-talk two-way radio that can be applied against the manufacturer firmware so the patches are written in c they then change the behavior of the commercial from where the hardware was designed for allowing you to add such things as um like a phone book of amateur radio operators that are registered with the protocol you can enable promiscuous mode you can also dump out the memory and dump raw packets off of the air in order to reverse engineer the settings of a commercial network and more recently hudson has been playing around with this thing which is a cpap machine that he's attempting to firm our patch into being a respirator

in all of these you need to begin by understanding the architecture that you're working with then you need to dump a firmware image load the firmware image into reverse engineering tool and then read it and annotate it until you can understand what it does so we're going to begin by taking the architecture at a high level and then we're going to jump into some details um it's impossible to learn how to reverse engineer a new uh architecture in just one hour so what you should be trying to remember from this are the nifty tricks that make the reverse engineering easier or the things that might block you and how to get beyond them and if you remember some

high-level details about the architecture and its quirks that would be good too uh ryan you want to take this one and begin um like the the high level of what an embedded arm ship is sure yeah so an embedded arm chip there's a few examples that you see here a common one that stm32 is a small microcontroller the nrf51 is from nordic it's uh has an rf core in it and there's many other examples and in arm keep in mind that uh you have typically 32-bit registers but your instructions could be 32-bit or it could be 16-bit if you're in thumb mode and i think that one of the the um most notable things here to keep in mind

as your reverse engineering arm is figuring out which context you're in and figuring out how to follow as we jump between contexts and we'll cover that later we're not talking about chips that have support for legacy classic arm instructions these chips are notably small in size and when we mean small and we look at their memory maps you'll see it's less than a mega flash less than 128 kilobytes from a very small ram as well so this helps keep the code size that we have to look at smaller but also means that the developers are going to be playing tricks for code optimization and the code can run from sram but it usually runs in place from flash memory you'll only

find it running from sram when it's a temporary patch or when it needs to load a small driver to rewrite flash memory and for that reason can't be executing from flash memory at the same time the registers are different than in x86 or md64 in that um someone put some thought into what the hell they should be and decided to give them meaningful names instead of conglomerates of all the backward compatibility history of the platform of the 1970s r15 is what you think of as the instruction pointer in an x86 we call it a program counter or a pc in risk ships but it's the same thing r14 is the link register 13 is the stack

pointer and then your local variables sit between r4 and r12 and it's the child's responsibility to restore these before returning to the parent while registers 0 to 3 hold parameters and return values and any time you call another function the child can clobber those and not fix them on the return the program counter is always on when you're in thumb two mode and if it were ever even that would mean that you were executing in the classic arm mode with 32-bit wide instructions in thumb tune mode your instructions can be either 16 or 32 bits wide and you have very little memory so the high bits of your program counter are pretty much always consistent

and the same and point to flash memory they can point to sram but in almost any code that you look at they'll be running consistently from inside of flash now i said that there were two instruction sets you've got arm and you have thumb two arm is optional in the smaller chips but anything that's a little bit older or a little bit larger older as in like uh game boy or game boy color uh game boy advance um i'm actually not entirely sure about the game boy first generation the game boy advance can definitely run both instruction sets and again it's the least significant bit of the program counter that tells you which instruction set you're

executing so on the machines that can do both if you jump to an odd address you get the thumb two interpretation of the function that begins one byte before the pointer that you're going to if it's an even address you don't use the thumb two instruction site you use the arm instruction set and they did this so that you could freely jump back and forth between these different instruction sets within one single program produced by a single compiler without the uh without breaking compatibility and function pointers um r14 is the link register you can think about this as the um the return pointer on the call stack in x86 except that in arm and in thumb and

power pc and many other risk instruction sets the instead of the the parent pushing the return pointer onto the stack as it calls the child it's the child's responsibility to save the link register before calling your grandchild so if you have a leaf function that you call a leave function is a function that has no children of its own then um you branch to that function and it can do its work and return without ever writing to the step as long as it doesn't need to save registers or call another child or anything and because this value gets thrown right back into the program counter on thumb two it will also always be odd r13 is the stack pointer it is

always even it is always 32-bit aligned and just like in x86 your stack grows down um the differences in the stack behavior are generally in that the sac is sort of manually managed when you do a call that doesn't directly change the sac it's the child's responsibility to save things after it arrives now the local variables are also the responsibility of the child to restore so when you have a sum2 function you'll see it begin by pushing the link register and some of the local variables onto the call stack and then at the end it pops them back off and if you read through the function you'll see that those are the same registers that it changes

by overriding the requirement is that the state be restored as long as it goes back to what it was the child can do whatever the hell it wants with them registers zero to three are used for the parameters and the return values um our zero is the first one by the calling convention and it also contains the 32-bit word that you might return if you return a 64-bit word then you will also use r1 and you're under no responsibility to preserve the other parameters so the the trial function can clobber any of these that it wants to and if the parent expects it to be the same it's it's responsibility to save it onto the call stack before branching to

the child but you're going to come here for a description of the uh the number of registers their names and stuff um you wanted to know how you could get code and begin working with it so there are a couple of different ways to to get the firmware um the easiest of course is a firmware update um brian when you're trying to mess with something what what fraction of the time do you get the firmware for free just by looking at an update or executable or the vendors i'd say yeah i'd say uh you get lucky on that about 80 percent of the time it's uh you know you can just um sometimes i mean not handed off but

you know yeah get it out of like a you know application download that goes through the smartphone and you can sniff on the wire or something right yeah uh for bluetooth you'll have to do chip extraction for bluetooth devices in particular um android has excellent support for logging bluetooth to disk which you can then replay in wireshark in order to capture the firmware update as it goes up there um another way of grabbing it is through jtag which is a debugging protocol that's very often unlocked and if jtag is unlocked then you can connect a debugger and dump an image out there's also a rom boot loader which can be either locked or unlocked if it's unlocked you just read it out

through the serial port um but if if jtag or the wrong bootloader are locked there are still options so for a specific example the stm32f0 family is rather popular um it has a vulnerability in its jtag implementation which i believe was first documented at using 2017. the vulnerability is that when you attach jtag to the spm32f0 flash memory is disconnected from the cpu core and the memory bus the idea here is that if you think that there is a functionality problem with your device you're free to connect a debugger to check for that but you're not allowed to read the code out of it that you would need to copy the device or to reverse engineer it

there's a minor problem here which is that sram is still exposed because it's not disconnected but the major problem that was presented used in exclude is that when you attach jtag flash is disconnected one cycle too late so your jk debugger can uh dump out a single word per connection and by repeatedly reconnecting and resetting the device the jtag debugger can dump out all the firmware the nrf51 is another popular jtag there's another popular embedded arm chip with radio support this is the one of the first bluetooth ships that supported bluetooth low energy and was nice to program um there's similar there's a similar situation here in that jtag was supposed to be able to

connect to the chip without being able to dump out all of code when the device is locked the vulnerability here is that you're still allowed to single step the cpu and you're still allowed to um view its registers you can read them and you can write them um you're just not allowed to do the extra memory yourself so what they did was uh with chris brush did in 2015 was he found an instruction that loads the um the word in memory at the address that's in register zero it takes that word and it loads it into register one um and after he finds that instruction he can then loop through this instruction while changing the value of

register zero in order to dump all of the firmware out the jtech port even though the chip is locked and once you have the firmware you can unlock the chip or replace it with a new chip right in the firmware and continue debugging with full privileges there are also bugs in uh master on boot loaders um i used to love writing exploits for these because when you find a vulnerability in a mask rom and you exploit it you generally have an expert live

and ryan how long do they live they they never go away these are the best ones have you ever seen one patched i think somebody passed one once by ripping everything out of the field and pushing out all new chips but it's just not uh not something people like to do so the cost of patching this is that you need to begin manufacturing new chips of every affected item in the family which requires the mask revision so you're looking at about a minimum of a quarter million dollars per unique model number in the family so in the case of stm32 they would need to do this separately for the stm32f407 and 417 and 207 and 217

and so on so the these bugs live a very very long time they're the sort of thing where when you um when you find one as a defender there's not much that you can do about it certainly can't upgrade anything in the field and if you have the decision to burn your warehouse into the ground and remanufacture the chips from scratch over a security bug you generally choose not to do it when i'm hunting for a bootloader vulnerability this is one of my favorite tricks that would never work on a pc um on a pc you have something called a guard page at uh address zero and the purpose of the guard page is that if you ever have a null pointer

meaning that the pointer is zero and you try to read from address zero the guard page will trap and inform the operating system and the operating system will kill your process and the reason for that is that very often you forget to initialize something or you try to allocate a buffer when you're out of memory or any number of things and you're stuck on embedded systems it works differently um this code here the the print data function that i've thrown together that just prints uh the bytes at a location in hexadecimal um this will dump the beginning of your firmware image if you call it for thumb length and for address equals zero on many embedded arm chips because

quite a few of them will locate whatever memory you booted from to address zero meaning that when you dereference a null pointer you get the beginning of your code so if you can ever find a place either in the application or in the bootloader where you can trick it into sending you uh data from the device's own memory at address zero you have a firmware this brings us to the concept of um what is in memory at which location this is the memory map from the pocket gtfo article that ryan and i published uh it was illustrated by angela martini and you know the the only advantage of doing this lecture remotely is that you can read some of

the uh the items on this list the um is being that memory is sort of arranged into 512 megabyte regions and that you've got a full 32-bit address space so four gigabytes of memory that can possibly be addressed for a microcontroller with one megabyte of flash and 192 kilobytes of ram here's the and travis maybe just covered for people how if they don't have a poc or gtfo article to reference the memory map they might find that in documentation oh yes so this is always in either the data sheet or the programmer's guide which is sometimes called a family guide it will be unique to your chip or at least to the chip's family and it will um it will tell you

um which pointer goes to which location if you're so unlucky that you do not have documentation for this or it's not in the data sheets you'll find it in the linking scripts and if you don't have the linking scripts freely available from the manufacturer you can very often find them included with commercial c compilots and since you're not trying to do you're not trying to use the compiler you're just trying to read its header files it's very often not restricted by the compiler's license enforcement when you look into the list you'll see that everything is sort of organized by the most significant bits of the of the address in this table here we're only looking at

the most significant byte if um if that byte is c0 we know that we're looking at mask from this is the wrong boot loader that is physically baked into the chip that cannot be patched or changed if the first part is before zero we know that this is in the i o region and that the um the thing being pointed to is something like a counter or a timer or a serial port that it interfaces with the outside world from the cpu core if the first pipe is twenty and hexadecimal we know that it's s gram and if it's 10 in hexadecimal we know that it's tightly coupled round which on this particular chip happens to

be slightly faster but is not executable and the vast majority of the pointers that you see when taking the firmware apart will begin with zero eight which means that they're in flash memory which can't be freely written simply by writing to an address like the other memories can but it is directly executable and it's where the majority of our code runs and then on this chip the stm32f4 series if the first byte is a zero that alias to whichever memory we booted from which is probably either the mask rom or flash but the hardware does allow booting from sram for development purposes

so let's take a look at some pointers um 20 again from our list above 20 is in f ram so the first pointer is to srm this is like a global variable perhaps the second and the third pointers those are both odd and they both begin with zero eight zero eight tells me that they are in um that they're in flash memory but they're being odd the least significant bit being off by one tells me that these are either very strange pointers to a byte offset which exists for rather rare or far more likely they're odd because their code and their thumb two code because remember this ship cannot run the classic 32-bit wide arm

instruction set it's unable to have code and even addresses so every function pointer that you find will be odd and then this last one uh four zero zero two three c zero zero that's in the i o region which we know because it begins with uh 40 and hexadecimal and there are header files for this chip that define all of the i o addresses this happens to be the flash control register so whatever function contains this pointer that function is interacting with flash memory it might be writing it or it might be um locking the chip we can then search for these numbers in order to find all of the functions that involve flash memory

or all of the functions that involve a particular serial port and by doing that we can then narrow down the code to the individual function that we care about

um ryan i've been hugging the mic would you care to talk about the interrupt vector table and what it means sure i can go through that um let's see actually travis why don't you start and i'll jump in on the next one that'll work so the interrupt vector table begins at the very beginning of an arm from our image when you get a firmware image you don't get a exe file or an elf file like you would in windows or in linux instead you very often get just a blob of bytes and you need to figure out where those bytes need to be loaded and you also need to figure out inside of memory where the basic functions are

this table begins with the initial stack pointer which is a really cool thing kind of unique to microcontrollers because it means that from the very instant the program begins running it has a functioning call stack yeah so that's that's r13 yes and that means that your reset vector which is the entry point for your code your reset vector can be written in c and you don't need any custom assembly language to write the beginning of your kernel on an arm cortex m firmware image so these are the beginning bytes of a real firmware image for um a cortex m3 chip you see that address zero we have um these four bytes in hexadecimal we have 30 14

0 0 20. and because this is a little ending in architecture we read them backward so the first 32-bit word in memory is 2 0 0 0 14 30. the 2 0 is an sram and that's our stack pointer and the value after that reading the next four bytes backward if you follow along is zero eight zero zero four one two one well the zero eight the most significant byte tells us that it's in flash memory and q1 the least significant byte is odd which tells us that it's thumb code so when we decode these we get first the initial fact pointer which tells us which piece of ram the firmware is using and then we get a bunch of interrupt

handler addresses beginning with the entry point of the program and continuing on to other events like receiving a byte on a serial port or um having the nmi tin or non-maskable interrupt uh attacks and we can work on the component pieces of the firmware in order to um in order to confirm that we have it loaded to the right location so the next step uh the one that you've been waiting for is actually loading this into ida pro or binary ninja or guidra or vadara 2 and doing something with it when you go through this and the concept go ahead trash oh sorry sorry when you um when you open the file in ida um it sees

this as a bunch of bikes and it doesn't really know the instruction set or the loading location or any of those details so you need to help it out a little bit the first thing that you need to do is change the processor type manually to arm little engine arm at some point did make big endian machines and even machines that can switch back and forth between big engine and little indian mode but these days it'll be rather rare for you to find the beginning firmware so you choose little indian and then in processor options um you have all of these fancy uh variants of the instruction set but arm is pretty good about keeping them

backward compatible so about the only thing that you need to do for these images is mark that the architecture does not support 32-bit wide instructions and then ida is going to ask you um for the memory organization and if you remember that memory map that i showed you earlier this is where you present the memory map to item so we do need a ram section or code that tries to read and write ram won't properly decompile and we also won't be able to track the global variables and we definitely need a rom stacker but ida defaults both of these to being at address zero and uh the ram to not even being of any size

so we correct it by two methods the first uh the easy one is the ram we know from the data sheet where ram begins and how large it is here i begin ram at um the address with a high bite of 20 because that's what i found in the memory map of the data sheet and i set the size to 128 kilobytes because that's also the value that i find in the data sheet the rom i load to a strange address it's zero eight zero zero and then um c thousand and the reason why i do that is because in this particular device there are two kernels and the first sits at the very beginning of flash memory

and acts as a sort of recovery bootloader to start the second one and the offset for the second one i had to find uh by reading that first bootloader's code so now that i know that it begins just 48 kilobytes further in memory than i then the documentation says i can apply that in ida and have everything loaded at the correct location and there's one more thing that you'll need to do or ida will give you total garbage for your results which is that um when ida loads an arm image it's a little bit confused about the instruction set because um on the arm architecture you have two different instruction sets and you can freely jump between them

so you need to use alt g and set the t value to be one when you do this you then told it that you want a code 16 section instead of a code 32 section and from that point on the disassembly will be correct and then you can begin exploring and marking up your functions at the very beginning it knows nothing about it so you see code 32 as your organization everything is at the correct loading address and the bytes are correct but ida doesn't know where any functions begin so the first thing that you do is you type alt g and set t equals one in the dialog to change code 32 to code 16.

the next thing that you do is you use the d key to mark these data bytes as being data words so once we mark all of them up as 32-bit words we find the internal table and it's very large it's more than a page in this architecture and each of these 32-bit addresses after the first one is a valid interrupt handler that the cpu will jump to when the right event happens we can then use this for a sanity check if we made a mistake at this point or up to this point if our alignment is off if our configuration is wrong there are tales that can let us know that we're about to waste our time by

reverse engineering gobbledygook and let us sort of uh step back a minute and and take a second look the first thing to check is that some of the instructions are only two bytes wide implying that we're properly in thumb mode the other thing to check is that any function which is apparent meaning that it calls another function should begin by pushing the at least the link register and maybe some other registers and also at the end of that function it should pop those back um off of the stack and into the registers that came from before returning we should also check that every target of a branch and link instruction which is how function calls work

that every function call is to a reasonable address meaning that when you look at the address that it's targeting you find the entry point of another valid function this one isn't as [Music] as useful as you would expect it to be because the branch and link instruction is relative in thumb to meaning that if i've loaded my image to the wrong address function calls will still work and we'll still jump to the correct address because they inherit the um the same mistake as our loading and then at the end of the function after the after the program counter is restored and the function goes back to its parent we will either find another function or

some has something called a constant pool and the constant pool is used instead of where x86 would use immediate values so in x86 you can have an instruction that contains a 32-bit word inside of it in thumb 2 the longest instruction is only 32 bits so there's no room for that constant to fit so instead you point it you point relatively to the end of your function where you have a list of all the 32-bit values you might want to load and it can be fetched from there this is an example of a thumb 2 instruction it's a short one i use this one during a sanity check the purpose of this function is just to

make sure that everything is loaded right we see that at the very beginning it pushes register 4 and the link register to the call stack at the very end it pops register 4 and the program counter off of it this is a neat way to cheaply return because in popping the value from the what was the link register at the time the function entered back into the program counter it can skip restoring that we also see a bl instruction in the middle which calls another function within the program jumping there i find a valid function which tells me that this is correctly loaded when you have your image and you begin to understand it you're generally trying to do it for um

for some reason one reason might be to change the firmware on the physical device but another useful thing that you might want to do is extract a library from the physical device so that you can run it on your desktop this is especially true if the library is proprietary or if you can't um you can't purchase the copy so this radio the titera md380 has a proprietary audio codec called ambi plus two and the um if you want md plus two there's an open source decoder but there is nothing open source that will encode the audio instead you're supposed to purchase a dongle that plugs into the serial port it's supposed to send your audio samples out the serial port

and then get back the compressed packets so what you can do instead which is i explained in uh hypocrisy tfo 13 5 is you can take the firmware image that you now understand because you reverse engineered pieces of it and you can re-link it to be a linux arm executable you can then compile your own speed code to run against that library and the end result is that you have um a unix command line tool that is able to call the proprietary library functions of the physical radio and because it has the correct code and it has the correct ram everything is in the correct position and everything still works these are the uh gcc 9 commands to actually perform

the linking the um the gist of it is that i'm telling it to load the firmware image where the firmware would go at zero eight zero c000 so this leaves experiment.img as our firmware where the firmware should go and then what this does is this loads ram.bin into memory where the ram should go and i also give it um section names so that my linker can provide pointers for this back to my main function and when i link all of this together i can then um i can then call into these functions so the first thing that i do is i make a function pointer to the same firmware address as i found a function that does nothing

but return and what this piece if it does is it it makes that function pointer and then it calls it and if it doesn't crash that means that it has called into the firmware and successfully returned if it crashes it might be because the page is not executable or it might be because of another mistake along the way after i have a very short function working i can then take a much more complicated function uh mba decode wave this is the function that takes a compressed amde audio packet as the radio would receive over the air and it decodes it into a wave buffer that will be played out the sound card of the radio and this

is how you receive digital audio and hear it as a voice that you understand now um you can see from my comments that i don't really understand what all of the parameters of this function do um i know that a5 and a4 are always zero so i just make them zero um the buffer is uh i think a temporary buffer and i use the one that was already loaded into ram on the device because there might be um tables that are required and then also a7 is always the same address inside of sram at 2001 1224 i just repeat that same address from what i see in the disassembled firmware i don't actually know what it means or

what's located there and i don't have to because as far as that function is concerned everything that it expects in flash memory is at the correct address and everything that it expects in ram is at the correct address so if i call it with the same parameters it performs the same functions this audio codec happens to have a copy protection mechanism which i didn't notice when i first emulated it because in my emulation the copy protection check had already passed so by running the firmware image that i've dumped out of the real radio and the frame that i've dumped out of the real radio it's almost like i've paused the running program inside of the physical hardware and then

migrated it into a unix process and continue time off from there [Music] part of the weirdness of this function is that uh in in amd you have two time slots so you have to call the decoder twice and only once it will work so i just double that up but the end result is that i have an arm linux executable which runs perfectly fine under qmo so it doesn't matter that my desktop is an intel xeon or whatever um which is then able to call back into the firmware if i were doing this for a commercial project instead of a hobby project or if i were more diligent in the quality control for my how

for my hobby projects i could use this to write unit tests and integration tests to know that my understanding of the meaning of the firmware and the locations of the functions were correct there are some other nifty tricks that you should know um

when you're reverse engineering desktop software um we've worked our way to a point in the world where most new software uses utf-8 and there are some exceptions like if you're looking at very old software you might find um a national standard or if you're looking at windows software you might find unicode 16 instead of etf8 in firmware for arm it's surprising what you'll very often find um the chinese standard called gb2312 so sometimes when you're looking through a firmware image for a device that has a free display in chinese but not in english um and you run springs on it you look for unicode strings and you find none of it you will still find gd231 two strings

if you try that and you can then run them through google translate and use those strings to understand the meaning of the code involved i have a github project that that dumps these strings because the standard tools don't seem to like that format magic constants are also useful in the case of reverse engineering the radio firmware there was um support in the protocol for um like an emergency shouting mode where every radio that hears the message will decode it you use this if you need to address everybody on the network even though some of them might not be configured to receive you by searching for the address that you send that audio to i was able

to find the functions that control whether the radio plays incoming audio or not um which allowed me to make promiscuous mode to decode all all incoming audio you can also search for masks in this particular protocol addresses are three bytes long instead of four or eight or 16 or whatever else you would want so by searching for zero zero ffff i could find all of the masking operations that were trying to drop off the extra byte that was not a part of the address and that allowed me to locate large parts of the networking style you can also hunt for reference code so um in my case i was attempting to reverse engineer a device which locked its own firmware

and i needed to find the function that locked the firmware in order to disable it to do that i i looked at the manufacturer's reference design and they provided this example function which was used exactly with no modification by the radio vendor it was um statically linked into the firmware image so i could look at what assembly code was generated by compiling this and then i could search for everything similar and by clicking i was able to find the correct function in very little time

you can also automate a lot of the the dredge work of sorting through the image and identifying things so one very useful thing to do is to realize that most functions are targets of a branch only construction and um the only exceptions to that in a normal application will be um functions that are called by pointer where you will find a copy of the function pointer either in flash or sram and interrupt handlers which are in the internet handler table so if you run through memory and you find every branch in link instruction and then look at its target and then you instruct your disassembly tool through an item python script or the scripting for whatever other tool

you're using you can automatically identify all of those functions and then um then begin working with the entry points all known or even instruct i to decompile the entire image and then search through it as c code without having to look at the assembly and when the vendor releases a firmware update and you would like to stay up to date with that firmware but you've written your patches against an older firmer image um you need some way to automatically port because if you do all of this by hand then you forget why you did certain things or why um you patched out certain functions and it becomes difficult to maintain instead it would be nice if all of your

patching were performed in c in the style of replacing a function of a particular name and the names you could apply to one version of the firmware and automatically move to future versions so if you're trying to do this between let's say adobe pdf reader on spark and modern mac os um that becomes a hard problem because you have different compilers different instruction sets different major versions um and they're very smart people who work on those problems and get results in them in my case i'm interested in getting an answer immediately with a couple pages of c code so the way that i do this is that i realize that um embedded developers usually use the same

version of the compiler they don't update their compiler frequently like um like a desktop linux user might and when you're using the same compiler to compile different versions of the same firmware most of the functions don't change at all and those that do change um those that do change are usually for the small feature that gets fixed they're not necessarily the same function that you might want to hook so you're usually looking at the exact same assembly code linked to a new address and when it's linked to a new address the targets of the bl instructions which means branch and link which is the thumb version of a function call those change but nothing else changes

so we can do this trick where um you can compare how similar two functions are by stepping through every 16-bit word in the function with a being the word in the first function and b being the word in the second function and if a equals b then great they're identical and you can move on if they don't equal each other but they both have a most significant nibble of f meaning both of them have their most significant four bits set then that that 16-bit word is part of a branch and link instruction and it doesn't matter that it changed because it's expected to during location so this lets you come up with a very very short c function

that will race through large firmware images and directly compare their functions in order to identify which ones are equivalent so that you can import your symbol names if you'd like an example of this there's a code in the symbols directory of the md380 tools project um we've gone over a lot in this lecture i hope that you picked up some nifty tricks that you'll use in your own projects i do not expect you to remember every detail and i certainly wouldn't if i watched it myself but it's the nifty tricks that save you when you're blocked or that allow you to um to hack something that you might otherwise get stuck on that are the valuable things that i look

for in these lectures if anyone wants to have reference material for this polka gtfo 11 11 6 has the full article that some of this content came from so you can also look at that if you want reference howdy hey so i'm uh sent out a query for questions uh one question we got was um what modern devices have the cortex m in it and i was just doing some noodling uh while while you all were talking and i found that ifixit is a pretty good place to find what chips are and what things since they do tear downs and they verbally you know they write out what chips they find and identify the chips

oh yeah their tear downs are great they also show you like um which tiny ribbon cable you're most likely to screw up when you open the device yeah yeah this thing will tear if you if you pull that screen off yep um yeah so i found out like uh some of those hoverboards use them the apple tv uses them ipod touch sixth generation uh the pebble watches smart watches use them so in the apple tv this won't be the main cpu this will be off on the side so it's in the they they use it both in the main unit and inside the remote controller does it have a chip name it's for the rf4ce travis in that one

i believe yeah so a lot of these that that's a great point engine right a lot of these will be in uh you'll often find these families in smart remotes so uh i don't know specifically if it's the m3 and them but you know think of any remote that you're pushing your your uh yeah i'm pretty sure they're actually empty-based that you push and do voice to tell your tv what you want to do for example whether that's a tv an apple product or a third party um typically these are integrated into some of those radio chipsets or based on this architecture and if you have uh if you have like a small consumer device

with an lcd screen but that clearly isn't as powerful as android um it's quite likely one of the higher-end gore-tex-iron chips cool yeah yeah it's what it looks like with a few devices here the oculus rift multiple versions of the oculus rift also use it and the uh second generation nest so yeah that was that was pretty cool i was trying to think like how would i even find that because you know manufacturers usually don't publicize that you know what chips are using for what things and i realized i fix it uh that's what they specialize in so the manufacturer won't publicize it but very often you can find out which chips they use from their fcc filing

yeah yeah we see a lot of leaks come from that before devices come out sometimes

well very cool um yes so i'm i'm not seeing any questions so thank you thank you both very much for the talk and um yeah there's uh there's definitely some discussion going on on the on the discord which has been fun to have

A Tourist's Guide to the ARM Cortex M3

Related talks