Writing a Mask ROM Tool

Name: Writing a Mask ROM Tool
Uploaded: 2023-05-27
Duration: 44 min 30 s
Description: Travis Goodspeed presents Mask ROM Tool, a CAD application for reverse engineering firmware from photographed microchip die. The talk covers chemical decapsulation and delayering techniques, photographic annotation of ROM bits, and automated conversion to binary images for analysis in tools like Ghi

BSides Knoxville · 202344:30884 viewsPublished 2023-05Watch on YouTube ↗

Speakers

Travis Goodspeed

Tags

CategoryTechnical

TopicHardware Hacking Reverse Engineering

DifficultyAdvanced

ResearchTechnical Deep-dives

StyleTalk

Mentioned in this talk

Tools used

Binwalk Ghidra IDA Pro

About this talk

Travis Goodspeed presents Mask ROM Tool, a CAD application for reverse engineering firmware from photographed microchip die. The talk covers chemical decapsulation and delayering techniques, photographic annotation of ROM bits, and automated conversion to binary images for analysis in tools like Ghidra. Real-world examples include extracting code from the Clipper chip and Game Boy ROM.

Show original YouTube description

Suppose that you have a photograph of the bits in a microchip, and you want to reverse engineer the firmware. This talk describes how I built a CAD tool in C++ and Qt6 for annotating and decoding the bits. After chemically decapsulating, delayering, and staining a microchip, its ROM can be photographed to reveal the physical bits. This talk will describe how I wrote an open tool in Qt6 and C++ for annotating these photographs to painlessly extract tens of thousands of bits for reverse engineering. This talk describes Mask ROM Tool, a CAD tool for converting photographs of bits in a microchip and then converting them to a ROM image suitable for reverse engineering or emulation.

Show transcript [en]

So today we're going to be talking about mask ROM reverse engineering for those of you who saw my lecture last year last year I explained the chemistry required in order to take apart a microchip photograph the bare dye of that microchip and then from the die you can begin figuring out how that chip works and reverse engineering Its Behavior um one of the pictures that I had in that lecture is the background here this is the mask ROM for the myk-82 it is a cryptography chip made by microtronics for the forteza card and it's of historic interest because this was the center of the Clipper chip controversy the Clinton Administration decided that they were going to backdoor cryptography

for basically the entire country we could all do secure telephone calls like with a landline and a modem that would digitally encrypt everything um we could all do that as long as we had a back door so that law enforcement could still listen to our phone calls the idea is that um if law enforcement had some wiretaps on you they could get a warrant present the warrant to maybe the NSA or a different group and then get back a key unique to your cryptography chip and that key would allow them to decrypt your session key which was revealed during the transaction and then they could decrypt the entire phone call and hear everything that you said

um at the time this was pretty controversial and people were fighting back and forth about it but what I wanted to do is I wanted to read the code that was inside of this chip and last year I got this photograph and in this Photograph I have all 130 000 bits of the firmware but at the time I did not have software to reverse engineer those those bits and to extract them what I really wanted was not a picture I wanted a text file with all of the bits in physical order as ASCII art and I wanted a script that would convert that to a DOT bin file that I could load in Ida Pro or ghidra or binary Ninja for

reverse engineering so in this lecture I'm going to tell you how I wrote that tool it is available for free uh it's open source it runs on Linux Mac and windows you can get it at maskromtool.com there are also tutorials to teach you how to do this yourself I've published a photograph of the CPU from a Nintendo Game Boy and with that photograph you can follow step by step and at the end of the tutorial you will have the dot bin file that main uses when it emulates that same Game Boy um to quickly recap there are different types of memory inside of a chip and there are also different types of read-only memory inside of the chip

in this lecture we're not going to be talking about Flash ROM or eprom or anything that can be programmed in the field we are only going to be talking about mask ROM which is programmed by making the right shape in a mask at the semiconductor Factory that stamps out all of these chips inside of the Mask you will have a little shape for every one and you will have the absence of that shape for a zero and what we are doing is we are trying to get a photograph that shows us that same shape or is something related to that shape so that we can mark all of the bits out of the picture and work our way backward to

a digital file um these ROMs are programmed at the factory and like any other memory they can contain but code or data as the designer chooses they are not always visible from the surface a lot of the photographs I'm going to show you today required chemical processing even after decapsulating the chip in order to remove the upper layers of the chip to reveal the bits or even further some of these encode the bits is the difference between a p and N silicon which are the same color so you need to chemically stain a difference into the colors in order to extract them um these are these mask rounds are used in very very high volumes

um changing the mask costs on the order of a hundred thousand dollars so there's no point in using a mask ROM if you are not making a bajillion of something examples of it include video game cartridges Nintendo Atari Super Nintendo they all use mask ROMs um boot loaders are also very common for that because all of the microcontrollers in the family will share the same bootloader and you don't need to make that unique to the customer um they're also used for copy protection because very often these masograms are contained inside of the microcontroller that runs the product when you buy an inkjet printer for thirty dollars and then you have to pay three hundred dollars for ink

inside of the ink cartridge there will be a little microcontroller with a little mask ROM that is responsible for proving that that cartridge is real um if you wanted to bypass these restrictions if you wanted to read this code if you want to Archive these games if there's no electrical connection it's sometimes very handy to fall back on the photographic method um here is the mask ROM from the myk-78 Clipper chip this was the very first chip that the public could buy with the kiosk or cryptography so that Bill Clinton could spy on us um I'm going to zoom in a bit by taking one half of this image um in each row you see that there's a

little shape and that there are little dark spots each dark spot is a one and if there is no dark spot that's a zero um now of course I mean this in the the physical sense I might have that backward for any particular sample after you get these bits out it's a separate problem to de-scramble them and to move them from the physical order of bits to The Logical order of the bytes um the tool that I wrote simply loads up the photograph I give it a DOT BMP file or a TIF file I then Mark in that tool where all of the rows are and all of the columns and wherever a row intersects with a column

that's where a bit is the tool then uses the color at that point to identify what the bit is and to mark it all up so all of these little blue and red squares are the bits that were found inside of this mask ROM inside of this chip the contents here at the time that the chip was manufactured and sold to the General Public the memory in here this contains the F table structure for a cryptographic algorithm called skipjack skipjack was in was still classified in 1993 when this chip was available so you could have gotten classified encryption technology from the US government just by purchasing one of these chips getting this image and extracting the bits out

of it they've since Declassified it which is particularly handy because I can confirm what this table has by comparing the PDF of the Declassified algorithm to the photographs and the bit extraction that I've made in my lab um now I'm I'm not the first one to build technology like this um they're basically two tools that you need in order to read a mask ROM um from the photograph the first thing that you need is a bit extractor the bit extractor rum parb item Lorry and bitracked by Chris gerlinski are a prior art for this and the maskram tool that I'm talking about today is my replacement for these tools the bit extractors only job is to begin

with a photograph and to end with ASCII art of all the bits in their physical order um it would also be nice if it did things like error correction or design rule checks so that you know when you've made a mistake but it's a second Tool's job to convert that ASCII art over to The Logical bytes zoram is probably the best that's available right now it's by John McMaster um it's a command line python tool you give it the text file and you tell it what bytes you think are in there and then it will give you all of the permutations that have those bytes and then you spot check them um like maybe you look at the

disassembly and see whether or not this is something makes sense or maybe you um draw the artwork like if you're extracting artwork from a video game when you start seeing Sprites or tile sets that are accurate then you know that you've correctly interpreted the machine I correctly interpreted the the bytes um I am working on my own tool for converting the physical bits to The Logical bytes that is not yet ready for release but I hope to have it in the next six months and I hope to have it better than zoram but I'm not yet to that point um we're going to quickly review the chemistry that's involved I'm not going to repeat my lecture from last year but

just to get us on the same page um they're basically three procedures that you need for um for extracting these bits the first one which you always have to perform is you need to remove the plastic covering from the microchip you need to get to the glass uh I do this with 65 nitric acid and a hot plate in a fume hood and I pump the air out the second procedure that you have to do is delayering the chip I use dilute hydrofluoric acid for this in the form of rust stain remover what this does is it burns off the the top of the Chip and leaves only the lower levels and that's how I get the

top metal layers out of the way so that I can look at the bits which are often encoded on Lower layers and then finally um a third type of mask ROM will not be visible at any vertical height within the chip because the bits are encoded is the difference between P silicon and N silicon which are exactly the same color for that I use a mixture of acids called a dash Edge which turns silicon brown but it turns a piece silicon Brown a little faster than it does in Silicon and that creates a difference that I can then photograph under a microscope um this is the the nitric acid that I use I had this delivered to the mail

room back when I was a grad student and the computer science mail room was not happy having the skull and crossbone package sitting on a shelf while I was out of town and unable to pick it up um for larger trips particularly old ones it's or ones that you want to decapsulate without breaking the bonding wires or damaging the chip stronger nitric acid is preferable the this acid the red fuming nitric acid has less than two percent water in it and water will dissolve the acid salts that are left behind on the wires as the metal as the metal interacts with the nitric acid it kind of rusts a little if there is water the water will brush the rust

off and allow the acid to go further and rest more of it so if you have almost no water like this acid does you can burn through the plastic without damaging the metal because the metal will be protected by a microscopically thin layer of rust so we take the chip we drop it in the acid the acid turns green from the metal dissolving um we keep heating it up eventually we get the dye out that little guy there at the end of my tweezers that's what we're looking at we then put it under a microscope um the microscope can show us the dye but usually they're too big to fit within one frame so what I'll do is I will take three

frames and I'll line them up in the right way so one two three and then I can combine them into a panorama that gives me the um the whole chip when I'm doing wrongs I do the same procedure but under higher magnification and only in the area of the chip that contains the rum so first I'll make like a big picture of the whole chip then I figure out which part I care about and then I photograph that in obscene detail um at the end of this I get like a full picture of the whole chip or the whole region um delayering is necessary because in very few of these chips are the bits visible

from the surface an example in which they are would be the Nintendo Game Boy the Nintendo Game Boy has uh 256 byte program built into its CPU whose only job is to scroll the Nintendo logo from the cartridge across the screen and to verify that it is the Nintendo logo that they have trademarked so that they can sue the living out of you if you include a Nintendo logo on an unauthorized game that has not been licensed by them um that is visible from the surface but for the ones that are not for the ones where you need to go lower you use rust stain remover this little stuff and you have to use a

plastic Beaker because hydrofluoric acid will attack glass um it will dissolve it it will etch it it turns it foggy and if you had it strong enough you could burn all the way through a glass speaker the effective of the delaying like this is the surface of the Chip and a lot of the details are kind of obscured by this filler pattern here that you can see under higher magnification if we want to see beneath that all we have to do is boil this in dilute hydrofluoric acid for a minute or two and then the the obscuring features that you see here kind of go away and these are the same chip this is about halfway etched and

this is a bit further down and now you can see very separate organs within the chip you can see that there are different memories and CPU components are kind of grouped together um in the uh the early chips like from the the 70s and the very early 80s you can even reverse engineer the instruction set by reading the lookup table that is the microcode of the CPU um so now that we have now that we have the picture um I'm going to walk you through how these are annotated basically we first need to draw row and column lines we need to tell the software where every row is and where every column is and if our pictures don't line up

perfectly we can do this in smaller segments you can do like um you know a couple of banks in one row and then do another row separately we also have to teach it what the difference in the color is between a one and a zero and ideally we would want this to be um we want a single color channel that has a bimodal distribution with like um one being a lot brighter than the other and we only need this in one color channel so we look for like a difference in green red or blue and then we we sample on that color Channel individually in order to get the best separation and what you'll see is that very often

there is no difference in one or two of these channels and it's only the third at which we can tell the difference between one and zero um and you're going to see that the um the the tool does a lot of the work for me it marks the bits themselves at the intersections of the rows and the columns so I only have to draw as many lines as the square root of the number of bits I don't have to draw as many lines as there are bits it also has design rule checks that will point out when I screw up if I hit the button to Mark a line twice instead of once then I'll have two lines

in the same position and the software can recognize that and warn me that I messed up and even show me where in my design that happened um and then it's very important to be able to import the difference between two project files so when I'm working on a very big ROM like the 130 000 bits in that cryptography chip I'm going to make mistakes I'm going to have dust on the the chip I'm going to have a photo out of focus by importing the difference between two projects I can compare the bits that they have and every bit that I get correct will be identical in two or three or four recordings but any bit that I screw up I should uniquely

screw up to one of the outputs and by comparing them I can figure out which way and correct it um this is um our first Target for today this is a music chip um back before we've had computers in doorbells we used little digital things that would have like a little song to them so when you power up this ship it plays Fur Elise by Beethoven on Loop that's all that this chip does it has three pins you give it power and ground on two of them and the third pin will output for release you run that to a Piezo buzzer or to a speaker and you have a yourself a doorbell or an electronic reading card

and let's say that we want to steal this music but we want every note exactly as it happens in that doorbell well first we look at the bits and um I think as this projector is set up it's easiest for you to see on the left you see how it's kind of like a checkerboard pattern but some of them are a little darker than the others those dark ones are the ones and every space where it might be darkened but is not that's a zero I draw a line through every row and then I do the same through every column so that the computer now knows a point within all of these bits you'll see that my column line here is

not directly in the middle of the bit instead it's on the side and I did that because the side of the bit has these little um like dark brown thingies and that's the spot that I'm interpreting the color difference from because it's more apparent there my software then marks these little squares at each of these positions um coming down from the top in this column uh the correct values which it is not currently displaying would be zero zero one one and as you look at it you can see that the ones have these little dark bars on the side and the zeros do not I can then annotate the the rest of the microchip so now the software knows

accurately where every single bit is it just does not know the correct color yet so the next feature of my tool is a bit histogram so this graph here shows me in red green and blue what I how popular every single color position is you'll see that some of them happen a lot like I have 37 bits that have a red value of about 191. and you can also see that my red has two peaks there's one here and there's one over here in green as well has two peaks blue has less of a separation so blue will probably not be the color that I sample on if I drag out the green selector to 91

and I'm sort of selecting like right here in the middle of the green Peaks I don't get a correct interpretation if you look at that white Arrow you'll see a position that is red the software thinks that it's a one but it ought to be blue the software ought to know that this is a zero and you with your own eyes can see that it's a zero because it does not have those thick bars on the side of it like the two ones that border it do and where there's one mistake there's usually more than one so here are some other bits that are mismarked um the if we change from over here the green marker if you look

at the targets of those arrows if we move that into the red marker so from Green 91 to Red 124 then suddenly all of those bits are accurately identified so now we've got almost all of the bits from furley's correctly extracted um when I say almost though the design rule checks are really important for recognizing edge cases maybe there's a little bit of noise just where we do the sampling maybe it's really close to the threshold or there's the wrong number of bits on a row in this case the software tells me that I have four bits that are ambiguous it puts a yellow box around it to tell me not to trust it and when I click on it in the menu it

will jump to that position now this is blue the software thinks that it's a zero but it's actually a one which I can tell by looking at it with my own eyes so um as this happens for each of the bits I can then correct them I can tell the software that this needs to be blue and then it disappears from my rule violation list whenever I export this file from now on because of this green square around the the blue square the software knows that its own opinion of the value of that bit is irrelevant because I as the human operator have told it what that bit ought to be when you have a project with uh checksum

or a self-test you can sometimes even incorporate that into your export to know that you've got it perfectly matched um the next thing that we need so we've got all of the bits marked we need to be able to get it out of the software into something else so I've got these um these different file formats that I can export as if this happens to be a mark 4 chip or an arm6 chip I could just directly dump a DOT bin file with all of the contents ready to reverse engineer um because those ships aren't very common I usually start by exporting either ASCII art or a python Matrix and then I can do like linear algebra

commands to rotate the Matrix around until the bits begin to make sense um if you are a student or if you just enjoy pain you can also export it as CSV for use in Microsoft Excel or in Matlab um and so at the end we get these bits here and these ones and zeros are all of the bits of Fur Elise and if you interpret them correctly you can play the music um there are a lot of little design decisions under the hood that helped out in this project um it scales rather well to hundreds of thousands of bits and it does so because I used the qt6 graphics format a graphics framework my code is written in

C plus and so is the library behind it and then it uses 2D acceleration for all of the graphics that come from your operating system so if you're running um like Mac or Windows or Linux it will use your graphics card in order to pre-buffer things to make it render faster I can also use the Collision detection algorithms that come from this framework so I'm not having to keep track of all of the bits myself I can just ask like hey are there any two classes of this type that happen to touch each other uh if so I throw a warning up um this makes the um the rendering very fast and allows me to

navigate projects that would be too large for the pre-existing software that came before this program um it's also necessary to accurately align the bits in order to do this I sort all of the bits by their x coordinate so I'm getting them like left to right and I don't pay any attention to where they are vertically um as I begin moving right I should see every bit in the First Column before I finish seeing all of the bits of the second column and the only tricky part is figuring out where the border between them is um I have like configuration entries that allow you to specify where the ending is if it gets it wrong

as soon as you have the First Column though the entire rest of the interpretation can be taking all of the bits from left to right figuring out which bucket it would fit in most reasonably among the bits of the previous column and then adding it to a linked list there so that at the end you get a linked list for every single row and then you can transform it out from there however you like another important design consideration was that from the very beginning I wrote this as both the GUI and as a command line tool um all of the important commands that you can do graphically you can also script to have the computer run itself

so as I begin marking up mask ROMs I give them a make file and the make file will do the complete decoding from beginning to end and this also doubles as an integration test for my software so as I'm developing this if I fat finger a command and introduce a C plus plus bug that causes it to crash but it only crashes in some obscure way that I only used in one chip I will still recognize that because when I type make in my project directory it recompiles every mask ROM that I've ever annotated in this tool and make will error out if any of them fails to correctly decode [Music]

um so we're gonna go back a step back to the the microtronic 78. um this chip was supposed to be the solution to law enforcement access to all of our private Communications um and it failed because of political backlash but it also failed um because it it had a problem with it and the problem was not that the back door could be used by any random third party the problem was that if you understood how this machine worked and you understood the commands that it used you could do safe cryptography even with the backdoor chip there are a couple of papers by Matt Blaze that explain this in detail um these two little segments here in the

bottom left they contain the lookup tables for that cryptography for the skipjack algorithm and we can photograph them and we can read them out and presumably um anyone else who wanted to could do the same thing like it takes patience but if you really have to you can sit there with an old school microscope and without a fancy computer without fancy digital photography and you can just look at them in a row and read them out to a board co-workers you sit there going one one zero zero one one zero zero one one um these are all of the bits this is the entire table so you need the patience to write down this many numbers but you

don't need fancy Space Age Technology to do it um if you decode that into BIOS you get this table which is the F table which um you know kind of like an s box or something in more modern algorithms this was classified at the time that that chip came out so there are secrets embedded in these chips that you can recover just through Photography in order to understand how they behave better or to reverse engineer what they do even though it's a physical object rather than uh rather than software in later generations they added software um so this is the forteza crypto card this came out in I think 1997. you would plug this into your fancy

government phone and you would make a phone call and you would have all of the back doors that the Clipper chip promised you had a few more algorithms that you could use but one of the big things that this chip introduced that the prior chip did not have was that it added an armed CPU so where this says uh Luts or lookup tables that's the same F table that we saw in the the prior chip but there's a CPU in the upper right and if you zoom in on it it has an arm6 logo so you know exactly which architecture it is the logo is backward for reasons I don't understand I think it's like a

photographic processing mistake um now over here on the left you have the main boot ROM for that RM CPU and we can see the bits on it these little um rectangles on the sides of the squares every rectangle is a one if there's no rectangle that's a zero and we can also see the entirety of the Chip and over here on the right side you'll notice that the rightmost column is entirely dark and the column just to the left of it is kind of split it is dark on the right side and it is light on the left side basically each of these big columns is one bit of significance well sorry one one half of a big column

is a bit of significance so as you read from the right side to the left you see a column in which almost all of the bits are one another in which they're almost always one a third in which they're almost always one and a fourth in which they are almost always zero one one one zero in hexadecimal that would be an e who here does arm reverse engineering of 32-bit code we're not talking thumb okay why is an e the most common letter that you see at the beginning of 32-bit arm code in every 32-bit word

exactly so every 32-bit arm instruction begins with four bits that describe the condition code on arm you can make a single instruction any instruction only execute if the flags are correct one one one zero the letter e means that you will always execute that instruction and because most instructions are always executed most of them begin with an e and you can see almost from orbit that therefore the most significant bit is the one on the right and that you count leftward as you go down to zero and sure enough when you properly decode this ROM that's how it's encoded in order to get the contents of this out I had to move to Bogota for a week

um I checked into a hotel there and I dumped these bits out and I just sat and I tried every combination until eventually it came out and it did come out um this is the the very first row and the beginnings of the second row of the um of the machine code of the bootloader of the forteza card here is the python code that does the encoding it's really simple you just go right to left 32-bit words start at the top row work your way down to the bottom you know that you're working your way down to the bottom because while we see bits as little dots up here there are no bits in the bottom

like fifth of the image because that is unused and as you decode it all of these are zeros so having the um the code dumped out I can then run binwall on it and Bin walk starts recognizing all of these arm function prologues I then throw it into ghidra and sure enough these begin to line up and these begin to decompile correctly and that is how with the chemistry lab and some patience and some photography you can extract the secret bootloader from an NSA designed cryptography chip from the 90s thank you [Applause] do I have time for questions yes all right um if we could get some questions and if someone could grab me a water I would be

very appreciative yes so how long did this this process take you for that chip for the forteza so in the case of the the forteza um it took me maybe one day for the chemistry it took me maybe three days for the photography because back then I did not have a microscope with a motorized stage so I had to manually move it without moving it too far um another day to get the images together into one big Panorama that evenly lined up because if the Panorama images do not accurately match your decoding gets scrambled um and then maybe four weeks to write the decoding software the mask ROM tool and then I had to move to Bogota where

my Spanish sucks and I didn't know anybody and spend a week hold up in a hotel room decoding the order until it made sense that final stage the week of figuring out the order is the next problem that I'm trying to solve by making a tool that will accurately figure out that order for risk chips like the arm I also have some ideas about statistics and disassembly for recognizing when a near-miss creates code that doesn't quite line up yes is the image processing portion of that a good candidate for say image recognition AI where you could instead of getting every column in row Mark just a few say this is what a one looks like

this is what a zero looks like and let it take it from there right so the question was whether AI might be able to help with this um so far I don't believe so because it's difficult to build a large training Corpus um I have had luck with support for plugins to recognize different ways of interpreting the the ROM bed the um the idea is that um some of these you can tell by a difference in color but others you have to look for a line and so I I did a plug-in that finds um like a long strip through the bit and then looks to see if the color dramatically changes halfway and in that I'm I'm able to recover more

of the images that gave me trouble in the beginning uh yes I got one more question and you and the orange in the back if you could quiet down a little bit I'm hearing you louder than the people asking questions I'm just curious why did you choose to go to Bogota I'm sorry I couldn't hear could you uh sure why did you choose to go to Bogota why was that something that you felt you needed to do flights were cheap and I've always felt illiterate for not speaking any Spanish and after one week I can now confidently use the present tense in the first and second person um and not much more [Music] mayamo Peggy Hill is gringo

all right well thank you all for your time and attention oh oh sorry sorry we got one more we got one more

yeah right I did not want to be distracted flights were cheap and um it's a beautiful town it's like New York City if it weren't so dysfunctional and overcrowded yes

so Mass problems are still used today the um as chips become more sophisticated you have the options to use um other forms of non-volatile memory and it is a risk to the reliability of your product to mask ROM if you're unable to patch it uh for example I reverse engineered a blood glucose sensor the freestyle Libra you inject it onto your arm and then you can use your phone and just sort of tap your arm and find out what your blood glucose levels are for diabetics this means that they can do more frequent sampling they don't have to prick a finger each time it's a great technology and God bless them for inventing it um but as I was reverse engineering it

the majority of the software is made as a mask ROM and then it uses Efram which is a non-volatile form of Random Access Memory Fram stores patches against the mask ROM so that if they make a mistake in the mask ROM they're able to patch it in more volatile memory but they don't have to include enough of that volatile memory to contain the entire program in that sensor you know the thing is disposable you inject it it lasts two weeks and you throw it away they have eight kilobytes of code to implement that program but they only have four kilobytes of Efram and in order to make all of this fit they have to split it across the two

memories um another technique that you can use in a very modern ship like if you're at a company that designs chips you can Implement your ROM as a sea of gates instead of implementing it as a regular array like this and this has a performance penalty and it has a density penalty but it is a lot harder to reverse engineer because the engineer needs to figure out all of this gobbledygook that was made by a verilog compiler during the placing route stage rather than something that is regularly arrayed and designed by a human being if you would like to see the design side of the stuff the open Ram project now includes a ROM for the sky 130

um manufacturing process which you can take apart and read and I think that's about it so thank you all for your time and attention and I'll be around afterwards

Writing a Mask ROM Tool

Related talks