← All talks

Intro to Reverse Engineering with Ghidra: Taming the Dragon

BSides SATX · 202147:02289 viewsPublished 2021-06Watch on YouTube ↗
Speakers
Tags
CategoryTechnical
DifficultyIntro
About this talk
Title: Intro to Reverse Engineering with Ghidra: Taming the Dragon Presenters: Christopher Doege Track: In The Beginning Time: 1100 Virtual BSides San Antonio 2021 June 12th, San Antonio, Texas Abstract: An introduction to reverse engineering binaries in elf and PE formats with Ghidra that goes over first reverse engineering concepts to utilizing ghidra's many features to tackle tasks such as malware analysis, vulnerability analysis, and general reversing. Ghidra is a disassembly tool that takes an executable, object, apk, etc., and parses the machine code into various forms that end with an assembly language and a decompiler to convert that assembly to C. This allows for a thorough inspection of code within executables for various purposes which could entail malware analysis, reverse engineering, vulnerability research, debugging, etc. This is a revamped presentation with updates for the latest Ghidra changes as well as more in depth tricks and tips for reversing with Ghidra and collaboration that I've learned over the last couple of years of using Ghidra professionally. Some familiarity with x86/x64 ASM and C will be helpful for this session. Speaker Bios: Christopher Doege Christopher Doege is a professional Reverse Engineer and Vulnerability Researcher that has been intrigued with software security for a long time. In his free time he enjoys CTFs, video game hacking, and checking out new technologies to work with. If you're every looking for a mentor to help you along your way to doing reverse engineering, vulnerability research, or other cyber security related topics feel free to reach out to me.
Show transcript [en]

hello um christopher davie um i'll be doing an intro to reverse engineering with uh deidre i like to call this talk taming the dragon um so first off a little bit about me i'm an avid ctf player i've played in my defcon finals qualified for other like very large ctf events with nasa rejects um went to utsa and enjoy doing crypto reverse engineering and like phone challenges for fun um so let's do a little bit into like what is reverse engineering um in this context i'm talking purely software reverse engineering there's another component where you do hardware reverse engineering that's outside the scope of what i'm going to talk about today but we'll talk about binary reverse

engineering generally taking some compiled code from the machine code you lift it up to some form of an assembly language and then from there you can mess around with it and you have tools that allow you to go up to higher level languages so it can be useful for figuring out how things work from malware to ctfs i do it a lot for vulnerability research and reverse engineering um that's generally some of the things that you can use it for also just like figuring out how something works or trying to debug a component of something so in reverse engineering you generally have the two common mindsets you have static analysis and dynamic analysis um for the purpose of this talk we're

primarily going to focus on static analysis with egypt and from there you're usually you're disassembling so you're taking that byte code that machine code you then lift it to assembly and then you have another step with fija and ida called the decompilation step which will take their assembly port it to some type of an il so with dijo you'd be going slay to p code and then from there go to like a c like syntax

so the process of static analysis is examining code without executing the program and this provides an understanding of code structure and program flow but it is limited right so you're gonna lose a lot of contextual understanding from like relative jumps from uh registers that you're not able to probe without having a dynamic system you won't know which branching paths you go down unless you actually run it and so while static analysis can be limited it is still very useful and very necessary trait for the reverse engineering toolkit uh the damage analysis is usually you execute the binary through some form of restitution usually uh in the reverse engineering spectrum you have some type of debugger

attached to it and i believe these go hand in hand with static analysis because you can utilize what you're learning from both ends to see which paths you want to go down in the dynamic slide or what values or what things you want to set when reversing and getting your understanding from the static analysis side oh this node is actually outdated they just recently in the beta for uh dj 10 added the debugger and i do have a slide talking about that um everybody's actually pretty interesting it reminds me of red sink which was a tool for ida that would allow you to like integrate with like uh wendy bug and some other things um the next thing that i'd like to talk

about though is like a plan of attack so generally when you are reversing some type of binary you have to start somewhere so i know some people do a top-down approach so they'll start from the like program entry and work all the way down through it and try to like document up anything that they see or like fix anything up that they see while scrolling down through the uh assembly this can be very difficult for people new to iris engineering because you get lost in the weeds here and may not always be necessary depending on what type of version you're trying to do you can do it at a high level so looking at the different function

entries and then go from there and skim the functions to identify common ones you can go from common system calls and function calls and then trace the program to find interesting things this is a very like common tactic used with like vulnerability research and that you want to spend as little time as possible reversing the entirety of the program you want to find where you have um input coming from the user or coming from over the socket or whatever controlled input that you may have in order to see what types of things you can um mess with and so on so that's very common for that and then guided usually um it's like if you have dynamic analysis

so you can trace the execution flow from dynamic side and then use that to see the paths that you take and that one i think is probably the easiest for beginners to get a understanding of because you have something that's already ran through and executed the binary so you can kind of step through that trace and figure out what's going on there uh some common pitfalls in reverse engineering is going too deep into the rabbit hole you can have a ton of different like functions that do like very complex things but those complex things may not be what you're interested in this happens a lot in debugging where people will initially get lost in some of the

life sys calls or so on start debugging code that's a little bit lower than what they actually want to be executing in um you should also get into a pitfall of reversing functions that aren't reachable this happens a lot in static analysis where um you don't know whether or not that code is going to be reachable for what you're doing it may be like a very um unused section that like only gets hit with like different startup parameters and so on so uh it can be hard to know what to reverse but you can also get stock reversing functions that um aren't very useful um running into decompile or like disassembly issues so that's something that can be very

frustrating um and i run into that with dj like either the function is too large or the function has like very strange things that causes the decompilers to quit out and same thing where the disassembly kind of just doesn't track the uh function you have to fix it up in order for anything to actually be analyzed and then not saving your database is a common pitfall because i've had power outages i've had other things where not having to save your database can throw a lot of work and research time out the window so a simple overview of what a disassembler or like edger does is it'll convert machine code up to assembly recognize the start and end of functions recognize

jump calls to functions uh it'll help you find strings and like data sections uh it'll remember like any user labels and comments so you can type in your own comments to start to make a like binary that you're versus more readable more understandable and generally has a graph of program flow all right so i've been talking about guitar a lot but what is deidra so it's a software reverse engineering tool with version management and decompilers and also now includes a debugger um it has a public repo on github with active contributions and maintainers and it is multi-platform so i'm using it today on osx but it's going to run on linux os x windows or any type of environment

that has java 11 supported as to the supported architecture so this is a uh list from rob joyce uh rob was one of the like initial people who spearheaded um publicizing guitar and also um was the main face in promoting diedra but they have all sorts of different uh supported architectures and these supported architectures generally have an associated decompiler and this is very nice because for a lot of these architectures tools like ida um did not have a way to lift them without doing um either your own custom lifting of it or having a financial in some other way so there's definitely improved reverse engineering of earl and has made uh life a lot easier for

people working in um some of these various architectures um yeah so to download pedra so there are two websites there's the uh nsa's um uh repo on github and then there's teacher sre.org right now you have the stable version which i believe is 9.2.4 and then the beta version which is 10.00 and the beta version is what i'm going to be doing for some of my uh demos at the end of the presentation uh so like the general like understanding of the layout you have um things broken down into different areas you have uh docs which have cheat sheets which have presentations to like get yourself or get you um acclimated to like some of the minutiae

of deidra um but for running it you have user run and need to run.bat so dj run is going to be the linux utility or osx utility just script that um opens the java and then same thing with the bad script which will run on windows um the trainings are really interesting uh they have instructor notes to give your uh to like um be more familiar with the content that's going on in the presentation so they can be a really good learning environment um i looked through the beta today and i didn't see anything in the advanced or intermediate that had uh presentations or documentation there on the debugger so the divider is still reading some of the other documentation

and kind of figuring it out on your own right now um some useful plugins that i think are pretty cool for youtube right now are dragon dance so it's like lighthouse for aedra lighthouse is a tool that would take like dynamic traces of the binary and then use that to visualize and manipulate the code coverage that you get when executing so it'll like highlight the different fields or basic blocks that you see to kind of get an understanding of where you've progressed in the binary at a dynamic side and replicate that over on the static side um this just has a uh it's a dj community page and it has a ton of plugins and cpu

other like cpu extensions that may not be directly supported by deidra but community support have created so i think there are some for like the uh like game boy and so on that are up there and then there's uh daenerys which allows for execution of ida strips and uddra and also allows you to port your digit scripts over to ida which has been really nice uh one thing not listed here is that edra also has tools to convert your binary database from ida to dija and from dija to ida noting that you will lose some of the comments and markups that you have but some of the structure like if you have to manipulate the like program entry

or if you have different like header sections that you have to mess with like if you're doing firmware those things will be saved and propagated into the other analysis tool so some useful features so themes and configurations um you can see here that i have like a little um dark mode light so it's using the metal theme and then inverts the colors to make it look like dark mode i was going to use this for some of my demos however i believe it makes it a little bit unreadable for like presentation wise so i'm going to probably just use the light mode theme um there's version tracking so version tracking in the uh giga server and that's probably

one of the most powerful things with each apart from having um lifters and decompilers for architectures that other tools don't support is the um server collaboration so i believe for ioda like rpi sec should say um ctf team out of the school rpi up in new york they have their own develop system to do collaboration ida but they never released it and never publicized it um so barring that there's no like collaborative way to do reversing other than everybody just sharing their databases across with each other so it was really difficult to manage who did what and also transferring those changes was a bit tedious so dja come support it with a djs server which you can set up and then you can

commit up your analysis changes to the server to allow other people to view that and also commit their own changes so it's very good as a teaching tool i think it is stellar to use in like the university sphere and is also very good for collaborative large reversing efforts on either like large uh binaries or things that are very complex that have multiple engineers working on them um it has binary diffing however um i still prefer bindif which bindif um i believe has some support for um deidre now but i know it has support for ida bindif is a utility that will diff two separate binaries if you have like a newer versions let's say

windows team out with a patch for something you want to figure out what that patch has you can use the binary difference to you the difference between that um i haven't been the biggest fan of the guitra provided binary differing tool but it still is nice uh as a nice to have and modi is something that i'll show you in uh while doing the demo um so i kind of ranted a bit about um the server uh collaboration but it's actually like really easy to set up um you basically just need a centralized server and then it's running out of the same utilities that you run for like each of run so you just have a server component it

runs that stuff just needs java setup and you can easily get the server running there and then have other people be able to access that um i thought about having a dj server that people could use and play with during the talk however um neutral has like authentication and there is an authenticated mode but i didn't necessarily want to do that because then people could like mess things up so i decided to miss that idea but um the server collaboration is very useful for lots of different things so you can track like commit changes kind of similar to like using github for changes to uh reversing engineering a binary you can review changes by others and you

also see that version history and revert back to those changes and it's very useful for ctfs where you're um doing things very like quickly and need to share those uh binaries very quickly so you could have people like reversing different functions and then collaborate to see where they are on different things very useful for mentoring and group malware reversing efforts group of vulnerability researching efforts and so on um so when you first start off with uh deidra you're gonna have like um you know start it up and you'll get this type of view so you have like a tool chest and you don't have an active project so you need to create a new project

um and with that new project so you create projects you can create a shared or unshared project shared projects are going to be pushed up to a server if you have a server um from there once you create a project my project is called test then you can then start to upload your binary so for this example i'm uploading or importing um my os x ls binary up into deidra to try to reverse engineer that once you have a binary that's imported you'll double check the binary it'll pop up with a a few different window panes but then it'll ask you saying hey this binary hasn't been analyzed would you like to analyze it and you'll have a bunch of different

settings if you're just starting off with dedra i would say probably don't mess with the different um analyzers i would like look through them and like some of them might be useful for like the different niche things that you're working with um so for instance like i know there are some plugins for analysis for like game boy and so on so you might want to like use those strips but maybe run them as one offs rather than run it in like the batch mode of like running all these scripts at one time um tool options so uh with feature you can modify the tool to do a bunch of different things you can change the

color scheme and you can also change um how different things like lay out on the tool so for instance uh one thing that i like to modify when i'm using guidra is usually eliminate unreachable code is checked as i do a lot of vulnerability research and it might be nice for once you have a bug to know what different chains or like um code that you can utilize to execute in other areas so um while i may think that it's unreachable you may be able to make it unreachable so i like to or you may be able to make it reachable so i like to uncheck that field um it adds a little bit more

to what you're going to see with fija and this can sometimes lead you back into that rabbit hole of reversing things that aren't reachable but i generally would rather see it than not from my perspective you can also go through and modify your key bindings i like to fix up some of the key bindings to make them similar to ida just because that's what i'm more used to and accustomed to when doing reversing um and also just adding other like key bindings to make things make your life easier for different um tools that you have or like the different components and different views that you have in egypt uh another thing that i like to disable

and this is really hard to read and why i decided not to do the dark mode from here but you can see like the check boxes really aren't usable but for the disassembly options i like to turn off markup inferred variable references and so what this is doing is saying if you have a variable that's named in the decompilation side so in the c side it tries to assign the registers over on the disassembly those same names and for my purposes that generally is more confusing than helpful i like to be able to see what registers are being used especially with the different parameters and like uh software function um like kali caller conventions so i prefer to like actually be able to

see the um registers there and i'll show you an example of that in a second um there's also some common key bondings that you can look at and after i finish this talk i'm going to go through and upload my slides and some test binaries for people to mess with if they're interested but there's like several uh different common key findings there's actually um several gists out there of like applying um all of like ida's um tooling interface so like the key command t bindings for that over to dja to make your transition a little bit easier um now we're going to talk about some of the views so there's the graph view um it is my

perspective that the graph view for djira is not my favorite um if i were to work on something only from a graph view perspective i believe ida is far superior in that and binary ninja as firearm superior and the view that you get for graph view i just feel like eucharist is a little bit funky but you know dj makes up for it and how its general workflow is intended to be used is more from the decompilation than from the graph view um but with the graph you can apply markup so you can apply coloring this is very useful when doing like traces so you can um uh colorify things in the graph to say

hey we took this branch we went down this code path and so on so it can make your life a little bit easier um with where which functions you've either gone down dynamically or which functions you've reversed and or which portions of the luncheon you reverse and which portions you haven't um now the decompiler i'd say the decompiler is the like bread and butter and what most people end up using when using um hydra um i like to think of it as a cheap mode in a sense i'm very used to using um ida to go through the disassembly and having to read a lot of the assembly to understand things so having a decompiler is definitely nice

and is definitely an aid in which you can comment up all these different variables add your own mnemonic names in order to make it more understandable and more readable for your reverse engineering needs there's an example of uh doing the markup so this is just a mim copy you have your dest your source and then the size so yeah

you can also patch a binary so say that you have i don't know my um example this that applies to most people say that you uh like want to cheat in the video game or you want to like apply a patch to software because it's being annoying or something you can apply patches to like say not take a jump or so on so it does have the ability to patch a binary and type in the various instructions that you'd want to add there it'll fix it up and write that machine code for you but then you having to like manually like set it the binary uh now talk a little bit about p code so

slay and p code are some of the intermediate views and intermediate languages that hedra uses taking the micropro the machine code popping it up to an assembly and then from there pop the name up to see so it'll go through these il or intermediate uh language or intermediate representation uh views before being pushed up to that c like syntax you can also go in uh i'll show it in a second but you can enable p code so that you can see what the p code is for the given function you're looking at and an understanding of the p code is very useful because you'll be using that a lot if you're doing any scripting or automation with uh draw

um then you have the strip manager so um this is just like all the different strips that are already available but if you add in any of your own strips or what i like to do generally is go in and edit one of these strips that has some of what i need but not all of it i'll create a new one copy it off that and then uh edit that to do um what i'm looking to do but um these and it'd be nice to like go through here and see if there are any scripts that might be useful for the reversing case that you're using definitely more of a power user than a very basic user type functionality

next is the data type editing structure so um i think this is probably one of the more like stronger use cases for ditra and applying it to the decompilation so you can have these like very large like c like structures that have tons of data in them and you can create those structures and edit those structures and dedra and then apply that to what you're seeing in the binary so that deidra recognizes that and is able to say ah yes this is this structure these are how they're referencing it and so on just to like pretty up the uh decompilation and make it look nicer but also to um you can apply this across all uh functions so that

uh it tracks and traces uh to know like what this um type of the um thing might be um that's another thing just in the data type editing um you can if you know there are custom types for the structure so say it's some well-known um ieee standard and you're reversing that you see these structures that relate to those things you can start to apply those structures and those data types and then apply that across the binary in order to have a greater understanding of what's going on i think this is like a very powerful um side of what that makes like feature very useful and then we'll talk about the debugging side so deabody is still in beta

i believe they're planning to release uh teacher 10 sometime this month either like mid to late this month um so then i think you'll get more documentation and maybe have some presentation slides uh with an introduction to the [Music] debugger but the debugger is very similar in my opinion to like red sync so it allows you to integrate in with like windy bug or allows you to integrate in with gdb in order to have some type of debugging like native debugging capabilities with um the binary that you're using some of the interesting things though that they've added or um they added like time travel s debugging through emulation to say um i want to step back through what i was

just executing um and it allows you to do that through keeping track and state of the registers noting that time travel debugging is not flawless so that is to say that there are some things that may be lost so like let's say that you are modifying files or you have mutexes or you have locks um it may change state when trying to go back through things um and so on if or those things may not like be properly set so um the timeless slide debugging isn't perfect but it is still very interesting and useful uh functionality and definitely something like really neat that i thought they added um also just like having the stack and like the ability to trace and

like set watch points within deidre and then also have the um disassembly and also decompilation available it's like really neat and useful um i won't be showing a uh demo of the debugging side of things today but it is something that i plan to use more and work with more um they might somewhat reset and release um but yeah cool demo time so i have um dj already open um i'm gonna create a new project just to show you the workflow but um so you can create a non-shared project or a shared project so non-shared being local shared project meaning you're going to need to have a feature server running that you can connect in with um

otherwise you're going to create a non-shared project so that's what we do here

and finish

all right so now we have a project right but we don't have any binaries added in here yet so there are two ways that you can add in binaries you can either add them in through getting i uh which is import you can also do file import file uh you can also like um go through and drag and drop so say i have a program over here i can drag it in and see what type of file it is so this particular file is the mips um video neon 32-bit binary and the form of an elf so we can go ahead and import that so what it's doing here just loading that binary into the

system it's not doing any of the analysis just yet um and do similar things so once i get more processes

and certain things are going to be a little bit slower than others so um for instance this is mips it's going to take a little bit longer than say your it's 86 or x64 um and so on uh i found that like analyzing velvet or like any apks is like very time intensive so i'll show you how to import the uh apks but i won't go through actually analyzing them because it takes a very long time but here we can see um so i'll prompt you with uh do you want to analyze the binary we'll say yes and then we'll just go through with the general um things sorry i can't really tell if this is

readable so if the speaker wants to pop up then i'm sorry if the other anyone wants to pop up and say it's not readable let me know um i'm used to doing this on like a large projector

yeah so while it goes through the analysis they'll kind of talk through the different components so you have your program tree here um which talks about your different sections of the uh program so you have your like global offset table here your bss your s data should have your data your text section and so on um over here you have your symbol tree so this will talk about your imports so this will show all the different um imports that you have so whether they're windows for linux this is a linux miner it's an elf um so it's going to have like libsy.so and then it's doing some crypto here um and then it's a it has like standard c

plus plus library as well um we can see what functions are exported so this would be useful if you're looking at like a dll or like a um some type of shared object um because then you have a lot of like uh program exports but um generally speaking for the types of binaries that you might be reverse engineering especially when you're a beginner you may not be looking at the shared objects or those dlls so you're going to be looking more at the functions tab one thing that's interesting

is that um deidre will kind of take full uh like your [Music] functions and like apply them into files for things that are similar uh which i think is interesting but not always like nice um for some of the uh things that i want to like look at but um you can see these are all your like different sys calls or like function calls here um you can see these are all your like unnamed functions and then you can see some of the other different things um so generally you're gonna look for your main um if it's a stripped binary it might be underscore underscore let's see start name um or entry or so on so depending on like the type

of binary you're going to have different like program entries but this one has a defined main um that we can look at and see like over here so this side we have the disassembly so you can see the different components and if you open this tab here so this shows all of the different sections um and right now you can see i have highlighted the operands um over here you can see the fights here you can see the mnemonic and then so on so if you wanted to like enable code you can enable that and see what the p code looks like for all of the different instructions over here but it does add a lot of bloat

so we go ahead and disable that field and then we also want like post comment and eol comments um your comments like adding comments in here are very nice when you're reversing to say hey i figured out what this does or what this is doing and so on and then over here you can do the same thing so now you're um i'm going through the decompilations you can see there's the function call that has two parameters i can label one of the parameters to be a new name and it'll um replicate across here um another thing that i kind of wanted to show were the tool options so when you go into full options you have a ton of different

options um i don't know how to say like go to here if you want to like do these things like it's so varied but if we go back to um this little view right here if you highlight like the different sections and see if this gets highlighted so like ah okay it's in the operands field that i want to like uh look at something so go to listing fields and that'll be all of the uh different fields for here and then you can go to operands field and this is the um thing that i was mentioning turning off before was the markup inferred variable references i prefer to see the um [Music] different like registers rather than the

pseudonames that deidra automatically applies or that i may apply when looking at the disassembly um other things that are interesting so the function called uh graph so this our function fault tree um just to see like what uh incoming upcoming calls go to these can be nice to see like track your way up or track your way down for different things um this button right here is how you open the script manager and so we can do all the different scripts uh and then window if you want to see the function graph yeah um and then here you can like scroll around and so on um i still think so i use i'm using a

laptop right now using a touchpad um but for dedra i still think that you need a mouse in order to like use it uh correctly just by virtue of like if you want to say highlight all uses of a variable and the [Music] assembly or highlight all uses of a register or something like that in the assembly you have the middle plate you can refine that and the keybindings but i believe it uh generally for their purposes is assumed that you have um a mouse um so for this instance we're not going to say for your instances you may want to save it and then i'll show you also opening a uh apk so import we

have this little like uh hello world apk it's literally the simplest apk that you want so you have different ways that you can open it you can open it as a single file that's not going to work batched not going to work either or it's a file system so i like to open it as a file system and then you can browse it to see ah here's the uh dex that i want to analyze it so then you can open this and then import that dex then you can import it as a single file you see that it's a uh dalvik executable so this is the java byte code and then go through and import that i

would go through do it but it takes a very long time to do those things so um i'm running kind of light on time so i'm gonna put the demo now and kind of go into my mean section um for those of you who may have seen one of my presentations before or know me i always have memes in my slides so these are some of my favorite ones related to idra um basically when deidre was released there's a lot of like back and forth between ida and future like guidry cost somewhere in the range of like three grand so like 15 grand if you're looking for like all versions and all different types of things

um and so on so i just like to leave the memes up for a bit uh this is probably my favorite one from the baba group of uh kind of making fun of ferdari users uh unfortunately hello i'm philip wiley the founder of the pawn school project homeschool project was founded in june 2018 as a way to offer free education based on penetration testing and ethical

meetings back in 2018 uh hello i'm philip wiley the founder of the pawn school project palm school project was founded in june 2018 as a way to offer free education based on penetration testing and ethical hacking to the public more specifically the dallas fort worth area this was created out of my passion to educate others when i was before i started teaching i did a lot of mentoring which kind of inspired me to go into teaching and you know i was teaching ethical hacking at dallas college and some of my students towards the end of the semester the first semester was asking why where could they take more classes because they're interested in taking it

but most people had you know a small budget for training so my idea was to get together like on the weekends and and do some some little workshops some little hands-on training to help them further their education so i've decided to go a step further and started the pawn school project or palm school for short uh the home school project hosts two meetings per month they started out physical meetings back in 2018 uh 2019 i started offering the dallas meeting streamed so that way it opened up to people around the globe to be able to consume this content and help them and when the pandemic hit we end up going virtual with both of the meetings

offering two meetings per month and we expanded past offensive security into defensive security even we had talks on uh becoming a cso as well as talks on uh becoming a sock analyst another thing unique to home school is at least far is the area that that i live in where phone school was founded in the dallas fort worth area is this this meetup was more friendly to new new people trying to get into the industry and we try to take more of an educational approach so not only does phone school stream uh monthly meetings i also teach pen testing and web app pen testing workshops at different conferences for different colleges and for different uh cyber security

groups so if you're interested in checking us out go to pwnschool.com and there's a link to our slack channel as well as meetup for our scheduled meetings and i hope to see you at home school meeting sometime soon [Music]

[Music] thanks

[Music] do [Music]

[Music] you