
those of you who know me know that i'm passionate about transparency in the software supply chain our next speaker is dimitri barriosa who's going to be talking about how you don't just have to rely on open source software to learn what the heck's in a piece of software that with just a little bit of special skills in reverse engineering you can learn that all software is in fact open source now of course we don't necessarily endorse this particular position but he's going to give you some of the tools to think about that uh dimitri is the senior security researcher at vectra ai and he's going to give you some excellent insights into how to open up the black box of
commercial software to figure out what's in there and thinking about discovering vulnerabilities and ideally fixing them take it away dimitri all right well thank you very much for inviting me it's great to be here the reason that i prepared this presentation is that i have some friends in cyber security that consider reverse engineering to be kind of hard to get into kind of cryptic so my hope is after seeing this presentation it will become a little more accessible a little easier to understand and actually enjoy so my name is dmitry bryoza i'm a security researcher with vectra ai we do on-prem and cloud threat hunting using ai and machine learning before that i spent some time as a pen tester and
secure software development advocate with ibm exports ethical hacking team and before that i spent a lot of years doing software design and development as you may have guessed uh reverse engineering is one of my interests and i i live in auto canada originally i'm from russia so if you have any russian hacker jokes please send him my way all right so let's jump right into it so i'm gonna talk about what reverse engineering is and what the modern applications of that are what you can actually reverse and why do it there are certain legal issues associated with it i'll touch on that there are different categories of reverse engineering that i'll describe and some code
examples to kind of get you started to give you a taste of that of what you're getting into there are a lot of tools that are available and i'll lightly touch on those there are a lot of obstacles as you're doing reverse engineering and i'll give a brief description of that and i'll present sort of like a self-learning plan for how how do you actually start and how do you start start practicing these new skills and we'll wrap up with general strategies and resources so what is reverse engineering it's it's basically a process of understanding of how some device or system or piece of software works just by examining it and in many times it's necessary because
the original design is either not available or actually intentionally withheld when you're dealing with commercial software that's very often the case and if you think about it humans have really been doing reverse engineering forever so let's say uh some primitive tribe discovered fire or just discovered the wheel they use that knowledge but others also observe them and probably want to copy those inventions and sometimes it's a kind of an adversarial exercise because people sometimes don't want to share this new invention that they have and you have you kind of have to guess and try to understand what what that invention is and how it works in order to replicate it so throughout human history we see
lots of examples of people actually doing reverse engineering just to to copy new inventions if you think about it all natural sciences are essentially reverse engineering nature doesn't come with a manual you have to very slowly painstakingly discover its laws and what's that's what scientists are doing they're they're observing they're making experiments they're documenting their findings and then they're repeating the cycle again and again and again and that's how slowly we discover laws of nature and that's very similar to what reverse engineers do when they are looking let's say at a piece of software and even if you haven't done reverse engineering yourself before you you actually probably have done it you just didn't think of it as doing
reverse engineering when you were you were a child you most likely pulled apart some toy that you wanted to know how it works or maybe you wanted to fix something around the house some appliance that stopped working but you didn't have the manual you didn't have the blueprints you just opened it up and tried to fix it and it's probably haven't haven't always been successful but it's always an interesting exercise there are a lot of modern uses for reverse engineering the first one that comes to mind is security and vulnerability research because we're we're in security field so that that kind of seems obvious a lot of researchers are looking at all kinds of pieces of pieces of
software and hardware and trying to discover vulnerabilities penetration testing in in my penetration testing job very often there were situations where the customer we were doing penetration testing for wasn't willing to share the design of the product that we're testing so we had to resort to actually opening it up and seeing how it works and discover bugs that way so it's a useful skill in that area malware analysis that kind of goes without saying malware analysis is all about reverse engineering malicious code but even beyond security there are all kinds of applications in the world around us military and intelligence work that's uh kind of goes without saying countries constantly look at what the the militaries of militaries and
industrial complexes of other countries are producing at new arms new weapons and they're trying to figure out how they work maybe replicate some of those successes so it's uh it's a very active field in the military and intelligence scientific research deals uh with some of that in commercial research um and competitive research this is often done to either discover what your competitors are doing so one car company may buy another another car company's a new model and pull it apart and discover what what's inside and how they solved a particular technical issue in order to either replicate it or just get new ideas for building new new models or sometimes companies do reverse engineering just to
build compatible products so you can see you can often see let's say printer toner cartridges available for sale that are not made by original manufacturers and this is probably a result of reverse engineering so somebody took an hp or canon cartridge open it up figured out how it works and build a compatible version there are companies that are doing independent quality control and for that they have to open commercial products examine them maybe find weaknesses and report on those is an interesting field where companies there are companies that actually specialize on patent infringement detection they examine let's say circuit boards or or software in order to look for algorithms or certain designs that were actually
patented by their clients in order to help with litigation so that one company can sue another company for stealing their their patents and their inventions so and these are just a few examples i'm sure there are many more out there in this presentation we'll talk about software reverse engineering but that kind of includes hardware too because much of modern hardware is actually driven by software so why should you do reverse engineering a lot of software and hardware us around us are black boxes essentially documentation architecture blueprints are either not available just because company didn't care to share them or their commercial secrets they're withheld intentionally but we still want to be able to look inside those
black boxes we sometimes want to analyze products for safety or quality we want to if you're a security researcher you want to look inside for either dangerous bugs that you want to highlight for that company or look for maybe hidden back doors which we still have still happens uh companies do build back doors for their own technical support that's later abused by attackers sometimes company goes out of business so there's no way to discover uh an architecture of a particular product so you're forced to open it and uh and and examine it and and learn how product works in order let's say to fix it sometimes you want to build compatible products and the company that you want to have
compatibility with is not forthcoming with their designs they don't want to breed competition they they don't want to share their designs but you kind of have to reverse engineer that yourself yet another reason is maybe you want to modify how the product works and or fix a certain annoying issue that the company is just unwilling to do again you need reverse engineering for that and it gives you that power to open a black box and actually understand what's inside there's this quote that i really like that when you know assembly all software is open source and and i think that's very true uh once you gain some skills in this area all of a sudden
all these things that surround you become kind of it becomes possible to analyze them and understand how they work and maybe modify them which is a great feeling if you're in information security i would argue that these days it's sort of a required skill to know some reverse engineering techniques why first of all there's vulnerability analysis all these bugs that we see reported practically every day most of them required some reverse engineering in order to understand what the vulnerability is how to trigger it so this would just make sense that you you need that skill pen testing already mentioned that it's a valuable skill if you're testing companies products for for holes incident response is a very active area
and those professionals constantly get samples of ransomware malware phishing emails with some embedding executables and scripting they may analyze command and control instances remote shelves there's all these malicious code floating around that you want to understand how it works to maybe in hopes of discovering a kill switch for example and that requires reverse engineering but last and not least it can be a lot of fun really it gives you a sense of discovery accomplishment you overcome challenges you look for secrets so you'll if you're into maybe putting together puzzles solving crosswords geocaching those kinds of activities you'll probably actually enjoy it and that's that's the goal really i think i firmly believe that you have to enjoy what you do for for a
living so what can you reverse really pretty much anything so just just a few examples all executables that you see on windows mac os linux that are compiled into native code also executables they're compiled into portable code sessions java.net python ios and android apps javascript drives many of the websites today and not only that and it's often minified obfuscated so having that skill is important to understand how javascript and webassembly applications work in malware you often see obfuscated powershell scripts so reverse engineering skills would be very useful in analyzing of those instances there's obfuscated office scripting that you you may want to open up and figure out how it works there are compiled automation scripts
buy some boot loaders shellcode hardware firmware the list just goes on and on and on pretty much anything is reversible so before we proceed there's all uh sometimes concern about legality of reverse engineering and the big caveat i'm not a lawyer i don't even play one on tv this is not legal advice but this is what i could gather by uh reading what's what's available out there so whether reverse engineering is legal really depends on the situation and i'll elaborate more a little later when you look at the license at license agreements of most commercial products these days many of them contain anti-reverse engineering clauses so you will see here's serious example for apple ios
and for windows 10 uh they specifically ask you to not reverse engineer their products and it kind of makes sense they don't want to reveal their intellectual intellectual property sometimes they maybe don't want flaws to be discovered and for them to be embarrassed so so i i can understand that um and there there are existing laws that actually restrict reverse engineering so there's dmca you may be familiar with that there's computer fraud and abuse act copyright law there are certain uh european union directives and and there are others so there are certain things to to keep in mind when you're embarking on your reverse engineering journey is that you have more protection if you actually own the device or software
that you're reversing if you got a copy from someone and you let's say a piece of commercial software and you're pulling it apart and you later on publish your results uh you'll be i would say in more jeopardy than if you were to actually legally purchase that software and open it up and and analyze it so um keep that in mind there is something called fair use defense in corporate law and using copyright works for good faith security research is likely fair use but again that's you would need a lawyer's assessment to to be certain but that's that that's what uh general feeling is under the mca legal owner of the program may reverse engineer it
and maybe circumvent protections to achieve inter interoperability with other products so that's another case where it's sort of legal to to do reverse engineering and like many things in life intent and what you actually do with the information that you discover is key i said i'm not a lawyer but uh there's lots of information on the internet there is a very good review published by harvard law school and iff and here's the link but by the way uh i believe the slides will be shared later so you can you don't have to take screenshots or something so for my own work i just kind of use this rough split of reverse engineering activities into three categories the way the way i see
them and again this is not legal advice things that i believe are safe to do is first of all anything that you produce yourself you're you're free to then reverse engineer so let's say you write a program and then uh disassemble it decompile it that's you you're completely free to do that and also non-commercial software i believe is is fair game if you are looking at commercial products but do that in private do not share your findings you're doing it just for learning just just to understand how things work but do not publish your results do not take advantage of uh results in any way i think that's that's a pretty safe activity and of course malware analysis malware
authors will not go after you if you reverse their malicious code the second category is where you actually have to kind of go more carefully if you're doing reverse engineering of commercial products for security research there are gray areas there bounty programs do offer some protection i think there's explicit clauses there that say that you know we want you to test our products for security we do allow certain degree of reverse engineering and you have to be careful when you when you publish maybe portions of reversed code as part of responsible disclosure publishing tools that are built on the knowledge that you discovered through reverse engineering you're going to have to tread carefully because there were legal cases in the
past and so there was a case of circumventing dvd copy protection circumventing adobe protections charges were were dropped people were acquitted but that doesn't mean that's going to happen in every case so you really have to be careful and another thing to think of is that a company even if what you're doing is completely legal it's kind of in the eye of the beholder and the company may take offense at you reversing their product and they legally may legally harass you you know to send cease and desist letters and you know cause all kinds of legal trouble for you even though you may be within your rights of doing this and they probably have deeper
pockets than you so it's it's something that uh i think most of us wouldn't wouldn't want to happen to them and then final category is things that are really really dangerous if you do a reverse if you're wearing some product and just dump fully uh all the discovered proprietary information that you found or what's even worse you start profiting from it or maybe you start building compatible or competing products you have to seek legal advice because you know when there there's money involved and use of somebody else's intellectual property for your financial gain that's that's that's kind of dangerous you have to uh ask for help in those cases so there are several types of reverse
engineering um you can start with data and communications analysis and that that's useful when uh something that you're looking for let's say a piece of software or or a device where it's truly a black box you actually don't have access to the underlying uh executable for example so you can still do some reverse engineering because you can monitor data data formats that are being produced you can interact with the product dump different files that are being produced and kind of tease out the knowledge about what what those formats are and how data is exactly stored the same goes for analysis of network protocols and it just requires just a lot of patience careful examination and just use of network capture tools
and binary editors this assembly is kind of the next step is where you actually have the executable code you can look at and so you basically take the compiled binary and convert it into human readable machine instructions and that can be done for native code and the portable code such as what jvm produces for example and finally there is the compilation there are a bunch of tools that actually are smart enough to convert machine instructions into something that's close to original source code the results are spotty most of the time c like pseudocode is is recovered and uh portable code is actually actually higher level of uh of decompilation it's it's much but you get much better
results with those and i'll show you an example in a second you also do static analysis versus dynamic analysis and static analysis involves just looking at the code and it's useful when you let's say you don't have hardware or emulators available to run this particular piece of code or where debugging is not possible for whatever reason or anti-debugging measures are just too hard to to overcome and that's balanced by uh dynamic analysis which where you do reverse engineering through execution and debugging and it helps in cases where code is really obfuscated compressed encrypted and you kind of want to open it up without applying you know sitting with pen and paper and trying to analyze things
and and they really go hand in hand the assemblers and the compilers are often integrated with debuggers and the combination approach often works best so let's look at a small example so this is a tiny c application just has an embedded password which is a horrible way to program but anyway it just checks whether password that user enters on command line matches and exits so pretty simple so far if you were to open if you compile it or not open it in a binary editor you'll see this which is it looks kind of intimidating when you look at something like that for the first time and that's totally understandable and i think that that's what
stops a lot of a lot of people from actually proceeding because who who can decipher this this junk well luckily there are tools that actually can open that up for you and present it in a more human readable form so that same application after it was compiled it's now opened in a debugger and there or disassembler that's embedded in the debugger and now you can actually see that it's not junk it's actually a set of instructions maybe the syntax is not familiar to you but you can sort of start to see meaning to this madness and see different parts they're doing different things and the names maybe start to look significant and easier to understand so
this is a much better picture you can actually see that it's not random it's these are a set of machine instructions that are just correspond to that original piece of c code and as you as you arm yourself within let's say an instruction manual for that particular processor you can look closely and you can actually see correspondence between different things that were in the original source code and what's being generated so we have a constant there that's uh referenced you have some comparison operators you see that there's a setup for a function call there is a checking of return results from a function so things become a little more clear it's not as scary anymore
uh once you employ the compilator oh sorry the compiler things get even better because the compiler actually as i said before makes an attempt to recover the something close to the original source code so this is the result of decompiler running over that application as you can see yes some things were lost so comments some symbolic names were lost but the algorithm that's recovered is actually very close to what we originally had and that's that's a big help and if you look further up at languages that compile into portable code such as java you will see that picture is even better like you get almost a one-to-one correspondence between what the compiler produces and what was the original java code you
lose little things you lose some descriptive comments some descriptive variable variable names maybe sometimes function names but things look much better now and the compilers exist also for things like dot net code generated for c sharp so here's the c sharp example and on the right is the compiler presentation again almost a one-to-one correspondence which um uh pretty encouraging so what tools are available so ida pro is synonymous with reverse engineering has been around for a long time really an ultimate reverse engineering tool has tons of support for many processors environments has disassembler and the debugger and the compiler add-on so kind of has has everything has plug-in scripting is a really mature tool at this point the downside is that
it's it's kind of expensive uh two four two to four thousand dollars for a single architecture so if you want to do let's say reversing on a 32b platform and then on 64b platform you have to buy two copies so that's that uh not everyone can afford that but luckily there are free versions that you can start with and there's a home version that's much less expensive for like professionals that are working on their own uh hidra is another famous tool open source by nsa of all organizations uh many supported architectures it's also fairly mature i would say it's a little rough around the edges sometimes especially the user interface but there's tons of work going on on
improving it so that's great and you can contribute to it uh it's on github it's uh has a disassembler decompiler and now has a debugger too with the latest version which is pretty cool and also has plugins scripting and so on and the list of tools is really endless there are complete frameworks like radaria or binary ninja the compilers there are debuggers there are tools specifically for net specifically for java specifically for python specifically for web app assembly so there's a full spectrum and they're even special virtual machines that you can download and run uh one is flare vm i think it's published by fireeye it's on windows and remnux is a linux vm that has already built in a lot of
reverse engineering tools uh and just simplifies uh things by having everything in one place and really sort of like a kali is for fantastic or or parrot and and the list is endless so um don't be offended if i if i didn't mention your favorite tool so of course not everything is as as rosy or as easy as we saw in the previous slides things are actually as these do reverse engineering things get pretty difficult so first of all as part of compilation and compiler optimization code is made more difficult to reverse so you lose code comments meaningful names you often lose structure of data you lose objects for object-oriented languages so that that makes it kind of difficult
so as part of reversing you actually have to recover all of that through careful analysis in order to understand what's going on uh variables move between stack and registers and that that gets kind of confusing execution flow gets obscured because of optimization so your loops conditional statements exception handling code can be moved around a lot and the compilation may not may not recover that cleanly plus there is function and lining embedding of libraries that just explodes the body of code so it makes things hard then of course software publishers would not sit idly and let you discover their secrets they're not happy about it so they employ all kinds of countermeasures to complicate analysis and i'll just quickly go through
just a laundry list of things so first of all there is obfuscation they sometimes deliberately obfuscate names for let's say for scripting or b code languages so it's hard to understand the meaning of the code code gets midified for scripting languages uh packers are often employed and that just compresses the code that gets decompressed on the fly of course that that makes it harder to analyze encryption is popular so code sometimes gets encrypted and decrypted at runtime so again you have to kind of reverse things dynamically understand what's going on and that could go down in multiple levels there are anti-disassembly uh methods that are being employed such as jumps into the middle of instruction
false branches and it confuses this disassemblers and there are dec anti-decapitation methods so of obscuring of program flow useless that code extreme optimization some tools actually implements like a a new vm and they come up with a new instruction set where they compile their original application into that instruction set and that vm mini vm is executing the application so that things can get really really obscure there as you start debugging there are all kinds of anti-debugging tricks from debugger detection vm detection and many others and this is not the exhaustive list so this makes things very very difficult but it's an arms race debuggers and nvms they sometimes have measures for hiding their presence the assemblers into compilers they get
smarter over time and then they can overcome some of these tricks so the bottom line is expect these challenges in your reverse engineering work they can all be overcome with adequate techniques and tools and there's really my firm belief is there's nothing out there that cannot be analyzed and reversed it's just a question of how much time and effort you're willing to invest so here's a rough plan i i would propose if you want to start out so first of all a level zero is uh as with many things in life the only way to get good at it is to study and practice and i would say there are certain required skills before you begin
um it's not a really high bar for entry but you have to have some knowledge you have to have some beginner ability to write in a programming language cnc plus plus is an asset basic knowledge of target computer architecture is a plus you need to have a general idea of how compilers operate how they take that original source code and produce machine code uh you don't need to be an expert but just some understanding is necessary uh otherwise you'll you'll just be looking at manuals all the time you'll it will be very slow going if you don't have some of that knowledge and of course attention to detail and patience this is uh sort of assumed then you can
once if you feel that you have that you can start by doing do-it-yourself code samples and the reason i propose that is so that you can write really really simple programs you can write programs that try different uh different features of the programming language and then compile them and try them in different decompilers and different disassemblers and that way you're starting slow you're starting with very simple stuff and you're learning tools at the same time so you don't go full hog on a really complex application you start with really really primitive code and as part of that you can also look at how compiler produces the code because they're it's not just the code that you put into the source code
there's some setup and teardown uh code that get added to the executable their function prologues epilogues there's library code you can try debugging so you start little by little and the big advantage of this is that you take one of the variables out of the equation is that in in a sense that you know exactly what the end result of your efforts will be because you wrote those little applications and once you've practiced practice with that sufficiently the next level is to do uh crack me's as they're called these are challenges designed by designed by someone else typically fairly small applications they're hiding a secret called a flag and you can go at your own pace so
there are lots of correct me sites out there the one that i like is called wechal because it's sort of like a global directory of challenge sites from that you can go into many others and there is a global scoreboard to gamify the experience which is always interesting once you've conquered that the next level i would say is actually doing online ctfs online or in person uh once the uh pandemic subsides sizes um so there are competitions of all levels from beginner to pro they occur all over the world fairly regularly and in almost every ctf you'll see reverse engineering challenges typically you have 40 out 48 hours to do a particular competition but there
are many different styles some some are shorter some are longer but that's kind of the most common format and the good thing about this is that you're no longer going at your own pace there's certain time pressure added to their experience and that in my experience that puts learning in overdrive and i recommend ctf time it's a very known well-known resource it's directory of all kinds of ctfs going on all over the globe pretty much every week there's something going on that you can just join and uh and play and you don't have to be a pro to start and there is there is a global scoreboard so they also try to gamify the experience a little bit
to actually encourage you to to do more and to get better uh and and the ultimate level i would say is a competition called flareon it's created by fireeye um every year as far as i know it's the toughest area competition in the world and but i may be wrong if i'm wrong please let me know if you know something uh better and and more more tough they usually have about 12 chance challenges of increasing difficulty and you get about 40 days to complete them that seems it doesn't seem like a lot but actually it is because challenges get really tough towards the end you'll you can spend days and weeks on on a particular challenge and there's
a huge variety of platforms for which you get these challenges you get to try a little bit of everything and what's very interesting is many challenges are based on real malware samples real malware techniques so that that will help you in your day job as well and the finalists get get the price so whoever reaches the end actually gets the price that's that's pretty awesome so there's certain general reverse engineering strategies that i think are are useful first of all i mentioned that i i firmly believe that you have to enjoy what you do making reverse engineering an enjoyable experience is is key it requires a lot of time patience you're digging in these details
you may as well enjoy it otherwise you will dread it why do it otherwise so make sure it's something that you're really motivated to do maybe solving a puzzle maybe you're playing in the competition uh maybe you're trying to discover a secret secret back door or try to find the bug get some motivation for doing it will go much smoother reverse engineering is very very complex and and modern applications are insanely complex even something as small as windows notepad has 13 000 roughly machine instructions it's just it's just mind-boggling so it's just unrealistic to expect that you will take a random application and we'll click quickly fully understand how it works it just doesn't work that
way even if you had a source code to any reasonably sized uh modern application that's fully commented it will take days and weeks to understand how it works and now now we're faced with something that doesn't have that that source code no no documentation so reversing something completely is just an unrealistic goal so what i would suggest that you focus on is on the on a certain target that you're looking for maybe you're trying to discover a particular algorithm or you're looking for a specific piece of data so focus on that do not try to understand everything do not try to boil the ocean so just focus on the price and then slowly expand your understanding around
that as necessary uh i mentioned this before combining static and dynamic analysis is is important because they have their their merits using debugger for analysis uh helps you a lot because analyzing by hand is often too labor intensive and you can make the application do the heavy lifting so let's say there are all kinds of encryption compression and location algorithms are employed you can debug it in such a way that these algorithms actually get executed so application opens its up it up for you and lets you see the the the the decrypted decompressed representation so rather than you doing it yourself with careful debugging and maybe jumping around with a debugger you can actually achieve that
and that helps a lot and and you can make static analysis simpler by scripting some of the reversing work so if you let's say you discovered the compression algorithm you can quickly script it let's say in python and apply it to the executable rather than doing everything by hand and scripting in general is great it's built into most um maybe not most but many reversing tools so take advantage of that and it's uh it really helps you automate tasks and makes it an easier experience that you will enjoy looking for specific pieces of information that that you want to discover is important is also in a sense that there are things like magic numbers for example there are
parts of specific algorithms let's say specific encryption algorithms you sometimes use magic numbers so if you look for those pieces of information in the code it helps you discover where that particular algorithm is located and that those can be found in in hashing or encryption there are also certain system calls or library functions that will tell you a lot about what a particular application is doing so starting by looking for those can help you build a mental picture of what's actually going on and there are there are tools and plugins to help look for those documentation is very important because you're dealing with very complex activity there's a lot of info that's lost in compilation
and as you slowly gain understanding of what different fields are what different functions are give them meaningful names and a lot of tools allow you to do that add a documentation as you go add comments restore data structures and you will see that as you do that and you do that kind of in a rinse and repeat fashion iteratively you you will uh discover the um the overall understanding uh uh quickly much much more quickly compared to if you just don't document anything and just kind of try to stare at the code and understand what's doing what it is doing so there are lots of resources online to help you with reverse engineering there are lots of blogs
articles google it there's a arrest engineering specific conference that occurs in montreal and brussels at least that happened before the pandemic we'll see how things go in the future and there there are discussion boards online there is a specific reverse engineering discord there is reddit section there is a stack exchange area uh where you can actually ask questions and experts can help you so that's that's pretty useful and there are also a lot of well not a lot of there's certain number of books that are available uh read those and and if you download slides you can you can click on links there so to wrap up as we've seen a lot of software and
hardware around us are black boxes and reverse engineering gives you that power to look inside and actually understand how they work and then do things with that knowledge maybe you can fix a problem that occurred or maybe you can discover some hidden bugs hidden security issues or back doors so this is really a powerful skill maybe you can build compatible solutions really the sky is the limit it's it's really like a super power in a way there are plenty of tools available to help you to help you reverse pretty much any product out there so so seek them out and experiment and as you learn more about it i'm sure you will enjoy reverse engineering as a fun and really
intellectually rewarding activity uh that's it and well thank you very much for listening you can connect with me on discord or twitter or linkedin and there's a link for slides thank you very much for your attention all right thank you dimitri that was that was a really interesting talk and i think a great summary of how we can sort of dive into it um you covered a lot of ground and and touched on a lot of fun things but i was wondering you know one of the fun things about infostack is the war stories and i'm guessing that if we were in person you'd have a lot of fun stories to tell over the bar
we're live in public now so obviously there could be some things that you can't say but is there can you tell us about some interesting things you found uh just to help us understand what are some of the values of developing some reverse engineering skills sure uh so um i uh actually primarily practice in things like competitions in solving offline challenges because as you can understand with the all kinds of legal restrictions you have to be real really careful and not overstep the bounds and practice safely where you can and that's what i would recommend to to everyone and that's rewarding in itself because you are you're not just uh you know doing this random exercise you were actually
striving towards a goal in a competition you'll get a flag you get your you know uh team move ahead so it's it's really rewarding in in a of itself but it has real world value as well as i mentioned that in my um uh pen testing job we um frequently had to analyze uh code for which source code was not available so essentially we had to open up binaries and uh see see what's inside and there were many instances where that paid off in spades because in software engineering even today people tend to think that you know if i come up with this clever algorithm of maybe obfuscating this little bit of data or maybe i'll i'll hide a saved
password here but i'll encode it really well nobody will will guess how to do that but you know we read in the news every day how uh hackers uh actually open up those defenses and find hidden secrets so i had many instances where i would uh take an executable open it up and immediately stumble upon a password that's happens to be hard-coded in either just base64 encoded or maybe there was just a little more thought given to uh to hiding it but still it's uh for a really diligent attacker that that really knows his stuff it's it's really no obstacle sometimes people will will try to do the right thing and let's say oh yeah we need to use this internal
certificate or a password but we will encrypt it but we'll encrypt it and we'll store the encryption key hard code hard coded in the code right near to where how do you decrypt it right exactly exactly exactly so uh reverse engineering really helps with that because as a developer you you certainly and i used to be a developer as i mentioned for many years you think that you know this is my source code i compile it and then it this becomes this you know opec uh black box that you know nobody can look inside and nobody knows what's inside once the application is compiled and that's that's just not true and for someone who deals with deals with reverse
engineering regularly it's uh really your application is essentially an open book to them anything you hard code inside anything you encode or maybe encrypt by you keep the encryption keys around that's trivial to open up and i mentioned competitions uh challenges uh like these where something is encoded inside uh are just uh service engineering 101. those are warm-up challenges in in competitions it's really so it's a learn i guess it's a lesson to be learned by those who build software for those who are on the on the blue team side when you spot these kinds of behaviors in your software engineering team stop them because it's it's not secure and qualified people will discover things like that
there was one example where we were given a compiled uh executable uh so that we could analyze an analyze as part of an a wider bigger software package we had to analyze for uh for security and uh in there uh as i was decompiling and looking through the code for interesting stuff i found some interesting pattern where they would uh you know take different uh pieces of data and encode them and put them together and in the end what i stumbled upon was essentially a back door that the manufacturer put in so that they could connect remotely to that piece of software and in order to discover the key for this backdoor you just had to
combine together several pieces of publicly known information or maybe something that that you could you could easily get from uh social engineering so that's a very dangerous thing and and we highlighted it for the customer uh but that's that's the stuff of that's the kind of stuff you discover through reverse engineering and it's really rewarding when you stumble upon that because it gives you a sense of accomplishment you you really found this hidden secret and at the same time you made the product more secure i like that and building on that uh he's got a few minutes left but i want to make sure barton seal is a question the discord uh saying you know going back to this
idea of uh keeping your eye on the prize to help sort of organize this do you have some recommended strategies for identifying what are the things we should be focused on is this something that you have a checklist or it depends too much on the software yeah well i mean uh mostly i i'm speaking from experience again through competitions and those are very specific challenges where somebody builds a challenge that people have to solve as part of ctf but when i approach a new executable that has let's say symbols removed has you know things obfuscated that's really really hard to even understand where do you start so the things that i look for are uh
first of all um uh there are system calls that are uh in in the end you invariably would probably have some kind of a system call that would either let's say output something to the screen or write right to a file uh those are you could use as your anchors to kind of start pulling on things printf statements uh maybe magic numbers uh let's say in a lot of encryption algorithms uh hash calculation algorithms there are certain constants that are used and if you uh let's say you're looking at the sea of machine instructions and then you spot a um a magic number all of a sudden things come into focus and you see that
oh actually this is not just um you know uh random code this is actually part of this particular encryption algorithm let's look and see where it's called and as you discover these little little things uh system calls uh maybe known um pieces of let's say logic that's part of a programming language that's part of the library that's compiled in magic numbers once you you anchor on those then you kind of expand outwardly it's very important to document things to um add comments to name things with with good good names so that building then on that knowledge you can expand more and more and more and and sorry one more thing one more thing in some tools like ida
and i think hydra has it and then some other tools they're actually uh uh um i forget the the the name there's a this functionality that discovers uh uh standard pieces of code standard library functionality in an executable so you could run that functionality over your executable and it can actually go and discover that by the way this function that's named i don't know abc is actually a string compare function and this function is actually an encryption algorithm so they have that power to recognize some common code and give it uh good symbolic names okay well thank you so much i think we are just about out of time for a q a uh but it was mentioned in the comments
that uh this is probably one of the the best uh reverse engineering 101 talks that we've seen uh a lot of people are going to be asking you for the some of the links you shared so hopefully the slides will be available um but perhaps you can chime in and sort of send people some pointers as well on the discord absolutely all right thank you again and have a great rest of the conference thank you uh looking forward to some great talks coming up but perhaps maybe it's time for a drink