
hello how are you very good thank you so who's who tell me who's who i'm rid rick and he's sorry hi the silent one is sora yeah gentlemen thank you very much for coming on to uh besides 2020 virtual edition you're going to be talking about fp analyze aid in exploitation with automated analysis um in the interest of time i'm going to ask you to immediately start your presentation and um i will be keeping an eye out for any questions as i mentioned before folks uh please ask your questions in the youtube chat uh chat box and i will keep an eye out for them so please rickvic and saurag take it away yes sure thank you
so i hope it's
so just let us know when we can start
yeah you can start now go thank you hello guys today we are going to introduce our tool fp analyze which will aid in exploitation with automated analysis i am ripwick and i am currently pursuing my third year in cse at amrita university kerala i regularly place etfs with team buyers i do linux binary exploitation and i have a keen interest in memory corruption and vulnerability research i am sarah and i am also pursuing my third year in csc at amitabh university i usually do binary exploitation about and right now i am mainly focusing on the kernel exploitation so about that team bias it is the number one ctf team of india it was founded in 2008 by our mentor
sri vipin pavitram the members of aklav engage actively in security research and we conduct international ctfs on an yearly basis so let us now move to the agenda of today's talk but before just jumping to the tool to be able to use it efficiently we need to have arbitrary memory read and write primitives wait what are these so first we'll be looking into unveiling the attack surfaces and have a quick overview of primitives then we'll also see a few common vulnerabilities on stack and heap which can be changed to get arbitrary read write and thus making it perfect for our fp analysis to come into picture note that there are many other ways we can get arbitrary memory
read or write but for the sake of brevity we will discuss only the stack overflow format string heap overflow and use after string so let's get started before after following this once we have a proper understanding of primitives and a few common vulnerabilities we are all set to introduce our fp analyse where we'll see why we need fp analyse we'll have a quick overview of its working then we'll also see what the core idea behind fp analyzes then we'll go a little into its internals of the implementation and towards the end we also intend to have a small live demonstration with ctf style binaries so it's not about the bugs you ultimately discover it's about the primitives you find along
the way so you can guess it we are moving towards unveiling the attack surfaces and get a quick overview of primitives but wait what is a primitive see the modern exploitation tactics can be broken down into exploitation primitives they can be assumed to be building blocks of an exploit simple as it is but in a broad sense primitives can be divided into two types the first one is arbitrary memory rate primitive and the second one is arbitrary memory right parameters we'll have a look at them one by one so first what is arbitrary memory read see arbitrary memory read lets you read or leak addresses of memory segments but why do you need to leak memory the
reason is that you can't directly copy addresses from memory and give that in your exploit the compiler you see enforces randomization techniques like address space layer randomization aslr technique which randomizes almost all memory segments hence to successfully exploit the program the very first step is to leak memory address of segments which are of our interest like the lip c segment you know which has functions like system so ctfs it's for you so now we will realize arbitrary memory read with a very small example here in this sample example you can see that i have two long integer pointers read ptr and right ptr for now just forget about write pdr so the program asks for
an address it takes that address and writes whatever is there at that address so this is a classical example of what arbitrary memory read is you can give any address and get the data at that address cool isn't it so now since now we compile this program with the suitable flags but you can see that since pi randomization technique is disabled in compile time we can still use hard-coded address of the program's global text segment wait what is that this segment is called bss and it contains addresses of lipsy functions which are used during the execution of program see a table which maintains all such ellipse addresses in the bss segment of the program is called the global offset table g o t
in short now if we give the global offset table address of the function put us see you saw that potus was being called during the program hence we can use the global offset table address of quotas we get the data inside it as the output see you can see that data but the data is nothing but the lipsy address of putters once we have leaved addresses of any memory segment protected by aslr or pi we can offset to any memory location with that segment within that segment and get addresses of useful functions like system so now moving on to the arbitrary memory right perimeter see arbitrary memory right primitive basically lexan lets attacker control data
to be written to attacker control memory location so you can write something somewhere this is most powerful primitive as it can give us arbitrary code execution let us realize arbitrary memory write primitive with a small example here you can see that we have the same previous example but in in place of read ptr we are now going to use our right ptr so we take an address we take a value and then we write the value to that address so attacker control value to an attacker control address so now let us run this example and same as previous we are going to give the got address of putters as the program is compiled without the
pi randomization technique and we give the decimal value of 0x414141 which is the standard aaa so you can see that in memory before the overhead had happened the got address of curtis is having the lipsy address of potus but after we have overwritten you can see that the protest address has been successfully overwritten with our input 0x41414 this is a classical example of how we can write anything anywhere so together if arbitrary memory read and write can exist in a program we can be sure in some way that the program can be pawned so now that we know how powerful arbitrary memory read and write primitives can be let us try to understand a few common vulnerabilities
which are still prevalent in the world of security when we begin to exploit a program we always try to convert our bugs into exploitation primitives that we have discussed so far so we'll be looking into the stack based vulnerabilities starting with stack overflow you see stack overplay is the most common type of vulnerability we see now and then to understand this let us take a quick look at a sample program so in this program you see that i have a buffer of size 16 and i'm taking 32 bytes into a buffer of size 16 which means it is for sure that an overflow exists here but to realize and understand an overflow we also need to see how the
stack looks like so you can see that before the attack the stack of any function will look like this so it will have the parameters of the function passed on the stack then the return address of the caller function and then the base pointer followed by the buffer and all the local variables down the stack but in an attack scenario as you can see we have buffer of size 16 and we can give 32 bytes of input which means we can over we can fill this buffer of size size 16 and then we can also corrupt the base pointer and finally corrupt the return address but wait how does corrupting return address do anything see return address
basically controls where the program will go after the function has finished so if you can overwrite the return address with an attacker control value you can return to any function or any address that's pretty cool yeah so what we'll do now is since we know that the size of our buffer is 16 and then the next eight bytes see here we are assuming the architecture to be 64-bit architecture so stack will be 8 bytes a line now as you saw in the previous example the picture of stack we had the first 16 bytes for the buffer the next eight bytes for the base pointer and finally the next eight bytes will corrupt the return address so
we pipe the input to the binary that we are running and you can see that the program has clearly sex faulted so to understand this we have to see what happens before and after an overflow has happened so before an overflow you can see the stack is quite innocent uh the return address is still intact on the stack and you can see that the return address is pointing to lipsy start means right so it's it's fine but you can see that after the overflow has happened the return address is corrupted with our input hence if the input is a valid address you can jump to any attacker controlled place and even pop a shell so so now we'll go on
and see what format string vulnerability is format string is one of the most powerful vulnerabilities which can be easily mounted to achieve memory read and write simultaneously see it's cool right yeah so we will realize permission with a small example here we have buffer of size 16 and we are taking 16 bytes of input into that buffer and we are calling printf of buffer without any format specifiers so when there are no format specifiers our input can content can contain untrusted format specifiers which means we can now read and write using these format specifiers let us see how we can do that so to understand how format string works we have to also see the state of the stack at the instance
when the vulnerable printf is called in the program you can see that at the instance when the vulnerable printf is called our input which is aaa as usual is on top of the stack and there are among a few other things there is also the return address which is stored on the stack now to realize format string we have to also understand how arguments are passed to a function in case of 64-bit architecture the first six arguments are passed through registers and subsequent arguments are passed on stack so if we see the first six arguments are registers and this from the sixth argument we can see that everything is on stack so seventh argument eighth and ninth and
so on will be all there on stack which means we can now offset to any argument and read or write from anywhere on stack so now you can see that on the ninth offset or the ninth argument there is the return address let us see how we can leak memory using format string so to leak memory using format string we can use the percentage p format specifier so since the memory we want to leak is at the ninth offset on stack if we give person nine dollar p so dollar is used to reference the offset so we are basically leaking memory from the ninth offset of stack with percent p you can see that it is the exact same
address which was there on the ninth offset similarly we can use the percentage n format specifier the percentage n format specifier writes the number of bytes that have been printed before the encounter of percentage n and it can write those many bytes to any memory location you specify using the same format so if i give percentage 10 dollar n it will write 10 toxic on stack so this is how we can use format string and achieve memory read and write simultaneously
now we are going to move on to the key part of the vulnerabilities first we'll look into heap overflow but what is heap heap is basically dynamically allocated memory and keep in mind that we'll be using the term chunk for the blocks of memory allocated on here now let's take an example and see how the heap looks like when we allocate memory here we have a small problem and it has two mallocs of size 0x10 and we are going to read input into both of those but let's see what is the heap looks like after we run this problem so if you can see this there is the two chunks and the size of the chunks are 0x20
but va 0 is 1 0 how come it is 0x20 so what happens is for every molar sun there is a header of size 0x10 so basically the program adds up both the sizes and then allocates that size and our data starts after this header so as you can as you can see our data starts just after the header and the header contains proper size and current size so this process is the size of the previous term in memory and this will be only set if the previous size is free this is used by free and malware for offsetting purposes and next we will see a heap out program here we have a similar save
program but instead of taking 0x10 as the input size we are going to read 0x to 0 but the our channel is only of 0x106 and we are getting 0x206 so we definitely have an outflow but what can we do with the outline key let's see that we'll be calling the read with the argument as the chunks address and then the size as 0x to 0 we can see before calling that read the data set part of both the chains are null because we didn't read anything into the chunk set but after reading the input you can see the header and the data part of the first chunk is in that and nothing happened but the
size and the process field of the next chunk has been corrupted that is because the overflow we can basically we can write into another chunk this can give us a lot of control now we will see the use after free so basically as the name suggests use after free means referencing memory after it has been free and if we are writing into a free memory we can have consequences like program crashing and it can be even molded to get arbitrary memory right also one more thing to keep in mind when we free a chance it goes into linked list and the type of linked list depends on the size so for later ellipses if our
size of the chunk is greater than 0x400 it goes into a double linked list else it will go into a single list this is a visual representation of how the chunks look in memory when that is free so we have two chunks which are free so the first time point to the second chunk and which is pointing to null
now let's see a small problem that to see how we can leak using u subtrophy so here we have a similar program to previous before one but here the size of the chunk is xerox 500 and we are taking into into that buffer sunk and then freeing it and then we are printing the contents of the chunk we can see what happens here we can see before freeing the chunk one the data part of the chain one has our input and we can also see the data part of the jump to has nothing because we have not read anything into it but after freeing the data part of the junk one has ft and dk basically because the size of
the song is greater than 0x400 it goes into a double increase and if there is only one element in the linked list then both the ft and vk will point to main arena which is in let's see so we have lipsy addresses on the channel and we can also see the product size of the chunk 2 has been set as i said earlier it will be only set if the previous chunk is free and since we are going to print the contents of the chunk one will be able to get leaks using this so now guys that we have had an overview of primitives and how common vulnerabilities can be changed to form exploitation primitives
we can very well move forward and introduce fp analyze but first why fp analyze see we play a lot of ctfs write exploits and have fun during the cpf but most of the times when we muster an arbitrary memory right in the challenge we go for overwriting lipsy global variables like malloc and freehook wait what are these you can think of these hooks to contain functions which are called to initialize some stuff when a first call to malloc or free happens see after the first call to malachor free happens the memory of these hooks is nulled out so that in subsequent calls reinitialization does not happen but if we can write a function which can
give us shell to those hooks the next call to malloc or free will give us code execution see in most cases we go for overwriting these hooks with functions which give us shell immediately these functions are called one gadgets see bear with me for the new terms being encountered but it will be clear so now the thing with these so called shell functions or one gadgets is that they pose some constraints which have to be satisfied in order to pop a shell in most cases the constraints get satisfied but you know guys life is not so easy as it seems and there are cases where the constraints for any one gadgets may not satisfy it at
all this is very frustrating moment for us as ctf players now we have to hunt function pointers which are called internally in lipsy functions which is a very boring and a tedious task so why go through all the hazel in searching for function pointers when fb analyze comes to the rescue with fp analyze you can find all function pointers which are called during the execution of program perfect just what we need see not only this there are also times when you have an arbitrary memory right but no calls to malloc or free happens which means there is no option for overwriting those hooks that we have discussed now but in this case also we can fire up fp analyze
get all the relevant function pointers and override them with the shell functions and get code execution effortlessly apart from all this fp analyse can be quite useful when combined with exploit automation tools it can also be useful in dynamic
analysis now we are going to see how fb analysis works so the main idea of generalized is booking init and it is basically the function that is called at the beginning of a program so because we are hooking in it will not be losing any pointers and the next thing that the program does is it passes passes the shared library's writable memory and then finds pointers and when it finds pointers it stores these pointers in and its address in global array and then continue with the normal execution of the program now we can see the core idea behind fb analyze the idea is to handle signals executive 6b which is basically segmentation so what happens
is when we pass the writable memory we get the pointers and we overwrite those pointers with index into the array so basically it's like if we are getting the pointer it is the first if it is the first pointer then we overwrite those with one and if it is a second pointer then we overwrite it with two and similarly we overwrite every pointer on writable memory and since this is overwritten when the program tries to call the pointer it is going to call the index so which basically will support and our fp analytics will catch this equals and register that pointer as used then it also replaces the memory back to the original address because we already found this pointer
once so if the program tries to use the same pointer again we don't want it to suffolt again so we'll replace it with the original memory now we will see the internals of implementation of fp analyze the first thing that the program does is after while hooking in it will first get the addresses we get the addresses from a file called slashbox maps this contains the memory map of the current program so we can open this then pass the data in this to get the relevant addresses like lipsy base binary based lipsy executable stack and similar addresses and we use flags simple flags like these first lip series first binary to make sure that we are getting the
correct address and we'll store these addresses in global variables for later use and after this we are going to pass the writable memory of both binary as well as deep c and since we know the addresses of both those we can go through the entire segment and then find each and every pointer and check if it lies in the executable region if it is in the executable region it means that there is a possibility that the program might use this pointer so we store this in array while simultaneously ending the memory with the index we can see an example here what happens with the while parsing here we can see before painting the vss have three addresses with three
pointers which are win one win two in three functions pointers and after painting these values have been overcome with one two and three which is basically the index into the array then after passing the vss we are going to handle the segments so we need a good handler so that we can replace the memory back to the original value so that program will continue execution so what what happens is when the program tries to call the a for index if the rit value will be corrupted the rit value will have the index because if it was a real address the call will have make the rip value into that address and since it is not that and since it is
our index that we are going to call rib will have index we are going to get the rip value from the context structure of the signal handler then we are also going to use built-in frame address function to get the stack address of the caller function this is so that we can get the written address of the function we need written address to calculate the offset of the instruction which caused the support so this will be this will make it easier for debugging the purposes we can see an example for handling support and on the left side we can see the same memory that we got after the passing happened we can see there are one two and three
and here at this point the rit value is two which means the second point is the one that was called so after the handle executes we can see that that only that memory has been replaced with the original value and now if the program tries to call this again there will not be any problem uh by the inside the handling support we need to figure out whether the pointer that we have is a ellipse pointer or a binary pointer so for this since we have the address addresses basically we have the limit of so we have the base of the binary and we have the end of the binary so we can check whether the pointer lies
in libsy or binary and then we can subsequently print the boxes from if it is lipsy pointer we will print the offset from lipsy base and if it is binary pointer we look in the option from binary piece so guys now that we have had an overview of how fp analyse works we can go for a short life demonstration but before that we'll just see how we can use sp analyse here in this screenshot you can see that we can just do dot slash run dot assets and the binary name from which you want to extract function pointers so what we just do is we ld preload the shared object fp analyze and run the
binary with it so let us see fp analyze an action you can see that when i run a sample binary with rfp analyse uh if the function point is detected inside binary which means the bss segment it will print the offset from the binary base and if the function point is detected inside the lipsy bss segment it will print the offset from lipsy base there are some instructions given which you can study and if the instruction if the instruction is also found then it gives the instruction offset if it's from the executable segment of the binary it will give the offset from the binary and if it's in the executive segment of lipsy it will give the offset from
the lip see base
yeah so now we can just go for a short demonstration of how fp analyze works i hope you can see the screen yeah so here we have we had encountered a challenge during hsctf which was conducted this year so in this challenge uh it was a scenario where we could not use any one gadgets uh we kind of tried running a brute force in a range just don't worry about the exploit we had a arbitrary right primitive we could write anything anywhere so we thought of writing the one gadgets that we have discussed before to malok hook so as you as you can remember malacook is the hook that is called whenever a call to malloc happens so
when i when we tried running the brute force script with all the one gadgets we saw that all the one gadgets had failed so we were in kind of dilemma but then wait to save us we had fb analyze one thing is that since the lipsy provided was different we had to use docker environment to run our fp analyse since we are already preloading a shared object it is not quite possible to preload lipsy again so now what i'll do is i'll yeah so
yeah so after after calculating all the function pointers from fp analyze you can see that fp analyzer has given function pointers uh which are this yeah so now what we do is we kind of run a brute force script uh trying out all the one gadgets on all the function pointers so if a function pointer works it will just print this the function pointer also
yeah so we run this script and you can see that in the very first run itself fp analysis is given a shell the function pointers which were found subsequent runs will not get but we were quite surprised to see another hit here so ah so out of the function pointers we detected two function pointers gave us now we will see how fp analyse works with even linux binaries like the bin assets binary and echo echo binary so this is just to show how versatile fp analyzes so you can see that the run dot ss is having the this thing now you can see that the sh binary has copied from the bin message so fp analyse works fine on
all this note that this segmentation fall that you are seeing is simply because we have preloaded a shared object it's it's not a bug or something it's it's just that we have preloaded a shared object we cannot execute commands like ls or cd if penalize works well with even the cat binary also
and even the echo binary
so now that you have seen how fp analyze works there are some limitations to its working though for now we support only linux binaries and another thing is that we cannot keep track of libraries which have been loaded after the call to init function this can happen in some cases and for now fp analyze remains tested only on cc plus plus binaries and there are some scenarios where a shade object file cannot be preloaded in those cases also we cannot use fp analyze so let us have a quick rewind of what we have done so far in our talk so initially we saw how primitives and common vulnerabilities can be chained together to get arbitrary memory read and write
then all this paved the way for our introduction to fp analyse then we saw how fp analyze is useful and had a quick overview of its working towards the end we had a demonstration of fp analyze and saw a few limitations of it so you can find sp analyze the source code it's already open sourced and i've posted it on slack the github link is also here and you can reach us through our twitter ids so that's it from our site guys thank you so much if you have any questions please feel free to ask oh yes we're back there we go um so griffix eric thank you very much for that you know uh we don't have any
questions yet folks please post your questions in the comments we do have how long do we have for questions well we've got a uh we've we've got about 10 minutes for questions if we if we can fit them in and if any come up but in the meantime you opened with the fact that you're both students at universities are you in your third years now or second year yeah both years i'm always impressed actually frankly because when i was at university my interests lay in sleeping drinking and partying uh and yet your interests seem to be actually creating tools going out and doing ctfs and then doing presentations on it all i mean talk about
well fantastic achievements and um it uh well it makes me feel somewhat ashamed of what i did 30 years ago but uh but yeah i just want to congratulate you you're part of as you say one of the best uh ctf teams around um and not only that um since we're you know we started today talking about community the contributions you make to the community as a result are just uh fantastic so thank you very much indeed you are helping make uh the information security uh community a better place thank you so much sir and thank you thank you for your presentation as well i see we still don't have any questions so yeah um wow
absolutely ask questions on slack i think you probably you were so thorough anyway that i think you've probably addressed them all uh anyway but um and you did you say the tools available to to for people to to download now sorry i didn't get yours did you say that uh the tool is available to download yes yes it is open source and i've posted the github link in slack channel itself you're taking all my questions away from you okay so um brilliant well there aren't any questions so we are going to move on rickvig saurag thank you very much indeed and have a lovely rest of day and here's your virtual applause thank you indeed thank you thank you so
much