Needles Without The Thread: Threadless Process Injection

BSides Cymru Wales · 202327:503.0K viewsPublished 2023-04Watch on YouTube ↗

Speakers

Ceri Coburn

Tags

CategoryTechnical

TeamRed

StyleTalk

Mentioned in this talk

Tools used

Beacon

Frameworks

Ghostpack

Show transcript [en]

PR besides so a little bit about myself first before we get started so I was a software developer for 18 years within the sort of DRM and security solution space and I joined the infoset game back in August 2019 um being dedicated to Red teaming and offensive security tool for the last two two and a half years and have released some tools to the community namely tools like bnet sweet potato and sharlock and more recently I have become a member of the rubius maintainer team so what will we cover today in our talk um what is process injection and why do we leverage it as you know offensive security operators so what of methods are currently used uh typically

today how is process injection detected I'll give a quick primer on function hooking now you might be wondering why function hooking is in a process injection talk but hopefully once we get onto the technique um it'll make a little more sense then we'll go on the process injection technique itself and obviously demo God's allowing um I'll do a live demo and hopefully it'll all go well uh detection results um had a maybe pool of six or seven edrs that I was able to test against and then perhaps some improvements that could be made to the Pok that I'll release um after the talk um where perhaps you know it it'll be improved even further and if we got

time uh a quick Q&A so what is process injection so process injection is the action of injecting well doesn't necessarily mean malicious but generally malicious code into a foreign process and as the Ive security operators why do we want to do this um so what I call The Three Bs the first one being breaking execution chains um so what I mean by that is an execution chain is typically the child parent relationship between processes and how they ex execute so you know DFI teams or blue teams will often use that execution chain within EDR portals to track sort of what a particular process has been doing so if you can can inject into a foreign process in a way that is

not detected It generally means you're breaking that execution chain making it more difficult to uh see what malicious activity that particular process was doing another one of the bees is blending malicious activity so let's say for example your C2 is communicating over HTTP then it's a good idea to inject your shell code on your C2 agent into browser processes or Edge or Chrome so that that means then the EDR is tracking domains and URLs that are from a browser process so it tends to blend in better than let's say some random executable you've just download loaded from the web and is all of a sudden making calls out to um s some random domains and then the final reason is uh

bypass security controls so a good example here potentially would be Port 88 and keros traffic uh on a Windows platform that typically is done from the Elsas process and can now and again uh originate from browsers as well so EDR and seams will have rules um that will look for suspicious curus traffic based on the originating process so if you were to use rubius for example from a non-browser process or from Elsas then you know that is typically raised as an alert so it's a good idea to BR you know inject yourself into maybe a browser process and then run rubius from memory um to bypass that type of control so how is it how is it done um

typically broken into three steps so you have the executable memory allocation or the allocation primitive so this is the piece where you basically from a malicious process you allocate memory in the foreign process so that you can write your malicious code then we move on to the second primitive and that's the right writing of the actual code and then the all important third primitive is the execution of the code itself so what techniques exist for you know the various Primitives that we need to need to perform so on the allocation side there are API calls like virtual alloc ex this is what we would say your your vanilla way of um allocating memory within the

forign process um and it's quite often used by um a lot of injection tools antim map through section is another API call this is typically used when mapping dlls or Exes in memory you can also have them be anonymous but it's just another way of sort of allocating memory within that foreign process and so code caves then is it's not an allocation primitive as such because a code cave by its very nature already exists within that foreign process uh a good example of a code cave is uh the btes in between various functions a function might align on a 16 by boundary for example so you may have you know space for uh injecting code in

there the disadvantage of code caves of course is the space is very small so injecting your 300K Beacon into it it's just not going to work then we move on to right Primitives the vanilla uh API used in this process is Right process memory um API calls like nmap view of section there's no cross process as such so you can basically use the NT map view section API to get a block of memory within your injector process and then any action you do within that get automatically mirrored within the within the destination process or the or the target process atom Bain so uh that's a technique using Windows Global atoms um and then ghost writing is a technique

where you ulating threads to read from your injector process as opposed to writing into it um but sort of all these techniques do involve with you know various interactions across your injector process to your uh your target process and then we have the all all important execution primitive which is um you know the piece that actually executes the Shell Code that you've wrote into that foreign process vanilla API in this case create remote thread um almost example out that out there in the web you will see um those three in order and T you tend to see create remote thread as that last execution primitive NT kapc thread is a method of hijacking if you will an

already existing thread but they have to be in a particular um state for you to do that they have to be in an alertable state so they have to be waiting on something um so you can't always use it if you can't find the thread that's in that particular State thread manipulation so that's where you target an already existing thread in a remote process you suspend it set the thread context which usually involves setting the instruction pointer to that of the memory that you've injected and then resuming that thread once again which will then trigger the execution Rob chains via stack bombing so using the same Trio of apis above a thread manipulation you're using them

apis but instead of setting the instruction pointer you are manipulating the stack and writing to the stack which will then potentially trigger R chains but all of these actions basically are involving um an interaction from the injector process to the Target process I haven't gone into massive detail I've just given you a brief sort of overview of the various Primitives and that's because safe breaches talk from 2019 process injection techniques got to catch them all well worth having a look at they go into great detail of all the various EX ution allocation and writing Primitives uh for process injection so detection how how uh do edrs you know Common detect these things quite often they will look at the

three um they look at the three Primitives in sort of succession so you have your allocation your write and your execute primitive so when it sees them um especially when they group together in a short space of time it's a good indication uh you know that you're in injecting uh into a foreign process the challenge that EDR vendors have is you also have legitimate activity that does this especially allocation and writing Primitives because a lot of um processes will use that for even interprocess communication uh so it doesn't necessarily mean the action is malicious but the all important execution primitive on the other hand that does trigger the actual execution so what happens what what

happens if we sort of remove that last crossprocess execution primitive what results do we get in terms of detection so before we get onto that I'll give you a quick primer on function hooking and what exactly that is so function hooking is the the art if you will of redirecting um the behavior of a function so on the left there you'll see an example of an unhooked function and you have say notepad.exe that makes uh an API called to create event W which lives inside kernel base uh DL or sometimes kernel 32 depending on what operating system and then the execution flow will follow through uh the creative entw and then return back to notepad

once done now when it comes to hooking the first instruction is modified so it's patched um within that create event function and the jump instruction is typically used and that will jump to uh you know another piece of code that piece of codee then can perhaps modify the argument um to create vent W in a way that you know to the attack is advantage or perhaps it won't even make a call to the original function at all a good example of this is an amzi bypass technique where instead of it calling the original function it just returns true to say that the payload is [Music] clean so onto the actual technique itself um so on the right hand side

there that's the a typical um memory layout of a process that's allocated within windows so at the top you have the memory that um is used for the Heap the stack that kind of thing and then your various dlls um are spread along towards the bottom of the memory space now in between each dll when Windows allocates dll it will leave a gap now it's not a memory hole in the sense of a code cave so you can't use these memory holes for actually injecting instantly CU it's anal loated memory but they do leave gaps between one dll to the next and that sort of becomes quite important with the actual technique and the reason

being is um we want to put a hook plus the ex the Shell Code that we want to execute just above or just below um you know the address base of the dll that we're looking to inject into uh and the reason being is um we end up putting a call instruction inside a particular export I gave an example of creative NW just now but you know it can be any export and the reason being is a relative function call uses up about five bytes of memory and because it's relative you can only jump or call within plus or minus 2 gig from where you're actually making the call from so that's why it's sort of

important to find an allocated memory either above 2 gig or below 2 gig of where your patch is is going to be and then this is the all important factor once you've put your hook in place which then is followed by the Shell Code and then you've patch the exported function you're now waiting for legitimate process activity to call your API so there's no there's no third execution primitive if you will you're actually just relying on the behavior of the program um to you know execute your shell code so that so when that when that actually happens and the injector is not actually doing any form of action cross process you're just waiting for that particular API to to be

called so what does the hook code look like stage one you can see right to the very top there on the top right um it's hijack in the actual execution flow itself now typical hooks will use a jump instruction for that first patch that you do but for this particular Technique we want to use a call instruction instead and the the reason being is um you pop on a function call the return address is pushed onto the stack now we pop that off immediately as the first instruction and within our Rax register um we'll get the return address of where the call function was just made from and that all important five bytes patch is

then subtracted from that so then by the time you subtract the five bytes we've actually calculated the address that we originally hooked then you're moving on to saving um the original volatile registers so on x64 architecture um rcx RDX R8 R9 they are used for the the arguments for passing arguments into particular functions so if we don't save these so these would be whatever notepad was you know using to call create event W for example um if we we don't save these for later it means you're likely to crash the behavior you're likely to crash the pro process that you're you're targeting next stage restoration so where you see that hex value there which is not particularly

clear because bad choice of red um but there's 1 1 2 2 3 3 4 4 5 6 7 8 8 that is generated dynamically so that's the only part of the Shell Code that's sort of dynamic depending on which export you're actually targeting and that's just the 8 bytes that existed where that Pat P was um even though the patch is only 5 bytes because it's um you know we can do that copy in one operation because it's a 64bit value we actually take uh whatever 8 bytes was there prior to patching creative NW for example uh what the restoration is doing is essentially putting that back um so by the time you get to this red area here

creative NW is no longer hooked um so any other future calls to create event w at that point uh will continue to execute as normal and then it's the all important setup for actually calling the Shell Code that you're injecting so the Shell Code itself immediately follows this this this assembly blob if you will um so it's a relative function call but you know you're talking maybe 10 bytes after the after this particular hook code to call your your calc payload or your your C2 Beacon whatever it may be and then once that call function returns your shell code is executed and and done it Merry thing so then we we're now starting the reverse of of what we what

we done at the start so we restoring all the nonvolatile registers so this puts it back in a state where you've you've got the original arguments what would you know these were the arguments that was originally used to call the creative NW function where notepad made that call and then finally you're restoring the calculated hook address so that then you can jump back to that exported function function so it means the the original intended behavior of that program will still continue um as expected um you know you don't want your program to crash once your shell code is executed so these last few stages are there to restore that behavior and make sure that that creative ENT W API call you know

you know succeeds as it would have uh even if you hadn't patched it okay so which DL exports do Target the patch um the main ones should be NTD kernel Bas kernel 32 the main reason being is they are the the address that they exist in every process on Windows is the same Now They do change from one OS boot to another but on on a single boot of Windows kernel 32 ntid kernel base they'll all be in the same Base address within every process so it means you can calculate the address that say create NW which is the example I've been given um whatever address that is at in your injector process it'll be at the

same address in your um in your target process but that won't be true of all DLS then you can use tools such as API monitor to Target a particular process that you want to you know you want you want to sort of Target within your actual environment you typically do this away from your environment or you know within your your V you sort of you know start up Chrome and and the API Monitor and just have a look what apis it's it's sort of calling and ones that might work out as good targets but definitely avoid really busy API calls um so Heap allocations anything to do with closing handles anything to do with synchronization like RTL and critical

section they're all really busy API calls um if I actually go back here um up until the point where you get to the restore phase you've essentially got um you've got yourself a race condition there so if you've got say four or five threads making a a call to create the event W all at the same time then there's a good chance that your shell code will actually execute multiple times now you do that with one of the really busy API calls you're going to end up with a th beacons checking in which you don't really want so yeah so so Target sort of apis that are you know um not called that often but called enough that you are

going to get execution now the other advantage to this particular technique is you can Target specific API so you're essentially then time bomb in your shell code execution um so you could I've got terminate process there but really the exit process would be a better um better targets or you could inject Explorer for example and and Target the exit process API so then you could trigger your shell code to execute when your the operating system is shutting down for example or the wind sock apis the connect function that's used when making a TCP connection so potentially you might want your shell code to execute when um that connect API is made so it's a good way of actually

time bombing uh the execution as well okay the all important Amo time

right so the tool itself I'll release after this talk um it's only a pop tool so essentially takes the Shell Code if you if you want to inject specific Shell Code if you don't provide one um it will pop a a count B Shell Code specify the pit that you want to inject into the dll and then the export you want to Target so I set up notepad and chrome um as examples here um I've already sto the PS of these just so that I can um make the D quicker so here you can say chosen K.D and then create thew oh good SP in I'm actually doing this on my own run I need to do that yeah I have

to put it down because I can't I can't do both

thank you right here we go right let me see it look over here right so Pro the the uh the park itself you basically got a few parameters to it one is the Shell Code um um if you don't provide the Shell Code parameter it basically pops C and then the process ID that you want to Target um the DL On Target and the export address where's my Chrome gone now God

no there it is right okay right okay so like like the example I gave in in the actual slides kernelbase.dll injecting into the creative NW right you press enter that does it thing I know when Chrome has focus it calls that

a no now

yeah so noad I chosen map f x I hope that's the one I can type in there I know that API call doesn't really get called during that but should get called when I try to say bring up the sa diog go [Applause] okay onto results let me uh get the presentation

back so these are the ones that I've managed to sort of source various friends around the world um to test against uh as you can see mostly across the board um no block no detected the ones where there is um hyphen in the block State just means it's not relevant for that particular EDR now the only one that did detect activity was silence Optics um now that was set in an aggressive mode um and what it actually detected was the first two Primitives it actually detected the allocation on the right um didn't necessarily detect the execution if any of you are running silence Optics in aggressive mode I'd love to hear from you to see how

effective that is actually in an Enterprise environment because I'd imagine you'd also get a a lot of false positives with it just triggering on them them first two um right Primitives so possible improvements how can that Park be improved a bit further so the park itself is used in the most vanilla form of EX allocation and right Primitives so perhaps using um you sort of more covert uh allocational right Primitives might you know also you know prevent um you know the edrs that were alerting to stop patchus hooking techniques it's a similar technique that is used in sharp block where you attach yourself as a deburger and then instead of patching the export Maybe using

Hardware break points instead um so that also will eliminate one of the the crossprocessed rights so there's another Improvement that could potentially be made now at the moment the park is using rwx so it changes the memory on the export address itself um and that's because the the hook code has to restore the original um you know bites back to what they were so the hook code could be made more complex so that it also changes the protection of the export on the fly so you would end up making a call uh to change it to rwx patch it and then change it back um so a possible Improvement there uh support more DLS um

the park itself like I said will only sort of Target dlls that are traditionally already loaded in all processes so kernel 32 Kel base ntdll and it calculates the address based on that but you using that technique for all dlls is not going to work CU they potentially a different base addresses so there's an improvement that could potentially be made there where it remotely enumerates the loaded modules in your in your target process and then last but not least the technique in the Pok um it executes on the thread um that actually triggers the execution so if you're injecting let's say a C2 Beacon that internally doesn't create the thread means your shell code will never

return until your beacon is exited and that's likely to lock up um say noad which is the example we use right so modifying that hook code so that it creates a thread internally within the process so not across from the top you know the injector process using the actual hook code to create the thread for your shell code to execute on so by implementing that then it means it's you know easier to sort of get your your C2 executing alongside original Pro program behavior and that's it any

[Music] questions well it's all to do with permissions right um so if you if it's an initial access payload and you're a low privileged user you're only going to be able to execute really into you know user you know same process of that user but you know if you've already privileged ES in your system unless they protected processes um you know you should be able to process anything oh inject anything sorry anything else nope great thanks for listening [Applause]

Needles Without The Thread: Threadless Process Injection

Related talks