← All talks

Writing Nim-less Nim - Tyler Randolph

BSides KC27:15900 viewsPublished 2024-05Watch on YouTube ↗
Speakers
Tags
CategoryTechnical
StyleTalk
About this talk
Nim is a statically typed compiled systems programming language. It is described as efficient, expressive, elegant by its creators. In offensive security, deploying Nim binaries are challenging due to the Nim runtime being heavily flagged by AV. Removing the Nim runtime is one way to avoid detection.
Show transcript [en]

so if anyone is great enough to stand the code on a lecture that's about malware congrats this is just a link to the repo in the slides there's a lot of small Texs so might be good to look at even later there's also some extra examples in the slides as well so today we'll be talking about writing mist n which is using n as language to write offensive tools by stripping over W the Nim run time so first who am I I do VR and hacker man and it's TR new Dynamics out of DC Metra area I'm a member of the US cyber Games season one athlete a couple years as a tech Mentor average

CTF malev and cat enjoyer as you will see so what is the Nim run time n is defined as an efficient expressive and elegant Language by creators compiles down with the c C++ Java scrip and a few other languages but essentially has very readable syntax like Python and pretty straightforward to understand it's also got a really cool type system since a statically type and it also has U what's called macros and templates those are really useful and fun to mess with the m the then run time with safety checks memory allocation and garbage collection so we don't have any index um index out of bounds and buffers um you don't have any underflows or overflows that's

strictly on the run time so if we have memory safety in in chests why would we want to remove it um so a lot of blue teams and stuff have flagged the N runtime um it's very easy to notice when the N binary is the n binary we remove the runtime we can also get a n program that's kilobytes in size anywhere around 3 to six sometimes more depending on how you compile it this is good for stage zero payloads and loaders um you can also leverage this for a positional independent Shell Code and as an added fit is for malev so a basic program to make would look like this um this is just like a really simple print f um

prints out hello B size 420 command line and we're compiling this with D relase um which this is adding on the orc memory manager is having optimized for Speed and is um is not having as many deug symbols what it looks like when we disassemble this is the cun time is going to call pre-main and M main module you also see pre Main in there called in between those depending on how it's combed what libraries you're using inside Prem main um this is setting up the run time the one of the most interesting one is this last one we follow the call and look into it we see it call a lot of these Nim Lo

libraries and then get proc add these are just wrappers for loow Library A and J proc address so what this presents into is IMT the import address table always having a g clock add G clock address and G module handle a um a lot of malware if it's um dynamically link We we'll see this and kind of a a flag of something suspicious is going on going back and looking at this we see the call into main module and where it is calling our main function um and on the right it looks very ugly this is with all the runtime we're having overflow chats we're having um kind of a pseudo pointer table table these

allocations dialex assertion kind of just a whole bunch of stuff a little nasty to rev so let's look at this as we are to Ed program without this red fine um so in this sort ofing style as it it becomes a little bit more sea like managing memory by ourselves um we have to manage threads by ourselves we don't have a run time to do that for us and we also have a limit of types um then has various types that get allocated in the Heap uh such as reference objects so all of our objects have to be on stack we use Al them a lot of function with the stream class or U stream module um where

since that gets dynamically allocated keep we can't add or remove so we have to manage all that ourselves um might be a limitation but it's a drastically different way of writing them just some informations of what I've had added to this is I'm using the N compiler 2.0.2 and then also GCC 11.1.0 um this GCC is a stock when it comes with B when you install it but we can swap it out with different gcc's so we're going to use a Nim config that's a slightly modified from a v bit mancer repo um this is used to facilitate the stripping away the N Run time as well as the C run time um so we

look at it the first thing we do is we're going to import wi in wiam is a module of n that is used to interface with the windows API so we're going to use this for heavy lifting of typing and then also Dynamic we W to for um for cross for crossing import next we have a Pudo print F we call a template play a it takes V ARS as an argument so that way you can do a WS prf a um into a buffer and then you call write console with the standard output handle and write that buffer to the output and then local fre the allocation that we have done with that buffer um

and then in main we have the the Pudo print F that we have just created and it prints out besides 420 in the config what we've done is we have passed into the N compiler and then that passes it into the Linker so let us know that we are dynamically linking kernel 32 and user 32 um in this assembly this starts looking really basic um our start call just calls main doesn't do anything else when I say call Main it just jumps Main and then our main function just looks like as if the template has been implanted just straight into the code it's not called that's how template works I like to think of it as the defined

macro C but we can look at this call local Alec and we can see that it's referencing the u a jump keyword to where local Al is in the IAT and then we can see that with all the others another thing I like for now is this this reference to a MIM stream I mention we have a lot of limited functional with NST streams we do get n streams with this but they get added into the read only data section so um that can be an I on this type of at the end it just RS and it's going to R into a program that has just right into a start function that has only

called jump main so we're not going to exit cleanly we wait to exit to just be called exit process so what do we get from this we get an INT that looks much more script we see only our functions that called in a couple of the the C time to have included such as inter critical section and then we see our one call from user 32 which is the dead W and this has compile into a binary of three kyes in size if we script it the sample program we can look at next is the self- deleting program so this uses code that was published by um Jonas Ling wood laabs um pretty much the

same um self toading program see um we're going to write it in such a way that we don't use the I at all so in order for us to do this we're going to have to write a custom uh G modle handle and get proc address so over here we have kind of some generic templates that we're going to use to start to facilitate we're have a utils folder this is going to contain have all of our utilities that we'll use in this dless run time we have a gmh which is G module handling replacement go to this go to functionality we don't do not have that in in uh GPA is the get part address replacement then we have a

hash just going to be hashing functions so our custom Replacements are going to Hash uh or going to find the functions by hashes and it's in such a way that you can arbitrarily put any hashing format or algorithm that you want to put in there then we have stack this is going to do stack string allocations so we can have strings that avoid writing to read only the data we're going to the sto which is going to be kind just strictly used for debug use um so we can see our output and see what we're doing the core of our program being self on the module angle this isation in in not really going to go too deep into

it but we can read from the P um get a and then from there get WEA data and pars over the list entry table and then prepare if we has we're passing in is the same as a half stream that is from the list entry table if it is there and return that address if it's not we're going to pass M so the um the mouth is going to have to handle all the um Elations of this to make sure that it's returning a valid address so I a little ugly um how how do we use this so the your first way of using it is by declaring the variable of H criminal 32 calling your customer G

module has P we pass in a string that we we um we T cast is a C string and then we statically P that so this is going to happen in compile time so that terminal 32 stream is not going to be introduced in the binary at all if we allocate it in the global scope then um this data is going to be uh H kernel 32 is going to be a variable in the vsss if we do it in a functional scope that's going to be allocated on the stack and another way of avoiding a data section um this the echo pass was done testing it with a M run time so we do not have access to Echo uh writing in

them here we have a template gmh that is just a wrap for that what it's going to do is you're going to pass in a c or Nim stream it's going to C it through C stream and call the static hasher and just kind of wrap it all for you and now we have the custom get COC Adder this is just a Prett generic C implementation um parts of Doss and then T header um and then the functions are in an oral array so we can get that and then the same way we're doing on the mod handle is we're calling um a hasher on the function name that's from the table and finding the

value and again this way is a little bit more ugly of using it um so order for us to cast it as a correct function we can call later on we have to type declare what the function is and this is going to declare the type and the arguments is passed into the function with the argument types what it returns and we need to Define standard call so it uses the the windows callings from there we can declare a variable H Kel 32 because get proc address needs a handle passing is the first one um then we call hit Park address with a handle and then the string we're looking for we're doing the same thing with

static so we can get that allocation to show up um and avoid the data section usage and then we can call directly on our function a wrapper of this looks very simple and straightforward GPA we can pass in a handle a nib string and the T this T is a generic so we can pass in anything as this and what this is doing is casting to the type and if we declare the type we can pass it in that or if the type is already declared in the Library we can rely on Wim to do all the dynamic uh typing for us so that way we don't have to declare it cleans up code a little bit more we can see we

have quite a few lines on the top one and on this bottom we just have one now we can call Fe out and call the local out function so we're also going to clean up the starter program and then just jumping to M uh we are going to declare our own start function and we're going to use inline as Sy the assembly we're going to adjust the STA make sure it's 16 by line and then call in we can see it if we inline it it goes directly what we write directly into the binary one of these pragmas that I'm passing in to start Isen deal with an attribute of declaring it in section what this is going to do um like this

it's nothing else to be declar a section in text it puts it at the top of the text section and uh you can see why it's done that here in a second our name function is going to call delete self and if it's true term one l z delete self is that implementation of it deleting itself from the binary using get module handle and getp address Replacements so when you do that you run it it pops up a console well um we can just pass in subsystem Windows to theer and prevent conso from so now we have this self Del program in capacity The Arc to strip it so there's no simp we're left with a binary 3 kiloby of

size like our first example looking into the disassembly and decompilation we see that there's a data section that's not used so we can just pass in the custom Linker script to say we don't need the I data um we can just use the data and it compiles to a program that's 2 kiloby in size um but there is no data section it's just the windows header and then the T section now since we declared that start function to be the. test now it's at the beginning of the binary so what we can do is we can strip out and extract just the Tex section of the binary and this leaves us with shell code it's position independent we're not

leing any libraries we're resolving all handles and functions within the code so we've essentially just Rin sh um a quick add to GPA um while playing around with this stuff heal didn't work the reason why it didn't work is because heal was a for uh for the function in uh keral 32 so we can add into our uh gioc address and and handle um for the functions um since we've Stripped Away this SE run time we don't have access to a lot of C utilities such as stuff in stream Heather um so we have to create our own sterlin a and it's pretty easy to do and we can throw in our function and you call it

uh so we have shell Cod what can we do now we can write a self injecting loer using direct s c so this is a start function sometimes we have a stack streen that's going to be declared such as a self injecting loader um we need to download Shelter From the URL one simp implementation of kind of hiding that a little bit not really is just doing a a rolling Zord key um to encode it and then we to decline declar a stack Source stack screen which is going to take a generic i t on this sorry generic I and J um these are just the the arrays lengths of bling feed so that way we can um pass in any arbitrary

known uh stream and stream for the URL and key for the key and kind of iterate over it we declared a pagma inline so the N compiler is going to inline this if if possible if not it's been call it as a function kind of left to what the N compiler wants us to do we're trying to break it as much as possible but at the end of the day um we're stuck with what we have without modifying the compiler itself uh with that we can just do some simple python to do this for us and then call the function um so this is almost Acy level encryption FL find this but we can use our imagination to do other

things so what we're going to use is we're going to use hellscape to call Direct this call I'm not going to go too much into detail about how a besides the implementation of the specific to um so get payload from urla is a wrapper to just U download the payload and store at HTI uh local shell injection is going to use the Hellcat to do the direct CIS calls um the very first thing is we need to allocate a VX table um this VX table is going to live on the stack as I um described earlier this VX table contains VX table entries of the CIS calls that we need to call um those entries has an

address a hash and a CIS call word that CIS call word is the value the SSN that is being called when we call CIS call instruction um from there we can we can call those CIS call by calling the function elgate with a CIS call pass in this sets up a global section in the data to tell which um which CIS call we're at and which one we're being called next and in h to sin we can pass in arguments and call that CIS um as described this is a call for the in Virtual man what it looks like within the STS these STS are Hellgate and hit Hellgate essentially just passes in the and sets

it to that Global CIS call entry we look at it in the function and all it's doing is moving a v into or a word into to um a pointer in the DAT data section don't really need a function call for that so we can just add inline to inline that kind an example of how we can optimize the a writing and he descent is a a cool way of doing this um we give it a pragma of our R so if you think any number of arguments that we pass in we also declare the first art to be Auto so that way um we pass in one of any R and inv VAR ARS so all of our arguments get

passed in we don't have to redeclare a new function for every CIS call that we're creating I mean it has a simple wrapper um with how CIS calls work with rcx into R and calls CIS call so that looked easy what was the capture that well if you compile it as is what you can see is it just fails to execute um and this is partially for a reason with how they um compilation is happening we're having a like a an overflow of trying to compile um the program into something that's in the 64bit address space and kind of exceeds that for whatever reason but for weird reasons I don't know why we pass in zero

to image base and try to base the image at zero um we get it to to compile perfectly fine and this leaves the program 6 kilobytes in size um so we can do some off obus station since it's binary looks pretty simple and straightforward we can add in anti-debugging um into our Hell Gate just adding a new CIS call into the DX table entry and then in hellsgate where we defined as calling um in its tangle um we just do the same thing with how we the other ones we grab the hash and store it into that VX kle entry statically hashing is screen at real time and then we call get VX table ENT

Tre which will populate that VX table um and then we can add a call to are exec which we'll check for a debug every one time um so what else can we do withus stations if we want cap up on the binary we can see what capabilities is on it we see zor encryption being used um head loader data being accessed in par and head head so first one we can tackle is off station um look true it has a good article of string off Fus station and just using a simple wave you can add um a jump into our for Loop of our sour stack screen and for whatever reason this breaks kappa's ability to find zor

uh zor encryption Ed um we can adjust how we access the PEB um to trick static analysis into P the access instead of reading directly from qw corner gs60 we can adjust and move 10 into R and times a hex 10 into R and then time six and then this allows us to reference that and it remove that ility from T finding it um and the last one is the go back for a second is parsing the PE hether this one is a little bit more tricky to do what we do is we create a template that um will check if a value passed in is zero uh and if it is it's going to jump to a failure um and then

we just Spam this all into a function it kind of messes up a control FL flow graph and um we Define failure so we inline the tag where it would jump to and then give our our fail return result issue with this is because the way it's being compiled is we can only call this template once per n file so you can do it multiple times in other n files on this is the example from GPA I did the same with hellsgate since they're both parsing the PE with hellsgate I had a test Dr fail one and something like a test Dr fail two so that way I can do more with two different labels um same methodology was

applied to H it was just a little bit more gross um kind doing a little bit more than just a simple GPA this brings us to a program that is about 5 and a half kiloby in the size and no capabilties is found by taba um trying with various optimization levels can give different results um for me I was using the size U the no optimization optimization one and optimization two effects the binary differently uh quick demo on this is we also have a reverse shell in the code um with this kind of quick this is going to create a power shell reverse shell create process and then call delete self and that's going to be the Shell

Code of course not very oppet concern safe and then we're going to use that new Au stated Loa as um our sh code Loa this is going to be on the fresh install Windows 10 um with the defender enabled we update our stackings now we have a new IP and then we're going to par out Shelf with b and compile program so this could be a demo we see Defender is open we'll grab the shell Cod from his teral on the top left that is just hting pain to and the Shelf so GRA with verify that we keep any way down here on the bottom left we haveen open we have the B ex it it's pop

up if we want to actually run this program since it's sign we soon the user will say run any way and we get a shell connected backr is itself and like our good see that completely

deleted and wpse real quick what else can we do instead of compiling the C we can compile with C++ instead we have to add an exception to use go to since C++ has different exceptions exception this language like see if we're going to pass different optimization levels that we want to do as a user we have to declare op To None because if we declare op speed or op size with n it's automatically going populated with OS and O3 OS being size O3 being speed um a lot of times I like to compile with a cash being out so I can see main Json see exactly what flags are being passed into the compiler and ler what files are

being compiled and we can change our GCC to test how different implementation of what's being for example you can use 132 which contains the O optimization and they create slightly smaller Shell Code and to end it off the best advice for this is to play with your compiler um here on the left we have OS and it's pretty straightforward control flow graph here in the middle we have using ABX which the advaned instructions set and then on the far right we have abs 512f which looking have is they get more more I was going to CTF I saw far right I would go outside and touch grass so that brings us to the unit of

talk we have any questions on I got kind of under question so like do you have any Nim Frameworks that are like Dynamic compilation kind of like C has Ros go has garble or anything um playing as far as Dynamic that's been written like MIM think so okay um pretty much the way MIM works is if you have any compiler that can compile C code you can use it with them because the name Will compile into C and then that c is used to a c compiler oh go okay I miss some help

with that that's all I have