
we have a special guest Mr Ross Bryant is going to be speaking about nation state threats and open source software supply chain but before we get him started I would like to thank St Mary for hosting us and Artic wool for funding and helping all this come together thank you again my name is Dwayne and have a good show all right thank you Dwayne um so again I want to talk to you about nation state uh proliferation of threats in open source software ecosystems um I'm the chief of research for philm and if you're not familiar with us I give you a brief introduction we're a series a startup we started in covid uh we're
100% remote from Italy uh Europe not Texas to California and we monitor open source software ecosystems for risk because um the bottom line is we all trust and use software from strangers on the internet um I was just the open source software Foundation North America and they had numbers like 98 99% of all software is open source and they were like we won great um but this comes with some risks strangers who we will never meet strangers in such vast quantities across every aspect of the software development life cycle it is hilariously unlikely that you'll ever truly know the full extent of all the identities and the intentions of every open source software supplier and so even though
open source has a lot of advantages there's still RIS risks and so my company uh classifies these broadly into five domains first is just malware is this package malicious is it intended to execute the will of the producer of the software over the consumer of the software um is this package poorly engineered does it have any engineering risk would a senior engineer send a junior engineer back to go during code review to go fix stuff um are there known vulnerabilities with a package uh this is basically the domain of all software composition analysis tools this is traditional SCA and then finally are there legal risks for commercial Enterprises to use certain open source software for licensed risk and when you
aggregate all these risks each of these informs an overall picture of the fifth domain of risk namely author risk that is how trustworthy is the purported author of this package and so that's our risk domains fin for whats and a Who and the scope of this problem is immense uh our solution is our automated platform that monitors seven different open source ecosystems npm for JavaScript PPI for python maen Central for Java crate IO for Russ Ruby jems for Ruby new get for C and go length for go and anytime a new package is published to any one of these seven film downloads it and analyzes it and within minutes of publication our customers have a
detailed report of all our findings across these five domains of risk we also integrate with various cicd platforms so the developers can immediately know if any package that has or dependency of a package has problems they should be aware of and so last year was my first bides and I had the privilege of speaking here and I gave a broad survey of the attacks that the film research team had uncovered in 2023 with our automative platform and I gave a lot of real world examples from npm and pii about spammers and scammers and credential Steelers and Opus skated code and typo squatters and combo squatters if you've ever even heard of that um automated respawning malware on GitHub U
and it was a great talk I really had a great time being my first bsides that next following week after I left bsides an unusual package hit our platform in npm it's called chart table JS U it's a simple package with as you can see just two small files not even onek of code uh by the way if you don't know what a combo squat is this is an excellent example it's not a typo squat where somebody's fat finger or something there is actually a package albe it unpopular called chart table and the attacker just slapped on a JS to the end because well it's JavaScript who would who would be any the wiser that's what a combo Squad
is and so we started looking into this package in the package's metadata file package.json and this one looks fairly innocuous there's really not a lot going on here except for the unusual prein install hook that executes as soon as a developer types npm install chart table JS um and so this hook immediately installs the sync request dependency and then executes main.js which is the only other file in the package so what we find there is also not a lot just 20 lines of code uh first they turn off TLS certificate validation which just bypasses any certificate checks and then it creates a hidden directory called C price in the user's home directory and after that after writing that
directory Nam to the console it calls this bespoke function at the top check SVN which sends a get request to that URL and then writes the response to a new file called price token in the directory it just created and that's it that's the entire functionality of the package there is nothing else in the package whatsoever and so my research team and I had a lot of unanswered questions in this point as we kept digging through our data for answers uh we found another package in npm that was published just 5 minutes after chart table JS and it's called vew wjs and it's also a similarly small package I'll be it a little bigger and so we started
in the package.json and now we find a post install hook which will right after you type npm install vew wjs will execute main.js so when we look at Main it's too big to fit on the screen so I'm going to chunk it up for you here again we find a bespoke function called get price it's going to send a post request to Port 443 to whatever is passed to it um and after it gets rid of um after collects all the chunk response chunks into a single buffer it's going to write that to the path and file name provided and then immediately execute that file as you can see online in 25 there so in
this case after turning off TLS checks again the vuejs package looks for the token that the chart table JS package just dropped and if it's found it calls get price and the payload is retrieved and executed so at this point i' just like to make a couple observations first of all these individual packages don't really look like malware per se taken individually one at a time uh they don't really do a whole lot in fact one package without the other is totally useless but taken together this doesn't pass the smell test at all like this is bad you get some random token from some server and that you authenticate with to get a payload and then you download and
execute it immediately that is super sketchy uh but I think most important here is that no SCA tool that just scans for vulnerabilities will ever find anything like this at all nor will it um this code is not vulnerable there's no cve associated with any this if your organization only scans for vulnerabilities this kind of activity should give you second thoughts about your organization's security posture anyway we still had some big unanswered questions for my team and the first one was like who would ever download any of this like this is dumb like why it's it's unlikely that anybody would be fooled by these packages especially if all they did was just take a look inside but part of philm success
is that nobody ever does take a look inside the things that they install they just install it and assume everything's going to be great how long had this been going on was this an isolated Advent are there other packages like this and finally who's behind all this and what's the point right this doesn't seem to really do much um and at this point in our investigation we just didn't have any answers but the second question was the easiest to answer so we went back and looked in our data and we found about a dozen pairs of these get post P package pairs uh going about six weeks prior the first two seem to be test
packages but the flurry of activity you can see there in May really seemed to be like a concerted campaign and with each new pair of packages the URLs and the directories and the file names would change and maybe some of the identifiers would change a little bit but the basic structure was all the same each time and though the package names varied across across a wide range of topics as you can see the identifiers in the code all revolved around a crypto theme and so as we saw in the pre as we saw in the previous example and so as we studied the packages Evolution it looked like they were incrementally improving their software each time they published a new
package and so going forward from our initial find we were able to track new pairs of packages fairly quickly because now we know what we're looking for uh but getting the payload off the server was elusive we never could be there in time to send that post and get that and see what the next stage was going to do um it just wasn't up long enough uh moreover each new pair of packages was published under a new npm username and email so these all were throwaway accounts you couldn't just track one account through here and find these packages um and then on July 11th the packages just stopped we still didn't know what was going on and on the 17th
npm all the packages that were still active on npm were taken down as security holding packages so they were all gone often times the attackers would take down and unpublish the packages themselves but many of these stayed up for over a month but the next day on the 18th the vice president of security operations for GitHub which is npms parent company posted a security alert saying that a social engineering campaign targeting the personal accounts of employees of Technology firms many of the targeted accounts were connected to blockchain cryptocurrency gambling sector and some people in cyber security and because GitHub has all the registration data for all of their users they were able to attribute this
activity to North Korea this was Jade SLE um and now this explains how people would get stumped by this in the first place they were getting socially engineer say hey I'd like to join with you on this project and they would entice a developer to install the packages and then the game is up because you install the packages and there's nothing left to do after that you're you're host they also gave us a shout out as well uh which was really nice because all we had was the code in the packages and we were chasing them as best we could but you know they didn't North Korea doesn't sign their stuff you know love Kim Jong it just we don't know what
they're doing so we all we had always suspected that nation state actors would be found in open source but this was the first time we ever confirmed it um and as far as we know it's the first time last June that any nation state had been found in open source they've been found in other places plenty but not act actively moving in open source ecosystems so after that blog post we B began coordinating with GitHub security team uh sharing findings but it seemed like that particular campaign really fizzled as soon as they outed the attackers in July and though we continue looking for Jade sleep for the rest of the summer and the beginning of the Fall
they just remained elusive they just went quiet as far as we could tell until about Halloween our automated platform alerted us to the pumac com package and this package was nowhere near as sparse uh as the ones from the spring campaign you can see there's a lot of files there um but again starting looking in the package. Json file now we see another unusual pre-install hook uh this time index.js is what's going to be run and then immediately be deleted as soon as it completes so looking inside index.js after you get past some innocuous import statements you declare these two variables data and PS data that's hard to read so I'm going to beautify that in a moment but looking at
the rest of index.js it first checks to see if you're running a Windows operating system and if you are we're going to write the contents of the data variable to a file called pre-install dobat and PS data to a file called pre-install PS1 and if those rights are successful then we're going to execute pre-install dobat delete it that is to say the unlink statement down there and so now we need to really see what's in those two files that we just wrote so here's a pretty printed version of pre-install outb first thing to notice is that it's doing its best to suppress output turning Echo off at the very beginning and all of those redirect says every single command
that does any kind of output we're just is going to send that to mil next we're going to curl a file from some HTTP server at a hard-coded IP address we're going to write that to SQL light. a and if you try to inspect this file that you just wrote you're not going to find anything but a blob of high entropy garbage uh which is usually indicative of some kind of encryption stay tuned for that next we're going to launch the Powershell script pre-install PS and note that the execution policy bypass is going to run Powershell without blocking anything and suppressing all warning so we're trying to be really really quiet here um when we pre-print the Powershell script it
looks like this and this simple script is going to take the sqlite sqlite a file that we just wrote and decrypt it using a single bite key 0xf and write that to sqlt now if you inspect this file now you would find a Windows PE header and so once we're done we're going to delete sqlite a and then return back to pre-install that thatat and now we're just cleaning up after El mostly the rest of pre-install is just covering its tracks by deleting the Powershell script that we just used and then it's going to take our newly decrypted file rename it to pre-install DB and call run dll 32 to execute an exported function somewhere in there
called calculate Su which passes some mysterious parameter 4906 once that's done we're going to delete that file finally we're going to replace our original package.json with some file called pk. Json now where did that come from when well if you look back in the original LS of the package we see that pk. Json shipped with our package and it's just slightly smaller than our original package.json if we compare the difference between our original package and the one that the scripts overwrites you see that the call to the pre-install script is gone moreover the index.js that orchestrated all this it's gone everything is gone the only thing that would be ever left on your system after you npm installed
pumac com is whatever the calculate sum function dropped on your machine and we didn't fully understand this it was it turned out to be a nested nested nested set of pees that's really gnarly uh but I'll talk about that in a moment still even though we don't have any of the calling cards from the previous campaign we still felt like we were chasing a fairly sophisticated actor maybe even the same ones maybe we were back on the trail of Jade sleep and so we published our findings on our blog along with four similar packages that looked and smelled very similar uh while we were trying to reverse engineer uh this decrypted dll we did end up finding in the same
way eight other packages going back to about September 14th um they all had crypto themed names nearly one every single one of them had the same structure the same subtle changes to identifiers they were all published under onetime burner accounts for npm and we noticed that as we looked across uh consecutive versions the number that you passed to calculate sum seemed to increment uh like it was being used as a number to track individual targets uh but the hard coder server IP changed occasionally sometimes the decryption key changed and we were able to track all those through November in December this one came out of nowhere the Chinese threat intelligence centor Kean Xin reported our stuff and they reported a fullon
bware analysis of the PE the decrypted PE and they attributed it to the Lazarus Group which as you know or may not know that was the guys uh involved in the Sony Pictures hack back in 2014 and so when you do a pretty good job of Google translating to this to English they they show a lot of our work but they also contribute a lot of their own reverse engineering there's a lot of either Ida or gidra output there and later in the write up they dulge the motive Behind these attacks and that is just simply to steal cryptocurrency and so now we have not only our actor but we have our motive right theft two weeks later we get a letter
from the UN Security Council panel of experts asking us for more information for their semiannual report uh this one kind of came out of nowhere and we just like no way this is not yet this this can't be real and so we had to really extra make sure like no no really is that u.org no really no really it was fing exactly exactly but our contribution to that report LED them to conclude that North Korea has stolen about $3 billion dollar worth of cryptocurrency over these past years which they launder and they pay for their weapons program in circumvention of about a dozen un resolutions and so over the course of this part particular campaign our research team found about
three dozen packages uh through early Fe feary of this year and the actors were getting really good at covering their tracks they were beginning to change things they were beginning to really uh pay 64 in code a lot of the hard-coded strings and these sorts of things uh one package in particular that we caught was only alive for 90 seconds they would publish it as soon as they knew that their victim was hooked it was yanked and it was gone we've got it because our platform picks up everything that's as soon as it's published but later that month later in February our research team began tracking a new set of npm packages associated with another social
engineering campaign this time targeting software developers with job offers hey we think you're going to make an excellent new senior engineer if you would but just uh meet us online and and download this coding interview exercise we'd love to interview for an exposition yeah Palo Alto Network unit 42 came back and said yeah that's that's the North Koreans again um because like I said we we don't have the aperture we don't have any of these user logins all we have is the code that everybody any of y'all can get on internet and we analyze it so all these other locations that have more ancillary information that can help piece these things together we have to
rely on them um there were a few developers that contacted us who got taken in by this scheme um this guy down here who uh didn't appear that English was his first language he was pretty fortunate if you can read he says they told me that it is for live coding interview software which I have to install but before I do it I found your warning and also read article then I resend email but there is no response from them their side we well thank you for saving me lots of job Seekers thank you again sir um that that feels like we're wiing here I mean even if it's just one uh there were some other
developers that were not so lucky they they went ahead and and did the coding exercise and they got owned well what are the takeaways here no coding interviews don't do coding interviews that right first of all it's get nothing out of this just understand the threats that are out there thank you um these kinds of attacks are are really low risk and high reward for the attackers uh oh by the way they're nation states so they like have compared to you infinite resources effectively um they're sophisticated they're well resource they're patient and their job is to just keep cranking this stuff out and making money for the dear leader um but I have to make this
important point because I get this a lot at security conferences the target here is not the software the target here is the software developer it's you the human right and as there are techn solutions that can address that as you know filing exists for that reason but this isn't just go find me a buffer overflow and patch that code for me if you would go find this cve and look it up that somebody made you know four months ago and and tell me to upgrade to the next newest version that's not what's the problem here the problem is that the developer is the target you're the target second traditional SCA is just not going to cut it and for these types
of things for it does well what it does like yes by all means scan for the vulnerabilities do that don't stop but 82% of the malware that we reported in in the last quarter none of them have a GitHub malware advisory associated with them at all like if we assigned a GitHub BW advisory for every single one we could easily push thousands a week there's just so much of this stuff out there because it's cheap to script and you just Spam all these packages out none of them none of the malware that we reported had a CV associated with them but 100% of them did something bad on the host machine and finally you need a defense
in depth approach right you need to you need to be on the lookout for this type of stuff and you need to know what kinds of things are happening out there because these tactics are always evolving with these attackers they're never going to stop and until we get better at looking where they don't think we're looking they're going to be able to continue to hide with these and so that's what my team the research team is good at do it we have a lot of uh expertise on the offensive and the defensive side of the house based on previous careers that we have and uh the threat hunting that we're able to do with our automated platform really does
uh I think keep us ahead a lot of the competition who really just throw bodies at this problem we we're a really small company and there are many companies that are orders of magnitude larger than us and they just have people clicking through one by one packages we just teach the platform like go look for that stuff and tell me what you find and report it up to me and let me know so with that I'll take questions yes so you talked about npm but it's not unique in the running of a pre and post and false script arbitrary Scripts um even if you're downloading raw source code like configure files for the same thing right um I see an awful lot of
large software projects on the internet that even if they're walking developers through setting up a development environment that's going to be in a local development container they're still having them locally run yarn or mpm or pip install or something outside the container um what are your thoughts on on kind of getting the community to embrace things that to reduce the ri like the the surface that those arbitrary scripts can touch yeah so this is a real friction point right security versus developer and we're software developers right we we don't want our builds to break for arbitrary reasons that are out our control I'm I'll quit my job like I'm I'm tired of this I've got to go push features for my customers
and I want to stupid security perr so what we've done is we've integrated the cicd pipelines and most of the time the things that my team raises are just informational Wars like hey dear devel I feel like every single L Le we have is like dear developer I know you didn't look at this code but just so you should know the author field is empty that they didn't tell you who they were is that bad no it's not bad but they also didn't give you a description and they also don't have an entry point and and when you begin to accumulate all these little facts you begin like do I really want this package in my ecosystem there are
however times when we know for example that package we know is malware and that package calls this as a dependency break the bill stop right now do not pass down do not collect two so it's going to be a tough problem to solve and and we're trying to find a way that tells developers like we don't want to slow you down but we also don't want you to just blindly like pip install the next thing and hope for the best because that's dangerous that answer your question a bit okay someone up here go uh you notice like aside from the normal trend of you know malware tax and everything like that like um with AI
chat Ai and everything like that has that been catalst to increase the uh prevalence of it so we've looked into this I haven't yet been convinced that Chet GPT is going to produce good cod in enough cases to met right I mean we've handed it simple op ska code said dear chat gbt tell me what this is doing and it goes great for a while until it gets to the end it just doesn't know what it's do the the long markof chains just sort of run out like we're just at this point um and that's pretty in from a software development standpoint right the compiler The Interpreter is a punishing like your code must run and these guys
don't really need that but what they can do I think is use AI to really massage these social engineering aspects which is the more dangerous aspect so not not the code per se but the thing that supports getting the developer to run the code I think that's going to be a pretty straightforward path for because I mean CH what's the redx for this thing oh yeah that looks good fine move on like it's to I don't know any North korein that speak English and chat gbt fair enough fair enough so uh question here somewhere yes sir yeah so um in software development teams I've been involved with uh licensing asy yeah search do you see any correlation
between like no licenses and these types of attacks anything like that so I find a lot of laziness like these these attackers have a specific purpose in mind and getting the license just right is usually more effort than it's wor um like a lot of times in npm GOC ISC and that's it and there won't even be a license file um and that's kind of that's kind of the approach we have is how we find this stuff in the first right it's very hard for an attacker to get every single thing right to look like a legitimate page there's always going to be something that they leave behind it's just like well why did you
do that but that seems weird I'm not saying it's bad I'm just saying that seems kind of bad or shock um one of my favorite papers is it's called the is dead and it goes into this in depth it was about 10 years ago that these dissidents in closed countries were trying to get out of the internet using Skype like homebuilt things and what they found out is they never could mimic the Skype protocol well enough that they wouldn't get caught every time and so as a threat Hunter as a as a defender of sorts I'm just looking for that one way that the attacker's been lazy did you do it with a license fine did you do it
with an author fine did you do it with comments in your code or lack there like anything that doesn't do what a solid well-run soft engineered pro project looks like I'm going to try to find you and if I can aggregate enough of these weak indicators I'll probably be able to punch it down pretty quick anybody else I know I'm keeping you from lunch I have some swag down here if you're interested in stickers and whatnot but I appreciate your time thank you so much