Detecting Log4J on a Global Scale Using Collaborative Security

Name: Detecting Log4J on a Global Scale Using Collaborative Security
Uploaded: 2022-09-04
Duration: 40 min 29 s
Description: Klaus Agnoletti explores how community-driven threat intelligence can detect and track widespread vulnerability exploitation at scale. Using Crowdstrike's collaborative security platform, the talk presents real-world data on Log4J detection timelines, attack patterns, and how shared reputation signa

BSides Las Vegas · 202240:29147 viewsPublished 2022-09Watch on YouTube ↗

Speakers

Klaus Agnoletti

Tags

CategoryTechnical

TopicThreat Intel Vulnerability Research

ResearchCase Studies and Incidents Analysis Empirical Research

StyleTalk

Mentioned in this talk

Tools used

YARA

Concepts

Sigma rules

Vendors

CrowdStrike

About this talk

Klaus Agnoletti explores how community-driven threat intelligence can detect and track widespread vulnerability exploitation at scale. Using Crowdstrike's collaborative security platform, the talk presents real-world data on Log4J detection timelines, attack patterns, and how shared reputation signals blocked 92% of malicious traffic across a global sensor network.

Show original YouTube description

BG - Detecting Log4J on a global scale using collaborative security - Klaus Agnoletti Breaking Ground @ 17:00 - 17:55 BSidesLV 2022 - Lucky 13 - 08/10/2022

Show transcript [en]

welcome back to besides las vegas this is the afternoon session this is breaking ground and we have klaus here speaking to us with a title the talk detecting log4j on a global scale using collaborative security a couple announcements before we do begin as always want to thank our sponsors for especially our golden diamond sponsors for without their generosity along with our other sponsors volunteers and staff this event truly would not be possible uh and and i think i share the same sentiment with a bunch of other people that i'm glad that this event in person truly did happen um so hopefully this just go uphill from here but regardless one important announcement uh if you are here for the six well i

guess the next talk the next talk six pm with the exclave experience relocating to almost canada by t profit that has been moved uh still 6 p.m but it's moved to a different ballroom it's now at passwords con and the tuscany ballroom so if you're going to t profits talk make sure you go to the passwords con and not common ground instead for those of you at home same thing the the schedule has been uploaded updated on the website so you can just refer to the website that has the most up-to-date information with all the speakers as they are being changed but uh a few other announcements again this is being live streamed just as any

other talk has been over the past few days this will also be posted later on youtube so please do make sure that your cell phones are silenced for respect to other speakers as well as those listening on the live stream and later on during q a i know it's annoying it's a slightly bigger room there's not many of us here so i promise i'll walk a little bit fast but please do speak into the mic because otherwise the people online would not be able to hear your question and we really don't want that to happen so without further ado klaus flow is yours take it away thanks i'm taus welcome to my talk on detecting log4k on

a global scale using collaborative security but before but before we start this is my first talk in vegas so that's i'm really happy about that um but i'm of course i'm mostly i'm most happy that about the fact that my first talk will be at b-sides because we all know b-side is always the nicest crowd that's that's how it is in the at blackhead they would have killed me so but um yeah another thing i want to say before we start is that um the first thing the best thing that happened in my professional career was when i got fired uh a year and a half ago from one of the biggest danish retailers online retails

and that's relevant because if i hidden if i hadn't then i probably wouldn't have been here today i've been an infrastruct professional for almost 20 years and i don't know about you guys but i just never came around to finding out what i wanted to do for the rest of my professional career but i did find out some at some point that what i liked best about my professional career about my job was all the things that i did in my spare time which is uh i've always been involved in old west i've been active an active member of overs copenhagen for 13 years some 14 years sorry and i'm co-founding besides copenhagen and what i what i've loved about that

and and always have loved about is arranging events and gathering people and see in their eyes that i've had a wonderful time and and helping them out and and that's really what i wanted to do and at some point i was so lucky at that to find out that there are people out there who does this for a living they interact with the community they give back to the community which i love to do of course and help them out and and eventually i found crowdstick or crafts they found me depending on how you look at how you look upon it and now i have the greatest job in the world so that's it um

first a little bit of background about how krauser works because if you if you don't understand just a little bit about how it works then you won't be able to understand the data and and understand and appreciate uh the method around log4j that i will be talking about later um i won't be going into details this is as high levels as i could do it but let's start with the beginning the beginning is that as a cyber secure professional or at least cyber security interested you know there's something wrong out there right there are people out there spending a gazillion dollars on on cyber security and they're still getting breached so something is clearly up i mean what

it seems like the whole world has agreed that if you just throw enough money into a big bottomless pit then everything is going to going to be all right except it's not all right that's evident for everyone right so our point is and and the reason why crowdstick is around is why don't we try something else for instance we tried outpowering the bad guys we tried outsmarting the bad guys why don't we try and empower them because if you think of it there are more of regular people like like us than criminals so why don't we work together and get those bastards right why don't we do that and then that is basically what crowdstay

games should do but what does this mean in practice you can think of prosthetic as the ways of cyber security and if you don't know waze waste is a gps app that basically collects information about how fast your car is going if there's any any accidents or any or any road repairs or anything around you so and sharing it with everyone else so that everyone has an efficient trip as possible the crowdstick does kind of like the same in that crowdsack detects locks or sorry cardstock read locks on the service you're running uh in it and it and it detects um it takes those threats um and then and mitigates them afterwards and there are

a number of way different ways to mitigate um then it sends signals back uh on the attacks i disease back to our data lake thing and after that it's being there all the ips are being assessed and turned into a block list and shared with with the rest of the community and that happens that happens automatically um yep carson can detect a number of different attacks crowd caustic was originally i guess meant to replace fails or ban filter ban only does brute force attacks and that's so that's fine but crosstalk can detect a lot of other things on things on layer seven for instance the log4j that we'll be talking about and that's in that

in that sense kraft second reads the the web server log and it detects those special strings uh it's not uh ai or ml it's basically rule based and and the log passing and the log scenario uh file consists consists of some part of of grog which then passes a log file and based on that it detects the text and and sends them back like i described before crowdsack this is relatively current stats so that will be saying we collect around 1.6 million signals a day we have around 4 million ips in what we call the smoke database which is three days of of bad guys and there's around 20 to 30 thousand ips in this curated block list that i talked

about before and in a later shot i'll i'll be talking a bit more about the strengths of the caster network meaning how how big a number are we talking about but before that it's important to understand that krausteg is free and open source and the license we chose for making the open source is called is the mit license and the whole point of that is that once you have open source something during the mi with the mit license then it's not possible to close the source again and crosstake being a startup there is a risk or change or possibility that some big company wants to acquire us and maybe they want to close the source

uh we we will do what we can to prevent that because crowdstick is nothing without the community we don't want to screw it over so instead we sort of set up like a fair deal at least we think software is free to use and if you want the block list you share your own signals and if you don't well then you don't get the signals but crosstalk still works crowdsake is still able to partial log detect attack and stuff like that you just don't get any signals from everyone else a question that's that i often get asked is what about privacy and especially when we're used to as a security personnel hating everything that's free on the internet

because that means that you're the product in this case you're not the product because honestly we don't care about you we care about the bad guys so what we're so what crosstalk is collecting is literally just a an offending ip a source ip a timestamp and like a behavior behavior is like a scenario that describes what's going on for instance like ssh brute forcing or credit card stuffing or whatever this that's the only thing that's being um that's been recorded cardstick doesn't send your lock anywhere and also if some of them if some of you are from europe we have something called gdpr which basically means that that we have to have a dpo a data

protection officer we have to have made privacy privacy impact assessments and we also need to have processes to remove ips from this list in case somebody somebody who identified himself would do so and so far no criminals have done that so that's good another question that i am often asked is how do we do or how do we deal with poisoning and false positives because the block list is being distributed without questions asked so to speak it's being sent to the agent and the agents just accept it without questioning it this means that if we simply can't accept that something is wrong with this that there are faults in this and to prevent in order to prevent

poisoning or first let me let me explain a little bit about how it works when signals are sent to crowdsec it's it's going into the smoke database because you know there's no smoke without fire then the consensus engine as we call it is assessing the ips finding out whether it's bad or not and if it is bad then it will end up in the fire database and get to get and get distributed back but in terms of um of poisoning we have a mechanism called the trust rank basically um whenever and all agents submit in signal surprise they have a trust rank based on how long time we've known them and have long uh and how long time we've known that that

they are reliable so a new agent will start with 12th rank of zero and then if they send consistent signals for six months without any errors then they have achieved this trust rank of 99. the reason for this is of course to make it time-consuming to to if you want to poison us so if so if you're a bad guy and you want to poison the crosstalk database first of all it takes time the other it also takes uh takes takes a lot of asms because one in this voting process basically i'm a level that ip needs a certain a number of votes or two in order to be malevolent and those um votes are given based upon the

thrust rank meaning that that that simply an agent needs to have a certain trust ring in order to to be able to affect this process and on top of that um and they're saying only has one vote and this means that if you are a bad guy and want to poison the the crowds like database you do you can spin up like a thousand bps's on gtp or somewhere else because there simply won't be enough asn to make any difference instead you as as a criminal would have to you know have a fleet of bad uh agents but spread across asn essence all around the world in order to you know ever have the chance of

affecting the the consensus process but obviously that's not all all that we do because we also don't want false positives and the one way to do that is that we have our own uh fleet of honey pots that that we use to compare signals from the from the crowd or test compare with the signals we get from the crowd and then i need some water

that was nice the second thing or the second mechanism of this is it's basically a white list so all google dns is all google xeo bots all cloudflare cdns whatever the thing services that you that that you simply don't want to block they cannot be bought and thirdly in the it sounds a little bit like a pre-crime maybe this but the way um the way crosstalk also works is that it it it looks at eyepiece from the same knob net block and at some point if carsick has seen and deemed enough ips from the same the net block malevolent then it will just you know block the rest of the um of the net block regardless of them

having done anything wrong or not we just assumed that yeah you will also do something wrong so we'll just block your head and then if it ends up in the fire database then things are distributed back to to crowdstick agents and if you want to know more about crosstak then this is not the talk for it i won't be talking about with os crowd stick supports or anything more fainter technical about that but if you want to know more then i'll be around at the pool party and i'll also be around the def con just follow the trail of the crowdstick crowdtake stickers like but like breadcrumbs i'll be around [Music] before we move on to the log for day

part it's important to understand that crowdstick is not a buff and this means that the way crowdstrike works as i explained before crowdseg needs a lock entry in order to block something right and that is that caused an inherent problem if you know a little bit about log4j because you know that if you're vulnerable it only takes one connection with just one log entry and then you're screwed so we very much want to first of all as a security professional i cannot recommend only depending on one control but if you use crowdtake as one of your controls then there is a risk that krausseg may not detect it and block it if if if if it if it's an if it's if

it's an ip that the crowd doesn't know already but luckily um we we did a little experian experiment with two identical servers just to try and try and explore how much of um how much of what crowdstick is blocked is being blocked by a community by reputation by the block list or or by local local signals and local processing so in order for that we set up a two servers on ovh cloud provider in europe they were completely identical both had the agent installed and another one had what we call the bouncer which is the ips part installed as well on the basically the the bouncer plug hole plugs into the to the host firewall

so it blocks the connection and it turned out that um [Music] after three months we compared the the signals or compared all the attacks that had seen and it turned out that 92 percent of the bad traffic was aimed aimed at the server is blocked just based on ib reputation so basically the the um the the server that that was protected only saw like the orange joins part and that is good that is good news for if you're vulnerable follow for gay and and plan to use card for the crowd car's sake or for that or of course similar vulnerabilities

yeah and conclusion of course community matters as we all know then let's move on to the part you came for i hope look for jay if there should be a symbol or a few person in the room not knowing what look for days i'll give a super short or super short resume of it on december 9 2021 the pasture foundation released information on the critical bug in the log4j library which was exploitable via remote remote code execution um and i didn't know that before that before then that but as it turns out love4js views everywhere and by everywhere i mean exactly that and this screenshot is from it from a tweet from a guy called the guest from coden cast

van gooden and that's where it dawned on me how bad this is because basically he said his name into this dndi string which complains the contains the exploit and the payload and he got back uh connections from apple servers meaning apple servers were also vulnerable and i was like this is going to be a show right so um yeah that quickly escalated into a worldwide panic and everybody was either patching or releasing free tools and resources held out and so did we

as i said before on december 9th airpatch released information about the vulnerability on december 10th uh kraftwerk released our first scenario uh detecting this love for day thing and then on december 12 on december 13th and 16th uh the scenario was updated because as it's it's a quite natural um development because in the in the beginning it was just like a solar obfuscated jdi string contained the payload then then over time it would be obfuscated in more and more in more and more obscene ways uh and of course the sarah needs to be able to take care of that and since um it's grog it's very static so in the end it ended up containing 34 clock patterns for for

matching and the last update was in the swimmer on december 20th where somebody from the community added support for unicode unicode encoding so to me this is a really good example of why why this cloud-based approach works because basically we we create a signals signal and then it takes i should show a little bit a little bit later how long time it takes and then we start getting data back and signals back of active active malevolent actors in this and that is uh to me i find that pretty amazing because that is that is really what the communities can do when you work together so as you can see or brought up this is a timeline of from

starting from december 9th and ending on december i don't know 16 17 something like that no sorry anyway the big spike is when um is is on a december 12th where we start getting signals uh we end up we end up at that point with around a thousand ips that we knew were actively exploiting this but as we looked at the data we saw something strange in the sense that that one one ip was um had a different scanning pattern and and we looked them up and it turned out to be a german security resource institute they were they were basically trying to find out how big how big is this problem by by scanning and we decided not to block

them because we would then mess with their research and also given that it's not it's not it's not a risk so um so they were filled out this is um an overview of of the signals we have received from lov4j until now the big dive is is on the may 18th approximately uh for three mo three months ago we still get we we got around 100 100 signals every day but today we get around 40 to 50. so the love for day is not dead rumors are gravely exaggerated um if you look careful there are a little bit of spikes after may 18th but two biggest one they are quite interesting so i'll talk a little

bit about them we're not really sure what happened because as you may know we don't really collect much about what's going on we just know that we don't know exactly what which payload or whatever bad actors are doing we just know that the scenario was triggered so based on that is relatively impossible to find out or try and find out what they were doing but we know who did it on june 21st our data scientist tweeted this and that that we saw a big spike and that was actually the small spike of the big ones and the ips 13.89.48.118 they're still they're still around doing their business with log4j um so you can see the graph the two the two

spikes there the first bike is on um it was under through uh well or what was on the june 21st and the second wasn't was a on july 7th and that was double as much and now i need some more water

we don't know why as i said but we do have a theory at least for june for june 21st because as it turned out on june 20th on june 23rd two days after the first spike cesar the cyber security infrastructure security agency in the us released a bulletin saying that uh vmware horizon a lot for gay related um vulnerability in vmware horizon was actively being being exploited so it is at least to us it makes sense that this may this this may have been what they were scanning for they just knew it a little bit in advance um and as as i said before in terms of a love for gay being everywhere it's also highly unlikely that that we've

seen the rest of this i'm pretty sure that there are still our love for gay things out there so love for days definitely not dead

and i guess when you think about it it makes sense that we can also see other interesting things in the data that we see that that we receive but to but to put things a little bit into perspective um this is a number it is an overview of the signals that that that we that we are that we're receiving for the last three months in average we have we've had 29 millions a month um broken down into weeks and hours um we have around 57 000 agents in 168 countries right now and it's it's crazy to to think that that the number of number of countries just keep rising i mean a month ago two months ago it was 150

and i was thinking well now that's it but it's not um we get around twenty thousand ip addresses shared every two hours there's a two percent renewal rate every 12 hours which brings about 400 new ips every 12 hours and 12 of these ips are are then seen for the first time so there is a large number of ips or there's a big change they're being changed shifted there's a big shift presented of the there's a there's a big percentage of the ips that are being shifted in the beginning whereas over time uh it's it's a it's a bit more static um there's an overview of that here that's less than one week 12.63 percent but

over time uh two percent of two percent uh 2.79 of those ip over time are new so the the the part of of of ips that i knew will are decreasing over time and one thing that that that we realized in the beginning of the crosstalk project project is that the sparse resource on uh on the internet or the scarce resource on the internet that's ip numbers so basically the more difficult we can make it for the bad guys to have new ips because then you they need vips for anonymity so basically they just hang around and try to find ips that they can use for the for whatever they're doing the more difficult we can make it for them the faster we

can burn their ips the better and that is basically i would say core vision of crosstalk really being so many in numbers so many agents in the world and and then the top of that being able to block it to block them right away we are working with um with an organization that we who has a pdp block list that we want to work with we want to push our feed into into theirs so that so that would be so that whatever cloud succeeds can be blocked on bdp broader level which is like the core the foundation of the internet so that there will be no router they won't be able to do anything as soon as as

they're detected that that is really the vision of crosstalk um and yeah and basically as as this says that that they are really always looking for new ips if we can disrupt this process then things are better for anyone or everyone except that except them of course this is an overview of of the block lists in in total on the left side you can see how many times this ip has ended i set it up on a blog list um and then there are like 50 times 25 times 10 and five as far as i think yeah um and it turns out that 79.5 of the ips in the community blog list they are there more or less permanently

for some reason um i want to talk a little bit about a layer 7d dos attack that we saw on a random use of yeah they they're big but but they're still random um i'll show you the graph of of the signals that we're receiving versus the ips that were blocked this is a two in one graph so it's a little bit confusing but the purple line is the signals that this agent sent and the yellow was the unique rps that were blocked so on july 27th that is the big blue spike to the left oh sorry to the right more than 4 000 frets was reported on 1.7 1.8 000 ips whereas on on august 1st

where the the big spike is a yellow one um more than eight thousand eight point eight thousand ips were blocked unique ips so this was really adidas attack but the the good good thing for luckily for the for the user they had crowd 6 so nothing happened

in terms of spring for shell we had a big spike on july 16th kind of like the log for gay spike only that this time around we have no idea what's going on basically so um we we saw that 178 agents were reporting signals on on this scenario but we haven't really find out been able to find out so if you know anything about springfield shell please let me know another thing that we found out is that zip is being constantly hammered and by hammered i'm in brute force back in november we were approached approached by a user a friend a friend's voice over ip provider who wanted us to create a scenario to take stuff like this

um and this sip scenario was installed on our honeypots it's a relatively dumb scenario in the sense that it doesn't really look into what you do when you connect to it you just count the connections to it but given that it's a honeypot there is no reason why anybody would want to connect to it next to it connect with it with something with a legitimate purpose another thing with the honeypot is that it doesn't have a it doesn't have a bouncer it doesn't block anything but that being said um on one day june 23rd we go we got around 83 000 signals just on that just on on that specific scenario just from our fleet of honeypot which isn't that big

so if you look at it if you look at the number of agents compared to the amount of signals we get out of it then crowd say oh then the sip is definitely the the most hammered scenario but it's not the only one all ssh is also um our data is a little bit biased in that crowd crowdsack is mostly mostly for linux and when you install crowslike it automatically detects if the ssh is installed and this is installed on linux so that's not that big a surprise that this is the most common signal daily we get around one one million signals from more than five thousand agents and around also around sixty thousand uh bad

actors reported sub frets you'll recognize the the orange and the um and the cyan one because those are ssh brute force and this is a slow brute force and together they're more than 60 uh whereas something like windows brute forcing is around two percent but again our data is biased uh windows is not supported until like so officially until a couple of weeks ago so obviously there aren't that many of those around and honestly who would who in the right mind would put a put a window server on particular internet right yep another into quite interesting thing that that we can see in in our data is how how good are the different cloud actors in cleaning up after these

attacks and we have looked at the data and and did a um the way it works when when an ip is a is being hacked or being taken over uh by a bad actor it will start appearing up in in our smoke database purposely and then when it doesn't anymore we assume that that is being cleaned so in that period of time from from once it was seen the first time i was seen to win is not no longer seen an average of that based on all the ips shows that aws in average takes three days to clean up whereas ovh they take a little bit longer 17 days we haven't done it for all child

providers yet but but we will be we will doing we will be doing a dedicated reporting on on asms based on asn in q3 um it's not completely fair to say that aws are bad and oh job is sorry that aws are good and oh and obs are bad it's not that simple obviously because there is a huge difference in what you choose to host under on that virtual server that that they have because if you choose to host like a php cms that you maybe not be so good at um at updating then you're definitely a way more vulnerable than if you just have you know linux uh linux with his h for instance

so yeah and also this is a little bit more based on this is also asn reputation but more broadly um it's a little bit small so i don't know how much you can see but um on the y-axis there is a number of ips assigned to this asn and the size of the bubble is the number of um ips that are being reported because that is not necessarily the same showing how widespread is is whatever compromise is going on um and interestingly enough in top four um that is chinanet um with uh with the 50 000 reported ips diesel ocean 24 000 that's half number three is china unicom china 169 backbone 9000 and amazon 7000. so so just to say that

of course there's a difference between the various asn and the various providers in how good they are and how good they want to be at cleaning up after the bad guys yeah and if you look at the legend which i know is totally impossible then then you can see which kind of attacks are primarily being being detected and orange is brute force it's our friend bru force attacks and uh yeah was the recording what can i say uh for details of course uh malicious ip origin distributed between cloud actors interestingly enough aws are are still are number two here they are real they have a lot of stuff that apparently even though that they do

a lot of uh a lot to um to clean them out basically it's not enough digital ocean yeah the they they are they are they are winning this one in terms of uh network strength these these are the the the countries that has the most number of agents but france us germany netherlands united kingdom which includes dublin and dublin is a huge aws hosting site cloud cloud computing kind of kind of muddles the borders a little bit here because a couple of months ago i was speaking in in besides co in the besides prisoner in kosovo in albania and in countries like that hosting a server is way too expensive so they use cloud computing all the time so

whoever is using an ip and only using cloud we don't know where they're from um but as i said before there's around 6 56 000 installs in total and um that was it for my talk um you can follow you can find us on twitter product underscore security we have a discord community if you scan the qr code you will get in there right away we are running workshops for for the time being for beginners where we walk you through how to install crowdsake how to install a bouncer on indian x and stuff like that uh how to set up capture uh to mitigate you can also send me a mail clouds at carsec.net or you can you can

hit me up here at besides las vegas which is soon over i'll be around tonight for a pool party and i'll also be at defcon and as i said before just follow the trail of tiggers and you'll find me all right

all right so we're gonna have time for questions but unfortunately uh we got word that there's some issues with the wireless mic so we're gonna have to stick with the wired mics if any of you have questions even more reason you might have to come forward and ask and just shout it out um but it costs if you could just repeat the question and then for everyone oh yeah that would probably work okay thanks any questions

oh sorry and the question was um why is it so low in topping that list and uh and in terms of rotating ips i'm not sure i completely understand

uh yeah um well i don't i don't really know why dear solution is topping the list as i said before it it it has a lot to do with what what people are using their their their vps is for and of course there is also like how good how good are um how good is is the company themselves at following up on the on on whoever a customer has their bps compromise um there may be some procedural there i i don't know um yeah i don't know really my i don't know what's more what more can say about the rotation of ips other than then it's that given that that the bad guys want to want to be anonymous they

um they they change high piece a lot so so the faster the providers are cleaning up they need to find new new pools of of ips and um also stay anonymous so yeah that's basically as i said i don't really know what what what else to say about it that's uh that's that i guess that's that's just how it works

but uh if there's no more questions um follow me find me i may or may not have a lot of stickers to give you and if not you've probably seen them outside because i've been throwing them around like crazy and yeah and also this is not an official besides las vegas t-shirt this is a complete rebuff yeah and i also have stickers in that design of course yep all right thanks for your time [Applause]

Detecting Log4J on a Global Scale Using Collaborative Security

Related talks