
yeah sorry sorry about that um anyway um we're going to be talking about my first research project at my new job but first before we get to that a little bit about myself my name is David biano I'm a security strategist with the surge research team at Splunk the nice thing about surge is that we do basic cyber security research not necessarily product oriented or anything like that just for the basically the general good of the cyber security community and the rest of the industry even and so we get paid to research and publish what we find and this is pretty much what we're going to be talking about today when I first joined surge in May of
2022 I knew that I was going to have to choose my first research project at some point like there was a two or three months of like settling in and you know taking all the required corporate trainings and meeting all the people and that but eventually it was going to come down to I need to figure out what my first research project is and I kind of have this issue like a lot of times I don't know what I'm going to want to work on especially these projects are like between 3 to six months projects so they're not terribly long but long enough that you might get sick of it in the end so I really wanted something
that was kind of cool I knew that as a as a newly mted Splunk employee I had an unlimited Splunk license don't hate me and I wanted to use it but I also wanted to get something like no one really ever looked at hadn't ideally something that people rarely think of so I had actually worked with uh technology called certificate transparency anybody familiar with certificate transparency yes a few people um I worked at it at my previous employer on some side projects just to check out things like we'll talk about certificate transparency um in a bit but the idea of it is it's a it's a real-time log of all of the TLs certificates that are
issued more or less uh for all websites and having access to that before I had it as a data stream so I could look at uh incoming new CER certificates as they were registered before they even went online to try to detect like brand impersonation or fishing sites and things like that but in this case that wasn't exactly what I had in mind I said well you know I could use this same technology this certificate transparency technology and I I could actually legitimately look at every single secure web server on the internet so that's pretty pretty much what I did what would I do if I had all that data I kind of quickly focused on the
certificates themselves there are some interesting pieces about the certificates if you're probably most of us are roughly familiar with how the certificate trust works like a certificate Authority issues a certificate to you they sign it and there's a there's a root certificate Authority somewhere involved and things like that uh that whole certificate chain of trust well it turns out all of our web browsers and and operating systems and whatever have these built-in sets of certificate authorities that they trust they come from the factory like that and you don't really have have a lot of choice I mean you can go in and delete things if you decide that there's a specific certificate Authority that you
don't like and you don't want to trust them I'm going off camera here um but for the most part nobody really does that that often Chrome comes with what is it uh at the time when I tested it on my own Chrome browser 138 trusted Roots Safari and Mac OS use a shared pool so they're like 154 even Firefox has like 54 of these and these numbers vary a little bit over time so they may even be a little bit different than they are now but the the point is each one of these browsers they have a set of requirements that if you fulfill the requirements you can become a trusted root certificate Authority and get shipped by
default and the the requir ments are meant to address the leg let's say the legitimacy of the root CA are they a serious organization that tries to you know have best practices and comply with its best practices have they published their policies and procedures for peer review and things like that but what they don't really address is have these certificate authorities been issuing certificates that ended up being used for malicious purposes so I decided that that might be a really good goal like they're trying to attack it from the front end is the ca's the root ca's uh policies and procedures such that we think that they might be doing a good job I'm trying to detect it from the
back end after this certificates are actually out in the wild which ones have been used for malicious purposes and then we could potentially rank the certificate authorities on whether they have issued more than or less than their share of certificates that ended up being used for
evil although I was the lead on this project I was not the only person working on it one of the another nice thing about working for surge is that we are very very collaborative um both within our own team and with other um with other people inside Splunk and even as you'll see a little bit later uh folks outside of our own organization so uh I just want to give a shout out to the three people that that helped me the most uh on this they were part of the project team that's uh Michael band uh kelse Bourne and Philip drer um Philip has excellent fashion forward jacket style but he's he is also our uh one of
our lead data scientists Kelsey was also a data scientist on directly attached to Serge although she's no longer um with Splunk and then uh Michael berkin who was a system engineer who had a lot of um like Hands-On expertise deploying Splunk architectures really quickly so it was it was nice to have all the three of them advise me on this project and help me out so in theory to do what I was trying to do would be really simple you kind of like uhoh what happened to my slides he's gonna have to reconnect to zo because recording's not happening right now so if you just you can keep presenting he's just gonna get record uh
okay can I have my slides back okay sorry some some recording issue um anyway as I was saying there in theory it should be really simple to do what I was trying to do you download all the certificates on the internet you you are in the meeting room you rank them recording in progress like each one of those if you listen to that you'll see each one of those is a project kind of in itself right so when I say it should be pretty simple that is the overall approach but it's never that
simple some of you may not be familiar with the innards of a uh an x509 certificate like the ones we use for websites don't worry I this is going to be super brief but there's just a few pieces of key information in a typical certificate that I want to bring to uh um bring to your mind here the first one is the subject or it's the name of the website whose certificate is this this is the certificate for www.suncommunityfcu.org
the most direct one is the issuing CA that is the certificate Authority that actually created the
www.suncommunityfcu.org alternate names uh San stands for a subject alternate name but it basically means you know you can take the same certificate and use it for other website names so in uh you'll often see www.s something.com as well as something.com and maybe star. something.com like Google does this forever you know all the time they will be multiple subject alternate names in in a lot of these certificates which means that when you're matching up you not only have to match up that subject but you also have to match up all the alternate names as well you may have heard this idea of the chain of trust take a look here at the bottom this is the actual certificate
www.suncommunityfcu.org there's something like 78,000 of these and you don't actually want to ship 78,000 uh entry databases in all your browsers actually there are an A A hierarchy of certificate authorities in this case there's a uh digicert Global root CA and the root Casa are what is actually shipped with your browsers the idea is you have a list of root CA that you trust and I'm sure most of you probably know about certificate um verification so the point is that we need to look at not only the issuing certificate but Al certificate Authority but also the root certificate Authority because we want to take a look at both levels to see what's going on we
technically a lot of certificates will have multiple um levels of uh of certificate authorities not just a root and an issuer but everyone has a root and an issuer so we chose those two levels and there's that root CA sorry now certificate validation like I said probably most of us know a little bit about this your web browser gets a certificate from the server it looks at the server's name and it says this is supposed to be www.lun.com this is CER certificate for
www.suncommunityfcu.org that CA is somebody that you trust in your database and all of the other certificates uh all of the other signatures matched up you trust it it's not the only thing but that's what most of us know about certificate validation like if the there are there are some things that they do to kind of prevent certain scenarios so like one one scenario is if the site owner experiences a Breakin or they're tricked into compromising their key or some something like that and so the secret key for their certificate might be compromised they can notify their issuing CA and they'll probably add this to a certificate revocation list or some similar thing there's an online protocol
for that as well basically saying even though all the sign matched no longer trust this certificate because it has been compromised and so that's also a key part of certificate validation is checking most commonly the certificate revocation list but what happens if the certificate Authority loses control of their keys this actually has happened in the past with some fairly high-profile breaches uh and and security incidents um I one of the big ones that comes to mind is in 2011 the certificate Authority kodo they were they found a bunch of valid signed unrevoked certificates being used by threat actors in the wild and what had happened was the threat actors were able to manipulate kodo's backend I don't know if they
actually got access to the Keys themselves but they were actually able to manipulate the backend to cause the signature to be generated and then they could Forge their own certificates that looked valid but really were not so what happens there was nothing built into the revocation or validation protocol to kind of guard against that scenario so what happened was in about 2012 I think it was some Google researchers came out and they said well what if we just publish a list of all of the certificates that our certificate Authority has ever signed that way and and not only that list will be very large but we'll give you a very fast API that you could
easily look up and when you get a certificate and all the signatures match and it's not on a certificate revocation list you could also additionally check really quickly with the they call it the certificate transparency log you could trans you could look at a CTL and just ask it was this certificate actually created by a legitimate CA process and if yes you're good and if no something's wrong there are a whole bunch of different certificate transparency logs operated by different organizations that just have like a some kind of interest in making the Internet run right like Google runs a number of them a lot of the big root CA organizations do um there's there's even some other private
companies that like internet service providers or something that have their own uh there's a ton of them so here's I have to give away some prizes so while we're on the subject of certificate transparency let's see I have two prizes here and if you can if you can give me the correct answer you you can have your choice first I have a lockpicking set some some simple lockpicks from southord and a nice little lockpicking manual on the other hand I have the blue namicon which is a book that my my team at Splunk published earlier this year it's a compendium of uh chapters about Network defense by lots of different industry recognized professionals uh including I have one in here about the
origins of the Pyramid of pain uh so if you can get the answer to my trivia question I will let you choose which of these two fabulous prizes um I said that the the certificate transparency came out in 2012 it says on the slide that Chrome has required certificate trans transparency for all web browsers or all websites uh since 2018 but at what point what year did it did Chrome start supporting CTL validation as an option sometime between 2012 2018 so there should be at least six of you raising your hand yes yeah it was in 2015 congratul ations which one would you like excellent choice [Laughter] sir okay yes and so in 20 uh 15 Chrome
started supporting that for I think it was for just for extended validation certificates like the super secure ones but in 2018 they made it a requirement for every certificate at meaning and this is important to us and especially to my project meaning that all of a sudden starting in 2018 no Chrome users could access an SSL website unless their certificates were in one of the certificate transparency logs meaning the certificate transparency logs starting in 2018 basically had a complete copy of all certificates for all websites because no one wanted to be the website that Chrome users couldn't go to most browsers today support it uh I don't Firefox I think is still the uh the biggest one that does not I'm not
sure why but most of the other ones um the bigger ones Safari of course Edge based on Chrome uh they all do another key thing is that these are all apid driven they're public you don't have to sign up for it anybody can query it and there are a number of API endpoints one of which is give me this certificate or give me this range of certificates by ID number which will become important later so what I did actually was I use certificate transparency logs to download all of the certificates from 2021 and 2022 there was a few other years mixed in there just the way the logs are but it's primarily 2021 and
2022 I'll talk a little bit later about the challenges I faced on that but basically once I got all those I I found that there were about a little bit over five billion certificates so five billion secure websites only websites so not anything like if you had a secure email server or anything like that they don't really do uh certificate transparency it's only web browsers and web servers so 5 billion unique uh website certificates in the data set there were about just a shade under 500 497 unique root uh certificate authorities and 78,000 issuing Casa I queried 15 different certificate transparency logs because basically I just chose the built-in set that Google Chrome came with
because everything was going to be in there if they wanted Chrome users to be able to look at it so um just as a fun thing uh Phillip helped me some with some of the large scale visualizations here I just visualized like the relationships between the issuing Casa and the root Casa totally skipping over any intermediate Casa um because we are only concerned with the issuers and The Roots as probably expected most of these things like you know all these little dots here basically most of these are kind of self-contained little universes with with probably one root CA and some time some number between one and a few different U certificate issuing Casa attached to that root CA but there there
were some bigger things like you can see this this big thing here this is Digis certs uh ecosystem because they bought uh I think it was semantic and someone else who I'm forgetting at the moment around 2017 so all of those certificate authorities then became cross- signed so that uh and some of the some of the issuers eventually changed from their like original semantic uh root CA to a diger CA so there's a little bit of structure there but most of them look kind of like this maybe the number will be a little bit different but this is a right in the middle is the root CA this particular one happens to be deuta Tel
and then Deutsche Telecom as the root CA is used by a number of different issuers it turns out most of these are German universities not all of them but most of these are German universities and they're issuing certificates for their own um web servers and also probably their own students and some other things so in this case um I don't want to I want to say this carefully so I don't give you the wrong impression the colors of these dots have to do with the relative risk of each of those Casa like how many they have issued that turned out to be used for bad things but only relative to their peers so it's a at
this level it's a lot of noise so they don't the colors don't really mean anything just these white ones doesn't mean these are objectively bad but they just give you a little bit of uh uh confidence that they're not all the same right there's there's a little bit of variation zoom in on the other one I said the other one um was digicert this one is actually oh gosh I'm blanking on the name but they uh it's the same kind of situation they bought oh it's seigo sorry this is seigo they bought kodo and a couple of other CA around 2018 I want to say so again you can kind of see like they cross
signed a bunch of their root certificate authorities to make sure that they would all still function and all of the existing issued certificates some of which were valid for multiple years would still continue to work some of the issuers during that time switched from their old let's say kodo rout CA to the new uh seigo rout CA or something like that so there's a couple of big structures in that data that almost all have to do with mergers and Acquisitions so it's kind of neat to see how the business landscape had affected the technical issuing of uh certificate uh certificates and their certificate authorities now the second piece was now we have all the data from the websites
the other thing is I needed some threat intelligence I basically needed to know a third-party view of what was bad in my data one of the things I mentioned earlier was that I really enjoy how surge is very collaborative even outside of Splunk with other security researchers or other security teams um we actually partnered with seven different other organizations that specialized in some kind of thread intelligence and we were able to take in uh actually the total amount of thread intelligence we took in was about 300 and by memory I say about 365 million uh indicators mostly domains there was like literally five uh certificate individual marked certificates uh but most of it was by
domain name uh from seven different providers all over the world because we wanted to counter any Geographic bias the largest was uh domain tools so we actually were we had a really close partnership with them so big shout outs to all of these organizations domain tools group IB the Yahoo paranoids you know internal security just for Yahoo but they have a very good threat intelligence database uh first if you're not familiar uh first is the Forum of incident response and security teams it's an very international organization and one of the things that they do is they maintain their own thread intelligence platform for their use of their members and they were nice enough to let us use it because I'm a I am a
member of first and then uh also J PT CC uh I I can't really say this was like a partnership because they actually publish a lot of their data so they actually have a GitHub where they published uh where they publish all of the data that they've seen primarily about fishing and spam but uh we contacted them and said hey is it cool and they were like that would be great that would be awesome you have permission please use it uh and so big shout outs to domain tools group IBS Yahoo first and JP SE CC because we have no thread intelligence data we would not have been able to do this project without our in Intel providers we even
had a couple that declined to be named so if you notice there's only five on the slide and I said seven you didn't catch me in a lie they just they did not want to be um publicly revealed so now we have five billion certificates and like 180 million pieces of thread Intel data mostly domains this was like a big data set we did have some challenges it I can't say it was super easy the first challenge of course was just getting all of that data actually getting the thread intelligence data was not too bad I just called up some people and if they agreed they we ended up like sending they sent me csvs or gave me API
access or whatever uh but it was the certificate transparency logs that took me a few months to download all those things they're publicly available through a well documented API most of the logs uh actually are very performant so you could ask them and get quick responses but it just turns out that downloading literally every certificate is not really well supported in tooling like in 2017 remember this is only a few years after ctls became popular but before they became required uh there was a there was an attempt at creating a tool that would download everything from a CTL but the the CTL sizes Just sh exploded so I ended up having to take that and create
my own uh version of that with basically super parallelism I had a small 10 node cluster and each of those clusters ran one of the downloaders that was based on this existing code um it was called axman I think I want to say it was from Cog uh it's available on GitHub but it was single-threaded so I had to make it multiple threaded we ended up having about a 100 download threads going at one time and it still took like a couple of months to download all that data um and parsing it download and parsing matching the subjects and the alternate names to the thread intelligence turned out not to be quite so easy either like first of all the
alternate names have to be parsed out and then you have to we we ingested all this into Splunk of course so I wrote some fairly Nic looking SPL that could do all this for us and match against any of the names that appeared in each certificate but some of them were still a little bit difficult especially when they came to wildcard names like you could have starsun do.com which would be valid for any Splunk website in the splunk.com domain no matter how many subdomains or whatever it had and it wasn't clear what a fair way to match wild cards were uh I still won't say that I've totally solved that what I really did was not
match wild cards the biggest problem though was less of a technical issue and more of a I don't know like a statistics issue some issuers were really tiny like or only one or two or 10 uh certificates in my data set some issuers were were huge like a billion who wants to guess for the Lockpick Set who had the billion it was let's encrypt um it was actually the root CA is not called let's encrypt it's the Internet Security research group but yes it was let's encrypt let's encrypt is by far the big dog in certificate Authority it was actually interesting um one of the hypotheses from our research Partners was that let's encrypt might
turn out to be one of the worst let's find out if that's actually true my first attempt at doing this kind of analysis was let's call it the naive attempt and I mean that in both it's the simplest way and also oh it's so cute you thought that would work what I did was basically put them all into one big big ranking for each route and for each issuer like I did these in two separate batches this is the roots for each root I counted the total number of certificates that I had that rolled up to that root certificate U root CA and then the total number of those that were present in my thread
intelligence feed and basically just made a percentage that said this is the percentage of certificates for that root CA that were involved in malicious activity because comparing these the percentages is not necessarily the easiest thing I I computed also something called a zcore anybody familiar with a zcore maybe not um a zcore is basically the number of standard deviations away from the from the mean that you are so we took all of the percentages and said here's here's all the actual percentage values what's the mean percentage and then what's the standard deviations away from that what's one standard deviation usually we use z-scores as outlier detection and a typical value is anything with a zcore
of more than three or less than three plus or minus three standard deviations right so in our case anything with more than three standard deviations would have been a risky outlier because the percentage was higher and anything with less than neg3 would have been a a truser outlier so I just ranked them and you can kind of see I know this is a little bit of an eye chart uh so I apologize for that but you can kind of see here's the zc scores over here remember our threshold was three so here there are actually three outliers um the first one the Internet Security research group I just mentioned they actually um are one of the they
they are the organization that lets encrypt um that runs let's encrypt but there's also other ones like um this there's GoDaddy and there's uh this one called udra which I'm not really familiar with uh before I did this project but look at that the isrg the top one has what is that a 6.3 zcore super bad right but it only has two total Cs and one of them was risky GoDaddy on the other hand had a whole bunch of shts and a whole bunch of them were risky but it scored less High still an outlier but it seemed like this was not a fair comparison um same thing with the issuing Casa except here on the issuing
Casa everything like there were a lot of outliers this is only the top what 10 I think uh and so there were even outliers below this uh so there were a lot of them but it still has the same problem most of these only had like less than 10 certificates with the ex ception of um trust ocean trust ocean had 10,000 most of which were bad but still interesting data but not a fair comparison because the population for each CA was so they varied so widely here's what it looks like when you plotted on the bottom these are all the individual root CA I didn't bother naming them because who could read all that um and then here
on the top this is the the number of certificates that rolled up into that CA this is root CA if you read this closely which you I understand you may not be able to do um this is a log scale not a linear scale I had to do that because at a log scale it just looked like a straight bar and then nothing the logarithm MC scale on the Y AIS actually um exaggerates the Y values a little bit so you could actually see them because otherwise you really wouldn't be able to see that much so this looks like a gentle slope but it is actually not a gentle slope it is a sharp drop meaning
that there are few root certificate authorities that have a lot and there are a lot of certificate authorities that have very few certificates I I feel like that scene in um in The Lord of the Rings where Bilbo's at his party and being like I like half as much of you twice as much as whatever it is when I say this but there's a sharp drop I did the same thing also for the issuing Casa and again logarithmic scale the even sharper drop so using this information about the distributions what I decided to do would be the fair way would be to group them into buckets so I called them tiers tier one tier two tier three tier
four the tier one for the root Casa is everything that had at least 10 million or more certificates tier two a little bit smaller uh 1 million to 10 million certificates tier three 100,000 to 1 million certificates and and then tier four was everything else it was really the long tail when you do that you are basically comparing each certificate authority to its peers in terms of the numbers of certificates and it becomes a lot more fair in tier one it turns out there were only 13 uh tier one root Casa so all of them are here on the screen and you can see um that there sorry that that there are no real outliers like there's some
differences some negative numbers all the way up to like a 2.37 but our outlier threshold we chose somewhat arbitrarily but a normal outlier threshold is three meaning that all of these people or all these roots are roughly equal which makes sense because these are the biggest players in the market and they presumably have the resources to do the right thing but also they are afraid of competition from their peers so nobody wants to really risk being significantly worse than the others so it made sense that there was really good quality um no none of them are particularly risky or even particularly trustworthy they all are roughly the same when you get to tier two kind of the same story like the top
one here is uh what is that 1 point I can't even read it 1.9 um so no outliers in the risk or or trusty ver uh directions and again same for tier three but tier four is where it gets interesting check out some of these like the top one here again is is the Internet Security res research group it only had two I considered this noise so even though it had a giant zcore I kind of excluded it the actual ones are ones that have you know some significant number of certificates here and now we get like four of these guys that are kind of more risky they're risky outliers I can't say that they're objectively risky
but they are more risky than their peers uh you'll see other ones here like maybe if
oops so you might see things like um this one here the top one the udra Technologies we saw that earlier they are a u they claim to be like the largest SSL certificate provider in India I take that word for it I'm not familiar with them uh if you go down there's a similar one um ugra which is from Turkey um the the interesting ones I think on here were first of all look at ssl.com like anybody familiar with ssl.com it's actually a legit business it's real business um but it is significantly more um risky certificates than most of its peers and I I don't for sure know why because it's not like I can call up the thread actors
and go like hey Vlad can you tell me why you chose ssl.com um but when I looked to try to explain this or come up with some theories one of the things I did was I looked at their websites um and for ssl.com for like I think it was like $10 a month or something you could request an unlimited number of 90day certificates and it's trust Ed by all the browsers and they even have an API now to me high Automation lowcost and trusted universally sounds like a recipe for a threat actor to Target them to as their SSL provider so I don't know if that's really the reason but it seems logical the biggest
surprise on here though was was this one here look at that Google Trust Services it actually is Google what the heck is Google doing on the list of riskiest certificate authorities well it turns out that if you run your own websites on Google cloud or on their user provided content pages and you want an SSL certificate this is the default choice of providers you can get it wherever you want but if you just click the button or whatever you do this is this is what will sign your certificate so in that sense it's mostly user generated and user provided content also kind of makes sense that you know a thread actor just throwing up a page on Google uh with
probably with a stolen credit card information and just wanting it to all be to work with the minimum level of effort clicking the button and letting the faults go through seems like it's likely um in tier four we didn't really have any particularly trusted we had a few not trusted but no particularly trusted things we did the same thing that kind of broke them up into tears for issuing Casa uh we had different levels uh so tier one the largest uh issuers were 100,000 or more all the way down to tier um tier one which was less than a
thousand in tier one we did have a couple of outliers you can see we had GoDaddy I wasn't sure why GoDaddy showed up here but the closest I could come was maybe it's because I mean they're not particularly cheap or particularly like automatable or anything but maybe because they're are also one of the largest hosting providers in the world it made sense that again people thread actors that might be using GoDaddy would use godaddy's certificate Authority uh I could kind of see that and uh a similar thing here with um uh zero SSL kind of Highly automatable a little bit less expensive than typical so maybe again that's one of the threat actors there maybe that's why they're using it
again no really trusty issuers on tier one um tier 2 risky issuers really just the one trust ocean encryption trust ocean is a Chinese um certificate Authority I don't really know what to make of it when I created this presentation their website was not responding at all and um apparently according to domain tools hadn't responded for some time they have some um GitHub repos out there but they haven't been updated since 2019 so my assumption was that they went out of business before I could check up on their procedures and see why they might be a risky issuer however uh I just checked this afternoon while I was prepping for this talk and I see
that their webs site now does respond but it's not the same domain as before um it seems like maybe they got acquired or somebody took over I don't know exactly what happened but the the GitHub repos now their website actually points to some new domain and that domain is working um so I'm not sure what to say about trust ocean I don't even I can't even say that I think went out of business anymore now I can just do the I don't know but I wasn't able to come up with a theory because I just didn't have enough information um no particularly trusty outliers on tier two uh for tier three we had a bunch of outliers uh some of
which you can see here um trust safe um there's that udra Technologies again another dig assert rapid SSL again I think rapid SSL High automation perhaps uh there's another emudra down here that both of their issuers being on here probably goes a little bit to explain why we saw them earlier in the root CA uh so there's all kinds of stuff again though on tier three no trusty outliers we're starting to see a pattern nobody's particularly better than their peers but there are a few that are notably worse than their peers when we comes to tier four though there's just literally so much like around 72,000 of the 78,000 issuers actually came in tier 4 and there's just so much
noise in there that even though these these zscore numbers look huge like the total numbers are really small so we just ended up considering all these to be noise and we excluded any of the tier four um uh issuers from our calculations so conclusion who's still awake all right mission accomplished sum it all up I looked at all these things at both the issuer CA and the root CA level and we identified four outlier Casa in the risky Direction and 10 outlier CA uh U issuing Casa in the risky Direction I'm not saying that all of those were inherently evil and bad and you should never trust their certificates blocking certificate authorities like people are like okay
great I got a list of risky CA I'm just going to block them and then I will have no risk anymore um blocking them is probably not really warranted even though that might be our first thought it's because we're not likely to see most of those certificates because they typically are not in the highest levels they're the lower uh echelons for the roots and there is no way to block issuing Casa like you can't just remove an issuing CA from your cach because they're not in the The Trusted cach anyway um what you can do though if you're not already doing it is use a tool like maybe Zeke or something similar that will observe all the TLs
transactions and log the details of the certificates because remember the certificates can't be encrypted yet because exchange of certificates is part of setting up the encryption so they're all in plain text binary but plain uh and so you can actually log them and Zeke will do this pretty much out of the box and will'll give you a lot of details including the certificate authorities and you can use those certificate Authority data to do like risk-based alerting or um data enrichment for incident response or threat hunting or something like that if only you had the data of the risk rankings for all of the certificate authorities H some things that we didn't get around to that I would have liked to have tried
and maybe if someone wants to take this to the next level they can also try a little bit longer certificate history longer thread Intel history uh we could have tracked variations over time rather than as a snapshot which would have given us a better view uh over who was actually doing worse or better uh so that would might have been nice um I did mention that we could have done better on the name matching the biggest one though is I really tried hard to get our thread Intel Partners to do this work again like on a regular basis because I did it as a snapshot in time but it gets stale so if you're going to use it for
data enrichment or risk-based alerting or something you need fresh data I weirdly none of the thread Intel providers seemed that interested like a couple of them were like that's really cool I I have I have wished for this but we couldn't make a business case out of it so you know if this sounds interesting to you maybe talk to your thread Intel providers and ask them for it um some next steps for you first of all remember I said that that ranking data if only you had it you could do something with it well we actually made it available um it is in a GitHub repo here at this uh this highlighted URL here and it actually is
the top 10,000 domains in issuers out of 72,000 you might be like why why didn't you give me all the rest of them but actually that's really all you need because it fell off into the really long tail after that so 10,000 has pretty much everything you're going to need and all of the root CA are ranked in there so you could download those they're they're a a few months old by now they were they were fresh in May um but uh still useful so you could actually go home start logging the TLs data pull down the risk rankings and then maybe um try out some threat hunting use data enrichment like hey are any of my users going to sites that uh
are signed by any of these issuers or or root CA maybe that would not be useful enough information when you know to make an alert or something like that but if you have a whole pile of data that you're trying to winno through maybe that will be one useful data enrichment and again like longer term it would be great if we could all beg our CTI vendors to provide those risk rankings on a regular basis so we could make sure that they stay useful and with that I guess I have a couple minutes for questions if there are any if anybody's still awake I had some I think I saw your your hand first
sir also the question is did the thread intelligence did did they did they differentiate between somebody who created a website specifically for evil versus somebody who compromised an existing website and used it for Evo right uh no no I none of that was in the data um I think just because the CTI providers probably don't know the answer uh other questions
yes
yeah so this question was uh if you're a website owner is there a way for you to prevent your legitimate users from falling for like impersonation websites with signed certificates that look like you but are not actually you again no not that I'm aware of it would be nice though wouldn't it but that would solve so many problems okay well thank you very much I I hope you've enjoyed it and uh Go download marisk rankings