Data, Agents And OSINT: You Don't Know What You Don't Know - Ryan Reeves & Zara Perumal

BSides Bristol · 202537:3845 viewsPublished 2025-01Watch on YouTube ↗

Speakers

Ryan Reeves Zara Perumal

Tags

StyleTalk

Mentioned in this talk

Platforms

GitHub

Service

ChatGPT

Show transcript [en]

H hi everyone can you hear me okay in the back are we good okay awesome uh well I'm Zara this I'm here with Ryan and we're here to talk to you about ENT threat intelligence AI digital agents um so before we get to into it I wanted to ask all of you um how many of you have used ENT in the last year okay I don't believe that because we've all use Google search so probably all of you um how about how how many of you have like integrated AI into your workflows in some way in the last year cool a little bit um how about how many of you are a little bit skeptical

about how many people are trying to force you to use AI in your workflows in the last year cool awesome perfect so yeah that's all we want to talk to you today about is um kind of our experience using AI agents or agentic workflows for osen um so today I'm going to talk to you on the ENT side about you know what is it why we find it useful for threat intelligence why we find it very pain F or not useful in many ways um do some case studies of where we've seen it in the wild um for in our work but also citing a lot of other awesome security researchers and their work um talking

about where why we thought AI an agents and ENT went together and then Ryan will take over and show you some cool demos of real agents and how kind of talk about how they work and how we're building them so all right to get started ENT and threat intelligence so why threat intelligence um I feel like this is all we need to know uh I feel like usually in blue team work we work spend a lot of time thinking about our systems so uh what do we build defending in depth like secure by Design uh but the point of threat intelligence is that you want to know very deeply how the adversaries are attacking you and so

it's just understanding them in the wild um ENT is a fun way to do this uh so what is ENT it's a jargony word just to mean for publicly available data and it means kind of everything that surrounds us first things like on the web like Google search things on social media things uh on the dark web or closed communities like the some of these criminal telegram channels and also some weirder sources like you think about satellite data or trade data all of it is considered ENT um basically everything that's like outside your corporate network uh where is it really cool uh so some of the reasons I think it's cool is one there's it gives you early

indicators so if you think about everything that ends up in the news sort of definitionally there's someone out there who's talking about what for it ends up in the news so sometimes this can be a security researcher is talking on on infosec Exchange or security Twitter about a vulnerability in other cases this is like a local news picking up some natural disaster that impacts trade data um so there's some really like early indicators that uh are out there uh another way I think it's really interesting is it kind of gives you this like full picture or this like common operating picture of a threat actor a vulnerability malware um because there's so many perspectives you think about

like social media perspectives versus blogs versus um you know social media and other countries it gives you like kind of a 360 degree view of like what is this threat actor what are they up to and what what this means for like a security organization is you get lead time on mitigation so you get sometimes a couple days or maybe weeks if you're lucky heads notice on something that's evolving so you can do something about it before it makes it to your organization okay so those sound great but what's the problem with ENT U first of all there's a credibility question so if you look at this breach forums example um just because someone is

talking about you on breach forums it's probably not real so that's the the big thing is if you see something on Twitter or if you see something on someone who's posting a breach about your company your first question is like is this even real at all U who is the person that's posting it do they have a track record of successfully posting leaked data in the past or S successfully posting about real incidents or are they some new user who's just scamming other people on scam forums uh the next question is noise so we just said that like ENT is like every piece of data in the world so great but like what what's the parts that you care

about so you think about things like I just threw some screenshots and but like there's the cve feed so there's all the cves that you have to care about and then you think about like every security blog about um some threat actor doing something and we always talk about like in threat intelligence people love to talk about the big ones like busting Panda and cozy bears but like does that even matter for your company um and many cases no uh and then the other like the other kind of feed of data is like also like things like security Twitter or social media and so there's a huge noise problem if there's like a few pieces of

relevant data out there but you're digging through like hundreds of thousands or millions of things to find that like the return on investment isn't always great and then the third question is action ability so here we're going into fraud Telegram and you see if you're on like telegram any almost anywhere telegram stop takes there's like bad stuff that you'll find but then the question is like can you do anything about it and if you can't do anything about it if you can't take it down or enforce or inform your security proster then aside from like personal interest there's not really a reason to look at it and so that's one of the other challenges is

like can you turn it into a hard signature or something to take down um or do something with the the data all right so now transitioning into ENT in the wild um I want to talk I think I will go lightly here I'm not a scattered spider expert and I'm sure some of you are so Tre Tre care carefully here but I want to talk about the scattered spider and the Tyler banon doxing so for those who aren't familiar scattered spider is part of a group called The C it's generally like younger um cyber fraud criminals like in the ages of 18 to 22 they often focus on things like Sim swapping and notably for us they messed up our hacker

conference in Defcon so uh if you look at the timeline of events the screenshots are about Caesar's uh Palace pull like pulling the contract with Defcon in Vegas I'm not sure how many of you were also at Defcon this year um but this timeline of this incident was basically uh scattered spiders which is this group they do social engineering Sim swapping stuff like that they launched a social engineering attack on the casinos in August of last year um they successfully got into the systems um they did not plug us devices into TVs as the hotels in Vegas seem to think um but they did uh get their way into the systems and then they were able to

partner with uh Alf v um and they they used ransomware to able to shut down and successfully like shut down operations completely for Caesar's Palace and MGM Grand and it was a tragedy for these these companies they were like completely out of operations they like had people running up to hotel rooms to physically see if they're occupied and like losing like millions of dollars a day uh and I think Caesar's P house actually paid the Ransom so it was like a really big ISS issue the the hotels were like completely shut down and then time goes on uh we're in January now uh the small a small cyber threat like a small part of scattered spider Noah

Urban was arrested in Florida he was a part of a Sim swapping game and part of the reason he was arrested because he was involved in violence as a service like he was a victim of violence as a service operations from another Sim swapping gang and so his data was sort of in like a hospital system but that's a whole story uh then Caesar's Palace drops deathcon and then Tyler you Canon is arrested so there's a lot that went on behind the arrest and I I'm told law enforcement also was was tracking this before but one of the notable things about this arrest was that he was dxed so someone on pbin like a rival person

uh posted his information meaning that not only you could tie his identity to his fraud channels as a threat researcher or as a law enforcement agency but also any other person who was mad at him including these violence as a surface rival ssw walking gangs could then go and compromise it its identity so this this is one example of you know this kind of a trend is like sometimes these criminals will dox each other and then sometimes there's other ways you know other more exotic ways of finding information of them through like leak data like leak passwords and stuff like that if I extend this example a little bit more um you see that when you see

talks about scattered spider in the news you see talks about his involvement with the SIM swapping gang the the violence of the service operation and you see that there's these this tie of not only like between the Sim swapping gangs to The Ransom more sophisticated ransomware attacks to some of these more physical real world harms which I think is is always interesting for me as you think about digital space as very like sanitized like we're just involved in like cyber warfare but it does have these physical and real world implications all right so extending this example a little bit more uh you see that this is kind of the the ecosystem of the of with this incident and some of

the overlaps between things like scattered spider to their partnership with Al V to the different victims that they've had to the other side the service operations and the the overlaps between the two victims um so this is just a fun graph but mostly the point that I think is here is that the the overlaps between different entities in the Cyber ecosystem are is very fuzzy and so sometimes you see like a more sophisticated with less sophisticated across different platforms but it's uh tracing these relationships often learning about some one thing can help you find something more exotic so now we get into fraud and I want to talk about fraud because you know it's something that we look at a

lot because they're just so much chatter about it um you see something like this this is like everywhere on telegram you can find uh stolen account so this this is someone advertising a stolen account and the and the problem these people have is that because they're Anonymous they scam and they scam each other no one trusts them and so they have to provide some credibility that they actually know what they're talking about if you're selling a stolen account and so here they're saying like here's an example of our stolen account and this is in like a traditional like a fraud method uh stolen account type Channel if we show a little bit more exotic another thing is these you know

we go into example of a Sim Swap and so here you see that people they're talking about they can do the walk and get you access to someone's account via Sim Swap and the point I just wanted to make here this was a very recent post this is August 19th this was from Star fraud chat which is the example that we just start talked about with star fraud is another name for scattered spider and this is very easy to find you can find it on Google um and so there's this kind of overlap between this and the more ransomware stuff um getting into some other examples of ENT in the wild so this is another thing is that there's some

interesting stuff with vulnerabilities there's a lot of really cool papers published about this but like follow the bluebird is is one of our favorites uh but there's when you look at vulnerability is nvd is like the classic place to start it's enriched it's processed they have analyst looking at it and they can match it to like victim software um Twitter is another ENT place to look um it's often like you can see in this paper and so a couple other people have measured it you see this two to three day lead time um from Twitter to end process nvd samples and so it can give you a heads up but then the classic question is if you're actioning on

something do you really want to tell your boss or your team that I'm doing this because I saw this on Twitter and so there's kind of this actionability and reliability question with some of the Twitter data but it can provide this really cool leading indicator another one is the breaches like we talked about things like the hacker forums um they talk a lot about breach data and so you can learn about things like your stolen credentials there's a lot of interesting stuff there um and then like Naas API the npm breach and then one of the fun ones we wanted to get into and this leads it more to Ryan's cool agents that he's going to

show you is ioc's and ttps um when we look at like things like understanding threat actors so much of it is in these cool like security blogs where people will go into the details of talk about the entire like attack chain uh and sometimes they also see people sharing like more of these tactical indicators either in the security blogs I think this is a little bit small but you can see ioc's here you can also see fishing URLs from the crowd strike U blue screen of death incident of people sharing them trying to like tell other people these are scam URLs um you also see things like crypto wallets so like for things like rug full scams you see people share

a lot of these in the wild and I think this is a really interesting space where sometimes this crowdsource knowledge can provide a lot uh a lot of interesting intelligence in addition to more traditional indicator feeds so with that I'm jumping into what about so what what does oen have to do with agents and where can agents help oen and so we've tried a bunch of these different ways and we think that they help in a lot of these spaces one is data collection so with agentic work flows things that can make these small decisions um they can expand your collections and they can allow you to collect from more more spaces uh and more elicit spaces so you can think

about like traditional spidering or scraping they can do a smarter version of that the next is finding relevant data so helping with that like needle and the Hast stack problem can they go search on your behalf to find what you care about um the third which is one of the demos so this is more exciting they can do research and write uh look look into this event and dig deeper to see if there's anything actionable that you can do something with uh the fourth is summarizing reporting in real time um that's another demo so keep an eye out and then the fifth is like doing things like integrating your workstream with other tools that you might have so if

you know that after you find an IP you want to put it into virus total D engs can help with that type of problem and with that I'll hand it over to Ryan so my first question is what exactly are agenic systems it's really be new technology probably been around about two years now so if you ask 10 different people you probably get 10 different answers this is kind of the way that I uh what I consider to be genenic systems um say a software program it's a sub degree mimics reasoning I don't want to go as far as say as it does do reasoning but it does something akin to mimicking reasoning um it's able to perform tasks

autonomously in his environment uh utilizing uh kind of a predefine set of tools and um allows you to carry out certain varying goals depending on your needs um they're use for are a number of different things there been a number of different papers out there detailing agents can be used for some most common are um WR detailed reports uh scientific research even things such as creating uh new Alloys for Material Science purposes and even r or hacking it was just a recently paper talking about how these agents were uh designed to do web application P testing so you have a cross scripting agent sqli agent um all to do this web penetration uh pen testing and even um

hacking or Assessments in a source code so I a little bit about design patterns uh there's four key components for uh developing these kind of systems uh the first are the prompting strategies um the three most common that you'll you'll see in the wild most commonly used react reflection and language agent research uh basically all these strategies boil down to basically three different things uh You observe act and evaluate and um these these problem strategies basically are lesser or greater sophistications of that uh those actions uh second component is a tool use you want to equip these agents with as many tools as possible these tools basically allow you to interface with your data sets so you have a tool that's

Bally designed for um SQL queries you interact with your database uh web queries there's a fantastic uh free tool out there called Tav that allows you to uh Supply it with a search query and it'll return to you a few Snippets from the web u in reference to that query along with the sources and even like follow-up questions you can delve deeper into the uh the research topic uh we have tools to execute code even tools that'll send email for you or stack messages uh the third component is uh planning you want these agents to be able to break down a um fairly generic task into subtasks and the fourth is U mist of Agents U rather than have one General

agent often times we'll develop more specialized agents that can handle very skure tasks and use more um tools that are U developed for very certain use cases there's definitely a lot of pros and cons using the systems uh the pros are increased speed Insight if you're requiring vast volumes of data these agents can help out a lot with um distilling that data down to key insights uh they can autom make time workloads um you have a general workflow that you do every day takes like one or two hours you can download that workflow into an agent and help you achieve your task a little quicker also Crea problem solving there's a new Research into a

kind of a this human in the loop technique where you utilize an agent to uh give feedback on certain tasks and there's definely number of cons uh high cost is one of them if you don't have like your own model your own infrastructure you're making calls out to another llm um running complex tasks and take uh hundreds and thousands of different uh calls to these ilms to complete them there Nations um most of you are probably familiar with this one but it is a problem where all the outlets aren't grounded in uh in ground truth task confusion is another one we'll give it a task say like give me all the I's or ttps for the black Su

ransomware I return to you the histories of men's black suits or a cactus ransomware as well we turn to use something like Sor and cactuses in Arizona so it does get confused a little bit uh there's also new research that came out recently uh talking about how like signing a uh a model role isn't really effective for these newer models so grounding in some role as a computer security expert doesn't actually have any effect on the uh outcomes of the uh of the the model and these um models T overgeneralize A lot of times these agents will spit out very uh kind of teid results that U aren't really have too many details aren't probably

valuable for use case so I like to uh show you all two demos uh the first demo is a research reporting agent uh this is based off an open source project you can find on GitHub called a gbt researcher it's pretty has a fantastic UI and it's really easy to um modify the backend for your own uh for your own agent you can see here on the right in the diagram basically you give this agent a task it'll then create a plan based off that task it'll break down break down the initial task and different subtasks and it'll um give it hand it off to the researcher re researcher will then generate different search queries

based off of those uh unique tasks it'll then utilize different tools to complete those queries and those different task aggregate them all together uh send it off to another poster agent which will then generate a final report for you um there's number number of different types of reporting gauges out there um another popular one is called storm is able to generate like Wikipedia style articles for you from a very like um generic um

request this is the GPT researcher uh in action basically I supplied a uh task to find me uh information on the cat transer group um the UI is all the gbg researcher the back end is a agent more tailored to um kind of um compiling reports on these different thread actors you see here it's breaking down the uh initial requests and different tasks and topics for the report and utilizing very specific tools for that uh for that topic these kind of specialized agents are a little bit better for um crafting reports when you want like very detailed um very specific information in those reports rather than more generic uh information for certain threat actor you see here's comp piling report

it's streaming to you you see there's quite a bit of information here you see some really detailed I's ttps being spit out by the agent itself some things like ioc's very specific I's like um registry keys being used um re or mutexes things of that nature a little bit better for those use cases you see here it's pretty uh pretty

detailed want to show you a quick diagram like what a specialized agent kind of looks like um a little different than the gpg re research style where they have a more General agent to do more General tasks um two kind of key differences here uh once you supply the task you want a uh agent that will pull down a very specific kind of outline to help guide it in its research process this is really important for a kind of um requiring the agent to come back with certain uh information that you find useful so I would recommend to like utilize the people within your organization have specialized skill sets help you kind of craft these outlines to

help guide these agents in their research process uh the next thing you probably want to do is um grab a um kind some task context to help it guide it in his research so get go off topic go off the rails give you random T information that aren't relevant to the initial request it's going to break down those uh Topics in the outline um create uh in search queries depending on your how many you specified and then for each search query research query is going to uh utilize a very specific tool to um retrieve information for that quer itself and then it's going to come back and it's going to uh pass it off to a

posting agent which will then create a draft a final report for you this one my favorite agent um talk about the web vager agent it's uh based off a paper called Web Voyager um which they uh create an agent that um interacts with the browser in real time using think use playwright that's what I use as well it allows them to do like a give a very simple task agent very simple task like you book me at a this restaurant and I'll go out and do it for you um Lang chain is a fantastic repo on a GitHub there's a lot of different uh notebooks out there that you can utilize for these different types of agents and

web Voyager is one of them it's pretty easy to modify and create your own uh it uses a um technique called set marks prompting so it'll take a screenshot of the uh the browser itself it'll inject JavaScript into the browser the JavaScript will then create bounding boxes around certain elements within the browser that are actionable buttons hyperlinks uh lists things of that nature and then it'll create Bing boxes with companied by a um a number one through n and then it'll pass off that image to the model which will then um send back which action to take based off the specific task uh the Dem I'm going to show you is um web Voyage agents been modified to um

go and extract um wallet connect uis from these different um uh crypto trainer websites out there crypto drainers are like a pretty prominent scam in the crypto uh World basically they try to get unwinding users to go to the site with um Promises of um Mass rewards for the crypto environment I'm try to get them to connect their wallets and one or two clicks their wallets will be completely drained so I wanted to create an agent that will automatically go to these uh drainer sites um download the uh wall connect URI which going to be used to extract um the uh mous wallets from those which then be fend into like CTI fees such as like chain abuse which US

users can then go to and check whether or not see whether or not their wallets are malicious that they're transacting with there's kind of the agent in action you get a very simple task like go this website and um copy the W connect U clipboard um and it'll pull up the website for you I just Supply the domain and it's going to navigate that website all strictly through computer vision it's not using any uh HTML or css selectors it's strictly using that set of marks prompting kind of navigate through the website and uh try to um complete that task you see here it pulls up the uh well connect your eye um hit the copy clipart button

download that URI and we then pass that off to another program so then extract the wallet from the uh from that uh URI

so a few few takeaways um J highly firsty pretty much are going to be limited by your imagination there's a lot of things you can do with them um the more data and tools you have the better U they're only as good as the tools that you supply it the more data they have to complete their task the more likely they are to to do that um they help optimize the uh models and like innate uh reasoning abilities they can Compound on those reasoning abilities uh depending on which ping ping strategies you uh

use that's all I have thanks so

much thank you so much Ryan um I hope you're willing to ask qu answer questions you can ask them as well um questions uh yes wait for the mic okay

I'll yeah have you tried any of the different sort of agent Frameworks GES any evaluations like giving them a audit CTF challenge to see how well we doing a no s I have not personally done that myself yeah I haven't either but I that's I will now I believe there's research that's been gone to that though trying to evaluate um how well agents perform against certain security tasks I know there's also the hot take of all the PC work of like using the agents to make pcc's and then it's debated on how much they're memorizing the published PC but I know a lot of people are doing that too with mixed mixed results hi there a couple of questions

first one which models are you using the agents so we's a kind of a mixture of models I think for most of the demos here we used GPT 40 we've also used some of the uh open source models on a Gro which we look quite like a llama mixed roll Gemma things of those nature as well anything local do you run any local um we have yeah we have in some cases um so like mix Dr and llama were the ones we tried um usually we found raged does like good enough to not do fine tuning um but there's a couple cases like uh understanding some of the like fraud jargon that's like a translation problem

where refine tune but oddly enough uh doing like ra or like putting context in the prompt seems to do pretty well with the public models and any of the processes have verification Loops in yeah that's always a challenge verification trying to uh ascertain where the information came from but there are some certain strategies to try to mitigate the husin naations um two that I've used in the past have been the chain pole there's a paper out there which uses like a mixture of LMS to determine whether or not the U output was uh grounded in the actual context another one I've used in the past has been actually telling LMS to spit out the uh information from which the

generate text came from then doing a fuzzy match off that generat that text to actually deter whether or not it was actually in the context or not one thought to one thought to add there is uh it's not real explainability there's talk earlier that was an awesome talk on explainability but we do a lot of uh kind of like using the source truth of like what information are you pulling it from using semantic search to like grab something produce an answer then verify it against the that data so that's some of the verification tricks but it's a not you know not true explainable stuff any more questions

and so you mentioned you use like GPT for for examp how do you deal with the ethics of certain models and what data they've been trained on side of things yeah that's a good yeah I I it both feels yeah strugly on that and then also because it is like to your point a lot of the models are trained on like Reddit Twitter sometimes there's worse like like stable diffusion was trained on cam data and so it is it is like kind of an inherent like their bias like we trained them the road rage version of the internet and then are confused why they like when they do crazy things um so I think it's something that we we try

as much as we can to like mitigate uh I think we haven't found Perfect Solutions to avoid some of those questions but we do like as much we can try to like test different corpes of data see how it performs on like different languages try to measure so right now we're trying to mitigate that um but not we don't have perfect solution do we have more

questions um what's the reliability and accuracy of like the G report if I use it for I engagment especially fors that are very new and have generic [Music] meing yeah there is a always a challenge try to get any disambiguation if you have ap41 for example and somebody else goes by uh name XY trying to group all those together into a single actual true thread actor is is challenging um you can do some evaluation just certain degree to try to U make sure that the data that you're pulling from is actually relevant to the initial initial query but it is a it is a is a challenge to always make sure that the outputs are um relevant to the initial

query um trying think specific techque but yeah I'm try to like I think a ballpark is maybe like I don't know 99 like like very rarely you see an issue I think one of our like this is not real mitigation but like a business mitigation or like practical mitigation is we try to use it as like fancy Google search so we'll like site The Source where it came from and so we're trying to use it as a faster way for someone to get the result and then then like tell the PE person to like now go click the link and like look at the source and do more verification and so that helps with some of the like

you know the hallucinations and and stuff like that um but yeah it's it's honestly something we're always trying to push

better thank you for your talk I just wanted to ask you know where can a beginner search for I guess resources and coures and stuff on o what can learn yeah that's uh let me think about I'll kind of follow up depending honestly I'll just tattoo after find I think there's like some oent Exchange like Belling cat depending on the type of ENT you like Belling cat is really good if you like fraud telegram like I do like Eric Huber or David Manon who does like a lot of check fraud on LinkedIn they post a lot there uh also Reddit like I think r/ oen is is quite good on like just places to start even

they even like kind of link to some of the dark web stuff depending on how dark you want to go um but I also think it's a fun one I feel like with oen is like you can kind of just give yourself any any challenge of like who is this Instagram person who just messaged me and like just kind of like hunting um is like honestly fun the fun way for me to get started I have a few for you all as well there's two substacks I really like open source intelligence newsletter is a good one then mic Warfare is another one I really enjoy and then also there's a podcast called on needle stack has a

different open source Intelligence resour on every other week it's Prett good resource so youve called out hallucinations um do you build an additional verification model afterwards and do you catch most of

alluc

do I can yeah I don't know was I think one thing one thing we found with that is so for example if if you ask like a very we used to do a lot of supply chain intelligence so you can ask a very generic question to an LM like uh what is the effect of like a who's that fell out of the sky like what is the effect on the co like the coal supply chain and it'll be like the effect is very large because a lot of coal is transported from India to the US via planes and you're like wait that doesn't make any sense and so a lot of times like when you give it very open-ended questions it

is horrible but if you do like very like categorical questions like extract the ioc looking things from this post or categorize this as a fraud method or not a fraud method or a like offer of a stolen account when you give it like these more like binary questions it's can do it's like a little bit better and it kind of limit the space of the craziness and then also it's like gives you a place to look of like you know we're surfacing you know we think this is relevant because it looks like a fraud method or we've extracted this based on the original text but some of those techniques are like kind of better for some of the verification stages of

like um are you sure this ioc and the text that kind of thing presentation appreciate um my question is have you looked it trying to turn the tool into like a specialized tool for like instant response platforms use like dark web data specifically for turning into a specializ specialization tool I don't think so we would love to talk to you about that though yeah we do uh we do like fraud intelligence for our day job so that's the we do some of that of like fancy alerting but we yeah i' would love to know more about what you're thinking thank

you I just wanted to ask um since I guess what what's the legality of someone who's I guess looking to get intelligence data from place like the dark dark web obviously on forums that sort of thing is there you know could you be in trouble for doing something like that during your investigations there are a lot of legal we spend a lot on lawyers uh I think some of the hurdles are like first of all if you're ever purchasing something that's like one gotcha is like you have to make sure you're not purchasing something from the sanction entity is a small amount like if you're spending $5 to some fraud person it's like okay but

if you're like doing anything that's like proportional to like you could be funding their crime that's a gotcha another big thing is images um and so like I guess downloading stuff in general is scary from the Securities perspective but also like if you find anything abusive there's a lot of requirements on reporting uh and then there's also like the scraping stuff although that's like a little bit I think right now it's like a little bit more like if you're collecting from listed spaces with like a reason it's a bit more clear but it is something there's a lot of a lot of lawyer questions for

sure thank you for of the the before you talk like oh who this person from Instagram let me search for it and everything yeah I mean if I try to do some Hing by myself I maybe not able to access some information that may because let's say on face are your privacy setting up I cannot see anything about you do your system Tes different tools is able to still gather some information about you that I wouldn't be able to do in that case how do you consider prot privacy yeah that's a really good question um I love your thoughts here but I think there's two things I think we think about here is like one um like

what there there's a general question of like how like how sure are you that this person or thing is involved in crime and so like that is like one thing but that's a very fuzzy thing cuz like sometimes it's really clear like someone's like I'm hiring someone to go brick or like go attack someone else and you're like okay great like yeah you seems very sketchy but then other kind of times it's a lot it's a lot more a fuzzy boundary and I I kind of completely agree with you like tracking individuals for like normal things is super creepy we don't want to do it uh but I think in between so one of the

things that we think about is is it data that people have posted like did you intentionally put it out into the world or is it something that's hidden so like a difference is like if you post on Twitter it's available versus some if you if it's like hidden location data from your dating app that's like really you obviously probably are not trying to put that out in the world so that's like one of the ways we started to draw draw the lines but honestly for us still it's you know it's kind of early and we try to do a case by case of like um you know making sure we're not we're not not going to the dark side

[Music] there we have time for just one last question is done Sarah Ryan thank you so much that was great Round of Applause

Data, Agents And OSINT: You Don't Know What You Don't Know - Ryan Reeves & Zara Perumal

Related talks