GT - Devising and detecting spear phishing

Name: GT - Devising and detecting spear phishing
Uploaded: 2024-09-04
Duration: 55 min 54 s
Description: Devising and detecting spear phishing using data scraping, large language models, and personalized spam filters (full title due to title character limitations) Ground Truth, Tue, Aug 6, 13:30 - Tue, Aug 6, 14:15 CDT We previously demonstrated how large language models (LLMs) excel at creating phis

BSides Las Vegas55:54299 viewsPublished 2024-09Watch on YouTube ↗

About this talk

Devising and detecting spear phishing using data scraping, large language models, and personalized spam filters (full title due to title character limitations) Ground Truth, Tue, Aug 6, 13:30 - Tue, Aug 6, 14:15 CDT We previously demonstrated how large language models (LLMs) excel at creating phishing emails (https://www.youtube.com/watch?v=yppjP4_4n40). Now, we continue our research by demonstrating how LLMs can be used to create a self-improving phishing bot that automates all five phases of phishing emails (collecting targets, collecting information about the targets, creating emails, sending emails, and validating the results). We evaluate the tool using a factorial approach, targeting 200 randomly selected participants recruited for the study. First, we compare the success rates (measured by pressing a link in an email) of our AI-phishing tool and phishing emails created by human experts. Then, we show how to use our tool to counter AI-enabled phishing bots by creating personalized spam filters and a digital footprint cleaner that helps users optimize the information they share online. We hypothesize that the emails created by our fully automated AI-phishing tool will yield a similar click-through rate as those created using human experts, while reducing the cost by up to 99%. We further hypothesize that the digital footprint cleaner and personalized spam filters will result in tangible security improvements at a minimal cost. People Arun Vishwanath Fred Heiding Simon Lermen

Show transcript [en]

perfect thank you everyone I'm just going to start my timer here so I hold you in a good time and yeah let's get started so again this is about a enhance fishing thank you so much for for the introduction my name is Fred heing and this is Simon I'm a research fellow in computer science over at Harvard focusing on a wide range of red teaming uh both iot and embedded devices and social engineering such as fishing last year I was presenting another study on AI fishing at blackhead talking about how large language models can automate fishing emails and this is a continuation of that work as we will see uh a little bit more about Simon just to

mention that as well he's an AI researchers focusing on a wide range of AI risk topics and uh lately a lot a lot of his work is on spear fishing we also receive U received counseling and mentoring from Aon wisha who's in the back of the room here and Bru schneer over at Harwood so I want to give a big thanks for them as well for being part of the project and yeah if you have any questions you you may stop me just during the presentation or there will be all the time we want afterwards as well so just priming you with a few questions that we're really interested in that overall guiding our and my

research that I do that I think is is really good to think about and one other thing is how can we use AI to yield more benefit to Defenders rather than attackers and this is actually really interesting couple of months ago I wrote an article in the Harvard Business review about how AI yields way more benefit to fishing attackers in the context of fishing the reason for that is is that you know there's a lot of different technologies that empowers things and AI really empowers things it makes it very easy to create good scalable high quality fishing emails but from the defender sides we can use AI to increase you know spam filters and so

forth but in the context of large language models and AI Technique we already have a lot of spam filters so the increase is is only small whereas for the attackers the increase is very very big so this is a fundamental question in my research how can we change this game to make sure that AI yields more benefit to Defenders than to attackers again in other technical domains you can really use it for booth sides attacking and defending because in the in technical defense you can just add an update and Patch all the systems but we can't patch humans that's why this is always a nuance and it's always a problem I'm also very interested in

how to quantify the risk of fishing I work a little bit with the business and poly schools as well as well over at Harvard to try to quantify how much should organizations pay for fishing defense is it generally underfunded or overfunded because these are quite important questions and a lot of people have a lot of opinions about fishing but it's quite difficult to get actual dollar values cuz what is the worth and what is the risks of fishing attacks and one thing we're also very interested in and happy to talk more about is just what is the purpose of academic fishing research um majority of the fishing research are being done by anti- fishing provider I think that's a little bit

biased CU these people people often want to sort of scale up the numbers and say that you should pay a lot for fishing defense and that does does make sense often but academic fishing research is quite important it's Al difficult because you need to pass a lot of Ethics reviews which takes a lot of time and makes it quite impossible to scale it that big so so this is uh this is something I'm also very happy to talk about anyway uh more down to the actual meat of the study uh again last year what we did with is sort of the fundamentales for this and for my future work is we compared human fishing experts with large language molds just

to see that which are best and we learned a bunch of things the the big takeaway is that language molds aren't yet this is 2023 language mods aren't yet as good as human experts if you combine them if you combine the language mod queries with some human expert input they get really really good so basically what we did was we automated email creation what we're doing this year is that we're taking it further and automating the entire fishing campaign attack chain so so that is finding participants finding information about the participants creating emails sending the emails then self-learning and improving the tool based on which people pressed the link and which people didn't press the link so it really is one step

further in this chain and we're implementing uh implementing it in a bunch of places we're talking with Harvard's it Department to test this on all all Harvard's faculty this is pretty important for us so if anyone has a company or someplace where where we can Implement our research we're very happy to collaborate because we want to get as much data as possible possible to scale this up pretty big um just a quick shout out as well to these artificial intelligence things that everyone is talking about it's a good fasis for this talk so large language models is an AI technique and they're often instantiated in chatbots and these are a bunch of different brands of chatbots so just I

think everyone knows this but it's good that you have this knowledge the icons down there are the ones that we use and it's for open the eye and Fric misol Google and Facebook it's not super important but it's good to know this difference chatbots langu models and AI so that was a quick introduction uh now I'm going to dive into the real study I'm going to talk a lot about how we automate the entire chain of fishing emails I'm going to show you some mitigation plans for how we can improve and solve this to try to make AI yield Defenders benefit I'm going to show a quick cost benefit analysis and then a demonstration of our

tool let's see if you can see this image here it's relatively good I think most people can see this this is just an overview and then I'm going to dive into to every part but so what does this tool do it's a python based tool for whatever that's work this used iterative query iterative searching through uh for Google we will probably uh expand it to other use other uh search engines too in the future I talked with another person about that yesterday but anyways we search for example Frederick hiding at Harvard so my name and one keyword doesn't need much at all and then we feed it to any language model of our choice we often use GPT different models

have different safeguards but we can we can try anyone we just added a local model as well that we can use and then it searches we use Google uh it usually uses two to five searches but it can search for 10 pages 20 Pages really as much information it is and the tool is quite smart IT learns itself when to saturate so it usually goes in and finds things such as personal websit social media LinkedIn is a great choice right company websites all these different profiles it understands how much information does it want before it lets go and again these API queries cost it's a very small cost but you have to pay a little bit for this so if you want to

scale this up to millions of people you can choose how granular information you want and how much information you collect but it usually creates a really good picture of the person after this we create a synthetic attacker profile so we scrape the target for example me then based on the target's information maybe I'm a computer science student then we then the tool itself creates an attacker profile that can be a synthetic professor who offers me a research internship gig or anything like that so based on the based on the attacking Target profile we match it with a sender that it could be real but it often times fake and then we create the email and

the email uses all this information a bunch of best practices and it usually really really good as we'll see just want to give you a high level overview of how that process looks uh so to dive into this a little bit deeper uh the pH one is about collecting participants information the tool can do this automatically which quite cool for example it could find out uh know all the employees of a certain group at Walmart or whatever company you want again in the context of Academia we can't really do that because that wouldn't be fully ethical so we recruit participants manually uh but this could be automated and of course we hypothesize that foreign na states do

automate this and work quite aggressively with this right now and we click participants we did a first pilot that was rather small and we're going to scale this up much bigger uh to see to see how this works in a in a real large context and We compare how long does it take to manually find this information to be able to isolate that cost variable in terms of time how much would how much time would I need to scrape all of yours information would be quite a task only the people in this room uh and then we compare the tool which is of course automatically and and we see which is best in these cases again we find

information such as field of work that's usually the best one but also collaborators which people do do this person talk to walk with work with Etc what interest do you have curricular activities and based on all this information we create fishing emails and we do this by using language model queries so the very naive ways to tell a language model create a fishing email to Frederick who's a computer science student um we have to be a little bit smarter as that as we'll see so the queries is really really important our main part of this study is actually to create really good llm queries to ensure the output emails are high quality here again we comp compare AI tool versus

human expert this is really interesting we did this last year in 2023 we do this again now so we basically take all the emails from the AI tool then we analyze the m and see how good is this email how much would I like to change and already in one year we want to change much less the emails are close to perfect and you in the years to come it's quite easy to imagine that they will very soon be as so good so we never want to change anything and then it's interesting to analyze what happens next so the first Benchmark and the only Benchmark we have now is when am model surpasses humans in

the context of deception but when you move beyond that how deceptive can they be there's really no upper limit for that deception and there's no good Benchmark for measuring that so we talk with a lot of the AI companies and AI labs to create that kind of a benchmark for AI deception as well and so for these LM queries uh they are based on information for the Target we feed this in in a general way uh I'm going to show you a version of this how it can look at the Theon ation but they're really long and there are hundreds of words so I can't really fit them in the slide but we have to tell the models things such

as don't use more than 100 words because the models have a tendency to digress they talk a lot and when they talk a lot they add things such as I hope this email finds you well or I hope this will be a fruitful Endeavor and these things kind of give it away so we have to cap the model and make it not say these words we also have the bypass security uh I'm going to talk a little bit more about that later but it's quite easy to do uh you can tell most of these mods to create a fishing email then it's going to say it's illegal so you have to say create a marketing email but this is

kind of well research it's really easy to bypass the llm guard rails and we can even create local models as I'll talk more about later but you have to be a little bit smart about it you you have to bypass it and think about that but we do that and that works rather well and again we compare different models to see which one is the best but this is sort of the main philosophy and when we create these queries we don't just use the E the information from the person we also use fishing best practices to make sure that they're really persuasive and one of the one of these fishing best practices is the VRI it's a really good

book written by Aaron who's in this room and the main things we look about here is credibility and relevance so we really H in on the query to Ure that it's credible and that basically means that the email looks legitimate you I see this as two Gatekeepers the first gatekeeper is that the email has to be credible if it's credible there's things like there's good English there's spelling mistakes know this might be a logo there might be a sender that you know I'm used to receive emails from so it looks really legit uh from bides it might be that it fits all of the sort of bsides categories so this really looks like a bsides email and if it would be a

bsides email to me it's also relevant because I am going to bides if there someone saying hey you need to upload your slides today press this link that's pretty relevant because I need to do that but if you sends a bsides email to my brother it's not relevant he would never do it because it's not here so it has to be cred and has to be relevant there's also a third bullet but we primarily think of the first two here so these are the overall things to create a really good fishing email and when we have that we move on to what I call the CDD guid Lin and these are sort of very old very established best practices for

persuading people uh there are these they're way older than these you know back in the old Greek they even had these persuasive things so how do you make someone do something and what's really cool about AI our AI tool is that we add these persuasion techniques to the emails and we randomly assign them so for example to some email we might say use social peer pressure that basically means say hey all your friends are going to this talk you should also go and for another email we might tell the language model to use Authority which can be saying hey your professor is telling you to do this or the police or the tax government is telling you to

do this so we have some authority figure and these are not necessarily Better or Worse they're just different so everyone is susceptible to different influence categories and we don't always know which we are susceptible to it can also change over time so maybe today I'm quite susceptible to scarcity and tomorrow I'll be susceptible to social peer pressure what's cool when we do this over a long long time and thousands of partipants we're going to see patterns that's where it gets really interesting and we're going to talk about that when we talk about the personalized spam filters which is a way to mitigate AI fishing we can see that all the salese on Mondays are really

susceptible to social proof but all the computer programmers on Tuesdays are really susceptible to Authority maybe that doesn't make sense but we're going to see patterns like that and that's super exciting because what we can do then is we can just add Flags in the email inboxes of people because people are super stressed we don't want to have overly aggressive spam filters because you need to get your emails but then we can see that hey fredi you're always falling for social peer pressure emails then we just flag the email saying hey this is a social peer pressure email you might want to press it it might not be bad right because for example I get a

lot of these marketing emails from the Red Cross for blood donations I want to press them because I want to do it but it's good to be aware of how are these people Fishers and marketing peoples and colleagues how are they using these techniques to persuade me most people just use them subconsciously but we always use them so making people aware of this is really interesting and we're going to find it out with our tool this is just a very quick codify version of how much Human Assistance emails need these numbers shouldn't be taken as gospel but are pretty cool so we just made a quick numbering screen where five is pretty bad and zero is

very good and we basically says how much do we want to change this email so last year we did it manually so we changed a bunch of the emails and then we saw how much better they were and as we saw in the results they were much better we look for some simple things right is this email credible if it's not credible and relevant then we change it and if it's not credible and relevant the email is really bad then it goes down right maybe it's credible but it's not relevant that was often the case with the language models last year or maybe the email it's pretty credible it's pretty relevant but we want to

tweak some some minor words or phrases and what we see last year the average change of 3.5 out of five so that means somewhere in between minor contextual change and change credibility relevancy so it was quite a lot like they weren't super independent the emails from the language models this year the average change that we want to do is 1.5 out of five which means somewhere between minor language changes and minor structure changes and we really needed to think right so when we looked at the emails that our tool created some of them was a zero like we couldn't possibly change it I don't know how to make this better some of them we did some small tweaks

but it's very easy to hypothesize that when we do this last next year it's going to be a z five and again what happens in 2030 like the only Benchmark we have right now is to bypass human deceptiveness like that's not the end goal right that's not as good as it can get so that's pretty cool we don't know how deceptive these models can be and I find that very fascinating it's just an example uh these are somewhat old so they're not the newest but we're going to show more examples later but it can be fun if you read the slides later you can read them as well this a pretty good email you know the

the model creates this random cyber ethics lab which is pretty close to ethics lab that we have at Harvard but this is just synthetic so hey Fredick we at Cyber ethics lab a hardware based group are working on a project related to ethical hacking and its potential impacts that's exactly what I do we thought you might find it intriguing we are looking to collaborate with individuals who share our interest in this to main check out our project details here could we discuss potential opportunities to collaborate and this is good it's a really good email right and if you do this on like a mass scale to 1 million people which you can do it's almost free like the APA ques are really

big so we could press play right now and send this out to 1 million people I I'm pretty sure we would get a lot of them right and that could cause like again this is an election year in the US there's a lot of stuff we don't want people to do and say this could C quite some nuisance and we have to realize that all the cyber crime groups on the world of course are going to work with this and of course they're going to use it so fishing emails are getting pretty pretty funky here's another example I'm not going to dwell on it but it's going to be in the slid so if you want to check

it out it can be cool to look at it's pretty similar a little bit shorter so then when you send emails out uh this is some classic best practices that we use uh we usually send them out in in the context of 10 if you scale this up to millions of people you have to be a little bit smart right you have to see are you sending this do you have to send them all out in one day there's a lot of there's a lot of good span filters right you can't just send 1 million emails out sometimes you can there a lot of different tools for this but as you scale this up you have to

work a little bit more with this and we have to avoid span filters we worked quite a lot about that different ways of finding legitimacy for the emails one way is just having domains that you had for a long time that adds legitimacy but there's a lot of best practices we usually follow all of them and one fun thing is that you also need to add some smart sort of bonus features one of them is that a lot of emails have a tendency to create previews these popup previews can really give a fishing email away so here's little example of how the link says you know how well would you rate this fishing attempt and I'm going to

talk more about that later but if someone presses a link in our study then they're taking the survey where we just ask them a couple of questions about the fish now obviously we don't want that preview to be shown because then it's very obvious that this a fishing email uh so we needed to add some tweaks to the email to the email client and the tool that we use to send these that's pretty easy to do and one sort of a meta investigation that we do when we create this tool to use do this automation is that we see like now we write it manually right we code it ourselves but could AI tool create the AI tool right

could it create itself and I've done some other research product on this other people do it as well AI coding is getting really good it couldn't really do this now but in a couple of years it didn't even need us to create the tool right the AI tool could create the AI tool and that's pretty cool then it would find but it's a little bit tricky because you need to find out these things like avoid the email pop ups yada y but pretty soon it's going to be even more autonomous last year we did mail Shimp I just want to say that because it was way easier we need send email with mail Shimp you just like get a lot of

credibility MailChimp have some good feature for this when we build our own tool you need to be a little bit trickier but obviously when they're a AI autonomous fishing Bots they're going to use their own tool so we have to investigate this anyhow this is just a throwback to the last year results I'm not going to dwell on them so long but the big takeaway is that the LM LMS were quite a bit worse wor than the human expert models if you combine the human expert models and the llm then the results were really really good but the average of the the human experts were best and that's to be expected again last year the language models that were big such

as the gpt3 and 4 they weren't even one year old they were rather new uh this year it's getting way and way better so this is a throwback what we're seeing this year in our pilot is a really high result 70% is quite insane from the AI automated tool we're going to skip this up really big so we'll see how this plays out we know thousands and 10 thousands of participants but I'm really excited I think it's going to hold up but we'll see so we track when they press a link and we also collect free text answers and this is a method of direct data collection uh it's also based from the week is link they have

some really good structures for how to measure fishing success in that book and by doing this we can see not just whether the person pressing or not but what they F and this is really interesting because we see things such as someone saying I F this emails was legitimate because it was a gift card from Starbucks and I usually go to Starbucks and other people say hey I didn't press this link because it was a gift Buck gift card from Starbucks and I hate free giveaways I always think that's false and that's pretty cool right because we see that there is no one siiz fit all answer we have to be personalized and it's really interesting

to learn that you know what one person think is super legitimate another person think is very suspicious I think that's quite interesting and I we use these things we have the target profile the center profile and the email persuasion type we analyze how all of these play together and how all the these add up to the final picture which is quite cool uh one word on Open Access models because this is rather important a lot of the protected models or the or the industrial models chat upt Claude uh Google have there mistel have there and so forth they have safety guard rails and we can bypass them as we see now but it's possible to imagine

that if you scale this up to 1 million people the apis might do some checks right they might see that hey like this person is doing 1 million email there's not a marketing institute there's some weird you know basement in Boston or whatever that that they could track that and we don't really want that you also have to upload your credit card it's not really Anonymous it's kind of beneficial from the hacker point of view if you can do this locally and anonymously and it's quite easy to do uh there's a lot of ways Simon did some other studies of how to jailbreak these models and there other research on as well it's quite easy to jailbreak you get it to your

local computer you don't need to use any of the you don't need to use any of the existing models like subscribing for the API or having your information with them so that's bad right and then you can also say create a fishing email you can create whatever you want because there no safety checks whatsoever so you can get this local to your computer uh there's some cool research for how to do this and that's pretty bad like we don't want that to be the case but I think it's pretty hard to remove that but it's not my field but there some really good other research on how to make sure it's impossible to jailbreak model and create

Open Access models but it's possible and that's something we have to be aware of because I assume that all the Cyber criminals will use this so a few words on how to uh mitigate Ai and what we can do to prot protect ourselves from this so I'm very excited about these personal personalized spam filters as I mentioned a few times already um and we analyze what users are susceptible to what types of emails and there are a lot of existing ways they've been trying to do this uh I think they didn't really get it for a lot of reasons and because it's tricky right there's so many variables here like if I press a fishing

email maybe the email was good maybe I was just stressed maybe it was a fluke who knows and in the sense I think we're also moving away from fishing a little bit we already see that there's a lot of deep fakes there's voice fishing there's all these types of different versions right and when the language models uses new types of deception in the future I think we can't really think about fishing emails as the only persuasive way to make people do stuff that's obviously not the case now even so it's really interesting to analyze how do people fall for this and to try to see these Trends because that's something we all I think need to be aware of like if

you ask people at the big AI Labs all of them know that there's no good way to protect against super intelligent deceptive AI models like how do we protect us as an AI model that thousand times smarter than us from deceiving us like we can there's no good answer to that and at least one it's not an answer but one guard rail is at least know what types of persuasion and my Mo susceptible to if you run this from a couple of years and I just learned okay I'm always falling for peer pressure or for reciprocity so if someone does something nice to me I always want to give that back that it's good to know

these things and the good thing is that it doesn't cost us anything right so often we see this fishing protection techniques that requires a lot of training and that training is boring and doesn't teach us anything or the spam filters that overly aggressive so we don't get the emails we really need to get but I think this is a pretty cool technique where we can just tell people what they most likely to fall for and that at least tells you something we can also use fishing detection with large language models I say AI but I mostly talk about large language models in this talk and that's a big U that's a good thing to say it's

a big delimitation but language models are pretty cool for fishing detection what should be said here is that language mods are just one way to use AI to detect fishing emails there's a lot of really good AI models that are not language models that get pretty high results like state of thee art are usually way above 95% sometimes about 99% so like we already have fishing detection it is an interesting field though because the fishing Emil gets better uh there's a lot of technical tools you can use the language model to analyze metadata and so forth that's pretty interesting but I'm going to show you some fishing detection examples on the next slide as well the last thing we

propose is a digital footprint cleaner that's pretty cool I'm not too excited about it but that what that means is to check the public information you have available about yourself then see do you really need to have this available can you remove it most people today have a way too big digital footprint like there's some stuff online that probably shouldn't be online and what we would like to do is to find a sweet spot here about information that you don't really need CU some information you need to have online I want to Market myself a little bit I want people to see what I work with I want people to see what I do that's quite valuable that can give me

collaborations it can make me work with people I don't want to be 100% Anonymous but there might be a sweet spot of information that is not Soo useful for me I don't really benefit from having it online but the attackers benefit a lot from it that's cool because we can see this when our study we see what type of information do the tool uses so what type of information is really valuable to the attackers and maybe we can flag that you know say this information you might want to remove and especially if it's an overlap with information that is really useful to the attacker and not useful to me well maybe at least we can

remove that so I think that's pretty good the reason why I'm not overly positive about this is that I think that even if you remove 90 or 95% of your online information it's the last 5% is probably enough to create a pretty good fishing email so I think that no matter how good you become at information cleaning it's always going to be enough information to create pretty targeted deceptive attacks but it's good to clean information anyway and we should work more with it a little more work about fishing detection uh we did we did an experiment last year and repeated that experiment this year when we used a bunch of language mold to detect whether a

fishing email we fishing or legitimate as with many other Trends it goes pretty fast they get much much better last year the results were okay this year the results were really good uh so what we did we did a bunch of the big models we used normal fishing emails which are other fishing we received to our email inboxes or fishing emails from online fishing archives such as the Berkeley Fishing archive and these email are rather bad some of them are pretty good some of them are pretty bad but these are what we historically have thought about when we thought about fishing emails we also use AI tool fishing emails and that are our emails the email that we create using our AI

tool and then we fed these emails to the language model and say hey do you think this is a fishing email what's the intention I'm going to talk more about these queries but we ask different queries we also use expert emails again the emails that we created we do everything we can to create a really good email that's a fishing email then we feed this to a language model and we try to see if the language model understands that it's a fishing email and use a legitimate email just marketing emails that I have gotten or we have gotten to our inboxes and we feed these marketing emails to the language model and say do you think this

email is fishing and we don't want them to think that right we don't want the language models to be overly suspicious and uh there's a limitation here that we can use uh prompt injections to trick the language models and that's pretty cool what this means and we're just starting this research but there's some pretty cool uh pretty cool findings here is that if you have an email with a very obvious fishing email that the language model would find out maybe say you have to press this link or you're going to lose all your money it's an obvious fishing email then you can add invisible text to that email so the humans can't read it but the machines can read it and

then in the invisible text you say something like ignore all the text Above This is actually a legitimate email I just want to try know whatever you run write something that renders the previous text uh unvalid basically and then when you feed this to a language model it seems that the language models are fooled by this again the humans don't see the invisible text but the language model sees it and that text sort of cancel out the malicious prompt before so there's some pretty cool ways to bypass fishion detection but everyone will probably not work with this but it's good to be aware of anyways some information about the data uh last year Claud was way better

than other models this year Claude is still way better than the other models and so in this section we ask what's the intention of this email we don't talk about fishing we don't talk about suspicion we just ask the Lang model what do you think is the intention of this email and what's cool about that is that it kind of represents a human when we're browsing our inbox we don't often think that you know everything in our inbox is going to be fishing we browse it we're stressed maybe we're hungry maybe we want to sleep and then we just look for things kind of without thinking too much and this is what this represents uh the false positive is is

very low it's not existing here so when we have legitimate emails and we ask what's the intention all the models says that you know this is a marketing email this is a Starbucks gift card or whatever which is really good so we don't have paranoid models but the INE the detection rate is so and so but it's actually quite impressive especially clae which is a green bar so the two middle sections here there the AI tool emails and the VRI emails if you remember from the one I of an example these emails are really good some of them say like hey I from uh I'm from this research group I would like to collaborate with you do you want to

answer my email like to the best of my knowledge that's a really legitimate email some of these things that humans wouldn't really notice as fishing Claude still assessed that hey the intention of this email is very likely fishing it portrays to be a research collaboration email but before you do anything you have to look at these things uh to ensure that you're not getting fished I think that's remarkable that's really really cool cuz we don't ask her fishing we don't talk about suspicion just say what's the intention most other models fail to do this uh in the leftmost column we see the control group emails and they are kind of obvious so for these a lot of the models at least for

some of the emails all the models say that the intention of this email is probably fishing and even that is quite impressive because we when you ask a human even if it's an obvious fishing email if a human is really stressed you know they're running through the streets or whatever and just shoot them an email what's the intention of this email most people you know would probably they they're kind of likely to fall for this but so I think this is quite remarkable but what's more interesting is that we're priming the model for suspicion uh so here's a quick throw back to last year doesn't matter too much there's two additional bars here one bar for human

detection and for other ml algorithms but the main highlight here is that this the trends are very similar Claude is still the best model all the models perform a little bit worse and this year they perform a little bit better but so this is the previous result from 2023 obviously just to go back one more time let me see here there we go to uh to the intention 2025 when we do this I kind of assume that these bars will be even higher right it Mak sense it goes up and it goes up so this was last year this is suspicion so here it's get pretty interesting here we ask the model a bunch of different queries we either ask

can you find anything suspicious with this email how suspicious would you rate this email on a scale of 1 to 10 and as we see the results are way higher that makes sense right again if you ask a human can you find anything suspicious about this email the human's going to look really hard right we we going to look at every possible thing and we're always going to find suspicion but when we ask that to a human it's really expensive in terms of production cost your time goes up a lot if you ask a person to find suspicious thing in 10 emails you're going to look at all everything for this email but if you

look at Claude which is a green bar here it maximizes all the fields but it also maximizes legitimate emails to negative 100% what that means is it CLA doesn't have one false positive so when we send it a legitimate email and we say can you find anything suspicious it all says no it's very unlikely this is a fishing email and that's fantastic right this is kind of thing you want so we're very impressed by this if you look at mistol which is the orange bar in the middle we see it says really good detection rates but it's also super paranoid because it almost said that every legitimate email was fishing and that's pretty bad like

we don't want the false positive that's a lot of problems some when you have an office inbox or whatever most people have once in their life had an important email they went to spam that's super annoying we don't want that that's pretty expensive in some cases but it seems that for some of the models when priming for suspicion the false positive doesn't increase but the accuracy increases tremendously I think so I think this is really cool I haven't seen too much other research on this which I think is quite strange but I think that priming fishing filters for suspition it seems great so I'm quite excited last year was pretty the same clae was a

little bit worse at detection and a little bit more paranoid most other models were pretty bad and I think that this this is again going to increase right 2023 2024 with the interesting with Claud is that I'm not sure how to Benchmark this further because we already fed it the most the best fishing emails we could imagine and it identified all of them so that's pretty good right that's that's a promising sign but we'll see how to test it even further a quick word about the economics of this that's actually quite important um I come from a very technical background so my I had some interesting lessons over the past years when I talk with businesses and with policy people

about this and everyone said it yeah your research is cool but you have to translate into growth and profitability the only thing we care about is how does your fishing research translate to the organization's growth and profitability like that's kind of interesting right for me that's very unintuitive like why don't you just care about this you can make your company more safer well what does it mean to me more safer how does that benefit the Bor members how does that benefit the shareholders and it's been quite interesting for me to think about and I collaborated with some economics researchers and we did like a lot of studies and there's a lot of information about this a white paper and

about this in a new white white paper as well uh these numbers are examples they shouldn't taken as gospel but they're they're pretty legitimately calculated and as we can see the main theme Here Right is if you look at the last slide which is or the last row which is the fully automated AI enhanced spear fishing emails they're insanely cheap to scale up and here we look at things such as query cost what is it time cost and what is production cost with a lot of different variables but manual spear fishing is traditionally expensive and again there's a lot of ways to calculate this like how much information you want to do you want to find you want to spend

10 minutes or one hour to find information it depends but the general trend is that it's incredibly cheap to create really good emails and the emails I showed you before and I'm about to show you soon I think they're really really good if you send this out to 1 million people I think you could change a couple of opinions I think you can make quite a lot of people do things they shouldn't do and when the price for that is becoming insanely low um things get interesting so that that's sort of the main theme of this I think it's really interesting we're working a lot more with various type of models of the fishing market and I think more research

in this is really needed to be able to translate the security research a lot of people here are doing to really say well what is it worth because then you can go to policymakers and say that well we actually see that you know we do some studies with North Korea and one of my collaborator did you know an estimation that North Korea gets 10% of their GDP through cyber crimes that's bigger than North Korea's cold trade which is insane right and if you combine that with the fact is get much cheaper launch fishing attacks which is a big part on North Korea's cber cyber crime groups well then they're probably get an even bigger chunk and this is interconnected right

it's rather complex so like these things are but it's important the focus on economic aspects I think that's super interesting a couple of next steps before I'm going to invite you for questions and show the demonstration you have a white paper coming up uh if anyone is interested in reading it just reach out to me and whenever it's published I'm going to let you know I'm also doing a very similar study as this but for for embedded devices so we use language models to hack iot devices um a couple of years ago I did a study hacking 22 iot devices uh I repeat all of these hacks using language models then we have a usefulness framework to

see how much do we save in terms of cost how much more powerful the attacks get it's pretty interesting if everyone is interested in that I'm super happy to chat more about it the big takeaway from the the hardware and software hacking is that language mods doesn't really make the attacks more powerful it doesn't really help me hack devices I couldn't hack on my own but it sa it saves some time some sometimes it saves a little bit of time which is good also tomorrow I'll be presenting at blackhead so if anyone is interested in hearing about National cyber security strategies and how this plays into a more policy oriented role you're very welcome to

comy Jasmine AE at 1120 and just reach out if you have any other questions a few takeaways and then we're going to show a demonstration of the tool it's becoming really really cheap to launch large scale fishing attacks this is the exact same thing as I said last year at at but now we have some numbers to actually show and I mean obviously it's going to get cheaper right it's going to get cheaper and more powerful so I think this is the trend that will continue priming models for suspicion is awesome for fishing detection I really encourage people just to try this by yourself you can like this can generate to other areas in fishing as well but

the language models are different from humans in this context it's not expensive to Prime them for suspicions like intuitively think we think it should be expensive but it's not that that's a really cool Discovery and personal fan filters are super exciting uh I'm really really excited about where that can help us too and I would also like to see more people working with that as repping said you're going to have to ask questions soon but I want to show you a quick demonstration just about how this tool Works um so it's quite fun to see uh yeah this slide is rather important of course if anyone want to find information uh I see some people taking photos I'm going to just

like stall I'm going to take a sip of water actually just for the sake of it uh oh okay let's see that's a good point I just heard inside information that the Wi-Fi is bad here so let's see where that brings us uh I'm going to try and if it doesn't work I'm going to use the hot spot so this is a manual inst instantiation of the tool right and this is what happens automatically but it's quite fun to see is relatively readable maybe we can zoom in a little bit but that's fine what we're going to do is going to write a name and I'm going to write for the sake with my name I'm

going to invite audience members to join later if they want to uh let's zoom in here for a little bit can I assum I can so my name is Frederick heing that's my full name uh you write some keyword this can be cyber security this can be affiliation or whatever and we can search for colleagues or profile or whatever we have but this is just how we search information what's more interesting is how this happening so I'm just going to start this then I'm going to go to my window here uh and this is quite fun so it goes rather quick let me boost up my lightning here we see a lot of stuff happening right it usually goes

to LinkedIn quickly because there's a lot of good information in LinkedIn and you know it's finds anything it finds it's important to say that Linkin LinkedIn banss scraping so you can't do this but obviously we do it anyway because there's a lot of ways to bypass the scraping and there's I met a lot of people who bypass social media scraping blocks uh Simon found some really smart ways to do this by analyzing Meta Meta data and other means but the important thing to know is that like a lot of people do the scraping and it really works even though it shouldn't work um it usually finds some personal we website later let's see here it congest

some information it goes to uh my my my horard profile here which is also a pretty good place and it saturates rather quickly because this is the saturated model and it finds some image it finds a lot of information this is fun right it's it's actually really good I sometimes use this um maybe maybe not when I need to find emails to people because it it finds quite good information what we do then uh I'm going to zoom in a little bit here again we have a bunch of different models uh now we use open AI model uh open AI has a bit worse safety guard rails than claw for example but we can use any model we

want here these are sender Alias names like sythetic send senders I'm not going to read because the time is running a little bit short but if you go to Grace's profile we actually have a profile for J Al's we see it like this is just a a imaginary person right who works at an AI lab but she has a pretty cool profile she's an MIT student from U PhD student from MIT and she does this and we use a template here and these templates are super interesting we have more information about them in the white paper if anyone is interested in LM quering quering I really recommend you to look at this but it's long that's a

big thing here it's really long and we see things such just new first we write a short story about this yada yada and this is actually important because this query has to be General right has to capture every different scenario we can do uh it's a lot of good information here they worked out it says things such as U there's a cap here limit the email to 100 words please don't begin email with phrases such as hope this email find you well a lot of good meet information again again I I can't have you read all this in 2 minutes I'm just going to create the fishing email for the SEC with yeah we see the Wi-Fi is

really bad I heard but I feel pretty positive I think we're going to make this work and let's see this is and then we have it and the 100w cap is really important so what we see is high Fric we're at the ex Tropic lab initiating a project that delivers in the Del into AI based cyber fits coincidentally your research is some domain CAU our attention we believe your expertise would be a good good fit press this link yeah like this is good right it's a really good email so when we do this again we're super exciting to scaling this up big you know tens of thousands millions of people I I think this could like this could do some

pretty large scale damage again like this is fishing units making pris link we could obviously tailor this to say hey based on this person political opin political experience you make them do something or make them take some action in that way yada yada so the cheapness and efficienc of doing it is quite astonishing after the after the talk I'm going to invite anyone who want to try this themselves to do it but I'm out of time so I'm going to stop here and thank you so much for for having me a big shout out to Simon and Aaron again and yeah thank you everyone and we invite for questions yeah again time you want time you want

time we can ask would anyone I would think it was pretty fun to do this at someone in the audience if someone would like me to scrape them I'm going to I'm going to purge database afterwards if you don't want that I guess they kind of make sense too but it would be fun to try this on someone that's okay will you do it I'll do it okay so I can type it in or you can type it in Gabriel g r i e l yeah that's B SS and if you choose a keyword for yourself maybe your company or whatever you want prob be something good like your dog okay um insurance it's a little bit tricky I

haven't tried it actually so we'll see what happens um that's exciting

Insurance he'll need to be soon uh yeah we can see here we can take a sneak peek on the information and again this is only publicly available information I can just find this anyway so this this is um this is nothing novel about about that

so yeah but yeah all so there's a lot of information I guess most of this is is is probably true and yeah ex this kind of LinkedIn stuff but the tricky thing now is that I haven't I have to figure out which profile to use so that's a little bit tricky I'm I'm probably going to go with with Wayne Smith here just to see what's happening he's a more General guy uh great s is an she's an AI researcher Wayne Smith is just a general person so we'll see I never done this with um 's my general fishing template I never done this with insurance one more time tget name is you oh dang I I messed up

well we got add some Wayne Smith and Gabriel and generald fishing to but good call good call let's see what happens it's just me

again let's see uh hi there we're working on something that might Pike your interest a new angle on Cyber risk management our research aim to enhance current strategies in the insurance industry we think it could be beneficial uh for your work at Liberty Mutual here's a link to the project detail any thoughts it's pretty good right it it's not a terrible email it's like I'm I'm not exactly sure how it should be otherwise but I mean that if you blast this out to a thousand people a few people might be

interested are you I'm going to send it to him later yeah no thank you so much for for joining and uh yeah thanks thanks anyone if anyone have a question I'm happy to take him yeah

than uh I'm curious when you're testing detections uh fishing detections against multiple models what does your architecture look like for doing that at scale say it one more time what does your architecture look like for when you're testing uh uh the detection side yeah uh like what what's your actual architecture look like for implementing that we so this test is quite simple we just feed the we feed the actual email to the different models and then we refresh the models we use the chat Bots we Ed the the real chat Bots and then we refresh the model awesome we we refresh refresh the chat Bots between each question but we just ask it to them so

it's it's a very simple architecture in this sense now we could we could scale it up and implement it in the tool but this is as simple as possible cuz we also wanted to make it very easy to generalize and do for other people cool cool thanks as so you showed the fishing um the suspicion models for detecting fishing being relatively good um are you able to point have the model point out the portions of the email that it feels are suspicious so that those can cue the human reader so that the same way I get an external mark on fish external emails I can get he parts of this email we think maybe um suspicious to cue the

human reader to look at the specific SP spots and slow down on those parts so that they're more the model and the human are working together to catch the fishing email does that so your question if I got it right right is can we select certain parts of the email and say that this part is suspicious could the detection model yeah um select certain parts that it feels are suspicious and then have the human then review only those parts so the human isn't reviewing the entire email that's a super good idea and it's definitely possible and we haven't done it because I haven't thought about that but that's really nice I like that so I think that's

something we I'm going to write it down because that's a good

idea hi um my question is that um I may have missed this but the how exactly are you priming the model for suspicion is that a a modification of the prompt and is the information about that priming process or your the prompts involved in the research available in that white paper you mentioned yes that's a very good question the difference is in the prompt so there there a bunch of different prompts to use but for example instead of saying what's the intention of this model we say how suspicion is this email on a scale of 1 to 10 or and then above a seven or whatever we say that's suspicious or just is there anything suspicious about this email so

it's yeah it's in the prompt right and then is that uh is that the content of your white paper that's coming up or is that already is that Avail is that research available right now it is available from our oldest white paper this year's white paper will be better I think but if there's a link in the second slide I have a link to our white paper from from past year or from this February and then we have the last year's uh fishing detection with the prts involved gotcha thank you thank you hello first of all great talk I really enjoyed it um my question is you talked about that the M that the module

gets saturated after some time does it get saturated when it when it finds enough information or when it finds enough information is confident about actually belonging to that person so enough information that is confident about but this we can you can use this in different way right and we can tweak how much information it should seek but to a large degree here it's actually not necessarily good to find more information because if you imagine searching 100 Google hits you're probably going to get a lot of pretty old and outdated information so the the top five 10 Google hits it's almost always enough that to do that what happens with people like someone says for for example only has a LinkedIn

profile yeah but there like other results the top 10 results are just other people with similar names will it also like will it get false positives when there's not enough that's a super good question we worked a lot with this to try to see that like are the result Congress basically so that that's something that's really important and for now if the model realized that you know of these first 10 Google hits five of them are different like they don't really match up then we just flat this is probably a person with a super common name or a super common keyword such a new John Smith USA then it's probably going to get quite a lot of people and

then we just don't use that person for now okay so we just skip it okay yeah it's quite diff you could find workarounds but that's you if you take like John Smith USA how do I know that I had the one you mean was almost impossible that thank you cool yeah great questions I have one yeah oh go right ahead know you're ready H yeah thank you for the uh great talk my question is uh what is uh personalized vulner vulnerability uh profile is like uh can I can can you see some samples with the of of the personalized vulnerability profiles yes yeah so now it's uh it's rather simple uh because you know we need to test on more data

but what it basically is that in the database we have columns for like which fishing emails did you press and what category did this fishing email have tagged so that's a very simple base of it and seeing that you know you pressed out of the lab past year for example you pressed 25 fishing emails and 15 of them are tagged as social peer pressure so then you know we wait up and then we're still working on how we should wait these different emails because there's a lot of ways to do it right but if you press a majority of the email from the social peer pressure category then your vulnerability profile will be overly matched with the social peer pressure

category but what's really cool here and this is work in progress but what's really cool here that I want to do is that you'll see different combinations of these vulnerability categories probably play out in different ways right so if you're one person who always press Authority and pressure email that might be a combo that you know makes your vulnerability to something but then if you have authority and scarcity that's different things but how it looks is basically columns in a database honestly and then we're still working on how to visualize it and how to do it in the best ways but for now we tag the emails and then we see which emails you press and that's how we We Gather that

data uh thank you so uh in the future maybe there's a personalized the database would be sold in the in somewhere yeah yeah we have I can show it's kind of like early work but I we have the reports feature uh so mean in in this section we can we can list here all the users right uh oh yeah I actually purged my database yesterday cuz I didn't want to to see all the other people that we fished but so in the report section we can list this right now but it's not like super super beautiful but what you said if you're interested feel free to reach out and whenever like the tool is getting

better and better every day honestly so whenever we have that feature uh you're very welcome to just try it because we also invite regular users just try this out and that would be awesome to have you check it out thank you cool yeah I really appreciate your uh presentation it was really uh really great uh one question about uh the detection um have you considered using uh instead of using general uh purposed llms uh to use something that is more specific in the fishing world or or there is any uh um llm model that is uh considered to be more specific for that uh use that usage yeah that's a super good question like how do we work with

fine-tuning fishing model Bas and we actually did we started working with something called the Cambridge cyber crime data set which is a really big also fishing data set and I personally discussed a few San Francisco based startups who work you know cyber security June language models and right now we just don't work with them because I think these are good enough but I think that that's super interesting what you say and I have some ideas I would like to make a study on it and I knew some other folks like people are working on this right to create a perfect tuned fishing or cyber security or whatever heav your language model the thing is that you want if you do that you want to

have a marketing model right because you don't want the model to be be fishing because traditional fishing emils are bad then you want to find tune like just a general persuasion model that's really persuasive and I think that's really cool I'm highly positive to that research but for now we just use these because they're pretty good they're also like the most wildly accessible but I I'm super interest in just fine-tuning deceptive models and seeing how that could be done better because it probably can be done better like like I think the upper Mark for deceptive is we're not even near it uh so that that's very interesting but I haven't worked much myself with it thank you no it's a really good

question cool thank you so much thank you

GT - Devising and detecting spear phishing

Related talks