← All talks

EHLO World: Spear-Phishing at Scale using Generative AI

BSides Las Vegas · 202423:35189 viewsPublished 2024-09Watch on YouTube ↗
Speakers
Tags
About this talk
Generative AI is dramatically lowering the barrier to entry for large-scale phishing campaigns by automating the creation of convincing, grammatically correct, and highly targeted emails. This talk examines real-world AI-generated attacks—including fabricated email threads with fake identities and conversations that never occurred—and demonstrates how adversaries are using LLMs to increase campaign reach and success rates. The speaker shares detection strategies and defense-in-depth approaches to counter this evolving threat landscape.
Show original YouTube description
Ground Floor, Tue, Aug 6, 14:00 - Tue, Aug 6, 14:20 CDT Email-based attacks remain at the forefront of the cybersecurity threat landscape, ever-evolving to circumvent defenses and trick unsuspecting users. In this presentation, we discuss the risks of Generative AI in the context of the email threat landscape. Specifically, we examine how Generative AI facilitates the automation of targeted email attack creation, resulting in increased campaign reach, diversity, and the likelihood of success. We'll show real, in-the-wild attacks with completely fabricated contents, including conversations between multiple individuals that never happened, to demonstrate the sophistication LLMs can afford attackers in conducting convincing phishing campaigns at scale. Attendees will leave this talk with an understanding of the impact of Generative AI on the email threat landscape and what to expect in the coming years. People Josh Kamdjou
Show transcript [en]

hey folks thanks for coming how's my audio back there good cool great all right welcome to hello world L World anyone get that joke anyone get that reference thank you very much I worked very hard on that um spear fishing at scale using generative AI um all right so quick background why we're here uh is there echo or just a little bit of echo maybe all right

um testing all right there we go thank you sir appreciate it okay so uh I'm on the internet you can find me pretty much everywhere at J Camu or I'm the founder and CEO of a company called Sublime security we detect and prevent lots of email attacks that's why we're here today to share a little bit about what we've been seeing in the wild um um and also some um share a little bit of experience um some of what we recreated on the offensive side um my background prior to Sublime spent most of my career in the offensive cyber space that's a little bit of what we're going to be bringing today um all right so quick overview of what we're

going to be covering we got a lot on the agenda um we're going to mostly spend most of our time talking about gen use by adversaries and what we've been seeing in the wild as well as um what we were able to recreate relatively quickly to just demonstrate really the the barrier of Entry to doing this stuff and and how low it is and we'll talk about detection and then defense in depth strategies as well so before we do I don't think it's a surprise to anyone that the um threat landscape is rapidly rapidly shifting we're seeing this in in email in particular lots of new techniques being employed uh anyone see like QR code fishing recently or heard

of it callback fishing I mean it feels like every day we're seeing new types of attack variants the question is like you know why what's the motivation behind it um when we're when when speaking about um financially motivated adversaries you have different types of adversaries with different objectives you've got nation states um who may be financially motivated but have other motives as well like Espionage you know Intel collection when speaking about financially motivated adversaries they are seeking High Roi opportunities um there's two inputs into Roi there's a return on investment return is the financial reward at the end and investment is the time money resources allocated to achieve that return so keep this in mind this is why you know we'll

see obviously the adoption of gen by attackers because it makes them more efficient um so quick terminology overview um has anyone not heard of gen or llms yeah all right cool so we won't belabor that points but just to draw a distinction here um gen is really like the umbrella term that includes image Generation video creation audio synthesis code generation it also includes llms so llms are a subset of gen um and llms um focus more on text generation summarization that kind of thing uh for the purpose of the talk we're just going to use the umbrella term but just so we're aware of the terminology there and why we're using what and just a couple words on the

landscape here and um we've really got two um two different kind of even Phil philosophical approaches to the landscape we've got um the Clos models and we've got the open source models everyone I'm sure is well very well familiar with open AI maybe not as familiar with the others anthropic go here uh these are all accessible via API these are how these closed models make their um models available and we've got the open source models and you run them locally generally or you can deploy them elsewhere using tools like AMA um so when when talking about what we've recreated um or even attacker usage you can generally see either one of these depending on privacy

preferences um or the lack thereof so we're seeing I mean this is probably not news to anyone that this is happening in the wild today um and it's happening pretty much through many different attack vectors so not just email we've got the FBI warning um around voice and video cloning down at the bottom left there and we got Microsoft talking about Forest blizzard a Russian threat actor employing llms for various purposes um Recon and uh enhanced scripting generation so really we're seeing um we are seeing adoption quite literally by um adversaries and bringing it back to the former Point around why right um it makes folks it makes makes you more efficient um and it lowers your

investment and increases your return and we'll talk about very specifically in the email domain what it enables you to achieve um Beyond just efficiency uh really we're talking about efficacy of attacks as well so let's get right into it around the attacks we've we're seeing in the wild so a quick note on this right it is practic impossible to assess with confidence that something is Gen uh um originating um so if anyone tells you with certainty that it is like unless they were on the adversaries keyboard and observed it happening it's impossible to say with certainty right um so we are saying likely we use likely AI generated U with very very high confidence and we're making that

assessment just for transparency sake using uh some of these factors right so we're talking to our customers and validating that these are in fact uh fake threads fabricated identities real events that are happening um we're seeing similar variants across multiple customers but tailored uniquely to certain ones um we look at thousands and thou millions and and very manually eyes on glass like thousands of messages like we analyze them so we have a lot of experience um so we've got Instinct for what's looks feels and looks like AI generated and and what's not um and then we are in in throughout the course of this presentation we'll also show the output of some of these AI detector um

tools so there's a bunch of these out there uh this is a relatively new field the detect the detection of AI generated text um so it's very nent it's not a reliable thing that you can use to like detect things there's a lot of FPS these are a lot FN but um it's it's a thing so we we show the output of some of these throughout this so here's the first one um this is writing to um verify and request um invoices and basically start a conversation and there's a bunch of things happening here there's um an impersonation of an organization's like a real uh contact at this organization what's really notable about this like invoice fraud is not a

new thing right that's been happening for a long time but it's generally riddled with like poor grammar um and you know threat actors that are non-english-speaking that clearly just threw some into Google translate and it's like not that really well written and so this is what's interesting about this is that it's um proper English there's a structure to the paragraphs here it reads like relatively well um and so really we're seeing just better formatted um generic this is like not highly targeted right we'll talk we'll get to the more targeted stuff but it's interesting because even the low-level Mass fishing campaigns are stepping up and and they are not your your Nigerian prce scams anymore right they're they're well

structured so over on the left here we also have identified some signals here we will come back to this towards the end when talking about detection but I wanted to highlight some of the signals that you can actually use um for each one of these so we'll come back to that um mentioned these AI generator detector tools here is I think one called zero GPT that assesses with high confidence that every single word here was generated with um a tool uh with with that was generated by AI um it makes these you can look up how it's making these interpretations really a variety of factors like Randomness the probability of certain words the uh variation of sentence structure the

length of sentences there's a lot of factors that go into these assessments um but yeah interesting uh assessment there all right example two who has who has received or seen a benefits enrollment fishing scam before yeah okay a few people um so this is like not necessarily a new uh technique right we are seeing old tried andrue techniques that are better than before um this is proper English there's you know 1 2 3 four these are all proper like it makes sense it's well structured there's no grammatical errors here and so we are seeing the older the these like tried andrue techniques around um pretext uh and the the techniques that are being employed just stepping up in

complexity or in convincing and how convincing they are the other reason I wanted to highlight this is that there is a PDF attachment on this message that uh has an Ed QR code so this is actually a QR code attack and so um you can see the the uh blurred out part there and so what is happening here almost certainly is like autogeneration of PDFs um and and embedding of QR codes as well in attachments so um quite interesting and here we can see a different uh AI detector Tool uh assumes predicts 100% of this was AI generated okay on to the most interesting one that we have come across as of late um there's a lot that is

redacted here because there is there are real identities there are real events and so for you know customer privacy reasons we've had to redact this but um this attack uh has an entirely fabricated thread with responses with fabricated responses from real identities at the Target organization about a real event um and we'll go through each one of these it's quite interesting so the first email in the thread um is coming from uh the a a fake m message that's purporting to be from the target organization an entity at the Target that is reaching out to the gala saying that they want to pledge $25,000 to this event the they we've got a reply coming back saying that they are um they wanted

to express their heartfelt appre appreciation and send a package for sponsorship um and then we've got uh a followup where in threaded reply two um these are actually um real organization names in the targets industry so they've got they've done some you know enrichment here and are saying hey X company and Y company are already in um and they filled out the form and whatnot and then we've got another f F reply uh purporting to be from the target organization saying where to send invoices to and then we finally have the last message in the thread which is you know what what the actual attack is which is sending the invoice so we've got someone in finance

that receives it sees that there's been exchanges and everything looks legit um and it is quite in fact not legit so very very interesting development here and you can see that this is um that the AI detector actually does a worse job on this and ultimately like these things are just they have too many FPS too many false negative so if you are kind of relying on this on a day-to-day just know that you know there it's it's a relatively n it's a relatively new field that's that's developing on the detection side so we did verify with the customer um that this was in fact completely fake uh did not exist the cender domain if we pop

back over um was actually registered um a few days prior to the attack so it was just registered newly registered domain it was designed to impersonate um so quite quite interesting here so what's the impact of this ultimately on the email threat landscape uh messages are more tailored they're more convincing they're more correct grammatically they're more diverse when looking at a campaign across multiple organizations and um they have more reach because they are landing in the inboxes more opposed to spam so uh just for fun I wanted to see like what it would take to kind of recreate something that was quite convincing so um there's there's lots and lots of tools this isn't a talk

about ENT um but you know there's plenty of talks on gathering information on entities and an automated way here's some of those tools um so you can go pull in pull information from identities from organizations crawl websites crawl LinkedIn all these types of things there are services that already have all this information like full contact UM or or even clear bit provides um logos give it a domain it'll give you a logo so um all of these enable you to automate this and for recency uh or or for more um uh higher chance of success the if you can include and this is a personal opinion from just coming having uh spending most of my career on

the offense like if you want to convince someone of something um or have a better chance one technique use a recent event use something they said use something that is like highly relevant and timely not just some generic thing right so hey I saw you at this event hey I saw you know something that is like much more believable so we took all this information this is a bunch of info gathered about me so name title pass roles um recent activity from socials this was like one of the key inputs here so pulling a bunch of my recent LinkedIn posts a bunch of my recent uh Twitter posts and then giving a prompt um that

we've iterated on so I'll read some of this uh you are a computer scientist who writes very dull with little excitement and is extremely tur craft an email message do not sound salesy at all or make any generic statements keep it extremely short and concise few sentences at most uh it goes It goes on to say to give some additional Direction um mention that you're sharing a document and say how specifically relevant to the observation how it's specifically relevant to the observations above and think they'd be useful or relevant be casual double check your work uh to ensure you are not making up anything that didn't happen so this is what we get um this

was after not much iteration um hi Josh I saw your LinkedIn post about about the increasing sophistication of fishing attacks link post your points on adaptive threat for to detection or insightful that is sort you know relatively what I was talking about uh I'm sharing a document on recent fishing methods that relat your observations I think you'll find it relevant and useful uh and we can embed then our lure in a Word document or a PDF or something like that um decently like I'm click on this um like I want to know what yeah what you're talking about recent fishing methods like that makes sense um so really this was an exercise to understand like what does it take uh and

it's the the barrier to entry have has been lowered significantly so um that's just something to really grock um okay really quickly we'll we'll go through the last few bits here detection and prevention so on the detection side it's not all that different from from targeted like tailored attack detection so even for this guy right um we're using signals like hey um you've never spoken to this person before there's a suspicious Word document there's a it's a malicious Word document it's a malicious PDF there is a suspicious Link in the PDF we're going to you know there's a link that we Analyze That Auto downloads and ISO like it's not all that different on the

detection side um but attackers are um just constantly evolving these techniques right so you know there's thousands of these signals um but the point is that it's it's on the detection side it's not very different even as things get more uh Advanced and um we talked a lot about the offensive applications of of generative AI there's a lot of of course defensive applications from language analysis int classification um so this is one thing that you know we do very heavily around identifying text and understanding its intent um extracting entities this is called Neer named entity recognition is there a request being made is there a sense of urgency things like that and of course you know there's

there's so many more around um alert prioritization sock there's a this is really just the beginning of um a lot of interesting defensive applications so the last word to wrap up um defense in depth really there's obviously like if you can block it at the email L like left of boom right as far left as you can go great but you should always have a defense in depth mentality um educate your users right it's important to know that fishing attacks um can be extremely convincing it it's not just the hey the the Microsoft teams alert or the fake Microsoft teams alert or things like that like it can be quite convincing um for credential fishing attacks employ

MFA ideally Hardware based like UB keys and for to prevent becc have a multimodal approval for large transactions multi-layered um all right um I think we've touched on all this it's it's just it's still very nent it's it's it's getting worse so um thank [Applause] you uh thanks uh any questions for Josh I'll bring mic C over so with regard to detection uh how much are the like the much more common uh kind of that that grammar uh uh like helping apps like grammarly Microsoft co-pilot even just Gmail how much is that kind of like fuzzing with uh those detection algorithms that you guys are using to because we're probably I I would assume seeing more and more

corporate users using that uh to you know help themselves but then Al obviously you know with Gen uh then being used to create these fishing messages how much is that kind of like wreaking havoc with those detection models as in uh for the for like the Gmail for Gmail's Det protction model or for like how are they so earlier you kind of had like 6% you know yeah you know so then obviously you've got more users like real users using these gen based uh apps to help with their grammar and and you know in their email yeah yeah oh yeah like how does that impact detection basically well that that's why I don't think you can use these detectors as

like inputs into detection right it's there's too many false positives yeah one more um so do you think that like I guess the overall mitigations that you would take against fishing let's say are like drastically Changed by this or is it just that the fishing is more effective the ladder okay yeah uh thanks very much Josh uh lunch bre other talks are going on in other tracks otherwise we'll see you back here at 2 p.m. for hacking arcades thanks