SQL Injection Is a Thing of the Past… and Other Lies We Tell Ourselves

Name: SQL Injection Is a Thing of the Past… and Other Lies We Tell Ourselves
Uploaded: 2025-05-18
Duration: 53 min 59 s
Description: Despite being discovered over 30 years ago, SQL injection, command injection, and XSS remain pervasive in modern applications, accounting for 6.7% of open-source and 10% of closed-source vulnerabilities. This talk examines why these technically solved problems persist in the wild and explores practi

BSides Charm · 202553:59104 viewsPublished 2025-05Watch on YouTube ↗

Tags

CategoryTechnical

TopicVulnerability Research Web AppSec

DifficultyIntermediary

TeamBlue

ResearchEmpirical Research

StyleTalk

Mentioned in this talk

Platforms

Aikido Security

About this talk

Despite being discovered over 30 years ago, SQL injection, command injection, and XSS remain pervasive in modern applications, accounting for 6.7% of open-source and 10% of closed-source vulnerabilities. This talk examines why these technically solved problems persist in the wild and explores practical solutions developers and organizations can adopt to eliminate them.

Show original YouTube description

Despite being older than internet explorer injection attacks like SQLi, Command Injection, and XSS remain prominent. Our research found SQLi alone accounts for 6.7% of open-source vulnerabilities and 10% in closed-sourceprojects. This session reveals why these attacks persist and how modern solutions can help. Mackenzie is a security researcher and advocate with a passion for code security. He is the former CTO and founder of Conpago, where he learned firsthand the importance of building secure applications. Today, Mackenzie works for Aikido security to help developers and DevOps engineers build secure systems. He also shares his knowledge as a contributor to many technology publications like DarkReading, Financial Times, and Security Boulevard along with appearing as an expert in TV documentaries and interviews.

Show transcript [en]

[Music]

All right. Well, I think we can get uh get started. So obviously today I do a little bit of talk about SQL injection and uh kind of other injection style attacks. This talk this presentation kind of came about uh one day I was kind of sitting around I had access to all this great data that I never had access to before and I wanted to know does like are we still doing SQL injection uh 30 years since kind of we first discovered it. uh and I was very shocked to see that yes, we are very much still suffering from SQL injection and it hasn't gotten much better over those periods. So that kind of led me down

this path to kind of create this talk, have a look at it. It's more or less like a a curious interest of mine and kind of morphed into a bit of a presentation. But a little bit about me, um if you're wondering where my accent is from, I'm from New Zealand or Arteroa. Uh but I I now live in the Netherlands. I'm the co-founder and CTO of Compo, which is a health tech company. It was acquired by elements uh in 2023. Uh and I'm currently the dev uh developer relations at Aikido Security, former dev relic at Giggardian, which a few people know about me. And I just like to keep everyone guessing where I am in the world. So uh but you can find

me anywhere on social media, the handle at AdvocateMac. So uh one of the curious things about injection attacks is that from a technical point of view they're really solved problems. There's no need that for us to have an injection vulnerability in our applications particularly SQL injection and it's been around for so long. It really leads to the question is why are we still struggling with these types uh of attacks and particularly when we'll go into it a little bit but the solutions to a lot of these is really quite simple and if we look at the history of SQL injection and where this first started it was first coined publicly in 1998 by Jeff or otherwise known as

rainforest puppy um but it was actually uh used and seen in the wild and explored in the wild long before that SQL injection, not the phrase, but the attack, it predates the Internet Explorer. Uh when it was back when developers were called web masters, which my dad still refers to me as uh Google wasn't founded yet, Napster wasn't founded yet, and the and the term open- source had yet to be coined as as it as we use it today in software development. So this vulnerability is really one of the oldest ones that we've been dealing with particularly when we look at web applications and it's still one that plagues us a lot today and I'll

let you know exactly how common it is. First, I'm sure probably everyone here is pretty familiar with SQL injection, but I have a few slides just under going through some of the basics, doing some basic things because I think it helps maybe as a refresher or maybe uh you're you're hearing about it for the first time. But SQL injection is essentially when untrusted data usually from user inputs is used directly in a query or in a function. And so if we then change that, so for this example, we have a search bar. If we put an SQL query in there and this state we're talking about deleting tables, dropping the tables and if we don't sanitize for that or we're

not using prepared statements, then we can manipulate the query that is being used and bypass authentification. We could delete data. We can access data we shouldn't have. And in some bad cases, we can even get down to levels of remote code injection uh which I'll I'll give some demos of. So SQL injection in its most basic form. This is kind of the the first thing you learn about SQL injection. And I'm going to show it because it's so relevant even in the most complex of examples. But if we have this website here, Aluro Mutual, it's kind of a a banking website and it has a login page that you can't get past. You need to have a

username and a password. But this is vulnerable because all we need to do is put in the username admin. Uh put a quotation point and add dash dash to the end of it and we can log in as an admin user to this. This is most basic SQL injection. And why is it doing that? Well, when we look behind the hood, if we look at the actual SQL query that's being run, it's taking our username and our password and it's running them directly in that query. So it's taking untrusted data from the user and utilizing it. So when I added in my malicious kind of SQL package there what was happening I'm saying that my

username is admin then I'm got my quotation point to finish my query and then I'm using dash which is the SQL which is in SQL language is is there for comments. So I'm commenting out the the the reason why I need a password. So this is all very very basic but this is kind of explains the principles of SQL injection even when it gets very very complex. So how do we solve this? And this really comes back to my earlier statements before is that SQL injection is really is a solved problem. And let me show you kind of how easy it is to solve. Here we have an insecure SQL query. Why? because we've got our

username and our password here and we're taking them directly from our untrusted data, our user input and then we're using them. If we want to solve this, it's very very simple. All we have to do is use primitization or prepared statements where we run the SQL query as a whole and then later on we add in our untrusted data and that is being run on the the on the back end there as strings and it's not being treated as part of the SQL query. So this is how it all solves. So this kind of leaves the question if this isn't complex, this isn't life-changing. I don't think I've explained anything that someone probably didn't know here today. So, why on earth

are we still dealing with SQL injection vulnerabilities today? So, what I thought is the best way to try and explain this is to look at a real life example of SQL injection, go through the exact steps of what it took to get there and how we could have solved it. And it it's going to get a little bit complex, but it'll also be incredibly and stupidly basic at the exact same time. So untrusted data comes from lots of different places. User input fields is the first one that everyone thinks about. It can come from URL query parameters. It can come from HTTP headers. This one is in particular something that we forget about. It can

come from cookies, lots of different other areas. And it's essentially just anywhere that a user input can can can manipulate anything. And so we nearly need to do this. But part of the reason why SQL injection seems to still be a problem is because we forget about a lot of these different areas. We forget about the HTTP headers that can be manipulated. And more than that, when we think about SQL injection, we often think about it as a user inputs data or manipulates data and immediately the injection happens. where a lot of SQL injection today happens in a delayed response, meaning that I will use HTTP headers to manipulate some kind of variable that I will use in an injection

attack later on. It makes it slightly more abstract, but essentially at its core, it's all the same. So, let's look at an actual example. This is from the move it. So, move it is a file uh transfer software. It's really designed for secure transfer transferring of of files and is really used by very large enterprises, governments and it's used to move sensitive information. So this is a very sensitive piece of software uh that should have very good te technology and security behind it. Uh, and as you've probably guessed, there was a massive vulnerability from this move IT software that affected millions of people and brought down airlines for a day and did all kinds of really nasty

things. And it all comes back to an SQL injection vulnerability. And while the chain of attacks and events that needed to happen to to make this vulnerability work does get a little bit complex, when we peel back the layers and we actually look at what really caused this, it's about as stupidly simple as the login bypass that I showed you at the start. So this is the vulnerability here. It's a 9.8 critical. It allows for remote code execution via SQL injection. So it's about as bad as it can get. And essentially how the vulnerability works is that you have an unauthored user that sends crafted headers to a DL file. And I'll explain that later. But these headers adjust or

affect session variables. Those session variables are then later on used to create an SQL injection attack. And there's a lot of other steps we need to go through uh to be able to do that, but ultimately it ends with us being able to run our own payload on the server or remote code injection. So what are the steps that we need to do as an attacker? If we all put on our kind of attacker mindset here, the first step any attacker needs to do is identify a target. So there's lots of tools to identify if this move IT software is being used uh by these. And the next thing we need to do is like any normal

user we have to generate a session ID and we can do this and none of this is insecure at the moment but this is just the first steps of of how we get going in this vulnerability. Then we need to set our malicious session variables. So this is the first part of the SQL injection campaign. So there is a file on this called the move it is ASP APIdL file and essentially it's not super important to go into the details about this file but this really handles the headers. So what happens is when you make a HTTP request and you insert some headers, this file then establishes them as session variables which means that they become variables in that

application for your session and the application will use them later on. So in the headers we can set these variables to be anything that we want. So for example instead of putting something expected in there I can put uh an SQL injection payload. And this is the first real big issue with this because there's no sanitization that happens here. This file just will take whatever input and add it as a session variable. And this is really the the first issue that we come across. So for example, we could input some kind of SQL injection here where we're making ourselves an admin. The next step that we need to do in this is we've got there, but we

haven't actually executed anything. We've we've put an SQL injection payload as a variable that's on the server. It's on the application, but we haven't executed it. It's not doing anything. There's no real danger yet. So, to be able to do that, we have to do a couple more steps. And we just kind of go through the process. Next, we need to create a CSRF token. This is for cross-sight request forgery. Again, this isn't a big vulnerability or anything. This is just a step that we need to take. So we just need to send a dummy request to this guest access file and we get back a valid CSRF token. And then the next step once we have that CSRF

token we can actually start making post requests to this guest access file guest access.aspx and this is when we can actually start to trigger our SQL injection because we've stored it as session variables and when we run this when we do this post we trigger them. So now all of a sudden our payload is being activated and we're controlling our SQL query and basically we can now manipulate that. We can run queries and we can run them with elevated privileges. So we essentially have now gained admin access to this database. So we can do things and the first thing that any attacker will do once they have access to any kind of database especially one that deals with

authentication is to create backd doors for themselves create login back doors. So that's the first thing that we're going to do and because we've proven how we can do this SQL injection we just need to keep doing that doing various different things creating accounts for ourselves making sure that we have persistent access here. We could do something nasty like delete everything, but we can do much more nasty things if we keep going down this chain and keep pushing forward. Now, what I ultimately want to do is get to remote code injection. At the moment, I have access to the database. I can do whatever I want with the database, but I can't remote create remote code execution, particularly on

the server. So, I have to keep going. I create using this database a fake trusted provider which gives me a JWT token. So because I have access to the database, I can then do that. I then get a fake JBT uh token signed with a private key here that allows me to communicate directly with the server. And then from here I can do what I really want to do, which is upload my malicious payload. So I can embed a file with commands in there to do remote code execution. And then the last step is to obviously trigger that payload. So just by going to this endpoint here, I can start to trigger that payload that I've

now uploaded and I have complete control over the entire system here. Not just its database, but everything. And the last step, if you want, you have total control. What do you do when you have total control? You clean up your mess and you pretend you're never there. Now, this all sounds very, very complicated and there's a lot of things that had to happen for this all to happen. And I don't expect everyone to kind of understand everything about this and go, "Yes, that's right." Or at least I hope you don't because you might challenge me on it. But, uh, but when we peel back these layers and actually understand what's happening, this very, very complicated attack all of a sudden

becomes incredibly simple. There was two big vulnerabilities or two big mistakes that happened. The first was in this file that we talked about at the start, the move it the move move it API.dll file. Now this wasn't a massive flaw, but it certainly contributed to it. This took the untrusted data. This was how I interacted with the session variables. It took the untrusted data and it stored them and it stored them however I wanted to. It did no sanitization. So that is the absolute first mistake that happened here. There was no sanitization on this input. However, in security, we should have robust backups for everything. So, this shouldn't have been enough for me to do an injection uh campaign. So,

there needed to be another vulnerability. And this was the really big one here in this uh guest access.px file. This is the code that was actually running that. And there's a couple of areas in here, but most importantly, this took untrusted user data. It took the session variables and it executed them directly in the SQL query. And if we just take a look at this at how it worked. This here is the session variable. This is what we adjusted in the HTTP headers. It's being stored as this parameter here, email body. And email body is being used in that query. Now you can understand how a developer wouldn't immediately see this because we're not taking data directly from a

user. There's one extra step in there and that is that it's being stored as a session variable. So you might think I'm not using untrusted data. I'm using session variables that I control. But actually you don't control them. And this is why I say that it's complicated, but it's also incredibly stupid. When I showed you the example of authentication bypass, I showed you, you know, what an SQL query looks like that's vulnerable and how easy it is to fix it. So we have this problem that caused huge damage to massive organizations. We got into it and how could we have or fixed it? All we needed to do was use a prepared statement. This is the same way we've

been fixing SQL injection vulnerabilities for 30 years. It hasn't changed and it's not complicated. So this just kind of goes to show that whilst the chain of events to kind of get remote code execution on this very sensitive piece of software is quite complicated to actually solve it in the first place to use it comes back to the same standard principles that we should have been using for a long time and this here obviously is what it would have done. So this kind of comes back to the fact that SQL injection is a very strange vulnerability for a lot of us because it technically is solved. So it should be an easy fix. So is SQL injection still a

problem? I know I just showed you an example of this, but outside of this unique situation, how big of a problem is SQL injection genuinely still today? And this is what set me off on a little bit of a of a research project and set me down this path to deliver this talk was I wanted to look at all of the CVEes that were created uh over the last few years and wanted to find out how many of them were because of SQL injection and I then wanted to review how many closed source vulnerabilities were. So to do this, I looked at all of the CVEes uh that were added to the GitHub advisory database over the last few years. And

then I also reviewed 70,000 clothed source programs or projects with initial scans to see how many of them contained SQL injection vulnerabilities. And then I continued this with command injection, path reversal, and cross- sight scripting to find out where we all were. So a few considerations just with the research before we get into the results was for the closed source projects I work for Aikido Security. We're a vendor and basically we're limited to the to to look at the data from our customers. Obviously I wasn't looking at our customers vulnerabilities. I'm just looking at the number of vulnerabilities that were found from that. So a few there there and I I from this is that

people that are using security tools are probably more mature than the average person. So there was a little bit of considerations where the problems actually probably worse than what we discovered although it's pretty bad from from what we can tell. So what I found out was that SQL injection generally was everywhere and uh we're definitely still suffering from it here. So in terms of open-source projects, so in 2024 there was 36,000 vulnerabilities in open source projects uh reported to them. 2,417 of them were SQL injection. And if we look at that from 2023, it's very similar. If we express this as a percentage, the open source, which is in purple, is around about 6.7% of

vulnerabilities in 2024 or 7.8 in 2023. And this makes up for around about 10% of vulnerabilities in closed source projects. So SQL injection vulnerabilities make up a huge percentage of the vulnerabilities that we are finding out there in the wild still today. And what's really interesting is around about 20% of the applications that we looked at the projects were vulnerable to SQL injection at some point. So what that means is that a project may have more than one vulnerability. So 20% of the projects that were closed source that we looked at on their initial scans they had an SQL injection vulnerability which to me was really surprising because I really didn't expect this to be such a

predominant issue in 2025 but in fact uh it it still is. So it still goes to show that there's so much area that we need to improve on in even basic security uh that we're looking at. But we can also expand this beyond SQL injection and actually look at other areas as well like cross-sight scripting. So if you're not super familiar with cross-sight scripting, there's really two types that we look at. There's persistent cross-sight scripting. This is essentially when I has a have a script and I embed it into a page. The most easy way to explain this is that there's a comment section on a website. I embed a script in there. that script does

something like steal session cookies and it sends them back to me the attacker usually via like a C2 server or something. So that's what persistent cross- sight scripting is uh from here. So you kind of embedding in a script that script is usually hidden. It runs every time someone goes on the page. There's also reflective cross- sight scripting and this happens where you kind of put the script inside the parameters in the URL. So you're trying to basically do the same thing but it's not persistent. It only works on that one link that you have uh managed to turn malicious. And if we look at cross- sight scripting which shares very similar elements to SQL injection where

it's still an injection style vulnerability. It's still solved by very similar things. It makes up a huge percentage of the overall vulnerabilities that we have. In fact, last year it made up 28% of the vulnerabilities that were discovered in closed source projects. So cross- sight scripting again is one of these vulnerabilities that isn't a technical vulnerability. It isn't a technical problem anymore because we know how to solve it. Any blog will tell you how to solve it. However, we still seem to be struggling with this out there in the wild. And then this is from some interesting research that I that I found from patch stack. So I've stolen this graph that they have here. But in certain areas

cross-sight scripting really is problematic. And I think this actually bumps up the results a bit. But in areas like WordPress, it makes up for about 50% of the vulnerabilities. And I think someone there was uh there was some uh research that had been done that I haven't been able to verify that said 24% of WordPress sites were vulnerable that were actively vulnerable to to cross- site scripting uh via using dash tools to find them. So that was a pretty scary result that I had there. And we can definitely see this out in the in the wild. Recently a very popular WordPress plug-in uh was found to have a cross-sight scripting vulnerability within it and over 2 million sites were

uh vulnerable to that. So again, still a massive issue that we're seeing out there in the wild from that. So how do we fix cross-ite scripting? Well, it's very very similar to how we fix things like SQL injection. We need to escape out properly. So that means that we need to escape out of these dangerous characters that we're inserting into the these these characters that are used for scripting. We need to make sure that we validate and and sanitize our inputs just like as we need to do in SQL injection. We used to need to use things like uh content security policies and avoid inline JavaScript. So, if we follow these, there are a few more, but if we follow

these principles, we can knock out most of the cross- sight scripting vulnerabilities that we have out there in the wild, but yet we're still struggling with this a lot. Now, command injection, uh, this one, this meme I quite liked. U, but you can only be you can only suffer from SQL injection if you have an SQL database. But that's kind of not true. uh it is for SQL injection but just general injection the very similar principles can apply in different areas. Command injection is again very very similar type of uh vulnerability to SQL injection in the sense that it's untrusted data that we're not sanitizing and not dealing with correctly. So command injection I'll just explain this

very very simply is that here you're getting uh usually it's from getting files from a server or files from your operating system. So here we're where this thing is asking us what file we want the dog the course or the cat and then we click it and then it retrieves that file from our system. So it's executing a command on our system to retrieve that file. But we can manipulate that uh using a post request so that we're asking for the dog file, but then we're also removing all the other files in the system. All right. And so this would kind of delete everything. And and this is a bit of a silly example, but it actually works

quite a lot uh and causes a lot of chaos. And it's again because people don't think about the untrusted data. And I specifically used here kind of a limited field, not just an input, because for whatever reason, people seem to think that if we have nice little uh uh options here that limit a user input that it's safe. But uh we can of course bypass this using something like a post. And what about command injection as a vulnerability? Well, here again, we're seeing huge numbers from it. Now, this is the only one that gives me a little bit of hope because we are seeing we have seen some declines from this. The most significant declines is that in

2024 closed source projects 5.8% of the vulnerabilities we discovered were command injection and in open source uh around about 7%. And this is down significantly from the other two from the 2023 years of 8.2% and 7.95. So this one here is getting a little bit better uh in terms of how we're dealing with it. But isolated in on itself, we're obviously still going a long way away because again solves problem from a technical point of view. So how do we solve this? Well, we just need to limit uh limit what what our applications can do on the server side. So making sure that if we if it's only selecting if we're only expecting certain inputs that we only

allow those certain inputs uh from here but again not client side on the server side here. uh and also making sure that we only allow access of what we're expecting. For example, making sure that we only have read access to these so that we can't just delete all the files on someone's server as well. So avoiding any kind of shelled commands here. So from a technical point of view, just like SQL injection, just like cross- sight scripting, from a technical point of view, these are really solved vulnerabilities. And of course, we do see a lot of command injection out there in the wild. Uh there was one recently uh called beyond trust which again affected millions of customers uh was a

fairly critical uh vulnerability. So we're still seeing this out there in the wild. This isn't just hypotheticals like we're seeing out there in some code with some tooling. The last one that I'll talk about before we get on to kind of other things is path traversal. Um, path reversal is one of my favorites to go into is it's not that common, but it can cause all kinds of havoc and it's so simple that a lot of people seem to forget about it. But pass reversal is still very relevant in file upload headers uh like archive extractors and also like kind of local file readers from here. And past traversal is very very simple one to explain. Here we have

a URL with a parameter where we're trying to download this file uh report.pdf PDF. And this is what the code kind of looks like for us to be able to do that. But if we were to change this to something like escape, escape, escape, and try and get to the password file, then we can download that. And these little dot slashes are just going back further in the directory. So, we're traversing back through uh the path all the directories to try and get to a more juicy file. And this is kind of the exact location that we would typically see something like this. And this is a this is something that uh is very surprising. This is the only one from

the research I did that I expected there to see more of out there. Uh but it actually wasn't so popular. Only around about uh 2.7% in open source and 3.5% in closed source had past traversal vulnerabilities which again is uh is going in the wrong direction because we're seeing an increase of this uh from 2023. And we can we can solve this in very much the same ways. I won't go through all of this, but sanitizing our inputs, restricting our access, and making sure that we only have expected results uh in in our code here. And of course, we are still seeing this out there in the wild. The connectwise report is one that's recently happened that we can we can

turn to. So what is the total impact of these injection tile vulnerabilities? Well, if we look at this, and again, I know I've said this a lot of time, but I'm going to stress it again. These are all solved from a technical point of view. So, there's no need for this. We're not talking about zero days here. We're talking about very known vulnerabilities. And if we actually look at it, it's nearly 50% of all the vulnerabilities we see out there in the wild are these types of vulnerabilities. vulnerabilities that can easily be solved that have a long history that there is lots of educational material about how to solve them but we still seem to be really struggling. So in

total 47.7% of these types of vulnerabilities and I haven't even included things like NoSQL injection uh which I am doing some research into but will come in a later presentation. Um, but just goes to show that this question that I had at the start uh when I started looking into this was I wonder how much SQL injection we're still suffering from. I definitely did not expect to get to an answer of 50% being general injection style attacks uh from here. So it shows again that we got a long way to go in in so why does this happen? Complex user data chains. This is kind of what happened in that that example that we ran through there.

Things can be complex and the the fact that you have untrusted data or you have user input data can be missed because you have a delayed uh you have a delayed execution of that payload. There's a lot of legacy platforms that are still out there being used and even if you're not using legacy platforms, they still touch our servers, our infrastructure, even our cloud on the using the highest end cloud here. There are still lots of legacy platforms that we use to operate that. I also see like a huge over reliance in tools like WFT and inapp firewalls. So web application firewalls and inapp firewalls have the ability to be able to block a lot of SQL injection

queries and but there can be huge over reliance on them and they have to be set up in very specific ways to be able to catch everything. uh and I see it that a lot of people that when these tools came out there was a big sigh of relief and we just rely on them so much now. And then the other part which I'm going to touch on a little bit is that AI generated code is definitely accelerating our risks that we're seeing out there. I'll look at and I'll briefly touch on some research that I did there. And the final point is that we're still just not sanitizing data. We're still not doing the basic things that we we

need to do to deal with this risk. So tools do help. I talked about WAFTs and RASPs or inapp firewalls. We can also find them with static application security testing, dynamic application security testing and software composition analysis because a lot of these injections we become vulnerable to them because our dependencies use them. So of course we want to use tools but just in general we should absolutely be implementing just the best practices that we can that can all stop these. So I want to talk a little bit now about injection in AI. Uh because this was also another kind of question that I had was I think a lot of people probably have is how good is AI generated code

and how good is it specifically at creating and spotting and finding these injection type vulnerabilities and came up with some very interesting answers. So uh first of all uh there's a study from Canel University that looked at a lot of these coding assistants and it found out that out of all of the vulnerabilities the injection style attacks were actually most common from AI generated code. So what's in purple is what humans write. So we can see that AI and what's in blue is from AI and we can see that AI is particularly bad at authentication and injection style attacks here. So this is definitely not helping us on our kind of path and our style uh of getting on top

of these issues here. So when we looked at this, I wanted to see okay, how can I make an AI model give me an injection style attack and how can I see how pleasant it is. So I came up with a very simple study uh from this where I had this query here which was to write a JavaScript function uh that search for products in a in a database yada yada yada and I purposely crafted this because I wanted to have something called temperature in my responses. Temperature is like the variance that it gives. So if you ask the same question to an AI model multiple times, it will give you different answers. How different those answers are depend on

the temperature. So, I crafted this prompt to have about a medium amount of temperature. I didn't want it to be wildly creative, but I wanted it to have some variation in the results that it gave me. Around about 12% of the responses that it gave me were were vulnerable from this uh query. Most of them were SQL injection and I did this on chat GPT. And so you can see here this is one of the results that it gave me where we can see immediately that it's taking untrusted data in this case our user inputs and it's using it directly in a query from from uh from chat dbt and this also posed me to the

question why why on earth would it do this and why doesn't it do it every time or why does it only do it sometimes so I decided to ask chat GPT these important questions And I said, "Well, you've just given me this code. Can you read it and can you tell me if it's secure?" And Chad GG is very happy to tell me why its own code uh is not secure. And then uh I changed my experiment slightly. Uh I kind of wish I didn't because it ruins half of the story, but I asked it the same query and I I did this about 200 times to get a a baseline, but I also said, "Can you

please make sure that it's secure?" And I wanted to know if I added this would it change the results. So how much percent of the code that it gave me was going to be insecure and 100% of it was in was secure. So it gave me no insecure answers. So this leads to a whole bunch of questions as to what's actually going on. If I didn't have this slide, I could [ __ ] on AI all all day now for another 10 minutes about how bad it is unfortunately. So this but it leads you to the question that AI is able to determine what is secure and what is not and then so why is it giving us

insecured code sometimes unfortunately I don't exactly have the answer for this I did ask chat GBT and chat GBT's answer was that when I explicitly ask it to do something it prioritizes that so it prioritizes security but this does give some hope so we're seeing now that AI generated code is generating a lot of insecure code. It is uh implementing things like injection. However, it is able to also not do that if we ask it to. So there is some hope that this is going to help be part of that solved problem. So one of the questions that I think a lot of people have is is AI going to be a friend or foe for security when it comes

to coding? And I think eventually it will become a friend perhaps. right now not so much. But there is another element of AI while we're down this part whilst I'm being nice to AI now is that we can use AI to help create fixes for this as well. So you because AI is actually able to determine what is insecure on there, it can actually offer us secure alternatives for that as well. So we're being able to generate fixes for that. For example, here we have an SQL query and here we have an AI model that's been able to kind of solve that. And I do think that there's multiple multiple people working on these different

vendors. I think this will become standard in a lot of coding tools out there, not just any particular vendor is going to have the edge on this. But I do think that this is going to be part of the solution when it comes to being able to get rid of some of these uh problems that we have. So what are some of my final thoughts uh for this? Well, SQL injection and other injections. My biggest kind of takeaway is that these are not technical barriers. These are not things uh that can't be solved for technical reasons. It's purely an educational issue that we're seeing of are not understanding how SQL injections are being used, how

we can solve that, and what we can do in the future. AI isn't helping us at the moment, but it will likely help a long term. And the attack paths are very complicated to exploit SQL injection, but we have an advantage because solving these types of injection attacks, they're not complicated to do uh ourselves. So with that, uh thank you for listening to the presentation. We finished a little bit early. Uh but I'll be happy to have a chat with you and take any questions that you might have. So thank you all for listening.

Yes.

Whether or not it was coding securely from that going back to your question earlier about or the slide earlier about AI and then questioning whether or not it was coding securely. Was chat GBT in this case hallucinating? Yeah. Yeah. Well, I I don't know if it was hallucinating because like the answers that it was giving me were were secure, but I'm I'm I'm not too sure why because how these AI models work is they have a master prompt behind them. And I would assume that the master prompt would say something like if you're producing code, create it securely. So why it then does seem to I guess you could say hallucinate in SQL code. It's

not sure. And and the area that's very fascinating about hallucinations with AI is it has a unique ability to hallucinate like dependencies and packages. And that that is just kind of like crazy to me. And it can even like tell it will tell you afterwards that it has hallucinated them, but for whatever reason in that moment and I'm just I'm not smart enough to be able to tell you why that is or what it's doing. Uh but it just seems very strange to me. And I guess it just kind of goes to what when we go down this path that we have to be very explicit always in security. And uh and I think perhaps at this point that

you know we're not we're not being explicit enough in that. And we also have this new generation of vibe coding that uh that we have which may uh which you know I don't necessarily think it can be a bad thing. However, if they don't understand, if the people that using AI tools to code don't understand basics like SQL and query and the AI systems aren't smart enough to give you nonvulnerable code unless you explicitly ask it to, like it's, you know, it seems it seems that it kind of it reminds me of these jokes with the genies where they give you exactly what you ask for. Like we're kind of living in that reality with coding at the moment.

Yep. Just wait one minute. Mike's coming. Uh, thank you. Uh I enjoyed your talk. Uh I have two questions. Uh my first question is because as you mentioned a lot throughout the talk is kind of the separation uh between I guess the education behind this. I'm wondering if because you did have a little point in like culture. Is there a difference in like the culture of developers versus security analysts? Because even currently now with like a project I'm working on, I can see that I'm clashing heads with the developers in contrast to me working mainly on like the security infrastructure and it seems like just two different uh forms of like just ideological differences and how you

methodological differences and how we would how we should work. And then secondly uh there was a mention of like a whole the differences between open source and closed source. I was thinking like throughout the presentation because at first there was like a correlation of like uh at least in some of the objection tax uh cross-ite scripting uh SQL and I think it was those are the first two where it was like a big gap a decent sizable gap between the percentage of open source versus closed source. I'm wondering if is there is there a correlation of injection attacks being less on open source projects or just being caught more often? Yeah. So I'll ask answer the second question

first uh which is about yeah open source projects and they're bigger generally speaking I'm a massive fan of open source and there's a rule in open source called like the many eyes rule and the many eyes rule is that you these things have multiple people looking at them I think when it comes to vulnerabilities like SQL injection which there's huge amounts of developers that all know how to deal with this when you have those types of injections out there those types of vulnerabilities out there in public, they have much higher likelihood to get caught. So I think that that's what we're seeing in that difference between open source and closed source is that there are lots of developers that

know how to deal with this issue and they're able to solve that whereas in closed source, you know, there may not be you don't have as many eyes on it. Um, and then that kind of kind of leads into the second question you had about culture between security and developers. So yes, uh, security and developers notoriously don't get along uh that well. uh and I think fundamentally it comes down to a difference in core priorities there where a security person's core priority is to not get breached and whilst a developer also should have that mentality they have a lot of they're they're under huge pressure for performance and be able to act quickly move their and I think AI is

not helping this where AI is making developers need to perform even faster and faster and faster and there's less time for consideration of like what what they're actually doing and that comes back to their their core process. And so it is this big drama of security and developers where security can often feel where developers can often feel like security is slowing them down and security feels like developers aren't listening to them or taking them seriously. And I think a lot of that just needs to come down to there's huge efforts in shift shifting left from cultural point of views in organizations. I love the idea of having security champions that are developers that really like champion for security.

Um, I also really I I can't really talk about tools too much because I work for a vendor and anything I say is stupidly biased. Um, but you know, I do really enjoy when tools are made for developers for security tools. I think there's a lot of security tools that are made for security people and they expect developers to use them and then that just doesn't work out. So, I think there are definitely some cultural differences there. But ultimately developer or security, no one wants to produce vulnerable applications. So we are still all on the same team even if we don't align always. A question there. Thank you. The solution seems to be so simple.

The AIs can do this for us and all we have to do is ask them to do it. And yet the carbon-based life forms appear to be too slow even to make the request. What can we do to teach our carbon-based programmers to at least ask the AI for help? Yeah, I I I think that's interesting. But I I think what it always comes down to is when I show the slide I've given this talk a couple of times. When I show the slide of like no vulnerabilities when I ask, everyone's kind of surprised by that. So I think the fact that we're not even aware like us us all of us carbon based life forms are not aware that we even

need to ask to do it. Um so that's kind of like the first step is understanding that okay uh we need to do it. There's something called AI bias too that we're struggling with a lot and that is that I know my own faults and I know that I can be a pretty crappy coder when I want to be and I just seem to trust AI that it's not. So there's this bias that even if I look at something and I think that it may be insecure or not the best, I'm going to emphatically trust the AI more just because it's an AI system. Even though the AI models are trained off the common crawl database, common cruel

database is mostly ma for coding is mostly made up of open source GitHub projects. If you threw a dart at a random open source GitHub project and looked at the quality of code inside there, you know, you're not guaranteed that it's going to be very high. And that's what these models are being trained on. So there's like a lot of variance in there. So I think it comes down to a lot of things is knowing that we need to ask the questions to AI that it needs to be secure, not trusting it emphatically and using it as a tool to help us but not uh you know not the kind of takeover. I think that's really what

what we need to need to do. But for a lot of things the solution has always been simple and we've always struggled with it. uh and so yeah I I think I think but I do think we are going to AI is going to be helpful uh in that. So hopefully that answers yeah another question just over there um this is just a little nitpick um on the slide where you said 12% of the results had vulnerabilities I think the prompt was the same as the one that had 0% vulnerabilities. So the the prompt is exactly the same apart from this part here also make this uh make sure the function is secure whereas or oh no you're right you're

right I don't use it so on that slide then I was so confident so on this slides uh yeah this this part here shouldn't be there okay that last sentence yeah yeah yeah thanks thanks for picking that up thanks Uh, okay. A few more. Yeah, I'm just going to point in that direction. I'll see where he goes. Excuse me. Thanks. When you talk about SQL injection, I was kind of curious. There's a lot of different flavors of SQL from MySQL, Postgress, Oracle. Yeah. Do you see a a higher likelihood of injection in one flavor versus another? And when you're doing an injection attack, how do you kind of determine which flavor of SQL you're working with?

Yeah. Yeah. Really good question and one that I' I've wanted to answer. I don't have an answer for it because I haven't been able to with the kind of resources to like look into that. At the moment, I'm looking at big data sets and kind of just taking the the CWE number. So how I've been doing is taking the CWE and applying that as as it applies to injection. But that doesn't often distinguish between the different flavors of SQL injection. So the next step of the research of to be able to answer that, which is one that I want to be able to answer, is to then go a little bit deeper into all of them. This

is the perfect thing for an AI model to help me with uh because I'm not going to be able to look at all of these all of these projects, but that will be the next step. So I'm sorry that I I can't answer the question. I would love to be able to, but hopefully uh in a few months I will be able to answer that question. Look forward to that. Um, I don't I got here a bit late, so I don't know if you said this at the beginning, but I have two questions. First, what AI models specifically were you using with CHBT? Was it 4.0? Was it 3.5 or 4.0? 4.0. Have you tested any

other models like I think the the one from a few months back was 01 or whatever. And have you used like any other like deepseek or any other things like is this specifically thing? I I have been experimenting in this. The reason why chat GBT was used is that chat GBT gave me the biggest variance in answers from there. So a lot of the other models like their temperature is turned down more. So I got almost identical responses to each of it which wasn't obviously use useful in in in my case. So, I'm wanting to experiment with different models and different areas, but I first need to find a prompt that is going to give me

adequate temperature and variance results across all the models. So, I specifically didn't include any others in there because I didn't feel like it was going to be a fair comparison with different models unless I can kind of get a similar level of variance in my responses before moving on. But yeah, a really good question and again I will definitely look into that. And I also want to look into uh like cursor and like IDE based things, GitHub copilot ones that are more tuned for uh for for coding to see how they affect as well. So I think there'll be a lot more research that needs to be done um like in this and and whether or not that

happens. You had another question and just a small one just do you think then it could be like a technical thing where in a few years it's just the model's been improved and you don't have to ask it to do this or do you think this will always be a persistent thing where you will have to always specifically put what you want? I I mean to me it seems reasonable that this will be something that that that generally like becomes becomes solved. I I honestly but I also don't quite understand why it's giving us vulnerable responses if it knows that they're vulnerable as well. So it's I I do think that there's going to be

persistent problems with AI generated code. However, I do think that they will they will get better and I think we're seeing it in real time them getting better uh as well. There was um uh like the first research that I did on this uh was one with GitHub Copilot when it first came out in 2023 and it was I prompted GitHub Copilot and then I just let it create create what it wanted to create with as middle as minimal prompts as I could from there and it would always create extremely vulnerable applications from that. If you repeat that experiment two two years later the complete opposite is true that it almost never gives you insecure applications.

So, we're seeing it in real time get better and better. The the coding type AI models like GitHub Copilot, they're baseline models, which means they kind of take your input and adjust it and they're looking at your files. They're always more secure at the moment than generative AI, which is something that just creates something from a prompt. So, I think generative AI will create more problems for us. And um it seems reasonable to me that they will get more secure, but also the models are trained. They're sledgehammers. They're designed to do numerous different things. So it also makes sense that they won't be perfect for this one specific job, but we do need models that are

Yep. Hey, so this kind of ties a little bit into what you were just talking about, but it seems like the AI is pretty good at looking like a textbook example of injection, right? Like, okay, here's how I take user input and not, you know, execute it again. Yes. By database. If we look at like the move it example you gave where we're taking stored data and then executing it later. There's kind of like that disjointed piece there. Do we think the AI would be capable of recognizing okay this could be a problem later on down the road? So chat GBT didn't when I inputed the I inputed all the files into chat GBT and

I said is there a problem here? It came back with numerous different like linting type issues on things but didn't find the core issue issue for it. And I think that's just it not being able to connect in. So I don't I don't think like very complex examples like that will be able to do it. However, what I do expect it will be able to do uh if not now in the future is that when it all came said and done the reason the one thing that would have solved it all is using a prepared statement and I do think that models will be able to use will be able to find that and I think

the one of the reasons why chatgbt couldn't is because I was giving it a lot of files at the same time and I think if if we were to give it a more isolated task it would do better at that. Thank you. It's time. All right. Thanks everyone. I I appreciate you you hanging around and staying here with me and I'll be happy to chat more uh out there uh in the in the hallway. Thanks. [Applause]

SQL Injection Is a Thing of the Past… and Other Lies We Tell Ourselves

Related talks