← All talks

Is ChatGPT a friend or foe to CTF competitions?

BSides Joburg · 202422:12123 viewsPublished 2025-02Watch on YouTube ↗
Speakers
Tags
About this talk
Heloise Meyer examines whether ChatGPT can effectively solve Capture the Flag (CTF) challenges, testing the AI against real CTF problems from the BSides Joburg 2024 competition. The talk explores ChatGPT's capability to assist with CTF tasks through guided questioning, analyzing which challenge types are solvable with AI assistance and which still require domain expertise. Meyer raises critical questions for CTF organizers: whether structural changes to challenges are needed, and whether ChatGPT should be permitted or even encouraged in competitions.
Show original YouTube description
Capture the Flag (CTF) events have become a popular format for ethical hacking competitions, offering participants invaluable opportunities to practice and hone their cybersecurity skills. With the release of ChatGPT, an artificial intelligence (AI)-based chatbot, the question now is: Can ChatGPT solve CTF challenges? CTF competitions offer an interactive environment to promote cybersecurity education, allowing students to gain hands-on experience solving cybersecurity challenges in a fun but controlled environment. One such initiative is the Cyber Security Challenge (CSC), first introduced in 2017 and organised by the South African National Research Network (SANReN). However, the emergence of ChatGPT has raised concerns regarding the possible influence of technology on the learning ability offered by CTF events. ChatGPT presents the capability to instantly respond to various text-based questions following a conversational approach. The typical style of CTF challenges usually follows a question-answer format, which offers students the ideal opportunity to enlist the assistance of ChatGPT. This talk will briefly discuss the ability of ChatGPT to solve CTF challenges. Questions to consider: - Are structural changes required to CTF challenges? - Should ChatGPT be permitted or even encouraged?
Show transcript [en]

Good morning, everybody. Thank you for this opportunity. Apologies for rushing the previous speaker. That was not my intent, but I'm going to be quite honest. The longer I sat there, the more nervous I got. This is quite a big audience. I'm actually more from an academic background, so the audiences tend to be a bit more smaller and perhaps sometimes even less intimidating. more smaller and perhaps sometimes even less intimidating. But yes, I'm Eloise Mayer. I'm all the way from the CSRR. I worked there in defence and security for 10 years. And then I changed sectors. I went to SunREN, so the South African National Research Network, where I'm currently housed. I'm responsible for our CSIR there, but also to host an annual

Cyber Security Challenge, a CTF competition for university students. challenge, the CTF competition for university students. So two years ago when I started this, thanks. So two years ago when I started this, ChatGPT wasn't even in the conversation, so it wasn't a concern for me at all. You know, we do these CTF competitions and it's up to the students to showcase their skills to see if they can crack these challenges. And then at the challenges. And then at the end of that year, end of 2022, boom, ChatGPT. And now I need to know, is ChatGPT a friend or foe to CTF competitions? I need to know this because if we're going to do this competition and we're going to award prizes to the best student for solving

these challenges, is it because the student had the skills to solve the challenges or because the student was just good at using ChatGPT? challenges, or because the student was just good at using ChatGPT. So that is the question that I am trying to answer here today.

Sweet. So for those of you, I think most of you might be familiar with CTF competitions, but for those that might not be aware, it's actually been around for quite a few years. I mean, the most well -known one currently is at DEF CON, started in 1996. I had the privilege to attend two DEF CON. in 1996. I had the privilege to attend two DEF CON events and just peeked into the CTF competition. My skills aren't nearly good enough to compete there, but it's quite interesting to see, you know, the competitive nature and, you know, the eagerness to solve those challenges there. And, you know, since its uptake, it's become quite an important component, you know, to promote cybersecurity education, but also to act as sort of training exercises,

you know, to upskill individuals. sort of training exercises, you know, to upskill individuals in cybersecurity. You know, it allows for, you know, training in various subdomains of cybersecurity. I mean, web security, binary exploitation, I mean, the list is there. There's a lot of things that can be learned. And I think for those of you that did participate in the CTF competition for B -sites this week, would probably have seen a lot of these domains being covered. And let's be honest, for those that have done it in the past, it's... fun. I actually still enjoy doing it, even though I'm a bit of a noob still with some of these categories. But it's really fun. It's a very fun way to learn cybersecurity

and learn specific concepts in cybersecurity. And I must admit, this is actually how I got into cybersecurity. My background was more computer science, but it was more software development. My idea was never to go into cybersecurity. My idea was never to go into cybersecurity, but exposure through these competitions actually got me interested. So I mentioned our competition. I'm responsible for the annual SunRent Cybersecurity Challenge. It started in 2017 already. So it's a competition that's open to all university students within South Africa. We've also expanded a bit beyond our borders. Last year we had quite a good team from Namibia that participated.

as well. And with this particular competition, it's hosted alongside the annual Center for High Performance Computing National Conference. It's three days of active participation. We take the top 10 teams at the finals. We have an attack and defend. There's some custom challenges, usually from industry. Obviously also a CTF event, some social engineering. some social engineering. But the idea really of this and the sort of motivator for this is really to give students exposure to cyber security. You know, our keynote speaker, Dominic, talked about that small droplet, you know, that we currently have. And with this competition, even though it might be a small step, we're trying to expand. We're trying to get students interested in interested in

cyber security upskill them to hopefully take up employment later on in a cyber security domain now the thing about ctf competitions there's different modes to them but the most popular one is the jeopardy style ctf challenges and this is also what was seen with the ctf event here at b sites so i always refer if i want to explain to people about ctf challenges people about CTF challenges, Jeopardy style, it's usually puzzles. It's actually puzzles that you need to solve. You know, it's often other real or current vulnerabilities. There's different categories to them. And the real idea is to actually solve this challenge, solve the puzzle to get a flag, which will be the answer

and will get you points. Usually the team or the individual with the most points will eventually actually win the particular CTF event. Now the thing with CTF event. Now, the thing with Jeopardy styled CTF challenges is this kind of question -answer concept, you know, where you need to sort of, you're asking a question and you want to get the answer, you get the flag. And this is an ideal opportunity to enlist the help of ChatGPT.

So I actually went to ChatGPT and asked them, listen, can you provide answers

responses for CTF challenges. So I asked ChatGPT 3 .5 about six, seven months ago when I actually started this investigation. And then recently, actually yesterday, I also asked ChatGPT 4 .0 mini, so Omni mini. And it's interesting the difference in the answer. And I know the more you can ask the same question to ChatGPT, it will always have a kind of different response. But it was interesting for me to see the difference where with 3 .5, it was like, I'm sorry, but I can't assist you, but still, I mean, if you need to have a bit of a push in the right direction, I can provide it. With 4 .0, it was basically, I can assist you in understanding. So it's just, for

me, it was interesting to see the change there. I don't know if this is intentional. Perhaps I need to ask this question a few more times. But it was interesting to see this change. more times, but it was interesting to see this change.

So what I decided to do is perhaps to prove a bit of a point, is I took some of the CTF challenges of this week's CTF event. Sorry, Ivan, this wasn't intentional, but I needed to know to see, can I actually use ChatGPT to solve some of these challenges?

I actually tried a few more, but in terms of the time available, I'm only going to discuss the six here. And it was quite interesting. Basically, I can put it down to three categories, where some of them I was able to solve using ChatGPT. I tried to put my own sort of knowledge and insight aside and tried just to rely on ChatGPT, but it was a bit difficult at some instances. A bit difficult at some instances, but some of them were quite easily solvable, and I will explain why. And then some of them you could solve, but it required a bit of intervention. And then some of them, I would say, chat GPT only offers support. So for these that were solvable, there were two of them.

Commitment, which is an open source intelligence challenge, and spidey sense, which was more forensics -related challenge. and spidey sense which was more forensics related challenge so for commitment it really boiled down to asking the right question so for commitment you were looking for information about a specific username so i asked you can you find that username it revealed some information and then i asked but i want more information not just what you've given me and essentially that led to the identification of a previous ctf event

with the same challenge, which provided me to the solution, to the flag. So I didn't solve the challenge, I just got a previous write -up through ChatGPT. For SpideySense, which was really to look for something unique or something interesting in a specific image, I basically went through several questions and asked ChatGPT, ChatGPT, okay. Can you provide me options to identify strings in the image source file? And source file is quite a key word there. So it gave me a few options. Eventually led me to a hex editor, which I then made use, and I saw, okay, this is a JPEG image. I want to know where's the start, where's the end. I asked ChatGPT. It gave me the end markers, the start markers, the end markers.

And then I saw, okay, but at the end of this JPEG, was the end markers and then i saw okay but at the end of this jpeg file there's something called a pk remember i'm trying to play now essentially i know what that means but i'm trying to play perhaps a bit dumb in this scenario and i asked chat tpt okay what is what kind of pk what does that mean in the jpeg file and it referred me to a zip archive and essentially i extracted the zip archive and i asked you to be chat tpt again how do when i try to open the zip archive let me take a step back, I saw it was password protected and I asked

ChatGPT, okay, how to crack, how do I crack a password? And it gave me the exact steps required using John the Ripper. And so essentially I got the answer. So for that particular challenge, you don't have to know any insight, you just need to be able to ask the right questions to ChatGPT. questions to ChatGPT. So the following two, again, solvable, but perhaps with a bit of intervention. The APK reverse engineering. This is actually a field I'm quite familiar with. I worked quite a long time in mobile security. So I again tried to put my own knowledge aside here. But again, asking the right questions that provided me with the steps to actually reverse the APK

file. And also, you know,

once I've actually completed that and got the source code of that APK file, putting that into ChatGPT, just the block of code, dump it in there, it gave me quite a nice explanation of what's happening in this code, and it became clear that I need to actually, there's a certain hash that I will need to be able to crack To be able to crack, to actually get to the flag, when I asked ChatGPT what can I use, it recommended Crack Station, and Bob's your uncle got the flag. In terms of holiday pictures, this was an interesting one because this was again our open source intelligence CTF challenge, and when I asked ChatGPT can you find information about, I hope I'm pronouncing this right, David

Turin, or Talrin, it said, Right, David Taurin, or Taurin, it said, no, it's a too common name. I don't know. And I tried again a bit more and a bit more, and it actually recommended, do a Google search. Especially if you're looking for social media, just go take the name, take the social media platform and do a Google search, which I essentially ended up doing. I got to the profile. Part of the profile was a BSS ID, and I asked. was a BSSID, and I asked ChatGPT in this scenario, because I'm looking for a specific location, listen, can I get location data from a BSSID number? And it gave me a few options, Google Cloud API or Wiggle, I'm quite familiar with Wiggle, went there, got the

location, got the answer, got the flag. So there was a bit of intervention, a bit of additional thinking Intervention, a bit of additional thinking required, perhaps a bit of manual knowledge also required, but essentially not bad guidance given by ChatGPT. Then the last two, and perhaps here, it might also be my limited knowledge as well. The first one there is basically a base64 encoded string that you needed to decrypt. If you ask ChatGPT, it will decode base64 for you. A64 for you, but then I got gibberish, and unfortunately that's where I ended. If you ask ChatGPT to perhaps explain anything gibberish to you, it's not going to get you far. Similarly with the Python decode program, it provided me nice

information on how to decompile a Python application. It also gave me a bit of insight into the program, but for this particular challenge, you actually need to have knowledge and understanding of the program. But for this particular challenge, you actually need to have knowledge and understanding of the problem that you need to solve. Which is perhaps, I could perhaps pursue a bit more, perhaps provide ChatGPT with perhaps a few more questions, and it will get me to the answer. But it shows, to me, this shows there needs to be an understanding of the problem to actually get to the solution. And if you don't have that understanding, you can't provide it to ChatGPT, it can't really help you in this regard. I can't provide it to ChatGPT, it can't

really help you in this regard. So, I mean, in terms of findings, it's quite clear ChatGPT is unable to provide answers to CTF challenge at the moment, as far as my investigation has gone.

It will provide you with the right answers when you ask it the right questions, especially if you can provide it with certain context. Especially if you can provide it with certain context. I've seen the more questions you will ask about a certain topic or program, the better and more relatable answers you will get. Certain keyword searches can lead to potential answers for CTF challenges if there is write -ups available. And yes, ChatGPT can offer guidance about certain topics. Is that a bad thing? That's something that we need to determine, especially when it comes to CTF challenges. Something that we need to determine, especially when it comes to CTF challenges. The one thing that I can mention with ChatGPT specifically,

I see it as a way to have perhaps a bit of information overload. So if you will ask it, perhaps, how can you crack a password? Okay, it doesn't like those kind of questions, but should you ask that particular question, it will give you multiple answers to it. With CTF events which are time -based, going through all those options might not work in your advantage.

The one thing with ChatGPT that still needs to be further explored is, especially with the newer version, version 4, the ability to upload files. What impact will that have specifically on CTF challenges? That's something we need to have a close eye on. So, perhaps to conclude, what is the future? Especially when it comes to ChatGPT and CTF challenges. When it comes to ChatGPT and CTF challenges, the answer is I don't really know at this stage. I'm actually in two minds. Should I ban this from our competition specifically to see how the students actually perform without the assistance of ChatGPT or similar tools? Or should we still allow it? I mean, it is freely available. A lot of people

are using it in the work environment. There is benefits to it. So perhaps what we need is to perhaps just do a bit of rethinking.

on how do we develop these challenges so that we can still evaluate the skills, the knowledge of the individual performing the challenge and not just on how well they can actually use ChatGPT to solve this. But I'm open to suggestions. I'm still around for the rest of the day. If there's any suggestions on how we should approach this, I will really appreciate it.

Thank you so much.

I'm not sure if there's now questions. I'm not sure if we can do questions.

Oh, there is one. I can sort of see. Yes. Yeah.

Specifically the base 64 one. So with chat GPT -4, you could actually ask it to write a Python script and then run that Python script. So I think a lot of those questions were like, yeah, where it's something like base64 decoding. Like it's not going to be able to do that just as part of the model, but you can get it to write a script and run it and that should work. Yes, yeah, yeah. No, definitely, definitely. So yeah, it's going to be interesting to see So it's going to be interesting to see where this is going to go. I mean, a lot has happened in two years. And that's why perhaps we need to

perhaps adapt and change as this technology also improves as well. But we'll see what the future holds. I'm not sure, is there a question there at the back as well?

I think the other mic is coming. I think there are the mics coming. Awesome presentation. I was just thinking, like, and I was listening to your last statement with regards to the future, right? I mean, you do have a challenge because you don't want to stop kids and the new generation from not using ChatGPT, right? But maybe there's a hybrid kind of version that we need to look at because you also don't want them to use it and not be able to think for themselves. not be able to think for themselves. So I loved your presentation. I thought it was brilliant. And I think one of the questions I walked away from, and maybe it's not a question, but a comment to

you, is that we need to think of a combination of the both. And having us teach them that you can think for yourselves and how to actually do it. And then, OK, but how can we use it to better solve it, maybe quicker, you know, more productively, but still you? Maybe quicker, you know, more productively, but still using your own thought process to get there. Because they're going to use it either way, right? You ban it and people are going to use it either way. So, yeah. Thank you. No, definitely. Yeah, I think we can take a few. Please, if anybody can also chase me off the stage if needed. But I think there's a few more questions. Hi.

Hi. I was thinking maybe if the students are going to be using ChatGPT with the CTFs, what about limiting the type of questions they can ask? I'm not saying what, how, when, but just limiting the type, but at the same time not giving away the answer by saying you can only ask these type of questions. Yeah. It's something we can most definitely explore. So I'm liking these ideas. Most definitely explore. So I'm liking these ideas. I just need to dot them down. Very good talk. Thank you very much. Here's a question. Do the bad guys have access to chat GPT? If they do, then you should not be banning it. I actually fully agree with that. I fully agree. So that's a difficult thing that I'm

sitting with because, like I've said, I want to make sure that if we do this competition, the team that wins, it because they have the skills. The team that wins, wins it because they have the skills and the knowledge. So that's why I'm thinking we need to look at the challenges we bring in and how they are formulated. So even though they can still make use of ChatGPT to get to the solution, there shouldn't be just that reliance on it. And yes, you need to have a balance. This is not going away. There's a lot of platforms. It's not going to change. So we need to find a middle ground, essentially. I think I have

a question over here. For sure, I think I have a question over here. Oh, okay. Sorry, you're in the light. No worries. So, I think one of the ways you can address the challenge of AI like this is by making them part of the challenge. Have you guys tried doing LLM pen testing type challenges as well? Not really. So, I'm not really responsible for developing the challenges, but I know my colleague, I say colleague, but he's my ex -colleague, but I still consider him my colleague. He's here, so he will take note of that. colleague, but I still consider him my colleague easier, so he will take note of that. For sure, yeah. So I think a lot of more modern CTF challenges that now have these challenges

where you attack an LLM and you try to get it to do something wrong. So that might be a nice dimension, you know, to add to the whole chat CPT aspect of pen testing. That's all I want to say. That's brilliant. That's brilliant. Nice talk. Thanks. Thanks. Yeah. Cool. I don't see any other questions, so I think I'm good. Thank you so much. Thank you for I don't see any other questions, so I think I'm good. Thank you so much. Thank you for all the feedback. I'm still here. I'll appreciate more. Thank you so much.