
Welcome to the presentation. As you can see the title is Hacking Arena, an innovative learning platform for ethical hackers and hacker robots. First of all, let me introduce myself shortly. As you can see my name, I'm Lasso Edwady. I'm a researcher at the University of Oslo. I'm teaching ethical hacking. So the idea of the presentation, the presentation is divided into two parts. In the first part, I'm going to introduce you the our innovative learning platform for ethical hackers, the UIO Hacking Arena. And in the second part of the presentation, you will see our machine learning researches related to offensive security. So the aim is to create a hacker robot. I'm going to elaborate on the topic later on,
but let me start with the very, very first part. First of all, we have to state that learning ethical hacking is useful. but also challenging. Why is it so challenging? You can see my arguments here. There are many different types of problems that we have to solve. Huge theoretical background and knowledge is needed to carry out the tasks. We need to do it in a very practical level, which is quite demanding, but this is necessary, the practical approach. And also, without the experience, it's not really possible. So how to how to make it easier to start ethical hacking. Our idea is to create that platform. Mainly our students are the users of the platform, but we have more
and more users around the world. You can see the URL hackingarena.com. What is that hacking arena? We have different types of hacking challenges. You can see the different topics here. Maybe it's better if I show you the real URL you can see here. We have different type of challenges, for example network reconnaissance, web hacking, binary exploitation, reverse engineering and forensics. And right now we have more than 100 challenges, but I would like to emphasize that we just started to create the hacking arena, the number of challenges are keep increasing and also we provide solutions for the different challenges. Also we are about to start our YouTube
video solution channel for the hacking arena. So that's only the beginning, but still I think it's worth to present it because there are many interesting things. Hope you will enjoy it. Let's go back to the presentation. According to our idea, all the challenges will be available online. So what's the difference between a CTF, a claptured flag competition and the hacking arena? We provide the challenges continuously and we provide solutions. And I think one of the biggest challenges for beginners to learn hacking is the difficulty level. Even we have easier competitions in CTF competitions, but still the level is quite advanced in some cases for To make it easier for the students, we provide also very easy challenges and continuous learning
process is provided, at least we hope so. We also provide solutions as I mentioned. I think I would like to show you some examples, then I move on to the machine learning parts because you will see why these two topics are really connected to each other. If you visit the URL, as I said, we have different type of network challenges as well. So take a look at only one. We have information gathering challenges, port scanning. What about, for example, touch number two? According to the challenge, there is a service somewhere in our servers. We have the palpatineserver.hackingarena.com. And inside the specified port range, we are looking for a service. After the port scan, we can find the service
which is at port 19101 so now I have a Kalinux running here and that will be the moment that you will realize that this is Wednesday according to my computer yes and what was the name that was Palpatine and if I would like to connect to the service palpatine.hackingarena.com
19101 as far as I remember something is wrong here palpatine.com yeah
I have a typo yeah
very good and according to the banner information this is a special eSafe appliance. So the task is about information gathering and getting in touch with services. So it's not a big secret that looking at default credentials on different websites we can easily find a default user and password and by logging into the service the flag will be here. Yeah that was just an introduction of one easy challenge.
In other cases we have different types of web hacking challenges, for example information disclosure, you can see the topics here, default settings, client-side validation, brute forcing, parameter tampering and so on. And now I'm focusing on one specific type of categories, the well-known SQL injection. The very first challenge is really quite easy, so I'm going to login as my username Laszlo, believe me I have no account here, but using the classical SQL injection approach it's quite easy to to login to the service and we can obtain the flag. In more complex examples, for example secret injection number 2, the password field is hashed, of course these hints are not provided. The password field is hashed, so we cannot really play
any trick here, so we need to do it in the name field. So how to do it in the best way? So if I would like to log in as admin, I just simply comment out the continuation of the script and it's quite useful using that challenge. So that was step number two. We have other challenges as well, following the different difficulty levels. Just quickly show you challenge number three from secret injection. where we have an airport information board and of course there are not too many parameters and also the type of the challenges provided so it's not that difficult but still we can play with it how to find the flag so right now i think if i just
quickly yeah check verify if really that parameter is vulnerability vulnerable then yeah quick success we have that so called blind SQL injection. So maybe for the exploitation I'm going to use some tools here with my Kali Linux leaving the Palpatine server SQL map dash u and the URL.
Maybe I forced the technique
Of course the vulnerability is identified and now I can enumerate just quickly the result of the queries. The airport database with, I need the tables, the airport database, I have a flag flag table.
And what about the columns? Good, and finally I dumped the database.
Yes, and the flag is here. So good luck if you would like to try it. We have many other type of challenges, just quickly the Sidious server as far as I remember. Another type of database related vulnerability, I just use my cheat sheet here. I'm already prepared. And copying the data. So this is the calendar of the dead star. testing and with that specific expats injection trick I can enumerate the XML file. So sooner or later I am going to find that there is an emperor and if I enumerate the emperor's... yeah emperor was user number 4 and the field number is 5, sibling number and with that trick exporting the x-pass injection I can find the
flag. So we have different type of challenges and I would like to move on to the machine learning but just quickly that we have binary exploitation challenges as well. For example very simply stack based overflow to heap overflows as well. For example I log into the Kenobi server
I don't remember the port number, that was easy. 801. And it's quite easy to provide some overflow.
I don't remember what was the correct bedding number. It's not enough. But still, I have the ready exploit here. Using Pond2 is quite easy to write the exploit. Now that was the Abbey solution.pycanoby.hackingarena.com and the port number was 801. Good. I have the shell and the frag txt. Yes, so different types of challenges, the number of challenges are keep increasing with solutions and other details. Let's move on how to use it for the research. Something more interesting question and I have 10 minutes left. Of course the idea is obvious. So we have the data, we have the attack data. Why not to use it for machine learning? So what is the aim of it? I have some theoretical questions.
Is the Skynet coming soon? I really hope not. Or at least not in that way, that you can see here in the Terminator. What do we have now? Well, that's another interesting question, who knows? So the question is, Bertrand, if I ask what is published now? Of course, machine learning
is one of the most popular topics nowadays. Not for not only for security but in other fields as well. Defensive security has some very relevant results considering machine learning, pattern recognition for malware analysis. For the offensive security we already have some attempts. The most famous one is the DARPA cyberbrand challenge but there are other ongoing researches as far as I know. One thing is sure, we need the data. So we would like to use the data that we have, the attacking data for the machine learning. So, UIO Hacking Arena has the double aim. The educational value is to have a learning platform. We believe it's a unique learning platform and it's improving. And the other value is the research
value to use the attack data. But I would like to emphasize that we don't really want to create a Skynet. So the robot hackers will appear anyhow. I really think that's inevitable. But if we can't prevent it, let's try to do it first and be well prepared. So what do we mean by robot hackers? We would like the agent to carry out exploitation. Also we would like to learn from previous cases and maybe make complex exploitation strategies. It's not like security scanners that run predefined scripts and analyze in a very simple way the answers, but to learn the best strategy, learn from the previous cases. So it's a real hacking agent. So what are the different approaches that we have? Of course it's machine
learning. The first question is what to use. So is it an unsupervised learning? that we should use? I don't really think so. Or maybe a supervised learning where we have input data set and some labelled data set and this is for the supervised learning. What about the reinforcement learning? It's very interesting but if we take a look at the process, how the hacking goes, it's typically a very interactive process. The attacker first does some information gathering, trying to find background information using the experience that he or she has, interact with the service, checking it manually, maybe he can use tools as well, but using the previous experience and the knowledge that he has, try to exploit the vulnerability, and also there is an exploitation part,
exploration part, I'm sorry, trying to find unique ways and using that way, trying to find the vulnerability. So we really believe that for that approach the reinforcement learning fits the best. So what is that reinforcement learning? I use the Wikipedia figure here. The agent interacts with the environment, the environment answers and the agent gets reward and moves to another state. Here the agent is the hacker robot, the environment is the service or the hacking program. So the agent can carry out actions and the environment responds, which provides some reward, or it can be a negative reward as well. And the agent will be in a different state. So that's how it goes. The problem is that this is not a chess game, where
the number of states and actions can be high, but it's very well defined.
Here the attacker observes different types of information and can send several responses. So the number of states will be exploded extremely soon. So what we need to do is to simplify the hacking problem. First, how to model it? So we model the main exploitation. So hacking can be quite complex, information gathering, technical information, attack service, exploitation, then extra. task opening channels and etc. But here we focus on the vulnerability finding and the exploitation. And what about the environment? Right now the environment is stimulated. So we created a simplified service. It's very similar that DARPA did with the special operating system where there were only 7 or 8 system calls. So the number of actions is limited. and we created
simplified problems. Together with my colleague we presented a paper with Fabio Zannaro which you can access here. Four different types of simplified hacking problems. The first is a really easy one, a port scanning program. According to that problem we have a computer that provides some ports but only one port has the vulnerable service and at the very first approach we simplified it that we have n possible port candidates but only one port has a service so the attacker the agent has to do first the port scanning and the attacker also has an action that can send an exploit to different ports it's quite easy because as a human i would do a port scan and
obtain the open port then send the exploit to the port. And that's all. But for the agent, the agent has to learn the process and according to our research, we managed to show that the agent will continuously learn that using the port scanning result is useful. And step by step the agent learns that that port should be attacked that is opened.
it trivial but still after a few episodes the solution speed increases very fast. In other problems we created simplified web servers and simple servers as well but what you can see here is a website where we have a simplified model so that means that the attacker can carry out a global scan read the file so for the first time the attacker will read the index file and the result of it will be a set of files. So the index file can contain links inside the HTML to other files. We have another action explorer which is about to check the content of the file. Not the links, but maybe indirect references to other files. So I mean if I read the source of a file,
I can realize that this is a WordPress. And if it's a WordPress, I can anticipate that maybe the wp-admin slash index.php exists. So this is another, the result is another set of files, but the action is different. Then we have an inspect, then we can analyze a file and maybe find a flag or obtain the parameters. If it's a server-side script, then the server-side script can have parameters. and the last action is the send where we can send one parameter to one file. So it's quite simplified as you can see the parameter name is only here, not the value. But still it's a very simplified web hacking problem. And you can see I forgot to mention about the result. After a few episodes the number
of steps decreases. And as a continuation, and it's an open platform, I created three different simplified services on the Chewbacca server. So we have a simplified website with flags which is random. So we can read a file and the result will be a set of files. We can deep read a file which means that analyze the content and file references indirect references to other files and we can try to access the files where we can obtain the flag or not. So I think it's time to do again some simulation.
Yes, ChewbaccaHackingArena.com port 801. I can read the index.php and there is a link inside the index.php to accounts.php. What about a deep read? So if I analyze the index.php I couldn't manage to find any other files on the web server. So what about reading the accounts.php? It's empty, but maybe from the content, the deep read accounts.php. It's empty. So it's a very basic website with only two files. So we try to access the index.php. which is nothing but what about accessing the accounts. PHP and the flag is there. Using other approaches, we complicated the problem a little bit. We have an inspect action and a send action. With the inspect action, we can find parameters inside the files and we can send a
value to a parameter. And maybe the flag will be there if I send a special value to a parameter. Here again the parameter value is always the same, only the parameter name is changing. And on port 803 the hacking program is even more complex. So now we have a list of parameter values that we can send. We can send script, we can send quote, we can send or 1 equal to 1. we can send slash etc slash plus vd and so on. So we have five different types of parameters and the agent has to learn to map the structure of the website with different actions. The agent has to inspect the files to find identify parameters, get or post parameters. And
also the agent can send different limited number of parameters to the server side script using different parameter names. and the flag is somewhere there and now this is the current level the agent will try to learn it and that's what we are doing right now. So I emphasize again our paper if you're interested in more our approach then please read the paper and find the conclusion and time is already up. So with easy example if we managed to prove that reinforcement learning can be useful for modeling penetration testing. We find that w-accrual learning is good, but it's not enough because of the state explosion. So what we need is function approximation most probably with neural networks to find the
appropriate action for that complex problem. Yes, it's already Time's up by 2 minutes, so I have to finish my presentation. So it's time to thank you for your attention and as I said in the beginning, I'm really waiting for your questions.