← All talks

EchoTrailblazer bsidesco 2024

BSides Colombia32:5848 viewsPublished 2025-04Watch on YouTube ↗
About this talk
Imagine una IA para orquestar procesos de ciberseguridad con voz personalizable, capaz de conectarse a dispositivos IoT. Proponemos una evolución en interacciones con la IA, combinándola con el pentesting y la personalización con la funcionalidad creando entornos más inteligentes e intuitivos. #ciberseguridad #ai #bsidesco
Show transcript [en]

Good afternoon everyone. First of all, thank you for being here. I want to introduce my team. Here is Johnny Jají, Full Stack, also on the IA and cybersecurity side. This is Alexander Samakam, he is completely in cybersecurity and artificial intelligence. And here we have Christian Urcuqui, who has already been at B-Sides last year and has been at several conferences, such as the Dragon Yard, which is one of the most important conferences in cybersecurity. And my person, Luisa Fernanda, who I am from the front end, and on the sides I am also with IA and with cybersecurity. Hello, hello, how are you? Good morning everyone. First of all, thank you very much for your time. I'm going to steal a moment from the presentation of the

guys. This presentation is theirs, it needs to be clarified. But I'm just going to steal a little bit of the showcase here to make a message, as we say, commercial and align a little what the topics that we are going to deal with. My idea is to outline a little why the project is coming and a little of the ideas that other speakers have already talked about the use of artificial intelligence. Simply as you have already noticed, Today we live in a world where we have a lot of use of artificial intelligence in many processes and operations. Because practically it has come to support us in many operations, regardless of the redundancy, and decisions in which it helps to reduce the time and processes that are sometimes

human and difficult. If we talk about security and computer science, Well, we are talking about a problem that is difficult to cover at the level of the solutions that have also been developed, that exist and have served, but that artificial intelligence has supported. And precisely because we have, as I mentioned, new technologies, we have events that are not manageable and traceable by us, such as the unhappy employee, unregistered computer attacks, unreported vulnerabilities, among others. I just say that sometimes I like to mention a word in all this discourse of artificial intelligence, why artificial intelligence comes to support these cybersecurity processes, and it is the word complexity. The complexity that carries out tasks, as we can talk

about security analysts, researchers, etc., among others. in this case artificial intelligence seeks, is to support those processes that are complex, that deterministic systems previously that also help, but they do not give an advance. I just sometimes like to talk a little bit about things that are external, because there are sometimes events that we sometimes in technology are alien, such as wars, cyber wars, pandemics, among others, things that I'm going to say it this way, but it's my thought. We can consider that in a way we are lucky because we are living certain things that, for example, our ancestors did not live. So there is also a technological acceleration. For example, the pandemic allowed us to dive into technology

more, but in turn, I'm going to leave you my note from my last year of research, and it's that this that is happening today, to get a little bit into the operations, our day to day technological, in which you are seeing your cell phones right now, is taking us to a technological dependence that sometimes is not healthy. And where sometimes we think that we have to go out a little into the real world, the digital world, to live a little of the things that sometimes the chaos of the internet can cause us. Well, they already talked about the issue of chat GPT, generative models, among others. But I just wanted to point out that artificial intelligence seeks to support the processes that were conventional. Artificial intelligence is not something

of the past, it is something that has been around for many years. Only that Microsoft, OpenAI, among other stars, took it out of the stadium, literally. There are people who think very well at the business level and know where to point. As one of the gurus says, today the topic of machine learning, Andrew Eng, it is very difficult to think of an industry in which it is not impacted by the use of artificial intelligence. I have worked, for example, with entertainment issues, with banks, with health issues, with cybersecurity issues, among many other sectors. So, Willy plays artificial intelligence, seeks to support certain processes. What line is that? I always say it, in business, artificial intelligence. And by means, if we talk about machine learning, data issues. If we talk

about a summary of some of the things that were also said in other talks, some of the things that we can find in the use of artificial intelligence: authentication, biometric issues, detection of phishing, detection of threats, management of vulnerabilities, among others. And we can see here monitoring or prediction issues as a brief summary. But I conclude it practically as artificial intelligence seeks to give a proactive approach to certain operations, that were more reactive at the cybersecurity level. There are a lot of things you can find and I'm going to present to you what we, at least from the ICCI University in Cali, so you can see that in Colombia we also develop research on issues

of artificial intelligence cyber, it can be done. . Among those products, more precisely, among those 14 projects, you can find a first book, "Open Source" is also in Spanish, precisely for a public that was initiating in the subject of machine learning and cybersecurity. That was a first project that was done with my master's degree. It was also presented in B-Sides, which I also thank precisely the organizers that in 2019, when I started in this, I presented one of the advances of this book, where I invite you, if any of you are interested, it's free. Each of these QR codes, they are not going to take you to any malicious site, but they will take you to the site that is.

The second book, which I have here in my hand, which you can see here, you can also download it there in PDF. It includes what is more advanced analytics, but from an offensive and also defensive perspective. So, for example, how to attack an artificial intelligence, how to generate deepfakes in audio and sound issues, how to build defense models for cryptojacking detection issues, botnets, among others. This is a brief summary. SQL Injection attacks, cross-site scripting, among others. This is not a job that I have done alone, it is worth emphasizing, and that is why the guys are here. This is a job that has been done with several students, graduates who have done master's degree, also pre-graduate, and several congresses have been presented with those

who can see here on the screen, they can also refer to those links to get more technical content. such as the issue of botnet detection using... I just heard a question about the use of network traffic. We explored the issue of the use of NetFlows to be able to build models of detection of anomalies. And at the Ecoparty a project was presented for the detection of audio deepfakes at the time. Finally, the project that you are going to see right now to give the screen to the guys, it is a book that was built thanks to a research that was granted by the company Dragonheart and Andro at the time. We thank you obviously for this funding that helped us build this third book that is here on the

screen, which we are going to give to you. I would like to mention this, it is a more personal message because I am celebrating my 10 years in this area and I printed these books in order to give you this access. We are going to, at the end of the presentation, give one of these with a raffle that the boys are going to do. You will find more information regarding the book. in this link, via interview, that was also resolved in the ECOBOOKS. Not being more, I leave the word to the guys who are going to address the project itself. I thank you again for your time and any questions, I'll be here at the party or having coffee, whatever. Ready? Thank you very much. Ready.

The first project we are going to talk about has to do a little with the project we saw right now. It is combining FrontEnd with cybersecurity and IA. which is something that is not very common to do. As we usually see in the meme, security, and as we saw in the previous talk, security in the frontend is very deplorable. It leaves a lot to be desired and has a lot of gaps in which people can attack. Well, mainly I would say that it is because people in the frontend, usually what they are least thinking about is cybersecurity. In any area as well, because cybersecurity as such has a very complicated learning curve. So people in the frontend are not going to

focus on checking if the code is completely safe, if it has any gaps. So let's say that first there begins the problem in that the people from FRONTEND do not have the capacity or the skills to know even better about the cybersecurity part. So the part would be that with this tool that we made, we can help them improve the code they are making. Besides, it's very difficult, it has a very complicated learning curve, it doesn't help that every day there are a lot of cyber-safety, yes, vulnerabilities and threats. It's very difficult to be up to date with the problems that exist in the front, now imagine a whole project. So, the idea of the project is to be able to help in

that part. Besides, most of us, here we like the console and things, But there are people who don't, and so the tools to learn cybersecurity, the tools we use in cybersecurity, are very unintuitive. So the things I used to make my project was Natural Language Processing. Basically, my project, at large, what it does is that you have the code in an IDE. I made an extension, this case is for Visual Studio Code, but the idea would be that it was for any IDE. You have the code and you are writing and every so often the application, the tool, will be checking the code in real time. So what it will do is that when you are writing a code, if it detects any vulnerability, then it will

generate a warning of what vulnerability it is and what type it is. I'm going to show you basically the CBE. Also, what was mentioned in the previous talk, we have as important the DevSecOps, which is basically reviewing in real time the cycle of day to day software, taking into account the cybersecurity part, dynamic analysis and also the DFIDF, which is the Third Frequency Inverse Document Frequency, which is basically being able to place the The code we have, to convert it into numbers and based on those numbers give it an importance, what level of importance each word has, so we can help in the model, in artificial intelligence, in identifying vulnerability. So here we can see an

image of what I was basically telling you. This is an example in JavaScript code. We have the JavaScript here, any JavaScript. And then in this code I found a vulnerability. So that symbol like the little lamp that marks that there is a vulnerability, and it tells you vulnerability found in your code and it tells you which one it is. In this case it's cross-site scripting. We take the top 10 of OWASP vulnerabilities, which is the most updated. I think we took the 2023 one and we took the three most popular ones, which is cross-site scripting and the SQL vulnerabilities. Good afternoon. First, to give an introduction, this is another project. The other project referred to, well, I want to ask a question to all those who are here. Have

you ever been scared, or any of your family or friends, when a terminal is opened? Have you ever had that scare that when you open a terminal, an update or something like that starts to run? Or does a family member ask you something about that? Yes or no? Yes? Well, this project was born as... um So, this project is about trying to help people who are starting in the world of cybersecurity and get into this field and start handling this type of tools that you may lose at the beginning and give a recommendation to those tools that suddenly one does not give them that attempt to try them out, suddenly because of that scare. So, the first thing I want to explain here is that this project

. And at that time, the foundations of Natural Language Processing were not so clear. And as we learned them, we realized that we were going to have some problems with BERT. First we had to teach it a little bit of cybersecurity context. Then we had to teach it some things, some words from the field. Secretbird is a masked language model. where more or less what Secure Bird did, as we can see at the bottom, Secure Bird did in Masked Language Model was, from a context that it had, as we can see in this case, the first one that says "Information from this scan may reveal opportunities for other forms of mask" in that part, from the context that it has, from the word that they are saying

before and after, and a pre-training that he had on cybersecurity, he found that it was the best to be able to fill it. So, what we were trying to do was just that one could have a conversation with an assistant. In this case, for the part of Secure Bird, I felt that we were missing something, but taking advantage of that pre-training in some cybersecurity terms, we decided how to change how the problem was . . to understand a little bit the process of what was happening. So what the model did was a tokenization process. You delivered a text and more or less what it does is tokenization. First, separate the words by spaces. Then, each word is given

a tag, which is like a label. And that label makes reference in a matrix or a number arrangement. In this way, you can integrate them to the artificial intelligence models to train them. At the end, more or less as I was saying, how it worked was more or less like this. The first thing the person interacted with was what they wanted to do. So in this case, what the person wanted to do was to obtain information about a domain. So in this case, the tools that we trained them and that they had in mind to do that were The Harvester, Shoran, In this case, for the domain, this, but others that I had trained were ExifTool, and I don't

remember which was the other one. And more or less what he did was, from what one passed there, he recommended the most appropriate tool, and then to recommend the most appropriate tool, they deployed a protocol of how to use it in natural language. Here more or less an example of how it worked with The Harvester. SPEAKER 1: - - and that you can bring a voice input and this model can recognize the text, pass it to the model that we had with Alex's project and generate some consultations or generate some steps to follow with this voice input. with this assistant. Basically what this model is trying to do is take the voice input, makes a recognition, this

is done by means of features extraction, which turns it into a tokenization way, passes it to text, makes an analysis of whether the The prayer is correctly executed and the transcription is done. Our project is called Echo Trailblazer and the user enters the application. Here is more or less the demo of how it is working currently. I want to explore a network. I want to search by IP address. What can I do with the vulnerabilities found in the network?

So, to explain a little what was happening, as my colleague had commented previously and with the project that I had already talked a little bit, what was being tried was, it had been proposed that apart from being able to help a person who is starting, it would be more efficient also that a person while doing a job, depending on if in this case the project that I was explaining is focused on the recognition part, could make that interaction, not only from the keyboard, but also in a more verbal way. This is one of the approaches of the interactions that we were commenting on, where the person is talking to the machine, and in this case the first thing he asks is: "What can I do with

this domain?" which is similar to the one I had before. I'm going to mute it and show it. I want to explore a network. And from there, he tells him that the tool that he is identified with to help him in that task is the Harvester. And from there, he starts to deploy the protocol and the response to the protocol is also verbal. The only thing that in this case was not verbal was the IP address digitization, because we were devouring a lot saying the number by number. And this is what we are proposing in this interaction. And so we are going to version 1. Fernanda will comment on some things. One of the considerations I want to make is, maybe you wondered

why there were three voices, the three of us. We wanted to prove that the application was able to identify different voices, different tone of voice. So, as we can see, he understood each word. And additionally, I also wanted to tell you that we didn't record it in the demo, but it is here. It is, let's say, instead of saying "I want to explore a network" as such, we can also say it in Spanish. And the cool thing here is that usually all of Circeguray, all of the software as such is in English. We can say it in Spanish and this model of this application will automatically pass what you said in Spanish to English. So, let's say that for people who don't feel very

comfortable speaking English as such, it can be very useful. We also have pending, which is in process, is to carry out the part of putting it from Spanish to Spanish, but that takes a little more time. For now we have the option of Spanish and that one translates it to English.

Now, as a future work, we have to make a development to tune the model, since it often generalizes certain concepts, certain sentences, it does not understand the words well and it is necessary to train well with a dataset. In this case we are going to work in two languages, which would be Spanish and English. . flow or do nothing. Another important part is the graphical interface, since for now, as you saw, we did it by console, through some recordings that Python did, that is, compiling a recording of certain seconds for each instruction. and generated the voice recognition to the transcript and as such the other model recognized this and it was very sequential so the plan is to do it as an assistant as

let's say that you can speak directly as if we were talking to Siri that is one of the plans and the last thing is to allow a voice to give the answers. For now, everything is shown in text and it would be a good idea that, as Siri does, our model speaks and gives its answers to have a better interaction with the user. I would like to conclude with a certain appreciation of why the project was born. And I thank the guys for presenting it. Well, what happens is that, as you noticed in the previous presentation, they also mentioned issues of how intelligence could support processes. But what I want you to see here is like an evolution of something that, let's say, maybe, which is a hypothesis

that I have, could happen. which is how we can extrapolate sensations through the use of technology. And in this case we are looking at sound. How we can transform through use, for example, what we are talking about directly to perform operations. The MOLOTEC Texto is the one behind, the one that would be operating. So, what is the hypothesis, or let's say the one piece, if you see that, of what I think could happen? I think that in the future there is the issue of how artificial intelligence can support the generation of multiple sensations and the issue of, for example, when the issue of edge computing arrives, we could get into it. And there, let's say,

it is precisely aligned to what is the title of this book, "Data, the Seeds of Chaos". And that's where we give ourselves enough to the generation of content that may not be, let's say, real. Where, I don't know if you've seen on the internet, we're already looking at the subject of privacy. So, he's talking about that. And we will look at which, let's say, the situations where where we really validate and appreciate human contact. Sometimes I would say that technology is moving very well for certain things morally, but not so morally and ethically correct. I leave you with that appreciation to close this item. I thank the guys very much. I know it's not easy to be

here giving a presentation and I would like you to please help me with a round of applause for them. Well, Let's go to the books then? Yes. The first one to raise their hand? After the question. We'll ask the question and we'll see who raises their hand first and we'll go. The question is focused on the part of the project that Securebird is using. And the question is: What changes were made on Securebird? What changes were made in the model type? Or what is the name of the model that is now working, the model that is based on the EP Recon, which was the one I was explaining. Another one? If I ever find myself at a Dragonheart Congress with

Vsides Defcon etc. Thank you very much.