← All talks

Enhancing Incident Response with AI: Leveraging ML for OT/IoT/IIoT Attack Detection and Prevention

BSides Philly · 202524:2574 viewsPublished 2026-02Watch on YouTube ↗
Speakers
Tags
About this talk
As cyber threats increasingly target operational technology, IoT, and industrial IoT systems, traditional defenses struggle to keep pace. This talk demonstrates how machine learning and AI can enhance incident response by enabling predictive detection and rapid response. Using applied research and real-world datasets from the Global Cyber Alliance, the speaker presents the IIoT Guardian prototype, a device-level solution that integrates ML for real-time anomaly detection across industrial networks.
Show original YouTube description
Speaker: Prof. Atdhe Buja As cyber threats progressively target Operational Technology (OT), Internet of Things (IoT), and Industrial IoT (IIoT) systems, traditional defenses wrestle to keep pace. This talk introduces how artificial intelligence (AI) and machine learning (ML) can redefine incident response in these domains by enabling predictive detection and rapid response. Through portrayals of applied research and real-world datasets from the Global Cyber Alliance (GCA), I will demonstrate how the Data Science Lifecycle can be applied to build predictive ML models that identify anomalies, patterns, and attack trends over IIoT networks. The session introduces the IIoT Guardian prototype, a device-level cybersecurity solution that integrates ML/AI for real-time anomaly detection.
Show transcript [en]

Good afternoon. Thank you so much. Happy to share the knowledge and expertise in this such kind of event. So that's would be my topic. So a little bit on what I'm going to cover. So I'm going to cover a little bit of context what where we are with industrial IoT and IoT and OT in special when it comes to the security. So and then I'm going to share some of my research work that I did on this field where I am still constantly working and trying to come up with a with a cyber security by design for those infrastructures. So that would be the whole and then at the end highlight some main points and then whenever if

you'd have any thoughts and any questions save it those for at the end. So that I would left some time for that part. So that's me. the whole of my background and expertise is built on technical on hands-on. So very recently lately I got to an at the academia. So I'm using that whole that knowledge and that expertise that I have giving to my students there at the Bloomsburg University right now called Commonwealth P University of Pennsylvania. So that's me. If you want to reach me, reach me over at LinkedIn or at my website. I'm happy to talk and discuss and have and work together. Why not on such kind of fields? So operation technology industrial IoT

challenges we are all aware that most of those infrastructures deal with a lot of complexities. Why? Because we have a very diver diverse environment of manufacturers that they don't care so much about those security of those devices especially sensors and then the rest of infrastructure that support this device this small device that would just generate data constantly. And uh this device this device would not care so much if we are able to collect those and store somewhere and treat those safely whenever those will be uh transmitted over the network. So we are aware of all those massive device scale that we have a lot of protocols are around for quite long time and none of these I can say

that those would be updated and would have would been enhanced. When it comes to the security, you have Modbus, you have a lot of those uh MQTT protocols that lack a lot when it comes to the security and then all about that vendor side where you would have a lot of vendors that they just produce massively and they not so much tend to focus mostly and align well that manufacturer process when it comes to security. So a lot of things are happening within the EU for example when it comes to industrial control systems on that aspects but I haven't seen anything that coming around here especially in US and world widely is very on low level it's just

initiatives that are talked so much but no one is trying to put the focus and the light towards that alignments on all that legal standards frameworks and best practice is that we already have. So no need to invent something you already have there. Just stick to those from a manufacturing process from the beginning and that would be you would be okay. would be s saved at some level and then later on on top of those use those emerging technology that we already have machine learning AI especially it brings that creativity why not I'm not saying that replacing the whole thing with an AI but use something that is there and I'm going to show those uh later on how

we would be able to use those so the focus would be on incidental response when it comes to those industrial IoT infrastructure systems so Incident response is struggling so much when it comes to responding and management and managing and handling those incidents in such infrastructures. Why is that? Because they incident response is facing mostly the teams that are working people that are working on that kind of field on that job on that role would be having a lot of those called data blind spots. when it comes to the uh logs where a lot of those would be not so much complete. You would have a lot of fragmentation where you would have one set of devices

coming from manufacturer that is out of the way when it comes to the communication and protection and some of the functionality that those would have built in that would not treat so much well that communication and that whatever would would be able to apply as a security mechanisms. And you have all those set of different manufacturers within a one infrastructure within one facility. And then at at the end where those incident response people teams would be sitting there monitoring and trying to handle those incidents, they don't know what to do whenever they would not find so much evidence to further investigate. So the whole idea of incidental response is get that capture that log and then drill down

deeply and investigate further either for security mitigation or digital forensics. But if you would be in such kind of environment you would lack for a lot of those evidences that would be coming potentially from a device and then from all those gateway points where that we put there to collect and then from edge points. So that would be mostly and then we have a lot of several things that those incident response teams would just have a lot of difficulties that would come from that the beginning from a source point where you don't have so much evidence logs and metrics and things that you would be able to further identify what are the trends where is a gap that lack where is

that deviation that from a baseline that I already have that is coming through. >> So threats are evolving as um in techniques in methods, hackers, malicious actors, AP groups all around the world worldwidely are doing that constantly and you see a lot of those that are I showed here to you but the list is long. I couldn't wrap up all those within one place and Seesaw and very good entity that is working so much to expose those kind of lacks and gaps and threats and risks when it comes to industrial infrastructure. So they're doing good job but needs more to be done from those from standpoint from point that we would be able to have that kind

of accountability when it comes to the manufacturer and not just on whenever we apply there whenever those devices would be sitting on the organization infrastructure. They might do something but when the device would lack on their design on the firmware for security protection you can do nothing and then add this thing on the add something else on the top of that where those devices would lack would have that limits on computational power. So you can't add so much encryption things algorithms on top of those devices on their end and expect from them from those devices to act and behave in a normal operation. So exposure is high. You just need to go to the shutdown IO on that platform.

Just go just search for any filter that you want to any port number that is used on ICS, industrial control systems, uh, Modbus MQTT whatever semens uh DNP3 and you can see a lot how much that exposure is going on when it comes to targeting those IOTS. And yeah there are some uh exposed uh cases that has happened with all those protocols that are around that I put it here. Incident response cannot keep up the pace that what is going on. So this is the whole idea, the whole topic, the whole my research work and objective is to bring some uh guide, some help, some support, some knowledge to those peoples that are sitting all day long and all

those cyber crime units, incident response, sock team, search teams. That is a really uh intense job and role whenever you would be dealing and sitting on that chair and expecting for any incident that might hit you. And at your end you have you can do nothing because you would lack all those metrics and evidence that I just said and logs. So let's see how this would be uh better and better. So why AI or machine learning all that thing? So I'm not talking that AI would just replace humans but let's just empower those or give some superpowers to those incident response teams whenever we can. why not how we would do that. So here would be the part where

I'm going to share my research work that I did uh last year. So I was part of that team on ICT Academy and they got an partnership with the GCA global cyber alliance and they have the initiatives the project called ADA where they put the honeypot there and they are collecting those logs from all those IoT companies those companies that would have you know their promises some kind of IoT solution and uh we got that research project and we were partnershiping for with them for the nine months. So we did this research from the beginning. first what was done. They hand over to us that data set that has that huge amount of rows when would

be attacks and we took that data set and we were trying to figure it out what it can do with that data set so we would be able to build something and to bring to applicability and not just doing that research and that's all those data analytics. So it wasn't just data analytics that was following one of the data science life cycle. So we follow this kind of data science life cycle all the time. So my whole work on the research is based on this kind of road map when it comes to the research. So there are two phases. One phase is when you will start and understand the data. So it's all about the data an analysis

and understanding those data and all that context where you are and whenever you do that the next step next phase is just doing things utilizing those technologies and emerge technologies that we have long time uh around machine learning algorithms and then on top you can just bring that creativity by having the AI. So that was the whole road map that we took as a structure approach when it comes to scientific uh and scientific approach and reproducibility. So we did that, we did those understanding and we took those steps where we would be dealing with a lot of analysis and we call it EDA exploratory data analysis a phase where you would understand what you're dealing what kind

of data and then this was the first step where we did that and we did some kind of uh analysis and we come up with the results where we were tr trying to look more which would be those username and passwords that are used the most that has successfully achieved needs access in a real IoT environments that was half of data set was succeeded and what was used the commonly used username and passwords was successfully breached and exploited. So that tells you a lot of things that where the security would stand and uh I cannot reveal all those data on the source and everything and geography location because of that research project that that was objective

and uh maintaining that privacy of all those organization. But those were the real data that are coming and we saw those how those would would reveal this analysis that behavior of those patterns of the attackers reuse those and then further we deep dive more on the following phases where we would headed towards building that machine learning model. We use some features we use the actual data set based on that outcome of those analysis. We took some of it and the rest we just put it out of the scope. It wasn't our objective. Our objective was building on top of that results that we got that understanding build a model that would be able to predict at some percentages because you

know when you build machine learning models you can't have them to behave 100% but there will be some behavior that I'm going to show to you and then building on top of that uh that model that would be able to predict based on this data set that was taken from a real data on specifically on IoT infrastructure coming from real organization and let's see how that would bring some value within the community and the industry. So as a conclusion there are common vulnerabilities especially in the login field. So that we're talking about the login field within the data set that are exploited multiple times. We see that 23.5 4.4 million total data set was 56

million. So certain login credential are common targets for attackers. That is a commonly used. Username and password was used there for sure. If you want to deep dive more on this work, I'm going to share later on the link where you would be looking the publications that we did a lot of details that we published there and many attack attempts do not successfully use or capture credential of course. So those were more uh in details analysis where we would come up with those graphics where would show some login field that has significantly lower count when it come indicating that not all attacks were successful of course and a high number of unique values emphasizing the dynamic and diverse

nature of the IoT attack landscape and then log logged in field that has a frequency for the most common successful logging combination suggesting commonly exploited vulnerability. What we did, we went to the feature engineering process. As you may know, that's a process where you would decide which field would be that most important available for your further developing and designing that model on machine learning. So we selected some of them. So duration would give us that how much long that access was around and the login indicator would have all about the username and passwords that are really uh used against exploited against that infrastructure that honeypot did a good job and they would be able to capture

real even the geographical location from where that coming that could be either ISP uh IP address or the real IP address of someone but we would be able to know to track that to go distance common frequency how often that attempt was happening and credential tried what kind of credential so the honeypot capture that what kind of credential and from that field we saw that commonly used password usernames were used actually within all those organizations that use IoT that speaks for itself where we are and protocol including more on that and then we use certain type of algorithms. We were testing all back and forth on those and we decided on supervised machine learning where we trained that

whole model on top of that data set and we came to the results where a behavior of that model was good. was the behavior showing some very high levels of accuracy and precisions when it comes to effectively identify and distinguish what was a successful exploited uh vulnerability and what's not and then the other metrics would showing also the good results when it comes to the introducing and indicating that the b the model is well balanced. So all those matters when it comes to the data science as you may know. So when you would deal with data analysis, you would have all those balance and imbalance data set where you would need to take care. So you would have a good behavior

model at least and not just showing whenever this model would be showing us 100% all the metrics that's not something that was going uh uh correctly. So that means that probably you would have an imbalance uh uh model where that just behaved 100%. We can't have that one. we can we can have around that that number but not exactly and then further implication for IT security. So that that would be able to be used as an model that would just enhance those organizations that GCA a project was uh capturing by the their honeypotss to provide them more on the that enhancement when it comes to cyber security. So this QR code is uh towards those publication if you would be

interested more looking on that on that work. So what this means for incident response coming back to the main topic. So all those boxes would show us a lot of good things and positives when it comes to those peoples that would be sitting and working on those incident response teams. And since our focus and objectives was to bring some good and wellbalanced and build and design model which could help them and even a slight improvement on that way would help them and if you would wouldn't be on that kind of nature of role and responsibility maybe you would not understand or you would just heard about that but uh uh I was engaged on previous project

where I uh set two s teams one in academia and one in private sector. So I really know what they are struggling and what kind of issues and challenge they have when they would lack for evidence for logs they can do nothing. So they would be on that kind of moment where uh leaving this on going on or we would do something we would filter what we would do and we would just be hanging around and trying to figure it out what what we be able to do at the end we would be most decisions that would come on that moment by those teams incident response team would be acting on the network level. quarantine, filter, stop, drop or just

switch to the next infrastructure uh turn on the disaster recovery integrating machine learning into the industriality industrial IoT that's possible. So uh we took this work on the research not just on that thing working for that specific project but many more where I'm heavily engaged on that I'm trying to bring some kind of prototype when it comes to cyber security by design for those infrastructure. So I'm still working on that way and I would uh expect from my audience if though anyone would be interested to reach me and I would be happy to work together why not do something good. So only detection examples might be lo login brute force geo distance anomalies common spikes

protocol misuse sensor manipulation. This happens a lot when it comes to the uh tampering with those devices where you just need to have a signal. You don't need to be physically so much close and touch the device. No, you just need to have a signal and with a drone you can do that very easily flying over that facility. uh not drone you can do that by walking over the fence driving whatever you want it signaling that signaling thing is very uh risky so this is an uh that kind of prototype that I'm still trying to work and come up with that device cyber security by design where would bring a lot of architecture when it comes to the

con constantly anomaly detection based on machine learning the whole thing and this would be brought of course not just on prem premises but on the edge infrastructures where we would have part of our facilities distributed even geographically on different places and continuous learning. So that would be something that would be uh bringing. So we we were thinking that this would might be a lightweight thing that would fit those industrial constraints. So we're trying to figure out what would be those lightweight functionalities that we would be able to give to this uh uh this product. other researchers working on. So lately we got engaged on one research project working in our own trying to bring some governance of cyber

security when it comes to the health organizations health infrastructure. saw that too. How could AI uh be an example? So, and machine learning before machine learning that was that $10,000 per day that you would need to go over if you would have a CM solution that would be good but even even those CM solution are using on the back end a lot of algorithms that coming from machine learning and after machine learning. So after you would have some kind of level of applicability when it comes to machine learning, you would have this kind of uh situation where the focus would be straightforward to that level of impact that could be criticality high, medium and not low andformational

one and not false positives as we used to know. We have a lot of those. the whole blueprint of this uh enhanced AI incident response could be so sock needs and the tools right away to the tools what kind of those tools you would be able to implement and use those main use tools that are that are used on the socks or teams and cyber crime teams and then why not enhance that towards the machine learning and AI and bring that creativity in the room future so it will be here so digital twins is one of the uh technologies that and ways and approach that we can use a lot of more. So we would be able to

mimic exactly the real physical environment physical world when it comes to the technology and that could bring a lot of good things autonomous agents continuous self-arning and of course predictive maintenance all that we want to be proactive all the time. So and with those with this kind of prediction of future we are not just uh uh considering the replacement of the whole human factor no we would be there we'd be around but let's use those as a tools some of the key takeaways so industrial it incident response is overwhelmed we saw that by scale diversity of devices and limited visibility when it comes to the evidence and logs that we were able to collect Machine learning would

provide some enhancements. Why not a lot of those? And we you saw that our research verified that kind of physibility very easily by using a random forest algorithm. That's a very basic one. Imagine when you would have a complexities of those models and that you would have a lot of more data than just those that we use on the uh we use on this research project. That could bring a lot of good cyber security and of course collaboration that would be important working together all those sectors and uh industry academia private and of course those regulatory entities that those needs to play a much major role when it comes to this topic. So here is my contact if anyone would be

interested to reach me if you have any question. Now we have some time >> question.

>> Thank you.