Bye-Bye False Positives: Using AI to Improve Detection

Name: Bye-Bye False Positives: Using AI to Improve Detection
Uploaded: 2019-03-18
Duration: 23 min 20 s
Description: Mainstream IPS/IDS solutions including WAF, NGWAF, and RASPs produce so many false positives they are almost impossible to manage. The reason for that is that they rely on outdated detection mechanisms like signatures, human-defined rules, regexps, etc. In this talk we want to suggest a better metho

BSidesSF · 201923:20815 viewsPublished 2019-03Watch on YouTube ↗

Speakers

Ivan Novikov

Tags

CategoryTechnical

TopicWeb AppSec

StyleTalk

About this talk

Mainstream IPS/IDS solutions including WAF, NGWAF, and RASPs produce so many false positives they are almost impossible to manage. The reason for that is that they rely on outdated detection mechanisms like signatures, human-defined rules, regexps, etc. In this talk we want to suggest a better method, based on neural network, provide an overview and comparison for several AI-based injection detection architectures, and release a specific architecture and implementation which has produced the best results. To illustrate the application of this methodology, we will review in detail the implementation of AI-based false-positive detection for a SQL injection. The insight is to represent the injection as time series which then lets us apply the same AI-approach as those used in time-series classification. To find the difference between normal requests and attacks/injections, we normalize query to the sequence of tokens/lexemes and pass them to our recurrent-based neural network model which predicts the probability that is the injection. The best architecture to apply here was proven to be bidirectional recurrent neural network with LSTM cells. As a result, it was possible to achieve 96.07% false positive detection quality at the false_positives dataset of 433 samples from libinjection (https://github.com/client9/libinjection/blob/master/data/false_positives.txt). The implementation of presented model is already used in production at Wallarm for reducing false positive events. Attendees will take away understanding of most modern AI injection detecting methods, a methodology for building their own RNN network for detection, understanding of the training and test datasets and methodology for accuracy testing.

Show transcript [en]

okay good afternoon everyone let's welcome Ivan Novikov presenting on by by false-positives using AI to improve detection welcome - thank you Hey hello everybody I'm here today to discuss and actually share our experience related to false positive things and how we applied machine learning neural networks to avoid false positives and how we actually solve this problem I believe that it's pretty pretty relevant problem for any any attack detection things not only for application security but also for and bone security for APS areas staff or so on so let me please introduce myself a little bit my name is Ivan I'm 13 years in InfoSec five years he of warm and I do some sort of bug bounties and I'm all

sort of like 50 plus more researches and my friend Michael and my really good colleague as well as three years animal he's not here unfortunately today but also involved in a lot of researches at least ten plus different researches and articles and I designed a system Michael actually put it into the code sir what's the goal for this research first of all we decided to replace false positive incident response process I mean production process from that initial user action when user who was blocked by some sort of false positive mistake made by detection system or he's a he'll capture action or email to support team with the request ID we want to automate this process from this initial action to

that time when our like automation system will tune the rule in IPS IDs or web application firewall or whatever second point here as we want to be indifferent to that attack detection logic that made that initial decision I mean whatever it means regular expression or sort of tokenizer or sandbox detection system or another neural network we want to design a system to be able to tune their rules and make decision and actually correct some sort of as a detection logic and being different and like what does it means in terms of detection logic behind us point number three we want to apply this to real world I mean to the wild world and achieve results it shouldn't

be just a research about research we need to adjust our own mechanism and tune our own detection logic and improve our own product and it should be available on both source as well so as a result I want to highlight that we don't want to invent yet and as a detection logic itself like signatures and G or whatever but we want to train neural network to improve existence algorithms so scope and limitation so for this particular task for this particular research we limited our data set by sickle injections payload but it's not limited in a code somehow it means in particular that you can replace our data source for sequel injection to data source of your I don't know

xxx payloads malicious payloads and improve your over 90 viruses or ng anti viruses or you can replace our data source by XSS and tune your baths related to XSS attacks or so it's only limited to limited by our data source and point number two it's not for legitimate false positives for example vulnerable app could be designed specifically to to interpret some sort of data as comments for example it could be PHP myadmin or like my web panel like cpanel or whatever or just like other ways of control panels we not designed the system to avoid this but i will address this problem as well a little bit later an exponent for example if you want to

protect I don't know your personal block and you want to put a particular attack payload and a message field and this payload will block will never address this issue because technically this payload or we can count this payload as a legitimate false positive right we wouldn't like address to the problem for in this research sir I want to mention that my first time I started to walk with neural networks more almost ten years ago and it's like three tens of flow times when we made everything from scratch and I want to mention that never trust people who will mention that ok we will apply neural network to solve this problem it's almost impossible to solve any single

problem by machine loan and if you never try to solve this problem by just ordinary algorithms so I want to I want to describe how we try to solve the same problem before before we even start to think about machine learning things and we can mention three points here point number a we need to define what's an attack payload itself in terms of formal logic I mean it should be definition related to science it should be different nation in this like Turing machine terms only in this case we will be faced with some sort of probable problem otherwise we can just be felt anyways the point on will be how to detect attacks in Siri with 100%

accuracy otherwise how it's possible to make sure what we designed based on neural networks is good or not and it's definitely should be should be solvable and salary somehow and point numbers see is it possible to implement this savory or if not why if it's possible to just implement this salary to apply 100% accuracy for what we need to design machine on and stuff if we can solve it not a problem and nowadays I want to describe each single points here sir point number eight Watson take payload we've spent almost one year even here with a now to just discuss and negotiate between each other you finally make this formal definition what doesn't act a lot is an emotional

language it seems like a piece of data likely using input as it could affect some sort of parser or compiler like sequel HTML x-axis X or whatever by ejected some comments in more formal way it's just an input data interpreted by some Turing machine that affect this Turing machine by some sort of instructions or another one variant if some Turing machine can interpret some input data not only as data itself but also as some set of instructions this machine can be called vulnerable and this data could be called as an Tek payload so and this is a formal definition funny thing about the cybersecurity like science that it's not a science I mean it's impossible to buy

a book and read some sort of things like ok this is a Turing machine this is a formal model of this machine and this is what the vulnerability is we have no we can study any University and they can cover only cryptography as a part of science which is good and all the rest especially related to apply it cyber security it seems like hundreds of random facts and that's how we define our payloads and how we put this one or like a couple of hundreds random facts to some sort of formal way point number B how it's possible to detect attacks with 100% occurs in theory since we define it that we're working nowadays where the children machines parser

machines and definitely each single compiler whatever it means like psycho injection or HTML or whatever could be defined by some Turing machine nowadays we know that technically we just want to run this chewing machines and understand what the parser process is is it any instructions there or not related to this particular using input and salary but there are three different problems we need to address to solve this first of all even for narrow case of sickle injection we faith with my cqms eco SQLite Postgres Oracle etc a lot of different like ways of parcels kind of parcels second one we need to know all the initial parcel states to run all the parcels but parcels are recurrent I'll address this

problem little bit later and point number three just performance which is definitely a solvable problem unless we will faced with a health problem so be one so many parsers it's addressable until we have no intersection in parser mechanics I mean in particular I'll exit lexical syntax semantics analyzers and sizes parsers otherwise we need to split it at different machines and run them separately which is also addressable problem and another one point here is can we build one unified parser to cover all the syntaxes and one turing machine yes we can but we need we will pay for accuracy in this case because when we will combine every single like sickle parser into one parser we will miss

something so point number bb2 I mean we need to know all the initial parser State so look at the first line the first line described simple cycle injection vulnerability on this initial part of state highlighted by red color we can just describe each single initial state related to select syntax updates index insert whatever we've seen before and actually solve this problem but we have an unlimited number of the states because they are behind recursively enumerable grammars like sequel injection for example this breakers who knows how many break it's worth there on the left in this initial state it technically means that we can not enumerate it but we can convert this region recursively enumerable parser to another

one grammar like context-sensitive or even context-free or even regular syntax to make it happen and it's like Lisp style breaks or as we call it at least whilst on us but it's also solvable and point point number three it's performance it's not only relevant to how problem itself because if there is a health problem and we cannot actually pass this particular data through the Turing machine and to donut like cage this infinite loop since this machine relevant to some sort of real vulnerability inside an app it's not our job actually so if it's if it's what designed bad it's not our job to fix this our job is to detect issues I mean to detect payloads to address this we

can play by memory we can run as many instances at least as many instances as different states we cover it there and then put data through them and it seems like a cache warm-up mechanism also addressable problems we can pay by accuracy as well as I mentioned before if we were able to design it kind of like context-free or context-sensitive grammar our even regular grammar to cover everything and to actually to be relevant to our initial original sin things but also to claim some new false positives and funny that if we were able to replace our initial and original s cosine look for example by regular grammar we actually will reinvent regular expressions is they effect

that's exactly what we will do in this case we will do signatures irregular expressions so point number see is it possible to implement this salary or if not why so it's definitely possible to implement everything I mentioned before with all the limitations I mentioned before and we've started with just sequel I see quite parser tuna to proceed oracle Postgres my sequel and as a specific exams with some hacks like how to remove this break its - don't be - do not allow lists to laugh on us but it's a lot of manual actions for single grammar and to make it easy we release lib detection framework to define grammars in the formal way and to

allow to use this not only for the tech detection itself but also for payload generation thinks it's available at github as well and you can use it no machine learning at all pretty useful stuff you can define your own grammar whatever it means in being AF put it there define your own lexer and generate your own payloads to train your machines or neural networks or sir or just use it to detect attacks pretty usable stuff so finally we decided to apply neural networks the most interesting part of entire talk we decided to use recurrent neural networks because we cannot limit length of our payloads because it's just impossible to ask attackers to be short in payloads

that's why recurrent we decided to apply an attention layer and attentional mechanism because of our assumption that it's possible to make a decision about false positive by looking at some particular part of the payload how we perceive these payloads by by our minds by our brains right we decided to do be directional recurrent neural network to cover semantic things when we can make a decision about what does it means only after looking back after looking forward or controverse for example when you read in a line you need to be able to come back to understand what was there or you need to jump forward in a couple of words to understand what does it means

that's why be directional and we applied some sort of data pre-processing by additional heuristic algorithm we call the tandems to remove just equal sequences and [Music] our payloads to solve this Lisp smiles problem so this is an architecture of the network we've built it's an embedding layer plus a couple of layers or more of be directional Aaron and I and in particular lsdm but you can replace this to other way as a remnant cells not a problem then max pooling things an average pulling things an attention layer and everything turned out but it's actually it's not usual architecture but we tried different things and we decided to use this and a little bit early I described why so it's

available online by MIT license and our github we also release some sort of real data to test it during our ml hackathon it's also very bad cargo and feel free to join this project feel free to ply your own ideas there feel free to correct us if we're over on or whatever and an Seguin it's limited only by this data set if you have your own data set you can apply different things there and achieve different results and tune your own detection systems so how to use it first of all you need to prepare your data set to or use our data set for sequel injection by Orion and support comment this comment will apply this

tend M things to remove the slips miles and will create a vocabulary based on all the tokens or all the lexical constructions in your data set and create this and then you can train your network by the strain PI it requires a tensor flow 1.1 plus and don't don't forget to give us a star at github so what to tune there if you want to tune this system if you want to I don't know make it a little bit better if you want to apply different and specific and specifically if you want to apply different syntaxes there and train it to avoid false positive for your I don't know exercise that action system or whatever

you can choose number of hidden size of this hidden layers right number of neurons there we started with 256 neurons by default and two layers you can add more if you want to but be careful because all the time when you go add new neurons you will spend more and more time to actually train the system because it's pretty pretty related to resources you can change imbalance ice layer to make it smaller or bigger if you want to you can play with the tension hidden layer size as well usually it's 50 percent of iron and layer neurons you also can change your pout if you want to interesting question here and actually the last question I

want to cover today as how to identify how many neurons I need to add there to actually achieve some sort of like valuable results so we decided to apply some sort of like semi magic logic so we decided to take a look at an amount of different like Samson Hawkins and initial syntax we decided to try to convert this parser machine to some sort of like pushdown automata and understand how many graph edges we will be we will make there in this case and apply at least that much neurons to actually allow our narrow network at least to Train the same amount of edges in their graph that's what we decided to do and then we add a couple of more neurons so

I believe that it's available logic and as well as if we want to apply ot ml things and tune another one neural network you create a topology or find hyper parameters of this one it's also possible and we released in our block an article how to apply out ml2 to find hyper perimeter of another narrow networks so what's next the last point here as to cover league legitimate false positives as well as I mentioned before if I want to protect my personal block or if I want to protect stackoverflow.com it's completely normal situation for this particular websites to achieve malicious payloads and all the time when we will apply the technologic this detection logic will

produce the false positives for us what we want to add there we want to add some sort of metadata related to particular app we want to protect to reduce even or legitimate false positives in particular it could be request parameters like a URL or parameter names or whatever and we're still working in it because it requires to redesign architectures this neural network at all add additional attention layer related to specific parameter sinks it requires for us to just define this DSL how to name these parameters in the requests and how it should be actually how to build this vocabulary all the other related things and also requires from our side to back everything together into one neural

network topology and put it there and we're still doing it so feel free to join the project feel free to share your minds feel free to I don't know ask us to do something more crazy than we did and you were welcome here thank you I believe that we have some time for question and answers there's a question that came online so let me read it to you cool why not using a rasp that by definition will not have false positive on sequel injection instead of looking only at the user payload payload it will compare the user payload and what is actually injected in the sequel query yes or expected questions about rest but unseen it doesn't matter how we will

apply at the detection logic or which can which which types of false positive we will able to mitigate a rasp sink is the difference between like network based IDs and rest as where exactly we put the detection logic and under the roof of any arrest we were faced with the same signatures applied to the data in query parameters you can avoid false positives between user data and this particular query but then they need to its it still requires to apply some parsers there to understand what happens there to understand like is it related to sustained analysis or not and in we will able to tune these false positives as well so that's why any other

questions

no more questions Thank You Ivan for your intention and on behalf of besides you get a gift

Bye-Bye False Positives: Using AI to Improve Detection

Related talks