← All talks

PATRIOT: Perceiving Advanced Threats by Rogue Internet Of Things

BSides DC · 201737:5193 viewsPublished 2017-10Watch on YouTube ↗
Speakers
Tags
CategoryTechnical
StyleTalk
About this talk
The Internet of Things is expected to turn the security world upside down with cheap, low security devices that have no ability to be managed or monitored natively. The only way to wrap any kind of security around these devices is by way of network monitoring and using advanced behavior analytics and machine learning tools to perceive threats. Enter PATRIOT, a solution to the problem of determining whether your IoT device has gone rogue. It consists of three components, IoT identification, IoT Behavior Analysis and IoT Threat Detection. PATRIOT analyzes network session metadata collected by Bro sensors and uses machine learning extensively to automate the analysis and provide a quick detection and response system for attacks by compromised IoT devices. Ajit Thyagarajan (CTO at Atomic Mole LLC) Ajit Thyagarajan is an innovative and passionate technologist who explores challenging technology opportunities. He is currently CTO at Atomic Mole, a cybersecurity company developing a simple and effective security solution for the Enterprise. Until recently, he held multiple Director positions at Fidelis Cybersecurity. Prior to Fidelis, he was heavily involved with Internet Protocols and building fast routers. He was instrumental in the early development and adoption of IP multicast and contributed several changes to the BSD kernel. He also worked on several enhancements to NTP (Network Time Protocol) during his graduate student days. Ajit has presented at various security conferences, the most recent being BSides DC 2016, BroCon 2016 and BSides San Francisco 2016. Ajit currently mentors several cybersecurity start-ups as part of Mach37, a Virginia based Cyber security incubator.
Show transcript [en]

the besides DC 2017 videos are brought to you by threat quotient introducing the industry's first threat intelligence platform designed to enable threat operations and management and data tribe a new kind of startup studio co building the next generation of commercial cyber security analytics and big data product companies well hello folks my name is Aditya Rajan I am at atomic mol and I'm going to talk about Patriot which is perceiving advanced threats by by rogue IOT here's the obligatory about me slide I'm currently the CTO at atomic mall we are a small cybersecurity company doing IOT threat detection I'm a big fan of machine learning I'm not a machine learning data scientist myself my background is more in network protocols

prior to atomic mol I was at a company called Fidelis cybersecurity you might have seen a presentation from Fidelis just earlier today there were a couple of data scientists over there talking about machine learning again but I ran their network sensor group everything to do with protocol analyzers and so on and a very long time ago I worked a lot with Internet protocols I did a lot of work with NTP Time Protocol my advisor in graduate school was the guy who invented NTP so I got chance to work with him I also worked a lot with multi casting in an internet multicast was very big in the late 90s everything from IGMP to dvmrp to pim

sparse mode dense mode and I'm telling my age over here but yeah I was working a lot on those protocols in the late news as well okay background on me so let me just give a quick overview of what I'm gonna be talking about so I'll give you a quick State of the Union on IOT what what's I mean in the world of IOT today next I want to sort of talk a little bit about what the issue with detecting IOT is is there IOT in my network in the first place so we'll talk a little bit about that will then go into talking about how we detect IOT threats and just a brief side note initially when we were talking

about Patriot since we were talking about detecting IOT in the network are the algorithm that we came up with was idiot but idiot was probably not such a good name so we migrated over to Patriot and I will try to give some examples using a bro how many of you all here use bro or familiar with bro and how many are not familiar with bro okay a few okay and again machine learning is seems to be like the the hottest thing in the planet today so I'll use some basic machine machine learning techniques on how we can use how we can use basic ml techniques with bro data to to to detect and to identify and detect threats with

IOT okay so it'll be interesting from that perspective so IOT is a huge huge huge phenomenon that's happening right now depending upon whose estimates you believe they're supposed to be like nine billion IOT devices out there today 36 billion in by 2025 or so a huge revenue opportunity for 70 billion and around five and a half million new devices on a per day basis again this is the entire world of IOT IOT is also enabling a lot of smart automation apparently you know 47,000 jobs were directly or indirectly created by IOT in the country of Spain by them moving to smart cities for example and that last number there is the most recent update I got from nest

saying that they had saved 13 gigawatt hours thanks to thermostats so so yes IOT is becoming a fairly large phenomenon today but the problem with IOT is this whole security thing so today we had 150 mm devices participate in the last Mirai botnet attack you know a few months ago 70% of IOT devices are supposedly vulnerable if you go to the IOT the lids over here the guys will say it's pretty much every device and I because I had a conversation with them and I was said okay you know you've got dealing cameras over here do you have any dealing device that is not you know vulnerable and we said not so so yeah highly highly

vulnerable set of devices that are being sold out there there were nearly eight and a half billion devices potentially affected by the most recent blue Bourne Bluetooth vulnerability this was like literally like three weeks ago fortunately most of the devices got patched joining your phones if you if you subscribe for automatic updates but there were a lot of unpatched devices as well out there so if you if you have this Bluetooth capability on your IOT device and you know that you haven't patched it I recommend that you do and the other big issue with IOT is just visibility in your network a lot of administrators are just not sure how many devices there are in my network in

the first place and and so they maintain a very tight control of what they can and cannot implement in their in the enterprise but some of them are more lacs and for them the ability to gain that visibility is very key and of course there's a huge problem with personnel you know we just don't have enough people to sit and help with with analysis and and with security issues some other one of the abilities that were recently discovered was if you are addy sites charm there was a guy who showed how he could hack into a pressure cooker and and of course you heard about the the voting machines that got hacked at DEFCON that was on the news as well and

just recently there was all the access camera you could put like a devil's eye via malware on that as well so 249 out of the 251 ID cameras are affected so anyway big problem with the weather with IOT from a security perspective so we talked about this high-profile IOT attacks though these were than large number of devices that got that participated we have the mayor I bought net there are still a lot of DDoS attacks that are going on depending on which magazine you read and in my opinion there is another major attack that is sort of waiting to happen okay so like I think this is definitely going to happen in the near future so

what are the key IOT challenges and and the reason why I ot is such a such a security problem is that hey you can't put any antivirus on so you have all these endpoint companies that claim that they have AI and and and the next-generation you know malware detection capability but you can't put any of this on the on the IOT devices but there are closed devices the second thing is they they produce little or no logs so you can't take your favorite sim tool and aggregate all that data and you know look for these threats you know is my IOT device compromised or not so those are the two major issues that you know companies face from deploying IOT

and of course if you're the lacs company and you allow for four devices to be brought in and you know randomly connected on to your Wi-Fi network then you have a BYOD problem as well so those are some of the key IOT challenges so from IOT security and and you you know what what do you need to do you have to sort of consider yourself hacked in which case you need sort of the tools to detect whether any of my devices are hacked or any of my endpoints are hacked and you need to have multiple layers of defense as well there are a lot of solutions out there that have there are perimeter solutions so they monitor your traffic going in

and out of the organization highly recommended that you get one of those parameter solutions and then now there is a flood of edge detection solutions as well you know solutions that will look at your wireless LAN you know there's WLC wireless LAN control protocol they will monitor all the logs from those protocols to tell you whether you have authorized devices unauthorized devices unauthorized devices trying to do IV attacks on your network and trying to gain access etcetera etcetera so you need a combination of both perimeter and edge detection in order to sort of you know make sure that your life is sane going forward so one of the things is so you say yes you need a lot of monitoring

and visibility tools in order to do to manage IOT devices if you are deploying IOT so from a network administrator's perspective what are their wish list items so the first thing is they would like to have some sort of device management capability so which means that they want visibility into what is connected into my network and there are some tools out there like your koalas and your Nexus of nests of scanner and so on they want access control capabilities they want to make sure that no rogue devices are are accessing your network and then once you have that sort of figured out then they need the ability to figure out okay are these devices that are connected to my network

posing any risk for me so I need to understand what kind of vulnerabilities they have usually if you run down the wasp top ten list of vulnerabilities everything from are these devices using insecure communications are they sending passwords and clear text those are sort of you know simple things that you would like to get some idea off before I deploy my devices and that also gives you a risk profile on on on the devices in my network then once you have that then you want to go to the threat detection component where you want to see if these devices are potentially compromised or not okay and if they are what can I do about it

can I get notified and how soon can I get notified and of course there's the whole anomalous behavior aspect which is okay is there even if I don't know if it whether it's compromised or not can you at least tell me if it is behaving abnormally and if so that may lead to a some sort of a compromise that may be some sort of a compromise and then finally you know everybody wants instant remediation which is you either quarantine that device or you prevent it from communicating all together so again that's that's a hard one it usually requires either access to the device itself or you need to somehow write some rules on your firewalls to sort of block

that communication so let's go on to the IOT identification phase now which was a part of that whole device management space and we'll talk about why IOT identification is important in a little bit or rather it'll become clear first let's figure out what is and what isn't IOT in my definition so my definition i OT is not i OE okay which means that a lot of people think that okay anything that's connected to the Internet is an IOT well not quite we want to distinguish between devices that exhibit autonomous behavior and devices that are sort of controlled by human behavior so laptops desktops Mobile's gaming platforms they have direct user input into them and therefore their behavior

is governed by user input for them for the most part whereas IOT devices like your nest thermostat or your wall switch will react to a user input but then for the most part out of our operating autonomously that's a key differentiator and it this also becomes very clear if you if you look at industrial automation sensors in the county for collecting rain data and weather data and so on and so forth okay and by the way that definition of exhibiting autonomous behavior came from never water okay so they didn't go with the whole IOT equals I we so what are the characteristics of these IOT devices so IOT devices first they generally generate a few short

packets you know per device again I'm just giving you some some sample characteristics it's not cat it's not every one of these doesn't apply to every single IOT these are this sort of approximate characteristics on a generic scale they tend to have you know low duty cycle traffic patterns that means maybe a long duration between transmissions burst like I may send out sensor data every hour or so I have a nest thermostat at home that sends data out literally every hour or so and the traffic patterns have small statistical variations with them so which means that there's a lot of periodicity involved in these traffic patterns and the other thing which is actually a key

characteristic is that there they tend to be uplink dominated so that means that the uplink volume tends to be higher than that of the down like volume and also and this is another key characteristic which is that they tend to have a low destination count which means that if you look at the number of unique destinations that they communicate with that tends to be a fairly small number and also a fairly constant number so it's a constant small small number from for an IOT perspective so these are sort of characteristics that we will use to identify IOT devices in your network so one thing that sort of came into my mind early on was can we

just do an IOT lookup in some sort of a database I mean you have Pheebs for threats from malware for you know bad domains and so on and we try to look for a few sites that had a list of IOT devices and we found some of these third one was interesting they claimed to be the dominant site for all IOT devices except that they only had less than 100 consumer grade IOT devices so if you go to IOT line up dot-com you can you can see what those hundred are and obviously it's not going to work you know with five and a half million devices being added for a day that's just way too many devices so you really

need automated detection okay so that's the that's the key requirement here so enter bro so we can start using bro bro is a the previous speaker also mentioned bro bro is a network monitoring so network monitoring software you can run it on on a VM there appliances available as well and it essentially collects session metadata from all of your traffic and depending upon where you place your bro sensor you can either collect session metadata of devices communicating with the internet or even inside your network as well so it really depends on the position of your of your bro sensor so so the idea is that if you're trying to do iov detection using bro you try and get as much good data

from your bro sensors in the network and then maybe you can identify which one of those are I Oh T devices inside that you may already know some of the IP addresses assigned to those IOT devices maybe they have a DNS name inside your network and that's one way to identify them and then you can write some rules to detect each type of iot in in your network so you may say that okay i want if it is this IP address and if it is communicating with these endpoints then boom that's that's an IOT device so you may have you can you can do a whole bunch of if-then-else statements there but that's hard that's hard to scale and

it's hard even if you generalize to some of the products some of the characteristics that we that we gave earlier so what what we can you can try and do then is what if you were to introduce some machine learning into this whole process so if you were to look at either addiction using bro and machine learning now your procedure looks somewhat like this which is you still get as much good data from your bro sensors in the network you can identify your known IOT devices in the network using this the techniques that I mentioned before but now what you do is with with machine learning is you start labeling your traffic so you start labeling which of

my devices are RI yo T and which are not and then you run a preprocessor on that data to see what are the relevant features and then you apply some domain expertise to get what is called engineered features and we'll talk softly we will talk through these in the next few slides it's just a summary apply some suitable machine learning supervised algorithm to the data and then you see how good your results are okay and of course at the end of it you have to continuously kind of tweak these parameters to see if you can get better results okay so what does bra data look like this sort of quick just a dump of some bro connection log data it has

multiple fields usually source port destination port these are all TCP connections and yeah okay so as I said for your first your first job is to figure out which one of these oops sorry I know it got messed up here it's not displaying probably but anyway you will see some some of these IP addresses are IOT and some are not and and your idea is to just go through that you can once you figure out which IP address or IOT you can just run a little script that will add that label into all of your data so the next step is to run a preprocessor which extracts those features that are relevant so if you

just run a preprocessor on the raw raw data you will get the the preprocessor will go through months through all the data and then tell you okay you know what these are all the relevant features that I think you should consider so this is what we did we chose univariate as one of the pre processors ran it and the stars over there kind of tell you what the what the relevant features are but there's not just one preprocessor they're actually like tons of them out there so what we do is ran it with all of them so there's univariate there's PCA there's lasso there's a whole bunch of them so we ran it through every one

of them and then we sort of took a a statistical average we averaged we averaged out all their responses and sort of came up with that line at the bottom that said ok here is an average of all these these features and a couple of them show that ok this is a relevant feature which is the the IP byte that were being sent out by the device and then a not so relevant feature was missed bytes by by by the but that was present in the pro data then we applied our engineer our domain knowledge we said ok you know what we went back to those characteristics that we said earlier which is you know small

number of destinations up own traffic ratios tend to be high for for IOT devices versus non IOT devices and we sort of added that into the data and we came up with these values and then we again ran the preprocessor on that and you can see this very high correlation amongst those those values it's just a sample three of them that we added in here up down ratio connection counts and the URI field so again this is your engineering aspect that you apply in here so that you can then move from just raw data to engineer data and now you know what your machine learning algorithm needs to take as an input in order to do your detect

okay so then we ran random forests on this data and without the feature engineering we would actually did pretty well even without feature engineering gave me a 0.76 accuracy of detecting IOT versus non IOT given that that many number of sessions but after doing the feature engineering that ratio increased significantly to almost a 99% accuracy and then we applied tried other techniques as well so we did boosting with a logistic regression we applied K nearest neighbors and we found that okay these were also fairly good so it really depends on your data set and depends on your familiarity with these algorithms so your choice of algorithm may vary okay so this confusion matrix is one of

the key determiners of your accuracy or sensitivity to false positives false negatives and so on I haven't put all the math in here there's there's a say there's a there was a talk earlier about machine learning they had a lot of math in that one so one other curve that is kind of useful is this thing called this ROC curve and it sort of tells you what the ratio between the false positive and false negatives are so without feature engineering you will see that you get this 0.76 accuracy and that's reflected in these curves which means that your you have to tolerate a certain amount of false positive rate before you get your accuracy up but once we did the feature

engineering that value increased significant or decrease significantly so that you can get a very high accuracy with a very low false positive rate and that's what we wanted okay so these these two curves are sort of significant when you do when you're doing any sort of machine learning analysis okay so from machine learning perspective data is everything so we multiple people have talked about this there was a great talk at blackhat from from the engineers at invention and our sofas about how if you feed your learning algorithms bad data you're going to get bad it out so a lot of people don't realize how important this is so the way you collect your data is

extremely critical and your results are going to be extremely dependent on on the data that you feed it you need to normalize your data especially if you have data coming in from multiple sources normalization is a very very critical aspect this is another aspect that a lot of people overlook which is the under sampling versus over sampling almost always you will have an unbalanced data set so you'll have a large quantity of let's say IOT or a large quantity of non IOT traffic and then how do you how do you then balance depending depending upon your circumstances you may have to use under sampling or over sampling we were lucky that in this data set that we were

considering we had an equal number of of IOT and non IOT devices in the in in our data set and can't stress enough on how well the engineered features are important so we chose our engineered features in such a way that we were able to maximize our accuracy choice of other engineers features may or may not give you that and you may not have the luxury of choosing those those features that will give you that that that level of accuracy and finally there was a talk yesterday about reinforcement learning which is even though you may have your algorithm predict wrong data for you or wrong or incorrectly you should be able to take that feedback let's say your

user says oh you know what that was not really an IOT device and you classified as an IOT well you should take that information and then feed it back into your algorithms so that you can relearn that aspect and that's what it's called reinforcement learning and that significantly improves detection so improve detection and it been shown to to to really work well okay one little side note on detection versus identification so so far we've talked about IOT detection so we can detect IOT but what about actually identifying it so I can say ok I have an IOT device that is an IOT device but can I can you tell me that this is a nest thermostat

or can you tell me that this was a tcp light bulb that actually got connected into my network it's a little harder an identification requires a lot more contextual information about that data but actually it's possible and we're getting there okay that's that's something that we're working towards there are bro fields that you can use to extract that information the user agent is a great field it generally gives you a lot of information about what kinds of applications are actually running and connecting to the net if that if that data is available knock addressed napping that's a great way to actually figure out what the device is almost all manufacturers are given blocks of MAC

addresses so you have to do a quick look up and see if your device falls into one of those categories again it's not conclusive but these are all little tips that will get you towards that that holy grail of identifying what that devices dns classification of the destination so if I'm connecting to nest comm and probably a nest device and similarly with SSL cert analysis but a lot of data is now as encrypted and I may not have some of these other other fields you can just figure out from this earth itself the shirt was issued to delcamp you know so therefore it must be a built in device and you can then combine all of this and use some sort of

you know natural language processing for example to actually extract out some really meaningful data about what that device actually is so this is something that we are working on right now and from an algorithm algorithmic perspective there are some great algorithms out there k-means and hierarchical clustering are really really good fits for for class for clustering and classification just a little sidebar okay so let's go to IOT threat detection so a couple of ways to threat detection generally you can write a road of rules and develop threat models threat models look for specific patterns so you may say that okay if I see a large amount of ICMP traffic going out of my network within a certain

period of time then I probably have some sort of a device that is bossing somebody you know so that might be a threat model content models are written very specifically so therefore there look for very specific patterns so the number of false positives that obviously minimized however they obviously won't catch everything unless you have threat models that are really extensive and even then the fact that you have zero days implies that you know that models are not or not they don't catch everything so the other option is a noble injection and there are a lot of solutions out there today that say okay you know what we will we will tell you every single anomaly that

is in your network but the problem with the lonely detection is that you get a lot of false positives and so you have you need some sort of a person or some sort of a tool to go through those false positives and figure out what's what's what is really a threat and what is not and so your your goal is is sort of in the case of anomaly detection is to minimize the false positives in the case of your threat models is to sort of write enough models or find ways to to encompass a large number of threats so that you get less number of false negatives so what we're going to do is we're going to try and use our detection

capabilities to do a normal a detection and what is an anomaly so normally typically implies that you have some sort of a baseline or normal behavior and an anomaly something that is not normal anymore so that's sort of the definition of anomaly in many a sense in order to do a normally detection you sort of need some way to identify what that normal or baseline behavior is okay so that that's the that's the key so what we're going to do is we're going to use the results from our detection stage to sort of build an baseline okay and the key is those engineered features that we have they actually tell a very good story so if we can just enumerate

all of those features both from the raw data as well as from the engineered features list that we computed that actually gives you a very good sense of what the baseline feature of the IOT devices are so some example this is a quick example of what that might look like is that so the nest thermostat you know communicate to just two destinations per day up down ratio generally on an average is four point three two the number of sessions per day is about 265 and it only uses port 1101 109 five to communicate so that's like your baseline behavior of your nest thermostat again I'm just giving you a small sample of attributes over here you

can expand that out significantly to include everything from periodicity and and so on so you can really expand on that on that list and similarly I've got another example using the the Belkin Wemo device which is also on a 24-hour basis one okay so so going back to aisle t thread detection then so you can either write rules now to detect threats this is kind of hard as I mentioned before all right anomalous pattern rules and then hunt for it so as I mentioned with anomaly detection all these anomalies that come out you have to sort of then go back and figure out from a contextual perspective whether it's a threat or an anomaly what

if we were to be able to use machine learning for this so what we do is we summarize the baseline features and then we can train these we can we can build models that take these baseline features compute various threat models and then sum them together and train them with that and once you train the machine learning algorithm we use random forest here we can then test out to see if these results actually work out pretty well or not and we used a a large number of p caps from this site called stratosphere IPS the arc they maintained a very large number of Network P caps on different types of threats DDoS Trojans malware and so on and and they have they

have ready-made bro capture logs as well so it makes it really easy then so you don't have to really run it to a bro sensor and extract that data out either so this combination is what what we used in the in our experiment and then what did we get from the result perspective on a per session basis we actually got very very very good accuracy but we did miss out a few and again this data seems seems like it is it is it is very very accurate but even with the per session detection we didn't miss a few so we missed about 46 and 35 sessions that were misclassified amongst 33,000 but when we were able to go to a 5-minute

detection interval and summarize the data we got complete complete classification so which means that if you were to wait a little bit and aggregate the data out and you can see the number of sessions getting getting diminished over here the detection rates were improved significantly so what do our results say it's say that you know you get very high accuracy in determining these common threats again we didn't run it through every single thread we looked at IOT what might be some of the common threats associated with them we ran it against that and and those are our results and increase in the detection time obviously improves the accuracy further so from a from a closing perspective I'm really excited

about machine learning and what the applications are I'm as I said I'm I don't claim myself to be a data scientist but but seeing the the applicability and the impact of using machine learning has been a tremendous learning experience for me I'm still learning as well we we want to go ultimately towards even a threat identification using machine learning I don't know of any one who does that today but that would be like the Holy Grail and hopefully we can get there someday so I wanted to thank there were two interns who worked with us Satya and Yi over this summer and that's not anonymous the anonymous anonymous that's just a person who didn't want to

be named over here in the India Thank You les that's about it any questions

okay so the source of the data was a data set from one of our customers yep yes so we'll put some there are some Python libraries that we have for this we'll put that up on our github github website if you give me your card or information I can send that information to you but yes if you just go search for atomic mole it's not up there right now but up but we'll get it up there yes no particular reason random forest seems to work well for us with this data set if you notice even in the previous in the detection phase we did try out the other algorithms as well random forest boosting k nearest neighbors and so on

for some reason random forest seems to work pretty well and and it's not just me I think that there was a couple of other talks today as well they all they all seem to be focusing on zooming in a random forest I don't know but you can try others as well as I said it is a very dependent on your data set and what you're comfortable with in the other questions yes

okay so there were a total of 18 IOT products and 16 non IOT devices in the network 34 in total yes possible there is actually a a NSF funded project at CMU where they are trying to do a behavior classification of all commercial IOT devices and I think they're starting with like the popularity rather than claiming to have every single IOT they're just taking away at the top 100 popular IOT devices and starting with that I've been working with them a little bit on on being able to take some of their data and give them some of our data as well so you might end up seeing our data over there any others thank you very much guys