
uses um it's something that uh I think we all believe we understand uh the the attributes of it comes in different uh PES it has different shapes um but ultimately this roll of toilet paper I think we can all agree is not very helpful first of all it wouldn't fit through most of our front doors uh so uh shopping for it would be really difficult and um what what really matters here is is that uh toilet paper in this size is is very unhelpful this size how however is what we're used to and is something that we can make use of this is how I feel about the term Big Data Big Data equals uh something that
uh in in its current form factor is not very useful so we need to think about um how big is Big Data how um helpful is Big Data and and uh come up with the answer to that and once we know the answer to that then we can talk about how to make more intelligent decisions on top of data so I don't know if you know but there's a toilet paper encyclopedia on the internet um and so I uh went through that and and found some interesting statistics um this is all uh crowd sourced information and um there's everything from uh whether you're a roller or a buncher uh to uh something that that I would say
is getting towards the more useful side of making use of data which um useful is in the eyes of of the beholder of course so although we may care as consumers about the other information uh we potentially uh aren't going to look at the data as though a company like charman or Proctor and Gamble somebody that sells toilet paper would and so um with that uh I want to make an analogy to the security world and what we think of as making use of data in that we're not talking about Big Data just for the sake of um you know understanding cyber security at uh a astronomically high level that no one really can use we're
talking about making better decisions in our organizations now if your organization is a big security company then data um is at a different scale but my presentation is going to be making decisions at the organization level how can I um understand what's happening at incident response and how well I'm doing it and who's attacking me um what why the threat keeps on going after the HR Group versus my executive team those are decisions that maybe don't require big data but do require data analytics so big companies run their business on Big Data but I would argue that um the size of the data isn't as important as the decisions that are being made so how do you decide what um the
value of the data is well there's a another term being used synonymously with big data called thread intelligence and so most in this room hopefully agree because we're smart uh that threat intelligence is the result of uh producing a data analytic process to come up with some that u means something and uh Rick Holland who talks a lot about threat intelligence he's a Forester analyst breaks down the value and and the attributes of value of threat intelligence into those uh six terms accuracy align with your requirements integrated predictive relevant tailored And Timely so big data is the is the unrefined coal that's coming into the the manufacturing process and at the other end you want to measure the
diamonds you're creating against those attributes big data is great but this is my no kidding slide how many of you in this room store your incident data in uh spreadsheets come on all right at least one person stores their so so I talk to the biggest companies in the world and they store their data in spreadsheets so if you guys are really really Advanced you're probably storing let me guess in ticketing systems so I'm not I'm not setting you up for failure here I know state-ofthe-art is uh screwed up right now the biggest way that people communicate about security matters in the business today email the way that they store the indicators and and find
them again in subsequent incidents email search maybe if they're really Advanced they have an Excel spread spread sheeet that they have created a pivot table on um I don't know about you but I use Outlook and and macmail and neither search works very well and the bigger the data set becomes the harder it works and and doesn't produce any answers so this being what we do today is having a major impact on our ability to make decisions and in a a predictive way um and just the the simple stuff incident response security operations um these are things that move quickly and we can't afford to communicate uh using 30-year-old technology so enter Big Data um most of
you are uh looking at uh Big Data as uh the merger of all this information from all these sources and most important of which is um you don't know what you're going to use it for so that's my uh my frequent conversation with customers and that they they say I want to put all my data in this database and I don't know what I'm going to do with it yet but someday I'm going to figure that out well putting my developer hat on it doesn't work that way you have to figure out what the data is going to be you have to figure out the ontology you have to figure out your use cases first
because the Big Data space is super complicated you need different Technologies for different things you need different um developers to build on top of them different use cases so think twice before uh saying hey my incident responder um has a little bit of time in between fighting incidents so they're going to build a big data platform this is really hard stuff and um it's not to say that I don't like building things with the best of them but um you know focus on the small data first and making sense of that before you jump into a big data platform so small data or medium-sized data again I don't know how to define big data is it millions of Records is it
tens of millions or is it hundreds who knows but what is important is that this problem has happened in every other part of the business every other part of the business makes decisions based on data product management makes decisions on what their customers do sales people make decisions on who's buying what um Finance very data driven they use Excel spreadsheets and they're supposed to us as security people we're not supposed to use spreadsheets unless we're doing some kind of financial analysis so where is the equivalent for the security industry well the first problem we need to think about is what is a fundamental requirement of building a platform that understands data again you can go back and you can just
buy Hadoop and throw it down and put a lot of data in it but Hadoop doesn't know how to analyze data only people know how to analyze data so fundamentally the thing that's missing from security unlike sales and finance and other parts of your business is an understanding of how to look at data how to the methodology to which an analyst uses to make sense of data and so there there isn't um a dozen choices here there's um there's people that talk about data structure so there's um things like uh sticks there are um various um other standards around the world that that look at data um and the structure of data but that's still
missing an an important ingredient which is how data is managed how you look at analyzing data and so the one that um some defer analysts use hundreds I would say say it's not it's not known around the world this is definitely not known to everybody is the diamond model of intrusion analysis uh created right here in the DC area in the uh within the Beltway um used by a lot of government agencies and what this does is it allows you to make sense of the analytic process and just like Salesforce had the plethora of people that knew how sales worked this stands to be the platform or the found from a methodology perspective that we
can all use to understand how to look at data and so to give you a a quick overview um my colleague Andy Pendergast is going to give a presentation tomorrow at 12:30 that will be um very detailed about this right down into like the algorithms and the how gra the graph Theory Works behind it but at a high level it allows you to take all the data that you capture during incident Response Security operations um threat intelligence processes and um make sense of it with relationships and so it's got a technical access between the capabilities or the malware and the infrastructure that that the adversary uses and then it's got a geopolitical access which is between the adversary
and the victim and a lot of people think of these as um tactical and strategic Intel I I would say it's it's easier just to think about it is it's all information that um matters to you in decision making and the term Intel may be overused in this situation it's it's data about what's happening to your business that you need to use to make better decisions so the industry as a whole for the last year has been talking about a new type of platform that takes advantage of the fundamental principles that I just discussed and they're calling it a threat intell platform um the again the word threat intelligence I think is um used differently by different people um but
let me just bring it back to the requirements that I just talked about and in particular the ability to capture data from uh various uh inputs the ability to bring data in from uh email lists from your Security operation Center uh from your threat team maybe you pay vendors for feeds ability to to col collect all of that information and aggregate it is the first step of a threat intelligence platform second is you have to analyze it me many people think you just buy threat intelligence and your problems all go away it needs to be relevant to your organization in order to be useful and the only way to make it useful and make it relevant is
to actually look at it and understand what it is you're trying to accomplish and there are different people people that use that data in different ways the CIO looks at the data within that platform uh differently than the threat analyst the threat analyst is focused on the threat the CIO may be focused on the threat analyst and how product activity is gained when they bring in new data sources or new automated uh types of capabilities and finally once all that data is analyzed similar to the uh the coal analogy at the other end of the conveyor belt you can't send it all out via email that's not the right way to distribute new um thread intelligence or
new actions instead the action side of this needs to be integrated directly into all the products you have we don't need another security product I would argue we need a way to take the all the data that we know as an organization from all the different sources that we get it from and P it into our existing security products and once we do that we're making them all smarter but the next and um really powerful thing is the feedback loop so now we can start collecting information about um the the alerts and the action side of things and we can start using that information to make better decisions about how we produce data so this in in fact is um so
I'm a product company I make decisions on what my customers need I measure their interest and then I put investment behind that interest it's no different in security we need to have the feedback loop so that we can make better decisions about where to place our time and money so a little bit more about aggregation so aggregation is the first step in a thread intelligence platform it's the it's the ability to bring all your data in and start the manufacturing process to create something meaningful it starts with your data so you cannot go by threat intelligence and have it uh solve all your problems you need to look at your own problems first make decisions about what it is
that you really need and then look at where you can get that information so one of the big things for those that that um probably are are aware of threat connect is we're really big on community uh we we View Community as being the highest Fidelity source of threat intelligence that exists because it's real time you you literally can get information from your business partner in real time that says I was just attacked by this particular um adversary in this way with this malware in this infrastructure and you can use that to defend yourself and communities and sharing is the very best way to do that is it hard yes it's very hard people often talk about the
legality of it um but it's happening it's been happening for many years but it was happening at a analyst level at a uh you know we we drank too many ruming Cokes last night level um and so it's happening today it's just that the platforms and the capabilities the technology is getting better at allowing people to do it more quickly and do it at scale using data and structure under the the hood so data services this is um the the security vendor's Dirty Little Secret so data services are things that you can use to make your data smarter and um so things like passive DNS and active DNS and who is data and uh ipgo these are
all uh data services that if you run your data through them it comes back with new um data and that data can be used used to make correlations so you can have different incidents that are correlated by who is um on the IPS or the host names that were used in the infrastructure of the attack so maybe you don't know that this incident and this incident are the same until you look at the adversary infrastructure that was used between the two attacks so data services are uh how the security companies create their own coal they're mining their own coal and creating diamonds and they're selling it to you so create your own diamonds and then
finally if you don't have the time to create your own diamonds and you can't build the automation to do it you can buy that information or you can get it from open source sources and um there are lots of sources and the the the new term now every company's building a Marketplace of Intel sources I would urge you to think about sources but only after you understand your own data finally you need to think about what kind of data you need first and foremost don't jump into I'm just going to collect data from everywhere because you're going to make it meaningless it will mean nothing because you'll have too much of it you won't be able to
process it it will be like dumping uh too much coal on the end of the conveyor belt and you don't get any diamonds and so some of the things that you can't see because it's cut off or um you know types of threats you're interested in uh the types of uh indicators themselves um are you interested in strategic Intel or tactical intel if you're looking for tactical do you have the ability to look for host names on the network or URLs or does that not matter because you don't have a detection capability to even make use of it so think about what your use cases are first um and then once you you understand what you're looking for for
then look at the various vendors or open source feeds that have that kind of information that you can bring it in but but only bring it in if you're going to use it and only bring it in if um it can be overlaid with the rest of your business so a lot of talk about machine readable threat intelligence um who here is familiar with sticks all right good so um more than one hand went up sticks is kind of the the new uh the new cool thing uh in this idea of threat intelligence and so um the reality is I need you all to be ambassadors of sticks so I'm going to teach you a little something and I'm
hoping you're going to go out in the world and you're all going to tell two or three people because we have a major problem as a security industry right now around this term and that people think that sticks is um a finished product it's something you can just start using and everybody will be on the same page that's not the case so sticks is a super language for those that don't know it's um a language used to create other languages is it a standard yes is it a standard that um is helpful yes but it's similar to uh for those that remember electronic data interchange EDI it was um a super language that every company
could use to to build a bridge between their companies and the and the challenge was we ended up with millions of bridges none of them were interoperable with each other but they did work in providing a a foundation to build the bridge and so where we're at as a community right now with sticks is that sticks is a great uh starting point it's there from a uh perspective of if I'm going to start taking XML and making it into something um it's the right thing for me to use but where we're at that is that everyone's doing that and saying I support sticks but none of them are interoperable Believe It or Not sticks is not interoperable out of the box so
when you guys head out of here um on Sunday uh and you go back to your work places just remind people that sticks needs work we as a community need to go from vendor specific to sticks for Sim or sticks for ir and it needs to be um specific enough that we can actually all build the same things so analyze um I don't have a whole lot of slides here I just want to talk through a couple uh of these use cases so analyze is the most important part of data you pull in data from all these sources it doesn't just analyze itself you need to make use of it and so when you think about the different use
cases that security teams use data for um there are many and I'm going to going to talk you through a couple so the first one is the threat intelligence production uh infusion use case so you are a fortune 1000 company you're in an ISAC you uh maybe you're in critical infrastructure so you're good friends with Department of Homeland Security um you're you've got five analysts on your team each of them is part of a few fight clubs they get data from there and um and so in general you have data coming from a variety of places uh and you're looking for that data to mean something to be usable um second is you need to um
make it relevant to the stakeholders which may require you to automate some of the processes make incorporate some workflow so for example every time DHS sends an email that says this is a critical vulnerability against skada system X and you happen to have that system you probably want to escalate the priority on that get an analyst looking at it right away analysts today are uh are not well organized they don't know what to work on they're discombobulated and email is not the best way to prioritize what they're working on so workflow is important but maybe even more important than workflow is once they determine it's a risk to your organization or your network the ability to to bring action against that
objective immediately and that's where Integrations come in so a thread intelligence platform has to have an API it has to be integrated into the the security products you already have um so that you can suck in those indicators and start looking for them in real time uh within the network and I'm going to talk through that a little bit more um event and alert triage so when you see an event from a ska system uh just to keep my analogy and that event has to do with the same exact indicator that DHS warned you about out and there's a high risk of the adversary that is um behind that activity you want to know about
that right away so in essence um you know Sim has a lot of alarms going off threat intelligence and and database decisioning allows you to make decisions about priorities based on risk and that's the thing that I I think we can all agree Sim never met up with that need completely and so given the fact that we're conveying or analyzing risk within a threat intelligence platform we also have the ability to correlate risk with real time information and a feedback loop again can start that intelligence cycle incident response support I would see as being very similar to that incident responders should not when they're creating a remedy ticket be storing all the indicators in remedy and
then having to go look in a Wiki to see have you ever seen that indicator before they should create an incident and the the system should tell them that these indicators have been seen and these incidents with this degree of risk because there's a correlation with an adversary that's U known by the organization and then hunting hunting is a uh a new term uh you know I think a lot of larger companies that have threat teams are starting to proactively hunt for indicators or adversaries in their Network and um the only reason I bring it up here is that um it's a lot of back and forth it's a lot of communication and collaboration across the security
team because it's looking for an indicator you find it all right well now I have have to look for five more if I find them I need to analyze them and so that triage process needs to be orchestrated in a better way than email finally it's all about taking action you can have the best data and and the best understanding of risk unless you do something with it it's going to be worthless and so the first use case here that um you can look at for a threat intelligence platform in this kind of data is making decisions it's reporting it's planning and um anytime you get involved in an organization that wants to to be better
Amazon didn't just snap their fingers and say uh we're going to be the the world leader and shipping products all over the world and um they had to get better at that they had to measure their effect Effectiveness and improve and the other thing to think about is not only as cyber security practitioners do we need to improve at understanding the threat but we need to improve our processes in doing that so incident response and threat intelligence production those processes um the faster we do them the more automation we bring to them the the better the return on investment and the faster we can move against our adversaries and so it's all about planning processing making
assumptions modeling those assumptions and and then measuring it and as quickly as we can improving so more exciting um you know people uh always want to focus on uh what am I going to find on my network and so with a threat intelligence platform one of the core use cases is What's called the watch list use case um this is the the conveying of indicators from that analytic process or multiple processes um in the action phase out to your network so things are being searched for um this is just a a a screenshot of Splunk uh using indicators that have been put in in one or more watch list and so this should be automated as an incident responder
starts processing an incident you should be hunting for those indicators because you think they're bad maybe you don't block at that point but you should definitely be hunting um there's various other watch list use cases other than just a Sim there's firewalls that look for hashes there's host-based products um you should be able to build your own watch list based on the types of indicators that you're collecting you're aggregating you're analyzing and then Farm it out to the devices you have that can find that stuff on the network second is signatures I I know that signatures are not sexy anymore it's all about behavioral analysis and a whole bunch of things that um you know
are uh effective if they're really good but we're not sure how good they are and so signatures are today used around the world to uh make use of indicators make use of this data that we've analyzed and um that's not going to change anytime soon what needs to change is the way that we manage signature so today you get that DHS report you create a signature and you put it into your device um what you don't know is that you actually had those indicators in another signature um that you had already deployed three years ago you just don't know because there's no correlation between the Intel coming in or being created and the actual signatures that you've already uh
deployed also signature management is something that uh most organizations do with email um maybe they have a a Wiki of some sort where they put a signature up and then the QA team pulls it down looks at it and then once they're ready they deploy it but it's um it's a halfhazard process and so a threat intelligence platform form should be able to take in Intel quickly determine the correlation between a existing signature in the new Intel or the existing Intel um it should be able to quickly determine what where you're covered and where you're not and then help the process of delivering that signature including an API based integration so that the the snort device
uh or the bro or the Yara device just sucks that signature up many of you likely have built a solution like this but it's it's not for the um the the less mature organizations that need more of an out-of thebox experience so finally there's always going to be um more analysis needed that uh can't be automated and you really need humans to do it and so a threat intelligence platform um need not think about being the answer to world hunger it doesn't need to be the Sim it doesn't need to be the visual analytics product it needs to be the brain that everything else can tap into and so these are just some screenshots from maltego um of an
integration with a threat intelligence platform you can see there are um there are data uh groupings and there are specific pieces of data that are stored and you can be analyzed using some kind of visual analytic product uh and then you can drill into the specifics of all of that information and pivot around and and this is for the the relationship that only a person can understand um and th you know this is why a thread intelligence platform needs to merge Automation and people because there's there is no way to again snap our fingers and start creating diamonds based on our organization so in in summary it's it's all about aggregating analyzing and acting on the data you
have um you know big data is definitely um a really helpful thing but I think most of us um feel like you know a form factor like this is probably uh the St the starting point and something that will be uh more useful in the short term so
questions I I can't answer that any questions was this helpful was this presentation shitty here all right [Applause]