← All talks

Bringing IACD (Integrated Adaptive Cyber Defence) to the Financial Sector

BSides DC · 201942:1244 viewsPublished 2019-10Watch on YouTube ↗
Speakers
Tags
CategoryTechnical
StyleTalk
About this talk
Current security operations in certain companies involve consumption of a feed of cyber intelligence data - information about current threats that may compromise corporate networks. Due to the large number of possible threats and related information generated each day, analysis time of each entry in the feed - with respect to its severity and impact - can take hours after time of first sighting. Using our Integrated Adaptive Cyber Framework (IACD), we implement flexible automation in the networks of three banks - MasterCard, Huntington, and Regions Financial - to decrease average response time from roughly 6 hours to under 3 minutes. Working in conjunction with the Financial Service Information Sharing and Analysis Center (FS-ISAC) and the banks, we use Security Orchestration, Automation and Response (SOAR) to parse raw threat intelligence into standardized Indicators of Compromise (IoCs), which common security orchestration tools can intake and respond to. As part of our threat intelligence parsing, we utilize external enrichment tools to provide a relevant "score" of how severe a particular IoC may be. Providing a central feed of parsed, enriched IoCs allows security orchestration tools to make good, consistent judgements on how best to respond to an incoming threat, taking into account possible impact to the specific network in management. We touch on the importance of automation making non-damaging changes to the configuration of the network (e.g. we should not accidentally block a commonly-used domain). Amar Paul (Software Engineer at JHU Applied Physics Laboratory) I work at the JHU Applied Physics Laboratory as a software engineer with a focus in open development and enterprise use of security tools. In the context of cyber security, my work focuses on furthering adoption of SOAR platforms where they can benefit companies, and the formalization of standards regarding threat intelligence data and sharing. On the flip side, I also work on development and usage of static analysis tools to scan current and legacy code, with the goal of making software safer to use for all parties involved.
Show transcript [en]

besides DC would like to thank all of our sponsors and a special thank you to all of our speakers volunteers and organizers so yeah I'll be talking about the IAC D effort of APO and so that's integrated adaptive cyber defense and I'll be talking about the work we did with some financial entities last over the course of about five to six months last summer so this effort was an integrated pilot and so I'll talk about the motivation the process and the opportunity that we had to do it I'll walk you through a description of the pilot effort that we put together and and a little we we have a few demonstration slides of the of the

process itself I have some initial insights followed by some by some retrospective some results now that we're actually done with the pilot effort and I'll have a look at the future ecosystem that we'd like to see in certain cyber defense aspects in the community so all right to start off this is kind of the executive summary of our results during the financial sector pilot so it was a joint IAC D effort with the financial sector I suck where we worked to provide a better feed of cyber threat intelligence data to certain financial entities and so the focus here for the pilot was on the use of automation specifically to enhance the use of threat indicators of

compromise we looked at the generation of i/o sees we looked at scoring them in addition to enrichment of the indicators of compromise and then we looked at how financial sectors and financial entities could actually receive an intake and deal with these I OCS in a nice automated robust way so you'll kind of see at a glance this graph kind of shows the manual process is on the right that was the previous time that financial entities were taking to actually remediate these indicators of compromise it turned out to be on average something like 10 hours and we got it down to 3 hours at the absolute worst that three-hour case is when a financial entity wanted a human in the loop to

verify any automated actions that were being taken if we didn't have a human in the loop that time could be reduced under five minutes and so that's just kind of a quick glance at how much automation can kind of help with time savings in a cyber analyst daily job so now so with that kind of executive summary out of the way I'll kind of take you through the initial motivation and the process for the pilot the initial motivation was we had this idea for there to be only one ecosystem so what that means is we'd like there to be one overarching place where for example financial entities can pull a feed of cyber threat data and get some good

information that they think can be somewhat trustworthy in terms of how threatening it is how malicious it is how bad it might be for their own internal enterprise network and an integrated pilot in this specific case helped us work with different entities altogether we could kind of share insights and tips on what we did for each of the individual entities and kind of use it to bolster efforts in other places while we're conducting the pilot effort and so the whole idea was just to help drive automated defense across the whole ecosystem so when we look at information sharing this kind of sharing of some kind of cyber threat data we want it to be important because we we would like

valuable information that really gives a perspective on what you should do in your corporate network when you're subscribing to this feed of data we want some kind of relevant data that has some meaning you know we don't want to accidentally block some some random domain that's not hosting IP we want there to be some relevant actual use out of this cyber threat data that people are subscribing to and so specifically we want consumers to be able to make timely decisions based on their own local considerations we'd like each each entity had some some different internal cyber enterprise set up had some different network they were using different tools to input this information they were using different

firewalls for example and we wanted them to be able to make their own decisions based on whatever data it was that we input and that's kind of a big a big point that ICD makes is you can bring your own enterprise so we'll work with whatever kind of tools that you already have existing in your own cyber Enterprise and so hopefully you don't ever have to switch tools because we don't have experience with something we'll work with whatever you got now as far as the opportunity goes we were looking at enrichment to help with this automation and what that means specifically is we were we were bringing in some form of cyber threat data right

and without some external knowledge we thought that there might not be a good idea that the analyst had just by looking at it oh is that bad is that not bad they might see some IP that somebody else had flagged as being malicious in some way without enrichment we had no real way for an analyst at a very specific company to corroborate that oh that IP actually is malicious or oh this file actually is malicious so we were looking at using third party external enrichment to enrich this initial feed of data that we're getting in and then hopefully alongside this enrichment cyber analysts at a specific financial entity would be able to understand how it would affect

their enterprise specifically and what rules they should take to update maybe their firewall rules update their block lists whatever and so that this integrated pilot kind of helped us to understand how enrichment could work in kind of the bigger scope and pull together information that is in very disparate places out there on the net specifically we're trying to answer this question what action should I take the cyber analysts ask that they want they want information that you give back they want their answers to be relevant to them they want to know what they should do in their specific network they wanted it they want it to be actionable if we send them an indicator like a malicious

domain they want to know should I update my firewall and block that domain and I want it to be targeted so this we should be able to tell them yeah you should update specifically your firewall you should update maybe your Active Directory rules whatever we want it to be targeted actionable and relevant and so this is kind of a layout this is an overview of how the integrated pilot itself looked we are working with the financial services Isacc which stands for information sharing an analysis Center and so the FSI sack was kind of the originator of the feed of cyber intelligence data and we were taking that parsing it into indicators and then enriching it and enriched indicators

along with the original indicators were sent using sent as sticks bundles to our taxi server and then the different financial entities were subscribing to our taxi server and ingesting that information and so during the pilot we referred to the entities as ostrich EMU and cassowary the flightless birds and at this point I'm at the liberty to say that ostrich Iman cassowary our Huntington Bank Regions Bank and MasterCard those were the three banks we were working with for the for the duration of the pilot and two of those organizations are still using the automated efforts that we put in place inside their inside their networks at this point a year later the pilot specifically kind of

focused on this FSI sac - taxi servers that that we were working on we worked on the parsing of information and the bundling it up and the pushing it to a server while publishing it to a server that other people could subscribe to and so that was kind of the sort of high-level description but then I'll also take you through a few a few of the specifics of the actual efforts that we went through to make this whole thing work together one important question we had kind of initially was can centralizing this enrichment really really benefit other people if if one producer gets the information and enriches it with some different enrichment tools for example virustotal

or domain tools I think domain tools this year today actually if we enrich it with these third-party external or enrichment tools at one place could that help everybody else instead of everybody needing to have their own API keys for their own enrichment services and trying to do it themselves possibly in an inconsistent way and then we also looked at how this kind of abstract initially enrichment could help with a malicious score that we'd assigned to an indicator we wanted to figure out whether there was some nice way of communicating easy way of communicating to a cyber analyst oh this thing is definitely bad we're sure because it's on this many different enrichment sources this thing is

probably not that bad and we're sure because it doesn't show up anywhere how can we come up with that kind of score so these are the specific technologies that we used during the pilot effort the top row is the are the technologies that we use to set up mirrors of the different banks enterprise networks on our own like testbed internal network specifically are we looking at Active Directory email services and database services as possible hosts of different cyber threat indicators and then the bottom row is what we used to create the automation of of the cyber threat data ingestion so specifically in this case we used amidst Oh which is an orchestration orchestration platform for

those of you not familiar we used anomaly threat stream which was hosting our our taxi server that's where we pushed indicators up to an enrichment bundles up to four other people to subscribe from and then we use virus total domain tools and threat minor as the third party external enrichment sources so we're using three enrichment sources to try to parse or to try to enrich this incoming data that we had parsed out from the Ephesus ACK and to try to hopefully make cyber analysts at the three different banks have a nice way to look at the data we were sending them then this is sort of a description of how exactly our architecture was set

up so in the top left we had emails being sent that were comprised of the indicators of compromised we had demystified up at the FSI sack and we had that running on custom Python module that we set up just to parse out the information from the emails and send it up to anomaly threat stream and then in the bottom left we had on the setup of the banks which was ended up being fairly consistent amongst the three of them they were also running mostly Domestos running our Python module which was subscribing from anomaly threat stream ingesting the data that we had sent and enriched and doing something with it so these banks I kind of mentioned local considerations

earlier these banks were all doing something a little bit different with for example if they got an email indicator an IP indicator what up whatever they all kind of had their own internal enterprise rules for dealing with possible malicious indicators of different types now so as far as scoring indicators goes we developed kind of a process for the pilot that we wanted to follow very specific rules for so we figured the default score must do no harm and so that means if we get an an indicator of compromise and we're not sure how bad it is we don't want an enterprise to accidentally take action against it so if we get it in some indicator then

maybe it represents a newly registered website and that website just happens to be legitimate and someone just happened to flag it we might end take it and we might hit a bunch of different enrichment sources and find oh there are no results for it on different enrichment sources guess it's legitimate so we'd like to assign that kind of thing a default score and send it up to our server and then when other people ingest it they won't do anything against it because it's just the default score we couldn't find anything really bad about it if we figured different scores should result in different actions so in other words we don't want our score to be too granular we want there to be a

very small subset of actions that might be enacted based on whatever the score is so cyber analysts can look at these different scores and immediately think in his her head what would I do if I saw something like that what would I do for this and then of course we wanted it to be a hundred percent automated and there's this concept of autoimmunity which is where if we if we were wrong about something we want it to be consistently wrong and that would help again with the cyber analyst with an actual human looking at it later they could see what we might have done wrong and but they wouldn't have to be worried about that kind of action changing in

the future and so this is kind of a layout of like them malicious score kind of metric that we came up with it's very simple it's supposed to be an extremely transparent way of scoring indicators of compromise zero is if something's whitelisted so specifically you know we whitelisted google.com and some other some other domains some other IPs google.com might show up in a cup actually does show up in a lot of indicators of compromise because it is somewhat related to some sites that host malware and since it's a search engine you can find malware from google.com so it shows up in a lot of indicators but obviously we don't want to block it so that comes up as whitelisted the default

score undetermined is if we get something new and we really can't find any results on enrichment for it and those are the two kind of benevolent scores the the bad ones on the right is if we assign a two which is if we get a hit on a single enrichment source and a three is if we get a hit on multiple enrichment sources and that's what we came up with at the time when we were newly looking for whether enrichment could really help beside where analysts at a glance determine if something was good or bad and so now as far as kind of a retrospective now that the pilot effort is done and it's been about a

year on these are some technical findings we ended up seeing quite a few overlapping indicators from the feed that we're in taking so about 6000 overlapping indicator and you'll see that I'm not sure if you can read the the graph in the bottom left but essentially most of those overlapping indicators were IP indicators those were the ones that ended up being seen and many many many different indicators of compromise then we had some of we had some like duplicate email indicators we had a few duplicate file hash indicators and that was kind of the that was kind of a breakdown of the duplicates and those duplicates were sometimes interesting because the score of them on average

ended up to be a 2 which if you remember our score metric that 2 means we found them in only one other enrichment source not multiple and so that kind of indicated to us at the time even if there are indicators that many many different people have seen enrichment sources don't necessarily update fast enough at least they didn't at that time enrichment sources don't necessarily update fast enough to keep up with how bad they think something is we're querying virustotal domain tools and threat miner and we you know only so often found a hit on even even one of them for something that five different people had flagged is bad in the last five or 10 minutes and I have here just

kind of to reinforce this stat the the time to automatically score and publish these indicators after parsing the facade feed and turning them into sticks objects and querying different enrichment tools that time was one minute to actually do all that and push it upstream to the taxi server in the past when they were looking at this feed manually at the FS I suck the time to manually score and publish it ended up being almost six hours so that's kind of a crazy time saving just at the FSI sack not even at the ingestion level when you when it gets to a cyber analyst on the other side

and so here is kind of driving home the point that different organizations want to do different things when they get indicators of compromise we had one organization who wanted to block any indicator of compromise they got that had greater that had a two or a three score but then we had a different organization that when they got any indicator at all with the two or three they wanted a human in the loop to actually look at it and verify that they should update their firewall or update their blacklist or whatever and so there was there was a lot of differences between how the how the organization's decided to deal with different kinds of indicators different kinds of scores and

each of the pilot participants had really kind of a varied outlook towards dealing with the indicators that we were that we were pushing upstream and we also ran mirrors of all of their actions and their corporate rules on our own like internal testbed and so as far as the kind of final performance for the time for cyber analyst on the organization side to look at these indicators the generation of them like I said took a minute versus multiple hours and then the response times were also were kind of reflected as the same order of magnitude sort of difference we got things down to like minutes where before they had been on the order of hours and

it was just extremely helpful for the different organizations and the FSS act itself to go through the extremely large amount of data that we see coming out of all these kinds of automated tools at this point there's just been kind of an explosion of especially cyber data that's being generated by your sim and trying to be ingested by your sore trying to deal with the huge amount of network traffic that any one corporation might get in any one day so just you know automation is good it's kind of the point in this slide and this is kind of overview of the before and after for each of the separate organizations on the left column the as

far as a security or automation and orchestration platform so like de misto before we started the pilot none of the three organizations we're actually using a soar platform at all to help with automated ingestion of cyber threat data but after our pilot was finished all three of them were and that just you know that just helps their cyber analysts quite a bit with dealing with all this data they have to deal with as far as actual automated use of indicators of compromised one of the organization's was kind of addressing automated use of them beforehand they were looking at indicators in somewhat an automated way for certain indicators but after the pilot was done all three

of them were because you know they were using sort platforms which helped with the automated ingestion and then as far as the use of the scores that we came up with two of the organizations are using we're using that and are still using that today one of the organizations came up with their own kind of enrichment algorithm to deal with incoming indicators and that they used in-house and so yeah so now I'll take you someone through some of the insights that we had initially when working on the pilot just F and just after the pilot was done in now a year later we won one big idea when we were doing this pilot was was

risk tolerance an impact we wanted to be sure that some bad bad suggestion didn't have too much of an impact on an enterprise network and we wanted to be sure that there was never too much risk in the feed that we were sending out so we we wanted to be absolutely sure that we didn't incorrectly say some indicator was actually a score of three when it should have been a score of one we wanted to make sure that each entity had some amount of understanding of their own network in order to understand what kind of risk an ingestion of a feed like this represents when they set up automated actions for de mistura's ample we also looked at information sources so

it can sometimes get tricky to set up like a trust trusted information source of this kind of cyber threat data and each different source really shouldn't require its own kind of client and its own its own like huge bar to sending out cyber threat data we wanted to make it easier for people to be able to share data if they need to and hopefully get it ingested somehow and that's something we something we ran into with one of the organization's especially was the idea of tell me what I what I need to know and let me ask about the rest and so this was important because some of the organizations did not care at all

about some of the enrichment that we were doing when we sent out data with a bunch of auxilary information they only cared about the initial data they only cared about the first time you've seen a malicious file hush something like that so they wanted to be able to just get in the initial data that was it that's all they cared about and then ask about some of the auxilary information if they're cyber analysts needed some some extra information whatever and then kind of the last thing we ran into was how other organizations would see our opinion in in this case our score of the indicator we wanted to be like I said really transparent about the scoring algorithm

we want it to be really simple about it we didn't want to have what really complex algorithm those doing a lot of crazy things figuring out a score metric for these organizations to deal with at the beginning we figured transparency and simplicity would really inspire trust in something like a feed that was telling other organizations what should you do if you get something of a score - or a score one or whatever

so as far as lessons learned for that actual scoring goes different organizations again this kind of hammers on the point that different organizations wanted different things based on their own internal network and they wanted they wanted sort of different things but all of them agreed that they wanted confidence that an indicator is associated with malicious activity that was a consideration a big consideration for all three of them and so that that scoring really really helped with just the prioritization at whomever ends up ingesting the data if a cyber analyst for for one of them if they wanted a human in the loop for the for the workflow they were there conducting if they want a human in the

loop somebody like that can look at their data stream and say all right I'm gonna address these threes first I'm gonna look at everything that has a score of three and figure out if they actually need to be addressed if I actually need to you know update my corporate rule my enterprise rules to deal with them and so that just that score really helped with people dealing with these indicators in maybe not always a fully automated way but at least a way that was helped by automation and then at this this pilot also showed that there is some work at the time there was some work that needed to be done on enrichment sources so if

you're familiar with the Stix format which is a sharing threat indicator expressions or something like that I don't even know it's a format for describing different different cyber data so it can be indicators it can be identities of corporations who have submitted the indicators it can be relationships between different indicators but specifically when we were looking at enrichment we didn't find any kind of enrichment indicator any kind of enrichment object any kind of nice object that we could shove our enrichment data into so it we had to create a custom object for that for sticks to point X so we did that and you know it had the the source name so whether it came from virustotal or

whatever and it had the score and it had the raw data and then had links to any other supporting enrichment that were found for the same for the same indicator and the problem we ran into was kind of on the stick side there was no custom enrichment object at the time but then I was kind of also on the enrichment tool side like I said earlier sometimes enrichment tools just don't update fast enough if you query for an IP you don't get back any hits and you're not left in a better place for deciding whether this IP is actually good or actually bad so you know including this enrichment object helped for a cyber analyst at the end to really

show their work and saying yeah you know I queried this this and this got these results back you know got a score of 50 out of 67 on virustotal and what-have-you out of 100 on domain tools stuff like that as far as other lessons learned not so much scoring but different organizations may want access to different information to support their decisions and so that's kind of an offshoot from the tell me what I need to know and I'll ask about the rest sometimes they use different information to make informative decisions about what they actually wanted to do to update their enterprise rules something we also ran into was that there's multiple ways to deal with the sticks format and to

shove information into it that at certain points was a problem when we had to decide a nice way to provide certain information we just kind of ended up deciding you know what it's going to be formatted like this and everybody pulling the information is going to have to expect it to be formatted like this at the time that was a little bit of a problem because you know we had to have a lot of communication and there was a little bit of a bar for the people subscribing to the feed to really understand how exactly our information was set up something else we ran into is that there was more commonly used information than we expected there to be initially and so

we figured you know it would be good if that commonly-used information could be in a published feed and everybody else can subscribe to that common information just in a nice automated manner and then at least initially organizations you know are not necessarily going to be super comfortable about subscribing to an automated feed like this they're gonna they're gonna want to check your work they're gonna want to understand how your scoring indicators how you're parsing indicators from a feed or whatever they're going to be they're going to want to have some amount of trust in what you might be doing to provide them the cyber threat indicators they don't want to block anything that they shouldn't block and they want to be

sure that they're getting you know the most data that they can as far as the fraction of cyber threat indicators being pushed out there one feature we kind of saw that was important to other organizations was traceability from original email through to the published content through to this final like end user cyber analyst support they wanted to be able to see how this parsing worked they wanted to be able to audit it throughout the pilot we wanted to intake suggestions from them on improve parsing how to improve certain formats of the enrichment that we ran into and we also wanted to have a talk with all the pilot organizations on actual analysis of like the value of the

feed how helpful was it for them how helpful were the score is how helpful were the different attributes that we push to the feed whatever at the time we had some future considerations that were pretty important to us we wanted some more advanced autoimmunity so we wanted a better construction of something like whitelists a better protection against accidentally blocking something that that you don't want to block for example and there have been actually some Oh a few efforts after the pilot that addressed that we wanted to initially look at adjusting the default score based on who submitted an indicator so we wanted to think about having some kind of mechanism for oh this submitter is somewhat trusted so it's trusted that

they're going to be submitting an indicator that actually is malicious however that there's there's a trade-off there because making the score metric more complicated makes the algorithm a little bit more opaque and we wanted it to be transparent initially what we want it to be easily understandable why we score something a two or three-year-old one or whatever we have even with that actually we've done some work recently on looking at submitter IDs and matching them up with some kind of confidence metric that gets worked into the confidence score of indicators and then again we you know the final bullet is just complex algorithm maybe would help with scoring would help with informing the end user cyber analyst on what they

should do better enrichment sources you know sources that update faster that have more information about why an indicator is bad an extra associated information would would really help with making sure you have the right score making sure an analyst knows what the right thing to do is at the end one thing we'd really like to do is identify threes in this case they're really bad ones we want to be able to identify the really really bad ones without over scoring other not necessarily bad indicators all right then so after a pilot effort we kind of put together some ideas of what we'd like this future ecosystem to look like what we would like a whole integrated

effort with organizations working together to look like we would like some vendor agnostic integration of products and services so we work toward this a good amount during the pilot when we had a Domestos and a few other soar platforms working together we work to get towards this a little bit with sticks and taxi which are nice standards for the format of cyberthreat data and the sending of it and it would be really nice if lots of different products are extendable and agnostic of the format you keep your data and that would help an organization like this be possible where people could just share back and forth oh you know I got this indicator I got that indicator and suggestions like

this would be able to flow between different entities much ease much more easily we ran into multiple models for sharing information and we thought that it might be nice if these different kind of ways of sharing information we're a little bit more standardized so for example you know there there's some threat intelligence platforms that we're using there were like commercial threat providers but then there also might be cross organizational collaboration too what the size of your company is depending on what you're trying to provide depending on what you think your threat profile is against some kind of advanced persistent threat you might want different things out of a feed of cyber data that you're subscribing to so

there are just many different things that you might be looking for in a feed that you might be worried about getting and then there's this idea of circles of trust that we came up with and so this is just this kind of thing that I mentioned earlier entities working together maybe subscribing from a feed and they they all hopefully ideally trust each other to submit more indicators of crop compromise that they are seeing and so this might help you know in the long run if some very large entity sees ton of cyber indicators they can easily send those out publish them to a feed and other smaller companies can intake those immediately and something like that would really help

with cyber resilience which is a bit of a buzzword but ideally it would be someone big sees a threat notifies other people of it and other people are immediately able to update their rules to stop the threat before it gets into their enterprise and so this whole concept of information sharing and automation we think is are really twinned concepts I mean information sharing might be great because you have tons of organizations that might all be seeing different malicious files different phishing email attacks different whatever and sharing that information helps everybody get get the best you know updated information that they need to deal with possible threats but that's obviously a huge amount of data that has

to be passed back and forth between different organizations and Automation specifically in kind of the cyber defense sector is going to be really important for both in taking information sharing generated indicators but also outputting them to other people and so just kind of a wrap-up conclusion the first bullet you know as automation is good right automation can help with a lot of time-saving if you use it incorrectly it can be in some ways impactful possibly negatively impactful to your network or to whatever you're applying it to but nicely set up automation can really really help especially in this case your cyber analysts deal with the data that they need to deal with to keep your company

safe this financial sector pilot specifically demonstrated how sore platforms combined with information sharing can make data much more actionable and enable really consistent actions from cyber analysts if a cyber analyst sees something that's a score of three and then a week later they see something that's a score of three they'll probably address it the same way as they did the week before whereas in the past if they were doing enrichment kind of by themselves they might end up doing some inconsistent actions that might address something that is actually the same level of threat and you know this this cooperation between our team at APL ICD and the FSI SEC and the different financial institutions was really

critical to figuring out how this kind of thing would work in practice and we found out a lot of good things about companies working together and then we found out a lot of like sticking points that might come up in this kind of thing but yeah I was super hopeful and with that I'm done do you have any questions I'd love to take them yes

[Music] sure yeah so the third one during our pilot actually before our pilot they um had kind of an odd method of automation mostly they were using a kind of a clutch of Perl scripts which is may be familiar with to some people in this room and so the act of automating some of this indicator ingestion in their network wasn't immediately super helpful to some of their analysts because they were still kind of preferring the old method of doing things and it kind of happened so that in the three months that we worked with them it never quite had the turning point where the analyst that they had at the job never got around to preferring a tool like de Mist

over the more manual way that they were already used to doing things and so they just kind of never stuck with it after the end of the pilot now all right well if nobody has any more questions that is the end of my slides and thank you [Applause]