
so i am not from splunk i'm not a vendor i just use splunk at my work uh a lot so this is and it's also an impromptu talk so hopefully these slides make sense in some kind of way um basically splunk is more than just a security product it's also usage analysis so what i've been submitting to my organization is that not only can the cyber operations center use splunk to look for malicious behavior but the internal operations can use it to analyze user behavior so where infosec is looking for the one percent case the rest of the organization can look for the 90 and we can use the same data set and reuse that for different use cases
so that's where we're seeing the value of splunk um so what is it so basically it's a search and analysis engine it's a tool you download it i'll tell you about the different versions and different uh pricing models but basically the free version you're able to index about 500 megs of data a day so that can be any type of machine data it could be twitter data could be astronomy data really anything that has a time stamp and a text format and there's also ways to get binary data in there but it's just a little bit fatter than your usual text document so we're trying to use this to correlate things like if we put in if we see our
organization's name come up on twitter a lot is there a related malware incident that happens later so we could try and line up the spikes between what's happening in the human world and what's happening in the machine world so this is its gui this is what it looks like the magic sauce the reason why i'm so excited about this product is that it is not a relational database on the back end it uses mapreduce and hadoop and these events that you stream into splunk are all time based so what happens is mapreduce runs and it splits out all the key value pairs of all the text that's in your data in your log files or
twitter and breaks it up by key value pairs words so when you look at an event you want to look across let's say the last 24 hours and look for something that happened that key is time so the primary key on this data is the time last 24 hours last four hours last 30 days so these bars are the combination of events from many different sources at that time so you're able to see okay what was my website traffic doing what was my firewall traffic doing all in the same screen instead of going to different systems and logging in and doing a grabber tail of the locks so that's really where our operational benefit comes in not only is it helping us with
debugging applications which are load balanced it helps the developers not have to go in and look at each of those consoles and try and figure out what just happened so this is a one screen kind of interface oops so the old way the old way was using uh relational databases they are slow i when you start to try and do uh schemas for different types of routers hardware you start to get into this big data space where the data has a lot of different variety you can't really normalize it in any sense and when you try and start to do joins across your your cisco data your checkpoint data you're going to uh get a lot of slowdown and eventually
those queries can't really run um they you'll get to a point where they just will stop running in traditional relational databases so again the key here is that mapreduce and hadoop are running in the background to give you that distributed parallel searching capability so when they say why google your logs they're using the google algorithm basically so this is from their site basically you want to take your hardware data your database logs your event logs your application logs your website logs even things that are happening on the hypervisor um across windows if you're logging both uh you're let's i work at in corporate it so we're logging our users activity on their machines and then
cross-correlating it with their activity on the server so one of the things we're starting to see from a security perspective is that these a user may be accessing the system very very legitimately through their user account but in actuality that traffic is malicious because the the anime has already uh hijacked that user account so we want to look at good user behavior known accounts and see what kinds of spikes are happening so i just put this up that the old way of doing things at least for us was uh getting notified of problems that are happening in the application space uh by tickets uh you know user reported problems we don't do that anymore we can kind of see the
500 errors the javascript stack tracers anything like that in real time oops i did it again so there are a couple of different versions like i said the free version is 500 megs a day that you can pipe through so that's a perpetual license so that means every day you can log 500 megs worth of text there's a compression rate so you're actually not storing that much the best case i've seen is about one-tenth the size again that's the mapreduce feature just kind of pulling away all the blank space um so some people are getting uh almost 10 times compression i would just conservatively say like maybe one third the other what you don't get in the free
version is single sign-on the ability to use ldap so that you can expose these dashboards to other people in the organization and that's really where we're finding a lot of value is saying okay let's give this to our front-end web developers our front-end marketing design people and say okay here are the buttons they're clicking on here's the related database calls and and how that structure kind of works the other thing that enterprise gives you is uh clustering and distributed so that you can split this these devices across your enterprise so the setup is really simple a one server setup is what you can do uh at home today right now on your laptop you can point it in any text-based data
so if you just do a search on your computer for something that's star.log you can point it at this and kind of take a look uh the free version actually uh our developers had been using it and didn't realize they were on the free version because that's all they ever needed there wasn't that much in the logs so but once you scale up and really go into production you'll probably need the enterprise license um there's two components there's the splunk server and that's where all your sensors are running and then there's the splunk forwarder the forwarder is an agent that you install and the devices that you want to monitor and then your splunk server you'll point it at different
folders that have uh log files in it or you can specify some regexes there's a lot of customization there the other thing you can do is send your data to a syslog server so if you have a lot of machines one example we have is a smart power units they don't have a way that we could install an agent on them so instead we forward all the events to assist log and then have splunk point at the syslog so it's kind of two options you can go through so here's the forwarder this is all in the gui it's very simple to use and and i've had a um really good experience with the irc channel for splunk they're
online pretty much all the time and answer quick questions so you get stuck um this is a bigger setup so you would have lots of different servers with the forwarders installed all pointing out one larger installation so what can you what can you splunk anything with the timestamp because again that's the primary key of splunk it's time everything is event based by time so um when you're trying to debug or when you try to look over you know something happened at midnight well what were the preceding events that seems to be the model that the people are using in operations at minimum you need two keys is a time stamp and something else what we've
found success with is actually making taking use of the guids and linking transactions across systems so a user will start and the website will pass along the guid to the database to the service layer and any errors that happen so we can actually see the whole trace of the user as they work through different layers of the system which is really helpful because sometimes you know the application developer will say okay i've gone as far as i can go to my end it's probably the database guy and blame gets shifted so we can see that all at one time like i said there are lots of use cases and the primary one in our organization was infosec but
we're trying to leverage that same data store and use it in marketing business analytics etc and mostly for operations um i have to tell you basically the other thing this is replacing our organization is emails when an error happens i have seen people with 2 000 emails in a folder all that say error there's really no checking back so one of the things we've been able to do is only create alerts after time has passed after you get 20 events instead of getting an email for every time there's a timeout you set it up so that you get emailed when the time timeline exceeds the standard deviation of the number of times that it normally times out so
we're able to kind of quiet down the noise and only alert when events are have escalated logs are everywhere you can find uh data within the organization piped in so many different places and we've been able to splunk almost all of that so at this point a lot of us are saying like just splunk everything so we can see it if you have to go into a relational database it's it's actually harder now for us um splunk has uh worked out so that it's the number of as the amount of data the throughput of data per day so you'll want to watch that some some developers like to really use those print statements others don't use it at
all there's a balance in between there and there's some guidance that we wrote up about that so there we have monitors and alerts to ping us when the threshold of the license starts to get close again this is what it looks like this is the regular user dashboard and you would just search up here so here we're saying uh so is all the vpn connections um that have a user authenticated so we could say if we've actually have some of our officers with a special alert so that if our ceo gets three failed login attempts that alerts up to uh csox so so in case it's like a spearfish or something um i'm going to go through that or i
told you the uses this is just again an example of vpn logins one of the nice things we're working on now is putting all the vpn logins on a google map so we can see where everybody is in the country it's pretty cool and it's real time one of the benefits again you're not waiting an hour or two hours to get this data it's 30 seconds so you can see in real time your web traffic hits your vpn logins etc again transactions across devices that's really the key here it's not trying to get people out of their stove pipes of i'm a dba i'm an app developer you want to get them out of that and
thinking what is the system doing holistically across transactions many uses these are just really basic reports like it took me a second to group these these are nice for managers to see these are errors by um application process you can get uh these are all available over the api so you can actually push these as websites and dashboards and give your website some interactivity here's a look at like the daily volume those peaks and troughs are at night you know and the rest of the time is during the day and that was all way too many words so we're not going to read that and if you want to know more there's my number my email um
basically i can show you how to do any of these searches really quick and uh ask me any questions yeah okay this isn't a question but it's a um i i know about splunk and i just wanted a security concern yeah so once you've taken all your logs and put them in a centralized place how do you lock it down right no no no it's not just locking it down you can give different people access to different forwarders different apps but if you start searching for things like passwords even though you shouldn't have passwords in the logs you're going to expose it to a lot of people yeah so you should have your risk people
have access to all of your sports sites yeah actually so it's very easy to find things i'm not saying they should be in there but you now made it very accessible that's the thing and actually we're grappling that with that from an information policy perspective uh daily is trying to get an agreement because our users because with this we can see when they badge in when they badge out when they make phone calls because it doesn't matter what type of logs it doesn't have to be just computers you know it's anytime they interact with any device so what if people are coming in at eight when they said they're coming out at seven and the guy down the hall figures
that out so we're dealing with those information policy issues now to get guys correlate anything very quickly and people are really good pad and recognition machines and they see yeah yeah so uh yeah yeah so there's a splunk storm which is uh the cloud service i think you get a gigabyte per day and it's really helpful for the amazon ece kind of amazon ec2 installations i don't need to use i mean i have it at work but it's just as fully functional okay okay see me after if you have any questions thank you so much