← All talks

Keith Perry - Roll Your Own Internet of Things Search Engine

BSides Orlando41:39592 viewsPublished 2014-04Watch on YouTube ↗
About this talk
How to build your own Internet of things search engine similar to Shodan. How to build it and how to overcome the hurdles. What can you use it for and examples of information useful for security professionals. Also lots of examples of fun stuff you might discover.
Show transcript [en]

morning everybody can you hear me in the back okay all right yeah I've got teenage kids so I can yell um good morning uh welcome to roll your own internet of things search engine uh I I'm kid Perry uh I can't come up with really cool names like Samurai WTF or you know you know backtrack so you're stuck with me coming up with you know things that don't make good acronyms so um the slide I'm going to read verbatim cuz my lawyer won't support me on any of this or cross country car races come to find out um disclaimer uh the information opinions views and advice are strictly mind and not of employers present past or future

hopefully my current employer will not thinkk that I need future employment um use of any code provider techniques describ at your own risk and you see my code you'll understand why I'm not a lawyer your mileage may vary past performance not an indicator of view performance and your cloud provider may not interpret the terms of service like I did come to find out they don't so with that said U my background um do an it for about uh 20 some odd years or more we'll go with the more uh currently I'm the appointed ha boss for a large insurance company uh of pointy hair bosses um so I'm in somewhat of security do integration uh middleware

infrastructure I'm the guy they call in the middle of the night and ask why things are broken and then I stand over people's shoulder and ask when it's going to be fixed and there is one person in this room that has been subjected to that um but he didn't work for me thank God Kevin um passionate for new technology security uh got some education uh didn't help any um early geek a licensed in amateur radio loved Electronics as a kid would have been electrical engineer but University of Florida uh didn't really partake to my kind of people um why is search Eng it so I've spent most of my life on a conference call if

you ask my kids what I do they tell them I am I do conference calls for a living back on a conference call in probably 2000 or so I started typing in IP addresses because I was not overly concerned about whatever it was that was going on in conference call and started finding some really cool devices and went hey you know if I wrote a script to do this I could stay completely entertained during conference calls you'll get to see that 15 years later so I uh thought well hey I you know everybody's doing this thing where they making money how turn this into a site that'll get a lot of hits so I went

out and got a domain name called random look.net which I think I still own um come to find out there's a guy who has a patent on you know displaying random images on the web um I haven't talked to him but I thought you know for the $3,700 I'll make over 10 years the $3.7 million investment I'll have an law fees will not cover it so it's still blank I think um platform to explore four different Technologies what I do I enjoy and I enjoy doing it off you know with with my own time um kind of a Lewis and Clark syndrome thing I enjoy seeing what's around the next Corner which is you know

really cool for certain things uh if you go ride mountain bikes me or whatever you may get lost um this was pre- HQ the next slide will cover how many people know what showan is all right very good you're my kind of people um project to learn about Cloud Compu so I had some stuff going on in my work side of life that I needed to learn more about cloud computing I thought hey this is really cool um but SE showan search engine so you all thinking well that thing's great it is showan is great this won't replace it this was me having fun over the years before show showan got started there were some things showan didn't have that

I thought would be really cool some of them I know why Shan doesn't have now um thumbnail scam capture search on HTML browse data unlimited search queries now there's one that I do have that I like and that's private queries that no matter what goes on nobody gets to see my queries if it's running on my box so the bad news is to get that you pretty much have let the entire internet know what you're doing I took two different approaches so we'll talk about the first one first the second one at towards the end hopefully I won't run out of slides I've either got way too many slides or not enough slides one of

the two uh two approaches uh build from scratch which is what this really kind of started out lamp stack um Linux MySQL Pearl PHP whatever you want to call it zmap nmap Pearl uh I used a bit Nami auntu for a for a prepackaged image uh I used the Amazon Cloud um there's some debate whether or not I should put up to actually what cloud provider I was using but at this point I'm sure most of the internet knows um so our project scans if you're not familiar with them we'll get to those in a minute but um scan. scans. which is the sonar project some uh uh full internet wide scans of ports that

they've made available comes into play here um WK HTML to IM and the second approach was a cloud search service using the Amazon Cloud search um the sonar project scin and pearl so um and if you have questions let me know probably don't have answers but I'll be more than glad to technology uh collecting data two different types of way I went about collecting data inmap and zmap scans um to collect valid IP addresses and then uh I wrote some code to actually go hit those to pull in the data that I was looking for uh inmap everybody's familiar with inmap I'm sure U inmap slow and targeted uh zmap I don't know everybody anybody use zmap in here okay

zmap is incredibly lightning fast it's a shotgun approach approach um and if you run it you'll people will like tell you you have run it um come to find out sunal project files I talked a little bit about a minute ago they are full size full internet scans of you know ports that they have put together and dump the data down very very interesting um you can par those directly into a database is what part of what I did um they're very large uh the port 80 uncompressed is 250 g G it's 65 gig compressed so there's a lesson learned slide in here and I'll talk about some of those but those are some of the hurdles you have start trying to

pull down a 65 gig file and you know people in your house will start complaining that they can't get the Netflix or Etc and by the way Netflix not working at my house is a sub one so um very reports available and you know it's point in time they do a new scan about every 30 days so um I tried to set this up so that the database schema would handle multiple ports and protocols I really focused around HTTP it's got some FTP and SMB in there um you know the ethical thing and trying to stay on the right side of people who could you know complain I played mainly with HTTP the concept some of the code

Etc are expandable and uh you can use for whatever Port you would like the parsing of the sonar F files I try to write code that would pick up and run most of their stuff I can't say I tested on a lot of them but on a couple different files um the setup files that I'm going to talk about so you could go do this and play with this at your own risk is uh is add this this uh Source Forge uh net.net uh project um it's Ro y um and it is index and right after I built it I couldn't find it so how it was indexed on their search engine um so approach one talked a

little bit about this early on uh approach one was to build a lamp server and then uh configure my SQL and copy some files to HTT docs for Apache and create a so I'll walk through this a little bit so uh the first thing you need I I started with a a lamp server I started with an auntu version um that's kind of what I'm familiar with I used bitnami which is a provider of images uh they build them for you know whether you want to run them on um your local desktop or a cloud IM they have both so um works well when you create that um if you're going to do a full scale Port 80

you need to add about 500 gigs of of disc space that's not too much uh configure the mySQL database is the next step in that um and let me just back up is all of this was built without security as a as a concern for my security I built this stuff and brought it up and down at my Will Made backup copies and if someone packed it and it got BL away it was no big deal so if you go and do this realize I didn't go into whole much trouble I I was using the the the default my SQL password and user ID for a while so um anyway configure mySQL database you need to add a little bit of

uh security there so the the scripts that I have written will run uh you need to copy the PHP files which are the front end of the search engine over that um as you can tell by how creative my SL design is my PHP files that run for the front end are just about that creative so I'm not a a graphics I'm a backin database kind of guy so bear that in mind um create a director called scanner to put scripts in and call copy all the other files in there so and there'll be a readme up on that um sourceforge.net site that will provide a little bit more information about exactly how to go into

this and I'm not really sure when we started when we're often so I'm going to try to run through this pretty quickly um you need to install zmap install inmap install zbk HTML to image which is a screen capture program to then create images off of it very cool tool um I had a lot of difficulty making it work for what I wanted to do and the quantity that I wanted to do but we'll get to that later um install Pro modules PE modules until things work um I used a ton of different pear modules I can't tell you all of them and every time that I tried to come up with a list of all of

them that I needed I kept you know finding more so isn't that how every ad huh isn't that how every admins pretty much so Kevin and I have a small history we have worked together in the past and um worked with a relative term um set file permissions firewalls and all the other stuff I've forgotten hopefully we'll add that into there so all right so once you've got up uh a MySQL a bunto lamp stack you need you've got the scripts installed you need to get some data into the database so there's three basic stages of this you need to get data you need to put in a database in store and you need to expose

it to a front end of some kind so I wrote a a short script to run inmap scan which runs some other in uh other Pearl modules behind it to grab files um and you're going to ask why I wrot use Pearl it was something I was a little bit familiar with and there was some cool things written in Pearl that would you know parse inmap files and Etc so run in this script and it'll pull data in you'll have to go change settings and Etc to do what you want to do I've got it set up for this just to run like pull like a th random you know 480s zmap same kind of thing you need to

adjust the the the variables and Etc all right so that's two different ways of getting data into the database the Third Way is to um gather data from the sonor project and I have to say thank you to those guys I think it's rapid 7 and HD Mo I believe um those guys have done an incredible thing if you haven't looked at the files you should it's very cool um they're out there that's the uh HTTP address um very cool stuff um there's some tricks to parsing it but you know I've written some basic files to help you along and then you'll have to kind of modify it from there um grab the files so the the port 80 file is a

65 gig file and it unzips to give or take 250 gig so um you can spend some time letting this run I know that the first couple times I ran it I gave it you know 100 Gig 200 gig I get up the next morning and it wouldn't be uh it would have run out of space so pull it down start unzipping it Let It Go um adult beverages if you're of age I think about part of the crowd is um and then create a smaller file from that 250 gig to test with and import um to import the database I wrote a a short Pearl script that is called sonar. uh it includes some things in

that uh it parses that file and puts it into the database there's also some options in there to go out and do things like um DNS lookups for each one of the IP addresses and also to look up geolocation stuff those two things call outside Services they are very slow so you may want to turn them off there's an option to turn those off and let it just run you don't get quite the the quantity of data that you want but it runs a lot faster any questions so far we good okay I gave this presentation of my labador and he had the same response so all right so this is what it looks like once it's run um if if you did all

of that correctly and you got some data in it and it actually works you get a very simple straightforward screen uh and then the back I'm sure you can't read it so um screenshot IP PT code Response Code and uh server and off to the right's title so it kind of runs off the screen um that gives you a base place to start doing some very simple searches okay one of the things that I wanted that showan didn't have was I was like you know that's cool it's a rabbit server what is that I have no earthly idea well come to find out rabbit servers can be a lot of things same with take a instance it's saying Apache you

can have a lot of things running on aache you can have websites you can have a lot of other stuff running on Apache so I wanted to add screenshot so um consider this experimental because I never got it to work to what I wanted it to be there's a couple reasons um there's a script um image. that will run it will pull in images it is incredibly paint takingly slow and it uses an incredible amount of resources um and we'll get into that some here in a minute um you can change the SQL query run it it's multi-threaded it's slow and it's aprom pulling back screenshots there's all kinds of things so one of the things that I found out doing

playing with this is you have you're searching the entire world well if you're sitting in the state of Florida you know you don't really worry about you know how many different languages there are and all the different ways that somebody can code HTML but when you start trying to parse it you start finding some really interesting things you start finding that there are um there character sets out there that don't parse well there are people have put weird things into HTML tags where you would not ever expect them um I'm not real good at regular Expressions anyway but it was challenging to try to make them work so but when you do screenshots you start

running up against things that have a Java back you know have a Java display or an active X display or just in general they don't respond or they're slow all those things really come into play when you're pulling the data when it's textual but when it's trying to grab a screenshot it really gets very difficult it does work but it's it's very ER problem the success rate is not real good so as you can see I ran I ran and you know added in some screenshots I tried to just as a general you know back to Kevin's ethics talk I really kind of try to stay away from anything that comes back and says it's a

starts with a four Response Code I kind of stay away from um if you give out a 401 or a 403 I kind of feel like hey you know what I'm not going to go poke at that at least I'm not going to make a presentation of me poing at that so I try to stick things that are you know have 200s those are the things that you typically see some more interesting stuff with so if you look I did a search for rabbit for rabbit and server okay and it came back with some various things but you'll see the screenshots are different between the ones it came back with you also see they have different titles so you you'll run

into all kinds of weird strange things out there so this is the part where you might be entertained I hope if not I failed completely all right so one of those if you go back there's a this black one here okay and you could scale these to be whatever size you want to be I grabbed them in 5K images because uh when you start adding up you know hundreds and thousands and well millions of rows 5K adds up pretty quickly bigger than that adds up a lot faster um so anyway so that black one come to find out is a highway traffic monitor somewhere I don't know exactly where and I'd rather not find out but so that's

one use of a rabbit well okay so what is rabbit rabbit's a server it's an embedded server you can go buy the little module and put it in stuff and you'll find that lots of people have it also it's also runs in weather stations somewhere and that one might be easier to find than the traffic one um these are just samples it's not like this database that was pulling from has 11 million rows in it uh give or take there's about 300 or 400 somewhere between 300 and 400 million rows in a port 80 scan from a sonar project project so there's plenty out there this is just one of many um so I really enjoy exploring the

stuff and trying to figure out what's out there showan is incredible but when you sit down at it you kind of kind need to know what you're going to search for you know you know they have people who have put you know really cool searches out there and that helps but if you're sitting at a screen trying to entertain yourself uh coming up with something to search for is not always the easiest thing to do so I said you know I want to write a short little screen server.php to go out and pull all the distinct types of servers that I have found so and this was a much smaller database this one only had like 300,000

rows in it so what you find very quickly is yeah apach is very popular Microsoft is very popular you start looking down the list okay so you get down to the ones that are like on of tues and someone has said you know I I read that thing about security and I should change my banner to be you know something something huge those are amusing go up to the ones where you start seeing 10 or 20 out of you know 30,000 or 300,000 or you find you know a couple out of you know a couple dozen out of a million those are the ones that are you know devices that are often you can find out

there that are interesting um rabbit's one of those U that's one there's some other ones boa is uh is is is one that used heavily for webcams um there's one called W3 MFC which I believe it's W3 you know www um and MFC is Microsoft you know the Microsoft framework MFC for Microsoft which was popular back I don't know late '90s early 2000s somebody probably in here has written in that um those are out there and those run a bunch of webcams but there those are very specific set of people using those in webcam design so you can start identifying webcams that are out there and you can identify the kind of webcam it is just

from the server much less the title which also is interesting so grab some of those and say okay you know what I'm going to go look at some of these different server types so in that case I said okay you know what let me look for embed I'm just going to look for embed the server type embed the query have I set up is is aik like if you you know most people SQL fans anybody SQL okay all right so if you use like you get back a bunch of different things that are very similar to that in this case I found okay you know I'll look for embed I found iard embedded that looks like an interesting

server type to go you know pull I pulled it up there were several hundred of those I started poking around say which one of these are you know 200s that I won't offend anybody if I go look at the website this one's out of China this one is um well it says Guard Security System uh it's interesting um It's Time card punch in punch out um and if you want to know uh Yen Lee was the last guy in or last guy out and you can go find out who's late um and whatever company this is in China somewhere um interesting things you can find and poke around that and see so controller so just go type in the name controller

and you know search for the server this is strictly around servers there's other things you can look for titles and Etc controller led me to find bunch of different controllers Gateway controllers EA controller Aqua controller okay and I was curious I like how many Aqua controllers are out there and what is it how many goldfish need to Twitter well come to find out a lot um the aqua controller is actually some kind of control module that runs fish T and there's a bunch of them out there there um I'll let you explore your own risk because those all came back with 401s they appear to be secured out of the box and I thought well hey cool you know the

fish are safe so all right this is an interesting one and I'm going to gloss over this slide a little bit um because I'd rather not talk to anyone's Char um so an ipg 7000 if you type in ipg and this is where the server came in you know I found ipg 8000 uh was a server type and I thought hey cool I went out to those they were you know they showed an interesting side panel like this but a you know login screen and they you know all were protected and I said that's cool that's that's good and I started looking and realized there was an older version ipg 7,000 which isn't quite as secure as the

ipg 8000 it appears now these devices are network communication devices and they show up typically in well a lot of convenience stores out of things that you might get money out of um so anyway if you if you notice with those some of the some of the log files may or may not be available to the general public I would want to secure them if they were mine anyway the obligatory printer slide everybody has seen show in or somebody go hey you know have you seen all the printers that are out there on showan yeah there's a lot there are probably multi-million sitting out there that you can get to the web interface on um very

cool use them as they are you can poke around at them um back to the whole I can't come up with good names thing um super Goose to which someone earlier pointed out they use a a I think it's either a dog or a cow for their logo they'll know not sure how you go with super Goose 2 and get a cow whatever um this is uh one of those that I found off also doing an embedded server this completely different type of embedded server this one is uh provides uh data back from Heating and Cooling and also provides a video feed and this one happens to be in a launder mat somewhere and you can tell exactly what

the temperature behind the 50 lb dryers is if you scroll down you can also see who's in the who's in there at the same time they're doing using the 50 lb dryers what you will find is that there are a ton of uh mechanical room type things attached to the internet that you know you may or may not know are there a lot of them people appear don't care if they're exposed or not some of them I think they might care um I think every laundromat in the world has a set of webcams that isn't secured um it gets pretty boring after a while so we watch Jeopardy every night so I couldn't help but use a Jeopardy ref ref

so Alex we'll take sunny sun or sunny solar for 500 doing a search for solar in just server brings up a whole set of solar controllers and this one was interesting CU um it's also out of the country which makes me feel better about showing it on the slide but it showed their consumption of power their collection of power and how they were using it and historically current time historically but that's not what I thought was really cool I thought the fact that you could type in sunny and come back with the sunny web server I don't think I have a slide of that no Sunny web server I was expecting a handful out of 11 million records I

figured there were probably you know a handful of solar panel control boxes out of 11 million records I don't know exactly how many were these not these but the Sunny web boxes that were protected but more than a thousand so there are more solar panels out there than I ever imagined and there are more of them connected to the internet than I ever imagined so those were the interesting slides we'll get back to you know what it takes to make this work and and Etc Lessons Learned so the first thing I learned was that images are very slow and difficult to process could be why showan doesn't have them and why you don't see them in a whole lot of other

places um outside calls pulling DNS information which one of the scripts you know the grab. pulls geolocation it also is slow um DNS is another thing is slow and we'll talk about feeds and speeds here in a minute of what you can expect when you're doing this um storage time for processing full internet scans back to that 250 gig 250 gig is an incredible amount of data for an Enterprise to play with even though you know ter terabytes terabytes and terabytes of data but 250 gig starts getting to be a pretty big file size to to start messing with um you've got to have the dis base it takes a lot of CPU to run through it

um no matter what even if you parse out just small amounts it makes a pretty big database file fairly quickly um I was amazed at how many different things are connected it's kept me entertained for many years if I had been smart enough years and years ago to go hey maybe people would pay for this data as someone pointed out um you know maybe I would have beat showan to the punch but he has done a phenomenal job and I use it and I am entertained by it regularly um parsing issues back to you know you you you're pulling 300 million or or so different IPS and in different languages with different formatting and you know when

you start paring HTML uh my code's horrible and it has horrible format um I seen HTML that is right on par with my formatting uh different ways to do Ty titles characters I never imagined anyone ever would put in a web document there's a lot of cleanup that takes to get data to go from parsing into a database you know I there were some there's some red jaes I built that just clean things up and they clean them up to a point where I pulled out everything that wasn't an alpha numeric character because you just couldn't parse out everything some of the multi-bit character sets you know whether it's Chinese or Japanese or Taiwanese very difficult to try the do

Size Matters 1 million or 100 million is a big difference in the number of Records when you start sing them into a myql box and I think the next slide I'll talk a little bit about that um zmap scans are not ninja likee um you start running a zmap scan and uh people will let you know so bear that in mind anybody who wants to run home and run one um do it off your neighbors WiFi um but you will you will get notices from your cloud provider saying hey we noticed you were busy okay so speeds of things so um I don't know if everybody's familiar with this or not but Amazon provides you with

a free account if you want one type in a credit card it's aws.amazon.com free they'll give you a free microsized Linux server for a year enough hours to run it Etc they will not give you enough disc space By the way um it's a good start so if using that microsized Linux server a bit Nami buun to uh image running some of the things it'll parse about 3.5 million sonar records in an hour so that's you know taking a sonar file and parsing it into the database about 3.5 million per hour so if you you run the numbers there it takes about 100 hours of processing time to run that 250 gig file into um a micro instance that's assuming

that it doesn't fail somewhere in the middle very difficult to parse a a single file multi-threaded um I you know I yeah I've got a computer science degree but I barely got one so that's way outside of my link uh some of the other stuff you can do multi-threaded pulling IP data from IPS Etc yeah you can do M multi-threaded the grab. pl that I wrote has is multi-threaded would run about 50 matter of fact the bottom one here grab grab. PL I would run about 50 threads on a micro instance which is a very very small instance on the Amazon on cloud infrastructure 50 threads approximately 10,000 IPS per hour is how much I could

I could process give or take um you run out of CPU very quickly you don't run out a bandwidth you run out of CPU and that's even setting it so that the the threads disconnect pretty quickly out to connections um zmap you can run about 10 million IPS in about 45 minutes off of an Amazon micro instance um that will not go unnoticed let me let me tell you right now they will notice you will get emails from people going hey um screen captures brutally slow um there just isn't enough resources there with a micro instance to do that so when you start moving up the chain and start running to an 8 CPU box

on on a cloud infrastructure you can do a lot more pretty quickly uh but yeah there is a charge I've got two oneu servers sitting underneath my desk at home uh that I used to use for this but I found that the bandwidth was the concerned then not necessarily CPU it was bandwidth I couldn't consume all the CPU I had because I didn't have enough band on a DSL type connection so shifting gears a little bit I said you know this was cool I want to play with some other things things that I haven't tried lately so Amazon has a beta feature out called Cloud search which is basically a search engine in a box uh they give you a

search engine in a box it's actually pretty cool it's not just cool for this but it's cool for other things um and I think they actually have a free program to try using that too if you want to play with it so so Cloud search basically you're going to take data that you have and dump into that somehow they are they really like um Json and um XML I think for a data feed into that um luckily the the data fil is from sonar are Json bad news is there's a whole bunch of B 64 encoding in those those B and those sonar files that you have to un encode so I I wrote a script

to do some of that um called sonar 2 cs.pl which was really kind of where I was just dumping in the database before but instead of dumping in the database I wrote it out to a CSV a comma separated file you need to run that a basic file and get a couple handful of Records to start with before you start building your Cloud search instance um save that file and use that to to basically build your index what the indexes will look like when you use their facilities um and you're going to upload that file during the creation which we'll get to in a second so um how we doing on time left 10 15 okay all right so I'm

going to run through these pretty quickly but this was really simple to do it only took me about 5 hours to figure out how simple it was um the documentation is a little sketchy um they call it beta for a reason I'm pretty sure throw in a uh a domain name you leave the desired instant type and replication count uh as a default and it will scale up and down as you as you do your thing against it um you're going to browse and find that sonor file that you just build and upload it so that file has a handful of things in it it's got maybe 10 IPS in it is what you want it

then will take and figure out when you pulled it in it'll take and figure out what the what the actual fields are in that file that it's going to index you need to to modify it a little bit um this is where it took me a few hours to figure out why my search has never worked um you have to change the type to a text it comes up with a with a literal it doesn't work um one of those things that I don't know is really described in their documentation so once you've done that you're then basically going to kick it off and it will start building the indexes um it is a little bit slow even

with no data in it building the indexes took quite a while it probably took 15 or 20 minutes for it to build indexes on five records I don't know what it's doing in the background but that seemed like it took a little too long for what I was expecting I matter of fact figured I had done something something wrong several times until I finally just waited it out all right so once it has indexed and it tells you it's rate upload documents you can grab a file and run as many records as you want with a couple limitations and then upload them so that'll put your data into their search engine all right so once you've done

that you can run some test queries from their actual control panel um what it'll do is give you back um what you were searching for you know it's it's their search engine is you know much more more robust when it comes to ends ores different types of things you want to return Etc uh it returns back this screen and return it back to some raw data that you can see what you're getting back if you do it from a web browser and you have to set up the access to get to the Cloud sech there's some it's almost like a firewall type panel to let who who and what IPS you will allow to get

to the search engine to upload documents and also to um do queries so you can control that when you set that this is another one of those things once you have set that it takes quite a while for it to actually take effect where you can actually hit it so you get like a 401 or 403 for quite a while and you try to figure out why your query isn't working come to find out you just have to wait for them to finally process the rules once you've done that you can put in send in a query and they give you the format as you can see I was my basically the IP address or the main name with

search question mark Q equal Google and that came back with these these fields um I think that this could be used for certain things for what I was looking for to play with it really didn't give me what I wanted I kind of enjoyed having the entire database at my disposal but if you're building a product if you're building a product that um you know whether it's a security product it's going to run locally or a website that you've built that's going to need to pull data from a a search engine that has pretty good capability this has got promise um it's easy to build availability scaling excess is provided you know so all those things

are taken care of you don't have to go figure out how to make a a buntu instance work um there are some limits it's limited to 500 gig so you know you can only have one database but you can build multiple databases so if you were going to index all the port 80 and all the port 8080 you probably would exceed what this has for capability the upload is one of those things that is is a little limiting for this because they want you to upload documents that are less than a Meg and in batches that are five Megs or less so to upload give or take you know 300 gig takes a bunch of

different files and a bunch of different batches so it's kind of hard to to do what I wanted to do with you could write a script to parse them out and do it but and just seemed like a lot of effort um like I said a possible fit for for uh for tools later on if you were building something um some potential technology and feur features that I bumped across that I thought would be cool maybe get to play with later I just I'm out of time to do with this play with this for right now um it's one of those things in your life you have lots of time you sign yourself up for the B sides and say I'm

going to go present and then um you get a new job position I get promoted which was great except for I had no time to work on this so it is what it is um I would love to run DB verse MySQL uh with with the the sonar files being in Json format they should import pretty easily there's some B 64 decoding that has to happen you know I didn't say you had to decrypt it um so anyway that would be cool to see what the difference is there I think the performance for mongod DB would probably be substantial over what you get out of my SQL uh when you're running my SQL searches on you know millions and

millions of Records it takes a lot of CPU the micro instance once you get over 20 million records isn't going to cut it unless you're willing to sit a long time and wait your search it to come back um question you know would solar search engine which is a you know open source search engine would how would that work compared to like Cloud search um wasn't as easy to set up at some point maybe I'll get around to doing that um Cy Captain verse WK HTML to image um you know would that be a better image capture program um to do what I was doing I don't know um zap also provides a capability where you can

build a banner grab and I believe that's what the sonar project has used and they made it work I never could get it to work but I'm not a Unix Guru or Linux Guru and you know so getting that to work would be a new a step in the right direction um one of the things that I wanted to do never got around to was I wanted to be able to take and grab IM from a whole bunch of different websites run an mv5 on them put them into a database and then say Oh all of these are equal even though the server names are different or Etc I haven't gotten around to that maybe one day um the

other one is to classify records um you know yes there's a bunch of different rabbit servers and if you can pull enough data back you should be able to classify what they are and what they're being used for which kind of takes you know some of the exploring fun out of it but it would be a time saer um and that might actually have some real practical use and uh improve search engine front end so all right so all the files are there there'll be a better readme up in the near future as to actually how to set this up Etc and then uh questions and heckling because I expected heckling from certain people um but it didn't

happen so I kind of disappointed uh any questions all right I never like to be between people in lunch so thank you guys [Applause]