When Bandit(s) Strike - Defend your Python Code

Name: When Bandit(s) Strike - Defend your Python Code
Uploaded: 2017-03-13
Duration: 25 min 33 s
Description: When Bandit(s) Strike - Defend your Python Code Bandit is an open-source tool designed to discover common security flaws in Python code. Although Bandit was originally developed to find issues in OpenStack (a large open-source cloud platform) it has since been adopted by many Python developers out

BSidesSF · 201725:33867 viewsPublished 2017-03Watch on YouTube ↗

Speakers

Travis McPeak

Tags

CategoryTechnical

StyleTalk

Mentioned in this talk

Tools used

Bandit

About this talk

When Bandit(s) Strike - Defend your Python Code Bandit is an open-source tool designed to discover common security flaws in Python code. Although Bandit was originally developed to find issues in OpenStack (a large open-source cloud platform) it has since been adopted by many Python developers outside of OpenStack. It has found dozens of critical security issues including: command injection, SQLi, insecure temporary file usage, and usage of insecure libraries. Join Travis McPeak, one of the core developers on the Bandit project to find out: how Bandit works, how to customize it for different workflows, how to create a Security CI pipeline with Bandit, and even how to extend it.

Show transcript [en]

welcome back from lunch to track two in the afternoon for security beside San Francisco brought to you by hack Fitbit big thanks to our sponsors you have to clap for the sponsors I'm Andrew this is Travis McPeak presenting when bandits strike defend your Python code and without further ado Travis thank you thanks everybody for coming after lunch I hope everybody enjoyed the pizza the beer and if you haven't had the awesome beer downstairs yet do it very good stuff and thank you for coming to my talk next door Jim O'Leary is giving an awesome talk on metrics which I was fortunate enough to see a tab set Callie a couple of weeks ago so for those of you that are missing

his talk to be here definitely check it out online I had a lot of fun watching it so can i I'm not gonna do a lot of show hands stuff but can you raise your hand please if you are responsible for the security of Python code in some way cool my people I always like to start off with these with just giving an intro just so you know how the flow of the talks gonna work so we'll do intro and then we're gonna talk about how we can use bandit to find some issues and then finally building a program around bandit and what I'd like to see happen in the future and then we'll do Q&A sound good

sweet who it is I'm Travis I love app sec I've worked for IBM cloud I'm heavily involved in OpenStack and Cloud Foundry security teams in fact the OpenStack security team which we'll get to in a little bit is why Bennett was created in the first place we had some problems with Python code and we wanted a tool to help find some of those problems and I'm also heavily involved in Ross Bay Area chapter any OS members last last hit show of hands a promise okay good Olaf is a very cool organization if you love security please check them out we're actually having a happy hour tonight at dirty water at 5:30 to 7:30 so if you want to have some beer and

hang out with cool people please come by who it is this is bandit I totally stole this logo this is not bandits logo I don't know that at all that's the disclaimer at the bottom Bennett is open-source completely free to use in fact we encourage people to use it and it's purpose-built so I mentioned that we started doing this work with OpenStack a while ago and in 2014 the type of issues that we were seeing we suspected that you could just run grep across code and find things like command injection use of weak crypto algorithms and stuff like that and so we actually started off with a grep based approach we were just like okay let's write some

grep rules find problems in Python code and then we read oh we're at like a hackathon or something like that and my boss at the time smart guy named Jamie went away thought about the problem and came back with the initial version of bandit and since then it has grown into the tool that I'll be talking about today we have a few design goals for bandit we wanted to be easily customizable so there are different workflows that you might use for example you might use it as a penetration tester and just say show me all of the things that might possibly be issues now when you do that it's very noisy and so it might not be appropriate to run bandit

in the same way if you're going to do like a gate check job for example so we wanted it to be very customizable and we also wanted you to be extendable so we came up with a certain amount of tests that we thought would be useful to us but we also wanted to make it easy to extend it for people that want to use the ast in Python to find their own security issues that maybe don't apply to our environment and finally we wanted to be lean I personally and I assume a lot of you have had bad experiences with some static analysis tools in the past some of them are very bloated they take

a lot of space you can't run it locally on a laptop very easy and so we wanted bandit to be something that quick runs across your entire code base without waiting half an hour and gives you results that you can do something with so what kind of issues are we talking about in Python code let's go through a few of them first one is command injection in Python you'll see a lot of times people will just say use sub-process to open something and then you give it a command along with some inputs of the command and finally user input and then you give it the magical shell equals true which does what drops everything that you just put

it onto a shell and so you typically see things like this if the user puts semicolon cat Etsy D password it's horrible you get wasted like this dog right here he's bad news poor guy here's another example people for whatever reason seem to like to hard-code slash temp paths and the problem of course is that if an attacker can guess which temp file you're gonna use they can do all kinds of stuff they can make it not usable to you in which case your code might error out or maybe they'll convince you to write something into it that you think is perfectly legit and they have sim linked it somewhere else and now all of a sudden you're causing a

data that you never intended to be written to a file so there are secure ways of doing this this is not one of them and so when we see this we want to flag it disable TLS certificate validation TLS exists for a reason there are a lot of things that you can prevent attack vectors by using TLS and whenever you say verify equals false like this you're saying forget about all that those TOS protections we don't want those you'll oftentimes see developers just doing it because they don't have certificates in their development environment but they forget to change it back and then guess what all of the TLS protections that you thought you have you don't have use of weak cryptography

this is not the 90s anymore md5 is totally broken if you're doing anything security do you not use md5 for it please so a lot of times you will see people doing stuff with passwords storing them md5 they're like okay cool hashing it good job but unfortunately we have very fast computers nowadays and it is trivial with cloud computing Sikkim computer hash collision last example issue we'll talk about our promiscuous file permissions I've seen a lot of developers just say you know what seven seven seven should work all the permissions for all the things it's fine and then you know the idea is that they'll come back and set something more secure later but again people being

people sometimes you forget and so we look specifically for that last bit oftentimes you'll want your app to be able to access files that it's dealing with but there's very little reason that you would want the world to be able to write to a file that you're dealing with and so anytime we see world writable or like ok that probably doesn't look right let's flag it okay so that's it for my very long-winded intro let's actually get into using bandit to find some issues so I mentioned in 2014 the open stack was not as secure as it is today part of the reason is that is because we used bandit to find some of the issues

for example in this case we have basically a hard-coded admin admin credential to this web UI and what that means of course is that you can't change it it's just admin admin so anybody that looks at this code knows exactly what your username and password is for the UI and that's one of the things that bandit can detect is hard-coded credentials in fact actually there's been some very cool work some of the people from lyft use String entropy to detect credentials more accurately than bandit does itself it's not very sophisticated it just looks for things that look like they might be passwords here's another example in this case they're using the temp file thing that I talked about

they're writing a sequel file over there to a fixed path so it's always going to be temp create gdb sequel and then later on in the code they're just executing that against their sequel database with the root user so again when you do stuff like this wasted it's not good we were able to fix both of these issues and a lot more issues that we won't get into today by using bandit what I'd like to do at this point rather than listening to Jimmy talk the whole time let's find one together

okay so the first thing I'm going to do is I'm going to use my virtual environment

whenever you're working with Python code you want to do it in a virtual environment otherwise all of the dependencies and stuff that you're dealing with get installed on your user account and it gets very messy so a virtual environment and it's got Banat installed now what I'd like to do is I'd like to run it against ansible sorry I said no more user questions but you guys have heard ansible right it's pretty popular orchestration framework okay cool

and now let's just let's run bandit it against it and see what we come up with

just to explain what I've done here so bandit I'm saying recursively scan the dot directory because I'm in danceable and I'm saying I want to exclude unit test we do that because anytime that we're looking at bugs you want to be very careful that you're not creating noise where you don't need to be and you don't want to tell developers about you know things that they're doing insecurely in code that doesn't make it in production so let's just exclude the unit tests and finally to make sure we're getting good results I'm using the LL and III filters which ll is severity filtering and III is a confidence filtering so what we're gonna do when we

get this is we're going to get all of the issues in ansible that aren't unit tests that have a medium severity and confidence or higher and you'll see it'll run fairly quickly it's not you're not gonna have to wait long for this this is one of the things that we like about bandit developers can run this you know before they check in code they can run it as part of CI it's no problem okay so here's here's one issue it's saying the input method in Python reads from standard input evaluates and runs the resulting string as Python source code well that sounds kind of crazy like does that seem like a reasonable thing that anybody else would want to do when

they're calling input let's dig into that a little bit

okay so this is actually the code in ansible I dug it up and put it on a slide so that you guys can see it easier and what they're doing is they have a prompt function and what they do is they they show the they get user input and showing them the prompt that you have passed but again they might not think that the input here is actually going to be executed as Python code I don't know about you guys but that seems pretty sketchy and when I when I saw this I was like oh that's not what I would expect when I do input so I looked it up in the documentation and in fact that's exactly

it's exactly what it does it's equivalent to eval raw input prompt whoa that sounds pretty bad to me so what I did is I came up with a little example that we can run through to see what that looks like

so first of all this is just a super basic script that uses the input function it just says input would you like a prize and then that's it that's the entire script let's see what happens when we run it would you like a prize yes okay so it looks like it is actually trying to execute my my input that looks pretty sketchy to me I wonder if we can exploit that what if instead of saying yes if I were to do something like this

you guys think it's gonna work be kind of a bad demo if it didn't work wouldn't it holy it's just Reiter and just ran my password list and dumped it out so you can't actually get command injection with this why am i showing you guys this is this is this me being super irresponsible and dropping dropping a day on the ansible project it turns out it's not I dug into it I was like wow this is kind of a crazy bug and they have code up earlier in the project that does this it says are we running in Python - if we are then we need to use raw input which is safer and if we're

running in Python 3 then go ahead inputs fine so what I would like to see if I'm running pin it I spent you know half hour 45 minutes looking into this bug it seems like it would make sense to put a comment here and say hey guys don't worry we're not exposing this to untrusted user input that we're gonna execute and so what I would do if I was running bandit on my project here if ansible was you know was my baby I would put a comment and here what I would say don't worry we've thought about this and so the next person that comes along doesn't spend time wasting their time on this ok let's switch gears a little bit

and talk about how you might build bin it into your program if you're responsible for a code base this is the essential workflow your run bandit find bugs I'm gonna fix the bugs you're gonna end up in a state where there's no more bugs and then you profit now of course the hard part in this whole thing was the remove bugs part so let's let's talk a little bit more about how that works ok so basically when we talk about removing a bug we're gonna do one of the following things here we're gonna use no sec which tells bandit ok humans looked at this it's not a security bug like for example we're using md5 here just as a

efficient hash algorithm we're not using it for any security function if you're talking about just non security function md5 is great you know it's quick you know all that stuff there's no reason not to use it and so if you're a security person you dig into it and you say it's fine we're not doing anything to security with it you can put a no sec tag and hopefully a helpful comment like I wish it would have with the ansible thing and then from then on bandit will ignore it and people that come after you will say it's cool I looked at it another option is you can actually fix the bug which in the case of the two

that I should have or openstack it's the right thing to do you want to fix those but unfortunately it takes some time now you can decide that the whole class of bug is not important to you for example there's a Jinja to plug in and Jinja to has an auto escaping feature a lot of times Jinja 2 is used for templating in HTML so you will take some template and you'll use that to put HTML to a user of course you're worried about cross-site scripting when that happens and Jinja 2 does have an auto escaping thing that'll work but a lot of times people disable it sometimes maybe you're not using it to render HTML and you could just say

it's cool I don't care about cross-site scripting and so you would say for my project don't worry about Jinja too so that's that's what I mean by here by saying decided that the whole class of bug isn't important say it's cool the final option here is something we call bandit baseline bandit baseline is basically my project is full of you know 500 bugs right now but I want to make sure that I don't get to 501 and what it'll do is it'll actually run bandit against your parent commit and then it will run ban it against your current commit and it'll compare the difference and if there's new bugs then it says ok you're doing something right now that's

gonna create a problem for me and otherwise it'll just say you know what these are new bugs so fix them when you can but for now it's okay so that's kind of the the options that you have when you talk about fixing bugs in your Bandit code and then really what you'd like to do is you'd like to build a gate you have any number of CI tools I'm not necessarily schilling for Travis CI here I just found it was an easy thing to set up so you tell Travis CI look at my repo and then these three lines will do it you say install bandit run bandit and then if you have a pull request on your

repo Travis CI will run bandit and then put comments on the poll requests so you know if there's new issues super-simple okay we are getting to the end here I guess I just want to take a little time and go into some next steps I feel like bandit has pushed the marker forward for Python code and what I'd really like to see at IBM you know I'm responsible for a lot of code that's not Python I would love to see resources like bandit and the secure development guidelines that I'll show you in a minute for other languages for example note tons of JavaScript on the server side happening there aren't a lot of tools like bandit that I've seen that

are any good at analyzing node and there are not a lot of development guidelines that show you what you should be doing and then metrics I think that there's a lot of room here to extend bandit and say for example let's say that the same development team keeps introducing bugs in Python code well we could probably help them out with some training and we don't want to necessarily give training to teams that are more sophisticated so we can use bandit to determine where we might need to spend a little bit more effort doing focused training I mentioned the secure development guidance this is something that we wrote for the OpenStack project but I think it's going to be useful to anybody that

does Python basically it's a it's a five-minute explanation of a certain type of issue that you might see it shows the incorrect and correct way of doing it I think all of us have probably seen examples where somebody just copied some code from some guy on Stack Overflow or some girl on Stack Overflow and it's just not good code and then it ends up everywhere in your codebase the idea here was to create a place where you could copy the example code and it would be done securely so that's exactly what we did here for about ten classes of Python issues the bandit plug-in documentation itself is I mean I'm not gonna say it's great but I think it's

pretty good to explain the type of issues that bandit is detecting so you can go online and you can see for example are you if you're requesting with no cert validation that issue that I talked about at the beginning this is what an example of it looks like here's the problem and then some resources if you want to go read further on it and that is the end of my material so I'd like to open it up for questions if anybody has any yeah I'm Kira Pete no it's not driven by right I should have mentioned that so the question is is bandit driven by a bunch of red regular expressions it's not it takes an ast

approach and so basically what will happen is we load a file and then we break the file it's a Python source file we break the Python source into abstract syntax tree which is actually the tree that Python uses internally to understand the flow of your program you'll see certain nodes like a function node or a string node every element in Python has a node type associated with it and so if we want to look for example for bad function calls then we look at the function node we say okay this is a subprocess P open and then we can look at the argument we can say okay it's got shell equals true and when we have the

two of those together we know that we have a possible sub process you know command injection issue so it's better than regular expression because with a regular expression you might have a newline in there and now you know unless you wrote the regular expression craftily you don't recognize it anymore it forces everybody that's writing plugins to be I guess you know guessing all the ways that regular expressions might break and there's a few other benefits to the ast approach that we we implemented other questions

thank you yeah the question is can we reported options exist for bandit so bandit in addition to the command line output that I demonstrated well I'll put JSON HTML CSV I think there's a few more that I'm forgetting about too but yeah there's a lot of options in particular for automation we use JSON a lot because I mean everything can parse JSON yep remediation the question is do we have any plans to add remediation suggestions I think that a great place to do that would be the Bandit plug-in documentation I think the format that we have lends itself to doing that well and so I don't I'm not sure that we've done it comprehensively for every plugin but some plugins

definitely do it for example we have an XML class of plugins off the top of my head and the suggestion there is use XML may be susceptible to you know fork bombs and like all kinds other XML attacks use diffused XML as our suggestion so yeah I mean I think that we have it in some places but I'd love to see it more prevalent in all plugins yep yeah it's a great question how do you how do you how do we track the remediation of these these issues over time generally part of it is the no-sex stuff so any issues that we actually determine weren't issues we'll put no sec and then generally we have like

offline ticketing and some some automation we've built internally around it but there's not anything for bandit itself and I would love to see more work in that space yep

that's a great question it'll give it'll go against whatever Python oh sorry let me repeat it does it do just a module or what can you do other modules to like your dependencies by itself bandit will just will scan any pile of code you give it it doesn't really understand that a certain set of source files make up a module so you can point it at a directory and it'll scan everything in the directory what I would recommend in that case is I would actually recommend doing the dependencies separately so that you kind of have in rather than having everything stuffed into one you kind of have a different report structure and then the other reason to do that one of the

limitations in bandit right now is that you can only exclude one unit test path and unit test for your dependency might be called something different than unit test for your project itself so what you ideally like to do is you'd like to run each project separately exclude unit test make sure you're getting good results and then pile all the results into one central store which you could easily do if you export like JSON or something like that

now the question is are there hooks for integration into IDs such as Spyder there aren't I wish there were it's something people have bought very frequently for example PyCharm is one of the ones that we get a request for a lot I don't think it would be very difficult to do but nobody's done it yet I would love to see it alright any other questions let's wrap it thank you thank you very much Travis special thanks from our sponsor Fitbit and here's your Fitbit you got a Fitbit from from our sponsor

When Bandit(s) Strike - Defend your Python Code

Related talks