← All talks

Building a programming environment for privacy

BSides PDX · 202327:1276 viewsPublished 2023-11Watch on YouTube ↗
Speakers
Tags
CategoryTechnical
About this talk
Building a programming environment for privacy and iterative learning Lateef Jackson (@lateefjackson on Twitter, https://bsd.network/web/@lhj, https://lateefjackson.com/) Why do cell phones ask for permission when installing an application for access to the internet, yet any software dependency can just access the internet willy-nilly? It doesn’t have to be this way. You will see a working prototype that prioritizes privacy first. 4 takeaways in 20 minutes: 1. Motivations for privacy first. 2. Why existing options (Capsicum, Pledge) don’t meet our needs. 3. POC: Sndl Adding capabilities (controls) to Lua packages. 4. Demo: Python socket as an example of applying package capabilities outside of Sndl. My name is Lateef Jackson. Since the turn of the century, I have been building software systems and engineering organizations. People build software, and people are impacted by software. My focus is on a data-driven approach to impacting people through technology. --- BSides Portland is a tax-exempt charitable 501(c)(3) organization founded with the mission to cultivate the Pacific Northwest information security and hacking community by creating local inclusive opportunities for learning, networking, collaboration, and teaching. bsidespdx.org
Show transcript [en]

[Music] welcome uh I'm going to do a talk about building a programming environment for privacy uh quickly just to make oops quickly uh I just want to do a quick thing of make sure you're in the right room so set the right expectations uh we're going to talk a little bit about the motivations for privacy at least for me I've been losing sleep over log for J since uh you know 20 years ago I wrote some code with log for J and I still have no idea how I would prevent the compromise of privacy that you know all of that log forj happen we still don't have the tools and so I was really really motivated to

think about how do we start to have tools that we can use to prevent those types of uh attacks in the future the second uh thing uh is really about um sandboxing packages so thinking about as we build software um our package are a Sandbox we default deny we kind of default deny as a pattern anyway um and that's kind of standard in the industry and so I want to make sure we all are have that capability when we're building software um I built an environment just for this and I think that the reason that we should really care about privacy because privacy is a human right and so it's worth putting a little bit extra

effort in how we build software and finally I'm going to do a little example of like how default deny can be applied to a python um application and see what that kind of just a quick you know code example it's going to be six lines Don't Panic be very simple um and then just sort of how that applies and what that might look like uh as we move into this new exciting World oops um okay okay so what I'm not going to talk about this is not production ready install this and magically you will have privacy it'll block all your dependencies um you know this is a configure thing this is um this is this is a path forward it's it's

going to have some trade-offs uh but I think those trade-offs are worth it for what we get um first I just want to be super honest and say I'm a fraud I am a privacy programmer at a security conference so um I I really yeah I'm really passionate about privacy but hopefully you are too hopefully solving privacy problems also secur solves security problems I've been doing this for 24 years so I've been uh you know creating privacy Pro code out there problems for security people for 24 years so uh you know I hope I hope I'm not alone and there's other people out here who write code and actually are like also causing problems future

employment um I used to be a CTO at a healthcare startup and that's where I really got passionate about um the concerns around I don't really know how to build software that has good controls for privacy um because the internet happened um okay so it's the end of the so if you're still here great thank you for staying um all you nice people uh uh this is my first talk so I'm a little nervous um so just a quick question Who here trust the internet anybody here trust the internet I mean you're where does conference if you raise your hand I'm a little con a little concerned I this is sort of a I'm a baited question I'm keeping my hand up

because I know that this is true um it's okay uh next question uh have you written any code or script or anything where you had a dependency you can raise your hand here put your hand up you know where this is going it's okay it's it's almost over we have 20 more minutes and then you know stretch out you go ahead make yourself known it's okay I I'm not recording any of this hi people at home um okay so uh couple more questions just small little questions do you trust your dependency now and forever in all future updates till breach do you part anyone yes no okay uh your hands are starting to go down what

dependencies what about the dependencies dependencies we call those transitive dependencies like so the things you depend on so the software you depend on that downloads the software that down the software that you know and all the way down do you trust all that code out there do you trust these dependencies with your hard drive should you shouldn't yes the right answer is no but I have to build software um what about S3 buckets anybody have access to S3 buckets either through the web or any other way yeah um production databases anybody have access to production databases so your dependent depes dependencies dependencies have access to your databases okay so um just quick level Set uh I'm a privacy person I want to

acknowledge the elephant in the room or the security black hole that is I was listening a podcast and I heard this on risky business and it just kind of blew my mind that we have kind of less the solar system in our vulnerabilities and I think uh I don't want to ignore security but I do want to focus on privacy and so I'm going to make a little case for privacy here um the these are interesting problems and we should definitely solve these like we should can you still hear me I feel like yes okay these are interesting problems we should solve these I don't think things are going to get better anytime soon based on what I

hear so good job security for everyone here not so good for everyone at home so um in those massive huge things and I know you can't read this in the back I did a quick Google search about breaches uh on data two things came up you can do this too uh I'll just read the top one and kind of give a little quantification so maybe security only in the trillions or billions or whatever numbers that we can't really comprehend but um this is I have no affiliation to this is is just a random Google search I don't I found this out I don't even know how this is breached but um here in Oregon we had

1.7 million clients breach data's breach Healthcare data breach that's that's 40% of the population so there's like people in here whose data was breached if you live in Oregon statistically um this is what failure looks like in case you're wondering this is it this is this is privacy failure next okay so um I just want to kind of put in the context um this is a human scale problem every privacy breach we have is a human somewhere that has potentially been harmed uh it's not just data in some huge Warehouse or some company that has no face or has an X or whatever um it's actually people and so I think that there's a there's a moral obligation to

think about this they've given us our trust their trust with their data and we have not been able to protect it and so we should have a little bit of ethical I know you know ethical motivation this is a great I just found this out this is a like August 3rd I read this blog um and this is perfect because um there was malicious so flog forj was sort of accidental a feature was added and the thing that's keeping me up is like oh how many things how much code out there has been breached because I added log for J 20 years ago and then you know they added a feature at some point where you could get shell access

and so that's like keeping me up at night but then I realized there's also malicious intent now so now as a threat actor pre some previous talks a great example of showing how you how what good is thre actor I can just deploy a a a module somewhere that's got a great name and that somehow gets included in one of those dependencies dependencies dependency dependency dependencies and now I can just read your hard drive in fact this particular attack was uh for developers or data data people who use Python data and analysts because you know they don't have access to sensitive data or anything sorry sarcasm uh anyway only 75,000 downloads so not really a

big deal in the hundreds of millions but I to me still um so privacy breach just through dependencies um yeah so let's not trust the internet let's sandbox our packages um and so I think there's a pattern that we see everywhere and this is an IP tables it doesn't really mean this is this IP tables command just says block everything that comes in that's the default behavior and um I've done some research on the prior work in this there's capsicum which is free and and open BS pledge and uh there's Linux stuff lots of Linux stuff um and they all default to deny do not give access to data and my my assertation is that we need to

sandbox our pro pro sandbox our packages or our dependencies in the same exact way we deny everything this is this is kind of us developers if you're if you manage any body here developer or deal with developers know what this stuff you know what this stuff is this import stuff where we package our code the magic where the magic happens um this is where the log for J stuff lives I mean these are specific examples of java and Ruby and python or whatever um this is we share our code via these libraries and sometimes we publish this stuff up on the internet and then get other people get to use it this has become now and and when I

started programming this was was not true dependency management wasn't considered a language feature but rust and Zig and all the new hotness all have dependency management built in which is awesome because you know 95% of the code that you ship is a dependency so most of the code you run you didn't write or we don't write which means it's someone else's code that we got on the internet and we know that we trust the internet because it's such a good place um where all of all everything's secure on the internet right so anyway uh the what I'm trying to to say here is just like this is this is the place for us to manage

our dependencies this is the place to put controls um yes so we have enormous power with this one line of import something and this particular case is a python piece of code I snatched out and I'll walk through it a little bit later but the point is we can import a socket and we connect it anywhere on the internet um we can copy our entire hard drive we can do whatever we want just with that One Import statement and our transitive dependencies dependencies dependencies can also do that too so a little risk risk for the risk people um this seems a little sad and wrong so I think we should be responsible developers responsible security I think this is kind of a blue

team I'm I'm getting used to um to security words and I guess I'm a blue team person not a red team person so so this is this is blue team candy so any blue team candy people let's let's not let us let's only constrain ourselves to this package this this dependency is allowed to go to this host on this port and it can read and write in the socket so you can't just connect to anywhere on the internet the best time to figure this out is when you write the code not like 10 years later when you're like hey you know does that mod talk to hacker one.com it's like now it's like right when you add the dependency is a good

time to add your controls um yeah and this is you know again like I think this is an example of what I'm I'm sort of pitching is like hey let's just get some controls whether it's terraform or whatever management the idea is that okay so how we doing on time oh I'm a minute um I think one thing I would just sort of before I go into the demo and talk a little about python I want us to think about divorcing the idea that we sometimes have in our heads from uh any code can access any data so so we build software and we bring in packages but not every piece of code should access every piece of data but in

a process right now when you import something it can access all your data it can access all your requests it can access all your databases we got to stop thinking that way that once I am part of the software I get access to everything and so I'm sort of trying to convince the change of my I feel like this is a very this audience is like cool with this this is a cool idea to you guys okay so python um so I built a whole program environment I wanted to take some of these Lessons Learned and try it with python and kind of just show you what that looks like uh this will be interesting because uh the terminals

over there but WiFi I think the first thing it's very popular every it's very familiar environment people probably have Python scripts security us it a lot it's easy to say the other thing is that I think this concept could very easily be applied to existing codebases so if you have a legacy app from five years ago you could easily say hey look I don't necessarily want the database module to also be talking to say an external website you could easily add controls into existing systems or at least Monitor and slowly add controls around over time so I think the idea here is that we don't we don't just have like hey we're just going to build new

software it's like hey we're going to add controls and we're going to add controls to everything any questions while I get this ready everybody okay with this any you upset about this software software Engineers are going to be like hating me soon if they find out I do this because they're like you want us to do what uh okay so what I'm going to do is just show you a little bit of that python code we'll just walk through it real quick line by line taking this again um so the first part I wanted to make this very explicit you can do more magic here but I didn't want it to be super explicit for the purpose of

Education um and in this case I first I'm just going to import sandal which is the project that I that I call it's my sort of sandal just my code name for the project that add controls to all of the io so I want to control all IO and so since I'm a privacy person I care about data access a security person probably care about other things but I'm I'm not really a security person or I kind of fake it a little bit with my friends um so you import a socket and you want to create a connection to Local Host Port 880 I want to send a little data I want to receive a little data I want to print

a little data that rhymed I didn't plan that that was actually pretty good uh um so so that's it so that's all going on is this everybody kind of followed this yes simple cool yes I use Vim okay uh yeah I'm I'm old um so let's just run this program simple should just work and the first thing we see is it failed because we didn't have any controls we denied all we did the the right thing happened here there's no controls configured did something change no okay there's no controls configured so it should should fail we have no configuration deny all we try to load our config there's nothing yes this is exactly what we should see future sell

right if back in 30 years ago you're like hey here's my server it doesn't have a firewall I've got my database up and clear text just you know tell that in and you know yeah yet no today we think no that's not going to happen like no no no no bad idea I hope in 30 years we're like yeah do you remember when we didn't have controls over our package management we just trusted all the code in the internet and just downloaded whatever and it was fun I just hope that is the future so first of all let's do some controls and I'm just going to give a simple example that config that we had

up there there we go complete um here it is just saying hey you can connect to Local Host so assuming my bash history is I'm gonna have to for you people in the back oh I can't really figure out how to give you all right well I'll just do it this way for you people in the back all I did here was I put I I ran it but I passed it environmental with the configuration just like if you had a Docker file here's your configuration here's your payload here's the config um this time it sort of doesn't like me again because apparently I didn't configure Hort 8080 in my controls so let's take a look and

see what's going on with that let's take a look at our code oh whoops sorry about that so port 8080 that isn't right so what happened your was 21 thank you thank you all right uh I'm going to buy you a beer later Perfect all right so 80 2100 let's

see yay it worked hopefully this was very clear that that worked I know I know it's a surprise thank you for the port number Okay so we've seen what it's like to add controls we've seen that it's it it it is we can add controls to packages um we don't have to live in this horrible World um just in summary oh good I'm sort of on time um just a quick assary um privacy failure is a lived experience we are now in the Privacy lived experience this is not a thing breaches are happening all the time now they're not all because we because we didn't control our packages social engineering there's lots of reasons for that but I

think it's important to remember that this is like humans all over now experience privacy bees and I think we we have to think about how can we make changes to make this better this can't be the normal there is no free lunch okay so like just like hey there's no magical thinking we need to we're going to have to manage configuration we're going to have to manage these controls somewhere like there's something we have to do here uh we can't just kind of sit up sit by and just sort of be like oh well we'll we'll we'll have another virus scanner do vulnerability you know something will fix this somewhere else and the pipeline we have to start thinking when we're

building software about privacy CH jbt will save us yes it totally will cuz it's like AI has never been racist or bigoted or wrong or statistically inaccurate it's we've never had that problems it's 99% of the time and so every hundred times it will give you a privacy breach piece of code isn't that great so uh thank you for that oh but there there's hope yeah so I think there's hope um and quick that's it for me I just want to thank for uh people who gave me feedback on my practice talk I want to thank bide people my first conference since Co it's so nice to be with humans again uh it's uh also just everyone for coming and

listening to me [Applause] [Music] gab yeah yeah I'm going to repeat the question I just want to point out before I repeat the question that I it's not a plant because he asked a node dependency question so I just want to be clear because that is like the the the scary one um so the question is is that um I just started a new node project it downloads all this dependencies um you have a dependency file why don't you add the controls right in the dependency file you don't need a new file you don't need a new thing so I think we I think inherently if you're not doing anything dangerous if I think the part of this is

a little bit of how we're going to change our code so if we're going to make a connection to do iio well there may be certain code that does that and other code that that we just pass that information to so there may be a little bit of we may rethink about how we do dependencies a little bit there's probably a little influence there but but just assuming what you're saying is true then yes you are going to have to go through and justify every single package that needs access to uh those humans that you that have entrusted you with your data that yes you you need to do that work that is part of the job so

I think the answer is if you have tons of dependencies that are all over the place accessing all kinds of things you're going to be a little sad for a little while until you sort that out but that is the that is the fiduciary or responsibility of being entrusted with data um about humans at the end of that you have a good understanding what you're doing what's crazy is when you add monitoring to this and so you know what packages are doing I reading and writing of what data and now if that Behavior changes okay maybe there's something that went wrong you know maybe some purple team person or red team person messed with you uh in the back in

the blue and I there's another blue in the back but totally a plan I'm really glad you asked that question um so not a plan I'm just kidding uh so the question was is for C or compile languages um you know how do we C modules in Python oh oh c modules in Python how do we deal with that and I think it's a great question I think that um so there's two answers to that question one is the control type stuff that I'm talking about could easily be pushed even lower like enabled into capsicum and and sec I Linux Linux EB what and whatever Linux thing is doing today because it changes but all of the uh OS

level uh os's already have this capability it's really hard to use it's really low level and it's really complicated but we can use these highle controls and push those down to the low level and and Prov provide these guards also so the answer sort of is is the key is to get to a point where I can make those lower level API calls to block things like C modules in Python is that sort of yeah I think I think at the end of the day um there's a little bit of re so I'm monkey patching like one of the tricks you can do is monkey patching you can also actually uh a number of these things there's an earlier talk

that sort of talked about tricks with this uh and that's sort of the tricks that I've used with the c modules is I just uh swap out the uh C API calls so I just proxy so I just say hey instead of using the csocket call you anyway there's a little little uh person in the middle trick with this so yes we're going to have to go through and patch a ton of code to make this all happen I'm sorry in the very back with the no no question uh I'm moving way I left one second a lot of blue shirts today I don't know what it is does everybody wear a blue shirt it's like bsides okay so here in

the middle almost the middle yes how much of this should be uh static analysis how much of this could be solved by Static analysis I agree I think a lot of it could be a lot of it could be exactly that and how we might actually build the initial uh version of of our control file might just be let's do static analysis and generate a file now let's review that and make sure that's correct okay because in staging might be different than production what might be different than Dev so we're going to have to like have some con per environment changes but I also think Sal analysis static analysis is a way for us to do our compiled languages so we can

wrap our compiled languages with these kind of hooks so that those also have the same uh you get the same behavior for dynamic languages you get compiled languages I think Dynamic languages is a place to start because it's easier but eventually static languages so let me just rephrase the question maybe in a shorter version uh if I can uh if I'm using a compiled language I already know every all the apis the io that I'm using why would I do this is that sort of the question how do I balance the amount of time why would I invest all this time in it since I already know what it is I think we need to stop doing

that I mean and I I I don't know like the how do we balance the privacy concerns I think we have to stop pretending like um using an external module uh is safe it's just not safe anymore like I I can give you some links if you want to go download some cool software and I'll totally own your computer right but you wouldn't do that but yet you go and you say oh cool I'm using uh Java or go and I'll just get whatever transitive dependency doesn't matter it's statically compiled right that is untrusted code that can access anything you can access in that in that process it's trickier it's harder with static languages but if you do if

there's an update you don't really know what's in that update do you read all the update actually let me ask you a question do you read every single update of every transitive dependency you have he's he's shaking and said no so then how do you trust your code do you know what packages you have do you even know what packages you have in general yes in general okay good so I think that's great that's a good example do you know what packages you wrote five years ago in that software that's running I I I think you're with me I don't but I think it's a good question it is I think there's really big challenges with certain languages and

it's a huge assumption change like it's a cultural change I think someone earlier the keynote we talked a little bit about culture it's an architectural change it's an architectural change oh it is time for this all to come to an end you guys have a great time thank you very [Music] much