← All talks

Accelerating Software Development with ChatGPT

BSides SLC · 202328:5140 viewsPublished 2023-06Watch on YouTube ↗
Speakers
Tags
StyleTalk
Show transcript [en]

all right morning folks um thanks for being here we're going to get started um my name is Scott pack a little bit of background context on me I started off in systems Administration with more of a networking background and over the last seven years or so I've been working as a cloud security engineer a couple respectively sized companies Adobe now Netflix both of which kind of had a problem where a security can't scale the same speed as the rest of the business uh head cone doesn't scale um basically you've got to learn how to do more with less so I as probably many of you found that a useful tool to solve that problem is automation which kind of

led me out of systems Administration into software engineering um I wouldn't necessarily call myself a software engineer even though that isn't my title today I kind of found myself into the role by first starting just writing bash scripts python Snippets to automate things then integrating off-the-shelf tooling to work better be more hands-off and from there I found myself actually writing like larger platform type things etls security control setup misconfiguration detection vulnerability scanning all kinds of stuff um over the last few months I like many have been caught up in rapidly advancing capabilities of generative AI I've been using it both in a work context as well as on personal projects over the next 20 minutes or so we're going to be doing

kind of a full project creation using chap GPT start to finish not live I don't hate myself that much to try and present with GPT start to finish we'll be abbreviating some parts that are kind of a little repetitive what this is not I'm not a YouTuber so I'm not going to try and distill this into like 10 cool tricks you should use for GPT I'm I'm not an AI specialist so I'm not going to try to explain how or why these things are working the way they are more just sharing patterns and approaches that I found that have been effective um my hope is that this presentation will give you some ideas on how to start

using it to make your day-to-day work faster maybe skip over the boring pieces let you get to the parts that you find more interesting and challenging um I'm not going to be talking about open source tools or projects that facilitate IA driven development there are getting to be more and more of them I swear every day there's a couple more that are being announced my experience with those is very nascent and there are too many coming out to stay on top of so when putting together the structure of this talk rather than just making a list of individual bits and pieces I thought it would be more useful to see how GPT can be used throughout like the

entire bootstrapping part of a project just like any project when starting off it's important to know what you're trying to build what you want to end up with specifics about what the application will do performance goals some technology Selections in mind one might wonder if we're working with an assistant that is just about thanks Ryan that is proficient and capable of cranking out just about any programming language or technology you could point a stick at why does it matter that we know the language that we're working on if what we know the technologies that we're working on it's because we're going to need to be able to do a lot of validation a lot of manual tweaks and

stitching pieces together the chunks that we coaxed GPT into producing definitely come kind of like an Ikea package there's some assembly required for the purposes of this presentation I chose to build a highly parallelized Lambda driven Port scanner this is something that I've built built a couple different times for different purposes in the past and I hope it'll both be contextually familiar to folks here as well as maybe an end product that will be useful since the results of this are open they're openly available published on a git repo so for a lot of this presentation I'll be following something that's probably familiar at this point to many people on the top prompts bottom responses on the

right hand side where there are code changes or like templates or whatever that GPT is spitting out that I'm merging in with my local version of the project I'll represent those there so starting off um starting with context I found that rather than going straight into hey GPT give me an give me an application that does this thing go it helps a lot more to provide an overview of the whole process that we're trying to build and how we're going to approach it and then turn it around and prompt for clarifications specifically requesting that we're going to do this in steps helps lay down Milestones that we can use to tackle it one one at a time rather than trying to

get the whole thing back right at the beginning it also gives you an opportunity to kind of fact check some of your own assumptions having GPT request more information does a good job of making me think about the issues and implementation obstacles ahead of time and sets the context to the thread in a way that lets it generate more accurate results with less Direction down their own so if left to its own devices GPT will typically make executive decisions on things like these questions here I said hey we're going to be building a port scanner we want tens of thousands per hour we're using Lambda we're using Dynamo DB and we're going to do it this in steps make it Deployable

and then iterate it ask for questions and it put out a pretty decent set of questions there are some things that you'd want to probably exert opinions and influence about um prompting for clarification for clarifying questions and addressing them up front doesn't make guarantees that your answers will be taken into consideration when you hit that point of implementation but it increases the likelihood that GPT will give you more desirable responses back on your first tries so after providing the answers to that info back we get something pretty close to the roadmap of how I would tackle building this starting with a schema building out the infrastructure building the temp building the Lambda functions themselves testing optimizing

um I've also found that it can be useful to ask it for a diagram I've used the ASCII outputs just like this one it's I've also seen some folks use like uml diagrams that can be imported in other programs it works okay but try not to wrap it whole I try not to Rabbit Hole too much early on what we're trying to do is just decompose the project into a set of steps and pieces that we can then review and validate so at this point we could just say okay give me step one give me step two and then try to assemble the whole thing ourselves um we'll still end up needing to do a lot

of the manual tweaks if we do it step by step but trying to do those iteratively one at a time get something small testable functioning um sets yourself up for a lot better time than trying to do the whole thing has anyone ever done like a peer review of code from a co-worker that had 80 changes across 50 different files that's kind of what like trying to do assemble the whole thing together at one point would feel like it's much easier much more straightforward to do it iteratively start small review and validate often so the first step that we had was to define a message and result schema this is useful because it impacts both the

way the Lambda functions are operating as well as the template the the the the queue and the dynamodb tables that we're going to be setting up so having information about these record schemas up front makes it so that we don't have to revisit those decisions necessarily down the road so let's get some real work done after setting up those steps prompted okay let's get a template I have spent a lot of time writing cloud formation um so I was able to review this with some amount of accuracy it picked up the right transform for Sam the the serverless application model that AWS uses use the correct serverless function types gave the functions some pretty good names set up the appropriate event

triggers gave us something very close to a valid template good environment variable names for the functions to introduce our q and table identifiers and it set correctly structured invocation Triggers on the functions there were a couple things that for me were undesirable or that I needed to change or didn't want that way it went a little bit off script on some of the dynamodb table definition it added some Global secondary indexes the first function and well actually both of the functions are circled there in red was referencing an incorrect attribute for the queue from previous experience I know that the op the sqs send message operation uses a q URL rather than RN and it repeated the same

mistake on the second function the other part I recognized was that there weren't any permissions tied to these functions which would have prohibited them from being able to actually interact with that queue and with that data dynamodb table I have seen worse I've probably written much worse than this um I know cloudformation well enough to have written this but it was a fair bit faster for me to review and fix these kinds of changes um than it would have been for me to write it from scratch even if I was using example code or from documentation so if you consider accuracy to be like the number of lines that didn't require modification over the total total number

of lines uh we're probably pushing High 90 percent here so at this point I need to get it fixed I need to get it closer to what would work so options are either I can fix it myself or I could ask gpt2 in this case I gave a shot I gave it a shot at asking GPT to fix it it's a little bit of direction we can get the thread going in the right direction in this case where I reminded it to grab the correct attribute and set the policy I got something that passed my sniff test although I hadn't actually run it through a linter or attempted to deploy it so that was the next step

um to make deployment easy and and repeatable I wanted to put a make file together with relatively small prompt give me a make file um I got a functional a functional one back very simple but it made it very easy to deploy but at this point I just had a template and a response in a prompt response and a make file and a prompt response I needed a working directory that contained these files kind of a pain but again we can cheat I'm just asking for a small bash script to create the appropriate folder structure for the project a very simple like almost one liner um ran that and got a directory structure where I could just take the outputs of

the thing of the GPT prompts GPT responses merge those in with my local directory and it was pretty much ready to go so now we have a directory structure and we know that where the Lambda deployment packages are going to live we can get those code URI set correctly previously in the template they were just set to a dummy dot those needed to be the appropriate deployment package in retrospect if I had generated that directory structure before creating the cloud formation template this step may have been able to be omitted because the context of the directory would have already been in the thread so those code Uris may have been correct from the get-go there are things like this that I

realize once in a while that hey next time I do this I could do it better um yeah so before deploying the template I decided to be a little bit more particular rather than using inline policies I wanted to use the sound policy templates which make the template a fair bit cleaner they also do a decent job of maintaining lease privilege and it really struggled with this I don't know if it's because they're a relatively few samples of using these policy templates in the training data compared to normal AWS managed policies but even after a few prompts I couldn't get it to work correctly at this point I decided to implement something I learned from Hitch which is 90 10. you make chat

GPT bring things 90 of their way there and then you're going to have to bring things to the last 10 percent um this is something I've heard a lot of people expressed that have taken similar approaches even with good prompts and good context setting the raw responses can get you pretty far but you're pretty much going to have to expect to finish it off yourself in this case I made the policy changes to the template in my local version I removed the globe the Global Security index that I didn't really want in the first place and then I put that whole template back into the thread it summarized back to me my changes and then made some comments about the

relationships between the Lambda function and visibility timeout that I knew were incorrect it started getting hysterical um while the thing it noted was a common Pitfall for this kind of setup it had the numbers backwards so I just called it on its Bluff and moved on um I have two four-year-olds I know arguing doesn't get you very far um I wasn't going to try that here don't try to convince it so at this point I had what I thought to be valid cloud formation a directory structure with some empty Lambda deployment packages and a makefile time to test it out and it worked so what if it didn't work options are you make the I make the

fixes myself and iterate until it deploys correctly and then provide those updates back to the thread which is just my normal way of doing things I mean with the additional step of adding back to the thread um if I can if I know if it requires broader changes I'll bring it back into GPT to make the tweaks via the prompt so that's kind of a decision tree that you kind of figure out how to make on your own each step if something isn't working right do I want to fix this myself or do I want to try and bring it back to GPT and have it do a larger refactor so moving on to the Lambda functions I

had previously set the context on what I needed the functions to do with some iterative prompts to give a first pass and confirm that like the pattern was correct I then made some requests for specific optimization tweaks around sqs message batching and asked for some formatting and structure improvements this is just a couple like one function definition and one of the classes that it generated um it looked pretty dang good to me um this is the kind of thing where some context from a function that you have written if you contribute that into the thread it'll pick up on some of the way that you like your doc strings formatted that you like type hinting that you like

comments done this way that you like structuring your code this way and I found that that does help get code that's more um style and Co style similar and cohesive to the stuff that you've done yourself one pleasant surprise was that it included the name of the Lambda function in the message that was publishing to the queue this can sometimes be very helpful if you have multiple senders publishing to the same queue because if one publisher is modified down the road and start sending malformed events having an attribute that lets you know where that event came from is super useful it's kind of fun to see best practices that sneak in like this unbidden um

yeah so after this was produced I put this into my local version and then I read did a redeploy of the Sam application um then I needed to invoke it I needed a test test early test often um so I requested a CLI command to invoke the function it somewhat naively chose a few private IP ranges which when scanning from Lambda might give you unexpected results but since we're not actually doing scanning at this point in time it didn't really matter I didn't really care and this one took me by surprise and that it worked almost right out of the gate it didn't take really many iterations on this one after I had like done the sniff

test manual review in the from the responses which is a pretty rare occurrence I checked the logs of the Lambda function check to see if there were messages in the queue sampled one message out of there to make sure that the format looked great and everything was looking pretty good stopping frequently to do this kind of validation really helpful and it's like a triumphant yes I got it it's working it's not just giving me garbage kind of moment um so at this point I felt like I had established enough context in the thread that I didn't have to be explicit about instructing step step after step so I basically just said okay you're the boss

what do you think we should do next and it said well let's move on to the next function which was what I would have proposed um it proposed using scapey which has a lot more flexibility in it but some more dependency management than I wanted to deal with where I was just looking to get Port State I didn't need the flexibility of scapey in here if I wanted to do a more robust and feature full version of this then yeah having some scapey capabilities would be great but for now I just told it to use the socket Library so we got a function that attempts to create a socket connection to a destination IP in Port it

determines the port to be open if that call succeeds or closed if the connection times out or receives an exception the dynamodb interaction looked to be correct but I noticed that hey you're only scanning Port 80 um we want more visibility than that so calling that out let's let's grab the com the top 100-ish most common ports that we're likely to encounter on the internet when instructed to do something it generated something very similar to the nmap top 100 and updated the function to iterate over that list at that point I ran another make deploy modified the Lambda function myself this time to Target a few uh the indication so not the function but like the AWS the CLI call to Target

the IPS um then I knew where hosting services on the internet and around the scan request Handler again very pleased to see results appear in the dynamodb table although they took longer than I would have wanted for them to get there um looking is like almost three minutes so looking back at the function I realized it was because the scan was running serially with a two second timeout 100 a hundred ports two second timeout gets right around to that three minutes which is what cloudwatch trail what cloudwatch was showing for the function duration so here on the bottom another Super exciting moment hey I have my dynamodb table I have Scan IDs I have IP

addresses I have ports that line up with what I would expect to be running on those IPS yeah validation is great so let's get it to speed up I started thinking like well what are some ways that I could tell it to go faster well what if I just tell it go faster and see what it gives back to me it proposed using the async io library and converting this splitting up the Lambda function a little bit more so that it could asynchronously asynchronously invoke scanport um beautiful uh yeah so with this it brought our scan times down to about three seconds um per two to three seconds per host again rather than the the hundred and uh

200 seconds that we were encountering before so make deploy retest um it made me curious about how many like we had some performance goals up front uh I asked that we make this highly scalable that we'd be able to be pretty performant um I asked how many we'd be able to get through an hour that didn't work well um there are a few things that come into play when determining the event throughput in Lambda off of sqs some things about hot and cold starts Lambda scaling Behavior off of sqs Maximum concurrency I guess that just proved to be too much to for a language model to get through it first gave me the number of scans a single Lambda instance would

get through which it was about 1800 in an hour when I said that wasn't right we'll have a thousand it gave me oh right right five lambdas will get you through seventy two thousand that's like or 7200 still not right then 100 uh yeah uh it's funny that I've noticed that several times that while GPT seems to speak most programming languages very very well it is very bad at math um bringing this into Excel I calculated that it should be able to get through about 1.5 million host scans per hour um when I threatened a trade GPT in for a TI-89 it didn't like that very much and responded very Snidely um I've seen that as an AI language

model response many times and I don't think I've ever been and as like like okay you got me there gbt good job um touche so we have a working system reasonably well documented scanning pretty quickly against targets let's get some docs put together ask for a readme of the project in good markdown it we do kind of end up with chopped up markdown through the GPT UI this is where a direct plug-in into your IDE would be able to make things a lot cleaner but pulling that out yeah got some very legible pretty accurate readme docs to to add into the the repo so test generation was also something that proved to be pretty difficult

particularly in Python it put together really good ideas for test cases but when attempting to actually generate the the tests the the provided output was pretty far out there and missing some important parts I encountered broken test fixtures incorrect environment variables set up for the lambdas conflicting parameters between the pi test ini and Pi test Flags after about 15 minutes of trying to just wrestle anything useful out of it I decided to take a different approach which was just to create a new thread narrowly scoped to the purpose of getting tests a single unit test for a single function sometimes starting a fresh thread can just erase a lot of confusion a lot of muddled context and you might get better

responses back right out of the gate switching to a different model like gpt4 which does a better job at handling some kinds of ambiguity can also give you better results although there is a huge trade-off in dealing with the slower response times um I didn't need to establish full context of the project just the directory structure the single function under test got the response back and reviewed it and I was much closer to my 90 95 accuracy where there was only one one or two changes that I needed to make to get a functioning test um quick update to the make file to run Pi tests at this point we can just make tests and we've got successfully passing

tests uh asked for a couple more again GPT being very bad at math where it thought an appropriate test case would be a slash eight um I'm not sure how long it would have taken that test to run but I think even with a locally mocked sqs it would have taken a while to do two to the 24 IPS in the now familiar fashion I just tweaked out myself provided it back to the thread to maintain that context um so to make invocation like use of this tool a little bit more straightforward I went back to the original thread and asked for a CLI utility to invoke the Lambda I won't go through too much detail here but in

about 10 minutes I had a working CLI that would discover the Lambda function Name by describing the cloud formation stack appropriate format format invocation bodies to run new scans and you could also you can also use it to retrieve format and print back the scan results from the Dynamo dynamodb table with an additional warning if there are scans in process um yeah so this is not my typical approach where this was like a demonstration of using it in many different parts that's not how I usually do my day-to-day work with it especially since I'm often working off of an existing code base um I relied on GPT to generate probably 90 to 95 percent of the contents of

what's in this repo now um that number for me is probably closer to 25 to 40 percent um I think as I get more familiar with ways to integrate this into like the IDE and my normal working instead of copy pasting back and forth between uh like GPT that number could go up um I found myself repeating patterns of GPT uh first setting context with milestones and then looping through cycles of prompt and review and then validate with manual fix feeding context back in the thread where appropriate probably have to correct 75 percent of the first version of anything GPT gives me um so getting to a point where you can do a sanity check on the project very

quickly is very important and being able to validate it again after each change you make I've used GPT in assisting with golang development which actually works I think better out of the box than than python because of how much faster you can validate a lot of like syntax things that python is is harder to do in Python without really strict type checking um there have been times I've gotten into things a bit beyond my capability to review and it's a danger zone like there's a reason this project does not have an actual UI because GPT probably could have gotten me 90 of the way there but then I would have been banging my head on whatever front-end language it

had suckered me into using for the last 10 percent um so it speeds me up in places where I'm familiar enough to confidently call its Bluffs um but if I don't have that confidence I'm usually better off with stack Overflow or finding samples from docs or other or other projects um so yeah kind of last don't waste time trying to get it get GPT to give you something 100 accurate if you find yourself thinking I can do this faster myself then just do it and feed those fixes back um yeah thanks for being here today I've published the full finished project to get link is there GitHub Scott jpac GPD Port scanner along with the relevant GPD thread

history so if you want to see like the full prompts and the full responses that's all available there look forward to meeting folks through the rest of the conference I'll be hanging out over here for a little while if anybody wants to chat thank you foreign