← All talks

Web Browser Automation: How To Be More Robot, Easily!

BSides Leeds27:23240 viewsPublished 2023-07Watch on YouTube ↗
Speakers
Show transcript [en]

good luck to everybody Welcome to uh web browser automation or how to use the internet like johnny5 and this talk is aimed at people at all skill levels we're going to be uh ramping people up from beginner level right the way through up to being air black belt in laziness but um but those that are more automation affluent already just bear with us and uh you know we'll get everybody up to speed as quick as possible uh getting the boring stuff out of the way my name's Dan or uh salty goat on Twitter I've been a pen tester for about three and a half years uh before that I worked in a gigs Live Events and touring and I identify as an amateur scripture none of this stuff is particularly complicated it's all stuff that I've figured out how to do on my own stream messing around with stuff and my ethos and this stuff is like you know if I can do it so can you and uh yeah I identify as a very very lazy person and I'm always uh aspiring to get lazier uh a few basic concepts we're going to have to cover like I said to get everybody up to speed the first one being the internet and stuff um from a very high level this is how the internet works you know we have this lonely pirate that's looking at some lost booty they uh go to lostbeauty.com the Lost booty.com server sends the website back to them it's important to remember that when the server sends that website back to them it doesn't actually look like the websites that we're used to seeing it looks something like this now if this is ugly and unreadable to you that's because it is ugly and mostly unreadable and you know human beings typically aren't supposed to deal with you know information in this way and this is called HTML for those that probably heard of it and but what happens is this HTML is then taken and interpreted by something called a web browser and then it looks something like this um you know web browsers being things like Firefox Chrome Brave Edge whatever flavor you're using um and then yeah these are sort of designed to be used by human beings instead of HTML which is mostly designed to be used by machines and but it is important to remember and something that we are going to be making use of quite extensively a little bit later on in this talk is that at any point you can right click on any sort of element within the web browser right click go to inspect and then you can actually see the building blocks of that website in your browser like a look you'll get a little thing that pops up at the bottom and you can sort of do into the HTML if you want to and yeah we're going to need to in a little while uh another quick uh precursor I want to cover is something called OAS G-Shock you can Google it for more information we'd have to go too deep into it but it's basically just a uh vulnerable web application that is designed to be hosted locally on your own machine and you can use it to practice hacking and things and we're going to be using that as uh the target for some of uh some of the scripts we're going to be making in a little while automation is a basic concept and you know I think other half of the title of this talk um I'm sure you're all familiar with uh the concept of automation but pulling a quote from the hive mind of all human knowledge on the internet uh automation is basically the concept of getting somebody or something else to do something that you don't want to do and specifically with browser automation it comes in a bunch of different flavors um you know most browsers now will have some form of native automation built in whether it's like chroma or Phil or you know Firefox learning your login details when you when you log in that kind of thing and but then you can also get you know browser extensions things like Axiom which will sit in your browser you can click record do things on a website click stop and then click play and it'll replicate what you've just done and you have desktop applications things like Microsoft power automate which is super powerful you can also make much more than web browsers with that but it has like a really powerful uh web web browser sort of corner to it that allows you to again automate anything you could possibly want to do in a web browser but tweet macros for the web app pen testers in the room you know if you have a good understanding of them you have a better understanding of them than I do um but again they are super powerful and you can do all kinds of browser automation with those and then lastly on the list is uh custom scripting so you know using programming languages like python Ruby and then libraries in them for like selenium or water to um interact with their web browsers from Custom scripts and uh that's actually what we're going to be focusing on for the remainder of this talk and developing some custom scripts to uh to automate some browser activity um basic concept scripting again quote pulled from the hive mind of all human knowledge on the internet scripting is a way basically of interacting with the system and getting it to do some things that you don't want to do yourself so uh why might we want to uh script things and automate things so we might have a mundanement honest task that we have to do in web browsers every day and we can write a script that uh you know again does that forward speeds apart Day means that we can focus on other things or as I've said at the start we can aspire to be lazier and do nothing um we can use scripts to achieve the humanly difficult so again this example here is if you know for uh web app testers in the room that are exploiting things like um access controls and they want to scroll through invoice numbers to see if they can see invoices they're not supposed to see all that kind of thing doing that manually would be really boring and would take a lot of time but we can uh write a script that Cycles through those invoice numbers for us really quickly and we can set to see if we can see stuff that we're not supposed to and you can do that with you know pre-existing tools but um you know you you will likely to find if you haven't already at some point in the future the these pre-existing tools might not be able to do exactly what you need them to do or you know you might need extended functionality on what is what already exists which is why is it with very very powerful skill to have and to be able to make your own custom tools like and it was a real level up moment for me as a pen tester when I gained the ability to like you know mid engagement if I could see something that would potentially be a vulnerability but that I I can't get a proof of concept for it using you know Burpa or something like that then I can just write the script and prove that this vulnerability exists with something that I've made myself and that yeah it's a really powerful skill to have so uh let's build a script um today we're going to be working in a programming language called Ruby because I like working in Ruby allows me to work quickly get my ideas down quickly and it really irritates python Fanboys when you say uh oh yeah I like to work in Ruby Python's better it's like yeah maybe it is but you know whatever um but yeah you can do this in Python Ruby Gold Line you know pretty much any flavor program online we still have some kind of library or module within it that allows you to easily interact with web browsers but today we're using movie so for our first script we're going to start off nice and easy we're going to build a script that allows us to log into a web application in this case our OAS due shop and now before I um start writing any scripts I I like to take a minute to sort of like conceptualize what it is that I'm trying to achieve and break it down into smaller building blocks and in this case we're when we're logging into a web application we kind of need three components we need to be able to browse to the web application so that's the URL at the top we need to input our credentials which is our username and password and then we need to log into the application so the three sort of goals for our script to to conduct those three steps step one nice and easy now we are going to see some code here apologies if you get some square eyes but I I promise you that this is less intimidating well sorry it's easier than it looks it's more intimidating than it actually is um we'll break it down and you'll find that most of it is actually quite human readable the first two lines we'd have to dig into too much now but they are um pulling into our script some code that somebody else has written because again we like being lazy that allows us to easily write things into our script to interact with the web browser directly so that's the first two lines line three where it says a Firefox water blah blah blah blah is as basically initializing one of those scripts that's that those scripts that other people have written to say like right okay whenever you see the word Firefox what we are actually saying is interact with this Firefox browser and then uh line four you know you can almost read it Firefox go to the URL of that web application so it is actually you know quite humanly readable and easy to understand and that's step one that takes us to the the g-shop application the next step we need to uh submit our credentials which is a little bit more complicated but you know again we can break it down and it's actually quite straightforward in concept so you remember I was saying earlier you can right click anywhere on a on a web application and inspect and it will show you the building blocks of that web application underneath it what we've done here we've right clicked on the uh email parameter and you can see at the bottom there it says um input ID equals email so we know where our credentials need to go well our username needs to go so then we're going to skip line five for a second we'll come back to it but on line six where he says Firefox text field where the ID is email set it to tested test.com then we can just do the exact same thing with the password uh field ID equals password set password and then we can do the exact same thing with the login button ID login button and then we just click it now um just to quickly touch on line five that we're doing the same sort of concept where it's a button that we're clicking um but instead of finding it by ID we're finding it by another parameter called Aria label this stuff is really powerful you can find any element on a page with any sort of identifier that exists within that HTML we'll dig more into that in a little while but that's just a welcome Banner that pops up and we're getting rid of it because you know it gets in the way it was logging in and so that's our first script run let's see it in action so it takes a few seconds for uh Firefox to wake up and do its thing I think that video is running yep and when it does actually kick in it all goes uh very quickly and it pops up it puts in our username and password and logs into the application in a second without us actually doing anything or interacting with it so that's all uh great and life changing but what's next what else can we do with that so we're going to set ourselves a few little challenges and progress forward into weaponizing that script we're going to see how that script can make us a little bit lazier in our day-to-day and we're also going to see how that script can um speed things up and let us do things very quick in and faster than any human could ever interact with a web application so first let's have a look at weaponizing the script again it like you know looks quite intimidating but when we break it down lines uh six to 11 there are literally just our login script again the only difference is on lines nine and ten where we're setting the email address and password instead of setting those to values you know our email address and the password we've got little placeholders at the end that says uh username and pass now those are being populated from line three where it says username equals tested test.com and then the passwords are instead coming from line four where it says password list file open passwords.txt so what we've got here is just a text file containing a bunch of passwords which you can see on the right where uh you know it says hello besides leads Etc um and then for each of those passwords in that list we're going to try and log into the web application with the username uh tested test.com and then um at line 12 we just have a little if statement a little bit of logic that says right okay if we don't see on the page that um the invalid email or password then we can infer that those credentials are correct and then it just puts those correct credentials to the screen so again let's see the script in action it takes a few seconds for Firefox to wake up and do its thing but when it does it will start cycling through those credentials and try to log in now if we have any um script is in the audience they are probably screaming internally now saying this is horrendously inefficient and it's going to take absolutely ages to do things this way and they are absolutely correct we're going to discuss that a little bit later on about you know different ways of uh of doing this type of thing and it takes a while and I want you to remember how long it's taking because again we're going to address that in a little while eventually maybe this login there we go and it puts the the credentials that worked to the screen so we've built a web application login brute forcer using browser automation with weaponize the script so next we're going to look at a laser finding the script so the challenge we set is um so say that you order a lot of juice from a wasp Juice Shop and you want to know when your latest order is due for delivery this one's going to be a little bit more complicated but again we can break it down conceptually and and build build the script around that so we start with again lines one two eight is literally just our login script that we've already built line 9 takes us to our order history page so if we go to the order history page in our studio shop we can see all our orders here and our latest order is at the top and to get access to our uh delivery information for that latest order we click on that little uh button in the red square but when we go to the latest order information what we can see in the middle you know that's that's the information that we're trying to get at but at the top the URL has a randomly generated ID number in it so we can't say to our script just go straight to this last order page because it's going to change every time there's a new order so what can we do instead well we've just clicked the button to you know get to this page so we can just say to our script we'll just go to that page click the button and get to this area but it's a little bit complicated because when we go to that page there are three buttons that all look the same and then if we inspect those buttons in terms of HTML they look practically identical you know there's no ID there's no ID equals button one button two button three or anything like that so we need to find a good way to track down that information on the page and locate where those buttons are so to do that you can build your own roadmap within the HTML on the page and say we're okay well it's included in the body and then it's included in this div and like et cetera et cetera Etc but you know we're trying to be lazy here and that sounds like it's a lot of work so what we can do instead is if we right click the element in um in the console there go to copy we can copy something called the XPath which is just a pre-built roadmap to that element on the page for us then we take that go back to our script and paste it in on line 10 where it says Firefox element XPath and then this big road map that looks like a bunch of jargon now you know we don't need to pay too much attention to to that value unless you know we have issues that we have to troubleshoot it but that's our roadmap to that button pre-built for us nice and easy we can continue to be lazy now we go back to our order history page find out a little bit of information there for the for the days until the latest delivery copy the X path for it and put it back in the script nice and easy and uh you know the only difference is is that on line 11 we've put that bit of information into a little placeholder called day count and then on line 12 we just print to the screen when our latest order is there due for delivery but again let's uh let's see in action there we go yeah so again it's going to take a few seconds for Firefox to wake up but when it does it'll all happen really quickly it's going to take us to the web application it's going to log in it's going to go to the order page get the information and we get to be lazy so next um anybody that has ever tried to buy a gig ticket online or has ever tried buying any sort of like hype clothing or anything like that we'll have heard of the term Bots and you know people are competing with Bots online because they buy things faster than uh than we can with the skills that we've got now we can build our own Bots and it's really straightforward so for this challenge you know we're saying like right okay apple juice is a hot commodity hot commodity in the OAS Juice Shop so we're gonna it's just restocked and we're gonna buy it faster than anybody else can so again the script looks big and intimidating but when you actually sort of go through it line by line the first eight lines are again just our login script and then line nine onwards is just us finding elements on the Page by their XPath or some form of ID file or I ID parameter and putting that into the script in order these scripts uh you know conceptually designed to mimic human behavior so when you're sitting down you're thinking like right okay so if I'm gonna buy apple juice from this shop I go to the apple juice on the page I add it to cart then I go to checkout and then I fill in my address details I'll fill in my payment details that's all each of these steps is doing it's interacting with the browser in a way that a human would but we're just automating it so it can do it can be done really quickly however a quick tip when you are dealing with a lot of these xpaths it's you know very again it's very difficult for humans to see that and instinctually know what it is on the page um so it's really useful to comment your code and do yourself a favor and if you're passing the script to anybody else do them a favor by putting comments in there as well but anyway Let's uh have a look at our script speedified once again takes a few seconds for Firefox to wake up when it does really quickly logs us into Juice Shop it adds the apple juice to the car it fills in the dress details you know picks payment options and completes the order for us as quick as that and these things you know work the exact same on any sort of live website production website that you're out there using trying to buy gigs tickets off or uh you know hype clothing or anything like that and that's uh that's our first script speedified um you can do a lot more stuff with browser automation than you know just what we've seen here and um as a quick demonstration of uh you know some of those things I've written a script here um that again starts off quite similar to our um our first login script and apart from online three you've you've seen that you'll see that we've changed to Chrome from Firefox just because just to show you know like yeah these things work the exact same and whatever flavor browser you're using and what is quite interesting though is at the end of that line three you can see where it says headless true so for all the scripts that we've seen so far you actually see the browser opening up and things flying past on the screen with headless mode enabled everyt