← All talks

All Day All Night API Security Endpoint Analytics | Jason Kent

BSides Sydney39:5270 viewsPublished 2025-02Watch on YouTube ↗
Show transcript [en]

how's everybody doing you good it's 11 o'clock everybody put your hands straight up in the air come on everybody do it see how you can do anything you want when you're on stage and make people do whatever it's great at least you got the numbness out of your fingers my name's Jason and I'm here to talk about stuff um why am I here to talk about stuff well I'm going to be talking about data science and I'm going to be talking about analytics and why those two things can be applied to API flows so the data that's moving across API transactions if we look at it from an analytical perspective we can actually see vulnerabilities being exploited we

can see people doing nefarious things so I'll tell you a real quick story my son calls me and he goes dad the fraud alert on my bank account is going off and he was getting text M messages from the bank saying we stopped fraud we stopped fraud we stopped fraud we stopped fraud so I went and looked I have access to his bank account and I went and looked at it and there was a $35 transaction on Delta Airlines for international Wi-Fi why can anybody tell me what just happened because I had to figure it out my son went to a gas station and he put his debit card into the gas pump his car got skinned

they took the number and they went and bought Wi-Fi with it why would they do that any guesses maybe any other guesses does the card still work that that's really what they were doing how are they doing it well when you log into any of these Airlines and buy their Wi-Fi you're using an API so if you step off the plane and you still know that API endpoint you can stick a credit card number in there and see does it work so it bought $35 worth of Wi-Fi International wi whatever that is uh on Delta but it drained my son's checking account because he's my son and uh in the states I'm known as the guy

that they put on TV at Christmas time and I sit there and say use a credit card because debit cards can drain your bank account credit cards have back stop limits for merchants always use a credit card online uh my son doesn't have a credit card he use a debit card but when he buys anything online I buy it right uh so that way it's a lot easier and safer so somebody had the API .4 Delta Airlines and they're using that to validate credit cards uh or debit cards whichever uh when they did that it took all the money out his account and so when they tried to do the next transaction it didn't work right and

that's a lesson right there that first transaction works if you drain a bank account the second transaction doesn't work so we're going to talk about the analytics of that uh so my name is Jason uh I've been doing this a while uh about 26 years now I've been throwing tick marks at websites this is a white hoodied hacker throwing tick marks according to chat GPT you'll notice there's semicolons um I couldn't get it to do that I've been doing um research in apis for quite some time various responsible disclosures uh I have a garage door opener on my house that uh didn't come with the little clicky box you know it came with an app and I

opened my garage and a buddy of mine had bought one of these he lives about 2,000 miles away from me I opened his garage uh and he said how are you doing these things and I said apis um so now I work for a company called traceable uh I'm a SE overlay um I do bot and fraud stuff I'm going to have to ask you guys to give me a little bit of Grace anybody that's in here that follows me on LinkedIn knows I had an explosion in my house blew up my basement uh incinerated most of my guns uh and I was about 40 seconds away from being incinerated myself um I had lit the pilot light on

my um water tank and I walked upstairs to tell my wife that the bath won't be cold in a few minutes and uh the house exploded it didn't quite blow up like this um this is a steel frame I got uh off my security camera it's my back door flying past my back door security camera it blew out the door and two panels next to it it was 12 feet wide blew it all out into my yard and that's what saved my life when my Builder built my house they only put four Nails on each side of this and if they had done their job the house would have blown up completely uh firemen told me that normally when they

roll up on these it's splinters and dead people so when something tragic like this happens to you you have to go lean uh on those that you know you can find comfort in and so I used to be in the Navy I was on a submarine so I have a chat group with a bunch of my old submarine friends and I told them about my house blowing up and this was crazy and you know firemen and the house caught on fire we lost most of the stuff we own uh and they said but did you die so that's the attitude I've started to take um and this is a a famous submarine thing when you tell a story they always

ask you did you die um I didn't but I have a new way of looking at things so we're going to talk a little bit about some of the attacks uh that can happen out there and some of the things that you'll see in apis uh things like um banking oh I'm on the wrong deck here dang it sorry I have a very similar deck that I was doing something with earlier s right I'm still here so we're going to look at uh understanding the different API endpoints that are out there um we're going to look at kind of some of the ways that we can use analytical strategies on these end points to understand what's happening right like

the original story I told about my son's credit card getting stolen and them using an airline API to validate the card there are analytics that you can apply to these things to figure out are people doing the farious things against you I'm going to talk about data science anybody here going to school for data science data analytics anything like that yeah couple good um and then we're going to talk about what vulnerability research looks like using analytics uh and we'll talk about stopping the attacks so there's all kinds of API endpoints out there right there's search endpoints there's login endpoints there's endpoints that go to infrastructure there's all kinds of different ways that you can approach

apis for an organization um you guys are aware that there was a telephone company here that lost some data I know you're all sick of hearing that um so I won't mention them but that really did happen right somebody pulled a bunch of data out of a database in the taxi on the way over today and this is an interesting thing everyone's in cyber security Now my Uber driver uh was telling me he's in cyber security and was talking about how he attacks different things and I said well that sounds great but I can just go in through the apis and pull the whole database out right uh especially if you're using something like graphql

right where the calls are are very big and then they allow for you to extract a lot of data at once and so as an example of this I'm going to talk about this company called fly.io does anybody know these guys you do you're the first person I've ever given a talk in front of that said they know who this is all right so what am I going to do with this well I'm not going to deploy servers or anything like that uh what I'm going to do is some research on this company and the first bit of research that I'm going to do is just simply put API dot in front and you can see from this or maybe

you can't 404 error what happened I made a request to api. fly.io and I got a 404 error what has happened well DNS worked right because we know we can go there and request things but I didn't give it a path there's no index so it didn't give me any Base information it just said the thing you ask for on the server doesn't exist but the server does right or as I like to say there's a there there right so now let's try to pick that apart a little bit and so what I'm going to do is I happen to know but I'm going to guess do they have a graphql end point and I just simply put in a path graphql

and you see what loads this is their graphql Explorer environment or whatever you want to call it um you notice on the side here it says docs and schema this is everything about this instance this isn't a vulnerability there's a demo site for these guys I'm just showing you their demo site but what did I just do in order to get there well I guessed that they might have an endpoint hanging off of a DNS record for API and I guessed that they might have a graph qm point I guessed and look where I landed this is called inference scanning to all of your vendors out there right they make an inference that things would be there uh

and then they test to see if they are so we figured out that there's a graph qlm point on the end of this we have access to all of the docs in the schema just simply because it's there so what's that got to do with analytics well attackers are going to go look for your endpoints they're going to try to find them I taught a Class A few years ago for oasp on how to decompose APK files so that you could find API endpoints on the end of them so think Android applications have this big text file of where all of your endpoints are um which is pretty nice you can set up a reverse proxy on a lot of iOS apps most

Android apps this won't work because you get stuck with certificate pinning right anybody hear mobile app pen testers or do any of this kind of work no one in the back now you got to know corellium corellium has uh this really nice service that they put together um that allows for you to sit on bare metal and run your own erts so certificate pinning no longer is a problem um it just goes away it's nice you can also look at web applications um a lot of times when you download a web application that means you go to www. whatever and it comes to your computer you can just open it up and look at the content apis where's it

going and getting things right there's a lot of this activity that can go on and believe it or not uh this section here is where all the really fun stuff is there's API keys in there there's a bunch of different kinds of um endpoint describers you can gain a lot of information from understanding how do you name things what do your variables look like right that's why you should use chat GPT for variables um because it's very random then and and it's much harder to guess uh it's really the only useful thing it does um documentation is often hanging off these end points and for years I told people you should develop Swagger files for any of your

apis that you have but I was very very incorrect In not then saying don't make them public right if I just go to Swagger do Json to a lot as an endpoint for a lot of these things I'm G to find stuff out there right developer documentation often has endpoints in it so all of these different ways organizations need to protect themselves um to from attackers figuring out where their endpoints are you also have to really understand that a catalog of these things right is going to lead to vulnerability Discovery if you understand that there are people touching your login that are touching your verify account that are touching your graph qlm point right then you

understand people are touching your environment and maybe are doing it in ways that they shouldn't right how many of you know that there's a thing called xmlrpc.php on every WordPress instance right so how many of you get scanned for xmlrpc.php how many of you have had to respond to a researcher who has found your xmlrpc.php and they want to charge you money because they found it I have I paid that Ransom um and it's dumb right so you can sit there and see these are our active endpoints these are the things that we really care about so that means we should be looking at this for is that normal is V1 login normal for us

it could be right I work with a lot of organizations that it'll be like v27 right but v23 is still on why and you'll see that endpoint getting hit I had a customer come to me and say uh can you go look at this for me and they were getting hit on this verify account flow and I said what is this and they said we don't know uh we don't have that developer here anymore and I said okay what's it do yeah we're not sure but man they're banging the crap out of it right so I go look at it and I look at the traffic that's hitting it and they're iterating through email addresses put in

an email address and it'll say things like rewards points available or things like that account doesn't exist right error messages are important in apis and so now we've got this thing where somebody's banging on this and what they're doing is validating are their email addresses that fit these accounts so now I have a list of valid email addresses what do I need next passwords right which I'm probably going to get out of a breach for all those email addresses I understand right so we have to understand which endpoints are out there you need to track what you think is important to you if you're a retailer search might be something that's important to you um it if you're

using llms Escape paths might be something that's important to you right I might be injecting code that says go look in your information catalog and the price for this item is this right not tell me what the price is so those kinds of things are the things that you have to really understand retailers care about gift cards retailers care about carts Banks retailers care about account info rate what are the things that matter to your business because this is where you're going to have to start looking at things the other things that you need to understand are what's contained in the data flow what's actually in there so if we look are there headers that we care about right I

work with a lot of phone companies and phone companies tend to drop imeis or something like that in the header so you can understand this is the device that's actually making the request why would I put in Canary headers it's to see what people are trying to touch right and it's what they rip out later when I go test a site I'm going to be ripping the headers out and seeing what happens uh maybe some of you know this maybe some of you don't if you go to a Jenkins admin console with no authentication header you are the admin you just need to take the header out right if you're instrumented so that you can see that activity you can stop it um

things like cart creation right for retailers or anybody that's doing hype sale information uh anybody that's interested in high-speed sales cart Creations are important part updates um this is how everybody steals stuff right this is how everybody beats you uh in a hype sale Arbitrage has kind of Gone Away um especially for us in the US you know during the pandemic money was free and so everybody was buying shoes and then reselling the shoes um and there was this guy named Kanye that sold shoes they're called Yeezys right well Yeezys for me Yeezys were important for like two years because people were buying shoes like crazy and then they'd list them on eBay and sell them immediately um and that

kind of Arbitrage is important so why is a cart update important well the way you buy shoes before someone else is you go there the day before and put a pair of socks in your cart then when you learn what the skew is for the shoes because you don't know it yet you won't know it until it releases as soon as you learn the skew you don't add shoes to your cart you update your cart with the shoes and check out take the socks out put the shoes in check out meanwhile everybody else is trying to get the shoes in their cart see you're a step ahead when you have a cart creation so when you're

looking at hype sales cart creation date's important if it was yesterday send them a capture right do something else origin IPS origin countries uh I often see Bob trying to log in from 19 different countries that's weird right that's a very strange thing uh and so you need to be able to get on top of that so there's going to be some strategy to this right I can't tell you you need to implement something right here uh if you're a bank that's going to be very different than if you're a phone company than if you're a retailer right uh so understand what your analytical strategies are but I'll give you a few examples right go to your business goals

and understand what it is you want to learn right you probably got a cheap marketing officer that actually cares about what sites people are touching or what information they can get to um and so maybe there's some kind of topend analytic that everybody's going to care about right they want to understand which flow is used the most login analytics and user Behavior tend to run hand in hand I'm seeing Bob from 19 different IP addresses that represent 16 different countries and he's tried 28 different passwords what's happening to Bob credential stuff attack right the account takeover attack um and so understanding the analytics that goes along with this stuff makes it really easy to find so where are the things

that you're going to care about and what are the things that you're going to apply these analytics to well today we have kind of in in our realm of understanding ML and AI right an ml is usually written in Python and AI is usually written in PowerPoint but what we're doing in ml is comparing two things these two things look like the same two things all the time and then suddenly one of the things is different Bob goes from 18 different places right so that thing that's different is that Bob is trying to come from too many places so we're going to do that in a bunch of different ways we want to understand IP addresses compared

to username passwords compared to username even if it's just hashes or jots um you're going to be setting yourself up to do these comparisons and these comparisons are what are going to lead you to understanding your analytics so this is an example of a what a credential stuffing attack looks like if you look at the Peaks the green Peaks this is a quick service restaurant they're only open from noon to midnight or one and dinner time happens to fall right at the Peaks this one little customer to work with right because their traffic is easy what the heck is happening right there right this jumps right off the page an account takeover attack they were attempting to rotate through a

bunch of emails and a bunch of um passwords to try to get through this 5.75 million accounts were tested in this time frame right that's not very long you'll notice there's this little side bump here that was when they started the attack and it was mitigated and then they pivoted and started attacking differently and it was mitigated the reality of this is when you're using analytics this way you can see IP address username password username it leaps right off the page right and you can tell I'm under attack and and this is how I'm under attack so let's talk a little bit about how we're going to use that ml one right because the AI one is harder um we're

going to use the ml1 and so I asked chat GPT well what is it that we should know this isn't a terrible list um it's not 100% accurate but it's not bad and if you start to look at your organization in this way how can I detect anomalies how do I see that Bob's trying to get uh his account taken over how can I do some predictive analysis on Behavior right what is it that you're supposed to do first and then second if your very first activity is card update well then that's wrong right and you have to be able to trace uh somebody through the environment in order to see that so um when we look at this stuff

oftentimes we look at things that are simple like user agents how many of you are Edge users how many of you are Chrome users how how many of you are Firefox users Safari users right um Firefox Safari is the same thing we used to call it Netscape um but if you look at how each one of these browsers gets updated they get new versions all the time doing data analysis against a flow you can actually see old bot writers right because they use old versions that no one uses anymore uh again leaps off the page mobile apps often have their own user agents right if I sit down and write a python script to attack a site right now

the user agent is going to be what python right it's going to say python in the user agent um if you're writing your own apps change your user agents through versioning right so that you can understand this user came to us on version 27 and they hit this flow right you can understand that a lot better data science will help you find the good user agents even when the population changes right because as you could understand data science is just a comparison of two things so if suddenly Chrome up versions which it did every day this year um you can see that as it rolls along right um similar to all of the other ones that are out there so

having these analytics is going to let you see this stuff right and then it also lets you see see things like data conformance are the headers in the right order because bot Riders are lazy and they'll just put it in any order but you're not right so they come in an order so you can see that stuff just jump off Json objects that all of a sudden are bigger right means something it means you are leak leaking data in some way or it means that you know something has changed in your environment time to ask the developers what happened um what objects do you allow in the response and when I say that what I mean

is I'm me logged into your system can I see your stuff right those are your objects not mine and so I need to be able to see what is it that I'm allowed to see um a lot of help comes along when you use things like Swagger files open API files because then you can see that stuff and you can understand what it's supposed to be set up like I like to use URLs right xmlrpc.php is a great one set it up make it look exactly like the WordPress page and send it to them static uh they'll end up running around like crazy trying to make things work off token age um you know what

computers do really well repetitive tasks you know what's a repetitive task logging in right using the same API keys for months doesn't make any sense your computers can log into each other right uh you can even set up certificates and all that kind of stuff to make sure that it's good if there's an old token time it out get rid of it um that way when somebody comes along with an old token they're easy to see so how do you stop this stuff well a lot of organizations have implemented a lot of different ways to do this but think about this for a second if I'm going to block you based on your IP address how long should I block you I

for two years my previous job I couldn't get to this one website because they had blocked our Amazon IP address for our VPN for two years I couldn't go buy things from this website because they were blocking me so what you need to do is Ban the IP address and then unban it really fast because if somebody's attacking you from in a Tye sale way an account takeover way uh whatever that might be they're going to be rotating IP addresses anyway they expect you to ban their IP address so Banning the IP address is just simply blocking the next guy that comes along or the next gal right it has to be a completely ephemeral activity you block it and move

on right you have to let it happen again but what if you could do it in a little bit more sophisticated way so if the username appears in four or more more IP addresses it uses more than eight passwords appears with more than two different browsers at a time when the traffic on the login m point is above normal then block that's analytics that's data science applied to API flows this is what we would call end blocking right where you set rules that just stack on top of each other stacking logic this allows for you to remove all those false positives the very first thing that anyone is going to say if you say I'm

going to block that traffic they're going to say no no no we don't want to block traffic right how many of you work for an organization that will not block traffic what ends up happening is somebody gets blocked that shouldn't right I had that quick service restaurant they block all TR they're in Canada so they block all traffic that originates from outside of Canada and then one of their Executives went on a trip and they were in Paris and they wanted to read the terms and conditions page off the website and they couldn't get to it right because they were being blocked geographically it's dumb and it was like an emergency request right the CEO can't

see our terms and conditions can't order pizza he's in Paris all right this allows for you to do a lot of false positive reduction and you're going to have to figure out which flows this applies to and how it's going to apply First Step inventory your apis every security program says inventory first and this is no different you need to know where your apis are how do you do that well it's not easy uh you can do it through DNS records uh some really great stuff has been uh done here in Australia with asset node and Kite Runner um to help you with this kind of thing uh I work for an organization that happens to do

this as well look at the traffic on those endpoints and understand it and start counting right how many times did Bob try to log in well it was wants today maybe Bob's okay so where does it live how do you bolt this stuff in well you have to figure out where the traffic's in the clear because that's going to give you the most advantage in order to compare things um but things like passwords usually are very sensitive to folks if they're being passed in uh so what you do is you take the hash or whatever that's coming in and count it um how do you identify new flows automatically is a Harden so you have to

be instrumented in a place where you are uh in the clear right on the flow that you care about and you have to be able to see when a new one shows up when your developer goes from version six to version seven that's a new flow even on your logins right everything's going to change and you have to be able to kind of react to it in a way that that just keeps going so you instrument for analytics you have to be ready for a lot of traffic I showed you that was about a threeh hour period where 5.7 million requests to the login mpoint came in that's an interesting piece of traffic it's an interesting moment in time so

what should the count be right at that moment I I probably could do this as like a blackjack talk right tell me the count um because we need to understand what it should be an understanding of the number of requests per second that your flow handles at this time of the day normally is important because if it normally runs at like 5 million requests per second uh and it goes to 50 that's important something happen so there's other things that go along with all of this analytics for instance look at these email addresses they're not user friendly they're not human friendly they don't make any sense right this is an actual use case uh where an organization has had free

server time if you sign up today you get like two hours of free server time and you know who loves free server time people that steal credit cards because they need to validate them against the Delta API uh so here we are building accounts but it's happening very slowly right this isn't all at once kind of stuff this is I'm going to jump in run my two hours of timeout leave come back in a long you know a long time from now two accounts a week that's not very much so being able to hold the count is important being able to identify this traffic is important these do not look like real users right now sure they

could probably pattern it up and make it look a little bit more like a real user um but if that's what you can catch them on catch them throw the stuff away here's a group of flows that have data flowing cross them look at this four four four four four four four anybody have guess what's happening somebody's running a scanner and this scanner is trying these things four times right and I know it's a scanner uh because it's trying to get API docs API Dash docks it's trying to get device associations Geo fenes jur Libs right I I can tell it's a scanner somebody's running against this thing so once you see that there's something happening on here

now I have to take action against it and I'll be willing to bet if you were to look at the IP addresses on there they'd all be the same because they're running a scanner uh and that makes it easier to block so if you count things that are happening in your environment you can use analytics against them to know things and that's really the important piece I can count all I want it's doing that comparison and saying well Bob's tried to log in 28 times from all these IP addresses using all these different passwords that's an account takeover attack it's very easy to see and it kind of leaps off the page now I talked about

credential stuffing a lot right uh because credential stuffing in my opinion is the real problem uh a lot of these account takeover attacks and these kinds of attacks happen because they're trying to log in as you uh or someone else so know where your transactions are coming in is it the login flow is it available through engine X or whatever IP reputation scoring is another thing that can help you right is this a new IP address is this an old IP address is this a set top box in Vietnam um that's you know hitting me right now you really can understand things from that and then just simply counting the number of login attempts uh and comparing each one of

those to different IP addresses is going to give you a ton of data about how things are working so how do you stop credential stuffing well captures right that that used to be the thing throw a capture at it uh this ours Technic um article that I read I don't know September 27th so that wasn't too long ago uh like a month ago AI Bots can now identify 100% how many of you have ever failed one of these the AI is better than you it's 100% accurate see I never know is that the motorcycle I never know a lot of times it's crosswalks and there'll be stairs and it just doesn't understand right this story comes from here uh I

was watching uh the flows on a third-party retail fulfillment app uh I don't know if you guys have noticed anymore if you order something online the email you get kind of looks like the other company when you ordered something online it looks very similar the tracking buttons are in the same place and all that's because there's a third party company that facilitates all that and so I was watching the information flow across this company and I started to see a pattern in this request Matrix that was coming through it was about, 1500 requests per day should say per day for the last 90 days for three months how many of you have bought something that

it took three months to get to you that's not a thing right well maybe it is if you have something bespoke built right but if you go to Foot Locker or JD Sports or anywhere and buy something it's probably not going to be three months before you get it all of these things were asking the requests were coming in asking for order status and what had happened is they were in Adelaide and they don't have FedEx there they have DHL or something else as their last mile carrier and so when it would go from handoff from you know FedEx or whoever to this local carrier the statuses changed so rather than saying I delivered it it would say complete and

this guy had turned a bot on to see where his shoes were that he had bought uh and wouldn't give him status you know shipped or whatever it was saying complete and it didn't match the algorithm that he had and he just kept banging away on this thing uh trying to figure out where his shoes were he got the shoes three days after the sale I I went through the tracker and found it right so this shoe bot is just running around trying to figure out where my shoes are um and they'd already purchased them but what I got from this was two things I figured out how their bot work right but I also figured out that they

were buying the shoes and so we could take those addresses back to the companies and say next hype sale block this address uh and they want to shut them down why is it important does anybody understand why shoe companies don't like Bots to buy their shoes they're getting paid anyway right why would they care it has to do with when you go on Twitter and GOI can't believe the Bots bought all the shoes again right if you do that the shoe companies get mad and then the shoe companies go to their retailers and say if you solve this problem you can sell more shoes I'll give you more allocation right and when they do they do it's really

interesting multiple IPS were asking for the same tracking number so he was rotating through a proxy service anybody here use a VPN on their laptop or their phone that's free you ever read the terms and conditions on a free VPN cu it says you're now part of their proxy Network that they sell to boders so your phone looks like a really good IP address if I'm doing comparisons right but you're just the proxy node um so the idea here is if you want to stop this kind of behavior there should be a drop de right the order shipped this tracking number doesn't work anymore drop dead right make it so that you give signal back to the bot so they'll stop

pounding on your site to figure out what's out there right when you have the analytics in front of you um you can take all of those items that you understand and you can apply them so like that big event with the five million requests we saw that jump off the page because that was an odd amount of traffic at that time and so we dug into it to figure out what it was should you take action on a new endpoint if one just shows up right when v27 shows up for your logins do you care about it this is the kind of stuff that you have to get yourself prepared for in order to set the analytics up if you

want to understand more about this uh I'll be upstairs in the lockpicking area I'm also lockpicking today um and I can talk you through some of the analytical strategies that I've seen work best in the past um as well as some of the things that uh make it easier for you to do things identifier uh apis and stuff that's out there so um I hope I didn't run too short like I said when my house blew up my brain started working weird and I I can't time things anymore um but I think I'm between you and lunch so uh bone Appetit thanks guys