Automated REST API vulnerability detection with WuppieFuzz | Thomas & Erieke | BSides Groningen 2025

Name: Automated REST API vulnerability detection with WuppieFuzz | Thomas & Erieke | BSides Groningen 2025
Uploaded: 2025-05-27
Duration: 27 min 19 s
Description: Today’s world depends on many digital services and the communication between them. To facilitate this communication between applications, standardised and well-specified application programming interfaces (APIs) are often used. In particular, the use of well-defined representational state transfer (

BSides Groningen · 202527:1972 viewsPublished 2025-05Watch on YouTube ↗

Speakers

Thomas Rooijakker Erieke Weitenberg

Tags

CategoryTechnical

StyleDemo Talk

About this talk

Today’s world depends on many digital services and the communication between them. To facilitate this communication between applications, standardised and well-specified application programming interfaces (APIs) are often used. In particular, the use of well-defined representational state transfer (REST) architectural constraints for APIs is popular. As an entry point to many applications, these APIs provide an interesting attack surface for malicious actors. Furthermore, since APIs often control access to business logic, a security lapse can have high-impact undesirable consequences. Thorough testing of these APIs is therefore essential to ensure business continuity. Manual testing cannot keep up, so automated solutions are needed. In this talk, we introduce and demonstrate WuppieFuzz, an open-source, automated testing tool that makes use of fuzzing techniques and code coverage measurements to find bugs, errors and/or vulnerabilities in REST APIs. By: Thomas Rooijakker & Erieke Weitenberg LinkedIn: Thomas Rooijakkers: https://www.linkedin.com/in/thomas-rooijakkers/ Erieke Weitenberg: https://www.linkedin.com/in/erieke-weitenberg-562b7810/ Event: BSides Groningen 2025 Official website: https://bsidesgrunn.org/ LinkedIn: https://www.linkedin.com/company/bsidesgrunn

Show transcript [en]

So welcome for the second session of today. Um I already see a couple of names on the poster. So make sure during the next break to also write your name on the post as well. Uh so you can prove that you were at besides uh for the next talk I'm going to introduce Thomas and Erica from TNO and they're going to show us how amazing weif is. [Applause] Thanks thanks for the introduction. Yeah. So today we will take you along with uh well how to go for automated rest API vulnerability detection with wi fuzz in particular. Uh we dive a bit deeper into the marvelous world of fuzzing and will also zoom in to the

specific application of uh of API fuzzing maybe to start with. So who are we? Uh well, we're already nicely introduced. We work for TNO. Uh I work as a lead scientist on topics like cyber security by design, software security, and vulnerability research. So if you have any questions on those, feel free to ask. I also have a background in privacy technologies. So if you want to ask some crypto stuff, you can also ask me during the well the many breaks. This I'm also a scientist at you know, I guess I work on mostly the same subjects as Thomas. I also sometimes do some monitoring and detection uh which is quite interesting but mostly vulnerability research and how to help

others uh not be that vulnerable um which is also what today's talk is about. Yeah. Thanks. So let's start. Let's see if the clicker is working. Yeah. So I just want to take you along with with fuzzing. I'm assuming that not everyone in the audience has an idea what the term fuzzing means or fuzz testing. So it's a software testing technique and the key principle is actually more like this image. So, I got the credits for the image actually in the bottom, but uh it's more like if you leave leave your device um somewhere in an uncontrolled environment. You might have a kid that walks to your device is trying all kind of random things, button

smashing, and all of a sudden they either break your device or they get in and you have no clue how that happened. I think that's a a nice way to describe fuzzing. So if you formalize a bit more, it's uh you try all kind of unexpected and seemingly random input data shooted at your application and you try to trigger unexpected and breaking behavior. I have to click a bit earlier, I guess. Yeah. All right. So what are the key aspects of fuzzing? What's driven by a high volume? Well, you want to try a lot of different things because if you do random stuff, well, many of the things don't really have an impact or an effect. Uh the nice thing of of

fuzzing is that you really decrease your testing bias. So if you're a software engineer or software developer, you are very good at testing the happy flow. If a program is supposed to do something, you know how to test it. You might be able to to get to some unexpected behavior as well and try to test that uh also. But it's very hard to test the things that you don't really know um that they are there. So testing the unexpected uh so you decrease your testing bias and you can also automate this fully. So implement it in your CI/CD pipelines. I hope conceptually this is quite simple uh but in practice it's actually a bit more complex. We'll dive into that uh in

the in the rest of the presentation. But to make it a bit more visual uh we also brought an example. So, um, if you're fuzzing and this is, for instance, starting input that you want to to mutate and change, uh, we actually did this with this Super Mario image and we fed it to an image parsing library just to demonstrate what the fuzzer is doing. So, if it starts fuzzing, it will try all kind kind of random mutations. Uh, and this this nice h is actually showing the the various mutations it's making. Of course, only the images that you can actually view. many of the inputs are also so that invalid that you don't actually see

it. Uh before diving even deeper into uh what we can do with fuzzing there I believe there are quite some uh some misconceptions and I want to to discuss those with you. So first off uh some people think that fuzzing is a form of stress stress testing. I don't agree with this. Of course you are stress testing your application but that's not the goal of fuzzing. So your goal is not to see how it behaves with a lot of data and a lot of uh for instance requests when you're fing a REST API, but the goal is to find unexpected behavior. It's also not a technique that you're using to attack a system. Uh in

particular, you're not attacking live systems. Uh if you would do that, there are many mitigation in place to prevent you from continuing. Uh so if you want to attack a system uh you you better just run it locally, find a vulnerability and then run an exploit uh in a single run. Uh some people might argue that it's only or most valuable when you're uh doing pentests. Uh but we also demonstrate it's actually very valuable in other stages of your uh development. And it's also not only for security testing. Although you might argue that uh depending on your definition of security, it is only for security testing. But it's also a great technique to enhance the reliability and

robustness of your application. So that's the that's the key thing I want to uh to mention here. So when can you uh apply fuzzing? Well, uh you can do it during development. You can use it to to improve your unit testing or integration testing and implement in your CI/CD environment. You can no well no of course do it during during penetration testing or testing open source software solutions that you're actually using uh and you can of course do it on legacy software as well uh or regression testing. So there are many places where you can apply fuzzing and all different instances have a slightly different way to set it up. So why would you want to go fuzzing?

uh well as I already mentioned um well if you detect flaws your software quality increases and there also the reliability and and the security of your software so that's I think a very important aspect and the nice thing of fuzzing as opposed to more static uh testing techniques is that the things you find are true positives and you can actually reproduce them you have payloads that you can just fire it at your application uh and just debug from there another interesting aspect so Once you manage to set up your fuzzing uh fuzzing campaigns, it's actually relatively cheap to keep doing it. Um it requires minimal effort to to change it uh when your software changes a bit.

It's fully automated and they don't really have a 9 to5 mentality. So they just run 24/7. So that's also great. Although I can imagine many of you in the crowd also work 247 for fun. So what does a fuzzing procedure look like? So the first thing uh that's very important often forgotten is actually realizing what the attack service of your application is and how can you actually hook uh your fuzzer to your application. So how can you feed input data to the application in order to actually find interesting uh unexpected behavior. So that's a very important aspect. Once you manage to uh to get that going the first step is to start generating these random inputs. Often

you start with an interesting well working input and you try to mutate that and if you this input is fed to the to the application and execute it and uh well once you execute it you observe the behavior. You might also observe uh other interesting aspects and these uh all these these metrics that you you can observe can be in turn used as feedback to improve your input generation, change your strategies, uh yeah do all kind of interesting uh stuff. Well, once you find interesting bugs or issues with your application, you can do some crash triaging and then there are many things you can do after that which include of course uh building an exploit or uh patching the issue at

hand depending on your

role. Then there are various settings. Um with a setting I mean what your what you can access of your application. So there are many definitions of black box uh well gray box and white box and they don't really align. So we decided to at least present today one of our interpretations of what this is. So uh black box is really you cannot see anything. It's just the input output is what you can see. Uh well white box would be you have the entire code base available. You can instrument the code. You can add code. You can modify it. And gray box is somewhere in between. If you look at compiled languages, you might be able to modify the binary uh but you

don't have the source. And the nice thing is if you go from right to left, um of course if your fuzzer is is only working on blackbox, you need way less knowledge of your application. Uh but if you go the other way around, there's more information available which can make your fuzzer way more intelligent. Uh but this also comes with a lot of trade-offs. So uh there are many things to take into consideration when you're setting up fuzzing. uh and also specifically for your application that you're fuzzing. I I listed a couple here. For instance, the performance. So performance is very important. The more you can try uh the more likely you will find something. But if you do it

intelligently, you might need not need to try as much. Um or if you kind of use a lot of structure or pre-nowledge about the input that you're feeding. For instance, if you're fuzzing a PDF parser and you know that it's parsing PDFs, well, maybe you can start with a PDF or use a structure of a general PDF to create interesting new uh starting points and also the amount of feedback that you can use to do uh to do intelligence thing can be tuned. Uh so there are many uh combinations of these uh so for instance uh having a very uh or more or less intelligent uh setup which is a bit less performant but also

uses a lot of feedback. uh I will not name specific fusses today for that uh but I think people deeper in the domain can imagine what it is or have a very performant one doing a lot of random stuff uh maybe using some structure of the of the input data uh but other than that will be more like a monkey smashing buttons uh and similarly you can use a lot of intelligence use all the feedback you can find uh which results eventually in having a very bad performance but then again this can still be very beneficial and there are some interesting puzzles that to use this approach to find deeper bugs and with that I want to uh give the

stage to Erica. Thank you. Thanks. So I will uh try to introduce to you the fuzzer that actually we uh wrote at TNO. So uh this is a REST API coverage guided fuzzer. Um and I want to introduce slightly the the the many terms I just threw at your heads even though you probably understand most of them. um because there are many fuzzers. Fuzzing is not a new technique like we are presenting it here just in case you don't know it but it's existed for tens of years. So uh what is fuzzing about? Well um actually there are many fuzzing projects including one currently operated by Google that just fuzz open source software all the time. So fuzzing

is not like some alien concept. You can do it at home or at work. Um but fuzzing um mostly has been focusing on like binary applications such as the programs you run in your terminal that just accept one input, give one output and then it's very easy to just attach the fuzzer to the standard input and start feeding nonsense. Um and many applications that you use every day consists of multiple parts. Not all of them like the applications you would use in a terminal. um instead they have like components that interact with each other and uh for instance your laptop and desktop applications might connect to a back end operated by a service provider that connects again to a database. Um,

so yeah, you have multiple components and all of these components can talk to each other and therefore have input and output interfaces and we set out to build a fuzzer that can test those interfaces as well because as you know when your mobile phone connects to a back end uh you as perhaps a security researcher can also do this in other ways and feed inputs to this back end that it might not expect. So testing APIs, REST APIs in our case, uh is according to us, uh an important avenue of research and also should be an important avenue of practical testing. Um so yeah, how do these uh things communicate? Well, they use for instance APIs, for instance, REST APIs.

For instance, ones such as this one. I uh used an example that you can find if you look up documentation on the open a open API specifications which are often used for REST APIs. Um then you get this example uh which for our purposes is very nice um because well you can see all the uh ways you can communicate with this application. you can send it HTTP requests that conform to certain rules and then you get an answer or a side effect happens and well in this case you can update a database of cats and dogs. Um then somewhere if you actually built this app you will have some code in the back end that does all these

things. So if you send a post request to the pet endpoint with some data it makes a new pet for you. Um, and then in the end it also connects to hopefully a database of some kind. Of course, you can keep your pets in memory, but usually you want to reset your server at some point or run multiple ones of them. And then well, you need a database of some kind. So when we start well, it works. uh when we start fuzzing uh especially when we start fuzzing a REST API uh the glitchy Mario goes into each of these endpoints. Uh so you try to fuzz for instance the parameters um because of course you can

also just open the HTTP port and start sending garbage there. Uh which might be interesting but you might also just be fuzzing the HTTP parsing component which is often not the part that you wrote yourself and that you want to actually test. So, uh, we try to, uh, make valid HTTP requests with invalid inputs for the parameters, for instance, or the cookies or the headers or whatever. And then hopefully these end up in your logic and start doing weird things to your database, maybe. Um, so yeah, another thing that uh kind of you want to do when you're testing is you want to know how good it's going, right? Maybe at some point you crash

your application then you know you found something but if you don't does it work um so yeah what we do try to do also is to give you some insight in either that's where the bug is or maybe even better we tested all this code and we don't we didn't find a bug so this is like code coverage is kind of nice to have and since our fuzzer is built on lip AFL which is a coverage guided fuzzing library. Uh we can use code coverage uh if we can instrument your application. This ties into the white box blackbox thing that Thomas talked about. And finally, there is another challenge. If you use a database, um

well, you might have to deal with the state of your application since uh it affects what bugs you find. If you feed glitchy Mario into your database, perhaps you don't see the effect immediately, but then if you try to do other very legal and cool stuff with your application, it might suddenly crash. Uh, and then you try to reproduce and you send your normal legal input and it doesn't crash. Um, this is typically a frustrating experience. So, it's nice if you can have some kind of help from the fuzzer that says, "Okay, this is the history of what I did to your application." Maybe that can reproduce the bug. Um, on the other hand, you

might only be able to find cool complex bugs by doing complex things and messing up your database in complex ways. So, one of the ways we try to do this is um well try to make chains of requests that hopefully do something legal and then mutate them so they do illegal stuff. Uh, in this case I am uh using the example of building Spotify. You have a music database. uh you want to have some data on songs and artists. So uh first you create the artist in this case Rihanna. Uh then when you have created the artist in your database you can attach an album to it and then when you have created the album you can

attach a song to it. Um and then if you want to delete everything you should first delete the song then delete the album and then delete the artist because otherwise you have songs that don't belong to any artists. So logically what the fuzzer will do is it will try to switch around these operations and maybe delete the artist first at some point um and then see what happens to your database. Maybe your application is written well and we'll say I can't delete the artist or we'll say I will also delete all of these albums and songs by the artist. Um but maybe it will just leave dangly references that later lead to problems. Then uh we spoke about

coverage. Um and that's sometimes difficult because that means you have to instrument your uh application uh which makes the fuzzer harder to use and one of our goals was to make it easy to use. Uh so we have coverage guided fuzzing in two modes. Easy mode is where we do endpoint coverage uh which is basically did we reach all of your endpoints? That's easy enough. And that we did we get all of the status code out of them that you documented. So here it helps if you document your API carefully since then uh we can see oh we didn't we tested a lot but we didn't get any uh 404 errors from this endpoint. Let's ask

for weirder things and hopefully get them. Um on the other hand this also allows us to see if your application for instance gives a 500 status code and you didn't document that as being normal probably you didn't uh then we can report it as a bug. So it works both ways. Um but of course uh it's nicer to have serious coverage of your application. So if you have the source code, I presume you do, uh you might be able to install coverage instrumentation which we here call the coverage agent in your application and then the coverage client in the fuzzer which speaks the same language can connect to it and better understand how much of your

application has been able to test. Um so yeah that that is better but can be a bit difficult. We have multiple solutions depending on your backend language. Uh since that kind of differs. If you write Java, there is a very nice coverage library used for unit testing or in our case fuzzing which is called Jakokco. So it speaks a certain protocol that we wrote a client for that the fuzzer can understand uh the coverage and then the client connects to the coverage agent in your application and tells the fuzzer how it's doing. Same for JavaScript and Python. Uh there are coverage uh agents and clients that speak ELOV which is a protocol about coverage. Um that you can use to test if your

application is being covered. Um we try to make the fuzzer extensible in this regard so that if you have another language let's say go um and you are faster than us you could write your own coverage agent and client and uh submit it and then we would of course happily include it in the fuzzer. Um so yeah this also depends a little bit on what people are using. Our clients were already quite happy with Python and Java but maybe you're not. So finally, what happens if you find a bug, right? Uh if it breaks, then there are a couple of things that could have gone wrong and we try to inform you about those. Uh for instance, um maybe you

just returned a status code that is not in your specification. Um this can be a documentation issue or a code issue. So we tell you about it, but we allow you to disable this kind of warning as well. Um, or you could return a structure that doesn't look like it should look. We had at some point a nice success where an application returned the entire database instead of just one entry. Uh, so then we say this doesn't look like what it's supposed to do. Um, and that that was a nice bug to find. Um, authentication problems are harder to find, right? If you ask for Thomas's data and you get mine, then that looks good according to

the specification. It's not good. So that's harder to detect. Um as is data loss if you uh have some malformed input that deates the deletes the entire table. Um yeah, that's kind of sad. And hopefully the rest of the campaign like you want to access the resource that you just created with this weird name. Um you will see that the faser detects problems and you will see the bug. Um logic errors usually cause exceptions. Sometimes the application crashes but usually not. Uh so hopefully you can see this in your logs as well. We one of our future features is also to connect to the logging of your application and record what is happening there at the

time of the bug since hopefully this helps you. Um so yeah we currently detect these three kinds of things happening with your application. Um because usually it's a separate process so we cannot really inspect it closer than you will already do. Um and I guess I'm getting to the end now. Um we released this already last year so you can actually just download it um and try to use it. We try to make it easy to use. Um, and we even have a movie that maybe we still have time to show, but maybe uh I will see how many questions there are. And if there are very few questions, I will click through to the movie, which is a

demo on how to use it. So, are there any questions? What's the most interesting bug you found this? I think the I I like the database dump the best. It was basically by malformed parameters. you could just ask it to give you a wild card query on the database and get data that wasn't meant for you. Do you have a favorite? Yeah, I would say the same indeed. Yeah, that was the most interesting and also important to note. So, we are not pentesters ourselves. So, we work on the innovation and we don't do active testing. So, I think it would be interesting to ask the community around the fuzzer what they actually find. might be nice to to have in the

repo as well in the rhythm. Like often often you catch boring things like this is an exception that wasn't caught or wasn't handled or this input is not being properly uh guarded against injections and we posted invalid uni code so it crashes and then you look at the code and you notice oh maybe we should check this in any case. So but yeah those are not the spectacular things I guess.

Um it's a lot harder to use if you if you don't have concurrent logs. We do um like the while fuzzing since HTTP is relatively slow compared to like fuzzing an application that you have in process. We do keep a database of all the inputs sent. So if you look at a crash and you're like I don't know what happened here, you can basically see a record of all previous requests and replay them while looking at the logs hopefully uh and see what is going on. But in the end logging is usually written by yourself and therefore easy for you to understand.

Yeah. also mainly because if you do it on a live environment there are mitigation place to send a lot of requests so you will be banned within a minute probably. So yeah

yeah or if it's running somewhere with without all the mitigations for number of requests that you can send. So that might also still be valuable. And the main thing you can also catch is uh deviations from the spec. So we do assume that you have the spec and any deviations can still be of interest. So not only 500 but you might also have just weird returns of of other specs that only differ because they didn't create and then generate code. but they generated code and then hopefully the spec should be similar to yeah hopefully indeed yeah so the ideal world is that you have a spec and you build your code and then they are the

same and they should be the same um I think you should also strive for that so it's also a tool that a developer can actually use to test whether they align with our spec and investigate any deviations so would also be yeah would be helpful for developers and maybe a bit less for pentesters in that case yeah

if there are no other questions. So, we also brought stickers, so I can put some on the bar so you can promote the fuzzer or just put it on your laptop because you like stickers. That's also fine. Uh, and I'm not sure whether we still have time for the video, but it's also available on YouTube. Uh, because or you can find it in our repo. Uh so if you go to that URL it's a demonstration of of how to work with webpuss in the well the basic sense easy mode uh with a nice for over by so uh hope you can follow along and there are also tutorials on github itself. So thank [Applause] you. Thank both of you uh for the nice

demonstration and talk about webifuss. Uh also for both of you a nice thank you. Um and if there are any other questions or you want to see a live demo, I suppose you both will be hanging around uh during the conference. So make sure to get in touch with them. Thank you. Yeah. Thanks. [Applause]

Automated REST API vulnerability detection with WuppieFuzz | Thomas & Erieke | BSides Groningen 2025

Related talks