← All talks

Unseen in the Stack: Mapping Hidden Java Dependencies for Real-World Applications

BSides NYC · 202526:2211 viewsPublished 2025-11Watch on YouTube ↗
Speakers
Tags
About this talk
This talk analyzes how shaded and repackaged Java dependencies propagate through Maven Central and real-world applications, revealing millions of hidden vulnerabilities that standard dependency scanners miss. Using static reachability analysis, the speaker demonstrates why dependency visibility breaks when libraries embed and rename their transitive dependencies, and presents findings from scanning over 16 million artifacts.
Show transcript [en]

presented by Auron Gutman. Right. So this talk presents research on how shaded and repackaged libraries propagate in the wild. So based on analysis of real Maven repositories and build artifacts. Let's give him a round of applause. By the way, it's also his first time talking at BSI New York City. So, you know now >> thank you very much for uh having me. Um so my name is uh Oruron. I'm the co-founder of CTO of uh Hopper Security. uh we are in the area of uh static uh function level reachability analysis for both SEA and uh and container scanning. Before OPER, I spent uh 15 uh uh 15 years in uh vulnerability research uh managing uh

both vulnerability researcher and uh and software engineer. Yes. >> What? I am sorry. Uh now is it okay? Okay, great. >> This one. Great. Thank you. So how do we manage OSS uh risk today? So the first thing that uh we want to do is to get the visibility to what we have in our software. Uh we can do it uh with something called sbomb. But essentially it's getting a list of all of the software component that we have in uh in our uh code in our binary uh depend on what we are scanning. And this is the the first thing that we want to do. The second thing is to identify what risk do we have in each component. So we

can cross reference each of the uh component the dependencies with NVD OSV or uh other uh vulnerab vulnerability database uh or uh we can do it with a proprietary one. uh but essentially uh this is what we need to do in order to identify uh the risk. Uh it's the same thing uh for license scanning. We will start with the inventory and then we will cross reference it with a license database. The third thing that we want to do is to prioritize the risk. uh now most of the uh the tools most of the teams what they're doing is based on the CVSS score or the EPSS for the most advanced or the CISAV or looking for a known exploits

and then the and then the fourth part is the remediation. Uh basically it's uh most of the time it's a version bump. we need to change the version from version A to version B for the nonvulnerable uh uh uh part uh when it's very difficult we can uh consider doing uh backporting. So why are we talking about Java today? So, Java is the most uh it's the number one backend uh language in the Fortune 500 companies. Uh nearly 50% of the Java applications in those company uh of the application is built uh uh by Java. Uh it's very complex uh ecosystem. There are a lot of framework like Spring and Jakarta that bring a lot of dependency.

It's very transitive. Uh a lot of dependency bring another dependency that bring intern more dependencies and this requires some creative uh methodology to to solve thing there that we will talk about it this later. And then the impact of vulnerability in Java is is very uh big uh because one vulnerability can affect hundred of thousands of uh other libraries uh for example the the lo forj uh um incidents. So how do we build Java application? We have today uh three major uh build system. We have the largest one which is the maven. Uh we have also gradal. It's a scriptbased uh uh build system. It's like a make file in C and C++. And we have Basil by

Google which uh more oriented to uh the build performance and uh reproducible uh build. So uh how do we build application with uh with Maven? Uh we have a simple thing that we called uh pom. It's an XML. It's defined uh meta data which the for example the group ID the artifact ID and the version which essentially the uh identification of the uh binary that we are uh going to build. We specify the dependencies with the version the direct dependencies and uh all the uh build life cycle. We take the source, we take the uh the dependency. We need to resolve the dependency because now we talk we we get the direct dependency. We get them from uh most of

the time from Maven central. We can get them from internal artifacts uh JFOG artifactory or sonotype Nexus for example. Then we will need to resolve uh conflicts. Uh Maven is doing this by what is nearest to the root. Uh direct dependency is get more priority uh on transitive dependencies and then we compile everything uh uh we declare the in the life cycle the the test that we want to uh to run to run and then we package it into uh there are four uh main uh uh packages type that we have. We have jar, war, uh, zip and ear. Um, and the most common is uh basically the the jar. So what is jar? Jar is basically Java

archive. Uh there is inside the class file that we compiled from the from the source. We have some metadata. uh because it's a zip it could uh contain a resource file uh configuration file uh log forj have a lot of configuration files uh in it and it's can be an executable executable mean that there is a main uh class a signature of main class uh that's the only thing that we can uh run uh the the java application or it could be a library without a main application. Uh it will be a dependency in other projects. So what is the challenges in executable jar? Uh basically the the dependencies because to run a java application we

need to specify all the all the dependency in the in the class. Here you can see the java minus uh cp uh specify uh I can specify jars. I can specify a directory uh which contain uh other jars and specify the uh the the main entry point by difference uh environment. This could lead to uh clasp error, clasp difference that could lead to runtime error. So one way to solve it today is to put all together in a container. But before there was there were containers uh there was something uh called Uber Jar that basically pack all the classes that from the first party code alongside with all the dependencies uh together and now we have a single jar

that contain everything and we can run this jar alone. So there is two uh main uh build uh build tool to automate it. There is a plug-in in Maven called Maven shape plug-in and in gradal it's shadow jar plugin. Now let's talk about what is the challenges in the the library. We talked about the uh the executable but now we have also libraries that has their own challenges. So basically what we are talking about is the dependency uh conflict because libraries when we publish them uh we publish them to uh for example Maven central or uh or internal artifactory they are coming with other pom that specify uh what are the dependency that require to this uh to

this library. So when we resolve a dependency, we take in account the library from this pal and we resolve it and we resolve all the tree uh until we do not have a dependencies anymore. So what's happened when two libraries require the same dependency but in a different version? Maven will resolve it. It will choose the nearest to the root. But if someone want version uh 1.0.0 and the other one want 3.0 uh three major differences. So one of them uh is going to win and one of them is going to uh crash in runtime because I assume there is a lot of differences between uh those uh libraries. So this is what we call a dependency

hell. So how we uh practically uh solve this problem? Uh we are doing something called shading. Uh what is shading? It's basically embedding uh the dependencies uh inside the jar uh and rename the the the class uh the the classes inside. For example, if the classes is started from or Apache common, I can now uh relocate them to kung fu shaded uh or apache common. Um it's basically a manage way to copy and paste the all content of the dependency inside the the jar. And now what we are going to do is to take the uh required dependency from the pom because I already copy it into my jar. I relocate it and it's something we

called a reduced pom. the pom uh for the library has no longer uh specified the dependency that I already uh copied inside. So let's uh take a look of a simple uh example. Um I took something that is old enough uh is vulnerabilities inside will I assume will not affect the world uh much today because it's more than uh 10 years old uh uh library but what I uh what I did here is create an application uh a simple one uh and specify one dependency. Uh for example, here it's the com O0 uh Java JWT in version uh 1.0.0. Um and now I want to test what uh what vulnerability there is inside. Um I will start with uh a simple thing.

I will run h a maven dependency tree and see um I want to resolve all the dependency the all the tree all the dependency that uh I have in my project. So here you can see that I have my project my root project here and there is only one dependency um and basically what I can learn from uh from this dependency tree that there is no other uh dependency uh relevant uh to the to the comm of zero. So let's let's inspect this uh from Maven central. We will see the pom that we have from Maven central for this dependency. And of course there is only one dependency uh it's a JUnit and it's in a test scope which mean it's okay

that I don't see it in my dependency tree. It will not get to the to the final artifact because it says dependency. So now we can start explore what vulnerability there is inside. Uh so I'm going to Maven central. Uh and as you can see there is only one vulnerability. This vulnerability is came from dependencies. The dependency that it came from is the the JUnit. Uh which is not relevant for us. Uh I can skip uh for this. Uh it seemed like it's okay. Um, anyone work with deps.dev uh by Google? Okay, basically deps.dev is an inventory of open source uh for uh some ecosystems uh for example Java u Java, Python uh and the other bigs. Uh

deps.dev is the security advisory is based on OSV.dev. dev also by Google which rely on NVD or uh and ghsa. Uh and what I can learn from this that there is no vulnerabilities in comos zero. Uh I think it's it's fine. Uh let's try the the sneak uh sneak develop a proprietary uh vulnerability database. uh the research vulnerabilities um uh with a researcher. So let's see if they see anything that the other miss. So according to snick there is no vulnerability in com zero. Uh let's try and scan the the my pomxml with snake. um they identified the the correct one the correct uh com zero java JWT dependency there is only one and as you

can see there is no uh vulnerability uh even uh from sneak. So the other methodology to uh try and see if there if there are vulnerability is to try and compile uh this uh uh this artifact. uh I packed him uh into a container and and run. It's an open- source uh tool. And here we can see that there is something interesting. Uh there are 49 vulnerabilities uh 19 critical, 27 highs. Um which is interesting because it specified that we have uh JSON core and JSON data bind uh dependencies uh with install version and now uh now I can uh resolve all the dependency by upgrade those uh dependencies. So let's try and do it. We will run

dependency tree and we will try and see and where those libraries came from and again there is not those library is not present in my in my project. So what we can see here that I can't upgrade it. I can't upgrade the uh those uh Jackson core or Jackson uh data bind. I can't see I I can't see them. I didn't bring them uh not even transitively. Um so how uh did this basically uh let's dive into the jar itself. Um I take the uh the com oz zero JWT uh jar and unzip it. And then I uh go to a metadata uh uh folder which contain uh some interesting uh files uh called pom properties. And those pawn

properties specify uh the the jar that uh and the dependency that's packed inside. Um so for example here we can see that there is the JSON uh the JSON core in version 2.2 uh 2.0.0 and the Jackson data bind. So Trivy uh wasn't lying about it. So how how it's it's made. So basically what we see here is is shading. Let's let's uh explore the the source of uh of this uh dependency. So I go to GitHub uh check out uh the the version uh 1.0.0 and then I can see that there are actual uh free uh dependency the Jackson databind the common uh codeuct and the famous JUnit from before. And now if I'm continuing the palm I can

see that there is a process that call shading. Uh they're using Maven shade plug-in. Uh the goal here is uh to shade and uh we will generate a reduced palm what I expect before we reduce it from the uh palm that we deliver to the Maven central and we relocated the uh the the packages. So it's no longer uh com faster XML Jackson. It's uh something else. Uh for now I can't tell it from uh from this. But uh let's uh let's continue with this. So we want to see if it's a theoretical uh uh problem or it's a or it's a or is it a real uh a real one. So we took our uh platform and scan uh and

scanned the project uh uh in in the GitHub. We found that there is 49 vulnerability. Five of them uh we consider uh uh as a reachable and we can explore the call graph and we can see that they're actually uh seems like a real risk. We have if we uh see the call graph we see that we call the the verify actually it is we want to verify a JWT token uh from the user and then uh we call to the decode and parse and then we uh dip down into the uh comfaster XML world uh till the vulnerable uh function we can see here that there is the relocation pattern it's not no longer

longer compass XML JSON it's com zero JWT internal compasstor XML uh JSON data bind uh and here we can see uh an interesting uh part where tell us to upgrade the JSON data bind from version uh 2.0 0 uh 00 to uh 22.7.1 uh we didn't find it. So the actual upgrade that we need to uh to do here is to upgrade from commas zero uh Java JWT in 1.0.0 to 3.0.1. This is the actual uh artifact that contain the the vulnerability. So our motivation was to see what what is the impact uh of all of the these shading parts because we we saw this uh across our customer base. We saw that there are a lot of uh

shading both from open source both from uh internal develop uh libraries uh and we want to see the impact. So what we did is to go to Maven Central. There are more than uh 16 million artifacts there and we analyze each one of them and try to map what are the uh shader dependencies inside those artifacts and then we want to map for each one of them uh what vulnerability uh it's contained inside. So the amazing part that we discovered is that there are a lot a lot of vulnerabilities are hidden inside uh more than uh 2.5 uh vulnerability point 2.5 million vulnerabilities. Uh you can see here that there is nearly uh 50 uh 5,000 vulnerable package that

there are still vulnerable in the latest version. Uh which mean there is no fix available for them. Uh there are more than 8 uh,000 cisakv uh criticality vulnerability near uh half a million uh critical one uh more than 3k log forj uh instances some of them in the latest version of some packages. Um and basically there are a lot of uh terrifying uh vulnerabilities. Uh most of them for the CISA kev uh they are exploited in the wild. Uh uh there are a lot of PLC for them. Uh

uh there are a lot of other uh vulnerabilities related to guava or Jackson uh data bind or or lo forj or really a lot of uh shitty vulnerabilities inside that that that are hidden. Um, that's it. Anyone have a question?

Okay, thank you very much.