
Hey everyone, hope you're all having a good morning here and it's always good to be back here at this stage and well twice in a row so that feels extra special. This time I even brought a friend alongside me and today we're going to be talking about building versus buying and our three around three engineers decided it would be a very good idea into creating an in-house SEA tool. So before we get to to business, this is who I am. My name is Zil Lamsh. I'm a security engineer. I've been working with security for over six years and I joined Checkmarks and moved on to Flirt Entertainment where I developed most of this project alongside this friend here
and now I'm at the X group. >> Hey everyone. Uh I'm Fabio. I'm a security engineer and flut flutter entertainment uh known in Portugal maniac as blip. We are in Porto. I started the company back in 2016 and I work there since and today I'll be with you presenting this nice challenge. Uh do we will go through the agenda right now. So we'll start introducing what is SCA then we're going to be talking about tackling the issue itself. then the solution and the tools you use moving on to the solution walk through and the demo and then some key takeaways and future vision. So introduction to XCA and it's exactly understanding what's the important of dependency management
in modern software development and uh SCA stands for software composition analysis and it's a process that involves identifying and analyzing the open source and third party components in a software code base. So why is I important in modern software development? Because nowadays applications often rely on multiple open source libraries and these libraries are not just for convenience tools but also essential parts itself of the code base and each of them creates potential risks and well now imagine vibe coding where developers just in install dependency freely unaware of what each other brings and and beneath the surface. And why has this become a critical problem? Because what once seemed like a very simple task installing a few
dependencies has evolved into a complex and web indirect of indirect risks that can compromise software security and stability. And well, we have transitive dependencies which are those dependencies that when you install a certain library, that certain library depends on other libraries and developers may not know the full chain of dependencies that comes when installing a single one. And that leads to vulnerable vulnerability detection issues because if a vulnerability is found in one of those deep transitive dependencies, developers may even struggle to understand where the issue is and where if it even if it is a real issue and when scaling the issue in large scale organizations and organization with multiple applications, manual tracking and auditing
dependencies is ineffective. you kind of need like an automated process that will go through all the process at the same time and look for security vulnerabilities that can go unnoticed uh in well we come up to the real issue which is dependency management in modern development because developers often use package managers like npm or even pip to install dependencies and they are not aware what come with all those libraries they just go there and do like npm install And this includes both those direct that you will install just npm install and you will install a dependency and everything will be okay. But it also includes those transactive dependencies that will depend on other dependencies. And by installing a single dependency
will bring alongside other dependencies. And well this brings us a little bit of dependenc because having all these layers can create complexity and um developers may even not know that they are installing dependencies that depend another dependencies that even those are vulnerable. not exactly the dependency that they installed but the other dependencies and well traditional tools like npm a and p res are good and they go there and you just use it and they'll install the dependencies that are used in the process and the project but they are quite inefficient when you go applying this to a large scale organization where you will have different environments different clusters and doing this all you will need a proper solution that will manage
everything in an automated So what we really need to build an efficient solution that will answer all our points. So first we'll need to build an inventory of dependencies and also we need to track those versions because we need to know which versions we have for which dependencies and well this also brings to automated vulnerability detection because we need to be proactively checking those dependencies and checking if those dependencies come with vulnerabilities and we need to provide fix and remediation because if we have a list of all the dependencies and the list of all the the vulnerabilities but we don't take action the list just sits there and most of the dependencies some of them might not even
have um a fix for it yet. So we need to find some kind of solution an alternative package or something like that. And this is why thinking and continuously monitoring is important because you need to be continuously improving, checking if those vulnerabilities have been solved and if they have been solved, you need to be checking for orders that may come up and most important I would say we need a place to manage and display those dependencies and those vulnerabilities because you can have a list and of inventory of all applications all the vulnerabilities but if you don't have a single place where developers may go there and understand what is vulnerable and what's not vulnerable and how can
they apply a fix and which versions have a fix for it and which version should they patch it to this will be for nothing and that will ensure in the end license compliance and reduce overall operational risks so tackling the issue and we'll talk a little bit about the steps that we need to address and the real need for a real solution and a scalable one so to manage vulnerabilities I would say We need to start creating a list of inventory of components, a list of all the dependency that we want. And since we are a large scale organization, we also would need some kind of solution that would support multiple code bases. And the the funny
thing, every time in the past that we have an instance, say let's say we have this vulnerable this vulnerable dependency and create a a big instance around it, we would use a a good tool called source graph to search around all the ripples where that tool is used. But having that static list isn't enough. We need to data to be synchronized in real time so we can keep quickly a good track of the dependencies and their versions and ensure that all developers have visibility and can take action when the problem occurs. So we'll be able to answer like two questions where this dependency is and which versions it has and does it have a fix for it yet? And
well we come up to the to that. So after we create having that initial inventory of all the dependency that we have on our organization, we needed some kind of robust solution that would cover all that I talked about and we went on the market searching for all the the solutions out there. But all of them had some real challenges that we weren't liking them very much. Some were expensive, others only support one code base and others had poor user interface. So uh we kind of need a tool that would answer all those things but also have some kind of customizability. So this leds to ourselves to a conclusion that we would need for a
tailored and cost effective solution that we could integrate across our organization and meet all our specific needs. And well, we also took this as an opportunity to to build new stuff because I think it's always good to go there to learn about what is an SCA and if we could implement one and well that's what we come up here because during our investigation of all the solutions we look around we identified four free components and one of those being the first one is creating a nest bomb which stands for software bill of materials and this basically a nest that list a nested inventory that lists all the components, libraries and all the dependencies in a software application.
And that we also need to now that we have that list, we need to have something to compare that list and check if they have vulnerabilities. So we need to have a dependency database which includes both vulnerabilities and end of life versions. And well now that we have a list of all the dependencies and now that we have a dependency database and end of life versions we need to have some kind of place to show the data that will be able to join between those things and it's a platform to both display and manage dependencies and also vulnerabilities and well this is the solution I'm going to present now. So uh I will give you an walk through of
the solution that we did and I'll start with the tool that we chose to build this application. So in order to generate the software bill of materials we chose cyclon DX uh from OASP. It's a really nice tool. We just it's a one of the standards that we have right now and it generates the the sbombs for the applications to query and to get all the vulnerabilities information where we are only relying on OSV dev from Google right now. uh they have a a really nice uh open source database that you can query and you can send software bill of materials directly to them and they will output your software bill of materials with the vulnerability themselves which
will help a lot on your or debugging on your processing and to check end of life dependencies we are using end of lifedate another another tool that you can just go and go there and use we are syncing the date into now our main tool and then for that processing and and displaying we are using a tool that we developed uh it is called surface security. This tool started in 2017 on the the former appsac team of for power at fair right now flood entertainment group and this is a jungle framework that has a lot of integrations and it really scalable the way that we built it because you can integrate more apps in it. The tool is open source and you can
have a look on into it on GitHub. We'll just share the the rep right there. So on the challenes the themselves identify and processing dependency. This is a really tricky one because okay you know to identify the dependencies all we needed to generate the software bill of materials but then you work in an organization with 40,000 ripples and then how can I generate all the dependencies for all of these repos and how can I make people adopt this process and to adopt this on their daily basis. Uh so this is one of the biggest challenge that we have while developing the tool. We needed to create some templates and for them to use of course
and to generate the dependencies and sort of an automated way to have the process working for them. So synchronizing nests and we needed a process also to get all the bill of materials that were generated that are a lot of course after you start on boarding tool uh tools into the SCA we'll have a lot of uh software bill of materials that you need to sync and then we need to manage the vulnerabilities and of course as Yogo mentioned create the custom dashboards for the teams to go there to use and for them to be a little bit proactive because it's impossible impossible from a security point of view to manage all of this in
the security team. So you need to pass responsibility to the development teams to use the tool and to to make sure that we are in a better shape from a security point of view. So our our tool is consisted in two main models. Uh we have the surface SCA that is the main model is the SJ tool and then the the SPOM repo. The spawn repo is a simple jungle um tool that consists in 2DB models where we store only the SPON data and all the the vulnerabilities from OSPD. We consume the API get the vulnerabilities. It gives you like half a million uh vulnerabilities that you can use on your on your tool and we are processing the S
bombs to store this and after that we have surface security. I'm sure you can You can understand the models because they look a little bit small but I can explain them to you. So on the our SCA we have multiple models the main one being the SCA dependency. The dependency model will be the parent model or everything. So this parent model has many to many field for itself. So we can have a dependency tree built based on this. We have the end of life dependencies model also that we sync separately. But then we have multiple relations in the database that we are able to go from a dependency to an owner. And this is what we wanted to
achieve uh building this tool because nowadays in big companies having a tool is great. Having the results is awesome. But then who is the owner of this vulnerability is one answer that is really hard to to answer nowadays in big companies because ownership is one of the the that is harder to to achieve properly in the company though nowadays. So our process consisted in creating the bill of materials. We scanned it with life dev. We share search if this dependencies is end of life crossing the data with life. We push the results to surface and then we have our views so people can use them. So the way we implemented this for this to be scalable and usable in our reality
back in time and it's working as is nowadays. We use Jenkins to generate all our beloved materials and GitHub actions. Then we send the results to surface. We only have surface here because right now surface and the somra repo are working uh together because rep is just a jungle app. So we can install inside inside our application surface itself and we have everything in the same place. Of course we query the osv to get the the vulnerabilities. We push the to surface and then we have UI right there the reports and score. It's it's right there to mention but it's just a way that we created a process inside the organization to gify a little bit the
security process. So this is just a side note for for what it is good to score. The key the key features of our of our tool I already mentioned espon synchronization dependency processing. We group the dependency by projects of course otherwise it would be a mess and I can show you after our first version that it would be a mess to manage without having this grouping and vulnerability management. So a way for people to manage their vulnerabilities. Extra tools we have of course besides the the views we have a check to see if the dependencies are in public um registries. So we have a lot of internal dependencies and having the possibility of those dependencies not pinned uh and
be available in a public republic can lead to dependency confusion if the the version is not pin itself. So we are checking if our internal dependencies and all dependencies are available in the the polar regist right. So we try to help in mitigating the the dependency confusion. We are integrated with CI/CD. We have our own suppress mechanism and of course we have the automatic updates and of course we identify if a vulnerability is fixable or not because there are rooms out there that there are no patches yet. So not everything is fixable from a developer point of view for a development point of view. So you cannot just go there and upgrade a dependency and in these situations is
good to know that you need to take other kind of actions besides updating it. So after this I will give you a a small walk through on the on the tool that we have and I will show you what we have right here. Let me check if I see let's just organize the screens. Okay, it can be this is so as a start I already have is this okay to see right now? Yeah. Okay. uh at the start I have two software bill of material that I generated previously for the presentation because sometimes it takes a little bit of time to build. Both of them were generated with cyclone DX it surface as bomb JSON and
the node codes just as an example. So sorry
So this is the kind of information that you can have on a spawn. As you can see the surface one is really small and we have a lot of information in a single place for this to be usable by someone just to take an SP and to get information from. Of course you can get the dependency and dependency list but this is not usable um to start. So right here I have my my jungle server that is our tool uh surface that is running. I'm opening Slack by mistake. Sorry. Oh. Um our jungle server and right here this is surface. So I start with a clean database. The only thing that we have right here are
the vulnerabilities from OSV that takes like 20 30 minutes to sync right now. So it was not a great idea to have them here. And we have information from end of life.date that I can zoom in a little bit. Maybe it helps. And this is the only things that we have for now. As I as you can see there are no dependencies, no projects. The database is completely clean. And this is our the ripple that we'll be scanning is surface security is basically a fork of the the ripple that is open source. And this is a ripple that we'll be we'll be scanning for the SCA. So the way that we start, we generate the sbomb and then we ingest
the data into the surface. So we'll take our curls that I hope they work and we just send the first bill of materials to the spawn repo and then the second bill of materials to the spawn repo. two post requests, two build materials were created. If I go to show you the spawns now we have two responss on our spawn database. This is just API based. There is no use for this. It's just the database where you are storing the spawns. If I take this and open you this one, it should open. Take this one. And it's not this one, but I can I can just copy this and paste it for sure. Sorry. No, it's just a space.
I give it the space right there. Uh so this is the bill of material itself that we just uh pushed to to our tool. We have the list of the dependencies and you can go through you can say this is a jungle package this is surface itself and for now we only have the dependency tree and the bill of materials. If I give it the the parameter boom data now it's going to sv sending the bill of materials and getting the results with all the vulnerabilities that were identified for this bill of materials. it looks the same but here on this form repo we have a vulnerability list for this package itself. So for surface in this case jungo 524 uh
it was the version that was on the repo back in time when I I scanned it and we have the all information right here and right now if I zoom out we still don't have anything on our projects because that that is the second second step this is the repo where we are storing the bill of materials and the the vulnerabilities and if I run the sync espo that we'll go through the response and get all the all the information that we have in there, all the the bill of materials and and the stuff that we have in there. We'll start getting this information into surface security into the the main the main tool. It gives a a few warnings on on the
versions because we we are using semantic version to check the versions if they are valid or not. And there are some that comes from from the API that are not really valid versions. So if this work right now, we have two uh SCA projects right here. But let me go just a little bit back uh a little bit back for our version one. Our first version was this. This was the first version of the wheel of materials that we just started with dependencies view and we have all the dependencies right here. Of course, we can go and search by Django. Let's go 5.2.4 not like this but 2.4 choose appear right here. And if I
search for it, we have a dependency. We had a tool that we have a recursive meth some recursive methods to build our dependency tree of it. And we also have a parent tree to see which dependencies are using this this version of Django. This was just the first version completely um I wouldn't say useless because it's more than having anything but it's not really ready for someone to use. So we had then the version two where we built the sea project view. This is basically the same model. It's sea dependency organized in a different way and based on this we are able to get the information of the vulnerabilities that we have which is a repo and we can
download the as if you someone wants to to have a look into it. So on the dependencies we can have a list of the dependencies of a project uh with some filters right there. The vulnerable one is interesting because you can see all your vulnerable dependencies of your project. If you look into there, there is the the jungle 5 to4. The one I I got really happy when I saw the jungle 5 to4. This is a the critical one in there is a few days old actually and when it appeared I was really happy. So, oh the tool works. This is really recent and the tool got it. So nice. This is working and it it
was a really nice moment. And then you have the the vulnerabilities view. On the vulnerabilities view, you can see all the vulnerabilities information that we get from OSP.dev. We have a small description just for for for people to know what it means and what the vulnerability is. And of course, we have a suppress mechanism. Let's let me just go back to the projects to show you that we have one critical seven high and five medium right here. And on the suppress mechanism, let's try to suppress just a normal high one. can be the first one. Of course, we need to have a mandatory suppression otherwise developers will go here, press the button and move forward because they
want to deliver and deliver is the most important thing of a development team. Uh so let me put something on the description for you to know. And right now it should have updated the counters. As you can see we have one critical, six high and five minute because the other one was suppressed for a reason. and someone put his name right there to say this is surprised surprised because of this reason but if there are any problems related to it we have ownership of that action and this is really important. Other than this we wanted to have the possibility of making the this user friendly for developers. So okay developers now they know that we
have these vulnerabilities in our dependencies. So the best way is to update and then should I update to which version? I don't know to the latest might break to the major one minimal. So we had an integration right here that calls renovate and if I have docker running that I think I do and if I press this button it should call renovate itself for jungle and it will in the repo create a merge request updating the dependency for the dev teams to go there and just review paste of course and merge in the end. Check if it's working. It said it did. Let me refresh it. That is There it is. We have a pull request for the teams.
And now this pull request is upgrading jungle to the latest fixed version of this that critical CVA that is 5 to8. And all you need is to check if your code is running and merge it. So on the SEA part and on the tool itself, this is it. And now I'll go pass the togo to give you the key takeaways and overview. Yeah, let's recap all of this and give you some key insights into what's important and our plan to expand the tool and enhance the solution itself. So some key takeaways. The genu repo is a pip package. You can just do pip install and you install the jangu into your application and once you install the sea
bit you'll be able to reync multiple vulnerabilities and end of life versions and well it might be biased but I think the tool is easy enough to go to GitHub to understand and to implement and set up to your own needs and it's also an opportunity to get into the surface security world because we have a lot of apps on there other than the main app and uh as repo. We have others that are quite useful but well you just go there and take a look for yourself and well it's a open source project and it's a a solution that is out there for everyone and we call other contributors because we put the solution out there but it's
always good to get some feedback into it and I want to highlight the importance of solutions like this because most of us use open source tools in our own development and we use a lot like you can see to implement this solution and it's our opportunity to give also something back to the community that it's open source and it's out there for everyone and what we plan on doing in the future is improving a little bit on the documentation. And I think the tool itself and the all Django ecosystems, the Django surface ecosystem needs a little bit more documentation and some kind of governance to improve one a public road map say like future release
otherwise the people just go there when they want to know the tool but will not go there for future updates or anything like that. we need to reduce a little bit the tech dev because there are standards and everything that we need to follow across the whole application and doing external communications because we can have a great product and I als I really think that this is a product it is useful and it's out there it's open source so just go there and use it but we need to do security meetups like this one and to talk with the guys and say okay we have this tool because if even if we have a great tool and nobody knows
about it nobody they will lose that tool and well I think that's all folks do you have any questions we are here to answer but feel free to go take a deep look at the GitHub to understand what we have down there on the SCA app when we have on the whole surface security app and that's our LinkedIn and that's our emails and you can ask I think
Hi guys, thank you for the presentation. Uh I have a quick question on the process whether you recommend an upgrade pet those vulnerable libraries. Are you just taking into account like the latest available version and confirming that it doesn't have those specific vulnerabilities or for example you also take into account things like determine if for example there is breaking changes when you're going to upgrade to that library because because that can have an impact on the the upgrade itself of the software and developers need to refactor code >> of course. So right now we are relying 100% on Renovate. Renovate is a tool out there that does this by itself. We just tell him to search on the repo for in
this case when I press the button to search on the repo based on the on the renovate configuration for jungle. It goes to the repo, searches for the jungle um dependency and upgrades it to the to the latest. Usually what it does it upgrades the dependency to the latest minor and creates another pull request for the major version. So the teams can look and test both and to see if they can go to the major or not. But right now we are completely relying on on renovate for for this process. >> Okay. Thanks
Uh thanks for the talk. Uh I wanted to ask you what were the greatest challenges you've faced in this project because generating S bombs is not trivial due to the every difference that every project has and how did you tackle this? Thank you. So that's a big one because this is mainly the the biggest challenge that we had because working with multiple code bases is really challenging when you want to build an unified solution for ESPOM generation because you don't know other teams code. You know the reps are out there. You know the languages they use but you don't know the team's code. And we have proven that as pom generation is much better after built environments. So
if you create the cdx gen after you build your tool locally, the results that you'll have will be much better than the ones that just passing the repo to it. So flow the repo, run cdx gen uh and generate the som will not bring the same results. and having this working for multiple languages and multiple environments uh is really challenging. What we did in our case as we did this in the past for sonar cube was to go and try to mimify the CI builds that we have of the team. So the way that the teams built their projects, we try to minimize that creating multiple templates for multiple languages based on standards that we try to identify along the time.
And based on that, we tried to create the templates much the better that we could for this to work and to be scalable and to cover the much language as possible in the in our codebase. But yeah, it's really challenging. Is it perfect? Of course not. Is it running for everything? No, it it isn't, but we are here to making it better and try to identify patterns. Ideally, the best solution for this would be to try to put the team send you the response. If you think about implementing this in your own companies, try to make this process, but having the team delivering your response because they know their tools, they know how to build them and they can provide you
really valuable information in terms of bill of materials. I hope this answers the question.
That's it. Thank you everyone. I hope you enjoyed.