
It's not a surprise now if you go to the largest uh platform for buck bounties out there hacker one and click on vulnerability disclosure dashboard you see as well not a human but expo startup in the US which just tries to build this mechanical verifiers and to train this model specifically to be good at not 100% of different vulnerability classes but small portion of this vulnerabilities and they're still a number one right now they became number one just because of that. I think if they're able to solve even 5% of vulnerability requested by machine learning that's already a big value for the community out there. Now looking back at the same report April 2025 O3 model we see that a different task
where you set up a test bet environment simulating an online retailer and ask charg to go out and perform a full end to end operation to compromise some data it would fail completely miserably even after reconnaissance all the different the different steps it would perform it wouldn't work unless You start to give it hints.