← All talks

BSidesSF 2026 - What a False Alarm Taught Us About Security as a... (Alex Chantavy, Kunaal Sikka)

BSidesSF30:1944 viewsPublished 2026-05Watch on YouTube ↗
Mentioned in this talk
Service
Concepts
Vendors
About this talk
What a False Alarm Taught Us About Security as a 2-Person Startup Alex Chantavy, Kunaal Sikka Our new startup was doing great. We had just onboarded our first customers and closed our fundraise when disaster struck: the password on our production database had been changed! This is a story of how a false alarm humbled us through a full IR playbook as a 2-person startup, and what we learned. https://bsidessf2026.sched.com/event/cc054cbd2ea6baf708f62a8a7e765871
Show transcript [en]

Without further ado, we have the powerful team of Alex and Kunel and they're going to talk to us about their experiences here at this startup and I'm handing over the mic because I know they have a conversation for us. Big round of applause for Alex and Canel. >> Thank you so much. >> Hey everybody, I'm Alex. >> And I'm Canal and the two of us are the co-founders of a company called Subage. First, I'd like to open up with a show of hands. Who either here has either founded a startup or had hopes and dreams of doing that before? A little trickling. I would have thought this is San Francisco, man. Really? It would have been Yeah. So, this is the

story that um we created sub image. We're a security company. It's one of the very first hard challenges that we ever faced after we had gotten our funding. And to be very honest with you guys, this is a really this is not an easy talk to give. It's frankly a little bit embarrassing. We're opening up a little bit. But we're hoping that by telling this story, this helps somebody else over here save time, especially from IR and security and all these other multitude of things that are uh uniquely challenging to be a startup in the security space. Just to tell you a little bit about who we are so you know who's talking at you.

So, I've worked in security for about 15 years or so. I started off my career at the NSA origin um moving off eventually into private industry at Microsoft where I worked on the Azure Red team and then working Thank you. then going over to uh Lyft eventually um to create a tool called cgraphy where I met Kunal. >> Yeah. Hi. I uh started my career at Lyft and met Alex. I started using the open source ctography and found it incredibly powerful in every way, shape and form. Um I spent four years at Lyft working on everything from van management to data security to some basic IR. Um and then I went to anthropic where we were doing

all the things all at once uh very early on. So yeah, >> so if you want to hear a talk about we actually talked about our whole story from the very beginning about you know getting funding and then all of our hopes uh doubts and everything else like that how Canel moved his uh entire life across the country from New York City all the way to San Francisco. We talked about this at Bides New York City. So go check that talk out. Essentially this talk begins after we had landed our first enterprise customers which we finished that in like about two or three months which was an absolute marathon. It was like a it was a sprint and a

marathon at the same time. It felt bad for all of those reasons at once. And after that like YC puts you in this whole um accelerator where you get that done and we had raised a $4.2 2 million seed round which in the normal uh normal people speaking terms that means that you have enough money to start your company. You can hire a team is you can do more than just have two people building product and on sales all the time. You can start to delegate responsibility. So this was us being able to really take this thing to the races. This was the first time that we felt even a little bit comfortable getting the first few customers that's

really big. getting money's kind of permission from the valley like okay you can try to do this thing this is the first time that we really felt the wind behind our back um yeah it was going good I'm going to go into a little bit we talked about this in the bides talk but there are some very unique challenges for a security startup so security startups you need to have a little bit more maturity than other industries it's different like we're not making yet another you know AI I rap or anything like that, selling it to somebody who doesn't have a very critical view of everything. Like all of us in this audience of security professionals,

we're we're all skeptical people. We don't believe anything and we are are very critical of every single thing. We're going to be looking at the details of every single vendor that we're doing. We have these vendor intake processes that are outright hostile sometimes. So the bar is very very high. And then that's a challenge that if you're coming in at this with uh just the two of a very small company, you got to get past that somehow. And we had some ways that uh to try to establish that credibility. This is what our security stack looked like as a twoperson company. We thought we were so cool. We thought we were so cool. We had

an IDP. We set up Octa from day zero. We had set up MDM even though there were two folks. it was required for sock 2 and we were like oh let's just do the whole thing like let's set it up properly rippling MDM sens one all the everything just all the edr all the things we had password managers that we used religiously everywhere uh and when I mean every single time we could use oath we did we were mostly passwordless just a couple little things where we needed to use the password manager um we had set up all the alerts we had all the sidecars that you could have in in AWS to like get you like guard duty uh we're

collecting metrics on everything We we had it all even though we were a 2% company. And a lot of people actually thought this was bad. Like in Silicon Valley in like all of our YC partners, they'd say you should not do anything until you have PMF. And PMF means you have several customers that keep using you that keep coming back to you. Not only have they paid you, but they use you. >> Product market fit. >> Product market fit. Sorry. And um we're like, oh no, we're going to set up all these security tools. Um we thought we had something crazy robust. I guess the equivalent here would be if you're a dev tools company selling Kubernetes

solutions and you yourself use Kubernetes as a twoperson startup. So like if you can think about that Midwit meme, you know, the that whole bell curve or whatever, we're kind of I don't know, we're probably honestly like on the midw like the the simple thing to do. I actually don't know what that simple thing to do is. I'm actually kind of curious like what you guys think like what is the minimum set of tooling that you would need to start your own company as a you know working in security but we certainly we were in the middle we're in that midw meme like trying to overthink everything and think that this would give us all that we needed to get that

credibility and move on with our sales well I'll uh tell you a little bit about what our company does just so you have the context about what happened when we had our IR so we are a hosted version around cartgraphy this is the tool we built at Lyft back in 2019. We've been um ingesting data into a knowledge graph and there's all kinds of amazing security use cases that you can do over here. So, we built out a little control plane where whatever the data is, whether it's AWS or Octa or anything else like that. We sync it in, put it in, and then now you can do things like find attack paths, you can do

vulnerability management. And this is just uh to orient you on the problem that we're solving and also the why it was so critical that uh leading into the parts why it scared us so much to have a potential incident. It all began with an infra migration. We started off like every startup you know you're get you're do you're going very scrappy. You keep everything as simple as possible. You got a very cheap EC2 instance writing stuff to some knowledge database, knowledge graph database, and then build an application around that just to be quick and to prove something out and have a minimum of value. We were growing up and then now it was time to

move on to something a little bit more sophisticated. We thought this would be like a really simple migration. I mean, we have some code, we package it up, we put on EC2 instance. Now we're packaging it up, putting it on a different compute platform. We're using the same underlying databases, just migrating some infrastructure, very few code changes, just some minor container packaging changes. That was supposed to be it. And just to orient you a little bit, that knowledge graph database is Neo4j. So that's the heart of everything that we're putting together. We need all of these massive joins all together. So we put it in a graph database. Neo4j has a hosted cloud version called Aura. And

then it has this control plane where you can view all your logs and metrics. And all of this is a helpful context for what comes in in just a little bit for the infra migration itself. We are again we're moving from EC2 over to ECS Fargate running on containers so that things can be a lot more uh reproducible when it's uh you know where we're sending a commit over and deploying it being a little bit um helping us move faster. And so to as part of this migration, we had moved a subset of customer data, migrating that over to that new database. And then everything was just kind of wiring it up together. And then the initial tests were working.

Everything seemed okay except well, all right, by about this is a day um two days leading up to our incident, we had the laptop was still able to write to the new database. We're pretty happy. things are humming along. And then the day right before that, so it's a late day in the office, maybe about like getting close to 6:30, we noticed this is a little bit weird. We are unable to authenticate to our database using our laptop. What's going on here? And this isn't exactly what that thing this isn't the exact actual screenshot but it would look it looked something like this where we thought it was a typo and then you know as a startup you got like a million

different things to do and then we had to go do something else. We kind of we ignored it. >> Also we had the password manager. It had worked earlier in the day. We were pretty sure that same password was fine. That's why we had the password manager to save the username and password and autofill it. And now it just shows up red. >> And you know we we thought everything was fine. We got distracted, came back. The very next day around a little bit after lunch, we saw a successful log on in the logs that we did not recognize. Holy crap. At that point, the the credentials on the laptop were still not working. And there's a million different

thoughts that are going through our head at that moment. It did not feel good. And so, this is the moment where everything went, okay, this is our holy moly moment. And then just kind of moving into like the next few steps. And then as you're going through this and maybe kind of think for yourself like what you would do, what put yourself in our shoes over here just laying out the context. Uh two person startup just got funded just in uh just gotten our first enterprise contracts. These companies had put their trust in us or we got to we got to do everything right now. So well our initial assessment and then just to be very human about it I mean

you can't help but like there's there's a little bit of panic going on there definitely is a little bit of panic going on so the first step like this was not like we had been working at you know large security compies in security for a very long time so this was not our first rodeo like we had been doing IR before we looped into things I'll just say here that like it hits completely different like you can have all the ownership in the world your company, you can have pride in the things that you're building um if you're working for somebody else, but when it's yours, it it just hits completely differently and you feel that

like and it's completely different. So, we looked uh how do you how could this have happened? No, there we had no evidence of brute forcing. This was um this is a sample of like what those logs looked like and then so there was no evidence of brute forcing or other IPs. We just saw like okay, here's the failure. Here's a success. nothing more. We had no evidence that somebody was necessarily there. We couldn't rule out that this was but we couldn't rule it out either way. We didn't know what was going on. Is this malicious or is this not malicious? We're just trying to validate. And then this was the next part. So trying to figure out how could it have

happened? Maybe they got in from popping one of our laptops. Uh maybe they got in from one of our phones if it did happen at all. And we didn't have any evidence from our EDR. But up until that point, we had no reason to really test our EDR. Like we had no reason to really put like the IICAR file on it and see like if it was routed correctly or anything, but again, we weren't really confident that it was working at that point. >> Also, we logged into our EDR and I'm not going to like cast any shade on particular vendors, but it sucked. Like it just sucked like like as a first like we had not used that particular one

before and we logged in and we're like this UI is from like 98 and like I we have no idea how to even use this. This is horrendous. So >> not going to say who they are though but ask ask me later. But >> the design and UI I mean sure you I mean the bones are what matters in the product but then if it doesn't look credible you're not going to trust it. And then security is all about trust in whatever way shape or form that takes. And for us it was just like I don't even know this is working. this UI is just really bad. We thought about other angles. Our internet exposed assets. So that EC2

instance, was it open to the complete internet? No, it wasn't. Uh we used our own product to go and validate all that. We looked in the cloud console to go make sure the um Neo4j console that I showed you earlier. We would have been a that was behind 2FA. the database itself that was open to the internet only because we were setting things up as part of that entire uh infrastructure migration that uh I was mentioning earlier but all in all still no real evidence of anything going on. Um the other part that I forgot to go into earlier I have in the corner up there at the top day one and what's weighing down

in our minds through this is we have a data protection agreement a DPA with our customers and uh who does not know what a DPA is cool everyone everyone knows about it already. So >> wait wait wait we should go over it. We should go over it. >> Yeah yeah yeah we're absolutely going to go over it. So a DPA basically outlines like what is the agreement or what is the expectation for handling your data for um for a a given client. And so our DPA says that if we know about an incident if a material data breach by the GDPR um definition has happened to us, we must notify customers within 72 hours. And so the clock was ticking and

you know clearly like mind you like at this point we didn't know if technically we didn't know if it's something that happened but we couldn't rule it out either but so time was of the essence and we were really well we're really aware we're up against the clock and just kind of trying to go as fast as possible. Also a small addendum here. I don't know if you all have seen the news recently about certain startups and compliance and and the memes that have happened but this has become particularly relevant even in like last two days is like you even as a startup we have to comply with these regulations and with these compliance attestations

that we've made because our customers are often 100 times larger than us. Um yeah >> very relevant. >> Yeah. >> Yeah. Okay. determining and next step in like IR you know we uh we did some we did validate we we attempted to do validation as far as we could we couldn't get any additional evidence next step is trying to determine and contain potential impact laptops and phones like that was the only real vector that we could really think about that any of this could have happened uh nothing from EDR but then out of paranoia um why don't you go into this one we just wiped everything um it was it was a mistake looking back we

should have quarantined the devices. Also, those were our only two devices, so we had to like reset up our devices. It probably set us back a couple of hours. >> That sucked. Yeah. >> Yeah. >> Don't don't do that, right? Like, >> yeah. >> Um immediate immediate regret. It's like, oh yeah, cool. Oh. Oh, the other part too um is that we're working these threads kind of in parallel, right? So, like you were looking at the endpoint devices. Yeah. Kunal was doing endpoint devices. I was doing the database itself and we're heads down in our own little lane. we kind of were went silent for a little bit. Um, in the thinking about trying to contain things, I deleted the

database, you know, like like I I thought, okay, well, I mean, it the goal contain things. If there is an attacker over here, I'm going to make sure that they're not going to be able to go and grab anything out of it in my fear, you know, like I'm going to make sure that they're not going to be able to get any of this potential customer data. But immediately the immediately after doing that, I was like, "Oh, no. We lost the logs. We didn't have any of that. We couldn't get that evidence at that point. So, we did look at security logs before that. But then, oh, if we wanted to refer to it again, oh, that

that's not good. And this is some feedback we gave to Neoforj because when you delete a given resource type in any of the major cloud providers, you still have the logs. You still have the cloud trail logs. We had expected that. We didn't expect to delete the database and then lose access to all security logs from the past. Um, again, a nuance that we didn't realize in Panic at the moment. And clearly, you know, at this point everything is not working. It is not feeling good. We're at uh day two by this point. Uh did not sleep well at all. I think that we got maybe a couple of hours of sleep if that. It was not

restful. Basically, the whole let me just get into like this. Things were spiraling, you know, thinking where are they still around? What to what could they be going after? Were they watching our Slack? Were they there somewhere? Was this some What could they want? you know, what was the next target? Was it someone trying to embarrass us or something like that? And then because if they were trying to embarrass us, I mean, message received, we're certainly feeling very embarrassed. And the biggest question here, were we dead as a company? You know, security, it's all about trust. It's all about, especially as a startup, you're going in there, like I mentioned before, our early customers took a big bet on us.

And then so we were kind of mentally preparing ourselves. Okay, if this was real, let's uh get all our ducks in a row, make the right message, present it to them, and just own up to it. And then that felt that that was heavy. And so by this point, the day two, we were feeling, you know, this was like the lowest of the low points like in our journey this far, we were out of ideas. We called Neoforj for support. We said, you know, we need urgent help on this thing. Can you restore the logs from that database that we deleted? I deleted. Sorry. Um and the next part was yeah again we prepared to notify them

and again the SLA thinking about uh 72 hours leading up to everything this entire thing. Neo4j like they got back their support was awesome by the way like they immediately got uh back to us uh very quickly and said all right the earliest that we can get on a call is tomorrow morning we don't think that it's a security breach let me show you what we mean all right so we got on the call and then Neo4j thankfully they were able to share their logs they restored the logs they were able to share everything I'm going to leave this up here on the screen for a little bit. This is showing every single I guess like different activity

authentication and whatnot to that database instance. Does anything pop up with you guys immediately? Can you even see that? Or I think it's a little hard to see. Maybe maybe we should zoom in a little bit. Let's let's do that. Do you guys see that the letter J is very skinny? And so um yeah, we uh we had the username wrong but but only in local dev. We had the username wrong like in our uh password manager we had saved it incorrectly. The password was correct. The username was I don't know what Neo4 is. Um Neil 4J is the name of the database. It's also the name of the default privileged user that is created when you

first provision your database. And we were using that for our initial migration just to get everything get our ducks in a row before we did buttoned it up and did things the right way. And uh why don't you talk through the I guess I don't know the clicking and all that like getting into your laptop and whatnot. Basically, uh, when we were clicking around, we realized that when we had first signed in to the database, uh, we probably just used the normal creds and it worked. And then when we saved it to the password manager, it had stopped working. Um, and then we also noticed that prod was working just fine. We had earlier in the process, we mentioned we

checked the security logs. But then we went back and we put up a new database. By this point, we were now running it again. We realized that if we had checked the query logs, we would have seen that the query logs look completely normal. It looked like our prod. So the successful login were from our prod. Everything was working completely normally. The failed login were just when we tried to click into the database from local dev. >> Exactly. So again like the next question I guess can all went into that like what was what the heck was that successful login? Uh just to zoom out and then make that entire block diagram to make sense

of that. You've got our laptop that was the the edge in this, you know, call graph that was not working. You know, we weren't able to the pass, not password, the username was wrong. But the successful thing came from ECS Fargate. That was our everything working. And then in our panic, we had completely overlooked that. In our panic, we didn't bother to go look at the query logs. And in our defense, if you did go to the query logs at that time, there was a button there that made you say, "All right, I want to request it for like the past half hour or whatever and it made you do that for like this half hour, that half hour,

that half hour and then in the middle of incident like we are not doing that like that is just wasted time and like we were or I don't know maybe we should have but god yeah they have since improved it um to their credit but then at that time that just like really uh took the wind out of our sales at that moment. So yeah, so we saw the security jobs. It this the sync job, we looked at security logs. Um we did not look at the query logs and if they were in the same place, being able to correlate those things, being like, okay, here's a failed login and then here's the actual data that's being like, you know, here

are the operations that are going on with that database. Oh, that would have helped immensely. All in all, no breach, nothing to report to customer, zero unauthorized authentication to a database. And you know, I don't want to say that this was this was not a time to celebrate. Like we were, you know, certainly relieved, but I was more upset than anything else to be very honest. Learnings. So I mean obviously the big uh key thing here is that panic absolutely made us blind. we were freaking out over this and you know use playbooks in your incident response and then so um maybe like uh shooting from the hip using our own previous experiences it may have uh pre-biased us

or made us you know not follow a disciplined process it would have been helpful even for us as a two person startup to kind of think about you know what are the common ways to do things in order look at all the relevant logs um I mean this one take with a grain of salt because like in a large enough environment, this is just not feasible at all. For our specific case, we only had like security logs and query logs and if they were put together in a good way, it would have been helpful for us to see all that and then uh get sane at that point. Also, you all saw that we use all these different security tools,

but we had not set up centralized log forwarding for that third party application. Large companies would often have all sensitive like resources forwarding logs into one place. we had drawn a line and said, "Oh, well, they all like Neo forj gives us good logs, right, in that that viewer. We'll just use that." But had we been forwarding logs, we would have still had them after that database was deleted. Perhaps we would have had easier ways to look at those logs instead of 30 minutes, 30 minutes, 30 minutes. Don't wipe your machines. Uh when you're doing this, find a way to quarantine them, but certainly do not delete those things. You will want to do forensics on

them. and test your EDR before an incident. Make sure that that's actually working the way that you expect it to. It very well might have been that that EDR vendor we were using, the UI might have been rougher on the edges. It very well might have worked perfectly fine, honestly. And we some things to take away from is we had all the data the whole time, but because all of these different entities were separate, we couldn't make sense of anything. and it ended up being effectively useless and helping us spiral this whole thing. Another thing that is kind of uh embarrassing to think about too is two people who knew their infra inside and out, we still got lost.

We still didn't know everything at all. We um Yeah. And it felt like all the tools in the world did not help. And I think that you want to double click on this one a little bit. >> Yeah, we thought we were invincible. We did all the things uh to our YC partner Sugarin and still we we couldn't find it couldn't find the bug uh until we called for help from Neo4j. Um I think a lot of companies would take a lesson away from this that like oh you don't need these you don't need these tools because they're not they're not going to help anyway. Um and I think we took the exact opposite approach. We were really really

glad that we went through this actually that it ended up like this because we gained an insane amount of customer empathy. I think we had never been in a position where we owned a company, we're doing IR, had that amount of pressure on us and we feel like this is probably what a lot of CESOs and other business owners face. Um they have this this this plethora of tools and they still can't use every tool under the sun to figure out their issues. So we are really really glad that we use all those tools that we that we actually went through this u because again that customer empathy was worth more than anything else. Um at the beginning of the talk I

mentioned how most startups in the valley are not told to do a lot of security until they have PMF product market fit until they have something to secure. But I think actually setting up these tools and being a security conscious company as a security startup is important on the road to PMF because again you learn what problems customers are facing. Parker Conrad made a tweet about this like a couple days ago with the Ripling AI launch. He said talking to your users is great but actually being the user yourself is one of the most valuable places you could be. For us in an instant, we needed we found that we needed to answer the questions

really fast. That was the main thing for customer empathy. Thinking about when our own customers are going through their own incidents and needing to get their answers, it's got to be fast. And this is the pain relief aspect of it in security where um there's always this tension between like you know are we how are we enabling the business? What is the value that we are creating for uh enabling everything else? uh this is this is how this is why we provide value and it was just really good to feel that so what's the word firsthand and directly um as I was mentioning earlier again I'm going to say this over and over again where it's like it's

completely different when it's your own company one last thing I'll add is u just like really concrete learnings after this we went and made our product 10 times faster anything that took more than a couple seconds to load we made it millisecond Uh few months after this we hired our first designer to make he's up there Zeie. Yeah. Um to make the product incredibly usable and every day he yells at us because this thing is not usable that's not usable. We try to take like a almost like a productledd growth type approach with our security tools so that in a moment if somebody's never even used a tool before they can figure out what they need in seconds. Um and the

last thing that we've done is we've added support for like every every integration under the sun. We realize now how cumbersome would it have been if we had some other tool that just didn't support Neo4j or didn't support anything else. We realize that while there's like company like economic modes that you can have by like not supporting certain integrations like every security tool has to support everything like no no security team should have blind spots with one vendor just because of some weird corp dev relationship that went bad. I guess uh to wrap this all up um again we originally debated we we really had a deep um discussion about whether we should even tell this story in the first

place uh because quite frankly it's super embarrassing like we we did it all because of a missing username thing. We had every tool under the sun. We're supposed to know our infra. We didn't know our infra. But we're hoping that by sharing this, it helps uh you know, potential other prospective founders know what uh what to expect, what to think about when you're in the middle of a fire. Zoom out, draw your block diagrams, get collected, follow those playbooks. Um hoping that uh you know that uh to just know that you're not alone in that like other people have uh faced this as well. And to close up, want to say thank you to Bside San

Francisco. You guys are awesome. um seven years ago actually at this very conference cartgraphy actually was presented over here and uh launched. I'm very grateful for all the support and it's been uh absolutely amazing. So, thank you so much. >> Wow, that's pretty incredible. Alex Kunal, thank you for opening up and providing some kind of go back to last year's keynote where Clinton was talking about vulnerability. You know, we are all practitioners. Remember that you're participants here. And so in that vein, we didn't see any questions on the slide or does anyone have a quick question for our presenters? We got time for two. >> Appreciate it. All right, then we'll say find them upstairs. Uh ask more about

what's going on with their startup. And uh hopefully you all take some good lessons from this. Another round of applause for our presenters. Thank you.

[ feedback ]