Waking Up to AI: An Adventure in Governance

Name: Waking Up to AI: An Adventure in Governance
Uploaded: 2026-04-29
Duration: 50 min 32 s
Description: Brian Myers walks through a scenario-based narrative of how a small software company confronts AI governance risks: from initial RFP questions about AI policy, through vendor management and contract updates, to shipping an AI-powered feature and managing post-release audits. The talk uses concrete e

BSides Seattle 202650:3219 viewsPublished 2026-04Watch on YouTube ↗

Speakers

Brian Myers

Tags

CategoryPolicy

TopicAI Security GRC

DifficultyIntro

ResearchCase Studies and Incidents Analysis

StyleTalk

About this talk

Brian Myers walks through a scenario-based narrative of how a small software company confronts AI governance risks: from initial RFP questions about AI policy, through vendor management and contract updates, to shipping an AI-powered feature and managing post-release audits. The talk uses concrete episodes—including a vendor AI incident, licensing complications with open-weight models, and stakeholder decision-making under uncertainty—to illustrate how well-intentioned teams with typical resources navigate emerging AI security and compliance challenges.

Show original YouTube description

Bsides Seattle February 27-27, 2026 lecture Presenter(s): Brian Myers

Show transcript [en]

Hi, my name is Brian and this is Waking Up to AI, an adventure in governance. Um, and I'm learning how to use this clicker. There we go. Uh, this go over quickly. It basically says, I spent a long time in software development from that I was ejected into management from which I escaped about 10 years ago into in into information security. I now work as an independent consultant helping companies set up and create information security programs. The core of that work is identifying risk and finding measures, security controls to put in place to mitigate that risk. And that's what this talk is about doing that work in our present time of AI ferment. Uh I went back wrong way. There we go.

So this data illustrates the starting point for my talk and it's from early this year and the chart basically says that what is it 94% of companies interviewed do expect to use Gen AI to help them write software. Great, no surprise there. It also says 82% of the same companies don't know if their employees are using AI sites or if they are what they're doing on them. This is a weird moment uh a historical situation that will eventually resolve itself. Uh and I may not be able to give this talk soon, but I bet I'll be able to give it for another year or two. We'll see. Uh so some things about this talk, just

what to expect as we go forward. This is going to be a scenario-based walkthrough of what happens at a small company. It isn't the the situations aren't limited to small companies, but I picked a small company example. um as they slowly become aware of AI risk. This is my prediction of how it's going to play out. Um it is an attempt to consider AI risk in uh specific concrete situations. So where where you're not just challenged to identify it, but also to think what can you do about it? Uh which is a little bit different from some other very useful talks that I've been to that informed this one. Um and this tale is both

exemplary and cautionary. I'm not trying to be koi. I'm going to show people doing what I think they will do. Well-intentioned people with typical resources making the best legitimately probably the best they can of it, but it's not going to be perfect and I won't stop at every point, but you are invited always to think in what ways what they're doing is not inadequate or if they could is not adequate or if they could have done something better. Um, this talk is also packed. I've given it a couple of times before, always longer. It was once a three-hour workshop. That was a mistake. It won't happen again. um but it has uh been paired down to focus

more on just the skeleton of the main story and the struggle of the company. I think this will I think the talk will benefit from this condensation and I'd be very interested to hear from you afterwards if it worked for you. Right. So let's get started. Once upon a time there was a small company called Mind Path. Uh and their product was a learning management system an LMS and they delivered train through the system they delivered to their clients training professional training and compliance related material like policies. So um you imagine if you're a client of mine path you upload to it all your policies all your trainings maybe you buy some training modules from mine path and

include those in your corpus. That's the kind of product it is. Mine path is a small company. They only have about 25 people. You'll meet a few of them. Um about nine months ago, they went through their first SOCK 2 type 2 audit. So, they have a security program in place. They know a few things. Uh they do, however, have on the staff or they don't have on the staff anyone who has ever been a full-time profession security professional and they have no one with any particular AI expertise. I think this is typical. Maybe not in in Seattle, maybe not in the Bay Area, but this is a normal situation. much of America is and much of the world is in a

similar situation and the story has episodes. It proceeds through a bunch of episodes. There's an event and they respond. So here's the first start. They get a question. It's in the form it comes to them through a request for proposals from a company called Big Bucks that is thinking of buying their product. RFP request for proposal is something that potential clients send Mind Path and businesses like it all the time. Mine Path is used to these questionnaires. They've answered the questions over and over. They know how to do this, but they have a new question they've never seen before this time. This time, the RFP asks them, "Do you have an internal governance AI governance policy?" Let me introduce you

to a couple of our founders. Um, Maxine Powers, her friends call her Max, is the CEO and the CTO is Archie Tech. And Archie is also the deacto security officer because as is common in companies like this, when they went through their sock audit, the technology leader is the one who led the effort and so he is in effect the security officer. So when Maxine seems to sees a question about a policy, she goes to Archie about it and she says, "Do we have an AI policy?" And he says, "We don't have any AI. there's no AI in our product. We didn't need it for sock. We don't have a policy for that. And she says, "Well,

I'd rather not give big bucks any excuse to ding us. Can we have a governance policy so that we can say yes?" And Archie rolls his eyes a little bit, but he agrees it should be easy. There's no real risk to manage. What does it matter what they write? Uh, and he goes and looks for some examples because he figures there must be some out there. Unfortunately, he hits a stumbling block because uh he first goes to DRAA which has a set of very useful policies that he relied on for some of his sock policies, but they haven't gotten around to AI yet or hadn't at least at the last time I looked. Uh and he sees uh he does

find an AI policy on the SANS security policy project. These are both sources of free available policies. Unfortunately, the one it sounds is written for big international enterprises and it's 14 pages long and full of jargon that he is sure doesn't apply to them and he doesn't want to publish that to his staff and be responsible for it. It isn't doesn't suit them at all. In short, he fails to find uh after a good faith effort on Google to find a good example, but he does learn a couple of basic things from trying to read what is out there. And this is what he writes. Now, he does a better job than this. what he writes

looks like a real policy and and you know it's got more formal language but these are the basic security controls the meat of what he writes and the core of it is really the third part he says we won't have AI in our services or our product unless it's approved by me first there I've taken his stand and the rest is all kind of straightforward normal so there now they have an AI governance policy he publishes it and this solves his problem right because the problem he was trying to address is being able to say yes to big bucks. It wasn't really trying to create any different solution than that. Simplest thing that might work, agile principles. This is it. And

that's enough for a little while. And now we get to the next episode. Now you see how the story works. Uh in this episode, there's a bug. And in order to explain what the bug is, I have to introduce you to one of the sub vendors that Mine Path uses. They call uh an API provided by a company called Transcribio. And Transcribio creates um uh transcripts, written transcripts of audio content. So they push through the API to um Transcribio uh like a um a video and the video has someone talking a trainer and uh uh Transcribio creates a written transcript of that that they send back and then gets incorporated in the final training module. And there's

been working great for some months, but there was a bug where occasionally drops the connection. No one knows why. And so we we enlisted Cody to look up the see if he could debug that problem. And he comes back quite quickly. He finds the problem and he explains here what the problem was and how he found the answer. Uh and if you're at all interested in application security, this line probably jumps out at you because it is the authentication token for the production API at Mind Path. And Archie sees this and he immediately thinks, "Oh, uh, well, that's a security incident for a couple of reasons. One, it's our policy here. We don't hardcode secrets into the

code like that. Two, you should never put secrets like that in Slack." Um, that's a, you know, need to know basis. There are people in Slack who don't need to know that. So, he creates a security incident, writes that up, handles it. It's a small one, easily handled. And then, as he's writing up the final, um, summary, he rereads this and he says, "Now, wait a minute. this code was shared with chat GPT. That means this secret, which you know, we've we've they've got a new token now. They're not worried about that anymore, but that secret went off site to another server we don't control. And he says, "Oo, maybe this is a thing." And he does some

research and sure enough, it is a thing. It's routine. It's measurable. sensitive data is shared with people on through um you know things like chat GPT all the time at a substantial measurable low rate. It happens. It's regular. Furthermore, Archie googling around finds out that most knowledge workers are already using tools like this. No surprise to you. But Archie didn't know how pervasive this was. Um and further most of them are using company non-compissued tools. Company the tools the company isn't managing. And they say, "This really scares Archie, that they would keep using those tools even if they were banned. Archie's learned something here." And so he thinks that he's probably going to well, first of all, he admits he's facing a

new risk here. So he goes to his risk register. And uh if you don't know, you probably do, but a risk register is really simple. It's just a list of all the information security risks that your company faces. It's a standard tool for keeping an overview of all the things that you might be working on and having discussions about what the priorities are and where the resources should be applied. It's he set it up as part of the sock um sock audit. It's standard thing most companies do. So he adds an item to this risk register and he does his best to describe what it is and how risk it might be. He says it's pretty

high because it's actually already happened at their company. So uh the next question is what he's going to do about it. and he realizes there's a lot he doesn't know about what might be going on at his company. And with a kind of sinking feeling in the pit of his stomach, he decides to go spelunking. And the first thing he has to figure out is what are people actually doing with AI? And he's savvy enough to realize that if he just sends out a note saying, "Hi, as your security officer, I need to know what you're doing with with AI." Um, he may not get a complete absolutely forthcoming answer. So he goes and talks

to Max and they come up with this plan which works pretty well. Max sends a note out to all employees saying, "Hey, I Max already using AI. Let's compare notes on how we can do better at our company with AI." And that leads to a series of lunch and learns where people present things that they're proud of having figured out. And after not long, Archie has a real view of what's going on at Mind Path where every line is a use case. And each use case involves a certain person or someone in a certain role doing a certain task with a certain tool way more than Archie suspected. And you can see, you know, some of these are

for the same tool, but are different people doing different things. So, uh, he doesn't know what to do about this, but he does have a bad sense that there are many things running around out of his control and that it's time for a moment of intervention. And the next episode is what? Oh, I heard my sound go. Uh, what is the intervention? So, he does what he always does. You're beginning to know Archie by now. He goes back and does some research. This is stuff he doesn't understand. So, he Googles a bit and his sock security controls are aligned with NIST. So he starts by going to NIST and discovering hey they already have an AI

riskmanagement framework. Surely he says I will find what I need in there. He also likes the UK national security cyber cyber security center because they write uh in language suitable for business. NIST writes for security specialists and it helps him to go back and forth between those two things. Um unfortunately reading all this helps him very little. it fills his head with all kinds of um abstractions that are not relevant that he doesn't understand because those documents now are targeting people who are building AI and they're full of risks for people who are making models with very little guidance for companies who are not making models and are just trying to like use chat GPT. There's nothing in there directed

at them or very little. He does learn a few things. some of this starts to make sense, but uh they they don't help him much and he uh does eventually go to the OASP AI exchange and find this little thing started that says quick start at the top and that you know anything's a quick start that tracks him and this describes at a high level a few steps that would might constitute the basic form of an AI management program that he understands and thinks he can act on. the basic points I've highlighted in green. Involve the board. Get stakeholders from across the company, multiple teams. Send out a survey and find out what your team's doing. Hey,

says Archie, I already did that. I'm on the right track. Um, do a risk analysis, update your policy, and institute an AI literacy program at your company. And he says, okay, I don't know what all those things are yet, but this sounds like I better learn and better figure out how to do something like this. So, he starts by updating the risk register with all the things he's learned. I won't make you read all of that. Again, I'm going pretty quickly. The slides are available. The slides are on my website. The URLs on the last slide. Um, there is also with the slides a readme that has a long kind of bibliography with links to all the

studies and anything that Archie referred to along the way is is there. Um, so he comes up with risks acknowledging, you know, what he's learned. Uh, and he constitutes a committee. He recruits uh Max and Mark to join him and be an oversight committee because he realizes AI is benefiting all departments and it's beyond his immediate scope. He shouldn't be the only one making decisions for the business about what they need to do and how they need to control it. Max and Mark agree uh and he shows them the risk register and they talk about it what those risks mean and they decide we're going to add these basic security controls to our governance program. This

is taking the policy you saw him write initially and expanding it. And they haven't done any of these things yet. They're just declaring these are what we will do. Uh and the core of it is we'll take that um that set of use cases that Archie uh found and we'll vet them. We'll look at all the vendors and decide which ones are safe and which ones are risky. Uh and we'll approve some of those and we'll not allow others. So we'll take a stand informed stand on what we what risk we can tolerate and we'll do some training uh and we'll I don't know what that is but Archie will come up with something uh and that's

basically it. So they expand their governance policy uh and turn it into they create a project plan specific tasks people need to do to make those things in the policy actually be true. Um and they assign them to people much of them most of them fall to Archie and he starts executing. So he he's going to start first on the use case inventory and then we'll look at literacy training what that is. So there were in those use cases 18 vendors and some of you all are already smiling. Some of you have had to do vendor reviews before. It's a process that typically involves a big standard questionnaire with lots of questions. You send it out. You wait three weeks

for the people at the other company to decide it was worth their while to fill it out. It comes back. You don't understand the answers. You either decide it's okay or you call them back. You get more questions. you have to write it up and leave artifacts. It's a pain and you have to do it every year for all those vendors. Um, and now Archie has 18 of them land on him and he's not happy about that. But he doesn't see a way around it. He does of course enlist help. Um, his lead engineer Ruby was his main assistant when they went through the sock uh audits. So he enlists Ruby. Um, and their first thought is, do we really

have to do full reviews on all 18 vendors? Is there any way we could lighten this load and make it something we could possibly manage? And they put their heads together and they think, well, we should if there's a way we could do it based on risk, like pay less attention to the less risky use cases, maybe we could make it easier to do. So, they come up with this rubric. Now, the columns are uh from their data confidential from their data classification policy that they made up as part of their sock audit. They already have that in place. less sensitive data is on the left and more sensitive data is on the right and they

say well actually a little bit of our data is public if you want to share text of a press release with a chat GPT how about it share it with any tool you don't need approval for that the data is already public so some set of use cases we could preapprove a small number of them but that's a little work saved maybe the most important innovation comes with the internal category if you're going to share minimally sensitive data that's internal to the but probably wouldn't really damage anything if it went out. Then we aren't going to do a full vendor review. We'll do a lightweight review still to be defined, but we'll invent something that

doesn't take us days and weeks and big questionnaires. And then we'll do a full review for anything else. Their next step is to invent what a light review is. And again, they don't find this online. They're inventing it as they go. And what they decide they'll do is uh 30 minutes max. It's a time boxed exercise based entirely on information they hope to be able to find on the vendor's website or maybe a little googling and press releases and they will take up to 30 minutes to see what which parts of this information they can find. And if they can't find it in 30 minutes, maybe that's a sign about the vendor itself that they aren't mature enough to care

enough about security or understand it well enough to put out even basic useful information. So here Ruby will spend 30 minutes to make a best you know kind of guess about whether this is a plausible reputable vendor or not and make a judgment call based too on what information we'll be sharing with these people. Uh and they go through the exercise they do full reviews on some and light reviews on others and they end up with a list like this. This isn't quite how you Archie publishes it to the employees who need it because it isn't that easy to read. It's what he keeps for himself on a spreadsheet. The final decision is there in the decision column

for each use case. And you can see for example there are a couple of requests to use chat GPT. One was allowed and one was denied based on the sensitivity of the data in each request. Whether this is scalable, how long he'll be able to keep this up, I don't know, but that's how they start. Um Archie also makes two recommendations to the company. He says, "Now, I've gone through this exercise um and I've learned that you get better protections for your data if you get a business license in some cases." And he recommends that the company consider getting two business licenses. One so that everybody in the company has a chat GPT license. I'm not shilling for chat

GPT. I'm not recommending them. It could have been a different tool. In Archie's case, that was the most requested tool. Might be different in your company. Um, and he says, uh, if we gave everybody that, then, uh, not only would most of the the the most common tool be set up, a lot of the other use cases that weren't for chat GPT could be handled in chat GPT. So, a great deal of the volume of work that's being handed from our company to AI could be handled through an approved licensed mechanism if we put that in place and we'd better be be better protected. He also suggests they get a co-pilot license for all the dev

team because most of the case requests to share restricted data very confidential data were from the dev team and this gives the employees two licensed managed ways to deal with AI in company approved manner with business licenses to do work he knows they're going to do anyway let's at least make it you know um protected as best we can and so that's how Archie thinks about it let's say mine path agrees and they get those licenses not all companies would. Um, then there's the AI literacy program and Archie says, "Well, we already have an annual security awareness training and this feels like part of that. So, we need to figure out what we should add to

the annual security awareness training that makes sure employees aren't entirely ignorant about the risks that AI brings to the company." And so, he adds these points to their existing training. It's not a lot, but it's the basic stuff. It makes sure that any new employee coming in at least understands that there are risks with AI, that there are rules about the use of AI in the company, uh, and that they have some responsibilities that is covered. Now, he has an AI literacy program. Uh and to recap this episode which began with uh uh Cody um uh sharing a a secret a coding secret with API and discovery basically of shadow AI in the company. In response to that, we have updated the

risk register, improved the AI policy, created an executive oversight committee, um, instituted approval for AI use cases, and published a list of AI use cases that are approved, and delivered AI awareness training to the staff. And Archie is feeling pretty happy with himself. What happened to me? >> Uh, okay. I can talk loudly if that's better. you couldn't talk while I'm being readjusted. So, we've come a long way from where Archie started when he thought there was no AI at the company and there was no risk. Uh, and he's put learned a lot of stuff and you just saw in that last slide, he's put a lot of controls in place. He has the backing of

the company behind him and he's feeling like we're in a much better place. He says, 'I know I was I recognize now I was a little behind the eightball and missing a few things, but now we've got it fixed, and this works until it doesn't. Um, new wrinkle. Um, this new episode begins with an email from a customer. And this email, somebody named John writes in from a company named Safe Harbor and says, "Hey, I was looking at one of our trainings uh, and the transcript mentioned a policy, a remote work policy." Ah, I can hear. I have a voice again. Uh, and uh, I was looking for that remote policy, a remote work policy, and I couldn't find it. So,

could you just point me to where that policy is? Well, this question gets forwarded to the content team at Mind Path and Paige on the content team investigates and she notices this odd thing. Yes, the transcript mentions a remote work policy. However, the video from which the transcript is derived does not mention a remote work policy. This is strange and the trans the transcriptions you already know come from transcribio. So they write to transcribio and say can you explain this discrepancy and will writes back from transcribio and says oh actually we started using AI to generate those transcripts and it looks like our AI hallucinated. We're really sorry. We'll do all the right things. We'll double

check everything. We'll put new measures in place. We take your we value your business. You know all the usual things. And Archie has another sinking feeling in the pit of his stomach because after all this is to start with a security incident. The integrity of their data was affected by the work of a vendor. Bad data was put into their um into their product. So that is a security incident. Uh, and Archie realizes that if Transcribio is starting to use AI, maybe some of their other vendors are also starting to use AI and introducing risks that he had not thought about into their business ecosystem. And sure enough, there's data about that. That is exactly what's happening.

Oh boy. Uh Archie goes back and writes up a bunch of new things to the risk register and says we got to worry about this and we've already had an incident so you know this is pretty high priority and he takes it to the oversight committee and they think about it and they decide well we need to do the following things. We need first of all to update our contracts with vendors to deal with AI risk and AI incidents because there's no language in there about who's responsible or what shall happen. We need to review all our vendors. You know that thing Archie loves doing, vendor reviews. Those vendor reviews need to include more information and need to ask vendors

about their use of AI. Uh, and we'll need to think about AI and our incident response plan. Is it surely there's probably something there we should be saying about AI incidents since we're going to add AI incidents to our contracts. Uh, so now let's look at that work. Starting with contracts, Archie doesn't like to go talk to the general counsel unless he has a good idea what he needs to say. He likes to sound smart when he talks to Drew. So he does a little research first, Archie's signature move, and he stumbles on this great article from Stanford Law School that this is just one graph from it. they're more backing up the main point

where they are contrasting data about typical SAS contracts from SAS companies with contracts from AI companies and saying what are the differences and the study shows that with varying degrees of difference um AI companies are much more likely to cap their own liability and less likely to cap your liability. They are more likely to grant themselves broad rights to use the data you choose to share with them. they are less likely like likely to commit to any regulatory compliance or to provide any warranty for service. So Archie feels like he's got information he can tell their legal counsel that they should think about in the contracts and he goes and talks to Drew. Um as a side note on this slide, I

should say I am not a lawyer. Uh nothing I say should be construed as legal advice. Drew is a lawyer, but he's fictional, so that doesn't change the story at all. Um, these may or may not be the right considerations for you. Uh, but they are a representative list of the kinds of issues that probably should be thought about in writing contracts. There may be some others. So, Drew thinks about it and he does his end of research and they revise their standard contract language, the boilerplate template they use with any of their vendors. And that's a new security measure in place. Next, Archie has to think about how he's going to review vendors. Well, he

already has a vendor questionnaire. He used the one from SIZA um for their supply chain riskmanagement template because it's specifically designed for small and medium businesses like mine path. He likes that one. So he goes back to SIZA and discovers that that has not yet been updated to include AI risk. So that's no help. He does however find some uh other sample vendor questionnaires from vendors. One trust in Vener helpfully make some standard things available and they do have some AI questions. He doesn't fully like either set of questions though because they too have questions he doesn't care about like data versioning and alignment and consent and that he doesn't think are relevant to him. But he is able to

pull from those a small set of questions that make sense to him and he constructs his own little addendum to their existing vendor uh review questionnaire. You can get this um from GitHub. Um it has no special authority behind it. just Archie's thinking of what might work and it is obviously not trying to penetrate the depths of the full inner workings of any vendor's AI. Um, any AI any data scientist would ask a lot more than this. Archie isn't one and he's not building AI and he thinks he's only going to ask questions designed to discover the degree of exposure and turn up any possible big red flags and limit it to questions he thinks will be

meaningful because he can understand the answer. Uh, and this is what he comes up with. So he let's say he sends this new addendum out to all their existing vendors and adds it um to becomes part of the standard questionnaire for any new vendors. They go through all that exercise. Finally, we're left with what to do with incident response because he's seen in several places he's looked at like the OASP writing and ris the um uh the NIST write up that you got to deal with AI in your incident management problem. But when he goes to these sources to see what the advice actually is, it's useless. It's like this paragraph at the bottom. It says, "Oh,

you know, be sure to take AI into account in your incident response program." It doesn't tell him how to do that. So, he's thrown back on his own devices. And, you know, the best he can do, he tries to think, well, following NIST, we divide our um uh incident response plan into seven phases. And Archie says, 'Given what I know now, which is not a lot but not nothing, I can think of a couple of things I could say we do for preparation and for detection. What we do for an analysis? He says, I don't know. It depends on entirely on the situation. I can't anticipate what that is. He says, at least I don't know enough, I guess. He

says, I'll do what I can on those first two and we'll call it good for now and if I learn more, we'll do more later. And so this is what he adds for the preparation stage and the detection stage. He says, we screen vendors. That's a good preparation step. And our vendor contracts talk about what will how we'll deal with AI related incidents if they occur. So those are some decent preparation steps. And to detect them, uh we'll train staff to recognize them and we will train them to in to report incidents to our AI oversight committee. We have an AI oversight committee. See, we know what we're doing. We're good. Uh and he knows this isn't great, but he's

doing the best he can to make something that is not nothing. uh working with what he's got. And so this is what he adds to the training. He's promised that they will make uh ensure that employees know how to recognize AI related incidents and drawing on his experience he just had with a um transcribio hallucination he says if things like this happen tell me and adds that to the annual security awareness training and a little recap but I thought it's probably fresh in your mind. So those are the governance kinds of controls that he's put in place in this episode. All right. Now, we're going to get somewhere. We're going to go to retrieval augmented generation from Rag

to Riches. I need to introduce you. I don't think I have yet. You haven't met Mark or No, this is a win. Yeah. Our sales VP. Um, he's been sitting on the sidelines for all this and he's hearing the company talk about AI and he's really excited. He says, "Oh, come on. There's so much we could do with AI. You're all fearful about it. Let's dig in. Let's make some money. Let's solve problems for our customers. let's be excited here. And he has lots of big ideas of things he thinks AI should be able to do in the mind path product, which is met with a modest amount of skepticism from the engineers who would

have to build it. But uh you know, Archie isn't the only one who's been doing research. Uh Ruby when they started dealing with AI did a couple of hobby projects on her own and she says to Archie, "You know, I made this ragbot that I can talk to. I sort of trained it on my own personal GitHub repo, she said. And it can now tell me things about my code I didn't know and it's kind of useful and it's turned out to be well doumented, easy to build. Let's do one as an ex learning experiment, an exercise for the Mind Path dev team. It'll be lowrisk because we won't put it in the product. We'll make an internal

rag chatbot for searching our wiki, which people are doing all the time anyway and having trouble finding information. And then maybe we'll toss in our policies and a couple of other corpuses and train it on those and make an internal tool. The dev team will learn, get some experience dealing with AI. We'll all get to think about that risk. The company will get a chance to use it and see how how it works. Uh, and that'll be good. And Archie agrees and he takes it wrong way. He takes it to the execs and they agree. Uh, and so they decide they're going to do this. They're going to build this rag chatbot. Whenever mine path undertakes a new

significant coding effort, they always start with a risk assessment that they call an architecture review. So this is the architecture review for the rag chatbot. It starts with a proposal about what they want to build. Usually a diagram, generally more sophisticated than one than this. This is a very standard rag uh architecture diagram. I'm not even going to explain it. This stuff is easy to find online. But the basic goal is you take some mind path documents like their wiki, you digest it into an embedding index that AI can understand. When someone asks a qu asks a question of a big chatbot like chatgptt or something like it, uh you're able to uh expand the context by finding

semantically related material from your own documents and passing that to the big big uh the big chat GPT. Um, so for the first time in this story when Archie does his research because of course he goes and Googles and does the research, he finds exactly what he wants. He's building AI now and this stuff is well documented. He finds articles on risks in rags specifically and so they're able to come up with a much better list more easily of what they should worry about in this project. Um, and there's some of the answer some of the things they come up with and some of their proposed treatments which are pretty light. So, for example, one of the risks with this

stuff is the chatbot, the rag chatbot will behave nondeterministically. You can ask it the same question twice and you'll get different answers. How do you automate testing with that? And they say, "We won't. We're just learning. We'll do like some manual smoke testing on builds. It's only exposed internally. The data doesn't matter very much." So, they take easy steps for some of the hard problems, but they acknowledge the problems. Um, one of the problems, the first one up there is hallucinations. Uh, and they say, "All we're going to do for that is remind users, uh, I can hear my voice throw. We're going to remind users that this stuff is dangerous and not always reliable." Um, and they're

thinking how to educate users for that. And they're kind of proud of themselves when they roll it out. They give it a name that they think will be pedagogically useful. They call this tool how it's supposed to sound like hallucination and also like the chat like the the hostile AI in 2001. So every time you use it, you're reminded this isn't necessarily the most trustworthy thing. And it works pretty well. You can ask it some questions and it'll come up with useful answers and it does indeed save people some time. There are a few incidents like this. Um, you know, you should that should should jump out at you already. Um, they never expected that would have access to any

secrets. And with when this shows up, they know they need to do a little forensic research to see how this happened. They call Ruby in to do it. and she discovers that of course this is in the wiki. That's how it got into Halley. There's no other way. Somebody a year ago when they were setting up some admin system in a business meeting took notes and wrote that down. That's how it got there. And how doesn't know any better. Another example is this one where the answer is correct, but Hi looks a little illiterate. Uh not knowing how to spell transcribio and forgetting that the P in mind path is capitalized. And you know that doesn't

matter for internal use, but if they were ever going to do something in production, this would be a problem. So they get Ruby and she has to do some forensic work and she discovers these typos are of course all over the wiki. That's why uh you know this isn't a hallucination. This is bad data. And in general there are enough incidents like this. Um Ruby jokes that she has a new job title for all the time she spends researching them. She says she's now the chief hallucination officer. And this is how the company discovers the importance of data governance. Um, and that's the end of that little episode. In the next one, and we're getting near the end

here, uh, they decide we're going to put on the seven league boots and walk like a giant and put AI in our product. We're going to promote Halley to live. Uh, and what it'll do there, it's not the wiki. It'll let customers do AI powered searches over their own corpus of content. So you can query your own training, your own policies, and ask questions and be given answers. And they figure, you know, there will be problems. They they know what it's like, but there's data governance built into their product to some degree because trainings and policies typically have an approval process and you know who approved it and it isn't published until it's approved. So the data is a little

bit cleaner and they think we'll sidestep that problem a bit. Um they have a new risk uh assessment, a new architecture review. It's almost the same diagram, only now there are multiple um indexes, one for each client. Um, and Archer reminds them, you know, there were a lot of things that we didn't think about in Halley that we're going to have to take seriously this time. So, there's more to think about as part of the arc review. And they all think about it and they come up with a much longer list of risks this time. This is a little scary to them. No, you don't have to read each one. I've grouped them. You get a a general idea.

they haven't had an architecture review come up with quite so much to worry about at once before and they're not certain if they should even proceed. Um, so Archie takes the list home and he thinks about it overnight. You know, what can we do? How should we think about this? And he comes back the next day and he says, you know, of the the many risks in that list, 15 of them, far and away the majority, although they're related to an AI feature here, are not specific to AI. These are kinds of risks we already have in our product. So we have contractual risks. We have stale content. We have regulations and quality assurance failures. Many of these are

things we know how to mitigate. It may be a lot of work, but we understand them and we know we have moves. You know, we we have some confidence that we could address them. We just allow enough time for it. He says there is however a handful of risks that are very specific to AI for which which we have no background and we'll have to think extra hard about. So they do they do a lot of research. They go to talks. They go to conferences. They talk to people and they come up with a long list of many measures to address all the risks that came out of the architecture review. This is not the full list. I give it to

you as an indication of the kinds of things they're thinking about. And then the question is they're still in the planning stage. How much risk have they alleviated when they do all this? Do will they feel confident shipping? And so Archie takes the full list and tries to guess how effective each measure will be in addressing all those risks. And he adds a new column residual risk. So the risk level is the initial assessment and the residual risk is what Archie estimates the remaining risk will be if all the proposed mitigations are implemented. And you can see most of it goes to low which is great. There is however a set at the top a handful of

things that didn't go to low and this is more than they're used to seeing and they think hard at this moment. And while the list will be very different from company to company and project to project I think companies getting to this stage where they are first releasing AI will have a moment like this where the risk is more than they're used to and they have to decide what should they do. Is this okay? Hey, should they, you know, can they manage it with a small beta release? Should they cut back the features? Should they take another month and build some more mitigations? Who knows? What Mind Path decides at this point is, you know, all

we're doing is building a search function. It's tiny. It's, you know, nobody's depending on it too much. The product we're putting it in is not missionritical to any of our clients. None of our data is regulated. It's not PHI. It's not financial data. Some things might go wrong. We might have some embarrassing moments, but probably it's worth the the cost and they they decide they will accept the risk and go ahead and ship. Uh and they ship and there are a few minor glitches. Um uh the final episode grows out of some of those. There were a few little um you know, not quite right answers that got the attention of their biggest client, Ironclad. And Ironclad

is very careful about what they release to their See, actually Iron Ironclad resells Mine Path. Ironclad has their own larger GRC platform and they build mine path into the back end to do training within that GRC platform. So they have a lot of clients. They're a big customer and they are very careful and when they see a few glitches, they decide they want to exercise their contractual right to do an audit and they write in and they say, "We're going to audit you. Please send us this information in prep." and they ask for the list of AI components in use, what's in the technology stack. The um uh anything you tell us about your licensing and your current AI controls

and oversight process. And you've seen most of that material. What you haven't seen is the list of items in the tech stack. And this is what they use to build their their rag um rag feature. Uh the bottom one, Runeststone, is the main chat uh J G the main the main LLM that answers questions. They're not using chat GPT. They're not using a commercial one. They found an openweight model that works well enough for them and they they host it locally. Um so they pass this on to uh Ironclad and Ironclad's auditor writes back and he says, "You know, you may think that you are using an LLM with an Apache license, but it's actually derived from

another model that was issued with a different license and that license is not friendly to our law enforcement clients." And Archie says,"What?" He says, "Look, I mean, I've done this before. You get a component, you see its license, you check the license, we know Apache, it's fine. Where does this other license come into it? Allow me to help." This is now out of fiction where in a real case, this is from the model card in hugging face um of this model at the bottom, Nem Gemma, that long name. That model was made by a company by called Needm something. I don't know anything about them. uh but this particular instance has been written about it's how

it came to my attention um and this model at the bottom has two parents or a parent and a grandparent and Google made the first two Google made one and then they fine-tuned that and Nem quantized that to make a final one and for some reason uh unexplained when needed did theirs they chose a different license the problem is that the Gemma license explicitly says you cannot override anything in our license with a later license and it specifically calls out Apache as something that's incompatible with it. And the Gemma license says you can't use this model for anything that helps uh surveil people without their knowledge, which is why ironclad with their law enforcement clients is

worried. So back to mind path, of course, Archie, you know, learns new risks here, right? So he goes back to his risk register and adds items to it. try, this is how he kind of crystallizes his thoughts about what we're facing and tries to think how important it is. Um, and they he talks to Drew and Drew does his best to make an argument on behalf of Mind Path and he says to Saurin, you know, look, I get your point. I see you're right about the licenses. We missed that, but really we're just building search. It's searching through trainings. We're not helping surveil anyone. We don't think the general license people would worry about it. And

Saurin says, you know, this is a gray area for us. What if when you search, you turn up a training that tells you how to surveil people? We don't like it and we don't want that risk. You need to fix it for your next major release and they don't have much choice but to do it. I need to say that the risk isn't just that there's incompatible um uh prohibitions in an earlier license. It isn't just the license drift problem from one license to another. those conditions who can't use the model for this or that purpose are important and and capturing you know ethical moral things but they're hellish to track particularly if they could be changing

in models from which you derive on releases that you're not tracking from you know it's and if Archie had seen the Gemma license restrictions he would not necessarily have realized that there were any law enforcement clients using mine path because that's actually Ironclad's client Um, so it's hard. Anyway, that ends the series of episodes on Mind Path's growing awareness of the risks they face. And let me just have a few final kind of summary and observation slides. First, this is a summary. This is these are the episodes you've lived through. Thank you for your patience. Um, the things that we that we dealt with. And just for fun, I put on the last column the number of items

Archie added to the risk register at each state. Of course, that's a meaningless statistic. Nobody cares. It doesn't matter how many risks. I only put it to capture that as an ongoing activity because after all that is how Archie got through it. And to the extent I have advice uh it doesn't really make it easier. It's just to remind us that this continuing discipline of identifying risks understanding the risks which is the hard part in this case. That's why Archie was constantly back with Google. Understanding the risks well enough to prioritize them and bu build plans to mitigate them is the way that is are the core muscle of any information security program anyway. We

just have to start applying it to these new risks. This slide uh just a little side note it lists uh many of the re resources that archie consulted for help along the way. Again links to all of these are in the readme that goes with the slides. You can find it in GitHub link from my website. Um, and the interesting thing about this is first of all, it's in chronological order for the story. So at the top are the things he consulted when he was first trying to write that first governance policy and at the bottom are things related to licensing and rag. And the color coding shows how useful the document was to Archie. And it becomes

clear that once they started building AI, it was much easier to find answers. And at the earlier stages when there was in some sense less risk or at least a different risk there was a lot of trouble finding best practices to follow there isn't much written for that yet. Uh which leaves us in the problem in the case of having to to invent for ourselves emergent practices. We have to take things that work in one context and apply them things we know from somewhere else and hope they'll work in a new context. that's always been a skill valued on information security teams, but it becomes extra important in our new world. Um, and so a word of encouragement, if

you or anyone you know is living through pain like what I've described, um, don't blame yourself too much. That is the work now. We are all struggling and we will continue. We learned from Eva this morning in a different context that the technology is outpacing greatly our ability to manage it and this is where we are. So from the entire staff of Mind Path, we wish you safe journeys on your AI travels and if there is time left, I'm not sure if there is. Um, okay. So the first question, how do we define risk? Um, Archie doesn't have a clear definition and I I do have you next. Um, doesn't have a clear definition for himself and in his mind

it's like anything that could cost the company money or or affect their reputation. So, it's something in his mind is not any more sophisticated than things that could go wrong that I have to worry about. Um, there are better definitions out there and if he needed one, he would go to NIST because that's what he's used to looking for. I don't know how helpful that is. The second question about um pieces of information that might be uh independently related rated as low risk or low sensitivity could become more sensitive if released together. Have I understood your point correctly? because we are doing with AI and we know AI can cross boundaries. >> Yes. >> And uh that means that something that

can be released less let's say public and if you get several things together we can get into the real or confidential really restricted. >> Yeah. >> Uh that's not something Archie has even occurred to Archie yet. He's not in the point where AI is evaluating releases from his company. at least he's not aware of it yet. Uh sounds like he would if that were a worry in his company uh he would need to start thinking about data governance on on what gets rated public uh in his data classification system and that would be a new concern for the company then as they authorize information to be disseminated. They might need to have a new lens on whether

you know what else has been disseminated and how those could be put together. Interesting point I hadn't thought of. Yes. I'm wondering in how many of these what do they AI seems to be treated as special is that going to be true for the long term or is this going to normalize I noted in one of your earnest sides that I passing confidential information to our true fender quickness condu is giving inadequate quality of work that that's a contractual thing why is AI being treated as special here in the first case even pullies a free pass to user RA and grows misconduct and in the second case say okay because it's AI but the Harvard

Business Review slide where you talked about the asymmetry of the the contracts that are currently being issued made me wonder whether this was merely a right now temporal question of whether AI will always have a free path. >> I think it is a right now temporal question. Um as I said at the beginning I think this is a talk that I will that has a limited lifetime. I think there will come a time, it may still be a few years out when uh the best practices for AI are better known. Um but we have to get to the point where the AI isn't advancing so quickly, where it's not capable of doing things we haven't

thought about yet. Uh and that's going to be a while. So um uh there are some so for now AI is special because it introduces risks that because we haven't thought about them are not already in our contracts. In in this case, for example, uh there was nothing about AI in the Transcribio contract and yet Transcribio's AI created a problem for mine path and there was no advanced consultation about who's liable if that occurs or whether problems like that have to be reported. So for now, while the existing best practices don't take into account, it is a new and special thing. Is that an answer? >> Yeah, but you wouldn't consider that just a quality of work product. Like why

does the fact that the quality was due to AI even matter? Quality of work but must. >> Yes, that's true. Yeah. So you could address if there is a quality of work uh clause in the transcribio contract that would presumably cover AI as well. Yes, I'd agree. I'd accept that. >> Yeah. >> We're going to cut it because we need to get lunch is being served is served. So we'll make sure you all get lunch. So will you be available outside? >> I will be available outside. I'm here today and I'm here tomorrow and I would be delighted to talk to anyone about these things.

Waking Up to AI: An Adventure in Governance

Related talks