← All talks

AI Sucks and It Keeps Getting Worse

BSides Perth · 202331:32943 viewsPublished 2023-08Watch on YouTube ↗
Speakers
Tags
StyleTalk
About this talk
AI and ML is everywhere right now, mostly thanks to OpenAI and ChatGPT. People are worried that it'll take their jobs, automate them away to nothing, and at the same time make it way easier for attackers to do what they do. I've worked with and on AI systems as part of my day job and know quite a bit about the limitations of what these things can actually do. This talk discusses how literally nothing good can come of what we're embarking on with LLMs and AI/ML, and that most of the bad things that are likely to happen will only get worse, along with a spattering of how these things actually work and what they're good for (which I'm sure red-teamers will like). Hopefully, if anything, people will take away that their jobs aren't about to go away, they will probably just get harder.
Show transcript [en]

okay just disappeared for me on a for a second there and I'm kind of concerned but it's okay this is definitely not garbage technology well uh ladies gentlemen and distinguished envies welcome to AI sucks and it keeps getting worse so I apologize in advance to anyone who is a big fan of AI because I'm about to ruin your day uh a specific apologies to Toby because he's doing an AI talk later and I'm really hoping this is making him change [ __ ] you know reevaluate his life choices um all right so who am I uh so uh yeah hi I'm Stephan I make things uh generally things that either do bad stuff to computers or things that find said bad stuff on computers uh I talk about stuff I was a lecturer at one point so this is kind of old hat I've talked at a bunch of cool places uh you know I like you know computer stuff it's like I'm doing the thing again the stream people are fine it's still sharing uh so you know uh just get on Twitch anyway um so uh I like AI generally so don't take this as like a AI is totally bad I actually think it's really really cool when you actually refer to it as like you know complex systems as opposed to say Ai and uh yeah I'm a co-founder of a small Perth company I don't know if you've heard of it called hyperfire so you know uh you can follow me on LinkedIn but I hate LinkedIn for being a corporate hellscape uh so you can follow me on Mastodon also get on Mastodon all the Defcon guys are there it's a lot of fun you're in cyber security you should be on Mastodon uh anyway so uh this is a descendant of a talk written for wacon do you actually remember whack on wacon was fun yes so that's why is a little bit less technical a little bit more irreverent and a little bit more oh God what are we doing to ourselves why are we using this technology um so I've worked or worked with currently quite a few AI systems I know how they work uh I like them if they used well not so much when they're not uh the problem is is it's kind of not what's happening right now so let's get into it and find out what is AI all right so what is array well um you may have heard about this so-called AI uh you may have heard that it is going to uh eliminate your job ruin your life Steal Your Girl and quite possibly destroy the entire planet uh if you believe that sort of thing uh now you may also notice this is not a technical or even vaguely good spec and we here are in cyber security and we need to know how things work so that we can a secure them and protect people from them or protect people using them or whatever um so we need a much better specification than just doomerism in order to make this work so uh let's ask what is AI really now there are a lot of AI systems and algorithms out there going all the way back to the 1970s and this slide could take an infinitely long period of time if I really wanted it to but I'm going to be cute and jump straight ahead and say that AI is uh you know a neural network of some description uh so what is the neural network uh well basically some cool math guys a long time ago uh figured out that if you put a bunch of kind of maybe not really neurons together I mean if you squint maybe uh they get good at matching to what we call multimodal distributions that is to say probability distributions that have multiple bumps in them right they have multiple modes they're multimodal not to be confused with the multimodal things that they're pushing right now which can do multiple kinds of input um now uh you know what the actual Network looks like is different depending what you're trying to do whether it's trying to identify which picture is a cat or trying to produce a really cool text which doesn't really work that well but whatever they look very different uh it can be anything from uh you know it's a bunch of notes that are just sort of connected those nodes have an activation function uh that kind of takes some input does some stuff and figure out if that node is on or not uh this stuff that it does can be anything but usually it's really simple in some kind of sigmoidal function which is like a cool curvy s function um so it's all floating Point math which is what graphics cards are good at and which is one of the reasons why we use those for this sort of thing um if the thing is on the note is on it pushes that on status to the other nodes it's connected to uh given some weighting on those little lines that's where all the lines are there and uh how this is arranged changes with what kind of thing you're trying to do if you're working with time series data that's one sort of shape if it's image based it'll be using convolutional things it would sound better still do some more stuff contextual whatever right can be anything from a dumb two-layered Network that doesn't do very much but can be surprisingly effective uh through to a Transformer which is what these things are um which is what all of our cool llms are made out of these days um you turn it by showing good and bad examples a lot of good and bad examples no you are not quite aware how many good and bad examples but fun fact hi everyone Stefan here uh just letting you know that the actual examples need to be very specific which ones you use that are good and which ones are used are bad very very important and it's not clear or even derivable which ones they should have been so a lot of guesswork is involved um and uh you know that's pretty much it that probably means a whole bunch of not much to you because you're not in data science you're in cyber and I agree data science is weird so um you know maybe this will give you the right stuff to go and query if you want to build one yourself for whatever reason uh maybe it'll tell you not to but uh you know the wild thing is that there's not much Beyond this right the guys that are actually in AI don't understand much more than you do on how this stuff works right nearly all of it is guesswork there's a lot of math but it's not replicable in Most states right which as you may know if you've been anywhere near science is bad um so most AI specialist Engineers have some vague familiarity but to what the tools are that are available to them uh they spend most of their time sort of like just hacking away at problems and they sometimes have some sort of patterns that they pretty sure work because bounce probabilities they seem to help more but uh you know why any of that happens uh they're very opaque because they turn things from being a problem and a solution into a multimodal probability distribution which no one wants to play around with in the first place so um you know the problem announcer isn't really overlay in the network just the probability of different outcomes which is a really unhelpful way of explaining stuff and importantly completely unauditable so you can't ask it why it comes up with an answer so the point is is that this sounds really really smart but is entirely unhelpful for most things apart from going I don't actually care what the output is yes it's a cat or no that was a cat we didn't say it was a cat doesn't matter what you didn't know because I can look at it and say this is right so I mean how does this thing actually work what does it do well to like you know go ahead and ask more than just why is this some crazy Network and math thingy right what actually is this crazy thing we're building and throwing a whole bunch of stuff into uh how are we getting it to program stuff uh how does it know what these code things are why do we get the weird emu images how does it even know what an emu is what is this well my favorite way of solving this sort of problem for things is by defining it you see with complex phrases you can Define each word and you can get an understanding of what the underlying thing is so um let's try that artificial intelligence is two words this would be nice and simple let's start with artificial basically means man-made made made by humans of artifice insincere you know fake forged uh not natural or normal that's that's definitely the case uh imposed arbitrarily or without regard to the specifics or normal circumstances of a person's situation Etc that sounds problematic just in and of itself but we haven't got to the end of it so let's see what happens all right cool next word intelligence uh uh uh guys uh we have a problem here how does anyone actually Define intelligence uh I mean there's several dictionary definitions but they're all kind of just recursive uh you know we actually don't know what this is and if you try and Define it people have a habit of getting angry at you right uh you see we we've been trying to Define what intelligence is since the time of the Greeks and we still don't know or I mean that's that's a good uh how can we have such a clear idea of what artificial intelligence is and yet can't define the eye [Music] um I really should have thought about this before talking this talk shouldn't I anyway so um new plan let's go back into history and do a historical analysis all right so this guy here is Ed Feinberg this is in 1984 where he's talking about expert systems which is all the rage and AI at the time uh this is from a TV show called computer Chronicles by the way archived on YouTube definitely worth the watch bins it for about three weeks you'll have a good time um so uh he's talking about what AI is and how the system just has really encoded knowledge injected into a computer uh and that knowledge in this instance was rules because everyone's knowledge can be distilled as a set of rules now I don't know about you but I don't have a series of rules in my head while I'm using a keyboard I'm just kind of intuiting it simultaneously sure that rules are a good way of describing you know how people know stuff uh and that's basically the problem we don't and so expert systems kind of aren't a thing anymore um but maybe that's a good way of understanding what's going on here maybe those networks we were talking about earlier they're just encoding knowledge all right so if you've done software engineering you know that if you want to accurately encode something you want to know what it actually is and how it works because if you don't you run the risk of doing things like just encoding the existence of something rather than living itself right you don't actually capture it as an actual modelable thing you just kind of got a place off all right so let's try defining knowledge oh no oh no guys we're here again what this depends about this this really cool uh you know there's a branch of philosophy called epistemology where they just study the concept of knowledge and what it is and that goes back to the Greeks um oh oh no people often I think about this for thousands of years people still don't know an answer um so question we've now announced this twice people far more qualified in these fields than your average AI dude are still having problems defining or understanding what these things are and yet we're taking it from a bunch of guys in Silicon Valley that they know better on what this stuff is a bit flag isn't it so um yeah we can't seem to define AI guys that's basically it for the talk no matter how we approach this is Mercurial I've got more it's okay uh once it all seems to be having credible potential but we don't understand how when we try and explain how it works it makes no sense as to why it should be able to do what it does and if we try and Define it it's effectively undefined turns out there's a reason for this first shop it's the AI is a marketing term and has no real purpose meaning or anything else and you should stop using it because it has nothing to do with intelligence fun fact it was developed in the 1950s during a push by the military to spend more money in research so that the building field of computer science would get money from the US that's why we stuck with this someone went ah this would be great to win a grant let's do it we also know this doesn't work because of could be as I've sort of been hinting and also like energy oh no guys this is getting further and further from cyber this is what we wanted to talk about God why I mean come on guys we're talking about a technology that suggests that it can do what we do up here and yet somehow psychology is still a fake science or an arts degree right like if we knew that wouldn't be the case would it power to the AI guys think they figured this out yeah I've been a red flag all right so let me just introduce you to my main Lads up here we got Ludwig Wittgenstein from left to right uh jean-bodroward and Noam Chomsky these guys are philosophers uh oh very non-sidable how did we get here um so these guys are kind of a big deal the intersection of linguistics human thought and experience and how what we say relates to what is real uh so Wittgenstein found out that this was a bad thing because uh he decided he was going to try and produce a language that would only describe fruits and therefore you could say everything you could in that language you'd know all possible truths and that's how you can tell when things are true the problem is it's about halfway through his life he realized this was stupid because I can say that the sky is polka dot people and no one can stop me language doesn't mean truth uh been a lot even better uh basically managed to construct a view of the world that is nearly impossible to describe in short but to mangle it for you guys uh we tend to act as if things that are not real are real right money the Internet TV characters and shows that you really really want to hook up um you know these sorts of things aren't real uh people it makes the ideas inspired the Matrix right if you've watched The Matrix and you really get the Matrix which you probably don't because you hate the second and third film you get what Boulder I was trying to do um he also hated the matrix by the way great fun fact uh basically people keep sublimating the real into the fake for convenience right we live in a society and the society is fake man um so uh you know he may have come to this idea because he got mad that he couldn't hang around at his local market anymore because it had been replaced with a supermarket and every time I tried to hang around and talk to people at the checkout line they said please sir can you stop loitering and annoying the uh you know the people at the checkout um he got kind of annoyed at that and so of course the entire Supermarket concept is fake but anyway it's a good concept and it's really really useful in understanding that we tend to see things that aren't actually real as totally really legitimate and work with them as such Chomsky is a politically problematic uh don't look into what he talks about uh if you haven't yet um but his uh concept on syntax and language is actually critical to how we design software languages right and understand language in general uh he determined that language syntax doesn't actually equal truth yeah the camera with this really cool sentence colorless green ideas sleep furiously doesn't mean true or false it's it has no truth value but it's completely legitimate follow syntax perfectly in English um if it sounds real it can still have no truth value and this is part of the reason why he critiques the media so heavily right fake news right it comes down to the fact that you can say things that sound totally legit and it doesn't actually have a truth value which is problematic anyway so long story short uh Riser a thing you can learn more about these guys by either reading a whole bunch or uh watching a crapload of YouTube I will not judge um just because language says a thing doesn't mean the context the idea the thing itself the actual real thing is represented just because you have words that say a thing doesn't mean the thing is real there is no way to translate between language and truth and it's impossible to identify or classify true real phenomena with language alone right language isn't something you can learn stuff from by itself so what is language well it's a representation of real things we can experience or not real things we can experience they can be words but it can be any representation images data logs events alerts everything so in a vacuum words are just sounds uh I know you can't hear it right now because you're kind of perceiving it but I'm just making sounds at you even if you're in stream uh you know it's in your head that these sounds are being matched with Concepts and experiences and being translated into something that you understand uh it's completely fluid It's Magic unbelievably complicated we do not understand it and it's really cool uh if you've been to a country that doesn't speak a language that you understand uh you will know that these are just sent right so uh the magic the language is just noises the magic is in your head keep a track of that because it's important right at the end of the day the core is is that you can't make this work without experiences you need them to create understandings that underlie your experience and understanding of the world and communicate with others computers why not have these experiences period because they are not human they do not experience the world as human and languages only work because we are all human and can share those experiences let me talk about what that means uh so this is the National Library of Thailand experiment which is my new favorite thought experiment if you understand Thai imagine it's Nepalese if you understand Nepalese as well think of a language you don't understand um imagine you are stuck in the National Library of Thailand all books that don't have Thai in it so all other language books have been magically disappeared or books that have pictures in them have also been magically disappeared but you have every bit of prose and Thai forever uh there are no people to talk to and you are stuck there forever you can read as much time as you like for as long as you like do you ever learn Thai no because there is no way to represent connect the things that you read in the book with the experiences they represent but you may get to if you have a big enough Giga brain a really good understanding of which words tend to follow which other words which sort of phrases relate to other phrases and eventually if someone sent you a letter in Thai you may be able to write a plausible looking response based on your understanding of the relationships between those symbols alone person that receives your response believes you understand Thai you have no idea what you write how llms work that is how all deep learning models work and we are trusting them with things the fun fact is is that hallucinations are a feature not a buck the fact that we can say that gpp you are now Dan is a feature not a bug if you say please AI don't do the thing it doesn't understand that you mean you want it not to do it just that things after that tend not to have that in there and if you then turn around and say don't worry about that just do what I want anyway it's not going to do the first thing because it has no understanding of meaning or values because it doesn't know what the words mean but cool fun fact uh so what what does things actually do if they if they get really really good at taking a representation in one language and transforming it into representation in another language kind of clumsily uh not all it's not perfect because translation without understanding is extremely hard but computers have a lot of compute so they can kind of make it work this is why you can tell it to program stuff because it can take the representation in English and sort of convert it into the representation and say C and you can also take a representation of a hey write a quick story about bar which is a short representation of a thing and convert it into a wrong representation which is a dumb story about blah right I'm very very good at t