
Good afternoon everyone. My name is Helen. Don't worry, I know it's almost lunch time and I promise I will try as much as possible not to make this boring and just make it relatable. So we won't really go deep into technical um jagons. So try to balance technical and nontechnical. And um I'm sure that here today we all use internet, right? So um which is like an interesting topic for us and so today I'll be talking about will recapture be enough and how do we secure websites in an AIdriven world right and um so my agenda for today would be um presenting the changing face of the internet let's talk about no internet when was internet born and when
is internet's birthday you know and then we can see how this has evolved with the um adoption of Hihi and then I would explain quickly what recapture is um in a simpler terminology as much as possible and then I would also drill into how AI is beating capture right and what are the security gaps in recapture and what does modern security principle requires and um lastly we talk about the evolve solution beyond recapture do we have any other solutions that could um that could that could protect our website beyond recapture. What other solutions do we have out there and also we're talking about the future of web security and you will help me answer the
question if recapture is enough or not at the end of this session. So if there is anything at all that you would love to take home is um after my presentation this afternoon is oh from what Ellen said from Helen's presentation is recapture actually enough and then you give me like a yes or a no answer based on my presentation or I can hear your opinion or view. I mean this is not me telling you what is enough or not. It's about let's contribute. This is AI and I'm sure that everyone understands AI or it's a buzz word or whatever. AI is is a new thing. So, but I'm happy that I'm talking to real human here today and not
AI. So, um which would make it more interesting. um the changing face of the internet. Um, like I said, I mean, I was researching like, oh, internet, I mean, internet has been a thing since the 1960s, but from Google search and from tragicity as well because it's AI. So, we leverage tragicity and some AI models. So, as to answer some questions, AI said I mean said um, Google internet birthday is now celebrated every 1st of January. I mean 1 of January and internet was born because there is a date 1983. I don't know who who has been here 1983 or before then but so internet's birthday is 1983 1st of January and so first of January we kind of celebrate
internet's birthday as we celebrate the new year as we celebrate the new year we also celebrate um having internet and um AI is of course reshaping the web and AI powers everything from chat box um recommendations engines automated customer support and personalized content and sincerely left I mean 10 years ago I feel the way the internet works or the way or the way the websites work is a bit crappy but right now everything is now really faster and smarter but also I would say it's now also complex and exploitable with the um heyi in in this ai world and then second there is a surge in web dependencies right I mean um incre there have been
like increasingly reliance on on digital services e-commerce education can actually be done online I mean you don't have to be in laca in person to have your first degree for instance you can I can assess I want to do my masters or PhD online at the at the university in in the US while residing in the UK so that means your banking application you don't have to go to your bank to make your transaction so which means that it's now like it's part of us. It's what we do from finances to health to education and the likes. And because of this, that has dramatically increase the attack surface. I remember um back then mean security wasn't even like a
thing when there was IP before or TC TCP IP, right? But when we had TCP IP, security was not considered at first. But because we now have needs for more IP addresses and there are now a lot of security that's why kind of like something we call IP IPv6 version six and the likes so and as AI become a thing I think now security will now evolve as AI is invol is evolving as well in this scenario and so every login page every check out form every contact submission is now a potential entry for bots and um AI is also welling the truth from what I have here and the truth is as AI enhances user experiences it also
empower more sophisticated and humanlike attacks right um in my everyday work sometimes I use um AI model I mean GitHub copilot for instance when you want to query right I mean ask AI to help you with writing some functions and so that means what would take me say hours to complete would take me maybe minutes, right? And so that means AI has really helped me or helped us really. And at the same time, AI is also helping the bad guys out there to create a more sophisticated attack that looks like um that looks like humanlike attack, right? And so the question here is is um how effective are traditional but defenses like recapture in this new AI
world that's the question are they really effective are they really securing our website and I I know that we are not just dealing with a rule based a a rule based um bot or a basic script bot things has really evolved over the years and we need to now start thinking of how can we ensure that our website business personal website is now secured. Now let me go to this second slide. I mean the next slide which talks about a capture. I mean I won't ask us a question. I will just tell us what a capture is. Capture means completely automated public storing test to tell computers and human apart. And initially what capture does is it uses um
distorted text right that only human can read and B cannot read. But right now it's not the same again. Even B can read the talk tested test and any other means of recapture method. And the core purpose of um recapture is to distinguish legitimate human users from bots without creating much friction and I I also saw comes something like a gatekeeper function and which kind of protects your login page your your APIs or your forms and it requires a user to pass to pass a challenge or meet a trust threshold before you could actually continue with the website. I know that all of us in a way or the other have experienced this on a website when you want to log in
like either you use like a check box or it's saying identify the bicycle in this image or read out this thing and then read out this letter you know we'll dwell on that but that's like the main function of recapture telling apart what is human and who a bot is and in my next slide there it's just telling us I just kind of want to quickly walk us through the different types of recapture and the evolution of recapture since um in the last two decades. In 20 in in 2007 we have recapture version one which um uses a word transcription from digitized books like ducted test. And then moving on to like 7 years after in 2014, we have um
the version two that mostly uses the I am not a robot checkbox and also image classification challenge. And then in 2018, which was about 5 years ago, we have the recaption um version three. And what recaption version three is, it does not even need your interaction. It does not need user interaction. Um version three um uses what we call the risk scoring based on behavioral data data around your mouse movements your clicking pattern your hyper reputation and that is the latest um recaption version three that we have which is mostly developed by Google. So all this version one, version two, version three is owned by Google. And um there are different types. We have the text recapture that
uses this talk test. We have the image one that utilizes context contextual recognition. And then we also have the um video recapture which converts text or images into moving objects that makes it like a video. And then we have the audio one. I mean I'm spelling this thing out then just type it you know that kind of thing. And then we have the puzzle one that mostly has human users to interpret and calculate something. I mean I I think there there are some puzzle one that like you have to drag a particular image to fit into a particular image and then it give like a score 97% 99% and if you don't meet a particular score then you you won't be
redirect to the next page. So how is AI now beating capture or recapture? Um the first one is solving visual puzzles. And we have models. I mean there are different models. We have um clip model. We have the dino model. We have um the OCR images that could actually resolve this distort distorted text or image puzzle and icon identification. And with I would say with a near human accuracy or even a superhuman accuracy because AI is never stressed right sometimes you are stressed and you can't solve some some problem or you have a lot of things on your mind but AI is never stressed AI wants to do what it is written to do so
that means they could actually even supersede human accuracy when it comes to um solving these um capture challenges and then the second one is AI emulating human behavior. So machine learning models we they built machine learning models in such a way that it could actually replicate human like interaction patterns like your mouse movements, your delay, your clicking patterns and also what I mean the machine learning model learns from real user interaction like what we do and so I mean there is no good AI without without the model the AI has been trained on. So that means they use all our um data to train this machine learning model and then they now become good at it because that is what they are
meant to do. That is what the model is built for. Um I can't remember the example of a machine learning model for for for this. I have I had one that I memorized but I can't remember. So sorry about that. And then the third one is using capture farms. Capture farms we call it capture sobing as a service. I mean sometimes this is actually now monetized right and it's mostly I mean we have like we have APIs that could offer real time c um capture solving often at a scent you know per request and which shows that AI is beating um capture in in this case with this um capture sub as a service um um feature and from what I have I
said AI doesn't just guess it learns. Models are now trained specifically to defeat capture format. It is not about bypassing one test anymore. It is about defeating the system architecture behind it. So these models are now saying oh because I don't want to go through this challenge. It's not just I want to pass this challenge. is about I want to understand the system architecture behind this web behind this um um maybe web infrastructure and which shows a sign of of concern. Um let's explore um security gaps in in recapture. Um number one is not infallible and an example will be the version three. Remember earlier I said the version 3 does not require user interaction. It mostly uses like a
riskbased scoring system and so there is high chance of a false positive where real users like me and you would be denied. I mean won't be able and I mean the the capture system which is version three would see us as a as as a as a as a bot and then you'll be rejected. You won't be able to go on continue with whatever you want to do. And then the first negative one is where a bot is seen as a user. And look at what I said earlier around um a high models even performing more than a user. So in this case there is a high chance for a false positive or a false negative um um for a false
positive or force negative um security gaps in in recussion fashion theory. And then there is something around what we call the user friction and it's it's around usability versus security and security tradeoff and I think generally the principle of security let me just kind of give us a bit background we have CIA we call this confidentiality integrity and availability and sometimes the CIA triad is not mostly going to be 33.3% all through is about depending on your application depending on your project or how sensitive your application is, you might want to vary. Do you want more confidentiality over availability integrity in this in that scenario for information security principle and so in this case we need to
kind of trade off user experience and security and that's like one of the security gaps around that and eg look at version two which uses um I am not a robot checkbox or image classification I mean there especially on mobile I I don't know if you've experienced this before it's sometimes it's easier for you to solve the capture challenge on while using a laptop compared to using a mobile devices. So there are some friction around this and then this poor user experience could lead to bounce rate and then maybe just abandon the whole project and then you can't even you don't even want to strain yourself just go through customer support request and that means there will be more
support request from the application support team. So, so which AI should help with our job but at the same time it's also in a way creating some challenges if we did not strike the balance and another security gap I could see here is accessibility issue um an example would be the the visual one the audio one or the image one someone with a with disability that could not see how would that person be able to solve you know a capture challenge in this situation And some of this um capture version had not comp WCAG standard which is the web content accessibility guidelines. And um last one I have here is the privacy concern. Privacy concern look at version
three that I said uses like a risk score basis which means that this is taking some data from you're capturing your IP address. Sorry, let me just go this capturing like an IP address and which could raise like a privacy um red flag around that. And one thing I also while I was preparing this slide I thought of is that means we might also be depending on the third party services. So capture being owned by Google if Google um server is down that means your website is not protected. You are relying on a third party services in in this um scenario. So now we've talked about the internet different type of capture and the security gaps in capture. Now let's
let's explore the main reason why we are here today. What does modern website security require? Defense in depth. I think it's everywhere. Defense in depth. Even as human when we interact with other humans, sometimes you want to you don't want to trust by default, you know. And I'm sure a lot of us don't just trust a stranger by default. you still kind of have like a layer of defense or adapting you as okay let me try the water test the water before just dipping my leg inside the ocean itself and that this is also applicable to the tech the um real world I mean um websites or any technical um development. So um leased intelligent
security number one we have the behavioral analysis. Behavioral analysis it says that let's track how user interacts the click pattern the number of time the um number of time you spend you spend on a particular page to detect anomalies. Right. Another one is um device and browser fingerprinting. Just like what browser and device priority will be to collect unique identifiers like your whole the plugin used in to build that particular website to flag any suspicious configuration or spoofed environment and then the user journey tracking the user journey tracking mostly how do we monitor user sequence say from login page to browse and then to check out page and see if there's any inconsistency and then or a high-risk
behavior you know and then another one will be a high powered anomaly detection. Let's leverage machine learning models that could help adapt to new attack patterns rather than just saying a rule based or a script based. And what this machine learning model will do is it could dictate an anomalous behavior based on pattern and it's not like a signature you know a signature detection kind of it's like it learns the way you do I mean learn your pattern over time and be able to predict if something is malicious or not and then the last one is the threat intelligence feed we need to leverage I mean there are different threats intelligence field out there you know leverage external um
threat threats intelligence phase and then capture IPs that are already flagged as um malicious or suspicious and use that to improve your web um security. So um evolving solution beyond recapture I think I need to run fast because I have just a few more minutes and then I could give us room to contribute. So beyond recapture we have what we call the H capture which is more privacy focused and um it could be monetized for some for some websites and it's one thing I like about the H cap is that it's GDPR friendly because it focuses on minimizing personal data collection. It doesn't collect much data from you. And then we have the Tonsty which is owned by Cloudfare. And the the
Tonsty one is also invisible and non-interactive and it considers user experience first and it also uses um devices and browser signal behind the scene to um train its model. And the last one here we have um it's biometric and bio authentication like continuous authentication. So it's not just passing a capture page on the login and then that's all. It's beyond that. So what it's doing is like okay let me let me let me perform a con notification throughout the user journey experience when you log in when you're trying to make your transaction when you're trying to check I mean log out and that that way it's able to I mean that way it's kind of difficult for or right now at
the moment difficult for a machine learning or a model to learn the entire behavioral journey of of a user. I'm not saying we're not going to get there but at the moment it's that that might kind of make things more um difficult for them. So the role of AI in security I'm sure I think I've mentioned this AI could be seen as a solution at the same time AI as risk. AI solution is around help with thread detection could help with um fraud prevention and adaptive access control. adaptive access control that could automatically adjust verification based on context like your location, the time of the day or the device that you are using and then fraud
protection could help with you know preventing I mean identifying threats I mean fraud and accounts takeovers um AI has a risk I mean there there could be bias and discrimination and bias and discrimination like there are some poorly trained AI models that could you discriminate based on your location or based on the type of user you are. You know, I'm not going to mention countries or anything in this case, but I'm sure you understand what I mean. And then lack of transparency. Lack of transparency. An an example will be the version three that just uses a non interactive way to verify you. We are not sure what exactly this model is using to dictate if you are human or
both. And it sort of makes security audits quite um difficult and over reliance on automation. So a lot of people right nowadays could rely so much on ai that we overlook the lapses of ahi. And this is what the bad guys would do. They would check this lapses and then use that one to exploit us. And so it's also applicable to our I mean the way we work as a developer for instance you don't want to over rely on the function that AI produces for you. You want to be able to tweak the code to suit your own um to suit what you you plan to to suit your own understanding or your or your business um um what's it
called now your business policy or process you don't want to pour out a code generated by AI and dump it into your SDS life cycle. So and um the future of web of website security the the web security is mostly adapt adapt adaptivity trust and continuous and continuity continuous monitoring as threat evolve we need to evolve with it and so if AI is being used to producing some of this thread then we also use we need to use AI to combat this threat as well and adapting to to evolving AI capability as AI gets smarter, as AI gets faster and it also gets more deceptive. So which means our security system as well needs to be used like a predictive
intelligence and not just a reactive control. We need to also think ahead of AI and then be able to strike a balance between the two and then also building trust without burdening. you know user expects seamless secure experience and we need to also be able to build like a frictionless and a privacy conscious um um model for a high. So is recapture enough? What do you think? Question contributions [Music] So, uh, looking at most of the, um, additional features you're adding, if you disable JavaScript on your browser, you can't do any of that. So, as a website owner, you got a choice between discriminating against people who disable JavaScript or not allowing them to log in at all or or create an account
at all or running your recapture. How do you recommend dealing with that from accessibility point of view in particular? Yes. So I think it I think it all depends on what so disabling JavaScript for instance you know is is is it going to be adaptable to all users you know some browsers allow JavaScript some browsers do not does not allow JavaScript. So I think you you need to kind of strike a balance between your user database. I mean who are the kind of people that will be um accessing this website and then you can it's it's going to be like a risk based score. I wouldn't say yes disable it or no disable it. But you
have to kind of strike a balance between um I mean have like a risk based okay if users if a user browser does not need a JavaScript how would that affect my business for instance you know so you just have to be as flexible as possible to accommodate um lots of users does anybody want to answer Helen's question about whether recapture is going to be enough or is that a conversation over lunch alternatively do you have have any other questions? Yes, sir. The back. Thank you for asking a question. Um, do you think you mentioned about the kind of biometric and behavioral analysis? Do you see that playing a growing role in like I guess
determining authenticity or is the focus really in other areas? So you asking question around the biometric authentication method. Yeah, I think for now I see that as um a way to combat um any high threat because from all I explained about authentication, it's not just going to give an authentication at the login page alone. it's going to complete I mean continue to challenge you or look at your your your gate or gesture throughout the user um journey right and not just at the login page. So I think that in a way would kind of make it inconvenience for a bad guy or for or AI mod to quickly um bypass that sort of but I'm not saying
tomorrow they won't be smart enough to be able to bypass that but now I mean what we are trying to do is how can we make things difficult for them. So for instance if you're saying MF authentication we are not saying MFA cannot be bypassed right and it's just that it's just kind of give them an extra time or it's giving like it give them like an extra headache to bypass an MFA by virtually they can bypass an MFA. So that is why we have like layers of defense like multiple layers of defense in um our um say technology or your website here. Oh, let's give Helen a round of applause. Thank you very much.