
Test one two one two. Test one two one two three four. Test one two three four five six. One two three four five six. Test test test test. One two three four five six. Test test test. One two one two three four. One two three four Test one two three four five six Test one two Test test test one two One two three four One two one two three four Test test test one two three four five Test one two one two three four five six One two three four five six Test 1, 2, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 1, 2, 3, 4. Okay. One second. 1, 2.
1, 2. You're good? Thank you. Test 1, 2. Yeah. Okay. Good morning everyone, welcome to BScience Las Vegas, Breaking Smart Statements. These talks are being live streamed on YouTube, so we would ask the audience to please mute their cell phones, and there will be about 10 or 15 minutes at the end for questions. Thank you. Hello everyone, welcome. Today I'll be presenting Breaking Smart Bank Statements, how to read a bank statement without a password. This will be the agenda for today, a little bit of context, the analysis of the file, analysis of the JavaScript, a brief talk about RC4, a demo, the analysis after the fix, conclusions and Q&A. So who am I? I work as a security researcher at TrussWave Spider Labs. That's
the shirt on. I previously worked in the offensive side of security as an ethical hacker and before that I worked on the defensive side as a network security monitor. I really like web attacks, all web related, like defense and offense on them. Denial of service is also a very, very interesting topic. How can a few people or a few servers can bring huge things down? That's really interesting. If anyone is interested, we can discuss both of them for hours. Other things that I really like and we can discuss for a lot of time are dogs and tacos. Like I can definitely talk about them for a lot, a lot of time. There's my Twitter handle if someone follows me.
and context. So this research I did it last year it's so this is just the attack function and then a little bit of hiding and rewriting so there it is. You can see the bank statements. So in order to be a full demo I have to test it with another file. This is the file of someone like a friend but I have no idea what their password is so I try something random. It doesn't work. and I use the very same exploit. So I define the extra function and I'm using the same plain text because let's remember like this is an HTML and the same function, the same part of the HTML will be the same for every user.
This is what we will use as the base of our attack. And here I'll just put another name because this is not my bank statement. And use and that's it. We use the same attack and we open two different bank statements. So, like, yeah, there you go. If you look into the details, like the amount of credit and then expenses, etc. are different because we're two different people, but I think it's not required to go into those details. Also, I'll show you quickly the content of this HTML file. So this is what I show in the first slides. This is the file. It's all encrypted text. We've got no idea of knowing what's in here. and there's
very little HTML and very little CSS. So most of the things is this encrypted text. Continue. So once the bank notify us that this was fixed, I start analyzing the new file I receive every month and it looks like they did fix it. Like right now they're using Crypto.js, the version 3.1.2. They're using the specific modules of AES and SHA-256. The decrypt function now uses AES, so that's good. That's way better than using RC4. Before decrypting, they're putting all the encrypted messages together and they're joining the text before decrypting. For the hash, they're using SHA-256. It's better than SHA-1, it's a little bit harder in terms of computing, so for brute force attacks that'll be better. And they also, it's
a good idea, they changed the hash, because if they will leave SHA-1 that will be an issue, because they will give you the key to the previous messages. All of this disclosure was transparent to the clients, like, As a client, I didn't get any message like we're changing the security or we're changing anything. So it's good they did it this way because they help their customers be safer and they didn't break the security in the meantime. So a little bit of conclusions. I think I run a little bit fast, but you'll have more time. Every time you roll a new security product, it's always important to have a security review. If you're in any business and management or
leadership decided to roll a new product, try to be strong on the idea of having a proper security review, especially if your product has an obligation to fulfill certain laws or certain precepts and you are required by law. If you're doing complicated things like cryptography, hire specialists if your team doesn't have the proper skill set. Cryptography in general, I think it's harder, especially for developers. I mean, if you are not a cryptography expert and you Google stuff, you may read some stuff online that's not 100% true, right? Like Stack Overflow and similar websites may have an answer. that kind of works but it's not the safest. I mean, here they did cryptography and it kind of worked but
it wasn't very safe. So that's why you should hire specialists. Hire people that really know their cryptography. There are a lot of smart people in the world. In this city right now there's a lot, a lot of very smart people. You should hire them. Always use cryptography algorithms that are considered safe. Never use old algorithms. Never use algorithms that are not the best at the moment because that will be an issue. If not an issue today, definitely an issue in the near future. Never ever roll your own cryptography algorithms nor implement a cryptography algorithm differently. Unless you are designing a capture the flag challenge, don't do this, especially for production environments. Never use the same key more than once in a stream cipher. This
is cryptography 101. But as security people, we must have this very clear, especially because Sometimes we check the work of developers and they're not security people and that's good. They know how to do their thing and they know how to do it very good on their things, on their side. They do amazing code and they don't know our side. So we have to help them get this side. We have to give them these very few tips like never use the same key more than once in a stream cipher, use good encryption algorithms, etc. if possible, especially if you work in a bank. Try to have a simple, safe and clear way of communicating security issues. This is something that's unfortunately not as easy or as common. I've done
a lot of disclosure. Thankfully, the SpiderLab Research Intelligence team has helped me. They're an amazing team. But I know that if they weren't on my side, this would be really hard. This disclosure in particular, it was hard for them. Because the bank doesn't have like a very simple website like "Hey, give us a message here if you find anything." Most of them don't have it. So if you work in an organization, please check if you have a way to receive bucks. And if you don't, start doing the process and start talking with the people, especially with leadership or management. So you start getting this process and you have a clear way of getting things. There's a blog post in the Spyder Labs Lab blog if you want to read
it. It's basically the same but a little bit shorter. And as a security researcher, I also have a few tips for the people who do research. Every time you get a new product, every time you see something new, try to hack it. Even if you say like, "Oh no, this was made from this big company. They should definitely have some good security in there." Try to hack it, you know? Read the code, understand what's going on. You may be surprised. I was very surprised. This is a big bank. They have millions of customers. I was pretty sure they did their proper review, their proper security review. But at the end, I also understand there's always a business need to release faster, to release cheaper, to release--
code generally goes very fast. So that gives us an opportunity as security researchers to find fun stuff. And don't try to do stuff that's illegal. Also be careful with that one. And questions. Does anyone have any question? Do you have any idea how much back and forth the fighter lab have to go with this bank? Yeah. I remember I sent this to the intelligence team and like the next day they told me like, okay, we've already sent an email and I go with them like a week later and I said like, hey, do you got any news? And they told me like, We've emailed a couple of people and a couple of public emails but got no answer. And then
I said like, "Okay, let's wait a couple more days." They sent more emails and they got no answer. So at some point they told me like they're not answering. Like, "We don't know if we're hitting the proper people. We don't know if we should go to Citi. We don't know what to do." So I asked a former colleague. and she worked in Citi for a while and I said like, "Hey, I got an issue. Do you know who I can talk to?" And she referred me to this person and she advised me like, "This is not the guy who will solve your issue, but at least he will read it and he will know with
who to forward it." So it was definitely a few emails like going back, going forth and there were also some NDA agreements. So they were like the intelligence team did an awesome work But I think this process was too hard. Like, if I would do it personally, I wouldn't finish because it's very exhausting and it's hard to communicate properly with certain institutions. Okay, okay. And compared to other customers, how hard was it? Have you ever had... So compared to other customers, how difficult it was in the sense of the timeline, talking to people. I'm assuming that other companies are more open to these kind of disclosures or fixing these problems. and probably this is just like bureaucracy, normal bureaucracy in a Mexican bank. So I'm just
curious to know how difficult it can be for, let's say, a manufactured company or another kind of industry. That's a good question. Like this is the first issue I find in a bank on my own. So like they're not expecting the report. So that's always a different treatment. But I have found a couple of issues like in software, like VPN software, and it got a very like fastest response like very very fast in comparison because when they do software like I think it's easier to patch and this one I can imagine that it had to go through a lot of teams and a lot of people are in this decision because it's a bank and
it's a big big bank and there's a lot of decision making and there's so it was like very slow in comparison to others but in general I think two months is two or three months is a very good response time for a bank in Mexico yeah last question sorry so in your opinion uh how is mexico in the book bounty program like have you seen many companies actually adopting this practice or it's just something that no one cares my opinion is that it's um it's an area for improvement so i mean i haven't really seen big companies do it hopefully in the next years there will be more programs and more conscious in that topic, but today no. For this research, I got nothing from the
bank, not even a pen or something. I got authorization to talk about it, so that's good. Thank you. Thank you. So for the, using the known plain text, did you just open up your own bank statement with your own password and then just extract the first chunk? How much of the plain text did you need? That's a very good question. So what I did was opening my bank statement with my normal password. I grabbed a little bit more of the length than my first encrypted messages. I mean, it depends on how big your bank statement is, how many messages they split it on. Some months I got like 28, some months I got like 33 messages. So I use
like the largest one. I count the bits. I use the one that's largest. and a little bit more. And the XOR function I'm using will take the, like, whenever it's one of both messages stops, like, if one message is larger than the other, it just stops the function in there. So it's better to have, like, a longer key stream and not use it instead of missing some bytes and having a strong result, like a bad result. And I did use my bank statement, like, my first one, and it breaks all of the bank statements of that month. There is also a little situation and that's the reason, well I didn't publish this tool because of two reasons. One is I think it doesn't really
help the good people because that's the theory of breaking it but you don't need to break something to see like it's really like you're able to break it and number two I saw like two or three different versions. I think they fixed some bugs, like production bugs. So at some point they changed the JavaScript libraries. The attack is still able to work, but as a tool you will need about two or three versions and you'll need to do some fancy things to detect if they're using version one or version two, et cetera. But yeah. - Okay, thanks. A quick question. You mentioned that there was a line of code missing from part of that algorithm. And I was wondering, have you determined is that a
bug or is that a feature? Something that they didn't have before. So this is the RC4 function they're using. And it's just in this line, the algorithm you find in Google or in Wikipedia or whatever, the original algorithm, here has some parentheses and it says "+1". This is also a proprietary algorithm, so there is no RFC or any official source of the algorithm. This is a funny story. I guess lots of people in this conference leave the story. I didn't leave it. But this algorithm was leaked, so like the leak algorithm is the one we know as RC4 but like the official one is you have to pay for it and this is just the line that's changing I did a lot of thinking and
I asked some people and there is no direct effect on negative or positive but I think there must be something somehow so if anyone knows the answer like what does this plus one does here That will be interesting. It's also more of a thinking exercise because this had a bad implementation. So, like, there was a bigger issue of than missing the, aligning the algorithm, it's a bigger issue that they did it improperly. And the way they fixed it was using a different type of algorithm, a different, instead of stream cipher, they're using block cipher. So, yeah, there is no solution on that one. It won't really have an impact to put all these hours, but it's a really fun challenge. If anyone knows, I
can get you some stickers. I got pictures of my dogs in stickers. Does anyone have another question? Thanks. It was really interesting, especially when I'm from Mexico too, so knowing this brings some concerns. Do you know if any other banks are following the same process? practices like Scotiabank or any other ones? Have you ever looked on those? That's an interesting question. I don't have bank account in all of them. I have in some of them, but I have friends who have banks in most of them. And they all tell me they either send their bank statement in either an encrypted PDF or an encrypted zip file. Also my friends don't like to share their bank statements with me for some
reason, like, you know, just give them for free and I won't do anything, but they have their doubts. And that's, like, using a smart statement, I've only used, I've only seen CD Banamex, but the idea of using encrypted PDF and encrypted zip files in all the country, it's a very interesting topic because I've talked with some people and It's not the safest approach. Like there is every year it's easier to brute force. So at some point all of the security will be outdated. So that's also a very interesting place to start thinking about it. There are a couple of few banks, to be honest, that they do it correctly. I think American Express is like kind of a bank and they only send you an email, which I think
is the same they do here. Like, hey, your new bank statement is ready. Log into the page and download it. That's a little bit of pain for the user, but it's way, way safer because you cannot brute force that one. I mean, the window of opportunity is very, very close, very small. Yeah? - I have a very easy report spoofing emails metric, so. Yeah, speaking of Amex, I also have them as a bank, and they're the only ones I've found that have very... Like, I get phishing emails from fake Amex, and Amex has a very clear way to report those emails. You just forward it to a certain address, so I was pretty impressed with
that as well. Yeah, yeah, yeah. Some of them do a little bit better jobs. Well, thank you, everyone. Hope you enjoyed.
Hey. Hola. Okay. Hey, thank you. Thanks. Yeah, I mean, I was...
Good morning everyone, welcome to B-Sides Las Vegas. This talk is entitled "An investigation of the security of passwords derived from African languages". Our host today is C-Chi. Very sorry. I'd like to thank our sponsors today, CriticalStack and Vellymail, and of course our stellar sponsors Amazon, Blackberry and Silence. I'd appreciate it if you could all turn your phones to silent. There'll be some time for questions at the end. I'll hand out a microphone, just raise your hand and I'll come on over. Great. Hi everybody. So my name is Sibu Sisil Sishi. Sishi is just my surname. It's just easier for people to pronounce my surname. So a little bit about me. So I'm co-owner of a company called Iron Sky, South African based penetration testing company. My main
function is to be a basically just penetration tester. Before I did security and anything security related, I used to be a professional athlete and win as far as the Olympic Games and World Championships. So yes, I basically brought my third side into security as well. So my talk is about the effectiveness of passwords and just basically talking everything about passwords from South Africa. And basically in South Africa the business language is English and what we found is that most of the password policies written by the organizations are based in English, you know with English examples. and basically also the systems that they use are also in English as well. And also around user awareness around passwords is
very, very low in South Africa. And user education about just general cyber risks is also very, very low in my country. For most people, just looking at the default Windows password policy is like, okay, we've ticked the box, we set this up, everything should be done, all passwords should be of a high standard. And most of the times, once the organization has set this, they really don't look at passwords anywhere else after that. So what I did is I asked seven organizations in South Africa if I could just collect the users' passwords. And the companies that I picked, they are in different sectors. So one is government, pharmaceutical, and different other industries. And I collected
2,614 LM hashes. and 170 433 ntlm hashes all these hashes that are collected are all user based so there are no computers um accounts in this thing so these are passwords that somebody has created then basically what i found was that um i was able to crack 2242 of the lm hashes because the problem with lm hashes is it takes the word breaks it up into seven uppercases it And what I found that most people in the organizations created password of ABC123 password and fish eagle as the main passwords. Just looking at these passwords, you can see that they just look like just generic passwords that an administrator may have created and then just forgot about them. On the NTLM hash side of it, I
was able to crack 76,588 of these passwords. And most of the passwords that I found were password01, and then password in all capital letters, and the company name. So a lot of people use the company name as one of their passwords in South Africa to cope. And this is just a breakdown of just the passwords that I found. So I just clicked them up together. So you can see that Fish Eagle 123 was accounted for 179 times. Fish Eagle and ABC. Where it says Redacted 123, that's basically the company name. We found that a lot of organizations love using their company names and then add in 123 or some special character at the end to create their passwords as well. And as
you can see as well, password is very, very popular as one of their coping mechanisms for making passwords. Other interesting information about the passwords is that users created passwords according to what the company makes as a product. For example, there was one company, so they make fertilizers, so people made passwords around fertilizer as well. people also use cities and towns that they were in for passwords. So I'm from Johannesburg so some of the passwords that I was able to crack were people who say Johannesburg 01 or Johannesburg full stop as one of their passwords. And I Some of the service account passwords that I found were that people or IT administrators will create a password, for example, called SQL, SQL 30, depending on whatever service that
they're using. Also, for example, SharePoint, the SharePoint installer will have SH installer as a password or something generic like that. South Africa is very religious and they mostly follow the Christian ideology. So people will use Bible verses as well as to make passwords. So you'll find like Genesis something or Romans. And there was quite a few people who used passwords and then also like the literal phrases of a Bible phrase, a verse sorry, as one of their passwords as well. A lot of people also use months, so they'll make whatever month, so June or August 19 as a password as well. And keyboard walking, so people will go like QWERTY, AQSZ, WSX as sort of like
a password, which are sort of very easy to guess if you have a nice word list. And these are the passwords that I was able to crack, most of them. So I was able to crack passwords from the most passwords I cracked were 8 characters of length. And the reason for 8 characters of length is that most organizations that I tested and I looked at their password policies as well, is that 8 characters was the minimum password length that they wanted the users to have. Then obviously it started to go up 9, 10, 11, 12, 13 and then back down to 6. There was one organization which said the users could create a 5 letter password as well.
So this is just the breakdown of just how the passwords were like using Hashcat. So you see like they were mixed alpha num. So there was 33,496 of them, which accounts for nearly 50% of all my passwords that I was able to crack. There were some passwords that actually had a special character and a number. And then there were lowercase passwords. There were some organizations where they really didn't care about how they entered their passwords, just as long as the user just had a password. And then there were some passwords which were just basically just numbers as well. And most of those numbers was literally 123456 as their passwords as well. So this got me thinking about... So
I was basically in Zimbabwe. I was doing a review for a company in Zimbabwe and I was struggling to crack passwords in the organization. And so I started... So I basically captured the hashes and I'm sitting there and I'm trying to crack them using my normal word list that I have and none of them were working until I decided like, okay, let me just dump the whole domain controller, take all the usernames and take all the surnames as well and then started doing like plays on those. And then it's only then I started getting hits. For example, the one account that I was able to get was Chow Runora. So there was a user who is, his account was Charunoma and
his surname and then basically when I used his name I actually got a hit as well and then this got me really thinking about you know how if we converted English known weak English passwords into into into another language you know how effective would it be how can people be able to sort of like be able to crack them and be able to guess them as well So South Africa, so South Africa is, so this is basically South Africa. In South Africa we have 11 official languages and the languages are based on regions. So I am from KwaZulu-Natal, so I speak IsiZulu. KwaZulu-Natal, I'll show over here. So this is KwaZulu-Natal, so I speak Zulu. And most of the regions are very
like, there's a very dominant language for that region. So the region just below it is the Eastern Cape and most of the people that reside there speak Isiikosa. And then in the Western Cape, Northern Cape, Western Cape, most of the people that reside there speak of the Khans. So yeah, so South Africa is very, very much regional. And then also the last one was, sorry, my apologies. And then the last one there is right there at the top, that is Venda. And we will speak about that a bit more as well. So South Africa has got 11 official languages. But for the purpose of this talk and my research as well, I only decided to pick eight of the most spoken languages in South Africa. So I picked
English. Isi Zulu, Siswati. So Siswati is a very unique language where it's also South African, South African speakers, but also at the same time though people in Swaziland also speak the language as well. It's actually mostly their language. Sipiri, Chivenda, Isindebele, Setswana, Tsitsonga and Isikosa. Another very important thing about African languages is that languages, African languages, especially in Southern Africa, that they're very much grouped. For example, I speak Isi Zulu, but I could speak to people who speak Swahili and Xhosa, for example, because we share some common bond. They're part of the Ndebele languages. So if you speak Sotho for example, you can speak Northern Sotho as far up as Botswana. And if you speak Zulu, you
can speak as far up as the Congo. Because I've been to the Congo before and I've listened to them talk and I actually could pick up words from their languages just because they share some sort of words with us.
So for the experiment I basically collected weak English password words and then I'm going to convert them into the eight African languages that I selected and then I'll send them up to various password cracking sites and people. So the whole process was for me I collected the top 19 weak passwords from the Semantic Top 500 list as well. But one thing I made sure was to make sure that the words were accountable. For example, you know, like in some language, for example, the numbers, I have to disregard the numbers. Of course, numbers are just numbers as well. And then part of another very important thing is that the Chivenda language has got their own unique special characters. that do not appear in some of
the dictionaries and some of the keyboards as well. So I had to remove the special characters otherwise, you know, people are really going to struggle to crack them as well. And another very important thing I did is that I made all the words lowercase. So the reason that I made them all lowercase is for example, in Isi Zulu for example, we sometimes capitalize the third letter of the word. So In English, generally you'll capitalise the first word, but in my language we sometimes we capitalise the third letter or sometimes even the second letter. For example, Isi Zulu, if you look at this word here, we capitalise the third letter in Isi Zulu. For example, if
you're saying soccer, we say i bola, which means that we only capitalise the second letter of the word. To basically keep it standard and the same for everything, I basically, this is why I lowercase everything and then I capitalise the first letter for every language irrespective of what language it was. Then another very important thing is that I used the special characters from from all the password cracking that I did from from from the seven organizations I used the special characters that the users used for for making passwords I don't want to introduce any any new any new special characters to this was I wanted to them I wanted to see people can fight figure out if they could crack the same passwords and
And then what I did is that I converted those, so I made the words and then I converted them into NTLM and MDM5 hashes as well. So these are the weak passwords. So these are the weak passwords that are selected for each language. So English is selected in yellow. And then for each language, I converted it into that language. So example, this is Zulu. So in English, it's called Password, but in Zulu, it's Password. And then for each of the other languages as well. You'll see that in some languages, we do share some words. For example, in Zulu we say 'ipaswedi' but in Zulu we say 'ipaswedi' and then in Tonga they also say 'paswedi'. So, you know,
there's a lot of sharing of words. If the word was shared between two languages, I only used it once in the research. And then what I also did is that I also represented the mants as well. So I converted all the mants and then I converted them into the various languages as well. Because as you saw in how users use for coping mechanisms, they use also mants to cope as well.
And then, so basically these are just examples of how the passwords got created. You'll see there in English, for example, let me in. So let me just explain. So in Zulu, for example, we can take a whole sentence and make it into one word. For example, let me in, in my language is in Fagapagat. So we can make it, we can take 'let me in' which is three words in English but we can just make it one word. So basically some of the words that people created as like 'let me in' I converted them into one word. as well. And then inside it's called "pagati", "welcome" in Zulu. Okay, that's the... Okay, another very important thing is that there are two ways that we say this
traditional Zulu and then there's like the more modern way of saying it. So basically I used the very traditional way of talking. So it's like Zulu, we say Zulu one. which is also very, even very, very hard even for me to sort of understand because the words are insanely deep and insanely long as well. And then to basically keep them all the same, the hashcash mask that I used was uppercase, the first letter, lowercase, everything else, add in random digits and then a special character at the end. And then for the months, I did the sort of same, but the two digits that I end were the months, so it was the year that we
end. So we're in 2019. So I'll add in one nine at the end.
So the breakdown of the words that got generated, we created, I created, so there were 11. So these are all the, at the bottom there, the character lengths, and then obviously it's the count of the words. So most of the words that got generated from English into the African languages were 11 characters of length, so 58 of them, followed by 10 at 50, eight character words at 47, And then 9 at 47 as well. 12 at 37, then 13, and then it sort of goes down as well. The reason, and also the reason that you'll see that I only gave very, very few, I tried to stick as much as I can to how the users, also the passwords that I was able to crack.
For example, most of them are 8-character passwords. But just because how my languages are sort of like, just in general, some of the words would just be long. For example, hello is H-E-L-L-O, it's five letters, but in my language we say S-A-W-U-B-O-A-O-N-A. So just the same word, hello, same meaning, but in my language it's just longer. So hence why the words will sometimes also be longer here that are generated. So I basically uploaded some of the passwords, for example, to Hashkiller. hashes.org online and online hash crack and then I also asked some people that I know around me know like hey I've created these words can you try can you try guess them basically the reason I uploaded
them to these services is because I wanted to get as many people to see which people could actually get could actually to be able to crack some of these passwords and then what I did is that I Like I regularly check the websites to see if they appeared on the website so I knew what the hash was So I'll just paste the hash on Onto the onto the service and then if the possible cracked cool, I'll see it But if it didn't then I'll wait a few more weeks So basically the password hashes were so this is just last year So I uploaded them on the 30th of July and then I stopped checking like
in August last year August and And then I found that 48% of the 359 passwords that I created were successfully cracked between the online services. So 100 NTLM hashes and only 55 MD5 hashes. Hashkiller accounted for nearly 90 of the 100 hashes and most of the MD5 hashes as well.
So this was basically the breakdown of the passwords that people were able to guess. So 8 character passwords accounted for the most and then followed by 9, 28, 10, 22, 7, 16 and then 11 character passwords for 15 of them. When you put them together, you'll see that we generated 11. You'll see that in the 8 character passwords, for example, nearly all of them got cracked. And then on the 9, followed by the 9, just slightly less. So 15 out of the 58 of the 11 character passwords were successfully cracked. All 7 character passwords were cracked. And only for 8 character passwords, 37 out of the 47 were successfully cracked. None of the 21 and 22
and the high length passwords were actually successfully cracked.
So basically then the passwords, so these are basically the passwords that got found in green. So the English passwords and these are the passwords that are, so all the passwords got sent out but the ones in green were the ones that people were successfully able to crack as well.
and then for the month your notice for the months as well is basically the same as well um people actually struggled to to crack the the the closer in the for the closer months while the satswana months were basically people were able to crack successfully um just go back one you'll find that for example is basically only four words were people struggle to find but um in kosa for example there were um so it's one that they were also able to crack a lot of them only three got not got found yeah so basically is that the the the word password was successfully cracked in two languages is it zooland is it closer um the
word hello was only was the only word to be fully cracked in all the languages Wasis a greeting and it said so many times so people must have must have it in their dictionary This is it was the language as a southern African language that was cracked successfully Out of all of the other languages while Tosa was the least cracked language when it came to the months So it's one of the most cracked language when it comes to months was also Setswana so in conclusion, so So corporate South Africans use English-based passwords even though they live in official languages. South Africans use regularly available objects to create their passwords because there's a lot of bias towards English-based passwords, towards your computers in English. The system is asking
you to say, hey, please enter a password in English. So obviously the user, even though you speak Zulu, for example, like me at home, you'll still create an English based password because there's a lot of bias towards that. Password is very popular in South Africa and it's got a very very high hit number as well within corporate South Africa. Converting English based passwords into Zulu did not offer any additional protection. The reason for this is because Zulu is so well spoken in South Africa Zulu is sort of like the language that if I meet another black person in Southern Africa, I would sort of expect them to understand my language, even if they don't. So it's got this very much of this, like we're
just proud of ourselves and we want to tell everybody else around us. The Chivender language might actually be the better one to create the passwords if you use their special characters. Because to get their special characters onto the keyboard, there's actually, there's only one person that I've found, he actually has a, he created a file that you have to put into, onto your Windows system to be able to get those, to be able to get those special characters as well. So I had to remove them just to make it a bit easier, but I'm sure if you added in those special characters, they would be really, really hard to sort of crack. Using months in closer was obviously the least crack was the least
cracked and I've got no reason as to why it was closer and Zulu or you know the part of the same in the belly group so there should have been a lot bit easier so to sort of guess and and to get right then thank you very much. On your first slide fish eagle was the Third most common, what is the significance of fish eagle? Fish eagles, actually, there's a company in South Africa that, they're a banking company, and they use that word as one of their passwords. So, it's very easy to put two and two together, but that's the reason why they use fish eagle a lot. Well, thank you very much, guys. If you have any more questions,
please come up to me and talk to me. Oh, do you have a question? So, you said that...
No, Isi Zulu. Isi Zulu, yes. Zulu. Zulu was the most cracked. But you said Setswana was... Was also highly cracked. Let me see if I can find it. So Isi Zulu is the most cracked word. So if you look at... Sorry, for converting the passwords from English into that, not the months, Zulu was the most cracked one for those ones. And then Setswana... So it's one of the followers of the close second as well. But when it came to the manths, Setswana was the most correct for the manths. So converting, so taking the months and putting in 18 or 19, people were actually able to crack Setswana words much more readily. I don't know why, but Zulu, people were able to
find Zulu words so easily just using like hello, welcome, freedom, hello. It really was different to see them that they actually didn't crack as much when it came to the months, but they were able to crack them in Setswana. So is Setswana also a ubiquitous language like Zulu, like more people would actually speak it in those regions? Yes, so Setswana is spoken mostly in sort of northern side of South Africa. So Zulu is spoken to more in Gwazulu Natal, so that's more coastal. Situwana is spoken more far more up north as well. So I think also into Botswana if I'm not mistaken as well. So the two languages, they may share a little bit of
words, but they don't really share all that much between each other. Thank you. Okay.
I'm just curious, how did you manage to get all the hashes from all the seven companies? Did you face any resistance from them? And what was the context on how did you get it? It is very hard. So basically we also do sort of like password checking, password auditing for the organization. So I have to get the password somewhere somehow to be able to do the password audits. And also most of the times that I got the passwords was, you know, you could just talk to the people, say, hey, I'm doing this, you know. But then what I did is write is that after the research, everything gets scrapped. Everything goes away. I don't even
share those passwords, but I will share the passwords that I created for the research, the converted ones, but I will never share the ones that I got from the organization as well. One more question. Will you be looking into breach credentials? Because now you know that you have seven companies, maybe you can look them up in breach credentials. You can pull out all the breach passwords and maybe do an analysis on them. So, yes, so those passwords that I've got actually did come from using like have I been pawned databases and a whole bunch of other sort of databases around it. It's just that, yeah, there's always going to be some new breach. And also
what I did is that for a lot of these words, a lot of the passwords for the corporations was that I used a lot of hash catch rules as well. So the whole process cracking them took about two to three months of just pure cracking at all. All right. Thank you. No worries. Thank you very much, guys. Does this sound okay? Yeah? Yeah. Test, test, test, test, test, test. Good afternoon. Welcome to B-Sides Las Vegas, Ground 1234 track. This talk is entitled Improper Database Authentication by Mitch from Canada. We'd like to thank our Inner Circle sponsors, Critical Stack and Valley Mail, and our stellar sponsors, Amazon, BlackBerry, and Silence. Please ask you to put your
phones on silent, and there will be some time for some questions after the talk. Just raise your hand, I'll come by with a microphone, and then the streamers can pick you up. Okay? Take it away. Okay, thank you. So, yes, like you said, my name is Mitch. I'm a software engineer, so not a security analyst, and I work on Cisco's advanced malware protection for Endpoint's product. So that's about all you're going to hear about Cisco, though. So we'll jump right into sort of the environment I work in. Just one slide on this. And I'll just describe a little bit about how endpoint protection is set up in the cloud in our case. So we have
an agent running on laptops, servers, phones, sends data back to our command and control servers. It sends out definitions. So I think like behavioral signatures, hashes, some machine learning, proprietary analytics. I don't know. I don't write endpoint software. So there's probably a ton going on there I don't know about. But the part I want to focus on is this data. It gets sent back. We take it into a streaming service, and it gets forked off to either have detections run on it. Those generate alerts. Alerts get viewed by customers in their web portal. The other side is the data comes through, gets written to Cassandra, another database, and that raw data is kind of like
an abridged version of what happened on the endpoint. So think like event tracing for Windows. You can get a ton of information out of there. There's no way we can bring all that back to the cloud for seven million endpoints. So what customers get to view here is a bit of like an abridged version of what happened on the endpoints, but it still is like a deep enough picture for investigation. And then the last data flow here is some generic customer information. It's put in MySQL. So you can think like admin names, whatever you'd configure in your Cisco AMP for Endpoints web portal. So we care about data security, one, because we're a security division. It'd be a little embarrassing if we didn't. And we're
also storing two classes of sensitive data here. One is data about customer systems. So like files, like a PDF could have some valuable business information that hackers want access to. We might have some information on that in here, so we want to protect that. Same thing with their alerts in terms of what's going on in terms of cyber events in their system. And then lastly, kind of like your generic customer data you think about that gets leaked on the regular. Right now you see these leaks in the news, so we want to protect that sort of stuff too. So now an important point to bring up here is the scale of the data in our system and it's going to come into play why I care about certain
things in database authentication. So this data path right here is about 200,000 records per second in our largest AWS environment. And it's like a distributed systems technology has evolved where it's not that hard. I mean, not that I'm saying our jobs aren't hard, but it's possible. It just makes it a huge pain in the butt to, for example, do like a rolling restart of this Cassandra cluster. Not that it's impossible. It's just that, like, for example, if you want to change a password for a database user or drop a user, we If that happens regularly enough, we don't want to be restarting the cluster. If we want to upgrade, that's fine, but a routine thing
like cred rotation should not be a cause for restarting the cluster. Okay, so the interesting edge cases for me here are credential rotation, user deletion, and session invalidation. And a lot of these are coming down from an audit we're about to go through. And to pass an audit, we need to rigorously understand what's going to happen with the different data stores we use. So that's what we'll be looking at. We'll be evaluating MongoDB, Cassandra, MySQL, those are the three databases you saw in there, as well as Postgres for good measure. So four popular open source databases. I think Oracle still has the crown. I didn't want to navigate getting an Oracle license for the purpose of this, so it's not in here. Right. And then the no
downtime emphasis there. A lot of the things I'm going to bring up that are like possible issues you might care about, an easy way out is just downtime, restart your databases. So keep that in mind. None of this is like the house is on fire type of stuff, even if I think later on I will show you, I guess it's a one day now vulnerability in MongoDB. Again, it's nothing to seriously panic about. Okay, so this is a bit of an obligatory slide. Favorite meme there. But it's authentication versus authorization, just basically to convince you I know the difference. I think in the abstract, I sort of aliased them behind the auth short form. Primarily going to be focusing on authentication, though it's hard to talk about one
without the other, and we'll see some of these databases sort of lean on their authorization features for authentication purposes. functionality, in a sense. Alright, so we'll go into some behavior examples here for these four databases. And the first thing we'll look at is credential rotation. So, I don't know, just like change the password. Those are two passwords that came off like a top 20 list I found online. I don't know, it's like, does anyone here use those passwords? Yes? I don't know who puts I love you for their password. That one just struck me as a little weird. All right, so this will be the format for the demo slides. So here we have two terminals, one on top, one on bottom. They're both user terminals for MongoDB
in this case. So I got the little logo here. So the top one, we log in to a Mongo shell. And Mitch is a read/write user on the database. That's the pattern I'm using in most of these. And you can see we successfully insert. And then behind the scenes in an admin shell, the admin changes the password for the account Mitch. And you can see we can still use this session to insert records. You can read them too, but I just thought writing-- it produced shorter output on the console, if you put it on one slide. And it sufficiently demos it. And then the next one here is kind of like, I'd call it table
stakes for being a database company. If you change the password, then you can't establish new sessions with the old password. Anyway, no surprise there. So you can't use the old password, password one, after it's been changed, but you can still use your old session. Alright, so we got this table here, MongoDB gets a green no for use old creds in cred rotation and a yellow yes in use old session, not necessarily bad, you just know about it. Alright, so going to Cassandra, similar behavior, again you can insert, admin changes the password, still uses the session to insert, can't use the old password. Pretty straightforward. So, no surprise there, Postgres. Again, insert, change the password, and still insert with that same session.
And can't use the old password. No surprises again. There's a bit of a pattern here. So MySQL, exact same thing. You can insert, keep using your session on credential rotation. It'd be kind of handy behavior to have, especially you want the ability to not kill all sessions. You can think like we have all these apps writing to Cassandra, for instance, and to kill all those sessions, have them reestablished, that'd probably cause a bit of a blip in our backlog. It might take a little bit to catch up, provision more servers, something like that. So I actually kind of like this behavior. Again, you just want to know about it. All right, so move on to user deletion now. So sorry, Spidey. Going to get
rid of you there. Does anyone like Avengers? I haven't even seen this movie. Did I use the meme right? I don't know. All right, so first we'll look at user deletion with MongoDB. So admin drops the user. And we could insert previously. We can't insert anymore. So here's sort of like our first, what I'd call, interesting behavior. You still have a session to the database, even though your user's gone. And you can still sort of do stuff. Non-privileged operations, so you can't insert, but you can, for example, send a command to the database. We'll see in the case of MySQL, Postgres, you can send SQL. It'll parse it, check if you can do it. So it's not the same as having no connection at all. You have
slightly more access. And then again, sort of the table stakes scenario, you can't use the old user after it's been dropped. So no surprise there. So in this table here, I'm filling that in as not really. Can't do what you could before. Still can do a non-zero amount of stuff, even if it's not very much at all. Again, not something to go nuts over, just good to know about. Alright, so Cassandra, again, similar situation. You can insert, admin drops the user, can't insert. And then Cassandra you can get a little bit more out. So in the Mongo one I just asked for connection status. I couldn't do something like this which was describe what tables there are or describe what databases there are. So Cassandra lets you get
away with a little more, but still very similar behavior. And then Cassandra is also a unique case in that it's the only one of these four databases where authorization is optional. So you can have a Cassandra database with user accounts and no authorization. So you just log in and you have full access to everything. So if you have authorization off, when you drop the user, you can just keep on going. So you have a session established, your account gets deleted, You're fine. Just keep on going. Anyway, so that gets it a red maybe. If you've decided to be lazy, didn't want to do authorization, you get this bad behavior. Again, something to know about. I'll summarize the things to know about at the end, but anyway, it's red. I
don't like it. So now we go into Postgres. Again, similar pattern where you can insert, do operations until your user gets dropped, then you can't except for unprivileged stuff. So here I'm saying like, show me all the tables that don't start with PG or SQL. There's like 50 tables in there that start with PG and SQL by default. So this is just showing the one data table. All right, so we get a not really for the use old session there. Then MySQL. You can insert, four users dropped, after it you can't. And you can still do some things like show me what user I am to the database. Alright, so not looking that crazy. Though there was something in here that hinted that things
might be a little weird. This Cassandra case where you turn authorization off and you still get your session, that sort of hints that this database is leaning on authorization capabilities to enforce authentication. So you don't have an account anymore, you shouldn't have an identity with the database, but it's using what accounts can do to enforce whether or not you have an identity. So that sort of begs the question, it's like, okay, so I don't have an identity. What if someone, or sorry, what if you recreate an account with that identity? So that's what we'll look at next is recreating the user. And there are plenty of cases where you'd want to do this. the credential rotation. Remember, we can still keep the session when you rotate credentials. Let's
say you did want to kill those sessions. An easy way around that would be dropping the user and then re-adding it with a new password. So it's definitely a case you want to consider. And with Cassandra, you try to insert, admin drops the user, then you can't insert anymore, and then admin recreates the user, and you get a successful insert with that same session. So basically it's just checking like is there a user in there named Mitch? Yes there is. All right Mitch has these privileges now because they were re-added and you're good to go with the old session because it's associated with Mitch. So not the best behavior in my opinion. Not something I like. I'm gonna have to find some sort of mitigation or workaround for
my environment at least. So now we go into Postgres and Postgres, this is what I wanted to see. So you have a successful insert, admin drops the user and we don't even do anything on this session here, no touching it. Admin recreates it with a new password and we try something and it doesn't work. We're back to the just regular drop user scenario. So this is what we want out of our database, I'll assert. So we get a nice yellow, not really, 'cause it's the same as the drop user. MySQL, we're back to the behavior we don't like. Same as Cassandra where we insert, admin drops the user, can't do it anymore. Admin recreates the user with
a new password and then we're back in business. Now MongoDB, this is the CD that we talked about. I thought it would be nice to give a live demo of this one. It was sort of like why this is considered a bug where, spoiler, the other ones I showed, so like MySQL, Cassandra, were not considered security bugs. And they're sort of more like things you should know about. So this right here is an admin shell for MongoDB. So we'll say like-- Oh, yes. Thank you. Did not account for that. Actually, why don't we do this? Can mirror this place. OK. So we've got an admin shell here. Is that good? Is it large enough font in the back? You're good? You can read it? Awesome. OK. So in
this admin shell, we've created a user Mitch, password password 1. And it's a read write role on this database. All right. So now we'll go over to another shell. It's in this container. And we'll go into the database with the user Mitch, password 1. All right. So do things like show DBs. Show collections. Alright, so we'll try inserting something. That works. Alright, so we're good. And we'll go over here, drop the user Mitch. So now we'd like to not be able to use the session. Insert. Alright, that's great. Requires authentication. Not my favorite that they mixed authentication and unauthorized in the same error code, but it's the behavior we want. So now we'll add this user back. We'll use a different password. Password two. Again, bad
examples, but again, choose good passwords, please. Okay, so here we're gonna try inserting again, and we shouldn't be able to. That's what we want. Okay, so that's good. Now, you'll notice here that we put activity on the session in between the user being deleted and the user being added back. So it was admin deletes user, we try and insert something, doesn't work, admin adds the user back, and then it still doesn't work. So we'll try not doing anything on the session. So we'll log in, password 2 now, we can insert, drop the user. again go back here this is the user shell we're not going to touch it no activity on the session and then we'll add this back
so we'll go password three and then we try and insert again and it works. MongoDB clearly put some thought into the scenario where if you put some activity on the session, it sort of works the session. Even if that user gets added back, you can't continue where you left off. But if there's no activity on the session in the meantime, then you can use it. That's why when I report this to them, they considered it a bug. Whereas Cassandra and MySQL don't consider it a bug. Okay, so MongoDB 4.08 gets a yes there. You notice I put the versions here. I think it's the latest version for all these databases except MongoDB. MongoDB's had a
fix out for a bit now for that one and I'll get into that fix in a second here. Okay, so I wondered... Sorry. Okay, so I was wondering why is this behavior this way between these different databases? We have Cassandra and MySQL on one side, as well as MongoDB, and then the new MongoDB with its fix and Postgres on the other side. So if we look at the user table in Postgres, we can see that there's this column called uSysId in pgUser. So here we have root and we have Mitch. And this is an admin shell, so we can go ahead and drop Mitch and then create Mitch again. and then look at the user table and you can see this useSysID value for
the Mitch user change. So if this is too low, up here it says 16385 for Mitch, down here it says 16386. So you have like a unique user identifier for each user account when it's created, even if the name is the same. All right, so that solves our Spidey problem. All right, so in MySQL, there's no such column in the user table. And this has been around for a while on their website. You're not meant to be able to read this warning, but basically it just says, like, hey, the drop user command does not take effect immediately. The way they word it here, it's a bit, I think, pessimistic. The situation is better than they describe on this page. But basically it's saying, like, if you drop the user,
don't expect sessions to be, like, gone right away. Cassandra, they're taking that route as well. Not modifying the database, but instead documenting it. They did not have something on their drop role documentation, which is the drop user equivalent. Tell them about this, they added it, so that's nice. There's a warning out there for everyone. And then, like I said, MongoDB, to their credit, this is great that they recognized it as a security bug. It shows that they really took it seriously, even though it's, in my opinion, not that crazy of a bug. And the fix they added in their 4.09 release, as well as their There are 3.6 and 3.4, which are also supported, is
a unique user identifier. And in this case, they used a universally unique identifier to basically distinguish between two instances of users with the same name across time. Okay, so there's just a few possible actions here. Remember, if your database has authorization, capabilities, use them. It might be leaning on them for some other security things like authentication, which is slightly different but related. The workaround for there being no UUID in your database user table is just create your own user UUIDs, kind of, so never make a user with the same name twice. I think that's the workaround we're going with. So we're going to do something like account name underscore the date it was created. If you do that, you'll never have this problem
of deleting and recreating a user in bad sessions maintaining activity. And then if you want that user ID functionality built in and you use MongoDB, just upgrade. You'll get it for free. I think that's about it. So open up to any questions. If you want to get in contact with me, please do. Wassonlabs.com is just a personal website. It's got email, LinkedIn, whatever. And then if you like this kind of stuff, Cisco is hiring pretty much on everything, but AMP for Endpoint specifically. We work at a Calgary, Alberta. If Canada isn't your thing, there's lots of security products as well in the United States and globally. I thought I saw a hand up over there. Is there a way in Cassandra or MySQL or
any of them to clear out sessions periodically? Can you disestablish sessions that have been established so that you don't run into this sort of thing over a long term? Good question. I haven't found a convenient way. I've seen answers on the internet that I haven't tried out which are kind of like custom scripting. You might be able to do something in your firewall, for instance, to send TCP packets that just stop sessions and force them to be reestablished. Something like that might be possible. Yeah, I don't have a super direct answer for you. Yep? I assume you'd have to try this before. We'll just wait for the mic, sorry. Yeah. Yeah.
We've tried this before, but have you tried to kind of change the role of the user while they have a live session? Yeah, yeah, yeah. So that's a great question. So, for example, like the MongoDB one, you delete the user, add it back. The old sessions inherit the new permissions. So there's no kind of like keeping track of what permissions were what. It's kind of like you have... a session that's established with a user named Mitch, and Mitch has these authorizations, then you can do it. So the CVE, if you take a look at it, it's got a pretty high score, I think like 7.4. My initial assessment was more around like a 4, and
it's that exact scenario. Cisco Talos wanted it upgraded to that case, and in case you take a user, delete it, and add it back with higher permissions, is what they were worried about. Other questions? There's one at the back there. Did you look at the cases where while there was an active session, you changed the authorization that the user had? How does it handle those cases? Yeah, you get the new authorizations. If I remember correctly, you do get the new authorizations. More questions? I thought I saw a hand up over here. No? Okay. Anyway, I'll be sticking around. I know it's lunchtime, but please come up, talk to me. I'll be as candid as I can.
Thank you. Hello. Hello, testing? We all set? I'd like to welcome everyone. Good afternoon. Welcome to B-Sides Las Vegas to 1, 2, 3, foreground. This is a talk-- oh, shoot. I did this thing. From Divya. Hey, Lakshmiya. Sorry about that. And I already lost the name, so I do apologize. No, not you. It's me. Sorry about that. Hootis, the right way to authenticate. Can you tell this is the first time I've ever done this? Real smooth, right? A few announcements before we begin. We'd like to thank our sponsors, the Circle City sponsors, Critical Stack and Voicemail, our seller sponsors, Amazon, Blackberry, and Microsoft. It's their support along with their other sponsors, donors, and volunteers that make this event possible. Cell phones, these talks are being streamed live
and it's a courtesy to our speakers and audience that we ask you make sure your cell phones are set to silent. Use the mic later. If you have a question, use the audience microphone later to answer any questions for you, okay? Right now, please pay attention. Yeah. Hey everyone, I know it's right after lunch and welcome to our talk, who is this right way to authenticate? To start off, who are we? I'm Lakshmi Sudhir and she is Divya. So we do all things application security. We work to do like secure software development cycle, to be involved in secure software development cycle and all stages. We work with a lot of developers, we work for them so that we can go ahead and ship secure products. We had one
more colleague we want to give a shout out to, it's Narayan Gauraj. He worked with us on this project, but unfortunately he couldn't be here today. I assure you he's alive. We have not done anything to him or this project doesn't kill anybody. So before we head on to the presentation itself, just a quick note, all of this research is based on publicly available data. There's nothing that has been taken from a company or anything like that. That's something that's accessible to everyone. Also, one more call out is we are not advocating that you use anything that's inherently insecure. But as most of you know in the real world there are cases where even if
you have a really strong protocol as such, there might be a part of it's inherently insecure component that you would have to go ahead and implement. In those cases, we are just trying to make sure that we just found a couple of like secure ways of doing it and that is what we are talking about today. We are not advocating using anything inherently insecure. Now that we're done with all our disclaimers, let's start with what is this talk about? What's the agenda for today? we're going to talk about the background itself as to why are we here talking about this. And talking more about the problem statement, like it's 2019 and we still have to talk about authentication. So we're going to get into more details about why is
this a problem statement itself and more details about the issues itself. Then we're going to try to touch upon some of the remediations and try to understand the patterns across authentication itself and various protocols. And then we can have some time for questions and answers. So what is authentication? When we think of authentication, the first thing that comes to mind is identity. Now how do these two piece together? Identity is who you are. Authentication is what verifies your identity. It says that if I'm Lakshmi, that's authentication is how I verify I'm Lakshmi. we think of a lot of there are a bunch of protocols that have been built bunch of technologies, bunch of platforms that we use to establish authentication itself. They inherently lie on three basic principles.
One is for me to establish my identity, it could be something that I know, that could be a secret, a token or just when I'm trying to try to get money and cash to gamble out here in Vegas since I'm here, I had to use my PIN. So that is something I knew. And then possession. Possession is more around YubiKeys. You could have a mobile device. You could have something else that is a physical entity or something that only you possess such that you could establish your identity. The third part is something that you inherently are. I open my phone, it's my Face ID, I use my Touch ID, that's something who I am, that's
something that's unique and that's how I establish my identity itself. So authentication in short for be it a user, service, anything to do with it is based on three broad things: knowledge, possession and inheritance. So, where does authentication come into play? It could be as a user when you're trying to access an application, or an application talking to the backend. User authentication, some of the common things we spoke about was Face ID, a couple of like, what else did I talk about? Yeah, YubiKey, passwords, secrets, and all of that. But when we had to actually put together this presentation and look for all the data, we logged into an application that was the user authentication aspect of it. And then this application spoke to multiple
other applications out there. And those applications to pull out like some of these authentication issues, for example, and they had to access their database, pull out all these reports they have, consolidate it and return it to the service. This service consolidated it. fed into a BI tool to make it more presentable, make it into all these fancy graphs such that we understand and we can ingest the data. And with each of these communications, all the services had to identify to one another to make sure that it is the right service it's talking to. So there's so much out there in the ecosystem, making authentication a very important aspect. It's more of the backbone when you
begin communication in this whole complex ecosystem. With AWS and other cloud instances out there and cloud services, everything is almost reduced to a web service per se. So authentication has become more and more important as we have progressed in this ecosystem. Why are we here? Is there a problem? Of course, yes, no surprises there. And that's why we are here today to talk about this. And we are all here to discuss and see how can we make this better. If you're still not convinced and a skeptic like me, we have some data for you. So as most of you know, OWASP, amongst OWASP top 10, even during the latest revision, the top second one right after XSE, I mean XSE, sorry, cross-site scripting, XSS. So right after cross-site
scripting was broken authentication. With SAML, OAuth, Kerberos, we have a million authentication systems out there. It's pretty surprising that this is still on the top. So we took a little deeper into this and this was the point of inception where we were like, this is a hackathon report for bounties that were paid out by companies. Companies have spent so much money on it and this came to a close second. Broken authentication or improper authentication was something that companies had spent a lot of money of their bounty on. That was again the second highest after cross-site scripting. This was a point where we were like, this shouldn't have been a problem really after so long. Why
is it still a problem? I mean, we didn't want to remember a lot of passwords. It was hard to have complex passwords and remember all these passwords in various applications. That's why we came into single sign-on, we had OAuth, we created multiple systems and multiple protocols and it still stands there on the top. So this was where we felt that this requires some research. We really want to work on this because this is an actual problem. We want to address it one way or another and this is how our talk got, I mean this was inception point for our talk itself. There are further more details we pulled up like some of the reports which
were high risk and that's something that further motivated us. So Divya is going to talk a little bit about that. Hopefully you can hear me now. No. All right. So once we had the inception point of that HackerOne report, like the second most problematic area was broken authentication, obviously the reigning king is cross-site scripting. So we kind of went into the past year, what are all the OAuth issues, or at least the general authentication issues that have been reported. And one of those, like this one, even seeing the report kind of pulled me in because I Like this is a social login scenario which is supposed to be secure, but still inherently during implementation something
went wrong. Similarly, you have stealing OAuth tokens via redirect URI. And again, a similar one, insufficient OAuth callback validation, again pointing back to redirect URI. And finally, authorization token not expiring after logout. Now, this is not inherently like an OAuth issue or even like a specific authentication scheme or a protocol issue. It's just general that ties back to the token. So what you are seeing in the screen right now is just cherry-picked like issues that we saw and we dove deeper to do our own case studies. But we saw multiple issues not just limited to this. And that led to like, Is it a common issue across protocols? Is it a common issue across schemes?
Is there like one protocol that's going to be like, I'm going to solve all your problems regardless of implementation. It's good to dream about it, but it's not the reality. So yeah, this is what we researched on. So as Divya said, after this, we thought we should come up with an approach to address this and understand if there's some common patterns and themes. So we started off going over the top 100 websites and we're like, hey, what are the most commonly used authentication protocols? And how do we classify them? Amongst these protocols, we identified a couple of protocols and then we were like, hey, what are the high risk issues? What's the worst thing that
could go wrong? And so that we could focus on these high ROI issues. And then we dug into what could be your remediation? What is the best case scenario for remediation? Because most best practices for the most part are sometimes sometimes not really applicable, actually most times not applicable in a business scenario. So how do we go about maybe having difference in depth or just outline some of the remediations that may work that has worked for us personally when we were talking to developers and maybe that's something that we could share. And also I think the primary idea I feel is that as security engineers we owe a responsibility to guide our developers towards the
right way and the most secure way. That's what our job is. So we felt that we need to make this usable and understandable for the developers themselves. So this is the approach. And for today's talk, we are going to focus on four of these. We're going to talk initially about the JSON Web Tokens, the JWTs or JWTs. And then we're going to focus a little bit on OAuth, passwordless and magic links. and then we're going to talk a little bit about SAML as well. And in our conclusion, we want to draw out like what are the common patterns and how do we go about why is authentication such a big problem even today? So let's
start with a JSON Web Token or JAT. Now, what is a token in general? Token is just an identifier which is used to authenticate. That is what is validated by an application. You present your token, it's like, hey, this is a valid token. I mean, this is a valid user. This is Lakshmi and it's her token. So let me provide the service. With JSON Web Tokens, it's nothing but a JSON encoded representation of a couple of claims which is digitally signed such that you have some key value pairs and this provides authentication because when you submit this token itself, the application that's receiving this can actually go validate the signature and you have this JSON
body which has something called claims which is nothing but you could have scopes, you could have the user ID instead of a single session identifier which has a unique number or a random token, you use this which is a JSON body and it is digitally signed. So we use both asymmetric and symmetric algorithms to perform this digital signing itself. So let's just look at a scenario. Now, as a user, I sign into some application using Facebook or any of the social authentication systems. Then the authentication server validates my username, password, or any of the primary authentication part. And then it provides a JSON Web Token back to the user. Now, this JSON web token, we're going to dive deeper into the structure of it and more details
on that. Now, when this user presents this token to an application server, the server actually validates, has this been tampered with, and it validates the signature itself. And then services based on whether this user should have access to the application server or not. And processes the HTTP request that was being sent there. It could happen again between service to service as well. So there are two components. This could be used as a session object as well as between service to service for service to service authentication. Now, why would you want to use JOTS instead of a session identifier? That's because, one, it provides authentication through integrity. Again, I want to call out that this is
just base 64 encoded, so there's no confidentiality, but there is integrity because there is a signature to the body and the header. So you get authentication. You can also add some information around authorization saying, hey, this user has code because it's a key value pair at the end of the day with JSON. So that way, it's easier to share that information. which makes it stateless. Now, with session identifier, you would have to store the unique identifier on your backend and also store the user ID or any associated primary key out there. With this, it makes it stateless. All the application server has to do is, as soon as it gets a service request, it looks
into the signature, it validates the signature, it's great, there is the scope information. So it makes it amazingly stateless. that makes it scalable as well compared to session tokens. It's compact. You could say it's compact and lightweight because mostly JSON key values are pretty compact. Of course, it depends on how much you put into the payload of this, but it's pretty compact for the most part. And again, coming back to the point of the key value pair in the JSON body, you could have information exchange. So this could be useful, let's say, in cases of like OAuth where you have a token issuer or an authorization server that's giving tokens in your multiple applications. That
way it's more scalable with different applications when you're using JWTs. You could even exchange information in this payload part of it, saying, hey, this person has go up to this application, this application, or even send out some usual information around user ID or a GUID or more on those lines. So, as I said, with all of these properties, where is the use of JWTs, which is pretty common? It's used in OAuth, it's used in OpenID Connect. In OAuth, it could be used as an access token, refresh token. And in OpenID Connect, it's used as an identity code token. And some applications also use it just as session objects itself. The structure of a JAR. So,
the three parts to a JWT One is the header. The one you see on the left here, the header, payload, and signature, is what is being passed. So it is a basic C4 URL encoded string that gets submitted to your application. So this header is what contains. Now, when you decode this, This is what you see. For the header component, you see the algorithm that's been used to actually create the signature. Again, you could use symmetric as well as asymmetric algorithms to actually create the digital signature. With symmetric, I think the challenge is more around sharing secrets across the other parties that have to validate this. So asymmetric is something that's pretty common because you
could, of course, rely upon the public private key architecture, I mean infrastructure. So that way, even if you're a token issuer, all you have to have as an application or a client application is just have the public key so you could validate this. So the header contains the algorithm and the type is JWT because we're talking about JWT tokens. There are other kinds of tokens, which is out of scope for this talk. And the payload is where the meat of this is. This is where you can have like, hey, this is the user. Is he an admin? Is he not an admin? You could also have more information around like what time was this issued
at? When, who is the token issuer? When does this expire? And all the information out there. So this is what is the main thing that has made it more attractive for the developer community to use JWTs instead of a session identifier. So this is where you have all the details and the signature is where the header and the payload are hashed. I mean, they're digitally signed. And this is the part that a server validates when a JWT is submitted during a service request. That we know that this is the structure. Let's move on and see if this is a valid JWT. Show of hands if any of you think this is a valid one, valid
JWT? OK. Awesome. You're all right. You're biasing them. So awesome. You guys are right, actually. So even though here we saw that there is a signature component and there is a value for the algorithm itself, we see that the algorithm value here is none. and there is no signature. Unfortunately, not unfortunately, there is a business case for the non-algorithm. But when you're using this as a session object or to establish session or identity as such, this is something that most implementations have missed, is verifying that the algorithm is not none. Because when this jot is submitted to a service, it looks into the algorithm to see, hey, was this using a symmetric algorithm or an asymmetric? What was the algorithm use? When it looks at none, the inherent behavior
is to not look at the signature and then just process the chart. As an attacker, I can create any number of tokens out there and just provide it to a service with the none algorithm. The whole point of authentication is lost here. There's authentication bypass, and I have complete control over anybody's account. So, this is one of the most common issues that has been there from 2015. I think 2014-ish was the inception of JWTs and this was discovered in 2015 and still there are libraries which support this. So, how do we go about remediating or making sure that we take one, we take notice of this and the second one, how do we go about
remediating this? Again, like I said, there's a best case scenario. There's an OK, I mean, like something we could do with. In that case, programmatic controls, of course, are the best way to go with any validation as such. So one of the things you could do is you could make sure that, let's say you're using HMAC256 here, you can make sure it's hard-coded that any token comes in, you use the same algorithm, you try to validate it. But that may not be possible in a very real world scenario because there'll be multiple algorithms that authorization server or the application server may support, the token issuer may support. So in that case, one of the, the
next best approach could be to have a white list of all these algorithms that the server supports and make sure that you are validating against just one of those and you do not accept none or something that is inherently cryptographically weak. So, these are some of the approaches, these are like the two approaches that could be taken. And the other thing of course is to choose the right library. Since we have libraries, we have these protocols, we have these technologies that do most of the heavy lifting, I feel like we should leverage some of it. JWT.io is a good website to go to. They have like for each of your tech stack, based on your
tech stack and based on the library you're planning to adapt, you could make sure to go through that to see what does it support, what doesn't it handle. That way, even when you're implementing it, you're aware that, hey, this doesn't check for an algorithm, so maybe I should implement this check myself or leverage something that already exists here. So I think that holds true for any application, I mean any technology per se that use the right library and identify. So with respect to JWTS, it's good to look for the non-algorithm and go about verifying it. Now we've checked the non-algorithm, we are doing great with the validation, we are making sure that we check for
the right algorithms. What else could go wrong? So let's say here, okay, we are talking about a cool company then. Okay, this is a cool company, by the way, and the attacker is Joe. Joe has access only to the mail server. We have a token issuer out there, so Joe submits his credentials and he's like, "Hey, this is great. I am Joe," which is right. He's not an attacker yet. And he gets this token from a token issuer and provides it to the mail service itself. The mail service validates, "We have checked for the non-algorithm. We have checked for the right algorithms. It's all cryptographically strong," making all those assumptions here. It services to Joe
with whatever request he does out there. Now, Joe wants to make more money. He wants to attack the finance service. All he does is submits this token. It is a valid token because even the financial service utilizes the same common authorization server. So he could just replay this token itself and provide it to finance. Now, if finance doesn't have the check, if the client application in this case doesn't have the check, of validating if this user actually has access to this service, then again, it fails as an authentication mechanism. The user can bypass his authentication. It is a valid token, but the user does not have authorization. So it's not enough if you have controls
just to validate it, but validating with context of what we are using this for is something that's important. Some of the mitigations that could be used is Jot, RFC provides you with predefined claim called odd, which is not short form for audience. So you could use that to explicitly say that, hey, this user, we need to make use of the JSON payload. I mean, that's why we moved to charts. So something you could use that for is to have this audience parameter and have like this a user has access only to database you could also use the issuer to make sure that hey this users or anyone who accesses finance service might have a different
issuer so to make sure that you are looking for the cool company issuer and this database itself you could make use of both of these components but audience i think has a higher hold in this case for a mitigation from a mitigation standpoint Obviously, you could have some extra checks around scope as well because just checking for audience may mean that this person has write access or just read access and you may need to have more of that. And you could again move this to the Jot itself to handle all of this because you know that if it gets tampered, you have a way to identify. So this could be one of the mitigations for
this. And now that we've handled all of these validations, scopes and all of that, how do you handle a logout scenario? So tokens are short lived. That's great. But we don't have a cache of all of these tokens like we have the session identifiers. So if you're using this to establish session, if a token has a four hour, for example, for the sake of this conversation, it has a four hour window there, then it is still valid even if as a user I log in and log out. How do you handle that? Now, one way of, I mean, the ideal way of handling this right now is to have like, you could use the JTI
claim, which is nothing but, you have a unique identifier in simple words on every JSON token. You maintain a backend copy of this saying, hey, this is identifier. And every time a user logs out, you can go put it into a revocation list. all your client applications can actually, would have to use it too, right? Because the token is still valid because it's not expired. So in this case, you would have to maintain state and you do lose it not being stateless anymore because you have to maintain a state to handle these revocations itself. So you would have to have a list of invalid identifiers and make sure even your client applications have this. It's
not enough if just authorization server has it, especially in the case of an asymmetric algorithm scenario where you have the public key. So that is something that is to be taken care of definitely if you want to handle sessions. This is one of the most common issues we've seen in a lot of places. So this is something I feel is a key call out anytime anyone comes to you and says, "Hey, we want to use Jots." to manage sessions. Also having, of course, a short TTL always helps, but again, it comes between usability and security. So if we have to make those compromises, then it's better to have this in place so that way we
have things controlled for pretty much most of it. Also, it is quite challenging with actually, I mean, revoking for a particular device or a particular user as such, because again, that would add more on your backend to be controlled. So that's to do with revocation. We've handled revocation, but it would come to key management at some point. There are two scenarios. One could be a brief scenario where you have a key on your authorization server or the token issuer that got leaked. how do you go about rotating it? If you rotate all the keys itself, it's a global logout and all the JODs would become invalid. But one way of approaching it could be again
using one of the things that JODs RFC itself provides, which is a key identification claim, which could be used to mark like, let's say you use a bunch of keys. It's usually like a bunch of keys that every authorization server uses. So you could identify which key was the one that was compromised and just rotate it for that particular key in case of a server compromise. And I think it's an accepted thing that if you force a change in any signing keys, then every JWT would be invalid. So these are some of the hazards that come along with using JWTs. While there are a great set of advantages, these are some things to keep in
mind and to be handled when you are implementing or using JWT. So what are the key takeaways here? One is definitely validate your algorithm claim out there in the header. Make sure to look for what you're using and make sure not to allow any cryptographically insecure algorithms. And the other part is to validate all claims. Make use of what is there in the RFC instead of creating your own because that needs to be vetted again. So this is something that could do the heavy lifting for revocation or key management and identifying the audience, managing to make sure that you don't have authorization bypass. Also, I couldn't tell this enough, please handle revocation carefully. We are
also trying to identify everywhere there's jargon. We're like, let's talk about revocation and then go back to all the nitty-gritty details because this is something where most applications have been doing a great job. So I think this is something that definitely requires a huge call out and something to look for maybe when next time you're looking at an application. So next we're going to talk about OAuth. Divya, do you want to take the stage? Hello. Everyone doing good so far? This is the half point? Awesome. Okay. So next, we are moving on to open authentication or open authorization. I'm just going to call it OAuth. So, okay. A brief primer on what exactly is OAuth.
It has three entities. It has the client, which could be a device, it could be a user, it could be something acting on behalf of the user, which is sending a request to access any resource from a resource owner. The third entity, which is the authorization server, verifies the client and forwards the verification information to the resource server. So the way the handshake works is client Client says, "Hey man, I want to access something from the resource server." And then the client, meanwhile, parallelly can initiate, "Hey man, can you please provide me something to authenticate myself with the authorization server so that I can forward the same to the resource owner so that they can
give me back the resource?" Same way, in the background, the resource owner has a list of, "Is this client okay to access this particular resource? Are these set of clients okay to access this particular resource?" and the way that is handled in both the authorization server and the resource server is between them. It changes according to implementation, but usually they make sure that they have a whitelisted client either at runtime or statically, they can access the resource. That's the OAuth primer. Moving on. Oh yeah, sorry. So there are three kinds of tokens here. So there is access token, there is refresh token, and then there is authorization code, which I'm still going to group it under the token. So authorization code is nothing but it's like a binding,
token or a code that is given to the client during the registration time. So this is like a one-time setup that happens between the client and the authorization server. And then in return, they are given a client ID and a client secret and an authorization code. Now, depending on the implementation, a subset of these secrets might be held on with the client and some or all of them will be held on with the authorization server, depending on what level of confidential information both of them want to hold on to. But the other two which are frequently seen in traffic are the access tokens, which are the short-lived tokens which are being sent with every resource
request or with every other special kind of request that goes to the authorization server. The other one which you have is the Refresh Token, which is the long-lived token. If you run out of access tokens or if you run out of, say, your initial login expiry or whatever session that you have maintained between the client and the authorization server, that's when the client pulls up the Refresh Token and he like, "Hey, I got this Refresh Token when I registered with you. I want to get a new set of access tokens that I can supply." So these are the three kinds of tokens to reiterate, access token, Refresh Token, and AuthCore. Now, irrespective of whatever token
that you got, as long as you have the token with you, as long as the client has this token, you are in. You get to access the resource, you get to communicate with the authorization server, and if there are multiple services on the back end that are going to use the same token to communicate, you get access to all of them. However, how do you know that that token belongs to that particular client? It works in a trusted end-to-end system where you can guarantee best practices were followed in every entity that was involved in that particular workflow. But I'm pretty sure we all know how much of a chance that is, of course, in Utopia. But in reality, it may be lost, it may be stolen, it may be
intercepted, proxied, what is it, relayed or altered. I was pretty stoked to see there were so many verbs used to what can happen with the token. And all of these have been seen in publicly disclosed reports. So, yeah, what do you do to prevent this? And I'm using the word prevent here very loosely because this is a mitigation effort. It doesn't remove the risk completely. So what can you do? You serve the tokens over TLS and you have short TTLs. And again, what is short? Five minutes will be short. A minute will be short. Some people might think six months is short. Some people have thought one year is short. So what exactly does it
mean by short TTLs? Why? So you understand that having transport like overusing TLS or having shorter TTS is definitely not enough. So what do you do? Token binding. So this has been getting a lot of fraction. So what this does in very simple terms is you already have the token. You already have the infrastructure that is set up for the client and the authorization server and the resource owner to communicate with tokens. This is not disturbing that. This is just adding on to something that is inherently generated and known by the client and also mutually shared with the authorization server. So what happens is the client generates a key pair and this is somehow during one of the negotiation calls the authorization is also aware of. So with every
subsequent call that goes out there is something, there is a certificate or there is a key that is sent along with the token that binds the client to the secret that they are sending over to the authorization server. So the client server negotiation that happens in token binding is both of them agree on a couple of metadata that is going to be shared between them. One of them is the token binding ID. This is basically another UUID that is generated by the client. Well, again, depending on the implementation, it could be generated by the client, it could be generated by the server. But the end game is both of them know about it. And the
second one is the signature scheme. Like, what signature scheme are both of them agreeing on? What is the payload going to be signed by? And there are other metadata that is available for extensions and for binding type, but for the purposes of this presentation, we are focusing on token binding ID and signature scheme. So what exactly happens, like, what is token binding in action? So, there is one like the initial like TLS call happens, client hello, so hello, hey man, hey man. And then finally, both of them have like identified each other and then there is another specific call just for initiating and securing the access token and the refresh token traffic that's going to go on, which is called the SEC token binding. So,
SEC token binding is the header that is passed along every post call that is made. And this payload is essentially containing the signed eKMI, which is the export key parameter or export key material. And this export key material is nothing but the key that the client has already made the authorization server aware of. And then something like a signing message, like this is the client who has been authorized to communicate with this authorization server. The assumption here is that it's not easily predictable because it's still going to be part of the payload. It's something unique between the client and the server or family of clients and server, but that becomes part of the payload. Right
now, there is API support from web servers because the web servers or the browsers are going to be handling the client calls for the client, but more on browsers later. So for now, The servers that support this are NGINX, Apache, and from a language point of view, there is Java 8 and 10 support that is available. All the developers need to do is just there are ready-made libraries available for OAuth itself, and it comes with token binding support. On the other hand, if they want to be adventurous, they can extend whatever TLS library that they are using. And if that is the case, and if you are the one who is reviewing it, make sure
that the metadata is unique enough and whether they are using all the metadata properly. Because again, there are things that could go wrong with that. Did I cover it? Yeah. So basically the payload looks like the SEC token binding header that is being sent from client to authorization server. On the authorization server end, it checks whether the payload contains whatever was already agreed upon. So even if there is another client who has hijacked the token itself, they do not have this binding information with the client who actually negotiated it. So with every subsequent call that goes out, say either providing a refresh token, providing an access token, or using the access token, then you basically
concatenate the token binding ID. Or if you want to go all out, you can just send the payload again and again. But in the authorization server end, what you're looking at is the binding ID. Like, is it coming from the client who registered the binding ID with me? So I briefly commented on browser support. So right now, I understand that Chrome doesn't support it. There's only Microsoft Edge that supports it. So obviously, this is not going to happen without the user knowing, or it's not going to happen automatically without giving pain to the developer. So here is where there is something called self-signed certificate, mutual TLS, and all of this that comes into picture. Self-signed certificates are not something alien. These have been existing for a long time.
So it's just repurposed for the user for the OAuth workflow. So what happens here is the client now sends a self-signed certificate to the authorization server. And again, there is always a negotiation call that happens. So the client sends whatever information, whatever certificate that belongs to them and that bounds them to what they are to the authorization server. And the authorization server on their end register that. Obviously, this is going to add one another attack vector where there's going to be a database entry of the binding ID and of all major secrets or concatenated secrets that they have with the client. But you need the context for subsequent workflows. And in subsequent workflows, what happens
is the access token is then concatenated with the client certificate. And then this is being sent off to the authorization server. And then now they can see that it's just not the access token that is coming in. It's also the certificate that identifies the client. So why is Pixie great? People aware of Pixie, raise your hands. So, Pixie is a proof of key code exchange. What it basically does is whatever I explained so far, just substitute your certificates and just substitute your key pair with a code verifier. So during the negotiation call, what happens is the client is going to send like, "Hey, I have a unique code and this is how I generated it. So I'm going to give you the generated code or
some form of code challenge, and I'm going to give you the TM, which is the transformation method." So the authorization endpoint now knows what to expect and how to derive the secret from that because it knows the transformation method. Now, for every subsequent call, the access token is appended with the code verifier. Like the most full bar case would be a malicious client. Otherwise, you need to overtake every workflow that happened with the authorization endpoint. Just getting the access token doesn't work. Just getting the code verifier doesn't work. You need to basically mirror every workflow that happens. Yeah, that's Pixie. Right now, it's mostly used by mobile workflows because the limitations of storing client secrets on the device. But again, storing client secrets anywhere is kind of a bad
idea. So if you want to use Pixie, you might as well go for it. So all like the token binding and Pixie, all of them were to do with inherent weaknesses or implementation weaknesses with token binding or with tokens. Now, moving on to OpenRedirect, which was what showed up in three of the HackerOne reports that I showed. So OpenRedirect, this case interested me because of how simple it was to execute. So this one was manipulated redirection URI via the redirect URI parameter. So basically what happened was the authorization server didn't have a static redirection URI registered with it when the initial registration was happening. And the client was basically passing a redirect URI through the referrer header. And what had happened was in the referrer header,
the URI was passed with an extra percentage. And that kind of screwed up the validation mechanism at the authorization server end. So yeah, it was one percentage mark that was sent along with the URI that was passed on. To mitigate this, what you can do is, people not familiar with this meme, the first one is the minimum security measure that you can take, which is if you are in a position where again, usability versus security, we need to support more clients and we need to make it secure, in that way, you let the burden on the client. You extend your trust boundary beyond your authorization server or your resource server. You include by default the client in the trust boundary. If you're wondering it's a bad idea, that's why
I started off with that. So basically you leave the client to authenticate where the token is going to be passed to. So if you have a list of redirect URIs, then say A.com to B.com, then A.com needs to verify that B.com is legit and then forwards it to authorization server in their referrer header. Like, hey, it's okay to pass the token to B.com. The second best way to do this is during the registration phase, the client says that, "Hey, I have a list of 10, 20, 100 redirect URIs, and you can whitelist all of them, and I'm going to maintain the whitelist with you. And once a call comes in where you have to pass
on the token, please check against the whitelist and then forward the token." This has some issues because all developers, they need to maintain a robust whitelist and they need to make sure that the websites are maintained, where the URLs are maintained, and then people might complain like, "Hey man, this is becoming an availability issue. We need to onboard someone within two days and we can't get to the authorization server then." There are pain points to this. The third part, which is actually advocated in the RFCs and the best practices, like any documentation that you will find, is to have one static redirect URI and every token goes there. Obviously, this endpoint is going to be
part of your trust boundary and you leave it up to the trust boundary to forward the token wherever they want to. But from an authorization server perspective, they can do string by string match like an absolute match. So there is no way that within the OAuth scope, the token isn't getting leaked anywhere. So yeah, those at least in our opinion, in all the issues that we have seen so far, token and redirect URI seem to be most easy to go wrong and the best practices that could be followed to prevent it. Moving on next to Magic Lens. So I wanted to bring up Magic Lens because it's been like creating such a buzz lately. So basically it's, the authorization server or the authorization system doesn't have the burden
of storing secrets. And to an extent, the client also now doesn't need to store secrets. So in a nutshell, the workflow that happens is the client wants to log into a website. The authorization server behind the website says that, hey, this person wants to log in. So rather than looking for credentials that the client could provide, they generate a credential or a secret that is passed on to something known about the user. Now, this might be a phone number or an e-mail where a generated secret is then passed on to either an e-mail or a phone number. Then the user is expected, ideally within a time frame to click on this code or to verify
this code and then get back and they will be given a login session. Now, these are technically supposed to be short-lived secrets, but they have been cases where they don't expire at all. And this is supposed to be a secure transport, but depending on the email service provider, depending on the phone network provider, it's basically shifting blame. You assume there is security on third party, like other integrations that store the secrets for you. Yeah, like I said, email and phone network security dictates the risk. So before the magic happens, these are all the kind of things to make sure that the developers take care of in their code. And this is before the magic happens. So this is even before the secret is sent out to the email or
the phone number. So the first one is, please make sure that the user, like either the phone number is verified or the email is verified. A lot of the times the implementation do not take that verified email or phone number. So make sure you do that. The next part is when you're generating the token, make sure that it has high entropy. And make sure that you're using something, some established protocol. And I've given the example of Jot here because we have talked about it. So make sure that you have an issued for, that there is an intended recipient, and you also make sure that you're using the correct algorithm. So please make sure they are
not using MD5. And yeah, again, token binding again. Make sure that whatever you're sending, whatever secret you're sending to either a phone number or an email, make sure that you track who requested it, where it is being sent, and where the response is coming from. And for this, you need to make sure-- either you can store the secret on its own, but you are getting the burden again anyway. So you can store a part of the secret to verify that what you sent out in the request or what came back in the response. Make sure that you have that contextual binding information. Last but not the least, make sure that you have rate limiting enabled.
It's very easy to overwhelm when there is something generating secrets and it could be easily used for enumeration. So make sure that you have a rate limit for where the request is coming from and make sure that the client, you have a legitimate number of times when a user forgets that they clicked the link already. So yeah. A quick note on SAML here. So we just brought in SAML as well because we wanted to see the issues that we saw in JAR and OAuth. Are we able to see something here as well? So just a little bit of metadata about SAML. It's basically XML data consisting of signatures and assertions. And again, this is something
negotiated between the client and the authorization server. and they know what's going on. But in the case of SAML, there is an increased amount of metadata that goes on. So you have user ID, which is the basis of authorization. And then you also give user attributes, like first name, last name, organization, and other things that you want, especially in a single sign-on scenario. Yeah, one complaint is that it's too verbose. It's not a simple payload that you can pass in the header. Or you could, but then it's probably going to trip up some load balances. But yeah. So one security issue is, again, because we spoke about signatures already. One of the things is digital
signature attacks. One, missing signatures. The client and the server negotiated that, "Hey, you are going to send us signatures," but then time of use, time of check, the authorization server doesn't check if a signature was passed along. Signature wrapping attacks. It could be alleged signature that is being passed on, MITM. Somebody else decides to wrap the payload containing the signature with their own signature, and then whatever token is returned is now returned to them. And then you have cloning signatures. Self-sign assertions is super interesting because you basically send the payload which contains the key as well. So the server looks to what was sent in the payload to verify the certificate. So it doesn't check
with whatever it has in its own database. It checks it against what was sent in the payload. This has happened. So yeah. And the other one, which is the XML passing issue where The data that is being sent is, again, like I said, negotiated. The authorization server knows what the client is going to send or at least the format of what it's going to send. This is a really cool way of gaming the format where the legit text here expected is the admin@example.com. But then the way that it is split is there is a text fake if you see the right-hand side, and then at the bottom you have admin@example.com. Now, what the server sees, it doesn't have enough context to even parse that XML data that it's going
to see text fake and then reject it, but then it's going to come down and read text admin@example.com and it's going to allow it. But overall, what it gives the token to is fake admin@example.com. So again, there is context that is getting mixed, missed between the services that are communicating with each other. Again, like authorization bypass. So yeah, that was SAML. And in conclusion, what we saw was across Jot, OAuth, SAML, and passwordless, We have seen different developers implement different protocols, but then it's the new protocol and it's again the same problems that we are seeing. When we are in a position to advise developers or work with developers to have a secure implementation of authentication, we see that
we continually advise them on have you managed token metadata properly? Have you chosen the right algorithm? Are you sure that the server is doing validation at each and every point? Is it doing validation across the right data? Finally, there is missing context in all of them. So there is always security entities that are involved in the protocol, but then it's never bound to what entity we are verifying. So in conclusion, there are many ways to authenticate. equal or more ways to implement authentication incorrectly. And we did this presentation to understand if we could come up with secure practices across authentication protocols, not just context-based, that they can take care of if they are going to be implementing something. So empowering developers
to securely implement authentication is definitely important. So questions? Please raise your hand so the people watching on YouTube will be able to hear both the question and the answer. I'll bring the mic for you. Anyone? Any comments? All right. Well, thank you very much Lakshmi and Divya. Good afternoon and welcome to B-Sides Las Vegas Ground. 1234. This talk is a secret and you don't get to know what it is. This talk is Exploiting Windows Group Policy for Reconnaissance and Attack by Darren Mart Alia. We have a few announcements before we begin. We'd like to thank our sponsors for Critical Stack and Ball Mail and our seller sponsors Robin Hood, Secure Code Warrior and Paranoid. It's their support along with our other sponsors and donors and volunteers that make
this event possible. We do ask that you please, these talks are being streamed live, so as a courtesy to our speakers and the audience, we ask that you check to make sure your cell phones are set to silent. If you have any questions, we will use the audience mic after the talk so that YouTubers can hear the questions and the answers. So after, just please raise your hand and I'll bring the mic to you. Thank you. How's everyone doing? Well, I really appreciate the turnout for such a topic as group policy. I have been speaking for about 20 years, mostly at IT conferences, but this is my first time at B-Sides, so I'm super happy to be here. How many
of you are reasonably familiar with group policy? Awesome. So I won't spend a ton of time talking about kind of the guts of group policy, although I think it's worthwhile to kind of review it just to make sure we're all on the same page. I guess questions are at the end, so if you do have questions as I'm talking, try to remember them, write them down or something, and I'll make sure I leave time for questions afterwards. A little bit about me. My day job is head of product at a company called Semperis. We do Active Directory protection and disaster recovery. I founded a website that many of you may have visited in your travels called GPOGuy.com back in 2004, and a company called SDM
Software that created commercial group policy, still does create commercial group policy software. I was a 14-time Microsoft Group Policy MVP. Until Microsoft summarily executed all Group Policy MVPs last year. So the reign of Group Policy MVPs has come to an end. There wasn't enough cloudiness in Group Policy for Microsoft's tastes. So they got rid of a whole swath of MVPs, including myself. I spent a bunch of years of my career, probably half of my career, in kind of enterprise IT for financial services and software companies. I have a group policy training course out on Pluralsight. I've got some projects out on GitHub. As the pictures sort of indicate, I have two incongruous hobbies of bike racing and making wine that I try not
to mix. But that's what I do for fun for the most part. And let's see if I can make a PowerPoint and do what I want. So just a little bit of a What we're going to talk about, just a little bit of a review on what Group Policy is, does, how it works. Some of the reconnaissance benefits of Group Policy. Benefits in quote. Attacking Group Policy, what are some of the things that attackers can do with Group Policy. And then a little bit on defending against that. So what is Group Policy? Built-in configuration management technology for Windows and Active Directory. roadmap to an organization security posture Windows security posture a malware delivery vehicle or all of the above can
anyone guess which answer is correct, right Actually just kind of interestingly the last time I was at B sides here was in 2016 so at that time I had been 16, 17 years working with Active Directory and Group Policy, blissfully ignorant as to the threats against both Active Directory and Group Policy. And I sat in a room just like this and listened to a talk about Bloodhound. How many of you are familiar with Bloodhound? Yeah, it's an amazing tool. And my eyes were opened. And I started to think about how Group Policy would benefit or not benefit from the same kinds of approaches that the guys were doing with Bloodhound. And actually after the original release of Bloodhound they released kind of a GPO release
that had a bunch of discovery in it, which I'll talk about in a little bit. But it sort of got me thinking about this whole notion of group policy from a security angle. And that was kind of my path to where I got to today where I'm talking about group policy and security. I have a mailing list that's got something like 17,000 people on it and every couple years or so I do a survey to try to understand what people are doing with group policy, how it's changing. Is everyone mass migrating to Intune because it's so awesome? Stuff like that. One of the things I ask is how are you using group policy? And this
is useful to sort of understand what the attack surface looks like. In most organizations, group policy is used for 80 plus percent for general administrative templates lockdown. Security, hardening, and general kind of registry tweaks just under 80%. And then, you know, other parts of group policy come in here. Drive and printer mapping is pretty high. Folder redirection, browser configuration. All of this stuff is, you know, plus 50% in most organizations. So it's worthwhile to know that if you're in a position of using group policy to secure your environment, this talk is very relevant to you and it's also very relevant to an attacker that's looking to exploit group policy to worm their way into your environment. So just about the structure of GPOs because
I think it's important when we're thinking about abusing group policy. There's a lot of words on this page but basically there's two sides to a group policy object. This is the way Microsoft architected it from the beginning for better or for worse. There's a piece in AD that's called the GPC. It's in the CN=Policies, CN=System container. You'll see a bunch of GWID-named containers, you can kind of see it on that screenshot. Each one of those GWID-named containers represents one GPO. And on that GWID name container are a set of attributes that talk about the friendly name of the GPO, the path to the other side of the GPO, the version number of the GPO, information that we'll talk about a little bit more that's relevant to
how Group Policy functions. As I implied, the other side of the Group Policy object is called the Group Policy Template or GPT. That's in SysVol. Now this was a huge problem in the early days of Windows 2000 and Windows 2003 because Microsoft had this really wonky mechanism called FRS that would replicate sysvol to every domain controller sometimes. And so you'd have this situation where you'd make a change to a GPO and the AD side would be changed but the sysvol side wouldn't replicate. So the clients would think that something has changed in the GPO but didn't really get the new settings. And it was a cause of a lot of consternation. They now have FRS that is more robust and allows SysVol to be more consistent
across all the DCs. Now you still have some latency and some difference in replication between the AD part of the GPO. Did I say FRS? I meant DFSR. The AD part of the GPO and the sysvol part of the GPO, they're not going to replicate identically at the same time, but such is life. It's a best effort sort of scenario in that case. But the point is, you have these two pieces of the GPO that are supposed to be synchronized from the perspective of permissions and content. So if you edit a GPO, by default when you edit a GPO, it's targeting the PDC emulator domain controller in your domain. When that happens, the change is made to the AD side, the change is made to the
sysvol side, and then it replicates out from there. That's the way it's supposed to work. The other thing to know about the GPT, this is where most of the settings storage occurs. So when you're defining an admin template setting or a GP preferences drive mapping or a security hardening setting, it's getting written to files in SysVol. Now there's a few exceptions to that. One is software installation. Software installation is a part of group policy not terribly used anymore where you can deploy MSI files. That particular area writes a piece to AD and a piece to SysVol. So it's kind of split between the two. Okay, so why am I talking about this? So the point about it is, and we'll get into it in a
little bit, both sides of the GPO have to, you have to pay attention to both sides from an attack perspective because attackers can take advantage of inconsistencies on either side, mostly in delegation. to put stuff into Group Policy objects that shouldn't be there. So when we talk about processing of Group Policy, it is strictly a client-side operation, meaning that if it's Windows Server, Windows Desktop, Windows Workstation, it is doing the work of pulling down the Group Policy. So when it does that, it does it in two phases. The first phase is basically Tell me all the GPOs that apply to me. That's called core. The core phase gets done and it queries AD, uses LDAP, determines which GPOs apply to it. It's got its list and it says,
okay, I know which GPOs apply. I know which policy areas are in each of those GPO. So then I'm going to call the client-side extensions or CSEs to bring each, to basically buy policy area bring down the settings storage for each GPO and process it, do whatever it says. Map a drive, make a registry setting, change a security configuration in SAM, whatever it happens to be. The CSE is responsible for actually processing the policy. That's a DLL, sits in system 32. Microsoft ships a bunch of CSEs out of the box. A little known fact, this is an extensible framework. It was meant to be an extensible framework from the very beginning. There are third party vendors that have extended group policy. Essentially what
that means is writing a new CSE and writing a new MMC snap in to the GP editor to be able to set those settings. But you can do it. A little C++ knowledge, and the ability to register the CSE in the registry. You need to be an admin to be able to register a new CSE. But if you are an admin, you can register that CSE and you can provide new policy functionality. So I went through that. Settings are per computer or per user. But in either case, as of MMS16072, I think, when per user policy processing runs, It runs in the context of the machine account local system and then it will impersonate the user to make the user specific changes. The reason they
made that change was for, I don't recall the exact scenario now, but there were some man in the middle attacks on group policy processing that were taking advantage of the fact that GP processing for the user was running in the user context. So they moved GP processing for the user into the user context. They broke a whole bunch of IT shops in one fell swoop because when you rolled out this MS16072, all of the GPOs that you had filtered by security, user security groups, didn't work anymore because you had to grant the computer access to that GPO in addition to the security, to the user group. More detail than is useful at this point, but
just wanted to set that context. GPOs get refreshed on clients and member servers every 90 minutes by default plus a 30 minute randomizer. So it could be 90, it could be 120 or anywhere in between. Now why this is important, and it's five minutes on domain controllers, why this is important is that if somebody futzes with the GPO thinking that it's going to have an immediate effect. It doesn't. In a large enough organization with sufficient randomization, you probably have machines that are refreshing group policy all the time. But the point here is that it's not going to be instantaneous across the organization when you make a change to a GPO. So targeting. So you have a GPO. You create a GPO in
AD. It does nothing until you link it to something. You link it to a site, an AD site, which is just a collection of IP subnet definitions in AD. You link it to the domain at the domain NC head, which is the top level of AD. Or you link it to an OU, an organizational unit. Once that's done, all of the things being equal, you will start applying that GPO to computers and users in AD. And there's an order of precedence to that. On Windows, on every Windows device, SKU server or workstation, there's a local GPO. There's actually something called multiple local GPOs that they introduced in Vista, which is probably an area for exploit, but I haven't really dug into it too much. But the point
is that there is this concept of a local GPO that you can edit just on that machine to set policy. It will be overridden by any SiteLink GPOs. if there's conflict, by any domain-linked GPOs, and then by any OU-linked GPOs. So you have this order of inheritance with group policy, local, site, domain, OU, where you could have 10 different policies applying to a given computer account, and if there's conflicts along the way, the last writer wins, meaning the OU-linked ones win. Now there are two things you can do to disrupt that. You can set a link higher up, let's say at the site level, to enforced. That will always win over a conflicting setting set at the OU
level. You can, at the OU level, you can block inheritance. Block inheritance says everything above me in the order of precedence, ignore it. The enforced link wins out over block inheritance. So enforced wins in every situation. That's useful to know both as an attacker and as a defender because if you're an attacker and you're able to write GPO links, what are you gonna do with that link? You're gonna make it enforced because it will beat out everything that somebody tries to do to circumvent you down below. And then there's also on a given OU you can have 10, 15, 20 GPOs linked. There's an order of precedence in those as you're looking into tooling and I'll show you this in a
little bit. So you've got linking but maybe you want to get more granular about who in the OU you want that policy to apply to. You have a number of different filtering criteria. you have security groups that you can use you can say on the GPO only apply this to members of the marketing users group in the marketing OU you have WMI filters that you can attach to a GPO you have one WMI filter per GPO a WMI filter is a WMI query that gets executed by the client if it's true the GPO applies if it's false it doesn't the WMI query is used for things like only apply this to Windows 7 machines or Windows 10 machines. And then you have GP preferences which
is a section of group policy and I'll show this so it becomes real. In each of the GP preferences areas you have something called item level targeting where you can have filtering on a per setting basis. If it's any wonder why people hate group policy this slide captures it beautifully. It is super complex. And that complexity is both a blessing and a curse because it gives you a lot of flexibility, but it gives attackers lots of surface area to go mess around. This last point is important. Group policy is normally only updated on the client if something in AD has changed. In other words, If the GPO is updated, it gets a new version number. The client wakes up
for its group policy processing cycle and says, the last version I processed is 2. What is the version on the GPO? It looks in AD for the version. It ignores sysvol, which there's a version there, but it doesn't get updated anymore. It's only looking at AD. If it says 2 on AD, it doesn't process the GPO that time around. It just ignores it. So if somebody has tinkered with the GPO but hasn't messed with the version number, it's not going to pick up that tinkering. That's important. Now, there are ways to circumvent that on a per CSE basis, but for all other things being equal, if the version numbers are the same on client and
AD, the GPO is ignored during that cycle. Let me, before I... move on I want to just talk a little bit or show a little bit of this stuff so that it's not so abstract I mean I'm sure most of you have messed around with this so I'm not telling you anything that you don't know but here's my domain test me dotnet I have one GPO linked at the domain level I have all these nifty OU's you can see the GPOs that are linked at the clients OU for example I have three of them The one at the top of the list gets processed last, so it wins if it's conflicting with any of its other ones. If there's conflicts between the default domain policy and any
of these, then these guys win because they're lower in the pecking order. Last writer wins. If I come down to the AD site, I have actually no GPOs linked at the AD site, but if I had a GPO linked here, it would be processed first or after the local GPO on that particular client. and so it would if there was conflicts between it and the domain or the OU the site linked one would lose now if I dig into a particular GPO I've got the computer side and the user side pretty straightforward we have policies and preferences when I was talking about item level targeting on preferences if I create a preference let's just do something kind of simple I'm
going to say c: oops I'll just do just enough to get me into this. You can see what item level targeting looks like. On that particular setting I now have 27 different possible types of targeting I can do in ANDed and ORed combinations if I choose to do so to further filter that GPO. So lots of opportunity here for confusion.
So that's kind of a review of what I just talked about. Let me just quickly, so you can kind of see what I'm talking about. Here's the GPC, right? This is the AD part, the Group Policy Container part. Each one of these GWID folders represents a GPO. If I come into the properties of that and look at the attribute editor, you will see the display name of the GPO. You'll see the GWID of the GPO, the distinguished name. You will see, if I come all the way down to the end, The version number, that is the version number of the GPO that the client looks at when it's trying to determine if anything has changed. And I'll refer to
this in a little bit. There's also these two attributes called GPC machine extension names and GPC user extension names. hold the GUIDs of the policy areas implemented in this GPO. This is super important when it comes time to talking about tinkering with your policy setting storage. These GUIDs have to exist for the corresponding policy areas that are in the GPO or the client will simply ignore it. And not only do they have to exist in the GPC, they have to be sorted alphanumerically. So, I've had some people kind of talk to me about, you know, injecting settings into group policy objects and they're able to do it but they can't figure out why it's not working on the client.
This tends to be the reason why. The GWID's not the GWID, it's actually a pair of GWIDs are not there for that particular policy area and they're not sorted. Maybe they're there, I discovered this early on, maybe they're there but if they're not sorted they're also ignored by the client. So there's a very precise kind of way of messing with group policy that you sort of have to be aware of as an attacker or even as a defender, frankly. Okay, let me get back into this deck. So why is GP useful for reconnaissance? Well, as I indicated in that little survey that I showed, many IT shops are using it for security hardening. A lot of the baselines that you get from the standards
bodies or from Microsoft, they come in the form of Group Policy Object Backups. They're basically telling you, "Use Group Policy for this." Of course, Microsoft has some other technology like SCCM that they offer for this, but most CHOPs, it's free, it's in the box, it's generally well understood. They use Group Policy to do security hardening. And they're doing it for not necessarily an order of importance, but certainly close. Local group membership, like setting local administrators. Configuring user rights, so who can do debug programs, who can log on locally or log on, access this computer from the network. Who has remote desktop access to the computer? User rights control all that stuff. Security options like whether UNC is, sorry,
UAC, user account control is enabled or not. All that's set in Group Policy. If you're using, how many of you are doing admin tiering? Microsoft talks about tier zero, tier one, tier two. Of those of you who are doing admin tiering, how many of you are implementing admin tiering enforcement through Group Policy? Yep, so roughly the same amount. The ability to control who can log on to domain controllers, who can log on to servers and workstations, that's all implemented in Group Policy. Password Policy. How many characters should the password be? How long should it last? Should it lock out? And then, configuring local admin passwords used to be a feature in Group Policy Preferences. How many of you used Group Policy Preferences
to configure local admin passwords? Surprisingly few, that's good. I've heard that this is still a problem when pen testers go into a Windows environment. They're finding these things littered all over the place. Why is it a problem? Well, because as part of Microsoft's protocol docs, they publish the encryption key for the policy storage in the doc. And it uses the same encryption key for every single implementation. So you can decrypt all of the passwords that are in group policy storage. And as I mentioned at the last point here, which is like the most important point for Group Policy as a reconnaissance tool, GPOs are world readable by default. Every authenticated user in the domain gets read access when a GPO gets created to your security hardening, to
your drive mapping, you know, the most trivial stuff and the most important stuff. is world readable. If I'm on AD, if I'm authenticated to AD, which we now, you know, this day and age, it's not that tough to get a foothold in an environment, even as a non-privileged user. And as a non-privileged user, I can run Bloodhound, for example, and get a map of who's in which, who's in admin groups on which machines. or other tools will let me read the password out of GP preferences so that I can log on to a machine as admin. So this world readable thing is a blessing and a curse as well. So, I wanted to kind of highlight some tools that are useful for reconnaissance of group
policy. PowerView, it's part of PowerSploit. It's got a bunch of commandlets, PowerShell commandlets in there for enumerating admin access on machines by user. The new GPO immediate task is actually a kind of a working sample of being able to inject settings in this case for a scheduled task into a GP preference on an existing GPO. It assumes you have access of course, but the point is with this commandlet you can do that. Sharphound, which is the ingester, the data collector for Bloodhound, collects a lot of the same information. If you set it to run in that mode, using Group Policy, without hitting the DC, I can figure out who's local admin on a machine or without hitting the machine itself. I think it's part of the
so-called stealth mode to use Group Policy to determine who's an admin on which machines in the environment. And then, Group or Two is my new favorite plaything. This is written by Laws. He's out in Perth in Australia. And it is super cool. If you haven't downloaded and tried Grouper 2, it is a great tool. I'll show you it in a minute. But it's basically what he's done is he's gone through and sort of come up with a list of things that represent a potential problem in group policy from a security perspective, everything from permissions on GPOs to local admins being added to GPOs to, you know, GP preferences passwords being found in GPOs. And if
you run group or two against your AD environment, it'll enumerate all of these, give it a risk score, or I think he calls it an interest level, and let you sort of assess in one fell swoop all of the potential issues that you have to be worried about from a group policy perspective. So let me kind of, I'm gonna just drop in and show some of this stuff. Alright, so... So I've got one just up on the screen here. This is a PowerView command called find group policy GPO computer admin. I pass it a computer name and the domain name, and it tells me what policies are granting which user's admin access or remote desktop access on this machine, this Win 10 client. So it's called out
this group called tier two admins. which is a member of local administrators. You can see here it's a group in this GPO and it's using the GP preferences local users and groups feature to grant that access. This one is domain users being granted remote desktop users access in this GPO using restricted groups policy. So it's really great at kind of calling out for a given machine or I can slip it around and ask for particular user where does this user have admin access now this is a fairly simple environment but you can imagine if you're an attacker trying to get access to this information running across an entire environment you're gonna get a lot of good information now The common thread
for all of these tools is you don't require, the person running the tool doesn't require GPMC. That's been kind of a stumbling block for most, I don't know if I'd call it a stumbling block, but it was a hurdle for any tools that are trying to assess group policy because if you're not using GPMC you have to sort of write it all from scratch. All of these tools have taken that on. You don't need GPMC running and it doesn't use any of the GPMC libraries. That's super handy because you can drop these tools on any machine and execute them effectively against your group policy environment. Let me shift gears and just show you how grouper
2 works. It's pretty straightforward. i'm just going to run it as is and i'm going to run it in pretty mode quote unquote and it's going out and munching through your group policies and then it starts basically showing all of the as i mentioned before all of the kind of different parameters of things that it's looking for and then the interest level so nt services There's some immediate tasks in here. I won't go through the whole thing because it's a little bit hard to read on the screen like this, but essentially what it's doing is looking for anything that's interesting from a security perspective and dumping it out to this report. And that includes, I'll talk about this in a
little bit, but it includes permissions on GPOs, linking, settings that might be interesting like security settings, local group membership, all of that. Okay.
So let's talk about attacking group policy now. Actually, before I do that, I meant to show you one other thing that I'm getting ready to, in the next couple of months. So I built this group policy SDK many years ago for reading and writing GPO settings. One of the challenges that tools like Grouper2 have, or any of these tools, are that if you're trying to determine settings in a Group Policy object that exist, there is no API for that, especially without GPMC. You can't run a settings report like you can in GPMC. It's just reading the raw storage files and then trying to parse them to make sense of them, and that's super time consuming. I went through that ordeal many years ago to write this SDK
and I'm planning, I've been working on decoupling the read part of the SDK. There's a getter part and a setter part where you can actually write settings to group policy objects. It also doesn't use GPMC so the reader part I'm decoupling and making available in GitHub on my GitHub account as soon as I can get the code cleaned up and all the swear words against Microsoft taken out. But I wanted to just kind of show you a little bit about how it works. The first line here, if you can see this, gets a reference to a particular GPO. The second line gets a reference to a setting path and you can see here that you're allowed to or you can use this by just referring to the
English language setting path as it appears in GP Editor. That's kind of the powerful part about it. You don't have to parse XML or parse some weird INF file. You can just refer to the setting path you can find out the if it's defined or not for a given setting in this this particular line is just looking for who has debug programs who has been granted the bugs programs that debug programs user rights in this GPO right and then if I find one that's set to one in other words it's defined then I wanna see what the value of that is so if I come down here let me just well just miss gonna paste
in the PowerShell cuz it's easier And if I just run it, you'll see here that it's returned the group that's been granted debug programs right in that GPO in that setting path. So super easy way of querying GPOs without GPMC programmatically from the command line. You can use it as the underlying code. You can use it in C# or PowerShell. It doesn't really matter. So, keep an eye out if you're interested in that. Keep an eye out on my GitHub page that I mentioned earlier. Hopefully that'll be dropping soon. All right. Attack paths. So, weak write permissions on the GPC or GPT for one or more GPOs. So, what's the opportunity here? Writing new settings into a group policy object to
execute arbitrary code. So, think about all the things that group policy can do. It's like a smorgasbord of execution, right? There's scheduled tasks, there's logon and startup scripts, there's software installation, there's shortcuts. All of these things are stuff that gets sent to the client where the client clicks on something or does something and it executes a code somewhere. And that instruction of where to execute that code is stored in the GPO. If it's a shortcut, it's a path to a file. That path can be manipulated, changed, added, whatever. If you have right access to the GPO, you can make arbitrary changes to it. So scheduled tasks is a good example because new GPO immediate task from Power View. I've actually seen this
used in the wild in malware to essentially, once the attacker got into the environment, They used this or something like it to create an immediate task in a GPO that executed on all the machines and what did the immediate task do? It installed the malware on every machine that processed the GPO. And you can multiply that. The only limiter to that today is the complexity of writing to setting storage. Because if you'll recall, I said that GPT where settings are stored Every policy area has a not every but almost every policy area has a different storage format for expressing settings So if I'm doing user rights assignments or security options, it's in a file called GPT temple dot
INF if I'm doing GP preferences shortcuts It's in an XML file if I'm doing GP preferences scheduled tasks It's in a different XML file with a different schema and a different set of attributes every single one I mean Microsoft did us a favor here. It's really hard to write an SDK, a unified SDK to write settings into a GPO. Programmatically it's hard. And I can't tell you the pain and suffering that I went through to do it over a number of years and I'm still working on it but it's not an easy task. But if you have a very defined target like scheduled tasks, like log on or startup scripts, like security hardening settings, then the task is a lot easier if you're just going after a few
of those setting areas. So what are the challenges to doing this? You'll remember I talked about those GPC extension GUIDs that have to be in a GPO for each policy area that's implemented in that GPO. So here's an example. I'm an attacker, I'm in the environment, I'm running new GPO immediate task, and I poke an XML file with the task into a GPO that doesn't have scheduled tasks defined in it. Well, unless I'm also poking into the AD part of that GPO the GUIDs for scheduled tasks, the clients are just going to ignore it. They don't care. So as an attacker, I have to find a GPO that already has scheduled tasks added to it, and add my arbitrary execution code into that gpo storage does
that make sense so the the the process of doing this is a i need write permissions on the gpt i don't even need to touch the gpc if i'm an attacker all i care about is that the gpt has gives me right permissions i've got right permissions i'm trying to write a scheduled task i need to find gpos that are linked to You know, targets that I care about, in other words, you want to find a GPO that's linked to as many clients or users as possible. Right? So you probably want to find one linked at the domain level. GPOs linked at the domain level are processed by every computer and every user by default in the domain. So I
want to find a good target GPO that has scheduled tasks implemented in it already, and then I can push my XML into the GPT. Great. But that's a challenge, right? It reduces the opportunity. So for new policy areas, if I want to be able to write a new policy area, I have to have permissions to write to the GPC, the AD part of it. I need to be able to add those extension GUIDs to the GPC. And again, as I mentioned, if you're only modifying the GPT, if I'm only writing that scheduled task to the GPT part of the GPO, and I'm not touching the version number in AD, then the clients, it may be
a while before the clients know about that. They may not pick it up. So if I'm an impatient attacker, it might not get me where I need to be. Now, you can touch the version number on the GPC if you have write permissions on it and increment it, say, by one. And then the client will say, oh, something's changed. Pick it up. Let's go. So we write permissions on containers. So this is the ability to link and unlink arbitrary GPOs to target users and computers. So let's say I have a GPO that I created because I was able to get privileged access on a user account that had the ability to create GPOs. So I created a malicious GPO. It doesn't do anything just sitting there. I have
to link it to something. The next thing I want is write permissions on container objects, OUs, domains, sites, so that I can link my malicious GPO to as many targets as possible. So that's the piece that, you know, kind of if I have both creation editing rights on a GPO and I can link to one or more containers, all bets are off, I can do whatever I want. Including if I can link that GPO to the domain controller's OU or the domain, there's a neat little not very well documented feature in restricted groups policy Remember I was using restricted groups policy to grant admin access on my machines. You can target domain groups with Restricted groups policy Microsoft doesn't recommend it. I
don't recommend it, but let's say I have I've gotten into the environment I've created a group or a user account I can create a restricted groups policy that says put that user account or group into local administrators and I linked that to to the default domain controllers or the domain controllers OU or even the domain and what happens when the domain controller processes it, it says oh I need to add this user or group to administrators. Well the only administrators I know about is the one in AD. So it puts that user in group or group into the administrators group in AD and you're now, you've gone from just a lowly regular person to a domain administrator.
Okay, pretty much talked about what you can do with linking and unlinking. It works the same way for unlinking. If I can unlink hardening, then I can weaken the security posture of the organization. So it's just as bad to be able to change links to unlink as it is to create links. I'm just going to drop out of here for a second to just show what I'm talking about with links. I apologize, I got a little tickle in my throat. So if I'm on an OU, let's say I'm on this client's OU, if I look at the properties of the OU itself, you'll see that marketing admins has been added to the delegation of this OU, and I think this is the one that I mucked with. If I
come down here to the permissions on marketing admins, and go down to the permissions on the property, what you'll notice is... I have write GP options and on the other ACE right next to it I have write GP link. That is the permission I need to be able to link and unlink a GPO on that OU. So if any user or computer or group or whatever has the ability to write to GP link, I can link a GPO to that OU. Okay. Alright, so what are the challenges? Well, you need to find existing GPOs that suit your purposes. Let's say you have link permissions, but you don't have permissions to edit GPOs. That's okay. You can use tools like PowerView to find interesting GPOs that grant admin access or
do something, you know, from a settings perspective that is advantageous to your position. and then you can use those existing GPOs to link to containers where maybe you have a user account that you've already compromised. External paths, this is one I just wrote up in the past four or five months. So you have GPOs and the permissions on the GPOs that may not grant write access But in the GPO, you're referencing external storage locations, like logon scripts that point to a server share somewhere, or shortcuts that point to a server share somewhere, printers, files where you're copying files from one location to another. All of these can contain external references, paths to servers that are not in SysVol. So even if your GPOs are
completely locked down and an attacker can't write to them easily, if they can get to the server where you have the logon script stored and they have write access to that, it doesn't matter. They just essentially put whatever script they wanted in place of the script that the GPO is referencing and they can do whatever they want in that script. So you have to think about hardening not only the GPO, but any external paths that it calls. And Group or Two looks for that in GPOs that you have. I mean, a part of this is about discovery. In a reasonable sized organization, you might have hundreds of GPOs. I had one customer that had 10,000 GPOs. Not fun. So imagine having to go through that and find out all
these nooks and crannies. now the challenge of course in this is you still have to have access to those external servers that are being referenced by the GPOs okay I'm gonna try to zoom through because I know we're running out of time GPT redirection let's say I have write permissions on the GPC the AD side of the GPO when I talked when I showed the AD side of the GPO I brought up the attribute editor just and there's an attribute in the attribute editor called gpc_filesys_path. This is where it tells the GPO where to find the GPT, the sysvol folder. Guess what? I can change this to any UNC path. And that's what this particular attack is all about. If I can make
that change, I can redirect it to some server, some SMB share somewhere, with a copy of the GPO settings with some alterations that I made for my benefit. And so when the client processes group policy, they will pull this down from this external share happily without thinking about it. Now, when I first published this, a guy in Germany, tag NullDE, did some work on it and found that using this technique he stood up an MPacket SMB server to be the external share and watched the hashes rain in as clients and users were accessing group policy, NTLM hashes. So it wasn't like, universal because there were certain circumstances where you were only getting the computer hashes, but that can be valuable as well. So the point
is, there's other things that you can do with this besides just faking settings storage. So it's really important to be able to control who can write to that GPC in this scenario. Now doing this, the downside to this is if you're an attacker doing this and the admin tries to go in and edit the GPO that you've redirected or report against it, it will just barf all over the place. Microsoft doesn't know what to do with it. So that's a pretty telltale sign that something's wrong. This next one, I just wrote about this. It's a little bit of, it's kind of out there, but I thought it was worth documenting. So Admin template settings use something called ADMX files. ADMX files define the
text that you see in the GP editor under admin templates and behind the scenes what registry keys and values they write to. Most shops have something called a central store. The central store is just a folder in SysVol where you keep all of your ADMX files and everyone using GPMC and GP Editor in the domain will use that copy of ADMX files. It's a centralized version control system, if you will, for ADMX. Because ADMX files change with every new release of Windows that Microsoft comes out with. So that's kind of the authoritative source. If I have right access to the central store, I can go into it go into an admin ADMX file and change the underlying registry key and value for a particular policy.
Now that in and of itself is not terribly interesting because in order for that to take effect an admin would have to come in and set that value after I made that change. So the chances of that are small, of course. I had recommended if you were going to try to do this as an attacker that you pick a setting that many shops implement. And I did that here in an example that I have where I took this setting here, always wait for the network at computer startup and log on, which I don't know, it's got to be one of the most common settings I've seen in group policy land. And I took the snippet of ADMX file that implements that so
it tells it what registry key and value to write to and I modified it I said well actually if they if they enable that setting what I really want you to do is disable UAC and the text description in GP editor doesn't change because that's coming from a different file that's not that's that language independent file called the HTML file but if they hit this and say enabled I said okay set enable Lua which is the UAC registry key for enabling UAC set it to zero turn it off so suddenly UAC gets turned off on that g on any on any clients that process that GPO so again there's some caveats in terms of leveraging this because you have to wait for the admin
you really have to be patient you have to wait for the admin to edit and create a you know use this setting in a GPO but Nonetheless, it's completely hidden. You'll never find it. As an admin, you probably won't find it until it's too late. And there are other things you can do with this as well. I picked an obvious one. Weak write permissions on starter GPOs. This one was kind of a low-hanging fruit thing. I don't know very many shops that use starter GPOs. There were this kind of aborted idea that Microsoft had around creating GPO templates. They never really took it very far. Today it exists to essentially you can use it to create a template of admin template settings and then you can create
a new GPO from it. Well, just like there's issues with weak write permissions on regular GPOs, Starter GPOs have the same problem. If I can write to a starter GPO and write some arbitrary setting in there that maybe weakens security in the environment, like I can write Windows firewall settings into registry.pol, then if somebody creates a new GPO from that starter GPO, then all bets are off.
Alright, I want to talk a little bit about defense before I run out of time. So, as it may be obvious to you, there's probably a couple different things that you can do from a reconnaissance perspective. This first one's a little tongue-in-cheek, but it's actually not a bad way to go. If you're using Group Policy today for security hardening, stop. Use something else. There's other technologies out there for pushing security configuration to Windows. The second one is more measured, I would say, which is to reduce the visibility of GPOs responsible for security. So don't give the attacker a roadmap to your security posture. If you have GPOs that implement security hardening standards, take away authenticated users read from the security filtering on that GPO at the
very least targeted to domain computers what does that do it means that users because authenticated users remember includes both computer accounts and user accounts if you're not using authenticated users you're just using domain computers then only computer accounts can read the policy and admins of course but a regular user cannot I have to stop Let me just talk quickly about hardening. Checklist. Check. Who can write to GPOs? Who can write GPO links? Who can write to the central store? Are the GPC and GPT permissions consistent? Do parent folders and containers in those two grant child rights that are disparate or separate from your GPO delegation? I've seen that happen a lot of times. And external path permissions should match the reference GPOs.
Sorry for that last minute flurry, but hopefully that was useful. Thank you. I don't know. I guess we don't have any time for questions. If you have any questions, thank you very much, Darren. If you have any questions, please remember to raise your hand so I can bring the microphone to you. That way any YouTube watchers are able to hear the question and the answer. Well, thank you again, Darren, very much. Thank you. Thanks, everyone. This is a level here. Test 1, 2, 1, 2.
Who is...