Dead Pixel: A Practical Guide To Attacking Server-Side Image Processors - Emil Lerner

BSides Prague · 202623:232.4K viewsPublished 2025-04Watch on YouTube ↗

Speakers

Emil Lerner

Tags

CategoryTechnical

StyleTalk

Mentioned in this talk

Tools used

Ghostscript ImageMagick Inkscape Pillow

Platforms

Docker

Service

Amazon S3

Show transcript [en]

Thank you. So today I will talk about image processing happening on server side. My name is Emmy learner. I play CTFs with bushwackers team. Uh occasionally I do security research and I validate the importance of this research via bug hunting. So assume we have a website that allows users to upload some multimedia for example an avatar. It could be a cloud drive allowing to store multimedia files and it processes it on the server side generating image previews for an attacker. It presents a great complicated attack surface. So we have some complicated binary input which is the image. We have some open source library running as a server because no one would code the image processing

themselves. And we have a direct output channel the preview from this library because no matter how many levels of abstractions is present uh the output the preview will be generated by the library itself. The topic is far from being new starting from image trading in 2016. It gained a lot of attention. A lot of different libraries were hacked. uh but the attack surface that the attack vector still persists and the attack surface is still not as housed. Uh today I will mostly focus on vector graphics. So the most popular vector format is obviously SVG. It's a XML based uh vector image format. It has a lot of implementation, a lot of different libraries. Uh and what's important for

today's talk supports including raster images. So most of security research of SVG focused on including external images. However, today I will talk about including uh inline images. The images that are in line to an SVG. So what we do when we attack an image converter? First thing we need to understand which exact library is running as a server side. So if we do this with generic libraries like image magic, pillow, uh maybe java libraries, we can upload a picture that is processed differently by different libraries. So for example this this would happen if we upload a GIF image that do not a GIF image that does not contain any pallet and we will see default pallet uh for this library which

is black for image magic uh this gray thing for pillow and this colorful stuff for Java uh and uh we can use the same approach to detect which library converts SVG as a server side and it's even easier because there is variety of different texts variety of different queries. So for example, we can take an SVG that includes three raster images. One is just basic PNG image. Another is a PCX image which is which is uncommon nowadays. Uh and another is uh just a PNG image but included via pass tag instead of the standard ref tag. And we will see that different libraries convert this uh image differently. So image magic's uh parser is most permissive. It would allow

everything. Libra would ignore unknown format and Inkscape which is another popular serverside conversion library uh would draw this red X which is pretty unique to Inkscape. So if you see this you will immediately know that it's Inkscape that is used as a server side. Uh so today I will describe one of the bugs I found in Libra SVG and well generally Libra SVG is a good library. It's written in RA the code is pretty solid. It uh do not have it doesn't have that much of security vulnerabilities. I know there was another one uh but it's pretty solid. However, it used to depend on another library uh named lib JDK Pixbuff for raster images. It is not the

case in the latest version. In the latest version, it just clear Rust version. However, it used to use bindings to load a C library and libdk buff is a different story. It's written in C. The code is pretty messy and it has a long history of memory safety issues. So it's pretty good target for an attacker. Uh so one of uh types of uh memory safety issues I like the most to exploit uh is information leak via uninitialized memory. Uh so it is easy to exploit at bug bounces because it does not require any system knowledge any binary knowledge. How does it work? So assume we have some server process. Naturally it works with some sensitive

data and it allocates memory for this data stores the data there and then free this memory. No sane memory allocator would zero out uh memory after a free. So it may happen that the buggy library the library that uses uh memory for image processing would allocate the same memory block and the buck is the library won't overwrite it before using. So it would use the same memory block for uh image processing and generate a preview containing the data that was uh in the newly allocated memory block and it might be sensitive data. then attacker would see the preview, download it and recover the data. Uh however an important prerequisite here is the vulnerable library must be directly linked into the

same system level process as the sensitive information is processed in. uh if uh it is another like another process a launcher uh external comment executed uh there won't be any impact but as we would see sometimes the vulnerable library is loaded directly in the same system level probably Linux level uh process as the well the sensitive data is processed in and uh I want to f to share a trick how to find this bugs. So the correct way to find this kind of bugs is using memory

syndications and well it it pretty solid. I suggest using it. However, it have some uh it has some pitfalls. It requires recompiling and not only recompiling the library itself but all the dependencies with the MSON uh memory synizer. So instead we can we can have this little trick. We can write a simple malo wrapper that uh fills the newly allocated memory with a particular bite value and then we check if the output depends on this byte value. So normally the uh contents of newly allocated memory should not be reflected in the preview. Right? However, if we see that the preview depends on this value, we know that somehow we would be able to recover it because it contained in the

preview. Uh so it's pretty simple trick. It requires you having a corpus of images to test on but it uh does not require to recompile the libraries dependencies and so on. Okay. The issue itself I found in Libra SVG is pretty simple, pretty trivial. So you include a BMP image which is uh not that popular format nowadays but it's an image format and you truncate the image leaving the only header uh you fake the content type replacing it with PNG and that's it. The buffer will be allocated uh and then used for generating the preview. However, it would be never filled, never overwriten. So if some data stored there, the attacker would be able to see

the this via the preview. Uh the first thing I did after detecting this bug, I uploaded I immediately uploaded to uh all the bug bounties uh especially to cloud drives and the Microsoft uh one drive came up appears to be vulnerable. So you see a lot of different previews. However, the uploaded image is the same. So you know that probably if you see different previews for the same payload, you know that something happening on the server side and you can recover this data and uh especially presentful is this case where you can see some other users document in the your preview. So some other user was using one drive at the same time as I was attacking it and

their document were stored in the memory and then this memory block were used for my image processing and then I saw then I just saw the document and because the width of the preview is the same for for all users I naturally can just read it. So this red censorship is added by me because by the nature of of the vulnerability I do not control which exact data I would leak. So I don't want to be an and compromise this client. However uh you can see that uh this uh this real document. So you can extract real other users data. Another interesting case uh is basic was base camp bouncy. So base camp is company invented rabian rails. So

it's not a surprise they're running it, they're using it. And after I uploaded my vector as an avatar there, I immediately noticed that there is a lot of interesting data that could be recovered from the previous including other users cookies. Uh so I I got that it is some web server process. Another interesting thing at base camp was fighting GPA uh lossy conversion. So the previews were uh only in JPEG format and JPEG is a lossy format. You can't just recover the original data from it. However, I used SVG filters to overcome this limitation. And uh SVG supports so-called table filters when you provide a table of uh 256 values uh that are just substituted

for each value of each channel. So channel is red, green, blue and you have just a table that for each value you substitute another value and you uh the thing is you can have multiple such filters and insert the same data uh with multiple filters and then you have enough data to uh recover the original values which could be AWS keys. So after uploading uh about 1 million images uploading recovering the data uh uh uploading again again recovering after 1 million iteration uh iterations I got this part this is obviously part of some configuration file and uh well it contained AWS keys which were uh usable from the outside of base camp network. So I guess

it's another vulnerability. It's another misconfiguration that I could use it from my IP address and it's not at the slides but well these keys allowed me to access their S3 buckets that contained all the files ever uploaded to base camp which is like pretty huge uh pretty huge compromise. Okay. So next popular image format is postcript. Uh so postcript is has different approach from SVG. SVG is the descriptive format where you describe what you want to see and postcript is an actual programming language when you describe how to get how to draw what you want to see. It's an actual programming language for printers and uh the thing is that uh printers are usually used to

interpret uh uh the sorry at the server side uh the most common interpreter is go script. Go script is a software. It's not a printer obviously. It is executed at the server side to process postcript files. And the thing with go script is that it is extremely vulnerable program. It has a lot of vulnerabilities. Uh so naturally go script has a sandbox uh for untrusted programs and untrusted programs for printers should not execute system commands should not access system files and they have some go script level sandbox for it. However it is so many bypasses of this sandbox that it even doesn't make sense to to talk about any of them. They're different. They're trivial, memory safety, logical issues.

Uh so many of them. I personally found like three bypasses. So it's it's easy target to hack. Go script is basically RC is a service. If you see go script is executed as a server side, you will probably get an RC. I guess every time I saw go script was executed as a server side, I was able to get an RC. maybe in a sandbox environment but like I could execute I could actually execute comments every time no matter what was the version. So initially I planned to uh demonstrate this uh go script syn thing with uh another story from Dropbox but then I thought maybe take something more recent maybe take an example from

2025 for example the website of besides 2025. So let's see uh how uh let's chain all the stages of the attack and see how the attack could happen. So yeah, this we have a website uh we have this nice talk. We are redirected to a third party platform. Uh whatever we'll hug this platform uh and we can upload images. So the first thing we do is to upload this GIF image I talked about. So don't look at the PNG extension. It's actually a GIF image without any pallet present in the image. So GIF format should have a PL but there is no pallet and you see this black square and now from this black square you know that it

is image magic probably running as a server side. Uh so the thing is image magic may support postcript via executing go script. So what we do is we upload a postcript file to check if this is the case. So the postcript file would just print some random letters but we see it's printed. So we know that the go script at least we know that postcript files are processed as the server. So we can try something go script specific to check if it's a go script. Uh so then we upload a file that just prints uh go script version. Again it's not an exploit. It's just prints something uh via preview. But now we know that the

it's go script running as a server side and it's pretty old version. So we can check the CVE. We can find a CVE. We will we will develop an exploit. I will skip this part in the video. Uh but we obviously can just find a proper CVE, develop an exploit and then upload it. And um then we would wait a little bit and we get a back

connect. Thank you. Thank you. So we can naturally execute comments there. Uh we see it's not that sandbox environment. Probably we could leak some data or modify it. And the other thing uh that you should note that on the uh on the left you can see that the output of the comments uh could be printed directly in the preview. Uh so as I said postcript is a printer language. it can naturally print words and if you have uh an output inside a variable inside postcript you can naturally just print it at the screen and it's really helpful in cases where the target do not have this internet connection so you can explore the sandbox you can maybe you can bypass the

sandbox and that was actually the case at the dropbox that I skipped uh so I could upload a file I could execute comments I could evaluate the sandbox box from inside it and then I could uh bypass it executing comments as a system itself. Okay. So now let's move on to some takeaways for uh blue team. So what to do to protect yourself from from these kind of vulnerabilities and the thing is uh that like regular approach just update everything to the latest version would not work as you have seen it's pretty easy to develop a zero day if you have a proper skills. Uh so what you need to do is to put your image

conversion into a isolated environment into a sandbox environment and isolated here means like really isolated. It should not have any infrastructure access. It should not have any uh network access both to your internal resources and to the internet. And uh so it should be something like this. uh even if you have like a worker that use soly for image conversion probably it has access to some parts of your infrastructure it could pull images from some kind of queue process it and then uploads the result somewhere so inside this worker you should have uh something uh else probably a docker container and this should be a one-time docker container so it just converts one image and then it's destroyed probably it

should have a readonly file system. Uh so even if you have a separate uh part of your infrastructure solely for image conversion inside this infrastructure you should have like really restricted sandbox environment. Uh yeah basically that's it. We still have time for questions.

Any questions? Uh yeah, go ahead.

Yeah, I understand this. But well, in real life, you don't have that much uh image conversions, right? So you you you have you don't have those much avatar plots. Usually usually you need to catch the result of the conversion. So you don't need that much resources. And if you see that pipeline is overloaded by someone when you can as usually you fight that those attacks somehow you should fight them right you should have an ability to fight those attacks against heavy operations on your infrastructure and that is just one of them. So yeah,

uh is there any mitigation for the postcript upload attack aside from just not processing postcript? Uh yeah, so yeah uh a good mitigation would be just disable postcript processing. Yeah, that's correct. No one would upload postcript as an avatar in real life. So that is what like 90% of all ba bouncy programs I reported this issue to what they they did this mitigation however as we thought uh the issues could be discovered in different formats it's not only postcript it could be in regular SVG processing pipeline uh and for example you may actually need SVG not for avatars but maybe for some reasons you may need SVGs and so do what Google does. If you upload a postcript file to

Google Drive, it would be processed but it and you could get an RC in this sandbox. But there is nothing there. Uh it's it's impossible to get out of this sandbox. There is no sensitive data there. So it's not a security issue unless you know attacker would mine bitcoins in this sandbox. So it's it's not a real security issue.

Uh [Music] have you been able to uh record that data and identify its nature? Yes. So obviously I did this with uh you mean for one drive or for B comp for one drive? Uh yes, I was able to and uh well there so I learned that there is a Java process or oh no no net process that does only sorry not Java obviously Microsoft so they use net so there is a separate net process that probably does solely image conversion and I did not go beyond that uh so I was able to catch some config uh config files that's suggesting it was a net uh process.

Yeah. Okay. Thank you very much. Uh if you have any more questions, I'll be here. So, yeah, basically that's it. Thank you.

Dead Pixel: A Practical Guide To Attacking Server-Side Image Processors - Emil Lerner

Related talks