← All talks

Beyond Base-64: When Your Data Gets Emojional

Bsides CT · 202531:46113 viewsPublished 2025-12Watch on YouTube ↗
Tags
CategoryTechnical
About this talk
"EmojiChef demonstrates how Unicode emoji characters can be utilized to create a novel non-standard encoding technique mapping to MITRE T1132 (Data Encoding). This talk explores the unexpected intersection of emoji encoding, offensive security, and AI language models. Starting from a late-night hacker conference conversation, EmojiChef evolved into a sophisticated tool exposing how alternative character sets bypass traditional security controls and create unique communication channels with modern LLMs. EP will demonstrate how emojis' visual distinctiveness can enable covert data exfiltration, payload obfuscation, and detection evasion. The presentation showcases real-world applications including bypassing content filters, encoding shellcode, and most surprisingly, using emojis as a natural language for AI interactions where LLMs interpret emoji sequences as both data and semantic commands. Attendees will learn practical encoding techniques, understand the security implications of Unicode abuse, and discover how AI models process these alternative encodings. This talk bridges creative encoding schemes and practical security applications, showing that sometimes the best place to hide is in plain sight—with a smiley face. 🦊"
Show transcript [en]

Um, next we have Erin Packet speaking on Beyond Base 64. When your data starts getting emotional. >> Hello. Thank you uh everyone for coming after lunch and uh hopefully everyone had a nice break and enjoyed the morning sessions. Uh uh please forgive me if I seem a little bit distracted uh just prior to this uh talk. I was stuck in the elevator uh for about 20 minutes and fire and rescue had to come and do the whole typical you know movie thing where you know they crack open the door and you have to hop down the uh you know the little difference in the the floors. So that was fun and exciting and ran straight

here. But it was uh nice to uh talk with some of the other fellow coners and and meet them. And hopefully you're getting to meet some other amazing people while you're here. Uh so quick introduction. Uh I go by let's see on the schedule they put me as TBA. I think I'll accept that as a new handle. I have no problem with it. I think it's amusing. Uh others uh may refer to me as EP or air interacted. But I'm an information security specialist focusing application security um in the financial sector. Uh I've been in my role for uh a little over eight years now and prior to that I had focus in um enterprise architecture uh IT as a whole

and uh a bit of side dabbling in development. Uh it was uh never my formal training but I always enjoyed it. uh my first language that I started playing with was basically uh it was basic and it was at that elementary level. It kind of grabbed me on that mentality and thought process. So working through uh how something is built. It's almost like that first uh exercise you may have done where it's uh explained very clearly step by step how to tie a shoe and it ends up building and becoming very important. Uh I if you notice I don't have the traditional slide deck. I just want to uh explain what you're seeing on the

screen as I walk through it so everyone can be on the same page. uh standard IDE. This is studio code which you're probably familiar uh in the uh bottom lefthand corner. I'm running presenter term. It's an open- source uh presentation software. So if uh you don't necessarily love to use PowerPoint or want to be uh drowning in that type of deck and you wish to have technical uh presentation, I highly recommend it. uh you define your entire presentation markdown and you have each page just as a simple dash dash and uh keep moving on through your presentation. Uh very effective and then at the end of it if you wanted to output to a website or PDF very flexible

because you start with markdown and then we obviously have just uh our terminal here on the right hand side. Um okay so I also don't love slides. I did come a little bit this time which I was uh uh excited about I guess to some degree. Uh I've given this talk one prior time at hack spacecon uh this spring. It's a great conference in Florida. Um but this entire concept has been basically built on over time uh through conferences talking with everyone meeting expanding ideas. first kind of came to mind as a friend of mine several years ago was interviewing for Amazon and they asked him a a question for an interview and he said very simply how would you convert

or encode a string to emojis and he told me about this and I said that's you know that's kind of interesting and again I don't have really necessarily formal uh background in development but I love to read code and work in different languages And I started thinking again if uh that that chew concept, how do you put those pieces together? How would you tie them up? How would I convert the string into emojis? And there's so many different ways you could do kind of like alphabet replacement, right? Kind of kind of common. Uh and I I started thinking then more practical uses. This was several years ago and my day job, you know, continues and life continues. And this

was just a thought experiment. So I said to the background getting back to about the expansion of ideas through conferences. Uh this uh springtime I was discussing this again. Well I guess it was maybe winter but earlier this year I was discussing it with some other colleagues and uh conference goers and we said you know this is actually kind of interesting and we were looking at other uh implementations. We start just googling how are others converting strings and different things to emojis and you start getting into some cool concepts like bit mapping and we ran into some uh projects and I will update the slides here I apologize don't have the resources here uh it was a bit of uh

that delay and kind of life as I mentioned even elevators but um uh the the repositories that we found Uh most interesting they were getting into some cool schemas basically like high base value mapping eight bits per emoji sorry 8 bits per character uh 10 uh uh base 1024 and this was really interesting because you can become a little more effective. Uh the problem with uh unicorn and we'll touch upon it in a bit when you're trying to move and shift these things about it can become not really optimal. you end up uh having a much larger uh stream than you initially started working with and we'll see in demonstration of the tool. But that's basically where this all

began uh in our minds. How uh do we play with emojis and uh from strings? How does the unic code map look? And then what are some further things you can do beyond just the cute smiley faces?

So, as we may know, the traditional encoding for B 64 uh would look like something like this.

I'm sorry. I think uh when I have it a little bit bigger here, some of the text is getting cut off and I apologize. Let me see if Let me see if we can make the whole thing visible.

Yeah, I think we'll we'll we'll go here. This is kind of what we're discussing with existing uh repositories. Some of these are like emoji coding 2.0. Uh and there's various implementations trying to do similar things. The one thing we didn't not we didn't find was the ability to create custom schemas. starting at your own unic code point uh no ability to have flexible uh recipes and then uh meaning different base values and then the biggest piece is beyond strings which we'll we'll uh see in a second. It's binary files and um that that's where it got really fun. Uh this chart shows kind of the recipes. I apologize. I think the so the bigger font is cutting some off and can't see

it switched. There we go. We'll catch up here. But we'll dive in. Uh and please forgive me if uh anything fails or a little bit slow or have some issues with it. Uh we'll hopefully the demo run smooth.

Just the Go.

So, I'll switch back just for explanation. Again, we had the recipes there to find this one here. classic base 128. So we'll see that it's encoded as the string now and we can do the same thing with different recipes and encoding strings and we have defined light is the uh base 64 and uh in typical fashion you do have a help this is going to be kind of hard to see in it, but it does explain the recipes same as you see in the presentation here. Match up and kind of start to piece together the idea.

So in the same way you would encode strings, you could do simply a shell command. just take your netcat encode that quick string and now you have converted may be questioning why we do this and it comes down to I think hopefully it's the next couple slides but it is uh being able to get past common filters your DP is looking for matching on typically B 64 and that's why I mentioned very beginning beyond B 64 uh this entire technique There we go. It maps to the MITER framework as data encoding obviously T1132 and the intention is your adversaries are attempting to make it more difficult to read the content. Simple enough. B 64 encoding exor operation

different ways of making it more unrecognizable but the intent is still the same.

The same application can be uh applied or intent uh against any common uh files. So you could uh go against that password and then send it encode it and even compress it. Zlim is compatible. It's supported within the tool. And similar to what I was mentioning before with DLP detection, it's in this case using a block rule if match um fourdigit fourdigit fourdigit fourdigit of common credit card patterns when encoded in emojis obviously would not m would not uh match.

now to get to uh some interesting things. It's it's it's nice to have a tool you can just play and go okay let's encode some strings let's encode some files uh as part of the package it is a defined codec that codec python class you can then implement into any integration that you want to get creative with I have metas-ploit package I have a uh hack chat which is a uh open-source IRC like uh chat application is web hosted originally by Andrew Belt. Uh really interesting tool and I have a demonstration uh here that will show a client transparent coding where the public and their the hack. chef will see it uh as just the strings of emojis, but

on the client side, you see it as uh the clear string text with the burger. As I'm apologize, I didn't have this queued up again uh distracted just prior to the uh session, but uh if you'll bear with me for a moment, I will uh open it.

what you're seeing right here. Um, several of us run a uh side kind of conference that runs between the margins. Anytime we attend any other conference, we're also running our conference. It is not to overlap or interfere with uh where we are. It is to add to it uh and share uh interesting capabilities. It's where we develop ideas like this. Uh and and if I go through something.

in the page here you'll see and and it was used within the the challenges for this conor CTF that you saw though I was just kind of putting the initial uh flag in for uh you didn't necessarily get to see the cheat on it but uh it allows the access to this snack chat portal once you get past that the snack Snack chat portal explains what's occurring and it goes over what we just discussed prior that it's stack based obuscation using emojis. Users can put the message in here.

See it clearly here. And if anybody wants to check it out publicly, you can go to hack.hat uh for slashquest snack uh_con.

You'll join the room. Hack chat. Uh hack. Chat does not have any persistent message storing on the servers. So you will not see the message I just sent. I'll send one from the hack chat room from public and we can go back to the Did I go to the right room? We can go back to the private session. Oh, sorry. It's no under underscore the snack on

them. And there's our emoji message in the public. So this can be used in many different ways than beyond. So any typical primitive where you're going to use the base 64 encoding, you can use emoji encoding. Uh you'll have to be careful with the recipes or the base that you're using because some services may not support some of the uh Unicode character set. So it's going to either drop or not be able to render uh correctly and your message will fail. So again with all other types of encoding you want to verify that your message is correctly encoded uh prior transmission. Um let's see slides here. So this is the hack chat transparent client and if anybody wants

to play with it uh the competition that we just saw up it is running as a basically if you will conquer the hill type challenge. It is open from Bsides uh Connecticut now till Bside Charleston November 8th and uh this is our fourth round running it and uh we have all different kind of varied challenges uh from you know common to uh to rather uh obscure and if anybody wants to join us we'd be more than happy to play

slides So, it would be probably a mistake if we didn't talk about AI in 2025 and emojis. So, as we continued, I I shared this uh as I mentioned first at Hackspace in the spring and uh as we were playing there, I had this dream of having my request in emojis and an AI respond to me in emojis, like my own language basically. Thought that'd be amusing. And uh I was hammering on it as they were releasing all different models and so many hallucinations. The models would think they're decoding the strings and give me a reply. They would think they're encoding something and give me a reply. And I would get excited when I see some of those

responses and I put it through and try to then decode the message like it's not real. And I kept playing with it, playing with it. And then uh Claude released I think it was around 35. I have it in my my the documents if you go through the repo, but I think it was around 35 that they introduced the ability to uh instead of just putting like this code for the emoji encoding into um um like attaching it as a file as an instruction. That always failed. they add that the ability to directly connect a repo or a GitHub uh to your u cloud session under projects. Once I did that, it was interesting because uh I

found my first success. And what's beautiful about cloud is you can go back and look at those conversations. So I'm going open open up that history and we can take a look at the evolution of claw learning uh the base 64 uh emoji coding first and then trying to improve the package all within requests and responses in emojis.

You get to see the behind the scenes on a few projects that I've been working on if you were quick enough. This is under the one account. It's kind of fun.

So the conversation started within the project. Again I added we can see here the authentication is no longer connected with that repository. Um but uh that's all I did was connect it in the project to my repo on GitHub and I asked simply that the the emoji encoder tool uh has been updated. Please provide some knowledge and let's communicate using the quick recipe uh and encode the following string hello hackspace con 2025. I like to do the live demonstrations. We were doing it that way. And this is when I again was kind of getting excited. It said, "Here's your message. Hello, Hackspace Con 2025." I was hoping to hit use the library and it was incorrect

and uh hold on. Was that the correct one? I apologize. Let me just double check. to conversations.

>> Ah yes, it encoded it correctly. I got excited because this was the first time that had not elucidated yet gave me a proper coding using the class. Um going forward then I said thank you this next conversation we use the codec for communication. So I said this and it said this back and I got excited. This is what I was saying. And uh here because I want it to be stubborn and I can put it into the uh the decoder.

I told it your previous message failed to decode. Try again in response an emoji because I was didn't want to reply back in uh just asky. I wanted to be stubborn. Just keep demanding, hey, let's we want to talk with this uh with this Python library. I know you can encode let's let's do it. And if we look at the analysis, it's interesting because it took the initial library and it kind of optimized it while still um being valid. And that's where some of the initial failures and we can see where it does the it takes that encoded message that I provided and this is the text it wants to encode. I understand the assignment. Let me

create an improvement. So it was already trying to improve for me. It was really funny. Uh I I've noticed and probably most have if you're playing with any AI that they always uh room for improvement and and very optimistic about it which is great. Uh but uh it basically says so here and this is where I was continuing to see success. Your previous message failed to decode. trying in a response an emoji what I showed that it goed and at that point it really did understand and it continued to have a conversation look at analysis of basically trying to improve the uh the project itself.

I find this exciting because it opens up in my mind uh all kinds of possibilities. I think the emojis are really interesting because it provides a visual representation of different encoding schemas that may not be standard or common. Uh and if you then think outside of that within really the same application, you could go uh further along the uh Unicode chart to the unrenderable blocks. So your entire um defined schema will just look like square square or question, you know, block with question mark. And when you're then looking back at uh something that's encoded, it' be very challenging to understand what that actually is. Um, I apologize that it was a little bit um not as prepared or organized that I'd

hoped to be and I hope it was uh interesting. I'd be h more than happy to discuss uh more about the uh tool or the concepts uh with anybody after. Just to show you a couple things real quick that we didn't necessarily have time. This is what a file looks like when it is encoded. And so again, long string of emojis. And if uh to decode that

Oh, space

in here. There's a space too many. That's how it goes. Oh my gosh. E code

is Oh my gosh.

Oh no file. Sorry. On strings you indicate with the quotes since it's a file. have to do fine decoded.

There we go. And you have to use the the correct recipe or base type because it would not decode correctly. And if we then look at our output image output decoded test probably to rename that right. Uh I had removed you specified dash O for your outputs and we'll look at our decoded image. Uh there we go. Uh so I hope this was interesting. Does anyone have any questions? I know I took off probably right to the edge. >> Yes. >> Uh if

Yes. Yes, thank you. Um, just for a matter of sake of time.

That's fine.

This is the repo. Um, and I had hoped to have a uh development branch. It's a bside strong development branch that has a bit of the extra integrations like hack. chats uh and a signal integration if you want to play uh with sending emojis to your friends with a custom signal app. So, uh those will be uploaded uh most likely this afternoon as I have a moment to take a breath, get some lunch uh that I missed after the little bit of a fun event. Uh uh but it's been an amazing time and I really appreciate it. I know I know there were a couple other hands here.

standard base

>> exactly and it's if you look through the code uh and in the documentation uh I I tried to do a decent job in explaining how the class works um and there is an option to utilize your own custom schema. You can get weird with it. You could do a base 68 and a weird starting point on that. Uh you can have a weird um bits per chunk as well. Uh that that is uh in a in a longer like workshop style. That's what I would hope to cover in the future as this talk develops.

Anyone else? Thank you so much. And I really appreciate uh