
their country, their citizens to uh reach the outside world and see content and new ideas that are contrary to that regime. Italy, they ranking, they're they rank a lot higher on the list, but still not great. Um, Southeast Asia in general, this whole region is quite red, which puts you your score down towards the bottom. And Pakistan, right? all the way 158. This is, and this is kind of where I get to one of the maybe more palatable arguments. Uh, this is from a YouTuber, No Rugrats. Uh, he says, "You should download a car." Parker was saying at the beginning, "Oh, I wouldn't download a car." But, uh, we all have seen those like ad campaigns, right? Uh, you
shouldn't download a car. Um, and the title of his video is piracy is morally correct. And one of his key main arguments that he puts in this video is that piracy allows for the preservation of digital culture and it allows for the um evasion of censorship. Again, I this is not legal advice and take what you will from this conversation. Uh I still am not sure where I fall on this spectrum of whether or not it is right to pirate, but I want to bring up a few other points, right? Um VPNs. How many of you guys use VPNs to watch content from like UK or other countries? Yeah. Uh my family, I used to live in
China, authoritarian country, right? Uh we had to use VPN to access Google and our Gmail to access Netflix or doing what the Egyptians are doing to get content that they can't get in their country. And we had no problem with that morally, but that was breaking Chinese law. So you have to ask yourself, is is breaking the law like if you've already crossed that boundary, is that okay? U maybe not. How many of you have ad blockers? There you go. Um, just this uh the past month, uh, Lionus Tech Tips and a bunch of other YouTubers have been noticing a problem with their view counts and their revenues tanking. And this was due to a funny piece of
code in a GitHub program that a lot of the ad blockers use uh in the way it connected to the YouTube API. So that those views were no longer counted period. So view counts dropped and their um availability on the platform dropped and they lost a lot of revenue. Um, Lionus from Linus Tech Tips regularly argues that uh ad blockers are piracy. You're stealing from uh the online content creators who get their revenue from advertising. And again, you may or may not agree with that, but it brings a brings an interesting counterpoint, right, to this argument. Is piracy okay or not okay? who gets the money and who should be rewarded for their pay. There's an example that he
provides in this video that kind of covers the situation that we've talked about. This is uh a movie review. An individual said like he wants to he's been wanting to watch this movie. It is only playing in theaters in in his country, so far away from him. And he found it on a streaming platform Disney Plus, but only available in India. So then he signs up for a VPN. He then signs up for an Indian uh Disney Plus account, but he needs a phone number. So he signs up for a digital phone number. Oh, but he needs uh but Disney blocks fake phone numbers or digital phone numbers, which a lot of institutions are considering. I know at Utah State we're
considering this or have talked about it. Um, so he found a vendor uh and bought a local SIM card and then bought the Disney Plus subscription. He tried all of these steps to do it the right way when he could have, you know, streamed it uh from a uh and download it from a torrent. Uh so if piracy is not palatable to you in terms of preserving culture I guess um because that's another one of the additional arguments that uh that YouTube creator uh commented on is that with digital rot uh things things die and the way to preserve things is by having copies. We all know the the 321 rule. Uh three copies, two uh two
mediums and one remote location. It's kind of the same thing in uh preserving digital media and preserving in libraries too like preserving physical books. It's best to have two, three, four, five, six copies because eventually they will rot. Um it's the same way. It's like the fund uh fundamental principle of the blockchain is a uh everybody has a copy of the blockchain so that they can verify truth and reality. Um but there there are other ways. Piracy is not the only way. Um there's really cool um projects focused on anti-ensorship and preserving digital culture. One of these is Minecraft. Uh, have any of you guys heard of the uncensored library? >> You have? Okay, cool. It's It's pretty
neat, right? It's a It's a pro. It's a web server that you can log on to on on Minecraft and they've built a massive like Roman-esque library with like beautiful architecture and then little zones of different places uh representing different countries and journalists that were either killed or political prisoners and you can read their works in Minecraft. The idea being that this is maybe not the first place that the Egyptian government or the Chinese government or Russian Federation would look to to block and and that's kind of the idea behind it. Uh, one of the works of um, a journalist that was featured in this library was Jamal Kashogi who was assassinated uh, by the Saudi government in what was
it 2018 and he was a writer for the Washington Post. Um, we need people like this uh, and we need them protected and we and we need their contributions to society as well. uh this kind of culture shouldn't be erased or censored and it's great that it is copied and available in places like a Minecraft library of all like of all places, right? There's other fantastic people too. Um, so I mentioned at the beginning of this presentation uh the federal administration currently pulling CDC data sets and and health uh research. At the beginning of the year, a lot of that was re-uploaded to the internet because of activists like us who are interested in digital
preservation and in protecting our our public data. Harvard hosted a two librarians at Harvard hosted a community uh data preservation thawn and I had the opportunity to talk with them and and reach out and see how it went and and what it looked like the tools that they used. Um and so I uh this is Ashley Thomas. uh she is the health science and data and digital services librarian at Harvard and she was incredibly helpful and and gave a really thoughtful response in talking about the event. Uh she said they got like 20 20 to 30 people to participate both online and in person. And although none of them were necessarily super technical or super
nerdy, they were able to scrape together and and download and then re-upload to the internet dozens of uh data sets. They used um they uploaded to a program called the data versse. and I'll get there. Um, but this quote on the screen that she she gave me is uh I thought really kind of poignant. She says, "Whether out of morals, ethics or resistance, data preservationists recognize the importance of shephering as much of our data and knowledge through this period so that in the future we have a foundation to rebuild uh build on. Um, we don't want to be a less healthy society. We don't want to be a less con uh less uh we want to be a less divided society.
We want to cooperate with each other and engage with each each other thoughtfully and that requires being able to share our ideas. It requires collaborative research. Um and it requires uh um trust in each other. So I I really appreciated this and I I appreciated her help. uh and this is like a great thing that communities can do and this happened all over the country. There was uh two data scientists uh Rajan Desi and Jeremy Heritzog. They formed the Fulton ring and they this is national re uh weather risk data that the US government published uh and then pulled down at the beginning of this year. they've re-uploaded it to make it available. Um, and they have their
project listed on GitHub if you guys are interested in. But this is like the kind of thing, right, that we need. Um, and that maybe we were uh kind of just like we just accepted that this was like free to us, that the government would give us this data and that we have this open collaboration. Uh but that won't be a constant as has been proven by the events of this last year. And we need people like these two who are on the game anticipating uh the the censorship and the removal of the data and then building tools to protect it. So, um, those are some of the people. Here are some of the organizations.
You've all heard of the Internet Archive and the Wayback Machine. Uh, but there's Flashoint Archive. They're they're saving Flash games from like the early 2000s. So, all those games that you had so much fun playing, they're available to you. They're available to your kids and their your grandkids, right? uh the data versse project focused on uh academic research and protecting that permaCC remember at the beginning I was talking about link rot and how that's an issue in court cases and legal documents and kind of across the whole internet uh permac uh you send them a link that you want to have cited in your uh say your document your research paper your court case they
download that web page as it is in that moment and then prov uh provide a permanent link. Really neat. CDC Restore, they uploaded a huge data uh trove of data from the beginning of the year right after uh that big removal. Uh and there's other organizations that are comparing this to what's out there now, what's been changed, what's been modified, what's been altered. And you'd be surprised, but uh there is quite a little bit. Um these are just some of the cool ones. Digital preservation co uh coalition, they compile a list of media formats that are going extinct, right? Like file types. When you when you consider digital preservation, you also have to consider are the tools that
I'm using to read my files and my data going to be around uh in the future? And so they keep a list of of data types that are maybe more open open access, open source and will have the best chance at saving that data and preserving it for the future. So what can you guys do? I've you all we talked right you all have home labs. So there's there's ways that you can participate in this uh data preservation event and that's where I get to Jason Scott. He founded archive team. Uh they work closely with the internet archive and they've built a tool uh a virtual machine that you can host and uh do the data archiving for you.
Never have to look at it. They maintain a list of projects that they pri prioritize uh to kind of fi fulfill their mission of protecting web culture and and digital media. Uh this is kind of a description of what the organization is. He uh John Jason Scott describes it as a loose team of of rogue archavists and programmers. Uh loudmouths, you know, sound familiar, right? to the cyber security team, cyber security crew. Uh, and they built this tool, uh, the archive team warrior. Uh, and he jokingly called it when he first released it the distributed preservation of service. Uh, kind of a spin-off of uh, of DOS. Um, and the idea is is that you can run a virtual machine on your
local computer or in a Docker container. And what it'll do, it'll it'll ping their list of um projects that they've they've prioritized. It it'll send it to your your instance and then it'll go out, it'll collect the data and then it will send it off to the internet archive and it just rinse and repeats and it runs in the background. And so that's kind of what um man I kind of flew through this. So I mean I've got I've got a few more things to talk about, but um that's like what you guys can do, right? You can uh maybe instead of running or maybe alongside your your Plex server, you also have this little Docker container running,
right? Uh it's a way to participate in this movement of digital preservation which is vital. Um when it's distributed and preserved, we have the greatest chance of of keeping it around and evading government censorship but also digital rot. Um I'm going to skip past this about me slide. We already talked about me. Um so then I also like compiled a list of tools that you can do this this kind of work for various formats. Uh web archiving I talked about uh the archive team warrior but there's also programs like site sucker which download whole websites including their their digital content uh photos and videos and images. Downey4 is a YouTube downloader but you can also find Python packages on
GitHub that do the same thing. um music. There are and maybe this gets into the realm of piracy depending on how you decide to use it. Uh but you can uh there's tools on GitHub to remove the DRM for your from your Spotify downloads. Uh there's CD rippers. Um that's a pirating site. Uh Handbreak uh is a great tool. You all probably have used that before as well. Uh ebooks, there's a great tool called DDRM and that takes you out of vendor lock when you buy uh you buy digital ebooks from resellers like uh Kindle or Coobo or any others that you can think of. Um and video games. This is I think really interesting too is we have seen a whole
like ground swell culture show up in the internet from people that are actively trying to save and archive old video games. Uh, and what it's also done is it has birthed a new culture of people that are designing games for those old consoles. This is a GBA game jam from 2025. People are still writing Game Boy Advance games in 2025. Uh how do we have that? We have it because people built tools to dump ROMs and make them available. And uh yeah, so to tie everything together, um digital preservation, digital uh is vital not just for preserving culture um but for evading censorship and preserving a community identity. And there's things that you guys can do now uh to help that
project. Uh any questions? >> Yeah.
I guess I have two real questions. One is
two.
software,
>> right? 100%. I think that's a totally valid question to ask. Uh yeah, the Internet Archive has gotten into some hot water because of the work that they do. Um in 2020 during the COVID pandemic, they had a program uh what was it called? It was the um it was they made their library and a a lot of books available. uh you could download mult you could lend out multiple copies at once. Companies sued quickly because they said that violated fair use and it violated kind of the laws that govern how libraries work in this country. Um it's very valid point. Um and I think there was there was another part of your question, right?
>> So
there's reason why
But I think as it relates to the US
>> Yeah. I mean, so that that that last part of the question I'll I'll get was that okay. Um, right. So maybe maybe the question about piracy is the wrong question whether it's ethical or not. Maybe the question is we needed to change our approach to what copyright law looks like and what fair use looks like uh to align with more ethical considerations. Any other thoughts? Yeah. >> Yes. >> Yeah. I was I was thinking about this today because we were talking um there was a lot of AI conversation today and that that I the all of the sudden um ownership of your data becomes incredibly important because if it's accessible to large language models for
scraping and all this work that internet archive has done is now feeding these large language models yikes. um they can reproduce your data, your your copyrighted works word for word. So that sucks for publishers and for authors. Um but then what happens after that? Like do we uh compartmentalize, shut everything off. Everything is a license agreement. Nobody gets to own any of the data that they purchase. So the book that I I buy is no longer mine. The movies that I watch is no longer mine. Um, it's a complex issue and I think it needs to be solved for sure.
Yeah. So some of the causes are uh people take their servers offline. Uh companies are breached, companies go out of business, um users uh there was there was an old website called mock pages. Anybody a fan of Legos in here? Yeah. So, Mock Pages was like a community uh photo sharing platform for people to upload their photos of their creations that they built and developed. They're just like their Lego models share. It was a big community and kind of out of nowhere they shut down and a lot of that a lot of that footage, a lot of those those pictures, a lot of that those models gone kind of forever. Um, so that's kind of that's the way things are as well and
how it ties to large language models. I think that's an interesting question. I've not thought about click-through rate and the impact that has on on data preservation, but I think that's also a really unique angle to take. Do you have like anything to follow on add on?
Thank you guys. Thank you so much.
>> Yeah, just a little bit.
>> Oh, thank you.
Yeah.
>> Okay. >> I um I did I feel like I was like >> I did end pretty I didn't go long enough but >> it's fine. >> That's true. I >> end when it feels good >> like
I'm about to dedicate like 10% of my bandwidth. >> I I'm just feeding I'm just feeding the community's data hoarding instincts. >> Dude, it is so Sorry that didn't work. >> It worked fine. >> I know. It's whatever. >> Great job. >> Thank you. >> There's my card if you ever want to follow up. >> We'll do cybercore intel. So, we're familiar with a couple. >> Well, thank you so much. >> Very good. Love to hear about your master's research is focus or >> I'm a undergraduate. >> Okay. >> Yeah. >> All right. So, uh, no, this is this is just something that I'm I'm interested and passionate about. Um, BYU, but then you're continuing finish your off.
>> Yeah, I was actually I was originally a Arabic and Middle Eastern studies student. >> And then I switched over to do information systems and cyber security and I actually just recently switched to data analytics. Um, so, uh, I have I have a broad range of passion. uh data analytics I think I'm really excited about because it'll enable me to kind of like supercharge uh my like research around like social topics and social issues which I'm also clearly incredibly passionate about. Yeah, we tend to focus a little bit of class on laboratory social issues, but we have an entire division dedicate which is like five department various AI and data sciences group but more focused on sort of
side reactor. >> Yeah. >> Yeah. optimize energy systems on the social issues if you will other than maybe social acceptance of clean energy like that. >> Anyway, very interesting. >> Your family >> uh my dad uh works for a um >> he works for Seammen's Healthcare. Uh they build um like MRI machines and imaging medical imaging equipment. So they he was setting up the like procurement in one of their factories in China. So we lived there for two years >> doing that. >> Do you have a favorite? >> No. >> Uh well wait was was Fire Boy and Lava Girl Splash. Yeah. Yeah. Okay. or or the or the um the like the the block that
was like too big like too tall >> by like one by two that you would like rotate around a platform with a hole in it. >> Mhm. Yeah, >> there was another one that I like played that I can't remember what the website was all was called, but there it's like when we when we go to the computer lab in school, that's like what I do classes like this. >> Oh yeah. like [Music]
they
Yeah, there was a lot more I could talk about, but it would have gone a little tangenty >> like talking about whether or not it's worth like whether or not it's actually worth preserving every piece of data.
I don't think you need a little clip on his
[Music]
video.
First of all,
and it's just >> this a good place for it. >> Okay. All right. And I suppose I can >> Okay.
Yeah, that's that's I'll try to keep it in that ballpark. Maybe I'll make some marks on the table to remind me where it is.
I
Okay.
>> Okay. I'll have to reading. >> It seems like this this is a relatively rapidly developing field. >> Uh so, but I found this really good paper that's a so if you this may not be the best reading paper ever, but it is a journal article. So, it's 15 pages long or so and it's got lots of references. So if you need more to dig into more detail like the Valerant stuff I'll dig into is actually referenced by this paper but it's not this paper. This one's a great one to start with. >> Vanguard. Vanguard is the one that is used by Valerant and League of Legends.
>> I don't understand this was going to be recorded. And and actually like unlike other talks, you're not going to be able to see me >> because the only video feed they is that have the camera and so they're just going to video that. So So that might actually be a good thing.
Here's a website.
>> Yeah. Well, I I would switch this switch this to dark mode, but >> it's a bit of a hassle at this point.
>> Why would I need more? Right. you know, we're going to talk conceptual anyway, so we're not going to be like putting code up or anything like that. And frankly, if you saw my design for slides, I'm aesthetically challenged. >> So, get you out of the logo, the template. >> Yeah, I can't draw ISU's logo. >> Actually, do we have an example of the ISU logo? >> Somebody pointed this out to me and now I can't unsee it. >> Um,
So I see is I think oh ISU you know great there's our spirit mark somebody pointed out yeah it's just a dollar sign >> now I can't unsee it >> but then again I've also seen what our tuition rates are we are not the problem >> yeah well >> right we are so cheap with the um We have some contracts with Iowa State. >> Yeah. >> I just have to entirely not use ISU. >> Right. Right. Yeah. And we got there first. >> That's why we're ISU. And they're not >> on ISU.edu. That's >> on the internet. We got We got there first.
He wants to
Joseph. Sorry to throw a wrench into things. It's just what I It's what I do.
>> Okay. >> Yeah. >> Who doesn't want to see the shirt, right?
>> Oh, like the reflection. >> Okay. >> All right. Oh, that's right. And it like looks for a hand motion to start tracking or something. >> So, just keep my hands away.
I've never even
There's another
I should have a picture.
What's that?
So, I checked next door. They're still going, right? They're still using that extra 10 minutes late start. So, we're going to at least start five minutes late. session. They're taking they're taking up the break time. >> All right.
>> So, we could do show tunes,
>> right? Yeah. try to do >> like shadow signs or something. I don't know.
Well,
this It'll still be interesting. I promise. >> Root kits are always interesting.
It's kind of
This I
think weird.
[Music]
All
All right.
>> You'll see.
Did they finally let let him go? >> They finally let him go.
>> Two blue lights. >> But that's But that's >> right. Yeah. Okay. So, we do
>> know. >> Oh, yeah. >> Right. Yeah. Fair enough. Fair enough. >> Just one.
>> Yeah. the you if you have the choice between knowing the material and great slides, know the material, right? >> So, >> yeah. >> Yeah. >> Although we are in my house here. So,
I guess that is U of I room, but I'm just go and pull the audio up just to check the mics. Use the mouse. Let me grab this bar and bring that up. Right there's good. Test test. This one isn't working.
Okay.
Okay. Well, welcome to the afternoon session. It's my great pleasure to introduce our next speaker. You really know you're dealing with a talent or a legend or whatever whatever phrase you want to say when they turn in a I teach, I hack, I play CTF, I play program. So, let me add a little bit. So, when he says that he does CTF, you're looking at the person that effectively wrote the kernel for the DARPA cyber grand challenge, which was the ultimate of CTF. He wrote the colonel for variety of different roles at the laboratory and other endeavors. He's presently one of his associations >> take my class programs and and teaches programming and teaches computer
science. So we are entertained by his nonto slide. So without further ado, Jason. >> Hello. All right. So we're going to go back in time. It is uh January of this year and two of my students walked into my class uh my operating system class which is computer science 4461. I look forward to see you all in it. Um some of you have been in it. Some of you will be in it. So anyway, um but they walked in and they were both on the esports team. Uh that's Liam Sirill and Isabelle Dherty said, "So how does colonel level antiche work?" and I said, "I have no idea what you're talking about." So, our esports team, this was their second
semester existing, and I had gotten interested in what they do and watching them play their games and that kind of stuff. So, I was at least familiar with the idea that these things exist, but I didn't know how they worked either. So, I said, "Well, I'll tell you what. I've got some spare time at the end of the class. Why don't we spend the the the remainder of the semester at the end of the semester looking at that?" So that's what we did and you well you're going to hear it for the second time. The rest of you are going to experience what we talked about. Unfortunately, we did it in about a week and I'm going to try to
comparison down to a day. We'll see how it goes. Okay. Ground rules. If something's unclear, stop me right then. Speak up. If I'm not speaking loud enough, particularly folks in the back, I know that I mumble. I'm not always aware when I'm doing it. So just go thumbs up to tell me to speak up. It's totally okay. Aware that I have the problem, not aware when I'm actually doing. Okay. Yeah. And this is about as high-tech as I get for slides for a class like this. I do have one other slide, but just because I don't want to eventually draw this while you're waiting on for me to draw it, but we're going to come to that. We're
just not there yet. So if you were going to take a snapshot of something, this is the thing to go. This is not my original work. I want to be absolutely clear about that. I found this paper really helpful and it it's not so much it is a good paper by itself, but it also links to an awful lot of other resources. So if it looks like a rootkit and deceives like a rootkit, a critical examination of kernel level antiche systems by Dorer and Clausner, great paper. It's like 15 20 pages long. goes into extreme detail which I'm going to summarize very quickly. But first we need to kind of level set. So we're in an operating system class.
Um and it's a 400 level computer science class. Some of you are not there. Okay. Hopefully will still be valuable for you. All right. Let's talk about the protection model for processes. Oops. Okay. So this is the the Unix Windows model. I have two processes that get created. What's what's a process? What is a process? >> Yeah, it's a chunk of instructions in memory. That's a good that's a good starting place. So a program is you write it in in whatever your favorite language is. It get it gets compiled. If we're talking about languages that are sane, it gets compiled to an executable. That executable is a program. It can't exist by itself. The operating system
has to add some stuff to it. So in a process, there'll be some state information, the value of all of its registers, um its page tables, its memory, etc. All that gets added by the operating system. So a program gets loaded off a disk and creates one or more processes. Okay? And all these will exist in an operating system which controls switching between the two. We're living in a world where we only have one processor for the sake of argument here. Um that's not the real world. At least not on any device that you have except for maybe microcontrollers these days. Multi-core CPUs would mean we'd have more those show up as multiple processors, right? The multiple threads
of execution. We're not living in that world just for the sake of argument for for the next for the next 45 minutes. You can have your cores back later. Okay. All right. So my process gets loaded off of disk and once it's running it's the operating system's job to make sure that one process doesn't interfere with the with each with another. Each process has to or has the illusion that it's the only process running on the machine. It's boulderdash because odds are you want to stream music and write code at the same time. Okay. So obviously we need to be able to switch between these things. But as far as each process is concerned, they're
completely independent from each other. And the operating system goes a long way into ensuring that illusion. Okay. Can one process write into another process? >> What do you think? >> In a process communication, but that's mediated by something. What were you going to say up here? >> Not directly, but But through the operating system, right? Yes. That that just fills in the blank between the two. And the reason I'm repeating is I forget to repeat things. If I don't repeat them pretty quickly, there's not much left up here. Okay. Okay. So, in order to get data from one process to another, the operating systems got to be involved. And let's do an example. Eventually, your your process might want to read
data. Good chance you might want to read a configuration file. Good chance you might want to read something off the network. The operating system will on your behalf go perform that operation and then write it into your process's memory space. So, the operating system has full access to write to your process all day long. What about more directly? Are there more direct ways for a process to write it to another? >> What's that? >> Could do DMA that's mediated by the operating system though. Well, somebody's got to initiate it. And we don't allow user processes to access DMA directly because we all know that's bad for all definitions of bad >> could maybe hack the operating system.
Let's assume the operating system is not Windows and may withstand some form of attack. Okay. I'm sorry. What? Oh, I never mind interruptions. Okay. I really don't. As long as we're like roughly following the topic, I'm totally good with it. Okay. So, in my class, what did y'all do, Ralph?
Yeah. Okay. Well, okay. >> Okay. So, I gave them an assignment. Has anybody played the old text game Rogue? I know you have. Um, okay. You have. So, imagine it's a dungeon crawler adventure, but it's all in text. >> Okay. And you use your arrow. Well, if you're running the original version of it, you're using VI keys to get around. Um, more modern versions allow used arrow keys. That's that's pretty new technology. We're in the 80s now. Okay. The key thing it had is it's a dungeon crawler, so it has hit points. So, when you interact with a monster, it will start to fight you and you'll lose hit points. I had my students make it so their hit points never went
to zero. In fact, one student set them to like 10,000. and make sure they stayed there. That was actually Isabella, as a matter of fact. How was she able to do that magic?
>> Ah, where are they stored? So where where is the data for a process stored? >> Not a program, but a process >> in its own. So each process has its own memory space. That's what I'm depicting here is actually each process's memory space. I as a single process can write to anywhere in my own process space. That's not completely true. I can't write over the text of the program. The actual code those pages are read only. But I can write anywhere the data can be written. But my students were able to cheat. Now I know you all write perfect code first time. You never have to use a tool called a debugger. How does a debugger work?
>> Okay, it asks it asks the operating system and in the Linux world this is called prace. It's uh process trace. In the Windows world, it's something far more complicated and I don't I just don't recall what it is off the top of my head. But this process, let's say this is process 42 and this one's uh 54. I can praise process 42. And here's what it allows me to do. It allows me to stop it, start it, and without operating system intermediation, read and write from its memory. Okay. So, how do we cheat? Use pets. But it it but it's more elegant than that. But that's that's where we start. Okay. So, dungeon crawler. Everybody knows what I'm
talking about when I talk when I say dungeon crawler, right? Okay. I'm not that old, I guess. All right. What if you search through all of the memory of of your target process while you're debugging it live? Now the process is running and you found where the hit points well here's how you find where the hit points are stored. This is kind of a hack but it's it's kind of cool. So I know that my hit points start out at 12 and I so I go looking through the entire address space of the process and I find every occurrence of the number 12 and note them. Put that aside. I then go interact with the monster. I get hit. Now I have nine
hit points. I go through finding every location that has nine the number nine in it and set intersection tells me where hit points are stored. Right? There's only one place that would have gone from 12 to 9. Sometimes you have to do it multiple times, but this this works for the most part. Okay, so now we know hit points since we can write to another process's memory. We can just say, "Oh, our our hit points dropped. Let's that didn't happen. Let's set them back up again." So, we had 12 hit points. We went to nine. Now, we have 10,000 hit points, for instance. Okay. That's the environment that games have to run in. I'm going to switch gears.
So, uh, do we have any Valerant players? >> Oh, go ahead, please. probably not a Valerant player, by the way. >> Uh, every system I run has an MMU. >> Yeah, that's true. Um, so in the It depends if we're doing multipprocessing without an MMU, these things have to be located in different chunks of memory. >> So like you're thinking things like Is it trace the trace tells the writ? >> No. Uh, process 54 says P trace process 42. And now process 42 or excuse me, process 54 can read and write to 42's memory. >> They are in different virtual memory spaces. >> Yeah. Yeah. the the the debugger can interact with it and access the page
tables of that other process. There's a caveat there, but we'll come to it in a minute. So, any Valerant players, we've got a couple uh Overwatch, uh Marvel Rivals. Oh, wow. Okay, you're hitting the gamut. Um League of Legends. So, yeah, we get at least one League of Legends. Okay, so Valerant and League of Legends are we're going to spend the most time on because because Riot's Vanguard engine is such a total rootkit. It's amazing. Um, Net Ease that's used by Marvel Rivals is also bad but not as bad as Riot Riot Vanguard. Okay. So what allows this to work and the caveat that I mentioned is in the world of operating systems we have this concept of a user uh operating
system people people will usually call this the principle we're we're just going to dispense with that they're users and the prace thing only works in two cases either the p traceer is root root can do anything or process 42 and 54 have to be owned by the same user. So in other words, I can prace my own processes, but I can't prace your processes running on the same machine. This is the Unix model of things. I can do anything I want to in my sandbox, but your sandbox is different unless I'm root and then all bets are off. See, did I have to? No, I didn't have to give them root access. Um I don't like giving students root access
to my machines. Yeah, not unless they ask really nicely. Although I do say, "Go poke around on my machine." And I I I have yet to have any I I've had one person say, "Hey, should I have access to this?" And luckily in that particular case, it was, "Yeah, you can have access to that. It's okay." All right. Okay. So, this that's the environment we're in. Now, let's look at it from the point of view of a game developer. game developer, particularly for games where we're playing online multiplayer, wants to accomplish some goals. They want their game to be fun. They want you to spend money on it. Okay. So, given those two requirements, what else do they care about?
Yeah. Go ahead. >> Fair. Games got to be fair. If the game's not fair, you won't spend money on. You'll stop playing it. Okay, let's talk about what they care about. All right fairness. What else? >> The game divi game developer or the producer or whatever. >> Well, that's really a fairness thing, right? >> Yeah. I'm just gonna blump it in that category. So, gender and then we'll come back to patients.
Oh yeah. Okay. I'll call that DRM. It's kind of close. Patience and then over here. >> Get a rid of a fact. >> Tell me what backb
Okay. That's probably going to be server side stuff. We can't deal with that here. >> Yeah, you just create a new account and keep going like keep playing, >> right? >> Okay. >> Oh, yeah. So, if it crashes a lot, you won't spend money. You will stop playing it. >> So, >> but there there's two kinds of reliability that we care about. It's not just that that the game doesn't crash. Okay. Another something. Okay. Go ahead. Perform. Yeah. If we drop the frame rate to two, adding our anti-che measures, nobody's going to play the game. Nobody spends any money. Not any fun. Okay. It's got to be fast. And that's probably frankly the most important thing. Got to be fast. If it's
good, that's even better. >> Uh I love what reading kernel code. It's not everybody's cup of tea. Um but you know, show me a well-written API and I'm totally there. All right. So, I'm going to add the word reliability again, but by this what I'm getting at is the idea that um
I'm not using very technical terminology here. If I accuse if I accuse you of cheating, I need to be right most of the time. I can be wrong some of the time. Um, my daughter's boyfriend got banned from Overwatch. He's never cheated in his life. Um, and got a one-mon ban. At least he says he's never cheated in his life. I don't know. Let's take him at his word, though. So in other words, he got accused incorrectly and we can accept some amount of that. You know, as long as we're not banning half of our users, that would be bad, right? We lose revenue. Um, and another thing that our that we'd like to be able to do do
is identify the principle, identify who cheated so that we can make sure they don't just set up another account like patients is going to do. >> Okay. Go ahead. >> Uh, digital rights management. uh making sure that you get what you pay for and nothing else. That's the way I think of it. You probably have a better definition than that, but let's go with mine for now. Okay. Okay. I'm going to spend some time on this identification thing. Okay. I'm gonna spend some time on this identification thing. How do you identify somebody is who they say they are? They could just tell you. >> Yeah. Serial numbers on components. What components have serial numbers? Hard
drives. Every hard drive has a unique serial number and the operating system Windows very helpfully puts it in the registry. Thank you, Windows. Uh
>> yes. So every network interface uh has a burnedin ROM that has a unique number assigned to it. Globally unique number and that's called a MAC address media access control if anybody cares about the acronym.
So if I see somebody with a particular disc or I see somebody with a particular MAC address, I can say with some confidence that's the same computer. Does that make sense? So what if when I ban somebody, I may take these two things and say if I see somebody coming to play my game with that disc serial number or that MAC or some subset of them even, then they're not allowed to play. Let's see. Uh, let's see.
BIOSes have serial numbers, motherboards. Intel had the wacky idea uh in the early 1990s to add a unique number to every processor. Fortunately, they got rid of that um because the first thing every operating system of the of the day did is disable it. So why would you ship a feature that everybody disables? Um it was a part of the CPU ID instruction set. There are some missing leaves if you go through the Intel documentation for CPU ID that you can discover. We kind of see where they got deleted out. uh because then you could uniquely identify a particular processor. >> It doesn't really matter. You can still identify the machine, >> right? We have all this other stuff, but
most of most of this other stuff was added after that though. >> Um hard drives in the 1980s and 90s didn't have serial numbers that you could query, at least not through the normal operating system methods. U there were usually vendor specific tools you could get out of it. So okay let's talk about how antiche does its thing and keep in mind that what we're trying to accomplish is fairness. Making sure that the users get what they pay for and nothing else. Um it's got to be reliable. It's got to be fast. And we've got to have good accusations. If I say you're cheating, you're cheating. If if I say you're not cheating, I don't really care.
And I've got to be able to identify you. All right. So, let's go back to our example in the simplified world of Unix processes. I am process 42. How do I protect myself from process 54?
>> Yeah, we we could we could switch users. However, you ultimately own your machine, right? At least I hope you do. If you've stolen it from somebody, don't tell me. But you own your machine. So, if you think of this kind of a right to repair uh thing, you should have access to all the pages. I love that your talk was right before mine. Um, so I should have access to all the pages of every process on my computer. I paid for that DRAM, right? And I can as a user process. I can ultimately promote myself to root because I own the machine and now I've got access to the the entire address space of the entire computer.
Uhoh. How do you hide from the operating system itself if the operating system is not trustworthy? Okay, that's impossible. I'll just go ahead and give that away. Um, but there are some ways where we can detect where things are going wrong. That's where uh kernel level antiche comes in. It has to be kernel level because any protections we put on our own process can be easily disabled by somebody else. Easily. If you have a bunch of checks, I can attach to that process and say those checks, how about I replace those with a whole bunch of noop instructions. We'll just paint right over that. Can do that because I can always access every page in my machine.
All right. So, that's that's the world we live in. So, how do the NT guys do what they do? Let's see. And there are kind of four major products uh that were considered in the paper that I I listed at the beginning. Um there's Battle Eye, which is used by Rainbow Six Siege and PUBG. I've never played either of those. Um Easy Antiche, which is Epic. That's Fortnite, Apex Legends. Uh Face It, Counter-Strike Global Offensive. Never played Counter-Strike Global Offensive. It's not my thing. I I I also can't play. I have played Valerant, but I'm too slow. I'm old, right? Um I walk into a room, I die. That's that's that's what happens in Valerant. Um
but it is still most mo the most interesting. Uh and Vanguard um originally their their paper says that uh that Vanguard is only used for Valerant, but that's not true anymore. It's used for like League of Legends now. So what most of these packages do is they show up as another kernel driver. In other words, they run as a part of the operating system itself. They get loaded here. They're just another part of the operating system. So we can do things like scan every process on the machine. So we can look to see was the code loaded off the disk for valorant.exe the same thing that's in memory. We can do those kind of checks. Those checks
are kind of slow but we can do them. And only the operating system can do that please and just speak up. It's totally okay. >> I don't know if this is but
Exactly. What makes it different from malware? You and I'm not making fun of you. You should be asking that question. And by the end of this talk, you'll be, you know, do I really need that game? Should I have a separate computer for that? Yes. Yes, you should is the answer. Um, there is very little difference between between Riots's Vanguard and your average rootkit. Very little difference. Okay. So, rootkits would also look for the difference is uh Riot is going to look at Valerant.exe and try to make sure that what came off the disk is still what's in memory. It hasn't been patched by another process. Uh it will scan other processes looking for known
cheat software. Rootkits would look for things like antivirus software, antivirus processes, um your Norton utilities or whatever. They would load kernel drivers as well if they could to make sure that they're not going to be discovered. The best way to hide is to be in the operating system. Even better is if you could be in the hypervisor. Okay, so a technique Windows has this idea of create remote thread. Is anybody familiar with create remote thread? It is the dumbest thing I've ever heard of. Here's I'll describe it. It allows me to create a thread of execution, which means I get to run inside another process's memory and I get to supply the code for that.
It's an API call. Look it up. create remote thread. Um, so in other words, I can create a program. I can create a really it's kind of a process that's running inside the memory of another process. So I can mess with it on the fly. >> Yeah, they do have to be on well there they it is a thread created inside another process. So it becomes a just another part of that process. Yeah. >> Yes. The one that's spawning the thread and the remote process both have to be owned by the same person. But remember, I can be root or administrator if you if you want to talk in Windows terms. Okay.
Okay. So how given that I can do this and that's bad. How does Riot detect that this has happened? And this is actually kind of elegant
>> sort of. Um I can also ask the operating system where the data for something came from. And here's where that that becomes important. I can ask the operating system so this page of code that you're executing right now, where did it come from? Did it come from a file on disk? If it came from a file on disk, I can go compare the two. And usually with create remote thread for things that are nefarious, there will be no associated file. It doesn't exist on the disk. Basically, I can inject arbitrary code into another process and run it without having a file on disk. So, what Valerant or excuse me, uh, Vanguard does is it scans a process to
see, and I've had little ones. It's not going to bother me a bit. >> Totally fine. So, uh, the Riot can scan its memory and look for pages that have been brought into the process that don't have a file associated with them. So we can kind of detect that this this kind of thing has happened. Let's see. Uh how does antivirus software work? At least antivirus software from the 1990s or so. It's the last time I spent any time looking at them. >> Yeah. Definition based pattern recognition. Absolutely. So we can just go scan for bite patterns of things that we've seen before. So and talking from Katchcha's talk earlier, if we've seen it before,
we can make inferences about it. If we've never seen it before, magical thinking. And being a kernel driver, we can actually uh intercept new processes being created. We can go, hey, you're firing up a new process. Let me just put that on pause for a minute, look at it, and see if it's bad. So, we can we can make sure that nothing gets bad, nothing bad gets loaded after our program starts.
Let's see. Let's see other stuff we can do. So we can do process enumeration. Basically look at the name of every program that's running and look for bad. Let's see. Uh a common thing for a good cheating software is it will want to make a call out to the network. What are they calling out to the network for? Whether you paid your subscription to the cheat service or not. got to get paid, right? If I if I'm if I'm writing cheat code, a good one, that auto aimer or whatever, I want to get paid for that. I want to make sure you subscription models are totally a thing here, evidently. I'm so I'm told. Actually, I haven't looked.
Uh >> indeed, indeed. Um let's see. Another clever one is um
uh stack walking. Uh when a when you execute a function, it creates a new stack frame and that maybe that function calls another function calls another etc. We can walk that chain backwards and make sure that the the chain makes sense that somebody hasn't injected another step in the chain. It basically has tried to hijack a particular function call. That's what stack walking is on all about. All right. Now, we're going to talk about the coolest feature of u Vanguard. Okay. The Unix world makes this complete illusion that every process has a linear chunk of memory that starts at approximately zero and goes to approximately I don't know let's call it three three gigabytes it can vary on 64-bit machines it can be
a lot larger let's think 32-bit terms for the moment adult brain okay that's not how it's laid out in real memory Wouldn't it be great if we could lay every process in its own chunk of memory? What's the problem with laying everything out linearly like this?
Yes, processes will want to allocate more memory. And if I stack them really close together, now I've got a problem. Wouldn't it be nice if I could scatter things all over memory? That's where this comes in. This is how it really works. This is what it looks like. This is how it works. Every time a an address is presented to the CPU, it does this on 64-bit machines. But I bet you all thought your machines were 64-bit. You've been lied to, by the way. They're only 48 bits as far as physical addresses go. If you want more bits than that, you can have it, but you won't need that till you get to about a pabyte of RAM.
If you have that problem, come talk to me. Okay? All right. So, when an address is presented, so I want to go access a particular bite in memory, it gets broken up into a whole bunch of pieces. That topmost bits look up in a table called PML4. Why is it called PML4? One two three four. What do you think the one for 64-bit real 64-bit machines is? PML5. Intel has totally gotten unoriginal in naming things. Okay? And the name, don't get caught up on the names, but each part of the address is a lookup to a different table. So the whole virtual address is 48 bits long, 0 to 47. And the first topmost bits are used to look up an entry in the
PML4 which points to an entry in the or points to a PDP PDP. Don't try to say it 100 times fast. Um, this index is an index into that table which points to a page directory table PDT. Don't get caught up on this. What's important is what uh Riot does. Riot modifies the page table handler in Windows. Here's what that means from a a practical point of view. Every time you load a m a program off of disk, Riot's code's running on your machine. Whether you're playing Valerant or not, it's running in the operating system kernel because the first thing it does when it fires up, loads itself as a kernel driver, and then it patches the
operating system itself. Page table handler. Yeah, we're we're going to modify. It's actually a function called, what is it called? Swap. Swap context hook. It is a hook that gets called every time we change from one process to another. So, every time you go back and forth between writing code and listening to music, uh, Riot's code's running. That sure does need to be reliable, right? And if your machine's still running and you have Valerant installed, they must have done something right. Right? Must they must have some smart folks or really lucky folks. Folks don't get that lucky writing kernel code, though. All right. Here's why it overwrites that handler.
to get this whole ball rolling. There's a register colloquial colloquially called CR3. Intel's naming conventions. Can't do anything with them. It is the entry into this process. It's the one that points at this PML for starts the whole process. When Valerant.exe exe runs since they're running in swap context hook they say oh you were going to switch to valorant.exe That's not the table I'm going to use. It creates a shadow version of this PML4 and it has real entries in it.
Uh the data structures in Valerant uh are some of them some of the most important ones are called world in all caps and player in all caps. That's where the good stuff is. That's the position of every player and the layout of the world and how things are interacted. I don't know much more depth than that. I haven't reverse engineered Valerant. Not going to. I ain't got that kind of time. So if you are another process, if you ask the operating system, so what's the layout of valorant.exe, you'll get this version of the PML4. And the same xxx points to
nothing. So if I want to go access the world and I'm not valorant.exe, when I ask the operating system, so tell me about the page table layout of valerant.exe, the operating system gives its its version of what it thinks Valerant looks like. says well it's PML table 4 table is here and then me as the the evil program can go access that and then walk the rest of the thing see the rest the how the memory is laid out for this process since it points to nothing as soon as I try to go access those pages bad things happen my cheat software crashes crashing cheat software Valerin doesn't care about that but you're not going to pay for it if it
doesn't and how long have you been holding that up and I've been ignoring you. Okay. Sorry about that. Um you may need to like wave it or something. All right. So when the operating switches system switches to valerant the kernel driver since it's hooked swap context hook can say here's the real value for CR3 that real entry point which has the real entries in it.
>> Yeah. Uh so valor act uh when it creates the memory region for this real world and player it marks it as being a part of its special sauce for the driver and the driver knows to allocate it into this area pointed to by the shadow PML4. Am I am I getting closer to answer your question? Okay. So, and again the the the the important thing is that when it's valerant.exe, everything behaves as normal when we're switching to this process.
We write the real value of PML4 there. The kernel driver for Valerant does that. I'm picking on Valerant. This also happens for League of Legends. Um, if it is somebody else and they ask the operating system, what's the layout of this process, they get the wrong value for the entry point to how memory is laid out for this process. They don't know where to start. They don't know how to find this version of the table in memory. It's in memory somewhere. It has to be. But where is it? I don't know. Does that help? And if I I haven't made it clear, let me know. All right.
Actually, I should uh let's see. There's one other kind of clever thing, and this is this is a lot simpler than all this page table stuff. How do you know what time it is? >> You can check your phone. Okay. Right. How does an operating system know what time it is? Where does Android get it from? >> Yeah. So, the time server is not a bad place. Um, when it first boots up, the operating system stores >> it does. So, CMOS battery keeps a little clock running and when the operating system boots, it reads that clock. After that point, it maintains its own tick. It keeps a a timer running which runs every n seconds for some value of n and
keeps the time. Okay, what if I set a timer for one second? Okay, my timer will go off after one second. Right, this is this is not rocket science, right? I can also ask a remote machine like say let's say I were a game developer and has servers for playing my multiplayer game. I could also those I control. I have physical control over those probably. I can set a timer on those to run once a second. What if I get different values back? What happened? In other words, here's here's what's going on. I ran a one second timer on the users's machine, my players machine, and I ran a one second timer on my servers, which I control,
and they are completely bulletproof. They'll never be uh hacked, and I get different numbers for how much time elapsed. What happened? >> Could be solar radiation. Okay. might have dropped packets not that will give me a little bit of variance but what will really give me variance if I happen to find myself on the user machine running in a hypervisor hypervisors have a hard time keeping precise time correct if I find myself in a hypervisor then I know I can't trust anything even at the operating system level ouch but I can just say hey you're running on a hypervisor's not supposed to run in a hypervisor violating terms of service goodbye and exit and riot
does Okay. So, has this been hopefully this has been a good CS 4461. We'll have an exam next week. >> But before that, are there any questions? Go ahead.
But they've got an easy problem. The Xbox guys have it easy. >> They control the hardware. >> Yeah. Okay. That's true. >> You're right.
>> How
Yeah. And I don't have good information for which checks are run at what point. Um I do know that that creating and deleting the hook for swap context happens at driver load time. So when Valerant uh when the when Windows is loading its device drivers, that one gets put in there. So every context switch you're running riots code on your machine. U now the memory scanning and stuff I think only takes place while you're running the game. So we'll do one expensive search first and then do cheap ones because it can stop new programs from starting.
>> Tell that tell that to Riot and and Microsoft. I'm not I'm not playing in that game. But but but they totally do that, right? They they totally change and hook functions in the in the Microsoft kernel. But folks have been doing that for about 40 years at this point. Am I right on the time frames there? >> Well, they want people paying their license fees for Windows for sure. And and if playing Valerant gets you to do that, yeah. So, I don't know. I really I don't know what how Riot gets away with another song. I don't know. Go ahead, Ralph. And then I'll come up there. Can you containerize it? No, because it
will detect that it's running in a hypervisor using that tick check.
>> Okay. So, okay, you're right. Containers are different, but uh my guess is they would detect that they're running in the container somehow. And I don't know how you do that off the top of my head. I know how by the way I've written two papers on how you decide that you're in a hypervisor mode there
>> uh what yeah so each of the the softwares for for doing antiche will do VM detection um it's cheap and easy relatively speaking to detect if you're in a VM and they know if they're running in a VM, number one, the game's not going to perform well and it opens them up to a whole lot of cheating, right? Because in the hypervisor, you can't even detect that the operating system is messing with you. You you can then start messing with the Valerant or the Riot Vanguard stuff like, oh, you said you were going to hook swap context. Let me get the real value and print it somewhere so I can use it. Or you can do
that in the hypervisor and not be detected. And and hypervisor detection is cheap and easy. So it might as well. The tick check is one that's very reliable, but there are about 50 different ways of detecting your own hypervisor. Wayne mentioned that I did work for the I didn't write a operating system kernel for Cyber Grand Challenge, but I did write a hypervisor.
Valerant doesn't run on Mac OS. >> So if you want to get your Riot games fix, get a Windows machine. >> And that's probably that's probably one of the major reasons why is they can't do good anti-che as far as their remember their point of view is games got to be fair, gota, you know, right? All that stuff. And if they can't ensure that, it's a hard platform to support.
What games do I run under Linux? >> Uh, yeah. Well, you met other Why don't they port like Valerant to Is that the question is why don't they port the games to Linux or
>> Yeah.
My guess is that they're trying to keep the level playing field by making sure that um they've got Edge in place. It's my guess. That's why they don't support other operating systems. I don't know.
>> That's it. It's hard to write one driver that works in multiple operating system kernels. Done it. It's painful. Um, if you thought portability in C was bad at the user space, try it in the kernel. Like in in Linux, there's no print f. It's of course called print k in case you were curious. That's a really Minor example though >> uh print f by far BSD Unix has had print f in the operating system kernel since 70s >> and Linux didn't ex Linux did not exist in the 70s
>> the trusted platform module. >> Yes. the TPM. >> That is a long discussion. I imagine you have some expertise in the TPM space. Oh okay. I'm guessing by Wayne standing that I am out of time. >> So, they've been okay. >> I can answer in five minutes. So, trust trusted platform module and the reason why you all have to upgrade to Windows 11. It's not because Microsoft is a money grubby corporation. They are, but that's not the reason you need to upgrade to Windows model. They totally are. But the reason you need to upgrade is they have added support for signed cryptographically signed uh loading of the kernel off the disk, making sure that it is valid when
it comes off the disk, kind of like all this antiche stuff does. Um, and the versions of the standard that are supported by Windows 10 and before aren't secure enough that we found holes in them. And I say we, I'm putting myself in the category of folks that worked in the space and I haven't. I haven't, but but vulnerabilities have been found in all the trusted platform module versions before 2.0. Um, so we can find ways around. We can load Windows kernel modules that are unsigned into kernels. We can get our code running in the operating system kernel. Supposedly that's fixed in TPM version two which um Windows 10 doesn't support excuse me Windows 11 doesn't
support the early versions of TPM and your hardware is getting obsolete because it doesn't support the hardware version uh TPM uh 2.0 and above. I I've got a machine. It's a great machine. Uh it cost about $8,000 when it was bought. Um, but it was my workstation until last year because I couldn't run Valerant on it because I couldn't enable secure boot Windows 11. Old TPM machine was bought in 201 18 2016. >> I could get Windows 11 on, but not with the TPM enabled, not with secure boot enabled. >> Yes, the TPM is a trusted platform module. It's a piece of hardware which stores cryptographic keys. The idea with the trusted platform module in the very
short version