
kiss so yeah I'm I'm a super engineer and the blockchain space now so I've got a pretty good understanding of peer-to-peer heuristics and general mechanics so the talk is you know a reference to that at least the the titles are offensive smashing is back for on profits I like that article so let's just get into the talk the things gone off by the way my the thing here
yeah
that's all right
yeah let me just bring this down here I'll be able to see great ok yes cool [ __ ] so yeah so the the so there's two there's two main kinds of excellent classes so the the first the first class is you know a logic exploits so I don't know you have like some buffer overflow because you didn't keep track of your integers correctly you hope you have you know I don't know some if statement that's wrong somewhere or whatever that's quite easy to fix we generally and you know it just takes a patch and those exploits effect single clients whereas the other type of exploit which is flaws in a design that's architectural flaws
and protocol architecture applause and how your database I don't know gets data from from your disk or whatever those are far harder fix because they require dramatic heuristics these are in in peer-to-peer protocols especially or generally any kind of networking protocol where two clients must agree on a way to exchange information those are almost doesn't even harder to fix because you can't just fix your own client because then you won't be able to talk the rest of network so so you have some yep the yeah so you have you know a softest of pelagic exploits there
all right anyways and then it's an office of design exploits there so you can see design exploits I just think that I've said you can't fix them unless you the entire at port networker upgrades especially for peer-to-peer protocols so [Music] peer-to-peer networks yeah field of your networks are very prone to design damage anyway so we read of your networks are very prone to design class exploits the peer exchange and discovery mechanism contain several subtle emergent properties that are easily abused and you can't change the pure exchange or discovering mechanism without breaking compatibility so in the BitTorrent protocol which is what we're going to be looking over today is the the peer exchange mechanism is the
DHT distributed hash table which is based on code Emily oh so there's a synopsis with us there
so it's the largest world's largest p2p network in terms of usage I'm traffic I think BitTorrent traffic accounts for something like thirty thirty to fifty percent of the world's you know bandwidth intensive traffic at certain times over today I think Netflix is like the second largest bandwidth user the spec of the bertrand protocol kind of evolved over time without any real oversight or guidelines the you know there's like the Bitcoin enhancement proposals which sort of are like community voted on but you know if you have if you tried your hand at implementing a BitTorrent client any time the last like ten years you know it's quite hard so mostly most what what people have done its they've taken
you know the reference calling or whatever when source clients and they've confirmed with that and you know that without really understanding the effect of what certain protocol decisions mean and your so there's the worst mistakes in the reference client you know like they get poured it into everything else as the so in for in terms of the DHT
right so all right so the bit the BitTorrent enhancement proposal number five came out in two thousand eight so that's what introduced the HD so if you have you know we use you probably use magnet links nowadays that's what allows magnet links to work the magnet the magnet link has a part of the protocol sorry the part of the link is the first part of the magnet link is called the BT pattern hand wash and then the rest you have sort of announced service so like if you go to I don't know download your DV and ISO because that's what everybody does with torrents you you have you will generally see a link which has a bunch of garbage at the
beginning so that's a shape one hash it's ash a one hash of the metadata and it's warrant it's a shochet one hash of the metadata and and sort of the torrent Merkle tree which is like a tree of hashes and of the terror of the of the app of the file and then you have the other part which is the announce list the amount list is what keeps track of the trackers so you know I don't know robert o'dubaine or calm as a trap Victoria come as a tracker that's what you that's what in the old days you could you would only be able to rely on that's what keeps track of the torrents you register according
to the tracker thank you said in its list you know you can create or anything the tracker so it's a central server that like axis meteor but you can actually with magnet links now you can take away that central server completely because it uses that the shredded hash table has sort of like this ridicule a store to do not turn for night reversal peer exchange and they've added a on top of this they've added another protocol extension which is VP their dear their nine which allows you to use the torrent in band to get metadata about the torrent given just a shade one hash of the torrent file so that's why you know nowadays you know you can if
that's whatever the power of me or something goes down it's very easy to get it back up or click on the side because all you have to do is just e play a list of the info hashes and you can always find other peels peers through the HD
so yeah in a specific form torrents gesture hash the helpers they have hash is just or metadata like I said and it's how that's how peers are found for a particular torrent so you know there's a there's a sort of like a dump of like a general torrent file for you know you enjoy yourself and you can see is that there is some that's so annoying
you can see that there's a you know the what will this comments are some random fields that are in the matter there's the announced field which is what has the trackers you know in a regular torrent this is the info hash which is what I described the file length the the number pieces and the total length have the DeMarco tree so that mark will trees like a tree of hashes it's one of the ways that the BitTorrent keeps track of pieces that you've downloaded and it's it's also keeps track of whether how it's an easy way to verify that someone hasn't given you the wrong data for certain piece so if I you know if I'm telling in like an ISO
which is it's important that that doesn't get corrupted but more so that that doesn't get maliciously art altered you know maybe I mean one of the trunks in the ISO is the sage winery and a little bit back doored or something so the way that this is prevented is the you take a hash of the entire file and and then you split the yeah actually no never mind so you you're getting good feet with the other type of filter structure so yeah you take you just flick the entire file into chunks which is like I don't know one kilobyte chunk and so you hash each of these chunks and then so that's like the that's layer zero saying of the the tree
and then you take each pair of hashes and you hash that in your hash the both of the hashes in fact you can catenate them in the hash and so then that's that's forms like you know level 1 and so level 1 the level 1 hash has two nodes um for that point to the lower hashes and you do this you know until you only left with one node root node which is the merkel root and so if I request a block from you as a peer that I went to Darwin from you I don't have to wait until I've done with the entire torrent to know that without download the right block plus I don't have to
hash the entire like 10 gigabyte torrent just to make sure that you didn't give me the pulse block so it's good for ever correction but it's also good for data integrity make sure that time you don't hum so make sure that you know people don't get exploited so at the HT node the Sujit hash table node what is it in essence each peer that participates in the DHT is at the HT node I need to pee or maintains a list appears which is so that did the HT routing table so the DHT or Nova has an ID which is also a one value so like a 1/9 190 bit well to the mm 190 value and there's
there's a few different protocol message about the the most important one for sort of in relation to Torrance is to get piers message and so these messengers are like be encoded which is like some weird binary encoding that the BitTorrent protocol uses it's just like it's a kind of like sn1 or or RLP so the this the way that the DHT works and the way the CAD Emily works the way that almost all the shooting the HP's work distributed she's almost all the HTS work is they use something called sloppy distributed hashing the concept of sloppy distributed hashing is you have a so that you the content has a is hashed and that's how you refer to that's how
you ask for her content you have hash you also have an ID which is your personal sort of identity in the network and you don't really look at IP address or anything like that in terms of fetching and identity management and what you do is if you are you know being introduced in keeping your routing table in the network yeah you always try to find to you keep sort of you try to keep your routing table the the peers of your routing table have to have it you try to keep IDs that are closest to your own ID and because you know it's just shade one hash it should be technically equally distributed so if
you have a high entropy input of the shape one hash which creates your BitTorrent ID then this sort of in theory can cause equal equal distribution in the links between the nodes right so in the sense that I if I'm taking if I generate a random L in our hash it you know my my hash result is gonna be all over the place and so if I always find peers that are close to me in the hash those peels are gonna be globally distributed hence we have sort of good network connectivity and certain in terms of variety and distribution and this is the so this is kind of like the hamming distance in terms of the
closeness to your know to another notice like the Hamming distance meaning like how many bits are different in your ID compared to the ID of this note that you have introduced to and those that are very far from you don't care about them too much as you always keep a huge but they note that are closer to you you have a very dense routing table and so this means this has implications for sir-sir for finding files because again files are as well shape categorized by shaven IDs so I take a file well actually rather I take an info hash which is share one hash and I want to Ubuntu ISO what the way that the
heuristics of the BitTorrent protocol find this file is I will look at my routing table and I have some peers so say that this hash is very far away from my own node ID so I look at my writing table and I have a lot of peers close to me in terms of in the ID hash and a few other peers sparse out I'll take the peer with the node ID that's closest to the info hash of the Torn that I want and I'll ask them for the peers now I'll ask them you have any peers that are announcing that they have this info hash that they can contribute to me and so the node will denote do you queried will
look into it so it's its routing table and we'll see ok mmm no probably I don't because you probably won't get in the first hop I don't have peers that you have announced this hash but I do have peers closer to this hash than you because you know I'm closer to that hash and so it'll return a list of those fears that are closer than I do that same step again with the peers that it's returned for me whatever and so on after like so many hops and so many iterations of this protocol you will get connected to a peer that is very close to the info hack the PR ID with appear with a peer
ID that it's very close to the same for hash of your file and then that's where a lot of that's that peer the the sort of the bucket of peers with the with the parodies closer to the infer has kind of counteract kind of act as the backbone router for that in for hash again most of their connections are routed through them so that's that's in in a in a quick summary how sort of you you find files through the [ __ ] [ __ ] networks when you're using DHT when using a tracker the tracker just sends you peers because it keeps a table kind of it's a little different in the in that
so the BP zero zero zero five supports the following writing pure messages so you have the ping which just make sure that the nodes are alive so you want to prune nodes in your writing table that you know obviously are dying and they don't reply to you so a lot of notes can't reply to you can't be reached they can only reply to certain messages like if they're behind us or something um fine node you know finds our note given a node ID which is like sort of the equivalent of the get peer suppose that without your notes I just want to find this specific node because maybe I went too well oh yeah like if I want to get connected and
outward on the first iteration I'll just generate like ten random hashes and I'll ask my bootstrap node just find me ten peers that are like equally distributed and I'll start filling my writing table so you have the in so like yeah one important concept here is if you have just a magnet link and for which of the hindu hash it's literally you just have a hash of the torrent that you want to find what you and you have no tracker you don't you haven't connected or anybody you have no peers yet so how do you connect in the first place obviously you need somebody to be aware if at least if you're doing natural versal you need somebody to be
aware of you before you can do the exchange so there's um a few things called like super nodes or bootstrap nodes and the DHT now you can run your own bootstrap node if you want and there's like this just a mess around so the biggest bootstrap notice is run by like BitTorrent calm and so if you if you like hit port six eight eight one on router table a torrent calm that's a bootstrap node which will bootstrap your routing table all it doesn't give you files it doesn't do anything other than keep a very sparse then equally distributed routing table and reply and you know sort of we act as intermediary for people who want
to bootstrap again get peers is whatever's explained query created a writing table for the notes close as the infra hash ID so that's like the basic idea behind sloppy through the hashing and the nouns fear which signals to the nodes to basically all all clearing you notes or any all your routing table now it depends on who you're responding to that you you control this node and like you are providing this file or looking for this file or you know you leach even if you're a leader you can provide data so you any as long as you have data you a kind of a provider anyway yep so the Indian then this there's some like you know
sanity checks such as if you you can't announce peer you can't you can't uh sort of send an amounts for your message just to random people in the network because otherwise then I could I get scrap ice if it subscribe you port horn that you didn't sign up for then I get you DDoS you so there's like you know you concatenate the IP address with some random some random some around their numbers and that's a token so when when I know it asks you for peers and you reply with an announced peer you must reply with that token so you know it would so you don't subscribe people to run the messages but this only
prevents content sort of injection into other people this isn't prevent people getting injected the wrong round tables which is what we'll cover in a bit [Music]
right so the properties of the DHT are such that you can build a service list or intention so there are few of these around they're not really known or popular because there's you know they're I don't know the power of a or whatever people use now so you know if you want like your random torrent about random things you will go on one of these centralized sites but you can easily build your own sort of local torrent search engine by scraping - DHT so these are called DHT crawlers or DHT spiders whatever yeah so yeah all you're doing is running a daemon which is a DHT bootstrap node and people you just keep like a healthy in in the
standard BitTorrent protocol you're only allowed to keep up to eight nodes per bucket and you're you should not keep more than like K buckets and you know you prune connections out and so this this guarantees that the network is healthy in terms of like topology the topology between nodes and the routing tables stays randomized and so you don't have like aggregations of nodes that connect to each other but then they're not in it so you don't have splits in that word basically but if you're just running a crawler you don't have the blessings in that so you can take your huge routing table and just like store it forever and listen to all the peers
and anytime you get a get peers so like when you're listening to get peers somebody is asking you do you have any peer that have this info hash and when you when you get that get peers you know you can't you basically it's a basically a message that says this like this IP address or the set of nodes is looking for this file it you can reply if you could if you fake reply that you have all the info hashes you will eventually get connected to pretty much every single node in the network in terms of like you'll be in the routing table pretty much all nodes especially if you run like 10k different nodes you can use
one IP address but you can run like 10,000 different IDs on your one IP address and so in you know like it like I don't know two to the tenth steps in your Shay one hash space and basically you will you in this way you can kind of be guaranteed to be in almost the entire network surrounding table and so yeah this is also what um you know like um those copyright companies do they run these kinds of nodes and they don't have to see you download something they don't have to give you the file all they do is listen forget P requests and they're like this IP one this info hash so clearly they try to download this
torrent where they're feeding is their seeding is torn and that's why it's really easy to do like math automation of these you know like the letters you get if you download the wrong thing or if you run a server and somebody's running a torrent on your server because it's it's very it's not very resource intensive to just keep track of the entire Bitcoin network at least on the effort hashes part and [Music] yeah again there's like a there's another thing called be p09 so say like I'm running this sort of crawler node and someone requests does it requests I get Pierce for me and request this information if I don't know what that info hash sorry if
even if I don't know what info hash that weren't specifies I can in my other node issue a like yet metadata base it so basically I get metadata command for the simple hash and the network will provide me the file name the vertical tree on whatever of this torrent so I get pretty I get the name of the burn and so that's how you build a search engine you can resolve info hashes to it's kind of like a domain name resolution system but like for torrents in some domain names it's like torrent names the server system and so yeah that's that's why you that's why that's how you is so easy to get like those thousands of letters they
just have our automated box that just spam these out they're like the AAS the autonomous systems are on owner like Amazon they're like VIPs not already download these things even though you don't really have you could also be running the HD crawler and then you know a lot of people like okay they get scared and they just comply but that's that's one or the cool thing about the DHT and the current better when I work that people don't realize is that you can all run your own pyro Bay's with like on a rose I don't want to code and that's what this site called BT dig does there's like plenty of sites run by people who just have fun with it and
it's cool also as sort of network investigation because it's you know you can get you can sort of sweep through the entire bit more network and see like the topology of the Tor network evolve for time and which is which gives you like a clearer view of the network than of a network that side that you usually wouldn't see unless you're like running a router back on or something right so so yeah there's Diaz now I just caught about this there cholera spiders and the Central Asian Network blah blah blah so you can you can without ever uploading a single single torrent so yeah I mean one of the one of the disadvantages of the HTR supposed to you
know it's like setting up site like the Pirate Bay is that if there is nobody serving that info hash at a certain point in time you don't you cannot get information about that torrent it wears like for example on the power they all have to do it do a keyword search mr. robot okay there we go or I can look up at a flash in there like okay I get the resolution whereas obviously it's a real time of the lability network so if nobody is providing an intern for us right now you don't you can't figure anything out about it but chances are that if you stay on long enough you'll you'll find it eventually there are some
like propagation delay things that could make a difficult but you know I generally you can build up a good pretty good database it takes like I think if you use if you run optimized code it maybe takes like one day like 20 20 30 hours to basically duplicate the entire TPP database locally which is not bad and there's no Club because it bandwidth intensive it's because it's probably the propagation delays
okay so so you know the sort of the scope of this talk now that we've got an offer other prerequisites it's prerequisites you can piggyback piggyback onto the [ __ ] torrent DHT and use it aid and that traversal or you know act as a command the control server server so natural bristle if you don't know is the way that routers work right now is you have if I if you have like a consumer router and you have no ports open and I try to I don't hit port like 50/50 on your router it's just gonna get dropped by the router right not because the router it's because it's um because the starter is really that
smart about this it's because it hasn't seen well the port not open but it also hasn't seen at a request to go out to port from port 50 52 whatever for so and if you take the UDP example for natural versal and this is how sort of peers connect to each other behind that is if I on my laptop and I have a rod here if I craft a UDP packet that goes from port local port so like local port is bound for it as in the port the port that you receive things from destination port is the thing that you're sending to so if I send a packet to an O to our host that has open
ports and so I've been sending a packet to like a DNS server for example so that's 53 so I'm sending my packet is just okay local IP my my IP local port I don't know porch it's usually a random pipe or soap or 60,000 the destination address and destination port so you know that's inertia there could be eight eight eight eight Google's DNS destination port 53 when that passes through my router my router puts logs that write it sort of puts a checkmark saying if a packet comes back on with source port 53 destination port 60,000 let that packet through because my broader is not open to the network so if I hadn't initiated this so I hadn't
put a hole in the routers firewall my faculty when you got through so you can do this as long as you can you can do this even with when none of you neither of you have open ports by just you know organizing yourself in another way so if I if I make some sort of if I make like a just a general client that has hard-coded source port so it's always going to come from port 60,000 and it's always gonna go to port 60,000 and you know the same thing to you when you want to connect to me or to random addresses it's always gonna come from port 60,000 to 65,000 on UDP as long as you don't have symmetric
mats meaning as long as you don't have mobile that's like then that's the network header translators that are used in the gateways for mobile networks because those are different um they the portion that you bound you're buying to locally isn't the porch that gets sort of as advertised once it passes the Gateway because you know there's like many different IP there's certain many different clients on the same address so forget that one you have to use some sort of stun server for that but if if I'm using like a home Internet line and your Zune isn't using a home internal line when we are both the routers they have 0 port open as long as we both know
each other's IP addresses and we have this client that hard codes source destination ports I'll send a package to you first then that package will get drop at your router because you haven't sent anything it to me now you will send me something we just agree on a time so send me something back and because I've sent you the packet my route is already hoping so I will get that packet even though the ports are open and I reply to you and because you have sent you've sent a packet now your router your browser is also open so now we have a open route that will sort of last depending on like what protocol you're
using or what route you're using better lies between thirty and sixty seconds as long as you have some activity it wasn't like you stop all activity that route will sort of timeout within 60 seconds and the things will be closed again so so there you know this background tonight reversal and that's a big thing in terms of law like you know botnets because if you're if you have malware and [Music] you know you need to control your BOTS somehow and you need some way to treat your BOTS but all your BOTS have closed closed ports so what you usually do well what is usually done is somebody will hack like root I don't know like ten
random leaning server somewhere and it'll turn them into bot masters right but that's pretty fragile because you can you know the the hundred all the FBI or whoever is charged in charge of the jurisdiction can just reach out to the hosting company or just black hole doors routes and then you've lost control over your bots unless you have some sort of like we're way for the bots like fun than you bought master but that's a work that's over complicated so if you don't need that central server for sort of to sort of dispatch instructions to your BOTS well or open up routes to your boss and the boss can just connect to each other that's basically a that's a far more
resilient botnet because you don't give it so you can't really take it down as in you can't just black hole every single client that isn't infected right it's it would be a kind of part of in pain point at which rot point in the rather infected especially so if you have like a million node botnet and that's that's probably gonna keep pretty healthy connections with each other box and they can keep routing tables and whatever so it's pretty much you're pretty much done like it's very hard to take that robot might like that especially if you use piggyback on to another gauge to network which is you know like the bit or network the bits or network has
insane levels of interrupting traffic I mean again it's some 40% of the internal traffic at some points in the day on top of that the there's some interesting properties that emerged when we have such a high bandwidth network is that the you know intelligent companies like the NSA do not come under BitTorrent traffic because it's actually too much for them to log so if you make a botnet that uses like air assure coding between torn traffic you're pretty much you're you're pretty much invisible to the NSA as well which is you know interesting in itself and whatever it's so yeah it's and I again because it's such a big it's such a big network even if you have a very large botnet
that would that shows up on radars normally like I don't know configure botnet or the Zeus botanist which can range from the one million to the tens of millions of nodes if you don't if you have like a couple of central servers that are hitting like 10 million connections that's pretty that's pretty obvious but if you have a distributed network that that piggyback on talk about the BitTorrent like I don't know there's probably like hundreds of millions of BitTorrent connections a traffic going through at any point of the day and if you're smart about it and you like keep localized connections like router localized connections and you keep and you you should master aid your BOTS commands as
a BitTorrent traffic you can have like a button in the millions and never get detected as well I mean it's very hard for you to get detected because it won't it'll just look like the tomorrow it specially because a lot of BitTorrent traffic is encrypted in some weird way which is not very secure about it makes like stateful packet analysis very hard especially at those scales um so yeah and then as well the DHT is so large that it's generally pretty well connected so even if you are even if your network so the larger network is the their larger your subset of the network is in you know your botnet within the DHT till less the propagation time is but even if
your network is just consists of two nodes as in like me and you just want to like chat with each other using the DHT the propagation time is only a cost at the very beginning meaning you will it'll take you like five to ten minutes the bootstrap to the DHT network once you start sort of broadcasting you'll get peers or Quest Z and your your nodes are sort of like in the routing tables are spread across the network that at that point when somebody new joins a network and they broadcast to get peers that request will get to you pretty quick so it's like it's it's it's like a TCP connection that takes like ten minutes to go through or once it
goes through then you have pretty good latency
so the simplest example to exploits DHT for your own purposes and doesn't you know this doesn't have to be for a mall where this can be for pretty much anything you can build a domain name resolution system on top of the HD you can build like a content addressing system whatever I mean anything there anything where you can have this tributed key value store that is too lenient and allows things that aren't actually torn into the achieve a Luthor so there's a saps generate a random name for hash that is the real torrent so just take a message like b-sides hash say one hash that message and then throw out a get Pierce request saying I want
to get peers I don't give me peers that are looking for this hash you know eventually you know that that info hash will get for you pretty well thrown out in the network if you and your bought client other party whatever server are both requesting that info hash you'll eventually be really we relate each other's IPs and also ports that you're listening on so it'll you know you know you'll get something like you know IP one two seven zero zero one is listening on bound for fifty fifty and addressing pound for sixty sixty so with that information because and you'll get my address in Vanport with that information because we can do night reversal we knew
DP narrative Russell brought it CCP net reversal and and you know micro DB or whatever there you go so you're done you connected to each other without needing a central server for this remediation and which is it's like it's it's like this sort of disk adem lab all these all these DHT protocols are are based off of this initial sort of design called Kadam leo so all these condemned we have based protocols have these traits and it's like they've been getting a lot more popular recently and you know there's I don't know if you know about ipfs which is like it's tensor doesn't send for IP file system it says there enter Brian after every file system so it kind of
uses this condemned layer based routing to address your content on your disk so you can publish a hash like on some website and as long as your notice online people can directly take data from your from your on this free mural like laptops disk and you know that those connection right through the DHD Bitcoin uses academically based learning system it's very similar it's not the same but it's very similar pretty much all cryptocurrencies at one point at one point Skype used to use academic based on system but not anymore because Microsoft killed it it would like everything that microscope touches but it's a lot of web RPC sometimes uses condemning a baserunning systems as long as doesn't start servers
and so there's there's like a trend that sort of these things are getting more popular but at the same time people aren't really aren't thinking about like the emergent properties of the designs and the network because just as BitTorrent is sort of vulnerable to hijacking we're hijacking in the sense that you're using a torrent you're using the network but not contributing those networks purpose right if I'm using the network just to resolve the main names or resolve my boss I'm a greeting they're not worried in the sense that I'm not contributing the network health I'm not quarantine I'm not ceding like I'm actually damaging that I work because the way that I use my routing
table to kind of my boss is very different from the way that you should be using your routing table to keep a healthy DHD foreign so anyway
great yep so then again the cave reads bootstrapping that the network takes a very long to takes a long time sometimes sometimes I can take a very long time depending on you know the network state it just it'll it's not really based on it's not again it's um based on whether it's a based on propagation speed and you can do some tricks to to like optimize this but worst-case scenario is always gonna be five to ten minutes so if you have something like critical thing that like you need to know that when I make this connection it's gonna happen within like a second no it's not it's not a good thing for that it could
take up to five minutes there are many nodes that just respond to this every single thing from hash like the crawler that I mentioned before and just debate the huge tables and that's annoying because if you're trying to like Couture your bolts or you're trying to find somebody's file on a system or DNS whatever you'll get a bunch of like crap or replies or like yeah look I have every file you want but then and you could have connected me you waste time because there they don't then never get back to connect back to you they're just trying to crawl the network you know again the the half of these are crawler nodes from people just run call
oralist or what are the hell reason companies who build up statistics maybe some of the nodes at the NSA I don't know if you spend too long using the same amount of hash that nobody else uses and nobody can get a metadata torned out of and it's like doesn't really exist and nobody's ever seen that horn it's pretty clear after a while that you're just a billion network but there's no there's no heuristics in the clients the detect they still just be happy with it like no no torrent client that I know of and no sort of no DHT duper striving no that I know keep track this may be may be the only exception might be like
private trackers but I don't know I don't use that stuff because yeah obviously like that the the behavior of give me all the give me all the files but then at the same time I won't give you anything and I'm never downloading these files is pretty obviously a crawler node and the crawler and rows are mostly DC I mean and the nca people and so I guess for private traps they might didn't I track that one more often or more seriously but I mean it's easy to track only very naive in your implementation if you are relatively intelligent and how you do it and you and you pretend to be a peer yeah even a little bit it's
it's pretty hard to track you down especially especially if you're like passive luxury IP address in your note idea so on top of this the DHD the entire network storing tables can you can be slowly manipulated over time with somebody who has enough resources or the right know-how about I'm like sort of differential changes and writing till topology depending on activity so yeah they were most clients or a lot heavily on the DHD IDs which is the shape on hashes that you just randomly generate instead of instead of IP addresses I mean obviously partly this is a effect of the internet because you know there could be there could legit be many different clients behind one
one IP address like for example a university which blocks UDP torrents meet the people words by the way I had these the TCP blue piensa test my client before and so yeah you this is but yeah this is also how certain people can dupe trackers like you forgot you ever you know check out the recent torrents on the brow the top torrents on that part of a just if you want to see like what's popular what people are watching oftentimes there's like ten horns by the same like random garbage named person with like more peers than are available and the entire DHT at any one time probably there's like oh you know Game of Thrones Season five hundred into the
future with like ten million pairs connecting to it and obviously it's fake but they're I'm fitting the numbers by just taking a few IP addresses or their bonnet or whatever I'm reading up like ten thousand one hundred ten thousand nodes on each of these IP address for announcing get appear was much are you putting out get peer and announce peer requests and so the tracker the patron tracker for power bear for example is been a very naive and it'll be like whoa look at these ten million requests clearly there's 10 million people seating in wanting this final so that shoots up to the top I don't know why they don't fix this it's really annoying and you know I guess people
like Maul word often because if you're you don't know about these things and you're you're not like in the mindset of being careful you could easily you can easily just let go mr. robot click on the first link because there is always sort of or device heaters downloads the malware and then you can you done and it's like you they take they remove these pretty quickly as I'm seeing lately but it's not quickly enough I mean because then it's not like it takes a lot a lot of time to make a new account in the power of a so as they remove them like 300 more pop up and you know whatever yeah and you can make sure if you make
sure that the routing tables of each you know don't roll up too much you can actually go a long time without this being detected on a sort of sort of I don't know on a fraud level or or an inflated level because the there's an important let's see actually there's an important property called equivocation of this region networks so a convocation means if I'm connected to you know day and I'm connected to you note B and you both send me the same requests and I want the Pierce for like info hash a I want the beer to rain for hash a if I send you to both of you two different replies that's application I'm lying to one of you
about the state of network because Bitcoin clients DHT nodes whatever do not track equivocation yeah because we don't try and do not do not track equivocation it's also a very hard problem solve anyway it requires like some sort of like graph theory algorithms to and and it requires like keeping like the differential states but because they don't track a revocation it's very easy to set up like a bunch of nodes that replying to the network tip if you have enough nodes and you have them disturb you turn off the network you are applying to request in the network in such a way that you can manipulate the routing table of people and section off parts of the network or make certain
parts of network to quality integrated and just cost chaos and the network so you could because because you can manipulate the routing tables of pretty much everybody now because none of them track equivocation and yes so like one of the problems is they don't recuperation when the problems is that the only health metric in in the protocol specification itself is if you don't respond within 15 minutes you're a bad node I'll drop you maybe some like maybe some torrent clients from what I've seen are starting to put in checks for if this node constantly asks from off for my info hash but never connects to me and never wants anything from me it never gives me anything I'll
blacklist that node but for all intents and purposes once you're connected that I know it it's kind of already late it's manipulative routing table and you've got and that note has your IP address so if it's like a if it's like a you know corporate company you're already done like they have your IP address they have the intent right here it's up it's over so that's that's a weakness in the profile health to do too much leniency networking Indian in the implementation of why nodes are good and my notes are bad again like no one with convocation checks whatever and also emergent properties that are bad from that we're health in the running table there are
there are some ways that like certain peer-to-peer networks like for example poor or iqp which is like tor but dark not only they kind of get around this by being very selective with their changes to the routing table and they will only change their routing table if they see many different a validated peers also change the routing table in that way and they have like by direct like first I'd like to be has bi-directional channels that are like multi-hop so you don't know while you don't know where the channel request came from you it's it's hard to abuse it in a DDoS manner because the the way that those channels is constructed has to sort of follow a
certain propagation format and so if we try to abused the network people will just drop your press so the conclusion okay good what that's really just ended this live show okay
no we cannot see anything right now oh
my god
Wow whatever there you go the conclusion is P TV networks pretty partner in the build services that make use of the DHT the issues described also applied to pretty much all the treated peer-to-peer networks again Bitcoin tor has some of these issue is I could be I do not have a bori out you know just name it just an Amber server network and it's steady problem it's hard to design protocols there's a lot of emergent property that you don't realize unless you do something called safe by construction protocols which is like a game theory way of designing protocols it's very difficult though I mean they're probably like a lot of understanding of games again like mathematics and game
theory and sort of logical complexity theory so yeah the the way that there's no protocols are design are it's just kind of like ad hoc and but ad hoc about evolving protocols all will always have these issues and because there are these are architectural flaws in the protocol it's pretty much impossible to fix once your protocol once your network gets big enough there's you know like an example a good example is the corn right now there's a lot of problem with whiskey with a per call one of them I don't know if you probably own always like the block size and and some of the equivocation issues and in terms of like spoofing transactions and which block has been
winning out but it's almost impossible changed nobody wants to change it you know a lot of hardware and software is assumes that the Bitcoin protocol acts this way and to operate all that hardware would be insanely expensive and so there's a lot of pressure around that if you want I don't know it probably won't work here again unless you're using a VPN or something because it seems like this why five blocks UDP but you've been tried at home or whatever if you if you take that code compile it it's just some C example code and you run in that way and the hashes be size or you garden wire you will have you'll be able to do not trans role
basically like pierre-pierre discovering with anybody else who has that message so you know if you had another client that you can write write a Python client or something or using that cat and connect to those ports the local port actually if you make yeah local board port 5450 for destination port whatever and you connect to each other's IP addresses you can directly connect with each other to knock that and so yeah that's that's an example code you can you can have fun with that proof of concept it's it's nothing it's simple enough to play wrong with and then rather Victorian comment that yes super node that's a BitTorrent runs so I think that's that's it we have
55 minutes any questions
[Music]
yeah if we've jeez if if you if the person who's done this more word naive yes but you can if you're you can easily make some so basically like you're assuming that the the info hashes that these peers are using to find each other are already there car code in the mower right that doesn't have to be the case you can make an info hash that is based on like I don't know some I don't know the the some sort of Avenue that you control or that you can predict but it's hard to sort of figure out from our engineering standpoint so like the hash of some webpage or like the hash of some Wikipedia page at this moment so I can I
can manipulate that would give you a chance at hash right now obviously like yeah you can do you can reverse them all where and connect to the network as a peer and still build routing tables over time that will find that peer but you know I don't know traverse the entire sort of network but it's you can have like you know segregated networks ins and like my my botnet is made up of ten partitions and each partition has its specific way of finding info hashes if you if somebody like routes one of the partitions and reverse engineers and I figure out that they reverse engineer that or you know like I keep check on who love the
traffic of the server or whatever then I can just section section off that partition half a half by half and ten until I find the person who is manipulating drawing tables or the the exploited server whatever there you go also you know you can you can send out like protocol level equivocation messages like on a large scale and so that like for any change in the routing table for like app if up for any change of pure behavior the peer must connect to at least like 20 or they're validated peers with like some sort of cryptographic signature so if you try to change the behavior of that peer you'll immediately found out right because you're the only
peer who is equivocating or acting differently or you know even if you start exploiting it like 100 BOTS then that's like a level of it of exploitation that kind of it's harder easier to go so yeah you can there's only like ways that you can uncover these but there's there's there's also ways that this can be made hard anyone else did the talk makes sense you guys understand yeah so it's um it's it's it's the pretty interesting there's there's a lot of research now into safe by construction protocols especially for things like Bitcoin and I mean the the things that I work on for a living is a theorem I don't know if anybody up who's
Howard of etherium it's more than I expected so but if they're basically like Bitcoin but instead of instead of having just tokens you have turn complete and discerning complete virtual machine language associated to a critter program a cryptographic transaction so I can I can take a certain address can either be an account address with value or an address that has code and I can send an input to the address and [Music] sort of have my code run on the entire network it's distributed in a decentralized manner if I release treated so they all validated and so it's like serverless infrastructure right you can make your decentralized Google or whatever then nobody controls but there's these these issues are very
important and networks like that because the way that the way that like the whole oh is this transit is this block valid what's what's the latest block in the blockchain how can I know that I'm not being lied to about say the blockchain right now the only medium proof of work you know hash brute-forcing that's a very inefficient though especially if you want like fast propagation time or fast finalization but then like I said you want Bitcoin I have to wait 10 minutes more like 30 minutes a lot of time before I can be sure that that turns I couldn't get reversed the right that's shitty the like that definitely won't work for like you know payment
systems so if you use something called proof of stake which is which is something that instead of instead of putting down like a we're forcing a hash and then making a block out of that hash so you have that heavy company heavy blockchain computationally you put in sort of like a deposit isn't like I bet or I I back this state of the blockchain with I don't know like 1,000 Bitcoin and so if I am wrong but right if I manipulate the blockchain after I put in my deposit I lose out the bother so it's I'm not incentivized to I mean sense wise against lying to the equivocating basically but so this this along with
safe Quebec intrusion safe black construction sort of mitigation protocols and Convocation brocade protection it can make it can mean that there's some graph theory algorithms it could mean that given the state of a network like to be the Tor network once you have a certain state of the network once we see a sort of snapshot of the network and you know you you contribute to it like you transact here something you can estimate how long it will take to or how many nodes need to be Byzantine Byzantine meaning equivocating notes like malicious nodes how many know things will be Byzantine before I'm at risk of this transaction being reversed so because yeah and because it's proof of stake there's no
computation time I mean there's no competition expensive algorithms that take like 10 minutes each it's more about a network agreeing and agreeing in a way that makes it hard for people to be malicious because you heard you have like a a tree a state tree of all the activities that everybody else has done in your snapshot and so you can have like immediate finality and which is pretty a lot cooler and you can have like guarantees about given given my experience my view of the network how safe is this network which you don't have in most peer-to-peer protocols right now so yeah it's um so it's a cool so yeah if a designer it depend what
else has an questions otherwise uh thanks for going [Applause]