← All talks

Andrew Morris - Flaying out the Blockchain Ledger for Fun Profit and Hip Hop

BSides Augusta · 201647:1429 viewsPublished 2016-09Watch on YouTube ↗
Speakers
About this talk
Video from BSidesAugusta 2016.
Show transcript [en]

all right hey folks all right yeah let's get this party started all right oh wait I need to turn my thing on hang on as if I think we'll be fine okay I think yep there we go so all right my name is Andrew this is flang The blockchain Ledger for fun profit and Hiphop and that title will make a lot more sense here shortly um before I get started I was just notified that I have some PR is to give out for questions but my first question the answer to it is on my first slide so I'm just going to go ahead and ask it now who can name first and last name the author of the Bitcoin paper

just shout it man what do I all right you all you all got it I don't know what I was thinking I'll I'll figure out some other thing to do um I don't know damn it I didn't really think that was through very well um and then another warning just real quick um I don't see any young kids in here there's like three or four slides that are going to have a little bit of profanity on it if that if anybody has a problem with that speak now or forever hold your peace cool um you would you were raising your hand y oh like uh like young young young people who don't want to hear profanity

um okay let's get started so before I even get started I'm going to blast through this pretty quickly um I want to acknowledge a bunch of people that helped me make this talk happen um my buddy Colin me and him did all this work together um uh like we did all the all the research and he was pretty much there the whole time we were bouncing everything off of one another and uh um so thanks to him thanks to my co-worker Chris honer who helped me with like all of the squl queries that I had to do with this because he's a really good database guy um my cooworker uh Richard Seymour who helped me a lot with a lot

of moral support and a lot of the like lower level uh Bitcoin script kind of stuff my cooworker Bobby who definitely doesn't want his name on this um my roommate my former roommate Andrew for um I'll explain at some point what he did the guy Edward ice kandov who is the author of The blockchain parsing Library Pi blockchain this guy named synex on YouTube who has this amazing amazing YouTube series on uh explaining the actual Bitcoin paper uh he takes every single every single sentence and he makes like a 10-minute video out of it um it's like 5 hours but it's totally worth it um Tom gard's dude that wrote this really nice blockchain parsing

guide Kurt barard is another one of my roommates who uh actually gave me a really brilliant idea of how to do some big data heavy Computing stuff that I needed to do sohi Nakamoto for writing uh the Bitcoin paper whoever you or they are the author is the Bitcoin developer documentation which is exceptional the two guys that got in a fight at mukan two years ago that led to me making the worst joke of all time and my company endgame for just being a generally amazing company to work at and enabling me to do this research so this is going to be the outline of the talk I'm going to give and I'll just kind of go through it one by one there's

a pce spin link up there if you want to write that down that's going to have all of the um links and hashes that I'm going to be referring to in the talk so that way if I I mean because I'm going to have I mean if you know anything about Bitcoin you know there's a ton of like transaction IDs and wallets and basically stuff like that that you guys I'm going to have on the slides but you guys aren't going to be able to write down in time so go ahead and take down that ppin link um I just tweeted it also for my Twitter so if you want to follow me it's Andrew Morris and you can it's it's on

there as well if you don't get a chance to write that down so my name oh yes yep absolutely yep okay all right so all right let's do this my name is Andrew Morris um this is my Twitter I work at a company called endgame on the research and development team my background is actually an offensive cybercity stuff um I was a pentester for a while and a red teamer for a while um doing majority private sector stuff a little bit of government work um I've been doing computer uh related things for the majority of my life and somehow I'm still kind of bad at it um I dropped out High School uh younger and I didn't do any kind of

schooling or anything so I kind of have a different background of looking at things um and I really like just computers in general playing music and tweeting stupid jokes um today I'm speaking on behalf of myself not speaking on behalf of my employer I do work for a company called endgame it's an amazing company but I'm not representing them today this is not their point of view this is just my point of view this is my recreational research all they did was just let me do it um so again this is just me I'm just talking about things that I observed um I'm not a Bitcoin or blockchain expert which is a silly thing to say when you

are giving a talk to people on bitcoin and blockchain um but the reason that I'm saying that is because Bitcoin and blockchain are massively massively complex and I know a decent amount about it but um I am by no means an expert and if you hear anything that I say in this talk that is incorrect or you think is incorrect do not hesitate to call me out in the middle of the talk you're not going to embarrass me or anything like that I don't want to spread any kind of misinformation this is actually really hard stuff so I'm going to do my very very best so let's get into it on August 26 there was this guy named Martin

schelli and he bought the wuang clan exclusive wuang Clan album for $2 million nobody knew it at the time cuz was like an anonymous guy but he but he did it he bought they uh the wuen clan had been recording an album in secret for like 5 years or something like that and he bought it exclusively for $2 million then on February 11th 2016 that same guy tries tweets at Kanye West and he's trying to buy Kanye West's new album that has yet to come out the life of Pablo and he's trying to buy it from him and he's trying to give him 10 million dollar and have Kanye West only give him exclusive rights to that album

or so he says 3 Days Later Martin scy claims to have a bunch of money stol he tweets quote who the [ __ ] has my $15 million I need my money back this isn't a [ __ ] joke WTF somebody someone named dcoin said he was Kanye's boy and I signed the deal to buy palbo and sent the Bitcoin call the police this is [ __ ] um so I saw this and I knew just enough about Bitcoin to know like I might actually be able to get to the bottom of this so I had a question I was basically like $15 million is ton of money um the Bitcoin Bitcoin I know just enough

about Bitcoin to know that Bitcoin records all transactions that happen on a uh distributed Ledger um The Ledger is publicly available it's not encrypted it's available to anyone it's actually distributed to everybody in the Bitcoin Network so I want to replicate The Ledger I want to get The Ledger and I want to search it and I'm going to see if I can find this $15 million transaction that Mr scy was referring to so basically given a date range find transactions that fall in a certain USD um value range find all transactions that fall in a USD value range um and actually before I go on do if I want to switch this to like multim monitor view

can I do that because I can't see presenter mode sorry sorry folks um just do if not I can just I can just keep going it's actually not a good deal yeah let me just keep going I thought maybe it would just like pop over uh oh oh my god oh it was all my fault thank you it was all my fault all right back on track so is it possible to find this transaction so my Approach is basically I'm going to replicate I'm going to get the Ledger somehow I'm going to parse it all out par all the transactions I need to get it into a consumable format of some kind I'm shove it into a database and I'm going to

write some queries I'm going to ask the database of a couple questions and then I'm going to review the results and that's it so let's see if we can get to the bottom of this this is basically like the pseudo query that I'm thinking in my head right given all Bitcoin transactions ever select star from all Bitcoin transactions ever where the date is in between I don't know February 10th and February 14th five days four days leading up to it and the US dollar equivalent of the Bitcoin transaction at the time is somewhere between 14 and $16 million so now I'm going to talk a lot about Bitcoin raise your hand if you're familiar with Bitcoin all right raise

your hand if you know how Bitcoin works okay yeah everyone's like okay Bitcoin is actually really complicated um I cannot possibly word it any better than some random guy I saw on a forum who just said said on a forum post some guy wrote hey can you explain like something something about Bitcoin and this other guy responds Bitcoin is not something you can sit down and understand in a couple of hours that guy was absolutely correct Bitcoin is actually really complicated so Bitcoin is a cryptocurrency it is a peer-to-peer cryptocurrency as far as I know it is the first cryptocurrency that I'm that I have ever known there may have been other cryptocurrencies but it was

certainly the first big one um there is no Central Authority which means there is no trusted third party what does that mean what is a trusted third party well if you send somebody $20 on PayPal to buy a pair of shoes and you receive the shoes in the mail and you open up the box and there's a dead rat in the Box you're going to be like what where are my shoes and you're going to open a ticket with PayPal and they're going to say ah I got you and they're going to reverse the transaction so you're going to get your money back because that's a trusted third party right but if you have $20 and you go up to somebody and

you give it to them to buy a pair of shoes in person and then they punch you in the face and they run away you can't like contact the US Mint to get them to inval can you guys just invalidate the uh the the the serial number on that $20 bill that I just gave that guy no you can't do that there isn't a trusted third party in cash transactions there's nobody that oversees it likewise with Bitcoin there is no trusted third party when you send somebody Bitcoin you can't unsend the Bitcoin no one can reverse the transaction it's cryptographically in feasible think of it as like as cash of the internet right um it uses known

cryptographically secure protocols to ensure the Integrity of the transactions um it uses a distributed blockchain ledger to record all transactions as I was talking about earlier uses something called a proof of work to prevent double spending prevent double spending um double spending is basically just having a Bitcoin and spending it and then turning right back around and spending it again to somebody else well you can't do that because once you've spent a Bitcoin then your B basically um you in order to double spend a Bitcoin you would actually have to you would have to do all of this You' have to crunch all of the CPU cycles that some that everybody else in the world crunched in

order to get to that coin in the first place so you'd have to compute all these very very heavy computational transactions it would take years and years and years to be able to fake a Bitcoin transaction it's cryptographically in feasible I'll get a little bit more into that later um oh and by the way a proof of work is basically just something that is difficult to to achieve the first time but it's easy to validate um so basically what that means is if you give me like uh if you say if you say hey Andrew um is this is this word in the dictionary and you give me a word and it's going to take me a really really

long time for example if the word is is in the dictionary um and the dictionary doesn't isn't in any kind of order it's big list of words it's going to take me a really long time to find it but then if you said if I said yes it is after I found it after 10 minutes I found it it's on page 24 and then you'd say oh okay and then it's really easy for you to just open up to page 24 and find the word so it's it's hard to do the first time but it's easy to validate that the thing is correct um everything is auditable and accountable back to the Genesis block or whichever block it was

mined from uh you can trace every single Bitcoin all the way back to uh Genesis block or to a coinbase coins are mined um basically by crunching CPU and by calculating hard do hard cryptographic equations kind of like mining gold out of a mountain Bitcoin uses a non-turing complete Bitcoin specific scripting language to validate transactions as if it wasn't complicated enough they have their own programming language it takes approximately 10 to 15 minutes to validate a Bitcoin transaction if you want to know more then I recommend you read the Bitcoin paper when you send somebody Bitcoin you're not actually sending them a Bitcoin what you're doing is you're signing the coin value to recipient in a

way that allows them to be the next person that spends the Bitcoin so what that means is when you send somebody Bitcoin you're not you are I'm not losing an amount and they're not getting an amount there's no State all that's happening is I'm basically just saying somebody somebody at some point told me that I was the next person that was going to be able to sign a Bitcoin and then I'm signing it over I'm saying all right now you are the next person that gets to sign the Bitcoin and that's basically you're the next person who gets to spend it and it's a chain it forms a chain the transaction is broadcast to the entire world to the

rest of the Bitcoin Network and only the recipient can spend the Bitcoin uh all transactions are known by everybody inherently um the network actually self-regulates with something called difficulty difficulty is a notion that where the more people that are a part of the Bitcoin Network at any given time the harder the cryptographic equations the cryptographic the proof of work is to achieve to generate the next block so what that means is if there's 10 people mining Bitcoin then the proof of work or the thing that the the math or the crypto that has to be done is going to be kind of hard and it's going to take those 15 people about or 10 people about

10 minutes to do it and then if a thousand people all of a sudden join the Bitcoin Network it's going to make the the the math a lot harder it's going to add a lot more difficulty because there's so many people mining Bitcoin it wants to regulate so that new blocks are harvested once every 10 to 15 minutes and the dollar value fluctuates with Bitcoin as it does with any currency that's not a Bitcoin thing that's just an economics thing that's how economics Works um and to get into a little bit the way the Bitcoin price is actually determined I'm not an economist but the way the price Bitcoin is determined is basically by currency traders who just

say this is how much cash I'm willing to spend on your 100 Bitcoin or something like that is Bitcoin Anonymous kind of everybody knows about every single transaction so in that way it's less Anonymous than cash but you don't know who the wallet belongs to in real life so in that way it's more Anonymous than cash one person can use lots and lots of wallets it's actually best practice to use one wallet per transaction that's what like the the super hardcore Bitcoin people say um Bitcoin can be mixed and tumbled to make it harder to trace which basically just means shift it around to a bunch of different wallets and then going to where you ultimately want it to

go to there I'm sure there are legitimate reasons why Bitcoin should be mixed I don't know what any of them are but I'm sure they exist and Walt it's can be created offline without any internet access to rece receive Bitcoin not to create Bitcoin but to receive Bitcoin which means I can have a computer that is not connected to the internet I can generate a wallet um I can generate a Bitcoin wallet I can take that that public key I can write it down and I can go and I can tell somebody hey send this wallet a bunch of Bitcoin and then they can send that wallet Bitcoin and I have and then I can claim that

Bitcoin whenever I want but you can't spend Bitcoin unless you are attached to the internet but people can send as much money they want to a wallet that you specify and because remember you're the only one who had the ability to sign it off to somebody else is Bitcoin secure I mean I want to say yes like the cryptographic protocols are very very secure the implementation is genius if you read the Bitcoin paper um it's it's actually I mean it's brilliant the guy thought or the people thought of everything um there are attacks against Bitcoin um they're unlikely but they're not impossible the biggest one is something called a 51% attack which is just when 51% or more of the Bitcoin

network uh of the blockchain Network collude together to make something bad happen like to um change something that happened in the past or to do something this that or the other um but it's very cryptographically in feasible I mean obviously if you do have 51% of people that are willing to do that then that's fine but the one of the defenses against the quote 51% attack is that if you have 51% of the members of the members of the Bitcoin Network you're going to make more money just working for the network than you are trying to change something that had happened in the past most likely um because of the way proof of work works

most of the attacks just affect having shitty obsc and that's it so I'm going to talk a little bit about the blockchain raise your hand if you know what blockchain is all right all right raise your hand if you know how blockchain works okay yep that's about me too uh blockchain 101 blockchain is a list of every single trans transaction ever it is a giant series of linked lists in the form of serialized data it is an eternally growing list of transactions and basically it's a big I me depending on what you mean when you say what is what is The Ledger I mean or the like blockchain is the technology then it uses a ledger a distributed Ledger um

The Ledger is always the same it's the same everywhere it always grows in the right it always grows in the same way um also as a side note if anybody does something like if I'm hosting a ledger and I do I'm I'm mining Bitcoin and I make a change um uh and I tell somebody else about the change like like if two people mine the same Bitcoin at the same time then what happens is and I we both add that onto our Ledger and everyone's like replicating from us then whoever whoever it is that mines the next Bitcoin whoever they trusted is the one that moves forward and the other person just loses their Bitcoin so that's how

it works it's the longest chain of trusted stuff um and we'll talk about that in a little bit um computer are incentivized to host the full blockchain Ledger and keeping keeping everything going by getting rewarded for mining Bitcoin by receiving Bitcoin for mining Bitcoin basically if you are mining your proof of work um if you are part of the Bitcoin Ledger and you are part of the Bitcoin Network and you are mining Bitcoin then you are there's a chance that you will achieve the coinbase you will achieve the proof of work create the newest block and you will receive a number of Bitcoin uh there was just something called the Bitcoin haling so I think uh until recently you would

receive 25 Bitcoin now you receive 12 um I think is that right 12 and a half that makes sense math um and so and that was I mean there was a time when Bitcoin was worth $1,000 a pop right now it's floating at around like 500 600 I think um yeah so it's it's floating around 600 so I mean man 12 Bitcoin 600ish bucks pop that's a lot um so nothing in the past that's ever happened in the blockchain can ever be changed it is hashed on hashed on hashed on hashed so if one thing changes then everything else subsequently is going to be invalidated um it's no there's no state to the blockchain necessarily it's a

chain it's a log of activity the log the blockchain is made up of blocks blocks are made up of transactions transactions are made up of inputs and outputs an output is a wallet sending coins to to another wallet kind of what an output is is it's kind of like I was telling you before an output is me saying you are I am I take this amount of Bitcoin and I'm signing it to your wallet you are the person who is allowed to spend it next um an input is somebody else saying you I acknowledge that you just sent me those Bitcoins I claim them I am the one I'm claiming those coins and then they would turn

around and output them to someone else right so every input every time someone claims Bitcoin they're going to point to they're going to have they're going to reference the output that they got the Bitcoin from so anytime I say all right I'm ready to spend my Bitcoin part of my part of the data here is for me to reference the output that I got it from and likewise the output that I got it from before did the same thing and it forms this giant link list um and output without a corresponding input just means that the coins have not been spent so if you're looking at the bit The blockchain Ledger and you find an output and

there's no input with a previous hash value pointing to it that means that those coins haven't been spent they're just hanging out so if you want to access the blockchain like a total Noob what you can do is you can go to blockchain.info I'm just kidding this is how everybody accesses the the the blockchain you go to one a blockchain Explorer there's a lot of free ones blockchain.info web btc.com I've got a big list of them right here uh there's a ton of these I mean there's like there's like 50 of these websites that you can use to like click through the uh The blockchain Click through Bitcoin transactions stuff like that there are tons of them

um one really cool one um it's called AB which I think stands for a blockchain Explorer um and that's an actual offline one so you feed at the raw serialized blockchain data and it will actually build your own offline blockchain Explorer it's a pain in the ass to get set up but I have a solution to that which I will talk about shortly this is what blockchain.info looks like if you've never been there before it's a great website um basically this is a hype so that just means that's which block it is the block height is what block number it is uh the first ever block the Genesis block was block number one uh was was had a height of one uh

the next block had a height of two this is how old the block is when the block was mined this is how many transactions took place inside of that block this is the total amount of USD value in all of the transactions that took place in that block this is the relayed by field which is basically like whoever was advertising whoever advertises that they were ones that Min that block so if they have like a there's like a mining pool or something like that announcing it to the to the world once they've mined it um and then this is the actual data size of the block the data size of the like serialized data that's sitting on The

Ledger of the block if you want to actually access an address the page is going to look like this you're going to get a QR code blockchain that info does some really fun stuff for you I mean it converts however much money the wallet has received how much it's sent all the transactions that it's had um Etc uh this is what a transaction looks like a Bitcoin transaction has something called a TX ID or a transaction ID um which is just a like a uyu ID a unique identifier for any transaction that's ever happened um but this is that's like super 2015 and there's no way to do the thing that I want to do which is show me all the

transactions above a certain dollar value in this time range well you can't ask Bitcoin blockchain.info that because it doesn't have any way to tell you those those apis aren't exposed it has the data but it doesn't give it to you in that way so you just have to go around and click stuff which isn't how I want to do things at all so also maybe you don't want somebody to have logs of questions that you're asking and I'm aware of this so to reiterate this is the actual transaction that I want to make I want to say show me all of the transactions that happened in a given date range that fell under a certain USD

equivalent range at that time there are some cheat codes to do this I mean there are easier ways to do it than how I did it um but I didn't really realize it at the time you can download a dump of web btc.com postgress database which is about 80 gigs compressed um and you when you decompress it it's about 180 it's about 160 gigs it's a giant postgress database so you can get it from there um the other thing is you can generate the database yourself with AB which is the blockchain a blockchain Explorer as I was saying before it's an offline blockchain Explorer um this I found a guy that dockerized it and made it way

way easier if you're familiar with Docker like you can literally kick off the whole thing with one command um it's amazing that command that I have written here is literally you just type that command with the serialized blockchain Ledger data and um and it'll rip everything apart for you it will take forever it will take a very long time and it will take up a lot of space but if you're not worried about time and you're not worried about space um then then that's the way to go and you'll get that's the interface right there basically just it's very similar like a strip down blockchain.info ah so how are we going to pull the leg

apart well The Ledger is made up of dat files there is currently about 600 125 megabyte do files um they're named is such BLK z. followed by BLK z001 do um and each DAT file is a serialized binary blob containing about 128 me well containing exactly 128 megabytes of blockchain Ledger data each file contains blocks each block contains transaction each each block contains a header and transactions and each transaction contains inputs and outputs the data structure is complex but only if you don't know anything about data structures which I did not before I started doing this uh this is my military grade PowerPoint skills of like wrapping what the uh with blockchain what what the actual block uh what a

block looks like got the block header wrapped wrapping around transactions with an input and output section and there are itemized inputs and outputs in each of those and then like I said there's blocks on blocks on blocks on blocks so how do you actually get The Ledger how do you get the data well there's a number of different ways the right way to do it is to install Bitcoin core on a server like spin up a digital ocean node for three days um ensure that you have about 100 G 100ish gigabytes free of dis space and then you just replicate The Ledger over a couple of days you're just pulling it in from everybody else um You're Building The

Ledger and after three I mean it I want to say it depends on your internet connection but it doesn't really because it takes time to build The Ledger together it takes time to to pull it down it's not it's not going to be throttling it it's not like it's just going to be flying in um so it takes longer it's going to take I mean it took me I think the first time I did it four days to build a ledger um and it was on a really fast internet connection um when I did this in July 2016 it was about 80 gigs it's been growing like a mofo though recently so it's it's going

to be it's it's going to be pretty it's going to be pretty big if you want to get it the fast and lazy way then uh you can download a torrent of The Ledger files if you want I haven't actually uploaded this but um if anybody wants it uh just get in contact with me and I'll I'll upload a torrent file you can seat it and we'll we'll do it like that I've seen a few of these hovering around the internet and you really when you do this the good thing is um you don't have to worry about anybody doing any like putting any [ __ ] on The Ledger because if you're like trying to

download it and then load it into Bitcoin core to validate it won't load if the Ledger if like The Ledger data if there's anything that's screwed up in it it'll air out nothing will validate it won't work so um so you can get you can get um you can get the Ledger if if you find a torrent there's very low risk of it being tampered with unless you don't validate it um this is what it looks like this is The Ledger just a bunch of hex uh this is a hex dump of The Ledger um so it's actually I mean it's binary it's it's serialized data but this is a hex dump of The Ledger

um and I looked at that and I was like okay I don't know what to do with this but the answer is are in here somewhere so then I'm like I got to rip the whole thing apart which means I have to learn how to use like strs which presumably you learn in like computer science and stuff but I wasn't fortunate enough to do that so um I started out by like mapping It Out by reading the the Bitcoin developer documentation and mapping out the data like the fields and then I like that is a map that I wrote in Vim um and I was just like going through and mapping it out I had like a

key and like all this kind of stuff um uh not the right way to do it parsing blockchain sucks so I started writing I did start writing a parser via this guy's guide this is a great guide um Tom gar wrote it uh I don't know who that is I've chatted with him a couple of times but I've met him um but if he hadn't have written that guide I would have been so lost I started read I started writing my thing following his guide but it wasn't it um it wasn't exactly what I needed to be but I did I did reference it a number of times uh so he really really helped me out with this

um I found a couple of libraries that did close to what I needed but uh but not exactly what I needed and then I found Pi blockchain which was committed to GitHub like literally several weeks before I started doing this project so I was like thank God so I used that it did exactly what I needed to um or almost exactly what I needed to um also shout out to this guy xort 987 who wrote a tool called block parer which is written in C++ uh he wrote it in like 2011 it worked great when The Ledger was like 1 gab but The Ledger is like 80 now so was like loading everything into RAM and

doing something and you can't really do that with 80 gb so these are the things that I need for the sake of my use case right I want to I want to do this big SQL query so I got to rip everything apart uh there's a trans I need a transaction ID just so I can reference everything with something else I need a payer wallet a receiver wallet the time of the transaction the Bitcoin value and the USD value there's a lot of problems with this there's a lot of problems with all these things for example like transactions aren't just a plus and a minus number it's it's a giant length list of inputs and outputs so you can't

it's it's more complicated than that I think I touch on that a little bit more there's a notion of change in Bitcoin which is basically I'll I think I touch on it in a different Slide the payer isn't something that is specified in the Bitcoin Ledger only the receiver and a reference to the output that the person uh whoever had the Bitcoin first so basically like everything needs to be linked together to get the payer uh I dig into that a little bit more the exchange rate right there is no USD value piece of like in the in The blockchain Ledger there's no like there's no field in the block header that says like this is how much US

Dollars Bitcoin is worth today like no so I have to derive that myself somewhere else um uh Big Data like not really you know it's like 80ish gigabytes when I ripped everything apart into like text it got a little bit bigger but it's it's not like big big but it was big for like my home computer um and in transaction patterns this was a huge pain in in the ass this is basically as Bitcoin has evolved um when the when a private key is being validated by the Bitcoin script programming language inside the actual Bitcoin U Bitcoin client um anytime a transaction is being validated like or anytime a transaction is being pushed to the Ledger uh there are different

patterns of pushing a public key and validating it with like different op codes um and it's it's not one standard everyone there are slightly different implementations huge pain in the ass transaction ID so remember this is the first thing that I need to get uh transaction ID is actually super easy um it's just a shot 256 of the entire transaction bam that's it that's not that hard right the next thing is time time is actually also really easy there is an epic timestamp located in the block header um all I had to do is just convert that to date time uh well I didn't even have to convert that to date time but I put that in in the database

uh as a Unix epic timestamp um it's right there super easy the receiver wallet is kind of hard like I said wallets and addresses are shorthand for a public key so um what it actually is is wallets are wallets are a public key uh shot to 56 and then shot to 56 again and then chopped apart and then rearranged in certain ways and salted with themselves and ended with like a version ID I think um so the data that's actually on the Ledger is not the same as the wallet that you'd see on like a website um and you have to do some work to actually get it Bitcoin script pushes the private key data to

like a stack and then it does like it sends a validation op code and like that's how it does stuff um Bitcoin script is really similar to the programming language fourth if you're familiar with fourth Mei um this is where script will be script is part of the output um it's the data section of the output think of it as the data section of the output there like I said there are lots of different script op code patterns there's about like really 99.99% of transactions in the blockchain are going to fall under about six different patterns um the library that I'm using to par the blockchain pi blockchain the guy bless his heart um but he only implemented one pattern so

there are so I was like okay and it was because I mean like I said he had literally committed this thing to to Big to GitHub like a week before so I know he's like implementing more and I chatted with him a little bit um uh but he only implemented one pattern so he would have really only gotten like you know 10% coverage on all the transactions so I ended up having to actually crack open source code and write and like basically Fork the library and write a couple I had I ended up writing two more patterns into it like the ability to recognize two more patterns and so with three of the six patterns you end up getting

99.5% coverage um I am missing data so when I did crunch through everything I got 999 99.5% coverage of the entire blockchain of all transactions um I am missing data but unfortunately everyone's just going to have to deal with it um and I didn't do anything there's also multi signat multi signature transactions which I didn't even try to screw with because that's just too hard for me or it was at the time the change problem that I was talking about before so I have a Bitcoin wallet with 20 Bitcoin in it and I want to send five Bitcoin to Kanye West um so what I'm going to actually do is I'm going to cryptographically sign five

Bitcoin to Kanye public key which is well T address which is just a different form of a public key and then I'm going to cryptographically sign 15 of those Bitcoin to my own public key or to a different public key if I want to be even sneakier and have the private keys for both of those right so like I said there is no State there's no there's no like transaction when you make a Bitcoin transaction it's it's not like minus 5 me plus 5 you it's that I'm basically I'm going to chop it apart and then I'm going to say I'm giving myself the next one because everything is a link list of Link list right um if you're only

looking at the outputs if you don't have the inputs then change transactions and non-rain transactions are indistinguishable there's no way to know the difference so you need the outputs to figure or you need the inputs to figure that out uh you need the entire chain to solve this right because you need all the inputs if you're going to be like have a decent look at the outputs um and like I said if I have um because of this problem that I'm talking about right now if I have $100 million do worth of bitcoin and I buy a copy for five dollar if you're just looking at the outputs on the blood Dr it's going to look like I'm sending somebody $5 and

somebody else is receiving 999,999 whatever like a ton of money right but they're not um the sender problem so this is the sender wallet this is not in the blockchain this is not in The Ledger there is no cender field um you have to derive it um it's sucks you have to actually derive it from by taking the sender wallet or the receiving wallet of the output that the input is a previous is referencing in its previous hash field um and it sucks because everything is a chain so that's how you have to drive this whole thing um my cooworker Chris oner wrote this giant big bastard of a SQL query that actually made all those connections for me um but

I realized in retrospect I probably should have used a graph database for this I I definitely should have used a graph database for this but I didn't really realize it at the time um when I did that giant SQL query to like map like link everything together when my when my cooworker did rather um I I straight up didn't have enough RAM in my machine to build like the view table for that so I overnighted 32 gigabytes of RAM to my house um and uh put it in my computer for like and it it still didn't work it was the dumbest $200 I've ever spent in my entire life um but so basically I did everything um

and I just didn't my South Carolina internet sucked too much so I ended up having to like like my roommate ended up taking me to his to his office that had gigabit fiber fiber and I like uploaded my whole data set I compressed it and I uploaded it to S3 and I rented this giant AWS server that had like 300 gigaby of ram that cost like $8 every hour and I was like I was like screaming I was like like typing everything in CU it was costing me like $8 an hour I'm like this has to go as fast as humanly possible um so any I ended up I did it I I made the query I like linked

everything together and then I just ripped out the database and pulled it back down um and then the amount field I was talking before the amount of Bitcoin the amount of Bitcoin in transaction isn't actually in the blockchain isn't in the outputs it's in Satoshi it's not in like Block it's not in Bitcoin it's in Satoshi there's a 100 million satoshis to a Bitcoin that's all you need to know um and it's here in the outlets then the next thing um which was a little bit of the of a pain in the ass um was the historical USD exchange rate like I was saying like Bitcoin the price of Bitcoin is not is not anywhere in The blockchain Ledger

it's actually going to be derived it's however much money uh people are willing to pay for it at the time that's what determines the market value um given I needed to have a function that basically I wrote I wrote a function that given an epic timestamp give me the Bitcoin USD value give at that time convert it to a date rip that out of like some data source and then I yanked down um the market basically the historical USD pricing data for Bitcoin off of blockchain.info at this link so take given satoshi's multiply that by 100 million multiply that by however much Bitcoin was worth at that time and there you've got it and then that was it that

was everything so I ended up with this big uh list it was like a CSV and you'll see here like all the you can't really see it but all the USD values zero because that was in 2009 when no one would have spent anything for Bitcoin um uh right when it came out um this is actually just the outputs but I'll go in here more U my parser Source codes on my GitHub uh my GitHub is the repo is still private I need to unprivate it I'm going to do that any day now I promise if you want it in between now and when I unprivate it um just just email me um or tweet at me or whatever and I'll send it

to you uh it's written it's in Python 35 it uses python 35 doesn't use it won't work with python 34 it certainly won't work with python 27 um it's slow I tried doing the multi-threading it kind of worked kind of didn't because I don't actually I don't know multithreading because I'm not a good programmer so then I needed to load everything into a dbms um uh I decided to use something called Yen click house by yendex yendex is the Russian search engine um why well because one of my co-workers recommended that I do it like that um and he's smarter than me and he was right it's awesome it uses something called views a view like a database view is basically

just uh the output of another query being treated as a table um so basically I linked everything together with a view output exported it and it was basically like a table that was just the output of another more complex query uh the only downside of Click house is that it can't make queries that don't fit into Ram uh so then I had it I had about 100 Gig play Tech CSV it took 13 hours to parse out and it took 15 minutes to load into the database this is what loading the inputs and the outputs look like um then I had to link it together this was the thing that I was talking about before um this is the thing that I told

you about uh I actually ended up going to a my friend's room uh my roommate's office spinning up a giant AWS server etc etc this this is the SQL query that's one query uh so it was huge to like link everything together my cooworker Chris Doner wrote that because he's the man um this is the output of it in quick house so this is when it actually rips everything apart I've got txid payer pay Satoshi amount epic time samp USD amount and then like a date format so I had it had it in in a database oh thank you had it in a database uh I'm going to blow through the next ones because I'm a little short

on time um this is what it looked like this is what it looked like 1. 4 billion records uh there were so how many transactions do you think there were oh yes okay how many transactions do you think there were that are over a million dollars of Bitcoin because I have the USD value over a million dollar transactions just's take a guess 50,000 is insanely close 60,000 60,000 transactions over $1 million $404 transactions over $10 million and I'm sorry 414 thank you and seven transactions that were over100 million do over time with Bitcoin biggest US dollar transfers of all time I mapped this out and grafted it a little bit um the biggest one ever was

$147 Million on November 22nd 2013 somebody wrote an article about it it was so big and the article was literally titled a shitload of money moves in one Bitcoin transaction that is what the title of the article was um there were $140 million trans there have been 140 mli million Bitcoin transactions largest unredeemed Bitcoin transaction this is a $34 million transaction that has been unclaimed for two years it's just sitting there I've got other stuff like the days the days that the most you oh yesr value that is current value yep so it is now worth $34 million and it was traded two years ago I'm sorry oh then oh because Bitcoin was about twice as much then yeah uh days

where the most us were traded on 2000 on January 24th 2016 $27 trillion of Bitcoin were traded the day before1 trillion of Bitcoin were traded uh the day before or no I'm sorry March 5th 2016 $7 trillion removed uh I didn't actually finish the slide um and I forgot to I'm sorry but you can look at things like how much money has been donated to like Wikileaks P Bay Wikileaks got $125,000 on one of their Bitcoin wallets the pirate gate Bay got $3,000 um I thought that was funny at first I was like The Pirate Bay probably would have gotten more but then I remembered the people that go to The Pirate Bay they're kind of cheap so that

kind of makes sense um a lot of other questions a lot of other questions were things like you know what are the average amount of transactions per month these are just different things that you can ask that I haven't gotten up to writing queries for yet and finally who the [ __ ] has My5 million Bitcoin so let's go back I basically wrote this big giant query that said show me all of the transactions show me the biggest transactions that happened in the month of February that was what I did uh there were not there were no transactions there were no one single transactions that wereth $15 million in February I found an $8 million

transaction that happened four days before the tweet and it actually looks like it went into a tumbler the day that Martin scowley sent the the Tweet to Kanye West um but I I I mean I can't I don't know that it has anything to do with him I just that's just the closest thing that I could find it was worth $8 million at the to right um this is it like being ripped apart being split in half like a bunch a bunch a bunch of times me going into a wallet and then it continues to do that if you actually browse this transaction um if you guys want to check this transaction down highly recommend it and

just click through and make your own decisions um it's in the it's in the ppin thing what if the transaction was split into multiple smaller transactions right um what if the guy was tricked into sending his Bitcoin in uh 10 different transactions um so show me the the wallets that received the most money total in the of February right oh I've got one minute uh I didn't find any all of the ones that I found were basically um they were all high cash flow wallets that were that had money coming in money going out for a year or so so I don't think any of them were this I did find a 12 million transaction two days after

the Tweet which was here um and then it was claimed about five months later um future work I want to open a public website blah blah blah build a front end make my parser code suck less uh watch everyone like watch what people do with donation money build an API build an learning engine blah blah blah law enforcement could use this intelligence Community could use this fintech could use this uh there there's anti-money laundering use cases um investment use cases for hedge funds there's some evil uses in here you can use it as targeting platform and find rich people and steal their money you violate people's privacy I don't know um blockchain is cool but

it's not formatted to make queries like we want to uh so we ripped it apart put it in a database asked it a question and it was awesome did anybody move $15 million around the time of the Tweet not in one transaction I saw a couple things that were kind of close but maybe my time range is off I don't know am I de anonymizing the blockchain by doing this because somebody asked me that at some point my answer is no because if you use that logic Google de anonymize the blockchain by creating a search engine um or de anonymize the Internet by creating a search engine so that's [ __ ] um database can do whatever you

wanted to you can make SQL queries against it answer bunch of questions correlate price of Bitcoin with Other Stuff Etc that's it that's my talk does anybody have any questions

yes the question is do I have any um plans to do it with other cryptocurrencies like Litecoin Etc um I don't have any plans to do it but it would be insanely easy to do because really the only thing that differentiates most of those cryptocurrencies is they use a different network identifi like they have their own thing and they have a different network identifier in the in the in the protocol it would be the exact it would be probably almost the exact same thing with minimal tweaks but no I haven't done that yet um but you can yep yes uh the question is any interest in doing this with side chains or private chains um I don't actually know anything

about either of those um I would think a private private chains have trusted third parties though don't they they're like hosted by like a multiple group of people yeah private um I don't but that's just because I really don't know anything about those I'm open to it though any other questions nope um uh that's it that's my talk um this is my contact info uh thank you guys for coming to my talk