
hi uh my name is schami I am a Java developer by day and a budding pentester at night uh the context and the general storyline we're going to run through here uh came about while was working on my degree thesis uh which involved everyone's favorite subject parking tickets um knowing that you probably wouldn't po up your hand if I asked you in prompt can everyone raise their hand please how many of you do not own Vehicles put down your hand how many of you have received a parking ticket in the last year say not too bad how many of you have appealed successfully right okay that just gives me some context here
so I don't need to talk in much detail about the general um perception people have of what parking tickets are about um what their actual aim is how much of effect it has on road congestions and likes of that um but you do tend to get an indication when even the government has put up a position and it's sort of subtly saying hold on a second it might be going a little too far parking tickets are the bane of our existence as Travelers and it's one of those things where uh from memory myself last year I broke down and I had to park my motorcycle on the side of the street I paid for a
weekend um for this stay not knowing when I would pick it up I returned on Sunday and I got a parking ticket for each day uh because there's a little sign on a landboard somewhere which said that we were doing work on that street um so it's one of those things where I thought I would put some sort of fight back and see what I can do about this so I built a parking ticket appeal system this is some a snippet of it where I was looking at when people receive parking tickets um how many of them are successfully appealed what is the ratio what arguments are they using if perhaps you have multiple arguments which one would
give you a better chance based on your localization so where you are uh what sympathy the council has perhaps and I was exploring that data in detail now generally there is a lot of play people put into appealing their ticket some of them can be quite harmless um some of you may have seen this before um but this isn't always the case traffic wardens do tend to face uh violence in the street you may have come across it yourself but um they tend to face harassment physical violence um and the likes so it isn't always a pretty picture when it comes to interaction with these individuals now so just to give you a background of what I was looking at and
for the purpose of this presentation I had 100,000 parking tickets for a localized Council in this case it was Sheffield Council it's probably the first time I mentioned it in public um I cleansed 40,000 of these for the purpose of this presentation and I spent many many hours in Excel uh trying to make sense because that seems the most appropriate way um the way this data came about this was a freedom information request so this wasn't any um leak of any sorts but this was legitimately the council have released this data I cleaned it up contextualized it so I understand where this data is coming from so Street X I want to make sure it's in the localized environments
in Sheffield rather than picking up from anywhere across the globe um I worked in it GE coding it using Bing that seemed to be the free way of doing it bringing it back into Google and I resulted in the following you have here if I skip the slide um the user comes to this system gives me an gives me a context they will say I receiv received a parking ticket on day x at this time I will return to them the officer ID and then take the officer ID and I start looking at the record and the historical data on this officer and I can work out and tell you where is the best place you should go to meet this
officer again now traditionally people tend to to keep track of their tickets I've seen quite a few people have sort of a glory wall where they put up all their tickets um or it's a matter of just going through your transaction so it's not really that challenging to say two years ago you received a parking ticket can you give me that data and I can work out what's happened to it but you can see here a snapshot this is the actual data cleaned up uh quite openly there you've got a ID associated with each ticket uh I believe I was going through about 40 offices or so 40 or 45 um if I take just a snapshot I don't
know I imagine it's quite clear here but I've taken I've broken them into sets of weeks and I've looked at a period of over 6 months here and you can see there is a deviance here and there but all in all we're looking at about an hour range for this particular Street for this officer in question so there is a consistent pattern we know where they will be at what time for a consistent period of time next step let's place it on a map so again I've taken Snippets here how I was visualizing it but I started looking at what is the route this officer takes perhaps if I can't tell you necessarily precisely where they
will be at one time I can tell you the reion or I can tell you where I'd have more confidence where they'd be in a later period I'm just tracking their movement and here we have a problem we know people are can can exhibit some sort of violence we're aware of visual anti groups when it comes to this sort of thing um and then there's simply the thrill of stalking and you know you know there is no malicious uh there's nothing you can charge but you can have a sort of a truckload of people just following around this officer after one of them gets pretty rowdy at a pub and just decides to you know have a nipp at it
before you know it blogs are created on these offices memes are born and it's Mayhem taking this a step further what I've created here is I actually have an API for this data set so it's it's absolutely trivial to turn this into a working system where I open it up and then sort of say give me an ID and I'll give you everything or not even that just give me a time and a date and I'll give you what you need um secondly some of you may have been familiar with freedom information requests a lot of times people would put through request they sort of use it and say Council X have given us this data so therefore it
does fall under the requirements or the standards and therefore give it to us so there is a precedence here where you can sort of say Sheffield's given it to us and I can go around all the councils and start bringing in this data and also there tends to be a shared architecture as well so once I build in terms of the technical consuming this data I already have a process in place it can be quite trivial and easy to do where does that lead us to um even if personal identifiable information has been released oh sorry has been held back contextualizing it cross referencing it can give us a lot of interesting information in fact I almost
saw it as the better way it's kind of more fun when you don't have actually know who the officer is you given an ID and you go on sort of a treasure hunt you're going to figure out who it was um who gave you this ticket um and the interesting one here which I found doing my research was AOL so a couple of years ago AOL released a data set now again this wasn't a leak of any sorts this was deliberately released for researchers to um collaborate and work on analyzing this data set now what happened there was you take a set of data points and you sort of say someone searched for a unique film how many
places are actually you know releasing that film at the moment and you look through their search history and then they talk about garden tools and then they talk about hying transactions from their wife's site and you can narrow this down and start figuring out who these people are and what they're doing even though there isn't actually explicit information there a good another example of this was a Harvard Professor who picked up 40% of a statistics on uh on DNA um and was able to figure out an identity for them and this leads us to an interesting area in cryptography called differential privacy now I was asked to actually put a visual for this um but I'll hold back
on the mass I couldn't find anything remotely um easy on the eyes um but the concept here is is actually a simple one essentially what it says is you could be affected by an outcome of a study but whether you choose to participate or not in this study shouldn't impact the outcome of that study that sounded a bit of a mouthful but essentially an example of this would be if I were a smoker which I'm not um and I was asked to engage in a um an interview with an insurance company who's doing a pol a policy in review on it and they're trying to understand relationships and how it impacts your life Etc now it could be as a result of this
study quite likely that insurance premiums would rise and as a result of me being a smoker I would be affected by that rise but whether I chose to participate in that study or not the effect would still be there the rise would still happen and so you're trying to create a context whereby whether I choose or not to engage in this study the outcome is still the same for the good or for the bad now you may have noticed earlier I used us a 972 and there was a reason for that because going back to the AOL case it became so fascinating to people and the search history and particularly one user user 927 had an a staggeringly
interesting uh background so much so that it actually developed into a theater production called user 927 which was based on the AOL uh data output so if your idea of theoretical production of your you know user data is not the kind of thing you're after you probably want to start thinking about when you do release data which is a positive thing and an interesting way to benefit from it you might want to be weary and uh focused on making sure that you're anonymizing it and luckily for us there are various tools to help us do so Cornell have um a couple of tools and there are various other tools on the market and just using some sort of
sensibility check appreciating can people cross referencing it what else is out there perhaps not now but in the future um and making sure it's future proof in that sense which is something people disregard as well and that leads me to the end and I just want to before I end off thank uh Derm to actually come in wobbling all the way from uh Ireland for uh this talk and was my mentor on this so thank you very much