← All talks

BSIDESLV 2018 - Common Ground - Day One

BSides Las Vegas1:30:32272 viewsPublished 2018-08Watch on YouTube ↗
Show transcript [en]

with the PRT T's and so forth we say the application was made pursuant to instead of saying pursuant to 32 or 33 were pursuant to the FISA Act classified the information is either still classified or if you look at the little markings next to it those are for read actions so the Freedom of Information Act is that when information is released you can hold back information for certain reasons so like b6 is a common law enforcement one I can't remember what b1 b3 b7 are but you can pull up flight and see so they basically are asserting that they can't release this information due to these yes well we can't see what the specific statute number is because it

might tell us how many different assets of authorities are underneath because their numerical yeah so it might be like you know so it's 50 USC instead of 18 it may be like sections you know 1 2 5 and I'm like oh there's at least 5 sections because they go numerically no I don't remember what if C is it would be one of the national security issue i'p ones yeah I actually thought this would be under 18 but it's not yeah so we have to assert that Russia is a foreign power I love how some of these things you get and it's just hurting like things that you would be like well duh like very basic faxing yet because the statute

requires it so I'm saying you probably have to assert that because it says it can only be used against a foreign power so we can essentially reverse engineer what the statute is requiring by looking at this warrant at what they didn't redact out and we say Carter Paige is a agent of Russia so that must be required the minimization procedures proposed in the application have been adopted so if I as a 7 or 2 years old the minimization procedures is this whole thing I don't even want to touch it because it's a hairball you should research it it's cool stuff it's a hairball and then he asked to say some to what we say a lot of the warrant ones

that all the certifications required by statute have been met here and notice again they blacked out the statute number so we can't tell which statute is requiring these statements we can say it's not clearly erroneous so this is probably another statute requirement that is in there that's not text that we usually see in a warrant one pursuant to the authority conferred on this court or gonna say the United States says I'm at the top of the second page and I say it's authorized to conduct blank of the target as falls and this is seeing like you can get the email subject lines you can get the URLs whatever they're actually getting here because we don't

know it because it's been flat out I didn't go through the whole thing but it was kind of interesting sort of going through all this and seeing the pieces it's pretty long it was at least I think it might have been released by DOJ it you might be right I don't remember

it's so blocked out yeah I mean I was mostly interested in it because I never particularly thought this through but it makes perfect sense that this is facially similar to the non-classified warrants that we handle for regular wiretaps yeah so this is just a little bit notes about 702 allows a targeted collection of content about collection to from and about the target we can do a couple of different hops around backdoor searches are basically where we already have this data available and the FBI has a target and they want to go see if there's anything in the already collected data about it the last odni a transparency report so there are one hundred and twenty nine

thousand eighty individual groups or entities who are targets so after we just went through all these statutes and I said you know case law doesn't really have much to do with it we're gonna touch on all writs Act which is kind of case law and is very fuzzy and a weird Wiggly little area it is kind of a catch-all because especially all these things are very specific requirements way back in 1789 they decided we'd better make sure we can always get the information we need and we're going to make this all writs act and it was actually enacted to enable courts to always be able to go get the information they needed to pursue essentially their

role so 99.999% of time that this law is used it's by a court basically saying I need you to go do something like some really arcane small like land dispute or something like it's not a big deal I guess use all hire in most states also have a version version of this for their state courts but it got used for the Apple iPhone unlocking a couple of years ago essentially the FBI had you know the phone they wanted a phone unlocked Apple was like sorry we can't unlock that for you without writing custom software it didn't fit in a box of like a wiretap or start content or something dozen wanted Apple to go affirmative lis right some

new software to do this and there's not really any provision to do that in our current wiretap statute or so forth so essentially the all writs act says anything necessary and appropriate to aid in the respective jurisdictions some I like this wording agreeable to the usage and principle of law I'm not sure how that says I'm going to require a software engineer to go write some code but DOJ said it was so United States versus New York telephone is a case that predates kaliya and essentially it was they wanted to essentially install a wiretap and because CLIA didn't yet exist the phone company was like sorry like we don't have the text set up her

tq2 tap this like we can't really help you and it they used this the all writs a-- to force essentially the technical assistance that is now codified in kaliya and they said in this case on basically we agree you know the federal courts should not just require third parties to do like anything they were kind of leaning on it then because they're like well clearly like we have this wiretap sort of law we should be using au phone company or just being annoying and so we're gonna force you because it seems kind of clear to us they should be allowed to do it and so essentially what happened with all writs act case with Apple was you know they found like some

other way to get in because software is always buggy and there was such a backdoor to it but they have not said they won't try this kind of thing again so there's another fight over this sort of thing if we do not have essentially a new version of CLIA that would require writing you know software or doing other work to affirm really unlock phones or other devices all writs act could come back to play again so these are pre sources there is a lot of very boring government manuals up here I will definitely tweet out my slide deck if you don't want to try to take pictures of super long URLs la Farah blog in

there is a very conservative blog but they have people who really know what they're talking about and so I find it as somewhat interesting read the thing at the bottom is a book so when you got a law school you actually really learn law through like the supplements and this is a very good horn book that basically taught me from pro and the top book is an insanely expensive book by David Chris that if you have access to a law library that has it will basically teach you FISA and so forth it's very good but it's like five hundred dollars for two volumes or something insane so try to find in a library so I think I

have time for a question or two okay oh wow is this basically the same for wireless surveillance like if I was gonna book somebody this is mainly wire how would you be bugging likey you individually doing it yeah well I mean law enforcement so via so makes my questions this is mainly wired with the laws be that much different for wireless this applies to cell phones just to cell phones and stuff yeah okay no no not Wireless like electronic surveillance like I actually have like a spike Mike you know pointed at you and so it technically falls under this yeah any of that done by law enforcement I mean that this sort of like if you are one person

recording you need a one person consent or a two person consent in your state anything outside of that is criminal and you should not do it please don't wiretap your neighbors so you noted that the wiretap sort of extends to if I swap the same in an existing device then they don't have to go and get a whole new wiretap order right they can continue to use the existing one even though the for number to change because the device identifier is the same or the yeah whoever they're keying the identification off of it shifts you can often follow it so does that still apply if you get a new device so the IMEI number changes I'm gonna say probably I

think it may depend on the company I didn't handle ones that were like that so I don't want to give you a definitive yes or no that may come down to the company policy that I don't think there's case law on that but so it seems like a lot of the reasons that you were kicking things back was sort of bureaucracy with that the wording has to match exactly can can that be gained from the perspective of someone who may be a target from this is there some way to increase your chances of being a person that makes it yeah make the law Clark drafting these get tired and forget to check the statute but I'm pretty sure

that most courts use a single injection in your name ah oh man we saw some funny needs because you don't usually have names you have known as and whatever they're like handles or whatnot yep most courts are basically gonna have like on the C Drive in the folder orders they're gonna have a like PRT T folder I think I have a PR TT with a D order and they're gonna fire that sucker up and basically do a find/replace so it's not like they're drafting it from scratch every time I think it's what is awesome talk one question you were talking about mo headphones so far but does this also apply to mobile phone technologies for

example I used into connected cars it would apply to anything has an internet connection essentially if you can get to a provider and serve an order on them then they can package that in response to the order of packages data up and send it to law enforcement so my question is with regards to modification of data that is passed over the wire thus this only cover well you are allowed to record everything that goes over this wire or through are you also allowed to modify the data for example enforcing SSL man-in-the-middle or something like that to be able to Inc to access the data because for example in Australia just changed the law if you're

using tor or signal the government is no longer able to read that if they want to wiretap you so they now have the right to install a backdoor on your phone they even may break into your house to install it there yeah in order to get access to the data so are they allowed to modify your device and all your data stream as well with the biota I'm gonna not answer this one okay maybe I'll move you to an easier question so how does how does law enforcement decide to whom to issue the subpoena right I mean there's no one service provider that's just a operating in a vacuum everybody's using you know yes and

Google and all these other pieces of infrastructure and other vendors that may have access to the same information but when how do they choose who to go to when it's a long chain of providers they are gonna guess go to the ones they're familiar working with some of these companies have very large trust and safety programs and they hire outside counsel and so forth this is where I did this kind of work and those are probably easier to work with and they may get a reputation to be easier to work with and people go there I don't know [Music] I think of seeing some recent laws allowing ISPs to possibly mounter and or alter data going through the ISP can you

about the ISPs being allowed to modify data six months ago sorry so trying to figure out like what the so law enforcement gets a wiretap order asking the ISP to monitor or to modify the data that they have no orders and they're getting stuff outside an order right I'm just gonna say that no client I have worked for would do that I can't say who my clients were but that's not gonna be a common practice among people who are had like the big trust and safety group so whatever because if you essentially you're following like your hiring attorneys to do this checklist because if you don't it's essentially illegal wiretapping which is very serious criminal penalties so if that's

happening my guess is maybe there's some kind of like a 702 classified whatever thing going on but I I don't know I also know Justin and we add to that supporting a client of mine in Australia right now who's looking at a civil activity relating to copyright infringement and DNS blocking which has now actually become commonplace in Australia hit me up sometimes ring yeah and copyright enforcement stuff is gonna be entirely different from wiretap anyway my question was I I couldn't that helps sense going through it it sounds like everyone has a whole heap of Word docs that they just everyone has a so it sounds like law enforcement has a whole heap of Word documents that the judgment

ctrl-c ctrl-v - has there been any view to automate and accelerate that process at all I just and even clear up something data validation piece well the data that I would see would come from the courts and courts are notoriously tricky to do nice whiz-bang technology with C pacer a lot I do you think these various agencies that are various ways to do this kind of stuff if you go read these manuals like DOJ C sips they're gonna tell you exactly how to create and draft on these applications like these are very interesting to read they'll tell you under like when you're allowed to use this what type of investigation and so forth as far as automating the legal

technology there are definitely interested the attorneys who do it they may be doing some automation I had some because I'm a programmer and this is really boring work um but it's a really high bar to make sure you're always correct like there's not a lot of room for screw-ups and so you know how many bugs happen like you don't want this fully automated thank you okay so this will be the last question I'm gonna be outside at lockpick if you want to talk to me more some people alluded to this but can you do this under any spectrum even without the internet there's like 900 megahertz 2.4 digger its visible light radar you would need a service provider to serve

on so if there is no service provider they don't have the authority to create a system that intercepts uh that's a good question they probably do I'm not that familiar with those I'm familiar with the stuff like the things the service providers get so that's a cool research question you should go look at I suspect they probably can I just don't off the top of my head standing here with no legal research tools no so

yeah you want to be very careful about recording anything off of this because there's a lot of laws in this area that are very finicky and in very serious penalties so well thank you everybody so today we're not worth up o norisse [Music]

[Music]

[Music]

[Music]

[Music]

[Music] [Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music] [Music]

[Music] upstart screw it up if you can you know why sit in the back

[Music] [Applause] [Music] all right good evening everyone welcome to common ground this is uh you're just complaining because you're guilty a guide for citizens and hackers to adversarial testing I'm software using the criminal justice system by Gina Nathan and roe a few words before we start we'd like to thank our sponsors especially our stellar sponsor Oh rapid7 and our other sponsors and was on oath and assembly it's with their support and the volunteers that and make this event possible as courtesy to our speakers and the audience we ask that you do silence your cellphone's since this is being recorded without further ado let's get this started

how about that oh yeah okay someone in the front row said that the mic wasn't working and he was totally right let's start this again I'm super thankful that you guys are all here to listen to what I think is a super important topic for citizens of the modern world and for those of us who want to understand how technology works in other words hackers it's really important to understand the role of software in the criminal justice system and to have people looking into how it works that are actually going to look for problems not just the people who have a vested interest in showing that it's working great I'm Gina Matthews I'm a computer science

professor at Clarkson University I'm also a fellow at data and society and I'm joined by Nathan Adams systems engineer at forensic bioinformatics services and jerome greco an attorney at the Legal Aid Society so software is increasingly used to make really important decisions about people's lives from housing to hiring to how we find friends and partners how we navigate the city streets every day how we get our news criminal justice system is just one example of this the way T or the decision the more crucial it is that we can understand and question the decisions that are being made what inputs are being fed into that decision how is the decision made is it correct do we have other information that should

be considered are we using protected attributes like race and gender or what about proxies for these characteristics that are just as effective as the actual characteristics themselves in the criminal justice system in particular software's and complex systems and algorithm decision-making is used throughout the system and it is very often black boxes which twitch it for which trade secret protection the intellectual property is considered more important than defendants rights to understand those decisions and not only that there's a lot of evidence of problems and how are we going to find bugs and fix problems if the answer is always you don't get to question that you're just complaining cuz you're guilty that does not sound like the recipe for a society

we want to live in in particular can you imagine being sent to prison rather than giving probation because proprietary software says you're likely to commit another crime that happened and you're not allowed to ask how the software made its decision that's Loomis vs. Wisconsin having the primary evidence against you being the results of DNA software but one program says you did it and another says you did it that's the Hillary case being accused of murder solely because of DNA transferred by paramedics to the scene not you at all but they don't even figure out that that was what happened four months that was the Anderson case and those are real life situations those of us who

work in technology know that software and complex systems of all kinds need an iterative process of debugging improvement am i right amen right anyone who uses technology knows that their glitches and bugs and unintended consequences and there's huge advantages to having independent third-party testing that's aimed at finding bugs in other words Avice aerial testing if only those with interest in the success of software see the details we have a huge problem and a recipe for injustice

so black boxes and proprietary software and trade secrets are increasingly becoming a problem in the criminal justice system as Gina was just explaining I'm gonna go through some examples of how that is happening obviously I can't go through every every possible example or every possible technology as there are a lot we have a very short period of time hopefully I don't talk too quickly and you'll understand everything I'm trying to say as I will try to get across a lot of information just to start with this is a graph a chart from OSAC which is gives a an idea of all the different forensic disciplines that come into play in the criminal justice system some of them

more scientifically reliable and accurate than others so just kind of give you the playing field that that's really at work here for our purposes we broke down a lot of technologies into four different categories but some of the technologies can fit in multiple categories in particular the line is really blurred when you talk about evidence assessment evidence gathering a lot of different things do both and sometimes they kind of work together and we'll talk a little bit about that shortly give you example of at least one technology in cases in each in each area but first I wanted to talk kind of break things down by secrecy level and so there's three different levels that

we've kind of identified the most secret being secret and in that situation the law enforcement saying we don't even want you to know this thing exists or if you know it exists we don't want you to know that we have it and secret as applied is we have this but we're not going to tell you how we use it we're not going to tell you what cases it's been used on and we're not going to say whether or not it's been used in a particular case that you're working on and then there's the trust us which is okay yeah we have it we used it in this case but just trust us it works the way

we say it works don't look at the man behind the curtain don't ask us any further questions just assume that we're right because why would we be wrong on something like this as if we could never be mistaken which we'll see as often not been true and depending on your jurisdiction some of these things switch categories also depending on as time goes on as we push more more for transparency hopefully we'll see less and less these things up there we're gonna start with predictive policing predictive policing in particular is the idea that an algorithm or data can help make a decision at work a making decision that used to be left up to human law

enforcement officers own decision-making capabilities which in theory that sounds great right like you can remove the bias from from this by having removing the human element when in reality it doesn't work that way right because if the data you're using to feed into to teach this algorithm or the to get this algorithm to work the way you want it to work is based upon prior past bad policing like racist policing or you know policing based upon class or gender that the algorithm is gonna learn it that way right and so for example if you know you were to if you constantly police the same area and you'll you make arrests in and you'll make more arrests in that

area so the algorithm starts thinking oh because you made more arrests in that area that area must have a higher higher crime therefore more law enforcement officers are needed in that area which then will also increase the arrests which will then just be a circular logic and we'll just be self-fulfilling at that points not necessarily the best use part of the problem with this is that a lot of this software and these algorithms are protected with non-disclosure agreements the companies will claim that their proprietary trade secrets and law enforcement shouldn't be allowed to tell anybody about what's going on and they'll say that the underlying data that gene use is sensitive all these things obviously prevent public scrutiny

which is increasingly becoming more and more important as it because more and more relied upon and people's lives are being greatly affected in this this realm based upon and by being thought what reason am I being stopped because some programs said that that I needed to be stopped and this idea that this the computer is is telling me or the algorithm is telling me you're removing the responsibility of humans is is very problematic in terms of evidence gathering today we're really just going to talk about cell size simulators and mobile device forensics cell size simulators for those you don't know is essentially a device that mimics being a cell phone tower and by doing so it

forces all the phones in its range to connect to it and then it could very accurately and precisely pinpoint where that phone is actually located including a specific apartment in a multi-story building which is what happened the USB lambis case the reason why I say cell size simulators instead of stingray device which is what a lot of you have probably heard it called and the media and the news is that a stingray device is a very specific model and all the models have different capabilities and have different different limitations and so I don't like to say stingray because that's a very specific model that has very specific capabilities where another device like a hailstorm or a kingfish or

trigger fish may have different capabilities so besides also being used to track some of the devices have the ability to intercept contents including voice calls and text messages we know that the NYPD used one of these devices over a thousand times between two thousand eight thousand fifteen without ever getting a warrant part of the problem with that is the NYPD still refuses to this day to tell us what model they have and so we don't even know what capabilities is possible and again it's here where they're saying to us trust us it doesn't intercept content we didn't we kept this thing hidden from you for years but you know trust us now that's not how it's working

which i think is ridiculous and non-disclosure agreements between the companies that make it and the Department of Justice have been the main way that they've been keeping these things secret or the main justification for doing so we're going to talk sort ly about a particular case that of Brooklyn in which I worked on where this became a big issue so and people V Gordon our client was found in a location that really had no connection to him and so our responsible is well how the heck did they find him there how did they know he was there at that date and time at that particular point and so besides using some other cell phone tracking stuff we

came to the conclusion they really could have only done this figured out where exactly with a cell size simulator and in our motion to suppress we said to them we know you use this and if you're saying you didn't then tell us how how you were able to accomplish knowing where he was at in real time let it you know and that precisely tell us how you did that the prosecutor in his response conceded yes we used it which was a big win for us because this was the first time that we're aware of on a New York State case on an open New York State case in which we were able to identify that they used

this again to this day none of us has actually ever been able to hold one or see one or examine the software used in one based upon the the prosecutors concession a few months later the judge granted our motion to suppress an identification in the lineup as we'll call fruit of the poisonous tree meaning it was directly connected to an unlawful conduct conduct because they had never gotten a warrant to use this so we were we were really happy we thought this was great but even after this admission after the judge's decision and months of time for the NYPD to correct this they then came out and said no you're wrong we didn't use one well your prosecutor

opened he's filed something in court under oath saying that you did use one and you had months to say that you didn't and was only when it became this public thing where a judge issued a decision that got published at all saying you're saying you didn't so even when we we've come to that point where we've gone we've been able to prove this and be able to show this they are still denying that they were using it they still won't admit that it was used in a particular case even though they haven't been able to come up with any other conclusion or any other reason for how this was able to be done and the only

basis I could think of is that they're still trying to abide by their non-disclosure agreement so in terms of mobile device forensics Reilly V California was the US Supreme Court case that required a warrant to be to be obtained prior to going into somebody's phone and extracting data a celebrate youth that touch is a very common device that's used to actually extract data or make copies or make images of phones and there's some other companies magnet and paraben who are also competitors make similar products this is you know not that that big of a problem for us in terms of this being used as this is something that is available to us my office has one there is some financial

barrier for a lot of people obviously to be able to get one but at least it's something we can test we can replicate what they're doing we could run our own version and we could see exactly what they're able to do and so it's available outside of law enforcement so it's not as big of a problem to us there's significant issues with the amount of data that is collecting but that's a talk for another day about over broad search warrants and the judge's signing whatever you put in front of them but where we start becoming an issue is these secretive processes many you probably remember the San Bernardino shooting in 2015 where the US government

said to Apple you must put in backdoors and your [ __ ] ssin you must help us be able to get into your phones and extract data Apple says no we're not going to do that and shortly after the Department of Justice withdraws this request and withdraws litigation and says alright we don't need you we got into the phone and everybody says well you just told us a minute ago that was impossible so what happened shortly after that cellebrite created advanced services at the time advanced investigative services which is that it's a secretive process where the the law enforcement agent sends the phone to them at their private lab they conduct some some sort of process that were we

don't know exactly what they're doing and then they send the phone back to law enforcement unlocked with the encryption and passcode removed recently a competitor called grey shift has created a product called grey key which does something similar but instead of sending it off to a to a lab to be done they actually sell a product directly to law enforcement law enforcement does it in-house but the problem with both of these things is nobody will sell it to me nobody will let me use it nobody letting me have access to it not just me and generally I met me as a representative of criminal defense attorneys and attorneys and the public in general and so I have no ability to

actually challenge it because I don't know exactly what they're doing I can't I have no I have no ability to know what process they did and that process worked and so I don't know if the process made it deleted some really important information that could have been on there could exonerate my client I have no idea if it may have changed some metadata and to be clear I don't think these companies would do anything malicious or intentional but if code worked perfectly the way we wanted to all the time and there was no issues or bugs this conference probably wouldn't exist so we all know that that's what happens right and so there's no way though for me to be able to do

that if I have no access to it and everybody says to me no it's a secret proprietary thing but we're gonna try to admit it into evidence and when somebody's liberty is at stake that that is obviously a significant problem so we'll talk a little bit about evidence assessment in particular talk about facial recognition which is becoming a more and more hot topic especially now that Amazon has gotten itself into the game did you see there the redactions are mine for the most part to protect our clients identity but the you know one of the problem we have with the facial recognition besides that obviously there's a real significant problem about the accuracy of these of

these programs and the algorithms as I think was recently shown it was identifying Congress members and stuff like that for the Amazon one but the the other part of it is is that we don't know exactly how its determining what is considered a match and then we also don't know the procedure that that they say this is a confidence level that's enough to determine that this is should be admissible in a court so if it's saying we think there's a 60% match is that enough for it to be used to arrest somebody or to go into court if the 70% what is the actual level that that that would be and the problem is we're having

law enforcement and these private companies do everything they can to keep that information secret and to keep that information from us and the limited ability that we've that people have had to test this stuff has shown a disparate effect on young people of color of being more likely to be falsely identified in in this which has a public defender that makes up the majority of my clients so that's a significant problem for us because there are ready being over-policed and already being falsely identified and now we're having a computer program in which they let me see how it works or how it makes its decisions also being used in that space to say that this blurry still

image from a surveillance video matches a mug shot or even driver's license depending on what jurisdiction you live in if they allow that the picture from a driver's license to be used I put up there the perpetual lineup that's a really great report from the Georgetown Law Center on privacy and technology which I highly recommend to everyone on this issue so individualized assessments in particular I've been talk about sentencing algorithms safety Loomis which Jena had had alluded to earlier was a case coming out of Wisconsin the US Supreme Court chose not to take it up so it doesn't it's not law across the country but it's it's indicative of of the problems that are going on

everywhere essentially the state of Wisconsin the judge used a program called compass which is a risk assessment tool to determine what the defendant Mr Loomis's sentence should be and one of the things that was taking into consideration for that sentence was gender the idea that men are more likely to be recidivists meaning commit another crime after they've already been arrested and convicted of a previous crime and so with that it would be if you had to if you had a man and a woman both situated exactly the same same crime same history same everything else the only difference being their gender the woman would be more likely to get probation where a man would be less

likely to get probation that seems incredibly problematic for a country that's supposed to be equal rights and your gender and sex are supposed to be irrelevant for that the other problem which then this doesn't just apply to kompis applies to the risk assessment tools that are being using for determinations or how much bail should be said if they'll should be said at all somebody should be getting parole again talking about sentencing here and used in a variety of ways a lot of times we don't even know what all the factors are that are being considered or how those factors are being weighted being weighed in the consideration like and for in particular for compass how much did

gender weigh into the determination of what the sentence should be you know was a very minor thing or was that the majority of it but again we have no ability to access that because their claim for prior to trade secrets and unfortunately the courts have not caught up to this yet and it's becoming a problem with us fighting back against judges with that this idea that something could be blackbox to something could not give us you know could not allow defense attorneys to challenge it when somebody's liberty literally Liberty is at stake it seems contradictory to everything that our Constitution our criminal justice system is supposed to stand for and I think the people here will say you

really can't know how well something works until you try to break it and take it apart and here they're saying that's exactly what we don't want you to do from there Nathan take over

okay so I'm Nathan my background is in computing but I work for a forensic DNA consulting company so I'm surrounded in a world of biologists the forensic statistical tool is what we consider a probabilistic genotyping system which is used to evaluate complex mixtures of DNA evidence it was developed at the behest of the office of the chief medical examiner of New York City by a software company and we didn't know a whole lot about it until less than two years ago I'm gonna take you through an introduction to that program a review that we did and a timeline of events as we have unfortunately learned after the fact so FST is going to be one of the acronyms

that I'm going to keep repeating OCME office of the chief medical examiner this is the forensic DNA testing laboratory for the city of New York it's I think it's the largest forensic DNA testing lab in the country the FST program will evaluate DNA mixtures that contains DNA from two or three individuals and it allows for missing data as well as the introduction of artifacts so it'll attempt to account for both artifacts and missing data that's the probabilistic part of probabilistic genotyping it will report out a likelihood ratio as a numerical output which is supposed to be the weight of the evidence that the jury is supposed to to infer from this process so the higher the number the more

incriminating typically the lower the number the more exculpatory it is the more suggestive that that the defendant is not his DNA is not present in that mixture it is pretty straightforward C sharp and sequel web interface that has a processing back-end on a server in 2010 it was reprove through the Regulatory Commission of forensic science for the state of New York there very few of these commissions in the country on a state level but in 2010 they approved it for casework use it looks at data on the human genome at 15 different locations we call the low sigh which I'll refer to again and for mixtures up to three people they originally intended it to be used on

mixtures of four people but that was too much data for it to buy their own measure it was too much data for them to effectively interpret so while they repeatedly publicly stated their intent to validate it for for person mixtures they stuck to two or three person mixtures it was brought online for casework use in April of 2010 and immediately brought offline some functions were updated live as a system was being used on casework an inadvertent change was made they broke FST we learned this last fall it happened in 2011 we have also learned more recently that it was tested repeatedly in the summer of 2011 indicating the likelihood ratio values these are the numerical outputs reported

to the court were modified as expected following a modification to the program in 2017 the laboratory claimed that this modification did not affect the methodology of the program despite reporting different values FST is brought back online for casework in July of 2011 and a bunch of stuff happens here that we don't have time to go into but in 2016 I'm part of the team that is the first outside glance into the inner workings of FST it's a criminal case in New York City it's a federal case and the source code was ordered over for the defense to inspect this is a part of an output screen from FST so if any of you have seen CSI this

looks a lot more like TurboTax than it does CSI if you look at the columns these are identifiers for the 15 genetic locations that we look at for this type of forensic DNA examination you have a reference profile which is a particular person's DNA profile at these 15 locations you have the evidence numbered one to three so this is actually three separate tests of that DNA evidence that's being evaluated together and compared to this reference profile as a standard the very bottom is a statistical weight that's calculated and I'm going to describe what that number is supposed to mean there are four numbers generated for every evaluation by OCME they are broken down by sub

population there is a greater similarity between people of similar ancestry so the breakdown used by OCME is asian black Caucasian and Hispanic the verbal translation of this number into something that's supposed to mean something is the evidence that is lines 1 2 & 3 the DNA profiles developed from whatever sample is being being evaluated the observed evidence is approximately 70 point six times more probable if the sample originated from reference profile so a particular individual and two unknown unrelated individuals this is the prosecutors hypothesis that the defendant and two unknown unrelated individuals contributed their DNA to the sample the defense hypothesis is that it originated from three unknown unrelated people not including the defendant so

this is a characterization of the relative strength of these two hypotheses the evidence is 70 times more probable if the prosecution hypothesis is true than if the defense hypothesis is true this is a scan of a validation summary it was provided not in digital form so it had to be scanned in and manually clipped through there is a library cart of this validation data that had to be manually scanned in the reported value for this evaluation is 157 from the 2010 validation that was approved for casework used by the New York Commission on forensic science in 2016 I ran this data through the version of FST provided to me in US v Johnson it produced the

number 70 point six that I showed on the previous slide these are different values

sure

the one 157 is more incriminating than 70 point six yeah so through a series of steps that we can't go into at a time we ultimately figured out that FST was not considering data at three of the 15 locations where results were generated for this evidence although those locations had been considered during the 2010 validation FST had been changed the 15 out of 15 locations in 2016 calculates the same out of as the the 12 out of 15 locations so this is a false positive value this particular analysis is a non contributor to the mixture the person whose DNA we're comparing to the mixture because this is a laboratory made sample during the validation study

we know his DNA is not present but he has an incriminating likelihood ratio it's above 1 so it seems like it might be a good thing that the modification made the value go down made the likelihood ratio decrease it becomes less incriminating it's still incriminating but it's less so but the likelihood ratio for this one single locus this one genetic location is less than 1 FST was ignoring exculpatory information these other two locations have likelihood ratios above one so it's ignoring exculpatory and inculpatory information

several months after we identified this behavior the an assistant US attorney acknowledged that this is this is occurring this is not a software developer this is not a biologist this is not an employee of the office of the chief medical examiner this is a prosecutor acknowledging that modifications were made to a software program all of this was done under seal there was a protective order that had this not been vacated through the efforts of ProPublica and the male the Yale media freedom and information access clinic that's a mouthful filed a request with the judge that the protective order be vacated it was because OD OCME did not oppose it I don't know why but here we are today

talking about it ProPublica then posted the code on github so if you have a copy of Visual Studio and Microsoft sequel you can get FST running on your home PC one of the issues here is that as this was made public in October of 2017 twelve samples were originally evaluated with FST in August 2010 and then Riva re-evaluated after the modifications were made in the spring and summer of 2011 two of these samples had one locus where data was dropped we just examined a sample that had data dropped at three locations there were 439 mixtures in this validation study they only went back and then did their version of a regression test on 12 out of 439 and only two of

those had the data dropping function that is changing likelihood ratio values more recently records from 16 additional quality control tests their vernacular for a regression test or retroactively out revalidation study were produced

the main culprit for this change in functionality is only 70 lines including comments and whitespace so all of this is a very small change in terms of the lines of code but as everyone here knows that can be pretty substantial to put this in context against the greater field of forensic DNA analysis and the growing field of probabilistic genotyping I am NOT surprised that this occurred that the story plays out the way that it does that software was modified five years later it takes an outsider to identify that it was its then publicly acknowledged and then its back pedaled as to its relevance or importance there was a case that I was involved in in

Washington state where the truly OL software was used this is a commercial probabilistic genotyping system the defendant Emmanuel faire his team requested that the true allele source code be provided similar to how the FST source code was provided in the Johnson case out of New York City responses as to why this did not need to happen included the declaration of the software developer that there's no way to actually use source code in a validation study yeah but it doesn't stop there a professor of mass medicine said there's no reason to have source code unless someone intends to modify it

a supervisor in a DNA of forensic DNA laboratory said source code is not necessary for determining reliability of software because sort source code is typically not used to determine reliability of software for forensic use excuse me lab director of another forensic DNA laboratory clarified that DNA analysts only take one class and stats and many of them don't have any foundational computer science and knowledge so the folks using the software don't necessarily know how its constructed or how it should operate he follows up with it strikes him as a regular that we would single out one particular step of the whole forensic DNA analysis process if one is to discuss error in DNA testing then would one not want to

capture an error rate for the entire workflow

so Gina's going to take it over again and describe some of our ongoing work in the field so we hope we've convinced you that us technologists us hackers have a lot to say here and need to be engaging in this part of our society and need to be commenting and watching right one thing that we're doing this year is we got a grant from the Brown Institute we're trying to tell this story to a general audience through expose and journalistic pieces we're trying to tell the story to it not a technology audience and though that's you guys we're trying to tell this story to a legal audience we have a bunch of articles in a journal called the

champion from the National Association of criminal defense lawyers that is all about the various aspects of this and we are trying to do independent third-party testing a probabilistic genotyping software as he mentioned it's not just FST and true allele there's a whole cloud of software and shouldn't we be comparing one to another I mean if courts would have trusted any of them and they don't agree what does that say about justice and where we have access to source code we are trying to look in the source code and find problems like the routine that Nathan heroically found in his look but I'm sure there's a lot of other things that could be found and you know we'd love your help with that

so what makes independent testing hard in this space well one it's really difficult to get access to the executables of the software the this software might not even be available to be purchased as jerome was saying about many devices you can't even get it even if you have a lot of money or if you can get it you know like for example what replaced FST in New York star Mix I think it's $30,000 to buy it plus like five thousand dollars of training and then even you know you know that's not talking about that's just an executable that's not talking about getting the source code or a database of bugs or testing plans or design documents or or other things that

as technologists we know would be really helpful to try to understand what's going on inside these black boxes and on top of that if you look at the Terms of Service even if you can buy these things oftentimes they say you know there's limits on publishing the results or talking about it in public and in general there's this problem of trade secret protection the intellectual property rights of manufacturers be considered to be more important even than the rights of defendants in criminal court like did you think that that was the way this country worked so even under protective order you can't go in and look many many times and I think often times that is done to shield from

legitimate questions of quality and fairness more than it's done to protect from competitors because there's already a huge first move advantage into a market just by admissibility into court and kind of a snowball of that so all of this is thwarting the essential iterative improvement and debugging of these systems that would be absolutely necessary to establish when they can be used reliably and when they cannot be and it is accountability it's thwarting accountability to stakeholders beyond just the buyers and in and in many instances the the interests of the people who are buying this software for use in jurisdictions is of course very different than the people being decided about and there's really a need for natural repositories

to share results and connect audiences you know Nathan mentioned you could download the FST source code and run it on your laptop but that's we've been doing that in our team as part of this magic grant and that's really hard it's not just you know there's a lot of secret sauce so you know how would you connect with us to know how to do that how would you be able to report back to us what you saw how would a defense team that's representing a client connect with experts who might be able to comment on the ways in which the things in the internals of the system might be relevant to that particular case so we

would really like your help if you have learned something this talk and you think there's other citizens and hackers who might be good for them to know this information we'd love if you you know get them to watch the video when software is being proposed for use we have a procurement phase wish list so when public money is being used for criminal justice software require would be great or at least give credit in the procurement phase for things like making your source code available making software artifacts like bug data reports internal testing plans and results software requirement specifications risk assessments design documents etc available to the public or at least independent testing but I say the public

because there's - it's too easy to corrupt an independent testing arm if it's not also public no clauses preventing third party review or publishing of defects should be in terms of service are you kidding me access to executables for third party testing without you know huge price tag oh here's a big wish list item scriptable interfaces to facilitate automated testing a huge amount of what we're doing is you know trying to put automated testing harnesses around these things so that we can run a bunch of cases all the way through bug bounties would be great we would love for in the procurement phase for jurisdictions to contribute to a fund for third party entities to do independent testing these

are all things we'd also love for you to be a third party review or please go check out the criminal justice software that is available things like FST on github as we mentioned lab retriever LR mix like Ltd euro formics those are all examples of software where the source code is available go look for bugs please tell us about them that would help us there's also predictive policing software Civic scape that's been posted there are people who are making software who get for making software in this space it probably ought to be so open source because that's the nature of things in the criminal justice system so take a look find bugs find bad smells in

code let us know also construct software for alternatives and comparisons many programs have algorithms published replicate and say you know we're getting different results than you say you're getting all of these things would be super helpful things for the technology community to do and in the bigger picture black box decision-making is all around us it's not just the criminal justice system the the criminal justice system is just especially offensive there there are big decisions being made about all of our lives and credit and hiring and housing in all of that list and we would really encourage all of us as citizens and technologists and hackers to be pushing for algorithmic accountability and transparency in all

of the things around us the Association for Computing Machinery has a Technology Policy group that put out a set of principles much like a code of ethics oh you can if you're building these things you can point your team and your boss to people you know to a list that says this is why we should invest in these features things like awareness access and redress accountability explanation data provenance audibility validation and testing be happy to point people to those principles and some explanation of them but the basic idea is pride evidence that's needed to improve systems for all stakeholders so that we're not running our society on buggy or possibly even malicious algorithms that are hidden from view and I think in

the light of all the things we've talked about that's enough evidence isn't there that there's bugs and problems and maybe even malicious stuff going on the risk is just too high we'd like to thank the people without whom our work would not be possible and that includes four of us students who are working on the project marzia Steven Abby and Maruyama maybe you guys can wave a minute are many our colleagues on them on the magic grant and the Brown Institute too many names here to list off but now we'd like to open it up for questions yeah ffs up there um so in that champion issue that I mentioned I co-wrote an article with two folks from EF F we

could have put I would say that they're definitely in the space of people that were talking to so it was definitely love Cambron and kit Walsh that the article I wrote was called opening the black box so yes definitely involved in the space thanks for asking big shout out to the FF what a question if you could extend this to like gas chromatography or maybe mass spectrometry some of the other measurements like that that are used in other cases absolutely I think as I mentioned I just think we all need to be looking at the decision-making around us that is being done in a black box away and we need to be pushing back against

that or we're entering a world in which what are humans are going to be saying oh it's just because the computer said so I don't know why you lose your job I don't know why you go to jail I don't know why this or that we really need as technologists to be pushing back on this in all of those areas push in many as areas as you can

so you talked a lot about how we don't necessarily know how computers and software are coming to the conclusions they are and how these are how these conclusions are affecting human beings however the if we didn't have these instead of say the computer telling the judge what's like whether or not someone should be given bail or what bail should be set at it would be the judge doing it with their own hidden biases that they might play themselves might not even know exists and I guess the point I'm trying to get at is I agree that the state of software in criminal justice is kind of a farce but what makes it more of a farce than the criminal justice

system currently is yeah I'll take that one so I agree as somebody who works at the Kurla justice system as part of my job that there's no question that judges the prosecutors and law enforcement are have their own biases I think too that the differences is one I'm not saying that no program should be I would be able to be used we're saying we should be able to know how it works and what how it's making the decisions if since we have the ability where it's a little harder for me to force information out of somebody's head whereas the computer we know what we know we can review it and so there's no reason for us not to be able to - in

regards to the comparison between excuse me comparison between a program and an individual I think a lot of times people outside of tech people like us when they hear a computer said this or that data is found this they take that as like the Word of God where it's not very hard to explain to somebody that an individual may have a bias because we all deal with individuals and we've had people lie to us or make mistakes whereas people who are not very tech savvy often hear a computer said this is right and they say oh okay and don't really about challenging in the same way they would challenge an individual who made who made an opinionated or subjective

decision so I think it's it becomes more problematic as soon as somebody hears DNA said that this person is likely contributor to this it's as if their brain shut off whereas when they hear a witness saying yeah I think that's the guy who did it it there's still more receptive to other information I think there's a particular problem in these kind of quantitative assessment programs too that that output a numerical result that FST that 7 D versus 157 value nobody questioned it nobody said this 157 should really be closer to 70 and when it was starting to during the validation and when it was going to 70 after the modification after it was already in use nobody said you know this

should really be higher like 150 or so there's so little oversight it's such a complex issue that our intuition about what's right or wrong about complex interpretation processes is so far removed that we just look at it and when somebody is has their evidence evaluated by a particular program that outputs a number calling that number in a question without offering your own number I'm no lawyer but that seems like kind of a weak strategy I think there's a lot of danger when you put algorithms that are trained on the way people did things are on historical data which is flawed into a black box and then call them unbiased cool logical decisions made by computer

almost infallible and especially if you're going to put them in that box and leave them there forever without iterating or debugging and the only question you ask is are we putting people in jail okay good you don't ask the question is it the right people or you know in this particular case has there been a mistake made if you don't get to ask that that's that's a pretty big problem do it I'm sorry first I told you to snapshot both my lawyers like handclaps and everything when they saw just the title for it I did have two questions and anyway he said one is I was looking for you because the title says you know innocent

I was wondering the cell phone taste work to where you were talking about whether you strike was that they use that to make a person seem incriminated or the fact they found a criminal but you're objecting to the fact the way they found him and your point there we're constantly every day hearing that people are flawed that we use our own judgments or bias so we move to a system and then we say we move to a system where you know it's just taking data and sampling and that hey we can find more criminals in this area you know so ramp up and but then we're saying back again that well now we need a human to put

their judgment that's out there it seems kind of cyclic that we're saying that huge others as flaws we need a kind of a non-biased Cole just analytical thing but they were saying well analytics can't can't put personal and human emotion sided so we need another person backside there I'll respond to at least the first part of that well obviously I he maintains his innocence as as his subway wear says in the case I also do as well but that was more of a point to show the the secrecy that that oftentimes surrounds a lot of this technology that we sometimes have even harder hard time even though it's being used and that even when it is being used

a lot of the there's a lot of the issues that arise keeps us from knowing exactly how it's being used for example going back to that you know we were told that was just used well we were on the president was just used to to located an individual that was a suspect to be arrested but they specifically didn't get a warrant which seems problematic because that's kind of how our system ensures that things are done appropriately and at the same time I'm being told to that I'm supposed to just trust then they didn't wasn't used in any other way which I have no reason to trust them based on especially based on their past history

and then even after all this they still did everything they could to pretend that it didn't happen didn't occur which I think is kind of endemic to what's going on here with these non-disclosure agreements and these claims of trade secrecy that that law enforcement is essentially acting on behalf of private companies to keep this information quiet and making it difficult for defense attorneys to actually do their job I think we have the potential to build systems that are more explainable more audible more fair and less biased but just because there's the promise of that doesn't mean that what we won't really get is black boxes where mischief is hidden that is an equally possible future if we don't insist on evidence

and accountability and transparency and watching right we could just as easily get black boxes where mischief is held and it serves the interests of powerful entities and it doesn't serve the interests of individuals now we have a criminal justice system that we say cares about individuals and and whether that individual is guilty or not are we going to continue to have that world or not I think we really that's what we're up against and very similarly I think it is very much kind of the question of our age the mix of human decision-making and computerized decision-making and I think there are ways that we can mix those together and get some of the best of

both but I also worry that we will get the worst of both if we're not careful and I think investing in accountability and transparency and the ability to watch and question at this juncture is incredibly important and I would that's why we're we're really asking citizens and technologists technologists especially to say you know this is what can happen this is what this is how often system complex systems are buggy this is how much you need people who are looking at this kind of auditing information and debugging information and looking for the problems and fixing them and you know automated processing by itself it you you invest in things that make kind of high-level demographic decisions rather than look at individual

exceptions so having if the goal is just to make this a lot less expensive so we don't need as many people doing hard work that's probably not going to be the recipe that gets us this is the the system we want to live with right so I think there's still roles for people but maybe the roles that people are taking are different they are understanding and questioning the output they're looking at how it applies to a very particular case they are you know doing more validation yeah okay um I think it was wrong I'm talking about the stingrays and the other devices like that it seems like you're allowing the tail wagging the dog a bit where I think if it's a

person you have the right to face your accuser if somebody in the room accuses me of something we'll go to court and they're gonna be in that audience or whatever so that my defense attorney can can cross-examine but now we have this class of person which is the corporation who says proprietary secrets and it seems like you should be able to say proprietary secret you have a right to your proprietary secret we have a right to throw it out as invalid because you will not give us this information just like if the person accuses me and says well I know he's guilty I just can't tell you why that's gonna get thrown out - I would think so it seems like the

system is allowing these things to exist without proper auditing and all corporations people when it's convenient and not when it's not yeah I think you pretty much hit the nail on the head there where that is are a large part of our argument you know the right to confrontation is a constitutional right and we're being prohibited essentially from being able to actually use that right and protect somebody's right to do that and that's a problem and and part of that problem comes back to something I said earlier which is most judges are not tech savvy and they hear computers said this is this and they go oh it's a computer and that's like the end of the questioning and they

basically what you're just saying is something I'm trying to explain to them with varying levels of success and so it's a it's a real ongoing fight but in the short term it seems like challenging these these companies and getting access to source code and testing seems like the right way to go but maybe in the long run would it be better to instead have maybe the federal government fund developments of open source tools that actually all local police departments law enforcement can use so they're not constantly buying the same products from all these other companies that we have the source this is has been heavily criticized from within at least the forensic science field that the members

both inside and outside of the forensic DNA community that I've spoken with because there's the idea that a grant will come down and open-source software will be developed and then abandoned when the grant dries up so it sits abandoned where it's not supported nobody gets training on it and and so there's this this kind of knee-jerk reaction to the idea against an open-source business model that you can either have protected software protected source code and generate revenue from selling it or you won't have supported software that that is kind of a parallel conversation that I think we need to have about how there can be opportunities both public and private for ensuring the the ongoing maintenance

and support of the software but like you understand what I'm saying like the the open soft open so everyone in this room understands how open source business models can work at least right like we've heard of some success stories but we're not the group who's questioning that please explain it to your legislators yeah so this is certainly very eye-opening I think this complete agreement that we need to challenge them I mean perhaps a situation where most of us face on a daily basis or semi regular basis for example a credit score I mean that's a great example of where you go to one credit agency and they'll give you a certain score and another one give

you another one and assuming that they have the same information why should they be different and so these are scenarios where how do we know whether it be in the justice system or in any of these organizations that have great influence on our lives how do we know that they're doing it for the better of everyone as opposed to maybe the party that might be most interested is those people willing to lend us money or whatever other situation well many times they are exactly doing it in the interest of those parties and not not the people being decided about that's kind of the point and so it's a it is a big problem you know I don't I don't

know how to get around it other than to vote with our wallets vote really you know set an agenda that this matters we're we're up against a huge problem a black box decision making all around us and huge impacts on our lives and no real accountability to two individuals only and the only accountability to you no governments and corporations and that doesn't sound like the world I'd like to live in so all right we'll have to wrap it up yep okay so right now you could send stuff to us we also set up a board software justice board stat net is that right but it we just set it up before this thinking it would be nice to have a place we can all

come together I think we also set up software justice league at gmail.com I'm not saying that we're incredibly organized but we're working on it and we would really like to see you know a larger entity come out of this to do general independent third-party testing I have to have a big shout out to Pro Publica and the kind of like data investigative journalism that they've been doing in this space you could start by writing them a check and I think PFF you could write them a check and you could help us stand up a you know credible independent third-party testing stuff so you could start by sending me mail if you'd like I'm easy first name

dot last name @ gmail.com so Jeana Matthews at gmail.com and we're we're we're given this talk at Def Con as well and maybe between now and DEFCON we can get our call to arms a little more clear sober exact so we were we were talking about we had that exact conversation but we're a small team and we've been really busy doing this independent third-party testing so if you would like to like volunteer to help us get that right come up and you could use star that you could write checks to them and you could get us get our board set up exactly right okay thank you everybody for your excellent questions attention and interest in this important area

[Applause] [Music] [Applause] [Music]

[Music]