Imagine a laundry basket full of socks: big socks, small socks, bright socks, dress socks, sport socks, all types of socks. How fast can you match the pairs? This depends how you proceed. And your methodology may have human rights implications, especially in the criminal justice field.
Metaphors offer powerful means to understand complex issues. This is why the “Algorithms, Big Data and Criminal Justice System” workshop that I led at the 8th Global Diplomacy Lab in Berlin in June 2018 illustrated algorithms with a large laundry basket of socks. I used this metaphor to introduce major concepts in data science, from data (the socks), databases (the laundry baskets) to keys (our socks’ characteristics,buckets (smaller laundry heaps) and memory (the space on the table to put out a sock for a short time).
In the workshop, we played with algorithmic efficiency in terms of time and space by comparing naïve searches to mergesorts, divide&conquer, bubble algorithms and hashing. Understanding these technical basics allowed us to kickstart the discussion on the uses and misuses of algorithms in the criminal justice system.
Criminal records are today an easily-accessible commodity. An international business ecosystem of data harvesting has emerged. The “data brokers” in the criminal justice field are entities that, for instance, pressure law enforcement, courts and correctional facilities to hand over criminal records under the Freedom of Information Act. This raw data is cleaned up, repackaged and sold to a very wide range of criminal record consumers: businesses that build crime-fighting software, foundations that test software, universities that research the subject matter, media outlets that report crime statistics, and frankly anyone else who wants to buy this data – start-ups, political parties, conspiracy theorists, just anyone.
However, the human consequences are neglected. This unregulated, widespread digital release of criminal records – data such as arrests, booking information, court records, criminal history – is creating a new form of eternal punishment and social stigma. Having your criminal record or presumed criminality exposed can function as a mark of Cain. Let us reconsider what it means to have a “criminal label” in the digital age. How can we expect anyone to reform and lead a non-criminal life if no one can walk away from their past? Do we not want to provide a path to redemption?
Big data – the new oil, as many call it – is poured into algorithms, efficient little engines which are refined into various types of predictive software. In today's digital age, “predictive policing” has become a buzzword for a mere development of established policing techniques that depend on quickening the response time between the criminal act and police reaction. Policing has always evolved with the available technology. The first guards were on foot, then on horseback, then in patrol cars; first telephones then wireless radio communication made their work more efficient. In the 1960s, the police force became more systematically organised, especially by optimising resources such as police force deployment across the city. In the 1980s, in-car computers in police vehicles were introduced, Shortly afterwards, this was featured in pop culture, like on the famous cop show Knight Rider that showcased KITT, a thinking, talking and self-driving car.
The software used for predictive policing offers many applications from visualising crime “hot spots” to facial recognition on CCTV cameras. Yet unlike in dystopian fantasies, predictive technologies in this field do not predict future crime but serve to shorten the time span between the criminal event and police action, and even here it is up against big challenges.
Within the criminal justice system field, most algorithmic technologies are employed at two moments: either in the pre-sentencing phase or in the sentencing phase. In the first case, a typical query the software must solve is: “Can we release this person on bail with a minimal risk of him or her committing another crime?” For the sentencing phase, the key question is whether this person will be a threat to society if given the most lenient sentence. Most software used for this risk assessment is proprietary and closed-source software, meaning the exact nuts and bolts that power the algorithmic engines are trade secrets.
For example in the U.S. the risk assessment software COMPASS is used in the pre-sentencing phase to assess risks such as recidivism. To study COMPASS’s success, Pro Publica, a U.S. think tank of investigative journalism, managed to receive and analyse over 10,000 “risk scores” predicted by COMPASS for defendants in one single Florida jurisdiction, and compare these predictive risk scores with the data of what happened in practice. Among other findings, Pro Publica found that COMPASS flagged 60% of all defendants as “at risk of recidivism”, but only correctly predicted 20% of the time that those flagged committed a violent crime. The risk predictions also seemed to have more false positives in – i.e. seemed to be skewed against – the African-American and Latino communities. Pro Publica created a heated debate between the software developers, activists and criminal justice scholars, the latter pointing out that such bias will always exist if the base rates of recidivism differ in the various communities.
An answer to this issue is provided by another NGO, the Laura and John Arnold Foundation, that has released its own public safety assessment tool which pointedly does not use any race variables or any variables that could proxy for race such as zip codes. The Arnold Foundation algorithm was tested on 1.5 million pretrial detention cases in the USA and is currently being used in 21 American jurisdictions.
In 2017, a Stanford University study compared several such criminal risk assessment software packages against 1.36 million pretrial cases and concluded that a computer is better than a human judge in predicting whether a suspect will flee or re-offend. Trusting in the risk assessment tool, and not exclusively in the intuition of the human judge, has yielded improvements on a larger scale: Lucas County in Ohio, which uses the Arnold Foundation algorithm, has cut in half the amount of crime committed by those awaiting trail and decreased by half the number of people languishing in jail awaiting trial. Trust in the risk assessment tools has led the state of New Jersey to reform its state-level bail system to legally allow more people to get out of jail without bail.
Even if the results can be trustworthy, basic human rights enshrined in the U.S. Constitution and the European Charter of Human Rights (ECHR) make taking judicial decisions based on software output contentious. In the ECHR, the concept of “equality of arms” means that both the prosecution and the defense must be able to access, understand and challenge the same evidence. However, the ability to challenge the accuracy of proprietary algorithm-generated evidence is oftentimes impossible, which leads to extreme knowledge asymmetries. Although the results are generally trustworthy, they may not be correct in individual cases.
The use of predictive software in the criminal justice system still falls into a legal grey area. This is why the European Commission for the Efficiency of Justice (CEPEJ) will publish a report of their recommendations for criminal justice policy-makers in December 2018. This report will consist of three parts: First, the CEPEJ seeks to correct misconceptions regarding the “predictive” abilities of these types of software. They correctly identify “artificial intelligence” (A.I.) as sophisticated statistical machines, operating via correlation with past patterns whose "understanding" of their own end results is just as limited as the understanding of online automatic language translation tools regarding the meaning of the text they translate. Secondly, predictive software needs to be better analysed in relation to Human Rights, such as the right to non-discrimination and the right to a free and fair trial. Finally, the CEPEJ will propose an idea of algorithmic governance in the criminal justice field. The proposal will suggest a toolkit to increase algorithmic accountability, supporting independent, regular and expert assessment of the bias in the algorithms used in A.I. software, and also certified trainings in cyber-ethics for IT developers in this field.
The IT developers in this field are by themselves becoming more aware of the ethical considerations of their work. Best practice in the field of data science for social good can be found in the work of Dr Rayid Ghani, the former Chief Data Scientist of the 2012 Obama Campaign and today the Director of the Center for Data Science and Public Policy at the University of Chicago. The Center creates projects and opportunities for data scientists that want to work for the social good. Most recently, it has released guidelines on how to hold algorithms accountable, and its own predictive policing software, the “Early Warning & Intervention Systems for Preventing Adverse Police-Public Interactions”. This turns predictive policing around by analysing the police force itself in order to flag officers who are the most at risk of committing a crime during arrest. This method holds the promise of decreasing the amount of negative interactions between the public and the police, ultimately helping to regain trust and legitimacy, working towards a better solution for both the Chicago communities and the Chicago police force.
Isaac Asimov once wrote that “science gathers knowledge faster than society gathers wisdom”. We are entering a brave new world enhanced by mathematics and algorithmically-augmented reality, but too many in the sustainable development field think these issues are far beyond their grasp or field of interest. The workshop I held in Berlin was an attempt to disprove this notion. Perhaps diplomats, global thinkers and sustainable development practitioners should engage minds and direct energy to these new urgent issues facing disadvantaged populations today before – to take an expression Marty Castro used during the 8th GDL – "it’s too late to change the Monopoly Board", again.
More information on the topic (URL):
Published on November 23, 2018.