Striving for Judicial Consistency: The Challenge of Algorithmic Sentencing
Paragraph 1
If it is to be completely fair, a legal system must be consistent. For justice to be meted out, a decision handed down by one judge should not be very different from that pronounced by another in a case with largely similar facts. In reality, however, this is rarely the case. To ascertain whether there is such a thing as judicial consistency, 47 district court judges from the state of Virginia, US, were given five different hypothetical cases and asked to adjudicate on them. Far from displaying consistency, their decisions could not have been more widely divergent. In one case, of those who voted guilty, 44% recommended probation, 22% imposed a fine, 17% imposed probation and a fine while the rest suggested jail time. If a group of sitting judges, adjudicating on the same set of facts could come up with such widely disparate results, how can we hope for any measure of consistency when they rule on real cases?
Paragraph 2
With this in mind, a number of countries have put in place prescriptive systems designed to take human subjectivity out of sentencing. These systems are designed to ensure that individuals convicted of the same crime always receive the same sentence. However, by removing judicial discretion, they sometimes fail to appropriately consider important mitigating circumstances that help establish whether or not the person convicted of the offence has any chance of being rehabilitated. It, therefore, becomes important to find a way to empirically establish what the likelihood is that a convicted criminal will commit a crime again.
Paragraph 3
In 1928, Ernest Burgess came up with the concept of unit-weighted regression and applied it to the evaluation of recidivism risk in prison populations. He identified 21 measures and assigned to each of the convicts in his sample set a score of either zero or one against each parameter. When the scores were added up, he predicted that convicts with scores of between 14 and 21 had a high chance of parole success, while those with scores of four or less were likely to have a high rate of recidivism. When he tested his prediction against what actually happened, 98% of his low-risk group made it through parole without incident while 76% of his high-risk group did not.
Paragraph 4
By 1935, the Burgess method was being used in prisons in Illinois and variants of this mathematical approach began to be used around the world. As computers got more advanced, the algorithms designed to assess recidivism risk were able to take into consideration a significantly larger number of factors. With advances in machine learning, they could spot patterns that humans could not hope to see. Not only was this approach producing consistent results every time the same set of facts were presented, given the vast volumes of data these systems could process, their ability to accurately establish recidivism risk was far better than any human could hope to deliver. That said, algorithmic sentencing is not perfect. The fact of the matter is that algorithms build their models on historical data sets, precedents that are themselves the outcome of decades of choices made by humans who are far from objective. We created objective algorithms because we knew that humans were inherently irrational in the decisions that they made. However, the solution we created seems to be infected with the same biases that we were aiming to eradicate.