My StatsChat posts, and especially the ‘Briefly’ links, tend to be pretty negative about big data and algorithmic decision-making. I’m a statistician, and I work with large-scale personal genomic data, so you’d expect me to be more positive. This post is about why.
The phrase “devil’s advocate” has come to mean a guy on the internet arguing insincerely, or pretending to argue insincerely, just for the sake of being a dick. That’s not what it once meant. In the early eighteenth century, Pope Clement XI created the position of “Promoter of the Faith” to provide a skeptical examination of cases for sainthood. By the time a case for sainthood got to the Vatican, there would be a lot of support behind it, and one wouldn’t have to be too cynical to suspect there had been a bit of polishing of the evidence. The idea was to have someone whose actual job it was to ask the awkward questions — “devil’s advocate” was the nickname. Most non-Catholics and many Catholics would argue that the position obviously didn’t achieve what it aimed to do, but the idea was important.
In the research world, statisticians are often regarded this way. We’re seen as killjoys: people who look at your study and find ways to undermine your conclusions. And we do. In principle you could imagine statisticians looking at a study and explaining why the results were much stronger than the investigators thought, but since people are really good at finding favourable interpretations without help, that doesn’t happen so much.
Machine learning includes some spectacular achievements, and has huge potential for improving our lives. It also has a lot of built-in support both because it scales well to making a few people very rich, and because it fits in with the human desire to know things about the world and about other people.
It’s important to consider the risks and harms of algorithmic decision making as well as the very real benefits. And it’s important that this isn’t left to people who can be dismissed as not understanding the technical issues. That’s why Cathy O’Neil’s book Weapons of Math Destruction is important, and on a much smaller scale it’s why you’ll keep seeing stories about privacy or algorithmic prejudice here on StatsChat. As Section 162 (4) (a) (v) of the Education Act indicates, it’s my actual job.