November 19, 2015

False positives

I searched for “Joe Hill” on Google a few months ago, and the “aren’t we clever” box popped up with:

CCvDKyaUMAEGqVg

The statistics and  computation behind these searches is impressive: in addition to all the usual Google stuff, the system realises that the – fairly common – words “joe” and “hill” occur together sufficiently often that they are probably a thing. Then it takes advantage of Wikipedia to realise that “joe hill” is the name of a person, not a geographical feature or a coffee shop (or, I suppose, profanity), and finds pictures and information. And it almost works — even with people who aren’t especially well known.

The gentleman on the left really is Joe Hill (author), aka Joseph Hillstrom King. One of his books has been made into a movie starring Daniel Radcliffe, so he’s definitely successful but not in any sense a mainstream celebrity. The gentleman on the right is someone else. People with an interest in labour history or folk music will recognise Joe Hill (activist), aka Joseph Hillstrom ,aka Joel Emmanuel Hägglund: I dreamed I saw Joe Hill last night, alive as you and me”. It’s an understandable mistake for the Google: the modern Joe Hill was named after the historical one, and there will be a lot of cross-referencing of the two. And it doesn’t really matter.

Joe Hill (activist) was involved in a rather more important false positive. The song says “they framed you on a murder charge”, and it’s only exaggerating a bit. There was strong circumstantial evidence and Hill refused to give any explanations, but it also appears the eyewitness testimony was manufactured. He was executed 100 years ago today.

avatar

Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »

Comments

  • avatar
    Howard Edwards (id:hedwards)

    Interesting – I had never heard of the author but that’s probably a function of my age (and being an old folkie from way back).

    Regarding people and place names, I was recently googling my former colleague Dick Brook (who sadly is not at all well at the moment) and discovered that he is also a tributary of the river Severn.

    8 years ago