June 5, 2017


  • Possibly a record false positive rate:  “a substantial number of takedown requests submitted to Google are for URLs that have never been in our search index, and therefore could never have appeared in our search results… Nor is this problem limited to one submitter: in total, 99.95% of all URLs processed from our Trusted Copyright Removal Program in January 2017 were not in our index” (Google submission to Register of Copyrights(PDF), via Techdirt)
  • Problem with rental costs in Canada’s historical CPI “the clerks who recorded the data were under an instruction that, since the CPI was to represent prices paid by better off working class families, to edit out any rental figures what were above a designated threshold. By the end of the 1950s they were throwing out more than half of the reported rents.” (Worthwhile Canadian Initiative). Data doesn’t just happen: it’s choices by people.
  • I’ve mentioned the University of Washington course “Calling Bullshit on Big Data” before. Now the New Yorker has a story about it.
  • What different sorts of things can go wrong with a statistical prediction rule? A taxonomy, from Ed Felten.
  • Explore NZ mortality rates divided up by ethnicity, income, and age
  • “What we learned from three years of interviews with data journalists, web developers and interactive editors at leading digital newsrooms” Storybench, via Alberto Cairo
  • A couple of examples from the fine UK election tradition of disinformation graphics: Scotland, London

Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »


  • avatar
    David Welch

    The machine learning xkcd is a pretty good one too: https://xkcd.com/1838/

    4 months ago

  • avatar
    steve curtis

    better not look at Google street maps of China, when you compare with aerial photos there is not any sort of matching overlay like you find in most western countries.

    4 months ago