April 17, 2013

Open data on the West Island

If you want to get Australian census summary data, you can download it from the Australian Bureau of Statistics, or buy a DVD for A$250.

An article in iTNews explains why someone might pay rather than downloading

“You have to click to download each pack individually, and they’ve set the site up deliberately to make it difficult to use a browser plugin to download everything that is contained on the released DVD image,” Bowland told iTNews.

That’s not hyperbole: Grahame Bowland quotes JavaScript code comments that actually say they are trying to make automatic downloading difficult.

Or, the data release is now available using bittorrent, thanks to Bowland, who bought the DVD (this is perfectly legit: the data are Creative Commons licenced).

(via @keith_ng)

avatar

Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »

Comments

  • avatar
    mpledger

    There is some cost to them in serving the data and a big file download can tie up the server quite quickly when lots of people want to download it. If all the hassles are due to making sure a person is trying to download the data, rather than some malicious automated script where the intention is to cost or tie up the server, than I think it’s reasonable.

    11 years ago

    • avatar
      Thomas Lumley

      I don’t think that entirely holds up as an explanation — they also, according to the code comments, were trying to disable caching and ensure that every download attempt went right through to their servers, which would unnecessarily increase the load.

      And other private and public sector organisations with large data sets (or PDFs or advertising videos) don’t seem to have this problem.

      11 years ago

      • avatar
        Thomas Lumley

        Further thoughts: if they wanted to ensure that people got the data directly from ABS rather than from a third party the setup would make sense, but in that case the CC-Attribution licence makes no sense.

        11 years ago