May 20, 2016

Briefly

  • The Princeton Web CensusToday I’m pleased to release initial analysis results from our monthly, 1-million-site measurement. This is the largest and most detailed measurement of online tracking to date, including measurements for stateful (cookie-based) and stateless (fingerprinting-based) tracking, the effect of browser privacy tools, and “cookie syncing”.  These results represent a snapshot of web tracking, but the analysis is part of an effort to collect data on a monthly basis and analyze the evolution of web tracking and privacy over time.”
  • Nate Silver on TwitterAn irony is that our early Trump forecasts weren’t based on a statistical model. Just a guesstimate that I got stubborn anchoring myself to. So one lesson is “when in doubt, build a model”. Doesn’t have to be your final answer. But it’s a great starting point. Provides discipline.”
  • From Flowing Data, a visualisation of the changing US diet
  • A visualisation of 24 hours of data flow in a health insurance company: pretty, but not necessarily useful
  • “Mukherjee gives us a Whig history of the gene, told with verve and color, if not scrupulous accuracy. “ A book review/essay at the Atlantic, by Nathaniel Comfort
  • There’s a new White House report on Big Data and Civil RightsUsing case studies on credit lending, employment, higher education, and criminal justice, the report we are releasing today illustrates how big data techniques can be used to detect bias and prevent discrimination. It also demonstrates the risks involved, particularly how technologies can deliberately or inadvertently perpetuate, exacerbate, or mask discrimination.” (via mathbabe.org)
avatar

Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »

Comments

  • avatar

    There is 1980 to 2014 aggregated birthday data for New Zealand in an excel file at

    http://www.stats.govt.nz/browse_for_stats/population/pop-birthdays-table.aspx

    As it is aggregated across all years, all the 13ths of each month are combined, Friday or not.

    8 years ago

    • avatar
      Thomas Lumley

      Yes, the aggregation is the problem, for Friday 13th, but also for Easter and for Mondayised holidays, and even for day of the week.

      One could look at how well the US model for Friday 13th fits NZ, though.

      8 years ago

      • avatar

        I came up with a bit of exploratory analysis that works with the aggregation. There does not seem to be a lot of evidence that fear of Friday the 13th has affected aggregate demand for scheduled obstetric services in the New Zealand (largely public) health system.

        https://thoughtfulbloke.wordpress.com/2016/05/21/new-zealand-births-and-friday-the-13th/

        Since it didn’t look like much was happening, I didn’t pursue it any more than that.

        8 years ago

        • avatar
          Thomas Lumley

          Nice. It would be interesting to know how big an effect is ruled out, but probably not interesting enough for me to get around to doing it before the twelfth of never.

          8 years ago

        • avatar
          Megan Pledger

          Wouldn’t it be better to compare 13ths with 6ths and 20ths – than you’re always comparing them on the same day of the week. On a “Friday the 13th” month, four of your comparison days are weekend days when no scheduled ceasers take place (I assume). On a “Wednesday the 13th” month none of the comparison days are weekend days.

          Also I don’t think there are enough scheduled ceasers to do fill all the week days even in big cities in NZ. I’ve got a vague feeling that scheduled ceasers used to be Tuesdays and Wednesdays at Auckland Women’s Hospital back in 2002 when I was passing through.

          8 years ago

        • avatar

          @Megan, absolutely it would. But the public data for NZ only has the day of month for all years, so 5416 born on March 13th from 1980 to 2014- because we don’t know the specific year of each birth, we don’t know which were Fridays. The same problem with being unable to identify the specific Friday the 20ths etc.

          Not actually doing scheduled c-sections on Fridays would intend mean fear of Friday the 13th is not having an influence on scheduled c-sections.

          8 years ago

        • avatar
          Megan Pledger

          But by using the 6th and 20th as comparisons, you are balancing out the day of the week effect of whatever day the 13th is.

          8 years ago

        • avatar

          Ah, with you now- comparing it to an equivalent “unthirteenthed” dates with similar day patterns rather than a collection of general dates distilled to a representative value.

          8 years ago

        • avatar

          looking for Megan’s comments about being scheduled on particular days by looking at the occurrences of number of days within the aggregated data,
          Dates w. 4 Mons vs 6 Mons average -5 birth
          Dates w. 4 Tues vs 6 Tues average -1 birth
          Dates w. 4 Weds vs 6 Weds average +74 birth
          Dates w. 4 Thus vs 6 Thus average +43 birth
          Dates w. 4 Fris vs 6 Fris average +15 birth
          Dates w. 4 Sats vs 6 Sats average -53 birth
          Dates w. 4 Suns vs 6 Suns average -73 birth

          Which points to scheduled c-sections not being on weekends, but historically tending to be done on Wednesday & Thursday.
          The caveat is that the standard deviations are in the order of 150 to 200 (for example dates with 4 Fridays have a mean of 5480 and standard deviation of 159 in number of births)

          8 years ago

        • avatar

          Going unnecessarily deep on this- Taking Megan’s suggestion of matching nearby Fridays, together with keeping in mind the seasonal patterns and that each 13th has its on combination of day frequencies, we find that the 13ths on average have 42 fewer deliveries than the average of the surrounding 6ths and 20ths. An difference much less impressive when there on average 5428 deliveries a day on the 13ths.

          There is also no pattern to it- the 13th with the most Fridays does not have fewer deliveries than others, and the 13th with the least Fridays does not have more deliveries than the others.

          And there is no explanatory surge- given we have evidence scheduled c-sections are not being done on the weekend, there is no equivalent increase to the average 42 less on the preceding Thursday or Wednesday to the 13th (there is _really_ no pattern between the two) so the 13ths are not being shifted to other nearby dates and the -42 is just natural fluctuation within the range (from +216 to -224). Friday the 13th is not influencing scheduled c-sections in the NZ health system.

          8 years ago