July 20, 2015

July 19, 2015


  • In the interests of balance, a post at Public Address by Rob Salmond, who did the analysis in the ‘Chinese names’ real-estate leak.  And a robust twitter discussion with him, Keith Ng, and Tze Ming Mok.
  • Stats New Zealand has a new standard question about gender identity (as distinguished from sex), acknowledging that it isn’t as simple as some people would like it to be.
  • The most important aspects of health seem to vary by age: “older raters gave significantly more weight to functional limitations and social functioning and less to morbidities and pain experience, compared to younger raters.” (via @hildabast)
  • Priceonomics has a post on the most common and most distinctive ingredients in recipes from around the world. The list illustrates the problem with the ‘distinctiveness’ metric (as Kieran Healy pointed out: whiskey is really not the distinctive signature of Irish food).  It also shows up other problems: for example, “African” and “Asian” are both listed as cuisines. Fundamentally, the limitation in is the recipe lists and the approximations made: galangal shows up as a reasonable candidate for most-distinctive Thai ingredient partly because there aren’t any substitutes; cayenne is the most widely used ingredient in the Mexican recipes because it’s being substituted for other chillis.
July 16, 2015

Don’t just sit there, do something

The Herald’s story on sitting and cancer is actually not as good as the Daily Mail story it’s edited from. Neither one gives the journal or researchers (the paper is here). Both mention a previous study, but the Mail goes into more useful detail.

The basic finding is

Longer leisure-time spent sitting, after adjustment for physical activity, BMI and other factors, was associated with risk of total cancer in women (RR=1.10, 95% CI 1.04-1.17 for >6 hours vs. <3 hours per day), but not men (RR=1.00, 95% CI 0.96-1.05)

The lack of association in men was a surprise, and strongly suggests that the result for women shouldn’t be believed. It’s also notable that while the estimated associations with a few types of cancer look strong, the lower limits on the confidence intervals don’t look strong:

risk of multiple myeloma (RR=1.65, 95% CI 1.07-2.54), invasive breast cancer (RR=1.10, 95% CI 1.00-1.21), and ovarian cancer (RR=1.43, 95% CI 1.10-1.87).

Since the researchers looked at eighteen subsets of cancer in addition to all types combined, and these are the top three, the real lower limits are even lower.

The stories referred to previous research, published last year, which summarised many previous studies of sitting and cancer risk.  That’s good, but the summary wasn’t entirely accurate. From the Herald:

Previous research by the University of Regensburg in Germany found that spending too much time sitting raised the risk of bowel and lung cancer in both men and women.

In fact, the previous research didn’t look separately at men and women (or, at least, didn’t report doing so). While you would expect similar results in men and women, that study doesn’t address the question.

The Mail does have one apparently good contextual point

However, this previous study – which reviewed 43 other studies – did not find a link between sitting and a higher risk of breast and ovarian cancer. 

But when you look at the actual figures, there’s no real inconsistency between the two studies: they both report weak evidence of higher risk; it’s just a question of whether the lower end of the confidence interval happens to cross the ‘no difference’ line for a particular subset of cancers.

Overall, this is a pretty small risk difference to detect from observational data. If you didn’t already think that long periods of sitting could be bad for you, this wouldn’t be a reason to start.

July 15, 2015

NRL Predictions for Round 19

Team Ratings for Round 19

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Roosters 8.23 9.09 -0.90
Broncos 7.81 4.03 3.80
Cowboys 5.13 9.52 -4.40
Rabbitohs 2.97 13.06 -10.10
Warriors 2.54 3.07 -0.50
Storm 2.00 4.36 -2.40
Panthers 0.60 3.69 -3.10
Bulldogs 0.10 0.21 -0.10
Dragons -0.01 -1.74 1.70
Sea Eagles -0.26 2.68 -2.90
Raiders -1.62 -7.09 5.50
Eels -2.14 -7.19 5.00
Sharks -3.00 -10.76 7.80
Knights -5.58 -0.28 -5.30
Wests Tigers -7.54 -13.13 5.60
Titans -7.91 -8.20 0.30


Performance So Far

So far there have been 128 matches played, 73 of which were correctly predicted, a success rate of 57%.

Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Raiders vs. Knights Jul 10 36 – 22 5.80 TRUE
2 Bulldogs vs. Broncos Jul 11 8 – 16 -4.10 TRUE
3 Warriors vs. Storm Jul 12 28 – 14 3.00 TRUE
4 Sharks vs. Dragons Jul 12 28 – 8 -3.20 FALSE
5 Titans vs. Sea Eagles Jul 13 6 – 38 -0.40 TRUE


Predictions for Round 19

Here are the predictions for Round 19. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Storm vs. Panthers Jul 17 Storm 4.40
2 Eels vs. Bulldogs Jul 17 Eels 0.80
3 Dragons vs. Rabbitohs Jul 18 Dragons 0.00
4 Knights vs. Titans Jul 18 Knights 5.30
5 Raiders vs. Sharks Jul 18 Raiders 4.40
6 Roosters vs. Warriors Jul 19 Roosters 9.70
7 Broncos vs. Wests Tigers Jul 19 Broncos 18.40
8 Sea Eagles vs. Cowboys Jul 20 Cowboys -2.40


A modest proposal

Positive-looking results are more likely to be published in scientific journals, much more likely to get press releases, and hugely more likely to end up in the news. This trend is exaggerated if the size of the association is large.  The most likely way to get a large association is to do a very small study and be lucky enough (by chance or sloppiness) to overestimate the strength of association, so the news selects for small, early-stage, and poorly-done research.

One way to reduce this bias would be for media to quote the lower (less impressive) end of the uncertainty interval (confidence interval, credibility interval) rather than quoting the midpoint of the interval as scientists usually do. In small studies, the lower end of the interval will be close to no association, even if the midpoint of the interval is a strong association. In large, well-designed studies the change in practice would have little impact.

Isn’t that biased?

If you assume that in most cases the association being tested is smaller that the uncertainty in the experiment (ie, close to zero), and that positive results are more likely to make the news then it’s less biased than using the middle of the interval.

Scientists would’t be able to use tests that don’t produce confidence intervals.

How sad. Anyway, they would, they just wouldn’t be able to get their press releases into the papers

Press releases often don’t report uncertainty estimates.

So those ones wouldn’t get in the papers. The silver linings are just piling up.



Bogus poll story, again

From the Herald

[Juwai.com] has surveyed its users and found 36 per cent of people spoken to bought property in New Zealand for investment.

34 per cent bought for immigration, 18 per cent for education and 7 per cent lifestyle – a total of 59 per cent.

There’s no methodology listed, and this is really unlikely to be anything other than a convenience sample, not representative even of users of this one particular website.

As a summary of foreign real-estate investment in Auckland, these numbers are more bogus than the original leak, though at least without the toxic rhetoric.

July 14, 2015

Another test for Alzheimer’s?

The Herald (from the Telegraph) has a story today about a Google Science Fair contestant, under the headline “Has a 15-year-old found a way to test for Alzheimer’s?“. This is the sort of science story it’s good to see in the papers, but it would be better if it were more accurate.

Krtin Nithiyanandam’s research is impressive even if you ignore the fact that he was only 14. But claiming he

 has developed a “Trojan horse” antibody which can penetrate the brain and attach itself to the toxic proteins present in the disease’s early stages.

is a bit of an exaggeration.

The project write-up describes how he attached antibodies to fluorescent quantum dots. These, cleverly, fluoresce at a near-infrared wavelength which passes through tissue, skin, and bone.  If the project works, it would be possible to screen for Alzheimer’s without even a lumbar puncture.

That’s still ‘if’. Despite what the story says, Krtin hasn’t tested the antibody on any actual brains. Theoretically, it binds to a transporter protein in the right way to penetrate the brain, but it needs testing. It also needs testing for toxicity — if it’s going to be used for screening, it will be injected into large numbers of healthy people, so has to be safe. After all that, it would have to be tested for predictive accuracy: to be useful, the test would have to have a very low false-positive rate. And, on top of that, for testing to really be helpful there would need to be some treatment that showed some sign of actually working. We’re not there yet.

You might also wonder how this relates to the four other early Alzheimer’s tests the Herald has reported on in the past year or so, or the other two proposed by Google Science Fair finalists.  Testing for Alzheimer’s has been an area with a lot of recent research, which is going to be useful if we ever have promising drugs to test.


July 13, 2015

July 11, 2015

What’s in a name?

The Herald was, unsurprisingly, unable to resist the temptation of leaked data on house purchases in Auckland.  The basic points are:

  • Data on the names of buyers for one agency, representing 45% fo the market, for three months
  • Based on the names, an estimate that nearly 40% of the buyers were of Chinese ethnicity
  • This is more than the proportion of people of Chinese ethnicity in Auckland
  • Oh Noes! Foreign speculators! (or Oh Noes! Foreign investors!)

So, how much of this is supported by the various data?

First, the surnames.  This should be accurate for overall proportions of Chinese vs non-Chinese ethnicity if it was done carefully. The vast majority of people called, say, “Smith” will not be Chinese; the vast majority of people called, say, “Xu” will be Chinese; people called “Lee” will split in some fairly predictable proportion.  The same is probably true for, say, South Asian names, but Māori vs non-Māori would be less reliable.

So, we have fairly good evidence that people of Chinese ancestry are over-represented as buyers from this particular agency, compared to the Auckland population.

Second: the representativeness of the agency. It would not be at all surprising if migrants, especially those whose first language isn’t English, used real estate agents more than people born in NZ. It also wouldn’t be surprising if they were more likely to use some agencies than others. However, the claim is that these data represent 45% of home sales. If that’s true, people with Chinese names are over-represented compared to the Auckland population no matter how unrepresentative this agency is. Even if every Chinese buyer used this agency, the proportion among all buyers would still be more than 20%.

So, there is fairly good evidence that people of Chinese ethnicity are buying houses in Auckland at a higher rate than their proportion of the population.

The Labour claim extends this by saying that many of the buyers must be foreign. The data say nothing one way or the other about this, and it’s not obvious that it’s true. More precisely, since the existence of foreign investors is not really in doubt, it’s not obvious how far it’s true. The simple numbers don’t imply much, because relatively few people are housing buyers: for example, house buyers named “Wang” in the data set are less than 4% of Auckland residents named “Wang.” There are at least three other competing explanations, and probably more.

First, recent migrants are more likely to buy houses. I bought a house three years ago. I hadn’t previously bought one in Auckland. I bought it because I had moved to Auckland and I wanted somewhere to live. Consistent with this explanation, people with Korean and Indian names, while not over-represented to the same extent are also more likely to be buying than selling houses, by about the same ratio as Chinese.

Second, it could be that (some subset of) Chinese New Zealanders prefer real estate as an investment to, say, stocks (to an even greater extent than Aucklanders in general).  Third, it could easily be that (some subset of) Chinese New Zealanders have a higher savings rate than other New Zealanders, and so have more money to invest in houses.

Personally, I’d guess that all these explanations are true: that Chinese New Zealanders (on average) buy both homes and investment properties more than other New Zealanders, and that there are foreign property investors of Chinese ethnicity. But that’s a guess: these data don’t tell us — as the Herald explicitly points out.

One of the repeated points I  make on StatsChat is that you need to distinguish between what you measured and what you wanted to measure.  Using ‘Chinese’ as a surrogate for ‘foreign’ will capture many New Zealanders and miss out on many foreigners.

The misclassifications aren’t just unavoidable bad luck, either. If you have a measure of ‘foreign real estate ownership’ that includes my next-door neighbours and excludes James Cameron, you’re doing it wrong, and in a way that has a long and reprehensible political history.

But on top of that, if there is substantial foreign investment and if it is driving up prices, that’s only because of the artificial restrictions on the supply of Auckland houses. If Auckland could get its consent and zoning right, so that more money meant more homes, foreign investment wouldn’t be a problem for people trying to find somewhere to live. That’s a real problem, and it’s one that lies within the power of governments to solve.

July 9, 2015

Followup: vitamin D and diabetes

Quite some time ago, I wrote about a story on vitamin D and diabetes:

A: Someone needs to do a randomized trial, where half the participants get vitamin D and half get a dummy pill. If the effect is real, fewer people getting vitamin D will end up with diabetes.

Q: That sounds like a good idea. Is someone doing a trial?

A: Yes, Professor Peter Ebeling, of the the University of Melbourne.

Q: Is there some useful website where I can find more information about the trial?

A: Indeed.

Q: Will it work?

A: No.

Q: Are you sure?

A: No, that’s why we need the trial.[…]

While the clinical trial registry hasn’t been updated, there are now published results from this trial.  The researchers didn’t get to their planned 160 participants; they gave up at 95 because of slow recruitment.  Even so, if the results had been as dramatic as in the observational studies, they would have been able to see the benefit.

They didn’t:

In this 6-month RCT of vitamin D and calcium supplementation in which over 90% of the participants reached the target serum 25(OH)D concentration of 75 nmol/L, there was no effect of supplementation on any measure of insulin sensitivity, insulin secretion or β-cell function in multi-ethnic vitamin D-deficient individuals at risk of type 2 diabetes (with prediabetes or an AUSDRISK score ≥15).

These results are no fun, so they have not received the same media attention as the observational correlations that prompted the trial, even though they are more reliable and more relevant to individual health choices.