Posts filed under Risk (149)

January 29, 2015

Absolute risk/benefit calculators

An interesting interactive calculator for heart disease/stroke risk, from the University of Nottingham. It lets you put in basic, unchangeable factors (age,race,sex), modifiable factors (smoking, diabetes, blood pressure, cholesterol), and then one of a set of interventions

Here’s the risk for an imaginary unhealthy 50-year old taking blood pressure medications


The faces at the right indicate 10-year risk: without the unhealthy risk factors, if you had 100 people like this, one would have a heart attack, stroke, or heart disease death over ten years, with the risk factors and treatment four  would have an event (the pink and red faces).  The treatment would prevent five events in 100 people, represented by the five green faces.

There’s a long list of possible treatments in the middle of the page, with the distinctive feature that most of them don’t appear to reduce risk, from the best evidence available. For example, you might ask what this guy’s risk would be if he took vitamin and fish oil supplements. Based on the best available evidence, it would look like this:



The main limitation of the app is that it can’t handle more than one treatment at a time: you can’t look at blood pressure meds and vitamins, just at one or the other.

(via @vincristine)

January 27, 2015

Benadryl and Alzheimers

I expected the Herald story “Hay fever pills linked to Alzheimer’s risk – study” to be the usual thing, where a fishing expedition found a marginal correlation in low-quality data.  It isn’t.

The first thing I noticed  when I found the original article is that I know several of the researchers. On the one hand that’s a potential for bias, on the other hand, I know they are both sensible and statistically knowledgeable. The study has good quality data: the participants are all in one of the Washington HMOs, and there is complete information on what gets prescribed for them and whether they fill the prescriptions.

One of the problems with drug:disease associations is confounding by indication. As Samuel Goldwyn observed, “Any man who goes to a psychiatrist needs to have his head examined”, and more generally the fact that medicine is given to sick people tends to make it look bad.  In this case, however, the common factor between the medications being studied is an undesirable side-effect for most of them, unrelated to the reason they are prescribed.  In addition to reducing depression or preventing allergic reactions, these drugs also block part of the effect of the neurotransmitter acetylcholine. The association remained just as strong when recent drug use was excluded, or when antidepressant drugs were excluded, so it probably isn’t that early symptoms of Alzheimer’s lead to treatment.

The association replicates results found previously, and is quite strong, about four times the standard error (“4σ”) or twice the ‘margin of error’. It’s not ridiculously large, but is enough to be potentially important: a relative rate of about 1.5.

It’s still entirely possible that the association is due to some other factor, but the possibility of a real effect isn’t completely negligible. Fortunately, many of the medications involved are largely obsolete: modern hayfever drugs (such as fexofenadine, ‘Telfast’) don’t have anticholinergic activities, and nor do the SSRI antidepressants. The exceptions are tricyclic antidepressants used for chronic pain (where it’s probably worth the risk) and the antihistamines used as non-prescription sleep aids.

January 9, 2015

The Internet of things and its discontents

The current Consumer Electronics Show is full of even more gadgets that talk to each other about you. This isn’t necessarily an unmixed blessing

From the New Yorker

To find out, the scientists recruited more than five hundred British adults and asked them to imagine living in a house with three roommates. This hypothetical house came equipped with an energy monitor, and all four residents had agreed to pay equally for power. One half of the participants was told that energy use in the house had remained constant from one month to the next, and that each roommate had consumed about the same amount. The other half was told that the bill had spiked because of one free-riding, electricity-guzzling roommate.

From Buzzfeed

It’s not difficult to imagine a future in which similar data sets are wielded by employers, the government, or law enforcement. Instead of liberating the self through data, these devices could only further restrain and contain it. As Walter De Brouwer, co-founder of the health tracker Scanadu, explained to me, “The great thing about being made of data is thatdata can change.” But for whom — or what — are such changes valuable?

and the slightly chilling quote “it’s not surveillance, after all, if you’re volunteering for it”
Both these links come from Alex Harrowell at the Yorkshire Ranter, whose comment on smart electricity meters is

The lesson here is both that insulation and keeping up to the planning code really will help your energy problem, rather than just provide a better class of blame, and rockwool doesn’t talk.


January 6, 2015

Foreign drivers, again

The Herald has a poll saying 61% of New Zealanders want to make large subsets of foreign drivers sit written and practical tests before they can drive here (33.9%: people from right-hand drive countries; 27.4% everyone but Australians). It’s hard to tell how much of this is just the push effect of being asked the questions and how much is real opinion.

The rationale is that foreign drivers are dangerous:

Overseas drivers were found at fault in 75 per cent of 538 injury crashes in which they were involved. But although failure to adjust to local conditions was blamed for seven fatal crashes, that was the suspected cause of just 26 per cent of the injury crashes.

This could do with some comparisons.  75% of 538 is 403, which is about 4.5% of all injury crashes that year.  We get about 2.7 million visitors per year, with a mean stay of 20 days (PDF), so on average the population is about 3.3% short-term visitors.

Or, we can look at the ‘factors involved’ for all the injury crashes. I get 15367  drivers of motorised vehicles involved in injury crashes, and 9192 of them have a contributing factor that is driver fault (causes 1xx to 4xx in the Crash Analysis System). This doesn’t include things like brake failures.  So, drivers on average are at fault in about 60% of the injury crashes they are involved in.

Based on this, it looks as though foreign drivers are somewhat more dangerous, but that restricting them is very unlikely to prevent more than, say, 1-2% of crashes. If you consider all the ways we might reduce injury crashes by 1-2%, and think about the side-effects of each one, I don’t think this is going to be near the top of the list.

January 3, 2015

Cancer isn’t just bad luck

From Stuff

Bad luck is responsible for two-thirds of adult cancer while the remaining cases are due to environmental risk factors and inherited genes, researchers from the Johns Hopkins Kimmel Cancer Center found.

The idea is that some, perhaps many, cancers come from simple copying errors in DNA replication. Although DNA copying and editing is impressively accurate, there’s about one error for every three cell divisions, even when nothing is wrong. Since the DNA error rate is basically constant, but other risk factors will be different for different cancers, it should be possible to separate them out.

For a change, this actually is important research, but it has still been oversold, for two reasons. Here’s the graph from the paper showing the ‘2/3′ figure: the correlation in this graph is about 0.8, so the proportion of variation explained is the square of that, about two-thirds.  (click to embiggen)


There are two things to notice about this graph. First, there are labels such as “Lung (smokers)” and “Lung (non-smokers)”, so it’s not as simple as ‘bad luck’.  Some risk factors have been taken into account. It’s not obvious whether this makes the correlation higher or lower.

Second, the y-axis is on a log scale, so the straight line fit isn’t to cancer incidence and the proportion of variation explained isn’t a proportion of cancer risk.  Using a log scale for incidence is absolutely right when showing the biological relationship, but you can’t read proportions of incidence explained off that graph.  This is what the graph looks like when the y-axis is incidence, either with the x-axis still on a logarithmic scale


or with neither axis on a logarithmic scale


The proportion of variation explained is 18% and 28% respectively.

It’s ok to transform the x-axis as much as we like, so I looked at a square root transformation on the x-axis (based on the slope of the log-log graph). This gets the proportion of incidence explained up to about one third. Not two-thirds.

Using the log scale gives a lot more weight to the very rare cancers in the lower left corner, which turn out not to have important modifiable risk factors. Using an untransformed y-axis gives equal weight to all cancers, which is what you want from a medical or public health point of view.

Except, even that isn’t quite right. If you look at my two graphs it’s clear that the correlation will be driven by the top three points. Two of those are familial colorectal cancers, and the incidence quoted is the incidence in people with the relevant mutations; the third is basal cell carcinoma, which barely counts as cancer from a medical or public health viewpoint

If we leave out the familial cancers and basal cell carcinoma, the proportion explained drops to about 10%.

If we leave out put back basal cell carcinoma as well, something statistically interesting happens. The correlation shoots back up again, but only because it’s being driven by a single point. A more honest correlation estimate, predicting each point based on the other points and not based on itself, is much lower.

So, in summary: the “two-thirds of cancers explained” is Just Wrong. Doing a mathematically correct calculation gives about one third. Doing a calculation that’s actually relevant to cancer in the population gives even smaller values. (update) That’s not to say that DNA replication errors are unimportant — the paper makes it clear that they are important.

December 30, 2014

How dangerous was flying this year?

The Washington Post says

With yet another airliner gone missing over Southeast Asian airspace, there’s no question that 2014 has been a year beset by mysterious air tragedies. But there’s a surprising fact hiding behind this year’s high-profile air tragedies: 2014 has been the safest year for flying since, well, ever.

When you look at their data, the claim is true if you are an aeroplane. If you are a passenger, the claim is false.

That is, 2014 (so far) has had 20 crashes by commercial flights carrying 14 or more passengers. That’s the lowest on record.

On the other hand, there have been 1007 fatalities in crashes of commercial flights carrying 14 or more passengers, which is about four times the number in 2013. You have to go back to the 1990s before 1000 deaths in a year becomes normal.

What has been unusual this year is that big planes have crashed. The missing AirAsia flight is an A320, the two Malaysia Airlines planes were 777s, the Air Algérie plane was an MD-83.

So is flying more dangerous? It’s hard to say. The trend over the past decade is still downwards, and the two Malaysia Airlines flights probably don’t indicate a pattern that applies to other airlines (even if it might make one nervous about that airline).  It’s too soon to say for the Air Asia flight.

The absolute risk was still extremely low: in 2013 there were 3 billion air passenger departures, so 1000 deaths would be one in three million.



December 29, 2014

How headlines sometimes matter

From the New Yorker, an unusual source for StatsChat, an article about research on the impact of headlines.  I often complain that the headline and lead are much more extreme than the rest of the story, and this research looks into whether this is just naff or actually misleading.

In the case of the factual articles, a misleading headline hurt a reader’s ability to recall the article’s details. That is, the parts that were in line with the headline, such as a declining burglary rate, were easier to remember than the opposing, non-headlined trend. Inferences, however, remained sound: the misdirection was blatant enough that readers were aware of it and proceeded to correct their impressions accordingly. […]

In the case of opinion articles, however, a misleading headline, like the one suggesting that genetically modified foods are dangerous, impaired a reader’s ability to make accurate inferences. For instance, when asked to predict the future public-health costs of genetically modified foods, people who had read the misleading headline predicted a far greater cost than the evidence had warranted.

Set to a possibly recognisable tune

The Risk Song: One hundred and eight hazards in 80 seconds

(via David Spiegelhalter)

December 11, 2014

Very like a whale

We see patterns everywhere, whether they are there or not. This gives us conspiracy theories, superstition, and homeopathy. It’s really hard to avoid drawing conclusions about patterns, even when you know they aren’t really there.

Some of the most dramatic examples are visual

Do you see yonder cloud that’s almost in shape of a camel?

By the mass, and ’tis like a camel, indeed.

Methinks it is like a weasel.

It is backed like a weasel.

Or like a whale?

Very like a whale.

Hamlet was probably trolling, but he got away with it because seeing shapes in the clouds is a common experience.

Just as we’re primed to see causal relationships whether they are there or not, we are also primed to recognise shapes whether they are there or not. The compulsion is perhaps strongest for faces, as in this bitter melon (karela) from Reddit


and this badasss mop


It turns out that computers can be taught similar illusions, according to new research from the University of Wyoming.  The researchers took software that had been trained to recognise certain images. They then started off with random video snow or other junk patterns and made repeated random changes, evolving images that the computer would recognise.


These are, in a sense, computer optical illusions. We can’t see them, but they are very convincing to a particular set of artificial neural networks.

There are two points to this. The first is that when you see a really obvious pattern it isn’t necessarily there. The second is that even if computers are trained to classify a particular set of examples accurately, they needn’t do very well on completely different sets of examples.

In this case the computer was looking for robins and pandas, but it might also have been trained to look for credit card fraud or terrorists.


December 7, 2014

Bot or Not?

Turing had the Imitation Game, Phillip K. Dick had the Voight-Kampff Test, and spammers gave us the CAPTCHA.  The Truthy project at Indiana University has BotOrNot, which is supposed to distinguish real people on Twitter from automated accounts, ‘bots’, using analysis of their language, their social networks, and their retweeting behaviour. BotOrNot seems to sort of work, but not as well as you might expect.

@NZquake, a very obvious bot that tweets earthquake information from GeoNet, is rated at an 18% chance of being a bot.  Siouxsie Wiles, for whom there is pretty strong evidence of existence as a real person, has a 29% chance of being a bot.  I’ve got a 37% chance, the same as @fly_papers, which is a bot that tweets the titles of research papers about fruit flies, and slightly higher than @statschat, the bot that tweets StatsChat post links,  or @redscarebot, which replies to tweets that include ‘communist’ or ‘socialist’. Other people at a similar probability include Winston Peters, Metiria Turei, and Nicola Gaston (President of the NZ Association of Scientists).

PicPedant, the twitter account of the tireless Paulo Ordoveza, who debunks fake photos and provides origins for uncredited ones, rates at 44% bot probability, but obviously isn’t.  Ben Atkinson, a Canadian economist and StatsChat reader, has a 51% probability, and our only Prime Minister (or his twitterwallah), @johnkeypm, has a 60% probability.