April 12, 2017

Criteria for criteria for mānuka honey

There’s a new proposed definition of NZ Mānuka Honey, as you may have seen. The MPI page on the topic is here; no-one is linking it, which is sad because it’s interesting if you’re enough of a nerd.

I’m not going to comment on the biochemistry or botany, but there are two statistically-interesting parts of the proposal.  First, how the statistical method for classifying honey was constructed. The document says:

A classification modelling approach (CART – classification and regression tree) was the most suitable method of analysis for determining the identification criteria for mānuka honey because:

  • test results for several different attributes were available and needed to be assessed in combination;
  • the identification criteria needed to be related to the attributes tested;
  • the identification criteria needed to be straightforward, transparent and easily interpreted
  • the outputs would enable an unknown honey sample to be authenticated as monofloral or multifloral mānuka honey.

CART is a relatively old classification method, developed in the early 1980s by adding statistical ‘pruning’ to automated methods for building decision trees. It hasn’t been the most accurate method in head-to-head prediction competitions for a long time now, but it remains very useful for basically the reasons the MPI scientists gave.  CART tends to end up with simple rules based whether a small selection of variables all or mostly exceed some thresholds, and while building a good CART prediction rule takes experience and statistical knowledge, using it doesn’t.

Using a collection of honey samples from known origins, and other information about chemical composition of the plants, a rule was developed for distinguishing mānuka honey from other NZ honeys such as kānuka or pōhutukawa, and from Leptospermum species other than mānuka. The resulting rule for monofloral (`pure’) mānuka honey is a threshold that four chemicals have to exceed, plus the presence of mānuka DNA.  For multifloral mānuka honey, the threshold for one of the four chemicals is lowered.

The second interesting aspect of the criteria is that none of the four chemicals have anything to do with real or imagined medical benefits of mānuka honey.  Methylglyoxal, the leading candidate for a somewhat mānuka-specific antimicrobial, isn’t in there.  The rule attempts to identify honey produced by bees foraging on mānuka flowers — scientists know what a mānuka flower is. It doesn’t try to identify honey that prevents miscellaneous diseases when you eat it, because no-one one knows what characteristics that honey would have, or even if it exists.

As I’ve noted before, the largest controlled trial of eating mānuka honey to prevent minor illness was conducted by a London primary school. On the other hand, people are willing to pay a lot of money for honey from NZ mānuka, and as long as MPI isn’t officially supporting the health arguments I’m definitely in favour of that money going to NZ apiarists rather than counterfeiters.

Are you related to your ancestors?

Two people have emailed me this story (one via Stuffone via the Herald) about the DNA ancestry of Oriini Kaipara, a TV presenter:

An analysis of the DNA of Oriini Kaipara, 33, has shown that – despite her having both Maori and Pakeha ancestry – her genes only contain Maori DNA. That makes her, in her own words, a “full-blooded Maori”.

Culturally, people identify as Maori through their whakapapa, while legally a person is defined as Maori if they are of Maori descent, even through one long-distant ancestor.

However, the intermingling of different ethnicities in New Zealand over the past 200 years means all Maori people are thought to have some non-Maori ancestry, so would not be expected to have 100 per cent Maori DNA.

It seems strange that someone could have an ancestor from whom they got no DNA, but while most ‘ancestry and genetics’ news stories are completely bogus, this one probably isn’t.

Ignoring the X and Y chromosomes to start with, you have 22 chromosomes from your mother and 22 from your father (except for some rare cases such as people with Down syndrome, who have an extra copy of one of them, usually from their mothers).  Each of your maternal chromosomes is a combination of DNA from your mother’s father and mother’s mother, in chunks averaging about 1/4 chromosome long. Each of your paternal chromosomes is a combination of DNA from your father’s father and father’s mother, in chunks averaging about 1/4  chromosome long.  So, on average, you have 1/4 of your DNA from each grandparent, but it’s random.  You might have only tiny chunks from one grandparent and almost 50% from another.

As we go back further, after N generations you have 2N direct ancestors, but the chunks of DNA being inherited are about 1/2N chromosomes long.  So, going back 10 generations you have 1024 ancestors and you’re inheriting DNA chunks about 1/20th of a chromosome long.   But with 22 pairs of chromosomes, that only allows you to fit in chunks from 20×2×22=880 of your great8-grandparents.   So, you almost certainly have DNA from all your grandparents, and very likely from all your great-grandparents, but it’s unlikely you have DNA from all your ancestors ten generations back, and the proportion you have DNA from goes down and down the further back you go.  Europeans in NZ don’t go all that far back, so the probability is pretty high for any given European ancestor of a modern Māori, but it’s not 100%.

In modern New Zealand, most Māori will have more non-Māori ancestors than Ms Kaipara does, and most people with only two non-Māori ancestors will have inherited DNA from at least one of them, so it would be unusual for someone to have no non-Māori DNA, but certainly not impossible.

The next question is how the genetic testing people can know which DNA came from Māori ancestors.  The DNA bases that end up in a saliva sample are synthesised in your body from the food you eat: they don’t come with little labels saying which ancestor’s DNA they are copies of.  One adenine base looks just like any other.  The approach to this problem is statistical: there are many, many positions in the DNA sequence where particular variations are more common in one part of the world than in others. Some of these are well known because of what they do, but those are a tiny minority; nearly all of them are unimportant copying errors. In any case, two people who share the variant probably got it from the same distant ancestor, so if you collect enough DNA variants from enough people around the world, you can tell with surprising reliability where people’s ancestors came from.

Here’s a picture from research in the USA, showing three genetic summaries for people identifying with various Hispanic/Latinx groups:

hispaniclegend

There’s pretty clear separation: in this sample you can tell quite a lot about a typical person’s ancestry from their genes.  No single genetic variant will tell you much, but thousands or millions of them together tell you a lot.  In this example, the three summaries correspond roughly to amounts of ancestry from the Americas before Columbus, from Europe, and from western Africa via the slave trade. are the most important variation after the first three summaries giving basic continental ancestry are taken  out.

The test used by ancestry.com measures 700,000 DNA variants, which is a respectable number.  It’s probably a bit short on markers for Polynesian ancestry, because there hasn’t been much genetic study of Polynesians. It will be very short on markers that distinguish Māori from other people with Polynesian ancestry, but in this example, family history was enough to make that unnecessary.  So, it’s plausible that some Māori have little or no non-Māori DNA, and it’s plausible that ancestry.com could determine that with reasonable reliability: the story is making a claim that has some content and could very well be true.  As the story says, the result doesn’t actually matter much, but it is interesting.

Without Ms Kaipara’s family history, just using genetic data, the video clip says her Polynesian ancestry was estimated as between 93% and 100%: there’s quite a bit of uncertainty.   For someone with a less clearly known family history, or from somewhere that mixing of populations happened longer ago than two centuries, the test will be less informative, but will still give some general information about what parts of the world your ancestors may have come from.  You might still want to know.

What this story should make you concerned about, though, is other news stories talking about someone’s descent from, say, Genghis Khan.  If Ms Kaipara can have recent ancestors whose DNA she doesn’t appear to carry, how can claims from 1000 years in the past be credible? And indeed they aren’t.  As you go back further and further in time,  you have more and more ancestors. By the time of Genghis Khan, there would be tens of billions of them.  Obviously there must be huge overlap, but that still allows you to be descended from a lot of people. Pretty much everyone in Europe and Asia has Genghis Khan as an ancestor; a fraction of them carry DNA descended from his; and a tiny fraction of these have copies of his Y chromosome.  The test results that more often make headlines are the last sort, which are pretty meaningless.

 

April 11, 2017

Super 18 Predictions for Round 8

Team Ratings for Round 8

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Hurricanes 16.86 13.22 3.60
Chiefs 10.16 9.75 0.40
Crusaders 9.42 8.75 0.70
Highlanders 8.29 9.17 -0.90
Lions 7.10 7.64 -0.50
Stormers 4.73 1.51 3.20
Brumbies 4.68 3.83 0.90
Blues 1.96 -1.07 3.00
Waratahs 1.24 5.81 -4.60
Sharks 1.05 0.42 0.60
Jaguares -1.48 -4.36 2.90
Bulls -3.22 0.29 -3.50
Force -8.75 -9.45 0.70
Cheetahs -10.03 -7.36 -2.70
Reds -10.56 -10.28 -0.30
Rebels -12.72 -8.17 -4.60
Kings -17.24 -19.02 1.80
Sunwolves -18.61 -17.76 -0.80

 

Performance So Far

So far there have been 56 matches played, 44 of which were correctly predicted, a success rate of 78.6%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Hurricanes vs. Waratahs Apr 07 38 – 28 20.90 TRUE
2 Sunwolves vs. Bulls Apr 08 21 – 20 -13.10 FALSE
3 Highlanders vs. Blues Apr 08 26 – 20 10.40 TRUE
4 Brumbies vs. Reds Apr 08 43 – 10 16.80 TRUE
5 Sharks vs. Jaguares Apr 08 18 – 13 6.70 TRUE
6 Stormers vs. Chiefs Apr 08 34 – 26 -2.70 FALSE
7 Force vs. Kings Apr 09 46 – 41 13.50 TRUE

 

Predictions for Round 8

Here are the predictions for Round 8. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Crusaders vs. Sunwolves Apr 14 Crusaders 32.00
2 Reds vs. Kings Apr 15 Reds 10.70
3 Blues vs. Hurricanes Apr 15 Hurricanes -11.40
4 Rebels vs. Brumbies Apr 15 Brumbies -13.90
5 Cheetahs vs. Chiefs Apr 15 Chiefs -16.20
6 Stormers vs. Lions Apr 15 Stormers 1.10
7 Bulls vs. Jaguares Apr 15 Bulls 2.30

 

NRL Predictions for Round 7

Team Ratings for Round 7

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Raiders 10.11 9.94 0.20
Storm 8.06 8.49 -0.40
Broncos 6.26 4.36 1.90
Sharks 6.05 5.84 0.20
Panthers 3.62 6.08 -2.50
Cowboys 2.98 6.90 -3.90
Dragons 0.52 -7.74 8.30
Roosters -0.12 -1.17 1.00
Sea Eagles -1.49 -2.98 1.50
Eels -1.82 -0.81 -1.00
Rabbitohs -2.39 -1.82 -0.60
Bulldogs -2.67 -1.34 -1.30
Titans -4.83 -0.98 -3.90
Wests Tigers -5.99 -3.89 -2.10
Warriors -6.28 -6.02 -0.30
Knights -14.05 -16.94 2.90

 

Performance So Far

So far there have been 48 matches played, 25 of which were correctly predicted, a success rate of 52.1%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Broncos vs. Roosters Apr 06 32 – 8 7.30 TRUE
2 Knights vs. Bulldogs Apr 07 12 – 22 -7.40 TRUE
3 Panthers vs. Rabbitohs Apr 07 20 – 21 11.50 FALSE
4 Sea Eagles vs. Dragons Apr 08 10 – 35 6.20 FALSE
5 Titans vs. Raiders Apr 08 16 – 42 -8.70 TRUE
6 Cowboys vs. Wests Tigers Apr 08 16 – 26 16.50 FALSE
7 Warriors vs. Eels Apr 09 22 – 10 -2.80 FALSE
8 Storm vs. Sharks Apr 09 2 – 11 8.20 FALSE

 

Predictions for Round 7

Here are the predictions for Round 7. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Bulldogs vs. Rabbitohs Apr 14 Bulldogs 3.20
2 Knights vs. Roosters Apr 14 Roosters -10.40
3 Broncos vs. Titans Apr 14 Broncos 14.60
4 Sea Eagles vs. Storm Apr 15 Storm -6.10
5 Raiders vs. Warriors Apr 15 Raiders 20.40
6 Dragons vs. Cowboys Apr 15 Dragons 1.00
7 Panthers vs. Sharks Apr 16 Panthers 1.10
8 Eels vs. Wests Tigers Apr 17 Eels 7.70

 

April 10, 2017

Attack of the killer sofa

From the Herald (from the Daily Mail)

Materials used to fireproof sofas are linked to a 74% rise in thyroid tumours

From the American Cancer Society

The chance of being diagnosed with thyroid cancer has risen in recent years and is the most rapidly increasing cancer in the US tripling in the past three decades. Much of this rise appears to be the result of the increased use of thyroid ultrasound, which can detect small thyroid nodules that might not otherwise have been found in the past.

That is, thyroid cancer looks as if it’s more common at least partly because diagnosis has improved. It could potentially still be true that fire retardants are a problem as well, but the  “killer sofa” people either don’t know about out about the changes in diagnosis or do know but don’t think we need to be told.  Either way, I don’t think it increases their credibility.

Briefly

  • Good piece at Stuff about what a 500-year flood is. The concept isn’t quite as shaky as it sounds — there’s some independent information from comparing different river systems — but it’s inevitably uncertain.
  • 23andme is back providing genetic risk information, but in a much more restricted way after FDA review.  A lot of the risk information you can get this way isn’t useful for treatment, but it’s the sort of thing some people like to know.  So, sometimes, do their insurance companies
  • The concept of ‘net tax’ — tax paid minus cash benefits and transfers (but not non-cash ones such as Pharmac subsidies) can be a useful concept.  However, I don’t think it’s as useful when ‘tax’ leaves out GST, as in this story at Stuff.  Admittedly, it’s not trivial to calculate how much GST people pay, but I’m sure the Treasury had looked at it.
  • Scientists and journalists need to get better at communicating uncertainty, and people need to accept it’s there. (Ed Yong, in the Atlantic)

Stat of the Week Competition: April 8 – 14 2017

Each week, we would like to invite readers of Stats Chat to submit nominations for our Stat of the Week competition and be in with the chance to win an iTunes voucher.

Here’s how it works:

  • Anyone may add a comment on this post to nominate their Stat of the Week candidate before midday Friday April 14 2017.
  • Statistics can be bad, exemplary or fascinating.
  • The statistic must be in the NZ media during the period of April 8 – 14 2017 inclusive.
  • Quote the statistic, when and where it was published and tell us why it should be our Stat of the Week.

Next Monday at midday we’ll announce the winner of this week’s Stat of the Week competition, and start a new one.

(more…)

April 5, 2017

Extrapolation, much?

HeadlineResearch has found that Marmite could help prevent dementia

Research article:  A group of 28 adult volunteers (10 males, mean age 22 years) completed the study after providing written informed consent.

We could just stop there, but it gets better (not better)

The study found that the people getting Marmite had, as hypothesised, less response by their brains to flickering visual stimuli.  The research paper does not mention dementia (or memory, or Alzheimers). At all. It concludes

“This demonstrates that the balance of excitation and inhibition in the brain can be influenced by dietary interventions, suggesting possible clinical benefits in conditions (e.g. epilepsy) where inhibition is abnormal.”

Even the story doesn’t come close to the headline claims, saying just

It could also prompt further research to see if Marmite, and its effect on the brain’s GABA chemical, might provide a treatment for dementia.

And, right at the end of the story, the quote from an independent expert

“there’s no way to say from this study whether eating Marmite does affect your dementia risk.

If it does, and if that’s because of the vitamin B12, it might also have been worth mentioning that there are other foods with as much or more vitamin B12 per serving, such as beef, and lamb, and many types of fish.

 

Briefly

  • If someone told me a longstanding problem in mathematical statistics had been solved, but then admitted the proof was short, used fairly elementary techniques, was written with Microsoft Word, and was published in the Far East Journal of Theoretical Statistics, I might not be in a hurry to look it up.  These are all genuinely reasonable filters for mathematical papers that are worth putting effort into. But, in this case, they were all false positives. Quanta Magazine has the story.
  • From The Conversation,”The seven deadly sins of statistical misinterpretation, and how to avoid them“.
  • From Newsroom (who seem to be quite good so far) Interaction of recreational genotyping and health insurance in NZ
  • From The Conversation, how website terms of use (and their potential criminal enforcement in the US) affect research into fairness and transparency of algorithms.
  • Good Herald interview on air pollution with NIWA scientist Elizabeth Somervell
April 4, 2017

Attack of the killer margarine: the reboot

In 2015, the Herald had a story from the Daily Telegraph on the alleged risks of margarine:

Saturated fat found in butter, meat or cream is unlikely to kill you, but margarine just might, new research suggests.

Traditionally people have been advised to reduce animal fats, but the biggest ever study has shown they do not increase the risk of stroke, heart disease or diabetes. However, trans fats, found in processed foods such as margarine, raise the risk of death by 34 per cent in less than a decade.

“For years everyone has been advised to cut out fats,” said study lead author Doctor Russell de Souza, an assistant professor in the Department of Clinical Epidemiology and Biostatistics, at McMaster University in Canada.

It’s a bit unclear exactly what “raise the risk of death by 34 per cent in less than a decade” is supposed to mean, but we’ll get to that. The research paper was in the BMJ, and came out on the same day the story did.

Today, in 2017, the Herald had a story from the Daily Telegraph on the alleged risks of margarine:

Saturated fat found in butter, meat or cream is unlikely to kill you, but margarine just might, new research suggests.

Although traditionally dieticians have advised people to cut down on animal fats, the biggest ever study has shown that it does not increase the risk of stroke, heart disease or diabetes.

However trans-fats, found in processed foods like margarine raises the risk of death by 34 per cent.

“For years everyone has been advised to cut out fats,” said study lead author Doctor Russell de Souza, an assistant professor in the Department of Clinical Epidemiology and Biostatistics, at McMaster University in Canada.

It’s a bit unclear exactly what “raise the risk of death by 34 per cent in less than a decade” is supposed to mean, but we’ll get to that. The research paper was in the BMJ, and came out nearly two years before the story did.

Yes, it really seems to be the same ‘new reasearch’: Dr de Souza hasn’t just published another meta-analysis. It even seems to be the same Telegraph story; I couldn’t find a new one.

So, how scared should we be of trans fats in our diets?  Food Standards Australia New Zealand say

Monitoring of TFAs in the Australian and New Zealand food supply has found that Australians obtain on average 0.5 per cent of their daily energy intake from TFAs and New Zealanders on average 0.6 per cent. This is well below the WHO recommendation of no more than 1 per cent.

They also say that the majority of that 0.6% is made by bacteria in the rumens of cows and sheep, not by industrial hydrogenation; the evidence of harm is weaker for these natural trans fats.

Now, back to the 34% statistic. This is based on two studies. One compared the 20% of people with the highest and lowest trans fat intakes and found a rate ratio of 1.24. The other, smaller, one estimated the ratio as 1.71 between the highest and lowest 25%.   These are rate ratios estimated from people in their 60. Since the actual probability of death in any given year would have been about 1% the absolute risk increase is smaller than “34% in less than a decade” sounds — but not at all trivial.  For comparison, the all-cause mortality rate ratio for current smoking is about 3.0, or 200% higher than non-smokers.

More importantly, though, we’re talking about a lot of trans fat in these studies. In the larger study with the less-scary rate ratio, people in the lowest 20% of trans fat intake got an average of 1.6% of their calories from it. That is, the lowest-risk group were eating three times as much trans fat as an average Kiwi today.  In the smaller study, they don’t give actual trans fat information for the groups they are comparing, but the average for the whole study was about 9% of fat in the blood was trans fat: if that even roughly translates to proportions of dietary fat they were also getting more than the typical Kiwi today.

There just isn’t that much trans fat in most margarine any more, less than 1% on average (according to Food Standards Oz/NZ, table 2) . There used to be a lot, but then we found out it’s bad for you.  Those scary numbers are actually good news if they’re true: they’d measure how much better off margarine consumers are today than twenty years ago.

(via Mark Hanna)