Posts from November 2019 (9)

November 21, 2019

What do statisticians do all day?

From the student dissertation talks this week

  • Toward Modeling of Disease Transmission Networks (E. coli in Germany)
  • Real Time Bus Headway Estimation in Auckland, New Zealand (the last bus was on time but your bus is late)
  • Survival Analysis of Guinea pigs with dental diseases (worse than you think)
  • Sample Path Behaviour of Accumulating Priority Queues
  • Comparison of the Bayesian and Simple Model for Estimations
  • Finding Periodic Climate Cycles in a Mud Core (we can’t see the sunspot cycle; we have a sad)
  • Optimal Sample Allocation for Estimating Regression Parameters (if it’s optimal for one purpose it won’t be for others)
  • Queue Mining — Online Delay Prediction
  • Enabling Text Analytics (simpler software)
  • Systematic Error Removal using Random Forests (in metabolomics)
  • Tracking of Dietary Patterns as Children Grow Up
  • Exploration of the Effects of a Text-MessageBased Diabetes Self-Management Support Programme (it works!)
  • A Study of Ticketing Prediction in the Events Industry (they didn’t give us the right data)
  • Anomaly Detection in Business Transactions UsingSupervised and— Unsupervised Methods (Fraud, we haz it)
  • Designing for a conceptual understanding of the Mean and Standard Deviation
  • Detecting Ecological Change along Environmental Gradients (for critters that live near the shore)
  • Identifying the Best Predictors for Power Demand Across Auckland
  • Automatic Identification of Patient Smoking Status based on Unstructured Clinical Notes (you’d think doctors would just say ‘smoker’. Sadly, no)
  • Visualization of Network Data (ooh, pretty)
  • Brownian Motions and Excursions
  • New methods for estimating population size based on close-kin genetics and extensions (Whales and inbreeding and population size)
  • An Examination of the Relationship between Student Engagement and Academic Achievement (it looks like lectures and tutorials are useful, but confounding)
November 19, 2019

Test for breast cancer?

Newshub (and a lot of the British press) reported a couple of weeks ago “New blood test could detect breast cancer five years before symptoms“.

There’s a problem. Well, more than one problem.

First, the accuracy of the test is terrible.  It missed the majority of cancers and falsely diagnoses about twenty percent of the normal samples as having cancer.  There’s no way anyone would use a test like that.

Second, the story says “They estimate that, with a fully-funded development programme, the test might become available in the clinic in about four-to-five years.” If they had a working test, that might be true. But they don’t. So it isn’t.

And finally, all the breast cancer samples were taken from people who had already been diagnosed, so the idea that you’ll get early diagnosis this way is, at best, hopeful.

 

Briefly

  • ‘For example, the tweet “I saw him yesterday” is scored as 6 per cent toxic, but it suddenly skyrockets to 95 per cent for the comment “I saw his ass yesterday”.‘  The Register, talking about a paper from the University of Washington
  • “Long-awaited cystic fibrosis drug could turn deadly disease into a manageable condition”. From the Washington Post. However, this drug will be priced at about NZ$485,000 for one year.  At that price, treating 500 people would cost about as much as Pharmac’s top six drugs, or as much as Pharmac currently spends on all cancer drugs. So let’s hope Pharmac can get a good deal.
  • Janelle Shane, optical physicist and AI humorist, has written a book about AI. The title is one of a set of machine-generated pickup lines. “You Look Like a Thing and I Love You”
  • “The goals of the advertising business model do not always correspond to providing quality search to users….we believe the issue of advertising causes enough mixed incentives that it is crucial to have a competitive search engine that is transparent and in the academic realm.” The interesting part of this quote is the source: Page L, Brin S (1998) “The Anatomy of a Large-Scale Hypertextual Web Search Engine” (via Slate Money)

How many renters?

The 1 in 3 figure, as Tava Olsen reminded us on Twitter, is the proportion of homes that are rented.  The proportion of people is probably higher, but it’s surprisingly hard to tell. I’m going to outsource this to me from 2017, when it was a Stat-of-the-Week nomination.

November 12, 2019

The Bird of the Year race as a data visualisation!

The Bird of the Year contest, run by wildlife advocate Forest & Bird, asks the public to vote for their favourite New Zealand native bird. People get very excited by this, with campaigns coalescing around particular birds and much trash-talk between rival camps.

This year, the race was between the kākāpō and the hoiho or yellow-eyed penguin. Find out which bird won and what voting looked like on the way there in a neat little data visualisation by Yvan Richard of Dragonfly Data Science here. Read a story about this year’s race here. #BirdoftheYear

 

Rugby Premiership Predictions for Round 5

 

 

Team Ratings for Round 5

The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Saracens 8.50 9.34 -0.80
Exeter Chiefs 6.09 7.99 -1.90
Northampton Saints 1.92 0.25 1.70
Sale Sharks 1.42 0.17 1.20
Gloucester 0.92 0.58 0.30
Bath 0.30 1.10 -0.80
Bristol -0.32 -2.77 2.50
Wasps -0.98 0.31 -1.30
Harlequins -1.56 -0.81 -0.80
Worcester Warriors -2.26 -2.69 0.40
Leicester Tigers -3.48 -1.76 -1.70
London Irish -4.33 -5.51 1.20

 

Performance So Far

So far there have been 24 matches played, 18 of which were correctly predicted, a success rate of 75%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Sale Sharks vs. Wasps Nov 09 28 – 18 6.30 TRUE
2 Bath vs. Northampton Saints Nov 10 22 – 13 2.00 TRUE
3 Gloucester vs. Saracens Nov 10 12 – 21 -2.30 TRUE
4 Harlequins vs. Worcester Warriors Nov 10 14 – 19 6.50 FALSE
5 London Irish vs. Leicester Tigers Nov 11 36 – 11 1.30 TRUE
6 Exeter Chiefs vs. Bristol Nov 11 17 – 20 12.60 FALSE

 

Predictions for Round 5

Here are the predictions for Round 5. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Bath vs. Saracens Nov 30 Saracens -3.70
2 Exeter Chiefs vs. Wasps Dec 01 Exeter Chiefs 11.60
3 Northampton Saints vs. Leicester Tigers Dec 01 Northampton Saints 9.90
4 Worcester Warriors vs. Sale Sharks Dec 01 Worcester Warriors 0.80
5 Bristol vs. London Irish Dec 02 Bristol 8.50
6 Harlequins vs. Gloucester Dec 02 Harlequins 2.00

 

Pro14 Predictions for Round 7

Team Ratings for Round 7

The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Leinster 14.50 12.20 2.30
Munster 9.42 10.73 -1.30
Glasgow Warriors 7.88 9.66 -1.80
Connacht 3.09 2.68 0.40
Scarlets 3.07 3.91 -0.80
Ulster 2.34 1.89 0.40
Edinburgh 2.07 1.24 0.80
Cheetahs 0.44 -3.38 3.80
Cardiff Blues 0.12 0.54 -0.40
Ospreys -0.84 2.80 -3.60
Treviso -1.63 -1.33 -0.30
Dragons -8.82 -9.31 0.50
Southern Kings -13.67 -14.70 1.00
Zebre -17.98 -16.93 -1.00

 

Performance So Far

So far there have been 42 matches played, 35 of which were correctly predicted, a success rate of 83.3%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Connacht vs. Leinster Nov 09 11 – 42 -4.60 TRUE
2 Edinburgh vs. Dragons Nov 09 20 – 7 18.40 TRUE
3 Ospreys vs. Southern Kings Nov 10 14 – 16 20.90 FALSE
4 Zebre vs. Glasgow Warriors Nov 10 7 – 31 -18.40 TRUE
5 Cardiff Blues vs. Cheetahs Nov 10 30 – 17 4.70 TRUE
6 Munster vs. Ulster Nov 10 22 – 16 13.40 TRUE
7 Scarlets vs. Treviso Nov 10 20 – 17 12.00 TRUE

 

Predictions for Round 7

Here are the predictions for Round 7. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Munster vs. Edinburgh Nov 30 Munster 13.80
2 Ulster vs. Scarlets Nov 30 Ulster 5.80
3 Treviso vs. Cardiff Blues Dec 01 Treviso 4.70
4 Connacht vs. Southern Kings Dec 01 Connacht 23.30
5 Dragons vs. Zebre Dec 01 Dragons 15.70
6 Ospreys vs. Cheetahs Dec 01 Ospreys 5.20
7 Glasgow Warriors vs. Leinster Dec 01 Leinster -0.10

 

November 5, 2019

Rugby Premiership Predictions for Round 4

Team Ratings for Round 4

The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Saracens 8.09 9.34 -1.20
Exeter Chiefs 6.92 7.99 -1.10
Northampton Saints 2.33 0.25 2.10
Gloucester 1.33 0.58 0.70
Sale Sharks 1.12 0.17 0.90
Bath -0.12 1.10 -1.20
Wasps -0.69 0.31 -1.00
Harlequins -0.93 -0.81 -0.10
Bristol -1.14 -2.77 1.60
Leicester Tigers -2.30 -1.76 -0.50
Worcester Warriors -2.89 -2.69 -0.20
London Irish -5.50 -5.51 0.00

 

Performance So Far

So far there have been 18 matches played, 14 of which were correctly predicted, a success rate of 77.8%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Bristol vs. Sale Sharks Nov 02 16 – 10 1.70 TRUE
2 Northampton Saints vs. Harlequins Nov 02 40 – 22 6.50 TRUE
3 Leicester Tigers vs. Gloucester Nov 03 16 – 13 0.50 TRUE
4 Saracens vs. London Irish Nov 03 16 – 13 19.90 TRUE
5 Wasps vs. Bath Nov 03 30 – 22 3.30 TRUE
6 Worcester Warriors vs. Exeter Chiefs Nov 04 20 – 24 -5.60 TRUE

 

Predictions for Round 4

Here are the predictions for Round 4. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Sale Sharks vs. Wasps Nov 09 Sale Sharks 6.30
2 Bath vs. Northampton Saints Nov 10 Bath 2.00
3 Gloucester vs. Saracens Nov 10 Saracens -2.30
4 Harlequins vs. Worcester Warriors Nov 10 Harlequins 6.50
5 London Irish vs. Leicester Tigers Nov 11 London Irish 1.30
6 Exeter Chiefs vs. Bristol Nov 11 Exeter Chiefs 12.60

 

Pro14 Predictions for Round 6

Team Ratings for Round 6

The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Leinster 13.62 12.20 1.40
Munster 10.09 10.73 -0.60
Glasgow Warriors 7.38 9.66 -2.30
Connacht 3.98 2.68 1.30
Scarlets 3.49 3.91 -0.40
Edinburgh 2.56 1.24 1.30
Ulster 1.67 1.89 -0.20
Cheetahs 1.19 -3.38 4.60
Ospreys -0.03 2.80 -2.80
Cardiff Blues -0.62 0.54 -1.20
Treviso -2.05 -1.33 -0.70
Dragons -9.30 -9.31 0.00
Southern Kings -14.47 -14.70 0.20
Zebre -17.48 -16.93 -0.50

 

Performance So Far

So far there have been 35 matches played, 29 of which were correctly predicted, a success rate of 82.9%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Glasgow Warriors vs. Southern Kings Nov 02 50 – 0 26.70 TRUE
2 Leinster vs. Dragons Nov 02 50 – 15 28.20 TRUE
3 Ulster vs. Zebre Nov 02 22 – 7 26.60 TRUE
4 Scarlets vs. Cheetahs Nov 03 17 – 13 9.90 TRUE
5 Ospreys vs. Connacht Nov 03 10 – 20 3.60 FALSE
6 Treviso vs. Edinburgh Nov 03 18 – 16 1.90 TRUE
7 Cardiff Blues vs. Munster Nov 03 23 – 33 -2.90 TRUE

 

Predictions for Round 6

Here are the predictions for Round 6. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Connacht vs. Leinster Nov 09 Leinster -4.60
2 Edinburgh vs. Dragons Nov 09 Edinburgh 18.40
3 Ospreys vs. Southern Kings Nov 10 Ospreys 20.90
4 Zebre vs. Glasgow Warriors Nov 10 Glasgow Warriors -18.40
5 Cardiff Blues vs. Cheetahs Nov 10 Cardiff Blues 4.70
6 Munster vs. Ulster Nov 10 Munster 13.40
7 Scarlets vs. Treviso Nov 10 Scarlets 12.00