July 6, 2018

Showing uncertainty with colour

From Claus Wilke on Twitter, using color to indicate uncertainty, based on data from before the 2016 US election.

The red:blue scale indicates who is ahead, and the grey:coloured scale indicates confidence.  There was lots of discussion about whether this is graying out the differences too much or not enough, and so on, but it’s an interesting idea.

July 5, 2018

Salary distributions

Chris Knox at the Herald has a very nice visualisation of salary distributions and gender differences by age, industry, region, and sector. 

These are tidier than you’d expect from a relatively small survey, because they are predictions from a model, rather than raw survey data.

The good thing about using a model like this is that you can get somewhat realistic pictures from a much smaller survey than you’d otherwise need. The model is expanding the real data for each individual into a smooth distribution on the graph.

The bad thing is there’s a bit of distortion: for example, a graph of a large enough set of raw data would show spikes where multiple people have the same round-number income, and probably a sharper cutoff at the bottom end rather than a smooth tail down to zero.

The graph shows very little difference between private-sector and public-sector workers.  That surprised me, because public sector employees on average have substantially higher wage/salary income — as Keith Ng separately writes in the Herald. The difference doesn’t, of course, represent higher pay for comparable jobs; it’s because the public sector is increasingly biased towards educated professionals.  Also, government bodies (like other large organisations) will often contract out their lowest-paying jobs rather than using their own employees. Keith showed, using StatsNZ data, that public-sector employees tend to earn slightly less than private-sector employees within the same occupation type.

But if it takes comparisons within an occupation to correct the misleading public-private comparison in StatsNZ data, why doesn’t it take comparisons within an occupation in the visualisation?  After some Twitter conversation we worked out that it’s because the visualisation is of salaries, and the other comparison is of salaries and wages. Restricting to salaried employees, while cruder than doing comparisons within occupation types, is enough to remove the bulk of the bias.

‘Foreign’ buyers

From the Listener this week, and now on noted.co.nz

This week, new data emerged from the ASB Bankshowing that foreign buyers are a much more significant part of the overheated housing market than had previously been established; that is, between 11% and 20% rather than the piffling 3.3% nationally – and 7.3% in the Auckland market, ground zero for our property frenzy – previously reported by Statistics New Zealand.

As I wrote last week

  1. It’s not new data – it comes from exactly the same StatsNZ report (if either the ASB report or the StatsNZ report had been linked, this would have been easier for the reader to find out)
  2. Not foreign buyers. The 11% includes 8% of New Zealand residents who aren’t citizens at the time they buy the house.  The 11-21% range includes 0-10% of buyers who are foreign commercial entities. Because ASB didn’t have any new data, they don’t know what proportion of the commercial entities are local, but they were guessing it was at the low end. So, ASB’s figure for foreign buyers is 3-13%, with a guess that it’s towards the low end.
  3. In fact, even the 3% counts people on work visas buying a house or apartment to live in as foreign buyers. You could maybe argue that people on work visas should be driving up rental costs instead, but it’s not that obvious a case.

 

 

July 3, 2018

Super 15 Predictions for Round 18

Team Ratings for Round 18

The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Crusaders 15.52 15.23 0.30
Hurricanes 11.14 16.18 -5.00
Chiefs 9.25 9.29 -0.00
Lions 7.81 13.81 -6.00
Highlanders 6.30 10.29 -4.00
Sharks 1.45 1.02 0.40
Waratahs 1.02 -3.92 4.90
Jaguares 0.97 -4.64 5.60
Stormers -0.93 1.48 -2.40
Brumbies -1.64 1.75 -3.40
Blues -2.23 -0.24 -2.00
Bulls -3.70 -4.79 1.10
Rebels -7.97 -14.96 7.00
Reds -9.97 -9.47 -0.50
Sunwolves -14.44 -18.42 4.00

 

Performance So Far

So far there have been 106 matches played, 72 of which were correctly predicted, a success rate of 67.9%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Blues vs. Reds Jun 29 39 – 16 10.20 TRUE
2 Rebels vs. Waratahs Jun 29 26 – 31 -5.60 TRUE
3 Highlanders vs. Chiefs Jun 30 22 – 45 3.80 FALSE
4 Brumbies vs. Hurricanes Jun 30 24 – 12 -11.60 FALSE
5 Sunwolves vs. Bulls Jun 30 42 – 37 -8.30 FALSE
6 Sharks vs. Lions Jun 30 31 – 24 -4.20 FALSE
7 Jaguares vs. Stormers Jun 30 25 – 14 5.20 TRUE

 

Predictions for Round 18

Here are the predictions for Round 18. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Crusaders vs. Highlanders Jul 06 Crusaders 12.70
2 Reds vs. Rebels Jul 06 Reds 1.50
3 Chiefs vs. Brumbies Jul 07 Chiefs 14.90
4 Hurricanes vs. Blues Jul 07 Hurricanes 16.90
5 Waratahs vs. Sunwolves Jul 07 Waratahs 19.50
6 Bulls vs. Jaguares Jul 07 Jaguares -0.70
7 Stormers vs. Sharks Jul 07 Stormers 1.10

 

NRL Predictions for Round 17

Team Ratings for Round 17

The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Storm 9.37 16.73 -7.40
Dragons 3.94 -0.45 4.40
Rabbitohs 3.87 -3.90 7.80
Roosters 3.60 0.13 3.50
Sharks 2.72 2.20 0.50
Raiders 2.44 3.50 -1.10
Panthers 2.35 2.64 -0.30
Broncos 0.79 4.78 -4.00
Cowboys -0.57 2.97 -3.50
Warriors -1.57 -6.97 5.40
Bulldogs -2.82 -3.43 0.60
Sea Eagles -3.50 -1.07 -2.40
Titans -3.62 -8.91 5.30
Wests Tigers -5.06 -3.63 -1.40
Eels -5.43 1.51 -6.90
Knights -8.83 -8.43 -0.40

 

Performance So Far

So far there have been 124 matches played, 73 of which were correctly predicted, a success rate of 58.9%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Dragons vs. Eels Jun 28 20 – 18 14.10 TRUE
2 Warriors vs. Sharks Jun 29 15 – 18 0.70 FALSE
3 Roosters vs. Storm Jun 29 8 – 9 -3.10 TRUE
4 Panthers vs. Sea Eagles Jun 30 10 – 18 11.60 FALSE
5 Knights vs. Bulldogs Jun 30 16 – 36 -0.20 TRUE
6 Broncos vs. Raiders Jun 30 26 – 22 0.90 TRUE
7 Wests Tigers vs. Titans Jul 01 12 – 30 4.70 FALSE
8 Rabbitohs vs. Cowboys Jul 01 21 – 20 8.50 TRUE

 

Predictions for Round 17

Here are the predictions for Round 17. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Storm vs. Dragons Jul 05 Storm 8.40
2 Panthers vs. Warriors Jul 06 Panthers 8.40
3 Bulldogs vs. Raiders Jul 07 Raiders -2.30
4 Titans vs. Broncos Jul 08 Broncos -1.40

 

Briefly

  • False positives: people who think they are allergic to penicillin (but aren’t, or aren’t anymore) are at higher risk of getting nasty antibiotic-resistant infections.
  • RadioNZ interview with the new Chief Science Advisor, Juliet Gerrard. Also, see the official chief sciencely instagram
  • As I point out each year, the Q&A list for the NZ Garden Bird survey has some well-written, simple principles of research design
  • There’s a story Americans vote Taco Bell as the ‘Best Mexican Restaurant of 2018’.  Of course it’s not true — and it’s a good example of where voting isn’t going to work.  Most of the good Mexican restaurants in the US will be unknown outside their local area, and any with nationwide recognition will have to be large chains.  The true story is that Taco Bell came top in ‘Brand Equity’ in the Harris poll, which basically means lots of people have heard of it and would be willing to eat there.
  • Why People Make Bad Charts (and What to Do When it Happens) from Flowing Data
June 28, 2018

Progressive or regressive fuel taxes

From the Herald

And in a startling revelation, the ministers claim that the wealthier a household is, the more it is likely to pay for petrol. They say the wealthiest 10 per cent of households will pay $7.71 per week more for petrol. Those with the lowest incomes will pay $3.64 a week more.

That’s good to see. And it does contradict the impressions given by some of the opponents of the fuel tax. But it doesn’t address (or even allude to) more detailed criticisms of the tax.

Wealthy households, on average, spend more money than poor households.  They spend more on food. They spend more on entertainment. They spend more on cars.  They use public transit more. And they drive more.  So, on average, they pay more GST, and they pay more fuel tax. That’s not a startling revelation. Even in the US, higher-income households (on average) spend more money on petrol.

In the other direction, any user charge is going to be a lower proportion of income for wealthy than poor households. The regional fuel tax is no exception: according to the statistics in the Herald story, the average charge is only about twice as high for the top income decile as for the bottom. The ratio of incomes is much larger than two. Again, that’s not a startling revelation.

For questions where the answer isn’t obvious, we need more data.

First, we’d like to know the distribution of costs, not just the average.  For example, the lowest income bands will contain more people who don’t have cars (who are paying quite a bit less than the average) and, by arithmetic,  will also contain some people paying quite a bit more than the average.

Second, if you think of the fuel tax as being a sort of road user charge or a surrogate for a congestion charge, we’d want the amount paid per kilometre driven to be roughly constant. It would be undesirable for low-income household to pay more per kilometre than high-income households. (On the other hand, if you think of it as a carbon charge, it makes sense for it to be based on fuel amount but doesn’t make sense for it to be higher in Auckland.)

Answering these questions takes a bit more analysis. So, I’m going to refer you to Sam Warburton, an economist formerly with the Department of Transport and now with NZIER the NZ Initiative.  Here’s his Twitter thread reacting to the story, and here’s his (PDF) submission to Parliament on the taxes.

Politics is about compromise, and it’s possible these fuel taxes are the best of the politically-feasible options, but they aren’t all unicorns and rainbows.

June 27, 2018

Who should have a home?

Yesterday, the Herald published this story

The headline wasn’t true.

Today, the headline is different, It’s not 3% – ASB analysis suggests up to a fifth of properties sold to non-citizens.

There’s a big difference.

It’s hard to get statistics on how many citizens there are in NZ vs other long-term residents.  The Census, for example, doesn’t ask — as Stats NZ explains here, that’s partly because it’s more complicated than you think, and partly because there’s no good reason to care. Citizen vs resident is rarely an important distinction. A non-citizen with a residence-class visa can’t run for Parliament, but they can vote, serve in the defence forces, play for the All Blacks, and, yes, buy a home.

Up to a fifth of home purchases does seem a lot, but in this case “up to a fifth” actually means:

the assumption was that the true figure was at the lower end of the 11 per cent to 21 per cent range “but there’s no way to know. …”

 

It’s not just the headline: the story is a bit misleading.

First, they’re leaving out an important mechanism whereby real estate is transferred from non-citizens to citizens. My house is currently owned by a non-citizen. Some time early next year (if I get around to requesting my US police report soon),  I hope it will be owned by a New Zealand citizen. And my citizenship change wouldn’t show up in the ASB analysis.

Second, the ASB range of 11-21% is for homes, not properties as the headline claimed. Both ASB and StatsNZ make this distinction carefully.

Third, the extent to which the ASB analysis and StatsNZ numbers differ has been exaggerated a bit.  Here’s the StatsNZ report, which ASB links to.  The StatsNZ numbers for home transfers:

  • 79 percent involved at least one NZ citizen
  • 9.9 percent involved only corporate entities
  • 8.0 percent involved at least one NZ-resident-visa holder (but no citizens)
  • 3.3 percent involved no NZ citizens or resident-visa holders (up from 2.9 percent in the December 2017 quarter).

If you add 8 and 3 you get 11. If you add 8 and 3 and 9.9 you get 21.

If you don’t separate residents from citizens the range is 3-13%.

And if you go along with the ASB report’s assumption that the true figure is at the lower end of the range, well, you’d get a much more boring headline.

 

June 26, 2018

Briefly

  • Bias detectives: the researchers striving to make algorithms fair” from Nature News.  Featuring Auckland (AUT) researcher Rhema Vaithianathan.
  • In the UK, the Ada Lovelace Institute is being established, looking at these and related issues.
  • There were a bunch of headlines in the UK saying that life expectancy was falling (and often attributing the fall to ‘austerity’ policies).  Our World In Data looks at the issue: what is actually happening is that expected increases in life expectancy had been scaled back slightly, and this was due mostly to changes in projections for the increase in life expectancy of one 15-year group of people (born 1923-1938).  Official statistics is complicated.
  • US researchers looked at the first digit after the decimal point in various numbers reported by companies listed on the stock exchange. For earnings per share, but not for other figures, there was a noticeable shortage of ‘4’s — about 8% rather than the expected 10% — suggesting that the numbers may have been manipulated a little so that this figure, which is published rounded to the nearest whole cent,  rounds up rather than rounding down.

    Interestingly, companies that didn’t post any .4s were more likely to “restate their financial statements, be named as defendants in SEC Accounting and Auditing Enforcement Releases, and be involved in class action securities fraud litigation” (via Matt Levine)
  • You may have seen in headlines that new research connects Alzheimer’s Disease to some very common viruses related to herpes. Derek Lowe writes about how this is better substantiated than a lot of previous alternative theories. But the take-home message is still that we don’t know what to do to treat or prevent AD.
June 21, 2018

Unnecessary work for the reader

From Newshub:

Overall opening day bookings for the Abel Tasman Track are already up from 551 last year to 811 this year.

Kiwi bookings have doubled, while international bookings are down by 20 percent. However, Ms Sage admits international visitors’ bookings could be masquerading as Kiwis. 

“There may be some people around the fringes, but it relies on honesty.”

For more idea of how much of an issue there might be with honesty of foreigners, it would be nice to compare the increase in local bookings and decrease in international bookings as actual numbers, not as percentages.  You can work this out (approximately) from the information given, but it involves high-school maths (solving a pair of linear equations), so perhaps DoC could just have been asked.

To add up to the numbers in the story, local bookings must have increased from about 308 to 616 and international bookings decreased from about 242 to 194. So, even if the entire decrease in international bookings was people pretending to be Kiwi, it would only explain a small fraction of the increase in local bookings.