Posts filed under Politics (194)

February 18, 2025

Surprises in data

When you get access to some data, a first step is to see if you understand it: do the variables measure what you expect, are there surprising values, and so on. Often, you will be surprised by some of the results. Almost always this is because the data mean something a bit different from what you expected. Sometimes there are errors. Occasionally there is outright fraud.

Elon Musk and DOG-E have been looking at US Social Security data. They created a table of Social Security-eligible people not recorded as dead and noticed that (a) some of them were surprisingly old, and (b) the total added up to more than the US population.

That’s a good first step, as I said. The next step is to think about possible explanations (as Dan Davies says: “if you don’t make predictions, you won’t know when to be surprised”). The first two I thought of were people leaving the US after working long enough to be eligible for Social Security (like, for example, me) and missing death records for old people (the vital statistics records weren’t as good in the 19th century as they are now).

After that, the proper procedure is to ask someone or look for some documentation, rather than just to go with your first guess. It’s quite likely that someone else has already observed the existence of records with unreasonable ages and looked for an explanation.

In this case, one would find (eg, by following economist Justin Wolfers) a 2023 report “Numberholders Age 100 or Older Who Did Not Have Death Information on the Numident” (PDF), a report by the Office of the Inspector General, which said that the very elderly ‘vampires collecting Social Security’ were neither vampires nor collecting Social Security, but were real people whose deaths hadn’t been recorded. This was presumably a follow-up to a 2015 story where identity fraud was involved — but again, the government wasn’t losing money, because it wasn’t paying money out to dead people.

The excess population at younger years isn’t explained by this report, but again, the next step is to see what is already known by the people who spend their whole careers working with the data, rather than to decide the explanation is the first thing that comes to mind.

View comments (6)

October 2, 2018

Pharmac rebates

By Thomas Lumley

There’s an ‘interactive’ at Stuff about the drug rebates that Pharmac negotiates. The most obvious issue with it is the graphics, for example

and

The first of these is a really dramatic illustration of a well-known way graphs can mislead: using just one dimension of a two-dimensional or three-dimensional thing to represent a number. The 2016/7 capsule looks much more than twice as big as the puny little 2014/15 one, because it’s twice as high and twice as wide (and by implication from shading, twice as deep). The first graph also commits the accounting sin of displaying a trend from total, nominal expenditures rather than real (ie, inflation-adjusted) per-capita expenditures.

The second one is not as bad, but the descending line to the left of the data points is a bit dodgy, as is the fact that the x-axis is different from the first graph even though the information should all be available. Also, given that rebates are precisely not a component of Pharmac’s drug spend, the percentage is a bit ambiguous. The graph shows total rebates divided by what would have been Pharmac’s “drug spend” in the improbable scenario that the same drugs had been bought without rebates. That is, in the most recent year, Pharmac spent $849 million on drugs. If rebates were $400m as shown in the first graph, the percentage in the second graph is something like ($400 million)/($400 million+$849 million)=32%.

More striking when you listen to the whole thing, though, is how negative it is about New Zealand getting these non-public discounts on expensive drugs. In particular, the primary issue raised is whether we’re getting better or worse discounts than other countries (which, indeed, we don’t know), rather than whether we’re getting good value for what we pay — which we basically do know, because that’s exactly what Pharmac assesses.

Now, since the drug companies do want to keep their prices secret there must be some financial advantage to them in doing so, thus there is probably some financial disadvantage to someone other than them. It’s possible that we’re in that group; that other comparable countries are getting better prices than we are. It’s also possible that we’re getting better prices than them. Given Pharmac’s relatively small budget and their demonstrated and unusual willingness not to subsidise overpriced new drugs, I know which way I’d guess.

There are two refreshing aspects to the interactive, though. First, it’s good to see explicit consideration of the fact that drug prices are primarily not a rich-country problem. Second, it’s good to see something in the NZ mass media in favour of the principle that Pharmac can and should walk away from bad offers. That’s a definite change from most coverage of new miracle drugs and Pharmac.

March 8, 2018

“Causal” is only the start

By Thomas Lumley

Jamie Morton has an interesting story in the Herald, reporting on research by Wellington firm Dot Loves Data.

They then investigated how well they all predicted the occurrence of assaults at “peak” times – between 10pm and 3am on weekends – and otherwise in “off-peak” times.

Unsurprisingly, a disproportionate number of assaults happened during peak times – but also within a very short distance of taverns.

The figures showed a much higher proportion of assault occurred in more deprived areas – and that, in off-peak times, socio-economic status proved a better predictor of assault than the nearness or number of bars.

Unsuprisingly, the police were unsurprised.

This isn’t just correlation: with good-quality location data and the difference between peak and other times, it’s not just a coincidence that the assaults happened near bars, nor is it just due to population density. The closeness of the bars and the assaults also argues against the simple reverse-causation explanation: that bars are just sited near their customers, and it’s the customers who are the problem.

So, it looks as if you can predict violent crimes from the location of bars (which would be more useful if you couldn’t just cut out the middleman and predict violent crimes from the locations of violent crimes). And if we moved the bars, the assaults would probably move with them: if we switched a florist’s shop and a bar, the assaults wouldn’t keep happening outside the florist’s.

What this doesn’t tell us directly is what would happen if we dramatically reduced the number of bars. It might be that we’d reduce violent crime. Or it might be that it would concentrate around the smaller number of bars. Or it might be that the relationship between bars and fights would weaken: people might get drunk and have fights in a wider range of convenient locations.

It’s hard to predict the impact of changes in regulation that are intended to have large effects on human behaviour — which is why it’s important to evaluate the impact of new rules, and ideally to have some automatic way of removing them if they didn’t do what they were supposed to. Like the ban on pseudoephedrine in cold medicine.

View comments (4)

January 9, 2018

Election maps: what’s the question?

By Thomas Lumley

XKCD has come out with a new map of the 2016 US election

In about 2008 I made a less-artistic one of the 2004 elections on similar principles

These maps show some useful things about the US vote:

the proportions for the two parties are pretty close, but
most of the land area has very few voters, and
most areas are relatively polarised
but not as polarised as you think, eg, look at the cities in Texas

What these maps are terrible at is showing changes from one election to the next. The map for 2004 (Republicans ahead by about 2.5%) and 2016 (Republicans behind by about 3%) look very similar. And even 2008 (Republicans behind by 7%) wouldn’t look that different.

Like a well-written thousand words, a well-drawn picture needs to be about something. Questions matter. The data don’t speak for themselves.

December 15, 2017

Public comments, petitions, and other self-selected samples

By Thomas Lumley

In the US, the Federal Communications Commission was collecting public comments about ‘net neutrality’ — an issue that’s commercially and politically sensitive in a country where many people don’t have any real choice about their internet provider.

There were lots of comments: from experts, from concerned citizens, from people who’d watched a John Oliver show. And from bots faking real names and addresses on to automated comments. The Wall Street Journal contacted a random sample of nearly 3000 commenters and found the majority of those they could get in contact with had not submitted the comment attached to their details. The StartupPolicyLab attempted to contact 450,000 submitters, and got a response from just over 8000. Of the 7000 contacted about pro-neutrality comments, nearly all agreed they had made the comment, but of the 1000 responses about anti-neutrality comments, about 88% said they had not made the comment.

It’s obviously a bad idea to treat the comments as a vote. Even if the comments were from real US people, with one comment each, you’d need to do some sort of modelling of the vast majority who didn’t comment. But what are they good for?

One real benefit is for people to provide ideas you hadn’t thought of. The public comment process on proposed New Zealand legislation certainly allows for people like Graeme Edgeler to point out bugs in the drafting, and for people whose viewpoints were not considered to speak out. For this, it doesn’t matter what the numbers of comments are, for and against. In fact, it helps if people who don’t have something to say don’t say it.

With both petitions and public comments there’s also some quantitative value in showing that concern about some issue you weren’t worrying about isn’t negligibly small; that thousands (in NZ) or hundreds of thousands (in the US) care about it.

But if it’s already established that an issue is important and controversial, and you care about the actual balance of public opinion, you should be doing a proper opinion poll.

November 15, 2017

Summarising house prices

By Thomas Lumley

From the Herald (linking to this story)

To begin with, “worst” is distinctly unfortunate now we’ve finally got a degree of political consensus that Auckland house prices are too high. “Best” might be too much to hope for, but at least we could have a neutral term.

More importantly, as the story later concedes, it’s more complicated than that.

It’s not easy to decide what summary of housing prices is ideal. This isn’t just about mean vs median and the influence of the priciest 1%, though that comes into it. A bigger problem is that houses are all individuals. Although the houses sold this October are, by and large, not the same houses that were sold last October, the standard median house price summary compares the median price of one set of houses to the median price of the other set.

When the market is stable, there’s no real problem. The houses sold this year will be pretty much the same as those sold last year. But when the market is stable, there aren’t interesting stories about real-estate prices. When the market is changing, the mix of houses being compared can change. In this case, that change is the whole story.

In Auckland as a whole, the median price fell 3.2%. In the old Auckland City — the isthmus — the median price fell 17%. But

Home owners shouldn’t panic though. That doesn’t mean the average house price has fallen by anything like that much.

The fall in median has been driven largely by an increasing number of apartments coming onto the market in the past year.

That is, the comparison of this October’s homes to last October’s homes is inappropriate — they aren’t similar sets of properties. This year’s mix has many more apartments; apartments are less expensive; so this year’s mix of homes has a lower median price.

The story does admit to the problem with the headline, but it doesn’t really do anything to fix it. A useful step would be to separate prices for apartments and houses (and maybe also for townhouses if they can be defined usefully) and say something about the price trends for each. A graph would be a great way to do this.

Separating out changes in the mix of homes on sale from general house price inflation or deflation is also helpful in policy debates. Changing the mix of housing allows us to lower the price of housing by more than we lower the value of existing houses, and would be valuable for the Auckland public to get a good feeling for the difference.

View comments (3)

Bogus poll headlines justified

By Thomas Lumley

The Australian postal survey on marriage equality was a terrible idea.

It was a terrible idea because that sort of thing shouldn’t be a simple majority decision.

It was a terrible idea because it wasn’t even a vote, just a survey.

It was a terrible idea because it wasn’t even a good survey, just a bogus poll.

As I repeatedly say, bogus polls don’t tell you anything much about people who didn’t vote, and so they aren’t useful unless the number voting one particular way is a notable proportion of the whole eligible population. In the end, it was.

A hair under 50% of eligible voters said ‘Yes’, just over 30% said ‘No’, and about 20% didn’t respond.

And, in what was not at all a pre-specified hypothesis, Tony Abbott’s electoral division of Warringah had an 84% participation rate and 75% ‘Yes’, giving 63% of all eligible voters indicating ‘yes’.

PS: Yay!

View comments (1)

September 27, 2017

Stat Soc of Australia on Marriage Survey

By Thomas Lumley

The Statistical Society of Australia has put out a press release on the Australian Marriage Law Postal Survey. Their concern, in summary, is that if this is supposed to be a survey rather than a vote, the Government has required a pretty crap survey and this isn’t good.

The SSA is concerned that, as a result, the correct interpretation of the Survey results will be missed or ignored by some community groups, who may interpret the resulting proportion for or against same-sex marriage as representative of the opinion of all Australians. This may subsequently, and erroneously, damage the reputation of the ABS and the statistical community as a whole, when it is realised that the Survey results can not be understood in these terms.

and

The SSA is not aware of any official statistics based purely on unadjusted respondent data alone. The ABS routinely adjusts population numbers derived from the census to allow for under and over enumeration issues via its post-enumeration survey. However, under the Government direction, there is there no scope to adjust for demographic biases or collect any information that might enable the ABS to even indicate what these biases might be.

If the aim was to understand the views of all Australians, an opinion survey would be more appropriate. High quality professionally-designed opinion surveys are routinely carried out by market research companies, the ABS, and other institutions. Surveys can be an efficient and powerful tool for canvassing a population, making use of statistical techniques to ensure that the results are proportioned according to the demographics of the population. With a proper survey design and analysis, public opinion can be reliably estimated to a specified accuracy. They can also be implemented at a fraction of the cost of the present Postal Survey. The ABS has a world-class reputation and expertise in this area.

(They’re not actually saying this is the most important deficiency of the process, just that it’s the most statistical one)

September 24, 2017

The polls

By Thomas Lumley

So, how did the polls do this time? First, the main result was predicted correctly: either side needs a coalition with NZ First.

In more detail, here are the results from Peter Ellis’s forecasts from the page that lets you pick coalitions.

Each graph has three arrows. The red arrow shows the 2014 results. The blue/black arrow pointing down shows the current provisional count and the implied number of seats, and the horizontal arrow points to Graeme Edgeler’s estimate of what the special votes will do (not because he claims any higher knowledge, but because his estimates are on a web page and explain how he did it).

First, for National+ACT+UnitedFuture

Second, for Labour+Greens

The result is well within the uncertainty range of the predictions for Labour+Greens, and not bad for National. This isn’t just because NZ politics is easy to predict: the previous election’s results are much further away. In particular, Labour really did gain a lot more votes than could reasonably have been expected a few months ago.

Update: Yes, there’s a lot of uncertainty. And, yes, that does mean quoting opinion poll results to the nearest 0.1% is silly.

View comments (3)

September 20, 2017

Democracy is coming

By Thomas Lumley

Unless someone says something really annoyingly wrong about polling in the next few days, I’m going to stop commenting until Saturday night.

Some final thoughts:

The election looks closer than NZ opinion polling is able to discriminate. Anyone who thinks they know what the result will be is wrong.

The most reliable prediction based on polling data is that the next government will at least need confidence and supply from NZ First. Even that isn’t certain.

It’s only because of opinion polling that we know the election is close. It would be really surprising if Labour didn’t do a lot better than the 25% they managed in the 2014 election — but we wouldn’t know that without the opinion polls.

Posts filed under Politics (194)

Surprises in data

Pharmac rebates

“Causal” is only the start

Election maps: what’s the question?

Public comments, petitions, and other self-selected samples

Summarising house prices

Bogus poll headlines justified

Stat Soc of Australia on Marriage Survey

The polls

Democracy is coming

Latest posts

All topics

Subscribe:

Receive our posts via email:

Posts filed under Politics (194)

Latest posts

All topics