Posts filed under Graphics (394)

February 13, 2012

More bubble charts – your feedback please!

Last week’s Sunday Star Times featured inaccurate bubble charts and it has happened again this week:

Sunday Star Times, 12 February 2012, “Rugby joy short-lived” A2

You can see that diameters of the circles have been used to represent the percentages, rather than the area. This gives a distorted view of the situation as our eyes notice the area, rather than the diameter/height.

For example, look at the Crime/Violence and Race relations data for 2008 where you are comparing the 1% and 15%.   Many more than 15 of those little 1% circles can fit inside the 15% circle.

For comparison, here’s the Crime/Violence and Race relations data as line graphs:

One could be also drawn for the Unemployment/Jobs and Economy data (or overlaid together, depending on what you’d like people to easily compare).

What I’m really interested in are your thoughts on the following:

  1. Which graphic do you believe is more easily understood by the general public?  Why?
  2. How important is it for the media to use accurate graphics versus getting across the general idea? I.e. does it really matter if the scale is correct or not? Why or why not?

Update: Here’s the bubble chart where the areas represent the figures, rather than the diameters (click to enlarge).  I have not included all the details in the graph, just wanted to show a size comparison here.

Hat tip: Murray Jorgensen

Adjusting for smoking?

Today the Herald is reporting that soft drinks give you asthma and COPD.  To be fair, the problems with this story are mostly not the Herald’s fault (except for the headline).

The research paper found that asthma and COPD are more common in people who drink a lot of soft drinks.  The main concern with findings like these is that smoking has a huge effect on COPD, and obesity has a fairly large effect, so you would worry that the correlation is just due to smoking and weight. [Or, if you believe some of the other recent new stories, due to bottle-feeding as a baby].

The researchers attempted to remove the effect of smoking and overweight, but their ability to do this is fairly limited.  The idea of regression adjustment is that you can estimate what someone’s risk would have been with a different level of smoking or weight, and so you can extrapolate to make the soft-drink and non-soft-drink groups comparable.  In this case the data came from a telephone survey, and the information they used for adjustment is a three-level smoking variable (never, former, current) and a two-level overweight variable based on self-reported height and weight (BMI < 25 or >25).    If duration of smoking or amount of smoking is important, or if weight distinctions within “overweight” are important, their confounding effects will still be present in the final estimates.

I can’t resist showing you the graph of COPD risks from the paper, which is an excellent example of why not to use fake 3d in graphs. The 3d layout makes it harder to compare the bars — a fairly reliable indication of a bad graph is that it is so unreadable that the data values need to be printed there too.

A 2d barchart will almost always be better than a 3d barchart, and this is no exception.  The comparisons are clearer, and in particular it is clear how big the effect of smoking really is.  It’s only in never-smokers that we have a precise description of smoking, and these are the only group that doesn’t show a trend.

But even the 2d barchart is misleading here.  The key  rules for a barchart are that zero must be a relevant value, and that uncertainty must be relatively unimportant. Zero relative risk is an impossible value — the “null” value for relative risk is 1.0 — and there is a lot of uncertainty in these numbers (although unfortunately the researchers don’t tell us how much).  A dot chart is better, with a logarithmic scale for relative risk so that the `null’ value is 1 rather than 0.

Needs standard errors, which in our case we have not got.

 

February 7, 2012

Inequality graph

I think this graph is an improvement over the density plot from StatsNZ I showed earlier.  It’s a box plot of median income for all census meshblocks in the Auckland region, in 1996, 2001, and 2006 (except for the ones that were too small to have data released publically). The data are from Stats New Zealand, rescaled to 1996 dollars

It’s clear from this graph that most areas had an increase in median income, but that the increase was larger in wealthier areas.   A few areas went up sharply, then down again, presumably in the dotcom crash.  Some of the larger decreases are probably due to changes in housing mix: two meshblocks in Auckland Central have declined a lot, and I expect that’s due to more small apartments.

It’s also worth noting that the percentage increase in median income is much closer to being constant across meshblocks.  In that sense the increase in inequality is not as bad as in the US, where increases in GDP have almost entirely ended up with the rich.

 

 

 

[Update: here’s a version where the areas that decreased from 1996 to 2006 are in a different color.  I don’t know if it helps for seeing the overall pattern.  Given more time and if WordPress took SVG, it would be possible to have mouseover labels for the meshblocks so you could see which is which.]

February 5, 2012

Who is really buying New Zealand? And it’s not what they plotted.

Today’s front page of the Sunday Star Times has a bubble chart showing the amount of hectares purchased by foreigners in the past 5 years:

Sunday Star Times, 5 February 2012, “Who is really buying New Zealand?” A1

While bubble charts are a trendy way to present data, it is well-known that people find it hard to judge areas and even more so when the circles are not concentric (their centres don’t coincide) or when the shapes overlap as in the Sunday Star Times’ chart.

However, there’s more problems with this graph. The bubble sizes just don’t match the data.

Compare USA with Canada – the area bought by Canadians was about 81% of the amount brought by Americans but the area in their chart is only about 58%.

It should have looked like this:

Compare China with Italy and you definitely know something has gone wrong in the calculations.

A better way to compare is via a bar chart: not quite as sexy looking but much easier to make comparisons:

January 10, 2012

War on Infographics

Megan McArdle at The Atlantic critiques the recent online infographics trend:

…it’s time to get down to a war that really matters: the war on terrible, lying infographics, which have become endemic in the blogosphere, and constantly threaten to break out into epidemic or even pandemic status.

The reservoir of this disease of erroneous infographics is internet marketers who don’t care whether the information in their graphics is right … just so long as you link it.

Megan critiques a series of infographics and is well worth the read.

December 31, 2011

Student multitasking

Another seasonal phenomenon at this time of year is the end of US college football. For those who haven’t encountered the game, American football is not entirely unlike rugby, only with less actual kicking and more ad breaks.

Some economists in Oregon have looked at the relationship between the average male:female GPA difference  at the University of Oregon and the performance of the Ducks, the University’s football team.

So what did the economists find? While the average GPA for male students was always lower than for female students, there was a definite pattern with a larger gap in years when the Ducks did well and a smaller gap when the team did poorly. (more…)

December 28, 2011

Seasonal(?) infographics

The New York Times has an amazing infographic of the World’s Greatest Atrocities, that I’ve been looking for an excuse to link to.

And the seasonal link? Today  in the Catholic/Anglican/Lutheran churches (or tomorrow, in the Eastern Orthodox churches) is the Feast of the Holy Innocents, the children killed by Herod, and remembered in the famous Coventry Carol.

According to Wikipedia

Byzantine liturgy estimated 14,000 Holy Innocents while an early Syrian list of saints stated the number at 64,000. Coptic sources raise the number to 144,000 and place the event on 29 December. Taking the narrative literally and judging from the estimated population of Bethlehem, the Catholic Encyclopedia (1910) more soberly suggested that these numbers were inflated, and that probably only between six and twenty children were killed in the town, with a dozen or so more in the surrounding areas.

It’s easy to magnify the atrocities committed by our enemies, minimize those of our side, and simply forget about those committed by and against people on the other side of the world.

December 14, 2011

We are the 0.01%?

 

New Scientist has an interesting article on peer-to-peer lending, a crowd-sourced alternative to borrowing from banks.  Unfortunately, it’s illustrated by an extremely misleading graph.

The graph title says ” As the recession caused US loans to plummet, peer-to-peer lenders began to fill the gap”, and the graph certainly makes the rise in peer-to-peer lending (green) look dramatic compared to total US consumer debt (red). However, the axis scale for the green line is 10,000 times smaller than for the red line. The point where the red and green lines cross is where peer-to-peer first reached 0.01% of all consumer lending.

If the two lines used the same y-axis scale, the green line would be horizontal and indistinguishable from the zero line.  Perhaps peer-to-peer lenders “began” to fill the gap, but they will have to expand a thousand fold before they are even visible on the same scale as total US consumer debt.

 

November 26, 2011

Election Night Graphics

I’ve posted this over on Throng as well, but thought I’d add it here too:

I just saw this graph on TVNZ’s election coverage:

Putting aside the issue of using a perspective graph makes it harder to compare, there’s something wrong with this graph: the informal votes (labelled INF) bar is not correct.

The correct graph should look like this:

Update: there are faint notches in their graphic indicating the “0” mark, however it’s not a clear enough distinction.

November 24, 2011

Interactive map of US road deaths

Zoom in anywhere in the US and see the locations, with icons indicating year and who died.

They also have a UK map, and are interested in expanding to other countries where the data are available.