So, since we’re having a Transgender Week of Awareness at the moment, it seems like a good time to look at how statisticians ask people about gender, and why it’s harder than it looks.
By ‘harder than it looks’ I don’t just mean that it isn’t a binary question; we’re past that stage, I hope. Also, this isn’t about biological sex — in genetics I do sometimes care how many X chromosomes someone has, but most questionnaires don’t need to know. It’s harder than it looks because there isn’t just one question.
The basic Male/Female binary question can be extended in (at least) two directions. The first is to add categories to represent other ways people identify their gender beyond just male/female, which can be fluid over time, or can have more than two categories. Here a write-in option is useful since you almost certainly don’t know all the distinctions people care about across different cultures. In a specialised questionnaire you might even want to separate out questions about fluid/constant identity from non-binary/diversity, but for routine use that might be more than you need.
A second direction is to ask about transgender status, which is relevant for discrimination and (or thus) for some physical and mental health risks. (Here you might want also want to find out about people who, say, identify as female but present as male.) We have very little idea how many people are transgender — it makes data on sexual orientation look really precise — and that’s a problem for service provision and in many other areas.
Life would get simpler for survey collectors if you combined these into a single question, or if you had a Male/Female/It’s Complicated question with follow-up questions for the third group. On the other hand, it’s pretty clear why trans people don’t like that approach. These really are different questions. For people whose answer to the first question is something like “it depends” or a culturally specific third option, the combination may not be too bad. The problem comes when answer to the second type of question might be “Trans (and yes I sometimes get comments behind my back at work but most people are fine)”, but the answer to the first “Female (and just as female as people with ovaries and a birth certificate, ok)”.
Earlier this year Stats New Zealand ran a discussion and had a go at a better gender question, and it is definitely better than the old one, especially when it allows for multiple answers and for a write-in answer. They also have a ‘synonym list’ to help people work with free-text answers, although that’s going to be limited if all it does is map back to binary or three-way groups. What they didn’t do was to ask for different types of information separately. [edit: ie, they won’t let you unambiguously say ‘female’ in an identity question then ‘trans’ in a different question]
It’s true that for a lot of purposes you don’t need all this information. But then, for a lot of purposes you don’t actually need to know anything about gender.