Scientists like to measure things. And they like to do it accurately. Striking a balance between real world variation and manageable data sets can be a challenge.
Epidemiologists measure things that fall into three broad categories. Firstly, the presence of a disease or health state (the ‘outcome’). Secondly, the presence of factors thought to contribute to the outcome (the ‘exposures’). Thirdly, other factors which might not cause the outcome but might interact with the exposures or muddy the waters in some way.
Weighing up the pros and cons of detail vs simplicity is important. If you want to know about the health effects of smoking, then asking people ‘Do you smoke?’ and giving them ‘Yes/No’ as options means you don’t clearly identify those who have smoked for years but quit somewhere along the line. Even among those who answer ‘yes’, there may be a different impact on health among those who smoke two or twenty cigarettes a day. Measures of disease can also be limited by simple classifications. Labelling people as ‘normal’ or ‘hypertensive’ according to their blood pressure means missing out on the degree of disease risk associated with a range of different levels. A lot of exposure-outcome trends have a ‘dose-response’ relationship, and the allocation of people to binary exposure or outcome categories fails to illustrate this.
Of course sometimes the fine detail is impractical to obtain or not really necessary. When designing epidemiological studies, decisions need to be made at the outset about how finely to measure the detail, and how much of this precision should be retained in the analysis. Classifying people as ‘Under 65’ and ‘65+’ may lose detail, but it may be entirely legitimate to group people into 5- or 10- year age categories. Too many categories can introduce statistical problems if there aren’t many people in each group. Judgements have to be made, and different studies may measure things in different ways, which can be problematic for subsequent comparisons between studies.
The UK referendum on the voting system is another case where a binary measure may fail to capture the true status of things. Voters were asked whether they wanted to abandon the current ‘First Past the Post (FPTP)’ system and replace it with the ‘Alternative Vote (AV)’ system. The options were simply ‘yes’ or ‘no’.
What we cannot tell from the result of such a question is how many of the ’no’ voters weren’t necessarily happy with First Past the Post, but didn’t think ‘Alternative Vote’ was a good enough replacement. Nor do we know how many of the Yes voters weren’t totally convinced by AV, but felt strongly enough that some kind of change was desirable that they would put up with AV.
Of course, the question could have been framed differently. For example
Do you think the current voting system should be changed?
– No, keep First Past the Post
– Yes – replace with AV
– Yes – but replace with something other than AV.
The problem with this is that had the vote been split 40:30:30, then the result would be ‘no’, even though 60% of people had voted for some kind of replacement to FPTP. Which is precisely the kind of problem that the referendum was trying to address. Deciding the ‘best’ way to ask a question isn’t always simple.