Selection bias and the perils of data science

According to the Guardian Data Blog, Obama is heading for electoral success, on the basis of a Twitter-based analysis. It’s all very nice to see mapped out, and the use of geocoding is cool (though possibly flawed), but underlying the approach is a massive potential for selection bias. The problem is quite simply this: if […]

Brave new worlds and dull old paperwork

Nice to hear a radio programme about lifecourse epidemiology and longitudinal studies this week.   It’s not often that population health gets such a measured presentation.  I was particularly pleased to hear reference (24 mins onward) to the potential of using existing data from patient records to give new insights to health, disease and clinical […]

On the limitations of binary measures

Scientists like to measure things. And they like to do it accurately. Striking a balance between real world variation and manageable data sets can be a challenge. Epidemiologists measure things that fall into three broad categories. Firstly, the presence of a disease or health state (the ‘outcome’). Secondly, the presence of factors thought to contribute […]