BIAS: Description of Research Programme

The research of BIAS consists of the following methodological components:

Multiple bias modelling for observational studies

Nicky Best, Sylvia Richardson, Sara Geneletti, Jassy Molitor, Lawrence McCandless

Observational studies are subject to many potential sources of bias, such as unmeasured confounding, missing/mis-measured data, and various selection biases. We are developing a methodological framework using graphical models to represent different types of bias that may be present in different types of observational studies. This graphical framework will provide social scientists with a conceptual tool to help identify possible biases that may affect their study, and also with an analytic tool to carry out sensitivity analysis. Typically, the bias parameters in the model are not identified by the study data, and so information must be supplied from other sources, including external datasets or expert opinion.

Combining individual and aggregate level data

Nicky Best, Sylvia Richardson, Chris Jackson

Many sources of data such as the Census, ONS neighbourhood statistics, and the national births, deaths and other health data sets, provide information on the average health and social circumstances of the whole population, and of different sub-groups of the population. On the other hand, there are various UK survey and cohort data sets that provide detailed individual-level information about health, lifestyle, socioeconomic and other personal characteristics, but only on a small subset of individuals. We have developed hierarchical related regression methods for combining random samples of individual level data with aggregate data on the same variables. These can be used to estimate the inter-play between individual-level and group-level influences on health and other outcomes of interest, and reduce problems of ecological bias. These methods have been successfully applied to study socio-demographic variations in the risk of self-reported limiting long-term illness and hospitalisation for cardiovascular disease. Software for implementing these methods in R and WinBUGS is available on the Sotware page. Further methodological developments in this area will include the extension to case-control samples. Ongoing applied work includes studies of the effects of air pollution on childhood leukaemia and low birth weight, and further development of the software.

Small area estimation

Nicky Best, Sylvia Richardson, Virgilio Gómez Rubio

This work is being carried out in collaboration with ONS. The basic methodological problem is to estimate the value of a given indicator (e,g. income, crime rate, unemployment) for every small area, using data on the indicator from individual-level surveys in a partial sample of areas, plus relevant area-level covariates available for all areas from e.g. census and administrative sources. We are extending the existing estimation methods used by ONS to incorporate spatial and spatio-temporal dependence, and to compare likelihood and Bayesian methods for small area estimation.

Modelling biases due to survey non-response

Nicky Best, Sylvia Richardson, Sara Geneletti

Survey non-response is typically due to non-availability (e.g. invalid address) or refusal (total or partial) and can result in the sampled population being non-exchangeable with the target population. One strategy for dealing with non-response is to (multiply) impute the missing data and analyse the completed dataset in a standard way. Alternatively, we can adjust estimates based on the observed data to attempt to reduce the bias that this non-exchangeability introduces. We are developing two different approaches to carry out such bias adjustments. One is based on developing graphical models (formally, directed acyclic graphs or DAGs) that explicitly represent the non-response process. In addition to representing the relationships between variables of interest, we encode non-response itself as a variable in the DAG and link it with potential causes. Conditional independences coded by DAGs can then be used to separate the process of non-response from the mechanism of inferential interest, and corrected estimates of model parameters can be obtained using, e.g. post-stratification methods. Our second approach aims to model the effect the bias through a latent variable representing factors associated with non-response, and will build on our recent work to model unmeasured confounding using propensity scores. Both our approaches will make use of external data or expert judgment to inform about the causes of non-response.

Spatio-temporal modelling of small area data to estimate social changes in space and time

Nicky Best, Sylvia Richardson, Philip Li

During BIAS I we have implemented Bayesian methods for cross-sectional estimates of small area indicators in which several types of spatial random effect are introduced in order to improve estimation. We are now extending these models to include a temporal dimension, developing a rich class of space-time models that will accommodate different patterns of changes over time. For some applications, e.g. small area estimates of income, smooth time changes are expected which can be modelled either parametrically (e.g. linear trend), semi-parametrically (e.g. cubic splines) or via latent autoregressive processes. In other cases, the detection of abrupt changes in some areas may be of interest, e.g. in the analysis of crime rates. We will pay special attention to the detection of areas where the residual space-time variation not predicted by the addition of separate space and time trends is significantly high, because these will capture unexpected changes over time, due for example to policy or social changes.

Applications of these methods include measuring changes over time in small area estimates of income and unemployment and estimation of space-time patterns of criminal offences

Generalised evidence synthesis for longitudinal data

Nicky Best, Sylvia Richardson, Jassy Molitor

Meta-analysis and evidence synthesis refer to approaches for combining identical variables or estimates of the same quantity from multiple datasets. Recently, attempts have been made to extend such methods to the synthesis of more disparate items of information, under the name generalized evidence synthesis. In collaboration with the IALSA international network on longitudinal studies of ageing (http://projects.pop.psu.edu/ialsa), we are developing and applying a Bayesian graphical modelling approach for the generalised synthesis of multiple cross-national longitudinal datasets. We will initially develop a basic hierarchical synthesis model with subject- and study-specific random coefficients and then elaborate this structure to allow for systematic differences between studies not attributable to measured covariates, country effects or random variation – for example, by introducing additional study specific ‘quality’ parameters to the random effects distribution. We will further extend the modeling framework to include studies in which the outcome and/or covariates are measured in different ways, by introducing latent variables to represent the common underlying variables.

We are applying these methods to the analysis of ageing-related changes in cognition and health

Measuring changes over time in small area estimates of income and unemployment

Nicky Best, Sylvia Richardson, Virgilio Gómez Rubio, Philip Li

We are collaborating with Mr Philip Clarke (Office for National Statistics) to apply our spatio-temporal small area estimation methods the analysis of ONS data such as the General Household Survey, the Integrated Household Survey and the Family Resources Survey. Important aspects of this analysis include the identification of time trends and areas with low levels of income and employment, poor health and housing conditions.

Space-time patterns of criminal offences

Nicky Best, Sylvia Richardson, Philip Li

We are collaborating with Prof Bob Haining (Department of Geography, Cambridge University) to apply our spatio-temporal small area estimation methods to the analysis of space-time patterns of criminal offences in Cambridgeshire. Available data include information on offences (crime type, date, time, postcode) and offenders (when known), plus relevant socio-economic area level covariates. We will investigate how stable patterns are and whether there is evidence of repeat victimisation and “spatial” repeat victimisation (where an offence such as burglary does not occur in the same household but within some radius). In addition, we will try to detect sudden increases in the crime activities by modelling space-time interactions.

Longitundinal studies of ageing-related changes in cognition and health

Nicky Best, Sylvia Richardson, Jassy Molitor

We are collaborating with Prof Scott Hofer (Oregon State University), who is director of the IALSA international collaborative research network on longitudinal studies of ageing, to apply our methods for generalized evidence synthesis to study ageing-related changes in cognition and health. The IALSA network has access to 25 existing longitudinal studies on ageing, with the direct involvement of many of the original principal investigators and their research teams. These studies span twelve countries and represent a total sample size of approximately 70,000 individual aged between 18 and 100. A key objective of the IALSA programme of research is to identify processes of change other than chronological age that account for population and individual-level cognitive change.

Electoral Behaviour

Nicky Best, Sylvia Richardson, Jane Key

We are collaborating Dr Steve Fisher (Department of Sociology, Oxford University) to apply the hierarchical related regression (HRR) methods developed as part of the BIAS I research programme to two connected questions relating to voting behaviour.

Ethnicity and Vote Choice

The vote choice of ethnic minorities in Britain is hard to estimate with opinion polls, or even with the British Election Study (BES) surveys because the sample sizes are too small to yield sufficient numbers of ethnic minorities. The 1997 BES ethnic minority booster sample showed that ethnic minorities were all much more likely to vote Labour than the white majority in Britain. Since then the issue of ethnicity has become more significant in British politics, especially following the 2003 invasion of Iraq. The 2005 election may well have seen a dramatic change in the tendency of ethnic minorities, and particularly Muslims, to vote for Labour. However it is difficult to say by how much minority vote choice changed between 2001 and 2005, or indeed whether there was a change, due to the small sample sizes in the BES surveys for those years. We are using HRR models to combine the individual level BES data with aggregate census data, which includes ethnicity and religion, plus election results for parliamentary constituencies, to improve estimates of the strength of association between ethnicity and vote for the 2001 and 2005 general elections and thereby assess the extent to which there was a realignment of British ethnic minorities away from Labour.

Class Voting

There is a substantial literature on the extent to which electoral politics is structured by competition between social classes. In Britain it is clear from the British Election Studies (BES) conducted since 1964 that there has been a decline in class voting, but it is not clear whether this is due to long term changes in the nature of the electorate, or contingent political factors that happen to have occurred in recent elections, most especially the similarity in policies offered by the main political parties. The measurement of the association between class and vote is important for this debate. Estimates have thus far been based entirely on individual level survey data, especially from the BES, but there are reasonable concerns about the reliability of these estimates due to small sample sizes (~2000) in the survey data. We are again using HRR models to combine BES data with aggregate census data and election results for parliamentary constituencies to improve estimates of the strength of association between class and vote for each election and thereby assess the pattern of change more accurately.