The tutorial will take place on 11 August 2008, 14:00 - 17:30.
This tutorial is an introduction to Small Area Estimation (SAE) and how to compute a range of estimators with R. SAE is gaining popularity because of an increasing interest in providing estimators at different administrative scales, which usually involves many small areas. Examples that will be described in this tutorial include estimation of the average income per household, rate of unemployment and the relative risk of a certain disease.
Data sources frecuently in SAE include survey data with direct information of the poplation under study and other aggregate administrative data sources. These two data components can efficiently be combined to provide reliable estimates in small areas.
First of all, direct estimators will be described. These family of
estimators rely almost entirely on the survey data. Hence, it may be difficult
or impossible to provide estimates in those areas that have not been included
in the survey sample. The Generalised Regression Estimator will also be
survey package will be used in this part of the
Regression models provide a suitable framework to borrow information for data from different areas and provide good estimates for all small areas. The tutorial will focus on linear models. Synthetic and composite estimators will be used in this section.
Mixed-effects models also play an important role in SAE. Estimates can be
improved by included random effects that accomodate between-area differences
better. We will illustrate how to use the
nlme package to
fit mixed-effects models with R in some SAE applications.
SAE will also be used to illustrate the computation of some
Spatial EBLUP estimators.
Bayesian hierarchical models have proven very useful in SAE. Important
applications include disease mapping, environmental modelling of pollutants and
many more. We will illustate how to fit some of these models using
WinBUGS. Spatial random
effects are a particularly interesting example because they can be used to
model spatial patterns and are specially useful to improve estimation in
non-sampled areas. Different spatial packages, including
maptools, will be used to
export geographical information to be used with
Finally, some other examples of non-linear models for SAE will be illustrated. In particular, how logistic regression can be used to estimate the rate of unemployment using a combination of individual and administrative data.
Maps are a useful way of reporting small area estimates. When the examples
involve geographical information,
maptools packages will be used to illustrate
how to handle and display maps in R.
All participants must have a working knowledge of R. Prior knowledge of statistics and small area estimation methods is desirable.
Dr. Virgilio Gómez-Rubio, Imperial College London
IMPORTANT: A port of the EURAREA macros to R can be found at the DACSEIS Project website. Thanks to Prof. Ralf Münnick for pointing this out.
Furthermore, you can also check some materials from a previous course on spatial data analysis here. Look for unit 9.
R software and packages
This is the main site to download R and the packages that we will use in the course.