Estimation of level and change for unemployment using structural time series models
Section 1. Introduction

Statistics Netherlands uses data from the Dutch Labour Force Survey (LFS) to estimate labour status at various aggregation levels. National estimates are produced monthly, provincial estimates quarterly, and municipal estimates annually. Traditionally monthly publications about the labour force were based on rolling quarterly figures compiled by means of direct generalized regression estimation (GREG), see e.g., Särndal, Swensson and Wretman (1992). The continuous nature of the LFS allows to borrow strength not only from other areas, but also over time. A structural time series model (STM) to estimate national monthly labour status for 6 gender by age classes is in use since 2010 (van den Brakel and Krieg, 2009, 2015).

Until now, provincial estimates are produced quarterly using the GREG. In order to produce figures on a monthly basis, a model-based estimation strategy is necessary to overcome the problem of too small monthly provincial sample sizes. In this paper a model is proposed that combines a time series modeling approach to borrow strength over time with cross-sectional small area models to borrow strength over space with the purpose to produce reliable monthly estimates of provincial unemployment. As a consequence of the LFS panel design, the monthly GREG estimates are autocorrelated and estimates based on follow-up waves are biased relative to the first wave estimates. The latter phenomena is often referred to as rotation group bias (Bailar, 1975). Both features need to be accounted for in the model (Pfeffermann, 1991). Previous accounts of regional small area estimation of unemployment, where strength is borrowed over both time and space, include Rao and Yu (1994); Datta, Lahiri, Maiti and Lu (1999); You, Rao and Gambino (2003); You (2008); Pfeffermann and Burck (1990); Pfeffermann and Tiller (2006); van den Brakel and Krieg (2016), see also Rao and Molina (2015), Section 4.4 for an overview.

In this paper, multivariate STMs for provincial monthly labour force data are developed as a form of small area estimation to borrow strength over time and space, to account for rotation group bias and serial correlation induced by the rotating panel design. In a STM, an observed series is decomposed in several unobserved components like a trend, a seasonal component, regression components, other cyclic components and a white noise term for remaining unexplained variation. These components are based on stochastic models, to allow them to vary over time. The classical way to fit STMs is to express them as a state space model and apply a Kalman filter and smoother to obtain optimal estimates for state variables and signals. The unknown hyperparameters of the models for the state variables are estimated by means of maximum likelihood (ML) (Harvey, Chapter 3). Alternatively, state space models can be fitted in a Bayesian framework using a particle filter (Andrieu, Poucet and Holenstein (2010); Durbin and Koopman (2012), Chapter 9). STMs can also be expressed as time series multilevel models and can be seen as an extension of the classical Fay-Herriot model (Fay and Herriot, 1979). Connections between structural time series models and multilevel models have been explored before from several points of view in Knorr-Held and Rue (2002); Chan and Jeliazkov (2009); McCausland, Miller and Pelletier (2011); Ruiz-Cárdenas, Krainski and Rue (2012); Piepho and Ogutu (2014); Bollineni-Balabay, van den Brakel, Palm and Boonstra (2016). In these papers the equivalence between state space model components and multilevel components is made more explicit. Multilevel models can both be fitted in a frequentist and hierarchical Bayesian framework, see Rao and Molina (2015), Section 8.3 and 10.9, respectively.

This paper contributes to the small area estimation literature by comparing differences between STMs for rotating panel designs that are expressed as state space models and as time series multilevel models. State space models are fitted using a Kalman filter and smoother in a frequentist framework where hyperparameters are estimated with ML. In this case models are compared using AIC and BIC. Time series multilevel models are fitted in an hierarchical Bayesian framework, using a Gibbs sampler. Models with different combinations of fixed and random effects are compared based on the Deviance Information Criterion (DIC). The estimates based on multilevel and state space models and their standard errors are compared graphically and contrasted with the initial survey regression estimates. Modeling cross-sectional correlation in multivariate time series models rapidly increases the number of hyperparameters to be estimated. One way to obtain more parsimonious models is to use common factor models. In this paper an alternative approach to model correlations between time series components indirectly is proposed, based on a global common trend and local trends for the domain-specific deviations.

The paper is structured as follows. In Section 2 the LFS data used in this study are described. Section 3 describes how the survey regression estimator (Battese, Harter and Fuller, 1988) is used to compute initial estimates. These initial estimates are the input for the STM models, which are discussed in Section 4. In Section 5 the results based on several state space and multilevel models are compared, including estimates for period-to-period change for monthly data. Section 6 contains a discussion of the results as well as some ideas on further work. Throughout the paper we refer to the technical report by Boonstra and van den Brakel (2016) for additional details and results.


Date modified: