Small area estimation for unemployment using latent Markov models
Section 4. The proposed model

In this section, the proposed SAE model based on LMMs is illustrated. It can be considered as a compromise between the YRG model based on (3.1), which leads to possible oversmoothing, and the computationally demanding alternative proposed in Datta et al. (1999), based on (3.2). We first outline a general description on LMMs and then move to the specification of the area level model and to its estimation.

4.1  Preliminaries

In LMMs, the existence of two types of process is assumed: an unobservable finite-state first-order Markov chain U i t MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWGvbWaaSbaaSqaaiaadMgacaWG0b aabeaaaaa@3459@ with state space { 1, , k } MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaadaGadaqaaiaaigdacaaISaGaaGjbVl ablAciljaaiYcacaaMe8Uaam4AaaGaay5Eaiaaw2haaaaa@3AF0@ and an observed process, which in our case corresponds to θ i t , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqaH4oqCdaWgaaWcbaGaamyAaiaads haaeqaaOGaaiilaaaa@35EF@ with i = 1, , m MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWGPbGaaGypaiaaigdacaaISaGaaG jbVlablAciljaaiYcacaaMe8UaamyBaaaa@3A76@ and t = 1, , T . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWG0bGaaGypaiaaigdacaaISaGaaG jbVlablAciljaaiYcacaaMe8Uaamivaiaac6caaaa@3B1A@ It is assumed that the distribution of θ i t MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqaH4oqCdaWgaaWcbaGaamyAaiaads haaeqaaaaa@3535@ depends only on U i t ; MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWGvbWaaSbaaSqaaiaadMgacaWG0b aabeaakiaaygW7caGG7aaaaa@36AC@ specifically, the θ i t MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqaH4oqCdaWgaaWcbaGaamyAaiaads haaeqaaaaa@3535@ are conditionally independent given the U i t . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWGvbWaaSbaaSqaaiaadMgacaWG0b aabeaakiaac6caaaa@3515@ In addition, the latent state to which a small area belongs at a certain time point only depends on the latent state at the previous occasion.

The state-dependent distribution, namely the distribution of θ i t MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqaH4oqCdaWgaaWcbaGaamyAaiaads haaeqaaaaa@3535@ given U i t , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWGvbWaaSbaaSqaaiaadMgacaWG0b aabeaakiaacYcaaaa@3512@ can be a continuous or discrete. Such a distribution is typically taken from the exponential family. Thus, the overall vector of parameters of LMM, denoted by ϕ , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaaiiWacqWFvpGzcaGGSaaaaa@33EC@ includes parameters of the Markov chain, denoted by ϕ lat , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaaiiWacqWFvpGzdaWgaaWcbaGaaGjbVl aabYgacaqGHbGaaeiDaaqabaGccaaMb8Uaaiilaaaa@3A03@ and the vector of parameters ϕ obs MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaaiiWacqWFvpGzdaWgaaWcbaGaaGjbVl aab+gacaqGIbGaae4Caaqabaaaaa@37C2@ of the state-dependent distribution. In fact, the model consists of two components, the measurement model and the latent model, which concern the conditional distribution of the response variables given the latent variables and the distribution of the latent variables, respectively. By jointly considering these components, the so-called manifest distribution is obtained: it is the marginal distribution of the response variables, once the latent variables have been integrated out.

The measurement model, based on parameters ϕ obs , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9WqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaaiiWacqWFvpGzdaWgaaWcbaGaae4Bai aabkgacaqGZbaabeaakiaaygW7caGGSaaaaa@387B@ can be written as

θ i t | U i t = u p ( θ i t | u , ϕ obs ) . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqaH4oqCdaWgaaWcbaGaamyAaiaads haaeqaaOGaaGPaVpaaeeaabaGaaGPaVlaadwfadaWgaaWcbaGaamyA aiaadshaaeqaaaGccaGLhWoacaaI9aGaamyDaebbfv3ySLgzGueE0j xyaGabaiab=XJi6iaadchadaqadaqaaiabeI7aXnaaBaaaleaacaWG PbGaamiDaaqabaGccaaMc8+aaqqaaeaacaaMc8UaamyDaiaacYcaai aawEa7aiaaysW7iiWacqGFvpGzdaWgaaWcbaGaaGPaVlaab+gacaqG IbGaae4CaaqabaaakiaawIcacaGLPaaacaaIUaaaaa@599F@

Moreover, the parameters ϕ lat MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaaiiWacqWFvpGzdaWgaaWcbaGaaGjbVl aabYgacaqGHbGaaeiDaaqabaaaaa@37BF@ of the Markov chain are:

π u = P ( U i 1 = u ) , u = 1, , k ; MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqaHapaCdaWgaaWcbaGaamyDaaqaba GccaaI9aGaamiuamaabmaabaGaamyvamaaBaaaleaacaWGPbGaaGym aaqabaGccaaI9aGaamyDaaGaayjkaiaawMcaaiaaiYcacaaMf8Uaam yDaiaai2dacaaIXaGaaGilaiaaysW7cqWIMaYscaaISaGaaGjbVlaa dUgacaaI7aaaaa@4814@

Π = ( π 1 | 1 π 1 | k π k | 1 π k | k ) , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWHGoGaaGypamaabmaabaqbaeqabm Waaaqaaiabec8aWnaaBaaaleaacaaIXaGaaGPaVpaaeeaabaGaaGPa VlaaigdaaiaawEa7aaqabaaakeaacqWIMaYsaeaacqaHapaCdaWgaa WcbaGaaGymaiaaykW7daabbaqaaiaaykW7caWGRbaacaGLhWoaaeqa aaGcbaGaeSO7I0eabaGaeSy8I8eabaGaeSO7I0eabaGaeqiWda3aaS baaSqaaiaadUgacaaMc8+aaqqaaeaacaaMc8UaaGymaaGaay5bSdaa beaaaOqaaiablAcilbqaaiabec8aWnaaBaaaleaacaWGRbGaaGPaVp aaeeaabaGaaGPaVlaadUgaaiaawEa7aaqabaaaaaGccaGLOaGaayzk aaGaaGilaaaa@5EE7@

π u | u ¯ = P ( U i t = u | U i , t 1 = u ¯ ) , u ¯ , u = 1, , k , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqaHapaCdaWgaaWcbaGaamyDaiaayk W7daabbaqaaiaaykW7ceWG1bGbaebaaiaawEa7aaqabaGccaaI9aGa amiuamaabmaabaGaamyvamaaBaaaleaacaWGPbGaamiDaaqabaGcca aI9aGaamyDaiaaykW7daabbaqaaiaaykW7caWGvbWaaSbaaSqaaiaa dMgacaaMb8UaaGilaiaaykW7caWG0bGaeyOeI0IaaGymaaqabaaaki aawEa7aiaai2daceWG1bGbaebaaiaawIcacaGLPaaacaaISaGaaGzb VlqadwhagaqeaiaaiYcacaWG1bGaaGypaiaaigdacaaISaGaaGjbVl ablAciljaaiYcacaaMe8Uaam4AaiaaiYcaaaa@5EB4@

In this work we consider homogeneous LMMs, namely LMMs where, in agreement with the previous definition, the transition probability matrix is constant in time. Generalizations to non-homogeneous hidden Markov chains and time-varying transition probabilities could also be considered (Bartolucci and Farcomeni, 2009). Individual covariates could be included in the measurement or in the latent model. When the covariates are included in the measurement model (Bartolucci and Farcomeni, 2009), they affect the response variables directly and the latent process is conceived as a way to account for the unobserved heterogeneity between areas. Differently, when the covariates are in the latent model (Vermunt and Magidson, 2002; Bartolucci, Pennoni and Francis, 2007) they influence initial and transition probabilities of the latent process. In a SAE context, we will consider the former approach, so that auxiliary information can be used to improve predictions. Bayesian inference approaches to LMMs are already available in the literature (e.g., in Marin, Mengersen and Robert, 2005; Spezia, 2010). In the following section we illustrate how to incorporate an LMM into an area level SAE model.

4.2  Proposed approach to area level SAE

The proposed model is based on two levels in an HB framework: at the first level, a sampling error model is assumed, then an LMM is used as linking model. The latter is based on two equations, corresponding to the measurement model and to the latent component. In particular, we adopt the following structure:

θ ^ i | θ i N T ( θ i , Ψ i ) , i = 1, , m ; MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaaceWH4oGbaKaadaWgaaWcbaGaamyAaa qabaGccaaMc8+aaqqaaeaacaaMc8UaaCiUdmaaBaaaleaacaWGPbaa beaaaOGaay5bSdqeeuuDJXwAKbsr4rNCHbaceaGae8hpIOJaamOtam aaBaaaleaacaWGubaabeaakmaabmaabaGaaCiUdmaaBaaaleaacaWG PbaabeaakiaaiYcacaaMe8UaaCiQdmaaBaaaleaacaWGPbaabeaaaO GaayjkaiaawMcaaiaaiYcacaaMf8UaamyAaiaai2dacaaIXaGaaGil aiaaysW7cqWIMaYscaaISaGaaGjbVlaad2gacaaI7aaaaa@572E@

θ i t | U i t = u , x i t N ( x i t β u , σ u 2 ) i = 1, , m ; t = 1, , T ; MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqaH4oqCdaWgaaWcbaGaamyAaiaads haaeqaaOGaaGPaVpaaeeaabaGaaGPaVlaadwfadaWgaaWcbaGaamyA aiaadshaaeqaaaGccaGLhWoacaaI9aGaamyDaiaaiYcacaaMe8UaaC iEamaaBaaaleaacaWGPbGaamiDaaqabaqeeuuDJXwAKbsr4rNCHbac eaGccqWF8iIocaWGobWaaeWaaeaacaWH4bWaa0baaSqaaiaadMgaca WG0baabaqcLbwacWaGyBOmGikaaOGaaCOSdmaaBaaaleaacaWG1baa beaakiaaiYcacaaMe8Uaeq4Wdm3aa0baaSqaaiaadwhaaeaacaaIYa aaaaGccaGLOaGaayzkaaGaaGzbVlaadMgacaaI9aGaaGymaiaaiYca cqWIMaYscaaISaGaamyBaiaaiUdacaaMe8UaamiDaiaai2dacaaIXa GaaGilaiablAciljaaiYcacaWGubGaaG4oaaaa@6BBB@

Here β u MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWHYoWaaSbaaSqaaiaadwhaaeqaaa aa@33D0@ is the p × 1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWGWbGaey41aqRaaGymaaaa@3533@ vector of the regression coefficients for the latent state to which area i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWGPbaaaa@325A@ at time t MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWG0baaaa@3265@ belongs, σ u 2 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqaHdpWCdaqhaaWcbaGaamyDaaqaai aaikdaaaaaaa@3512@ is the corresponding error variance, and Ψ i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWHOoWaaSbaaSqaaiaadMgaaeqaaa aa@33BA@ is the matrix of sampling variances, which is assumed to be known.

It must be noticed that, while in the classical area level SAE models heterogeneity is modeled using continuous (usually Normally distributed) random variables, here it is modeled with a discrete dynamic variable. As we can deduce from Figure 4.1, our data have a skewed distribution. However, the empirical distribution is not far from a Normal distribution. D’Alò, Di Consiglio, Falorsi, Ranalli and Solari (2012) show that the differences in estimates between adopting a Normal or a Binomial model are not as relevant as expected and Normal models are often used for estimation of unemployment rates (You et al., 2003; Boonstra, 2014). Finally adopting the Normal distribution has computational advantages which are clarified later in this section.

Figure 4.1 of article 54956 issue 2018002

Description for Figure 4.1

Figure presenting a density kernel plot of the direct estimates of unemployment incidences (N = 20,122 and bandwidth = 0.002937). Density from 0 to 20 is on the y-axis and unemployment estimates from 0 to 0.35 are on the x-axis. The distribution is skewed with a right tail and a peak reached for an unemployment estimate of about 0.025.

The model parameters of interest can be divided into three groups:

Θ = ( θ 11 θ 1 T θ m 1 θ m T ) ; ( 4.1 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWHyoGaaGypamaabmaabaqbaeqabm WaaaqaaiabeI7aXnaaBaaaleaacaaIXaGaaGymaaqabaaakeaacqWI VlctaeaacqaH4oqCdaWgaaWcbaGaaGymaiaadsfaaeqaaaGcbaGaeS O7I0eabaGaeSy8I8eabaGaeSO7I0eabaGaeqiUde3aaSbaaSqaaiaa d2gacaaIXaaabeaaaOqaaiabl+UimbqaaiabeI7aXnaaBaaaleaaca WGTbGaamivaaqabaaaaaGccaGLOaGaayzkaaGaaG4oaiaaywW7caaM f8UaaGzbVlaaywW7caaMf8UaaiikaiaaisdacaGGUaGaaGymaiaacM caaaa@58E2@

ϕ obs = ( β 1 , , β k , σ 1 2 , , σ k 2 ) ; MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaaiiWacqWFvpGzdaWgaaWcbaGaaGjbVl aab+gacaqGIbGaae4CaaqabaGccaaI9aWaaeWaaeaacaWHYoWaa0ba aSqaaiaaigdaaeaajugybiadaITHYaIOaaGccaaISaGaaGjbVlablA ciljaaiYcacaaMe8UaaCOSdmaaDaaaleaacaWGRbaabaqcLbwacWaG yBOmGikaaOGaaGilaiaaysW7cqaHdpWCdaqhaaWcbaGaaGymaaqaai aaikdaaaGccaaISaGaaGjbVlablAciljaaiYcacaaMe8Uaeq4Wdm3a a0baaSqaaiaadUgaaeaacaaIYaaaaaGccaGLOaGaayzkaaWaaWbaaS qabeaakiadaITHYaIOaaGaaGzaVlaaiUdaaaa@601E@

ϕ lat = { π , Π } . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaaiiWacqWFvpGzdaWgaaWcbaGaaGjbVl aabYgacaqGHbGaaeiDaaqabaGccaaI9aWaaiWaaeaacaWHapGaaGil aiaaysW7caWHGoaacaGL7bGaayzFaaGaaGjcVlaai6caaaa@41C4@

To complete the Bayesian formulation of the proposed model, it is necessary to choose priors for the model parameters. Small area parameters do not need a specific prior because direct estimates based on observed data are available; therefore, a set of priors is chosen for the measurement and the latent parameters. Regarding ϕ obs , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaaiiWacqWFvpGzdaWgaaWcbaGaaGjbVl aab+gacaqGIbGaae4CaaqabaGccaGGSaaaaa@387C@ diffuse normal priors are assumed for the regression coefficients. These priors are conjugate and computationally more convenient than the usually flat priors over the real line (see Rao, 2003, Chapter 10). In particular, we assume

β u N p ( η 0 , Σ 0 ) , u = 1, , k , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWHYoWaaSbaaSqaaiaadwhaaeqaae bbfv3ySLgzGueE0jxyaGabaOGae8hpIOJaamOtamaaBaaaleaacaWG WbaabeaakmaabmaabaGaaC4TdmaaBaaaleaacaaIWaaabeaakiaaiY cacaaMe8UaaC4OdmaaBaaaleaacaaIWaaabeaaaOGaayjkaiaawMca aiaaiYcacaaMf8UaamyDaiaai2dacaaIXaGaaGilaiaaysW7cqWIMa YscaaISaGaaGjbVlaadUgacaaISaaaaa@4FBB@

with Σ 0 = σ u 2 Λ 0 1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWHJoWaaSbaaSqaaiaaicdaaeqaaO GaaGypaiabeo8aZnaaDaaaleaacaWG1baabaGaaGOmaaaakiaahU5a daqhaaWcbaGaaGimaaqaaiabgkHiTiaaigdaaaaaaa@3BB8@ and Λ 0 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWHBoWaaSbaaSqaaiaaicdaaeqaaa aa@3379@ is a known diagonal matrix.

Variances σ u 2 , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqaHdpWCdaqhaaWcbaGaamyDaaqaai aaikdaaaGccaGGSaaaaa@35CC@ u = 1, , k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWG1bGaaGypaiaaigdacaaISaGaaG jbVlablAciljaaiYcacaaMe8Uaam4AaiaacYcaaaa@3B30@ are unknown and, therefore, it is necessary to set a prior also on these parameters. The choice of the prior distribution for the variance components is critical as in Bayesian mixed models the posterior distributions of these parameters are known to be sensitive to this specification. The inverse Gamma distribution is a popular choice, see e.g., You et al. (2003) and Datta, Lahiri, Maiti and Lu (1999) among others. Gelman (2006), Gelman, Jakulin, Pittau and Su (2008), and Polson and Scott (2012) propose to assume a half-Cauchy distribution for the variance of the random effect. Alternatively, a Uniform distribution can also be considered. Fabrizi et al. (2016) conduct an exhaustive sensitivity analysis when using a latent class model in a multivariate setting and find no significant difference among these different alternatives. For this reason, we choose the same prior distribution considered in You et al. (2003) and use an inverse Gamma distribution with shape parameter a 0 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWGHbWaaSbaaSqaaiaaicdaaeqaaa aa@3338@ and scale parameter b 0 ; MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWGIbWaaSbaaSqaaiaaicdaaeqaaO Gaai4oaaaa@3402@ then σ u 2 IG ( a 0 , b 0 ) , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqaHdpWCdaqhaaWcbaGaamyDaaqaai aaikdaaaqeeuuDJXwAKbsr4rNCHbaceaGccqWF8iIocaqGjbGaae4r amaabmaabaGaamyyamaaBaaaleaacaaIWaaabeaakiaaiYcacaaMe8 UaamOyamaaBaaaleaacaaIWaaabeaaaOGaayjkaiaawMcaaiaacYca aaa@4493@ u = 1, , k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWG1bGaaGypaiaaigdacaaISaGaaG jbVlablAciljaaiYcacaaMe8Uaam4AaiaaiYcaaaa@3B36@ where a 0 , b 0 > 0 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWGHbWaaSbaaSqaaiaaicdaaeqaaO GaaGilaiaaysW7caWGIbWaaSbaaSqaaiaaicdaaeqaaOGaaGOpaiaa icdaaaa@38DE@ are set to very small values. This choice makes it also easier to derive the full conditional distributions for the Gibbs sampler.

For ϕ lat , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaaiiWacqWFvpGzdaWgaaWcbaGaaGjbVl aabYgacaqGHbGaaeiDaaqabaGccaaMb8Uaaiilaaaa@3A03@ a system of Dirichlet priors is set on the initial probabilities and on the transition probabilities. The Dirichlet distribution is a conjugate prior for the multinomial distribution. This means that if the prior distribution of the multinomial parameters is Dirichlet then the posterior distribution belongs to the same family. The benefit of this choice is that the posterior distribution is easy to compute and, in some sense, it is possible to quantify how much our beliefs have changed after collecting the data. Then, we assume

π Dirichlet ( 1 k ) , π u ¯ = ( π 1 | u ¯ , , π k | u ¯ ) Dirichlet ( 1 k ) , u ¯ = 1, , k . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xb9qqFj0db9qqvqFr0dXdHiVc=b YP0xb9peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaafaqaaeGacaaabaGaaGPaVlaaykW7ca aMc8UaaGPaVlaaykW7caaMc8UaaGPaVlaaykW7caaMc8UaaGPaVlaa ykW7caaMc8UaaGPaVlaaykW7caaMc8UaaGPaVlaaykW7caaMc8UaaG PaVlaaykW7caaMc8UaaGPaVlaaykW7caaMc8UaaGPaVlaaykW7caaM c8UaaGPaVlaaykW7caaMc8UaaGPaVlaaykW7caaMc8UaaCiWdaqaae bbfv3ySLgzGueE0jxyaGabaiab=XJi6iaabseacaqGPbGaaeOCaiaa bMgacaqGJbGaaeiAaiaabYgacaqGLbGaaeiDamaabmaabaGaaCymam aaBaaaleaacaWGRbaabeaaaOGaayjkaiaawMcaaiaaiYcaaeaacaWH apWaaSbaaSqaaiqadwhagaqeaaqabaGccaaI9aWaaeWaaeaacqaHap aCdaWgaaWcbaGaaGymaiaaykW7daabbaqaaiaaykW7ceWG1bGbaeba aiaawEa7aaqabaGccaaISaGaaGjbVlablAciljaaiYcacaaMe8Uaeq iWda3aaSbaaSqaaiaadUgacaaMc8+aaqqaaeaacaaMc8UabmyDayaa raaacaGLhWoaaeqaaaGccaGLOaGaayzkaaWaaWbaaSqabeaakiadaI THYaIOaaaabaGae8hpIOJaaeiraiaabMgacaqGYbGaaeyAaiaaboga caqGObGaaeiBaiaabwgacaqG0bWaaeWaaeaacaWHXaWaaSbaaSqaai aadUgaaeqaaaGccaGLOaGaayzkaaGaaGilaiaaywW7ceWG1bGbaeba caaI9aGaaGymaiaaiYcacaaMe8UaeSOjGSKaaGilaiaaysW7caWGRb GaaGOlaaaaaaa@AF48@

4.3  Estimation and model selection

In this work we make use of a data augmentation Markov Chain Monte Carlo (MCMC) method (Tanner and Wong, 1987; Liu, Wong and Kong, 1994; Van Dyk and Meng, 2001) based on the Gibbs sampler, in which the latent variables are treated as missing data (Marin et al., 2005; Germain, 2010). There are two main reasons for this choice. First of all, there is evidence that data augmentation has a better performance than other methods, as the marginal updating scheme (Boys and Henderson, 2003). Moreover, it simplifies the process of sampling from the posterior distribution. Details on this method and the full conditionals employed in the Gibbs sampler are given in Appendix A.1.

The choice of the number of latent states is a crucial step in applications. In the framework of LMMs, this requires a model selection procedure. From a Bayesian perspective, a fundamental goal is the computation of the marginal likelihood of the data for a given model. In this paper we use a model selection method based on the marginal likelihood and to estimate this quantity we use the method proposed by Carlin and Chib (1995), applied for each available model on the basis of the output of the MCMC algorithm. Technical details are provided in Appendix A.2.

A well-known problem occurring in Bayesian latent class and LMMs is the label switching. This implies that the component parameters are not identifiable as they are exchangeable. In a Bayesian context, if the prior distribution does not distinguish the component parameters between each other, then the resulting posterior distribution will be invariant with respect to permutations of the labels. Several solutions have been proposed; for a general review see Jasra, Holmes and Stephens (2005). The easiest approach is to use relabeling techniques retrospectively, by post-processing the MCMC output (Marin et al., 2005). However, in our case, we are interested in the prediction of the small area parameters, whose distribution depends on the number of areas in each latent state. Therefore, we do not use the post-processing approach and the MCMC output is permuted at every iteration according to the ordering of the mean of the response variables in each class.


Date modified: