A mixed latent class Markov approach for estimating labour market mobility with multiple indicators and retrospective interrogation
Section 2. The latent class Markov model

Table of contents

Latent class analysis has been applied in a number of studies on panel data to separate true changes from observed ones affected by unreliable measurements. Relatively recent contributions include Bassi, Torelli and Trivellato (1998), Biemer and Bushery (2000), Bassi, Croon, Hagenaars and Vermunt (2000), Bassi and Trivellato (2009).

The true labour force state is treated as a latent variable and the observed one as its indicator. The model consists of two parts:

structural, describing true dynamics among latent variables;
measurement, linking each latent variable to its indicator(s).

Let us consider the simplest formulation of latent class Markov (LCM) models (Wiggins 1973), which assumes that true unobservable transitions follow a first-order Markov chain. As in all standard LCM specifications, local independence among indicators is assumed, i.e., indicators are independent conditionally on latent variables In the LCM model with one indicator per latent variable, the assumption of local independence coincides with the Independent Classification Errors condition.

Let $X_{i t}$ denote the true labour force condition at time $t$ for a generic sample individual $i, i = 1, \dots, n;$ $Y_{i t}$ is the corresponding observed condition; $P (X_{i 1} = l_{1})$ is the probability of the initial state of the latent Markov chain, and $P (X_{i t + 1} = l_{t + 1} | X_{i t} = l_{t})$ is the transition probability between state $l_{t}$ and state $l_{t + 1}$ from time $t$ to $t + 1,$ with $t = 1, \dots, T - 1,$ where $T$ represents the total number of consecutive, equally spaced time-points over which an individual is observed. In addition, $P (Y_{i t} = j_{t} | X_{i t} = l_{t})$ is the probability of observing state $j$ at time $t,$ given that individual $i$ at time $t$ is in the true state $l_{t} :$ this is also called the model measurement component.

It follows that $P (Y (1), \dots, Y (T))$ is the proportion of units observed in a generic cell of the $T -$ way contingency table. For a generic sample individual $i,$ a LCM model is defined as:

$\begin{array}{l} P (Y_{i} = y) = & \sum_{l_{1}}^{K} \dots \sum_{l_{T}}^{K} P (X_{i 1} = l_{1}) \\ \prod_{t = 2}^{T} P (X_{i t} = l_{t} | X_{i t - 1} = l_{t - 1}) \\ \prod_{t = 1}^{T} P (Y_{i t} = j_{t} | X_{i t} = l_{t}) (2.1) \end{array}$

where $y$ is the vector containing observed values for individual $i, l_{t}$ and $j_{t}$ vary over $K$ classes (in our application, three labour force conditions). Equation (2.1) specifies the proportion of units in the generic cell of a $T -$ way contingency table as a product of marginal and conditional probabilities.

In an LCM model with concomitant variables, latent class membership and latent transitions are expressed as functions of covariates with known distributions (Dayton and McReady 1988). $P (X_{i 1} = l_{1} | Z_{i 1} = z_{1}),$ where $z_{1}$ is a vector containing the values of covariates for respondent $i$ at time 1, estimates covariate effects on the initial state, and $P (X_{i t} = l_{t} | X_{i t - 1} , Z_{i t} = z_{t}),$ where $z_{t}$ is a vector containing the values of covariates for respondent $i$ at time $t,$ estimates covariate effects on latent transitions.

On the basis of the above components, the complete model for individual $i$ is given by:

$\begin{array}{l} P (Y_{i} = y | Z_{i} = z) = & \sum_{l_{1}}^{K} \dots \sum_{l_{T}}^{K} P (X_{i 1} = l_{1} | Z_{1} = z_{1}) \\ \prod_{t = 2}^{T} P (X_{i t} = l_{t} | X_{i t - 1} = l_{t - 1}, Z_{i t} = z_{t}) \\ \prod_{t = 1}^{T} P (Y_{i t} = j_{t} | X_{i t} = l_{t}) (2.2) \end{array}$

When more than one $(M)$ indicators per latent variable are observed, the model formulation becomes the following (Vermunt 2010):

$\begin{array}{l} P (Y_{i} = y | Z_{i} = z) = & \sum_{l_{1}}^{K} \dots \sum_{l_{T}}^{K} P (X_{i 1} = l_{1} | Z_{1} = z_{1}) \\ \prod_{t = 2}^{T} P (X_{i t} = l_{t} | X_{i t - 1} = l_{t - 1}, Z_{i t} = z_{t}) \\ \prod_{m = 1}^{M} \prod_{t = 1}^{T} P (Y_{m i t} = j_{t} | X_{i t} = l_{t}) (2.3) \end{array}$

In our application, the $M$ indicators are given by the three pieces of information collected for all respondents on their labour market condition.

Typically, conditional probabilities are parameterised and restricted by logistic regression models. The parameters are estimated via maximum likelihood (Vermunt and Magidson 2013). Identification is a well-known problem in models with latent variables and, although the number of independent parameters must not exceed the number of observed frequencies, this is not a sufficient condition. According to Goodman (1974), a sufficient condition for local identifiability is that the information matrix is positive definite. Latent Gold software (Vermunt and Magidson 2008), provides information on parameter identification. Another problem linked to estimation is that of local maxima, to deal with which we estimated our models several times with different sets of starting values.

A mixed LCM model assumes the existence in the population of not directly observable groups moving across time, following latent chains with different initial state probabilities and different transition probabilities; the groups may also be assumed to have different response probabilities (van de Pol and Langeheine 1990). Such a model can be extended to include time-varying and time-constant covariates (Vermunt, Tran and Magidson 2008). A special case of a two-class mixed LCM model is the mover-stayer model: the group of movers has positive probabilities of transferring from one state to another over time, and the group of stayers do not change. For the latter, transition probabilities between different states are imposed as zero. A two-class mixed LCM model with concomitant variables has the following form:

$\begin{array}{l} P (Y_{i} = y | Z_{i} = z) = & \sum_{w = 1}^{2} \sum_{l_{1}}^{K} ... \sum_{l_{T}}^{K} P (W = w) P (X_{i 1} = l_{1} | Z_{1} = z_{1} , W = w) \\ \prod_{t = 2}^{T} P (X_{i t} = l_{t} | X_{i t - 1} = l_{t - 1} , Z_{i t} = z_{t} , W = w) \\ \prod_{j_{t} = 1}^{K} \prod_{t = 1}^{T} P (Y_{i t} = j_{t} | X_{i t} = l_{t} , W = w) (2.4) \end{array}$

where $W$ is a binary latent variable. The mover-stayer model is obtained assuming, for $l_{t} \neq l_{t - 1} ,$ $P (X_{i t} = l_{t} | X_{i t - 1} = l_{t - 1} , W = 2) = 0$ and, consequently, for $l_{t} = l_{t - 1}$ $P (X_{i t} = l_{t} | X_{i t - 1} = l_{t - 1} , W = 2) = 1.$

The likelihood function of an LC model can also be estimated if information is missing in the response variables. We exploit this opportunity to take into account the response patterns generated by the survey rotation design. Sampled households are interviewed for two consecutive quarters, do not participate in the survey for the subsequent two quarters, and are then re-interviewed on two other occasions (see Table 3.1). We assumed that missing information due to survey design is missing at random. In this case, each unit only contributes to the likelihood function with the information available (Vermunt 1997).

ISSN : 1492-0921

Editorial policy

Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.

Submission of Manuscripts

Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).

Note of appreciation

Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.

Standards of service to the public

Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.

Copyright

Published by authority of the Minister responsible for Statistics Canada.

Use of this publication is governed by the Statistics Canada Open Licence Agreement.

Catalogue No. 12-001-X

Frequency: semi-annual

Ottawa

Date modified:: 2017-06-22

Language selection

Search and menus

Search

A mixed latent class Markov approach for estimating labour market mobility with multiple indicators and retrospective interrogation
Section 2. The latent class Markov model

A mixed latent class Markov approach for estimating labour market mobility with multiple indicators and retrospective interrogation Section 2. The latent class Markov model

Editorial policy

Submission of Manuscripts

Note of appreciation

Standards of service to the public

Copyright

A mixed latent class Markov approach for estimating labour market mobility with multiple indicators and retrospective interrogation
Section 2. The latent class Markov model