A mixed latent class Markov approach for estimating labour market mobility with multiple indicators and retrospective interrogation
Section 2. The latent class Markov model
Latent class analysis has been applied in a
number of studies on panel data to separate true changes from observed ones
affected by unreliable measurements. Relatively recent contributions include Bassi, Torelli and Trivellato (1998), Biemer and Bushery (2000), Bassi,
Croon, Hagenaars and Vermunt (2000), Bassi and Trivellato (2009).
The true labour force state is treated as a
latent variable and the observed one as its indicator. The
model consists of two parts:
- structural,
describing true dynamics among latent variables;
- measurement,
linking each latent variable to its indicator(s).
Let us consider the simplest formulation of latent
class Markov (LCM) models (Wiggins 1973), which assumes that true unobservable
transitions follow a first-order Markov chain. As in all standard LCM
specifications, local independence among indicators is assumed, i.e., indicators are independent
conditionally on latent variables In the LCM model with one indicator per
latent variable, the assumption of local independence coincides with the
Independent Classification Errors condition.
Let
denote
the true labour force condition at time
for
a generic sample individual
is the
corresponding observed condition;
is the
probability of the initial state of the latent Markov chain, and
is the
transition probability between state
and
state
from
time
to
with
where
represents
the total number of consecutive, equally spaced time-points over which an
individual is observed. In addition,
is the
probability of observing state
at time
given
that individual
at
time
is
in the true state
this is
also called the model measurement component.
It follows that
is the
proportion of units observed in a generic cell of the
way contingency table. For a generic sample
individual
a
LCM model is defined as:
where
is the vector containing observed values for
individual
and
vary
over
classes (in our application, three labour force
conditions). Equation (2.1) specifies the proportion of units in the generic
cell of a
way contingency table as a product of marginal and conditional
probabilities.
In an LCM model with concomitant variables,
latent class membership and latent transitions are expressed as functions of
covariates with known distributions (Dayton and McReady 1988).
where
is a
vector containing the values of covariates for respondent
at
time 1, estimates covariate effects on the initial state, and
where
is a
vector containing the values of covariates for respondent
at
time
estimates
covariate effects on latent transitions.
On the basis of the above components, the
complete model for individual
is
given by:
When more than one
indicators
per latent variable are observed, the model formulation becomes the following (Vermunt
2010):
In our application, the
indicators
are given by the three pieces of information collected for all respondents on
their labour market condition.
Typically, conditional probabilities are
parameterised and restricted by logistic regression models. The parameters are
estimated via maximum likelihood (Vermunt and Magidson 2013). Identification is
a well-known problem in models with latent variables and, although the number
of independent parameters must not exceed the number of observed frequencies,
this is not a sufficient condition. According to Goodman (1974), a sufficient
condition for local identifiability is that the information matrix is positive
definite. Latent Gold software (Vermunt and Magidson 2008), provides
information on parameter identification. Another problem linked to estimation
is that of local maxima, to deal with which we estimated our models several
times with different sets of starting values.
A mixed LCM model assumes the existence in the
population of not directly observable groups moving across time, following
latent chains with different initial state probabilities and different
transition probabilities; the groups may also be assumed to have different
response probabilities (van de Pol and Langeheine 1990). Such a model can be
extended to include time-varying and time-constant covariates (Vermunt, Tran
and Magidson 2008). A special case of a two-class mixed LCM model is the
mover-stayer model: the group of movers has positive probabilities of
transferring from one state to another over time, and the group of stayers do
not change. For the latter, transition probabilities between different states
are imposed as zero. A two-class mixed LCM model with concomitant variables has
the following form:
where
is a binary latent variable. The mover-stayer
model is obtained assuming, for
and,
consequently, for
The likelihood function of an LC model can also
be estimated if information is missing in the response variables. We exploit
this opportunity to take into account the response patterns generated by the
survey rotation design. Sampled households are interviewed for two consecutive
quarters, do not participate in the survey for the subsequent two quarters, and
are then re-interviewed on two other occasions (see Table 3.1). We assumed that
missing information due to survey design is missing at random. In this case,
each unit only contributes to the likelihood function with the information available
(Vermunt 1997).
ISSN : 1492-0921
Editorial policy
Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.
Submission of Manuscripts
Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).
Note of appreciation
Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.
Standards of service to the public
Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.
Copyright
Published by authority of the Minister responsible for Statistics Canada.
© Minister of Industry, 2017
Use of this publication is governed by the Statistics Canada Open Licence Agreement.
Catalogue No. 12-001-X
Frequency: semi-annual
Ottawa