Model-assisted calibration of non-probability sample survey data using adaptive LASSO
Section 2. Calibration
2.1 Traditional calibration
For an analytical sample
(the sample which requires weight calibration)
of size
drawn from sample design
with design weights
and the diagonal matrix of design weights
calibrated weights
minimize a distance measure
under the
constraint:
where
is expectation with respect to the analytic
(probability) design,
is a differentiable function with respect to
strictly convex on an interval containing
and
and where
is a row vector of known population totals of
sample calibration variables
(Deville and Särndal, 1992). The constant
is independent of design weight
The commonly used generalized regression
(GREG) estimator uses the chi-square distance:
with
Under this distance measure:
The estimate of population total of outcome
is based on calibrated weights:
where
is the standard (weighted) design-based
estimator,
is the weighted least squares estimate of the
linear regression
given weights
(This corresponds to the poststratified
estimator when
consists entirely of cell totals for categorical
variables.) The calibrated weights defined in equation (2.3) do not rely
on any outcome variable. Thus the same set of weights can be applied to all
variables in the survey. Note that GREG assumes a working model that is linear.
Although
is asymptotically design-unbiased for
when the relationship between
and
is non-linear, such as in the case when
is binary, the design variance of
can be larger than the design variance
2.2 Model-assisted calibration
Model-assisted calibration
estimators can have significant advantage over
because model-assisted calibration allows for
non-linear models to assist in the construction of calibrated weights. In model-assisted calibration, we assume a
relationship between an outcome
and
through first two moments (Wu and Sitter,
2001):
where
and
are unknown superpopulation parameters,
is a known function of
and
and
is a known function of
or
and
are expectation and variance with respect to
the model
Let
be the finite population (or census) estimate
of
(i.e., the quasilikelihood estimator of
based on the entire finite population), and
where
is the sample estimate of
The model-assisted calibrated weights
then minimize a distance measure
under the constraints
and
The main conceptual difference between
traditional calibration and model-assisted calibration is that in
model-assisted calibration, the constraints are based on two quantities: (1)
population size, and (2) population total of predicted values
In traditional calibration, the constraint is
a vector of population totals of
(see equation (2.2)). Under chi-square
distance measure with
the model-assisted calibrated weights are:
where
and
(In the non-probability setting the vector of
design weights
can be replaced with
The estimate for the population total based on
model-assisted calibrated weights is then:
where
is the calibration slope to satisfy the
calibration constraints (different from the model parameter estimates
Unbiasedness and small variances of
both rely on how well the
approximates the true expected value of
ISSN : 1492-0921
Editorial policy
Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.
Submission of Manuscripts
Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).
Note of appreciation
Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.
Standards of service to the public
Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.
Copyright
Published by authority of the Minister responsible for Statistics Canada.
© Her Majesty the Queen in Right of Canada as represented by the Minister of Industry, 2018
Use of this publication is governed by the Statistics Canada Open Licence Agreement.
Catalogue No. 12-001-X
Frequency: Semi-annual
Ottawa