Keyword search

Filter results by

Search Help
Currently selected filters that can be removed

Keyword(s)

Geography

1 facets displayed. 0 facets selected.

Survey or statistical program

2 facets displayed. 0 facets selected.

Content

1 facets displayed. 0 facets selected.
Sort Help
entries

Results

All (65)

All (65) (30 to 40 of 65 results)

  • Articles and reports: 12-001-X200700210488
    Description:

    Calibration is the principal theme in many recent articles on estimation in survey sampling. Words such as "calibration approach" and "calibration estimators" are frequently used. As article authors like to point out, calibration provides a systematic way to incorporate auxiliary information in the procedure.

    Calibration has established itself as an important methodological instrument in large-scale production of statistics. Several national statistical agencies have developed software designed to compute weights, usually calibrated to auxiliary information available in administrative registers and other accurate sources.

    This paper presents a review of the calibration approach, with an emphasis on progress achieved in the past decade or so. The literature on calibration is growing rapidly; selected issues are discussed in this paper. The paper starts with a definition of the calibration approach. Its important features are reviewed. The calibration approach is contrasted with (generalized) regression estimation, which is an alternative but conceptually different way to take auxiliary information into account. The computational aspects of calibration are discussed, including methods for avoiding extreme weights. In the early sections of the paper, simple applications of calibration are examined: The estimation of a population total in direct, single phase sampling. Generalization to more complex parameters and more complex sampling designs are then considered. A common feature of more complex designs (sampling in two or more phases or stages) is that the available auxiliary information may consist of several components or layers. The uses of calibration in such cases of composite information are reviewed. Later in the paper, examples are given to illustrate how the results of the calibration thinking may contrast with answers given by earlier established approaches. Finally, applications of calibration in the presence of nonsampling error are discussed, in particular methods for nonresponse bias adjustment.

    Release date: 2008-01-03

  • Articles and reports: 12-001-X200700210489
    Description:

    Missingness may occur in various forms. In this paper, we consider unit non-response, and hence make attempts for adjustments by appropriate weighting. Our empirical case concerns two-phase sampling so that first, a large sample survey was conducted using a fairly general questionnaire. At the end of this contact the interviewer asked whether the respondent was willing to participate in the second phase survey with a more detailed questionnaire concentrating on some themes of the first survey. This procedure leads to three missingness mechanisms. Our problem is how to weight the second survey respondents as correctly as possible so that the results from this survey are consistent with those obtained with the first phase survey. The paper first analyses missingness differences in these three steps using a human survey dataset, and then compares different weighting approaches. Our recommendation is that all available auxiliary data should have been used in the best way. This works well with a mixture of the two classic methods that first exploits response propensity weighting and then calibrates these weights to the known population distributions.

    Release date: 2008-01-03

  • Articles and reports: 12-001-X200700210498
    Description:

    In this paper we describe a methodology for combining a convenience sample with a probability sample in order to produce an estimator with a smaller mean squared error (MSE) than estimators based on only the probability sample. We then explore the properties of the resulting composite estimator, a linear combination of the convenience and probability sample estimators with weights that are a function of bias. We discuss the estimator's properties in the context of web-based convenience sampling. Our analysis demonstrates that the use of a convenience sample to supplement a probability sample for improvements in the MSE of estimation may be practical only under limited circumstances. First, the remaining bias of the estimator based on the convenience sample must be quite small, equivalent to no more than 0.1 of the outcome's population standard deviation. For a dichotomous outcome, this implies a bias of no more than five percentage points at 50 percent prevalence and no more than three percentage points at 10 percent prevalence. Second, the probability sample should contain at least 1,000-10,000 observations for adequate estimation of the bias of the convenience sample estimator. Third, it must be inexpensive and feasible to collect at least thousands (and probably tens of thousands) of web-based convenience observations. The conclusions about the limited usefulness of convenience samples with estimator bias of more than 0.1 standard deviations also apply to direct use of estimators based on that sample.

    Release date: 2008-01-03

  • Articles and reports: 75F0002M2007007
    Description:

    The Survey of Labour and Income Dynamics (SLID), introduced in the 1993 reference year, is a longitudinal panel survey of individuals. The purpose of the survey is to measure changes in the economic well-being of individuals and the factors that influence these changes. SLID's sample is divided into two overlapping panels, each six years in length. Longitudinal surveys like SLID are complex due to the dynamic nature of the sample, which in turn is due to the ever-changing composition of households and families over the years. For each reference year, SLID produces two sets of weights: one is representative of the initial population (the longitudinal weights), while the other is representative of the current population (the cross-sectional weights). Since 2002, SLID has been producing a third set of weights which combines two panels that overlap to form a new longitudinal sample. The new weights are referred to as combined longitudinal weights.

    For the production of the cross-sectional weights, SLID combines two independent samples and assigns a probability of selection to individuals who joined the sample after the panel was selected. Like cross-sectional weights, longitudinal weights are adjusted for non-response and influential values. In addition, the sample is adjusted to make it representative of the target population. The purpose of this document is to describe SLID's methodology for the longitudinal and cross-sectional weights, as well as to present problems that have been encountered, and solutions that have been proposed. For the purpose of illustration, results for the 2003 reference year are used. The methodology used to produce the combined longitudinal weights will not be presented in this document as there is a complete description in Naud (2004).

    Release date: 2007-10-18

  • Articles and reports: 11-522-X20050019450
    Description:

    Taylor linearization is generally applicable to any sampling design, but it can lead to multiple variance estimators that are asymptotically design unbiased under repeated sampling. Demnati and Rao (2004) proposed a new approach to deriving Taylor linearization variance estimators that leads directly to a unique variance estimator that satisfies the above considerations for general designs.

    Release date: 2007-03-02

  • Articles and reports: 12-001-X20060029547
    Description:

    Calibration weighting can be used to adjust for unit nonresponse and/or coverage errors under appropriate quasi-randomization models. Alternative calibration adjustments that are asymptotically identical in a purely sampling context can diverge when used in this manner. Introducing instrumental variables into calibration weighting makes it possible for nonresponse (say) to be a function of a set of characteristics other than those in the calibration vector. When the calibration adjustment has a nonlinear form, a variant of the jackknife can remove the need for iteration in variance estimation.

    Release date: 2006-12-21

  • Articles and reports: 12-001-X20060019255
    Description:

    In this paper, we consider the estimation of quantiles using the calibration paradigm. The proposed methodology relies on an approach similar to the one leading to the original calibration estimators of Deville and Särndal (1992). An appealing property of the new methodology is that it is not necessary to know the values of the auxiliary variables for all units in the population. It suffices instead to know the corresponding quantiles for the auxiliary variables. When the quadratic metric is adopted, an analytic representation of the calibration weights is obtained. In this situation, the weights are similar to those leading to the generalized regression (GREG) estimator. Variance estimation and construction of confidence intervals are discussed. In a small simulation study, a calibration estimator is compared to other popular estimators for quantiles that also make use of auxiliary information.

    Release date: 2006-07-20

  • Articles and reports: 12-001-X20060019261
    Description:

    Sample allocation can be optimized with respect to various goals. When there is more than one goal, a compromise allocation must be chosen. In the past, the Reverse Record Check achieved that compromise by having a certain fraction of the sample optimally allocated for each goal (for example, two thirds of the sample is allocated to produce good-quality provincial estimates, and one third to produce a good-quality national estimate). This paper suggests a method that involves selecting the maximum of two or more optimal allocations. By analyzing the impact that the precision of population estimates has on the federal government's equalization payments to the provinces, we can set four goals for the Reverse Record Check's provincial sample allocation. The Reverse Record Check's subprovincial sample allocation requires the smoothing of stratum-level parameters. This paper shows how calibration can be used to achieve this smoothing. The calibration problem and its solution do not assume that the calibration constraints have a solution. This avoids convergence problems inherent in related methods such as the raking ratio.

    Release date: 2006-07-20

  • Surveys and statistical programs – Documentation: 75F0002M2005009
    Description:

    The release of the 2003 data from the Survey of Labour and Income Dynamics (SLID) was accompanied by a historical revision which accomplished three things. First, the survey weights were updated to take into account new population projections based on the 2001 Census of Population, instead of the 1996 Census. Second, a new procedure in the weight adjustments was introduced to take into account an external source of information on the overall distribution of income in the population, namely the T4 file of employer remittances to Canada Revenue Agency. Third, the low income estimates were revised due to new low income cut-offs (LICOs). This paper describes the second of these improvements' the new weighting procedure to reflect the distribution of income in the population with greater accuracy. Part 1 explains in non-technical terms how this new procedure came about and how it works. Part 2 provides some examples of the impacts on the results for previous years.

    Release date: 2005-07-22

  • Articles and reports: 12-001-X20050018092
    Description:

    When there is auxiliary information in survey sampling, the design based "optimal (regression) estimator" of a finite population total/mean is known to be (at least asymptotically) more efficient than the corresponding GREG estimator. We will illustrate this by some simulations with stratified sampling from skewed populations. The GREG estimator was originally constructed using an assisting linear superpopulation model. It may also be seen as a calibration estimator; i.e., as a weighted linear estimator, where the weights obey the calibration equation and, with that restriction, are as close as possible to the original "Horvitz-Thompson weights" (according to a suitable distance). We show that the optimal estimator can also be seen as a calibration estimator in this respect, with a quadratic distance measure closely related to the one generating the GREG estimator. Simple examples will also be given, revealing that this new measure is not always easily obtained.

    Release date: 2005-07-21
Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (63)

Analysis (63) (0 to 10 of 63 results)

  • Articles and reports: 12-001-X202300200005
    Description: Population undercoverage is one of the main hurdles faced by statistical analysis with non-probability survey samples. We discuss two typical scenarios of undercoverage, namely, stochastic undercoverage and deterministic undercoverage. We argue that existing estimation methods under the positivity assumption on the propensity scores (i.e., the participation probabilities) can be directly applied to handle the scenario of stochastic undercoverage. We explore strategies for mitigating biases in estimating the mean of the target population under deterministic undercoverage. In particular, we examine a split population approach based on a convex hull formulation, and construct estimators with reduced biases. A doubly robust estimator can be constructed if a followup subsample of the reference probability survey with measurements on the study variable becomes feasible. Performances of six competing estimators are investigated through a simulation study and issues which require further investigation are briefly discussed.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200016
    Description: In this discussion, I will present some additional aspects of three major areas of survey theory developed or studied by Jean-Claude Deville: calibration, balanced sampling and the generalized weight-share method.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200017
    Description: Jean-Claude Deville, who passed away in October 2021, was one of the most influential researchers in the field of survey statistics over the past 40 years. This article traces some of his contributions that have had a profound impact on both survey theory and practice. This article will cover the topics of balanced sampling using the cube method, calibration, the weight-sharing method, the development of variance expressions of complex estimators using influence function and quota sampling.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202200200002
    Description:

    We provide a critical review and some extended discussions on theoretical and practical issues with analysis of non-probability survey samples. We attempt to present rigorous inferential frameworks and valid statistical procedures under commonly used assumptions, and address issues on the justification and verification of assumptions in practical applications. Some current methodological developments are showcased, and problems which require further investigation are mentioned. While the focus of the paper is on non-probability samples, the essential role of probability survey samples with rich and relevant information on auxiliary variables is highlighted.

    Release date: 2022-12-15

  • Articles and reports: 12-001-X202200200011
    Description:

    Two-phase sampling is a cost effective sampling design employed extensively in surveys. In this paper a method of most efficient linear estimation of totals in two-phase sampling is proposed, which exploits optimally auxiliary survey information. First, a best linear unbiased estimator (BLUE) of any total is formally derived in analytic form, and shown to be also a calibration estimator. Then, a proper reformulation of such a BLUE and estimation of its unknown coefficients leads to the construction of an “optimal” regression estimator, which can also be obtained through a suitable calibration procedure. A distinctive feature of such calibration is the alignment of estimates from the two phases in an one-step procedure involving the combined first-and-second phase samples. Optimal estimation is feasible for certain two-phase designs that are used often in large scale surveys. For general two-phase designs, an alternative calibration procedure gives a generalized regression estimator as an approximate optimal estimator. The proposed general approach to optimal estimation leads to the most effective use of the available auxiliary information in any two-phase survey. The advantages of this approach over existing methods of estimation in two-phase sampling are shown both theoretically and through a simulation study.

    Release date: 2022-12-15

  • Articles and reports: 89-648-X2022001
    Description:

    This report explores the size and nature of the attrition challenges faced by the Longitudinal and International Study of Adults (LISA) survey, as well as the use of a non-response weight adjustment and calibration strategy to mitigate the effects of attrition on the LISA estimates. The study focuses on data from waves 1 (2012) to 4 (2018) and uses practical examples based on selected demographic variables, to illustrate how attrition be assessed and treated.

    Release date: 2022-11-14

  • Articles and reports: 12-001-X202100200005
    Description:

    Variance estimation is a challenging problem in surveys because there are several nontrivial factors contributing to the total survey error, including sampling and unit non-response. Initially devised to capture the variance of non-trivial statistics based on independent and identically distributed data, the bootstrap method has since been adapted in various ways to address survey-specific elements/factors. In this paper we look into one of those variants, the with-replacement bootstrap. We consider household surveys, with or without sub-sampling of individuals. We make explicit the benchmark variance estimators that the with-replacement bootstrap aims at reproducing. We explain how the bootstrap can be used to account for the impact sampling, treatment of non-response and calibration have on total survey error. For clarity, the proposed methods are illustrated on a running example. They are evaluated through a simulation study, and applied to a French Panel for Urban Policy. Two SAS macros to perform the bootstrap methods are also developed.

    Release date: 2022-01-06

  • Articles and reports: 11-522-X202100100014
    Description: Recent developments in questionnaire administration modes and data extraction have favored the use of nonprobability samples, which are often affected by selection bias that arises from the lack of a sample design or self-selection of the participants. This bias can be addressed by several adjustments, whose applicability depends on the type of auxiliary information available. Calibration weighting can be used when only population totals of auxiliary variables are available. If a reference survey that followed a probability sampling design is available, several methods can be applied, such as Propensity Score Adjustment, Statistical Matching or Mass Imputation, and doubly robust estimators. In the case where a complete census of the target population is available for some auxiliary covariates, estimators based in superpopulation models (often used in probability sampling) can be adapted to the nonprobability sampling case. We studied the combination of some of these methods in order to produce less biased and more efficient estimates, as well as the use of modern prediction techniques (such as Machine Learning classification and regression algorithms) in the modelling steps of the adjustments described. We also studied the use of variable selection techniques prior to the modelling step in Propensity Score Adjustment. Results show that adjustments based on the combination of several methods might improve the efficiency of the estimates, and the use of Machine Learning and variable selection techniques can contribute to reduce the bias and the variance of the estimators to a greater extent in several situations. 

    Key Words: nonprobability sampling; calibration; Propensity Score Adjustment; Matching.

    Release date: 2021-10-15

  • Articles and reports: 12-001-X201800254960
    Description:

    Based on auxiliary information, calibration is often used to improve the precision of estimates. However, calibration weighting may not be appropriate for all variables of interest of the survey, particularly those not related to the auxiliary variables used in calibration. In this paper, we propose a criterion to assess, for any variable of interest, the impact of calibration weighting on the precision of the estimated total. This criterion can be used to decide on the weights associated with each survey variable of interest and determine the variables for which calibration weighting is appropriate.

    Release date: 2018-12-20

  • Articles and reports: 12-001-X201800154963
    Description:

    The probability-sampling-based framework has dominated survey research because it provides precise mathematical tools to assess sampling variability. However increasing costs and declining response rates are expanding the use of non-probability samples, particularly in general population settings, where samples of individuals pulled from web surveys are becoming increasingly cheap and easy to access. But non-probability samples are at risk for selection bias due to differential access, degrees of interest, and other factors. Calibration to known statistical totals in the population provide a means of potentially diminishing the effect of selection bias in non-probability samples. Here we show that model calibration using adaptive LASSO can yield a consistent estimator of a population total as long as a subset of the true predictors is included in the prediction model, thus allowing large numbers of possible covariates to be included without risk of overfitting. We show that the model calibration using adaptive LASSO provides improved estimation with respect to mean square error relative to standard competitors such as generalized regression (GREG) estimators when a large number of covariates are required to determine the true model, with effectively no loss in efficiency over GREG when smaller models will suffice. We also derive closed form variance estimators of population totals, and compare their behavior with bootstrap estimators. We conclude with a real world example using data from the National Health Interview Survey.

    Release date: 2018-06-21
Reference (2)

Reference (2) ((2 results))

  • Surveys and statistical programs – Documentation: 75F0002M2005009
    Description:

    The release of the 2003 data from the Survey of Labour and Income Dynamics (SLID) was accompanied by a historical revision which accomplished three things. First, the survey weights were updated to take into account new population projections based on the 2001 Census of Population, instead of the 1996 Census. Second, a new procedure in the weight adjustments was introduced to take into account an external source of information on the overall distribution of income in the population, namely the T4 file of employer remittances to Canada Revenue Agency. Third, the low income estimates were revised due to new low income cut-offs (LICOs). This paper describes the second of these improvements' the new weighting procedure to reflect the distribution of income in the population with greater accuracy. Part 1 explains in non-technical terms how this new procedure came about and how it works. Part 2 provides some examples of the impacts on the results for previous years.

    Release date: 2005-07-22

  • Surveys and statistical programs – Documentation: 11-522-X19990015684
    Description:

    Often, the same information is gathered almost simultaneously for several different surveys. In France, this practice is institutionalized for household surveys that have a common set of demographic variables, i.e., employment, residence and income. These variables are important co-factors for the variables of interest in each survey, and if used carefully, can reinforce the estimates derived from each survey. Techniques for calibrating uncertain data can apply naturally in this context. This involves finding the best unbiased estimator in common variables and calibrating each survey based on that estimator. The estimator thus obtained in each survey is always a linear estimator, the weightings of which can be easily explained and the variance can be obtained with no new problems, as can the variance estimate. To supplement the list of regression estimators, this technique can also be seen as a ridge-regression estimator, or as a Bayesian-regression estimator.

    Release date: 2000-03-02
Date modified: