Keyword search
Filter results by
Search HelpKeyword(s)
Type
Year of publication
Results
All (32)
All (32) (0 to 10 of 32 results)
- Articles and reports: 12-001-X202300200014Description: Many things have been written about Jean-Claude Deville in tributes from the statistical community (see Tillé, 2022a; Tillé, 2022b; Christine, 2022; Ardilly, 2022; and Matei, 2022) and from the École nationale de la statistique et de l’administration économique (ENSAE) and the Société française de statistique. Pascal Ardilly, David Haziza, Pierre Lavallée and Yves Tillé provide an in-depth look at Jean-Claude Deville’s contributions to survey theory. To pay tribute to him, I would like to discuss Jean-Claude Deville’s contribution to the more day-to-day application of methodology for all the statisticians at the Institut national de la statistique et des études économiques (INSEE) and at the public statistics service. To do this, I will use my work experience, and particularly the four years (1992 to 1996) I spent working with him in the Statistical Methods Unit and the discussions we had thereafter, especially in the 2000s on the rolling census.Release date: 2024-01-03
- 2. Revisions to 2006 to 2011 income data ArchivedSurveys and statistical programs – Documentation: 75F0002M2015003Description:
This note discusses revised income estimates from the Survey of Labour and Income Dynamics (SLID). These revisions to the SLID estimates make it possible to compare results from the Canadian Income Survey (CIS) to earlier years. The revisions address the issue of methodology differences between SLID and CIS.
Release date: 2015-12-17 - Surveys and statistical programs – Documentation: 13-605-X201500414166Description:
Estimates of the underground economy by province and territory for the period 2007 to 2012 are now available for the first time. The objective of this technical note is to explain how the methodology employed to derive upper-bound estimates of the underground economy for the provinces and territories differs from that used to derive national estimates.
Release date: 2015-04-29 - Articles and reports: 12-001-X201300111823Description:
Although weights are widely used in survey sampling their ultimate justification from the design perspective is often problematical. Here we will argue for a stepwise Bayes justification for weights that does not depend explicitly on the sampling design. This approach will make use of the standard kind of information present in auxiliary variables however it will not assume a model relating the auxiliary variables to the characteristic of interest. The resulting weight for a unit in the sample can be given the usual interpretation as the number of units in the population which it represents.
Release date: 2013-06-28 - 5. Estimation of the variance of cross-sectional indicators for the SILC survey in Switzerland ArchivedArticles and reports: 12-001-X201300111827Description:
SILC (Statistics on Income and Living Conditions) is an annual European survey that measures the population's income distribution, poverty and living conditions. It has been conducted in Switzerland since 2007, based on a four-panel rotation scheme that yields both cross-sectional and longitudinal estimates. This article examines the problem of estimating the variance of the cross-sectional poverty and social exclusion indicators selected by Eurostat. Our calculations take into account the non-linearity of the estimators, total non-response at different survey stages, indirect sampling and calibration. We adapt the method proposed by Lavallée (2002) for estimating variance in cases of non-response after weight sharing, and we obtain a variance estimator that is asymptotically unbiased and very easy to program.
Release date: 2013-06-28 - 6. Combining cohorts in longitudinal surveys ArchivedArticles and reports: 12-001-X201300111828Description:
A question that commonly arises in longitudinal surveys is the issue of how to combine differing cohorts of the survey. In this paper we present a novel method for combining different cohorts, and using all available data, in a longitudinal survey to estimate parameters of a semiparametric model, which relates the response variable to a set of covariates. The procedure builds upon the Weighted Generalized Estimation Equation method for handling missing waves in longitudinal studies. Our method is set up under a joint-randomization framework for estimation of model parameters, which takes into account the superpopulation model as well as the survey design randomization. We also propose a design-based, and a joint-randomization, variance estimation method. To illustrate the methodology we apply it to the Survey of Doctorate Recipients, conducted by the U.S. National Science Foundation.
Release date: 2013-06-28 - 7. On the performance of self benchmarked small area estimators under the Fay-Herriot area level model ArchivedArticles and reports: 12-001-X201300111830Description:
We consider two different self-benchmarking methods for the estimation of small area means based on the Fay-Herriot (FH) area level model: the method of You and Rao (2002) applied to the FH model and the method of Wang, Fuller and Qu (2008) based on augmented models. We derive an estimator of the mean squared prediction error (MSPE) of the You-Rao (YR) estimator of a small area mean that, under the true model, is correct to second-order terms. We report the results of a simulation study on the relative bias of the MSPE estimator of the YR estimator and the MSPE estimator of the Wang, Fuller and Qu (WFQ) estimator obtained under an augmented model. We also study the MSPE and the estimators of MSPE for the YR and WFQ estimators obtained under a misspecified model.
Release date: 2013-06-28 - 8. Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities ArchivedArticles and reports: 12-001-X201300111831Description:
We consider conservative variance estimation for the Horvitz-Thompson estimator of a population total in sampling designs with zero pairwise inclusion probabilities, known as "non-measurable" designs. We decompose the standard Horvitz-Thompson variance estimator under such designs and characterize the bias precisely. We develop a bias correction that is guaranteed to be weakly conservative (nonnegatively biased) regardless of the nature of the non-measurability. The analysis sheds light on conditions under which the standard Horvitz-Thompson variance estimator performs well despite non-measurability and where the conservative bias correction may outperform commonly-used approximations.
Release date: 2013-06-28 - Articles and reports: 12-001-X201200111686Description:
We present a generalized estimating equations approach for estimating the concordance correlation coefficient and the kappa coefficient from sample survey data. The estimates and their accompanying standard error need to correctly account for the sampling design. Weighted measures of the concordance correlation coefficient and the kappa coefficient, along with the variance of these measures accounting for the sampling design, are presented. We use the Taylor series linearization method and the jackknife procedure for estimating the standard errors of the resulting parameter estimates. Body measurement and oral health data from the Third National Health and Nutrition Examination Survey are used to illustrate this methodology.
Release date: 2012-06-27 - Articles and reports: 12-001-X201100211609Description:
This paper presents a review and assessment of the use of balanced sampling by means of the cube method. After defining the notion of balanced sample and balanced sampling, a short history of the concept of balancing is presented. The theory of the cube method is briefly presented. Emphasis is placed on the practical problems posed by balanced sampling: the interest of the method with respect to other sampling methods and calibration, the field of application, the accuracy of balancing, the choice of auxiliary variables and ways to implement the method.
Release date: 2011-12-21
Data (0)
Data (0) (0 results)
No content available at this time.
Analysis (28)
Analysis (28) (0 to 10 of 28 results)
- Articles and reports: 12-001-X202300200014Description: Many things have been written about Jean-Claude Deville in tributes from the statistical community (see Tillé, 2022a; Tillé, 2022b; Christine, 2022; Ardilly, 2022; and Matei, 2022) and from the École nationale de la statistique et de l’administration économique (ENSAE) and the Société française de statistique. Pascal Ardilly, David Haziza, Pierre Lavallée and Yves Tillé provide an in-depth look at Jean-Claude Deville’s contributions to survey theory. To pay tribute to him, I would like to discuss Jean-Claude Deville’s contribution to the more day-to-day application of methodology for all the statisticians at the Institut national de la statistique et des études économiques (INSEE) and at the public statistics service. To do this, I will use my work experience, and particularly the four years (1992 to 1996) I spent working with him in the Statistical Methods Unit and the discussions we had thereafter, especially in the 2000s on the rolling census.Release date: 2024-01-03
- Articles and reports: 12-001-X201300111823Description:
Although weights are widely used in survey sampling their ultimate justification from the design perspective is often problematical. Here we will argue for a stepwise Bayes justification for weights that does not depend explicitly on the sampling design. This approach will make use of the standard kind of information present in auxiliary variables however it will not assume a model relating the auxiliary variables to the characteristic of interest. The resulting weight for a unit in the sample can be given the usual interpretation as the number of units in the population which it represents.
Release date: 2013-06-28 - 3. Estimation of the variance of cross-sectional indicators for the SILC survey in Switzerland ArchivedArticles and reports: 12-001-X201300111827Description:
SILC (Statistics on Income and Living Conditions) is an annual European survey that measures the population's income distribution, poverty and living conditions. It has been conducted in Switzerland since 2007, based on a four-panel rotation scheme that yields both cross-sectional and longitudinal estimates. This article examines the problem of estimating the variance of the cross-sectional poverty and social exclusion indicators selected by Eurostat. Our calculations take into account the non-linearity of the estimators, total non-response at different survey stages, indirect sampling and calibration. We adapt the method proposed by Lavallée (2002) for estimating variance in cases of non-response after weight sharing, and we obtain a variance estimator that is asymptotically unbiased and very easy to program.
Release date: 2013-06-28 - 4. Combining cohorts in longitudinal surveys ArchivedArticles and reports: 12-001-X201300111828Description:
A question that commonly arises in longitudinal surveys is the issue of how to combine differing cohorts of the survey. In this paper we present a novel method for combining different cohorts, and using all available data, in a longitudinal survey to estimate parameters of a semiparametric model, which relates the response variable to a set of covariates. The procedure builds upon the Weighted Generalized Estimation Equation method for handling missing waves in longitudinal studies. Our method is set up under a joint-randomization framework for estimation of model parameters, which takes into account the superpopulation model as well as the survey design randomization. We also propose a design-based, and a joint-randomization, variance estimation method. To illustrate the methodology we apply it to the Survey of Doctorate Recipients, conducted by the U.S. National Science Foundation.
Release date: 2013-06-28 - 5. On the performance of self benchmarked small area estimators under the Fay-Herriot area level model ArchivedArticles and reports: 12-001-X201300111830Description:
We consider two different self-benchmarking methods for the estimation of small area means based on the Fay-Herriot (FH) area level model: the method of You and Rao (2002) applied to the FH model and the method of Wang, Fuller and Qu (2008) based on augmented models. We derive an estimator of the mean squared prediction error (MSPE) of the You-Rao (YR) estimator of a small area mean that, under the true model, is correct to second-order terms. We report the results of a simulation study on the relative bias of the MSPE estimator of the YR estimator and the MSPE estimator of the Wang, Fuller and Qu (WFQ) estimator obtained under an augmented model. We also study the MSPE and the estimators of MSPE for the YR and WFQ estimators obtained under a misspecified model.
Release date: 2013-06-28 - 6. Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities ArchivedArticles and reports: 12-001-X201300111831Description:
We consider conservative variance estimation for the Horvitz-Thompson estimator of a population total in sampling designs with zero pairwise inclusion probabilities, known as "non-measurable" designs. We decompose the standard Horvitz-Thompson variance estimator under such designs and characterize the bias precisely. We develop a bias correction that is guaranteed to be weakly conservative (nonnegatively biased) regardless of the nature of the non-measurability. The analysis sheds light on conditions under which the standard Horvitz-Thompson variance estimator performs well despite non-measurability and where the conservative bias correction may outperform commonly-used approximations.
Release date: 2013-06-28 - Articles and reports: 12-001-X201200111686Description:
We present a generalized estimating equations approach for estimating the concordance correlation coefficient and the kappa coefficient from sample survey data. The estimates and their accompanying standard error need to correctly account for the sampling design. Weighted measures of the concordance correlation coefficient and the kappa coefficient, along with the variance of these measures accounting for the sampling design, are presented. We use the Taylor series linearization method and the jackknife procedure for estimating the standard errors of the resulting parameter estimates. Body measurement and oral health data from the Third National Health and Nutrition Examination Survey are used to illustrate this methodology.
Release date: 2012-06-27 - Articles and reports: 12-001-X201100211609Description:
This paper presents a review and assessment of the use of balanced sampling by means of the cube method. After defining the notion of balanced sample and balanced sampling, a short history of the concept of balancing is presented. The theory of the cube method is briefly presented. Emphasis is placed on the practical problems posed by balanced sampling: the interest of the method with respect to other sampling methods and calibration, the field of application, the accuracy of balancing, the choice of auxiliary variables and ways to implement the method.
Release date: 2011-12-21 - 9. Maximum likelihood estimation for contingency tables and logistic regression with incorrectly linked data ArchivedArticles and reports: 12-001-X201100111444Description:
Data linkage is the act of bringing together records that are believed to belong to the same unit (e.g., person or business) from two or more files. It is a very common way to enhance dimensions such as time and breadth or depth of detail. Data linkage is often not an error-free process and can lead to linking a pair of records that do not belong to the same unit. There is an explosion of record linkage applications, yet there has been little work on assuring the quality of analyses using such linked files. Naively treating such a linked file as if it were linked without errors will, in general, lead to biased estimates. This paper develops a maximum likelihood estimator for contingency tables and logistic regression with incorrectly linked records. The estimation technique is simple and is implemented using the well-known EM algorithm. A well known method of linking records in the present context is probabilistic data linking. The paper demonstrates the effectiveness of the proposed estimators in an empirical study which uses probabilistic data linkage.
Release date: 2011-06-29 - 10. Hierarchical Bayes small area estimation under a spatial model with application to health survey data ArchivedArticles and reports: 12-001-X201100111445Description:
In this paper we study small area estimation using area level models. We first consider the Fay-Herriot model (Fay and Herriot 1979) for the case of smoothed known sampling variances and the You-Chapman model (You and Chapman 2006) for the case of sampling variance modeling. Then we consider hierarchical Bayes (HB) spatial models that extend the Fay-Herriot and You-Chapman models by capturing both the geographically unstructured heterogeneity and spatial correlation effects among areas for local smoothing. The proposed models are implemented using the Gibbs sampling method for fully Bayesian inference. We apply the proposed models to the analysis of health survey data and make comparisons among the HB model-based estimates and direct design-based estimates. Our results have shown that the HB model-based estimates perform much better than the direct estimates. In addition, the proposed area level spatial models achieve smaller CVs than the Fay-Herriot and You-Chapman models, particularly for the areas with three or more neighbouring areas. Bayesian model comparison and model fit analysis are also presented.
Release date: 2011-06-29
Reference (4)
Reference (4) ((4 results))
- 1. Revisions to 2006 to 2011 income data ArchivedSurveys and statistical programs – Documentation: 75F0002M2015003Description:
This note discusses revised income estimates from the Survey of Labour and Income Dynamics (SLID). These revisions to the SLID estimates make it possible to compare results from the Canadian Income Survey (CIS) to earlier years. The revisions address the issue of methodology differences between SLID and CIS.
Release date: 2015-12-17 - Surveys and statistical programs – Documentation: 13-605-X201500414166Description:
Estimates of the underground economy by province and territory for the period 2007 to 2012 are now available for the first time. The objective of this technical note is to explain how the methodology employed to derive upper-bound estimates of the underground economy for the provinces and territories differs from that used to derive national estimates.
Release date: 2015-04-29 - 3. A donor imputation system to create a census database fully adjusted for underenumeration ArchivedSurveys and statistical programs – Documentation: 11-522-X19990015668Description:
Following the problems with estimating underenumeration in the 1991 Census of England and Wales the aim for the 2001 Census is to create a database that is fully adjusted to net underenumeration. To achieve this, the paper investigates weighted donor imputation methodology that utilises information from both the census and census coverage survey (CCS). The US Census Bureau has considered a similar approach for their 2000 Census (see Isaki et al 1998). The proposed procedure distinguishes between individuals who are not counted by the census because their household is missed and those who are missed in counted households. Census data is linked to data from the CCS. Multinomial logistic regression is used to estimate the probabilities that households are missed by the census and the probabilities that individuals are missed in counted households. Household and individual coverage weights are constructed from the estimated probabilities and these feed into the donor imputation procedure.
Release date: 2000-03-02 - 4. Sampling and Weighting (Reference Products: Technical Reports: 1996 Census of Population) ArchivedSurveys and statistical programs – Documentation: 92-371-XDescription:
This report deals with sampling and weighting, a process whereby certain characteristics are collected and processed for a random sample of dwellings and persons identified in the complete census enumeration. Data for the whole population are then obtained by scaling up the results for the sample to the full population level. The use of sampling may lead to substantial reductions in costs and respondent burden, or alternatively, can allow the scope of a census to be broadened at the same cost.
Release date: 1999-12-07
- Date modified: