Keyword search

Skip to main content
Skip to footer

Language selection

Français

Search and menus

Search and menus

Search

Results

All (50)

All (50) (0 to 10 of 50 results)

1. Construction and Assessment of a Social Inclusion Index for the Canada Mortgage and Housing Corporation: Technical Report
Articles and reports: 11-633-X2021001
Description:
Using data from the Canadian Housing Survey, this project aimed to construct a measure of social inclusion, using indicators identified by the Canada Mortgage and Housing Corporation (CMHC), to report a social inclusion score for each geographic stratum separately for dwellings that are and are not in social and affordable housing. This project also sought to examine associations between social inclusion and a set of economic, social and health variables.
Release date: 2021-01-05
2. The Impact of Annual Wages on Interprovincial Mobility, Interprovincial Employment, and Job Vacancies Archived
Articles and reports: 11F0019M2016376
Description: The degree to which workers move across geographic areas in response to emerging employment opportunities or negative labour demand shocks is a key element in the adjustment process of an economy, and its ability to reach a desired allocation of resources.
This study estimates the causal impact of real after-tax annual wages and salaries on the propensity of young men to migrate to Alberta or to accept jobs in that province while maintaining residence in their home province. To do so, it exploits the cross-provincial variation in earnings growth plausibly induced by increases in world oil prices that occurred during the 2000s.
Release date: 2016-04-11
3. A design effect measure for calibration weighting in single-stage samples Archived
Articles and reports: 12-001-X201500214236
Description:
We propose a model-assisted extension of weighting design-effect measures. We develop a summary-level statistic for different variables of interest, in single-stage sampling and under calibration weight adjustments. Our proposed design effect measure captures the joint effects of a non-epsem sampling design, unequal weights produced using calibration adjustments, and the strength of the association between an analysis variable and the auxiliaries used in calibration. We compare our proposed measure to existing design effect measures in simulations using variables like those collected in establishment surveys and telephone surveys of households.
Release date: 2015-12-17
4. A method of determining the winsorization threshold, with an application to domain estimation Archived
Articles and reports: 12-001-X201500114199
Description:
In business surveys, it is not unusual to collect economic variables for which the distribution is highly skewed. In this context, winsorization is often used to treat the problem of influential values. This technique requires the determination of a constant that corresponds to the threshold above which large values are reduced. In this paper, we consider a method of determining the constant which involves minimizing the largest estimated conditional bias in the sample. In the context of domain estimation, we also propose a method of ensuring consistency between the domain-level winsorized estimates and the population-level winsorized estimate. The results of two simulation studies suggest that the proposed methods lead to winsorized estimators that have good bias and relative efficiency properties.
Release date: 2015-06-29
5. Bayesian multiple imputation for large-scale categorical data with structural zeros Archived
Articles and reports: 12-001-X201400114002
Description:
We propose an approach for multiple imputation of items missing at random in large-scale surveys with exclusively categorical variables that have structural zeros. Our approach is to use mixtures of multinomial distributions as imputation engines, accounting for structural zeros by conceiving of the observed data as a truncated sample from a hypothetical population without structural zeros. This approach has several appealing features: imputations are generated from coherent, Bayesian joint models that automatically capture complex dependencies and readily scale to large numbers of variables. We outline a Gibbs sampling algorithm for implementing the approach, and we illustrate its potential with a repeated sampling study using public use census microdata from the state of New York, U.S.A.
Release date: 2014-06-27
6. Pseudo-likelihood-based Bayesian information criterion for variable selection in survey data Archived
Articles and reports: 12-001-X201300211871
Description:
Regression models are routinely used in the analysis of survey data, where one common issue of interest is to identify influential factors that are associated with certain behavioral, social, or economic indices within a target population. When data are collected through complex surveys, the properties of classical variable selection approaches developed in i.i.d. non-survey settings need to be re-examined. In this paper, we derive a pseudo-likelihood-based BIC criterion for variable selection in the analysis of survey data and suggest a sample-based penalized likelihood approach for its implementation. The sampling weights are appropriately assigned to correct the biased selection result caused by the distortion between the sample and the target population. Under a joint randomization framework, we establish the consistency of the proposed selection procedure. The finite-sample performance of the approach is assessed through analysis and computer simulations based on data from the hypertension component of the 2009 Survey on Living with Chronic Diseases in Canada.
Release date: 2014-01-15
7. Joint determination of optimal stratification and sample allocation using genetic algorithm Archived
Articles and reports: 12-001-X201300211884
Description:
This paper offers a solution to the problem of finding the optimal stratification of the available population frame, so as to ensure the minimization of the cost of the sample required to satisfy precision constraints on a set of different target estimates. The solution is searched by exploring the universe of all possible stratifications obtainable by cross-classifying the categorical auxiliary variables available in the frame (continuous auxiliary variables can be transformed into categorical ones by means of suitable methods). Therefore, the followed approach is multivariate with respect to both target and auxiliary variables. The proposed algorithm is based on a non deterministic evolutionary approach, making use of the genetic algorithm paradigm. The key feature of the algorithm is in considering each possible stratification as an individual subject to evolution, whose fitness is given by the cost of the associated sample required to satisfy a set of precision constraints, the cost being calculated by applying the Bethel algorithm for multivariate allocation. This optimal stratification algorithm, implemented in an R package (SamplingStrata), has been so far applied to a number of current surveys in the Italian National Institute of Statistics: the obtained results always show significant improvements in the efficiency of the samples obtained, with respect to previously adopted stratifications.
Release date: 2014-01-15
8. Comparison of different sample designs and construction of confidence bands to estimate the mean of functional data: An illustration on electricity consumption Archived
Articles and reports: 12-001-X201300211888
Description:
When the study variables are functional and storage capacities are limited or transmission costs are high, using survey techniques to select a portion of the observations of the population is an interesting alternative to using signal compression techniques. In this context of functional data, our focus in this study is on estimating the mean electricity consumption curve over a one-week period. We compare different estimation strategies that take account of a piece of auxiliary information such as the mean consumption for the previous period. The first strategy consists in using a simple random sampling design without replacement, then incorporating the auxiliary information into the estimator by introducing a functional linear model. The second approach consists in incorporating the auxiliary information into the sampling designs by considering unequal probability designs, such as stratified and pi designs. We then address the issue of constructing confidence bands for these estimators of the mean. When effective estimators of the covariance function are available and the mean estimator satisfies a functional central limit theorem, it is possible to use a fast technique for constructing confidence bands, based on the simulation of Gaussian processes. This approach is compared with bootstrap techniques that have been adapted to take account of the functional nature of the data.
Release date: 2014-01-15
9. Objective stepwise Bayes weights in survey sampling Archived
Articles and reports: 12-001-X201300111823
Description:
Although weights are widely used in survey sampling their ultimate justification from the design perspective is often problematical. Here we will argue for a stepwise Bayes justification for weights that does not depend explicitly on the sampling design. This approach will make use of the standard kind of information present in auxiliary variables however it will not assume a model relating the auxiliary variables to the characteristic of interest. The resulting weight for a unit in the sample can be given the usual interpretation as the number of units in the population which it represents.
Release date: 2013-06-28
10. Combining cohorts in longitudinal surveys Archived
Articles and reports: 12-001-X201300111828
Description:
A question that commonly arises in longitudinal surveys is the issue of how to combine differing cohorts of the survey. In this paper we present a novel method for combining different cohorts, and using all available data, in a longitudinal survey to estimate parameters of a semiparametric model, which relates the response variable to a set of covariates. The procedure builds upon the Weighted Generalized Estimation Equation method for handling missing waves in longitudinal studies. Our method is set up under a joint-randomization framework for estimation of model parameters, which takes into account the superpopulation model as well as the survey design randomization. We also propose a design-based, and a joint-randomization, variance estimation method. To illustrate the methodology we apply it to the Survey of Doctorate Recipients, conducted by the U.S. National Science Foundation.
Release date: 2013-06-28

Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (50)

Analysis (50) (0 to 10 of 50 results)

1. Construction and Assessment of a Social Inclusion Index for the Canada Mortgage and Housing Corporation: Technical Report
Articles and reports: 11-633-X2021001
Description:
Using data from the Canadian Housing Survey, this project aimed to construct a measure of social inclusion, using indicators identified by the Canada Mortgage and Housing Corporation (CMHC), to report a social inclusion score for each geographic stratum separately for dwellings that are and are not in social and affordable housing. This project also sought to examine associations between social inclusion and a set of economic, social and health variables.
Release date: 2021-01-05
2. The Impact of Annual Wages on Interprovincial Mobility, Interprovincial Employment, and Job Vacancies Archived
Articles and reports: 11F0019M2016376
Description: The degree to which workers move across geographic areas in response to emerging employment opportunities or negative labour demand shocks is a key element in the adjustment process of an economy, and its ability to reach a desired allocation of resources.
This study estimates the causal impact of real after-tax annual wages and salaries on the propensity of young men to migrate to Alberta or to accept jobs in that province while maintaining residence in their home province. To do so, it exploits the cross-provincial variation in earnings growth plausibly induced by increases in world oil prices that occurred during the 2000s.
Release date: 2016-04-11
3. A design effect measure for calibration weighting in single-stage samples Archived
Articles and reports: 12-001-X201500214236
Description:
We propose a model-assisted extension of weighting design-effect measures. We develop a summary-level statistic for different variables of interest, in single-stage sampling and under calibration weight adjustments. Our proposed design effect measure captures the joint effects of a non-epsem sampling design, unequal weights produced using calibration adjustments, and the strength of the association between an analysis variable and the auxiliaries used in calibration. We compare our proposed measure to existing design effect measures in simulations using variables like those collected in establishment surveys and telephone surveys of households.
Release date: 2015-12-17
4. A method of determining the winsorization threshold, with an application to domain estimation Archived
Articles and reports: 12-001-X201500114199
Description:
In business surveys, it is not unusual to collect economic variables for which the distribution is highly skewed. In this context, winsorization is often used to treat the problem of influential values. This technique requires the determination of a constant that corresponds to the threshold above which large values are reduced. In this paper, we consider a method of determining the constant which involves minimizing the largest estimated conditional bias in the sample. In the context of domain estimation, we also propose a method of ensuring consistency between the domain-level winsorized estimates and the population-level winsorized estimate. The results of two simulation studies suggest that the proposed methods lead to winsorized estimators that have good bias and relative efficiency properties.
Release date: 2015-06-29
5. Bayesian multiple imputation for large-scale categorical data with structural zeros Archived
Articles and reports: 12-001-X201400114002
Description:
We propose an approach for multiple imputation of items missing at random in large-scale surveys with exclusively categorical variables that have structural zeros. Our approach is to use mixtures of multinomial distributions as imputation engines, accounting for structural zeros by conceiving of the observed data as a truncated sample from a hypothetical population without structural zeros. This approach has several appealing features: imputations are generated from coherent, Bayesian joint models that automatically capture complex dependencies and readily scale to large numbers of variables. We outline a Gibbs sampling algorithm for implementing the approach, and we illustrate its potential with a repeated sampling study using public use census microdata from the state of New York, U.S.A.
Release date: 2014-06-27
6. Pseudo-likelihood-based Bayesian information criterion for variable selection in survey data Archived
Articles and reports: 12-001-X201300211871
Description:
Regression models are routinely used in the analysis of survey data, where one common issue of interest is to identify influential factors that are associated with certain behavioral, social, or economic indices within a target population. When data are collected through complex surveys, the properties of classical variable selection approaches developed in i.i.d. non-survey settings need to be re-examined. In this paper, we derive a pseudo-likelihood-based BIC criterion for variable selection in the analysis of survey data and suggest a sample-based penalized likelihood approach for its implementation. The sampling weights are appropriately assigned to correct the biased selection result caused by the distortion between the sample and the target population. Under a joint randomization framework, we establish the consistency of the proposed selection procedure. The finite-sample performance of the approach is assessed through analysis and computer simulations based on data from the hypertension component of the 2009 Survey on Living with Chronic Diseases in Canada.
Release date: 2014-01-15
7. Joint determination of optimal stratification and sample allocation using genetic algorithm Archived
Articles and reports: 12-001-X201300211884
Description:
This paper offers a solution to the problem of finding the optimal stratification of the available population frame, so as to ensure the minimization of the cost of the sample required to satisfy precision constraints on a set of different target estimates. The solution is searched by exploring the universe of all possible stratifications obtainable by cross-classifying the categorical auxiliary variables available in the frame (continuous auxiliary variables can be transformed into categorical ones by means of suitable methods). Therefore, the followed approach is multivariate with respect to both target and auxiliary variables. The proposed algorithm is based on a non deterministic evolutionary approach, making use of the genetic algorithm paradigm. The key feature of the algorithm is in considering each possible stratification as an individual subject to evolution, whose fitness is given by the cost of the associated sample required to satisfy a set of precision constraints, the cost being calculated by applying the Bethel algorithm for multivariate allocation. This optimal stratification algorithm, implemented in an R package (SamplingStrata), has been so far applied to a number of current surveys in the Italian National Institute of Statistics: the obtained results always show significant improvements in the efficiency of the samples obtained, with respect to previously adopted stratifications.
Release date: 2014-01-15
8. Comparison of different sample designs and construction of confidence bands to estimate the mean of functional data: An illustration on electricity consumption Archived
Articles and reports: 12-001-X201300211888
Description:
When the study variables are functional and storage capacities are limited or transmission costs are high, using survey techniques to select a portion of the observations of the population is an interesting alternative to using signal compression techniques. In this context of functional data, our focus in this study is on estimating the mean electricity consumption curve over a one-week period. We compare different estimation strategies that take account of a piece of auxiliary information such as the mean consumption for the previous period. The first strategy consists in using a simple random sampling design without replacement, then incorporating the auxiliary information into the estimator by introducing a functional linear model. The second approach consists in incorporating the auxiliary information into the sampling designs by considering unequal probability designs, such as stratified and pi designs. We then address the issue of constructing confidence bands for these estimators of the mean. When effective estimators of the covariance function are available and the mean estimator satisfies a functional central limit theorem, it is possible to use a fast technique for constructing confidence bands, based on the simulation of Gaussian processes. This approach is compared with bootstrap techniques that have been adapted to take account of the functional nature of the data.
Release date: 2014-01-15
9. Objective stepwise Bayes weights in survey sampling Archived
Articles and reports: 12-001-X201300111823
Description:
Although weights are widely used in survey sampling their ultimate justification from the design perspective is often problematical. Here we will argue for a stepwise Bayes justification for weights that does not depend explicitly on the sampling design. This approach will make use of the standard kind of information present in auxiliary variables however it will not assume a model relating the auxiliary variables to the characteristic of interest. The resulting weight for a unit in the sample can be given the usual interpretation as the number of units in the population which it represents.
Release date: 2013-06-28
10. Combining cohorts in longitudinal surveys Archived
Articles and reports: 12-001-X201300111828
Description:
A question that commonly arises in longitudinal surveys is the issue of how to combine differing cohorts of the survey. In this paper we present a novel method for combining different cohorts, and using all available data, in a longitudinal survey to estimate parameters of a semiparametric model, which relates the response variable to a set of covariates. The procedure builds upon the Weighted Generalized Estimation Equation method for handling missing waves in longitudinal studies. Our method is set up under a joint-randomization framework for estimation of model parameters, which takes into account the superpopulation model as well as the survey design randomization. We also propose a design-based, and a joint-randomization, variance estimation method. To illustrate the methodology we apply it to the Survey of Doctorate Recipients, conducted by the U.S. National Science Foundation.
Release date: 2013-06-28

Reference (0)

Reference (0) (0 results)

No content available at this time.

Report a problem or mistake on this page

Date modified:: 2024-04-18