Analysis

Statistics Canada's Trust Centre

Results

All (4)

All (4) ((4 results))

1. Improved small area inference from data integration using global-local priors
Articles and reports: 12-001-X202500200009
Description: We present and apply methodology to improve inference for small area parameters by using data from several sources. This work extends Cahoy and Sedransk (2023) who showed how to integrate summary statistics from several sources. Our methodology uses hierarchical global-local prior distributions to make inferences for the proportion of individuals in Florida’s counties who do not have health insurance. Results from an extensive simulation study show that this methodology will provide improved inference by using several data sources. Among the five model variants evaluated the ones using horseshoe priors for all variances have better performance than the ones using lasso priors for the local variances.
Release date: 2025-12-23
2. Combining data from surveys and related sources
Articles and reports: 12-001-X202300100003
Description: To improve the precision of inferences and reduce costs there is considerable interest in combining data from several sources such as sample surveys and administrative data. Appropriate methodology is required to ensure satisfactory inferences since the target populations and methods for acquiring data may be quite different. To provide improved inferences we use methodology that has a more general structure than the ones in current practice. We start with the case where the analyst has only summary statistics from each of the sources. In our primary method, uncertain pooling, it is assumed that the analyst can regard one source, survey r, as the single best choice for inference. This method starts with the data from survey r and adds data from those other sources that are shown to form clusters that include survey r. We also consider Dirichlet process mixtures, one of the most popular nonparametric Bayesian methods. We use analytical expressions and the results from numerical studies to show properties of the methodology.
Release date: 2023-06-30
3. Bayesian inference for a variance component model using pairwise composite likelihood with survey data
Articles and reports: 12-001-X202200100002
Description: We consider an intercept only linear random effects model for analysis of data from a two stage cluster sampling design. At the first stage a simple random sample of clusters is drawn, and at the second stage a simple random sample of elementary units is taken within each selected cluster. The response variable is assumed to consist of a cluster-level random effect plus an independent error term with known variance. The objects of inference are the mean of the outcome variable and the random effect variance. With a more complex two stage sampling design, the use of an approach based on an estimated pairwise composite likelihood function has appealing properties. Our purpose is to use our simpler context to compare the results of likelihood inference with inference based on a pairwise composite likelihood function that is treated as an approximate likelihood, in particular treated as the likelihood component in Bayesian inference. In order to provide credible intervals having frequentist coverage close to nominal values, the pairwise composite likelihood function and corresponding posterior density need modification, such as a curvature adjustment. Through simulation studies, we investigate the performance of an adjustment proposed in the literature, and find that it works well for the mean but provides credible intervals for the random effect variance that suffer from under-coverage. We propose possible future directions including extensions to the case of a complex design.
Release date: 2022-06-21
4. Double sampling for stratification Archived
Articles and reports: 12-001-X199300114473
Description:
Double sampling is a common alternative to simple random sampling when there are expected to be gains from using stratified sampling, but the units cannot be assigned to strata prior to sampling. It is assumed throughout that the survey objective is estimation of the finite population mean. We compare simple random sampling and three allocation methods for double sampling: (a) proportional, (b) Rao’s (Rao 1973a, b) and (c) optimal. There is also an investigation of the effect on sample size selection of misspecification of an important design parameter.
Release date: 1993-06-15

Articles and reports (4)

Articles and reports (4) ((4 results))

1. Improved small area inference from data integration using global-local priors
Articles and reports: 12-001-X202500200009
Description: We present and apply methodology to improve inference for small area parameters by using data from several sources. This work extends Cahoy and Sedransk (2023) who showed how to integrate summary statistics from several sources. Our methodology uses hierarchical global-local prior distributions to make inferences for the proportion of individuals in Florida’s counties who do not have health insurance. Results from an extensive simulation study show that this methodology will provide improved inference by using several data sources. Among the five model variants evaluated the ones using horseshoe priors for all variances have better performance than the ones using lasso priors for the local variances.
Release date: 2025-12-23
2. Combining data from surveys and related sources
Articles and reports: 12-001-X202300100003
Description: To improve the precision of inferences and reduce costs there is considerable interest in combining data from several sources such as sample surveys and administrative data. Appropriate methodology is required to ensure satisfactory inferences since the target populations and methods for acquiring data may be quite different. To provide improved inferences we use methodology that has a more general structure than the ones in current practice. We start with the case where the analyst has only summary statistics from each of the sources. In our primary method, uncertain pooling, it is assumed that the analyst can regard one source, survey r, as the single best choice for inference. This method starts with the data from survey r and adds data from those other sources that are shown to form clusters that include survey r. We also consider Dirichlet process mixtures, one of the most popular nonparametric Bayesian methods. We use analytical expressions and the results from numerical studies to show properties of the methodology.
Release date: 2023-06-30
3. Bayesian inference for a variance component model using pairwise composite likelihood with survey data
Articles and reports: 12-001-X202200100002
Description: We consider an intercept only linear random effects model for analysis of data from a two stage cluster sampling design. At the first stage a simple random sample of clusters is drawn, and at the second stage a simple random sample of elementary units is taken within each selected cluster. The response variable is assumed to consist of a cluster-level random effect plus an independent error term with known variance. The objects of inference are the mean of the outcome variable and the random effect variance. With a more complex two stage sampling design, the use of an approach based on an estimated pairwise composite likelihood function has appealing properties. Our purpose is to use our simpler context to compare the results of likelihood inference with inference based on a pairwise composite likelihood function that is treated as an approximate likelihood, in particular treated as the likelihood component in Bayesian inference. In order to provide credible intervals having frequentist coverage close to nominal values, the pairwise composite likelihood function and corresponding posterior density need modification, such as a curvature adjustment. Through simulation studies, we investigate the performance of an adjustment proposed in the literature, and find that it works well for the mean but provides credible intervals for the random effect variance that suffer from under-coverage. We propose possible future directions including extensions to the case of a complex design.
Release date: 2022-06-21
4. Double sampling for stratification Archived
Articles and reports: 12-001-X199300114473
Description:
Double sampling is a common alternative to simple random sampling when there are expected to be gains from using stratified sampling, but the units cannot be assigned to strata prior to sampling. It is assumed throughout that the survey objective is estimation of the finite population mean. We compare simple random sampling and three allocation methods for double sampling: (a) proportional, (b) Rao’s (Rao 1973a, b) and (c) optimal. There is also an investigation of the effect on sample size selection of misspecification of an important design parameter.
Release date: 1993-06-15

Date modified:: 2026-06-04

Language selection

WxT Language switcher

Search and menus

WxT Search form

Analysis

Filter results by

Keyword(s)

Subject

Year of publication

Author(s)

Content

Results

All (4) ((4 results))

Articles and reports (4) ((4 results))

Analysis

Filter results by

Keyword(s)

Subject

Year of publication

Author(s)

Content

Results

All (4) ((4 results))

Articles and reports (4) ((4 results))

How are the results ordered?

How are the results ordered?

How do I use the filters and the search box?

How do I refine my search?

How does the search work?