Keyword search

Results

All (53)

All (53) (0 to 10 of 53 results)

1. An Approximate Bayesian Approach to Improving Probability Sample Estimators Using a Supplementary Non-Probability Sample Archived
Articles and reports: 11-522-X202100100008
Description:
Non-probability samples are being increasingly explored by National Statistical Offices as a complement to probability samples. We consider the scenario where the variable of interest and auxiliary variables are observed in both a probability and non-probability sample. Our objective is to use data from the non-probability sample to improve the efficiency of survey-weighted estimates obtained from the probability sample. Recently, Sakshaug, Wisniowski, Ruiz and Blom (2019) and Wisniowski, Sakshaug, Ruiz and Blom (2020) proposed a Bayesian approach to integrating data from both samples for the estimation of model parameters. In their approach, non-probability sample data are used to determine the prior distribution of model parameters, and the posterior distribution is obtained under the assumption that the probability sampling design is ignorable (or not informative). We extend this Bayesian approach to the prediction of finite population parameters under non-ignorable (or informative) sampling by conditioning on appropriate survey-weighted statistics. We illustrate the properties of our predictor through a simulation study.
Key Words: Bayesian prediction; Gibbs sampling; Non-ignorable sampling; Statistical data integration.

Release date: 2021-10-29
2. Demystifying Confidence Intervals
19-22-0005
Description:
In this session, we will attempt to demystify the concept of confidence intervals as they relate to sample data. A practical approach is used, placing emphasis on the meaning and interpretation of results rather than the mathematics. The goal is to make sense of some common challenges faced by data users when interpreting confidence intervals. The session is intended for a beginner audience. Some familiarity with basic statistical concepts would be beneficial/advantageous but not required.
https://www.statcan.gc.ca/eng/wtc/information/19220005
Release date: 2021-05-28
3. Methodology of the Canadian Labour Force Survey
Surveys and statistical programs – Documentation: 71-526-X
Description:
The Canadian Labour Force Survey (LFS) is the official source of monthly estimates of total employment and unemployment. Following the 2011 census, the LFS underwent a sample redesign to account for the evolution of the population and labour market characteristics, to adjust to changes in the information needs and to update the geographical information used to carry out the survey. The redesign program following the 2011 census culminated with the introduction of a new sample at the beginning of 2015. This report is a reference on the methodological aspects of the LFS, covering stratification, sampling, collection, processing, weighting, estimation, variance estimation and data quality.
Release date: 2017-12-21
4. Model-assisted optimal allocation for planned domains using composite estimation Archived
Articles and reports: 12-001-X201500214230
Description:
This paper develops allocation methods for stratified sample surveys where composite small area estimators are a priority, and areas are used as strata. Longford (2006) proposed an objective criterion for this situation, based on a weighted combination of the mean squared errors of small area means and a grand mean. Here, we redefine this approach within a model-assisted framework, allowing regressor variables and a more natural interpretation of results using an intra-class correlation parameter. We also consider several uses of power allocation, and allow the placing of other constraints such as maximum relative root mean squared errors for stratum estimators. We find that a simple power allocation can perform very nearly as well as the optimal design even when the objective is to minimize Longford’s (2006) criterion.
Release date: 2015-12-17
5. Model-based small area estimation under informative sampling Archived
Articles and reports: 12-001-X201500214248
Description:
Unit level population models are often used in model-based small area estimation of totals and means, but the models may not hold for the sample if the sampling design is informative for the model. As a result, standard methods, assuming that the model holds for the sample, can lead to biased estimators. We study alternative methods that use a suitable function of the unit selection probability as an additional auxiliary variable in the sample model. We report the results of a simulation study on the bias and mean squared error (MSE) of the proposed estimators of small area means and on the relative bias of the associated MSE estimators, using informative sampling schemes to generate the samples. Alternative methods, based on modeling the conditional expectation of the design weight as a function of the model covariates and the response, are also included in the simulation study.
Release date: 2015-12-17
6. Objective stepwise Bayes weights in survey sampling Archived
Articles and reports: 12-001-X201300111823
Description:
Although weights are widely used in survey sampling their ultimate justification from the design perspective is often problematical. Here we will argue for a stepwise Bayes justification for weights that does not depend explicitly on the sampling design. This approach will make use of the standard kind of information present in auxiliary variables however it will not assume a model relating the auxiliary variables to the characteristic of interest. The resulting weight for a unit in the sample can be given the usual interpretation as the number of units in the population which it represents.
Release date: 2013-06-28
7. Optimizing quality of response through adaptive survey designs Archived
Articles and reports: 12-001-X201300111824
Description:
In most surveys all sample units receive the same treatment and the same design features apply to all selected people and households. In this paper, it is explained how survey designs may be tailored to optimize quality given constraints on costs. Such designs are called adaptive survey designs. The basic ingredients of such designs are introduced, discussed and illustrated with various examples.
Release date: 2013-06-28
8. On sample allocation for efficient domain estimation Archived
Articles and reports: 12-001-X201200111682
Description:
Sample allocation issues are studied in the context of estimating sub-population (stratum or domain) means as well as the aggregate population mean under stratified simple random sampling. A non-linear programming method is used to obtain "optimal" sample allocation to strata that minimizes the total sample size subject to specified tolerances on the coefficient of variation of the estimators of strata means and the population mean. The resulting total sample size is then used to determine sample allocations for the methods of Costa, Satorra and Ventura (2004) based on compromise allocation and Longford (2006) based on specified "inferential priorities". In addition, we study sample allocation to strata when reliability requirements for domains, cutting across strata, are also specified. Performance of the three methods is studied using data from Statistics Canada's Monthly Retail Trade Survey (MRTS) of single establishments.
Release date: 2012-06-27
9. Calibration alternatives to poststratification for doubly classified data Archived
Articles and reports: 12-001-X201200111683
Description:
We consider alternatives to poststratification for doubly classified data in which at least one of the two-way cells is too small to allow the poststratification based upon this double classification. In our study data set, the expected count in the smallest cell is 0.36. One approach is simply to collapse cells. This is likely, however, to destroy the double classification structure. Our alternative approaches allows one to maintain the original double classification of the data. The approaches are based upon the calibration study by Chang and Kott (2008). We choose weight adjustments dependent upon the marginal classifications (but not full cross classification) to minimize an objective function of the differences between the population counts of the two way cells and their sample estimates. In the terminology of Chang and Kott (2008), if the row and column classifications have I and J cells respectively, this results in IJ benchmark variables and I + J - 1 model variables. We study the performance of these estimators by constructing simulation simple random samples from the 2005 Quarterly Census of Employment and Wages which is maintained by the Bureau of Labor Statistics. We use the double classification of state and industry group. In our study, the calibration approaches introduced an asymptotically trivial bias, but reduced the MSE, compared to the unbiased estimator, by as much as 20% for a small sample.
Release date: 2012-06-27
10. Nonsampling errors in dual frame telephone surveys Archived
Articles and reports: 12-001-X201100111443
Description:
Dual frame telephone surveys are becoming common in the U.S. because of the incompleteness of the landline frame as people transition to cell phones. This article examines nonsampling errors in dual frame telephone surveys. Even though nonsampling errors are ignored in much of the dual frame literature, we find that under some conditions substantial biases may arise in dual frame telephone surveys due to these errors. We specifically explore biases due to nonresponse and measurement error in these telephone surveys. To reduce the bias resulting from these errors, we propose dual frame sampling and weighting methods. The compositing factor for combining the estimates from the two frames is shown to play an important role in reducing nonresponse bias.
Release date: 2011-06-29

Data (1)

Data (1) ((1 result))

1. Ontario Adult Literacy Survey Archived
Public use microdata: 89M0018X
Description:
This is a CD-ROM product from the Ontario Adult Literacy Survey (OALS), conducted in the spring of 1998 with the goal of providing information on: the ability of Ontario immigrants to use either English or French in their daily activities; and on their self-perceived literacy skills, training needs and barriers to training.
In order to cover the majority of Ontario immigrants, the Census Metropolitan Areas (CMAs) of Toronto, Hamilton, Ottawa, Kitchener, London and St. Catharines were included in the sample. With these 6 CMAs, about 83% of Ontario immigrants were included in the sample frame. This sample of 7,107 dwellings covered the population of Ontario immigrants in general as well as specifically targetting immigrants with a mother tongue of Italian, Chinese, Portuguese, Polish, and Spanish and immigrants born in the Caribbean Islands with a mother tongue of English.
Each interview was approximately 1.5 hours in duration and consisted of a half-hour questionnaire, asking demographic and literacy-related questions as well as a one-hour literacy test. This literacy test was derived from that used in the 1994 International Adult Literacy Survey (IALS) and covered the domains of document and quantitative literacy. An overall response rate to the survey of 76% was achieved, resulting in 4,648 respondents.
Release date: 1999-10-29

Analysis (48)

Analysis (48) (0 to 10 of 48 results)

1. An Approximate Bayesian Approach to Improving Probability Sample Estimators Using a Supplementary Non-Probability Sample Archived
Articles and reports: 11-522-X202100100008
Description:
Non-probability samples are being increasingly explored by National Statistical Offices as a complement to probability samples. We consider the scenario where the variable of interest and auxiliary variables are observed in both a probability and non-probability sample. Our objective is to use data from the non-probability sample to improve the efficiency of survey-weighted estimates obtained from the probability sample. Recently, Sakshaug, Wisniowski, Ruiz and Blom (2019) and Wisniowski, Sakshaug, Ruiz and Blom (2020) proposed a Bayesian approach to integrating data from both samples for the estimation of model parameters. In their approach, non-probability sample data are used to determine the prior distribution of model parameters, and the posterior distribution is obtained under the assumption that the probability sampling design is ignorable (or not informative). We extend this Bayesian approach to the prediction of finite population parameters under non-ignorable (or informative) sampling by conditioning on appropriate survey-weighted statistics. We illustrate the properties of our predictor through a simulation study.
Key Words: Bayesian prediction; Gibbs sampling; Non-ignorable sampling; Statistical data integration.

Release date: 2021-10-29
2. Model-assisted optimal allocation for planned domains using composite estimation Archived
Articles and reports: 12-001-X201500214230
Description:
This paper develops allocation methods for stratified sample surveys where composite small area estimators are a priority, and areas are used as strata. Longford (2006) proposed an objective criterion for this situation, based on a weighted combination of the mean squared errors of small area means and a grand mean. Here, we redefine this approach within a model-assisted framework, allowing regressor variables and a more natural interpretation of results using an intra-class correlation parameter. We also consider several uses of power allocation, and allow the placing of other constraints such as maximum relative root mean squared errors for stratum estimators. We find that a simple power allocation can perform very nearly as well as the optimal design even when the objective is to minimize Longford’s (2006) criterion.
Release date: 2015-12-17
3. Model-based small area estimation under informative sampling Archived
Articles and reports: 12-001-X201500214248
Description:
Unit level population models are often used in model-based small area estimation of totals and means, but the models may not hold for the sample if the sampling design is informative for the model. As a result, standard methods, assuming that the model holds for the sample, can lead to biased estimators. We study alternative methods that use a suitable function of the unit selection probability as an additional auxiliary variable in the sample model. We report the results of a simulation study on the bias and mean squared error (MSE) of the proposed estimators of small area means and on the relative bias of the associated MSE estimators, using informative sampling schemes to generate the samples. Alternative methods, based on modeling the conditional expectation of the design weight as a function of the model covariates and the response, are also included in the simulation study.
Release date: 2015-12-17
4. Objective stepwise Bayes weights in survey sampling Archived
Articles and reports: 12-001-X201300111823
Description:
Although weights are widely used in survey sampling their ultimate justification from the design perspective is often problematical. Here we will argue for a stepwise Bayes justification for weights that does not depend explicitly on the sampling design. This approach will make use of the standard kind of information present in auxiliary variables however it will not assume a model relating the auxiliary variables to the characteristic of interest. The resulting weight for a unit in the sample can be given the usual interpretation as the number of units in the population which it represents.
Release date: 2013-06-28
5. Optimizing quality of response through adaptive survey designs Archived
Articles and reports: 12-001-X201300111824
Description:
In most surveys all sample units receive the same treatment and the same design features apply to all selected people and households. In this paper, it is explained how survey designs may be tailored to optimize quality given constraints on costs. Such designs are called adaptive survey designs. The basic ingredients of such designs are introduced, discussed and illustrated with various examples.
Release date: 2013-06-28
6. On sample allocation for efficient domain estimation Archived
Articles and reports: 12-001-X201200111682
Description:
Sample allocation issues are studied in the context of estimating sub-population (stratum or domain) means as well as the aggregate population mean under stratified simple random sampling. A non-linear programming method is used to obtain "optimal" sample allocation to strata that minimizes the total sample size subject to specified tolerances on the coefficient of variation of the estimators of strata means and the population mean. The resulting total sample size is then used to determine sample allocations for the methods of Costa, Satorra and Ventura (2004) based on compromise allocation and Longford (2006) based on specified "inferential priorities". In addition, we study sample allocation to strata when reliability requirements for domains, cutting across strata, are also specified. Performance of the three methods is studied using data from Statistics Canada's Monthly Retail Trade Survey (MRTS) of single establishments.
Release date: 2012-06-27
7. Calibration alternatives to poststratification for doubly classified data Archived
Articles and reports: 12-001-X201200111683
Description:
We consider alternatives to poststratification for doubly classified data in which at least one of the two-way cells is too small to allow the poststratification based upon this double classification. In our study data set, the expected count in the smallest cell is 0.36. One approach is simply to collapse cells. This is likely, however, to destroy the double classification structure. Our alternative approaches allows one to maintain the original double classification of the data. The approaches are based upon the calibration study by Chang and Kott (2008). We choose weight adjustments dependent upon the marginal classifications (but not full cross classification) to minimize an objective function of the differences between the population counts of the two way cells and their sample estimates. In the terminology of Chang and Kott (2008), if the row and column classifications have I and J cells respectively, this results in IJ benchmark variables and I + J - 1 model variables. We study the performance of these estimators by constructing simulation simple random samples from the 2005 Quarterly Census of Employment and Wages which is maintained by the Bureau of Labor Statistics. We use the double classification of state and industry group. In our study, the calibration approaches introduced an asymptotically trivial bias, but reduced the MSE, compared to the unbiased estimator, by as much as 20% for a small sample.
Release date: 2012-06-27
8. Nonsampling errors in dual frame telephone surveys Archived
Articles and reports: 12-001-X201100111443
Description:
Dual frame telephone surveys are becoming common in the U.S. because of the incompleteness of the landline frame as people transition to cell phones. This article examines nonsampling errors in dual frame telephone surveys. Even though nonsampling errors are ignored in much of the dual frame literature, we find that under some conditions substantial biases may arise in dual frame telephone surveys due to these errors. We specifically explore biases due to nonresponse and measurement error in these telephone surveys. To reduce the bias resulting from these errors, we propose dual frame sampling and weighting methods. The compositing factor for combining the estimates from the two frames is shown to play an important role in reducing nonresponse bias.
Release date: 2011-06-29
9. The construction of stratified designs in R with the package stratification Archived
Articles and reports: 12-001-X201100111447
Description:
This paper introduces a R-package for the stratification of a survey population using a univariate stratification variable X and for the calculation of stratum sample sizes. Non iterative methods such as the cumulative root frequency method and the geometric stratum boundaries are implemented. Optimal designs, with stratum boundaries that minimize either the CV of the simple expansion estimator for a fixed sample size n or the n value for a fixed CV can be constructed. Two iterative algorithms are available to find the optimal stratum boundaries. The design can feature a user defined certainty stratum where all the units are sampled. Take-all and take-none strata can be included in the stratified design as they might lead to smaller sample sizes. The sample size calculations are based on the anticipated moments of the survey variable Y, given the stratification variable X. The package handles conditional distributions of Y given X that are either a heteroscedastic linear model, or a log-linear model. Stratum specific non-response can be accounted for in the design construction and in the sample size calculations.
Release date: 2011-06-29
10. Cost efficiency of repeated cluster surveys Archived
Articles and reports: 12-001-X201100111449
Description:
We analyze the statistical and economic efficiency of different designs of cluster surveys collected in two consecutive time periods, or waves. In an independent design, two cluster samples in two waves are taken independently from one another. In a cluster-panel design, the same clusters are used in both waves, but samples within clusters are taken independently in two time periods. In an observation-panel design, both clusters and observations are retained from one wave of data collection to another. By assuming a simple population structure, we derive design variances and costs of the surveys conducted according to these designs. We first consider a situation in which the interest lies in estimation of the change in the population mean between two time periods, and derive the optimal sample allocations for the three designs of interest. We then propose the utility maximization framework borrowed from microeconomics to illustrate a possible approach to the choice of the design that strives to optimize several variances simultaneously. Incorporating the contemporaneous means and their variances tends to shift the preferences from observation-panel towards simpler panel-cluster and independent designs if the panel mode of data collection is too expensive. We present numeric illustrations demonstrating how a survey designer may want to choose the efficient design given the population parameters and data collection cost.
Release date: 2011-06-29

Reference (3)

Reference (3) ((3 results))

1. Methodology of the Canadian Labour Force Survey
Surveys and statistical programs – Documentation: 71-526-X
Description:
The Canadian Labour Force Survey (LFS) is the official source of monthly estimates of total employment and unemployment. Following the 2011 census, the LFS underwent a sample redesign to account for the evolution of the population and labour market characteristics, to adjust to changes in the information needs and to update the geographical information used to carry out the survey. The redesign program following the 2011 census culminated with the introduction of a new sample at the beginning of 2015. This report is a reference on the methodological aspects of the LFS, covering stratification, sampling, collection, processing, weighting, estimation, variance estimation and data quality.
Release date: 2017-12-21
2. Coverage (Reference Products: Technical Reports: 1996 Census of Population) Archived
Surveys and statistical programs – Documentation: 92-370-X
Description:
Series description
This series includes five general reference products - the Preview of Products and Services; the Catalogue; the Dictionary; the Handbook and the Technical Reports - as well as geography reference products - GeoSuite and Reference Maps.
Product description
Technical Reports examine the quality of data from the 1996 Census, a large and complex undertaking. While considerable effort was taken to ensure high quality standards throughout each step, the results are subject to a certain degree of error. Each report looks at the collection and processing operations and presents results from data evaluation, as well as notes on historical comparability.
Technical Reports are aimed at moderate and sophisticated users but are written in a manner which could make them useful to all census data users. Most of the technical reports have been cancelled, with the exception of Age, Sex, Marital Status and Common-law Status, Coverage and Sampling and Weighting. These reports will be available as bilingual publications as well as being available in both official languages on the Internet as free products.
This report deals with coverage errors, which occured when persons, households, dwellings or families were missed by the 1996 Census or enumerated in error. Coverage errors are one of the most important types of error since they affect not only the accuracy of the counts of the various census universes but also the accuracy of all of the census data describing the characteristics of these universes. With this information, users can determine the risks involved in basing conclusions or decisions on census data.
Release date: 1999-12-14
3. Labour Force Classification in the Survey of Labour and Income Dynamics (SLID): Evaluation of Test 3A Results Archived
Surveys and statistical programs – Documentation: 75F0002M1993014
Description:
This paper presents the results from test 3A of the Survey of Labour and Income Dynamics (SLID), conducted in January 1993, with a view to identify any necessary changes to the questions or to the algorithm used to derive labour force status.
Release date: 1995-12-30

Report a problem or mistake on this page

Date modified:: 2024-05-16

Language selection

Search and menus

Search

Keyword search

Filter results by

Keyword(s)

Subject

Type

Year of publication

Geography

Survey or statistical program

Content

Results

All (53) (0 to 10 of 53 results)

Data (1) ((1 result))

Analysis (48) (0 to 10 of 48 results)

Reference (3) ((3 results))

Keyword search

Filter results by

Keyword(s)

Subject

Type

Year of publication

Geography

Survey or statistical program

Content

Results

All (53) (0 to 10 of 53 results)

Data (1) ((1 result))

Analysis (48) (0 to 10 of 48 results)

Reference (3) ((3 results))

How do I use the filters and the search box?

How do I refine my search?

How does the search work?

How are the results ordered?

How are the results ordered?