Survey design

Sort Help
entries

Results

All (266)

All (266) (30 to 40 of 266 results)

  • Articles and reports: 12-001-X201700114817
    Description:

    We present research results on sample allocations for efficient model-based small area estimation in cases where the areas of interest coincide with the strata. Although model-assisted and model-based estimation methods are common in the production of small area statistics, utilization of the underlying model and estimation method are rarely included in the sample area allocation scheme. Therefore, we have developed a new model-based allocation named g1-allocation. For comparison, one recently developed model-assisted allocation is presented. These two allocations are based on an adjusted measure of homogeneity which is computed using an auxiliary variable and is an approximation of the intra-class correlation within areas. Five model-free area allocation solutions presented in the past are selected from the literature as reference allocations. Equal and proportional allocations need the number of areas and area-specific numbers of basic statistical units. The Neyman, Bankier and NLP (Non-Linear Programming) allocation need values for the study variable concerning area level parameters such as standard deviation, coefficient of variation or totals. In general, allocation methods can be classified according to the optimization criteria and use of auxiliary data. Statistical properties of the various methods are assessed through sample simulation experiments using real population register data. It can be concluded from simulation results that inclusion of the model and estimation method into the allocation method improves estimation results.

    Release date: 2017-06-22

  • Articles and reports: 12-001-X201600214660
    Description:

    In an economic survey of a sample of enterprises, occupations are randomly selected from a list until a number r of occupations in a local unit has been identified. This is an inverse sampling problem for which we are proposing a few solutions. Simple designs with and without replacement are processed using negative binomial distributions and negative hypergeometric distributions. We also propose estimators for when the units are selected with unequal probabilities, with or without replacement.

    Release date: 2016-12-20

  • Articles and reports: 12-001-X201600214662
    Description:

    Two-phase sampling designs are often used in surveys when the sampling frame contains little or no auxiliary information. In this note, we shed some light on the concept of invariance, which is often mentioned in the context of two-phase sampling designs. We define two types of invariant two-phase designs: strongly invariant and weakly invariant two-phase designs. Some examples are given. Finally, we describe the implications of strong and weak invariance from an inference point of view.

    Release date: 2016-12-20

  • Articles and reports: 12-001-X201600214684
    Description:

    This paper introduces an incomplete adaptive cluster sampling design that is easy to implement, controls the sample size well, and does not need to follow the neighbourhood. In this design, an initial sample is first selected, using one of the conventional designs. If a cell satisfies a prespecified condition, a specified radius around the cell is sampled completely. The population mean is estimated using the \pi-estimator. If all the inclusion probabilities are known, then an unbiased \pi estimator is available; if, depending on the situation, the inclusion probabilities are not known for some of the final sample units, then they are estimated. To estimate the inclusion probabilities, a biased estimator is constructed. However, the simulations show that if the sample size is large enough, the error of the inclusion probabilities is negligible, and the relative \pi-estimator is almost unbiased. This design rivals adaptive cluster sampling because it controls the final sample size and is easy to manage. It rivals adaptive two-stage sequential sampling because it considers the cluster form of the population and reduces the cost of moving across the area. Using real data on a bird population and simulations, the paper compares the design with adaptive two-stage sequential sampling. The simulations show that the design has significant efficiency in comparison with its rival.

    Release date: 2016-12-20

  • Articles and reports: 18-001-X2016001
    Description:

    Although the record linkage of business data is not a completely new topic, the fact remains that the public and many data users are unaware of the programs and practices commonly used by statistical agencies across the world.

    This report is a brief overview of the main practices, programs and challenges of record linkage of statistical agencies across the world who answered a short survey on this subject supplemented by publically available documentation produced by these agencies. The document shows that the linkage practices are similar between these statistical agencies; however the main differences are in the procedures in place to access to data along with regulatory policies that govern the record linkage permissions and the dissemination of data.

    Release date: 2016-10-27

  • Articles and reports: 89-648-X2016001
    Description:

    Linkages between survey and administrative data are an increasingly common practice, due in part to the reduced burden to respondents, and to the data that can be obtained at a relatively low cost. Historical linkage, or the linkage of administrative data from previous years to the year of the survey, compounds these benefits by providing additional years of data. This paper examines the Longitudinal and International Study of Adults (LISA), which was linked to historical tax data on personal income tax returns (T1) and those collected from employers’ files (T4), among others not mentioned in this paper. It presents trends in historical linkage rates, compares the coherence of administrative data between the T1 and T4, presents the ability to use the data to create balanced panels, and uses the T1 data to produce age-earnings profiles by sex. The results show that the historical linkage rate is high (over 90% in most cases) and stable over time for respondents who are likely to file a tax return, and that the T1 and T4 administrative sources show similar earnings. Moreover, long balanced panels of up to 30 years in length (at the time of writing) can be created using LISA administrative linkage data.

    Release date: 2016-08-18

  • Articles and reports: 11-522-X201700014745
    Description:

    In the design of surveys a number of parameters like contact propensities, participation propensities and costs per sample unit play a decisive role. In on-going surveys, these survey design parameters are usually estimated from previous experience and updated gradually with new experience. In new surveys, these parameters are estimated from expert opinion and experience with similar surveys. Although survey institutes have a fair expertise and experience, the postulation, estimation and updating of survey design parameters is rarely done in a systematic way. This paper presents a Bayesian framework to include and update prior knowledge and expert opinion about the parameters. This framework is set in the context of adaptive survey designs in which different population units may receive different treatment given quality and cost objectives. For this type of survey, the accuracy of design parameters becomes even more crucial to effective design decisions. The framework allows for a Bayesian analysis of the performance of a survey during data collection and in between waves of a survey. We demonstrate the Bayesian analysis using a realistic simulation study.

    Release date: 2016-03-24

  • Articles and reports: 12-001-X201500214229
    Description:

    Self-weighting estimation through equal probability selection methods (epsem) is desirable for variance efficiency. Traditionally, the epsem property for (one phase) two stage designs for estimating population-level parameters is realized by using each primary sampling unit (PSU) population count as the measure of size for PSU selection along with equal sample size allocation per PSU under simple random sampling (SRS) of elementary units. However, when self-weighting estimates are desired for parameters corresponding to multiple domains under a pre-specified sample allocation to domains, Folsom, Potter and Williams (1987) showed that a composite measure of size can be used to select PSUs to obtain epsem designs when besides domain-level PSU counts (i.e., distribution of domain population over PSUs), frame-level domain identifiers for elementary units are also assumed to be available. The term depsem-A will be used to denote such (one phase) two stage designs to obtain domain-level epsem estimation. Folsom et al. also considered two phase two stage designs when domain-level PSU counts are unknown, but whole PSU counts are known. For these designs (to be termed depsem-B) with PSUs selected proportional to the usual size measure (i.e., the total PSU count) at the first stage, all elementary units within each selected PSU are first screened for classification into domains in the first phase of data collection before SRS selection at the second stage. Domain-stratified samples are then selected within PSUs with suitably chosen domain sampling rates such that the desired domain sample sizes are achieved and the resulting design is self-weighting. In this paper, we first present a simple justification of composite measures of size for the depsem-A design and of the domain sampling rates for the depsem-B design. Then, for depsem-A and -B designs, we propose generalizations, first to cases where frame-level domain identifiers for elementary units are not available and domain-level PSU counts are only approximately known from alternative sources, and second to cases where PSU size measures are pre-specified based on other practical and desirable considerations of over- and under-sampling of certain domains. We also present a further generalization in the presence of subsampling of elementary units and nonresponse within selected PSUs at the first phase before selecting phase two elementary units from domains within each selected PSU. This final generalization of depsem-B is illustrated for an area sample of housing units.

    Release date: 2015-12-17

  • Articles and reports: 12-001-X201500214230
    Description:

    This paper develops allocation methods for stratified sample surveys where composite small area estimators are a priority, and areas are used as strata. Longford (2006) proposed an objective criterion for this situation, based on a weighted combination of the mean squared errors of small area means and a grand mean. Here, we redefine this approach within a model-assisted framework, allowing regressor variables and a more natural interpretation of results using an intra-class correlation parameter. We also consider several uses of power allocation, and allow the placing of other constraints such as maximum relative root mean squared errors for stratum estimators. We find that a simple power allocation can perform very nearly as well as the optimal design even when the objective is to minimize Longford’s (2006) criterion.

    Release date: 2015-12-17

  • Articles and reports: 12-001-X201500214237
    Description:

    Careful design of a dual-frame random digit dial (RDD) telephone survey requires selecting from among many options that have varying impacts on cost, precision, and coverage in order to obtain the best possible implementation of the study goals. One such consideration is whether to screen cell-phone households in order to interview cell-phone only (CPO) households and exclude dual-user household, or to take all interviews obtained via the cell-phone sample. We present a framework in which to consider the tradeoffs between these two options and a method to select the optimal design. We derive and discuss the optimum allocation of sample size between the two sampling frames and explore the choice of optimum p, the mixing parameter for the dual-user domain. We illustrate our methods using the National Immunization Survey, sponsored by the Centers for Disease Control and Prevention.

    Release date: 2015-12-17
Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (266)

Analysis (266) (240 to 250 of 266 results)

  • Articles and reports: 12-001-X198700114511
    Description:

    A new unequal probability sampling scheme for selecting n(> 2) units without replacement from a finite population is proposed. This scheme ensures that the inclusion probabilities are proportional to sizes. It has the advantage of simplicity in selection and estimation and also provides a non-negative variance estimator. The variance of the Horvitz-Thompson (H-T) estimator under the proposed scheme is shown to be smaller than that of the customary estimator in probability proportional to size sampling with replacement. The proposed scheme also compares favourably with the without replacement scheme suggested by Sampford (1967) in an empirical study on a few natural populations.

    Release date: 1987-06-15

  • Articles and reports: 12-001-X198700114512
    Description:

    The Health and Activity Limitation Survey is part of the program to establish a data base on the disabled population in Canada. The sample design used for the part of the survey covering the population not living in institutions is described. In addition, the methods used to determine the sizes of the samples and to select the samples are presented.

    Release date: 1987-06-15

  • Articles and reports: 12-001-X198600214446
    Description:

    This paper discusses the influence of the sampling design on the estimation of a linear regression model. Particularly, sampling designs will be discussed which are dependent on the values of the endogenous variable in the population: endogenous (or “informative”) designs. A consistent estimator of the regression coefficients is given. Its variance is the sum of a sampling design component and a disturbance term component. Also, model-free regression is briefly discussed. The model-free regression estimator is the same as the model estimator in the case of an endogenous design.

    Release date: 1986-12-15

  • Articles and reports: 12-001-X198600214450
    Description:

    From an annual sample of U.S. corporate tax returns, the U.S. Internal Revenue Service provides estimates of population and subpopulation totals for several hundred financial items. The basic sample design is highly stratified and fairly complex. Starting with the 1981 and 1982 samples, the design was altered to include a double sampling procedure. This was motivated by the need for better allocation of resources, in an environment of shrinking budgets. Items not observed in the subsample are predicted, using a modified hot deck imputation procedure. The present paper describes the design, estimation, and evaluation of the effects of the new procedure.

    Release date: 1986-12-15

  • Articles and reports: 12-001-X198500214372
    Description:

    The use of a multivariate clustering algorithm to perform stratification for the Labour Force Survey is described. The algorithm developed by Friedman and Rubin (1967) is modified to allow the formation of geographically contiguous strata and to delineate heterogeneous but compact primary sampling units (PSUs) within these strata. Studies dealing with stratification variables, stratification robustness over time, and type of stratification are described.

    Release date: 1985-12-16

  • Articles and reports: 12-001-X198500114365
    Description:

    The cost-variance optimization of the design of the Canadian Labour Force Survey was carried out in two steps. First, the sample designs were optimized for each of the two major area types, the Self-Representing (SR) and the Non-Self-Representing (NSR) areas. Cost models were developed and parameters estimated from a detailed field study and by simulation, while variances were estimated using data from the Census of Population. The scope of the optimization included the allocation of sample to the two stages in the SR design, and the consideration of two alternatives to the old design in NSR areas. The second stage of optimization was the allocation of sample to SR and NSR areas.

    Release date: 1985-06-14

  • Articles and reports: 12-001-X198400214353
    Description:

    Following each decennial population census, the Canadian Labour Force Survey (CLFS) has undergone a sample redesign to reflect changes in population characteristics and to respond to changes in information needs. The current redesign program which culminated with introduction of a new sample at the beginning of 1985 included extensive research into improved sample design, data collection and estimation methodologies, highlights of which are described.

    Release date: 1984-12-14

  • Articles and reports: 12-001-X198400214357
    Description:

    A finite population of size N is supposed to contain M (unknown) units of a specified category A (say) constituting a domain with mean \mu. A procedure which involves drawing units using simple random sampling without replacement till a preassigned number of members of the domain is reached is proposed. An unbiased estimator of \mu is also derived. This is seen to be superior to the corresponding possibly biased estimator based on a comparable SRSWOR scheme with a fixed number of draws. The proposed scheme is also shown to admit unbiased estimators of M and the domain total T.

    Release date: 1984-12-14

  • Articles and reports: 12-001-X198300214341
    Description:

    Cost models to determine an optimum allocation of the sample among stages in cluster samples are considered. Results from a proposed cost model, which directly considers the implications of follow-up visits to sample clusters as well as other travel to and from the field by data collectors, are compared with results from existing cost models. The proposed model generally calls for fewer clusters with more elements selected per cluster than the existing models.

    Release date: 1983-12-15

  • Articles and reports: 12-001-X198300214343
    Description:

    The oil crisis of the mid-1970’s triggered a new awareness among Canadians of the importance of energy conservation. The resulting government programs in the transportation sector demanded basic data about on-the-road fuel consumption by motor vehicles operating in Canadian conditions. This paper describes the Passenger Car Fuel Consumption Survey which was developed jointly by Statistics Canada and Transport Canada to meet this need. The methodology of the survey is described and some examples of the results are presented. The paper concludes with some speculation about future directions for the survey and for vehicle-usage statistics in general.

    Release date: 1983-12-15
Reference (1)

Reference (1) ((1 result))

  • Surveys and statistical programs – Documentation: 75F0002M1992001
    Description:

    Starting in 1994, the Survey of Labour and Income Dynamics (SLID) will follow individuals and families for at least six years, tracking their labour market experiences, changes in income and family circumstances. An initial proposal for the content of SLID, entitled "Content of the Survey of Labour and Income Dynamics : Discussion Paper", was distributed in February 1992.

    That paper served as a background document for consultation with and a review by interested users. The content underwent significant change during this process. Based upon the revised content, a large-scale test of SLID will be conducted in February and May 1993.

    The present document outlines the income and wealth content to be tested in May 1993. This document is really a continuation of SLID Research Paper Series 92-01A, which outlines the demographic and labour content used in the January /February 1993 test.

    Release date: 2008-02-29
Date modified: