Survey design

Sort Help
entries

Results

All (266)

All (266) (30 to 40 of 266 results)

  • Articles and reports: 12-001-X201700114817
    Description:

    We present research results on sample allocations for efficient model-based small area estimation in cases where the areas of interest coincide with the strata. Although model-assisted and model-based estimation methods are common in the production of small area statistics, utilization of the underlying model and estimation method are rarely included in the sample area allocation scheme. Therefore, we have developed a new model-based allocation named g1-allocation. For comparison, one recently developed model-assisted allocation is presented. These two allocations are based on an adjusted measure of homogeneity which is computed using an auxiliary variable and is an approximation of the intra-class correlation within areas. Five model-free area allocation solutions presented in the past are selected from the literature as reference allocations. Equal and proportional allocations need the number of areas and area-specific numbers of basic statistical units. The Neyman, Bankier and NLP (Non-Linear Programming) allocation need values for the study variable concerning area level parameters such as standard deviation, coefficient of variation or totals. In general, allocation methods can be classified according to the optimization criteria and use of auxiliary data. Statistical properties of the various methods are assessed through sample simulation experiments using real population register data. It can be concluded from simulation results that inclusion of the model and estimation method into the allocation method improves estimation results.

    Release date: 2017-06-22

  • Articles and reports: 12-001-X201600214660
    Description:

    In an economic survey of a sample of enterprises, occupations are randomly selected from a list until a number r of occupations in a local unit has been identified. This is an inverse sampling problem for which we are proposing a few solutions. Simple designs with and without replacement are processed using negative binomial distributions and negative hypergeometric distributions. We also propose estimators for when the units are selected with unequal probabilities, with or without replacement.

    Release date: 2016-12-20

  • Articles and reports: 12-001-X201600214662
    Description:

    Two-phase sampling designs are often used in surveys when the sampling frame contains little or no auxiliary information. In this note, we shed some light on the concept of invariance, which is often mentioned in the context of two-phase sampling designs. We define two types of invariant two-phase designs: strongly invariant and weakly invariant two-phase designs. Some examples are given. Finally, we describe the implications of strong and weak invariance from an inference point of view.

    Release date: 2016-12-20

  • Articles and reports: 12-001-X201600214684
    Description:

    This paper introduces an incomplete adaptive cluster sampling design that is easy to implement, controls the sample size well, and does not need to follow the neighbourhood. In this design, an initial sample is first selected, using one of the conventional designs. If a cell satisfies a prespecified condition, a specified radius around the cell is sampled completely. The population mean is estimated using the \pi-estimator. If all the inclusion probabilities are known, then an unbiased \pi estimator is available; if, depending on the situation, the inclusion probabilities are not known for some of the final sample units, then they are estimated. To estimate the inclusion probabilities, a biased estimator is constructed. However, the simulations show that if the sample size is large enough, the error of the inclusion probabilities is negligible, and the relative \pi-estimator is almost unbiased. This design rivals adaptive cluster sampling because it controls the final sample size and is easy to manage. It rivals adaptive two-stage sequential sampling because it considers the cluster form of the population and reduces the cost of moving across the area. Using real data on a bird population and simulations, the paper compares the design with adaptive two-stage sequential sampling. The simulations show that the design has significant efficiency in comparison with its rival.

    Release date: 2016-12-20

  • Articles and reports: 18-001-X2016001
    Description:

    Although the record linkage of business data is not a completely new topic, the fact remains that the public and many data users are unaware of the programs and practices commonly used by statistical agencies across the world.

    This report is a brief overview of the main practices, programs and challenges of record linkage of statistical agencies across the world who answered a short survey on this subject supplemented by publically available documentation produced by these agencies. The document shows that the linkage practices are similar between these statistical agencies; however the main differences are in the procedures in place to access to data along with regulatory policies that govern the record linkage permissions and the dissemination of data.

    Release date: 2016-10-27

  • Articles and reports: 89-648-X2016001
    Description:

    Linkages between survey and administrative data are an increasingly common practice, due in part to the reduced burden to respondents, and to the data that can be obtained at a relatively low cost. Historical linkage, or the linkage of administrative data from previous years to the year of the survey, compounds these benefits by providing additional years of data. This paper examines the Longitudinal and International Study of Adults (LISA), which was linked to historical tax data on personal income tax returns (T1) and those collected from employers’ files (T4), among others not mentioned in this paper. It presents trends in historical linkage rates, compares the coherence of administrative data between the T1 and T4, presents the ability to use the data to create balanced panels, and uses the T1 data to produce age-earnings profiles by sex. The results show that the historical linkage rate is high (over 90% in most cases) and stable over time for respondents who are likely to file a tax return, and that the T1 and T4 administrative sources show similar earnings. Moreover, long balanced panels of up to 30 years in length (at the time of writing) can be created using LISA administrative linkage data.

    Release date: 2016-08-18

  • Articles and reports: 11-522-X201700014745
    Description:

    In the design of surveys a number of parameters like contact propensities, participation propensities and costs per sample unit play a decisive role. In on-going surveys, these survey design parameters are usually estimated from previous experience and updated gradually with new experience. In new surveys, these parameters are estimated from expert opinion and experience with similar surveys. Although survey institutes have a fair expertise and experience, the postulation, estimation and updating of survey design parameters is rarely done in a systematic way. This paper presents a Bayesian framework to include and update prior knowledge and expert opinion about the parameters. This framework is set in the context of adaptive survey designs in which different population units may receive different treatment given quality and cost objectives. For this type of survey, the accuracy of design parameters becomes even more crucial to effective design decisions. The framework allows for a Bayesian analysis of the performance of a survey during data collection and in between waves of a survey. We demonstrate the Bayesian analysis using a realistic simulation study.

    Release date: 2016-03-24

  • Articles and reports: 12-001-X201500214229
    Description:

    Self-weighting estimation through equal probability selection methods (epsem) is desirable for variance efficiency. Traditionally, the epsem property for (one phase) two stage designs for estimating population-level parameters is realized by using each primary sampling unit (PSU) population count as the measure of size for PSU selection along with equal sample size allocation per PSU under simple random sampling (SRS) of elementary units. However, when self-weighting estimates are desired for parameters corresponding to multiple domains under a pre-specified sample allocation to domains, Folsom, Potter and Williams (1987) showed that a composite measure of size can be used to select PSUs to obtain epsem designs when besides domain-level PSU counts (i.e., distribution of domain population over PSUs), frame-level domain identifiers for elementary units are also assumed to be available. The term depsem-A will be used to denote such (one phase) two stage designs to obtain domain-level epsem estimation. Folsom et al. also considered two phase two stage designs when domain-level PSU counts are unknown, but whole PSU counts are known. For these designs (to be termed depsem-B) with PSUs selected proportional to the usual size measure (i.e., the total PSU count) at the first stage, all elementary units within each selected PSU are first screened for classification into domains in the first phase of data collection before SRS selection at the second stage. Domain-stratified samples are then selected within PSUs with suitably chosen domain sampling rates such that the desired domain sample sizes are achieved and the resulting design is self-weighting. In this paper, we first present a simple justification of composite measures of size for the depsem-A design and of the domain sampling rates for the depsem-B design. Then, for depsem-A and -B designs, we propose generalizations, first to cases where frame-level domain identifiers for elementary units are not available and domain-level PSU counts are only approximately known from alternative sources, and second to cases where PSU size measures are pre-specified based on other practical and desirable considerations of over- and under-sampling of certain domains. We also present a further generalization in the presence of subsampling of elementary units and nonresponse within selected PSUs at the first phase before selecting phase two elementary units from domains within each selected PSU. This final generalization of depsem-B is illustrated for an area sample of housing units.

    Release date: 2015-12-17

  • Articles and reports: 12-001-X201500214230
    Description:

    This paper develops allocation methods for stratified sample surveys where composite small area estimators are a priority, and areas are used as strata. Longford (2006) proposed an objective criterion for this situation, based on a weighted combination of the mean squared errors of small area means and a grand mean. Here, we redefine this approach within a model-assisted framework, allowing regressor variables and a more natural interpretation of results using an intra-class correlation parameter. We also consider several uses of power allocation, and allow the placing of other constraints such as maximum relative root mean squared errors for stratum estimators. We find that a simple power allocation can perform very nearly as well as the optimal design even when the objective is to minimize Longford’s (2006) criterion.

    Release date: 2015-12-17

  • Articles and reports: 12-001-X201500214237
    Description:

    Careful design of a dual-frame random digit dial (RDD) telephone survey requires selecting from among many options that have varying impacts on cost, precision, and coverage in order to obtain the best possible implementation of the study goals. One such consideration is whether to screen cell-phone households in order to interview cell-phone only (CPO) households and exclude dual-user household, or to take all interviews obtained via the cell-phone sample. We present a framework in which to consider the tradeoffs between these two options and a method to select the optimal design. We derive and discuss the optimum allocation of sample size between the two sampling frames and explore the choice of optimum p, the mixing parameter for the dual-user domain. We illustrate our methods using the National Immunization Survey, sponsored by the Centers for Disease Control and Prevention.

    Release date: 2015-12-17
Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (266)

Analysis (266) (50 to 60 of 266 results)

  • Articles and reports: 12-001-X201300211884
    Description:

    This paper offers a solution to the problem of finding the optimal stratification of the available population frame, so as to ensure the minimization of the cost of the sample required to satisfy precision constraints on a set of different target estimates. The solution is searched by exploring the universe of all possible stratifications obtainable by cross-classifying the categorical auxiliary variables available in the frame (continuous auxiliary variables can be transformed into categorical ones by means of suitable methods). Therefore, the followed approach is multivariate with respect to both target and auxiliary variables. The proposed algorithm is based on a non deterministic evolutionary approach, making use of the genetic algorithm paradigm. The key feature of the algorithm is in considering each possible stratification as an individual subject to evolution, whose fitness is given by the cost of the associated sample required to satisfy a set of precision constraints, the cost being calculated by applying the Bethel algorithm for multivariate allocation. This optimal stratification algorithm, implemented in an R package (SamplingStrata), has been so far applied to a number of current surveys in the Italian National Institute of Statistics: the obtained results always show significant improvements in the efficiency of the samples obtained, with respect to previously adopted stratifications.

    Release date: 2014-01-15

  • Articles and reports: 12-001-X201300111824
    Description:

    In most surveys all sample units receive the same treatment and the same design features apply to all selected people and households. In this paper, it is explained how survey designs may be tailored to optimize quality given constraints on costs. Such designs are called adaptive survey designs. The basic ingredients of such designs are introduced, discussed and illustrated with various examples.

    Release date: 2013-06-28

  • Articles and reports: 12-001-X201300111829
    Description:

    Indirect Sampling is used when the sampling frame is not the same as the target population, but related to the latter. The estimation process for Indirect Sampling is carried out using the Generalised Weight Share Method (GWSM), which is an unbiased procedure (see Lavallée 2002, 2007). For business surveys, Indirect Sampling is applied as follows: the sampling frame is one of establishments, while the target population is one of enterprises. Enterprises are selected through their establishments. This allows stratifying according to the establishment characteristics, rather than those associated with enterprises. Because the variables of interest of establishments are generally highly skewed (a small portion of the establishments covers the major portion of the economy), the GWSM results in unbiased estimates, but their variance can be large. The purpose of this paper is to suggest some adjustments to the weights to reduce the variance of the estimates in the context of skewed populations, while keeping the method unbiased. After a brief overview of Indirect Sampling and the GWSM, we describe the required adjustments to the GWSM. The estimates produced with these adjustments are compared to those from the original GWSM, via a small numerical example, and using real data originating from the Statistics Canada's Business Register.

    Release date: 2013-06-28

  • Articles and reports: 12-001-X201200111682
    Description:

    Sample allocation issues are studied in the context of estimating sub-population (stratum or domain) means as well as the aggregate population mean under stratified simple random sampling. A non-linear programming method is used to obtain "optimal" sample allocation to strata that minimizes the total sample size subject to specified tolerances on the coefficient of variation of the estimators of strata means and the population mean. The resulting total sample size is then used to determine sample allocations for the methods of Costa, Satorra and Ventura (2004) based on compromise allocation and Longford (2006) based on specified "inferential priorities". In addition, we study sample allocation to strata when reliability requirements for domains, cutting across strata, are also specified. Performance of the three methods is studied using data from Statistics Canada's Monthly Retail Trade Survey (MRTS) of single establishments.

    Release date: 2012-06-27

  • Articles and reports: 12-001-X201100211606
    Description:

    This paper introduces a U.S. Census Bureau special compilation by presenting four other papers of the current issue: three papers from authors Tillé, Lohr and Thompson as well as a discussion paper from Opsomer.

    Release date: 2011-12-21

  • Articles and reports: 12-001-X201100211607
    Description:

    This paper describes recent developments in adaptive sampling strategies and introduces new variations on those strategies. Recent developments described included targeted random walk designs and adaptive web sampling. These designs are particularly suited for sampling in networks; for example, for finding a sample of people from a hidden human population by following social links from sample individuals to find additional members of the hidden population to add to the sample. Each of these designs can also be translated into spatial settings to produce flexible new spatial adaptive strategies for sampling unevenly distributed populations. Variations on these sampling strategies include versions in which the network or spatial links have unequal weights and are followed with unequal probabilities.

    Release date: 2011-12-21

  • Articles and reports: 12-001-X201100211608
    Description:

    Designs and estimators for the single frame surveys currently used by U.S. government agencies were developed in response to practical problems. Federal household surveys now face challenges of decreasing response rates and frame coverage, higher data collection costs, and increasing demand for small area statistics. Multiple frame surveys, in which independent samples are drawn from separate frames, can be used to help meet some of these challenges. Examples include combining a list frame with an area frame or using two frames to sample landline telephone households and cellular telephone households. We review point estimators and weight adjustments that can be used to analyze multiple frame surveys with standard survey software, and summarize construction of replicate weights for variance estimation. Because of their increased complexity, multiple frame surveys face some challenges not found in single frame surveys. We investigate misclassification bias in multiple frame surveys, and propose a method for correcting for this bias when misclassification probabilities are known. Finally, we discuss research that is needed on nonsampling errors with multiple frame surveys.

    Release date: 2011-12-21

  • Articles and reports: 12-001-X201100211609
    Description:

    This paper presents a review and assessment of the use of balanced sampling by means of the cube method. After defining the notion of balanced sample and balanced sampling, a short history of the concept of balancing is presented. The theory of the cube method is briefly presented. Emphasis is placed on the practical problems posed by balanced sampling: the interest of the method with respect to other sampling methods and calibration, the field of application, the accuracy of balancing, the choice of auxiliary variables and ways to implement the method.

    Release date: 2011-12-21

  • Articles and reports: 12-001-X201100211610
    Description:

    In this paper, a discussion of the three papers from the US Census Bureau special compilation is presented.

    Release date: 2011-12-21

  • Articles and reports: 89-648-X2011001
    Geography: Canada
    Description:

    In January 2006, a conference on longitudinal surveys hosted by Statistics Canada, the Social and Humanities Research Council of Canada (SSHRC) and the Canadian Institute of Health Research (CIHR) concluded that Canada lacks a longitudinal survey which collects information on multiple subjects such as family, human capital, labour health and follows respondents for a long period of time. Following this conference, funds were received from the Policy Research Data Gaps fund (PRDG) to support a pilot survey for a new Canadian Household Panel Survey (CHPS-Pilot). Consultations on the design and content were held with academic and policy experts in 2007 and 2008, and a pilot survey was conducted in the fall of 2008. The objectives of the pilot survey were to (1) test a questionnaire, evaluate interview length and measure the quality of data collected, (2) evaluate several design features; and (3) test reactions to the survey from respondents and field workers. The pilot survey achieved a response rate of 76%, with a median household interview time of 64 minutes. Several innovative design features were tested, and found to be viable. Response to the survey, whether from respondents or interviewers, was generally positive. This paper highlights these and other results from the CHPS-Pilot.

    Release date: 2011-09-14
Reference (1)

Reference (1) ((1 result))

  • Surveys and statistical programs – Documentation: 75F0002M1992001
    Description:

    Starting in 1994, the Survey of Labour and Income Dynamics (SLID) will follow individuals and families for at least six years, tracking their labour market experiences, changes in income and family circumstances. An initial proposal for the content of SLID, entitled "Content of the Survey of Labour and Income Dynamics : Discussion Paper", was distributed in February 1992.

    That paper served as a background document for consultation with and a review by interested users. The content underwent significant change during this process. Based upon the revised content, a large-scale test of SLID will be conducted in February and May 1993.

    The present document outlines the income and wealth content to be tested in May 1993. This document is really a continuation of SLID Research Paper Series 92-01A, which outlines the demographic and labour content used in the January /February 1993 test.

    Release date: 2008-02-29
Date modified: