Survey design

Sort Help
entries

Results

All (266)

All (266) (100 to 110 of 266 results)

  • Articles and reports: 11-522-X20050019443
    Description:

    A large part of sample survey theory has been directly motivated by practical problems encountered in the design and analysis of sample surveys. On the other hand, sample survey theory has influenced practice, often leading to significant improvements. This paper will examine this interplay over the past 60 years or so.

    Release date: 2007-03-02

  • Articles and reports: 11-522-X20050019466
    Description:

    A class of estimators based on the dependency structure of a multivariate variable of interest and the survey design is defined. It will be shown by a MonteCarlo simulation how the adoption of the estimator corresponding to the population structure is more efficient than the others.

    Release date: 2007-03-02

  • Articles and reports: 11-522-X20050019468
    Description:

    At the time of recruitment, the participants in a longitudinal survey are chosen to be representative of a population. As time goes on, typically some of the participants will drop out, and dropout may be informative in the sense of depending on the response variables of interest. However, even if dropout is minimal, the participants who continue to the second and third waves of a longitudinal survey may differ from those they supposedly represent in subtle ways. It is clearly important to take such possibilities into account when designing and analyzing longitudinal survey data before and after an intervention.

    Release date: 2007-03-02

  • Articles and reports: 12-001-X20060029546
    Description:

    We discuss methods for the analysis of case-control studies in which the controls are drawn using a complex sample survey. The most straightforward method is the standard survey approach based on weighted versions of population estimating equations. We also look at more efficient methods and compare their robustness to model mis-specification in simple cases. Case-control family studies, where the within-cluster structure is of interest in its own right, are also discussed briefly.

    Release date: 2006-12-21

  • Articles and reports: 12-001-X20060029550
    Description:

    In this paper, the geometric, optimization-based, and Lavallée and Hidiroglou (LH) approaches to stratification are compared. The geometric stratification method is an approximation, whereas the other two approaches, which employ numerical methods to perform stratification, may be seen as optimal stratification methods. The algorithm of the geometric stratification is very simple compared to the two other approaches, but it does not take into account the construction of a take-all stratum, which is usually constructed when a positively skewed population is stratified. In the optimization-based stratification, one may consider any form of optimization function and its constraints. In a comparative numerical study based on five positively skewed artificial populations, the optimization approach was more efficient in each of the cases studied compared to the geometric stratification. In addition, the geometric and optimization approaches are compared with the LH algorithm. In this comparison, the geometric stratification approach was found to be less efficient than the LH algorithm, whereas efficiency of the optimization approach was similar to the efficiency of the LH algorithm. Nevertheless, strata boundaries evaluated via the geometric stratification may be seen as efficient starting points for the optimization approach.

    Release date: 2006-12-21

  • Articles and reports: 12-001-X20060029552
    Description:

    A survey of tourist visits originating intra and extra-region in Brittany was needed. For concrete material reasons, "border surveys" could no longer be used. The major problem is the lack of a sampling frame that allows for direct contact with tourists. This problem was addressed by applying the indirect sampling method, the weighting for which is obtained using the generalized weight share method developed recently by Lavallée (1995), Lavallée (2002), Deville (1999) and also presented recently in Lavallée and Caron (2001). This article shows how to adapt the method to the survey. A number of extensions are required. One of the extensions, designed to estimate the total of a population from which a Bernouilli sample has been taken, will be developed.

    Release date: 2006-12-21

  • Articles and reports: 12-001-X20060029553
    Description:

    Félix-Medina and Thompson (2004) proposed a variant of Link-tracing sampling in which it is assumed that a portion of the population, not necessarily the major portion, is covered by a frame of disjoint sites where members of the population can be found with high probabilities. A sample of sites is selected and the people in each of the selected sites are asked to nominate other members of the population. They proposed maximum likelihood estimators of the population sizes which perform acceptably provided that for each site the probability that a member is nominated by that site, called the nomination probability, is not small. In this research we consider Félix-Medina and Thompson's variant and propose three sets of estimators of the population sizes derived under the Bayesian approach. Two of the sets of estimators were obtained using improper prior distributions of the population sizes, and the other using Poisson prior distributions. However, we use the Bayesian approach only to assist us in the construction of estimators, while inferences about the population sizes are made under the frequentist approach. We propose two types of partly design-based variance estimators and confidence intervals. One of them is obtained using a bootstrap and the other using the delta method along with the assumption of asymptotic normality. The results of a simulation study indicate that (i) when the nomination probabilities are not small each of the proposed sets of estimators performs well and very similarly to maximum likelihood estimators; (ii) when the nomination probabilities are small the set of estimators derived using Poisson prior distributions still performs acceptably and does not have the problems of bias that maximum likelihood estimators have, and (iii) the previous results do not depend on the size of the fraction of the population covered by the frame.

    Release date: 2006-12-21

  • Articles and reports: 12-001-X20060029554
    Description:

    Survey sampling to estimate a Consumer Price Index (CPI) is quite complicated, generally requiring a combination of data from at least two surveys: one giving prices, one giving expenditure weights. Fundamentally different approaches to the sampling process - probability sampling and purposive sampling - have each been strongly advocated and are used by different countries in the collection of price data. By constructing a small "world" of purchases and prices from scanner data on cereal and then simulating various sampling and estimation techniques, we compare the results of two design and estimation approaches: the probability approach of the United States and the purposive approach of the United Kingdom. For the same amount of information collected, but given the use of different estimators, the United Kingdom's methods appear to offer better overall accuracy in targeting a population superlative consumer price index.

    Release date: 2006-12-21

  • Articles and reports: 12-001-X20060029555
    Description:

    Researchers and policy makers often use data from nationally representative probability sample surveys. The number of topics covered by such surveys, and hence the amount of interviewing time involved, have typically increased over the years, resulting in increased costs and respondent burden. A potential solution to this problem is to carefully form subsets of the items in a survey and administer one such subset to each respondent. Designs of this type are called "split-questionnaire" designs or "matrix sampling" designs. The administration of only a subset of the survey items to each respondent in a matrix sampling design creates what can be considered missing data. Multiple imputation (Rubin 1987), a general-purpose approach developed for handling data with missing values, is appealing for the analysis of data from a matrix sample, because once the multiple imputations are created, data analysts can apply standard methods for analyzing complete data from a sample survey. This paper develops and evaluates a method for creating matrix sampling forms, each form containing a subset of items to be administered to randomly selected respondents. The method can be applied in complex settings, including situations in which skip patterns are present. Forms are created in such a way that each form includes items that are predictive of the excluded items, so that subsequent analyses based on multiple imputation can recover some of the information about the excluded items that would have been collected had there been no matrix sampling. The matrix sampling and multiple-imputation methods are evaluated using data from the National Health and Nutrition Examination Survey, one of many nationally representative probability sample surveys conducted by the National Center for Health Statistics, Centers for Disease Control and Prevention. The study demonstrates the feasibility of the approach applied to a major national health survey with complex structure, and it provides practical advice about appropriate items to include in matrix sampling designs in future surveys.

    Release date: 2006-12-21

  • Articles and reports: 12-001-X20060019256
    Description:

    In some situations the sample design of a survey is rather complex, consisting of fundamentally different designs in different domains. The design effect for estimates based upon the total sample is a weighted sum of the domain-specific design effects. We derive these weights under an appropriate model and illustrate their use with data from the European Social Survey (ESS).

    Release date: 2006-07-20
Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (266)

Analysis (266) (0 to 10 of 266 results)

  • Articles and reports: 75F0002M2024005
    Description: The Canadian Income Survey (CIS) has introduced improvements to the methods and data sources used to produce income and poverty estimates with the release of its 2022 reference year estimates. Foremost among these improvements is a significant increase in the sample size for a large subset of the CIS content. The weighting methodology was also improved and the target population of the CIS was changed from persons aged 16 years and over to persons aged 15 years and over. This paper describes the changes made and presents the approximate net result of these changes on the income estimates and data quality of the CIS using 2021 data. The changes described in this paper highlight the ways in which data quality has been improved while having little impact on key CIS estimates and trends.
    Release date: 2024-04-26

  • Articles and reports: 11-522-X202200100010
    Description: Growing Up in Québec is a longitudinal population survey that began in the spring of 2021 at the Institut de la statistique du Québec. Among the children targeted by this longitudinal follow-up, some will experience developmental difficulties at some point in their lives. Those same children often have characteristics associated with higher sample attrition (low-income family, parents with a low level of education). This article describes the two main challenges we encountered when trying to ensure sufficient representativeness of these children, in both the overall results and the subpopulation analyses.
    Release date: 2024-03-25

  • Articles and reports: 12-001-X202300200001
    Description: When a Medicare healthcare provider is suspected of billing abuse, a population of payments X made to that provider over a fixed timeframe is isolated. A certified medical reviewer, in a time-consuming process, can determine the overpayment Y = X - (amount justified by the evidence) associated with each payment. Typically, there are too many payments in the population to examine each with care, so a probability sample is selected. The sample overpayments are then used to calculate a 90% lower confidence bound for the total population overpayment. This bound is the amount demanded for recovery from the provider. Unfortunately, classical methods for calculating this bound sometimes fail to provide the 90% confidence level, especially when using a stratified sample.

    In this paper, 166 redacted samples from Medicare integrity investigations are displayed and described, along with 156 associated payment populations. The 7,588 examined (Y, X) sample pairs show (1) Medicare audits have high error rates: more than 76% of these payments were considered to have been paid in error; and (2) the patterns in these samples support an “All-or-Nothing” mixture model for (Y, X) previously defined in the literature. Model-based Monte Carlo testing procedures for Medicare sampling plans are discussed, as well as stratification methods based on anticipated model moments. In terms of viability (achieving the 90% confidence level) a new stratification method defined here is competitive with the best of the many existing methods tested and seems less sensitive to choice of operating parameters. In terms of overpayment recovery (equivalent to precision) the new method is also comparable to the best of the many existing methods tested. Unfortunately, no stratification algorithm tested was ever viable for more than about half of the 104 test populations.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200006
    Description: Survey researchers are increasingly turning to multimode data collection to deal with declines in survey response rates and increasing costs. An efficient approach offers the less costly modes (e.g., web) followed with a more expensive mode for a subsample of the units (e.g., households) within each primary sampling unit (PSU). We present two alternatives to this traditional design. One alternative subsamples PSUs rather than units to constrain costs. The second is a hybrid design that includes a clustered (two-stage) sample and an independent, unclustered sample. Using a simulation, we demonstrate the hybrid design has considerable advantages.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200008
    Description: In this article, we use a slightly simplified version of the method by Fickus, Mixon and Poteet (2013) to define a flexible parameterization of the kernels of determinantal sampling designs with fixed first-order inclusion probabilities. For specific values of the multidimensional parameter, we get back to a matrix from the family PII from Loonis and Mary (2019). We speculate that, among the determinantal designs with fixed inclusion probabilities, the minimum variance of the Horvitz and Thompson estimator (1952) of a variable of interest is expressed relative to PII. We provide experimental R programs that facilitate the appropriation of various concepts presented in the article, some of which are described as non-trivial by Fickus et al. (2013). A longer version of this article, including proofs and a more detailed presentation of the determinantal designs, is also available.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200010
    Description: Sample coordination methods aim to increase (in positive coordination) or decrease (in negative coordination) the size of the overlap between samples. The samples considered can be from different occasions of a repeated survey and/or from different surveys covering a common population. Negative coordination is used to control the response burden in a given period, because some units do not respond to survey questionnaires if they are selected in many samples. Usually, methods for sample coordination do not take into account any measure of the response burden that a unit has already expended in responding to previous surveys. We introduce such a measure into a new method by adapting a spatially balanced sampling scheme, based on a generalization of Poisson sampling, together with a negative coordination method. The goal is to create a double control of the burden for these units: once by using a measure of burden during the sampling process and once by using a negative coordination method. We evaluate the approach using Monte-Carlo simulation and investigate its use for controlling for selection “hot-spots” in business surveys in Statistics Netherlands.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200016
    Description: In this discussion, I will present some additional aspects of three major areas of survey theory developed or studied by Jean-Claude Deville: calibration, balanced sampling and the generalized weight-share method.
    Release date: 2024-01-03

  • Articles and reports: 75F0002M2023005
    Description: The Canadian Income Survey (CIS) has introduced improvements to the methods and systems used to produce income estimates with the release of its 2021 reference year estimates. This paper describes the changes and presents the approximate net result of these changes on income estimates using data for 2019 and 2020. The changes described in this paper highlight the ways in which data quality has been improved while producing minimal impact on key CIS estimates and trends.
    Release date: 2023-08-29

  • Articles and reports: 12-001-X202300100009
    Description: In this paper, with and without-replacement versions of adaptive proportional to size sampling are presented. Unbiased estimators are developed for these methods and their properties are studied. In the two versions, the drawing probabilities are adapted during the sampling process based on the observations already selected. To this end, in the version with-replacement, after each draw and observation of the variable of interest, the vector of the auxiliary variable will be updated using the observed values of the variable of interest to approximate the exact selection probability proportional to size. For the without-replacement version, first, using an initial sample, we model the relationship between the variable of interest and the auxiliary variable. Then, utilizing this relationship, we estimate the unknown (unobserved) population units. Finally, on these estimated population units, we select a new sample proportional to size without-replacement. These approaches can significantly improve the efficiency of designs not only in the case of a positive linear relationship, but also in the case of a non-linear or negative linear relationship between the variables. We investigate the efficiencies of the designs through simulations and real case studies on medicinal flowers, social and economic data.
    Release date: 2023-06-30

  • Articles and reports: 11-633-X2022006
    Description:

    This article compares how survey mode, survey thematic context and sample design contribute to variation in responses to similar questions on self-perceived racial discrimination across the 2013, 2014, 2019 and 2020 cycles of the General Social Survey (GSS).

    Release date: 2022-08-09
Reference (1)

Reference (1) ((1 result))

  • Surveys and statistical programs – Documentation: 75F0002M1992001
    Description:

    Starting in 1994, the Survey of Labour and Income Dynamics (SLID) will follow individuals and families for at least six years, tracking their labour market experiences, changes in income and family circumstances. An initial proposal for the content of SLID, entitled "Content of the Survey of Labour and Income Dynamics : Discussion Paper", was distributed in February 1992.

    That paper served as a background document for consultation with and a review by interested users. The content underwent significant change during this process. Based upon the revised content, a large-scale test of SLID will be conducted in February and May 1993.

    The present document outlines the income and wealth content to be tested in May 1993. This document is really a continuation of SLID Research Paper Series 92-01A, which outlines the demographic and labour content used in the January /February 1993 test.

    Release date: 2008-02-29
Date modified: