Survey design

Sort Help
entries

Results

All (297)

All (297) (0 to 10 of 297 results)

  • Articles and reports: 11-522-X202200100010
    Description: Growing Up in Québec is a longitudinal population survey that began in the spring of 2021 at the Institut de la statistique du Québec. Among the children targeted by this longitudinal follow-up, some will experience developmental difficulties at some point in their lives. Those same children often have characteristics associated with higher sample attrition (low-income family, parents with a low level of education). This article describes the two main challenges we encountered when trying to ensure sufficient representativeness of these children, in both the overall results and the subpopulation analyses.
    Release date: 2024-03-25

  • Journals and periodicals: 75F0002M
    Description: This series provides detailed documentation on income developments, including survey design issues, data quality evaluation and exploratory research.
    Release date: 2024-02-22

  • Articles and reports: 12-001-X202300200001
    Description: When a Medicare healthcare provider is suspected of billing abuse, a population of payments X made to that provider over a fixed timeframe is isolated. A certified medical reviewer, in a time-consuming process, can determine the overpayment Y = X - (amount justified by the evidence) associated with each payment. Typically, there are too many payments in the population to examine each with care, so a probability sample is selected. The sample overpayments are then used to calculate a 90% lower confidence bound for the total population overpayment. This bound is the amount demanded for recovery from the provider. Unfortunately, classical methods for calculating this bound sometimes fail to provide the 90% confidence level, especially when using a stratified sample.

    In this paper, 166 redacted samples from Medicare integrity investigations are displayed and described, along with 156 associated payment populations. The 7,588 examined (Y, X) sample pairs show (1) Medicare audits have high error rates: more than 76% of these payments were considered to have been paid in error; and (2) the patterns in these samples support an “All-or-Nothing” mixture model for (Y, X) previously defined in the literature. Model-based Monte Carlo testing procedures for Medicare sampling plans are discussed, as well as stratification methods based on anticipated model moments. In terms of viability (achieving the 90% confidence level) a new stratification method defined here is competitive with the best of the many existing methods tested and seems less sensitive to choice of operating parameters. In terms of overpayment recovery (equivalent to precision) the new method is also comparable to the best of the many existing methods tested. Unfortunately, no stratification algorithm tested was ever viable for more than about half of the 104 test populations.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200006
    Description: Survey researchers are increasingly turning to multimode data collection to deal with declines in survey response rates and increasing costs. An efficient approach offers the less costly modes (e.g., web) followed with a more expensive mode for a subsample of the units (e.g., households) within each primary sampling unit (PSU). We present two alternatives to this traditional design. One alternative subsamples PSUs rather than units to constrain costs. The second is a hybrid design that includes a clustered (two-stage) sample and an independent, unclustered sample. Using a simulation, we demonstrate the hybrid design has considerable advantages.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200008
    Description: In this article, we use a slightly simplified version of the method by Fickus, Mixon and Poteet (2013) to define a flexible parameterization of the kernels of determinantal sampling designs with fixed first-order inclusion probabilities. For specific values of the multidimensional parameter, we get back to a matrix from the family PII from Loonis and Mary (2019). We speculate that, among the determinantal designs with fixed inclusion probabilities, the minimum variance of the Horvitz and Thompson estimator (1952) of a variable of interest is expressed relative to PII. We provide experimental R programs that facilitate the appropriation of various concepts presented in the article, some of which are described as non-trivial by Fickus et al. (2013). A longer version of this article, including proofs and a more detailed presentation of the determinantal designs, is also available.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200010
    Description: Sample coordination methods aim to increase (in positive coordination) or decrease (in negative coordination) the size of the overlap between samples. The samples considered can be from different occasions of a repeated survey and/or from different surveys covering a common population. Negative coordination is used to control the response burden in a given period, because some units do not respond to survey questionnaires if they are selected in many samples. Usually, methods for sample coordination do not take into account any measure of the response burden that a unit has already expended in responding to previous surveys. We introduce such a measure into a new method by adapting a spatially balanced sampling scheme, based on a generalization of Poisson sampling, together with a negative coordination method. The goal is to create a double control of the burden for these units: once by using a measure of burden during the sampling process and once by using a negative coordination method. We evaluate the approach using Monte-Carlo simulation and investigate its use for controlling for selection “hot-spots” in business surveys in Statistics Netherlands.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200016
    Description: In this discussion, I will present some additional aspects of three major areas of survey theory developed or studied by Jean-Claude Deville: calibration, balanced sampling and the generalized weight-share method.
    Release date: 2024-01-03

  • Articles and reports: 75F0002M2023005
    Description: The Canadian Income Survey (CIS) has introduced improvements to the methods and systems used to produce income estimates with the release of its 2021 reference year estimates. This paper describes the changes and presents the approximate net result of these changes on income estimates using data for 2019 and 2020. The changes described in this paper highlight the ways in which data quality has been improved while producing minimal impact on key CIS estimates and trends.
    Release date: 2023-08-29

  • Articles and reports: 12-001-X202300100009
    Description: In this paper, with and without-replacement versions of adaptive proportional to size sampling are presented. Unbiased estimators are developed for these methods and their properties are studied. In the two versions, the drawing probabilities are adapted during the sampling process based on the observations already selected. To this end, in the version with-replacement, after each draw and observation of the variable of interest, the vector of the auxiliary variable will be updated using the observed values of the variable of interest to approximate the exact selection probability proportional to size. For the without-replacement version, first, using an initial sample, we model the relationship between the variable of interest and the auxiliary variable. Then, utilizing this relationship, we estimate the unknown (unobserved) population units. Finally, on these estimated population units, we select a new sample proportional to size without-replacement. These approaches can significantly improve the efficiency of designs not only in the case of a positive linear relationship, but also in the case of a non-linear or negative linear relationship between the variables. We investigate the efficiencies of the designs through simulations and real case studies on medicinal flowers, social and economic data.
    Release date: 2023-06-30

  • Articles and reports: 11-633-X2022006
    Description:

    This article compares how survey mode, survey thematic context and sample design contribute to variation in responses to similar questions on self-perceived racial discrimination across the 2013, 2014, 2019 and 2020 cycles of the General Social Survey (GSS).

    Release date: 2022-08-09
Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (268)

Analysis (268) (10 to 20 of 268 results)

  • Articles and reports: 12-001-X202200100010
    Description:

    This study combines simulated annealing with delta evaluation to solve the joint stratification and sample allocation problem. In this problem, atomic strata are partitioned into mutually exclusive and collectively exhaustive strata. Each partition of atomic strata is a possible solution to the stratification problem, the quality of which is measured by its cost. The Bell number of possible solutions is enormous, for even a moderate number of atomic strata, and an additional layer of complexity is added with the evaluation time of each solution. Many larger scale combinatorial optimisation problems cannot be solved to optimality, because the search for an optimum solution requires a prohibitive amount of computation time. A number of local search heuristic algorithms have been designed for this problem but these can become trapped in local minima preventing any further improvements. We add, to the existing suite of local search algorithms, a simulated annealing algorithm that allows for an escape from local minima and uses delta evaluation to exploit the similarity between consecutive solutions, and thereby reduces the evaluation time. We compared the simulated annealing algorithm with two recent algorithms. In both cases, the simulated annealing algorithm attained a solution of comparable quality in considerably less computation time.

    Release date: 2022-06-21

  • Articles and reports: 12-001-X202100200008
    Description:

    Multiple-frame surveys, in which independent probability samples are selected from each of Q sampling frames, have long been used to improve coverage, to reduce costs, or to increase sample sizes for subpopulations of interest. Much of the theory has been developed assuming that (1) the union of the frames covers the population of interest, (2) a full-response probability sample is selected from each frame, (3) the variables of interest are measured in each sample with no measurement error, and (4) sufficient information exists to account for frame overlap when computing estimates. After reviewing design, estimation, and calibration for traditional multiple-frame surveys, I consider modifications of the assumptions that allow a multiple-frame structure to serve as an organizing principle for other data combination methods such as mass imputation, sample matching, small area estimation, and capture-recapture estimation. Finally, I discuss how results from multiple-frame survey research can be used when designing and evaluating data collection systems that integrate multiple sources of data.

    Release date: 2022-01-06

  • Articles and reports: 11-522-X202100100024
    Description: The Economic Directorate of the U.S. Census Bureau is developing coordinated design and sample selection procedures for the Annual Integrated Economic Survey. The unified sample will replace the directorate’s existing practice of independently developing sampling frames and sampling procedures for a suite of separate annual surveys, which optimizes sample design features at the cost of increased response burden. Size attributes of business populations, e.g., revenues and employment, are highly skewed. A high percentage of companies operate in more than one industry. Therefore, many companies are sampled into multiple surveys compounding the response burden, especially for “medium sized” companies.

    This component of response burden is reduced by selecting a single coordinated sample but will not be completely alleviated. Response burden is a function of several factors, including (1) questionnaire length and complexity, (2) accessibility of data, (3) expected number of repeated measures, and (4) frequency of collection. The sample design can have profound effects on the third and fourth factors. To help inform decisions about the integrated sample design, we use regression trees to identify covariates from the sampling frame that are related to response burden. Using historic frame and response data from four independently sampled surveys, we test a variety of algorithms, then grow regression trees that explain relationships between expected levels of response burden (as measured by response rate) and frame covariates common to more than one survey. We validate initial findings by cross-validation, examining results over time. Finally, we make recommendations on how to incorporate our robust findings into the coordinated sample design.
    Release date: 2021-10-29

  • Articles and reports: 11-522-X202100100007
    Description: The National Center for Health Statistics (NCHS) annually administers the National Ambulatory Medical Care Survey (NAMCS) to assess practice characteristics and ambulatory care provided by office-based physicians in the United States, including interviews with sampled physicians. After the onset of the COVID-19 pandemic, NCHS adapted NAMCS methodology to assess the impacts of COVID-19 on office-based physicians, including: shortages of personal protective equipment; COVID-19 testing in physician offices; providers testing positive for COVID-19; and telemedicine use during the pandemic. This paper describes challenges and opportunities in administering the 2020 NAMCS and presents key findings regarding physician experiences during the COVID-19 pandemic.

    Key Words: National Ambulatory Medical Care Survey (NAMCS); Office-based physicians; Telemedicine; Personal protective equipment.

    Release date: 2021-10-22

  • Articles and reports: 11-522-X202100100016
    Description: To build data capacity and address the U.S. opioid public health emergency, the National Center for Health Statistics received funding for two projects. The projects involve development of algorithms that use all available structured and unstructured data submitted for the 2016 National Hospital Care Survey (NHCS) to enhance identification of opioid-involvement and the presence of co-occurring disorders (coexistence of a substance use disorder and a mental health issue). A description of the algorithm development process is provided, and lessons learned from integrating data science methods like natural language processing to produce official statistics are presented. Efforts to make the algorithms and analytic datafiles accessible to researchers are also discussed.

    Key Words: Opioids; Co-Occurring Disorders; Data Science; Natural Language Processing; Hospital Care

    Release date: 2021-10-22

  • Articles and reports: 12-001-X202100100002
    Description:

    We consider the problem of deciding on sampling strategy, in particular sampling design. We propose a risk measure, whose minimizing value guides the choice. The method makes use of a superpopulation model and takes into account uncertainty about its parameters through a prior distribution. The method is illustrated with a real dataset, yielding satisfactory results. As a baseline, we use the strategy that couples probability proportional-to-size sampling with the difference estimator, as it is known to be optimal when the superpopulation model is fully known. We show that, even under moderate misspecifications of the model, this strategy is not robust and can be outperformed by some alternatives.

    Release date: 2021-06-24

  • Articles and reports: 12-001-X202000200001
    Description:

    This paper constructs a probability-proportional-to-size (PPS) ranked-set sample from a stratified population. A PPS-ranked-set sample partitions the units in a PPS sample into groups of similar observations. The construction of similar groups relies on relative positions (ranks) of units in small comparison sets. Hence, the ranks induce more structure (stratification) in the sample in addition to the data structure created by unequal selection probabilities in a PPS sample. This added data structure makes the PPS-ranked-set sample more informative then a PPS-sample. The stratified PPS-ranked-set sample is constructed by selecting a PPS-ranked-set sample from each stratum population. The paper constructs unbiased estimators for the population mean, total and their variances. The new sampling design is applied to apple production data to estimate the total apple production in Turkey.

    Release date: 2020-12-15

  • Articles and reports: 12-001-X202000100002
    Description:

    Model-based methods are required to estimate small area parameters of interest, such as totals and means, when traditional direct estimation methods cannot provide adequate precision. Unit level and area level models are the most commonly used ones in practice. In the case of the unit level model, efficient model-based estimators can be obtained if the sample design is such that the sample and population models coincide: that is, the sampling design is non-informative for the model. If on the other hand, the sampling design is informative for the model, the selection probabilities will be related to the variable of interest, even after conditioning on the available auxiliary data. This will imply that the population model no longer holds for the sample. Pfeffermann and Sverchkov (2007) used the relationships between the population and sample distribution of the study variable to obtain approximately unbiased semi-parametric predictors of the area means under informative sampling schemes. Their procedure is valid for both sampled and non-sampled areas.

    Release date: 2020-06-30

  • Articles and reports: 12-001-X202000100005
    Description:

    Selecting the right sample size is central to ensure the quality of a survey. The state of the art is to account for complex sampling designs by calculating effective sample sizes. These effective sample sizes are determined using the design effect of central variables of interest. However, in face-to-face surveys empirical estimates of design effects are often suspected to be conflated with the impact of the interviewers. This typically leads to an over-estimation of design effects and consequently risks misallocating resources towards a higher sample size instead of using more interviewers or improving measurement accuracy. Therefore, we propose a corrected design effect that separates the interviewer effect from the effects of the sampling design on the sampling variance. The ability to estimate the corrected design effect is tested using a simulation study. In this respect, we address disentangling cluster and interviewer variance. Corrected design effects are estimated for data from the European Social Survey (ESS) round 6 and compared with conventional design effect estimates. Furthermore, we show that for some countries in the ESS round 6 the estimates of conventional design effect are indeed strongly inflated by interviewer effects.

    Release date: 2020-06-30

  • Articles and reports: 12-001-X201900300001
    Description:

    Standard linearization estimators of the variance of the general regression estimator are often too small, leading to confidence intervals that do not cover at the desired rate. Hat matrix adjustments can be used in two-stage sampling that help remedy this problem. We present theory for several new variance estimators and compare them to standard estimators in a series of simulations. The proposed estimators correct negative biases and improve confidence interval coverage rates in a variety of situations that mirror ones that are met in practice.

    Release date: 2019-12-17
Reference (29)

Reference (29) (0 to 10 of 29 results)

  • Surveys and statistical programs – Documentation: 98-20-00012020020
    Description:

    This fact sheet provides detailed insight into the design and methodology of the content test component of the 2019 Census Test. This test evaluated changes to the wording and flow of some questions, as well as the potential addition of new questions, to help determine the content of the 2021 Census of Population.

    Release date: 2020-07-20

  • Surveys and statistical programs – Documentation: 11-522-X201700014749
    Description:

    As part of the Tourism Statistics Program redesign, Statistics Canada is developing the National Travel Survey (NTS) to collect travel information from Canadian travellers. This new survey will replace the Travel Survey of Residents of Canada and the Canadian resident component of the International Travel Survey. The NTS will take advantage of Statistics Canada’s common sampling frames and common processing tools while maximizing the use of administrative data. This paper discusses the potential uses of administrative data such as Passport Canada files, Canada Border Service Agency files and Canada Revenue Agency files, to increase the efficiency of the NTS sample design.

    Release date: 2016-03-24

  • Surveys and statistical programs – Documentation: 89-631-X
    Description:

    This report highlights the latest developments and rationale behind recent cycles of the General Social Survey (GSS). Starting with an overview of the GSS mandate and historic cycle topics, we then focus on two recent cycles related to families in Canada: Family Transitions (2006) and Family, Social Support and Retirement (2007). Finally, we give a summary of what is to come in the 2008 GSS on Social Networks, and describe a special project to mark 'Twenty Years of GSS'.

    The survey collects data over a twelve month period from the population living in private households in the 10 provinces. For all cycles except Cycles 16 and 21, the population aged 15 and older has been sampled. Cycles 16 and 21 sampled persons aged 45 and older.

    Cycle 20 (GSS 2006) is the fourth cycle of the GSS to collect data on families (the first three cycles on the family were in 1990, 1995 and 2001). Cycle 20 covers much the same content as previous cycles on families with some sections revised and expanded. The data enable analysts to measure conjugal and fertility history (chronology of marriages, common-law unions, and children), family origins, children's home leaving, fertility intentions, child custody as well as work history and other socioeconomic characteristics. Questions on financial support agreements or arrangements (for children and the ex-spouse or ex-partner) for separated and divorced families have been modified. Also, sections on social networks, well-being and housing characteristics have been added.

    Release date: 2008-05-27

  • Surveys and statistical programs – Documentation: 75F0002M1992001
    Description:

    Starting in 1994, the Survey of Labour and Income Dynamics (SLID) will follow individuals and families for at least six years, tracking their labour market experiences, changes in income and family circumstances. An initial proposal for the content of SLID, entitled "Content of the Survey of Labour and Income Dynamics : Discussion Paper", was distributed in February 1992.

    That paper served as a background document for consultation with and a review by interested users. The content underwent significant change during this process. Based upon the revised content, a large-scale test of SLID will be conducted in February and May 1993.

    The present document outlines the income and wealth content to be tested in May 1993. This document is really a continuation of SLID Research Paper Series 92-01A, which outlines the demographic and labour content used in the January /February 1993 test.

    Release date: 2008-02-29

  • Surveys and statistical programs – Documentation: 75F0002M1992007
    Description:

    A Preliminary Interview will be conducted on the first panel of SLID, in January 1993, as a supplement to the Labour Force Survey. The first panel is made up of about 20,000 households that are rotating out of the Labour Force Survey in January and February, 1993.

    The purpose of this document is to provide a description of the purpose of the SLID Preliminary Interview and the question wordings to be used.

    Release date: 2008-02-29

  • Surveys and statistical programs – Documentation: 16-001-M2007004
    Description:

    Statistics Canada administers a number of environmental surveys that fill important data gaps but also pose numerous challenges to administer. This paper focuses on two on-going environment surveys - one newly initiated and one in the process of a redesign.

    Release date: 2007-11-23

  • Surveys and statistical programs – Documentation: 75F0002M2005002
    Description:

    This paper describes the changes made to the structure of geography information on SLID from reference year 1999 onwards. It goes into reasons for changing to the 2001 Census-based geography, shows how the overlap between the 1991 and 2001 Census-based concepts are handled, provides detail on how the geographic concepts are implemented, discusses a new imputation procedure and finishes with an illustration of the impact of these changes on selected tables.

    Release date: 2005-03-31

  • Surveys and statistical programs – Documentation: 71F0031X2005002
    Description:

    This paper introduces and explains modifications made to the Labour Force Survey estimates in January 2005. Some of these modifications include the adjustment of all LFS estimates to reflect population counts based on the 2001 Census, updates to industry and occupation classification systems and sample redesign changes.

    Release date: 2005-01-26

  • Surveys and statistical programs – Documentation: 75F0002M2004006
    Description:

    This document presents information about the entry-exit portion of the annual labour and the income interviews of the Survey of Labour and Income Dynamics (SLID).

    Release date: 2004-06-21

  • Surveys and statistical programs – Documentation: 81-595-M2003009
    Geography: Canada
    Description:

    This paper examines how the Canadian Adult Education and Training Survey (AETS) can be used to study participation in and impacts of education and training activities for adults.

    Release date: 2003-10-15
Date modified: