Survey design

Skip to filters. View results.

Sort Help
entries

Results

All (329)

All (329) (0 to 10 of 329 results)

  • Articles and reports: 75-005-M2025001
    Description: Since 2010, engaging Canadians to participate in the LFS has become more challenging due to a variety of social and technological changes. The decline in the LFS response rate accelerated in 2020, exacerbated by public health measures during the COVID-19 pandemic. This technical paper presents preliminary results of two collection initiatives implemented using an online first strategy to improve the LFS response rates by confirming respondent contact information and expanding the availability of online response. Through these and other planned initiatives, Statistics Canada is working to ensure that the LFS estimates continue to provide an accurate and representative portrait of the Canadian labour market.
    Release date: 2025-10-21

  • Articles and reports: 11-522-X202500100004
    Description: The Survey of Household Spending (SHS) conducted by Statistics Canada collects paper diaries and shopping receipts as a source of household expenditure data. An auto-capturing algorithm was created for SHS 2023 to reduce statistical clerks' manual work of extracting important information from scanned receipts of common store brands. The algorithm used Tesseract optical character recognition (OCR) to extract text characters from images of receipts, and it identified store and product entities using regular expressions, also known as regex. The goal of this study was to enhance the current auto-capture algorithm by experimenting with more advanced OCR and machine learning methods. As a result, PaddleOCR, an open-source OCR toolkit, was selected as the new default OCR engine due to its overall performance in recognizing texts, especially digits, accurately across receipts of various qualities. Additionally, entity classifiers based on support vector machines were trained on historical SHS records and existing regex patterns. By using classifiers to categorize different elements present on receipts instead of relying solely on regex patterns, product and store recognition improved. It is expected that this new algorithm will be used for SHS 2025 to improve the auto-capture quality and reduce the manual burden associated with capturing receipt variables.
    Release date: 2025-09-08

  • Articles and reports: 11-522-X202500100011
    Description: The use of modern "data"-driven imputation methods to treat non-response in the context of surveys processed in the Integrated Business Statistics Program at Statistics Canada has previously been explored. It was observed that these methods can lead to high quality imputation and further have the potential to result in broad efficiencies when setting up a particular survey's edit and imputation strategy. However, estimation of the associated total variance, more specifically the component due to imputation, remains a challenge. In this article, two methods for estimation of total variance are proposed and show preliminary results that have motivated us to pursue further research in this area.
    Release date: 2025-09-08

  • Articles and reports: 11-522-X202500100029
    Description: J.N.K. Rao has contributed to almost every subdiscipline of survey research, including unequal-probability and two-phase sampling, variance estimation, regression and categorical data analysis, small area estimation, and data integration. For each of these topics, Rao's work anticipated and led future research directions. His contributions will be discussed in the context of broader research trends as seen in the articles of Survey Methodology over the journal's 50-year history.
    Release date: 2025-09-08

  • Articles and reports: 11-522-X202500100030
    Description: In the setting of multilevel models to be estimated using data from surveys with complex sampling designs, this paper outlines some contributions of the landmark paper by Rao, Verret and Hidiroglou (Survey Methodology, 2013) and subsequent related work.
    Release date: 2025-09-08

  • Articles and reports: 11-522-X202500100032
    Description: Although non-probability data sources are not new to official statistics, a revived interest in the topic has emerged from pressures due to falling survey response rates, increasing data collection costs and a desire to take advantage of new data source opportunities from the ongoing societal digitalisation. Due to the exclusion of certain segments of the target population, inference derived solely from a non-probability data source is likely to result in bias. This work approaches the challenge of addressing the bias by integrating non-probability data with reference probability samples. The focus will be on methods to model the propensity of inclusion in the non-probability dataset with the help of the accompanying reference sample, with the modelled propensities then applied in an inverse probability weighting approach to produce population estimates. The reference sample is sometimes assumed as given. In this presentation however, an objective of finding an optimal strategy will be pursued that is, the combination of a data integration-based estimator and sample design for the reference probability sample. Recent work is discussed in which advantage is taken of the good unit identification possibilities in business surveys to study an estimator based on propensities and derive optimal (unequal) selection probabilities for the reference sample.
    Release date: 2025-09-08

  • Articles and reports: 11-522-X202500100033
    Description: Aligning with recent needs for increased disaggregated data, in 2021 Canada became the first country to collect and disseminate data on gender diversity in a national census giving Canadians the option to select male, female, or non-binary. Due to their small size, non-binary population counts were not used in the 2021 Census long-form sample calibration procedure due to the risk of increasing the variance of estimates. This paper presents an alternative long-form calibration strategy which allows for small populations, such as the non-binary group, to be incorporated while mitigating methodological concerns. The strategy put forward can incorporate multiple small populations simultaneously while also being flexible enough to fit the calibration systems of other National Statistical Offices (NSOs). The results of a Monte Carlo (MC) simulation are presented showing improved data quality for the non-binary population under the alternative calibration strategy.
    Release date: 2025-09-08

  • Articles and reports: 12-001-X202500100010
    Description: The discussants highlight promising research topics for improving the quality and granularity of estimates from surveys. We agree that continued research is needed to evaluate models used for inference, and suggest development of measures of model dependence.
    Release date: 2025-06-30

  • Articles and reports: 12-001-X202500100011
    Description: This discussion examines some advancements in survey design and estimation, inspired by the comprehensive appraisal of Professors Jon Rao and Sharon Lohr on current trends in the field. It delves into three specific areas: balanced sampling, calibration, and small area estimation. Probabilistic balanced sampling methods, such as the cube method and penalized balanced sampling, are explored, with an emphasis on addressing emerging challenges, including extensions to linear mixed models, nonparametric regression models, and spatially balanced designs. Calibration is discussed using a modular framework that incorporates modern regression techniques, and highlights innovative uses of model calibration for data editing and causal inference. Small area estimation is considered in the context of latent variable modeling and data integration, emphasizing its role when the variable(s) of interest cannot be measured either directly or without error. Applications in integrating probability and non-probability data and conducting causal analysis at local level are also discussed.
    Release date: 2025-06-30

  • Articles and reports: 12-001-X202500100012
    Description: In this discussion, we complement the excellent overview by Profs. Lohr and Rao with some additional topics. The first topic is a call for more recognition of the central role of modeling in survey estimation. The second is a brief discussion of the use of partial frame information in survey design. Finally, we draw the attention to recent increases of synthetic methods, in particular, multilevel regression and poststratification (MRP) in small area estimation applications.
    Release date: 2025-06-30
Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (300)

Analysis (300) (0 to 10 of 300 results)

  • Articles and reports: 75-005-M2025001
    Description: Since 2010, engaging Canadians to participate in the LFS has become more challenging due to a variety of social and technological changes. The decline in the LFS response rate accelerated in 2020, exacerbated by public health measures during the COVID-19 pandemic. This technical paper presents preliminary results of two collection initiatives implemented using an online first strategy to improve the LFS response rates by confirming respondent contact information and expanding the availability of online response. Through these and other planned initiatives, Statistics Canada is working to ensure that the LFS estimates continue to provide an accurate and representative portrait of the Canadian labour market.
    Release date: 2025-10-21

  • Articles and reports: 11-522-X202500100004
    Description: The Survey of Household Spending (SHS) conducted by Statistics Canada collects paper diaries and shopping receipts as a source of household expenditure data. An auto-capturing algorithm was created for SHS 2023 to reduce statistical clerks' manual work of extracting important information from scanned receipts of common store brands. The algorithm used Tesseract optical character recognition (OCR) to extract text characters from images of receipts, and it identified store and product entities using regular expressions, also known as regex. The goal of this study was to enhance the current auto-capture algorithm by experimenting with more advanced OCR and machine learning methods. As a result, PaddleOCR, an open-source OCR toolkit, was selected as the new default OCR engine due to its overall performance in recognizing texts, especially digits, accurately across receipts of various qualities. Additionally, entity classifiers based on support vector machines were trained on historical SHS records and existing regex patterns. By using classifiers to categorize different elements present on receipts instead of relying solely on regex patterns, product and store recognition improved. It is expected that this new algorithm will be used for SHS 2025 to improve the auto-capture quality and reduce the manual burden associated with capturing receipt variables.
    Release date: 2025-09-08

  • Articles and reports: 11-522-X202500100011
    Description: The use of modern "data"-driven imputation methods to treat non-response in the context of surveys processed in the Integrated Business Statistics Program at Statistics Canada has previously been explored. It was observed that these methods can lead to high quality imputation and further have the potential to result in broad efficiencies when setting up a particular survey's edit and imputation strategy. However, estimation of the associated total variance, more specifically the component due to imputation, remains a challenge. In this article, two methods for estimation of total variance are proposed and show preliminary results that have motivated us to pursue further research in this area.
    Release date: 2025-09-08

  • Articles and reports: 11-522-X202500100029
    Description: J.N.K. Rao has contributed to almost every subdiscipline of survey research, including unequal-probability and two-phase sampling, variance estimation, regression and categorical data analysis, small area estimation, and data integration. For each of these topics, Rao's work anticipated and led future research directions. His contributions will be discussed in the context of broader research trends as seen in the articles of Survey Methodology over the journal's 50-year history.
    Release date: 2025-09-08

  • Articles and reports: 11-522-X202500100030
    Description: In the setting of multilevel models to be estimated using data from surveys with complex sampling designs, this paper outlines some contributions of the landmark paper by Rao, Verret and Hidiroglou (Survey Methodology, 2013) and subsequent related work.
    Release date: 2025-09-08

  • Articles and reports: 11-522-X202500100032
    Description: Although non-probability data sources are not new to official statistics, a revived interest in the topic has emerged from pressures due to falling survey response rates, increasing data collection costs and a desire to take advantage of new data source opportunities from the ongoing societal digitalisation. Due to the exclusion of certain segments of the target population, inference derived solely from a non-probability data source is likely to result in bias. This work approaches the challenge of addressing the bias by integrating non-probability data with reference probability samples. The focus will be on methods to model the propensity of inclusion in the non-probability dataset with the help of the accompanying reference sample, with the modelled propensities then applied in an inverse probability weighting approach to produce population estimates. The reference sample is sometimes assumed as given. In this presentation however, an objective of finding an optimal strategy will be pursued that is, the combination of a data integration-based estimator and sample design for the reference probability sample. Recent work is discussed in which advantage is taken of the good unit identification possibilities in business surveys to study an estimator based on propensities and derive optimal (unequal) selection probabilities for the reference sample.
    Release date: 2025-09-08

  • Articles and reports: 11-522-X202500100033
    Description: Aligning with recent needs for increased disaggregated data, in 2021 Canada became the first country to collect and disseminate data on gender diversity in a national census giving Canadians the option to select male, female, or non-binary. Due to their small size, non-binary population counts were not used in the 2021 Census long-form sample calibration procedure due to the risk of increasing the variance of estimates. This paper presents an alternative long-form calibration strategy which allows for small populations, such as the non-binary group, to be incorporated while mitigating methodological concerns. The strategy put forward can incorporate multiple small populations simultaneously while also being flexible enough to fit the calibration systems of other National Statistical Offices (NSOs). The results of a Monte Carlo (MC) simulation are presented showing improved data quality for the non-binary population under the alternative calibration strategy.
    Release date: 2025-09-08

  • Articles and reports: 12-001-X202500100010
    Description: The discussants highlight promising research topics for improving the quality and granularity of estimates from surveys. We agree that continued research is needed to evaluate models used for inference, and suggest development of measures of model dependence.
    Release date: 2025-06-30

  • Articles and reports: 12-001-X202500100011
    Description: This discussion examines some advancements in survey design and estimation, inspired by the comprehensive appraisal of Professors Jon Rao and Sharon Lohr on current trends in the field. It delves into three specific areas: balanced sampling, calibration, and small area estimation. Probabilistic balanced sampling methods, such as the cube method and penalized balanced sampling, are explored, with an emphasis on addressing emerging challenges, including extensions to linear mixed models, nonparametric regression models, and spatially balanced designs. Calibration is discussed using a modular framework that incorporates modern regression techniques, and highlights innovative uses of model calibration for data editing and causal inference. Small area estimation is considered in the context of latent variable modeling and data integration, emphasizing its role when the variable(s) of interest cannot be measured either directly or without error. Applications in integrating probability and non-probability data and conducting causal analysis at local level are also discussed.
    Release date: 2025-06-30

  • Articles and reports: 12-001-X202500100012
    Description: In this discussion, we complement the excellent overview by Profs. Lohr and Rao with some additional topics. The first topic is a call for more recognition of the central role of modeling in survey estimation. The second is a brief discussion of the use of partial frame information in survey design. Finally, we draw the attention to recent increases of synthetic methods, in particular, multilevel regression and poststratification (MRP) in small area estimation applications.
    Release date: 2025-06-30
Reference (29)

Reference (29) (0 to 10 of 29 results)

  • Surveys and statistical programs – Documentation: 98-20-00012020020
    Description:

    This fact sheet provides detailed insight into the design and methodology of the content test component of the 2019 Census Test. This test evaluated changes to the wording and flow of some questions, as well as the potential addition of new questions, to help determine the content of the 2021 Census of Population.

    Release date: 2020-07-20

  • Surveys and statistical programs – Documentation: 11-522-X201700014749
    Description:

    As part of the Tourism Statistics Program redesign, Statistics Canada is developing the National Travel Survey (NTS) to collect travel information from Canadian travellers. This new survey will replace the Travel Survey of Residents of Canada and the Canadian resident component of the International Travel Survey. The NTS will take advantage of Statistics Canada’s common sampling frames and common processing tools while maximizing the use of administrative data. This paper discusses the potential uses of administrative data such as Passport Canada files, Canada Border Service Agency files and Canada Revenue Agency files, to increase the efficiency of the NTS sample design.

    Release date: 2016-03-24

  • Surveys and statistical programs – Documentation: 89-631-X
    Description:

    This report highlights the latest developments and rationale behind recent cycles of the General Social Survey (GSS). Starting with an overview of the GSS mandate and historic cycle topics, we then focus on two recent cycles related to families in Canada: Family Transitions (2006) and Family, Social Support and Retirement (2007). Finally, we give a summary of what is to come in the 2008 GSS on Social Networks, and describe a special project to mark 'Twenty Years of GSS'.

    The survey collects data over a twelve month period from the population living in private households in the 10 provinces. For all cycles except Cycles 16 and 21, the population aged 15 and older has been sampled. Cycles 16 and 21 sampled persons aged 45 and older.

    Cycle 20 (GSS 2006) is the fourth cycle of the GSS to collect data on families (the first three cycles on the family were in 1990, 1995 and 2001). Cycle 20 covers much the same content as previous cycles on families with some sections revised and expanded. The data enable analysts to measure conjugal and fertility history (chronology of marriages, common-law unions, and children), family origins, children's home leaving, fertility intentions, child custody as well as work history and other socioeconomic characteristics. Questions on financial support agreements or arrangements (for children and the ex-spouse or ex-partner) for separated and divorced families have been modified. Also, sections on social networks, well-being and housing characteristics have been added.

    Release date: 2008-05-27

  • Surveys and statistical programs – Documentation: 75F0002M1992001
    Description:

    Starting in 1994, the Survey of Labour and Income Dynamics (SLID) will follow individuals and families for at least six years, tracking their labour market experiences, changes in income and family circumstances. An initial proposal for the content of SLID, entitled "Content of the Survey of Labour and Income Dynamics : Discussion Paper", was distributed in February 1992.

    That paper served as a background document for consultation with and a review by interested users. The content underwent significant change during this process. Based upon the revised content, a large-scale test of SLID will be conducted in February and May 1993.

    The present document outlines the income and wealth content to be tested in May 1993. This document is really a continuation of SLID Research Paper Series 92-01A, which outlines the demographic and labour content used in the January /February 1993 test.

    Release date: 2008-02-29

  • Surveys and statistical programs – Documentation: 75F0002M1992007
    Description:

    A Preliminary Interview will be conducted on the first panel of SLID, in January 1993, as a supplement to the Labour Force Survey. The first panel is made up of about 20,000 households that are rotating out of the Labour Force Survey in January and February, 1993.

    The purpose of this document is to provide a description of the purpose of the SLID Preliminary Interview and the question wordings to be used.

    Release date: 2008-02-29

  • Surveys and statistical programs – Documentation: 16-001-M2007004
    Description:

    Statistics Canada administers a number of environmental surveys that fill important data gaps but also pose numerous challenges to administer. This paper focuses on two on-going environment surveys - one newly initiated and one in the process of a redesign.

    Release date: 2007-11-23

  • Surveys and statistical programs – Documentation: 75F0002M2005002
    Description:

    This paper describes the changes made to the structure of geography information on SLID from reference year 1999 onwards. It goes into reasons for changing to the 2001 Census-based geography, shows how the overlap between the 1991 and 2001 Census-based concepts are handled, provides detail on how the geographic concepts are implemented, discusses a new imputation procedure and finishes with an illustration of the impact of these changes on selected tables.

    Release date: 2005-03-31

  • Surveys and statistical programs – Documentation: 71F0031X2005002
    Description:

    This paper introduces and explains modifications made to the Labour Force Survey estimates in January 2005. Some of these modifications include the adjustment of all LFS estimates to reflect population counts based on the 2001 Census, updates to industry and occupation classification systems and sample redesign changes.

    Release date: 2005-01-26

  • Surveys and statistical programs – Documentation: 75F0002M2004006
    Description:

    This document presents information about the entry-exit portion of the annual labour and the income interviews of the Survey of Labour and Income Dynamics (SLID).

    Release date: 2004-06-21

  • Surveys and statistical programs – Documentation: 81-595-M2003009
    Geography: Canada
    Description:

    This paper examines how the Canadian Adult Education and Training Survey (AETS) can be used to study participation in and impacts of education and training activities for adults.

    Release date: 2003-10-15