Collection and questionnaires

Results

All (347)

All (347) (0 to 10 of 347 results)

1. Comparing two common approaches to within-household sampling: A field experiment in Costa Rica
Articles and reports: 12-001-X202600100004
Description: We test the notion that a quasi-probabilistic method of selecting individuals within households (last birthday, LB) draws in a different sample compared to a non-probabilistic approach that selects respondents according to known parameters on age and gender (frequency matching, FM). With data from an original field experiment, we evaluate fieldwork efficiency (time and completed cases), economy (cost), success in recruiting a representative sample, and differences across a set of attitudinal and behavioral measures. We find that the FM approach performs better on efficiency and cost and achieves a comparable sample; importantly, this comparability extends across measures of personality traits and public opinion. With appropriate caveats, we conclude that researchers’ choice of selection methods should be guided by both theoretical benefits and practical tradeoffs.
Release date: 2026-06-29
2. Who’s asking? Interviewer effects on unit nonresponse in the Household Finance and Consumption Survey
Articles and reports: 12-001-X202500200002
Description: This study examines interviewer effects on household nonresponse in three waves of the Household Finance and Consumption Survey (HFCS) in Austria using a multilevel model. Addressing nonresponse at its source is crucial for maintaining survey data quality and representativeness. Our findings indicate that the variation in response behavior explained by interviewer effects decreased from about one-third in the first wave to 7% in the third wave. Effective interviewers tend to have a university degree, be married, homeowners, and have a larger workload. Additionally, higher mean wages in the household’s municipality negatively affect survey participation. These insights suggest targeted interviewer selection and training strategies to improve response rates.
Release date: 2025-12-23
3. In-Person Survey Data Collectors: Looking to the Future Archived
Articles and reports: 11-522-X202500100003
Description: In-person data collection is critical for the success of many large government-sponsored surveys. Despite response rate declines and increasing costs, the mode remains the gold standard for meeting the most rigorous survey requirements for federal survey programs, particularly as part of a multimode data collection strategy (Schober, 2018). However, over the last ten years critical labor market and workforce changes, exacerbated by the pandemic, have made in-person data collection efforts prohibitive for all but the largest survey organizations. Shifting ideas about job flexibility and job satisfaction alongside the increasingly technical role and demanding nature of the job have impacted recruitment and retention for survey organizations across the U.S. and Europe (Charman et al., 2024). The trends in U.S. field data collector employment are summarized and it is outlined that there are promising practices in recruiting and retaining high quality field data collectors. Additionally, broader ways to structure the field data collector labor force for continued success are considered, including supplementing field data collection with multimode alternatives such as video interviewing and updating value propositions for respondents.
Release date: 2025-09-08
4. Improving the Automated Capture of Survey of Household Spending Receipts using advanced Machine Learning Techniques Archived
Articles and reports: 11-522-X202500100004
Description: The Survey of Household Spending (SHS) conducted by Statistics Canada collects paper diaries and shopping receipts as a source of household expenditure data. An auto-capturing algorithm was created for SHS 2023 to reduce statistical clerks' manual work of extracting important information from scanned receipts of common store brands. The algorithm used Tesseract optical character recognition (OCR) to extract text characters from images of receipts, and it identified store and product entities using regular expressions, also known as regex. The goal of this study was to enhance the current auto-capture algorithm by experimenting with more advanced OCR and machine learning methods. As a result, PaddleOCR, an open-source OCR toolkit, was selected as the new default OCR engine due to its overall performance in recognizing texts, especially digits, accurately across receipts of various qualities. Additionally, entity classifiers based on support vector machines were trained on historical SHS records and existing regex patterns. By using classifiers to categorize different elements present on receipts instead of relying solely on regex patterns, product and store recognition improved. It is expected that this new algorithm will be used for SHS 2025 to improve the auto-capture quality and reduce the manual burden associated with capturing receipt variables.
Release date: 2025-09-08
5. Recruitment and Collection of Web Panels at Statistics Canada Archived
Articles and reports: 11-522-X202500100008
Description: In 2020, Statistics Canada started to use probabilistic web panels as an alternate method of collecting official statistics. In a web panel, respondents to another survey are asked for contact information to participate in future short surveys. This paper will highlight Statistics Canada's experience with panels after 4 years, including what has been learned about the recruitment of panel participants and how to subsequently collect data using panel surveys. The ways in which recruitment questions are presented can result in very different rates of participation. Moreover, the wealth of auxiliary information available on the recruitment survey can be used to actively manage panel collection operations, by predicting the probability of response and using this information to target follow-up efforts.
Release date: 2025-09-08
6. Advancing Equitable Data Collection: Insights from Statistics Canada's Statistical Integration Methods Division Disaggregated Data Action Plan Research Project Archived
Articles and reports: 11-522-X202500100013
Description: As part of answering the call to action for the United Nations' (UN) 17 Sustainable Development Goals, as well as addressing social, economic, and equity challenges within Canada, Statistics Canada's five-year development phase for the Disaggregated Data Action Plan (DDAP) was funded in 2021 to support data driven decision around these challenges. In turn, the document "Guiding Principles: Leveraging the 2021 Census of Populations Data for DDAP Groups of Interest" were created. The guiding principles document explains the organizational framework of the DDAP in the Agency, describes existing data sources, addresses ethical and privacy concerns, and centralizes sampling methods tailored for DDAP initiatives while accounting for characteristics which can complicate sampling and data collection procedures.
Release date: 2025-09-08
7. 2024 Census Test: Design and methodology of the content test
Surveys and statistical programs – Documentation: 98-20-00052026004
Description: This report provides detailed insight into the design and methodology of the content test component of the 2024 Census Test. This test evaluated changes to the wording and flow of some questions, as well as the potential addition of new questions, to help determine the content of the 2026 Census of Population.
Release date: 2025-07-04
8. Census of Agriculture: Changes to the questionnaire
Surveys and statistical programs – Documentation: 32-26-0008
Description: This report describes the main changes, additions or deletions to the Census of Agriculture questionnaire by topic and in the order they appear on the questionnaire.
Release date: 2025-07-04
9. Daily rhythm of data quality: Evidence from the Survey of Unemployed Workers in New Jersey
Articles and reports: 12-001-X202400200002
Description: This paper investigates whether survey data quality fluctuates over the day. After laying out the argument theoretically, panel data from the Survey of Unemployed Workers in New Jersey are analyzed. Several indirect indicators of response error are investigated, including item nonresponse, interview completion time, rounding, and measures of the quality of time diary data. The evidence that we assemble for a time of day of interview effect is weak or nonexistent. Item nonresponse and the probability that interview completion time is among the 5% shortest appear to increase in the evening, but a more thorough assessment requires instrumental variables.
Release date: 2024-12-20
10. Investigating mode effects in interviewer variances using two representative multi-mode surveys
Articles and reports: 12-001-X202400200006
Description: As mixed-mode designs become increasingly popular, their effects on data quality have attracted much scholarly attention. Most studies focused on the bias properties of mixed-mode designs; few of them have investigated whether mixed-mode designs have heterogeneous variance structures across modes. While many characteristics of mixed-mode designs, such as varied interviewer usage, systematic differences in respondents, varying levels of social desirability bias, among others, may lead to heterogeneous variances in mode-specific point estimates of population means, this study specifically investigates whether interviewer variances remain consistent across different modes in mixed-mode studies. To address this research question, we utilize data collected from two distinct study designs. In the first design, when interviewers are responsible for either face-to-face or telephone mode, we examine whether there are mode differences in interviewer variances for 1) sensitive political questions, 2) international items, 3) and item missing indicators on international items, using the Arab Barometer wave 6 Jordan data. In the second design, we draw on Health and Retirement Study (HRS) 2016 core survey data to examine the question on three topics when interviewers are responsible for both modes. The topics cover 1) the CESD depression scale, 2) interviewer observations, and 3) the physical activity scale. To account for the lack of interpenetrated designs in both data sources, we include respondent-level covariates in our models. We find significant differences in interviewer variances on one item (twelve items in total) in the Arab Barometer study; whereas for HRS, the results are three out of eighteen. Overall, we find the magnitude of the interviewer variances larger in FTF than TEL on sensitive items. We conduct simulations to understand the power to detect mode effects in the typically modest interviewer sample sizes.
Release date: 2024-12-20

Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (247)

Analysis (247) (0 to 10 of 247 results)

1. Comparing two common approaches to within-household sampling: A field experiment in Costa Rica
Articles and reports: 12-001-X202600100004
Description: We test the notion that a quasi-probabilistic method of selecting individuals within households (last birthday, LB) draws in a different sample compared to a non-probabilistic approach that selects respondents according to known parameters on age and gender (frequency matching, FM). With data from an original field experiment, we evaluate fieldwork efficiency (time and completed cases), economy (cost), success in recruiting a representative sample, and differences across a set of attitudinal and behavioral measures. We find that the FM approach performs better on efficiency and cost and achieves a comparable sample; importantly, this comparability extends across measures of personality traits and public opinion. With appropriate caveats, we conclude that researchers’ choice of selection methods should be guided by both theoretical benefits and practical tradeoffs.
Release date: 2026-06-29
2. Who’s asking? Interviewer effects on unit nonresponse in the Household Finance and Consumption Survey
Articles and reports: 12-001-X202500200002
Description: This study examines interviewer effects on household nonresponse in three waves of the Household Finance and Consumption Survey (HFCS) in Austria using a multilevel model. Addressing nonresponse at its source is crucial for maintaining survey data quality and representativeness. Our findings indicate that the variation in response behavior explained by interviewer effects decreased from about one-third in the first wave to 7% in the third wave. Effective interviewers tend to have a university degree, be married, homeowners, and have a larger workload. Additionally, higher mean wages in the household’s municipality negatively affect survey participation. These insights suggest targeted interviewer selection and training strategies to improve response rates.
Release date: 2025-12-23
3. In-Person Survey Data Collectors: Looking to the Future Archived
Articles and reports: 11-522-X202500100003
Description: In-person data collection is critical for the success of many large government-sponsored surveys. Despite response rate declines and increasing costs, the mode remains the gold standard for meeting the most rigorous survey requirements for federal survey programs, particularly as part of a multimode data collection strategy (Schober, 2018). However, over the last ten years critical labor market and workforce changes, exacerbated by the pandemic, have made in-person data collection efforts prohibitive for all but the largest survey organizations. Shifting ideas about job flexibility and job satisfaction alongside the increasingly technical role and demanding nature of the job have impacted recruitment and retention for survey organizations across the U.S. and Europe (Charman et al., 2024). The trends in U.S. field data collector employment are summarized and it is outlined that there are promising practices in recruiting and retaining high quality field data collectors. Additionally, broader ways to structure the field data collector labor force for continued success are considered, including supplementing field data collection with multimode alternatives such as video interviewing and updating value propositions for respondents.
Release date: 2025-09-08
4. Improving the Automated Capture of Survey of Household Spending Receipts using advanced Machine Learning Techniques Archived
Articles and reports: 11-522-X202500100004
Description: The Survey of Household Spending (SHS) conducted by Statistics Canada collects paper diaries and shopping receipts as a source of household expenditure data. An auto-capturing algorithm was created for SHS 2023 to reduce statistical clerks' manual work of extracting important information from scanned receipts of common store brands. The algorithm used Tesseract optical character recognition (OCR) to extract text characters from images of receipts, and it identified store and product entities using regular expressions, also known as regex. The goal of this study was to enhance the current auto-capture algorithm by experimenting with more advanced OCR and machine learning methods. As a result, PaddleOCR, an open-source OCR toolkit, was selected as the new default OCR engine due to its overall performance in recognizing texts, especially digits, accurately across receipts of various qualities. Additionally, entity classifiers based on support vector machines were trained on historical SHS records and existing regex patterns. By using classifiers to categorize different elements present on receipts instead of relying solely on regex patterns, product and store recognition improved. It is expected that this new algorithm will be used for SHS 2025 to improve the auto-capture quality and reduce the manual burden associated with capturing receipt variables.
Release date: 2025-09-08
5. Recruitment and Collection of Web Panels at Statistics Canada Archived
Articles and reports: 11-522-X202500100008
Description: In 2020, Statistics Canada started to use probabilistic web panels as an alternate method of collecting official statistics. In a web panel, respondents to another survey are asked for contact information to participate in future short surveys. This paper will highlight Statistics Canada's experience with panels after 4 years, including what has been learned about the recruitment of panel participants and how to subsequently collect data using panel surveys. The ways in which recruitment questions are presented can result in very different rates of participation. Moreover, the wealth of auxiliary information available on the recruitment survey can be used to actively manage panel collection operations, by predicting the probability of response and using this information to target follow-up efforts.
Release date: 2025-09-08
6. Advancing Equitable Data Collection: Insights from Statistics Canada's Statistical Integration Methods Division Disaggregated Data Action Plan Research Project Archived
Articles and reports: 11-522-X202500100013
Description: As part of answering the call to action for the United Nations' (UN) 17 Sustainable Development Goals, as well as addressing social, economic, and equity challenges within Canada, Statistics Canada's five-year development phase for the Disaggregated Data Action Plan (DDAP) was funded in 2021 to support data driven decision around these challenges. In turn, the document "Guiding Principles: Leveraging the 2021 Census of Populations Data for DDAP Groups of Interest" were created. The guiding principles document explains the organizational framework of the DDAP in the Agency, describes existing data sources, addresses ethical and privacy concerns, and centralizes sampling methods tailored for DDAP initiatives while accounting for characteristics which can complicate sampling and data collection procedures.
Release date: 2025-09-08
7. Daily rhythm of data quality: Evidence from the Survey of Unemployed Workers in New Jersey
Articles and reports: 12-001-X202400200002
Description: This paper investigates whether survey data quality fluctuates over the day. After laying out the argument theoretically, panel data from the Survey of Unemployed Workers in New Jersey are analyzed. Several indirect indicators of response error are investigated, including item nonresponse, interview completion time, rounding, and measures of the quality of time diary data. The evidence that we assemble for a time of day of interview effect is weak or nonexistent. Item nonresponse and the probability that interview completion time is among the 5% shortest appear to increase in the evening, but a more thorough assessment requires instrumental variables.
Release date: 2024-12-20
8. Investigating mode effects in interviewer variances using two representative multi-mode surveys
Articles and reports: 12-001-X202400200006
Description: As mixed-mode designs become increasingly popular, their effects on data quality have attracted much scholarly attention. Most studies focused on the bias properties of mixed-mode designs; few of them have investigated whether mixed-mode designs have heterogeneous variance structures across modes. While many characteristics of mixed-mode designs, such as varied interviewer usage, systematic differences in respondents, varying levels of social desirability bias, among others, may lead to heterogeneous variances in mode-specific point estimates of population means, this study specifically investigates whether interviewer variances remain consistent across different modes in mixed-mode studies. To address this research question, we utilize data collected from two distinct study designs. In the first design, when interviewers are responsible for either face-to-face or telephone mode, we examine whether there are mode differences in interviewer variances for 1) sensitive political questions, 2) international items, 3) and item missing indicators on international items, using the Arab Barometer wave 6 Jordan data. In the second design, we draw on Health and Retirement Study (HRS) 2016 core survey data to examine the question on three topics when interviewers are responsible for both modes. The topics cover 1) the CESD depression scale, 2) interviewer observations, and 3) the physical activity scale. To account for the lack of interpenetrated designs in both data sources, we include respondent-level covariates in our models. We find significant differences in interviewer variances on one item (twelve items in total) in the Arab Barometer study; whereas for HRS, the results are three out of eighteen. Overall, we find the magnitude of the interviewer variances larger in FTF than TEL on sensitive items. We conduct simulations to understand the power to detect mode effects in the typically modest interviewer sample sizes.
Release date: 2024-12-20
9. Survey Series on People and their Communities Archived
Articles and reports: 11-522-X202200100011
Description: In 2021, Statistics Canada initiated the Disaggregated Data Action Plan, a multi-year initiative to support more representative data collection methods, enhance statistics on diverse populations to allow for intersectional analyses, and support government and societal efforts to address known inequalities and bring considerations of fairness and inclusion into decision making. As part of this initiative, we are building the Survey Series on People and their Communities, a new probabilistic panel specifically designed to collect data that can be disaggregated according to racialized group. This new tool will allow us to address data gaps and emerging questions related to diversity. This paper will give an overview of the design of the Survey Series on People and their Communities.
Release date: 2024-03-25
10. From theory to practice: Lessons learned from implementing the Network Sampling with Memory method Archived
Articles and reports: 11-522-X202200100016
Description: To overcome the traditional drawbacks of chain sampling methods, the sampling method called “network sampling with memory” was developed. Its unique feature is to recreate, gradually in the field, a frame for the target population composed of individuals identified by respondents and to randomly draw future respondents from this frame, thereby minimizing selection bias. Tested for the first time in France between September 2020 and June 2021, for a survey among Chinese immigrants in Île-de-France (ChIPRe), this presentation describes the difficulties encountered during collection—sometimes contextual, due to the pandemic, but mostly inherent to the method.
Release date: 2024-03-25

Reference (100)

Reference (100) (0 to 10 of 100 results)

1. 2024 Census Test: Design and methodology of the content test
Surveys and statistical programs – Documentation: 98-20-00052026004
Description: This report provides detailed insight into the design and methodology of the content test component of the 2024 Census Test. This test evaluated changes to the wording and flow of some questions, as well as the potential addition of new questions, to help determine the content of the 2026 Census of Population.
Release date: 2025-07-04
2. Census of Agriculture: Changes to the questionnaire
Surveys and statistical programs – Documentation: 32-26-0008
Description: This report describes the main changes, additions or deletions to the Census of Agriculture questionnaire by topic and in the order they appear on the questionnaire.
Release date: 2025-07-04
3. Standard Geographical Classification (SGC) Volume II. Reference Maps
Geographic files and documentation: 12-572-X
Description:
The Standard Geographical Classification (SGC) provides a systematic classification structure that categorizes all of the geographic area of Canada. The SGC is the official classification used in the Census of Population and other Statistics Canada surveys.
The classification is organized in two volumes: Volume I, The Classification and Volume II, Reference Maps.
Volume II contains reference maps showing boundaries, names, codes and locations of the geographic areas in the classification. The reference maps show census subdivisions, census divisions, census metropolitan areas, census agglomerations, census metropolitan influenced zones and economic regions. Definitions for these terms are found in Volume I, The Classification. Volume I describes the classification and related standard geographic areas and place names.
The maps in Volume II can be downloaded in PDF format from our website.
Release date: 2022-02-09
4. Painting a Portrait of Canada: The 2021 Census of Population
Notices and consultations: 98-26-0001
Description:
This white paper presents Statistics Canada’s planned approach to the 2021 Census of Population and provides a clear explanation of the processes behind the census program, touching on historical, legal, operational and content aspects. Statistics Canada recognizes that it is important to not only successfully conduct the census, but also to be transparent and informative about the way in which those efforts are accomplished. Painting a Portrait of Canada: The 2021 Census of Population gives readers an exclusive, detailed look at how census data is collected, analyzed and given back to Canadians, in the form of high-quality statistical information, used to make evidence-based decisions in Canadian society.
Release date: 2020-07-20
5. Sources and Methods: Capital Investment in Infrastructure
Surveys and statistical programs – Documentation: 34-26-0002
Description:
As of reference year 2018, the Annual Capital and Repair Expenditures Survey (CAPEX) has added additional content allowing to produce estimates of capital and repair expenditures on infrastructure assets. In addition to the existing content, the new questionnaire asks for a breakdown of expenditures by function (or purpose) as well as the source of funding of capital expenditures from government grants and subsidies.
This product will decribe the sources and methods used to produce capital and repair expenditure estimates specific to infrastructure assets by function.
Release date: 2020-04-01
6. 2016 Census Program Content Test: Design and Results
Notices and consultations: 92-140-X2016001
Description:
The 2016 Census Program Content Test was conducted from May 2 to June 30, 2014. The Test was designed to assess the impact of any proposed content changes to the 2016 Census Program and to measure the impact of including a social insurance number (SIN) question on the data quality.
This quantitative test used a split-panel design involving 55,000 dwellings, divided into 11 panels of 5,000 dwellings each: five panels were dedicated to the Content Test while the remaining six panels were for the SIN Test. Two models of test questionnaires were developed to meet the objectives, namely a model with all the proposed changes EXCEPT the SIN question and a model with all the proposed changes INCLUDING the SIN question. A third model of 'control' questionnaire with the 2011 content was also developed. The population living in a private dwelling in mail-out areas in one of the ten provinces was targeted for the test. Paper and electronic response channels were part of the Test as well.
This report presents the Test objectives, the design and a summary of the analysis in order to determine potential content for the 2016 Census Program. Results from the data analysis of the Test were not the only elements used to determine the content for 2016. Other elements were also considered, such as response burden, comparison over time and users’ needs.
Release date: 2016-04-01
7. The Alternative Data Solution – Experience of the Producer Prices Division Archived
Surveys and statistical programs – Documentation: 11-522-X201700014706
Description:
Over the last decade, Statistics Canada’s Producer Prices Division has expanded its service producer price indexes program and continued to improve its goods and construction producer price indexes program. While the majority of price indexes are based on traditional survey methods, efforts were made to increase the use of administrative data and alternative data sources in order to reduce burden on our respondents. This paper focuses mainly on producer price programs, but also provides information on the growing importance of alternative data sources at Statistics Canada. In addition, it presents the operational challenges and risks that statistical offices could face when relying more and more on third-party outputs. Finally, it presents the tools being developed to integrate alternative data while collecting metadata.
Release date: 2016-03-24
8. Challenges and results in using Audit trail data to monitor Labour Force Survey data quality Archived
Surveys and statistical programs – Documentation: 11-522-X201700014707
Description:
The Labour Force Survey (LFS) is a monthly household survey of about 56,000 households that provides information on the Canadian labour market. Audit Trail is a Blaise programming option, for surveys like LFS with Computer Assisted Interviewing (CAI), which creates files containing every keystroke and edit and timestamp of every data collection attempt on all households. Combining such a large survey with such a complete source of paradata opens the door to in-depth data quality analysis but also quickly leads to Big Data challenges. How can meaningful information be extracted from this large set of keystrokes and timestamps? How can it help assess the quality of LFS data collection? The presentation will describe some of the challenges that were encountered, solutions that were used to address them, and results of the analysis on data quality.
Release date: 2016-03-24
9. Challenges Associated with Using Scanner Data for the Consumer Price Index Archived
Surveys and statistical programs – Documentation: 11-522-X201700014751
Description:
Practically all major retailers use scanners to record the information on their transactions with clients (consumers). These data normally include the product code, a brief description, the price and the quantity sold. This is an extremely relevant data source for statistical programs such as Statistics Canada’s Consumer Price Index (CPI), one of Canada’s most important economic indicators. Using scanner data could improve the quality of the CPI by increasing the number of prices used in calculations, expanding geographic coverage and including the quantities sold, among other things, while lowering data collection costs. However, using these data presents many challenges. An examination of scanner data from a first retailer revealed a high rate of change in product identification codes over a one-year period. The effects of these changes pose challenges from a product classification and estimate quality perspective. This article focuses on the issues associated with acquiring, classifying and examining these data to assess their quality for use in the CPI.
Release date: 2016-03-24
10. A New Survey Measure of Disability: the Disability Screening Questions (DSQ) Archived
Surveys and statistical programs – Documentation: 89-654-X2016003
Description:
This paper describes the process that led to the creation of the new Disability Screening Questions (DSQ), jointly developped by Statistics Canada and Employment and Social Development Canada. The DSQ form a new module which can be put on general population surveys to allow comparisons of persons with and without a disability. The paper explains why there are two versions of the DSQ—a long and a short one—, the difference between the two, and how each version can be used.
Release date: 2016-02-29

Date modified:: 2026-07-18