Filter results by

Search Help
Currently selected filters that can be removed

Keyword(s)

Survey or statistical program

38 facets displayed. 0 facets selected.

Content

1 facets displayed. 1 facets selected.
Sort Help
entries

Results

All (192)

All (192) (30 to 40 of 192 results)

  • Articles and reports: 12-001-X202300200004
    Description: We present a novel methodology to benchmark county-level estimates of crop area totals to a preset state total subject to inequality constraints and random variances in the Fay-Herriot model. For planted area of the National Agricultural Statistics Service (NASS), an agency of the United States Department of Agriculture (USDA), it is necessary to incorporate the constraint that the estimated totals, derived from survey and other auxiliary data, are no smaller than administrative planted area totals prerecorded by other USDA agencies except NASS. These administrative totals are treated as fixed and known, and this additional coherence requirement adds to the complexity of benchmarking the county-level estimates. A fully Bayesian analysis of the Fay-Herriot model offers an appealing way to incorporate the inequality and benchmarking constraints, and to quantify the resulting uncertainties, but sampling from the posterior densities involves difficult integration, and reasonable approximations must be made. First, we describe a single-shrinkage model, shrinking the means while the variances are assumed known. Second, we extend this model to accommodate double shrinkage, borrowing strength across means and variances. This extended model has two sources of extra variation, but because we are shrinking both means and variances, it is expected that this second model should perform better in terms of goodness of fit (reliability) and possibly precision. The computations are challenging for both models, which are applied to simulated data sets with properties resembling the Illinois corn crop.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200005
    Description: Population undercoverage is one of the main hurdles faced by statistical analysis with non-probability survey samples. We discuss two typical scenarios of undercoverage, namely, stochastic undercoverage and deterministic undercoverage. We argue that existing estimation methods under the positivity assumption on the propensity scores (i.e., the participation probabilities) can be directly applied to handle the scenario of stochastic undercoverage. We explore strategies for mitigating biases in estimating the mean of the target population under deterministic undercoverage. In particular, we examine a split population approach based on a convex hull formulation, and construct estimators with reduced biases. A doubly robust estimator can be constructed if a followup subsample of the reference probability survey with measurements on the study variable becomes feasible. Performances of six competing estimators are investigated through a simulation study and issues which require further investigation are briefly discussed.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200006
    Description: Survey researchers are increasingly turning to multimode data collection to deal with declines in survey response rates and increasing costs. An efficient approach offers the less costly modes (e.g., web) followed with a more expensive mode for a subsample of the units (e.g., households) within each primary sampling unit (PSU). We present two alternatives to this traditional design. One alternative subsamples PSUs rather than units to constrain costs. The second is a hybrid design that includes a clustered (two-stage) sample and an independent, unclustered sample. Using a simulation, we demonstrate the hybrid design has considerable advantages.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200007
    Description: Conformal prediction is an assumption-lean approach to generating distribution-free prediction intervals or sets, for nearly arbitrary predictive models, with guaranteed finite-sample coverage. Conformal methods are an active research topic in statistics and machine learning, but only recently have they been extended to non-exchangeable data. In this paper, we invite survey methodologists to begin using and contributing to conformal methods. We introduce how conformal prediction can be applied to data from several common complex sample survey designs, under a framework of design-based inference for a finite population, and we point out gaps where survey methodologists could fruitfully apply their expertise. Our simulations empirically bear out the theoretical guarantees of finite-sample coverage, and our real-data example demonstrates how conformal prediction can be applied to complex sample survey data in practice.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200008
    Description: In this article, we use a slightly simplified version of the method by Fickus, Mixon and Poteet (2013) to define a flexible parameterization of the kernels of determinantal sampling designs with fixed first-order inclusion probabilities. For specific values of the multidimensional parameter, we get back to a matrix from the family PII from Loonis and Mary (2019). We speculate that, among the determinantal designs with fixed inclusion probabilities, the minimum variance of the Horvitz and Thompson estimator (1952) of a variable of interest is expressed relative to PII. We provide experimental R programs that facilitate the appropriation of various concepts presented in the article, some of which are described as non-trivial by Fickus et al. (2013). A longer version of this article, including proofs and a more detailed presentation of the determinantal designs, is also available.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200009
    Description: In this paper, we investigate how a big non-probability database can be used to improve estimates of finite population totals from a small probability sample through data integration techniques. In the situation where the study variable is observed in both data sources, Kim and Tam (2021) proposed two design-consistent estimators that can be justified through dual frame survey theory. First, we provide conditions ensuring that these estimators are more efficient than the Horvitz-Thompson estimator when the probability sample is selected using either Poisson sampling or simple random sampling without replacement. Then, we study the class of QR predictors, introduced by Särndal and Wright (1984), to handle the less common case where the non-probability database contains no study variable but auxiliary variables. We also require that the non-probability database is large and can be linked to the probability sample. We provide conditions ensuring that the QR predictor is asymptotically design-unbiased. We derive its asymptotic design variance and provide a consistent design-based variance estimator. We compare the design properties of different predictors, in the class of QR predictors, through a simulation study. This class includes a model-based predictor, a model-assisted estimator and a cosmetic estimator. In our simulation setups, the cosmetic estimator performed slightly better than the model-assisted estimator. These findings are confirmed by an application to La Poste data, which also illustrates that the properties of the cosmetic estimator are preserved irrespective of the observed non-probability sample.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200010
    Description: Sample coordination methods aim to increase (in positive coordination) or decrease (in negative coordination) the size of the overlap between samples. The samples considered can be from different occasions of a repeated survey and/or from different surveys covering a common population. Negative coordination is used to control the response burden in a given period, because some units do not respond to survey questionnaires if they are selected in many samples. Usually, methods for sample coordination do not take into account any measure of the response burden that a unit has already expended in responding to previous surveys. We introduce such a measure into a new method by adapting a spatially balanced sampling scheme, based on a generalization of Poisson sampling, together with a negative coordination method. The goal is to create a double control of the burden for these units: once by using a measure of burden during the sampling process and once by using a negative coordination method. We evaluate the approach using Monte-Carlo simulation and investigate its use for controlling for selection “hot-spots” in business surveys in Statistics Netherlands.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200011
    Description: The article considers sampling designs for populations that can be represented as a N × M matrix. For instance when investigating tourist activities, the rows could be locations visited by tourists and the columns days in the tourist season. The goal is to sample cells (i, j) of the matrix when the number of selections within each row and each column is fixed a priori. The ith row sample size represents the number of selected cells within row i; the jth column sample size is the number of selected cells within column j. A matrix sampling design gives an N × M matrix of sample indicators, with entry 1 at position (i, j) if cell (i, j) is sampled and 0 otherwise. The first matrix sampling design investigated has one level of sampling, row and column sample sizes are set in advance: the row sample sizes can vary while the column sample sizes are all equal. The fixed margins can be seen as balancing constraints and algorithms available for selecting such samples are reviewed. A new estimator for the variance of the Horvitz-Thompson estimator for the mean of survey variable y is then presented. Several levels of sampling might be necessary to account for all the constraints; this involves multi-level matrix sampling designs that are also investigated.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200012
    Description: In recent decades, many different uses of auxiliary information have enriched survey sampling theory and practice. Jean-Claude Deville contributed significantly to this progress. My comments trace some of the steps on the way to one important theory for the use of auxiliary information: Estimation by calibration.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200013
    Description: Jean-Claude Deville is one of the most prominent researcher in survey sampling theory and practice. His research on balanced sampling, indirect sampling and calibration in particular is internationally recognized and widely used in official statistics. He was also a pioneer in the field of functional data analysis. This discussion gives us the opportunity to recognize the immense work he has accomplished, and to pay tribute to him. In the first part of this article, we recall briefly his contribution to the functional principal analysis. We also detail some recent extension of his work at the intersection of the fields of functional data analysis and survey sampling. In the second part of this paper, we present some extension of Jean-Claude’s work in indirect sampling. These extensions are motivated by concrete applications and illustrate Jean-Claude’s influence on our work as researchers.
    Release date: 2024-01-03
Stats in brief (10)

Stats in brief (10) ((10 results))

  • Stats in brief: 11-001-X202411338008
    Description: Release published in The Daily – Statistics Canada’s official release bulletin
    Release date: 2024-04-22

  • Stats in brief: 11-637-X
    Description: This product presents data on the Sustainable Development Goals. They present an overview of the 17 Goals through infographics by leveraging data currently available to report on Canada’s progress towards the 2030 Agenda for Sustainable Development.
    Release date: 2024-01-25

  • Stats in brief: 11-001-X202402237898
    Description: Release published in The Daily – Statistics Canada’s official release bulletin
    Release date: 2024-01-22

  • Stats in brief: 89-20-00062023001
    Description: This course is intended for Government of Canada employees who would like to learn about evaluating the quality of data for a particular use. Whether you are a new employee interested in learning the basics, or an experienced subject matter expert looking to refresh your skills, this course is here to help.
    Release date: 2023-07-17

  • Stats in brief: 98-20-00032021011
    Description: This video explains the key concepts of different levels of aggregation of income data such as household and family income; income concepts derived from key income variables such as adjusted income and equivalence scale; and statistics used for income data such as median and average income, quartiles, quintiles, deciles and percentiles.
    Release date: 2023-03-29

  • Stats in brief: 98-20-00032021012
    Description: This video builds on concepts introduced in the other videos on income. It explains key low-income concepts - Market Basket Measure (MBM), Low income measure (LIM) and Low-income cut-offs (LICO) and the indicators associated with these concepts such as the low-income gap and the low-income ratio. These concepts are used in analysis of the economic well-being of the population.
    Release date: 2023-03-29

  • Stats in brief: 11-001-X202231822683
    Description: Release published in The Daily – Statistics Canada’s official release bulletin
    Release date: 2022-11-14

  • Stats in brief: 89-20-00062022004
    Description:

    Gathering, exploring, analyzing and interpreting data are essential steps in producing information that benefits society, the economy and the environment. In this video, we will discuss the importance of considering data ethics throughout the process of producing statistical information.

    As a pre-requisite to this video, make sure to watch the video titled “Data Ethics: An introduction” also available in Statistics Canada’s data literacy training catalogue.

    Release date: 2022-10-17

  • Stats in brief: 89-20-00062022005
    Description:

    In this video, you will learn the answers to the following questions: What are the different types of error? What are the types of error that lead to statistical bias? Where during the data journey statistical bias can occur?

    Release date: 2022-10-17

  • Stats in brief: 45-20-00032022002
    Description:

    Canada’s diversity and rich cultural heritage have been shaped by the people who have come from all over the world to call it home. But even in our multicultural society, eliminating all forms of discrimination remains a challenge. In this episode, we turn a critical eye to the ways that cognitive bias risks perpetuating systemic racism. Statistics are supposed to accurately reflect the world around us, but are all data created equal? Join our guests, Sarah Messou-Ghelazzi, Communications Officer, Filsan Hujaleh, Analyst with the Centre for Social Data Insights and Innovation, and Jeff Latimer, Director General - Accountable for Health, Justice, Diversity and Populations at Statistics Canada as we explore the role data can play to make Canada a more equal society for all.

    Release date: 2022-03-16
Articles and reports (169)

Articles and reports (169) (30 to 40 of 169 results)

  • Articles and reports: 12-001-X202300200011
    Description: The article considers sampling designs for populations that can be represented as a N × M matrix. For instance when investigating tourist activities, the rows could be locations visited by tourists and the columns days in the tourist season. The goal is to sample cells (i, j) of the matrix when the number of selections within each row and each column is fixed a priori. The ith row sample size represents the number of selected cells within row i; the jth column sample size is the number of selected cells within column j. A matrix sampling design gives an N × M matrix of sample indicators, with entry 1 at position (i, j) if cell (i, j) is sampled and 0 otherwise. The first matrix sampling design investigated has one level of sampling, row and column sample sizes are set in advance: the row sample sizes can vary while the column sample sizes are all equal. The fixed margins can be seen as balancing constraints and algorithms available for selecting such samples are reviewed. A new estimator for the variance of the Horvitz-Thompson estimator for the mean of survey variable y is then presented. Several levels of sampling might be necessary to account for all the constraints; this involves multi-level matrix sampling designs that are also investigated.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200012
    Description: In recent decades, many different uses of auxiliary information have enriched survey sampling theory and practice. Jean-Claude Deville contributed significantly to this progress. My comments trace some of the steps on the way to one important theory for the use of auxiliary information: Estimation by calibration.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200013
    Description: Jean-Claude Deville is one of the most prominent researcher in survey sampling theory and practice. His research on balanced sampling, indirect sampling and calibration in particular is internationally recognized and widely used in official statistics. He was also a pioneer in the field of functional data analysis. This discussion gives us the opportunity to recognize the immense work he has accomplished, and to pay tribute to him. In the first part of this article, we recall briefly his contribution to the functional principal analysis. We also detail some recent extension of his work at the intersection of the fields of functional data analysis and survey sampling. In the second part of this paper, we present some extension of Jean-Claude’s work in indirect sampling. These extensions are motivated by concrete applications and illustrate Jean-Claude’s influence on our work as researchers.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200014
    Description: Many things have been written about Jean-Claude Deville in tributes from the statistical community (see Tillé, 2022a; Tillé, 2022b; Christine, 2022; Ardilly, 2022; and Matei, 2022) and from the École nationale de la statistique et de l’administration économique (ENSAE) and the Société française de statistique. Pascal Ardilly, David Haziza, Pierre Lavallée and Yves Tillé provide an in-depth look at Jean-Claude Deville’s contributions to survey theory. To pay tribute to him, I would like to discuss Jean-Claude Deville’s contribution to the more day-to-day application of methodology for all the statisticians at the Institut national de la statistique et des études économiques (INSEE) and at the public statistics service. To do this, I will use my work experience, and particularly the four years (1992 to 1996) I spent working with him in the Statistical Methods Unit and the discussions we had thereafter, especially in the 2000s on the rolling census.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200015
    Description: This article discusses and provides comments on the Ardilly, Haziza, Lavallée and Tillé’s summary presentation of Jean-Claude Deville’s work on survey theory. It sheds light on the context, applications and uses of his findings, and shows how these have become engrained in the role of statisticians, in which Jean-Claude was a trailblazer. It also discusses other aspects of his career and his creative inventions.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200016
    Description: In this discussion, I will present some additional aspects of three major areas of survey theory developed or studied by Jean-Claude Deville: calibration, balanced sampling and the generalized weight-share method.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200017
    Description: Jean-Claude Deville, who passed away in October 2021, was one of the most influential researchers in the field of survey statistics over the past 40 years. This article traces some of his contributions that have had a profound impact on both survey theory and practice. This article will cover the topics of balanced sampling using the cube method, calibration, the weight-sharing method, the development of variance expressions of complex estimators using influence function and quota sampling.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200018
    Description: Sample surveys, as a tool for policy development and evaluation and for scientific, social and economic research, have been employed for over a century. In that time, they have primarily served as tools for collecting data for enumerative purposes. Estimation of these characteristics has been typically based on weighting and repeated sampling, or design-based, inference. However, sample data have also been used for modelling the unobservable processes that gave rise to the finite population data. This type of use has been termed analytic, and often involves integrating the sample data with data from secondary sources.

    Alternative approaches to inference in these situations, drawing inspiration from mainstream statistical modelling, have been strongly promoted. The principal focus of these alternatives has been on allowing for informative sampling. Modern survey sampling, though, is more focussed on situations where the sample data are in fact part of a more complex set of data sources all carrying relevant information about the process of interest. When an efficient modelling method such as maximum likelihood is preferred, the issue becomes one of how it should be modified to account for both complex sampling designs and multiple data sources. Here application of the Missing Information Principle provides a clear way forward.

    In this paper I review how this principle has been applied to resolve so-called “messy” data analysis issues in sampling. I also discuss a scenario that is a consequence of the rapid growth in auxiliary data sources for survey data analysis. This is where sampled records from one accessible source or register are linked to records from another less accessible source, with values of the response variable of interest drawn from this second source, and where a key output is small area estimates for the response variable for domains defined on the first source.
    Release date: 2024-01-03

  • Articles and reports: 82-003-X202301200002
    Description: The validity of survival estimates from cancer registry data depends, in part, on the identification of the deaths of deceased cancer patients. People whose deaths are missed seemingly live on forever and are informally referred to as “immortals”, and their presence in registry data can result in inflated survival estimates. This study assesses the issue of immortals in the Canadian Cancer Registry (CCR) using a recently proposed method that compares the survival of long-term survivors of cancers for which “statistical” cure has been reported with that of similar people from the general population.
    Release date: 2023-12-20

  • Articles and reports: 11-633-X2023003
    Description: This paper spans the academic work and estimation strategies used in national statistics offices. It addresses the issue of producing fine, grid-level geography estimates for Canada by exploring the measurement of subprovincial and subterritorial gross domestic product using Yukon as a test case.
    Release date: 2023-12-15
Journals and periodicals (13)

Journals and periodicals (13) (0 to 10 of 13 results)

  • Journals and periodicals: 11-522-X
    Description: Since 1984, an annual international symposium on methodological issues has been sponsored by Statistics Canada. Proceedings have been available since 1987.
    Release date: 2024-06-28

  • Journals and periodicals: 12-001-X
    Geography: Canada
    Description: The journal publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves.
    Release date: 2024-06-25

  • Journals and periodicals: 75F0002M
    Description: This series provides detailed documentation on income developments, including survey design issues, data quality evaluation and exploratory research.
    Release date: 2024-04-26

  • Journals and periodicals: 11-633-X
    Description: Papers in this series provide background discussions of the methods used to develop data for economic, health, and social analytical studies at Statistics Canada. They are intended to provide readers with information on the statistical methods, standards and definitions used to develop databases for research purposes. All papers in this series have undergone peer and institutional review to ensure that they conform to Statistics Canada's mandate and adhere to generally accepted standards of good professional practice.
    Release date: 2024-01-22

  • Journals and periodicals: 12-206-X
    Description: This report summarizes the annual achievements of the Methodology Research and Development Program (MRDP) sponsored by the Modern Statistical Methods and Data Science Branch at Statistics Canada. This program covers research and development activities in statistical methods with potentially broad application in the agency’s statistical programs; these activities would otherwise be less likely to be carried out during the provision of regular methodology services to those programs. The MRDP also includes activities that provide support in the application of past successful developments in order to promote the use of the results of research and development work. Selected prospective research activities are also presented.
    Release date: 2023-10-11

  • Journals and periodicals: 92F0138M
    Description:

    The Geography working paper series is intended to stimulate discussion on a variety of topics covering conceptual, methodological or technical work to support the development and dissemination of the division's data, products and services. Readers of the series are encouraged to contact the Geography Division with comments and suggestions.

    Release date: 2019-11-13

  • Journals and periodicals: 89-20-0001
    Description:

    Historical works allow readers to peer into the past, not only to satisfy our curiosity about “the way things were,” but also to see how far we’ve come, and to learn from the past. For Statistics Canada, such works are also opportunities to commemorate the agency’s contributions to Canada and its people, and serve as a reminder that an institution such as this continues to evolve each and every day.

    On the occasion of Statistics Canada’s 100th anniversary in 2018, Standing on the shoulders of giants: History of Statistics Canada: 1970 to 2008, builds on the work of two significant publications on the history of the agency, picking up the story in 1970 and carrying it through the next 36 years, until 2008. To that end, when enough time has passed to allow for sufficient objectivity, it will again be time to document the agency’s next chapter as it continues to tell Canada’s story in numbers.

    Release date: 2018-12-03

  • Journals and periodicals: 12-605-X
    Description:

    The Record Linkage Project Process Model (RLPPM) was developed by Statistics Canada to identify the processes and activities involved in record linkage. The RLPPM applies to linkage projects conducted at the individual and enterprise level using diverse data sources to create new data sources to meet analytical and operational needs.

    Release date: 2017-06-05

  • Journals and periodicals: 11-634-X
    Description:

    This publication is a catalogue of strategies and mechanisms that a statistical organization should consider adopting, according to its particular context. This compendium is based on lessons learned and best practices of leadership and management of statistical agencies within the scope of Statistics Canada’s International Statistical Fellowship Program (ISFP). It contains four broad sections including, characteristics of an effective national statistical system; core management practices; improving, modernizing and finding efficiencies; and, strategies to better inform and engage key stakeholders.

    Release date: 2016-07-06

  • Journals and periodicals: 88F0006X
    Geography: Canada
    Description:

    Statistics Canada is engaged in the "Information System for Science and Technology Project" to develop useful indicators of activity and a framework to tie them together into a coherent picture of science and technology (S&T) in Canada. The working papers series is used to publish results of the different initiatives conducted within this project. The data are related to the activities, linkages and outcomes of S&T. Several key areas are covered such as: innovation, technology diffusion, human resources in S&T and interrelations between different actors involved in S&T. This series also presents data tabulations taken from regular surveys on research and development (R&D) and S&T and made possible by the project.

    Release date: 2011-12-23
Date modified: