Quality assurance

Sort Help
entries

Results

All (250)

All (250) (0 to 10 of 250 results)

  • Journals and periodicals: 75F0002M
    Description: This series provides detailed documentation on income developments, including survey design issues, data quality evaluation and exploratory research.
    Release date: 2024-04-26

  • Surveys and statistical programs – Documentation: 32-26-0007
    Description: Census of Agriculture data provide statistical information on farms and farm operators at fine geographic levels and for small subpopulations. Quality evaluation activities are essential to ensure that census data are reliable and that they meet user needs.

    This report provides data quality information pertaining to the Census of Agriculture, such as sources of error, error detection, disclosure control methods, data quality indicators, response rates and collection rates.
    Release date: 2024-02-06

  • Articles and reports: 13-604-M2024001
    Description: This documentation outlines the methodology used to develop the Distributions of household economic accounts published in January 2024 for the reference years 2010 to 2023. It describes the framework and the steps implemented to produce distributional information aligned with the National Balance Sheet Accounts and other national accounts concepts. It also includes a report on the quality of the estimated distributions.
    Release date: 2024-01-22

  • Articles and reports: 13-604-M2023001
    Description: This documentation outlines the methodology used to develop the Distributions of household economic accounts published in March 2023 for the reference years 2010 to 2022. It describes the framework and the steps implemented to produce distributional information aligned with the National Balance Sheet Accounts and other national accounts concepts. It also includes a report on the quality of the estimated distributions.
    Release date: 2023-03-31

  • Articles and reports: 13-604-M2022002
    Description:

    This documentation outlines the methodology used to develop the Distributions of household economic accounts published in August 2022 for the reference years 2010 to 2021. It describes the framework and the steps implemented to produce distributional information aligned with the National Balance Sheet Accounts and other national accounts concepts. It also includes a report on the quality of the estimated distributions.

    Release date: 2022-08-03

  • 19-22-0009
    Description:

    Join us as Statistics Canada’s Quality Secretariat will give a presentation on the importance of data quality. We are living in an exciting time for data: sources are more abundant, they are being generated in innovative ways, and they are available quicker than ever. However, a data source is not only worthless if it does not meet basic quality standards – it can be misleading, and worse than having no data at all! Statistics Canada’s Quality Secretariat has a mandate to promote good quality practices within the agency, across the Government of Canada, and internationally. For quality to truly be present, it must be incorporated into each process (from design to analysis) and into the product itself – whether that product is a microdata file or estimates derived from it. We will address why data quality is important and how one can evaluate it in practice. We will cover some basic concepts in data quality (quality assurance vs. control, metadata, etc.), and present data quality as a multidimensional concept. Finally, we will show data quality in action by evaluating a data source together. All data quality literacy levels are welcome. After all, everybody plays a part in quality!

    https://www.statcan.gc.ca/en/services/webinars/19220009

    Release date: 2022-01-26

  • Articles and reports: 11-522-X202100100015
    Description: National statistical agencies such as Statistics Canada have a responsibility to convey the quality of statistical information to users. The methods traditionally used to do this are based on measures of sampling error. As a result, they are not adapted to the estimates produced using administrative data, for which the main sources of error are not due to sampling. A more suitable approach to reporting the quality of estimates presented in a multidimensional table is described in this paper. Quality indicators were derived for various post-acquisition processing steps, such as linkage, geocoding and imputation, by estimation domain. A clustering algorithm was then used to combine domains with similar quality levels for a given estimate. Ratings to inform users of the relative quality of estimates across domains were assigned to the groups created. This indicator, called the composite quality indicator (CQI), was developed and experimented with in the Canadian Housing Statistics Program (CHSP), which aims to produce official statistics on the residential housing sector in Canada using multiple administrative data sources.

    Keywords: Unsupervised machine learning, quality assurance, administrative data, data integration, clustering.

    Release date: 2021-10-22

  • Articles and reports: 11-522-X202100100023
    Description:

    Our increasingly digital society provides multiple opportunities to maximise our use of data for the public good – using a range of sources, data types and technologies to enable us to better inform the public about social and economic matters and contribute to the effective development and evaluation of public policy. Ensuring use of data in ethically appropriate ways is an important enabler for realising the potential to use data for public good research and statistics. Earlier this year the UK Statistics Authority launched the Centre for Applied Data Ethics to provide applied data ethics services, advice, training and guidance to the analytical community across the United Kingdom. The Centre has developed a framework and portfolio of services to empower analysts to consider the ethics of their research quickly and easily, at the research design phase thus promoting a culture of ethics by design. This paper will provide an overview of this framework, the accompanying user support services and the impact of this work.

    Key words: Data ethics, data, research and statistics

    Release date: 2021-10-22

  • Articles and reports: 13-604-M2021001
    Description:

    This documentation outlines the methodology used to develop the Distributions of household economic accounts published in September 2021 for the reference years 2010 to 2020. It describes the framework and the steps implemented to produce distributional information aligned with the National Balance Sheet Accounts and other national accounts concepts. It also includes a report on the quality of the estimated distributions.

    Release date: 2021-09-07

  • Stats in brief: 89-20-00062020001
    Description:

    In this video, you will be introduced to the fundamentals of data quality, which can be summed up in six dimensions—or six different ways to think about quality. You will also learn how each dimension can be used to evaluate the quality of data.

    Release date: 2020-09-23
Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (171)

Analysis (171) (0 to 10 of 171 results)

  • Journals and periodicals: 75F0002M
    Description: This series provides detailed documentation on income developments, including survey design issues, data quality evaluation and exploratory research.
    Release date: 2024-04-26

  • Articles and reports: 13-604-M2024001
    Description: This documentation outlines the methodology used to develop the Distributions of household economic accounts published in January 2024 for the reference years 2010 to 2023. It describes the framework and the steps implemented to produce distributional information aligned with the National Balance Sheet Accounts and other national accounts concepts. It also includes a report on the quality of the estimated distributions.
    Release date: 2024-01-22

  • Articles and reports: 13-604-M2023001
    Description: This documentation outlines the methodology used to develop the Distributions of household economic accounts published in March 2023 for the reference years 2010 to 2022. It describes the framework and the steps implemented to produce distributional information aligned with the National Balance Sheet Accounts and other national accounts concepts. It also includes a report on the quality of the estimated distributions.
    Release date: 2023-03-31

  • Articles and reports: 13-604-M2022002
    Description:

    This documentation outlines the methodology used to develop the Distributions of household economic accounts published in August 2022 for the reference years 2010 to 2021. It describes the framework and the steps implemented to produce distributional information aligned with the National Balance Sheet Accounts and other national accounts concepts. It also includes a report on the quality of the estimated distributions.

    Release date: 2022-08-03

  • Articles and reports: 11-522-X202100100015
    Description: National statistical agencies such as Statistics Canada have a responsibility to convey the quality of statistical information to users. The methods traditionally used to do this are based on measures of sampling error. As a result, they are not adapted to the estimates produced using administrative data, for which the main sources of error are not due to sampling. A more suitable approach to reporting the quality of estimates presented in a multidimensional table is described in this paper. Quality indicators were derived for various post-acquisition processing steps, such as linkage, geocoding and imputation, by estimation domain. A clustering algorithm was then used to combine domains with similar quality levels for a given estimate. Ratings to inform users of the relative quality of estimates across domains were assigned to the groups created. This indicator, called the composite quality indicator (CQI), was developed and experimented with in the Canadian Housing Statistics Program (CHSP), which aims to produce official statistics on the residential housing sector in Canada using multiple administrative data sources.

    Keywords: Unsupervised machine learning, quality assurance, administrative data, data integration, clustering.

    Release date: 2021-10-22

  • Articles and reports: 11-522-X202100100023
    Description:

    Our increasingly digital society provides multiple opportunities to maximise our use of data for the public good – using a range of sources, data types and technologies to enable us to better inform the public about social and economic matters and contribute to the effective development and evaluation of public policy. Ensuring use of data in ethically appropriate ways is an important enabler for realising the potential to use data for public good research and statistics. Earlier this year the UK Statistics Authority launched the Centre for Applied Data Ethics to provide applied data ethics services, advice, training and guidance to the analytical community across the United Kingdom. The Centre has developed a framework and portfolio of services to empower analysts to consider the ethics of their research quickly and easily, at the research design phase thus promoting a culture of ethics by design. This paper will provide an overview of this framework, the accompanying user support services and the impact of this work.

    Key words: Data ethics, data, research and statistics

    Release date: 2021-10-22

  • Articles and reports: 13-604-M2021001
    Description:

    This documentation outlines the methodology used to develop the Distributions of household economic accounts published in September 2021 for the reference years 2010 to 2020. It describes the framework and the steps implemented to produce distributional information aligned with the National Balance Sheet Accounts and other national accounts concepts. It also includes a report on the quality of the estimated distributions.

    Release date: 2021-09-07

  • Stats in brief: 89-20-00062020001
    Description:

    In this video, you will be introduced to the fundamentals of data quality, which can be summed up in six dimensions—or six different ways to think about quality. You will also learn how each dimension can be used to evaluate the quality of data.

    Release date: 2020-09-23

  • Stats in brief: 89-20-00062020008
    Description:

    Accuracy is one of the six dimensions of Data Quality used at Statistics Canada.   Accuracy refers to how well the data reflects the truth or what actually happened.   In this video we will present methods to describe accuracy in terms of validity and correctness. We will also discuss methods to validate and check the accuracy of data values.

    Release date: 2020-09-23

  • Articles and reports: 13-604-M2020002
    Description:

    This documentation outlines the methodology used to develop the Distributions of household economic accounts published in June 2020 for the reference years 2010 to 2019. It describes the framework and the steps implemented to produce distributional information aligned with the National balance sheet accounts and other national accounts concepts. It also includes a report on the quality of the estimated distributions.

    Release date: 2020-06-26
Reference (78)

Reference (78) (60 to 70 of 78 results)

  • Surveys and statistical programs – Documentation: 11-522-X19980015018
    Description:

    This paper presents a method for handling longitudinal data in which individuals belong to more than one unit at a higher level, and also where there is missing information on the identification of the units to which they belong. In education, for example, a student might be classified as belonging sequentially to a particular combination of primary and secondary school, but for some students, the identity of either the primary or secondary school may be unknown. Likewise, in a longitudinal study, students may change school or class from one period to the next, so 'belonging' to more than one higher level unit. The procedures used to model these stuctures are extensions of a random effects cross-classified multilevel model.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015020
    Description:

    At the end of 1993, Eurostat lauched a 'community' panel of households. The first wave, carried out in 1994 in the 12 countries of the European Union, included some 7,300 households in France, and at least 14,000 adults 17 years or over. Each individual was then followed up and interviewed each year, even if they had moved. The individuals leaving the sample present a particular profile. In the first part, we present a sketch of how our sample evolves and an analysis of the main characteristics of the non-respondents. We then propose 2 models to correct for non-response per homogeneous category. We then describe the longitudinal weight distribution obtained from the two models, and the cross-sectional weights using the weight share method. Finally, we compare some indicators calculated using both weighting methods.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015021
    Description:

    The U.S. Bureau of the Census implemented major changes to the design of the Survey of Income and Program Participation (SIPP) with the panel begun in 1996. The revised survey design emphasized longitudinal applications and the Census Bureau attempted to understand and resolve the seam bias common to longitudinal surveys. In addition to the substantive and administrative redesign of the survey, the Census Bureau is improving the data processing procedures which yield microdata files for the public to analyse. The wave-by-wave data products are being edited and imputed with a longitudinal element rather than cross-sectionally, carrying forward information from a prior wave that is missing in the current wave. The longitudinal data products will be enhanced, both by the redesigned survey and new processing procedures. Simple methods of imputing data over time are being replaced with more sophisticated methods that do not attenuate seam bias. The longitudinal sample is expanding to include more observations which were nonrespondents in one or more waves. Longitudinal weights will be applied to the file to support person-based longitudinal analysis for calendar years or longer periods of time (up to four years).

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015022
    Description:

    This article extends and further develops the method proposed by Pfeffermann, Skinner and Humphreys (1998) for the estimation of gross flows in the presence of classification errors. The main feature of that method is the use of auxiliary information at the individual level which circumvents the need for validation data for estimating the misclassification rates. The new developments in this article are the establishment of conditions for model identification, a study of the properties of a model goodness of fit statistic and modifications to the sample likelihood to account for missing data and informative sampling. The new developments are illustrated by a small Monte-Carlo simulation study.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015023
    Description:

    The study of social mobility, between labour market statuses or between income levels, for example, is often based on the analysis of mobility matrices. When comparing these transition matrices, with a view to evaluating behavioural changes, one often forgets that the data derive from a sample survey and are therefore affected by sampling variances. Similarly, it is assumed that the responses collected correspond to the ' true value.'

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015024
    Description:

    A longitudinal study on a cohort of pupils in the secondary school has been conducted in an Italian region since 1986 in order to study the transition from school to working life. The information have been collected at every sweep by a mail questionnaire and, at the final sweep, by a face-to-face interview, where retrospective questions referring back to the whole observation period have been asked. The gross flows between different discrete states - still in the school system, in the labour force without a job, in the labour force with a job - may then be estimated both from prospective and retrospective data, and the recall effect may be evaluated. Moreover, the conditions observed by the two different techniques may be regarded as two indicators of the 'true' unobservable condition, thus leading to the specification and estimation of a latent class model. In this framework, a Markov chain hypothesis may be introduced and evaluated in order to estimate the transition probabilities between the states, once they are corrected or the classification errors. Since the information collected by mail show a given amount of missing data in terms of unit nonresponse, the 'missing' category is also introduced in the model specification.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015026
    Description:

    The purpose of the present study is to utilize panel data from the Current Population Survey (CPS) to examine the effects of unit nonresponse. Because most nonrespondents to the CPS are respondents during at least one month-in-sample, data from other months can be used to compare the characteristics of complete respondents and panel nonrespondents and to evaluate nonresponse adjustment procedures. In the current paper we present analyses utilizing CPS panel data to illustrate the effects of unit nonresponse. After adjusting for nonresponse, additional comparisons are also made to evaluate the effects of nonresponse adjustment. The implications of the findings and suggestions for further research are discussed.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015027
    Description:

    The disseminated results of annual business surveys inevitably contain statistics that are changing. Since the economic sphere is increasingly dynamic, a simple difference of aggregates between n-l and n is no longer sufficient to provide an overall description of what has happened. The change calculation module in the new generation of annual business surveys divides overall change into various components (births, deaths, inter-industry migration) and calculates change on the basis of a constant field, assigning special importance to restructurings. The main difficulties lie in establishing subsamples, reweighting, calibrating according to calculable changes, and taking account of restructuring.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015028
    Description:

    We address the problem of estimation for the income dynamics statistics calculated from complex longitudinal surveys. In addition, we compare two design-based estimators of longitudinal proportions and transition rates in terms of variability under large attrition rates. One estimator is based on the cross-sectional samples for the estimation of the income class boundaries at each time period and on the longitudinal sample for the estimation of the longitudinal counts; the other estimator is entirely based on the longitudinal sample, both for the estimation of the class boundaries and the longitudinal counts. We develop Taylor linearization-type variance estimators for both the longitudinal and the mixed estimator under the assumption of no change in the population, and for the mixed estimator when there is change.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015029
    Description:

    In longitudinal surveys, sample subjects are observed over several time points. This feature typically leads to dependent observations on the same subject, in addition to the customary correlations across subjects induced by the sample design. Much research in the literature has focussed on modeling the marginal mean of a response as a function of covariates. Liang and Zeger (1986) used generalized estimating equations (GEE), requiring only correct specification of the marginal mean, and obtained standard errors of regression parameter estimates and associated Wald tests, assuming a "working" correlation structure for the repeated measurements on a sample subject. Rotnitzky and Jewell (1990) developed quasi-score tests and Rao-Scott adjustments to "working" quasi-score tests under marginal models. These methods are asymptotically robust to misspecification of the within-subject correlation structure, but assume independence of sample subjects which is not satisfied for complex longitudinal survey data based on stratified multi-stage sampling. We proposed asymptotically valid Wald and quasi-score tests for longitudinal survey data, using the Taylor Linearization and jackknife methods. Alternative tests, based on Rao-Scott adjustments to naive tests that ignore survey design features and on Bonferroni-t, are also developed. These tests are particularly useful when the effective degrees of freedom, usually taken as the total number of sample primary units (clusters) minus the number of strata, is small.

    Release date: 1999-10-22
Date modified: