Quality assurance

Sort Help
entries

Results

All (250)

All (250) (0 to 10 of 250 results)

  • Journals and periodicals: 75F0002M
    Description: This series provides detailed documentation on income developments, including survey design issues, data quality evaluation and exploratory research.
    Release date: 2024-04-26

  • Surveys and statistical programs – Documentation: 32-26-0007
    Description: Census of Agriculture data provide statistical information on farms and farm operators at fine geographic levels and for small subpopulations. Quality evaluation activities are essential to ensure that census data are reliable and that they meet user needs.

    This report provides data quality information pertaining to the Census of Agriculture, such as sources of error, error detection, disclosure control methods, data quality indicators, response rates and collection rates.
    Release date: 2024-02-06

  • Articles and reports: 13-604-M2024001
    Description: This documentation outlines the methodology used to develop the Distributions of household economic accounts published in January 2024 for the reference years 2010 to 2023. It describes the framework and the steps implemented to produce distributional information aligned with the National Balance Sheet Accounts and other national accounts concepts. It also includes a report on the quality of the estimated distributions.
    Release date: 2024-01-22

  • Articles and reports: 13-604-M2023001
    Description: This documentation outlines the methodology used to develop the Distributions of household economic accounts published in March 2023 for the reference years 2010 to 2022. It describes the framework and the steps implemented to produce distributional information aligned with the National Balance Sheet Accounts and other national accounts concepts. It also includes a report on the quality of the estimated distributions.
    Release date: 2023-03-31

  • Articles and reports: 13-604-M2022002
    Description:

    This documentation outlines the methodology used to develop the Distributions of household economic accounts published in August 2022 for the reference years 2010 to 2021. It describes the framework and the steps implemented to produce distributional information aligned with the National Balance Sheet Accounts and other national accounts concepts. It also includes a report on the quality of the estimated distributions.

    Release date: 2022-08-03

  • 19-22-0009
    Description:

    Join us as Statistics Canada’s Quality Secretariat will give a presentation on the importance of data quality. We are living in an exciting time for data: sources are more abundant, they are being generated in innovative ways, and they are available quicker than ever. However, a data source is not only worthless if it does not meet basic quality standards – it can be misleading, and worse than having no data at all! Statistics Canada’s Quality Secretariat has a mandate to promote good quality practices within the agency, across the Government of Canada, and internationally. For quality to truly be present, it must be incorporated into each process (from design to analysis) and into the product itself – whether that product is a microdata file or estimates derived from it. We will address why data quality is important and how one can evaluate it in practice. We will cover some basic concepts in data quality (quality assurance vs. control, metadata, etc.), and present data quality as a multidimensional concept. Finally, we will show data quality in action by evaluating a data source together. All data quality literacy levels are welcome. After all, everybody plays a part in quality!

    https://www.statcan.gc.ca/en/services/webinars/19220009

    Release date: 2022-01-26

  • Articles and reports: 11-522-X202100100015
    Description: National statistical agencies such as Statistics Canada have a responsibility to convey the quality of statistical information to users. The methods traditionally used to do this are based on measures of sampling error. As a result, they are not adapted to the estimates produced using administrative data, for which the main sources of error are not due to sampling. A more suitable approach to reporting the quality of estimates presented in a multidimensional table is described in this paper. Quality indicators were derived for various post-acquisition processing steps, such as linkage, geocoding and imputation, by estimation domain. A clustering algorithm was then used to combine domains with similar quality levels for a given estimate. Ratings to inform users of the relative quality of estimates across domains were assigned to the groups created. This indicator, called the composite quality indicator (CQI), was developed and experimented with in the Canadian Housing Statistics Program (CHSP), which aims to produce official statistics on the residential housing sector in Canada using multiple administrative data sources.

    Keywords: Unsupervised machine learning, quality assurance, administrative data, data integration, clustering.

    Release date: 2021-10-22

  • Articles and reports: 11-522-X202100100023
    Description:

    Our increasingly digital society provides multiple opportunities to maximise our use of data for the public good – using a range of sources, data types and technologies to enable us to better inform the public about social and economic matters and contribute to the effective development and evaluation of public policy. Ensuring use of data in ethically appropriate ways is an important enabler for realising the potential to use data for public good research and statistics. Earlier this year the UK Statistics Authority launched the Centre for Applied Data Ethics to provide applied data ethics services, advice, training and guidance to the analytical community across the United Kingdom. The Centre has developed a framework and portfolio of services to empower analysts to consider the ethics of their research quickly and easily, at the research design phase thus promoting a culture of ethics by design. This paper will provide an overview of this framework, the accompanying user support services and the impact of this work.

    Key words: Data ethics, data, research and statistics

    Release date: 2021-10-22

  • Articles and reports: 13-604-M2021001
    Description:

    This documentation outlines the methodology used to develop the Distributions of household economic accounts published in September 2021 for the reference years 2010 to 2020. It describes the framework and the steps implemented to produce distributional information aligned with the National Balance Sheet Accounts and other national accounts concepts. It also includes a report on the quality of the estimated distributions.

    Release date: 2021-09-07

  • Stats in brief: 89-20-00062020001
    Description:

    In this video, you will be introduced to the fundamentals of data quality, which can be summed up in six dimensions—or six different ways to think about quality. You will also learn how each dimension can be used to evaluate the quality of data.

    Release date: 2020-09-23
Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (171)

Analysis (171) (50 to 60 of 171 results)

  • Articles and reports: 11-522-X200800010985
    Description:

    In Canada, although complex businesses represent less than 1% of the total number of businesses, they contribute more than 45% of the total revenue. Statistics Canada recognized that the quality of the data collected from them is of great importance and has adopted several initiatives to improve the quality. One of the initiatives is the evaluation of the coherence of the data collected from large, complex enterprises. The findings of these recent coherence analyses have been instrumental in identifying areas for improvement. These, once addressed and improved, would be increasing the quality of the data collected from the large, complex enterprises while reducing the response burden imposed on them.

    Release date: 2009-12-03

  • Articles and reports: 11-522-X200800010991
    Description:

    In the evaluation of prospective survey designs, statistical agencies generally must consider a large number of design factors that may have a substantial impact on both survey costs and data quality. Assessments of trade-offs between cost and quality are often complicated by limitations on the amount of information available regarding fixed and marginal costs related to: instrument redesign and field testing; the number of primary sample units and sample elements included in the sample; assignment of instrument sections and collection modes to specific sample elements; and (for longitudinal surveys) the number and periodicity of interviews. Similarly, designers often have limited information on the impact of these design factors on data quality.

    This paper extends standard design-optimization approaches to account for uncertainty in the abovementioned components of cost and quality. Special attention is directed toward the level of precision required for cost and quality information to provide useful input into the design process; sensitivity of cost-quality trade-offs to changes in assumptions regarding functional forms; and implications for preliminary work focused on collection of cost and quality information. In addition, the paper considers distinctions between cost and quality components encountered in field testing and production work, respectively; incorporation of production-level cost and quality information into adaptive design work; as well as costs and operational risks arising from the collection of detailed cost and quality data during production work. The proposed methods are motivated by, and applied to, work with partitioned redesign of the interview and diary components of the U.S. Consumer Expenditure Survey.

    Release date: 2009-12-03

  • Articles and reports: 11-522-X200800011002
    Description:

    Based on a representative sample of the Canadian population, this article quantifies the bias resulting from the use of self-reported rather than directly measured height, weight and body mass index (BMI). Associations between BMI categories and selected health conditions are compared to see if the misclassification resulting from the use of self-reported data alters associations between obesity and obesity-related health conditions. The analysis is based on 4,567 respondents to the 2005 Canadian Community Health Survey (CCHS) who, during a face-to-face interview, provided self-reported values for height and weight and were then measured by trained interviewers. Based on self-reported data, a substantial proportion of individuals with excess body weight were erroneously placed in lower BMI categories. This misclassification resulted in elevated associations between overweight/obesity and morbidity.

    Release date: 2009-12-03

  • Articles and reports: 11-522-X200800011007
    Description:

    The Questionnaire Design Resource Centre (QDRC) is the focal point of expertise at Statistics Canada for questionnaire design and evaluation. As it stands now, cognitive interviewing to test questionnaires is most often done near the end of the questionnaire development process. By participating earlier in the questionnaire development process, the QDRC could test new survey topics using more adaptive cognitive methods for each step of the questionnaire development process. This would necessitate fewer participants for each phase of testing, thus reducing the cost and the recruitment challenge.

    Based on a review of the literature and Statistics Canada's existing questionnaire evaluation projects, this paper will describe how the QDRC could help clients in making appropriate improvements to their questionnaire in a timely manner.

    Release date: 2009-12-03

  • Articles and reports: 11-522-X200800011014
    Description:

    In many countries, improved quality of economic statistics is one of the most important goals of the 21st century. First and foremost, the quality of National Accounts is in focus, regarding both annual and quarterly accounts. To achieve this goal, data quality regarding the largest enterprises is of vital importance. To assure that the quality of data for the largest enterprises is good, coherence analysis is an important tool. Coherence means that data from different sources fit together and give a consistent view of the development within these enterprises. Working with coherence analysis in an efficient way is normally a work-intensive task consisting mainly of collecting data from different sources and comparing them in a structured manner. Over the last two years, Statistics Sweden has made great progress in improving the routines for coherence analysis. An IT tool that collects data for the largest enterprises from a large number of sources and presents it in a structured and logical matter has been built, and a systematic approach to analyse data for National Accounts on a quarterly basis has been developed. The paper describes the work in both these areas and gives an overview of the IT tool and the agreed routines.

    Release date: 2009-12-03

  • Articles and reports: 12-001-X200900110887
    Description:

    Many survey organisations focus on the response rate as being the quality indicator for the impact of non-response bias. As a consequence, they implement a variety of measures to reduce non-response or to maintain response at some acceptable level. However, response rates alone are not good indicators of non-response bias. In general, higher response rates do not imply smaller non-response bias. The literature gives many examples of this (e.g., Groves and Peytcheva 2006, Keeter, Miller, Kohut, Groves and Presser 2000, Schouten 2004).

    We introduce a number of concepts and an indicator to assess the similarity between the response and the sample of a survey. Such quality indicators, which we call R-indicators, may serve as counterparts to survey response rates and are primarily directed at evaluating the non-response bias. These indicators may facilitate analysis of survey response over time, between various fieldwork strategies or data collection modes. We apply the R-indicators to two practical examples.

    Release date: 2009-06-22

  • Articles and reports: 82-003-X200800410703
    Geography: Canada
    Description:

    Data from 16,190 respondents to the 2004 Canadian Community Health Survey - Nutrition were used to estimate under-reporting of food intake for the population aged 12 or older in the 10 provinces.

    Release date: 2008-10-15

  • Articles and reports: 82-003-X200800310680
    Geography: Canada
    Description:

    This study examines the feasibility of developing correction factors to adjust self-reported measures of body mass index to more closely approximate measured values. Data are from the 2005 Canadian Community Health Survey, in which respondents were asked to report their height and weight, and were subsequently measured.

    Release date: 2008-09-17

  • Articles and reports: 82-622-X2008001
    Geography: Canada
    Description:

    In this study, I examine the factorial validity of selected modules from the Canadian Survey of Experiences with Primary Health Care (CSE-PHC), in order to determine the potential for combining the items within each module into summary indices representing global primary health care concepts. The modules examined were: Patient Assessment of Chronic Illness Care (PACIC), Patient Activation (PA), Managing Own Health Care (MOHC), and Confidence in the Health Care System (CHCS). Confirmatory factor analyses were conducted on each module to assess the degree to which multiple observed items reflected the presence of common latent factors. While a four-factor model was initially specified for the PACIC instrument on the basis of priory theory and research, it did not fit the data well; rather, a revised two-factor model was found to be most appropriate. These two factors were labelled: "Whole Person Care" and "Coordination of Care". The remaining modules studied here (i.e., PA, MOHC, and CHCS) were all well-represented by single-factor models. The results suggest that the original factor structure of the PACIC developed within studies using clinical samples does not hold in general populations, although the precise reasons for this are not clear. Further empirical investigation will be required to shed more light on this discrepancy. The two factors identified here for the PACIC, as well as the single factors produced for the PA, MOHC, and CHCS could be used as the basis of summary indices for use in further analyses with the CSE-PHC.

    Release date: 2008-07-08

  • Articles and reports: 11-522-X200600110397
    Description:

    In practice it often happens that some collected data are subject to measurement error. Sometimes covariates (or risk factors) of interest may be difficult to observe precisely due to physical location or cost. Sometimes it is impossible to measure covariates accurately due to the nature of the covariates. In other situations, a covariate may represent an average of a certain quantity over time, and any practical way of measuring such a quantity necessarily features measurement error. When carrying out statistical inference in such settings, it is important to account for the effects of mismeasured covariates; otherwise, erroneous or even misleading results may be produced. In this paper, we discuss several measurement error examples arising in distinct contexts. Specific attention is focused on survival data with covariates subject to measurement error. We discuss a simulation-extrapolation method for adjusting for measurement error effects. A simulation study is reported.

    Release date: 2008-03-17
Reference (78)

Reference (78) (60 to 70 of 78 results)

  • Surveys and statistical programs – Documentation: 11-522-X19980015018
    Description:

    This paper presents a method for handling longitudinal data in which individuals belong to more than one unit at a higher level, and also where there is missing information on the identification of the units to which they belong. In education, for example, a student might be classified as belonging sequentially to a particular combination of primary and secondary school, but for some students, the identity of either the primary or secondary school may be unknown. Likewise, in a longitudinal study, students may change school or class from one period to the next, so 'belonging' to more than one higher level unit. The procedures used to model these stuctures are extensions of a random effects cross-classified multilevel model.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015020
    Description:

    At the end of 1993, Eurostat lauched a 'community' panel of households. The first wave, carried out in 1994 in the 12 countries of the European Union, included some 7,300 households in France, and at least 14,000 adults 17 years or over. Each individual was then followed up and interviewed each year, even if they had moved. The individuals leaving the sample present a particular profile. In the first part, we present a sketch of how our sample evolves and an analysis of the main characteristics of the non-respondents. We then propose 2 models to correct for non-response per homogeneous category. We then describe the longitudinal weight distribution obtained from the two models, and the cross-sectional weights using the weight share method. Finally, we compare some indicators calculated using both weighting methods.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015021
    Description:

    The U.S. Bureau of the Census implemented major changes to the design of the Survey of Income and Program Participation (SIPP) with the panel begun in 1996. The revised survey design emphasized longitudinal applications and the Census Bureau attempted to understand and resolve the seam bias common to longitudinal surveys. In addition to the substantive and administrative redesign of the survey, the Census Bureau is improving the data processing procedures which yield microdata files for the public to analyse. The wave-by-wave data products are being edited and imputed with a longitudinal element rather than cross-sectionally, carrying forward information from a prior wave that is missing in the current wave. The longitudinal data products will be enhanced, both by the redesigned survey and new processing procedures. Simple methods of imputing data over time are being replaced with more sophisticated methods that do not attenuate seam bias. The longitudinal sample is expanding to include more observations which were nonrespondents in one or more waves. Longitudinal weights will be applied to the file to support person-based longitudinal analysis for calendar years or longer periods of time (up to four years).

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015022
    Description:

    This article extends and further develops the method proposed by Pfeffermann, Skinner and Humphreys (1998) for the estimation of gross flows in the presence of classification errors. The main feature of that method is the use of auxiliary information at the individual level which circumvents the need for validation data for estimating the misclassification rates. The new developments in this article are the establishment of conditions for model identification, a study of the properties of a model goodness of fit statistic and modifications to the sample likelihood to account for missing data and informative sampling. The new developments are illustrated by a small Monte-Carlo simulation study.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015023
    Description:

    The study of social mobility, between labour market statuses or between income levels, for example, is often based on the analysis of mobility matrices. When comparing these transition matrices, with a view to evaluating behavioural changes, one often forgets that the data derive from a sample survey and are therefore affected by sampling variances. Similarly, it is assumed that the responses collected correspond to the ' true value.'

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015024
    Description:

    A longitudinal study on a cohort of pupils in the secondary school has been conducted in an Italian region since 1986 in order to study the transition from school to working life. The information have been collected at every sweep by a mail questionnaire and, at the final sweep, by a face-to-face interview, where retrospective questions referring back to the whole observation period have been asked. The gross flows between different discrete states - still in the school system, in the labour force without a job, in the labour force with a job - may then be estimated both from prospective and retrospective data, and the recall effect may be evaluated. Moreover, the conditions observed by the two different techniques may be regarded as two indicators of the 'true' unobservable condition, thus leading to the specification and estimation of a latent class model. In this framework, a Markov chain hypothesis may be introduced and evaluated in order to estimate the transition probabilities between the states, once they are corrected or the classification errors. Since the information collected by mail show a given amount of missing data in terms of unit nonresponse, the 'missing' category is also introduced in the model specification.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015026
    Description:

    The purpose of the present study is to utilize panel data from the Current Population Survey (CPS) to examine the effects of unit nonresponse. Because most nonrespondents to the CPS are respondents during at least one month-in-sample, data from other months can be used to compare the characteristics of complete respondents and panel nonrespondents and to evaluate nonresponse adjustment procedures. In the current paper we present analyses utilizing CPS panel data to illustrate the effects of unit nonresponse. After adjusting for nonresponse, additional comparisons are also made to evaluate the effects of nonresponse adjustment. The implications of the findings and suggestions for further research are discussed.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015027
    Description:

    The disseminated results of annual business surveys inevitably contain statistics that are changing. Since the economic sphere is increasingly dynamic, a simple difference of aggregates between n-l and n is no longer sufficient to provide an overall description of what has happened. The change calculation module in the new generation of annual business surveys divides overall change into various components (births, deaths, inter-industry migration) and calculates change on the basis of a constant field, assigning special importance to restructurings. The main difficulties lie in establishing subsamples, reweighting, calibrating according to calculable changes, and taking account of restructuring.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015028
    Description:

    We address the problem of estimation for the income dynamics statistics calculated from complex longitudinal surveys. In addition, we compare two design-based estimators of longitudinal proportions and transition rates in terms of variability under large attrition rates. One estimator is based on the cross-sectional samples for the estimation of the income class boundaries at each time period and on the longitudinal sample for the estimation of the longitudinal counts; the other estimator is entirely based on the longitudinal sample, both for the estimation of the class boundaries and the longitudinal counts. We develop Taylor linearization-type variance estimators for both the longitudinal and the mixed estimator under the assumption of no change in the population, and for the mixed estimator when there is change.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015029
    Description:

    In longitudinal surveys, sample subjects are observed over several time points. This feature typically leads to dependent observations on the same subject, in addition to the customary correlations across subjects induced by the sample design. Much research in the literature has focussed on modeling the marginal mean of a response as a function of covariates. Liang and Zeger (1986) used generalized estimating equations (GEE), requiring only correct specification of the marginal mean, and obtained standard errors of regression parameter estimates and associated Wald tests, assuming a "working" correlation structure for the repeated measurements on a sample subject. Rotnitzky and Jewell (1990) developed quasi-score tests and Rao-Scott adjustments to "working" quasi-score tests under marginal models. These methods are asymptotically robust to misspecification of the within-subject correlation structure, but assume independence of sample subjects which is not satisfied for complex longitudinal survey data based on stratified multi-stage sampling. We proposed asymptotically valid Wald and quasi-score tests for longitudinal survey data, using the Taylor Linearization and jackknife methods. Alternative tests, based on Rao-Scott adjustments to naive tests that ignore survey design features and on Bonferroni-t, are also developed. These tests are particularly useful when the effective degrees of freedom, usually taken as the total number of sample primary units (clusters) minus the number of strata, is small.

    Release date: 1999-10-22
Date modified: