Keyword search

Filter results by

Search Help
Currently selected filters that can be removed

Keyword(s)

Geography

3 facets displayed. 0 facets selected.

Content

1 facets displayed. 0 facets selected.
Sort Help
entries

Results

All (94)

All (94) (0 to 10 of 94 results)

  • Articles and reports: 11-522-X202200100001
    Description: Record linkage aims at identifying record pairs related to the same unit and observed in two different data sets, say A and B. Fellegi and Sunter (1969) suggest each record pair is tested whether generated from the set of matched or unmatched pairs. The decision function consists of the ratio between m(y) and u(y),probabilities of observing a comparison y of a set of k>3 key identifying variables in a record pair under the assumptions that the pair is a match or a non-match, respectively. These parameters are usually estimated by means of the EM algorithm using as data the comparisons on all the pairs of the Cartesian product ?=A×B. These observations (on the comparisons and on the pairs status as match or non-match) are assumed as generated independently of other pairs, assumption characterizing most of the literature on record linkage and implemented in software tools (e.g. RELAIS, Cibella et al. 2012). On the contrary, comparisons y and matching status in ? are deterministically dependent. As a result, estimates on m(y) and u(y) based on the EM algorithm are usually bad. This fact jeopardizes the effective application of the Fellegi-Sunter method, as well as automatic computation of quality measures and possibility to apply efficient methods for model estimation on linked data (e.g. regression functions), as in Chambers et al. (2015). We propose to explore ? by a set of samples, each one drawn so to preserve independence of comparisons among the selected record pairs. Simulations are encouraging.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100002
    Description: The authors used the Splink probabilistic linkage package developed by the UK Ministry of Justice, to link census data from England and Wales to itself to find duplicate census responses. A large gold standard of confirmed census duplicates was available meaning that the results of the Splink implementation could be quality assured. This paper describes the implementation and features of Splink, gives details of the settings and parameters that we used to tune Splink for our particular project, and gives the results that we obtained.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100012
    Description: At Statistics Netherlands (SN) for some economic sectors two partly-independent intra-annual turnover index series are available: a monthly series based on survey data and a quarterly series based on value added tax data for the smaller units and re-used survey data for the other units. SN aims to benchmark the monthly turnover index series to the quarterly census data on a quarterly basis. This cannot currently be done because the tax data has a different quarterly pattern: the turnover is relatively large in the fourth quarter of the year and smaller in the first quarter. With the current study we aim to describe this deviating quarterly pattern at micro level. In the past we developed a mixture model using absolute turnover levels that could explain part of the quarterly patterns. Because the absolute turnover levels differ between the two series, in the current study we use a model based on relative quarterly turnover levels within a year.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100019
    Description: The purpose of this article is to compare the linkage results for individuals from French tax sources with those of the 2019 Enquête Annuelle de Recensement (EAR), obtained through different methods. Such a comparison will decide whether the Répertoires Statistiques d'Individus et de Logements (Résil) program should be equipped with a probabilistic matching tool for its administrative source identification and matching engine.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100020
    Description: The reconciliation of 2021 census dwellings with the new Statistical Building Register (SBgR) presented linkage challenges. The Census of Population collected information from various dwelling types. For a large proportion of the population, mailing addresses were at the centre: they were used for reaching out to people and collected as contact info. In parallel, the register environment has been evolving. The agency is transitioning from the Address Register (AR) to the SBgR holding both mailing and location addresses, while also covering non-residential buildings. The reconciliation was conducted using a combination of systems, notably the new Register Matching Engine (RME) for difficult cases. The RME holds an interesting range of sophisticated string comparators. A deterministic linkage approach was used, while incorporating some data knowledge like the entropy. Through metadata, the matching expert could also reduce the amounts of false positives and false negatives.
    Release date: 2024-03-25

  • Articles and reports: 91F0015M2024002
    Description: This paper examines the emigration of immigrants using the Longitudinal Immigration Database (IMDB). An indirect definition of emigration is proposed that leverages the information available in the IMDB. This study found that emigration of immigrants is a significant phenomenon. Certain characteristics of immigrants, such as having children, admission category and country of birth, have a strong correlation with emigration.
    Release date: 2024-02-02

  • Articles and reports: 91F0015M2023001
    Description: Using record linkage, this article compares marital status as identified in the 2015 T1 tax data to what was provided in the 2016 Census using record linkage.
    Release date: 2023-07-11

  • Articles and reports: 12-001-X202200100007
    Description:

    By record linkage one joins records residing in separate files which are believed to be related to the same entity. In this paper we approach record linkage as a classification problem, and adapt the maximum entropy classification method in machine learning to record linkage, both in the supervised and unsupervised settings of machine learning. The set of links will be chosen according to the associated uncertainty. On the one hand, our framework overcomes some persistent theoretical flaws of the classical approach pioneered by Fellegi and Sunter (1969); on the other hand, the proposed algorithm is fully automatic, unlike the classical approach that generally requires clerical review to resolve the undecided cases.

    Release date: 2022-06-21

  • Articles and reports: 11-522-X202100100006
    Description:

    In the context of its "admin-first" paradigm, Statistics Canada is prioritizing the use of non-survey sources to produce official statistics. This paradigm critically relies on non-survey sources that may have a nearly perfect coverage of some target populations, including administrative files or big data sources. Yet, this coverage must be measured, e.g., by applying the capture-recapture method, where they are compared to other sources with good coverage of the same populations, including a census. However, this is a challenging exercise in the presence of linkage errors, which arise inevitably when the linkage is based on quasi-identifiers, as is typically the case. To address the issue, a new methodology is described where the capture-recapture method is enhanced with a new error model that is based on the number of links adjacent to a given record. It is applied in an experiment with public census data.

    Key Words: dual system estimation, data matching, record linkage, quality, data integration, big data.

    Release date: 2021-10-22

  • Surveys and statistical programs – Documentation: 12-539-X
    Description:

    This document brings together guidelines and checklists on many issues that need to be considered in the pursuit of quality objectives in the execution of statistical activities. Its focus is on how to assure quality through effective and appropriate design or redesign of a statistical project or program from inception through to data evaluation, dissemination and documentation. These guidelines draw on the collective knowledge and experience of many Statistics Canada employees. It is expected that Quality Guidelines will be useful to staff engaged in the planning and design of surveys and other statistical projects, as well as to those who evaluate and analyze the outputs of these projects.

    Release date: 2019-12-04
Data (2)

Data (2) ((2 results))

  • Table: 95F0303X
    Description:

    This product presents selected 2001 and historical data from the Census of Agriculture - Census of Population Linkage database. The data are available at the Canada and province levels for free. The data variables include: age; sex; marital status; mother tongue; highest level of schooling; net farm income; as well as farm population counts and income profiles for census farm families and households.

    (No linkage databases were created for the 1966 and 1976 Census years, so historical comparisons are not possible for those years.)

    Release date: 2003-12-02

  • Table: 16-200-X
    Description:

    Part of Statistics Canada's Econnections: linking the environment and the economy statistical series, this product consists of a printed publication combined with a CD-ROM. The product offers summary indicators plus detailed statistics that quantify the relationship between economic activity and the environment. Information is presented for issues ranging from greenhouse gas emissions, water and energy use, to natural resource wealth, environmental expenditures and beyond. The printed publication provides convenient reference to the summary indicators, including analysis of important trends, while the CD-ROM offers straightforward access to dozens of detailed statistical tables that underlie the indicators. An electronic version of the printed publication is included on the CD-ROM and each indicator in the publication is hypertext linked to a group of related statistical tables, allowing the user to easily select detailed statistics for viewing in association with any given indicator. Simple analysis of the statistics can be done directly within the CD-ROM's software. For those who carry out more complex analysis, downloading of data from the CD-ROM in standard spreadsheet format is easily accomplished.

    Release date: 2001-02-23
Analysis (73)

Analysis (73) (50 to 60 of 73 results)

  • Articles and reports: 11-522-X20010016263
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    This paper describes the Annual Business Inquiry (ABI) project to integrate the Office for National Statistics' (ONS) main, annual business surveys, regardless of economic sectors. The ABI project also brings together employment and financial data surveys and is capable of generating a wide range of subnational analyses, another objective of the development. Methodological aspects covered by the paper include sample design; estimation and outlier treatment; apportionment of data from reporting units to local units (individual sites) and the methodology for subnational and small area estimation. The subnational methodology involves the use of logistic and loglinear models.

    Release date: 2002-09-12

  • Articles and reports: 11-522-X20010016264
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    Conducting a census by traditional methods is becoming more difficult. The possibility of cross-linking administrative files provides an attractive alternative to conducting periodic censuses (Laihonen, 2000; Borchsenius, 2000). This method was proposed in a recent article by Nathan (2001). The Institut National de la Statistique et des Études Économiques (INSEE) census redesign is based on the idea of a "continuous census," originally suggested by Kish (1981, 1990) and Horvitz (1986). The first approach, which could be feasible in France, can be found in Deville and Jacod's paper (1996). This particular article reviews the methodological developments and approaches used since INSEE started its population census redesign program.

    Release date: 2002-09-12

  • Articles and reports: 11-522-X20010016277
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    The advent of computerized record-linkage methodology has facilitated the conduct of cohort mortality studies in which exposure data in one database are electronically linked with mortality data from another database. In this article, the impact of linkage errors on estimates of epidemiological indicators of risk, such as standardized mortality ratios and relative risk regression model parameters, is explored. It is shown that these indicators can be subject to bias and additional variability in the presence of linkage errors, with false links and non-links leading to positive and negative bias, respectively, in estimates of the standardized mortality ratio. Although linkage errors always increase the uncertainty in the estimates, bias can be effectively eliminated in the special case in which the false positive rate equals the false negative rate within homogeneous states defined by cross-classification of the covariates of interest.

    Release date: 2002-09-12

  • Articles and reports: 11-522-X20010016301
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    The Integrated Metadatabase is a corporate repository of information for each of Statistics Canada's surveys. The information stored in the Integrated Metadatabase includes a description of data sources and methodology, definitions of concepts and variables measured, and indicators of data quality. It provides an effective vehicle for communicating data quality to data users. Its coverage of Statistics Canada's data holdings is exhaustive, the provided information on data quality complies with the Policy on Informing Users of Data Quality and Methodology, and it is presented in a consistent and systematic fashion.

    Release date: 2002-09-12

  • Articles and reports: 12-001-X20010026092
    Description:

    To augment the amount of available information, data from different sources are increasingly being combined. These databases are often combined using record linkage methods. When there is no unique identifier, a probabilistic linkage is used. In that case, a record on a first file is associated with a probability that is linked to a record on a second file, and then a decision is taken on whether a possible link is a true link or not. This usually requires a non-negligible amount of manual resolution. It might then be legitimate to evaluate if manual resolution can be reduced or even eliminated. This issue is addressed in this paper where one tries to produce an estimate of a total (or a mean) of one population, when using a sample selected from another population linked somehow to the first population. In other words, having two populations linked through probabilistic record linkage, we try to avoid any decision concerning the validity of links and still be able to produce an unbiased estimate for a total of the one of two populations. To achieve this goal, we suggest the use of the Generalised Weight Share Method (GWSM) described by Lavallée (1995).

    Release date: 2002-02-28

  • Articles and reports: 11-522-X19990015646
    Geography: Canada
    Description:

    The current economic context obliges all partners of health-care systems, whether public or private, to identify those factors that determine the use of health-care services. To increase our understanding of the phenomena that underlie these relationships, Statistics Canada and the Manitoba Centre for Health Policy and Evaluation have established a new database. For a representative sample of the province of Manitoba, cross-sectional micro-data on the level of health of individuals and on their socioeconomic characteristics, and detailed longitudinal data on the use of health-care services have been linked. In this presentation, we will discuss the general context of the linkage of records from various organizations, the protection of privacy and confidentiality. We will also present results of studies which should not have been performed in the absence of the linked database.

    Release date: 2000-03-02

  • Articles and reports: 12-001-X19990024878
    Description:

    In his paper Fritz Scheuren considers the possible uses of administrative records to enhance and improve population censuses. After reviewing previous uses of administrative records in an international context, he puts forward several proposals for research and development towards increased use of administrative records in the American statistical system.

    Release date: 2000-03-01

  • Articles and reports: 12-001-X19990014713
    Description:

    Robust small area estimation is studied under a simple random effects model consisting of a basic (or fixed effects) model and a linking model that treats the fixed effects as realizations of a random variable. Under this model a model-assisted estimator of a small area mean is obtained. This estimator depends on the survey weights and remains design-consistent. A model-based estimator of its mean squared error (MSE) is also obtained. Simulation results suggest that the proposed estimator and Kott's (1989) model-assisted estimator are equally efficient, and that the proposed MSE estimator is often much more stable than Kott's MSE estimator, even under moderate deviations of the linking model. The method is also extended to nested error regression models.

    Release date: 1999-10-08

  • Articles and reports: 12-001-X19990014717
    Description:

    The British Labour Froce Survey (LFS) uses a rotating sample design, with each sample household retained for five consecutive quarters. Linking together the information on the same persons across quarters produces a potentially very rich source of longitudinal data. There are however serious risks of distortion in the results from such longitudinal linking, mainly arising from sample attrition, and from response errors, which can produce spurious flows between economic activity states. This paper describes the initial results of investigations by the Office for National Statistics (ONS) into the nature and extent of the problems.

    Release date: 1999-10-08

  • Journals and periodicals: 88-522-X
    Description:

    The framework described here is intended as a basic operational instrument for systematic development of statistical information respecting the evolution of science and technology and its interactions with the society, the economy and the political system of which it is a part.

    Release date: 1999-02-24
Reference (19)

Reference (19) (0 to 10 of 19 results)

  • Surveys and statistical programs – Documentation: 12-539-X
    Description:

    This document brings together guidelines and checklists on many issues that need to be considered in the pursuit of quality objectives in the execution of statistical activities. Its focus is on how to assure quality through effective and appropriate design or redesign of a statistical project or program from inception through to data evaluation, dissemination and documentation. These guidelines draw on the collective knowledge and experience of many Statistics Canada employees. It is expected that Quality Guidelines will be useful to staff engaged in the planning and design of surveys and other statistical projects, as well as to those who evaluate and analyze the outputs of these projects.

    Release date: 2019-12-04

  • Surveys and statistical programs – Documentation: 82-225-X200701010508
    Description:

    The Record Linkage Overview describes the process used in annual internal record linkage of the Canadian Cancer Registry. The steps include: preparation; pre-processing; record linkage; post-processing; analysis and resolution; resolution entry; and, resolution processing.

    Release date: 2008-01-18

  • Surveys and statistical programs – Documentation: 82-225-X
    Description:

    The compendium of Canadian Cancer Registry procedures manuals set out the rules for reporting cancer data to the CCR for all provincial and territorial cancer registries.

    Release date: 2008-01-18

  • Surveys and statistical programs – Documentation: 82-225-X20070109648
    Description:

    The Record Linkage Overview describes the process used in annual internal record linkage of the Canadian Cancer Registry. The steps include: preparation; pre-processing; record linkage; post-processing; analysis and resolution; resolution entry; and, resolution processing.

    Release date: 2007-06-21

  • Surveys and statistical programs – Documentation: 82-225-X20070109650
    Description:

    The User Guide to Record Linkage Feedback Reports C1 and C2 is intended for the users of the reports. The reports were developed to facilitate the exchange of information and decisions between the Canadian Cancer Registry and the Provincial and Territorial Cancer Registries.

    Release date: 2007-06-21

  • Surveys and statistical programs – Documentation: 82-225-X20060099202
    Description:

    The User Guide to Record Linkage Feedback Reports C1 and C2 is intended for the users of the reports. The reports were developed to facilitate the exchange of information and decisions between the Canadian Cancer Registry and the Provincial and Territorial Cancer Registries.

    Release date: 2006-07-07

  • Surveys and statistical programs – Documentation: 82-225-X20060099203
    Description:

    The user guide to Death Clearance Feedback Reports is intended for users of the feedback reports. The feedback reports were developed to facilitate the exchange of information and decisions between the Canadian Cancer Registry and the Provincial and Territorial Cancer Registries.

    Release date: 2006-07-07

  • Surveys and statistical programs – Documentation: 82-225-X20060099204
    Description:

    The Record Linkage Overview describes the process used in annual internal record linkage of the Canadian Cancer Registry. The steps include: preparation; pre-processing; record linkage; post-processing; analysis and resolution; resolution entry; and, resolution processing.

    Release date: 2006-07-07

  • Surveys and statistical programs – Documentation: 82-225-X20060099206
    Description:

    The Guidelines for Abstracting and Determining Death Certificate Only Cases are intended for use by all provincial and territorial cancer registries during their Death Clearance Process. The guidelines should be used when performing a comparison between the Death Certificate Notification and the cancer registry database.

    Release date: 2006-07-07

  • Surveys and statistical programs – Documentation: 16-505-G
    Description:

    Part of Statistics Canada's Econnections: linking the environment and the economy statistical series, this publication describes in detail the conceptual frameworks, data sources and empirical methods used to compile the Canadian System of Environmental and Resource Accounts (CSERA). Designed to be compatible with the accounting frameworks of the System of National Accounts, the CSERA allows users to easily analyze the linkages between economic activity and the environment in terms of material and energy flows, environmental expenditures and natural resource stocks. This publication will be of interest to researchers in both the economic and environmental fields who want to familiarize themselves with the accounting concepts of the CSERA. It is a companion volume to Environment-economy indicators and detailed statistics (catalogue no. 16-200-XKE), another product in the Econnections series.

    Statistics Canada has updated its 1997 documentation on environmental accounts, Econnections: Concepts, Sources and Methods of the Canadian System of Environmental and Resource Accounts, with publication of the Methodological Guide: Canadian System of Environmental-Economic Accounting.

    Release date: 2006-04-12
Date modified: