Keyword search

Filter results by

Search Help
Currently selected filters that can be removed

Keyword(s)

Geography

3 facets displayed. 0 facets selected.

Content

1 facets displayed. 0 facets selected.
Sort Help
entries

Results

All (87)

All (87) (0 to 10 of 87 results)

  • Articles and reports: 12-001-X202200100007
    Description:

    By record linkage one joins records residing in separate files which are believed to be related to the same entity. In this paper we approach record linkage as a classification problem, and adapt the maximum entropy classification method in machine learning to record linkage, both in the supervised and unsupervised settings of machine learning. The set of links will be chosen according to the associated uncertainty. On the one hand, our framework overcomes some persistent theoretical flaws of the classical approach pioneered by Fellegi and Sunter (1969); on the other hand, the proposed algorithm is fully automatic, unlike the classical approach that generally requires clerical review to resolve the undecided cases.

    Release date: 2022-06-21

  • Articles and reports: 11-522-X202100100006
    Description:

    In the context of its "admin-first" paradigm, Statistics Canada is prioritizing the use of non-survey sources to produce official statistics. This paradigm critically relies on non-survey sources that may have a nearly perfect coverage of some target populations, including administrative files or big data sources. Yet, this coverage must be measured, e.g., by applying the capture-recapture method, where they are compared to other sources with good coverage of the same populations, including a census. However, this is a challenging exercise in the presence of linkage errors, which arise inevitably when the linkage is based on quasi-identifiers, as is typically the case. To address the issue, a new methodology is described where the capture-recapture method is enhanced with a new error model that is based on the number of links adjacent to a given record. It is applied in an experiment with public census data.

    Key Words: dual system estimation, data matching, record linkage, quality, data integration, big data.

    Release date: 2021-10-22

  • Surveys and statistical programs – Documentation: 12-539-X
    Description:

    This document brings together guidelines and checklists on many issues that need to be considered in the pursuit of quality objectives in the execution of statistical activities. Its focus is on how to assure quality through effective and appropriate design or redesign of a statistical project or program from inception through to data evaluation, dissemination and documentation. These guidelines draw on the collective knowledge and experience of many Statistics Canada employees. It is expected that Quality Guidelines will be useful to staff engaged in the planning and design of surveys and other statistical projects, as well as to those who evaluate and analyze the outputs of these projects.

    Release date: 2019-12-04

  • Articles and reports: 11F0019M2018411
    Geography: Census metropolitan area
    Description:

    Immigrants tend to reside disproportionately in larger Canadian cities, which may challenge their absorptive capacity. This study uses the linked Longitudinal Immigration Database and T1 Family File to examine the initial location and onward migration decisions of immigrants who are economic principal applicants (EPAs) and who have landed since the Immigration and Refugee Protection Act was passed. The main objective of the study is to identify the factors associated with initially residing and remaining in Canada’s three largest gateway cities: Montréal, Toronto and Vancouver (referred to as MTV).

    Release date: 2018-12-07

  • Articles and reports: 82-003-X201800800001
    Description:

    The objective of this study is to report the population rate of surgical treatment of incident primary female breast tumours diagnosed from 2010 to 2012 overall, and by disease stage in Canada (excluding Quebec). This study uses newly linked Canadian Cancer Registry and hospital discharge data, created in the Canadian Cancer Treatment Linkage Project by Statistics Canada in 2016.

    Release date: 2018-08-15

  • Articles and reports: 11-633-X2018014
    Description:

    The Canadian Mortality Database (CMDB) is an administrative database that collects information on cause of death from all provincial and territorial vital statistics registries in Canada. The CMDB lacks subpopulation identifiers to examine mortality rates and disparities among groups such as First Nations, Métis, Inuit and members of visible minority groups. Linkage between the CMDB and the Census of Population is an approach to circumvent this limitation. This report describes a linkage between the CMDB (2006 to 2011) and the 2006 Census of Population, which was carried out using hierarchical deterministic exact matching, with a focus on methodology and validation.

    Release date: 2018-02-14

  • Articles and reports: 11-633-X2017006
    Description:

    This paper describes a method of imputing missing postal codes in a longitudinal database. The 1991 Canadian Census Health and Environment Cohort (CanCHEC), which contains information on individuals from the 1991 Census long-form questionnaire linked with T1 tax return files for the 1984-to-2011 period, is used to illustrate and validate the method. The cohort contains up to 28 consecutive fields for postal code of residence, but because of frequent gaps in postal code history, missing postal codes must be imputed. To validate the imputation method, two experiments were devised where 5% and 10% of all postal codes from a subset with full history were randomly removed and imputed.

    Release date: 2017-03-13

  • Articles and reports: 18-001-X2016001
    Description:

    Although the record linkage of business data is not a completely new topic, the fact remains that the public and many data users are unaware of the programs and practices commonly used by statistical agencies across the world.

    This report is a brief overview of the main practices, programs and challenges of record linkage of statistical agencies across the world who answered a short survey on this subject supplemented by publically available documentation produced by these agencies. The document shows that the linkage practices are similar between these statistical agencies; however the main differences are in the procedures in place to access to data along with regulatory policies that govern the record linkage permissions and the dissemination of data.

    Release date: 2016-10-27

  • Articles and reports: 11-633-X2016003
    Description:

    Large national mortality cohorts are used to estimate mortality rates for different socioeconomic and population groups, and to conduct research on environmental health. In 2008, Statistics Canada created a cohort linking the 1991 Census to mortality. The present study describes a linkage of the 2001 Census long-form questionnaire respondents aged 19 years and older to the T1 Personal Master File and the Amalgamated Mortality Database. The linkage tracks all deaths over a 10.6-year period (until the end of 2011, to date).

    Release date: 2016-10-26

  • Articles and reports: 89-648-X2016001
    Description:

    Linkages between survey and administrative data are an increasingly common practice, due in part to the reduced burden to respondents, and to the data that can be obtained at a relatively low cost. Historical linkage, or the linkage of administrative data from previous years to the year of the survey, compounds these benefits by providing additional years of data. This paper examines the Longitudinal and International Study of Adults (LISA), which was linked to historical tax data on personal income tax returns (T1) and those collected from employers’ files (T4), among others not mentioned in this paper. It presents trends in historical linkage rates, compares the coherence of administrative data between the T1 and T4, presents the ability to use the data to create balanced panels, and uses the T1 data to produce age-earnings profiles by sex. The results show that the historical linkage rate is high (over 90% in most cases) and stable over time for respondents who are likely to file a tax return, and that the T1 and T4 administrative sources show similar earnings. Moreover, long balanced panels of up to 30 years in length (at the time of writing) can be created using LISA administrative linkage data.

    Release date: 2016-08-18
Data (2)

Data (2) ((2 results))

  • Table: 95F0303X
    Description:

    This product presents selected 2001 and historical data from the Census of Agriculture - Census of Population Linkage database. The data are available at the Canada and province levels for free. The data variables include: age; sex; marital status; mother tongue; highest level of schooling; net farm income; as well as farm population counts and income profiles for census farm families and households.

    (No linkage databases were created for the 1966 and 1976 Census years, so historical comparisons are not possible for those years.)

    Release date: 2003-12-02

  • Table: 16-200-X
    Description:

    Part of Statistics Canada's Econnections: linking the environment and the economy statistical series, this product consists of a printed publication combined with a CD-ROM. The product offers summary indicators plus detailed statistics that quantify the relationship between economic activity and the environment. Information is presented for issues ranging from greenhouse gas emissions, water and energy use, to natural resource wealth, environmental expenditures and beyond. The printed publication provides convenient reference to the summary indicators, including analysis of important trends, while the CD-ROM offers straightforward access to dozens of detailed statistical tables that underlie the indicators. An electronic version of the printed publication is included on the CD-ROM and each indicator in the publication is hypertext linked to a group of related statistical tables, allowing the user to easily select detailed statistics for viewing in association with any given indicator. Simple analysis of the statistics can be done directly within the CD-ROM's software. For those who carry out more complex analysis, downloading of data from the CD-ROM in standard spreadsheet format is easily accomplished.

    Release date: 2001-02-23
Analysis (66)

Analysis (66) (0 to 10 of 66 results)

  • Articles and reports: 12-001-X202200100007
    Description:

    By record linkage one joins records residing in separate files which are believed to be related to the same entity. In this paper we approach record linkage as a classification problem, and adapt the maximum entropy classification method in machine learning to record linkage, both in the supervised and unsupervised settings of machine learning. The set of links will be chosen according to the associated uncertainty. On the one hand, our framework overcomes some persistent theoretical flaws of the classical approach pioneered by Fellegi and Sunter (1969); on the other hand, the proposed algorithm is fully automatic, unlike the classical approach that generally requires clerical review to resolve the undecided cases.

    Release date: 2022-06-21

  • Articles and reports: 11-522-X202100100006
    Description:

    In the context of its "admin-first" paradigm, Statistics Canada is prioritizing the use of non-survey sources to produce official statistics. This paradigm critically relies on non-survey sources that may have a nearly perfect coverage of some target populations, including administrative files or big data sources. Yet, this coverage must be measured, e.g., by applying the capture-recapture method, where they are compared to other sources with good coverage of the same populations, including a census. However, this is a challenging exercise in the presence of linkage errors, which arise inevitably when the linkage is based on quasi-identifiers, as is typically the case. To address the issue, a new methodology is described where the capture-recapture method is enhanced with a new error model that is based on the number of links adjacent to a given record. It is applied in an experiment with public census data.

    Key Words: dual system estimation, data matching, record linkage, quality, data integration, big data.

    Release date: 2021-10-22

  • Articles and reports: 11F0019M2018411
    Geography: Census metropolitan area
    Description:

    Immigrants tend to reside disproportionately in larger Canadian cities, which may challenge their absorptive capacity. This study uses the linked Longitudinal Immigration Database and T1 Family File to examine the initial location and onward migration decisions of immigrants who are economic principal applicants (EPAs) and who have landed since the Immigration and Refugee Protection Act was passed. The main objective of the study is to identify the factors associated with initially residing and remaining in Canada’s three largest gateway cities: Montréal, Toronto and Vancouver (referred to as MTV).

    Release date: 2018-12-07

  • Articles and reports: 82-003-X201800800001
    Description:

    The objective of this study is to report the population rate of surgical treatment of incident primary female breast tumours diagnosed from 2010 to 2012 overall, and by disease stage in Canada (excluding Quebec). This study uses newly linked Canadian Cancer Registry and hospital discharge data, created in the Canadian Cancer Treatment Linkage Project by Statistics Canada in 2016.

    Release date: 2018-08-15

  • Articles and reports: 11-633-X2018014
    Description:

    The Canadian Mortality Database (CMDB) is an administrative database that collects information on cause of death from all provincial and territorial vital statistics registries in Canada. The CMDB lacks subpopulation identifiers to examine mortality rates and disparities among groups such as First Nations, Métis, Inuit and members of visible minority groups. Linkage between the CMDB and the Census of Population is an approach to circumvent this limitation. This report describes a linkage between the CMDB (2006 to 2011) and the 2006 Census of Population, which was carried out using hierarchical deterministic exact matching, with a focus on methodology and validation.

    Release date: 2018-02-14

  • Articles and reports: 11-633-X2017006
    Description:

    This paper describes a method of imputing missing postal codes in a longitudinal database. The 1991 Canadian Census Health and Environment Cohort (CanCHEC), which contains information on individuals from the 1991 Census long-form questionnaire linked with T1 tax return files for the 1984-to-2011 period, is used to illustrate and validate the method. The cohort contains up to 28 consecutive fields for postal code of residence, but because of frequent gaps in postal code history, missing postal codes must be imputed. To validate the imputation method, two experiments were devised where 5% and 10% of all postal codes from a subset with full history were randomly removed and imputed.

    Release date: 2017-03-13

  • Articles and reports: 18-001-X2016001
    Description:

    Although the record linkage of business data is not a completely new topic, the fact remains that the public and many data users are unaware of the programs and practices commonly used by statistical agencies across the world.

    This report is a brief overview of the main practices, programs and challenges of record linkage of statistical agencies across the world who answered a short survey on this subject supplemented by publically available documentation produced by these agencies. The document shows that the linkage practices are similar between these statistical agencies; however the main differences are in the procedures in place to access to data along with regulatory policies that govern the record linkage permissions and the dissemination of data.

    Release date: 2016-10-27

  • Articles and reports: 11-633-X2016003
    Description:

    Large national mortality cohorts are used to estimate mortality rates for different socioeconomic and population groups, and to conduct research on environmental health. In 2008, Statistics Canada created a cohort linking the 1991 Census to mortality. The present study describes a linkage of the 2001 Census long-form questionnaire respondents aged 19 years and older to the T1 Personal Master File and the Amalgamated Mortality Database. The linkage tracks all deaths over a 10.6-year period (until the end of 2011, to date).

    Release date: 2016-10-26

  • Articles and reports: 89-648-X2016001
    Description:

    Linkages between survey and administrative data are an increasingly common practice, due in part to the reduced burden to respondents, and to the data that can be obtained at a relatively low cost. Historical linkage, or the linkage of administrative data from previous years to the year of the survey, compounds these benefits by providing additional years of data. This paper examines the Longitudinal and International Study of Adults (LISA), which was linked to historical tax data on personal income tax returns (T1) and those collected from employers’ files (T4), among others not mentioned in this paper. It presents trends in historical linkage rates, compares the coherence of administrative data between the T1 and T4, presents the ability to use the data to create balanced panels, and uses the T1 data to produce age-earnings profiles by sex. The results show that the historical linkage rate is high (over 90% in most cases) and stable over time for respondents who are likely to file a tax return, and that the T1 and T4 administrative sources show similar earnings. Moreover, long balanced panels of up to 30 years in length (at the time of writing) can be created using LISA administrative linkage data.

    Release date: 2016-08-18

  • Articles and reports: 82-003-X201600814647
    Description:

    This study is based on 2006 Census (long-form) socio-demographic information (including Aboriginal identity) that was linked to the Discharge Abstract Database to create a sample for analysis from all provinces and territories except Quebec. The purpose is to provide national figures on acute care hospitalizations of Aboriginal (First Nations living on and off reserve, Métis, Inuit in Inuit Nunangat) and non-Aboriginal people.

    Release date: 2016-08-17
Reference (19)

Reference (19) (0 to 10 of 19 results)

  • Surveys and statistical programs – Documentation: 12-539-X
    Description:

    This document brings together guidelines and checklists on many issues that need to be considered in the pursuit of quality objectives in the execution of statistical activities. Its focus is on how to assure quality through effective and appropriate design or redesign of a statistical project or program from inception through to data evaluation, dissemination and documentation. These guidelines draw on the collective knowledge and experience of many Statistics Canada employees. It is expected that Quality Guidelines will be useful to staff engaged in the planning and design of surveys and other statistical projects, as well as to those who evaluate and analyze the outputs of these projects.

    Release date: 2019-12-04

  • Surveys and statistical programs – Documentation: 82-225-X200701010508
    Description:

    The Record Linkage Overview describes the process used in annual internal record linkage of the Canadian Cancer Registry. The steps include: preparation; pre-processing; record linkage; post-processing; analysis and resolution; resolution entry; and, resolution processing.

    Release date: 2008-01-18

  • Surveys and statistical programs – Documentation: 82-225-X
    Description:

    The compendium of Canadian Cancer Registry procedures manuals set out the rules for reporting cancer data to the CCR for all provincial and territorial cancer registries.

    Release date: 2008-01-18

  • Surveys and statistical programs – Documentation: 82-225-X20070109648
    Description:

    The Record Linkage Overview describes the process used in annual internal record linkage of the Canadian Cancer Registry. The steps include: preparation; pre-processing; record linkage; post-processing; analysis and resolution; resolution entry; and, resolution processing.

    Release date: 2007-06-21

  • Surveys and statistical programs – Documentation: 82-225-X20070109650
    Description:

    The User Guide to Record Linkage Feedback Reports C1 and C2 is intended for the users of the reports. The reports were developed to facilitate the exchange of information and decisions between the Canadian Cancer Registry and the Provincial and Territorial Cancer Registries.

    Release date: 2007-06-21

  • Surveys and statistical programs – Documentation: 82-225-X20060099202
    Description:

    The User Guide to Record Linkage Feedback Reports C1 and C2 is intended for the users of the reports. The reports were developed to facilitate the exchange of information and decisions between the Canadian Cancer Registry and the Provincial and Territorial Cancer Registries.

    Release date: 2006-07-07

  • Surveys and statistical programs – Documentation: 82-225-X20060099203
    Description:

    The user guide to Death Clearance Feedback Reports is intended for users of the feedback reports. The feedback reports were developed to facilitate the exchange of information and decisions between the Canadian Cancer Registry and the Provincial and Territorial Cancer Registries.

    Release date: 2006-07-07

  • Surveys and statistical programs – Documentation: 82-225-X20060099204
    Description:

    The Record Linkage Overview describes the process used in annual internal record linkage of the Canadian Cancer Registry. The steps include: preparation; pre-processing; record linkage; post-processing; analysis and resolution; resolution entry; and, resolution processing.

    Release date: 2006-07-07

  • Surveys and statistical programs – Documentation: 82-225-X20060099206
    Description:

    The Guidelines for Abstracting and Determining Death Certificate Only Cases are intended for use by all provincial and territorial cancer registries during their Death Clearance Process. The guidelines should be used when performing a comparison between the Death Certificate Notification and the cancer registry database.

    Release date: 2006-07-07

  • Surveys and statistical programs – Documentation: 16-505-G
    Description:

    Part of Statistics Canada's Econnections: linking the environment and the economy statistical series, this publication describes in detail the conceptual frameworks, data sources and empirical methods used to compile the Canadian System of Environmental and Resource Accounts (CSERA). Designed to be compatible with the accounting frameworks of the System of National Accounts, the CSERA allows users to easily analyze the linkages between economic activity and the environment in terms of material and energy flows, environmental expenditures and natural resource stocks. This publication will be of interest to researchers in both the economic and environmental fields who want to familiarize themselves with the accounting concepts of the CSERA. It is a companion volume to Environment-economy indicators and detailed statistics (catalogue no. 16-200-XKE), another product in the Econnections series.

    Statistics Canada has updated its 1997 documentation on environmental accounts, Econnections: Concepts, Sources and Methods of the Canadian System of Environmental and Resource Accounts, with publication of the Methodological Guide: Canadian System of Environmental-Economic Accounting.

    Release date: 2006-04-12
Date modified: