Longitudinal Immigration Database (IMDB) Technical Report, 2019
2 Data sources

Several files are included in the IMDB, including the PNRF, the NRF. These files, which will be described in this section, consist of immigration data, immigrant tax files, and auxiliary files covering information available for immigrants admitted since 1952, and non-permanent residents since 1980.

2.1 Immigration data

Every year, Statistics Canada (StatCan) receives admission data on new recipients of permanent residency permits and non-permanent residency permits from IRCC.

2.1.1 Integrated Permanent and Non-permanent Resident File (PNRF 1980-2019)

Every year, admission data is added to create the Immigrant Landing File (ILF). This file contains information such as date of admission, date of birth, and immigration category. The ILF could be seen as a census of the people who have immigrated to Canada as permanent residents since 1980; it holds information on their characteristics at admission. This file, however, is not directly available to IMDB users. Admission data for these immigrant taxfilers is available in the Integrated Permanent and Non-permanent Resident File (PNRF). This file also contains information on non-taxfilers for identification; however, the first tax year and last tax year information will be missing.

Because it is an administrative record of permanent residency, the ILF overestimates the number of immigrants currently living in Canada. This overestimation occurs for two reasons. First, the ILF does not identify the individuals who have left the country. Immigrants who landed in Canada may have left Canada since admission. Second, the death of immigrants who landed in 1980 and thereafter is only partially reported. Further information on mortality data can be found in Section 7.2.

Researchers can access the Integrated Permanent and Non-permanent Resident File (PNRF), which combines information from the Immigrant  Landing File (ILF) and the NRF at the person level. The PNRF provides users with the ability to follow the migration history of immigrants, including their pre-admission experience in Canada. The PNRF covers all the admission data (except emigration and mortality) as well as detailed information on the sociodemographic characteristics of immigrants who landed in Canada in 1980 or thereafter, making it possible for example, to determine whether a person was a non-permanent resident prior to admission. This file contains the number of permits for each non-permanent resident who became a permanent resident, and includes admission dates. However, it is to be noted that this file does not include the records of non-permanent residents (temporary residents) who have not become permanent residents. The PNRF also includes a date of death when a link to a death record has been made (see Section 7.2.2). For more details on the content of this file, please refer to the immigration component of the IMDB dictionary, in sections 3.3 and 3.4 of this report.

In addition, a file named PNRF_EXTRA_1980_2013 is available to data users; it includes variables that have been retired, have little analytical value, or for which no metadata are available. The complete list of variables can be found on the IMDB immigration data dictionary.

In the past, the PNRF used to separate taxfilers from non-taxfilers (e.g. PNRF_2016 and PNRF_NONFILERS_2016). As of the 2018 IMDB release, the taxfilers have been merged with the non-taxfilers and it is called PNRF_1980_2019.

2.1.2 Admissions Prior to 1980: Integrated Permanent and Non-permanent Resident File (PNRF) 1952-1979

Prior to the 2018 IMDB release, the data on immigrants in the IMDB were limited to admissions from 1980 onwards.  As a part of the 2018 IMDB release, the PNRF will now include data from 1952, expanding the IMDB universe.  The new file (PNRF_1952_1979) contains the immigrants admissions from 1952-1979. However, PNRF 1952-1979 has fewer variables than for the people admitted after 1980 (see section 2.1.1), since it is older data. Its major categories include: Gender, Country, Birth year, Landing year and month.

2.1.3 Non-permanent Resident File (NRF)

The Non-permanent Resident File (NRF_Permit) is created from the data of individuals who have been granted non-permanent resident permits since 1980. This file includes the type of permits (work or study, for example) and the last valid date of a permit for example. The file is updated each year with new annual non-permanent permits data. For the 2018 IMDB release, the Permit NRF is called NRF_PERMIT_1980_2019.

A given person can have multiple permits over time. These permits include Work Permits, Study Permits, Refugee Claims, and Other Permits issued, as well as the date when they were issued and the date that they expire. The NRF Person, called NRF_PERSON_1980_2019, stores information at the person level such as the number of permits and the first year of temporary residence permit.

The data can be linked to the PNRF by means of the IMDB unique person identifier (IMDB_ID). For variables common between the PNRF and NRF, in cases of discrepancies refer to the PNRF values. For more details on the variables included on these files, please refer to the immigration component of the IMDB dictionary.

2.1.4 Express Entry (EE)

The Longitudinal Immigration Database (IMDB) includes data on immigrants admitted through the Express EntryNote (EE) application management system. Express entry (an extension of the PNRF) is an application process for economic immigrants wanting to settle in Canada permanently and wanting to take part in our economy. This selection process was launched on January 1, 2015, and the first draw (to select qualified permanent residents) was on January 31, 2015.

The IMDB contains data on 200,300 individuals (principal applicants and their family members) admitted through EE. These individuals can be identified using the variable EXPRESS_ENTRY_IND from the IMDB’s Integrated Permanent and Non-permanent Resident Files called PNRF_1980_2019. It is to be noted that data for the 2019 cohort will be added at a later time.

Detailed data on principal applicants admitted through EE are available. For example, transferability of skills and highest level of education are available.

To obtain more information, there is a detailed technical report on Express Entry.


2.1.5 Quebec Admissions File

Provided by the province of Québec, the file contains detailed information regarding the province’s own admission selection program for admission years 2012 to 2019. It includes variables such as the Québec selection category, Québec family code, Québec field of study, and others detailed in the IMDB Landing Dictionary. The file contains 403,220 records. The province of Quebec, in special agreement with the government of Canada, has full responsibility of its immigration levels, programs, and policies.

To obtain more information, there is a detailed report on the Québec admissions file.


2.2 Additional IMDB Modules

The IMDB includes additional modules on Children, Wages, and Settlement. These modules may be released in stages, and may not have been updated at the time of the release. Previously disseminated modules can still be utilized, as IMDB_IDs are stable across iterations. The sections detailed below will be updated as the modules are released.

2.2.1 Children Data Module

This is a brief introduction to the Longitudinal Immigration Database (IMDB) Children Data module, which includes a file named PNRF_CHILD_1980_2019 with children immigration records and T1 Family Files (T1FF) since 1982, named IMDB_CHILD_T1FF, for immigrant children during their childhood. In these tax files, the parents of children are identified with IMDB_ID_PARENT, which is equal to the parent’s IMDB_ID if parents are present on the immigration files (e.g.: had permanent or non-permanent permit(s)).

Since 1980, over 2 million immigrants who were admitted to Canada were aged less than 18 years old at their time of admission. This represents 24.8% of immigrants admitted during that timeframe. These children will most likely receive all or part of their education in Canada and will have different challenges than adult immigrants. Little information is available about immigrant children during their childhood in the Longitudinal Immigration Database (IMDB), as they are likely not tax-filers.

 In order to increase the analytical capability of the IMDB, a children module was produced. The ability to study the impact of the childhood socioeconomic condition on adulthood economic outcome is an added value to the IMDB.

Different methods were used in order to add tax information for immigrant children. One method consisted in using the immigration application number to identify a parent. The second method used the Statistics Canada Dependant Register, which is the result of record linkages, to identify children’s guardians. Once a child-parent connection is made, the remaining task was to produce the tax files; tax files during the years of childhood are created for immigrant children admitted since 1980.

In order to determine the parent-child connection, information from a DIN-SIN (Dependent Identifier Number - Social Insurance Number) connection was prioritized. When this information was not available, the immigration application number was used to identify children’s parents. Once parent-child connection is made, tax files during the years of childhood are created for immigrant children admitted since 1980.

Tax files related to children’s parent include a subset of the variables included in the IMDB_T1FFs. Only main income variables (such as employment income), tax benefits and deductions provided to families and parents were kept (such as child tax benefits and education amount and tuition fees transferred from a child). The PNRF_CHILD_1980_2019 file includes a subset of variables available in the PNRF and information about the children’s parent(s), such as IMDB_ID and first and last year of filing during the children's childhood.

To obtain more information, there is a detailed technical report on IMDB Children.

2.2.2 Wages and Salaries Data Module

This is a summary of the linkage between the Longitudinal Immigration Database (IMDB) and the Statement of Remuneration (T4) Supplemental file. The Preliminary Wages and Salaries tax files are derived from the T4 Supplemental tax files, which contain tax employment information as provided by the individual’s employer. T4 Supplemental files are used to report salary, wages, and taxable benefits paid to employees for services rendered during the year, as well as pension adjustment, amounts of pay for employees who accrued a benefit for the year under a registered pension plan or a deferred profit sharing plan. Variables extracted from these files include province of employment, province of employee, T4 earnings per by tax year, and number of T4 slips per tax year. The preliminary wages and salaries tax data are available from 1997 to 2019.

There are three main reasons for integrating the T4 tax files to IMDB:

  1. To better understand the actual coverage of the IMDB with regards to temporary residents working in Canada by using temporary SINs as a basis for analysis.
  2. To have a more comprehensive coverage and understanding of temporary residents working within Canada by using T4 slips rather than relying on T1 Filers, in particular those temporary residents who do not transition toward permanent residency.
  3. To understand the feasibility of disseminating IMDB findings earlier using the T4 tax file, as the T4 files are available approximately six months before the T1FF files.

Integrating the T4 does provide some benefit to the IMDB, particularly additional coverage of temporary foreign workers. The values provided by the linkage to the T4 were validated against T1FF values, matching 93.0% of the time, while seeing an average overall difference in T4 earnings of 1.8%.

To obtain more information, there is a detailed technical report on Wages and Salaries.

2.2.3 Settlement Services Data Module

The IMDB’s Integrated Permanent and Non-permanent Resident File (PNRF) and other IMDB files can be integrated to the settlement services module. The non-confidential person identification number (IMDB_ID) is included in all the files, it should be used to integrate immigrants and non-permanent residents to their records in other IMDB files. The files DOM_CLIENT_SETTLEMENT (recipients of domestic services) and FRN_CLIENT_SETTLEMENT (recipients of foreign services) have information on type and number of services received at the person level. Then a series of files, by type of service, with more details on the services received are available to users. In total, this module includes 15 files. A dictionary is available to users for more details (in English only, information in French available on request).

Several files related to settlement services provided to permanentNote and non-permanent residents selected to become permanent residents are available at Immigration, Refugees and Citizenship Canada (IRCC). The Immigration Contribution Agreement Reporting Environment (ICARE) is where all settlement data are collected and stored. ICARE is a reporting system used by organizations providing resettlement services to immigrants to report their activities. Annually, Statistics Canada received files generated by ICARE, in order to produce an IMDB settlement services module. The data received covered the services provided from 2013 and onward. According to the data received 1,213,850 people were provided services in Canada since 2013 and 74,320 people were provided pre-arrival services since 2015.

Settlement services are received in Canada or pre-arrival. A variety of services are offered to new immigrants and non-permanent residents, some are related to employment or assessments of needs and others to information and orientation. Support services, such as transportation and childminding are also provided.

Several components of the ICARE data add analytical power to the IMDB that combines immigrant admissions and non-permanent resident permits with their tax files. The data currently available does not allow the addition of data for settlement services received prior to 2013 (and 2015 for foreign services).

The coverage for immigrants admitted prior to 2013 is partial. In cases of multiple admissions, which are rare, the settlement services relate to the most recent admission when admission characteristics (kept in the IMDB) relate to the first admission. Settlement services are not limited to recent immigrants. For example, 280 immigrants first admitted in 1980 had received settlement services between 2013 and August 2019, it was also the case for 950 immigrants admitted in 1990 and 4,000 immigrants admitted in 2000. Settlement data not connected to a recent admission were not removed from the IMDB.

It is to be noted that the module includes data on 64,490 non-permanent residents who received settlement services, these could be people who were admitted in 2019 (the IMDB includes admissions up to 2019) or are in the process of becoming permanent residents. Data from organizations located in Quebec are not collected, so only services provided to immigrants outside Quebec are available.

The settlement data module is comprised of several files, as different types of services are in separate files. Also, a distinction is made between services received pre-arrival (foreign) and post-arrival (domestic). Data about these services are available on different files. It was possible to integrate only 23.9% of foreign services recipients to an IMDB record. The main reason is that these people had not arrived in Canada prior to 2019 or are still not in Canada. The coverage of foreign services start in 2015 where the coverage of domestic services begin in 2013. In order to be included into the IMDB, some of this data was synthesized at the person level. Numerous variables were derived, such as the number of services received by topic. Similar to the IMDB_ID, a non-confidential service number (SERVIC_NUM) was created.

The Longitudinal Immigration Database (IMDB) Integrated Permanent and Non-permanent Resident File (PNRF) and other IMDB files can be integrated to the DOM_CLIENT_SETTLEMENT (person who received domestic services) and FRN_CLIENT_SETTLEMENT (person who received foreign services). The files DOM_CLIENT_SETTLEMENT and FRN_CLIENT_SETTLEMENT have information on type and number of services received at the person level. Then a series of files, by type of service, with more details on the services received are available to users. In total, this module includes 15 files. The non-confidential person identification number (IMDB_ID) is included in all the files; it should be used to integrate immigrants and non-permanent residents to their records in other IMDB files.

To obtain more information, there is a detailed technical report on Settlement Services

2.3 T1 Family File (T1FF)

The T1 Family File (T1FF). Every year, Statistics Canada uses the annual individual T1 file, T4 Tax file, and the Canada Child Tax Benefit file from the CRA and creates an analytical T1FF. T1FF data is available from 1982 to 2018.

The tax files used to create the IMDB_T1FF files are those contained in the T1 Family FileNote (T1FF). Statistics Canada takes the annual individual T1 file, T4 tax file and Canada Child Tax Benefit (CCTB)Note file from the CRA and creates the T1 Family File for that year. Processing consists of many steps, ranging from geographical coding to the formation of families (for example, when the taxfiler mentions a spouse and this spouse is also a taxfiler, the spouse is integrated via a common identifier to the original taxfiler). T1FF data go back to the 1982 tax year. With the experience gained from many years of T1FF processing, editing rules have been created to reduce the number of inconsistencies in the database and ensure that data quality continues to improve.

The availability of the tax variables depends on the information collected in a given year. The T1FF produced annually for the IMDB includes individual and family incomes as well as family composition variables, such as the number of kids and the spouse identification number. The IMDB contains IMDB_T1FFs for 1982 and subsequent years for immigrant taxfilers. The creation process of these files is described in Section 5.1. For more details on variables available on the IMDB_T1FFs, refer to the tax component of the IMDB dictionary.

2.4 Auxiliary Files

To create the IMDB, it is necessary to use auxiliary files that facilitate record linkage and add variables to the database. These auxiliary files are not available to IMDB users.

The Social Data Linkage Environment (SDLE) was used to facilitate the record linkages.

The SDLE at Statistics Canada promotes the innovative use of existing administrative and survey data to address important research questions and inform socio-economic policy through record linkage. The SDLE expands the potential of data integration across multiple domains, such as health, justice, education and income, through the creation of integrated analytical data files without the need to collect additional data from Canadians.

