Analytical Studies: Methods and References
Estimating Immigrants’ Presence in Canada within the Context of Increasingly Fluid International Migration Patterns

by Hanqing Qiu, Feng Hou and Eden Crossman
Social Analysis and Modelling Division No. 032
Release date: March 16, 2021

Skip to text

Text begins

Acknowledgements

This study is conducted in collaboration with Immigration, Refugees and Citizenship Canada. The authors would like to thank Julien Bérard-Chagnon, Stacy Hallman, Rebeka Lee, and René Morissette for their advice and comments on an earlier version of this paper.

Abstract

International migration has become increasingly fluid and is viewed decreasingly as a one-time, permanent movement from a source country to a destination country. Immigrant-receiving countries often point to long-term economic- and population-related goals as motivations for permanent immigration programs, making immigrants’ presence and absence patterns of increasing policy interest. This article explores two methodological issues related to measuring immigrants’ potential presence in Canada. The first is the use of auxiliary administrative data sources as a means to supplement the T1 Income Tax Return file, which—to date—has been one of the key data sources used to estimate emigration among Canadian immigrants. The second is the evaluation of the sensitivity of emigration estimates to the definition of immigrant disappearance and reappearance in administrative data. The results show that, for a given year, the use of additional data sources (specifically, the 13 auxiliary data sources available in the Fiscal Activity Indicator File [FAIF] collected by the Canada Revenue Agency [CRA]) captures 4 to 5 percentage points more immigrants than using the T1 Income Tax Return file alone. The estimated emigration rates of immigrants vary considerably with the definition of disappearance and the data sources used. For example, the estimated emigration rate by the tenth year after landing ranges from 15% to 20% among immigrants who arrived between 2000 and 2004 and were aged 25 to 64 at landing. Overall, the results of this study show that using auxiliary tax data available in the CRA’s FAIF in addition to the T1 Income Tax Return file increases the number of immigrants identified as potentially living in Canada. As a result, the inclusion of these data sources reduces the estimated emigration rate of immigrants in Canada.

1 Introduction

The net contribution of immigration to a country’s population growth and labour supply is determined by the number of immigrants staying in the receiving country. Immigrant-receiving countries often identify long-term economic- and population-related goals as motivations for permanent immigration programs, making immigrants’ presence and absence patterns of increasing policy interest. However, an expanding body of literature suggests that international migration has become increasingly fluid and that the line between temporary and permanent migration has become blurry (Budnik 2011; Fauser et al. 2015; Vadean and Piracha 2010). Yet, no commonly accepted quantitative indicators have been developed to measure the transitory nature of international migration.

In the absence of a direct data source with information collected on immigrants exiting Canada, previous studies have had to rely on indirect estimation methods, including the residual method, reverse record checks, tax data and Statistics Canada’s Demographic Estimates Program (see Bérard-Chagnon 2018 for an excellent overview of this). Using varying criteria, these studies tended to treat emigration among immigrants as a one-time, permanent move. The differences in methodology and estimates were particularly significant among studies using administrative data from income tax files. For example, Dryburgh and Hamel (2004a) estimated that, among immigrants who landed in 1990 and filed income taxes in Canada at least once, 7% left Canada by 2000. They defined emigrants as those who stopped filing taxes for at least two years by 2000, and all of their landing group (i.e., family or extended family who landed together) also stopped filing taxes at the same time. In their estimate, immigrants who never filed taxes were excluded from the base population and not counted as emigrants. In contrast, Aydemir and Robinson (2008) included all landed immigrants—regardless of whether they had ever filed a tax return— as the base population, and defined emigrants as immigrants who never filed a tax return or who stopped filing for four consecutive years. Based on this definition, about 27% of immigrant men who were aged 25 to 35 at landing and arrived in 1991 left Canada within 10 years of immigration. They also noted that, if the analysis was restricted to immigrants who filed taxes at least once, the estimated emigration rate would be reduced by half. Similar to Dryburgh and Hamel (2004a), Bérard-Chagnon, Tang and St-Jean (2019) considered only immigrants who filed a tax return at least once in their estimation of immigrant emigration. However, these authors applied a more stringent definition of emigration. They identified emigrants as those who stopped filing a tax return for at least three consecutive years and did not file again before the last year observed in the data, with the exception of (1) women of childbearing ages (19 to 45 years) who are the only member who stopped filing in their landing group, and (2) immigrants aged 65 and older who landed within the last 10 years and were not eligible to receive Canada’s Old Age Security payment. The results showed that, among immigrant tax filers who were aged 18 and older at landing and arrived between 1990 and 1991, about 18% had emigrated 10 years after immigration and 36% had emigrated 20 years after immigration. It is clear that how to treat immigrants who never filed an income tax return and how to define immigrants’ absences are two main sources of differences in the methods and results of previous studies that have used administrative data to estimate immigrant emigration.

The present study seeks to refine the estimation of emigration among immigrants to Canada by assessing methodological choices concerning the scope of the data and definitions of absence using tax-based administrative data. The analysis first examines the prevalence and characteristics of immigrants who do not file income tax returns but engage in other fiscal activities that are captured in auxiliary administrative data sources collected by the Canada Revenue Agency (CRA). Patterns of disappearance and reappearance among immigrants that were observed using these data sources are subsequently explored. Lastly, the sensitivity of the emigration estimates to different definitions derived from the frequency and duration of immigrants’ disappearance from Canada is investigated.

Methodologically, this study deals with two difficulties typically encountered in previous Canadian studies on the emigration of immigrants. The first challenge is deciding how to interpret data on immigrants who are counted as having landed in Canada but who never appear in the T1 Income Tax Return file. Some studies have made the assumption that immigrants who never filed taxes left Canada (e.g., Aydemir and Robinson 2008). However, this assumption suffers from two sources of potential bias. First, the linkage rate between immigrant landing records and the tax return file is not perfect. Individuals whose landing records are not matched to the tax file may be absent from income tax files because of mismatching and not because they did not file taxes. This problem has been mitigated by recent data improvements. The linkage rate of the Longitudinal Immigration Database (IMDB) —the main data source used to study the emigration of Canadian immigrants—increased from 55% in the mid-1990s (when the database was first developed) to about 81% in the late 2000s. By the mid-2010s, the linkage rate was 97% (see Statistics Canada 2019, Section 8 for details).Note  Another source of bias arises from the possibility that some immigrants might live in Canada without filing an income tax return. Some studies have ignored immigrants who never appeared in the income tax return file, and only estimate disappearance rates among immigrants who filed a tax return at least once (e.g., Bérard‑Chagnon, Tang and St-Jean 2019; Dryburgh and Hamel 2004b). This report addresses this bias by assessing the extent to which immigrants who never appeared in or disappeared from the tax return file are present in auxiliary data sources collected by the CRA.

The second methodological challenge is deciding how to define emigration. Past studies have generally defined emigration as the disappearance of an individual from the tax return file for a number of continuous years (e.g., an absence of two, three or four years) (e.g., Aydemir and Robinson 2008; Bérard-Chagnon, Tang and St-Jean 2019). However, this does not account for the possibility of an individual reappearing in the tax file after this time period has elapsed. Using the IMDB, Dryburgh and Hamel (2004b) showed that over 20% of immigrants reappeared in the tax file several years after an initial disappearance of three years. Similar to never having made an appearance in the tax return file, the disappearance of an immigrant in the tax file records does not necessarily indicate emigration. Furthermore, the estimate of reappearance is less feasible for more recent immigrant cohorts because of the shorter period of observation. This report evaluates the sensitivity of the emigration rate to the choice of years used to define disappearance and account for reappearance.

This study is organized as follows: Section 2 describes the data and sample selection for the estimation of immigrant emigration in Canada; Section 3 examines the use of auxiliary data sources to assess fiscal behaviour and enhance the emigration estimates of immigrants; Section 4 describes the impact of different measures of duration on the disappearance and reappearance of immigrants in the CRA data; and the summary and discussion are found in Section 5.

2 Data and sample selection

Canadian residents are required to file a tax return.Note  This information is captured in the T1 personal master file (T1PMF), which is a cross-sectional dataset consisting of the T1 Individual Income Tax Return records of Canadian tax filers who submitted their returns before an assessment date. It contains a wide variety of information about these individuals, including demographic characteristics (e.g., year of birth, sex, marital status, and province or territory of residence); income (e.g., employment, self-employment, investments, capital gains); and a number of federal and provincial amounts for taxes, transfers, credits and allowances.Note  However, there are individuals who receive taxable benefits or employment income that do not file a tax return or do so after the filing deadline.Note  As a result, these individuals are not captured by the T1PMF. However, it is possible that they are captured by other tax data sources collected by the CRA. For example, personal income tax files submitted more than two years after the assessment date are not included in the T1PMF, but are counted in the T1 historical personal master file (T1HPMF). Furthermore, individuals who receive employment income or employment insurance (EI) benefitsNote  are saved in the databases of T4 Statement of Remuneration Paid or T4E Statement of Employment Insurance and Other Benefits.

Consequently, use of the T1PMF alone may underestimate tax-reporting behaviour and associated estimations of the presence of immigrants in Canada. To address this issue of underestimation, this study combined the T1PMF with 13 other tax file data sources held by the CRA in the Fiscal Activity Indicator File (FAIF). The FAIF is an administrative file used to identify the longitudinal tax-reporting behaviour of every social insurance number (SIN) reported at least once in any of the 14 selected tax files provided by the CRA to Statistics Canada.Note  The data sources in the FAIF are listed in Table 1.


Table 1
Data components in the Fiscal Activity Indicator File
Table summary
This table displays the results of Data components in the Fiscal Activity Indicator File Source file, Brief description and Tax file versions used (appearing as column headers).
Source file Brief description Tax file versions used
1 T1 personal master file (T1PMF) Includes individuals who filed a tax return before the cut-off date (i.e., December 22 in the year following the tax year). 2000 to 2017: final
2 T1 historical personal master file (T1HPMF) Includes the same individuals as the T1PMF, as well as individuals who submitted a late return (i.e., those who did not submit tax returns to the Canada Revenue Agency [CRA] in time to be included in the conventional database). 2000 to 2015
3 Canada Child Tax Benefit (CCTB) Standard A tax-free monthly payment made to eligible families to help them with the cost of raising children younger than 18. 2000 to 2002: historical
July 2002 to June 2003
July 2003 to June 2004
...
July 2018 to March 2019
4 T4 Statement of Remuneration Paid (including late filers) An information slip prepared and issued by employers to employees and the CRA for how much employment income employees were paid during a tax year and the amount of income tax that was deducted. NON-LATE FILERS
2000 to 2017: final
LATE FILERS
2000 to 2016
5 T4A Statement of Pension, Retirement, Annuity and other incomes Issued to individuals who received income from pensions, retirement allowance, annuities or other types of income (e.g., benefits for medical premiums, registered disability savings plan payments, grants for the apprenticeship incentive, death benefits, and Registered Education Savings Plan payments). 2000 to 2017: final
6 T4E Statement of Employment Insurance and other benefits Issued to individuals who received employment insurance benefits or repaid an overpayment in the previous year. 2000 to 2017: final
7 T4A(OAS) Statement of Old Age Security Issued to individuals who received the Old Age Security pension. 2000 to 2017: final
8 T4RIF Statement of Income from a Registered Retirement Income Fund Issued to individuals who received income from a Registered Retirement Income Fund. 2000 to 2017: final
9 T4RSP Statement of RRSP Income Issued to individuals who withdrew from their Registered Retirement Savings Plan (RRSP) account or received RRSP income. 2001 to 2017: final
10 T5007 Workers’ compensation benefits and social assistance payments Issued to individuals who received workers’ compensation benefits or social assistance. 2000 to 2017: final
11 Registered Retirement Savings Plan (RRSP) Issued to individuals regarding the amount of their contribution to the RRSP in a tax year. 2008 to 2017: final
12 Canada Pension Plan (CPP) Issued to individuals regarding the amount of their contribution to the CPP in a tax year. 2000 to 2017: final
13 Quebec Pension Plan (QPP) Issued to individuals regarding the amount of their contribution to the QPP in a tax year. 2000 to 2017: final
14 Shelter Allowance for Elderly Renters Issued to individuals who received cash payments to subsidize rents for eligible senior renters in some provinces. 2000 to 2005 and
2007 to 2017: final

Using the T1PMF and the 13 auxiliary data sources in the FAIF, this study examines the extent to which using only the T1PMF may underestimate the potential presence of immigrants in Canada. The sample of immigrants was selected from the Immigrant Landing File (ILF) from Immigration, Refugees, and Citizenship Canada (IRCC).Note  This file contains the characteristics of all immigrants who landed in Canada from 1952 onward. This study focuses on immigrants who landed in Canada between 2000 and 2015 because the current version of the FAIF covers the period from 2000 to 2016. The study sample excludes individuals who died between 2000 and 2015 by using the death indicator from the mortality database.Note  The analysis was restricted to immigrants who were aged 25 to 64 at landing because adult immigrants have a high tax-filing rate. The sample was further restricted to only those with a reported sex. With these restrictions, there were 2,454,935 observations for the analysis.

The immigrant sample is linked to the FAIF by using the linkage key of SINs. The original IRCC ILF does not contain SIN data. To create the basis for matching to other data sources, Statistics Canada performed a probability record linkage method to add SINs to the IRCC ILF.

3 Using auxiliary data sources to measure tax-reporting behaviours

This section examines the value of using the 13 auxiliary data sources in the FAIF in addition to the T1PMF to capture the presence of immigrants in Canada. To provide an overall picture, this section first looks at how many landed immigrants who arrived between 2000 and 2015 had a SIN and—among those with a SIN—how many appeared at least once in the T1PMF and the 13 auxiliary files in the FAIF.Note  Since some immigrants may file taxes in some but not all years, this section further examines the pattern of immigrants appearing in the T1PMF and the 13 auxiliary files in the FAIF by years after immigration.

Table 2 shows the distribution of landed immigrantsNote  in four exclusive categories: (1) appeared in the T1PMF for at least one year between 2000 and 2016, (2) was absent from the T1PMF but appeared in the 13 auxiliary FAIF data sources, (3) had a SIN but it was not found in the 14 FAIF data sources, and (4) was in the ILF but did not have a SIN and—as a result—could not be linked to the FAIF.Note 

In Table 2, the second column shows the number of immigrants in the final sample, as well as the frequencies by demographic characteristics. Among the 2,454,935 immigrants who landed between 2000 and 2015 and were aged between 25 and 64 at landing, 92.9% filed a personal tax return at least once from 2000 to 2016. Therefore, they appeared in the T1PMF for at least one year (Column 3). Because the study sample was restricted to adult immigrants aged 25 to 64 at landing, this high filing rate is expected. The fourth column shows that 0.8% of immigrants who were absent from the T1PMF during the entire 2000 to 2016 period were captured in the other 13 auxiliary data sources in the FAIF. The last two columns show that 4.1% of immigrants had a SIN but were not found in any of the 14 FAIF data sources between 2000 and 2016, and 2.2% of immigrants from the original IRCC landing files (i.e., the ILF) did not have a SIN and—as a result— they could not be linked to the FAIF based on a SIN.


Table 2
Tax-reporting rates of immigrants who were aged 25 to 64 at landing and arrived between 2000 and 2015, by demographic characteristics, pooled 2000 to 2016 tax data, CanadaTable 2 Note 1
Table summary
This table displays the results of Tax-reporting rates of immigrants who were aged 25 to 64 at landing and arrived between 2000 and 2015. The information is grouped by Demographic characteristics (appearing as row headers), Immigrants, Appear in the T1PMF for at least one year, Absent from the T1PMF but appear in the 13 other FAIF data sources, Have a SIN but are absent from all FAIF data sources and Do not have a SIN but appear in the ILF, calculated using total count and percent units of measure (appearing as column headers).
Demographic characteristics Immigrants Appear in the T1PMF for at least one year Absent from the T1PMF but appear in the 13 other FAIF data sources Have a SIN but are absent from all FAIF data sources Do not have a SIN but appear in the ILF
total count percent
All 2,454,935 92.9 0.8 4.1 2.2
Sex
Male 1,192,185 92.6 0.9 4.5 1.9
Female 1,262,750 93.3 0.6 3.7 2.4
Age group
25 to 34 1,223,505 93.2 0.7 3.7 2.4
35 to 49 967,700 93.0 0.8 4.3 1.9
50 to 64 263,730 91.9 0.6 5.2 2.3
Immigrant class
Federal Skilled Worker Program 794,960 88.0 1.1 7.9 3.0
Provincial Nominee Program 218,900 96.8 0.4 1.7 1.1
Canadian Experience Class 57,115 99.0 0.1 0.3 0.6
Other economic class 483,470 93.2 1.1 4.3 1.4
Family class 649,420 94.7 0.5 2.0 2.8
Refugee 220,130 98.8 0.2 0.2 0.8
Others 30,940 99.0 0.2 0.3 0.5
Education
Less than high school 539,460 95.2 0.5 2.3 2.0
High school or trade 301,110 95.2 0.7 2.3 1.8
Some postsecondary 286,510 95.2 0.7 2.5 1.7
Bachelor’s degree 903,165 92.8 0.8 4.4 2.0
Graduate degree 424,030 87.2 1.2 8.2 3.4
Not stated 665 96.5 0.6 1.7 1.2
Source countryTable 2 Note 2
China 337,760 91.3 0.4 5.2 3.0
India 320,960 89.1 1.0 6.5 3.3
Philippines 247,425 98.6 0.2 0.7 0.6
Pakistan 97,970 92.6 0.8 5.0 1.6
Iran 85,745 92.1 1.0 5.9 1.1
United Kingdom 82,465 91.9 1.1 4.2 2.8
United States 63,915 91.8 1.0 3.6 3.6
South Korea 57,070 92.8 0.5 3.7 3.0
France 52,800 93.1 2.0 3.0 1.9
Morocco 42,365 90.5 1.7 6.4 1.4
North and South America (excluding the United States) 236,080 95.4 0.8 2.4 1.5
Europe (excluding the United Kingdom and France) 230,785 94.1 1.0 3.2 1.7
Africa (excluding Morocco) 249,265 93.8 0.9 3.7 1.6
Asia (excluding China, India, Philippines, Pakistan, Iran and South Korea) 322,150 91.9 0.7 4.8 2.6
Oceania and other 28,185 93.2 1.1 3.5 2.2
Official language
English only 1,494,530 92.7 0.7 4.3 2.3
French only 130,185 95.3 1.0 2.5 1.2
English and French 288,905 91.7 1.4 5.0 1.8
Neither English nor French 541,055 93.7 0.5 3.4 2.4
Not stated 255 94.9 0.8 3.1 1.2
Canadian work experience prior to landing
No 1,807,845 90.8 0.9 5.5 2.8
Yes 647,090 98.9 0.3 0.3 0.5
Canadian study experience prior to landing
No 2,248,660 92.5 0.8 4.4 2.3
Yes 206,275 98.0 0.5 0.8 0.8
Intended occupation
ICT 121,875 86.2 1.1 8.9 3.8
Engineering 84,935 90.0 1.3 6.5 2.2
Management 95,390 89.7 1.0 6.8 2.5
Professional or technical 476,045 91.5 1.2 5.2 2.1
Other 192,775 96.9 0.6 1.4 1.1
Unknown 1,483,915 93.8 0.6 3.4 2.2

Table 2 also shows that T1PMF tax-filing rates varied by immigrant characteristics (Column 3). Immigrants admitted as federal skilled workers had a lower tax-filing rate than other immigrant classes, primarily because they were less likely to have a SIN or more likely to not have their SINs found in any of the tax files. Only 88% of immigrants who landed as federal skilled workers filed a personal tax return at least once between 2000 and 2016, compared with 99% of immigrants in the Canadian Experience Class. Differences were also observed by education level. Immigrants with a graduate degree at landing had a lower tax-filing rate (87.2%) than those without a university education (95.2%). Immigrants who landed in Canada with the intention to work in information and communications technology had a filing rate of 86.2%—the lowest among all intended occupations.Note  Of the top 10 source countries of immigrants, immigrants from India had the lowest filing rate (89.1%), while those from the Philippines had the highest filing rate (98.6%). Lastly, immigrants with work or study experience in Canada prior to landing had higher tax-filing rates than those with no such experience.

Excluding landed immigrants who did not have a SIN and those whose SIN was not found in any tax files, only a small proportion (0.85%) of immigrants who appeared in the other 13 FAIF data sources did not file any T1 tax return. Therefore, the 13 auxiliary data sources did not capture many more immigrants than the T1PMF. However, some immigrants who filed an income tax return at least once may not have filed every year, and they may appear in the auxiliary data sources in those intermittent years. To measure the presence of immigrants in Canada, it is important to determine whether immigrants appeared in the auxiliary files in the years when they did not file a T1 return.

Table 3 shows the percentage of immigrants who appeared in the T1PMF and the 13 auxiliary FAIF data sources by years after immigration. Data were pooled for immigrants who arrived between 2000 and 2015. In the table, the year of landing is denoted as T, while T+ i (i= 1 to 10) refers to the ith year after the landing year. For the ith year, the calculation was based on all immigrants who arrived for at least i years. For example, in year T+1, the calculation was based on all arrival cohorts, while in T+10, the calculation was based on immigrant cohorts who arrived at least 10 years before 2016 (the last observed data point).


Table 3
Tax-reporting rates among immigrants who were aged 25 to 64 at landing and arrived between 2000 and 2015, by years since landing, Canada
Table summary
This table displays the results of Tax-reporting rates among immigrants who were aged 25 to 64 at landing and arrived between 2000 and 2015. The information is grouped by Years since landing (T)  (appearing as row headers), Immigrants, Appear in
the T1PMF, Do not appear in the T1PMF but appear in the 13 other FAIF data sources and Otherwise, calculated using frequency and percent units of measure (appearing as column headers).
Years since landing (T)  Immigrants Appear in
the T1PMF
Do not appear in the T1PMF but appear in the 13 other FAIF data sources OtherwiseTable 3 Note 1
frequency percent
T+1 2,454,935 86.0 3.8 10.2
T+2 2,279,240 86.4 3.6 10.0
T+3 2,108,355 86.2 3.7 10.1
T+4 1,946,010 85.3 3.7 11.0
T+5 1,784,665 83.4 4.3 12.3
T+6 1,630,355 81.7 4.7 13.6
T+7 1,455,795 80.2 4.9 14.9
T+8 1,300,085 78.9 5.1 16.0
T+9 1,148,830 78.0 5.1 16.9
T+10 1,006,015 77.1 5.1 17.8

Table 3, Column 2 (Immigrants) shows the population counts for the calculation of percentages in columns 3 to 5. The total counts decreased from year 1 to year 10 because the number of observable cohorts decreased in each sequential year. As shown in Column 3 (Appear in the T1PMF), the tax-reporting rate measured by the T1PMF alone dropped from 86% to 77% from year 1 to year 10. Column 4 includes immigrants who did not appear in the T1PMF, but who appeared in any of the 13 other data sources in the FAIF. Using the 13 other data sources in the FAIF raised tax-reporting rates by 4 to 5 percentage points. These shares equate to a considerable number of additional immigrants being accounted for (from 93,700 individuals in T+1 to 50,840 in T+10). Therefore, auxiliary data sources in the FAIF can provide meaningful information to supplement the measurement of tax-reporting behaviour of immigrants on a yearly basis.

In tables 2 and 3, the 13 auxiliary data sources in the FAIF are pooled together. It is possible that some data sources may contribute more than others to increasing immigrant tax-reporting rates. This possibility was confirmed when the data sources in which immigrants appear most frequently were examined. Table 4 outlines the fiscal activities engaged in by immigrants who did not appear in the T1PMF, but who appeared in at least one of the 13 other FAIF data sources.Note  The top three categories were those who filed a late tax return (T1HPMF), those who received the Canada Child Tax Benefit (CCTB) and those who received a T4 slip from an employer. Between the first and tenth years after landing, the shares of immigrants who filed a T1HPMF and those who received a T4 slip (but did not file taxes) declined. Over the same period, the share of those who received the CCTB (but did not file taxes) increased. Notably, the share of those who contributed to the Registered Retirement Savings Plan (RRSP) (but did not file taxes) increased between the first and tenth years after landing.


Table 4
Fiscal behaviour of immigrants who did not appear in the T1PMF but who appeared in the 13 other Fiscal Activity Indicator File data sources, by years since landing, immigrants who landed between 2000 and 2015, Canada
Table summary
This table displays the results of Fiscal behaviour of immigrants who did not appear in the T1PMF but who appeared in the 13 other Fiscal Activity Indicator File data sources. The information is grouped by Years since landing (T)   (appearing as row headers), Immigrants, T1HPMF, CCTB, T4, T4A, T4E, T4RSP, T5007, RRSP and Other, calculated using frequency and percent units of measure (appearing as column headers).
Years since landing (T) Immigrants T1HPMF CCTB T4 T4A T4E T4RSP T5007 RRSP OtherTable 4 Note 1
frequency percent
T+1 93,700 54.8 29.0 41.6 9.8 3.8 1.3 7.3 3.0 0.1
T+2 81,935 51.4 36.7 39.8 9.8 6.3 2.0 6.4 4.0 0.2
T+3 77,345 48.9 42.0 39.6 9.8 7.1 2.6 6.6 4.9 0.3
T+4 72,505 42.1 48.5 39.3 9.4 7.8 3.2 6.9 5.9 0.4
T+5 77,015 38.4 53.4 36.6 8.4 7.4 3.5 6.6 6.4 0.6
T+6 76,545 35.8 58.4 33.4 7.4 6.9 3.5 6.0 7.0 0.8
T+7 72,025 34.0 60.2 30.9 7.0 6.4 3.5 5.9 8.1 1.0
T+8 66,145 32.1 62.5 29.9 6.5 5.8 3.6 5.7 9.4 1.2
T+9 58,240 28.9 64.0 29.8 6.4 5.7 3.6 5.8 10.1 1.5
T+10 50,840 27.5 64.3 29.5 6.1 5.4 3.6 5.7 10.7 1.9

The inclusion of auxiliary FAIF data sources raised tax-reporting rates by 4 to 5 percentage points among immigrants on a yearly basis, from a baseline rate of between 77% and 86%. Among immigrants who were captured in the auxiliary data sources but who did not appear in the T1PMF, the majority are those who filed their taxes late (T1HPMF), those who received the CCTB and those who received a T4 (but did not file taxes). Overall, the use of the auxiliary data sources in the FAIF can enhance the measurement of the presence of immigrants in Canada.

4 The frequency and duration of the disappearance and reappearance of immigrants in tax file data

This section uses tax-reporting behaviour to measure immigrants’ presence in Canada, demonstrating that the inclusion of auxiliary data impacts estimated rates of disappearance and reappearance in tax file data and—ultimately—emigration rates. The analysis was restricted to immigrants who landed between 2000 and 2004 to ensure a long enough period of observation over which the frequency and duration of disappearance and reappearance could be examined. The sample excluded individuals who died between 2000 and 2015 by using the death indicator from the mortality database. The sample was further restricted to those aged 25 to 64 at landing and with a reported sex. The final sample contains 696,643 observations.

As stated previously, past studies on immigrants’ emigration have used different measures of duration to define disappearance. This study compares two different durations to define the disappearance of an immigrant: not filing a tax return in two and four consecutive years after landing, which corresponds to the shortest and longest durations used in previous studies (Dryburgh and Hamel 2004a; Aydemir and Robinson 2008). This study also examines whether an immigrant returned to the tax data within five years after having disappeared—also called a reappearance.

Combining different durations and data sources, this study used four alternative methods to measure disappearance:

  1. Not in the T1PMF for two consecutive years after landing.
  2. Neither in the T1PMF nor in the 13 other FAIF tax data sources for two consecutive years after landing.
  3. Not in the T1PMF for four consecutive years after landing.
  4. Neither in the T1PMF nor in the 13 other FAIF tax data sources for four consecutive years after landing.

Below is an example (Table 5) illustrating the method used to calculate the rate of disappearance and reappearance of immigrants who landed in 2004.


Table 5
An example of using two consecutive years to define disappearance
Table summary
This table displays the results of An example of using two consecutive years to define disappearance. The information is grouped by T (appearing as row headers), T+1, T+2, T+3, T+4, T+5, T+6, T+7, T+8, T+9, T+10, T+11 and T+12 (appearing as column headers).
T T+1 T+2 T+3 T+4 T+5 T+6 T+7 T+8 T+9 T+10 T+11 T+12
2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016
Note ...: not applicable Note ...: not applicable disappear t+1 t+2 t+3 t+4 t+5 t+6 t+7 t+8 t+9 t+10
Note ...: not applicable Note ...: not applicable Note ...: not applicable disappear t+1 t+2 t+3 t+4 t+5 t+6 t+7 t+8 t+9
Note ...: not applicable Note ...: not applicable Note ...: not applicable Note ...: not applicable disappear t+1 t+2 t+3 t+4 t+5 t+6 t+7 t+8
Note ...: not applicable Note ...: not applicable Note ...: not applicable Note ...: not applicable Note ...: not applicable disappear t+1 t+2 t+3 t+4 t+5 t+6 t+7
Note ...: not applicable Note ...: not applicable Note ...: not applicable Note ...: not applicable Note ...: not applicable Note ...: not applicable disappear t+1 t+2 t+3 t+4 t+5 t+6
Note ...: not applicable Note ...: not applicable Note ...: not applicable Note ...: not applicable Note ...: not applicable Note ...: not applicable Note ...: not applicable disappear t+1 t+2 t+3 t+4 t+5

In this example, T refers to the year of landing. An immigrant is considered to have disappeared if they did not file taxes for two consecutive years. The first incidence of disappearance can happen only by the end of T+2 for immigrants who did not file a tax return in T+1 and T+2. Immigrants defined as having disappeared by the end of T+2 were not used in the calculation of the disappearance rates in T+3, T+4 and so forth. Immigrants who did not disappear by the end of T+2 were carried forward to T+3 and again assessed for disappearance. In other words, each immigrant disappearance is conditional on no previous disappearance.

This analysis focused on five arrival cohorts: immigrants who landed in 2000, 2001, 2002, 2003 and 2004. The observation period was 2000 to 2016. Because each arrival cohort had different possible maximum years of stay in Canada, the number of observable disappearances was different for each cohort. To ensure there were five years of observations after disappearance to determine reappearance, the last disappearance was calculated in 2011. For the 2004 cohort (see figure above), only six possible disappearances were observed and the last disappearance was calculated in T+7 (in 2011, or five years before 2016). Similarly, the last disappearance was calculated in T+8 for the 2003 cohort, T+9 for the 2002 cohort, T+10 for the 2001 cohort and T+11 for the 2000 cohort. Therefore, from T+2 to T+7, the calculation of disappearance was based on all cohorts. Starting from T+8, in each additional year after immigration, one cohort was lost because it reached the end of the observation period. In T+11, only the 2000 cohort was left for the calculation of the disappearance rate.

The definition of duration of disappearance as four consecutive years of absence from tax file data is similar to the two-year definition, except that the first disappearance happened in T+4.

The rate of disappearance from tax data among immigrants who landed between 2000 and 2004 is presented in Table 6. Column 2 shows the denominator used to calculate the disappearance rate. From T+2 to T+7, all five cohorts are available. The sample size began to decrease as of T+8 because one cohort was lost in each subsequent year. In T+11, only immigrants who landed in 2000 were used as the denominator of the disappearance rate.


Table 6
Rate of disappearance from tax data among immigrants who landed in 2000 to 2004, Canada
Table summary
This table displays the results of Rate of disappearance from tax data among immigrants who landed in 2000 to 2004. The information is grouped by Year of landing (T) (appearing as row headers), Number of immigrants, Do not appear in the T1PMF for two consecutive years (method 1), Did not appear in either the T1PMF or the 13 other FAIF tax data sources for two consecutive years (method 2), Did not appear in the T1PMF for four consecutive years (method 3) and Did not appear in either the PMF or the 13 other FAIF tax data sources for four consecutive years (method 4), calculated using frequency and percent units of measure (appearing as column headers).
Year of landing (T) Number of immigrants Do not appear in the T1PMF for two consecutive years (method 1) Did not appear in either the T1PMF or the 13 other FAIF tax data sources for two consecutive years (method 2) Did not appear in the T1PMF for four consecutive years (method 3) Did not appear in either the PMF or the 13 other FAIF tax data sources for four consecutive years (method 4)
frequency percent
T+2 696,645 11.1 8.7 Note ...: not applicable Note ...: not applicable
T+3 696,645 1.8 1.2 Note ...: not applicable Note ...: not applicable
T+4 696,645 1.8 1.1 9.3 7.7
T+5 696,645 2.0 1.1 1.3 1.1
T+6 696,645 2.3 1.3 1.4 1.0
T+7 696,645 2.1 1.4 1.6 1.1
T+8 554,930 1.9 1.3 2.0 1.3
T+9 422,130 1.6 1.3 1.9 1.3
T+10 284,985 1.4 1.1 1.8 1.3
T+11 135,240 1.2 1.0 1.5 1.3
Cumulative disappearance rate Note ...: not applicable 27.1 19.6 20.8 16.0

In the last row of Table 6, the cumulative rate of disappearance was calculated from the sum of the rate of disappearance between T+2 and T+11. The results show that measuring immigrants’ disappearance from tax data is sensitive to the duration of disappearance and the data sources applied. When disappearance was defined as being absent from the T1PMF for two consecutive years (method 1), the cumulative disappearance rate was 27.1% 11 years after landing. Alternatively, if disappearance was defined as being absent from the T1PMF and the 13 other FAIF tax data sources for four consecutive years (method 4), the cumulative disappearance rate 11 years after landing was much lower—16.0%.

As mentioned previously, immigrants may return to the tax data after having disappeared—a phenomenon referred to in this study as “reappearance.” Table 7 presents the rate of reappearance of immigrants in the tax data five years after having disappeared in year Y. The rate of reappearance was calculated based on the cumulative number of immigrants who disappeared from T+2 to T+11 (Table 7). Therefore, the denominators differ across methods because the rates of disappearance vary from methods 1 to 4.


Table 7
Rate of reappearance in tax data of immigrants who landed in 2000 to 2004 after having disappeared, Canada
Table summary
This table displays the results of Rate of reappearance in tax data of immigrants who landed in 2000 to 2004 after having disappeared. The information is grouped by Year of disappearance (Y) (appearing as row headers), Did not appear in the T1PMF for two consecutive years (method 1), Did not appear in either the T1PMF or the 13 other FAIF tax data sources for two consecutive years (method 2), Did not appear in the T1PMF for four consecutive years (method 3) and Did not appear in either the T1PMF or the 13 other FAIF tax data sources for four consecutive years (method 4), calculated using frequency and percent units of measure (appearing as column headers).
Year of disappearance (Y) Did not appear in the T1PMF for two consecutive years (method 1) Did not appear in either the T1PMF or the 13 other FAIF tax data sources for two consecutive years (method 2) Did not appear in the T1PMF for four consecutive years (method 3) Did not appear in either the T1PMF or the 13 other FAIF tax data sources for four consecutive years (method 4)
frequency percent frequency percent frequency percent frequency percent
Y+1 169,270 13.8 120,855 7.8 121,115 4.0 93,705 2.0
Y+2 169,270 6.1 120,855 3.2 121,115 2.4 93,705 1.3
Y+3 169,270 3.4 120,855 1.9 121,115 1.9 93,705 1.2
Y+4 169,270 2.1 120,855 1.3 121,115 1.4 93,705 0.8
Y+5 169,270 1.6 120,855 1.1 121,115 1.1 93,705 0.8
Cumulative reappearance rate Note ...: not applicable 26.9 Note ...: not applicable 15.3 Note ...: not applicable 10.8 Note ...: not applicable 6.1

The results in Table 7 indicate that immigrants are more likely to reappear in the tax data after having disappeared within the three years immediately following their disappearance. In all four methods, the reappearance rates were much higher in the first three years than in later years.

The last row in Table 7 shows the cumulative reappearance rate, which was calculated from the sum of the reappearance rate that occurred in the five years after disappearance. The cumulative reappearance rate five years after disappearance varied between 6% and 27% across all methods. The cumulative reappearance rate tended to be higher when fewer years were used to define disappearance. When the duration of disappearance was fixed, reappearance rates were lower when auxiliary administrative data were used to measure disappearance and reappearance. This is because the additional data sources capture more immigrants who might be counted as disappeared when only the T1PMF is used. Compared with immigrants who do not appear in the T1PMF or any of the 13 additional data sources, those who do not appear in the T1PMF but who do appear in the 13 additional data sources are more likely to be present in Canada or only temporarily absent. Therefore, they are more likely to reappear in the T1PMF. If immigrants are not captured in any of the 13 additional data sources, it is highly likely that they have left Canada and, as a result, will not reappear.

By combining both the cumulative disappearance rates and cumulative reappearance rates, it is possible to calculate the emigration rate of immigrants. In this study, emigration is defined as disappearance from the CRA’s tax file(s) after landing with no reappearance in the five subsequent years. The following equation was used to calculate the emigration rate:

emigration rate = cumulative disappearance rate * [(100 - cumulative reappearance rate) / 100] 

The estimated emigration rate was calculated in the last year of the period of study, i.e., 2016. The results should be interpreted as the extent to which immigrants who landed in 2000 to 2004 had emigrated by 2016. Table 8 shows that, by 2016, the estimated emigration rate of immigrants who landed in 2000 to 2004 varied considerably by method used—between 15.1% and 19.8%. Therefore, estimated emigration rates are sensitive to the duration of time used to define disappearance and the data sources used to measure the disappearance and reappearance of immigrants.


Table 8
Estimated emigration rates for immigrants who landed between 2000 and 2004 using four different methods
Table summary
This table displays the results of Estimated emigration rates for immigrants who landed between 2000 and 2004 using four different methods Did not appear in the T1PMF for two consecutive years (method 1), Did not appear in either the T1PMF or the 13 other FAIF tax data sources for two consecutive years (method 2), Did not appear in the T1PMF for four consecutive years (method 3) and Did not appear in either the T1PMF or the 13 other FAIF tax data sources for four consecutive years (method 4) (appearing as column headers).
Did not appear in the T1PMF for two consecutive years (method 1) Did not appear in either the T1PMF or the 13 other FAIF tax data sources for two consecutive years (method 2) Did not appear in the T1PMF for four consecutive years (method 3) Did not appear in either the T1PMF or the 13 other FAIF tax data sources for four consecutive years (method 4)
Cumulative disappearance rate 27.1 19.6 20.8 16.0
Cumulative reappearance rate 26.9 15.3 10.8 6.1
Emigration rate 19.8 16.6 18.6 15.1

In summary, the exercise in this section demonstrates how emigration estimates using tax‑reporting behaviour are sensitive to the inclusion of auxiliary data and the duration of time used to measure disappearance. The use of a shorter duration to define disappearance (two versus four consecutive years) is associated with a higher reappearance rate. Furthermore, when a shorter duration is used to define disappearance, the inclusion of the auxiliary data sources reduces more of the estimated disappearance rates. When reappearance is taken into account, the inclusion of the auxiliary sources has a more significant impact on the estimated emigration rate than the choice of duration for defining disappearance. 

5 Summary and discussion

International migration is not always a one-time, permanent movement from a source country to a destination country. Some immigrants may return to their country of origin, some may move to a third country and some may stay only intermittently in the destination country. These migration complexities pose challenges for national statistical accounts and related research in countries with incomplete exit control statistics. To assess the extent of such challenges, this study explores two methodological issues related to the measurement of immigrants’ potential presence in Canada. The first is the use of auxiliary administrative data sources as a means to supplement the T1 Income Tax Return file, which—to date—has been a key data source used to estimate emigration among immigrants to Canada. The second is evaluating the sensitivity of emigration estimates to the definition of immigrant disappearance and reappearance in administrative data. The main findings of this study can be summarized as follows:

  1. Using the 13 additional tax data sources from the FAIF captures more immigrants by 4 to 5 percentage points on a yearly basis than using only the T1PMF
  2. Among immigrants captured in the auxiliary data sources but not in the T1PMF, the majority are those who filed taxes late (T1HPMF), those who received the Canadian CCTB or those who received a T4 Statement of Remuneration Paid (but did not file taxes).
  3. The estimated emigration rate by the tenth year after landing ranged from 15% to 20% among immigrants who arrived between 2000 and 2004 and were aged 25 to 64 at landing, depending on the duration used to define and data sources used to measure disappearance. While both the data sources and duration for defining disappearance matter, when reappearance was taken into account, the inclusion of the auxiliary sources played a more significant role in estimating the emigration rate than the choice of duration. 

In summary, using auxiliary tax data available in CRA’s FAIF in addition to the T1PMF increases the number of immigrants identified as potentially living in Canada. As a result, the inclusion of these data sources reduces the estimated emigration rate of immigrants in Canada.

More importantly, this study highlights the fact that estimating emigration is a difficult task. Even with the inclusion of 13 auxiliary data sources, it is still possible that immigrants who appear in tax files in a given year do not actually reside in Canada, or that those who disappear from tax files still reside in Canada. Objective and complete entry and exit information is needed to accurately measure individuals’ Canadian residence status. The federal government has established an entry–exit program to collect exit and entry data at the land border with the United States and exit data from airlines on all travellers leaving Canada by air.Note  Before the complete exit data are collected and made available for research purposes (and to examine periods of time prior to that which will not be covered by the newly collected data), other administrative data sources can be used to refine the measurement of immigrant emigration. One example is the Longitudinal Social Data Hub (LSDH), which is currently being developed by Statistics Canada’s Social Data Linkage Environment. The LSDH is a statistical register of person-level activity information in the domains of work (EI status vector), education and human capital (Postsecondary Student Information System, Registered Apprenticeship Information System), health and well-being (births, deaths, continuity of care record, Discharge Abstract Database, National Ambulatory Care Reporting System metadata), family (T1 Family File), and crime and victimization (Integrated Criminal Court Survey). In addition, provincial register data, such as driver’s license and health card information databases, could also be useful in future studies for measuring the presence of immigrants in Canada. These data sources would help capture immigrants who have no connection with the CRA but are engaged in other activities in Canada (e.g., education, health care, crime and victimization, activities requiring a license).

Furthermore, examining emigration by country of destination would identify reasons why immigrants left Canada. It may be possible to use international data sources to provide a measure of subsequent immigration by immigrants to Canada. This would provide information on the issue of step migration for example, i.e., individuals who immigrate to Canada first and subsequently immigrate to other countries. It may be possible to use the American Community Survey and Department of Homeland Security data to provide a measure of subsequent immigration by Canadian immigrants to the United States. This could provide a picture of the extent to which Canada may or may not be the ultimate destination country for some international migrants, and further understanding of emigration patterns from Canada. The use of international data sources could also provide information on the specifics of return migration, i.e., when individuals who become permanent residents of Canada subsequently return to their country of origin to live (and become emigrants from Canada). Using data from receiving countries would help estimate immigrants’ presence in Canada within the context of increasingly fluid international migration patterns.

It is also important to understand whether emigration has increased, decreased or remained stable and under which conditions each of these trends presents. Further research could examine whether emigration increases during a recession, such as that induced by the COVID-19 pandemic, the nature of the emigration that takes place during recessions and economic booms, whether this emigration consists of highly educated and skilled immigrants or lower-skilled immigrants, and whether this emigration varies by immigration category (i.e., economic, family class, refugee). Future research will be enhanced by the methodologies developed now to examine the types of immigrants likely to stay or leave after the recession.

When complete exit information becomes available, measures that better capture the fluidity of immigrants’ Canadian residence status can be developed. The conventional emigration rate does not reflect this fluidity because it conceptualizes emigration as a one-time, permanent move. Its estimation is also sensitive to the duration used to observe disappearance and reappearance. More useful indicators should be able to measure the stock, flow and longitudinal dynamics of immigrants in Canada. The stock indicators should include the number and share of immigrants in an arrival cohort who reside in Canada in a given year and—among those who resided in Canada—the number and share who are actively engaged in the labour market. The flow indicators should include the number and characteristics of immigrants who leave or return to Canada in a given year. The indicators of longitudinal dynamics should include statistics on the duration of absence, frequency and time interval of leaving and returning, and cumulative years residing in Canada. Together, these measures can provide a comprehensive picture of the demographic and socioeconomic impact of immigrants.

References

Aydemir, A., and C. Robinson. 2008. “Global labour markets, return and onward migration.” Canadian Journal of Economics 41 (4): 1285–1311.

Bérard-Chagnon, J. 2018. Measuring Emigration in Canada: Review of Available Data Sources and Methods. Demographic Documents, no. 14. Statistics Canada Catalogue no. 91F0015M. Ottawa: Statistics Canada.

Bérard-Chagnon, J., J. Tang, and B. St-Jean. 2019. Immigrant Emigration: Results from the Longitudinal Immigration Database (IMDB). Statistics Canada, Demography Division.

Budnik, K. B. 2011. “Temporary migration in theories of international mobility of labour.” Bank i Kredyt 42 (6): 7–48.

Dryburgh, H., and J. Hamel. 2004a. “Immigrants in demand: Staying or leaving?” Canadian Social Trends 74: 12. Statistics Canada Catalogue no. 11-008-X.

Dryburgh, H., and J. Hamel. 2004b. The Retention of Immigrants in Canada – Tax-filing Attrition in the Longitudinal Immigration Database. Statistics Canada Catalogue no. 89-596-XIE. Ottawa: Statistics Canada.

Fauser, M., E. Liebau, S. Voigtländer, H. Tuncer, T. Faist, and O. Razum. 2015. “Measuring transnationality of immigrants in Germany: Prevalence and relationship with social inequalities.” Ethnic and Racial Studies 38(9): 1497–1519.

Statistics Canada. 2019. Longitudinal Immigration Database (IMDB) Technical Report, 2018. Diversity and Sociocultural Statistics 2019005, no. 024. Statistics Canada Catalogue no. 11-633-X. Ottawa: Statistics Canada.

Vadean, F., and M. Piracha. 2010. “Circular migration or permanent return: What determines different forms of migration?” In Migration and Culture, Frontiers of Economics and Globalization, ed. G. Epstein and N. Gang, p. 467–495. Bingely: Emerald Publishing.

Date modified: