Longitudinal Immigration Database (IMDB) Technical Report, 2019
Appendix

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

Table of contents

Skip to text

Text begins

A. Links to key IMDB documents and web pages

Dictionaries (tax and immigration component):

Available to data users or upon request by contacting Statistics Canada by email at STATCAN.infostats-infostats.STATCAN@canada.ca)

Portal on Immigrants and Non-permanent Residents Statistics: The immigrants and non-permanent residents portal brings together the most requested data, tools and reports on a single page.

Link when prepared

Historical IMDB

IMDB releases in The Daily

Analysis using the IMDB

Evra, R. and Kazemipur, A. 2019. The Role of Social Capital and Ethnocultural Characteristics In The Employment Income Of Immigrants Over Time. Statistics Canada: Insights On Canadian Society.

Huystee, M. 2016. Interprovincial mobility: Retention rates and net inflow rates 2008-2013 admissions.

Ng, E et al. 2019. Tuberculosis-Related Hospital Use Among Recent Immigrants To Canada. Statistics Canada: Health Reports.

Picot, G, and Lu, Y. 2017. Chronic Low Income Among Immigrants in Canada and its Communities. Statistics Canada.

The Consumer Price Index (62-001-X):

Description of the annual Income Estimates for Census Families and Individuals (T1 Family File):

B. Coverage

The 2019 IMDB was used to produce these counts. Filers are linked immigrants who have filed a tax return at least once since 1982. Statistics below exclude the 2019 admissions.

Table 15
Distribution of taxfilers and non-taxfilers by admission year
Table summary
This table displays the results of Distribution of taxfilers and non-taxfilers by admission year Taxfilers, Non-taxfilers, Total and Immigrants, calculated using number and percent units of measure (appearing as column headers).
	Immigrants	Immigrants	Immigrants	Taxfilers
	Taxfilers^{Table 15 Note 1}	Non-taxfilers	Total
	number			percent
1980	120,410	22,710	143,130	84.1
1981	107,670	20,910	128,580	83.7
1982	103,400	17,680	121,090	85.4
1983	77,080	11,940	89,030	86.6
1984	77,510	10,520	88,030	88.0
1985	75,010	8,930	83,940	89.4
1986	88,980	9,780	98,770	90.1
1987	137,480	13,680	151,170	90.9
1988	146,210	14,550	160,750	91.0
1989	173,880	16,790	190,670	91.2
1990	192,170	23,260	215,440	89.2
1991	208,550	23,270	231,820	90.0
1992	228,700	25,240	253,930	90.1
1993	230,800	24,880	255,670	90.3
1994	198,820	24,790	223,600	88.9
1995	188,800	23,350	212,150	89.0
1996	198,630	26,740	225,360	88.1
1997	189,420	26,040	215,460	87.9
1998	155,460	18,220	173,680	89.5
1999	168,820	20,550	189,360	89.2
2000	203,410	23,340	226,750	89.7
2001	224,190	25,580	249,770	89.8
2002	202,760	25,450	228,210	88.8
2003	194,110	26,410	220,520	88.0
2004	205,190	30,160	235,340	87.2
2005	224,680	37,110	261,770	85.8
2006	214,560	36,550	251,110	85.4
2007	199,780	36,400	236,180	84.6
2008	205,380	41,240	246,610	83.3
2009	208,900	42,680	251,590	83.0
2010	225,880	54,200	280,070	80.7
2011	197,780	50,340	248,120	79.7
2012	205,620	51,620	257,240	79.9
2013	205,970	52,530	258,510	79.7
2014	207,230	52,280	259,500	79.9
2015	211,030	59,970	271,000	77.9
2016	216,730	78,670	295,390	73.4
2017	213,860	71,470	285,330	75.0
2018	217,650	102,140	319,790	68.1
Total	7,052,510	1,281,940	8,334,440	84.6
Note 1 Taxfilers are linked immigrants who have filed taxes at least once since 1982. Return to note 1 referrer Note: All counts are rounded. Source: Statistics Canada, 2019 Longitudinal Immigration Database.

Table 16
Proportion of linked taxfilers by age group at landing, sex and admission decade
Table summary
This table displays the results of Proportion of linked taxfilers by age group at landing. The information is grouped by Sex and cohorts (appearing as row headers), Age at landing, 0 to 14, 15 to 24, 25 to 34, 35 to 49, 50 to 64, 65 and older and Total, calculated using percent units of measure (appearing as column headers).
Sex and cohorts	Age at landing
	0 to 14	15 to 24	25 to 34	35 to 49	50 to 64	65 and older	Total
	percent
1980 to 1989 cohorts
Male	82.6	93.9	94.9	93.5	84.8	62.6	89.4
Female	81.5	91.5	93.3	92.5	81.6	60.5	87.1
Total	82.1	92.6	94.1	93.0	82.9	61.4	88.2
1990 to 1999 cohorts
Male	81.7	92.9	93.0	92.7	90.0	76.8	89.7
Female	80.0	91.4	92.7	92.8	88.1	75.5	88.8
Total	80.9	92.1	92.9	92.8	89.0	76.1	89.2
2000 to 2009 cohorts
Male	64.7	93.0	91.8	92.8	93.0	89.2	86.1
Female	63.6	92.5	92.8	93.6	92.8	87.5	86.9
Total	64.1	92.7	92.3	93.2	92.9	88.3	86.5
2010 to 2018 cohorts
Male	14.0	89.2	94.1	93.2	90.3	83.2	75.8
Female	14.0	90.4	94.3	94.1	89.4	82.3	77.9
Total	14.0	89.8	94.2	93.6	89.8	82.7	76.8
Source: Statistics Canada, 2019 Longitudinal Immigration Database.

C. Previous analysis

Since its creation, the IMDB has been used to produce several analyses. The following is a summary of some Statistics Canada studies that have made use of the IMDB.

In recent years, several releases in The Daily have featured the IMDB. The subjects discussed include changes in the regional distribution of new immigrants to Canada, income and mobility of immigrants, immigrants in the hinterlands, and immigrants who leave Canada. These articles are accessible via the Statistics Canada website. Papers using the IMDB have been published in the Perspectives on Labour and Income publication series (75-001-X) and the Analytical Studies Branch Paper Series. Among the topics covered were the income of immigrants who pursue postsecondary education in Canada, and the earnings advantage of landed immigrants who were previously temporary residents in Canada.

D. Best practices and tips for analysts

D.1 Programming tips

This section provides programming information for individuals who want to have a better understanding of the programming structure used to access data from IMDB files. Please note that individuals may conduct their own programming. There are two types of IMDB files—the yearly IMDB data files and the immigration data (for more details on IMDB files, refer to Section 3). IMDB tax variables are identified with a variable name that consists of three parts: (1) the acronym name as described in the IMDB tax data dictionary, (2) the aggregate level (I or F), and (3) the year (the four-digit year extension exists in most, but not all, cases).

Example: The interest and investment income at the individual level for 2014 would be named INVI_I2014.

Observations in the IMDB files are sorted according to a variable, IMDB_ID (note that there is no year extension for this variable), which enables users to maintain a link across years. Data access takes place by means of the SAS programming language. A sample SAS program designed to access IMDB data is provided below. The samples below are created to perform the following task:

“retrieving the number of Social Assistance (SA) recipients for immigrants who landed between 2007 and 2012, living in Ontario between 2015 and 2017, and did not have any earnings appearing on their T4 slips by sex and year (2015 to 2017)”

Researchers who are new to the IMDB are encouraged to go through this sample SAS program. There are generally three components in the sample.

Library set-up: The library assignments on the first two lines are the locations for the input files (first line) and the output files (the second line).
Steps to generate a working dataset:
1. The input files are stored in SAS format and can therefore be accessed with a SET or MERGE statement.
2. This program is aimed at retrieving the number of Social Assistance (SA) recipients for immigrants who:
  1. landed at any time from 2007 to 2012
  2. lived in Ontario from 2015 to 2017
  3. did not have any earnings on their T4 slips
    And generate the number of SA recipients by sex and year (in this case, 2010 to 2012).
The dataset used to produce the number of the SA recipients: The part, which starts with “proc freq,” produces the numbers of interest as they are specified in the rest. At the end of the program, four tables are created from the output data file.

It is generally recommended that programs use the variables available in the PNRF rather than the yearly tax files for consistency. For example, the sample program uses the variable GENDER, a variable found in the PNRF, rather than SXCO_I&YEAR, the variable found in the yearly IMDB_T1FF. In this program, only individuals who have filed every year from 2015 to 2017 are selected.

When programming in SAS, one should keep in mind the distinction between missing values and zeros in numeric fields. With SAS, most mathematical operations performed with missing values will return missing values. In IMDB, in years that an individual is present, numeric variables not relevant to that individual have a value of “0” (zero). For example, if a person without a spouse filed in 2015, the value for RRSPSI2015 (contributions to a spouse’s RRSP) should be “0” (zero). If that individual did not file in 2015, the value will be missing.

Sample IMDB program



*Sample SAS program using the IMDB;


libname source1 ‘FILEFOLDER1’; * location of  IMDB files ;

libname Out ‘FILEFOLDER2’; * user’s directory ;




* This sample program’s objective is to use the IMDB to retrieve the number of Social  Assistance (SA) recipients in Ontario that did not have any earnings appearing on their T4 slips, according to sex and year (in this case, 2015 to 2017). Data for  provinces  and   earnings  are from  the  yearly  IMDB  files  whereas  the  sex variable  is from the PNRF _ 1980 _ 2019 ;




* The first step is to create a datafile containing all the information that we need to

produce our tables. This datafile will be called SAOnt and will be saved in the ‘out’

directory. The Longitudinal Identifier Number (IMDB _ ID) is used to merge the annual

IMDB datasets. ;




data out.SAOnt;

merge

source1.imdb _ t1ff _ 2015(where=(prco _ i2015 = 5 and outlier _ ind2015=0) in=a

keep=imdb _ id prco _ i2015 saspyf2010 t4e __ i2015 outlier _ ind2015)




source1.imdb _ t1ff _ 2016(where=(prco _ i2016 = 5 and outlier _ ind2016=0) in=b

keep= imdb _ id prco _ i2016 saspyf2016 t4e __ i2016 outlier _ ind2016)




source1.imdb _ t1ff _ 2017(where=(prco _ i2017 = 5 and outlier _ ind2017=0) in=c

keep= imdb _ id prco _ i2017 saspyf2012 t4e __ i2017 outlier _ ind2017)




source1.pnrf _ 1980_2019(keep= imdb _ id gender landing _ year immigration _ category);

by IMDB _ id ;




If a and b and c and (landing _ year>=2007 and landing _ year<=2012);


*person must be taxfiler in all three years, not be flagged as an outlier, and must

 have landed between 2007 and 2012 (population of interest);


 * We create a flag variable that identifies the SA recipients for each year.

 The result is three variables,


 flag _ sa2015, flag _ sa2016 and flag _ sa2017, taking a value of either 1 or 0.;

 If (t4e __ i2015=0 and saspyf2015>0) then flag _ sa2015 = 1 ;

 else flag _ sa2015 = 0 ;

 if (t4e __ i2016=0 and saspyf2016>0) then flag _ sa2016 = 1 ;

 else flag _ sa2016 = 0 ;

 if (t4e __ i2017=0 and saspyf2017>0) then flag _ sa2017 = 1 ;

 else flag _ sa2017 = 0 ;

 run;

 

 
* The SAS ‘freq’ procedure is used to produce our tables. We would also need to make

 sure that confidentiality guidelines standards are respected. ;


 proc freq data = out.SAOnt;

 tables immigration _ category*flag _ sa2015*flag _ sa2016*flag _ sa2017

 gender*flag _ sa2015*flag _ sa2016*flag _ sa2017 /missing ;

 run ;

 * End of the sample program;

D.2 Creating a cohort

Prior to starting an analysis, the cohort of interest needs to be defined. The cohort can be restricted by landing year, geography, or any other variable of interest (e.g., admission category or gender) according to the researcher’s need. A clearly defined single cohort should be followed to allow comparability. For example, a researcher might be interested in women who landed in 2000 and who lived in a family that received social assistance in 2001 (Table 17). A study question regarding this cohort could be “What proportion of this cohort received social assistance in the following two years (2002 and 2003)?” It is worth noting that the Canada Revenue Agency (CRA) requires the spouse with the higher net income to report the social assistance payment. As a result, measurement on social assistance (SASPY_F), even for individuals, is best reported with the family-level information.

Table 17
Example - Women who landed in 2000 and received social assistance (SASPY_F) in 2001
Table summary
This table displays the results of Example - Women who landed in 2000 and received social assistance (SASPY_F) in 2001. The information is grouped by IMDB_ID (appearing as row headers), Landing year, Gender, SASPY_F2001, SASPY_F2002 and SASPY_F2003, calculated using dollars units of measure (appearing as column headers).
IMDB_ID	Landing year	Gender	SASPY_F2001	SASPY_F2002	SASPY_F2003
IMDB_ID	Landing year	Gender	dollars
IM583	2000	Female	20,500	19,000	14,000
IM145	2000	Female	3,000	0	0
IM548	2000	Female	11,500	13,800	0
IM798	2000	Female	16,000	18,000	8,000
IM961	2000	Female	10,000	0	0
IM967	2000	Female	9,500	0	0
IM110	2000	Female	5,000	2,000	1,000
IM125	2000	Female	1,000	0	200
Source: Statistics Canada, example from Longitudinal Immigration Database (IMDB).

D.3 Calculating retention rates

A key strength of the IMDB is the presence of geographic variables that allow for the study of mobility and retention. No other dataset contains a comparable level of detail on taxfilers annually, especially when it comes to smaller geographies. Having annual provincial, census division (CD), census metropolitan area (CMA), census agglomeration (CA), census subdivision level (CSD), and census tract level updates allows for a broad range of analyses.

Individual mobility trajectories can be studied simply by flagging changes in postal codes, and mobility trends can be calculated by studying relocations at specific levels of geography. For example, CSD-level mobility (year-to-year changes in CSD) and provincial mobility (year-to-year changes in province) significantly vary by a number of immigrant characteristics, such as age and admission category. These geographies are derived from the postal code (IMDB variable PSCO at the individual and family levels). The postal code is a six-character alphanumeric code that locates the point of delivery of mail addressed to post office customers in Canada. See Section 3.4.1 for a description of the geography variables.

In the example below (Table 18), the researcher is interested in mobility until 2002. IM798, IM961, IM967 and IM110 could be excluded from the mobility study because data (or files) are missing.

Table 18
Example - Mobility until 2002 of immigrants who landed in 2000
Table summary
This table displays the results of Example - Mobility until 2002 of immigrants who landed in 2000. The information is grouped by IMDB_ID (appearing as row headers), Landing year, Destination province, PRCO 2000, PRCO 2001 and PRCO 2002 (appearing as column headers).
IMDB_ID	Landing year	Destination province	PRCO 2000	PRCO 2001	PRCO 2002
IM583	2000	B.C.	B.C.	B.C.	B.C.
IM145	2000	Alta.	Alta.	Sask.	Sask.
IM548	2000	Alta.	Ont.	Ont.	Ont.
IM798	2000	Ont.	Note ..: not available for a specific reference period	Ont.	Ont.
IM961	2000	N.B.	N.B.	N.B.	Note ..: not available for a specific reference period
IM967	2000	Ont.	Note ..: not available for a specific reference period	Alta.	Ont.
IM110	2000	Note ..: not available for a specific reference period	Que.	Note ..: not available for a specific reference period	Que.
.. not available for a specific reference period Note: PRCO is province of residence Source: Statistics Canada, example from Longitudinal Immigration Database (IMDB).

While mobility, at the individual level, is fairly straightforward, retention of immigrants in a jurisdiction can be calculated in several ways. How retention is calculated is an analytical decision based on the individual researcher’s particular needs. The number of individuals retained is fairly straightforward to define—it is the number of individuals filing taxes in the jurisdiction of interest at a given time. A decision has to be made about what constitutes the initial admission cohort about which retention is calculated (the denominator in the retention rate).

The retention rate can be measured as proportion of immigrant taxfilers who reside in the province where they landed (defined as the province of intended destination) at a given time. For a given cohort (e.g., landing year) and a given tax year (or years since admission), the denominator is the number of taxfilers with the selected province of admission. The numerator is the number of taxfilers with the selected province of admission who are also residing in the province.

To compute retention rates three years after admission for the 2011 cohort, a researcher would prepare a table with all provinces of admission (i.e., the province of intended destination), all provinces of residence, landing year = 2011, and reference year = 2014. The table would look as follows:

Table 19
Province of residence in 2014 and province of landing, 2011 corhort
Table summary
This table displays the results of Province of residence in 2014 and province of landing. The information is grouped by Province of landing (appearing as row headers), Province of residence, Total province of residence, Newfoundland and labrador, Prince Edward Island, Nova Scotia, New Brunswick, Quebec, Ontario, Manitoba, Saskatchewan, Alberta, British Columbia and Other residence, calculated using number of immigrants units of measure (appearing as column headers).
Province of landing	Province of residence
	Total province of residence	Newfoundland and labrador	Prince Edward Island	Nova Scotia	New Brunswick	Quebec	Ontario	Manitoba	Saskatchewan	Alberta	British Columbia	Other residence
	number of immigrants
Total province of landing	174,740	405	330	1,365	880	31,505	70,590	9,698	6,120	26,965	26,390	500
New Foundland and Labrador	515	325	0	5	0	5	75	5	0	60	30	0
Prince Edward Island	1,245	0	265	25	10	30	560	0	0	50	295	0
Nova Scotia	1,460	10	5	1,080	10	25	185	0	5	90	30	10
New Brunswick	1,340	0	10	35	750	55	275	0	10	80	120	0
Quebec	36,275	10	10	35	15	30,200	3,255	40	75	1,190	1,400	45
Ontario	69,135	35	25	115	70	875	63,145	275	335	2,815	1,325	115
Manitoba	11,190	0	0	15	0	55	645	9,170	80	825	380	10
Saskatchewan	6,360	0	0	0	0	20	295	45	5,370	445	165	10
Alberta	21,940	10	0	20	0	95	810	65	140	20,170	590	35
British Columbia	25,000	5	0	30	5	140	1,330	85	100	1,200	22,030	70
Other	280	0	0	0	0	0	15	0	0	35	20	200
Source: Statistics Canada, 2014 Longitudinal Immigration Database (table 43-10-0035-01)

Results for Nova Scotia shed some light on the matter. A total of 1,460 individuals landed in Nova Scotia in 2011 and filed taxes in 2014. Of those, 1,080 had Nova Scotia as their province of residence in 2014. Nova Scotia’s three-year retention rate would be 1,080/1,460, or about 74%. Table 19 also provides information on secondary migrants^Note — 1,365 individuals who landed in 2011 resided in Nova Scotia in 2014, of which 1,080 intended to land in Nova Scotia, and 285 had a destination province other than Nova Scotia.

The above definition of retention assumes that the number of taxfilers with the specific province of intended destination is the total population that can be retained in a year (i.e., if all 1,460 individuals who had intended to land in Nova Scotia had filed taxes there in 2014, the province would have 100% retention). This method does not take into account late sporadic tax filing behaviour or emigrants that left Canada, for which tax file was not available in 2014.

One alternative is a purely longitudinal approach, where a single admission cohort is selected (according to the province of intended destination, the province of initial tax filing, or both), and the retention rate is calculated as the proportion of this cohort that is still filing taxes in the province. When the province of initial tax filing is used to define the admission cohort, it is recommended that the first tax file occur in the year the immigrants were admitted (landing year = tax year), to exclude individuals who may have first arrived elsewhere and subsequently migrated to the region before filing taxes for the first time. A further restriction can be made if a researcher is interested in the population whose destination geography matches the geography of the first tax file.

Given that a portion of each annual cohort do not file taxes for their year of admission, it may be necessary to increase the population size for a region by defining the admission cohort as anyone who first filed taxes in the region within two years of admission (i.e., first_tax_year = landing_year or landing_year+1). Allowing individuals whose first tax filing occurred several years after admission to be part of an “admission cohort” is not recommended, as it is possible that they first landed elsewhere but did not file taxes. It is also a good idea to exclude intermittent filers from these analyses, as their place of residence is unknown in the years for which there is no tax data. Retention calculated this way will show a gradual decline in numbers; this decline is due to immigrants who stop filing, out-migration, and death.

If researchers are interested in secondary migrants to a region, this can be found by removing individuals in the defined admission cohort from the total number of immigrants filing taxes in the region at the time of interest. Again, however, these analyses should be restricted to individuals who first filed taxes within the same time period (year 0 or year 1) to avoid mistaking late-filers for in-migrants. If the admission cohort is restricted to immigrants whose destination geography matches the geography of first tax filing, a subsequent distinction should be made between secondary migrants who first filed elsewhere (and subsequently filed in the region of interest) and immigrants who first filed in the region of interest but were subsequently recruited by other jurisdictions (or information on their intended destination is missing altogether).

The following table presents an example of a longitudinal approach to provincial retention using fictitious data, with various definitions of the initial admission cohort.

Table 20
Number of immigrant tax filers within the specified population residing in British Columbia and associated retention rate, by years since landing
Table summary
This table displays the results of Number of immigrant tax filers within the specified population residing in British Columbia and associated retention rate. The information is grouped by Years since landing (appearing as row headers), Taxfilers who first filed taxes in B.C. in year 0, Retention rate, Taxfilers who first filed taxes in B.C. in year 0 or 1 and Taxfilers who first filed taxes in B.C. in year 0 or 1 and province of intended destination was B.C., calculated using number and percent units of measure (appearing as column headers).
Years since landing	Taxfilers who first filed taxes in B.C. in year 0	Retention rate	Taxfilers who first filed taxes in B.C. in year 0 or 1	Retention rate	Taxfilers who first filed taxes in B.C. in year 0 or 1 and province of intended destination was B.C.	Retention rate
Years since landing	number	percent	number	percent	number	percent
0	20,000	100	20,000	Note ...: not applicable	17,500	Note ...: not applicable
1	18,000	90	25,000	100	19,000	100
2	17,000	85	23,000	92	18,000	95
3	16,500	83	22,000	88	17,500	92
... not applicable Source: Statistics Canada, 2014 Longitudinal Immigration Database.

In the above example, retention in British Columbia can be calculated according to three definitions of the population, and the three-year retention rate varies per the definition adhered to. Importantly, all individuals in the sample filed taxes at each point in time.

With the 2019 IMDB release, a mobility summary table is available on the Statistics Canada website. The measures for mobility compare the intended destination from immigration files to the province of residence obtained from tax files. For example, table 21 provides the mobility measures based on the differences between the intended province of destination for immigrants admitted in 2010 and their province of residence in 2015 according to their tax files

Table 21
Mobility measures for 2010 cohort by province, 2015 tax year
Table summary
This table displays the results of Mobility measures for 2010 cohort by province Total destination
(a), Total residence
(b), Out migration
(c) , In migration
(d), Stayed in province
(e=a-c), Population growth rate
(f=b/a-1) , Retention rate
(g=e/a), Out migration rate
(h=1-e/a) and In migration rate
(i=d/a) (appearing as column headers).
	Total destination (a)	Total residence (b)	Out migration (c)	In migration (d)	Stayed in province (e=a-c)	Population growth rate (f=b/a-1)	Retention rate (g=e/a)	Out migration rate (h=1-e/a)	In migration rate (i=d/a)
Canada	200,600	200,600	27,260	27,260	173,340	0.0	86.4	13.6	13.6
Newfoundland and Labrador	525	410	245	130	280	-21.9	53.3	46.7	24.8
Prince Edward Island	1,930	370	1,630	70	305	-80.8	15.8	84.5	3.6
Nova Scotia	1,630	1,405	570	340	1,065	-13.8	65.3	35.0	20.9
New Brunswick	1,535	920	795	180	740	-40.1	48.2	51.8	11.7
Quebec	38,050	33,900	5,955	1,805	32,095	-10.9	84.4	15.7	4.7
Ontario	83,355	84,965	7,725	9,335	75,630	1.9	90.7	9.3	11.2
Manitoba	11,475	9,785	2,420	730	9,055	-14.7	78.9	21.1	6.4
Saskatchewan	5,620	5,410	1,220	1,015	4,400	-3.7	78.3	21.7	18.1
Alberta	24,255	29,850	2,360	7,955	21,895	23.1	90.3	9.7	32.8
British Columbia	31,820	32,790	4,250	5,215	27,575	3.1	86.7	13.4	16.4
Other	405	420	95	115	305	3.7	75.3	23.5	28.4
Not stated	..	365	..	365	..	0.0	0.0	0.0	0.0
.. not available Source: Statistics Canada, 2015 Longitudinal Immigration database (IMDB)

The new table provides the following measures of mobility:

The total destination (column a) represents the number of immigrants admitted in 2010 and filling taxes in year 2015, in Canada;
The total residence (column b) represents the number of immigrant taxfilers in 2015 in the province specified;
The out migration (column c) represents the number of immigrant taxfilers originating from the specified province and filing tax in another province, in year 2015;
The in migration (column d) represents the number of immigrants originating from a difference province of destination and filing tax in the specified province in year 2015;
The stayed in the province (column e) represents the number of immigrant taxfilers continuing their residence from the province of destination, in year 2015;
The population growth rate (column f) represents the percentage of immigrant taxfilers gained or lost by the specified province. This takes into account immigrants migrating out and migrating in the specified province;
The retention rate (column g) represents the percentage of immigrant taxfilers continuing their residence from the province of destination, in year 2015. This does not take into account immigrants migrating in from another province of destination;
The out migration rate (column h) represents the percentage of immigrant taxfilers originating from the specified province and filing tax in another province in year 2015;
The in migration rate (column i) represents the percentage of immigrants originating from a different province of destination and filing tax in the specified province in year 2015.

The table 21 shows that 200,600 immigrants were admitted to Canada in 2010 and filed taxes in 2015.

Of the 83,355 immigrant taxfilers who intended to reside in Ontario, 75,630 remained there in 2015, representing a retention rate of 90.7%.

While 7,725 immigrant taxfilers migrated out of Ontario, 9,335 immigrant taxfilers had moved into Ontario from other destination provinces. So, for this 2010 cohort, the total number of Ontario residents in 2015 was 84,965, or 1.9% more than the number of immigrant tax filers who intended in reside in Ontario.

Finally, analysts should use caution when studying low-level census geographies over a long period of time, as CA and CMA boundaries change and CSDs are dropped and added. If possible, analysts should run the Postal Code Conversion File (PCCF+) program to standardize postal codes to a constant census geography.

D.4 Calculating income trajectories over time

As is the case with retention, calculating year-to-year changes in wages, salaries and commissions earnings (or, for that matter, any economic variable) requires consecutive information. For example, if a researcher wants to compare the median wages, salaries and commissions earnings of the 2000 cohort of women aged 24 to 54, 1 year after admission and 5 years since admission (Table 22), records with missing T1FF files could be removed from the analysis. The decision to remove these records would be based on the desire to evaluate the cohort’s median income versus the cohort filer’s median income.

Table 22
Median employment earnings of the 2000 cohort of women aged 24 to 54, 1 year after landing and 5 years since landing
Table summary
This table displays the results of Median employment earnings of the 2000 cohort of women aged 24 to 54. The information is grouped by IMDB_ID (appearing as row headers), Landing year, Age at landing, Gender, Wages, income 2001 and income 2002, calculated using dollars units of measure (appearing as column headers).
IMDB_ID	Landing year	Age at landing	Gender	Wages	Wages
				income 2001	income 2005
				dollars
IM583	2000	34	Female	20,500	49,000
IM145	2000	53	Female	Note ..: not available for a specific reference period	56,000
IM548	2000	29	Female	11,500	33,800
IM798	2000	31	Female	36,000	0
IM961	2000	42	Female	10,000	Note ..: not available for a specific reference period
IM967	2000	40	Female	Note ..: not available for a specific reference period	Note ..: not available for a specific reference period
IM110	2000	35	Female	0	59,000
.. not available for a specific reference period Source: Statistics Canada, example from Longitudinal Immigration Database (IMDB).

Use caution when calculating the “first year in Canada” income as it might not represent a full year of taxation. For example, someone who landed in November of 2013 and filed taxes for 2013 would have only two months of income in 2013. A best practice is to use the first full year of income (landing year +1, see Table 20). One exception is pre-filers, those who filed taxes in Canada before admission and filed at landing year as well, are most likely reporting income for the entire year.

Over-time income should also be studied in constant dollars. Consequently, Consumer Price Index (CPI) adjustments should be made (Appendix D.7). This adjustment is made in the IMDB tables.

D.5 Rounding data

Respecting the privacy of Canadians is important to Statistics Canada. Consequently, any tables produced from IMDB_T1FF files are subject to rounding. The purpose of rounding is to ensure that no small cells are released that may reveal information on specific individuals or small groups of individuals. In general, the macros will take an unrounded input dataset of various statistics (counts, means, medians, etc.) and output a rounded dataset.

The rounding rules are available to all researchers accessing the microdata in the Research Data Centres (RDC).

D.6 Identifying outliers

The variable OUTLIER_IND was created to identify outliers within the T1FF (see Section 5.5). It should be used to remove outlier data from any calculation (e.g., mean, median, or regression) employing tax data. Outliers differ from one year to another, meaning that a person’s data may be identified as an outlier for a given year but not for a subsequent year.

The following table (Table 23) gives the distribution of the outliers in the tax files for 1982 and subsequent years by type of resident for the 2019 IMDB. Less than 0.05% records were identified as outliers per tax year. The proportion of outliers increased from 1995 to 1996 as a result of updates to the outlier detection method applied to tax files for 1997 and subsequent taxation years.

Table 23
Distribution of outliers by tax year
Table summary
This table displays the results of Distribution of outliers by tax year Total, calculated using number and percent units of measure (appearing as column headers).
	number	percent
	Total
1982	490	0.03%
1983	390	0.02%
1984	560	0.03%
1985	500	0.02%
1986	460	0.02%
1987	590	0.03%
1988	950	0.04%
1989	900	0.03%
1990	710	0.03%
1991	790	0.03%
1992	990	0.03%
1993	850	0.03%
1994	490	0.01%
1995	730	0.02%
1996	860	0.02%
1997	1,220	0.03%
1998	1,770	0.05%
1999	1,130	0.03%
2000	1,330	0.03%
2001	1,270	0.03%
2002	1,350	0.03%
2003	1,550	0.03%
2004	1,260	0.03%
2005	1,570	0.03%
2006	1,830	0.04%
2007	1,740	0.03%
2008	1,820	0.03%
2009	2,120	0.04%
2010	1,810	0.03%
2011	2,070	0.03%
2012	1,660	0.03%
2013	1,730	0.03%
2014	1,730	0.03%
2015	1,840	0.03%
2016	1,740	0.02%
2017	1,780	0.02%
2018	2,800	0.04%
Source: Statistics Canada, 2019 Longitudinal Immigration Database.

D.7 Adjusting income for the Consumer Price Index (CPI)

In order to take into account the cost of living, all incomes should be adjusted to the Consumer Price Index (CPI) for Canada. “The Consumer Price Index (CPI) is an indicator of changes in consumer prices experienced by Canadians. It is obtained by comparing, over time, the cost of a fixed basket of goods and services purchased by consumers. Since the basket contains goods and services of unchanging or equivalent quantity and quality, the index reflects only pure price change.”^Note The adjustment factors for 2017 are available in Table 24. To transform data to constant dollars of a specific year, data users need to multiply the dollar values in all but the reference year by a year-specific adjustment factor. To obtain the adjustment factors, data users need to divide the CPI of the reference year by the CPI of the specific year. In table 24, the year of reference is 2017.

Table 24
2018 Consumer price index adjustment factors
Table summary
This table displays the results of 2018 Consumer price index adjustment factors. The information is grouped by Year (appearing as row headers), 2018 consumer price index adjustment equals 133.4 divided by, calculated using number units of measure (appearing as column headers).
Year	2018 consumer price index adjustment equals 133.4 divided by
Year	number
1982	54.9
1983	58.1
1984	60.6
1985	63.0
1986	65.6
1987	68.5
1988	71.2
1989	74.8
1990	78.4
1991	82.8
1992	84.0
1993	85.6
1994	85.7
1995	87.6
1996	88.9
1997	90.4
1998	91.3
1999	92.9
2000	95.4
2001	97.8
2002	100.0
2003	102.8
2004	104.7
2005	107.0
2006	109.1
2007	111.5
2008	114.1
2009	114.4
2010	116.5
2011	119.9
2012	121.7
2013	122.8
2014	125.2
2015	126.6
2016	128.4
2017	130.5
2018	133.4
2019^{Table 24 Note 1}	136.0
Note 1 In 2019, the CPI was 136.0. In order to transform a price from another year into 2019 dollars, one must multiply the specified year’s dollar amount by the inflation ratio, which is the 2019 CPI (xxx) divided by the specified year’s CPI. Return to note 1 referrer Source: Statistics Canada, Table 18-10-0005-01.

D.8 Calculating key income measures

The IMDB tables contain several income measures. Table 25 describes which variables of the T1FF are included in their calculation.

Table 25
Description of the Longitudinal Immigration Database income main measures
Table summary
This table displays the results of Description of the Longitudinal Immigration Database income main measures. The information is grouped by Measure (appearing as row headers), Components and Formula (appearing as column headers).
Measure	Components	Formula
Wages, salaries and commissions income	Earnings from T4 slips	T4E__i + OEI__i
Self-employment income
Since 1988	Self-employment income from business, profession, commission, farm, and fishing; limited partnership income	SEI__i + LTPI_i
Before 1988	Self-employment income from business, profession, commission, farm, and fishing;	SEI__i
Investment income	Interest and investment income; dividends; capital gains/losses, net taxable	INVi_i + XDIV_i + CLKGX
Employment Insurance benefits	Employment Insurance benefits	EINS_i
Social welfare benefits	Social welfare benefits (use family-level)	SASPYf
Total income	Sum of all measures described above
Source: Statistics Canada, 2019 Longitudinal Immigration Database

It is to be noted that all outliers are removed from these calculations (Outlier_ind=1), that the variable Province of Residence at the End of the Year (PRCO_) is used to identify the province, and that all incomes are adjusted according to the Consumer Price Index (CPI) of the year of the most recent T1FF available. “Mean with income” is the mean income of immigrant tax-filers with income of the given type. “Median with income” is the median income of immigrant tax-filers with income of the given type.

Date modified:: 2021-02-01

Language selection

Search and menus

Search

Longitudinal Immigration Database (IMDB) Technical Report, 2019
Appendix

Archived Content

A. Links to key IMDB documents and web pages

B. Coverage

C. Previous analysis

D. Best practices and tips for analysts

D.1 Programming tips

D.2 Creating a cohort

D.3 Calculating retention rates

D.4 Calculating income trajectories over time

D.5 Rounding data

D.6 Identifying outliers

D.7 Adjusting income for the Consumer Price Index (CPI)

D.8 Calculating key income measures

Longitudinal Immigration Database (IMDB) Technical Report, 2019Appendix

Archived Content

A. Links to key IMDB documents and web pages

B. Coverage

C. Previous analysis

D. Best practices and tips for analysts

D.1 Programming tips

D.2 Creating a cohort

D.3 Calculating retention rates

D.4 Calculating income trajectories over time

D.5 Rounding data

D.6 Identifying outliers

D.7 Adjusting income for the Consumer Price Index (CPI)

D.8 Calculating key income measures

Note of appreciation

Standards of service to the public

Copyright

Longitudinal Immigration Database (IMDB) Technical Report, 2019
Appendix