# What can we learn about low-income dynamics in Canada from the Longitudinal Administrative Databank?

## Abstract

Statistics that depict the movements in the bottom end of the income distribution, such as the proportion of low-income persons exiting low income from one year to the next, provide important information for developing policy on poverty and income inequality. Since the mid 1990s, these statistics have been generated using data from the Survey of Labour and Income Dynamics (SLID). The longitudinal component of the SLID was discontinued in 2010. This paper examines new and alternative time series on low income dynamics that can be created using the Longitudinal Administrative Databank (LAD).

The results suggest that it is feasible to construct several dynamic low-income statistics to track movements in low income using the LAD data. Using a fixed low-income measure (LIM) methodology, the LAD-based estimates of low-income persistence, as well as rates of entry, exit, hazard and survival yielded similar information on low-income dynamics as did estimates from the SLID.

## 1. Introduction

Since the mid-1990s, Statistics Canada has been producing several time series to gauge the dynamics of low income using data from the Survey of Labour and Income Dynamics (SLID). These time series contain information on the entry, exit, and persistence of low income for several groups of individuals at different levels of geography, providing useful information for debates on poverty-alleviation polices.

With the discontinuation of the longitudinal portion of the SLID, statistics on low-income dynamics will no longer be available after reference year 2010. This paper’s main objective is to see if it is feasible to fill this data gap by constructing statistics on low-income dynamics using data from the Longitudinal Administrative Databank (LAD). In particular, we attempt to see if the same dynamic low-income statistics can be obtained when the same methodology of low-income measure is applied to the LAD and the SLID.

Using tax data to examine low-income dynamics in Canada is not new. The Economic Council of Canada’s 1992 study (Economic Council of Canada, 1992) was one of the earliest attempts to explore low-income transition among families using data based on tax and social assistance records. The first study on low-income dynamics based on the LAD is Laroche (1998), which examined the persistence of low-income spells for the period from 1982 to 1993. This is followed by several studies by Finnie and his co-authors (for example, see Finnie and Sweetman, 2003) assessing poverty dynamics in Canada for the 1992-to-1996 period using data from the LAD. Other authors employed the LAD to examine low-income dynamics in Toronto, Montréal and Vancouver (Frenette, Picot and Sceviour, 2004) or for immigrants (Picot, Hou and Coulombe, 2007).

We will build on and extend the above work. But our study is different from theirs in several ways. First, we follow current international practice to construct the low-income thresholds (Murphy, Zhang and Dionne, 2010). Among others, these thresholds will enable us to identify those in low income relative to a larger population. Second, our work links cross-sectional low-income incidence with entry and exit rates. This enables us to provide a more complete portrait of low-income dynamics. Third, we pay particular attention to see how panel attrition and the censoring of low income spells would affect statistics on low-income dynamics.

In the next section, we describe the data, the sample and the methods for the study. Section 3 provides evidence on the annual low-income entry and exit rates and explores how annual low-income incidence is linked with the entry and exit of low income as well as to various persistence and duration measures. In Section 4, we provide evidence on the distribution of non-low-income spells to help understand the trend in re-entry to low income. In Section 5, we investigate the effect of panel attrition. Finally, Section 6 contains a summary and the conclusions.

## 2. Data and methodology

This study employs data from the LAD. The LAD is a subset of the T1 Family File (T1FF) and is constructed by Statistics Canada using information from individual income tax records and other administrative sources. The family concept under the T1FF is a ‘census family’ consisting of married or common-law couples with or without children, lone parents with at least one child living in the same dwelling.Note 1 In the T1FF, family members are either attached to their spouse by the spouse’s Social Insurance Number (SIN) listed on the tax form or by matching on name, address, age, sex and marital status. The remaining tax filers who have not been matched with other persons are identified as persons not in census families. The T1FF covers all persons from census families and persons not in census families who completed a T1 income tax return, or who received Canada Child Tax Benefits (CCTB). In addition, the T1FF file is augmented by adding records for non-filing spouses and children identified from the CCTB file, the birth file and the historical file. We refer to these individuals in total as the T1FF population.

Each year, a 20% sample of the T1FF is drawn to form the LAD. But the LAD is not a simple random sample of the T1FF population—not all persons from the T1FF have the same probability of being selected; only those with an SIN are eligible to be sampled into the LAD.

This sampling rule ensures individuals can be tracked over time with a reliable identifier. Consequently, the LAD is a representative sample of tax filers and non-filers with an SIN, not of the T1FF population or of the general Canadian population. Nevertheless, since the early 1990s the LAD sample has been able to account for about 75% of the Canadian population. It is mainly the youth segment of the population for which the LAD sample is not representative, because the vast majority of Canadian youth do not file a tax return. For example, in 2010, Canadians 15 years of age and younger account for only 0.2% of the LAD sample, while in the official population estimate they accounted for 16.5%.Note 2 However, for Canadians aged 18 or over, the LAD sample accounted approximately 95% of the official population. As a result of the under coverage of the youth segment of the population, it will be difficult to infer about the population using estimates from the LAD. Recognizing this data limitation, we only examine the dynamic low-income statistics for a sub-sample of the LAD—those who were 18 or older in the first year of a panel and who were observed more than once.

We also observe that Canadians’ tax filing behavior has changed over time. In the early 1980s, only about 60% of Canadians filed income tax returns. The introductions of the Federal Sales Tax Credit in 1986 and the Goods and Services Tax Credit in 1989 have substantially increased the proportion of Canadians filing tax returns. As a result, the T1FF/LAD’s coverage has increased steadily over time to just below 75% in recent years. The filing rate became relatively stable in 1992 when it approached 70%: thus, our study focuses on the years since.Note 3

As a starting point, we use the LAD to construct the LIM thresholds for identifying low income families and individuals. The LIM sets half of the median adjusted income as the low-income threshold for a one-person family.Note 4 Since we only need a cross-sectional sample to calculate the LIM threshold and since the LAD contains information on family size and number of family members with an SIN, an unbiased LIM threshold can be estimated directly from the annual LAD sample with respect to the T1FF population.Note 5,Note 6

Indeed, the aforementioned studies on low-income dynamics all employed cross-sectional data from the LAD to create their own low-income thresholds by following the general LIM methodology.Note 7 For example, focusing on their study sample, Finnie and Sweetman (2003) calculated the half medians of adjusted family income, adjusted by the square root of census family size, for the five years of their study period, 1992 to 1996. They then used the average of these medians as the low-income threshold. We broadly follow this strategy to establish our low-income threshold, but make an important modification for our study.

One purpose of our study is to compare dynamic low-income statistics derived from the LAD with those derived from the SLID to determine if the two data sources can generate the same trend of low income under the same LIM methodology. To do this, we need to calculate an annual threshold instead of an average threshold over a period of time (Finnie and Sweetman, 2003). We shall also establish fixed LIM thresholds. But instead of fixing the average over a period of time, we fix the threshold at particular years, and adjust the fixed threshold with the all-items Consumer Price Index (CPI). The fixed threshold provides a proxy to the current LICO threshold.Note 8

The LAD sample inherits a relatively rich set of demographic and family characteristics from the T1FF, such as age, sex, family type and presence of young children, making it possible as a data source to determine low-income status for each sampled person as well as their family members. It is important to understand that low-income status is normally identified at the family level: thus, we will be able to infer low-income development for certain under-covered population groups when we know the circumstances of their families. For example, the current LAD sample does not enable us to generate statistics to provide direct evidence on low-income entry and exit for children under age 18. But with family characteristics, we would be able to describe the dynamics of low income among parents who have young children. This would provide indirect evidence to help understand child low income dynamics in Canada, since the low-income status of a child is determined by the income of the whole family, not that of the child.

The conceptual difference will make low-income statistics based on the LAD different from those based on other data sources in the magnitude of the statistics. Canada’s low-income statistics, including the dynamic statistics, have been based on data from the SCF (Survey of Consumer Finance)and the SLID under three low-income lines: the Low-income Cutoff (LICO), the (MBM) and the LIM. Under the LICO and MBM, the economic family is the unit for resource sharing; under the LIM, household is the unit of resource sharing. Economic family and household are broader family concepts than census family, implying that economies of scale in consumption are accounted less within a census family than within an economic family or a household. Consequently, other things being equal, we expect more persons to be counted as in low income when census family rather than economic family or household is assumed as the unit of resource sharing.

However, the difference in magnitudes of the low-income statistics based on the LAD and those based on SLID is less important when the objective of the study is to examine how low-income statistics changed over time. In the dynamic context, what matters most is how low income evolves over time for the same people. As long as the qualitative conclusions remain the same across the data sources, we will be confident that low-income statistics based on the LAD can provide useful information for policy discussions in lieu of SLID.

## 3. Evidence of low-income dynamics from the LAD

### 3.1 Low-income incidence, entry and exit rates

Low-income incidence is often examined in the cross-sectional context, but it can also be linked with the dynamic aspect of low income. In fact, we can decompose the cross-sectional low-income incidence into its dynamic elements— the entry and exit rate of low income between any two periods. By definition, low-income incidence in any period t can be expressed as

where the "Change in number of low income persons� is equal to the difference between the number of persons entering low income in period t and the number of persons exiting low income in period t. If we assume the population stays constant between t-1 and t, the above equation can be rewritten as

Where ${P}_{t-1}$ is the low-income incidence in year t-1. Hence, when the population total changes little between two consecutive years, an increase in the entry rate or a decrease in the exit rate would lead to an increase in low-income incidence; an increase in the exit rate or a decrease in the entry rate would lower it.

We employed several low-income thresholds to produce these statistics. One was the variable LIM threshold, which varies according to the income distribution of each year. The others are the fixed LIM thresholds using the 1992, 1997, 2002 and 2007 income distributions, adjusted by the CPI. All the thresholds were estimated assuming census family as the unit of income sharing. In addition to the standard LIM threshold (half of the median adjusted income), we also estimated low-income incidence at 40% and 60% of the median adjusted income. These different thresholds naturally will produce different levels of incidence, but we found the underlying trends under the different fixed LIM thresholds were the same. We will focus on reporting results based on the variable LIM and the 1992 fixed LIM.

Figure 1 shows our estimates for cross-sectional low-income incidence for tax filers and non-filers tracked by the LAD, 18 years old and older, using data from the LAD and the SLID for the years 1992/1993 to 2010.Note 9 We also estimated the incidences for the years before 1993 using data from the SCF and the LAD (these results are not presented in the figure to save space). One thing is clear immediately: for the years before 1993, low-income incidences based on the LAD and the SCF are quite different, both in level and in trend, no matter which LIM threshold is used. In terms of magnitude, the incidence-based on the LAD was often higher in the period from 1982 to 1992 than that based on the SCF. Differences in low-income trends were also evident. For example, data from the SCF suggested that the incidence increased during the period from 1987 to 1992 under the 1992 fixed LIM, whereas data from the LAD indicated an opposite direction. In the period from 1982 to 1992, there were no significant data problems known in the SCF; in the same period, there was significant known under coverage in the LAD. This under coverage was the mostly likely cause of the discrepancy in the incidence, suggesting that low-income statistics based on the LAD for the period prior to 1992 might be inappropriate.

Description for Figure 1

Since 1992, however, the estimated low-income incidences from the LAD and those based on data from the SCF/SLID have been remarkably similar in both trends and magnitudes, particularly under the fixed LIM thresholds. With the variable LIM, there exists a long-term upward trend in low-income incidence according to both data sources, suggesting that, relative to the middle, Canadians from the bottom of the income distribution had not advanced much since the early 1990s.Note 10 When compared to the 1992 fixed standard, low-income Canadians experienced a difficult decade until the late 1990s. But their circumstance improved since then.

Of course, occasional deviations occurred in low-income incidence based on the LAD and incidence based on the SLID. One divergence occurred in the period between 2007 and 2010. Under the variable LIM threshold, the SLID data suggest that low-income incidence increased while the LAD data showed that low-income incidence changed little during this period. Even under the fixed LIM threshold, some short-term deviations also occurred. For example, between 2000 and 2001, the LAD data suggest that low-income incidence increased under the fixed LIM. The cause of the above deviation is not clear, but we did see a sudden increase of one percentage point in the number of tax filers, from 86% to 87% among those aged 15 and over from 2000 to 2001. The share then fell back to 86% in 2002. The increase in the incidence would occur if the majority of the extra tax filers belonged to the low-income population. However, other than occasional deviations such as those, which were probably due to some changes in tax filing behaviour, the estimated low-income incidence based on data from the LAD is essentially the same as those based on the SCF and the SLID under the fixed LIM threshold. While there are deviations between the data sources under the variable LIM methodology, the trend stays essentially the same.

Turning now to the estimates of dynamic low-income statistics such as the entry and exit rates, we focus on one-year entry and exit rates here. The entry rate is defined as the percentage of people who fell into low income between t-1 and t, with the condition that these people were not in low income at time t-1. The exit rate shows the percentage of low-income people in the base year (t-1) who were able to escape from low income by year t. Figure 2 illustrates our estimates for the entry and exit rates based on the LAD for the period from 1992 to 2010 and those based on the SLID for the period from 1993 to 2010.Note 11

The estimated entry and exit rates using data from the LAD appeared to have similar long-term trends as those from the SLID. The entry rates from the different data sets followed a long-term downward trend. The conclusions are the same under both the variable LIM and the fixed LIM thresholds, although the decline under the fixed LIM might be somewhat stronger than under the variable LIM, and the level of the statistics with data from the LAD was higher than those from the SLID. Under the variable LIM threshold, the entry rates from 1993 to 1994 were 6.1% based on the LAD, and 5.7% based on data from the SLID. By the period 2009 to 2010, they dropped to 5.2 (LAD) and 4.6% (SLID), respectively. Under the fixed LIM, they started from 6.2% (LAD) and 5.8% (SLID) in the 1993-to-1994 period. By the 2009-to-2010 period, they dropped to 4.1% (LAD) and 3.0% (SLID).

Description for Figure 2

But the exit rates varied more than the entry rates. The exit rates followed a downward trend under the variable LIM no matter which data set was employed (Panel A, Figure 2). However, the magnitude from the LAD was lower than that from the SLID, and the short-term trend sometimes appeared to be different between data sources. The exit rates estimated under the fixed LIM thresholds also had different magnitudes between the LAD and the SLID, but they largely followed the same cyclical trend.

When we combine the cross-sectional low-income incidence with the entry and exit statistics, we saw that the increasing incidence under the variable LIM in the last 20 years was accompanied by a declining exit rate, while the decreasing incidence under the fixed LIM was likely associated with a declining low-income entry rate.

### 3.2 A simple measure of low-income persistence

A very simple measure of low-income persistence is the distribution of years in low income within a given period of time. For example, within a given length of panel—say, four years—what proportion of the population has never been in low income, what proportion was in low income for one, two, three, four or more years? Answers to these questions provide an indication of the extent of low income persistence.

Many authors have used this approach. Statistics Canada has published similar statistics as a measure of persistence of low income. Researchers also examined low income persistence along the same line (see, for example, Morissette and Zhang, 2001). But one issue has rarely been discussed: What is the effect of data censoring due to limited length of observation on low-income status in a typical panel dataset?

All panel data have a starting point and an ending point. If a person is observed in low income in the first year of the panel, he or she could have been in low income in the year immediately before, but their true state is not observable. Similarly, if the person is observed in low income in the last year of the panel, he or she could also be in low income in the year thereafter—but again, we may never observe the true state. The first case is often referred to as left censoring and the second case, right censoring. In this section, we investigate the effect of censoring while examining the persistence of low income.

For this purpose, we calculated the percentages of persons in low income for different numbers of years, and compared these statistics across three types of low-income spells: all spells irrespective of their censoring status, new spells with left-censored spells excluded, and completed spells with both left- and right-censored spells excluded. To facilitate the comparison, we looked at spell distributions within the four-year period in the middle of the six-year panel, using observations from the first and the last years to help classify the three types of low-income spells.Note 12

The results indicate that, under fixed LIM, the distributions of the low-income spells from the LAD are similar to the distributions of the spells from the SLID, and they suggest that low-income persistence has declined in Canada since the late 1990s. Figure 3 illustrates the persistence of low income implied by completed low-income spells under the 1992 fixed LIM.Note 13 Part A of the figure shows the proportion of people never in low income in the middle four-year period for each of the six-year panels of the SLID and the LAD (from the 1993-to-1998 period to the 2005-to-2010 period). It shows that the long-term trend in the resistance to low income was the same according to both data sources: other than a pause between the 1999-to-2004 period and the 2002-to-2007 period, the proportion of people who never fell into low income increased slowly yet steadily over time. According to the SLID, 89.7% of the targeted population never fell into low income between 1994 and 1997; this increased to 93.4% by the 2006-to-2009 period. In the LAD data, the proportion increased from 91.0% to 94.2% over the same time periods.

The increase in the resistance to low income was accompanied by the decreases in the proportion falling into low income for one year or more. According to the SLID, 10.3% fell into low income at least once between 1994 and 1997; this dropped to 6.6% in the 2006-to-2009 period. In the LAD data, the proportion decreased from 9.0% to 5.8%, over the same period. These decreases were decomposed according to the number of years in low income in parts B and C of Figure 3. Part B shows the proportions falling into low income for one year and two years; Part C shows the proportions falling into low income for three and four years, respectively, during the same periods. The finding that the LAD and the SLID point to the same direction in low-income persistence in Canada at various lengths of duration reinforces the consistency between the two data sources in gauging low-income dynamics when the fixed LIM methodology is employed.

Description for Figure 3

Description for Figure 4

Under the fixed LIM, the distribution of the new low-income spells from the LAD and those from the SLID are also consistent in terms of long-term trend in low-income persistence; however, short-term deviations occurred sometimes. Figure 4 illustrates how low-income persistence evolved when right-censored spells were included.Note 14 Essentially the same observation can be made as for the completed spells in terms of long-term low income persistence, except between the 1999-to-2004 and 2002-to-2007 periods, when data from the LAD and the SLID pointed to certain inconsistencies.

For the all-spell case (i.e., when censoring is not taken into consideration), our results were less consistent between the data sources, especially for the longer durations. For example, the proportion of people falling into low income for four years or longer in the SLID was much higher than that in the LAD for three of the five comparable panels of data. The results were less clear-cut under variable LIM, regardless of how we treat the censoring of the low-income spells. Indeed, data from the LAD and the SLID often pointed to different directions for the evolution of low-income persistence under the variable LIM threshold.

However, the approach for examining low-income dynamics in this section has one limitation: it ignores the consecutiveness of the low-income spells. For example, for the proportion of individuals in low income for three years, the statistics do not tell us how many of them were in low income for three consecutive years and how many in three non-consecutive years. The next section will attempt to overcome the problem by focusing on consecutive low-income spells.

### 3.3 The duration of low-income spells

How long does it take a low-income person to get out of the low-income state? This is a key question in studying the dynamics of low income. The answer depends on the distribution of the low-income spell T. For example, if it has been determined that the low-income spell follows an exponential distribution with the probability density function , where λ is the unknown population parameter and t is a particular value of the random variable T, then we expect a low-income spell to last on average 1/λ periods of time with a variance $1/{\lambda }^{2}$. Furthermore, we can also examine how various factors, such as age and gender, may affect the duration by modelling λ as a function of these individual characteristics.

The above parameter λ is referred to as the hazard rate (or function) of the exponential distribution. The hazard rate of a spell is the conditional probability of an event to terminate in the current period, on condition that it has not terminated in the last period. In the context of low-income dynamics, it tells us what would be the probability that a low-income person exits the low-income state during any year, given that he or she has not been able to do so in the previous year. Alternatively, the low-income spells can also be characterized by the survival function or survival rate. This function tells us the probability the low-income spell T will last longer than a particular value . In the exponential example, the cumulative distribution function is . The survival function of the spell is simply . In general, the hazard rate is the ratio of the probability density function of the random variable T and the survival function. So the duration of the low-income spells can be characterized by either the hazard function or the survival function.

Figure 5 presents our estimates of the hazard rates of the new low-income spells using data from the LAD and the SLID for various six-year panel periods.Note 15 These are non-parametric (or life table) estimates for which no specific assumption is needed with respect to the distribution of the low-income spells. We focus only on new spells in order to avoid technical difficulties related to left censoring.Note 16 The figure suggests that, under the fixed LIM threshold, the hazard rates of new low-income spells largely followed the same trend regardless of the data sources employed, particularly during the first few years after the low-income spell started. But in the years further away from the start, different data sources sometimes pointed to different directions.

Description for Figure 5

Description for Figure 6

However the inconsistency of the estimates based on the SLID was much stronger than those based on the LAD, and they even contradicted each other at different points in time. For example, in Part B of the Figure 5, the SLID-based estimates of hazard for years 3 and 4, and those for years 4 and 5 followed different trends; the LAD-based estimates for these periods evolved following the same pattern. These are likely due to the declining sample size of the continuing low-income spells in the SLID. As time passes, some low-income persons would exit low income and, consequently, the number of surviving low-income persons would drop. Because the sample size of the SLID is much smaller than that of the LAD, the declining number of spells under the SLID might have a much stronger effect on the estimated hazards for years further away from the start of the new spells.Note 17

But the survival rate estimates appear to be less sensitive to data sources and period of time. This is natural: the survival rate is a measure of the cumulative probability, while the hazard rate is an instantaneous probability between two adjacent periods of time. Figure 6 presents our estimates of the survival rate from the LAD and the SLID under the fixed LIM threshold.Note 18 The survival rate estimates from the LAD and the SLID indicated the same weakly declining trend in the probability of continuing to be in low income in Canada.

Similar observations can be made under the variable LIM threshold. In the years near the starting point of the low-income spells, the estimated hazard and survival rates from the two data sources were somewhat different in magnitude, but the underlying trends were the same. The only difference was that the estimated hazard for the 1993-to-1998 period under the LAD was lower than under the SLID; the estimated survival rate for the same period was higher in the LAD than in the SLID. But again, for years further away from the starting points of the low-income spells, the estimates based on the LAD appeared different from those based on the SLID under the variable LIM threshold, likely due to the small sample size of the SLID.

## 4. The re-entry to low income – the duration of non-low-income spells

The rate at which a person falls back into low income, after exiting it previously, is the re-entry rate. It gauges the conditional resistance to low income for those who have previously escaped from it. The re-entry rate differs from the unconditional resistance measure in the transition analysis. In Section 3.1, we examined the low-income entry rate. Denoting the entry rate as δ, then  is the unconditional low income resistance measure. It is referred to as unconditional because we only know that the person in question was not in low income in periods t and t-1; we do not know his or her status before t-1. In contrast, when we measure the conditional resistance, we focus on a group of persons who were in low income for at least one year in the past and escaped low income subsequently and stayed out of low income for at least one year. The re-entry rate is the probability at which these persons would fall into low income again in the next few years.

Low-income re-entry can be examined by analyzing the duration of non-low-income spells of those who escaped low income. In the illustration below, we first observe a person was in low income sometime in the past (year s). The person then exits low income in year t. He or she is now at risk of falling back into low income at time t + k, where k=1, 2, 3....

In low Income in year s ... Out of low income in year t ... Fall into low income in year $t+k$

The key question is how long does it take the person to fall back into low income again? The distribution of the random variable K provides the answer: it is simply the length of the non-low income spell. Hence, the conceptual framework we discussed in Section 3.3, including the hazard and the survival rates, also applies to the analysis of duration of non-low income spells.

Our results suggested that in the years further away from the starting point of the non-low income spells the two data sources provided rather different levels and trends in the hazard rates, no matter which LIM threshold was employed—like the results presented in Section 3.3. Only during the first few years after the start of the spells did the estimates based on the fixed LIM threshold point to the same trends in the estimates of the hazard and the survival rates. But the levels of these estimates were different: the estimated hazard rate based on the SLID was always lower than that based on the LAD, and the estimated survival rate based on the SLID was always higher than that based on the LAD. However, under the variable LIM threshold, the result was more mixed -- sometimes, estimates from different data sources suggest different trends.

Figure 7 presents the low-income re-entry hazard rates during the first few years of the spells with data from the LAD and the SLID under the 1992 fixed LIM threshold.Note 19 The magnitudes of the estimates from the two data sources were quite different. Estimates based on the SLID were consistently lower than those based on the LAD. Nevertheless, the underlying trends implied by the estimates were similar and it appeared that from the 1993-to-1998 period to the 2005-to-2010 period, the probability of falling back in low income dropped. But from the 1996-to-2001 period to the 2005-to-2010 period, the trend underlying the LAD estimate might be quite different from those based on the SLID. Based on the SLID, the re-entry rate dropped, sometimes significantly; while data from the LAD showed little change between these two periods.

Description for Figure 7

Description for Figure 8

Figure 8 presents the survival rate estimates, also under the 1992 fixed LIM.Note 20 The estimates are consistent with the hazard rate estimates. From the 1993-to-1998 period to the 2005-to-2010 period, the estimated survival rate increased, suggesting that probability for falling back into low income decreased. But again, from the 1996-to-2001 period to the 2005-to-2010 period, estimates based on different data sources appeared to be different. Under the SLID, the survival rate increased; under the LAD, it changed little.

Results from this sub-section suggest inconsistency between the LAD and the SLID for the re-entry hazard estimates and the survival rate estimates of the non-low-income spells. Results based on the LAD are probably more reliable than those based on the SLID due to relatively small size of the SLID samples. But more research is needed to explore the dynamics of the non-low-income spells in future work.

## 5. Panel attrition and the dynamic statistics of low income

With any panel data, attrition can be a significant factor to affect consistency of the estimates. The LAD is also subject to panel attritions. Those who have no tax to pay or no benefits to claim (or do not wish to claim) may choose not to file a tax return. For example, it is possible for an individual to file the return at year t but not at t+1. In examining the dynamics of low income for a panel of individuals starting from year t, we will have an attrited panel—we will not be able to observe some of the individuals at t+1. Facing panel attrition, researchers often choose to ignore the attrited people, as we did in the previous sections. It is interesting to know what would happen to the estimated dynamic low-income statistics when panel attrition is taken into consideration.

To deal with panel attrition in the LAD, we follow the approach discussed in Fitzgerald, Gottschalk and Moffit (1998). The authors propose to create a set of weights to overweigh the individuals who satisfy two conditions: they are observed at the beginning and at the end of the panel; and they have similar characteristics as the attritors at the beginning of the panel. The weights are created as the inverse probabilities through the predictions of two Probit models of attrition. The approach relies on the availability of one or more auxiliary variables that affect both the attrition and the outcome variable of interest (low income status in our case) and a group of variables that affect the outcome only. The authors denote the predicted probability of attrition based on all variables as the unconditional prediction (UPr) and that based only on the auxiliary variable(s), the conditional prediction (CPr). The ratio CPr/UPr is the adjusting weights to account for panel attrition.

Following these authors, we estimate the Probit models with family size, the presence of children (under the age of 18), earnings quintiles, and a dummy variable indicating if a record is imputed or not as the auxiliary variables. Other variables included in the models are gender, age, immigration status, family type and province dummies.

Description for Figure 9

The results suggest that panel attrition has little effect on our estimates of the one-year transition statistics. Both the exit rate and entry rate estimates for the period from 1982 to 2010 are essentially the same as the estimates when panel attrition was ignored. This is probably not surprising because the underlying panels all have a length of two years and the attrition rates are small in the LAD (generally under 5%). The same observation holds with the six-year panels, where the attrition rate varied around 15%, although the effect was slightly stronger than in the two-year panels. We also examined the effects of panel attrition on low-income survival and hazard estimates for panels of 8- and 10-year lengths. The 10-year panel result is illustrated in Figure 9.Note 21 The effect became stronger as the panel length increased, but even with the 10-year panels the effect of attrition is still relatively small under both the fixed and the variable LIM thresholds.

## 6. Conclusion

The results of this study suggest that it is feasible to construct from the LAD several dynamic low-income statistics to track the persistence and duration of low income. We found that, under the fixed LIM threshold, the entry rate, the exit rate, the hazard and survival estimates of low-income spells based on the LAD generate essentially the same trend in low-income dynamics as those based on the SLID. The same can be said for the simple measures of low-income persistence such as the proportions of individuals falling into low income for one year, two years or longer within a given panel period. But these types of estimates would be consistent between the data sources only when censored spells, particularly left-censored spells, are excluded. Under the variable LIM threshold, the results were less consistent between the data sources. Thus, producing statistics on low-income dynamics using data from the LAD with fixed LIM thresholds seems plausible while the same statistics based on variable thresholds provide some valuable complements.

We also investigated if panel attrition would be an issue. It turns out that the result is quite robust with respect to panel attrition, even with panels as long as 10 years.

However, there is more work to be done in the future on low-income dynamics. On the one hand, we have not discussed the accuracy of our estimates based on the LAD. This needs the variance of the various estimates. While we have estimated the variances in the SLID; with data from the LAD, estimation is more complex. We leave this work to a different project in the future. On the other hand, we focused on the national picture of low-income dynamics. No comparative analysis has been conducted between groups of individuals and across the provinces and regions. We will also make these comparisons in future work. Finally, we have not discussed the depth of low income in the dynamic context. This could be done by following the methodology proposed by Rodgers and Rodgers (1993). Indeed, we have applied a simplified version of this approach in a previous report (Murphy, Zhang, and Dionne, 2012) using data from the SLID.

## References

Economic Council of Canada (1992), “The New Face of Poverty”, Minister of Supply and Services, Ottawa, Ontario, Canada.

Finnie, R. and A. Sweetman (2003), “Poverty Dynamics: Empirical Evidence for Canada.” Canadian Journal of Economics, Vol. 36, No. 2 (May, 2003), pp. 291-325.

Fitzgerald, J., Gottschalk P., and R. Moffit (1998). “An analysis of sample attrition in panel data”. The Journal of Human Resources, 33(2): pp. 251-299.

Frenette, M. , Picot, G., and R. Sceviour (2004). “How long do people live in low income neighbourhoods? Evidence for Toronto, Montreal and Vancouver”, Analytical Research Paper series No. 216, Statistics Canada, Ottawa, Ontario.

Laroche M. (1998) “The Persistence of Low Income Spells in Canada, 1982-1993”, Ministry of Finance Research Report 98-02, Ottawa, Ontario, Canada.

Morissette, R., and X. Zhang (2001). “Experience low income for several years”, Perspectives on Labour and Income, Vol. 13, No. 2. Catalogue No. 75-001-XPE. Statistics Canada, Ottawa, Ontario.

Murphy, B., Zhang, X. and C. Dionne (2010), “Revising Statistics Canada’s Low Income Measure (LIM)”, Statistics Canada, Catalogue No. 75F0002M — No. 004, Ottawa, Ontario.

Murphy, B., Zhang, X. and C. Dionne (2012), “Low Income in Canada: a Multi-line and Multi-index Perspective”, Statistics Canada, Catalogue No. 75F0002M — No. 001, Ottawa, Ontario.

Picot, G., Hou, F., and S. Coulombe (2007). “Chronic Low Income and Low income Dynamics Among Recent Immigrants”, Analytical Research Paper series No. 294, Statistics Canada, Ottawa, Ontario.

Rodgers, J. and J. Rodgers (1993). “Chronic Poverty in the United States”. The Journal of Human Resources, Vol. 28, No. 1, pp. 25 – 54.

Zhang, X. (2010): “Low income measurement in Canada: What do different lines and indexes tell us?” Income Research Paper Series, Catalogue No. 75F0002M – No. 3. Ottawa, Ontario.

## Appendix

Table 1 Low income incidence based on census family income, 1992 - 2010

Table 2 Low income transition rates based on adjusted census family income, 1992 - 2010

Table 3 Some simple measure of low income persistence, census family income, 1993- 2010

Table 4 Non-parametric low income hazard rate estimates by data and threshold, 1993 - 2010

Table 5 Non-parametric low income survival rate estimates by data and threshold, 1993 - 2010

Table 6 Non-low income hazard rate estimates by data and threshold, 1993 - 2010

Table 7 Non-low income survival rate estimates by data and threshold, 1993 - 2010

Table 8 Effect of attrition on low income hazard rate estimates, 2001 to 2010

Table 9 Implicit duration dependence and hazard rate by logistic model*

Table 10 Logistic estimate of low income duration by data and thresholds*

Date modified: