# The Effects of Cancer on Employment and Earnings of Cancer Survivors

## Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

by Sung-Hee Jeon

Text begins

## Abstract

The study examines the effects of cancer on the work status and annual earnings of cancer survivors who had a strong attachment to the labour market prior to their diagnosis. The comparison group consists of similar workers never diagnosed with cancer. The study is based on a Statistics Canada linkage file that combines microdata from the 1991 Census, the Canadian Cancer Registry, mortality records and personal income tax files. The study estimates changes in the magnitude of cancer effects during the first three years following the year of the diagnosis using a large sample of cancer survivors diagnosed at ages 25 to 61. The empirical strategy combines matching and regression models to deal with observed and unobserved differences between the cancer and comparison samples, and to improve causal inference. The results show moderate negative cancer effects on the work status and annual earnings. Overall findings suggest that, in the long run, cancer is more likely to affect survivors’ work status than their earnings.

Keywords: cancer, employment, annual earnings

JEL classification: I1, J2, J3

## Executive summary

The study examines the effects of cancer on specific labour market outcomes of cancer survivors using the newly available linkage data combining Canadian 1991 Census microdata with administrative records from the Canadian Cancer Registry, Vital Statistics Registry and longitudinal personal income tax records. These unique data are used to estimate cancer effects on labour market outcomes of Canadian cancer survivors affected by all cancer types. Labour market outcomes of cancer survivors are compared to those of the comparison sample, drawn from the same linkage data files, consisting of individuals with similar characteristics but never diagnosed with cancer. The study specifically focuses on two labour market outcomes— employment status and annual earnings—of cancer survivors diagnosed at ages 25 to 61 who had a strong attachment to the labour market prior to the diagnosis. The study measures changes in the magnitude of cancer effects during three calendar years (T+1, T+2, and T+3) following the year of the diagnosis (T). The study also examines differences between cancer effects for subgroups with certain observable characteristics and the average cancer effect for the cancer sample as a whole.

The study results show that cancer survivors are less likely to work compared with individuals never diagnosed with cancer in all three post-diagnosis periods. The probability of working in T+1 is, on average, 3.0 percentage points lower in the group of cancer survivors relative to the comparison group. The difference in the probabilities increases to 3.7 percentage points in T+2 and 4.8 percentage points in T+3. The magnitude of the negative effect is greater for cancer types with low survival rates than for those with high survival rates. The probability of working in T+1 among cancer survivors with a low-survival-rate type of cancer is 11.0 percentage points lower than the probability of working for the comparison group. For those diagnosed with a high-survival-rate type of cancer, this probability is 2.0 percentage points lower than that of the comparison group.

The average annual earnings of cancer survivors are also lower than those of the comparison sample. The study found that cancer survivors earn, on average, about 12% less in T+1 than their counterparts never diagnosed with cancer when we accounted for transitions from employment to non-employment by including zero earners in the sample. When only those with non-zero annual earnings are considered, the estimated average earnings loss for cancer survivors in T+1 is 10% to 11%. In the second and third year after the year of the diagnosis, the earnings gap between cancer survivors and those in the comparison group narrows, particularly when cancer survivors continue to work after the cancer diagnosis. The effects of cancer on annual earnings are larger and longer lasting for survivors with low-survival-rate cancer, which is consistent with the results pertaining to employment status.

The study findings also suggest that, in the long run, the negative effect of cancer on the employment status for cancer survivors diagnosed at ages younger than the average age in the cancer sample, 48 years, is smaller than the average effect for the cancer sample as a whole. Conversely, the magnitude of the negative effect on the employment status is larger for cancer survivors with no high school diploma compared with the average effect in the sample. More generally, however, the patterns of post-cancer earnings losses among cancer survivors with different characteristics are mixed.

Observed labour market outcomes in the data are determined by the supply and demand sides of the labour market. However, the lack of information in the data makes it impossible to investigate the roles played by each side of the labour market in reducing the likelihood of employment of cancer survivors relative to those never diagnosed with cancer.

## 1 Introduction

About 1,638,910 new cancer cases were expected in the United States in 2012, and about 577,190 Americans were expected to die of cancer in that year (American Cancer Society 2012). In Canada, the Canadian Cancer Society (CCS) estimates there were about 186,400 new cancer cases and 75,700 cancer-related deaths in 2012 (CCS 2012). CCS data show that cancer incidence among the Canadian population steadily rose from 1983 to 2007—largely because of population growth and aging—even though age-standardized rates remained relatively stable during that period. In contrast, cancer mortality rates among men in Canada have slowly declined since the late 1980s and, among women, since the mid-1990sNote 1 (CCS 2012). Most member countries of the Organisation for Economic Co-operation and Development have seen declining cancer mortality rates since 1980 (OECD 2007). Earlier diagnoses, improvements in cancer treatment and better follow-up care have substantially increased the number of cancer survivors. The five-year relative survival rateNote 2 for all types of cancer in Canada increased from 56% in 1992–1994 to 62% in 2004–2006, based on 2004–2006 CCS estimates. Similarly, in the United States the five-year relative survival rate for all types of cancer diagnosed from 2001 to 2007 was 67%, up from 49% in 1975–1977, and the number of cancer survivors reached nearly 12 million in 2007 (American Cancer Society 2012; Centers for Disease Control and Prevention 2011).

This study examines the effects of cancer on labour market outcomes of cancer survivors. It uses linkage data combining Canadian 1991 Census microdata with administrative records from the Canadian Cancer Registry, the Vital Statistics Registry and longitudinal personal income tax records. These unique data are used to estimate cancer effects on labour market outcomes of Canadian cancer survivors affected by all types of cancer. Labour market outcomes of cancer survivors are compared with those of the comparison sample consisting of individuals never diagnosed with cancer and drawn from the same linkage data files. The study specifically focuses on the labour market outcomes of working-age cancer survivors with a strong attachment to the labour market prior to the diagnosis and the effects of cancer on the survivors’ labour market outcomes approximately one to four years from the time of the cancer diagnosis. The study measures changes in the magnitude of negative labour market effects associated with cancer during the three years after the calendar year of the diagnosis. Differences between cancer effects for subgroups with certain observable characteristics and the average cancer effect for the cancer sample as whole are also examined.

As cancer survivor rates rise, researchers are increasingly trying to understand the effect of cancer on economic outcomes of cancer survivors. Breast cancer has so far received the most attention in this literature, partly because it is the most common type of cancer among individuals of working age and partly because individuals diagnosed with breast cancer have higher survival rates compared to those with other types of cancer. Among recent studies, Bradley et al. (2005) find that breast cancer survivors aged 30 to 64 are less likely to work shortly after the diagnosis (within the next six months) compared with other women. However, when Bradley et al. consider a longer post-diagnosis period, 12 to 18 months, they find little evidence of any negative effect (Bradley et al. 2007). Moran et al. (2011), who study the long-term (two to six years after the diagnosis) effects of all cancer types on employment of workers aged 28 to 54, find that cancer survivors have lower employment rates than other similarly aged adults.

Much of the literature on economic outcomes of cancer survivors is fairly recent, and the evidence presented so far is mixed, particularly regarding long-term effects. Scarcity of population-based data that contain both cancer diagnosis and labour market information has been a major hindrance for researchers. Common population-based household surveys containing labour market information generally do not contain cancer information: even if cancer survivors can be identified in such surveys—generally older workers—the sample size is very small (Bradley et al. 2002). Other data sources may contain labour market information by surveying a sample of cancer survivors identified based on cancer patients records, in which case labour market information for a comparison sample have to be obtained from other data sources (Bradley et al. 2005; Moran et al. 2011). However, in these cases, data are collected from a targeted local area and may not be representative at the national level. Another common problem is the lack of longitudinal labour market outcome data—particularly earnings data—for any extended period of time before and after a cancer diagnosis. Previous studies that examined long-term effects of cancer on employment outcomes used retrospective questions to construct pre-diagnosis labour market histories (e.g., Moran et al. 2011), in which case the results are likely to suffer from substantial measurement error.

This study considers two labour market outcomes: work status (working or not working); and total annual earnings. A particular strength of the study is that these labour market outcome variables are based on highly accurate individual-level data from annual personal income tax files over pre- and post-cancer periods. Yet, unlike most survey-based studies, this study cannot explicitly examine the effect of cancer on the labour supply of cancer survivors—their labour force participation or hours of work—because these variables, routinely available in survey data, are not available in administrative data. Conceptually, however, there are at least two important ways in which a cancer diagnosis can affect individuals’ labour supply decisions and, consequently, their labour market outcomes. First, cancer alters the allocation of individuals’ time: they must spend some of it on cancer treatment and recovery. Accordingly, cancer survivors may have less time that can be allocated for work, so they may reduce their working hours or stop working altogether. However, it is likely to result in a temporary decline in individuals’ labour supply and, consequently, a decrease in their employment or earnings would not last long. Second, not only can cancer and cancer treatment physically constrain the amount of time cancer survivors are able to work during a day; like other health shocks, cancer also can shift individuals’ preferences away from work toward leisure. In this case, the reduction in labour supply would last longer and may become permanent. This would result in a steady decline in employment or earnings observed in longitudinal data.Note 3

Following recommendations in Ho et al. (2007), this study combines matching with parametric models to improve causal inferences for the estimates based on observational data. The data are preprocessed using the Coarsened Exact Matching (CEM) method and reweight the comparison sample using CEM matching weights to make the comparison sample as similar as possible to the cancer sample (Iacus et al. 2012). The weighted regression accounts for potential confounding effects of observable pre-cancer characteristics such as age and education on cancer incidence and labour market outcomes of cancer survivors. Also, the study attempts to control for the potential correlation among unobservable characteristics, cancer incidence and labour market outcomes by combining matching with the regression-adjusted ‘difference-in-differences’(DID) method (Heckman et al. 1997, 1998).

The results show that the employment rate among all cancer survivors was lower than the employment rate among individuals never diagnosed with cancer in all three post-diagnosis periods. However, the estimated magnitude of the difference in the employment rates is relatively moderate (3 percentage points in the first year after cancer diagnosis). The magnitude of the negative effect is greater for cancer types with low survival rates than for those with high survival rates. The average annual earnings of cancer survivors are also moderately lower than the average earnings of the comparison sample (12.1% at the first year after cancer diagnosis). However, the earnings gap between cancer survivors and their counterparts in the comparison sample narrows, particularly when survivors continue to work after the cancer diagnosis. Overall, the findings suggest that in the long run, cancer is more likely to affect survivors’ work status than their earnings.

The plan of the paper is as follows. Section 2 describes the empirical strategy of identifying the causal impact of cancer on labour market outcomes. Section 3 describes all data sources, and explains the selection rule for the cancer and comparison samples. Descriptive statistics for the cancer and comparison samples before and after matching are also provided in Section 3. Section 4 presents results. Section 5 concludes.

## 2 Empirical strategy and model

### 2.1  Observable confounding factors affecting cancer and labour market outcomes

A first look at the data reveals that cancer does not affect individuals in the sample randomly. As well, the incidence of cancer is correlated with some of the observable characteristics: one indication is that the average characteristics of the cancer sample are different from those of the comparison sample. Not accounting for these factors would significantly bias the results.  (Detailed summary statistics of both samples will be presented in the next section.) The first concern is that factors increasing the chances of having cancer also have a direct impact on labour market outcomes. Aging, for example, increases cancer incidence and at the same time decreases the probability of working. To account for the confounding influence of pre-cancer conditions on cancer incidence and outcome variables, it is important to make characteristics of the comparison sample more similar to those of the cancer sample.

The study applies the CEM method to match the cancer sample with the comparison sample based on the pre-cancer period observable characteristics (Iacus et al. 2012). The CEM is a multidimensional exact matching algorithm applied to cells generated by dividing continuous variables into discrete intervals or by regrouping categorical variables into fewer coarsened categories.Note 4 The CEM algorithm creates a set of strata with the same coarsened values of matching variables; it also restricts the matched data to areas of common empirical support by pruning unmatched observations from both treated and control samples. For each stratum, the CEM returns weights$\left({n}_{t}/{n}_{c}\ast {N}_{c}/{N}_{t}\right)$Note 5 that can be used to reweight observations in the matched comparison sample and balance the empirical distributions of the matching variables between the cancer and comparison samples.

Ho et al. (2007) demonstrate that preprocessing raw data using matching procedures turns parametric models into a much more reliable tool of the empirical analysis of causal effects. In particular, estimates of causal effects are less sensitive to the choice of a model specification.Note 6 One of the proven properties of the CEM is that it reduces the degree of model dependence (Iacus et al. 2012).Note 7 To make causal inference, matching weights are used in the regression analysis of labour market outcomes. As well, the regression method controls for the remaining imbalance since, depending on coarsening, some imbalance can remain in the matched data. The estimates of the cancer effect in the model measure the average effects of cancer on the outcome variables for cancer survivors.Note 8

### 2.2 Unobservable confounding factors affecting cancer and labour market outcomes

Another concern is the correlation between cancer incidence and individuals’ unobservable characteristics—this is also known as ‘selection-on-unobservables’ in the evaluation literature. Like observed differences, unobserved differences between the cancer and comparison samples may also result in different labour market outcomes for these two groups. For instance, developing cancer may be correlated with unhealthy lifestyle unobserved in the data (e.g., smoking or poor diet). This, in turn, may depend on particular unobserved individual characteristics related to personal motivation. Individuals with such characteristics would have less chance of developing cancer (e.g., by refraining from smoking) and, at the same time, have greater likelihood of earning higher income. Even if the confounding effect of the unobserved characteristics in this example is not particularly strong, not controlling for the average difference in unobservable characteristics of the cancer and comparison groups would lead to overestimating the negative effect of cancer on labour market outcomes.Note 9

Heckman et al. (1997, 1998) proposed combining DID and matching approaches to eliminate time-invariant unobservable differences between two groups. This approach, however, requires pre- and post-diagnosis data, which are now included in the newly available linkage data used in this study. Adopting the Heckman et al. (1997, 1998) strategy, the difference in individuals’ earnings before and after the cancer diagnosis is used as a dependent variable regressed on the cancer variable using CEM matching weights.Note 10

### 2.3 Model specification

In the statistical model, T is the year of cancer diagnosis in the cancer sample and the matched year in the comparison sample. Calendar years before and after the year of cancer diagnosis (or years matched to them in the comparison sample) are $T-j$ and T+j, where $j=1,...,J$. Labour market outcomes at T+1, T+2 and T+3 are modelled as a function of a cancer indicator and individual as well as work characteristics

$Labour\text{ }\text{ }market\text{ }\text{ }outcome=f\left(cancer,\text{ }\text{ }\text{ }\text{ }individual\text{ }\text{ }characteristics,\text{ }\text{ }work\text{ }\text{ }characteristics\right)\text{ }\text{ }+\epsilon .$

Individual characteristics include age, age squared, marital status (couple/single), an indicator of having a long-term disability in 1991 and province of residence. The variables used for matching—sex, education, visible minority status—are also included in the model specification.Note 11 Work characteristics are indicators for having non-zero self-employment income and union (professional association) membership. Quintile dummies for total earnings at $T-1$ and year dummies are also included in the model.

## 3 Data

The 1991 Census–Longitudinal Worker File (LWF) is a unique dataset that combines data from four sources: Canada’s 1991 Census of Population, the Canadian Mortality Database (CMDB), the Canadian Cancer Database (CCDB) and the LWF.

The CMDB contains individual death records from 1950 onward. Provincial and territorial vital statistics offices provide these records annually to Statistics Canada for national-level analysis.

The CCDB is a databank combining two cancer data sources: the Canadian Cancer Registry (CCR) and the National Cancer Incidence Reporting System (NCIRS). The former is a person-oriented tumor database, which includes clinical and demographic information about Canadian residents with cancer since 1992 (Statistics Canada 2008). The latter is a historical tumor-oriented database containing cancer cases diagnosed as far back as 1969 (Carpenter et al. 2008). Individual cancer records from the CCR are used in the analysis; historical information from NCIRS is used to verify that individuals in the CCR had no prior cancer history.

The LWF represents 10% of the random sample of Canadians who either filed a personal income tax form (Form T1 General, Income Tax and Benefit Return) or received a statement of remuneration (Form T4, Statement of Remuneration Paid (slip)) from their employers in each year from 1983 onward. Once individuals are selected into the LWF, they are followed regardless of their employment status for as long as they file tax returns (Form T1 General) or their incomes are reported to the Canada Revenue Agency (CRA) by their employers. The current version of the LWF contains information on wages, salaries and net self-employment income as well as firm-level information.Note 12 Wages and salaries are obtained from T4 slips issued by employers. Net self-employment income and basic personal information (marital status, province of residence, etc.) are obtained from the personal income tax files (T1).Note 13

Statistics Canada’s Health Analysis Division initially linked selected personal information from CMDB and CCDB to the individual records of individuals aged 25 and older in the 1991 Census file.Note 14 This initial data linkage is called ‘1991 Canadian Census Cohort: Mortality and Cancer Follow-up.’ Individuals’ death records up to 2006 and individuals’ cancer records up to 2003 were obtained from both data banks, CMDB and CCDB.Note 15 Subsequently, the LWF records were linked to the 1991 Canadian Census Cohort to provide the crucial income component.

The 1991 Census–LWF data sample contains 263,674 individual records corresponding to about 1.4% of the Canadian population aged 25 and over in 1991. Approximately 58.8% of the 1991 Census–LWF cohort was observed in all 28 years of the LWF, from 1983 to 2010; the average number of years an individual is present in the sample is 24.8. Tax-filing rates were slightly lower in the 1980s compared with the more recent decades; in the years 1990 to 2010, 66.9% of 263,638 individuals are observed in all 21 years, and the average number of years an individual is observed in the sample is 18.5.

### 3.1 Cancer sample

Initially, individuals diagnosed with cancer for the first time from 1992 to 2000 are selected from the 1991 Census–LWF cohort. As this study focuses on the labour market outcomes of the cancer survivors, the sample is restricted to individuals who were under 62 when they were diagnosed with cancer. In the 1991 Census–LWF data, only cancer sites are available, and no information about the severity and stage of cancer is available in the data. First, all tumors are grouped according to 26 of the most common cancer sites based on the Surveillance, Epidemiology, and End Results (SEER)Note 16 grouping for ICD-9 (International Statistical Classification of Diseases, 9th Edition) and ICD-O-2/3 (International Classification of Diseases for Oncology, Second and Third Edition) codes available in the data. For those who had multiple tumor records in any year, the record with a malignant and primary site tumor that has the lowest relative survival rate according to previous studies on cancer survival rates (Ellison et al. 2011) is selected. Then, 26 sites are grouped into three categories—high, middle and low—based on the five-year relative survival ratio (hereafter referred to as the ‘cancer survival category’) for each cancer site (Ellison et al. 2011). These cancer categories are used as a proxy for the average severity of cancer to determine the impact of cancer severity on labour market outcomes of cancer survivors.Note 17

The 1991 Census–LWF data contains 5,185 individuals (2,120 males and 3,065 females) diagnosed with cancer for the first time from 1992 to 2000 who were aged 61 or younger in the year of diagnosis.Note 18 The numbers of cases for each of the 26 cancer sites are shown in Appendix Table A.1. Among all 5,185 cancer cases, about 8% (412) are tumors other than the 26 common sites. The most common cancer sites are breast (21.5%), lung and bronchus (10%), cervix uteri (7.9%), prostate (5.9%) and colon (5.8%). Broken down by cancer-survival categories, 2,023 (39%) cases are grouped as high, 1,659 (32%) as middle, and 1,091 (21%) as low. Also, 2,020 records (39% of the total sample) are matched to death records from CMDB mortality data (the last available year is 2006). By cancer-survival categories, 19.1% of the high, 34.3% of the middle, and 81.5% of the low-survival groups are matched to death records.

For the analysis, 3,716 individuals who survived for at least three years after their cancer diagnosis are kept in the sample, and 77 individuals who had multiple new cancer diagnoses within those three years are excluded from the sample. Finally, based on their labour market attachment prior to the cancer diagnosis, the cancer sample is restricted to individuals who work in the year of cancer diagnosis (T) and two previous years ($T-1$ and $T-2$) since the focus of the study is on the effects of cancer on labour market outcomes for those with a strong labour market attachment prior to the cancer diagnosis.Note 19 More details on the labour market outcome measures will be presented in Subsection 3.3. By imposing this restriction, 1,042 individuals are further excluded from the cancer sample. To summarize, the cancer sample consists of 2,597 cancer survivors diagnosed with cancer for the first time from 1992 to 2000 and aged 61 or younger at the time of the diagnosis. These individuals worked in the year of the diagnosis and two previous years, survived for at least three years after the diagnosis and had no subsequent diagnoses during those three years.

Table 1 shows the types of cancer in the cancer sample. Since the cancer sample is restricted to people who were diagnosed with cancer and survived for more than three years, it contains proportionally fewer people in the low-survival category than shown in Appendix A. One-half of the cancer sample is categorized as a high-survival group. General characteristics of the cancer sample are presented in the second column of Table 2 in the next subsection.

### 3.2 Comparison sample

This section describes the selection of the comparison sample whose labour market outcomes are compared with those of the cancer sample. The general selection criteria used to select the cancer sample are first applied: individuals are pre-selected based on their cancer records, year, age, death records, and work history from the same 1991 Census–LWF data. Individuals of the 192,537 who have no cancer records in all available years in either the CCR or NCIRS (historical tumor records) are initially chosen from the data. Similarly to the cancer sample, the comparison sample consists of individuals aged 61 or younger in each year from 1992 to 2000. Then, in each year, individuals who lived for at least three years following T (that excludes 1,053 individuals) and worked in year T and two previous years ($T-1$ and $T-2$) (further excluding 22,576 individuals) are kept in the sample. At this point, 168,908 individuals are pre-selected from the 1991 Census–LWF data and the pooled number of observations over nine years from 1992 to 2000 is 1,228,551.

On the next step, CEM is applied to match this pre-selected sample with the cancer sample to obtain the matched comparison sample. Matching variables include age in year T, gender, level of education, visible minority status, year T and province of residence in year $T-1$. Age is coarsened into five-year intervals. The visible minority status is coarsened into three categories: non-minority, Asian and other minority. The Northwest Territories, Nunavut and Yukon are merged into a single category (Northern Territories). CEM automatically restricts the cancer and comparison samples to areas of common empirical support by pruning unmatched observations from both samples. Four among 2,597 individuals in the cancer group cannot be matched to anyone in the comparison group. The matched comparison group consists of 624,835 observations (pooled year data for 142,196 individuals).Note 20 Table 2 shows the summary statistics for matching variables for the cancer and comparison group before and after matching. The summary statistics for other variables, not used in the matching procedure but used in the regression analysis, are provided in Appendix Table A.2. The last column of Table 2 presents the differences in the variables’ means (proportions) between the cancer and comparison samples before matching and t-tests for the differences between these two samples.Note 21

The average age in the cancer sample is higher than that of the pre-matched comparison sample. The age distribution shows that cancer incidence is positively correlated with age. The cancer sample also has proportionally more individuals with less than a high school level of education than the comparison sample. Recent literature finds evidence of causal influence of higher education attainment on health and healthy lifestyle (Jones et al. 2011). The cancer sample has proportionally more women than the pre-matched comparison sample. Breast cancer, the most common type of cancer among women, is also the most prevalent cancer type in the study sample. Among men, prostate cancer is the most common type of cancer, but it usually develops later in life at ages excluded from the study sample. The data are also consistent with the fact that prevalence of cancer is generally lower among Asians than among non-minorities.

Shown in the fourth column, the reweighted (post-matching) distributions are almost identical in the cancer and the matched comparison samples. Matching and reweighting procedures enables us to regard cancer as randomly assigned “treatment” conditional on the matched observable (pre-cancer) sample characteristics.

Summary statistics for earnings in the two years ($T-1$ and $T-2$) prior to cancer diagnoses in the cancer sample are presented in Appendix Table A.2 along with the corresponding earnings in the matched comparison sample. As a result of matching, both the average earnings and the earnings distributions in the cancer sample in the years before the cancer diagnosis are virtually identical to those in the matched comparison sample. This observation has two implications. First, total annual earnings can be accurately predicted in the study samples based on variables used for matching. Second, in the absence of cancer diagnosis, the average earnings of the cancer sample would follow the same common trend as those of the matched comparison sample. This observation is particularly important, as it is consistent with the main assumption of the DID approach used in the empirical analysis.Note 22

### 3.3 Measures of labour market outcome

The study considers two labour market outcome variables: the first is a binary annual working status variable; the second, a continuous variable for total annual earnings (in 2010 dollars). Individuals’ total annual earnings are defined as the sum of all their wages and salaries received in a given year plus net self-employment income in that year. If individuals had non-zero earnings—either positive or negative—they are considered to have worked in that year. In the data, the frequencies of outcome variables are annual. There is no information on how many months or weeks each cancer survivor worked before and/or after the cancer diagnosis in the year he or she was diagnosed. To ensure that the person had not stopped working before being diagnosed with cancer, the cancer sample is restricted to individuals with non-zero earnings in the year of the cancer diagnosis (T) as well as two previous years ($T-1$ and $T-2$). The study examines the effect of cancer on labour market outcomes in the years after the diagnosis by comparing labour market outcomes of the cancer and comparison samples for three consecutive periods: T+1, T+2 and T+3.

Note that T corresponds to the calendar year, not the actual time of the diagnosis. Accordingly, T+j corresponds to the calendar years following the year of diagnosis, not the time intervals that elapsed from the moment of the actual diagnosis. For instance, suppose that a cancer survivor had zero earnings in year T+1 (the year after the year of diagnosis). It is known that this person did not work for at least 12 months, but that period may be as long as 24 months from the time of the diagnosis if the person was diagnosed at the beginning of year T. However, if a cancer survivor had positive earnings in year T+1, these earnings reflect total earnings for a period of 12 to 24 months from the cancer diagnosis. Therefore, the actual working time from the day of the cancer diagnosis to the end of T+1 depends on the time of the cancer diagnosis in year T. Consequently, the data are not suitable for analyzing the short-term (e.g., less than a year) effects of cancer diagnosis on labour market outcomes; thus, only longer-term labour market outcomes are considered. Appendix Table A.3 shows unadjusted means of employment and earnings in T+1, T+2 and T+3 by cancer status and cancer-survival category.Note 23 Unadjusted mean differences in employment and earnings between the cancer sample and the matched comparison sample are -4.36% and $-9,022 at T+1. ## 4 Results ### 4.1 The effect of cancer on employment The left panel of Table 3 shows estimates of cancer effects on the probability of working at T+1, T+2 and T+3. Because all individuals in the sample worked at T, $T-1$, and $T-2$, estimating the probability of working in each period is equivalent to estimating the probability of transition from employment to non-employment between T and T+1, T and T+2, and T and T+3, respectively.Note 24 Here, the effect of cancer on employment at the end of each time interval combines various aspects of all labour market transitions that took place between T and the final period. Some individuals, for instance, may temporarily stop working after the cancer diagnosis because of cancer treatment and post-treatment recovery but return to work in the subsequent periods. Others may not leave the labour market immediately after the cancer diagnosis but will leave in the subsequent periods. Some individuals may leave the labour market permanently following the cancer diagnosis. The differences between the survivor and comparison groups in the probability of working at the end of each period reflect the cumulative effect of cancer on employment from T to that period. The upper-left panel shows that the probability of working in T+1 is, on average, 3.0 percentage points lower in the cancer group relative to the comparison group. The difference in the probabilities increases to 3.7 in T+2 and 4.8 percentage points in T+3 The right panel of Table 3 presents the effects of cancer on the probability of working conditional on continuously working after the cancer diagnosis until T+1, T+2 or T+3. This is equivalent to estimating the probability of exit from employment between two consecutive periods (a more conventional exit rate). The results in the upper-right panel suggest that even if cancer does not interrupt individuals’ attachment to the labour market soon after the diagnosis, it may affect their exit from the labour market in the longer term. The exit rate decreases with more years since the cancer diagnosis, but it is still 1.3 percentage points higher at T+3 than the average exit rate for people never diagnosed with cancer, and the difference is statistically significant. In the bottom panel, instead of a binary cancer variable in the model, three cancer-survival categories—high, middle and low—are included in the estimation model. Among these three categories, the negative effect of cancer on employment is the largest for the cancer survivors in the low-survival category in all three periods compared with people never diagnosed with cancer. In the bottom panel, the probability of working at T+1 for survivors in the low-survival group is 11 percentage points lower than the probability of working in the comparison group. In addition, the bottom right panel shows that, for those in the low-survival category, the exit rates conditional on working in all previous years also remain high and statistically significant for all three post-diagnosis periods. However, in the high-survival category, for those who continue to work after the cancer diagnosis, the estimated exit rate at T+3 is no longer significantly different from the average in the comparison group. ### 4.2 The effect of cancer on earnings Next, the effect of cancer on total annual earnings is estimated using a weighted linear regression. The left panel of Table 4 shows estimates of cancer effect in each period not conditional on working. Zero annual earnings are included in this analysis, so the results presented in the left panel include earnings losses resulting from transitions from employment before cancer diagnosis to non-employment after the diagnosis. Estimates in the right panel of Table 4 are conditional on working in each period (zero earnings excluded). These results are close to the cancer effect along the intensive margin of labour market outcomes and can be compared to the estimated effects along the extensive margin shown in Table 3. Cancer survivors earn, on average,$5,079, or 12.1%, less at T+1 than their counterparts never diagnosed with cancer; however, the negative effect of cancer is smaller at T+2, 9.7%, and smaller still at T+3, 9.3% (upper-left panel in Table 4). When the effect of cancer is estimated conditional on working, workers diagnosed with cancer at T earn \$4,675 (10.6%) less at T+1 than do their counterparts from the comparison sample. The negative effect of cancer is smaller at T+2, 6.6%, and T+3, 5.4%, than at T+1.Comparing these results with the effects of cancer on the probability of working shown in Table 3, the impact of cancer on work status appears to be more persistent than its impact on annual earnings in post-diagnosis periods.

The bottom panel of Table 4 shows results for the model in which the cancer variable is grouped into three survival categories. In both right and left panels, the earnings decline, relative to the comparison sample, is greater for survivors in the low-survival category than for those in the high-survival category. The effects of cancer are larger and longer lasting for the low-survival category than for other categories, which is consistent with the results for the employment status shown in Table 3. As time goes by, the average annual earnings of cancer survivors remaining in the labour market seem to converge to the average annual earnings of workers with similar characteristics never diagnosed with cancer, particularly for those in the high-survival category.

### 4.3 Combining difference-in-differences with matching

Next, the study further controls for differences between the cancer and comparison samples by differencing out time-invariant unobserved characteristics that may be associated with both cancer incidence and earnings. Earnings at $T-1$ are subtracted from earnings at T+1, T+2, and T+3. These three pre- and post-diagnosis differences are then used as the dependent variables in the weighted DID regression. The top panel in Table 5 presents the results from the weighted DID regressions for annual earnings. For comparison, the middle panel in Table 5 shows again the results from the weighted linear regression reported in Table 4: the bottom panel shows the results from the linear regression based on the unweighted (pre-matching) sample.

Table 5 shows several notable patterns among the results from these three models.Note 25 First, in the top panel, after additionally accounting for unobservable differences between the cancer and comparison samples, the negative cancer effects become smaller compared with those in the middle panel, where only observable differences between the two samples are accounted for. These differences in the estimates provide evidence of modest time-invariant unobservable differences between the (matched) cancer and comparison samples: this leads to the overestimation of the negative cancer effect in the model reported in Table 4 (the middle panel of Table 5). Second, comparing the results in the middle panel to those in the bottom panel shows that, consistent with Ho et al. (2007), controlling for selection on observable characteristics by applying matching weights improves causal inference, particularly at T+1—a relatively short term in this study. In the longer term (T+3), however, accounting for selection on observable characteristics in the model appears less important. Third, accounting for the simultaneous associations between time-invariant unobservable individual characteristics and cancer, on the one hand, and these characteristics and labour market outcomes, on the other hand, appears more important for estimating longer-term effects of cancer on earnings than for estimating short-term effects.

### 4.4 Heterogeneity of cancer effect

So far, the analysis has focused on the overall effect of cancer on specific labour market outcomes. The question now is whether cancer effects for subgroups with certain observable characteristics are different from the average cancer effect estimated for the whole sample.Note 26 To answer this, models with interactions between the cancer indicator and categorical variables for age, education and pre-diagnosis earnings are estimated. Interactions between each set of categories and the cancer indicator are included separately in each regression; all control variables from previous regressions are also included. First, group-specific average cancer effects are computed; then, differences between these effects and the overall average effect (shown in Tables 3 and 4) are tested for statistical significance. A model with the interaction terms is used instead of estimating separate models for each category in order to hold cancer incidence constant over all subgroups by using CEM weights obtained for the whole sample. Because the weighted cancer incidence is the same in the whole sample and in each subgroup (e.g., people aged 35 to 44, people with a university degree, etc.), these differences between estimated group-specific and full-sample effects can be interpreted as those reflecting differences in (pre-cancer) characteristics between the subgroups and the whole sample and not as differences in cancer incidence.

The differences in cancer effects $\left(\Delta CE\right)$ between each subgroup $\left(\left[CE\text{ }\text{ }\text{ }\text{ }|\text{ }\text{ }\text{ }\text{ }×=1\right]\text{ }\right)$ and the overall effect estimated on the whole sample $\left(\left[CE\right]\right)$  are presented in Tables 6-1 and 6-2. The following are the test hypotheses:

$\begin{array}{cccc}\begin{array}{l}\left[\stackrel{^}{pr}\left(work|cancer=1\text{ }\text{ }\text{ }and\text{ }\text{ }\text{ }x=1\right)-\stackrel{^}{pr}\left(work|cancer=0\text{ }\text{ }\text{ }and\text{ }\text{ }\text{ }x=1\right)\right]-\left[\stackrel{^}{pr}\left(work|cancer=1\right)\\ -\stackrel{^}{pr}\left(work|cancer=0\right)\right]=\left[C{E}_{work}|x=1\right]-\left[C{E}_{work}\right]=0\end{array}& & & \left(1\right)\end{array}$

$\begin{array}{cccc}\begin{array}{l}\left[E\left(earnings|cancer=1\text{ }\text{ }\text{ }and\text{ }\text{ }\text{ }x=1\right)-E\left(earnings|cancer=0\text{ }\text{ }\text{ }and\text{ }\text{ }\text{ }x=1\right)\right]\\ -\left[Ε\left(earnings|cancer=1\right)-E\left(earnings|cancer=0\right)\right]=\left[C{E}_{earnings}|x=1\right]-\left[C{E}_{earnings}\right]=0\end{array}& & & \left(2\right)\end{array}$

Table 6-1 presents $\Delta CEs$ for the probability of working at period T+1 and T+3 as well as test results for Equation (1) for each set of sample characteristics. The cancer effects on the probability of working for younger age groups (25-to-34 and 35-to-44) are 1 to 2  percentage points higher at T+1 than the cancer effect estimated for the full sample (-0.030 in Table 3), where the average age is 48:Note 27 this difference is statistically significant at the 10% level. The $\Delta CEs$ for the two youngest age groups are larger and more statistically significant at T+3 than at T+1. However, the $\Delta CEs$ for older age groups (45-to-54 and 55-to-61) are not statistically significant in either period. This result suggests that, for those diagnosed with cancer at younger ages, the magnitude of the negative cancer effect on the probability of working is smaller than the estimated average effect for the whole sample, particularly in the long run.

The $\Delta CEs$ are also statistically significant in some educational categories. At T+1 the only significant $\Delta CE$ is observed for cancer survivors with a university degree for whom the cancer effect on the probability of working is 2 percentage points higher than the overall effect. At T+3, the only significant $\Delta CE$ is observed for those with no high school diploma; it is 1.9 percentage points lower than the overall cancer effect on the probability of working at T+3. In other words, at T+3 the negative effect of cancer on the employment status among individuals with no high school diploma is largerin magnitude (-0.048 - 0.019 = -0.067) than the overall average effect (-0.048).Note 28

For cancer survivors in the bottom 20% of the earning distribution at $T-1$, the $\Delta CE$ at T+1 is -0.042, which means that the negative effect is 4.2 percentage points larger than the overall effect (-0.030). In contrast, for those in the upper 60% of the earnings distribution at $T-1$, the $\Delta CE$ at T+1 (0.023) is positive and statistically significant, so the negative effects of cancer on the work status for these groups are close to zero at T+1 (-0.03+0.023= -0.007). At T+3, the corresponding $\Delta CEs$ are much smaller and neither is statistically significant.

Table 6-2 presents $\Delta CEs$ for annual earnings and test results from Equation (2) at T+1 and T+3. Post-cancer earnings losses are generally small for cancer survivors in the 25-to-34 age group (-5,079.37+4,337.53=-741.87) for which the negative effect on the probability of working is smaller relative to the overall average effect. However, the patterns of group-specific cancer effects on earnings are less clear than those observed for work status in Table 6-1.

## 5 Conclusion

Earlier diagnoses, improvements in cancer treatment and better follow-up care have substantially increased the number of cancer survivors during the last two decades. As cancer survival rates rise, life after cancer treatment has attracted considerable interest of researchers in different fields. Economists are particularly interested in the effects of cancer on the labour market choices and labour market outcomes of cancer survivors.

Using unique linkage data from the Canadian 1991 Census and administrative cancer, mortality and longitudinal income tax records, the study estimates the effects of cancer on certain labour market outcomes of cancer survivors by comparing their outcomes to those of a comparison group consisting of people never diagnosed with cancer. On average, cancer lowers the probability of working in the first year following the year of diagnosis by 3 percentage points relative to the comparison sample. The average earnings loss of cancer survivors in the first post-diagnosis year is 12% when transitions from employment to non-employment are taken into account by including zero earners in the sample. When only those with non-zero annual earnings are considered, the estimated average earnings loss for cancer survivors is 10% to 11%. In the second and third year after the year of diagnosis, cancer lowers the probability of working among cancer survivors; however, the earnings gap between survivors and those in the comparison group narrows, particularly when cancer survivors continue to work after the cancer diagnosis. In addition to controlling for differences in observable characteristics between people diagnosed with cancer and those never diagnosed with cancer, accounting for the simultaneous associations between (1) time-invariant unobservable individual characteristics and cancer, and (2) between these characteristics and labour market outcomes appears important for estimating long-term effects of cancer on earnings.

The study findings also suggest that cancer effects on labour market outcomes differ for different survival categories, which in this study serve as proxies for cancer severity. The probability of working and annual earnings are substantially lower for survivors in the low-survival category than for those in the high-survival category in all post-cancer periods. In the long run, the negative effect of cancer on the work status of people diagnosed at ages younger than the average age in the sample is smaller than the average effect for the full cancer sample. Conversely, the negative effect of cancer on the work status of cancer survivors with no high school diploma is greater than the average effect for the full sample.

Overall, the findings suggest that, in the long run, cancer is more likely to affect survivors’ work status (extensive margin of labour market outcomes) than their earnings (intensive margin of labour market outcomes). Improving general understanding of the role that supply and demand sides of the labour market play in reducing the likelihood of employment of cancer survivors relative to those never diagnosed with cancer is an important direction for future research.

## Notes

Date modified: