Estimating relative survival for cancer: An analysis of bias introduced by outdated life tables
Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.
Larry F. Ellison
The relative survival ratio (RSR) is the preferred measure for evaluating and comparing survival in population-based cancer studies. It is defined as the ratio of the observed survival in a group of people diagnosed with cancer to the expected survival of a comparable group of people—free from the cancer under study—in the general population.Note1 In practice, expected survival is typically estimated from general population life tables.
However, population life tables corresponding to the most recent calendar years of cancer patient follow-up may not be readily available. In such circumstances, the practice has been to extend the latest available life tables to cover the remaining years for which an estimate of expected survival is required.Note2 Although the assumption is that any bias introduced into the estimation of expected survival, and hence, into the RSR, will be negligible, empirical studies of this bias have yet to be published. Furthermore, the increasing use of period survival as an analytic tool to complement or replace more traditional cohort approaches would seem to increase vulnerability to potential bias.
Based on data from the Canadian Cancer Registry (CCR), this study examines the impact of using historical rather than current life tables to estimate expected survival in calculations of RSRs. Results are presented by sex, age group, and survival duration.
Cancer incidence data are from the October 2011 version of the CCR, a dynamic, person-oriented, population-based database maintained by Statistics Canada. The CCR contains information on cases diagnosed from 1992 onward, compiled from reports from each provincial/territorial cancer registry.
A file containing records of invasive cancer cases and in situ bladder cancer cases (the latter are reported for each province/territory except Ontario) was created using the multiple primary coding rules of the International Agency for Research on Cancer.Note3 Cancer cases were defined based on the International Classification of Diseases for Oncology, Third EditionNote4 and classified using Surveillance, Epidemiology, and End Results (SEER) Program grouping definitions, with mesothelioma and Kaposi sarcoma as separate groups.Note5
Mortality follow-up was carried out by record linkage to the Canadian Vital Statistics Death Database (excluding deaths registered in the province of Quebec), and from information reported by the provincial/territorial cancer registries. For deaths reported by a provincial registry but not confirmed by the national record linkage, the date of death was assumed to be that submitted by the reporting registry.
Predicted RSRs for 2005-2007 were derived using life tables centred on the 2006 Census of PopulationNote6 to estimate expected survival. These RSRs were the gold standard. The calculations were repeated using life tables centred on the 2001 CensusNote7 and on the 1996 Census.Note8 Differences in percentage units between the gold standard RSRs and the corresponding RSRs ascertained through the use of earlier life tables were determined.
Differences in expected survival that arise from the use of different life tables cannot be completely ascribed to natural changes in life expectancy over time, because the methodology used to derive the life tables also changed. Two major changes were introduced beginning with the 2005-2007 life tables—the first was related to the method of estimating old-age mortality, and the second was related to the method of smoothing age-specific death probabilities.Note9 Because the purpose of this analysis was to investigate the impact of reliance on historical data, no attempt was made to tease out the effects of these methodological improvements, which were considered to be minimal.Note9
Period method survival analysesNote10 were based on a publicly available algorithmNote11 incorporating the Ederer II methodNote12 with minor adaptations to increase precision (for example, determining attained age to three decimal places). Expected survival was derived from sex-specific complete provincial life tables. Complete life tables were not available for Prince Edward Island and the three territories because of their small populations; expected survival proportions for these areas were derived from abridged life tables for Canada and the affected jurisdictions and complete Canadian life tables using a method suggested by Dickman et al.Note13
Analyses were based on all primary cancers.Note14-16 Data from the province of Quebec were excluded because the method of determining the date of diagnosis in this province differed from that of the other provinces, and because of issues in correctly ascertaining the vital status of cases. Records were also excluded if: age at diagnosis was younger than 15 or older than 99; diagnosis was established through autopsy only or death certificate only; or year of birth or death was unknown.
The sex and age group distributions of cases diagnosed from 2000 to 2007 that were eligible for survival analysis are provided for each cancer studied and for all cancers combined. This period was chosen to describe the full cohort potentially used in the 5-year (focal duration) analyses, because the period method of survival, by its nature, does not pertain to any specific population.Note17
Among non-sex-specific cancers, those most skewed toward men were laryngeal (83% male), liver (76%), bladder (74%) and esophageal (73%) (Table 1). Conversely, 78% of thyroid cancer cases were diagnosed in women. The mean age at diagnosis was highest for cancers of the bladder (71 years), colon (70) and pancreas (70), and lowest for Hodgkin lymphoma (42), thyroid (48) and cervical cancer (49). Relative survival for 2005-2007 was highest for thyroid and prostate cancer (5-year RSR ≥ 95%) and lowest for pancreatic, esophageal, lung and bronchus, and liver cancer (< 20%), followed by brain and stomach cancer (< 25%) (Table 2).
Replacing expected survival estimates calculated from 2005-2007 life tables with those calculated from 2000-2002 life tables resulted in increases in RSRs for all cancers and all survival durations studied. For all cancers combined, increases in 1-, 5- and 10-year RSRs were 0.2, 0.8 and 1.7 percentage units, respectively. Increases were highest for prostate (0.4, 2.0 and 4.7 percentage units) and bladder cancer (0.4, 1.6 and 3.0), and lowest for brain (0.1, 0.1 and 0.1) and pancreatic cancer (0.1, 0.1 and 0.2).
Using expected survival estimates calculated from 1995-1997 life tables yielded even greater increases in RSRs. Patterns that emerged based on the 2000-2002 life tables were also evident, but differences in RSRs were approximately double when 1995-1997 life tables were used to estimate expected survival. For all cancers combined, increases in 1-, 5- and 10-year RSRs based on the 1995-1997 life tables were 0.4, 1.8 and 3.7 percentage units, respectively. Increases were highest for prostate (0.8, 4.3 and 10.2 percentage units) and bladder cancer (0.8, 3.4 and 6.6), and lowest for brain (0.1, 0.2 and 0.3) and pancreatic cancer (0.1, 0.2 and 0.4).
In general, the magnitude of differences in RSRs for 2005-2007 based on the use of historical rather than current life tables depended on the relative survival of the individual cancer. Cancers with a better prognosis (for instance, prostate) tended to be associated with large differences, while cancers with a poorer prognosis (for instance, pancreas) tended to be associated with small differences. Twelve of the 15 cancers with a 5-year RSR greater than 60% were associated with a bias of at least 0.6 percentage units in 5-year survival. Conversely, of the 8 cancers with relatively poor prognoses (RSR < 45%), the bias exceeded 0.4 percentage units only for multiple myeloma (0.6).
For all cancers combined, replacing expected survival estimates calculated from 2005-2007 life tables with those calculated from 2000-2002 life tables resulted in larger increases in 5-year RSRs among males (1.2 percentage units) than among females (0.5) (Table 3). Exclusion of sex-specific cancers (including breast cancer) reduced the increase among males to 0.8 percentage units, but the increase in females was not affected (data not shown). Similarly, for each cancer studied, increases among males were the same or greater than those among females. The largest difference between the sexes in the increase was for bladder cancer (1.9 percentage-unit increase among males versus 0.7 among females).
Increases in 5-year RSRs resulting from the replacement of expected survival estimates calculated from 2005-2007 life tables with those from 2000-2002 life tables rose with advancing age. For all cancers combined, there was a 1.9-percentage-unit increase among people aged 75 to 99 at diagnosis; increases for those aged 65 to 74 or 55 to 64 were 0.8 and 0.4 percentage units, respectively. Among people aged 75 to 99 at diagnosis, increases were greatest for prostate cancer (4.6 percentage units), followed by skin melanoma (3.4), larynx (2.9) and bladder (2.8). The largest increases at ages 65 to 74 and 55 to 64 were also among those diagnosed with prostate cancer (1.7 and 0.7, respectively). Among people aged 15 to 54 at diagnosis, increases did not exceed 0.1 percentage units for any cancer except prostate cancer (0.2).
The previously noted doubling of increases in RSRs for 2005-2007 that resulted from the use of 1995-1997 life tables rather than 2000-2002 life tables occurred in both sexes and every age group for all cancers combined and virtually each cancer studied (data not shown).
This study provides empirical data on the nature and degree of bias introduced when historical rather than current expected survival data are used to calculate RSRs for cancer. Increases in 5-year RSRs resulting from the use of 2000-2002 rather than 2005-2007 life tables were highest for prostate and bladder cancer, for males and for people aged 75 to 99 at diagnosis, and rose with survival duration. Differences in survival were approximately double when life tables that were 10 rather than 5 years out-of-date were used.
The bias was related to differences in life expectancy in the life tables used to calculate expected survival information. Between 2000-2002 and 2005-2007, life expectancy in Canada rose by 1.3 years for males, and by 0.8 years for females.Note6,Note7 The corresponding increases between 1995-1997 and 2005-2007 were twice as large, at 2.8 and 1.7 years, respectively.Note6,Note8 In addition, gains in individual age-specific probabilities of surviving from one age to the next over these periods rose with advancing age among people aged 55 or older. Note6-8
Analysis of the role of life tables in relative survival calculations for different cancers is complex.Note18 For example, the strongest bias was observed for prostate cancer—a male-specific cancer with an older-than-average mean age at diagnosis. Yet very little bias was associated with esophageal cancer, which has a male-dominated sex distribution and a slightly higher mean age at diagnosis.
Another factor to consider is the prognosis of a particular cancer. For two cancers with matching expected survival, regardless of the life table used to derive the value, differences between two life tables will result in greater differences in relative survival for the cancer with the better prognosis.Note18 For example, consider two cancers with observed survival proportions of 80% (cancer 1) and 10% (cancer 2). If the expected survival derived for both cancers is 80% using life table 1 and 82% using life table 2, then the effect of using different life tables on cancer 1 will be 2.4 percentage units (80/80 - 80/82), and on cancer 2, 0.3 percentage units (10/80 - 10/82).
The findings of this study support the importance of the prognosis for a given cancer in assessing the bias that results from the use of historical rather than current life tables. Cancers with a better prognosis tended to be associated with relatively large differences, while cancers with a poorer prognosis tended to be associated with relatively small differences. The most conspicuous exceptions were thyroid, Hodgkin lymphoma and cervix uteri, all of which had a good-to-excellent prognosis and small differences. Not coincidentally, the mean ages at diagnosis for these three cancers—particularly Hodgkin lymphoma—were the lowest among the cancers studied by a wide margin. Furthermore, the sex distributions for thyroid and cervical cancer were heavily or entirely skewed toward females. In short, less bias emerges when expected survival derived from the various life tables is more similar, and this can outweigh prognosis as a determinant of the magnitude of the bias.
The analyses were conducted using the period method, which provides more timely estimates of cancer survivalNote19-21 and has been used with increasing frequency in recent years.Note17 The results of period analyses reflect only the survival experience in the most recent period for which data are available. In turn, RSRs derived using period analyses rely exclusively on expected survival data from this most recent period. Long-term estimates of relative survival based on the cohort method require expected survival data spanning many more years—the most recent of which would, at best, apply solely to the final intervals of the analysis (for example, the 8th, 9th and 10th years in a 10-year survival analysis). Thus, period survival estimates would be affected to a greater extent than cohort-based estimates by the compensatory use of historical life tables.
The results of this analysis reflect the data in the national cancer registry, but not necessarily the results that would be obtained from data for individual provinces. The bias introduced by using 2000-2002 rather than 2005-2007 life tables to calculate predicted provincial RSRs is shown for selected provinces and cancers in Appendix Table A.In Canada, data on expected survival have been derived from population-based quinquennial life tables produced by the Demography Division of Statistics Canada. In practice, expected survival for the calendar years 1994 to 1998 and 1999 to 2003 has been based on life tables (1995-1997 and 2000-2002) centred on the 1996 and 2001 Censuses, respectively.Note7,Note8 Before production of the 2005-2007 life tables6 (which were centred on the 2006 Census) expected survival for the calendar years 2004 to 2007 was also based on the 2000-2002 life tables. During this time, several relative survival analyses were published, Note22-28 the interpretation of which would benefit from consideration of the findings of this study.
Change in life expectancy at birth from 2000-2002 to 2005-2007 and extent of bias in predicted five-year relative survival ratios (RSR) for 2005-2007 introduced by using 2000-2002 life tables to derive expected survival, by sex, selected age groups, provinces and cancers, Canada excluding Quebec
As demonstrated by the simultaneous release of mortality and life table data by Statistics Canada,Note29 there is currently no delay between the production of life tables and the availability of the data on which they are based. In addition, improvements have been made to the derivation of expected survival that is used in the estimation of relative survival. First, starting with the 2006-2008 version,Note30 life tables centred on non-census years are being published. Consequently, in the future it likely will not be necessary to rely on historical expected survival data for current years of analysis. Second, the methodology used to produce life tables from 2005-2007 forwardNote9 has been retroactively applied to each calendar year between 1991-1993 and 2004-2006 (unpublished data). This provides not only for methodological consistency in the derivation of expected survival over time, but also for derivation of expected survival from life tables by single calendar year.
The extent to which international estimates of relative survival rely on non-current expected survival data is difficult to assess because studies often do not provide details on how such data were derived. In the United States, the Surveillance Epidemiology and End Results (SEER) program has follow-up of cancer cases through the year 2010, although the most recent expected survival data are limited to 2007.Note31 In previous years—when the SEER program relied strictly on decennial life tables—there were greater discrepancies between the latest year of follow-up and the latest year for which expected survival data were available. For example, the 2000 life table was extended to 2006 for the November 2008 submission of data, and three years earlier, the 1990 table was extended to 2003 (SEER*Stat Technical Support, Information Management Services, Inc.). In fact, that extension prompted the present analysis of the effect of using life tables that are 10 years out of date.
The findings of this paper are intended to increase awareness of the importance of maintaining timely expected survival data. Reliance on historical rather than current expected survival data in calculating RSRs for cancer may lead to consequential overestimation of survival. The increasing adoption of period survival methodology underscores the need for up-to-date information on expected survival.
It is also recommended that researchers be explicit in describing the source and coverage of their expected survival data when presenting estimates of relative survival.
The Canadian Cancer Registry is maintained by Statistics Canada. It is comprised of data supplied by the provincial and territorial cancer registries whose cooperation is gratefully acknowledged.