Social determinants of lung cancer incidence in Canada: A 13-year prospective study

by Debjani Mitra, Amanda Shaw, Michael Tjepkema and Paul Peters

Lung cancer is the most commonly diagnosed cancer and the leading cause of cancer death in Canada, with an estimated 26,100 new cases and 20,500 deaths in 2014.Note 1 Research consistently shows that lung cancer risk is inversely associated with socioeconomic status (SES).Note 2-19

Examining cancer outcomes by SES in Canada is challenging because cancer registration data do not usually contain information on individual-level socioeconomic characteristics.Note 20 Previous Canadian studies have explored associations between lung cancer and SES using small samples, ecological approaches in which neighbourhood-level markers of SES were attached to cancer data, or surveillance systems that employed case-control designs.Note 8,Note 12-16 These studies were limited by small sample size, lack of representativeness, and biases associated with exposure and outcome ascertainment.

The recent creation of a large, population-based cohort linking a sample of census respondents to the Canadian Cancer RegistryNote 21 overcomes many of these limitations and offers an opportunity to investigate the role of socioeconomic determinants of lung cancer risk in a large, representative sample of the Canadian population. It also allows for a better understanding of the etiology of lung cancer through the examination of histologic subtypes across socioeconomic gradients. Using data from the 1991 Canadian Census Cohort, this study quantifies the risk of lung cancer by individual measures of SES (educational attainment, income, and occupation) and examines associations by sex, age, and histologic subtype.

Data and methods

Data sources

The data are from the 1991 Canadian Census Cohort, the largest population-based cohort in Canada.Note 21,Note 22 Individuals were eligible for the cohort if they were aged 25 or older on June 4, 1991, enumerated by the 1991 Census long-form questionnaire, and successfully linked to a name file (consisting of 1990 and 1991 tax-filers) using standard probabilistic techniques. Exclusions were people living in institutions at baseline (1991), census undercoverage (3.4% of the population, including residents of 78 Indian reserves), and individuals who did not file taxes in either 1990 or 1991. Approximately 2.7 million individuals who completed the long-form census were successfully linked to the Canadian Cancer Database (from 1969 to 2003), the Canadian Mortality Database (from 1991 to 2006), and a residential postal code file (1984 to 2007) derived from tax-filer data.Note 21,Note 22

The cohort contains a rich breadth of demographic and socioeconomic information derived from the 1991 Census long-form questionnaire. Cancer information includes date of diagnosis, age, site, topography, morphology, laterality, and date of death, if applicable.Note 21 Each cohort member was followed from the day of the 1991 Census (June 4, 1991) to the date of censoring (first diagnosis of lung cancer, the date of emigration, the date of death or the last day of follow-up, whichever came first). Person-days of follow-up were divided by 365.25 to obtain person-years at risk (PYAR).

Incident cases of lung cancer were identified among the cohort. The outcome variable was the first primary lung cancer. Morphology was coded according to the International Classification of Diseases for Oncology, 2nd Edition (ICD-O-2). To better understand lung cancer etiology, individuals who developed a different primary cancer before lung cancer were excluded. Benign and in-situ tumours were also excluded. The analyses were conducted for all lung cancers combined (C34.0, C34.1, C34.2, C34.3, C34.8, C34.9), and separately for the main histological types: adenocarcinoma (ICD-O-2 8140, 8211, 8230-8231, 8250-8260, 8323, 8480-8490, 8550-8560, 8570-8572), squamous-cell carcinoma (ICD-O-2 8050-8076), small-cell carcinoma (ICD-O-2 8040-8045), large-cell including giant-cell, clear-cell and undifferentiated carcinoma (ICD-O-2 8012–8031, 8310), and unspecified carcinoma (ICD-O-2 8010-8011, 8032-8034).

For each cohort member, data were extracted on lung cancer diagnosis, date of diagnosis, date of birth, sex, and three SES variables: 1) highest level of education (less than secondary graduation, secondary graduation including trades certificate, at least some postsecondary short of a bachelor’s degree, and university degree); 2) income based on pre-tax low-income cut-offs; and 3) occupation based on the 1990 Standard Occupational Classification. Age-at-baseline, sex, and SES-specific incidence rates by five-year age groups were used to calculate age-standardized incidence rates (ASIRs) using the direct method. The 1991 mid-year population was used as the standard population. Based on methods described in detail elsewhere,Note 21,Note 22 ASIRs were used to calculate rate ratios (RRs), rate differences (RDs), and corresponding 95% confidence intervals (CIs). Absolute excess incidence was calculated by subtracting the ASIR of those in the highest SES categories (university degree, highest income quintile, managerial occupation) from the ASIR of the total cohort. This difference represents the number of new lung cancer cases per 100,000 PYAR that could have been avoided if all cohort members had experienced the incidence rate of those in the highest SES categories.


From June 4, 1991 to December 31, 2003, 215,700 of the 2,734,835 cohort members were diagnosed with at least one type of cancer. Lung cancer was the most common diagnosis, accounting for 14% of all incident cases and representing 30,075 cohort members (19,220 men and 10,855 women). The average ASIRs  of lung cancer among men and women in the cohort were 123 and 69 cases per 100,000 PYAR, respectively. Appendix A shows the characteristics of members of the cohort by sex and SES.


Lung cancer ASIRs showed a stepped gradient by educational attainment, with the highest incidence among men and women with lowest level of education. Compared with men who had a university degree, the ASIR was 1.5 times higher for those with a postsecondary diploma, 2.1 times higher for those with a secondary school diploma, and 2.8 times higher for those with less than a secondary school diploma (Table 1). The gradient was similar for women, among whom the corresponding rate ratios (RRs) were 1.6, 2.1, and 2.7.

When examined by age at baseline, a gradient in lung cancer risk by educational attainment was evident for both sexes at ages 25 to 44, 45 to 64 and 65 to 79, although relative inequalities decreased with age (Table 2). The highest RR was among women aged 25 to 44 with less than secondary education (RR= 4.04), followed by men in the same age/education group (RR = 3.93). The average ASIR of men and women in the cohort was 123 and 69 cases per 100,000 PYAR, respectively.

The absolute differences in incidence rates (RD) between those with the lowest versus highest level of education were 95.4 cases per 100,000 PYAR for men and 51.5 cases per 100,000 PYAR for women (Table 1). The absolute education-related excess incidence showed that if all cohort members had experienced the incidence rate of those with a university degree, lung cancer incidence would have been 56% lower among men and 55% lower among women, representing 68.9 and 36.5 fewer new cases per 100,000 PYAR, respectively.


Results also showed a stepped gradient in lung cancer incidence rates among men and women by income quintile. Compared with men in the highest quintile (Q5), the ASIR was 1.25 times higher for those in the second-highest quintile, 1.47 times higher for those in the middle quintile (Q3), 1.75 times higher for those in the second-lowest quintile (Q2), and 2.11 times higher for those in the lowest quintile (Q1). The gradient was similar for women, where RRs for respective comparisons were 1.16, 1.28, 1.40, and 1.81 (Table 1).

A gradient in lung cancer risk was observed for all age groups, although relative inequalities were less clear among people aged 80 or older (Table 2). Lung cancer risk was highest for men and women aged 45 to 64 in the lowest income quintile.

Data on absolute inequalities showed that absolute differences in incidence rates (RD) were 91.1 cases for men and 40.7 cases for women per 100,000 PYAR in the lowest (Q1) versus the highest income quintile (Q5). Income-related absolute incidence excess showed that if all cohort members had experienced the incidence rate of those in the highest quintile, lung cancer incidence would have been 33% lower among men and 25% lower among women, representing 40.9 and 16.7 fewer new cases per 100,000 PYAR, respectively.


Lung cancer incidence rates were highest among men and women employed in unskilled jobs or with no occupation, and lowest for those in managerial occupations. Compared with men in managerial occupations, the ASIR was 1.39 times higher for those in professional occupations, 1.65 times higher for those in skilled, technical or supervisory occupations, 2.06 times higher for those in semi-skilled occupations, 2.28 times higher for those in unskilled occupations, and 3.06 times higher for those without an occupation. A similar, but less steep, gradient was observed for women in skilled, technical or supervisory occupations (RR = 1.43), semi-skilled occupations (RR = 1.59), unskilled occupations (RR = 1.82) or no occupation (RR = 2.07). However, unlike men, incidence was higher among women in professional occupations than among those in the more technical trades.

Relative inequalities in incidence by occupation were most pronounced among men and women aged 25 to 44 and diminished with age (Table 2).

Compared with men in managerial occupations, the absolute differences in lung cancer incidence rates were 117.7 cases per 100,000 PYAR for those without an occupation, and 73.2 cases per 100,000 PYAR for those in unskilled occupations. The corresponding differences for women were 33.6 cases (no occupation) and 21.7 cases (unskilled occupations) per 100,000 PYAR. Occupation-related absolute incidence excess showed that if all cohort members had experienced the incidence rate of those in managerial occupations, lung cancer incidence would have been 54% lower among men and 44% lower among women, representing 65.6 and 29.2 fewer new cases per 100,000 PYAR, respectively.


A negative gradient in lung cancer risk was apparent for squamous cell carcinoma and small-cell carcinoma for all three SES indicators, but the pattern was less consistent for adenocarcinoma, where a negative gradient was evident only for income and education (Table 3).

The highest RR (3.34) was among individuals diagnosed with squamous cell carcinoma who had less than secondary school graduation, compared with university graduates. A similar but less attenuated risk was observed for income and occupation (Table 3).

Associations were similar for small-cell carcinoma. Individuals with less than secondary school graduation had 2.54 times the risk of small-cell carcinoma compared with university graduates.  Individuals in the lowest income quintile had twice the risk compared with those in the highest income quintile, and those without an occupation had 1.53 times the risk of those in professional occupations.

The risk of adenocarcinoma followed a negative gradient by education and income, but no clear pattern emerged for occupation. Numbers for large-cell, and undifferentiated carcinomas and unspecified carcinomas were too small to report.


The socioeconomic gradient in lung cancer risk reflect differences in the prevalence of risk factors such as smoking, diet, and environmental and occupational exposure among  SES groups.Note 3,Note 6-10,Note 13,Note 14,Note 18,Note 19 Previous studies have reported an inverse association between SES and lung cancer risk in men,Note 2-18 but the evidence has been less conclusive for women, with studies finding inverse, positive, or no associations after adjustment for smoking and other confounding variables.Note 2,Note 4,Note 8,Note 9,Note 19 According to the present analysis, lung cancer risk was inversely associated with education, income, and occupation in both sexes, although associations were weaker for women.

The association between lung cancer risk and SES was strongest for education. In other studies, education has been more frequently associated with disease than other social indicators.Note 18,Note 23 While the reason for this relationship cannot be determined here, previous research suggests that education is a predictor of income and occupation and a stronger determinant of health behaviours than other socioeconomic indicators, as it is often acquired early in life.Note 23,Note 24 In addition, education is a more inclusive indicator than income or occupation because it applies to individuals outside the workforce, and does not depend on regional definitions of households and differences in cost of living.Note 23 The unadjusted RRs for educational inequalities in this study were higher than those previously reported in the United States and Canada (pooled unadjusted RR: 1.84, 95% CI: 1.56-2.19).Note 19

While education was the indicator most strongly associated with lung cancer risk in women, among men, risk was highest in the semi-skilled/unskilled and no occupation categories. Research indicates that a larger percentage of female than male cohort members are either not in the labour force or unemployed: 42% versus 28%.Note 25 Sex-specific differences in lung cancer risk by occupation may reflect the possibility that a larger percentage of women in this cohort were unemployed or not in the labour force, but still lived in relative affluence. Moreover, this finding may imply that exposures to workplace carcinogens may be more common among men in semi-skilled and unskilled occupations compared with women, a hypothesis that may be further informed by yet-to-be-published research examining the risk of lung cancer among those in high-risk occupations (welders and miners) using data from the cohort.Note 26

Relative inequalities in lung cancer incidence were generally greater in younger age groups, and decreased with age, a finding supported by other longitudinal studies.Note 19,Note 24 While SES inequalities appeared to persist into old age, they were less pronounced. Because the risk of lung cancer rises with age, the impact of SES on risk may be diminished in older age groups. Steeper risk gradients in SES among younger cohort members compared with older members are also compatible with research showing that SES inequalities in smoking prevalence have been increased in Canada over the last 60 years despite a steady decline in overall adult smoking prevalence (from more than 40% in 1965 to 16% in 2012).Note 27,Note 28 This is characteristic of countries that have reached stage IV of the smoking epidemic, where SES differences in smoking appear to be increasing and likely lead to widening  inequalities in lung cancer incidence in the future.Note 28,Note 29

Results by histology showed a distinct negative SES gradient in the risk of squamous cell and small-cell carcinoma for all three SES indicators. A negative gradient in the risk of adenocarcinoma was apparent for education and income, but no consistent association with occupation emerged. This is in line with research showing that the association with cigarette smoking (attributed to 85% to 90% of lung cancer casesNote 7,Note 8,Note 30,Note 31) prevails for all histological subtypes, but is strongest for squamous cell cancers and small-cell cancers, followed by adenocarcinoma.Note 32-34

Strengths and limitations

A major strength of this study is the large, representative, population-based cohort that allows  for the examination of several SES variables simultaneously.Note 21,Note 22 The granularity of socioeconomic data available for the cohortmakes it possible to explore individual-level data, in contrast to area-based measures that  can mask inequalities evident at the individual level.Note 35 Because the sample size and the length of the follow-up period increase statistical power, cancers with long latency periods can be examined and analyses can be conducted by subsite.

A limitation of the cohort is that it pertains only to people aged 25 or older, and excludes residents of institutions, people not enumerated by the 1991 Census long-form questionnaire and those who did not file taxes in 1990 or 1991.Note 21,Note 22,Note 25 As well, although the cohort is broadly representative of most groups in the Canadian population, some characteristics differ.Note 22,Note 35 For example, owing to the nature of the linkage, the cohort under-represents rural residents (less precise postal codes for matching), people with less than secondary graduation (who are less likely to be employed), people not in the labour force, and those in the lowest income quintile (both groups are less likely to be tax-filers).Note 22,Note 25 Moreover, the cohortwas not disease-free at baseline, so some members may already have had underlying conditions that could have contributed to the risk of lung cancer.Note 21,Note 22,Note 25

Another limitation is that SES characteristics were captured at a single point in time, which does not allow for examination of changes in exposures over time. As well, the dataset does not include important behavioural and environmental risk factors. Having data on key risk factors such as smoking and an extension of the follow-up period would make it possible to understand historical and birth cohort effects of smoking and their impact on lung cancer inequalities in the context of an evolving  smoking epidemic in Canada.

An area for further research is the extent to which smoking and other risk factors may explain the SES gradient in lung cancer risk.Note 9,Note 24,Note 36 Previous studies indicate that adjusting for smoking can decrease SES differences in lung cancer risk by 50% to 65%, and that more complete adjustments for smoking can almost eliminate the association.Note 9,Note 37 This suggests a potential for misclassification of lifetime exposure to tobacco due to the use of proxy subjects or self-reported behaviour (often captured at a single point in time) and broad risk categories (ex-/never-/current smokers), and to the challenges of measuring smoking intensity over time.Note 9,Note 37 Occupational exposures can account for up to 14% of inequalities after adjustment for smoking and fruit and vegetable consumption.Note 36 Findings of the few studies on diet (fruit and vegetable consumption) are not consistent,Note 9 nor are studies of exposures such as radon in the home and air pollution, although it is possible that these risks may contribute to residual differences in lung cancer incidence.Note 12,Note 13,Note 31


This large-scale, nationally representative cohort study showed a negative socioeconomic gradient in lung cancer incidence rates in both men and women. Inequalities in lung cancer risk were particularly pronounced for histologies more strongly associated with smoking. In the future, it will be important to explore the role of smoking, occupational exposures, and diet on lung cancer risk to understand the extent to which inequalities remain after adjustment for these known risk factors. Linkages of the 1991 Census Cohort to surveys such as the Canadian Community Health Surveyor the Canadian Tobacco Use Monitoring Survey may be helpful in this respect. Extension of the follow-up period may allow for the study of cohort effects.


The authors thank Robert Semenciw and Les Mery for their support. Funding for this analysis was provided by the Public Health Agency of Canada. Funding for the creation of the Canadian Census Cohort was provided by the Canadian Population Health Initiative of the Canadian Institute of Health Information, the Healthy Environment and Consumer Safety Branch of Health Canada, and the Health Analysis Division of Statistics Canada. The authors also acknowledge Canadian provincial and territorial vital statistics registrars and the Canadian Cancer Registry.

Date modified: