Geozones: An area-based method for analysis of health outcomes

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

by Paul A. Peters, Lisa N. Oliver and Gisèle M. Carrière

For this article...

Administrative datasets that contain information about health service use and events such as births and deaths are powerful tools in population health research. However, such datasets often lack information about health determinants (for example, income and education) and individual characteristics (for example, Aboriginal identity or country of birth), which can be important to understanding health disparities among and between certain groups. This article describes the Geozones methodology for calculating area-based thresholds of population characteristics derived from census results that can be applied to administrative data for use in the analysis of inequalities in health outcomes, health service use, or social characteristics.

Compared with individual-level measures, the advantages of area-based indicators are that they: consider the total population in a geographic area; yield statistically reliable and consistent estimates; detect differences between groups; and can be tracked over time and geographic location.¹ Area-based studies examining the relationship between neighbourhood income differentials and health outcomes in Canada have shown differences in injury, mortality, life expectancy, and potential years of life lost.^2-7 Geozones has been applied in previous analyses of geographic areas with high concentrations of immigrants,⁸ First Nations people,^9-12 and Inuit.^13,14

Geozones stems from residential segregation analysis and the calculation of threshold profiles of spatial concentration.^15,16 The proportion of a population subgroup in a geographic area is compared with the rest of the population or with other population subgroups in the same area. The resulting threshold definitions can be used for comparative analyses of areas with different levels of concentration of a particular characteristic.¹⁷

This article presents a guide to calculating Geozones, using the examples of concentration of the Aboriginal identity population and of income quintiles.

Methods

Geozones is based on population proportions and the comparison of different populations within specified areas, the results of which are used to create a typology of population concentration for a given level of geographic aggregation.

The first step is calculation of threshold tables for a specific subgroup and a comparison group at a given level of geographic aggregation. In Canada, Dissemination Area (DA), Census Tract or Census Subdivision levels are most commonly used.

In the second step, concentration curves are plotted to display the distribution of the subgroup across specified thresholds and to determine potential cut-points for low or high concentrations.¹⁸ These curves provide a visual representation of population concentrations, which aids in selecting an appropriate threshold quantile.

Third, based on examination of the threshold tables and concentration curves, the population is divided into quantiles (terciles, quintiles, deciles, etc. ). This quantile definition is the basis of Geozones. The concentration ranges within the chosen quantile constitute a typology for comparing areas with different concentrations of the subgroup of interest.

Fourth, because the purpose of some analyses is to compare geographic areas with low or high percentages of a specific subgroup, quantile classification tables are created to determine appropriate cut-points.

Geographic unit of analysis

Selection of the geographic unit of analysis depends on the distribution of the subgroup of interest and the overall area under consideration. The level of geographic aggregation that is chosen influences the interpretation of results. For example, smaller areas have the advantage of increased variation and potentially improved discernment of local concentration, but they are more likely to produce spurious associations.¹⁹ As well, difficulties achieving adequate population counts may make larger geographic units preferable.

This study uses DAs, which consist of one or more urban city blocks or rural areas with a population of 400 to 700.²⁰ The DA was selected because it has 100% coverage and is the smallest geographic unit for which census population and dwelling characteristics are disseminated.

In this article, census data by self-identification as North American Indian, Métis or Inuit and by income quintile are examined at the DA level. The term, "First Nations people," is used to refer to census respondents who reported their identity as North American Indian. Income quintiles are based on average household income at the national level. Although DAs totalled 54,626 in 2006, this analysis is based on a somewhat smaller number―the 52,973 DAs for which the proportion of residents reporting Aboriginal identity or where a population large enough to calculate income quintiles was available. Aboriginal identity and income quintile could not be determined for DAs with fewer than 40 residents, for those with high global non-response, or for incompletely enumerated Indian Reserves.

Threshold tables

The threshold table method has been shown to be a robust means of comparing concentrations of subgroups at a regional level.^17,21,22 It allows for the production of tables and maps showing where subgroups form a majority, are dominant (modal), or exceed defined concentration levels.¹⁷ It is also the first step in creating a typology, according to which areas are classified based on the proportion of the subgroup of interest.¹⁶ The Geozones methodology described here uses the threshold profiles to compare health outcomes in areas with different concentrations of the subgroup.

For each subgroup of interest, the proportion it constitutes of the total population of each geographic unit (DA) is calculated. To measure concentration of that subgroup, the proportion living in geographic areas with a given percentage of the same group is calculated. By changing the denominator in this calculation, it is possible to measure the proportion of a subgroup that lives in geographic areas with a given percentage of a different subgroup, that is, the exposure of one subgroup to another.

Concentration curves

Concentration curves illustrate the proportion of subgroups in geographic areas by selected thresholds.²¹ Concentration curves are created by plotting each row of the threshold tables. These curves are a means of determining if the selected thresholds are valid, and if the geographic areas represent the subgroup of interest. Although this stage is not essential, simultaneously displaying coverage and concentration is helpful in understanding the subgroup under consideration.

Quantile definition

The quantile range influences the interpretation of results and depends on the descriptive or analytic model. The quantile definition categorizes geographic areas as having low versus high percentages of the subgroup. By definition, each quantile contains an equal percentage of the subgroup, but an unequal number of geographic areas (in this case, DAs).

Quantiles are calculated by ranking the geographic areas from those with the lowest to the highest percentage of the subgroup. The first category of geographic areas that contains the desired percentage (one-third, one-fifth, etc. ) of the subgroup is coded 1, the second is coded 2, and so on until all geographic areas are coded based on the chosen number of quantiles. Quintiles are used most often, although terciles, quartiles, etc. could be employed. Selection of the quantile may be constrained by the size of the subgroup and the frequency of the outcome under consideration (for instance, hospitalization or cancer incidence). Quantile selection may also be influenced by the characteristic or determinant under study. For individual characteristics such as Aboriginal identity, the purpose may be to compare areas with a low or high percentage, but for health determinants such as income or education, the purpose may be to examine the gradients of concentration.

Data preparation

Constructing Geozones requires careful preparation of the data. From an epidemiological perspective, it is essential that the entire population-at-risk be included in the analysis. Thus, ensuring an appropriate numerator and denominator is important. Health administrative data (numerators) with complete population coverage, such as death certificates, acute-care hospitalizations and cancer registry statistics, should be coupled with denominators that also have complete population coverage—for example, area population counts by age and sex that include institutional residents.

Results

Aboriginal identity

Table 1 shows the threshold concentrations for the same-group population for all Aboriginal identity groups combined, First Nations people, Métis, Inuit, and non-Aboriginal people. Column headings indicate the percentage that the same-group population constitutes of the total DA population, by decile thresholds. Rows show the proportion of the group residing in DAs with the indicated percentage of the same-group population. For instance, in 2006, 38% of Aboriginal people lived in DAs where less than 10% of the population reported Aboriginal identity. However, another 26% of Aboriginal people lived in DAs where more than 90% of the population reported Aboriginal identity. By comparison, 95% of the non-Aboriginal population lived in DAs where more than 90% of the population identified as non-Aboriginal. The differences in the concentration profiles of First Nations people, Métis and Inuit in Table 1 demonstrate the importance of studying each group separately.

Table 1 Concentration of Aboriginal identity groups and non-Aboriginal population, by Dissemination Area (DA) decile threshold, metropolitan and non-metropolitan-influenced areas, Canada, 2006

When the degree of metropolitan influence is considered, a different picture emerges. In metropolitan-influenced zones, just 8% of the Aboriginal population lived in DAs where more than 90% of the population reported Aboriginal identity. By contrast, in non-metropolitan-influenced zones, 60% of the Aboriginal population lived in DAs where more than 90% of the population reported Aboriginal identity. The results differ among the three Aboriginal identity groups and for the non-Aboriginal population. For example, First Nations people were significantly more concentrated in non-metropolitan-influenced zones (70% lived in DAs where more than 90% of the population identified as First Nations people) than they were in metropolitan-influenced zones (13% lived in DAs where more than 90% of the population identified as First Nations people).

A change in the group column percentages shifts the focus from concentration to exposure—the extent to which subgroups live in areas with a specified percentage of another population group. Table 2 shows the proportions of the Aboriginal identity groups living in areas with varying percentages of non-Aboriginal people. For instance, 56% of Inuit, but only 3% of Métis, lived in DAs where fewer than 10% of the total population reported non-Aboriginal identity.

Table 2 Exposure of Aboriginal identity groups to non-Aboriginal population, by Dissemination Area (DA) decile threshold, metropolitan and non-metropolitan-influenced areas, Canada, 2006

Income quintiles

Threshold tables were also constructed for concentration and exposure of the population by household income quintile.

Table 3 shows that people in the highest (Q5) and lowest (Q1) income quintiles were the most concentrated. For instance, 10% of people in the lowest household income quintile and 9% of those in the highest lived in DAs where more than 60% of the population were in the same quintile (the sum of the four columns covering >60% to 100%). Fewer than 1% of the population in the other three household income quintiles lived in DAs where more than 60% of the population were in the same quintile.

Table 3 Concentration of population, by household income quintile and Dissemination Area (DA) decile threshold, Canada, 2006

Table 4 shows the exposure of people in the first four household income quintiles (Q1 to Q4) to people in the highest (Q5). Just 1% of people in the lowest household income quintile (Q1) lived in DAs where at least 50% of the population were in the highest (Q5). In fact, more than half (55%) of those in the lowest income quintile (Q1) lived in DAs where a small percentage (less than 10%) of the population were in the highest income quintile (Q5). By comparison, 43% of people in Q2, 34% in Q3 and 25% in Q4 lived in DAs where less than 10% of the population were in the highest income quintile.

Table 4 Exposure of lower household income quintiles to highest household income quintile, by Dissemination Area (DA) decile threshold, Canada, 2006

Plotting concentration and exposure

Based on the four threshold tables (Tables 1 to 4), concentration curves can be plotted for the various Aboriginal identity and household income groups.

The top panel in Figure 1 shows the concentration curve for the Aboriginal identiy groups by the same-group thresholds at the DA level (first five rows of Table 1). A distinct U-shape is apparent in the distribution of First Nations people and Inuit, with large proportions either not concentrated (living in DAs with low percentages of the same group) or very concentrated (living in DAs with high percentages of the same group). This degree of concentration did not prevail for Métis, the majority of whom lived in DAs with a low percentage of Métis residents. The non-Aboriginal population, on the other hand, was very concentrated—95% lived in DAs where more than 90% of the population was non-Aboriginal. The bottom panel in Figure 1 shows the exposure of Aboriginal identity groups to the non-Aboriginal population (first four rows of Table 2).

Figure 1 Concentration and exposure of Aboriginal identity groups and non-Aboriginal population, by Dissemination Area (DA) threshold, Canada, 2006

In Figure 2, the concentration and exposure of the population by household income quintile are shown by DA income thresholds. The top panel displays the concentration profiles of each household income quintile group (rows in Table 3), with people in the highest and lowest income quintiles more concentrated than those in the remaining quintiles. The bottom panel (rows in Table 4) shows that the people in the lowest household income quintile were less exposed to the population in the highest income quintile than were people in Q2, Q3 or Q4.

Figure 2 Concentration and exposure of household income quintile groups, by Dissemination Area (DA) threshold, Canada, 2006

Figures 1 and 2 illustrate how the distributions of individual characteristics such as Aboriginal identity (concentration and exposure have a U-shaped distribution) differ from the distributions of health determinants such as income quintiles (concentration and exposure appear as a gradient).

Selecting cut-points

Table 5 shows a further step of quantile classification statistics—the threshold, coverage, and concentration of the population who reported Aboriginal identity. Each successive row increases the population threshold, decreases the coverage, and increases the concentration of the Aboriginal population. For instance, a threshold of 0.10 (1st decile) includes 20,114 DAs, where, collectively, 2% of the population reported Aboriginal identity (98% reported non-Aboriginal identity). By comparison, a threshold of 0.90 (10th decile) includes only 363 DAs, where, collectively, 98% of the population reported Aboriginal identity.

Table 5 Quintile classification statistics (vigntiles), Aboriginal identity population, Canada 2006

The data in this table can be used to select an appropriate cut-point for quantiles, where upper categories contain greater proportions of the subgroup. For instance, at the 0.80 threshold, 94% of the population reported Aboriginal identity. By comparison, at the 0.75 threshold, 82% of the population reported Aboriginal identity, and at the 0.70 threshold, 50%.

The definition for Aboriginal Geozones in this paper uses quintiles, where the 0.80 threshold corresponds to the highest quintile (94% of the population in these DA reported Aboriginal identity). Based on these results, Aboriginal identity quintiles can be mapped (Figure 3). At the national level, DAs in the 5th quintile (more than 80% of the population reported Aboriginal identity) were primarily located in rural areas, north of large urban centres, and largely in central and western parts of the country. DAs in the 4th quintile, which also had a large percentage of residents who reported Aboriginal identity (>60% to 80% of the population), were more common in urban areas. For instance, the distribution of Aboriginal DA quintiles in the Winnipeg urban area (Figure 4) shows strong concentration, with a cluster of DAs in the north of the city classified in the 4th quintile.

Figure 3 Dissemination Areas in upper (5th) Aboriginal quintile, Canada, 2006

Limitations

Geozones treats each geographic unit as a discrete entity, and ignores the population composition of adjacent units. However, the administrative definition of units may not reflect differences in population composition, where a DA with a high percentage of a subgroup may be beside another DA with an equally high percentage of the same group. For example, the map of Winnipeg (Figure 4) shows considerable clustering of DAs with a high percentage of Aboriginal people. Therefore, some groups may be more or less concentrated than is suggested by the Geozones method, because aggregations of neighbouring DAs using different spatial configurations could change the level of concentration.

Figure 4 Aboriginal quintiles, by Dissemination Area, Winnipeg urban area, 2006

Inclusion of thresholds in linear models must be approached cautiously, as any unspecified spatial error may bias the results.²³ This can be accounted for by testing for the degree of spatial autocorrelation at local and global levels and including a spatial adjustment in the calculation.²⁴

Spatial methods of detecting local clusters, such as the Getis and Ord "hot-spots" or local Moran's I, could also be used to identify concentrations of population groups. However, these techniques focus on the distribution of a population in a local area rather than on identifying specific geographic areas of concentration. Thus, the results would be complementary and could be used in combination with the Geozones methodology to gain further insight into the spatial distribution of a population.

Most Geozones calculations can use national population distributions to create concentration curves and thresholds.⁸ However, nationally derived thresholds may favour some parts of the country over others. For instance, thresholds for immigrants based on national distributions would exclude much of Atlantic Canada, despite concentrations of immigrants in some areas. Changing the method to include locale-specific population distributions would produce different thresholds for each area there. While this may benefit research focused on a specific sub-national geography (for example, Manitoba or Winnipeg), the results would not be nationally comparable.

Findings from analyses that use thresholds of First Nations, Métis or Inuit identity populations or ethnic minority groups cannot necessarily be generalized to the entire population of interest. Notably, the characteristics of the population in DAs where a high percentage of the population identifies as First Nations people, Métis or Inuit may differ from the characteristics of the population in DAs where the percentage of the local population with Aboriginal identity is low. Moreover, First Nations people, Métis and Inuit have different patterns of geographic concentration, and thus, aggregation into a single Aboriginal category must be interpreted accordingly. In particular, the geographic concentration of Métis tends to be low, so this approach would likely yield an insufficient concentration of high-percentage Métis areas. By contrast, 78% of Inuit live in one of 53 communities in the Inuit Nunangat land claims settlement area.²⁵

Integration of health administrative data with thresholds calculated at the DA level requires accurate coding to census geographic codes. With tools such as the Postal Code Conversion File (PCCF+), administrative records containing postal codes can be automatically geo-coded with census and other administrative identifiers.²⁶

Although not presented here, it is possible to calculate thresholds for multiple census years and track changes over time in the concentration of subgroups. Because geographic concentration may change significantly, analyses must use the appropriate threshold year and take potential changes in the underlying population into consideration.

Conclusion

Most health administrative databases in Canada do not contain socio-economic or ethnic identity information. Consequently, it is not possible to report on the health service use, morbidity, or mortality of population subgroups. However, geographic-based methods can be used to obtain such information and analyze relationships between health outcomes, health services use and socio-economic characteristics for areas with high concentrations of these subgroups. The Geozones technique is a method of identifying areas with low or high concentrations of specific population characteristics and gradients of socio-economic determinants.

Acknowledgements

We are grateful for the assistance of Russell Wilkins for his insight and review of this manuscript. An earlier use of the Geozones method that measured life expectancy and health indicators in the Inuit-inhabited areas of Canada was conducted by Statistics Canada in partnership with Aboriginal Affairs and Northern Development Canada and Health Canada.