The changing landscape of Canadian metropolitan areas
Appendix B. Data sources and methods

Data sources

This study combines data from the 1971, 1991, 2001 and 2011 Census of PopulationNote 1 and Interpolated Census of AgricultureNote 2 with spatial data sets in order to analyze the evolution of built-up areas in and around CMAs.

Historical land cover and land use data are taken from the Canada Land Inventory (CLI), undertaken under the Agricultural Rehabilitation and Development Act (ARDA), 1961 and supplemented with data from the 1971 Canada Land Use Monitoring Program (CLUMP): Land Use. The CLI was a comprehensive survey of land capability and use for various purposes including agriculture, forestry, recreation and wildlife and included information on the existing land use. Land was classified based on air photo interpretation, field surveys and census information. Soil characteristics were determined by soil surveys. The CLI mapping of southern Canada took close to two decades to complete, producing over 1,000 map sheets at the 1:250,000 scale.

Data from Canada Land Inventory: Land Use (CLI: LU)Note 3 formed the basis for the 1971 built-up area. Land use categories in this product are: built-up area; mines, quarries, sand and gravel pits; outdoor recreation areas; horticulture, poultry and fur operations; orchards and vineyards; cropland; improved pasture and forage crops; rough grazing and rangeland; productive woodland; non-productive woodland; swamp, marsh or bog; sand, sandbars, sand flats, dunes and beaches; and rock and other unvegetated surfaces.

The CLUMPNote 4 used CLI-compatible land use classes, but focused on land use change in the urban-rural fringe of Canada's 23 largest urbanized regions,Note 5 the boundaries of which do not represent administrative, census or other legal limits, but simply a spatial extent set by the program. Mapping was based on aerial photos interpretation, with supplementary information from field surveys, street maps, municipal planning maps and satellite images.Note 6

Canada Land Inventory: Soil Capability for Agriculture, 1969Note 7 provided information on the potential of a specific area for agricultural production. Despite the vintage of this product and the availability of more recent soil data for some areas, the soil capability interpretations are considered to be largely valid and they continue to be used for land planning purposes.Note 8

The CLI and CLUMP maps were digitized in Environment Canada's Canadian Geographic Information System in the 1960s. Significant research and development by Agriculture and Agri-Food Canada (AAFC), National Archives of Canada, Natural Resources Canada (NRCan) and Statistics Canada (StatCan) was involved in recovering these data and converting them to ArcInfo in mid-1990s.Note 9

Remote sensing imagery data are taken from AAFC's Land Use 1990, 2000 and 2010.Note 10 These land use maps cover all of Canada south of 60°N at a spatial resolution of 30 m and were developed to meet international reporting requirements including those for the National Inventory Report to the United Nations Framework Convention on Climate Change, the Agri-Environmental program of the Organisation for Economic Co-operation and Development (OECD) and the FAOSTAT component of the Food and Agricultural Organization of the United Nations (FAO).

This land use product includes the following classes: settlement, roads, water, forest, forest wetland, trees, treed wetland, cropland, grassland managed, grassland unmanaged, wetland, wetland shrub, wetland herb and other land.

Data product specifications indicate that source data include land cover and crop maps, as well as various topographical layers from CanVecNote 11—a digital cartographical reference product produced by NRCan. Imagery for 1990 was based on satellite data taken between 1988 and 1994; the 2000 image, from 1988 to 2002 and the 2010 image, from 2009 to 2012. The estimates of the overall accuracy of this product are 84.0%, 87.1% and 92.7% for 1990, 2000 and 2010 respectively. See Table B.1 for the accuracy assessment of the Settlements class.

The data product specifications report that most misclassification occurs between the following categories: other land and forest, grassland and forest, cropland and forest, and wetland and forest, and that most errors occur with boundary pixels. Boundary pixels are located on the fringe of a given land use area.

As well, AAFC Crop Inventory, 2011 provided detail for forest dataNote 12 and NRCan's CanVec+ provided the water data.Note 13

Each data set is subject to limitations. In particular, the accuracy of land cover classification using spatial data sets depends on the resolution of the data and imagery dates and is also limited by the similarity of certain land covers when viewed from above and by cloud and tree canopy cover, which can obscure underlying land features. Metadata for these data sources are available through the provided references.


The report uses a consistent methodology to compare urban development trends across the country, allowing inter-city comparison. However, it is recognized that this broad scale analysis does not capture the finer details that are required to assess all the environmental impacts of development in and around cities.

Spatial units

Data for this report are tabulated according to two main geographies—the census metropolitan area (CMA) and census metropolitan area-ecosystem (CMA-E).

The CMA is a Census of Population geography.Note 14 The 2011 CMA boundaries were used for all years to produce data that are comparable over time.Note 15 For this reason, population data by CMA for 1971, 1991 and 2001 may not match previously published census data.

The CMA-E was created for this analysis and combines all Soil Landscapes of Canada (SLC)Note 16 polygons that are contained within or that intersect the boundary of the CMA, as well as SLC polygons that are fully contained within this newly formed boundary of the CMA-E. See Map B.1 for a visual depiction.

CMAs and CMA-Es vary in size, shape and topography. Caution should be used when comparing data for CMA-Es. CMA-Es are not spatially mutually exclusive—they overlap where a SLC polygon crosses more than one CMA boundary, as is the case in Vancouver and Abbotsford–Mission or Toronto and surrounding CMAs including Hamilton, Oshawa, Kitchener–Cambridge–Waterloo, Guelph, Brantford and Barrie. For this reason, CMA-E data should not be summed to generate a total.

Data from the Census of Agriculture is not available by CMA. As natural and semi-natural land is calculated as a residual, it is also not available for CMAs. For this reason, this report provides data for arable land and natural and semi-natural land only by CMA-E. These areas, which include the Soil Landscapes of Canada polygons surrounding the CMA, are a useful geography for presenting information on metropolitan areas' arable and natural and semi-natural land assets.

Built-up, settled and road areas

Built-up area is land that is predominantly built-up or developed, including the vegetation associated with these land covers, such as gardens and parks. It is characterized by a high percentage of impervious surfaces including roadways, parking lots and roof tops. Low-density dwellings and small structures or buildings in rural areas outside core built-up areas may not be captured due to the resolution of the data and overlying tree canopy.

In this report, settled area is defined as built-up area not including roads.

Built-up area for 1971 was estimated using the CLI: LU circa 1966 land use code B – Urban built-up area for all 33 CMAs covered in this report, supplemented by data from the CLUMP 1971 for 24 of the CMAs. CLUMP coverage excludes the CMAs of Moncton, Trois-Rivières, Sherbrooke, Peterborough, Kingston, Barrie, Brantford, Kelowna and Abbotsford–Mission. As a result, 1971 built-up areas may be underestimated for these CMAs. On average, CLUMP contributed 27% of the built-up area for CMAs where it was available (Table B.2).

For CLI: LU polygons that had information on more complex land use (includes secondary or tertiary land use) available in these data sets, this analysis used only the primary, or dominant, land use code.

By overlaying the data from AAFC's Land Use, 1990, 2000 and 2010 over the CLI and CLUMP layer, it was determined that in some cases, the built-up area from these earlier data sets overbounded the built-up from later years. Built-up area normally remains built over time—a logic rule used in AAFC Land Use.Note 17 Since it is assumed that the data quality of the remote sensing product is better than that of the CLI: LU and CLUMP, CLI: LU and CLUMP built-up areas that were not built-up in the 1990 data set were removed from the 1971 built-up data and reclassified according to their 1990 land use. In total, 1,752 square kilometres were trimmed from the CMA built-up area from the CLI and CLUMP layer.

The CLI: LU and CLUMP built-up data include roads located within the core built-up areas; however, data for roads outside these areas are not available. The area of these roads therefore needed to be modeled for 1971 in order to enable comparisons over time. As a first step, roads in the core built-up areas were identified using Land Use, 1990 and were removed from the 1971 base layer to isolate the settled area. Next, the road area was estimated by multiplying the 1971 settled area by the ratio of road area to settled area taken from AAFC's Land Use, 1990. This assumes that the ratio of roads to settled area remained constant between 1971 and 1990. The area of these modeled roads is not spatially explicit—it is provided as a total area for the CMA and CMA-E and cannot be attributed to a specific location.

Built-up areas, settled areas and roads were estimated for the 1991, 2001 and 2011 reference years using the Classes 21 – Settlements and 25 – Roads from AAFC's Land Use, 1990, 2000 and 2010.

Arable land and agricultural land by soil capability

Arable land is represented in this analysis using cropland, tame or seeded pasture and summerfallow data from the Census of Agriculture, which is consistent with the variables used by the Food and Agriculture Organization of the United Nations. Data for arable land do not indicate the amount of land that is potentially cultivable.

This analysis focused on arable land rather than total farm area. Census of Agriculture variables 'natural land for pasture' and 'all other land,' which includes land such as wetland and woodland on farms, were not included in recognition of the higher habitat values of these types of farmland.Note 18

Since Census of Agriculture data are not available by CMA, arable land for 1971, 1991, 2001 and 2011 is calculated for CMA-Es using the Interpolated Census of Agriculture, which aggregates dissemination area (DA) and enumeration area (EA) data from the Census of Agriculture by soil landscape and drainage area units.

In the data provided by StatCan to AAFC, farm area data are spatially referenced to the location of the farm headquarters, rather than distributing each field to the actual location.Note 19 AAFC and StatCan interpolated the data using an area-weighting process to reallocate DA and EA data to the SLC polygons.Note 20 Arable land data from the Interpolated Census of Agriculture do not have the spatial accuracy to be usefully mapped in Section 3.

Confidentiality procedures are applied to the data provided to AAFC to produce the Interpolated Census of Agriculture in order to avoid the possibility of identifying any specific agricultural operation. This involves the suppression of selected data. It was assumed that the suppression would have relatively little impact on the amount of arable land in each CMA-E.

To quantify the scope of this assumption, the confidential data were obtained from Agriculture Division, Census of Agriculture Section, for the most recent year, 2011, which also coincided with the year that had the highest count of suppressed farms. The confidential data were summed for CMA-Es. Table B.3 shows that the majority of the arable land estimates are only slightly affected by the suppression.

Arable land lost to settled area from 1971 to 2011 was calculated by overlaying the growth of settled area between 1971 and 2011 on arable land from the CLI: LU base layer and, for areas where the 1971 settled area was trimmed, on the area reclassified using AAFC's Land Use, 1990. The following CLI: LU classes were included: cropland, improved pasture and forage crops, orchards and vineyards and horticulture, as well as cropland from Land Use, 1990.

This analysis used settled area rather than built-up area—overlaying 2011 built-up area on 1971 built-up area would have overestimated the loss of arable land since the 1971 roads were modeled and therefore not spatially explicit. As a result, loss of arable land is underestimated. However, it is estimated that roads account for more than one-quarter of the growth in built-up area.

Agricultural land lost to settled area by soil capability class was calculated by overlaying the growth in settled area from 1971 to 2011 on the Canada Land Inventory: Soil Capability for Agriculture 1969 base layer. Dependable agricultural land represents agricultural land classes 1 through 3—land areas that are not hampered by severe constraints for crop production. Class 4 to 6 is marginal agricultural land that requires conservation or management practices for crop and/or forage crop production. For complex soil capability polygons, only the primary or dominant soil capability class was used in this analysis. Agricultural land lost to road growth was not included in this analysis.

Soil capability data for Kelowna CMA-E was not included in CLI: Soil Capability for Agriculture. The agricultural capability data for Kelowna were taken from British Columbia's Provincial Agricultural Land Commission,Note 21 which uses similar soil capability classifications.

Natural and semi-natural land

Natural and semi-natural land is the residual area remaining after subtracting built-up area and arable land from the total area of the CMA-E. In addition, for 2011, areal information on forest and water was specified. Water area was derived from CanVec+ geospatial data set. Forest area was calculated by summing the land cover classes 210–Coniferous, 220–Deciduous and 230–Mixedwood from AAFC Crop Inventory, 2011 (30 m) for all land that was not otherwise categorized as built-up, arable or water. Other natural and semi-natural land was derived as a residual of the total area.

Natural and semi-natural land lost to settled area from 1971 to 2011 was calculated by overlaying the 1971 to 2011 settled area growth on natural and semi-natural land from the CLI: LU base layer and, for areas where the 1971 settled area was trimmed, on the area reclassified using AAFC's Land Use, 1990.

Included CLI classes were woodlands, rough grazing and rangeland, outdoor recreation areas, rock and unvegetated surfaces, open wetland and unmapped areas. As well, a small amount of land categorized as mines, quarries, sand and gravel pits was included in the natural and semi-natural category. For the trimmed areas, land reclassified from Land Use, 1990, included classes forest, forest wetland, grassland managed, grassland unmanaged, trees, treed wetland, wetland, wetland herb, wetland shrub, water, settlement, roads and other land. Natural and semi-natural land lost to roads was not included in this analysis.

Population and dwellings

Detailed population and dwelling data are available from the Census of Population. Data are available for EAs for 1971 and 1991, and, following an improvement of the official geography of the Census, the finer scale dissemination block (DBs) for 2001 and 2011.Note 22 Both EA and DB population and dwelling data are attached to a representative point.

With regards to CMAs, a boundary adjustment was required to make individual CMAs comparable through time since some had expanded between 1971 and 2011. The 2011 boundary was selected in all cases. Population and dwelling counts for EAs and DBs were tabulated to comply with the 2011 boundary.

For the process of transferring census data to settled areas (as defined by the land use products), EA and DB points were selected if they fell in or within 400 metres of the settled area. This specific distance was selected because of the average spatial relationship between EAs, DBs and the scale of settled areas.

This study presents population and dwelling density measures by CMA as the ratio of settled area population or dwellings to the settled area (km2), which is defined as built-up area excluding roads.

It is worth noting that while most people live in settled areas, some live in low-density areas that may not be captured as settled in the satellite imagery product (and therefore not counted), a scenario that occurs more frequently in rural areas. Also, because DBs are of finer scale than EAs, the comparability of the data over time may be limited.

Table B.4 provides a summary of the above information on sources and methods.

Date modified: