Economic and Social Reports
Exploring property crime and business locations: Using spatial analysis and firm count data to reveal correlations in Toronto, Ontario

Release date: November 27, 2024

DOI: https://doi.org/10.25318/36280001202401100001-eng

Skip to text

Text begins

Abstract

This article presents an exploratory analysis of the relationship between the population, firm counts and average property crime from 2017 to 2020 across the Toronto census metropolitan area (CMA). It combines datasets from different domains—crime, business counts and population data—using 500 m by 500 m spatial grids to explore their relationships. At this scale, residential and business land use can be at least partially separated, allowing the independent association between residential populations, business counts and crime to be measured and mapped across the Toronto CMA. This analysis provides a picture of the spatial pattern of crimes across the CMA, explores and validate the data by establishing expected baseline relationships, and points towards areas for more in-depth analysis to determine the relationship between crime and business outcomes. After accounting for the population of grid squares, a positive association between business counts and crime was found, consistent with previous work. Furthermore, after considering population and firm counts, statistically significant spatial clusters of high (and low) crime rates were found. This work therefore sets the foundation for future analysis that would examine how variations in crime rates across space and time affect business outcomes (e.g., firm profitability and exit).

Keywords: property crime, firms, businesses, spatial crime patterns, geospatial analysis, crime hotspots

Authors

Matthew Brown, Mark Brown and Ryan Macdonald are with the Economic Analysis Division, Analytical Studies and Modelling Branch, Statistics Canada.

Acknowledgement

The authors thank their collaborators at Statistics Canada’s Canadian Centre for Justice and Community Safety Statistics for their essential contributions, as well as the reviewers for their insightful feedback, which greatly enhanced this work. In particular, they would like to thank Simon Baldwin, Mathieu Charron, Samuel Perreault and Stephen Tapp for their helpful comments.

Introduction

Increasingly, data and information across various domains (e.g., social, economic and environmental) are being combined to better understand the relationships between different aspects of society and the economy. Geography provides a natural framework for combining often disparate data that may otherwise have no other linkable characteristics and can reveal patterns that point to underlying socioeconomic processes. To this end, this paper examines the spatial correlations between the location of property crime, firms and the population for the Toronto census metropolitan area (CMA).

The focus on the association between the location of firms and crime is motivated, in part, by the growing body of work pointing to a negative relationship between firm outcomes and crime. Evidence suggests that consumers consider crime when deciding whether to visit a business (Fe & Sanfelice, 2022), that business investment is negatively affected by increasing crime (Acolin et al., 2022; Barbieri & Rizzo, 2023), and that higher levels of violent and property crime in neighbourhoods are associated with higher rates of business failure and mobility (moving away) (Hipp et al., 2019). Conversely, declining property crime is associated with higher neighbourhood-level economic activity (Stacy, Ho & Pendall, 2017). While these findings are not universal (see, for example, Bates & Robb, 2008), the weight of the evidence points towards the negative influence of crime on firm outcomes.

The objective of this paper is not to associate crime with firm outcomes per se. Rather, it takes a step back and gathers evidence on the correlation between the presence of firms and crime at the neighbourhood scale. Specifically, it explores how the presence of firms overlaps with property crime at the neighbourhood level. For large geographic units, such as cities or CMAs, population size may be a sufficient metric for measuring property crime rates. However, within Canadian CMAs, crime rates are not uniformly distributed (Savoie, 2008).Note  At smaller geographic levels, such as local neighbourhoods, this metric becomes especially limited, as crime does not strictly follow population size. Crime also occurs where people work and shop (i.e., in locations where firms operate), adding an additional level of complexity to the measurement of neighbourhood-level crime.Note 

This analysis, therefore, combines reported property crime counts with firm counts from Statistics Canada’s business microdata and population counts from the Census of Population. It explores the underlying spatial characteristics of these data geocoded to 500 m by 500 m grid squares—a standard areal unit that can be used to unify different types of data. In doing so, the analysis reveals correlations between the variables using traditional non-spatial correlative analysis techniques, such as linear regression, and spatial bivariate mapping and cluster analysis techniques.

The results demonstrate that reported property crime is positively associated with population levels and firm counts across the grid squares. Additionally, bivariate maps and regression analysis illustrate that including firm counts explains the variation in crime locations across grid squares in ways that the population alone does not. After accounting for the population, the exploratory regression model shows a statistically significant positive association between property crime and consumer-facing firms (e.g., retail stores). The analysis also identifies the presence of statistically significant spatial clusters of neighbourhoods with high crime (e.g., downtown Toronto), where property crime levels are higher than what would be expected given population size and the number of firms. This confirms that there are non-random processes driving spatial patterns in property crime rates and provides additional motivation for future work to better understand the causes of high crime clusters, particularly in relation to the number of firms within a neighbourhood.

The remainder of the study is structured as follows. Section 2 discusses the data sources and the pre-treatment of the data to produce grid-square-based values that are suitable for analysis. Section 3 describes the basic geospatial patterns found in the property crime, firm-level and population data. Section 4 examines the correlation between property crime, firm and population counts using bivariate maps and measures of spatial clustering (i.e., Local Moran’s I) derived from regression residuals. Section 5 concludes the paper.

Data

The analysis takes advantage of three types of data: crime, firm and population counts. This section describes the characteristics and sources of these data and how they are combined geographically through a uniform grid.

Key to the analysis, of course, are measures of crime—specifically, property crime. Property crime is the focus because it is more likely to be associated with businesses than other types of crime, such as homicide or drug trafficking. The property crime dataset used here includes all types of property crime violations under the Canadian Criminal Code, including breaking and entering, various forms of theft, possession and trafficking of stolen property, and criminal mischief.Note  This dataset was obtained from Statistics Canada’s Canadian Centre for Justice and Community Safety Statistics and includes the geographic point locations of reported crimes at various levels of geography, with the majority being captured at the block-face and dwelling levels.

The crime data were filtered to contain only property crimes and to include only point locations geocoded at finer-scale geographies (i.e., dissemination area level and below). Additionally, geocoded locations of fraud and other virtual crimes often differ from their actual location. The victim’s residence is often used as the location, even though it may not always be appropriate, such as in the case of online fraud. Therefore, the following crimes were removed from the analysis: fraud; identity fraud; identity theft; and altering, removing or destroying vehicle identification numbers.

Firm-level data were derived from the Longitudinal Business Database (LBD), which was built using the Business Register—a dataset covering the universe of firms in Canada (Statistics Canada, 2024). The LBD is used to construct the firm count variable, and the location of the enterprise is used to construct the firm count by location variable. Because most firms have only one operating location, this variable reasonably estimates the number of firms in a grid square. A key aspect of the LBD is that it allows researchers to consistently track firms over time, facilitating future work on the relationship between crime and firm outcomes.

Firm counts and crime counts were originally formatted as a spatial points layer and then aggregated into a tessellation of 500 m by 500 m grid squares covering the Toronto CMA. The grid squares present longitudinal units whose values were averaged over the period from 2017 to 2020 to create a surface of grid squares containing the average annual number of firms and the average number of property crimes for each square. To facilitate the regression analysis, average firm counts were additionally split into, and calculated for, consumer-facing and non-consumer-facing firms. Consumer-facing firms are those with customers as clientele (e.g., retail), as opposed to other businesses. Firms were separated into these categories using their North American Industry Classification System (NAICS) codes, following a classification scheme identified by Kane, Hipp and Kim (2017). The expectation is that consumer-facing firms would be more likely to affect, and be affected by, property crime (e.g., shoplifting).

Population data for the study were sourced from the 2021 Census of Population, made available via GeoSuite (Statistics Canada, 2021c), as well as 2021 geographic boundary files (Statistics Canada, 2021b). Population data at the dissemination block (DB) level were converted into a 500 m by 500 m grid square surface via geometric intersection.Note 

Because of reporting and collection methods, certain grid square locations may have missing data in specific years. To maximize data inclusion, null values were treated as 0 if the average of any variable (i.e., property crime, firms or population) over the period was above 0. Locations with null values for all three variables were excluded from the analysis. For example, a grid square located on an airport runway would be removed from the analysis, but a grid square located in a residential area with at least one resident and no recorded firms or crimes would remain. For brevity, in the remainder of this paper, the term “crimes” refers to the average counts of reported property crime, while the term “firms” refers to the average number of firms, with averages calculated across the four-year study period. Descriptive statistics for each variable are presented in Appendix A. All variables tend to have right-skewed distributions because of, in part, the presence of null values in the data.

Analysis based on spatial characteristics

The analysis focuses on the population of individuals, the population of firms and the reporting of crimes within the boundary of the Toronto CMA. The Toronto CMA is the most populous CMA in Canada, with a population of 6,022,225 and a land area of 5 903 km2, according to the 2021 Census of Population (Statistics Canada, 2021a).

Within the Toronto CMA, theft under $5,000 (not including motor vehicles) was the most prevalent type of property crime for each year from 2017 to 2021. The next largest categories were fraud,Note  mischief, breaking and entering, and theft of a motor vehicle (Table 1). These five categories constitute most property crimes in the Toronto CMA and represent violations that can directly affect businesses and people.


Table 1
Incident-based crime statistics, by detailed violation, Toronto, Ontario
Table summary
This table displays the results of Incident-based crime statistics. The information is grouped by Year (appearing as row headers), Breaking and entering, Possession of stolen property, Trafficking in stolen property, Theft of a motor vehicle, Theft over $5,000 (non-motor vehicle), Theft under $5,000 (non-motor vehicle), Fraud, Identity theft, Identity fraud, Mischief, Arson and Altering, removing or destroying a vehicle identification number, calculated using number units of measure (appearing as column headers).
Year Breaking and entering Possession of stolen property Trafficking in stolen property Theft of a motor vehicle Theft over $5,000 (non-motor vehicle) Theft under $5,000 (non-motor vehicle) Fraud Identity theft Identity fraud Mischief Arson Altering, removing or destroying a vehicle identification number
number
2017 13,493 1,313 49 8,014 2,300 67,009 15,892 117 2,227 16,459 418 8
2018 14,300 1,560 58 9,971 2,500 77,075 18,395 100 2,123 16,405 372 3
2019 14,981 1,451 42 10,641 2,501 76,928 21,614 159 2,042 15,977 319 0
2020 11,614 1,578 69 11,509 2,231 57,794 19,435 145 2,189 16,219 408 2
2021 9,748 1,347 95 14,021 2,277 59,868 18,229 87 2,390 16,150 383 4

To examine reported crimes at more disaggregated, regionally specific geographies, data are often reported rates, such as crime occurrences per 100,000 people, and over areas, such as counties, provinces or sections of a city. Map 1 provides an example of this type of crime rate based on the average annual level of crime from 2017 to 2020 across census subdivisions (CSDs) within the Toronto CMA. While this presents a coarse look at the dispersion of reported property crimes, insights can be drawn from this exercise. The highest property crime rates are in the Toronto CSD, with lower but highly variable rates in the surrounding areas. This scale, however, likely masks considerable variation in crime rates within CSDs. Residents often experience crime at the neighbourhood level. This is also true for firms. Firms tend to concentrate in districts (e.g., because of zoning or various locational advantages common across firms). Therefore, to examine the relationship between firms, the population and crime rates, a scale of analysis that captures this spatial variation is required. To address this, the spatial distribution of property crime is reported at a finer spatial scale, using standardized grid squares as a linking geography. Maps showing the distribution of the population, firms and crimes over the study area in grid squares are shown in Map 2.

Map 1 Average property crime rate by census subdivision, Toronto, Ontario, 2017 to 2020

Description for Map 1

Map 1 depicts the average property crime rate per 100,000 people from 2017 to 2020 in each census subdivision of the study area, which covers the extent of the 2021 Toronto census metropolitan area. The map is coloured in a red colour scale, with light red hues depicting the lowest rates of crime and dark red hues depicting the highest rates of crime. The highest rate of crime is found in the Toronto census subdivision, which covers the central downtown area. Similarly, high crime rates are found in Mississauga, Brampton and Vaughan, which immediately surround the Toronto census subdivision. Georgina, which is much farther north along the shore of Lake Simcoe, also has a high property crime rate. The lowest crime rates are observed in the Whitchurch-Stouffville and Uxbridge census subdivisions. Overall, the map illustrates that the rate of property crime varies significantly across the study area and that standard census geographies, like census subdivisions, often cover large areas (especially in rural locations) because they are created relative to population thresholds. These size disparities can mask local variations in the data.

Map 2 Overview maps showing a) the population, b) average firm counts and c) average property crime counts inside grid squares across the Toronto census metropolitan area

Description for Map 2

Map 2 depicts four panels arranged in a two-by-two grid. Each panel contains mapped data at a 0.5 km2 grid square level across the entire Toronto census metropolitan area, except for the bottom-right panel, which depicts inset maps zooming into downtown Toronto for each of the other maps to show local trends more clearly. Each map also includes line features representing the major roads in Toronto and the boundaries of census subdivisions for added context.

The top-left panel is a map labelled “Population per grid square.” This map shows the population across the study area in a five-stage blue colour scheme, where low values are light blue and high values are dark blue. The map highlights that population is highest in the downtown core of Toronto and surrounding neighbourhoods. Notably, large sections of the study area are unpopulated in rural areas.

The top-right panel is a map labelled “Firms per grid square.” This map depicts the average count of firms from 2017 to 2020 in a five-stage purple colour scheme, where a low count is the lightest purple and a high count is the darkest purple. The map highlights that average firm counts are most highly concentrated in downtown Toronto.

The bottom-left panel is a map labelled “Crimes per grid square.” The map depicts the average count of property crime from 2017 to 2020 in a five-stage red colour scheme, where a low count is the lightest red and a high count is the darkest red. Like the other maps, crime is concentrated in high quantities in downtown Toronto. Locations where more than 20 crimes occurred on average are very localized across the study area and generally appear to follow the road network.

The bottom-right panel contains inset maps for each of the variables mapped in the other panels. Each inset zooms into downtown Toronto, where the highest amount of each variable is observed.

All three maps in Map 2 illustrate a common concentration of population, firms and crime in downtown Toronto (see also the inset maps), but there are different geographic patterns in their detailed geographies. As illustrated in Map a), the population is concentrated in the Toronto CSD, peaking in the downtown core. In Map b), the average number of firms is also highest in the Toronto CSD, with its highest levels in the downtown core at Union Station and extending northward along Yonge Street. High concentrations of firms can also be observed in downtown Mississauga; Brampton; and near major highway intersections in Vaughan, Richmond Hill and Markham. In Map c), a high concentration of crime occurs in the central downtown core. Upon closer inspection, squares with higher average property crime tend to follow the road network and are generally in CSDs immediately surrounding Toronto to the north and west. Areas of high crime appear to be much more locally concentrated, compared with what is observed in the population or firm maps. For example, high-crime areas in Brampton and Mississauga are in their downtown cores and do not appear to spread out as extensively into surrounding areas. This is consistent with previous research showing that, for instance, shoplifting was not found to be affected by the characteristics of adjacent neighbourhoods in Toronto (Charron, 2009). Looking at the univariate choropleth maps in Map 2, there is an apparent high degree of association between the locations of reported property crimes and population counts and firm locations. While this observation is intuitive, the individual maps do not allow for a statistical analysis of how crime locations, people and firms interact. Doing so requires multivariate analysis techniques.

Correlation analysis

As the informal comparison of the maps in Map 2 suggests, when the counts of crime, firms and population across grid squares are compared, there is a relatively strong positive correlation among them. The Pearson correlation coefficient (r MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOCaaaa@36ED@ ) between crime and firms is 0.49. Likewise, the r MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOCaaaa@36ED@ between property crime and the population is 0.48, while between firms and the population it is 0.56 (see Map 3). While these coefficients indicate that the data are correlated and reinforce the observations in the univariate maps, they conceal local variations in correlation that become apparent in bivariate maps. Of particular interest are locations where there is disagreement, such as areas where crime is relatively high, but the population is relatively low, and whether these areas visually correspond to locations where firms are more prevalent.

Bivariate maps provide a visual representation of the correlation between two variables. In these maps, each variable is divided into three bins, potentially creating nine unique colour classes. The low to high values of each variable can be read from bottom to top (Variable 1) and left to right (Variable 2). The upward diagonal from bottom left to top right represents a positive correlation in the data, whereas the off-diagonal colour ramps indicate increasing disagreement between the variables. Because a positive correlation among the variables was established, the left-to-right diagonal is identified by a light-grey colour to highlight the off-diagonal cells, which are areas where this relationship does not hold. For example, the colour ramp from the bottom left to the top left (i.e., from light grey to light yellow to dark yellow, or from light grey to light red to dark red) represents areas that show increasing values in Variable 1, while remaining low in Variable 2. The opposite disagreement pattern (i.e., where the values in Variable 2 become increasingly high, while remaining low in Variable 1) can likewise be observed along the bottom row, from left to right. Map 3 displays the bivariate maps of every possible variable interaction, along with the corresponding correlation coefficients between variables.

Map 3 Bivariate maps comparing a) firms and crimes, b) the population and crimes, and c) the population and firms

Description for Map 3

Map 3 depicts four panels arranged in a two-by-two pattern. The first three panels, labelled a), b) and c), contain bivariate choropleth maps (i.e., maps that simultaneously display two variables using colour combinations that represent the correlation relationship between the variables) at a 0.5 km2 grid square level across the entire Toronto census metropolitan area. The bottom-right panel contains a correlation matrix of each variable. The legend used in the bivariate maps is complex, with nine distinct categories represented as a three-by-three grid. Each axis of the grid corresponds to one of the variables, with values increasing from low to high. The cells in the grid are coloured based on the blend of two base colours, indicating different combinations of the variable values. In each map, the first variable is represented with a colour gradient from high to low using different colour schemes. For average firms, a light grey to dark yellow colour scheme is used. For population, a light grey to dark blue colour scheme is used. For average crimes, a light grey to dark red colour scheme is used. The most notable colours to observe in each map are those found on the off diagonals of each colour ramp (i.e., dark yellow, red and blue), as these depict situations where one variable is high and the other is low. Each map also includes line features representing the major roads in Toronto.

The top-left panel depicts a bivariate map of average firms and average crimes. This map illustrates moderate to high value correlations between these variables across much of the study area, with a seemingly random mixture. There are some areas of high crime and low firms, particularly in southern Scarborough and northwestern North York.

The top-right panel depicts a bivariate map of the population and average crimes, whereas the bottom-left panel depicts the population and average firms. The key highlight of these maps is that there are areas where firms and crimes are high, but the population is low—i.e., red squares in panel b) and yellow squares in panel c). This pattern emerges in areas surrounding Toronto/Lester B. Pearson International Airport, MacMillan Yard and various commercial locations. This variable disagreement highlights the presence of unexpected patterns in the data.

In the bottom-right panel, there is a correlation matrix showing the Pearson correlation coefficient between each variable. The Pearson correlation coefficient is near 0.5 between each variable.

In terms of patterns that emerge from the bivariate maps, areas where crime, firms and the population are positively correlated are seen in generally the same locations across each map (e.g., downtown Toronto). Of particular interest are the patterns of variable disagreement represented by the colours on the off diagonals. Specifically, areas where the average number of firms and the average number of crimes appear to be high, while the population is low—i.e., red squares in Panel b) and yellow squares in Panel c)—appear in the areas surrounding Toronto/Lester B. Pearson International Airport, MacMillan Yard and various commercial locations (e.g., Etobicoke City Centre). This variable disagreement reveals that the presence of many firms does not necessarily correspond to a high population and that low-population areas can still be locations where crime occurs. For example, for the grid squares that surround Toronto/Lester B. Pearson International Airport, there is little relationship between the population and crimes, as shown by the red squares in Panel b), but there is a positive relationship between firm counts and crimes.

Regression modelling

At root, the bivariate choropleth maps demonstrate that crime and firm or population counts are positively related, but that firm counts and the population potentially have independent associations with crime. To formally test this, the following regression model is estimated using ordinary least squares (OLS):

Crim e i = o β o + o β 1 CFfirm s i + β 2 NCFfirm s i + β 3 populatio n i + o ε i MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGdbGaamOCaiaadMgacaWGTbGaamyza8aadaWgaaWcbaWdbiaa dMgaa8aabeaak8qacqGH9aqppaWaaWbaaSqabeaapeGaam4Baaaaki abek7aI9aadaWgaaWcbaWdbiaad+gaa8aabeaak8qacqGHRaWkpaWa aWbaaSqabeaapeGaam4Baaaakiabek7aI9aadaWgaaWcbaWdbiaaig daa8aabeaak8qacaWGdbGaamOraiaadAgacaWGPbGaamOCaiaad2ga caWGZbWdamaaBaaaleaapeGaamyAaaWdaeqaaOWdbiabgUcaRiabek 7aI9aadaWgaaWcbaWdbiaaikdaa8aabeaak8qacaWGobGaam4qaiaa dAeacaWGMbGaamyAaiaadkhacaWGTbGaam4Ca8aadaWgaaWcbaWdbi aadMgaa8aabeaak8qacqGHRaWkcqaHYoGypaWaaSbaaSqaa8qacaaI ZaaapaqabaGcpeGaamiCaiaad+gacaWGWbGaamyDaiaadYgacaWGHb GaamiDaiaadMgacaWGVbGaamOBa8aadaWgaaWcbaWdbiaadMgaa8aa beaak8qacqGHRaWkpaWaaWbaaSqabeaapeGaam4Baaaakiabew7aL9 aadaWgaaWcbaWdbiaadMgaa8aabeaaaaa@6DE8@ 
Table 2
Ordinary least squares regression model estimates of the count of property crimes as a function of firm and population counts across grid squares
Table summary
This table displays the results of Ordinary least squares regression model estimates of the count of property crimes as a function of firm and population counts across grid squares. The information is grouped by Dependent variable: Average number of property crimes (appearing as row headers), Coefficient and t-statistic (appearing as column headers).
Dependent variable: Average number of property crimes Coefficient t-statistic
(Intercept) -2.628 -3.239Note **
Average count of consumer-facing firms 1.010 8.796Note ***
Average count of non-consumer-facing firms -0.040 -1.337
Population count 0.011 5.434Note ***
Observations Note ...: not applicable 12,130
Adjusted R2 Note ...: not applicable 0.35
Residual standard error Note ...: not applicable 29.61 (df = 12,160)
F-statistic Note ...: not applicable 3,171 (df = 12,160)

In line with the existing literature, the model reveals a statistically significant positive relationship between consumer-facing firms and property crime, after taking the population into account. This finding supports the notion that these firms are potentially more exposed to crime because they serve the broader public, putting them at greater risk (e.g., from shoplifting). This is further reinforced by the statistically insignificant association between counts of non-consumer-facing businesses and crime. The population has the expected positive independent association with crime. Together, the independent variables account for 35% of the variation in the average number of property crimes. This is reasonably high given how few variables are included in the model.

Beyond the independent associations between population and firm counts and crime, the residuals of the model are also of interest. They measure the degree to which crime is higher or lower than would be expected after accounting for the count of people and firms in a grid square. If there are spatial clusters of grid squares where crime is higher (or lower) than expected, then this suggests there are underlying spatially non-random factors that result in higher (or lower) crime that may warrant further investigation. For instance, clusters of positive residuals—where crime is higher than expected—may result if there are clusters of grid squares with firms that are particularly vulnerable to property crime (e.g., retail stores). However, if there are no statistically significant spatial patterns in the residuals, the null hypothesis—that the factors leading to higher or lower levels of crime than expected are randomly distributed across space—cannot be rejected.

As might be reasonably expected for such a simple model, the residuals are not spatially random. The Global Moran’s I, which ranges from -1 to +1, yielded a value of 0.20 (z-score = 34.44), suggesting strong positive spatial autocorrelation and rejection of the null hypothesis of spatial randomness.Note  Moreover, the Local Moran’s I based on the model residuals indicates statistical clusters in the study area. The resulting map of clusters is presented in Map 4.

Map 4 Local Moran’s I cluster map of ordinary least squares model residuals

Description for Map 4

This map shows a Local Moran’s I cluster map. This type of map illustrates the spatial distribution of data, highlighting regions with statistically significant clusters. It uses a colour-coded system where high values surrounded by high values (high-high clusters) are in red, indicating areas of intense clustering. Low values surrounded by low values (low-low clusters) are in blue, showing regions with consistently low measurements. High values adjacent to low values (high-low) are in light blue, and low values next to high values (low-high) are in pink, representing areas of spatial outliers. Areas with no significant clustering are marked in grey. This map helps identify where similar values are grouped together and where anomalies occur in the spatial data. In this map, the variable being tested for clustering is the residual value from the ordinary least squares model estimated in the paper. Therefore, the clusters dictate areas where there are model underpredictions and overpredictions of crime, as well as outliers. Notably, model underpredictions (i.e., high-high clusters) appear in downtown Toronto and sparsely around the census metropolitan area along major roads and highways, whereas model overpredictions (i.e., low-low clusters) appear in residential areas of North York, Mississauga, Markham, Vaughan and Richmond Hill.

From this map, model underpredictions of crime (i.e., groupings of relatively high-value positive residuals, shown in red) cluster in downtown Toronto in areas with the highest density of firms or people. They also appear sparsely around the CMA and particularly within the Toronto, Etobicoke and Brampton CSDs, primarily along major roads and highways. In general, clusters of underpredictions tend to appear in areas used heavily for commercial purposes, such as shopping centres or downtown commercial districts. Conversely, several large clusters of model overpredictions (i.e., groupings of low-value residuals, shown in dark blue) can be found in many neighbourhoods across the CMA, but particularly in North York, Mississauga, Markham, Vaughan and Richmond Hill. These overpredictions typically occur in regions of heavy residential land use. This finding suggests the presence of other underlying spatially non-random factors that result in higher (or lower) crime counts.

Lastly, the map also identifies statistically significant spatial outliers. Underprediction spatial outliers (shown in pink) are grid squares with above-average residual values whose neighbours, on average, have lower values than would be expected under spatial randomness (e.g., a high-value residual grid square surrounded by low-value residual grid squares). The opposite holds true for overprediction spatial outliers (shown in light blue). Underprediction spatial outliers tend to appear in locations with locally high levels of commercial activity (such as grid squares containing big-box retail locations) that are adjacent to areas of low activity, such as parks, highways or residential areas. Conversely, overprediction spatial outliers are often in residential areas situated on the periphery of commercial areas.

Limitations and assumptions

While this article was intended as an exploratory analysis, the limitations of the data need to be acknowledged. Because the analysis uses averaged data from a four-year panel dataset of crime and firm counts, these counts may be underestimated in grid squares that saw significant growth over the period, such as those on the outskirts of Toronto. Another limitation of the dataset pertains to how data are captured in rural locations. Rural postal codes cover much larger areas than their urban counterparts. As the firm data are based on centroids of postal code geography during the grid square aggregation, grid squares with no actual firms within them could be identified as having firms. Lastly, as the COVID-19 pandemic may have affected crime rates, the analysis was repeated excluding 2020 data. The results were qualitatively unchanged, so the paper includes data from the entire period (2017 to 2020).

As illustrated by the spatial analysis of the OLS regression residuals, several other variables beyond population size and the number of firms likely influence the observed crime counts. Socioeconomic factors or local accessibility probably also play a role. Another consideration is the reliance solely on firm counts in the analysis, without accounting for variations in firm size. Consequently, the model treats small businesses (i.e., firms with 1 to 99 employees) with the same weight as large enterprises (i.e., firms with 500 or more employees), potentially resulting in bias. As most businesses in Canada are small (e.g., approximately 87% of Ontario firms in 2020 had fewer than 20 employees [Statistics Canada, 2021d]), the results tend to reflect the relationship between small firms and property crime.

Conclusion

This paper presents an exploratory analysis of local property crime patterns in the Toronto CMA. Through the aggregation of crime, population and firm count data into a uniform spatial grid dataset, spatial analysis techniques were used to explore patterns in the data. Although property crime, population and firms are shown to be positively correlated with each other, the correlation varies over space. Bivariate maps highlight the spatial correlations between the different variables and, specifically, the utility of including a firm count variable. Moreover, the presence of statistically significant spatial clusters in the data illustrates that there are grid squares exhibiting levels of crime that are either higher or lower than would be expected given the population and number of firms there. Further exploration is warranted to understand what additional variables should be included, as well as to test different types of models (e.g., count-based models).

This paper illustrates that using a grid square geography in conjunction with firm count information reveals an independent association between the presence of firms and crime. To fully understand neighbourhood-level crime, the firm dimension needs to be considered. As the presence of firms is important for understanding neighbourhood crime, neighbourhood-level crime may also be important for understanding firm outcomes (Hipp et al., 2019; Stacy, Ho & Pendall, 2017). Therefore, an avenue for future work would be to explore how crime influences firm outcomes, like profitability and exit rates. This would fully leverage the underlying data that track firms over time.

Appendix A

Summary statistics


Appendix Table A.1
Summary statistics for variables of interest across grid squares
Table summary
This table displays the results of Summary statistics for variables of interest across grid squares Average firms, Average crimes and Population (2021) (appearing as column headers).
Average firms Average crimes Population (2021)
Minimum 0 0 0
Maximum 680 500 10,969
Mean 25 9 502
Median 10 3 165
Number of observations Note ...: not applicable Note ...: not applicable 12,130

The mean and median values of each variable are much closer to the minimum value than to the maximum value. This can be explained by the presence of a large number of null values in the dataset, particularly for property crime. All the variables in this analysis have a right-skewed distribution.

References

Acolin, A., Walter, R. J., Skubak Tillyer, M., Lacoe, J., & Bostic, R. (2022). Spatial spillover effects of crime on private investment at nearby micro-places. Urban Studies, 59(4), 834-850.

Barbieri, N., & Rizzo, U. (2023). The impact of crime on firm entryJournal of Regional Science63(2), 446-469.

Bates, T., & Robb, A. (2008). Crime’s impact on the survival prospects of young urban small businessesEconomic Development Quarterly22(3), 228-238.

Charron, M. (2009). Neighbourhood characteristics and the distribution of police-reported crime in the city of Toronto. Statistics Canada.

Charron, M. (2011). Neighbourhood characteristics and the distribution of crime in Toronto: Additional analysis on youth crime. Statistics Canada.

Fe, H., & Sanfelice, V. (2022). How bad is crime for business? Evidence from consumer behavior. Journal of Urban Economics, 129, 103448.

Hipp, J., Williams, S. A., Kim, Y. A., & Kim, J. (2019). Fight or flight? Crime as a driving force in business failure and business mobilitySocial Science Research82, 164-180.

Kane, K., Hipp, J., & Kim, J. (2017). “Analyzing accessibility using parcel data: Is there still an access-space tradeoff in Long Beach, California?The Professional Geographer 69(3), 486-503.

Rosenthal, S. S., & Urrego, J. A. (2023). Eyes on the street, spatial concentration of retail activity and crime. Working paper, Syracuse University.

Savoie, J. (2008). Analysis of the spatial distribution of crime in Canada: Summary of major trends. Statistics Canada.

Stacy, C. P., Ho, H., & Pendall, R. (2017). Neighborhood‐level economic activity and crime. Journal of Urban Affairs, 39(2), 225-240.

Statistics Canada. (2021a). Focus on geography series, 2021 Census of Population.

Statistics Canada. (2021b). Boundary Files.

Statistics Canada. (2021c). GeoSuite.

Statistics Canada. (2021d). Table 33-10-0304-01 Canadian Business Counts, with employees, December 2020.

Statistics Canada. (2022). Table 35-10-0177-01 Incident-based crime statistics, by detailed violations, Canada, provinces, territories, census metropolitan areas and Canadian Forces Military Police.

Statistics Canada. (2024). Business Register.

Willits, D., Broidy, L., & Denman, K. (2013). Schools, neighborhood risk factors, and crime. Crime & Delinquency, 59(2), 292-315.

Date modified: