Analytical Studies — Methods and References
Measuring Social Capital at the Neighbourhood Level: Experimental Estimates of Sense of Belonging to the Local Community Measured at the Census Tract Level

Release date: November 16, 2021

Skip to text

Text begins

Abstract

Statistics Canada continues to use a variety of data sources to provide neighbourhood-level variables across an expanding set of domains, such as sociodemographic characteristics, income, services and amenities, crime, and the environment. Yet, despite these advances, information on the social aspects of neighbourhoods is still unavailable. In this paper, answers to the Canadian Community Health Survey on respondents’ sense of belonging to their local community were pooled over the four survey years from 2016 to 2019. Individual responses were aggregated up to the census tract (CT) level. The small area estimation component of the Generalized Estimation System developed at Statistics Canada was then used to produce more efficient estimates of average community belonging within CTs than the estimates obtained from standard survey-weighted methods alone. For most areas, the small area estimate has a smaller coefficient of variation than the direct estimate; the difference is particularly notable for the areas with the smallest sample sizes. The bivariate and multivariate correlations between CT-level community belonging and seven other CT-level characteristics are also presented. CT-level estimates of community belonging are found to be correlated as expected with other CT-level variables, such as population density, population turnover, housing tenure and crime.

Introduction

The development and use of small area information continue to be a priority at Statistics Canada. Data from a variety of sources, such as the census, administrative files, satellite imaging and open sources, are being used to construct neighbourhood-level variables across an expanding set of domains, such as sociodemographic characteristics, income, services and amenities, crime, and the environment. Yet, despite these advances, information on the social aspects of neighbourhoods—such as contact and familiarity with neighbours and trust in others—is still unavailable. This type of information is collected in some household surveys. However, these surveys do not generally have large enough samples to allow individual-level responses to be aggregated into reliable estimates at the neighbourhood level. Estimates can be provided across larger geographic areas, such as census metropolitan areas and economic regions, but not at more detailed neighbourhood levels, which are the focus of much research interest.

The relationships between neighbourhood-level social capital and various outcomes have been important themes in the research literature. Neighbourhood social capital is viewed as a social determinant of health and well-being, with studies examining its correlations with outcomes such as obesity (Carrillo-Alvarez, Kawachi and Riera-Romani 2019), self-rated health (e.g., Mohnen et al. 2013; Chola and Alaba 2013) and satisfaction with life overall (Helliwell, Shiplett and Barrington-Leigh 2018). The role that neighbourhood social capital plays in preparedness for, and responses to, disasters such as hurricanes is another line of inquiry (e.g., Marsh and Buckle 2001; Kim and Kang 2009; Zahnow et al. 2019). In the context of the COVID-19 pandemic, evidence indicates that geographic areas with stronger social capital exhibited greater compliance with stay-at-home directives and lower excess mortality rates than areas with weaker social capital (Bartscher et al. 2020). Still, some researchers note that the relationships between neighbourhood social capital and outcomes are often mixed and that results can be sensitive to research design (Carrillo-Alvarez, Kawachi and Riera-Romani 2019). Including measures of the social aspects of neighbourhoods would increase the scope for empirical analysis of such issues, particularly given the lack of Canadian data of this type.Note 1

This initiative was undertaken to address this gap. A core question on the annual Canadian Community Health Survey (CCHS) asks respondents to describe their sense of belonging to their local community. Responses to this question were pooled over the four years of the CCHS from 2016 to 2019, and individual responses were aggregated up to the census tract (CT) level. The small area estimation component of the Generalized Estimation System developed at Statistics Canada was then used to produce more efficient estimates of average community belonging within CTs than the estimates obtained from standard survey-weighted methods alone.

This work and the resulting variable are described below. The data sources used for the initiative are presented first, followed by a discussion of the area-level model used to produce the small area estimates of CT-level community belonging. Information on the quality of these estimates is also presented. Next, the correlations between CT-level community belonging and a small set of CT-level variables are presented using Pearson correlations, cross-tabulations and a multivariate regression model. Lastly, conclusions and next steps are discussed.

Data

To create a neighbourhood-level variable that captures social capital, it is optimal for answers to the underlying survey questions to be available for as large a sample of respondents as possible. In this respect, questions on surveys that are fielded annually are attractive as they can be pooled over successive years to obtain the samples needed for geography-based estimates. The inclusion of questions on surveys that contain similar content and use the same survey processes from year to year is also attractive, as these features increase the comparability of pooled samples.

The sense of community belonging variable included on the CCHS meets these criteria. It is among Statistics Canada’s larger household surveys and includes a set of core content, including the question on sense of community belonging. Each year respondents are asked:

  • How would you describe your sense of belonging to your local community? Would you say it is...?
  • Very strong, Somewhat strong, Somewhat weak, Very weak

This question is positioned near the beginning of the survey, following basic questions about the age, household composition, main activity and general health of respondents. Most survey respondents are able and willing to answer the question, with non-response around 4%.

Sense of community belonging is an umbrella concept that reflects a number of underlying factors. It is strongly correlated with familiarity, reciprocal exchanges and trust in neighbours, underscoring the underlying importance of social capital. However, neighbourhood characteristics (e.g., perceptions of area crime, the built environment) and “rootedness” (e.g., duration of residence in an area) are also significantly correlated with a sense of community belonging, independent of individuals’ familiarity, trust and reciprocity with neighbours (Schellenberg et al. 2018).

Variables designed to gauge the social aspects of neighbourhoods are available on other Statistics Canada surveys. For example, the 2013, 2015 and 2020 cycles of the General Social Survey (GSS) include questions about respondents’ trust and interactions with neighbours. However, the smaller GSS samples and inclusion of these questions on a less-than annual basis make them less well-suited for producing geographic-level estimates. Similarly, the 2018 Canadian Housing Survey (CHS) asked respondents about their satisfaction with their neighbourhood and the degree to which they feel part of it. While the annual sample of the CHS is quite large, only a single year of CHS data was available at the time this initiative was undertaken, precluding the option of pooling data across survey years. With at least two additional years of CHS data being collected by Statistics Canada (with the support of the Canada Mortgage and Housing Corporation), the survey will soon offer greater scope to produce geography-based estimates. At the time of writing, the sense of community belonging variable on the CCHS was the best available option.

Regarding the techniques used to estimate neighbourhood-level characteristics, standard weighted estimates (or direct estimates) are typically obtained for a given area by using sample data from that area. Direct estimates are typically reliable if the sample size in the area is large. Small area estimation methods attempt to produce reliable estimates when the sample size in the area is small. This is achieved by complementing the sample data (CCHS data) with auxiliary information and using a model linking the direct estimate to explanatory (auxiliary) variables over all areas.

The direct estimate in this study is the weighted mean value of the sense of community belonging at the CT level, as defined using 2016 Census boundaries. Again, four years of CCHS data were pooled to obtain enough responses to calculate direct estimates for numerous CTs. The corresponding direct variance estimate was calculated using 1,000 bootstrap weights. The combined sample size for each CT was also retained.

Auxiliary information was drawn from the 2016 Census. Three variables at the CT level were identified as possible explanatory variables for the mean sense of community belonging. Two involved mobility status and one involved age. Specifically, the auxiliary variables were (1) the percentage of the population in the CT present for less than one year, (2) the percentage of the population in the CT present for less than five years, and (3) the percentage of the population in the CT aged 20 to 34. Although there were 5,678 CTs from the 2016 Census, 5,656 CTs had values for all three of these variables. Thus, small area estimates were produced for 5,656 areas.

All three auxiliary variables were initially considered in the CT-level model. Only one, the percentage of individuals aged 20 to 34 in the CT, had a significant regression coefficient estimate and was retained. To improve the model fit, a method of regression splines of order 2 using this auxiliary variable was implemented. Briefly, this involved creating more auxiliary variables by segmenting the original auxiliary variable and using them in the CT-level model.

The combined CCHS sample produced direct estimates for 4,863 CTs, though 237 of them were excluded from the estimation of model parameters in the CT-level model. One of these 237 CTs was excluded from the modelling because it was deemed an outlier, and the other 236 CTs were excluded because the direct variance was estimated to be zero. This happened in CTs with extremely small sample sizes (less than five), putting in doubt the accuracy of the direct variance estimates. Of these 236 CTs, 201 had a combined sample size of one.

Area-level model

The linear Fay-Herriot model is the most commonly used model in practice to obtain small area estimates. In this model, the assumptions about the relationship between the population parameter of interest ( θ i MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqiUde3aaS baaSqaaiaadMgaaeqaaaaa@38C6@ , for area i=1,2,.,m MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyAaiabg2 da9iaaigdacaGGSaGaaGOmaiaacYcacaGGUaGaeyOjGWRaaiilaiaa d2gaaaa@3EA4@ ) and external information—whose source is independent of the survey—are at the area level. In this application, the parameter of interest is the population mean sense of community belonging and the areas are CTs. The Fay-Herriot model is often represented by its two components, the sampling and linking models shown below:

Sampling model: θ ^ i = θ i + e i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqiUdeNbaK aadaWgaaWcbaGaamyAaaqabaGccqGH9aqpcqaH4oqCdaWgaaWcbaGa amyAaaqabaGccqGHRaWkcaWGLbWaaSbaaSqaaiaadMgaaeqaaaaa@3FA7@

Linking model: θ i = β 0 + β 1 x 1i + β 2 x 2i ++ β p x pi + v i , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqiUde3aaS baaSqaaiaadMgaaeqaaOGaeyypa0JaeqOSdi2aaSbaaSqaaiaaicda aeqaaOGaey4kaSIaeqOSdi2aaSbaaSqaaiaaigdaaeqaaOGaamiEam aaBaaaleaacaaIXaGaamyAaaqabaGccqGHRaWkcqaHYoGydaWgaaWc baGaaGOmaaqabaGccaWG4bWaaSbaaSqaaiaaikdacaWGPbaabeaaki abgUcaRiablAciljabgUcaRiabek7aInaaBaaaleaacaWGWbaabeaa kiaadIhadaWgaaWcbaGaamiCaiaadMgaaeqaaOGaey4kaSIaamODam aaBaaaleaacaWGPbaabeaakiaacYcaaaa@5583@ where θ ^ i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqiUdeNbaK aadaWgaaWcbaGaamyAaaqabaaaaa@38D7@ denotes the direct estimate (CCHS weighted) for area i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyAaaaa@36E5@ and β 0 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaS baaSqaaiaaicdaaeqaaaaa@387E@ and β j MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaS baaSqaaiaadQgaaeqaaaaa@38B3@ , and j=1,2,.,p MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOAaiabg2 da9iaaigdacaGGSaGaaGOmaiaacYcacaGGUaGaeSOjGSKaaiilaiaa dchaaaa@3E3C@ are unknown regression coefficients. The term e i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyzamaaBa aaleaacaWGPbaabeaaaaa@37FB@ is called the sampling error, whereas the term v i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamODamaaBa aaleaacaWGPbaabeaaaaa@380C@ is called the linking model error. The model allows for many auxiliary variables x j . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiEamaaBa aaleaacaWGQbaabeaakiaac6caaaa@38CB@ By combining these two components, the linear Fay-Herriot model is obtained:

Fay-Herriot model: θ ^ i = β 0 + β 1 x 1i + β 2 x 2i ++ β p x pi +( v i + e i ). MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqiUdeNbaK aadaWgaaWcbaGaamyAaaqabaGccqGH9aqpcqaHYoGydaWgaaWcbaGa aGimaaqabaGccqGHRaWkcqaHYoGydaWgaaWcbaGaaGymaaqabaGcca WG4bWaaSbaaSqaaiaaigdacaWGPbaabeaakiabgUcaRiabek7aInaa BaaaleaacaaIYaaabeaakiaadIhadaWgaaWcbaGaaGOmaiaadMgaae qaaOGaey4kaSIaeSOjGSKaey4kaSIaeqOSdi2aaSbaaSqaaiaadcha aeqaaOGaamiEamaaBaaaleaacaWGWbGaamyAaaqabaGccqGHRaWkda qadaqaaiaadAhadaWgaaWcbaGaamyAaaqabaGccqGHRaWkcaWGLbWa aSbaaSqaaiaadMgaaeqaaaGccaGLOaGaayzkaaGaaiOlaaaa@5A0E@

The Fay-Herriot model resembles a standard linear regression model but its error structure is different, and this is why special estimation methods are necessary to estimate its unknown parameters (Rao and Molina 2015). The estimates of the model parameters β 0 ,  β 1 , β p MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaS baaSqaaiaaicdaaeqaaOGaaiilaiaabccacqaHYoGydaWgaaWcbaGa aGymaaqabaGccaqGSaGaaeiiaiablAciljaabYcacqaHYoGydaWgaa WcbaGaamiCaaqabaaaaa@4252@ are denoted by β ^ 0 ,  β ^ 1 , β ^ p MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqOSdiMbaK aadaWgaaWcbaGaaGimaaqabaGccaGGSaGaaeiiaiqbek7aIzaajaWa aSbaaSqaaiaaigdaaeqaaOGaaeilaiaabccacqWIMaYscaqGSaGafq OSdiMbaKaadaWgaaWcbaGaamiCaaqabaaaaa@4282@ , respectively.

For small area estimation to work well, the assumptions underlying the Fay-Herriot model must hold. The model assumes a linear relationship and normality of model errors v i + e i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamODamaaBa aaleaacaWGPbaabeaakiabgUcaRiaadwgadaWgaaWcbaGaamyAaaqa baaaaa@3AFC@ . These aspects must be validated. A valid model is particularly useful for the smallest areas and serves to increase the precision of estimates. In our context, model validation was conducted by carefully examining model diagnostics and graphs, and making modifications until a satisfactory model was found. In particular, one outlier was identified through this validation process. This outlier was then excluded for the estimation of the model parameters.

The application of the Fay-Herriot model requires knowing the variance of the direct estimates (or square standard error). While these variances were estimated using the bootstrap weights, these direct variance estimates may be quite unstable when the sample size in the area is small. In this project, this instability was reduced by modelling the direct variance estimates using a log-linear smoothing model. The resulting variance estimates are called smoothed variance estimates, which were used in the application of the Fay-Herriot model. Greater detail about the log-linear smoothing model and its validation can be found in the article by Hidiroglou, Beaumont and Yung (2019).

For each area, the small area estimate for which a direct estimate is available is a weighted combination of the direct estimate, θ ^ i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqiUdeNbaK aadaWgaaWcbaGaamyAaaqabaaaaa@38D7@ , and the model prediction, β ^ 0 + β ^ 1 x 1i + β ^ 2 x 2i ++ β ^ p x pi MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqOSdiMbaK aadaWgaaWcbaGaaGimaaqabaGccqGHRaWkcuaHYoGygaqcamaaBaaa leaacaaIXaaabeaakiaadIhadaWgaaWcbaGaaGymaiaadMgaaeqaaO Gaey4kaSIafqOSdiMbaKaadaWgaaWcbaGaaGOmaaqabaGccaWG4bWa aSbaaSqaaiaaikdacaWGPbaabeaakiabgUcaRiablAciljabgUcaRi qbek7aIzaajaWaaSbaaSqaaiaadchaaeqaaOGaamiEamaaBaaaleaa caWGWbGaamyAaaqabaaaaa@4E28@ , also called the synthetic estimate. This weighted combination is called the composite estimate of θ i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqiUde3aaS baaSqaaiaadMgaaeqaaaaa@38C7@ . This small area composite estimate leans towards the direct estimate when the latter is precise, typically when the sample size in an area i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyAaaaa@36E5@ is large. However, when the quality of the direct estimate is poor in a given area, the composite estimate will lean towards the synthetic estimate. One advantage of small area estimation techniques is that it is possible to produce an estimate for an area for which no sample data exist. In this case, the small area estimate consists solely of the prediction from the model, i.e., the synthetic estimate. The small area estimation process also produces a standard error of the small area estimates, providing an indication of the quality of the small area estimates.

Small area estimates were produced for all 5,656 CTs. The small area estimate type consisted of 4,626 composite estimates and 1,030 synthetic estimates. The synthetic estimates consisted of the 793 CTs for which no sample data were available, plus the 237 CTs whose direct estimates were not retained.

The discussion now turns to diagnostics to validate the linearity and normality assumptions of the area-level model and to a description of the quality of the CT-level community belonging estimates.

Many diagnostic plots (not shown here) were scrutinized and various measures were calculated to assess the adequacy of the models. In summary, a strong linear relationship between the mean sense of community belonging and the predicted value from the area-level model is apparent, with a coefficient of determination equal to 0.81. See Hidiroglou, Beaumont and Yung (2019) for a definition of the coefficient of determination for the Fay-Herriot model. A graph of standardized residuals against predicted values shows no evidence of non-linearity. Other evidence, such as a Q–Q plot of standardized residuals versus standard normal quantiles, suggests that the normality assumption of the model errors appears to hold. Equally important is the adequacy of the variance smoothing model for the logarithm of the direct variance estimates. Inspection of the standardized residuals from the variance smoothing model suggests an adequate log-linear model. In addition, smooth variance estimates are compared with the direct variance estimates to ensure no systematic bias is introduced in the smoothed variance estimates.

With sufficiently adequate model diagnostics, the small area estimates are compared with the direct estimates. It appears that the direct estimates are more variable than the small area estimates. For most areas, the small area estimate has a smaller coefficient of variation than the direct estimate. The difference is particularly notable for the areas with the smallest sample sizes. This provides evidence that small area estimation methods usually improve the precision of estimates, sometimes significantly.

Results

Across CTs, small area estimates of community belonging averaged 2.81 on the 1-to-4 response scale, with a standard deviation of 0.41. Community belonging ranged from a low of 2.67 to a high of 3.28 across CTs, yielding a range of 0.61. Over half of CTs have an estimated community belonging score from 2.81 to 2.83, underscoring the narrow range of the distribution.

To situate CT-level community belonging within a broader empirical context, the variable was integrated with a set of seven other CT-level variables drawn from the 2016 Census, the 2016 to 2018 T1 Family file and crime statistics based on administrative data. These seven variables are

  • population density in the CT
  • percentage of the CT population residing in multi-unit dwellings
  • percentage of the CT population residing in rented dwellings
  • percentage of the CT population not residing at same address one year earlier
  • median adult-equivalent adjusted (AEA) family income in the CT
  • mean age of the CT population
  • percentage of the population residing in the 10% of CTs with the highest rates of violent crime.

The Pearson correlations between these variables are presented in Table 1. Before the correlates of community belonging are considered, it is instructive to consider the correlations between some of the other variables. As shown in the third column of Table 1, CT-level population density is positively correlated with the percentage of the CT population residing in multi-unit dwellings (Pearson correlation = 0.595), the percentage of the CT population residing in rented dwellings (0.531) and the percentage of the CT population who did not live at the same address a year earlier (0.346). Population density is also negatively correlated with median AEA family income (-0.278) and mean age (-0.120) in the CT. Overall, higher-density neighbourhoods tend to be characterized by younger populations, families with lower incomes, a greater prevalence of multi-unit and rented dwellings, and a shorter duration of residence.

Some of the characteristics above are expected to be correlated with CT-level community belonging given their relationships observed at the individual level, such as duration of residence, population turnover, and dwelling type (Schellenberg et al. 2018). The first column in Table 1 shows that CT-level community belonging is negatively correlated with population density (-0.440) and the percentages of CT populations residing in multi-unit dwellings (-0.594), residing in rented dwellings (-0.594) and with a duration of residence of less than one year (-0.678). Median AEA family income and age are positively correlated with CT-level community belonging.



Table 1
Pearson correlations between selected census tract–level characteristics, census metropolitan areas in Canada, 2016 to 2019
Table summary
This table displays the results of Pearson correlations between selected census tract–level characteristics SAE community belonging, Median AEA family income, Population density, Percentage not living at same address one year prior, Percentage in rented dwelling, Percentage in multi-unit dwelling, Percentage in top 10% of CTs with most crime and Mean age (appearing as column headers).
SAE community belonging Median AEA family income Population density Percentage not living at same address one year prior Percentage in rented dwelling Percentage in multi-unit dwelling Percentage in top 10% of CTs with most crime Mean age
SAE community belonging 1.000 Note ...: not applicable Note ...: not applicable Note ...: not applicable Note ...: not applicable Note ...: not applicable Note ...: not applicable ....
Median AEA family income 0.380Note *** 1.000 Note ...: not applicable Note ...: not applicable Note ...: not applicable Note ...: not applicable Note ...: not applicable ....
Population density -0.440Note *** -0.278Note *** 1.000 Note ...: not applicable Note ...: not applicable Note ...: not applicable Note ...: not applicable ....
Percentage not living at same address one year prior -0.678Note *** -0.345Note *** 0.346Note *** 1.000 Note ...: not applicable Note ...: not applicable Note ...: not applicable ....
Percentage in rented dwelling -0.594Note *** -0.621Note *** 0.531Note *** 0.646Note *** 1.000 Note ...: not applicable Note ...: not applicable ....
Percentage in multi-unit dwelling -0.526Note *** -0.526Note *** 0.595Note *** 0.552Note *** 0.814Note *** 1.000 Note ...: not applicable ....
Percentage in top 10% of CTs with most crime -0.258Note *** -0.226Note *** -0.024Table 1 Note  0.356Note *** 0.276Note *** 0.181Note *** 1.000 ....
Mean age 0.395Note *** 0.032Note * -0.120Note *** -0.205Note *** -0.022Table 1 Note § -0.012Table 1 Note § 0.020Table 1 Note § 1.000

To gain further perspective on community belonging, CTs were sorted from highest to lowest on community belonging and then divided into three groups of equal size. This yielded three CT-level community belonging terciles across which compositional characteristics can be compared. To begin, average community belonging within terciles ranged from 2.76 among CTs in the bottom tercile to 2.84 among those in the top tercile. The narrow range of the estimates is evident again.

Nonetheless, the terciles capture a large degree of variation across other CT-level variables. Average population density among CTs in the bottom community-belonging tercile was 5,915 individuals per km2—almost three times the average density among CTs in the top tercile, at 2,016 per km2. Similarly, the average share of CT populations residing in multi-unit dwellings was 68% in the bottom tercile and 29% in the top tercile, while the shares in rented dwellings ranged from 48% to 17%. Almost one in five residents (19%) in CTs with low community belonging had lived at their current address for less than a year, about twice that in high community belong CTs, at 10%. The average age in low community belonging CTs is 3.6 years younger than that in high community belonging CTs and median AEA income was about $12,500 lower. Finally, 20% of the total population in the bottom community-belonging tercile resided in the 10% of CTs with the highest rates of violent crime. The shares of the population in the middle and top community-belonging terciles residing in ‘high-crime’ CTs were 6% and 3%, respectively.


Table 2
Summary characteristics of census tracts, by community-belonging tercile, census metropolitan areas in Canada, 2016 to 2019
Table summary
This table displays the results of Summary characteristics of census tracts Bottom CT community-belonging tercile, Middle CT community-belonging tercile and Top CT
community-belonging tercile, calculated using number, people/km, years, percent and dollars units of measure (appearing as column headers).
Bottom CT community-belonging tercile Middle CT community-belonging tercile Top CT
community-belonging tercile
number
Mean CT community belonging (range of 1 to 4) 2.76 2.82 2.84
people/km2
Mean population density 5,915 3,162 2,016
years
Mean age 38.1 39.8 41.8
percent
Mean percentage in multi-unit dwellings 68.3 45.1 28.5
Mean percentage in rented dwellings 48.4 27.6 16.7
Mean percentage not living at same address one year prior 19.3 11.9 9.8
Percentage of tercile population residing in top 10% of violent crime CTs 19.9 5.7 2.8
dollars
Median AEA family income 35,095 39,097 47,572

As a final step in this initial analysis, a multinomial logistic regression model was run in which community-belonging terciles were regressed against the seven other CT-level variables. The second tercile was used as the reference group. Regression coefficients are shown in Table 3. After other characteristics in the model were taken into account, population density was positively correlated with a CT being in the bottom community-belonging tercile and negatively correlated with it being in the top tercile. The same pattern is observed in terms of the shares of CT populations with short durations of residence and those residing in rented dwellings. The correlations between CT populations residing in multi-unit dwellings and community-belonging terciles are not significant when other variables are taken into account.

Consistent with the bivariate results above, average age is positively correlated with a CT being in the bottom community-belonging tercile and negatively correlated with it being in the top tercile, net of other variables in the model. A strong relationship with crime is observed again, as CTs with the top 10% of violent crime rates were significantly more likely to be in the bottom community-belonging tercile and significantly less likely to be in the top tercile, than other CTs. The correlation between median CT AEA family income and community belonging is less straightforward, with negative correlations observed with the likelihood of being in either the bottom or top community belonging terciles.


Table 3
Census tract–level community belonging regressed against selected census tract–level covariates
Table summary
This table displays the results of Census tract–level community belonging regressed against selected census tract–level covariates Bottom CT community-belonging tercile, Middle CT community-belonging tercile and Top CT community-belonging tercile, calculated using coefficient units of measure (appearing as column headers).
Bottom CT community-belonging tercile Middle CT community-belonging tercile Top CT community-belonging tercile
coefficient
Log population density 0.337Note *** Note ...: not applicable -0.233Note ***
Percentage of CT popuation in multi-unit dwellings 0.001 Note ...: not applicable -0.003
Percentage of CT population not at same address one year earlier 0.269Note *** Note ...: not applicable -0.076Note ***
Percentage of CT popuation in rented dwellings 0.011Note *** Note ...: not applicable -0.032Note ***
Log AEA family income -0.157Note ** Note ...: not applicable -0.137Note ***
Mean age -0.158Note *** Note ...: not applicable 0.147Note ***
CT in top 10% of CTs in terms of violent crime rate 0.602Note *** Note ...: not applicable -1.211Note ***

Discussion

Overall, applying small area estimation techniques to data pooled across successive years of the CCHS yields improved estimates of community belonging at the CT level. This adds a much-needed measure of social ties and place attachment to the growing stock of neighbourhood-level information available from Statistics Canada. Even with four years of pooled CCHS data, sample sizes were still small in many CTs, leaving room to further strengthen CT-level estimates. Adding the 2020 CCHS to the pooled sample and re-estimating CT-level community belonging using small area estimation techniques is one way forward.

The bivariate and multivariate correlations above highlight just a few of the factors that contribute to, or detract from, community belonging within neighbourhoods. CT-level estimates of population density, duration of residence, housing tenure, income, age and crime are significantly correlated and all, with the exception of income, are in the expected direction.

Other neighbourhood characteristics may further detract from, or contribute to, community belonging within neighbourhoods. Parks and green space, for example, might be expected to enhance community belonging by providing opportunities for social interactions or proximity to nature. The availability of CT-level estimates of community belonging provides new opportunities to assess such factors. It also presents the chance to enhance analysis of the relationships between neighbourhood characteristics and the outcomes experienced by individuals, such as those pertaining to health and well-being.

References

Bartscher, A.K., S. Seitz, S. Siegloch, M. Slotwinski, and N. Wehrhöfer. 2020. Social Capital and the Spread of Covid-19: Insights from European Countries. ECONtribute Discussion Paper no. 007. Bonn and Cologne: University of Bonn and University of Cologne, Reinhard Selten Institute.

Carrillo-Alvarez, E. and I.Kawachi and J. Riera-Romani. 2019. “Neighbourhood social capital and obesity: A systematic review of the literature.” Obesity Reviews 20 (1): 119–141.

Chola, L., and O. Alaba. 2013. “The social determinants of multimorbidity in South Africa.” International Journal for Equity in Health 12 (63).

Helliwell, J., H. Shiplett, and C.P. Barrington-Leigh. 2018. How Happy Are Your Neighbours? Variation in Life Satisfaction Among 1200 Canadian Neighbourhoods and Communities. National Bureau of Economic Research working paper no. 24592. DOI:10.3386/w24592.

Hidiroglou, M., J.F. Beaumont, and W. Yung. 2019. “Development of a small area estimation system at Statistics Canada.” Survey Methodology 45 (1): 101–126.

Kim, Y.C., and J. Kang. 2009. “Communication, neighbourhood belonging and household hurricane preparedness.” Disasters 34 (2): 470–488.

Marsh, G., and P. Buckle. 2001. “Community: The concept of community in the risk and emergency management context.” Australian Journal of Emergency Management 16 (1).

Mohnen, S., B. Volker, H. Flap, S.V. Subramanian, and P.P. Groenewegen. 2013. “You have to be there to enjoy it? Neighbourhood social capital and health.” European Journal of Public Health 23 (1): 33–39.

Rao, J.N.K., and I. Molina. 2015. Small Area Estimation. Hoboken: John Wiley & Sons, Inc.

Schellenberg, G., C. Lu, C. Schimmele, and F. Hou. 2018. “The correlates of self-assessed community belonging in Canada: Social capital, neighbourhood characteristics, and rootedness.” Social Indicators Research 140: 597–618.

Zahnow, R., R. Wickes, M. Taylor, and J. Corcoran. 2019. “Community social capital and individual functioning in the post‐disaster context.” Disasters 43 (2): 261–288.

Date modified: