Data source for income distributions: the Social Policy Simulation Database and Model

Annual income distributions of households in the Macroeconomic Accounts (MEA) draw on data included within Statistics Canada’s SPSD/M, which is designed to analyze the financial interactions of governments and individuals in Canada, and aims to assess the cost implications or income redistributive effects of changes in the personal taxation and cash transfer system. Underlying the SPSD/M is a statistically representative database of individuals in their family context that is updated regularly and available as a publicly-released microdata file. The SPSD/M is designed with enough information on each individual to compute taxes paid to and cash transfers received from government.

The SPSD/M provides a comprehensive picture of household sources of income patterns as it integrates microdata from various regular annual household surveys with government administrative data on the Canadian tax/transfer system. In particular, SPSD/M data on household income are derived from the Survey of Labour and Income Dynamics (SLID) and the Canadian Income Survey (CIS), while aggregated data on consumption are derived from the Survey of Household Spending (SHS). The survey data are then statistically matched to federal government administrative data on Employment Insurance (EI) claims and personal income tax returns - using Statistics Canada’s Annual Income Estimates for Census Families and Individuals (T1 Family File, or T1FF) since 2009 and using a sample of personal income tax returns which was created by the Canada Revenue Agency prior to 2009. The techniques used to create the database and avoid confidential data disclosure include various forms of categorical matching and stochastic imputation.

Certain adjustments are made within the SPSD/M using auxiliary information. For example, as persons in collective dwellings are not included in the sampling frames of SLID, CIS or the SHS, records are added to the SPSD/M to reflect the institutionalized elderly population using information from the Census. To better reproduce estimates from personal income tax returns, high income individuals are cloned and their incomes are replaced with tax filer information. Further adjustments are made to compensate for item non-response in underlying survey data (e.g. the ability to increase the number of families receiving social assistance to match control totals). While the above-mentioned adjustments address some of the differences between survey and MEA estimates, investigations are being made to further improve the SPSD/M estimates, with the ultimate goal of improving the concordance between the SPSD/M microdata and MEA concepts. For example, since the SPSD/M coverage does not extend to the territories of Yukon, Nunavut or Northwest Territories, the DHEA estimates incorporate auxiliary information from other Statistics Canada surveys such as the SHS that include economic and demographic information for households in those territories.

The SPSD/M contains more than 300,000 composite individuals residing in over 100,000 households in ten Canadian provinces, for a time series extending back to 1995. There are approximately 600 variables covering detailed socio-economic and demographic data as well as information on weekly employment histories, consumption patterns and itemized tax deductions. The full family structure for each individual on the database allows one to identify the familial relationships between all household members.

More information on the SPSD/M can be found within Statistics Canada’s SPSD/M website.

Date modified: