User Guide for the Survey of Household Spending, 2014
2. Survey methodology
Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.
The target population of the 2014 SHS is the population of Canada’s 10 provinces, excluding residents of institutions, members of the Canadian Forces living in military camps and people living on Indian reserves. These exclusions account for about 2% of the population of the 10 provinces.
For operational reasons, people living in certain remote areas where the rate of vacant dwellings is very high and where the collection cost would be exorbitant are excluded from collection. Also excluded are people living in other types of collective dwellings such as:
- people living in residences for dependent seniors;
- people living permanently in school residences, work camps, etc.; and
- members of religious and other communal colonies.
Collection exclusions represent less than 0.5% of the target population. However, these people are included in the population estimates to which the SHS estimates are adjusted (see Section 2.6).
The SHS primarily collects detailed information on household expenditures. It also collects information on household demographic characteristics and certain dwelling characteristics (e.g., type, age and tenure), as well as some information on household equipment (e.g., electronics and communications equipment). In addition, income information from personal income tax data is combined with the survey data.
For expenditure information collected through the questionnaire, the length of the reference period depends on the recall period specified in the question (e.g., the past month, the past three months, or the past 12 months). The reference period also varies with the collection month (e.g., for households in the January 2014 sample, “the past 12 months” signifies the period from January 2013 to December 2013, while for households in the December 2014 sample, it refers to the months between December 2013 and November 2014). Expenditures collected in the expenditure diary are reported for a period of two weeks.
In general, longer recall periods are used to collect expenditures for goods and services that are more expensive or purchased infrequently or irregularly. In contrast, shorter recall periods are used for goods and services that are of less value or that are purchased frequently or at regular intervals.
For demographic characteristics, dwelling characteristics and household equipment, the reference period is the interview date. The reference period for income is the calendar year preceding the survey year (i.e. 2013 for the 2014 SHS).
The sample of the 2014 Survey of Household Spending consists of 17,109 households throughout the 10 provinces. A stratified, multi-stage sampling design was used to select the sample. It is essentially a two-stage design, of which the first stage is a sample of geographic areas (referred to as clusters). Next, a list of all the dwellings in the selected clusters is prepared and a sample of dwellings is selected. The selected dwellings that are inhabited by members of the target population constitute the survey’s sample of households. The SHS uses a number of components of the Labour Force Survey’s (LFS) sample design to minimize operating costs, though the dwellings selected for the SHS are different than those selected for the LFS.
Fifty percent of sampled households are selected to also complete an expenditure diary. Thus, in each selected cluster, a subsample of the previously chosen dwellings is selected in order to identify the dwellings for which the households will be asked to fill out a diary.
The national sample is first divided among the provinces on the basis of the variability of total household expenditures and, to a lesser extent, the number of households in each province. The goal is to obtain provincial estimates of similar quality as the national estimates. Provincial sample sizes are shown in Table 1 of Section 3. The sample is then divided into strata by grouping clusters with similar characteristics based on a number of socio-demographic variables. Some strata were defined to target specific subpopulations such as high-income households. To improve the quality of the estimates, the high-income household strata are allocated a larger share of the sample than the allocation proportional to stratum size that is used in other strata.
Since data are collected monthly, the sample is divided into 12 subsamples of similar size.
Users should note that the geographic concepts used for the 2014 SHS sample are those of the 2011 Census.
The SHS is a voluntary survey. The data are mostly obtained directly from the respondent through two collection modes: a personal interview conducted by an interviewer using a questionnaire on a laptop, and a diary in which the household is required to report its daily expenditures over a two-week period. The data for the 2014 SHS were collected on a continuous basis from January to December 2014 from a sample of households spread over 12 monthly collection cycles.
Firstly, households in the sample are asked to respond to a questionnaire (administered using a computer-assisted personal interview) that mainly collects regular expenditures (such as rent and electricity) and less frequent expenditures (such as furniture and dwelling repairs) for a recall period that varies in length depending on the type of expenditure. For regular expenditures, the amount of the last payment and the period it covered are typically collected. For the other types of expenditures collected in the interview, recall periods of one month, three months or twelve months are used. The recall periods are defined in terms of months preceding the month of the interview. For example, for a household in the June 2014 sample, “the past three months” corresponds to the period from March 1 to May 31, 2014. Demographic characteristics, dwelling characteristics and household equipment information, which are also collected in the interview, refer to the household’s situation at the time of the interview. Starting in 2013, respondents are informed that the survey data will be combined with tax data to obtain selected variables related to personal income for household members aged 16 and over on December 31 of the calendar year preceding the survey year. The reference period for personal income tax data is the calendar year prior to the survey year.
Fifty percent of sampled households are selected to also complete an expenditure diary. Following the interview, respondents of this subsample are asked to record the expenditures of all household members in an expenditure diary for a period of two weeks starting the day after the interview. Households are required to include all of their spending, except for a few types of expenditures, such as rent, regular utilities payments, and real estate and vehicle purchases. Households have the option of providing receipts of their purchases made during the two-week period in order to reduce the amount of information manually recorded in the diary. However, they are asked to write out additional information on the receipt if the description of the item appearing on the receipt is incomplete.
A telephone follow-up is carried out a few days after the interview to address any questions the respondent may have about the diary and to reiterate important information about how to complete it. At the end of the two-week period, the interviewer returns to the respondent’s residence to pick up the diary and ask a few additional questions to help the respondent report expenditures that he or she might have forgotten.
The diaries and all receipts supplied by respondents are scanned and captured at Statistics Canada’s head office. An expenditure classification code is assigned to each item from a list of over 650 different codes.
The electronic questionnaire contains many features designed to maximize the quality of the data collected. Many controls are built into the questionnaire to identify unusual values and detect logical inconsistencies in the reported data. When a response is rejected by the control, the interviewer is prompted to correct the information (with the respondent’s help, if necessary). Once the data are transmitted to the head office, a detailed verification of each questionnaire is undertaken through a comprehensive series of processing steps. Invalid responses are corrected or flagged for imputation.
A number of verification steps are also carried out on the diary data when the diaries are received at the head office as well as throughout the capture and coding steps. For example, checks are carried out to ensure that the start and end dates of the reference period of the diary are indicated, that the reported expenditures were made during the specified reference period, and that there are no items that appear in both the data written in the diary and on the receipts provided by the respondent. After validation, capture and coding, quality control procedures are applied. A sample of diaries is selected and completely verified once more to ensure that the diaries were captured and coded as specified in the procedures.
Next, a series of detailed verifications is performed on all diaries, and invalid responses are corrected or flagged for imputation. The final step is to assess whether the information reported in the diaries is of sufficient quality using parameters which are based on households’ characteristics. The reported expenditures and number of items are compared with minimum thresholds estimated for each geographic area (Atlantic Provinces, Quebec, Ontario, Prairie Provinces and British Columbia), household income class and household size. Diaries that satisfy the conditions are deemed usable. The remaining diaries are examined, and deemed usable if they include notes providing justification for their low expenditures or their small number of reported items (for example, a person living alone who had few expenses to report because he or she was on a business trip during the diary recording period). Diaries that do not meet the usability criteria are treated as non-response diaries; they are excluded from the estimates. It should be noted that some of the usable diaries are incomplete and may have non-responded days.
To solve problems of missing or invalid information in interview questions, donor imputation using the nearest neighbour method is generally applied. That is, data from another respondent with similar characteristics (the donor) are used to impute. The imputation is done on one group of variables at a time, with the groups formed on the basis of the relationships among the variables. The characteristics used to identify the donor are selected such that they are correlated with the variables to be imputed. Household income, dwelling type, and the number of adults and children are commonly-used characteristics.
Donor imputation is also used when information is missing from the expenditure diary. For instance, a respondent may have reported a particular expenditure item without its cost or given the total amount spent (for example, on groceries) without listing the individual items. Imputation is also used to enhance the level of detail in the coding of the items reported. For example, the information provided by the respondent may simply indicate that a bakery product was purchased, but a more detailed code is required to meet the survey’s needs. In this case, donor imputation is used to impute the type of bakery product (bread, crackers, cookies, cakes and other pastries, etc.). Diary imputation is carried out at the reported item level, and the characteristics most often used to identify the donor are cost, available partial code, household income and household size. Imputation is done by province and quarter to control for provincial differences and seasonality of expenditures.
Starting in 2012, the imputation method was refined to use supplementary information on the type of store where the purchases were made in order to produce detailed expenditures when a respondent has only provided a total amount in their diary. This method takes into account the increasing amount of grocery products sold in large chain stores that do not specialize in groceries.
For personal income tax data, missing or invalid data are generally donor-imputed.
Income and expenditure imputation is performed primarily with Statistics Canada’s Canadian Census Edit and Imputation System (CANCEIS).
After imputation, taxes are added to the diary items that are reported with taxes excluded. In order to reduce the burden on respondents, instructions are provided to respondents indicating when to include or exclude taxes from reported expenses. Thus, the Goods and Services Tax (GST), the Provincial Sales Tax (PST), and the Harmonized Sales Tax (HST) are added to the diary items according to the appropriate federal and provincial taxation rates.
The estimation of population characteristics from a sample survey is based on the premise that each sampled household represents a certain number of other households in the target population in addition to itself. This number is referred to as the survey weight. There are a number of steps involved in the process of computing the weight assigned to each household.
First, each household in the sample is given an initial weight equal to the inverse of its probability of being selected from the target population. Since only 50% of the households in the sample are selected to complete a diary, different weights are computed for the interview questionnaire and for the diary. A few adjustments are later applied to the interview weights and to the diary weights.
The interview weights are first adjusted to take into account the households that did not respond to the questionnaire. They are then adjusted so that selected survey estimates are coherent with aggregates or estimates from auxiliary sources; this process is called weight calibration. Three data sources are used for weight calibration.
Firstly, the weights are adjusted according to the number of persons by age group and the number of households by household size from population estimates produced by Statistics Canada’s Demography Division. These estimates are derived from 2011 Census data. Annual estimates of the number of persons in nine age groups (0–6, 7–17, 18–24, 25–34, 35–44, 45–54, 55–64, 65–74, and 75+) are used at the provincial level, and estimates for two age groups (0 to 17 years and 18 years and over) are used at the census metropolitan area level. For the number of households, the weights are adjusted to total the annual provincial estimates for three household size categories (one, two, and three or more persons). An adjustment is also made to ensure that each quarter is adequately represented in terms of the total number of households.
The second source used for weight calibration is the Statement of Remuneration Paid (T4) from the Canada Revenue Agency (CRA). The T4 data are used to ensure that the survey’s weighted distribution of income (on the basis of wages and salaries) is consistent with the income distribution of the Canadian population. Interview weights are calibrated to total the T4 accounts of the number of persons per province in six categories of wages and salaries on the basis of provincial percentiles (0th–25th, 25th–50th, 50th–65th, 65th–75th, 75th–95th and 95th–100th).
Starting with SHS 2012, a third source for adjusting the weights is provided by the personal income tax data (T1) from the CRA. The interview weights are adjusted to reflect the number of persons in each of the three highest personal income classes (based on the 95.5th, 97th, and 98.5th percentiles) for each province, except Prince Edward Island where one income class is used. This adjustment aims to compensate for the under-representation of these groups among the survey respondents.
The diary weights are also subject to a series of adjustments. A factor adjusts for the non-response to the questionnaire, while another factor compensates for households that respond to the questionnaire but refuse to complete the diary. The weights are also adjusted to total demographic estimates in a manner similar to that used for the interview weights. The demographic estimates of the number of persons at the provincial level are the same for the diary as for the interview, with the exception of Prince Edward Island. Seven age groups are used for Prince Edward Island due to its smaller sample size (0–17, 18–24, 25–34, 35–44, 45–54, 55–64, and 65+). At the census metropolitan area level, the distinction between the two age groups (0 to 17 years and 18 years and over) is retained only for Montreal, Toronto and Vancouver. Like the interview weights, the diary weights are adjusted to total the annual provincial estimates for the three household size categories; however, no quarterly adjustments are made.
The diary weights are also adjusted according to income. Instead of adjusting on wages and salaries (T4), the weights are adjusted to the estimated number of households by income group and by province calculated from the interview data. Specifically, the estimated number of households for each provincial quintile of total household income is used. The adjustment to the interview estimates ensures that the weighted income distribution of diary-respondent households is consistent with the weighted income distribution of interview-respondent households. The diary weights are also adjusted for the number of high-income individuals according to personal income tax data, using a single income class based on the 95.5th percentile. This personal income diary adjustment is not applied to Prince Edward Island.
All expenditure amounts collected from the interview and diary are converted to annual amounts (annualized) by multiplying them by a factor based on the recall period. Some expenditure data are also corrected by an adjustment factor when influential (extreme) values are identified. For the diary, another adjustment factor is produced to compensate for non-responded days.
The estimates for a given expenditure category collected from the interview therefore correspond to the weighted sums (using interview weights) of the annualized and adjusted amounts. The estimates of an expenditure category derived from diary data are calculated in a similar manner using diary weights and the appropriate annualization and adjustment factors. Lastly, summary expenditure category estimates that include components from both collection methods are produced by taking the sum of the estimates from both the diary and the interview components.
The 2014 SHS estimates were computed with weights adjusted to 2014 population estimates. These population estimates were based on 2011 Census data as well as more recent information from administrative sources such as birth, death and migration registers.
In order to make SHS estimates comparable over time, the 2013 SHS estimates have been revised using the population projections based on the 2011 Census. Estimates for 2013 were previously based on 2006 Census population projections. Estimates for 2010, 2011 and 2012 will be reweighted in the near future.
The historical revisions based on 2011 Census data also take into account improvements to the calibration methods used for the interview and diary weights which were introduced with the 2014 SHS. The calibration strategy used for the 2014 SHS estimates is described in Section 2.6. The changes introduced in 2014 are as follows:
- For the calibration of the interview weights at the provincial level, as well as for the calibration of the diary weights in all provinces except Prince Edward Island, there are nine age groups instead of the eight age groups used previously. The age group for persons aged 65 years and over was split into two groups (65-74 and 75+).
- For the calibration of the diary weights for Prince Edward Island, seven age groups are used instead of the eight age groups used previously. One group is used for persons aged 0 to 17 instead of both the 0-6 and 7-17 groups.
The same demographic controls used to calibrate the weights for the 2014 SHS estimates were used for the historical revisions of the 2012 and 2013 SHS estimates. For the historical revisions of the 2010 and 2011 estimates, the nine age groups were used to calibrate both the interview and diary weights for all provinces. In addition to the weight calibration, the other steps of the weighting process used to produce the revised estimates based on the 2011 Census data were also modified to account for the methods used for the 2014 SHS. The entire weighting process is thus standardized for the years 2010 to 2014.
SHS estimates prior to 2010 (2001-2009) are based on weights calibrated to population estimates produced using data from the 2001 Census. There is no plan to revise theses estimates (using 2006 or 2011 Census data) due to the break in the data series starting with the 2010 SHS (see Section 2.9).
With continuous monthly collection, the reference period of the collected data differs from one month to the other, as illustrated in Figure 1. For example, for an expenditure item with a three-month reference period, the data from the July sample include expenditures made between April 1st and June 30th, whereas the data from the December sample include expenditures made between September 1st and November 30th.
Description for Figure 1
This figure shows the sample reference periods of three different lengths for each of the twelve monthly collection periods from January to December.
For each monthly collection period, expenditures with a one-month reference period cover the month preceding the month of the collection period, expenditures with a three-month reference period cover the three months preceding the month of the collection period, and expenditures with a twelve-month reference period cover the twelve months preceding the month of the collection period. The following examples are based on the collection periods of January and December.
For the collection period of January of the survey year, expenditures with a one-month reference period cover the month of December of the year prior to the survey year, expenditures with a three-month reference period cover the period from October to December of the year prior to the survey year, and expenditures with a twelve-month reference period cover the period from January to December of the year prior to the survey year.
For the collection period of December of the survey year, expenditures with a one-month reference period cover the month of November of the survey year, expenditures with a three-month reference period cover the period from September to November of the survey year, and expenditures with a twelve-month reference period cover the period from December of the year prior to the survey year to November of the survey year.
Collected expenditures with a reference period of less than 12 months are annualized so that all expenditure amounts cover a period of 12 months. SHS estimates are produced by combining the data from the 12 monthly samples.
When combining the annualized data from the 12 monthly samples to generate annual expenditure estimates, for expenditures with a recall period of three months or less, most of the expenditures were made during the survey year. This is also true for all expenditure data collected with the diary.
For expenditure items with a 12-month recall period, the collected expenses occurred between January of the year before the survey year and November of the survey year, depending on the collection month. For example, expenses collected in January cover the period from January to December of the year before the survey year, while expenses collected in December occurred between December of the year before the survey year and November of the survey year. For the estimates produced to represent a single 12-month period when the data from 12 monthly samples are combined, it must be assumed that expenditures made during the survey year are similar to those made during the previous year. This must be considered when making comparisons between estimates based on a 12-month recall period and those based on shorter periods.
The limits of the collection model in producing expenditure estimates covering the same period (or the same year) are known, since the majority of countries use this methodology. Despite these limitations, continuous collection with reference periods adapted to the respondent’s ability to provide information is considered preferable in order to obtain data that reflects households’ true expenditures.
The SHS has been conducted annually since 1997. This survey includes most of the content of its predecessors, the periodic Family Expenditure Survey and the Household Facilities and Equipment Survey. Prior to 2010, the SHS was primarily based on an interview during the first quarter of the year in which households reported expenditures incurred in the preceding calendar year, although some changes to the methodology and definitions were made between 1997 and 2009.
A new methodology, which combines a questionnaire and a diary to collect household expenditures, was introduced for the 2010 survey. The recall periods have been shortened for many expenditure items and collection is continuous throughout the year. Although the expenditure data collected since 2010 are similar to those of previous years, the changes to data collection, processing and estimation methods have created a break in the data series. As a result, users are advised not to compare SHS data from 2010 onward with data prior to 2010, unless otherwise noted.
Since 2010, the SHS incorporates a significant amount of content that was previously collected through the Food Expenditure Survey (FES), last conducted in 2001. Although there are some differences between the SHS and FES methodologies, food expenditure data in both surveys have been collected using an expenditure diary that households are asked to fill in for a period of two weeks. The content of the SHS diary is slightly less detailed than that of the FES diary (e.g., the weight and quantity of food items are not collected) in order to limit the SHS respondent’s burden.
The content of the SHS was also reviewed in 2010 to reduce the time required for the interview. A number of components regarding household equipment and dwelling characteristics as well as most of the questions regarding changes in household assets and liabilities have been dropped. Some definitions have also changed. As well, starting with the 2010 survey, the data related to household income and income tax come mainly from personal income tax data.
Finally, the estimates for 2013 and 2014 are based on weights calibrated to population estimates produced using data from the 2011 Census. Estimates for 2010 to 2012 will be reweighted in the near future. Estimates for previous years (2001-2009) are based on weights calibrated to population estimates produced using data from the 2001 Census.
- Date modified: