Methodology and data quality

Warning View the most recent version.

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

Introduction
Reference period
Target population
Variables measured
Instrument design
Sampling
Data collection
Error detection
Estimation
Quality evaluation
Disclosure control
Data accuracy
Response rates and sampling error
Comparability of data and related sources

Introduction

This section provides an overview of the underlying methodology of the survey and of key aspects of the data quality. It will also provide an understanding of the strengths and limitations of the data. The information may be of particular relevance when making comparisons with data from other surveys or sources of information and when drawing conclusions from time series.

Reference period

Respondents of the Households and the Environment Survey (HES) were asked to refer to behaviours and activities that were undertaken by the household or, in the case of the transportation module, by a selected individual within the household for the following reference periods:

Text table A
Examples of questions/modules using reference period
Reference period Examples of questions/modules using reference period
At the time of the interview
  • Water source
  • Water treatment
  • Type of heating equipment
  • Access and use of recycling and programs
During the previous summer
  • Lawn and garden watering
Warmer months and colder months
  • Mode of transport to work
  • Time and distance to work
Heating season and cooling season
  • Amount of wood burned
  • Indoor temperature
2005
  • Fuel consumption by motor boat or snowmobile
  • Fertilizer or pesticide application
  • Leftover paint

Target population

The target population consisted of households in Canada, excluding households in which no member is 18 years old or more. Also excluded were households located in the Yukon, Northwest Territories and Nunavut, households located on Indian reserves and on military bases, and households consisting entirely of full-time members of the Canadian Armed Forces. For a subset of questions, the survey targeted adults 18 years of age or older living in households that were included in the survey's main target population. The survey, therefore, aimed to provide two different units of analysis: the household for most questions, and the person for a limited number of questions relating to the modes of transportation that were used to travel to work.

Variables measured

Broadly, the 2006 HES measured variables that explored the following themes:

  • Water quality concerns of households
  • Consumption and conservation of water
  • Energy use and home heating and cooling
  • Use of gasoline-powered equipment
  • Pesticide and fertilizer use on lawns and gardens
  • Recycling, composting and waste disposal practices
  • Impacts of air and water quality on households
  • Transportation decisions

Instrument design

The questionnaire was designed by Statistics Canada in consultation with stakeholders involved in the Canadian Environment Sustainability Indicators project and in consideration of the data needs of both the project and the larger research and policy communities.

Testing of the questionnaire was done by Statistics Canada's Questionnaire Design Research Centre (QDRC). Focus group sessions were conducted along with a number of one-on-one interviews. These were conducted in both English and French by the QDRC in five cities across the country in July and August 2005.

The questionnaire was designed to follow standard practices and wording, when applicable, in a computer-assisted interviewing environment. This included the automatic control of question wording and flows that depended upon answers to earlier questions and the use of online edits to check for logical inconsistencies and gross capture errors. The computer application for data collection was subjected to extensive testing before its use in the survey.

Sampling

This is a sample survey with a cross-sectional design.

The HES sample began with households that were included in the Labour Force Survey (LFS) conducted in February 2006. The sample was selected in order to allow for reliable estimates; i.e., with a coefficient of variation (CV) of 16.5% or better for proportions as small as 10% in 28 census metropolitan areas (CMAs) and in the non-CMA portion of each province. The initial sample size consisted of 36,431 households and assumed a response rate of 75%.

Data collection

Data collection took place in conjunction with, and as a supplement to, the LFS from February 15, 2006 to April 15, 2006. Participation in the survey was voluntary.

Data were collected directly from survey respondents by telephone interview as part of the LFS collection process. Once the LFS was completed for all eligible members in a household, the interviewer asked to speak to the person who was most knowledgeable about household practices relating to the environment in order to complete the HES. Depending on this person's availability and operational constraints, the HES interview was completed immediately or arrangements were made to call back in order to complete the interview. An automated call scheduler managed follow-up calls in order to try to make contact with the respondent at different times of day throughout the collection period.

Interviews for the HES were conducted from Statistics Canada's regional offices using a computer-assisted telephone interviewing (CATI) application. Partway thorough the interview, the computer survey application randomly selected one eligible member, 18 years of age or older. This person was the subject, through proxy response if this person was not the HES respondent, of a subset of questions relating to modes of transportation used to travel to work. The initial sample size consisted of 36,431 households. A 77.8% response rate yielded a final sample of 28,334 responding households to the HES.

Error detection

The HES questionnaire incorporated many features to maximize the quality of the data collected. There were multiple edits in the computer-assisted interview questionnaire to compare the entered data against unusual values. Other edits checked for logical inconsistencies in these sections of the questionnaire as well as in other sections with multiple choice responses. When an edit failed, the interviewer was prompted to correct the information, with the help of the respondent. For most of the income and expenditure edit failures, the interviewer had the ability to override the edit failure if it cannot be resolved. As well, the interviewer had the ability to enter a response of "Don't Know" or "Refused" if the respondent did not answer the question.

Once the data were received at Statistics Canada's head office, an extensive series of processing steps was undertaken to examine each record received. A top-down flow edit was used to clean up any question paths that may have been mistakenly followed during the interview. The editing and imputation phases of processing identified logically inconsistent or missing information items, and corrected such errors.

Estimation

Estimates representing all households that were in-scope were produced by assigning weights to each sampled household. The weight of a sampled household indicated the number of households in the population that the unit represented. The initial weight was provided by the LFS and incorporated the probability of selecting the household in their sample, as well as other adjustments such as the treatment of non-response to the LFS.

In addition, person-level estimates were produced using a second weight, which was attached to each individual 18 years of age or older who had been randomly selected from a sampled household as the subject of a subset of questions relating to the modes of transportation used to travel to work. The weight of a sampled individual indicated the number of people in the population that this person represented.

In order to produce both weights, a first adjustment was made to the initial weight to reflect the fact that only a subsample of the LFS was used. Depending on the size of the LFS sample in a given domain of interest, different numbers of LFS panels (from 2 to 6) were surveyed for the HES. The second adjustment was made to account for the LFS computer-assisted personal interview cases that were not interviewed for the HES. The third adjustment started with this interim weight for the sampled household and inflated it to represent the non-respondent households that did not participate in the HES but who did participate in the LFS. All units selected for the HES were modeled using a logistic regression to calculate their propensity to respond. This probability was used to group records into clusters. The inverse of the observed response rate in each cluster was used as this third adjustment factor.

To produce the final person-weight, a fourth adjustment was made to account for the selection of a single household member for the transportation module. Then, the fifth adjustment used generalized regression estimation to calibrate the interim HES person-weights, matching the age–sex distributions for each province and the population counts for several CMAs. These population projections were taken from the same totals used in the LFS. The final HES person-weight is the outcome of these five adjustments to the initial LFS subweight.

To produce the final household-weight, the final person-weight was modified by undoing the fourth adjustment above (to return to a household level for estimation) before a fifth and final adjustment was performed by calibrating to independent estimates of the distribution of households in each region according to size (i.e., one, two, or three or more occupants).

The quality of the estimates was assessed using estimates of their CV. Given the complexity of the HES design, CVs cannot be calculated using a simple formula. Bootstrap replicate weights were used to establish the CVs of the estimates.

Quality evaluation

A comparison of social and demographic domains from HES was made with previous surveys to ensure consistency. Subject-matter experts made selective data confrontations with other data sources.

Disclosure control

Statistics Canada is prohibited by law from releasing any data that would divulge information obtained under the Statistics Act that relates to any identifiable person, business or organization without the prior knowledge or the consent in writing of that person, business or organization. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

Data accuracy

The coverage error of the LFS, of which the HES is a subsample, is estimated at less than 2%. The exclusion of households in which no member is 18 years old or over is considered negligible.

Response rates and sampling error

The response rate for this survey was 77.8%. Follow-ups in some locations were terminated once the targeted response rate of 75% was reached. Provincial response rates ranged from 73.1% to 83.3%.

The results estimated from HES are based on a sample of households in Canada. The results obtained from asking the same questions of all Canadian households would differ to some known extent. The extent of this sampling error is quantified by the CV) with the following guidelines:

  • 16.5% and below: acceptable estimate
  • 16.6% to 33.3%: marginal estimate requiring cautionary note to users; and
  • 33.3% and above: unacceptable estimate.

Estimates that do not meet an acceptable level of quality are either flagged for caution or suppressed. CV tables are prepared by Statistics Canada and made available to help users understand the quality of individual estimates. For example, CVs for the estimated proportion of households who used pesticides on their lawn or garden in 2005 for Canada and the provinces are as follows:

Canada 1.4%
Newfoundland and Labrador    6.5%
Prince Edward Island 10.2%
Nova Scotia 6.1%
New Brunswick 10.8%
Quebec 5.8%
Ontario 2.1%
Manitoba 3.8%
Saskatchewan 3.5%
Alberta 3.1%
British Columbia 3.9%

Comparability of data and related sources

Data obtained from the 2006 survey are comparable with data from the 1994 survey for the following variables:

  • Access to and use of recycling programs
  • Household composting
  • Pesticide use
  • Presence of a thermostat and a programmable thermostat
  • Presence of energy-saving light bulbs
  • Presence of low-flow shower heads
  • Presence of a low-flow toilet or a toilet tank with the water volume modified
  • Presence of water purifiers or filters
  • Presence of a yard