Response and nonresponse
Despite the best efforts of survey managers and operations staff to maximize response, most, if not all, surveys must deal with the problem of nonresponse.
Response refers here to all data obtained either directly from respondents or from administrative data. This broad definition of response is necessary to reflect the increased use of different collection strategies in the same survey, a practice that has become more and more commonplace. Moreover, as with survey data, administrative data is not exempt from either partial or total nonresponse. This nonresponse is sometimes the result of lateness in obtaining all the administrative data.
For a unit to be classified as responding, the degree of item response or partial response (where an accurate response is obtained for only some of the data items required from a respondent) must meet a minimum threshold level below which it is considered that there is unit nonresponse. In that case, the sampled person, household, business, institution, farm or other unit is classified as not having responded at all.
Classic nonresponse mechanisms are as follows: uniform nonresponse (or response missing entirely at random) where the response probability is completely independent of the units and the measurement process, and is constant over the entire population; nonresponse depending on an auxiliary variable [or response missing at random] where the response mechanism depends on certain auxiliary data or variables available for all units measured and nonresponse depending on the variable of interest [or response not missing at random] where the response probability depends on the variable of interest.
Nonresponse can have two effects on data: first, it introduces a bias in estimates when nonrespondents differ from respondents in the characteristics measured; second, it contributes to an increase in the total variance of estimates since the sample size observed is reduced from that originally sought.
The degree to which efforts are made to get a response from a nonrespondent depends on budget, time and staff constraints, impact on overall quality and the risk of nonresponse bias. If nonresponse persists, there are several approaches to reduce the effect of the nonresponse. Decisions on the degree of research to be undertaken to develop nonresponse adjustment techniques are subject to the constraints mentioned above.
In telephone or personal interviews, or during follow-up, try to collect as much basic information on the respondent as possible to avoid making adjustments based on assumptions a little later.
In dealing with nonresponse, take advantage of available auxiliary information as much as possible.
An effective respondent relations program, a well-designed questionnaire, the use of active management to ensure regular follow-up on collection operations and adaptive data collection (Laflamme, 2008) are essential elements in optimizing response.
Setting an anticipated response rate
One point to consider in determining sample size and managing collection is setting an anticipated response rate. One way of doing this is to use the results of previous survey cycles, a test run or similar surveys.
Ensure an acceptable level of quality in all survey planning and implementation steps to obtain a good response rate. To do this, keep the following factors in mind:
When designing the survey: previous experience with similar surveys, and the total budget and allocation of the budget to various operations;
The survey frame's quality (in terms of population coverage and the facility of establishing contact with respondents), the population observed and the sampling method;
The data collection method (for example, by mail, personal interview or computer-assisted telephone interview, by electronic data reporting (EDR), the Internet or a combination of methods), the time of year and the length of the collection period;
The communication strategy to be used to inform respondents of the importance of the survey and to maintain a relationship with respondents;
The use and effectiveness of respondent incentives;
The response burden imposed (length of interview, difficulty of subject matter, timing and interview periodicity); the subject's nature and sensitivity, questionnaire length and complexity; questionnaire language and respondents' cultural backgrounds;
Collection staff's prior experience and skills in interpersonal relationships; their workload; factors related to the interviewers themselves, such as training; and potential staff turnover;
The effectiveness and scope of follow-up methodology and expected difficulties in tracing respondents who have moved.
Institute adaptive collection allowing the collection strategy to evolve over time. This requires instituting active management for regular follow-up on collection operations and adaptive data collection in the four collection phases: before initial contact, after some trials, in the middle of the collection period and towards the end of collection.
Putting nonresponse follow-up procedures in place during collection
Follow up on nonrespondents (all or a subsample of them). Following up on nonrespondents increases response rates and can help ascertain whether respondents and nonrespondents are similar in the characteristics measured. The survey strategy should take nonresponse into account immediately by adopting a two-phase selection perspective.
Prioritize follow-up activities. For example, in business surveys, follow up on large or influential units first, possibly at the risk of missing the smallest units. Likewise, give high priority to nonresponding units in known domains with a high potential for nonresponse bias. A score function can be used to prioritize the follow-up.
Follow-up is particularly important in the case of longitudinal surveys, in which the sample is subject to increasing attrition (and possibly bias) due to nonresponse on each survey occasion. In this case, high-quality tracing must be facilitated; obtain additional contact information for the units sampled in each survey cycle; provide a "Change of address" card and ask the sampled unit to advise the Bureau if it moves between survey cycles. This will help obtain up-to-date contact information. In addition, administrative data, city and telephone directories and many other sources, including local knowledge, are valuable to the tracing staff.
Assessing potential nonresponse bias
There are various approaches for determining whether there are differences between respondents and nonrespondents and evaluating potential nonresponse bias: specific follow-up of units, follow-up of nonrespondents and analysis of known characteristics of respondents and nonrespondents. Information on nonrespondents might come from previous information waves (in the case of longitudinal surveys or with rotation groups), or by using external data sources (e.g. administrative data or paradata files).
Determining the response mechanism
Analyzing respondents' and nonrespondents' characteristics also helps establish a nonresponse model to reduce nonresponse bias as much as possible and determine the best way to compensate for nonresponse.
For longitudinal surveys, the structure of nonresponse over time must be considered (Hedeker and Gibbons, 2006).
Deciding how to handle nonresponse
The main approaches to dealing with missing data are imputation and reweighting.
The approach should be chosen based on the kind of nonresponse (total or partial), the availability of auxiliary variables and the quality of the response model. In general, reweighting is used to deal with total nonresponse. Imputation is mainly used for partial nonresponse although it may be used to deal with total nonresponse if auxiliary data is available (repeated surveys, administrative data, etc.)
Reweighting is used to eliminate, or at least reduce, total nonresponse bias. Reweighting can be viewed from two angles: nonresponse model or calibration (Särndal, 2007).
For the nonresponse model approach, a model is developed to estimate unknown response probabilities. The survey weights are then adjusted inversely to the estimated response probabilities (Oh and Scheuren, 1983; Lynn, P., 1996). To protect somewhat against model insufficiency, it is suggested that homogeneous response groups be formed, i.e. that units with the same characteristics and the same propensity to respond be grouped together (Haziza and Beaumont, 2007). Several methods may be used to do this: decision tree algorithms such as CHAID in Knowledge Seeker (Kass, 1980; Angoss Software, 1995), logistical regression models, the score method, use of auxiliary data such as paradata (Beaumont, 2005; Eltinge, Yansaneh, 1997).
Systems developed at Statistics Canada are used to evaluate and measure the impacts of nonresponse and imputation: GENESIS (GENEralized SImulation System) quantifies the relative performance of imputation methods using simulation studies and SEVANI (System for the Estimation of Variance due to Nonresponse and Imputation) calculates variance due to nonresponse (Beaumont, 2007). It should be noted that if nonresponse variance is high compared to sampling variance for a given region, it might be useful to reduce the desired sample size and devote more resources to nonresponse prevention in order to stay within the budget.
Evaluating and disseminating nonresponse rates
Follow the Standards and Guidelines for Reporting Nonresponse Rates (Statistics Canada, 2001d) to make it easier to compare surveys. These standards describe nonresponse rate reporting requirements as set out in the Policy on advising users of data quality and methodology for censuses or sample surveys based strictly on collecting data directly from respondents. Subjects discussed include weighted and non-weighted nonresponse rates, response rates in data collection and estimation, nonresponse rates for secondary or longitudinal surveys, nonresponse bias, survey operations monitoring, data collection method evaluation, survey frame coverage measures, longitudinal database creation, nonresponse case reporting and integrated metadata base reporting requirements.
If need be, refer to particular implementations of the standards based on the specific characteristics of surveys. For example, recent articles have discussed surveys using administrative data for some units and survey data for others (Trepanier et al. 2005), surveys where a mixed collection method is used for the same unit (Leon, 2007) and random dialing surveys (Marchand, 2008).
Identifying and analyzing reasons for nonresponse
Note reasons for nonresponse at collection time (e.g. refusal, non-contact, temporary absence, technical problem) since nonresponse bias levels may differ depending on the reason.
Main quality element: accuracy
Evaluating response and nonresponse rates
Write a note on the response rate. The response rate may be calculated in different ways, with interpretations for different purposes. Refer to Standards and Guidelines for Reporting Nonresponse Rates (Statistics Canada, 2001d). Report weighted response rates to show their contribution to estimates and use non-weighted response rates to reflect participation rates in the survey population.
Report nonresponse rates broken down by different nonresponse types. This information can be used later when designing other surveys and is useful to data users who must interpret the data. The percentages of sampled units that refused to respond, were identified as out of scope, could not be contacted during the collection period and partially responded might also be of interest.
Specify whether the survey estimates are adjusted to compensate for nonresponse. If estimates are adjusted, a description of the adjustment procedure should be appended.
Evaluating nonresponse variance
Report nonresponse variance. To do so, use SEVANI when the total or average of a domain or resampling methods is estimated.
Study the nonresponse bias based on the collection method and type of nonresponse.
In the case of periodic surveys, conduct periodic nonresponse bias studies. The results of these studies should be included in the information reported to users in accordance with the policy.
If applicable, try to determine how successful the procedures are in correcting potential bias.
Angoss Software. 1995. Knowledge SEEKER – User's Guide. ANGOSS Software International Limited.
Beaumont, J.-F. and J. Bissonnette. 2007. « Variance Estimation Under Composite Imputation Using an Imputation Model." Article presented at the Workshop on Calibration and Estimation in Surveys, Ottawa, 2007.
Beaumont, J.-F. 2005. "On the Use of Data Collection Process Information for the Treatment of Unit Nonresponse Through Weight Adjustment." Survey Methodology. Vol. 31, no. 2. p. 227-231.
Eltinge, J.L. and I.S. Yansaneh. 1997. "Diagnostics for formation of nonresponse adjustment cells, with an application to income nonresponse in the U.S. Consumer Expenditure Survey." Survey Methodology. Vol. 23, no. 1. p. 33-40.
Fuller, W.A. 1993. Measurement Error Models. New York: Wiley-Interscience. 440 p.
Hedeker, D. and R.D. Gibbons. 2006. Longitudinal Data Analysis. New York. Wiley-Interscience. 360 p.
Groves, R.M., D.A. Dillman, J.L. Eltinge and R.J.A. Little, 2001. Survey Nonresponse. New York. Wiley-Interscience. 520 p.
Haziza, D. and J.-F. Beaumont. 2007. "On the construction of imputation classes in surveys." International Statistical Review. Vol. 75, no. 1. p. 25–43.
Kass, G.V. 1980. "An exploratory technique for investigating large quantities of categorical data." Applied Statistics. Vol. 29, no. 2. p. 119-127.
Lynn, P. 1996. Weighting for Nonresponse. Proceedings from the Association for Survey Computing 1996 Survey and Statistical Computing Conference. London, UK.
Laflamme, F. 2008. Using Paradata to Actively Manage Data Collection Survey Process. Proceedings from the American Statistical Society 2008 Joint Statistical Methods Conference. Denver, Colorado.
Leon, C. A. 2007. Reporting Response Rates in Characteristic Surveys. Proceedings from the Statistical Society of Canada 2007 Conference. St. John's, Newfoundland.
Marchand, I., R. Chepita, P. St-Cyr, D. Williams. 2008. "Coverage and Non-Response in a Random Digit Dialling Survey: The Experience of the General Social Survey's Cycle 21 (2007)." Article presented at Symposium, Statistics Canada, 2008.
Oh, H.L. and F.J. Scheuren. 1983, "Weighting Adjustment for unit nonresponse", in W.G. Madow, I. Olkin, and D.B. Rubin (eds). Incomplete data in Sample Surveys, Vol. 2: Theory and Bibliographies. New York. Academic Press. p. 143-184.
Särndal, C.-E. and S. Lundström. 2005. Estimation in Surveys with Nonresponse. New York. Wiley. 212 p.
Statistics Canada. 2000d. "Policy on Informing Users of Data Quality and Methodology." Statistics Canada Policy Manual. Section 2.3. Last updated March 4, 2009.
Statistics Canada. 2001d. Standards and Guidelines for Reporting Nonresponse Rates. Statistics Canada Technical Report.
Trépanier, J., C. Julien and J. Kovar. 2005. "Reporting Response Rates when Survey and Administrative Data are Combined." Federal Committee on Statistical Methodology Research Conference.
- Date modified: