Data quality, concepts and methodology: Methodology and data quality

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

Next | Previous

Survey frame and sample selection

Every five years, the Census of Agriculture collects information on agricultural operations across Canada, including institutional farms, community pastures, farms on First Nations reserves, etc. The Census of Agriculture provides a list of farms and their crop areas from which a probability sample for the November Farm Survey is selected.

The target population for the November Farm Survey includes all farms in Canada enumerated in the Census of Agriculture, except institutional farms on First Nations reserves and farms from the Northwest Territories, Yukon and Nunavut.

Probability surveys can use two types of sampling frames; list and area. In the November Farm Survey, only the list frame is used in sample selection. This list frame is stratified into homogenous groups on the basis of Census characteristics (such as farm size and crop area) and sub-provincial geographic boundaries. A sample of approximately 28,600 farms has been drawn from the list frame for the November 2011 Farm Survey.

Data collection

The November 2011 Farm Survey was carried out from October 24 to November 10. Data collection is undertaken using the "Computer assisted telephone interview" (CATI) system.

Edit and imputation

With the CATI system, it is possible to implement edit procedures at the time of the interview. Computer programmed edit checks in the CATI system inform interviewers during the interview of possible data errors, which can then be corrected immediately by the interviewer and respondent. CATI significantly reduces the need for subsequent telephone follow-up, thereby reducing respondent burden and survey processing time.

Response rate

Usually by the end of the collection period, 80% of the questionnaires have been fully completed. The refusal rate to the survey is approximately 8 to 9%. The remainder of the sample unaccounted for can be explained by non-contact or non-response. Initial sample weights are adjusted by a process called "raising factor adjustment" in cases of total and partial non-response.

Sampling and non-sampling errors

The statistics contained in this publication are based on a random sample of agricultural operations and, as such, are subject to sampling and non-sampling errors. The overall quality of the estimates depends on the combined effect of these two types of errors.

Sampling errors arise because estimates are derived from sample data and not from the entire population. These errors depend on factors such as sample size, sampling design and the method of estimation. An important feature of probability sampling is that sampling errors can be measured from the sample itself.

Non-sampling errors are errors which are not related to sampling and may occur throughout the survey operation for many reasons. For example, non-response is an important source of non-sampling error. Coverage, differences in the interpretation of questions, incorrect information from respondents, mistakes in recording, coding and processing of data are other examples of non-sampling errors.

Estimation

The survey data collected are weighted in order to produce unbiased level indicators which are representative of the population. These level indicators then undergo a validation process, based on subject matter analysis, before final estimates are published.

Revisions

The November crop production estimates contained in this publication are final for the current crop year. Revisions to these estimates may still be made for up to two years after the end of the crop year.

The following table contains some statistics which indicate the magnitude and direction of the updates between the November production and final revised production estimates. The magnitude is measured by the average percent change between the final and revised estimates. The direction of the update is indicated by counting the number of years that the preliminary estimate is above or below the final revised estimate. The data indicate, for example, that the estimates of the November production for barley are revised by a magnitude of 2.9% on average, and usually in a downwards direction.

Text table 1

Magnitude and direction of changes between November and revised production estimates, Canada, 2001 to 2010

Data quality

The November crop production estimates are based on level indicators obtained from a probability survey of farming operations. The potential error introduced by sampling can be estimated from the sample itself by using a statistical measure called the "coefficient of variation" (c.v.). Over repeated surveys, 95 times out of 100, the relative difference between a sample estimate and what would have been obtained from an enumeration of all farming operations is less than twice the c.v.. This range of values is referred to as the "confidence interval". While published estimates may not exactly equal the level indicators due to the validation, these estimates do remain within the confidence interval of the survey level indicators. For the November Farm Survey, c.v.'s range from 1% to 5% for the major crops. C.v.'s for specialty crops and small areas are usually between 6% and 25%.

For the different types of special crops, the estimates contained in this publication have been assigned a letter to indicate their coefficient of variation (c.v.) expressed as a percentage. The letter grades represent the following c.v. ranges:

Text table 2

Coefficient of variation rating system for special crops

Data confidentiality

Data confidentiality is ensured under the Statistics Act, which prohibits the divulging of individual or aggregated data where individuals or businesses might be identified.

Next | Previous