Data quality, concepts and methodology: Methodology and data quality

Warning View the most recent version.

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

Survey frame and sample selection

Every five years, the Census of Agriculture collects information on agricultural operations across Canada, including institutional farms, community pastures, Indian reserves, etc. The Census of Agriculture provides a list of farms and their crop areas from which a probability sample for the June Farm Survey is selected.

The target population for the June Farm Survey includes all farms in Canada enumerated in the Census of Agriculture except institutional farms, farms on Indian reserves and farms from the Northwest Territories, Yukon and Nunavut.

Probability surveys can use two types of sampling frames: list and area. In the June Farm Survey, only the list frame is used in sample selection. This list frame is stratified into homogenous groups on the basis of Census characteristics (such as farm size and crop area) and sub-provincial geographic boundaries. A sample of approximately 25,000 farms has been drawn from the list frame for the June 2011 Farm Survey.

Data collection

The June 2011 Farm Survey was carried out from May 25 to June 3. Data collection is undertaken using "Computer-assisted telephone interview" (CATI) system.

Edit and imputation

With the CATI system, it is possible to implement edit procedures at the time of the interview. Computer programmed edit checks in the CATI system inform interviewers during the interview of possible data errors, which can then be corrected immediately by the interviewer and respondent. CATI significantly reduces the need for subsequent telephone follow-up, thereby reducing respondent burden and survey processing time.

Response rate

Usually by the end of the collection period, 80% of the questionnaires have been fully completed. The refusal rate of the survey is approximately 8 to 9%. The remainder of the sample unaccounted for can be explained by non-contact and non-response. Initial sample weights are adjusted by a process called "raising factor adjustment" in cases of total and partial non-response. No imputation is performed for missing values.

Sampling and non-sampling errors

The statistics contained in this publication are based on a random sample of agricultural operations and, as such, are subject to sampling and non-sampling errors. The overall quality of the estimates depends on the combined effect of these two types of errors.

Sampling errors arise because estimates are derived from sample data and not from the entire population. These errors depend on factors such as sample size, sampling design and the method of estimation. An important feature of probability sampling is that sampling errors can be measured from the sample itself.

Non-sampling errors are errors which are not related to sampling and may occur throughout the survey operation for many reasons. For example, non-response is an important source of non-sampling error. Coverage, differences in the interpretation of questions, incorrect information from respondents, mistakes in recording, coding and processing of data are other examples of non-sampling errors.


The survey data collected are weighted in order to produce unbiased level indicators which are representative of the population. These level indicators then undergo a validation process, based on subject matter analysis, before final estimates are published.


The June seeded area estimates contained in this publication are preliminary estimates and consequently are subject to revision. Seeded areas will be finalized for the crop year in the November Farm Survey report.

The following table contains some statistics which indicate the magnitude and direction of past revisions to the June seeded area estimates. The magnitude is measured by the average percent change between the preliminary and final estimates. The direction of revisions is indicated by counting the number of years that the preliminary estimate is above or below the final revised estimate. The data indicate, for example, that the preliminary estimates of June seeded area for barley are revised by a magnitude of, on average, 4.7% and usually in a downwards direction.

Data quality

The June seeded area estimates are based on level indicators obtained from a probability survey of farming operations. The potential error introduced by sampling can be estimated from the sample itself by using a statistical measure called the "coefficient of variation" (c.v.). Over repeated surveys, 95 times out of 100, the relative difference between a sample estimate and what should have been obtained from an enumeration of all farming operations would be less than twice the c.v.. This range of values is referred to as the "confidence interval". While published estimates may not exactly equal the level indicators due to the validation process, these estimates do remain within the confidence interval of the survey level indicators. For the June Farm Survey, c.v. range from 1% to 10% for the major crops. Coefficients of variation for specialty crops and small areas are usually within 11% to 25%.

For the different types of special crops, the estimates contained in this publication have been assigned a letter to indicate their c.v. (expressed as a percentage). The letter symbols represent the following c.v. ranges:

Data confidentiality

Data confidentiality is ensured under the Statistics Act, which prohibits the divulging of individual or aggregated data where individuals or businesses might be identified.

Next | Previous