Whole Farm Database Reference Manual

Warning View the most recent version.

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

About Whole Farm Database

Skip to text

Introduction
Background
Database structure
Selection criteria
Standard output formats
Data quality
Confidentiality

Text begins

Introduction

Meeting users' needs

The Whole Farm Database (WFDB) is the product of a joint venture by Agriculture and Agri-Food Canada and Statistics Canada. It was developed with the intention to meet the increasing demands of users of agricultural statistics for more disaggregated data at the farm level. There are various potential users, from public sector policy-makers, to industry analysts and private sector decision-makers. The WFDB provides the means for users to evaluate agricultural policies and programs as well as analyze the viability, stability, and competitiveness of various farm businesses.

All the information in a single base

Integrating into one base the agricultural data, which are available from administrative and survey sources, is the essence of the WFDB. While committed to reducing response burden, eliminating duplication and maximizing the use of existing resources, the WFDB can offer users access to a wider than ever range of disaggregated physical and financial data at the farm level.

A guide to the Whole Farm Database

This manual was designed to familiarize potential users of the WFDB with the structure and quality of its data, as well as with other WFDB products and services.

Background

Since the mid-1920s, the Agriculture Division of Statistics Canada has been publishing data series depicting provincial levels and trends within the agriculture industry. Although timely and reliable, these series have not always satisfied the growing demand for more disaggregated farm level data.

In 1991, Agriculture and Agri-Food Canada obtained the funding to launch the Farm Level Data Project (FLDP) and provide the data necessary for monitoring the financial position of farm businesses; assessing the impact of changing policies, programs and economic conditions on farms; and administering and evaluating agricultural programs.

To meet this goal, Agriculture Division of Statistics Canada and Agriculture and Agri-Food Canada launched the Whole Farm Database (WFDB), an essential component of the FLDP. The ultimate objective of this base is to provide a set of physical and financial data at the farm level that is as complete as possible.1 This is achieved by integrating data from various existing surveys and administrative sources to produce disaggregated statistics by farm type, revenue class and sub-provincial geographic regions.

Database structure

There are several major data sources currently offered in the WFDB which produce data annually, at the farm level: the Taxation Data Program (TDP), the National Farm Survey (NFS), the June Crops Survey (JCS), the July Livestock Survey (JLS), and the Farm Financial Survey (FFS). The WFDB can be used to produce estimates for selected domains for farms with reported annual revenues of $10,000 and more.

The following components have been targeted for each source for the stated reference years:

  • Taxation Data Program 2 —1987 to 2009
    • detailed revenues and expenses
    • additions and disposals of assets3
    • operator off-farm income (unincorporated only)4
    • operator off-farm income (unincorporated and incorporated)5
    • farm family off-farm income6
  • National Farm Survey 7 , 8 —1988 to 1992
    • cropland acreages
    • livestock inventories
    • certain financial components9
  • June Crops Survey 10 —1993 to 2009
    • crops - seeded area
  • July Livestock Survey 10 —1993 to 2009
    • livestock inventories
  • Farm Financial Survey 11 —1987 to 2009 12
    • capital investments and sales13
    • assets and liabilities
    • long-term capital borrowed14

The sampling and methodology for each of these data sources are summarized in Appendix C.

Selection criteria

The WFDB can produce tables using, as selection criteria, any of the variables available from each data source. However, the production of cross-tabulations using the various data sources is limited to a group of descriptive variables, known as the "core" variables. Financial data from the Taxation Data Program, for example, could not be tabulated by number of animals on a farm because this particular data source does not have any variables on livestock inventories.

The "core" variables are divided into three information groupings: regional information, farm activities, and farm operator. It is the "core" variables that enable the structured disaggregation of the WFDB. As these variables are common to most of the data sources, the database is capable of offering tabulations with the same selection criteria (e.g., revenues and expenses from TDP, livestock inventories from JLS, and assets and liabilities from the FFS could all be produced for the same census division, farm type, and revenue class). Although most of the core variables are available for each record, disaggregations will be somewhat limited by the data quality and the number of respondents for a given variable (e.g., data cannot be disaggregated by a single postal code, however, the postal codes are included on the individual records, allowing a user to specify a group of codes to create a custom region).

The "core" variables are defined in Appendix A.

Standard output formats

The WFDB can provide estimates for every variable collected from all data sources. To maintain data quality and consistency, a specific series of agricultural variables was developed for each data source. These standard output formats were developed to provide thorough coverage of the agriculture sector on an annual basis (see Tables 1 to 5).

Most of the variables used from the NFS, JCS, JLS and FFS data sources were drawn directly from the survey questionnaires. These variables are defined in the questionnaires (see Appendix F) and interviewer training manuals; both are available from Agriculture Division. In contrast, TDP variables used in the standard outputs are predominantly custom aggregates of farm tax filer data (see Appendix B for descriptions).

Data quality

Sampling errors

All of the estimates produced by the WFDB are derived from samples, making them subject to sampling errors. Such errors occur when observations are based only on a sample and not on the population as a whole. The size and design of the sample, the variability of the characteristic of interest in the population, and the estimation method all affect data quality. In sample surveys, inference is made about the entire population based on data obtained from a part of the population; therefore, the results are likely to be different than if a complete census was taken under the same survey conditions. The most important feature of probability sampling is that the sampling error can be measured from the sample itself.

Each estimate released through the WFDB is assigned a coefficient of variation (c.v.) to measure its quality. As an objective statistical measure obtained through random sampling of the variation between each estimate and its "true" value, the c.v. indicates the degree of confidence that should be placed on a particular estimate. The users must determine if an estimate with a significant c.v. is appropriate for use.

The following rating system is suggested when using figures within a specific c.v. range.

Coefficients of variation rating system
Symbol Coefficient of variation range Meaning
A 0.00% to 4.99% excellent
B 5.00% to 9.99% very good
C 10.00% to 14.99% good
D 15.00% to 24.99% acceptable
E 25.00% to 34.99% use with caution
F >=35.00% too unreliable to be published

The c.v.—defined as the standard error divided by the sample estimate—is not always a good indicator of the precision for some variables. This is particularly true when the different values of a variable are positive and negative. In that case, the standard error of the estimate tends to be large and the estimate tends to be small or approaching zero, thus resulting in a high c.v. Therefore, the estimate might be near the exact population value and, at the same time, be rated as being unreliable. The variables net operating income, net operating income adjusted for capital cost allowance (CCA), net market income and net market income adjusted for CCA are in that situation and therefore, the c.v.'s calculated for these variables are not used. In order to give an indication of their precision, these variables have been assigned a data quality symbol based on the c.v. of variables from which they are derived.

For example, while net operating income values may fluctuate around zero, we have two distinct components (total operating revenues and total operating expenses) for which we can calculate c.v.'s. Data quality symbols are assigned as follows: 1) When the c.v. of both components is below 35.00% and the c.v. of at least one of the two components is between 25.00% and 34.99%, the symbol "E" is assigned. This symbol means that the estimate should be used with caution. 2) When the c.v. of at least one component is equal to or greater than 35.00%, the symbol "F" is assigned. This symbol means that the estimate is too unreliable to be published. 3) When the c.v. of both components is below 25.00%, no symbol is assigned. The quality of the estimates not accompanied by a data quality symbol is assessed to be "acceptable or better."

Variables for which a c.v. cannot be calculated have been handled in a similar manner. The c.v. for the variables total income (including or excluding taxable capital gains) and total income adjusted for capital cost allowance (including or excluding taxable capital gains) cannot be evaluated. Total income is the sum of off-farm income and net operating income and is calculated in two different steps.

Non-sampling errors

Non-sampling errors can occur whether a sample is used or a complete census of the population is taken. Such errors can be introduced at various stages of data processing (such as coding, data entry, editing, weighting or tabulation) and include errors introduced inadvertently by respondents. Such errors are reduced through extensive edits and data analysis; however, there are some limitations. In Saskatchewan, due to the unreliability of the TDP Census Subdivision (CSD) breakdowns, Census Agricultural Regions (CARs) cannot be reliably determined. In addition, until the 1992 taxation year, the TDP was unable to assign farm types to certain crop farms in Quebec; these farms were classified as "unspecified crop farms." This limitation has been addressed by subject-matter specialists. Since the 1993 taxation year, the "unspecified crop" revenues have been allocated to the crop type. Finally, TDP estimates for the Peace River Region in British Columbia are not available for 1988 and 1989.

Confidentiality

Statistics Canada maintains a strict level of confidentiality. All tabulated data are subject to restrictions prior to release. Several computerized checks are performed on all data cells to prevent the publication or disclosure of any information deemed confidential.

For each of the tabulations produced, the estimated number of farms is rounded to base 5 and the estimates of the other variables within that table are adjusted by a variable factor. The estimated number of farm families is rounded to base 10. With regard to the estimated number of farm operators, it is rounded to base 5 in the series of farm operators operating a single unincorporated agricultural holding and to base 10 in the series of farm operators operating incorporated or unincorporated agricultural holding. If the degree of detail required to answer user requests creates confidentiality concerns, the affected data or the entire table will be suppressed.

This method preserves the confidentiality of the data, without jeopardizing the quality of the actual estimates.


Notes

  1. For more information, please refer to: Mario Ménard, Denis Chartrand and Dave Culver. January 1992. Report on User Consultations and Proposed Whole Farm Database Tabulations. This report is available from the Agriculture Division of Statistics Canada.
  2. The TDP did not cover the Prairie provinces from 1987 to 1989, also data for 1988 and 1989 are not available for the BC Peace River Region.
  3. The statistical series on additions and disposals of assets does not cover the unincorporated sector from 1996 to 1999. Series was discontinued after 1999 data.
  4. Unincorporated sector only. Farm operators involved in a single farm operation.
  5. Farm operators operating incorporated and unincorporated agricultural holdings. Data available since 1993.
  6. Unincorporated sector only. Farm families involved in a single farm operation, 1989 to 2008 only.
  7. Since the 1993 reference year, the NFS has been replaced by the June Crops Survey and the July Livestock Survey.
  8. The NFS did not cover Newfoundland and Labrador for 1992.
  9. Prairie provinces only. These data are not available for 1992.
  10. The JCS did not cover Newfoundland and Labrador from 1993 to 1997. This province was excluded from the JLS in 1993 and 1994.
  11. The FFS was originally collected by Farm Credit Canada (previously Farm Credit Corporation) under the name "Farm Survey." The dates are for reference years. Data are available on a biennial basis from 1987 to 2001 and available on an annual basis since 2002.
  12. There was an FFS in 1992 that covered only the Prairie provinces and British Columbia.
  13. Capital investments and sales data are collected biennially beginning in reference year 2005.
  14. This series was discontinued after 2006 data.