Concepts, methodology and data quality

Warning View the most recent version.

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

This survey measures, on a monthly basis, the quantities of lumber that are produced and shipped by Canadian manufacturers. The target population for this survey includes all sawmills in Canada classified to the North American Industry Classification System (NAICS), 321111.

General Methodology

This is a sample survey with a cross-sectional design.

The frame used for sampling purposes is the Statistics Canada Business Register. The statistical unit is the establishment.The survey population includes all sawmills establishments above certain thresholds that vary by province and by reference year.

Data are collected each month from survey respondents using a mail-out / mail-back process as well as electronic questionnaire. Data capture and preliminary editing are performed simultaneously to ensure validity of the data. Businesses from whom no response has been received or whose data may contain errors are followed-up by telephone, email or fax. To estimate the contribution of units below sampling thresholds, the system derives ratios from Goods and Services Tax (GST) files using a statistical model. The model accounts for the difference between units above the threshold and those below the threshold as well as the time lag between the reference period of the survey and the reference period of the GST file.

Missing data for the current month are imputed automatically using a number of statistical techniques that use survey data collected during the current cycle as well as auxiliary information sources. These auxiliary sources include survey data from a previous cycle (historical), donor questionnaires and administrative data. Opening stocks are set equal to the value of the closing stocks from the previous month. Closing stocks are calculated by adding production to opening stocks and then subtracting shipments and waste values. The option exists for the subject matter analyst to manually override these imputations with better estimates based on pertinent knowledge about the industry or the business.

As part of the estimation process, survey data are weighted and combined with administrative data to produce final industry estimates.

Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

Direct disclosure may occur when the value in a tabulation cell is composed of a few respondents or when the cell is dominated by a few companies. Residual disclosure may occur when confidential information can be derived indirectly by piecing together information from different sources or data series.

Under normal circumstance, data are collected, captured, edited, tabulated and published within 6-8 weeks after the end of the reference month.

Revisions

Monthly, preliminary estimates are provided for the reference month and revised estimates, based on late responses, are provided for the previous month.

Once every year (normally in July), the monthly Sawmills series are revised. These revisions incorporate any data that may have been received after the close of the collection cycle during the previous reference year.

The revised estimates are published in CANSIM.

Data accuracy

While considerable efforts have been taken to ensure high standards throughout all stages of collection and processing, the resulting estimates are inevitably subject to a certain degree of non-sampling error. These errors are not related to sampling and may occur for various reasons including non-response, inaccurate reporting and processing. Errors relating to non-response can be measured. All attempts are made to control inaccurate reporting and processing errors.

Non-response error

Some respondents may be unable to provide data for numerous reasons (i.e. fire, theft, strike, economic hardship, etc.), while others may be late in responding. To minimize non-response, delinquent respondents are followed up rigorously by phone, e-mail or fax. Data for non-responding units are imputed using industry trend and other related information. Data are revised based on the revision policy for questionnaires that are received after the end of the monthly collection cycles.

Non-response error is calculated using the number of non-responses divided by the number of total expected responses for the units in the sample.

Inaccurate response

Inaccuracy may result from poor questionnaire design or an inability on the part of respondents to provide the requested information or from misinterpretation of the survey questions. To reduce such errors the format and wording in the questionnaire are reviewed from time to time and modified based on feedback from survey respondents and data users. Respondents are also reminded of the importance of their contribution and of the accuracy of reported information.

Processing errors

These errors may occur at various stages in the processing of survey data such as data entry, verification, editing and tabulation. Data are examined for such errors using automated edits along with an analytical review by subject matter experts. Several checks are performed on the collected data to verify internal consistency and comparability over time.