Executive summary

Warning View the most recent version.

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

Care needs to be exercised when estimating economic relationships with the Capital, Labour, Energy, Materials and Services (KLEMS) database. Although the dataset comprises high quality estimates of economic variables constructed from the supply and use tables of the National Accounts, aberrant observations, such as outliers and leverage points, constitute an important feature of the dataset.

Aberrant observations can occur for several reasons. First, changes in classification and methodology can lead to discontinuities over time. Second, disaggregation into finer industry classifications produces data where coherency may be of lower quality. Third, macroeconomic or other shocks may cause abrupt, unusual movements in the underlying data sources.

This paper provides a first step for addressing aberrant observations. It is broadly divided into two parts. The first part examines how pre-tests for unit roots or stationarity perform when applied to the KLEMS data. The results imply that, due to the complexity of the data and the presence of aberrant observations, commonly applied tests, used individually or aggregated over the panel of industries, may not provide adequate inference.

The second part focuses on how aberrant observations affect parametric total factor productivity (TFP) growth estimates, and on how to identify them. When aberrant observations are present, commonly applied estimation methods such as Ordinary Least Squares (OLS) can be affected. This paper examines their impact by employing an estimation technique that is less sensitive to unusual observations. The results of the two estimation methods are compared to illustrate the impact of unusual data points.

Over the course of the second part of the paper a number of questions pertinent to dealing with aberrant observations are addressed:

  • What techniques are available for dealing with aberrant observations in the KLEMS database?

Several estimators that are insensitive (or less sensitive) to aberrant observations are available, including Least Median Squares, M-estimators, Least Trimmed Squares and S-estimators. These estimators use functions of the data that, in a variety of ways, account for the influence of aberrant observations. In this paper an S-estimator is used because it is insensitive to aberrant observations in the dependent and independent variables.

  • How do aberrant observations affect OLS estimates, and does this matter economically?
  • On average, OLS estimates are found to underestimate TFP growth relative to the S-estimator by as much as 0.35 percentage points, and by as much as 4.3 percentage points for particular industries. These magnitudes are non-trivial, particularly when compounded over 43 years.

  • Are aberrant observations equally distributed across years?
  • Analysis of the timing of aberrant observations shows that they tend to be grouped around certain events. In particular, the first oil shock in the early 1970s, the 1980–1981 recessions and the 1990 recession are shown to be important sources of aberrant observations.

  • What causes aberrant observations?
  • Macroeconomic shocks are the most important source of aberrant observations. It appears that when industries are placed under stress, either by rapidly increasing input costs with inelastic demand or by reducing aggregate demand, the response function of the industries is different from 'regular' expansionary periods. The responses vary widely across industries and, for the purposes of estimating TFP, represent adjustments that are likely beyond the scope of traditional assumptions regarding productivity growth.

  • How important are aberrant observations in the KLEMS dataset?
  • Aberrant observations are an important feature of the KLEMS dataset. In this paper, which uses a Cobb-Douglas production function, up to 21% of sample observations are found to be aberrant in some way. Failure to account for these observations can affect parameter estimates and lead to inaccurate inference. Researchers, when they use the KLEMS dataset, need to consider whether special techniques should be employed to take into account the aberrant observations.

    View the publication Estimating TFP in the Presence of Outliers and Leverage Points: An Examination of the KLEMS Dataset in PDF format.