The Canadian Consumer Price Index Reference Paper
Chapter 9 – Reliability and Uncertainty

Table of contents

9.1 The Consumer Price Index (CPI) is widely used and trusted by Canadians. The index is never revised, which means it can be used to settle contracts without concern that those contracts may have to be reopened at a later time. The index release dates are typically announced a year in advance and firmly adhered to. The data usually become available three weeks after the price observation period. The index is available in considerable detail and without charge from Statistics Canada.^Note

9.2 As a sample-based statistic, the CPI, like all such statistics, cannot with 100% accuracy estimate the underlying (but unobserved) ‘true’ value it aims to measure. Nevertheless, the size of any statistical error or bias associated with the CPI is likely to be small enough to be within the range of tolerance of most users.

9.3 This chapter is about the error and bias properties of the CPI. Error refers to non-systematic inaccuracies introduced potentially at all stages of estimation. Errors which are systematic, meaning they lead to consistent over- or under-estimation of the phenomenon being measured, are called biases.^Note

9.4 The goal of this chapter is to inform users about the various ways in which statistical and non-statistical error gets into the CPI and the steps taken by Statistics Canada to minimize the error. The chapter is organized under two main themes. One is the error associated with the estimation of indices at the lower level, while the other discusses the error entering into the calculation of the CPI at the upper level.

Error at the Lower Level of Consumer Price Index Calculation

9.5 Since most elementary price indices are derived from statistical samples, they are subject to sampling errors. These errors will surely have sampling variance^Note and they may also have statistical bias, although efforts are made to minimize any such bias. Other things being equal, a larger sample size should yield a smaller sampling variance for a given elementary index.

9.6 Most of Statistics Canada’s surveys have samples drawn randomly from a frame of all in-scope units. Information about the number and size of units in the statistical population makes it possible to analyze the sample properties and calculate estimates of the variance and bias associated with any calculated estimates. If this were the case for the CPI it would be possible to report, for each elementary price index, a corresponding estimate of its sampling variance and bias. However, no comprehensive frame of all consumer products is available and for this reason it is generally not possible to estimate the variance and bias of elementary price indices.

9.7 For a small number of elementary aggregates in the CPI – notably drivers’ licenses, passenger and vehicle registration fees – a single price rules the market within each geographical stratum. As a result, these elementary price indices do not have sampling error.

9.8 There are also cases where, although prices vary, information is available on virtually all consumer transactions and therefore estimates of price change have minimal sampling error. An example of such a case is the tuition fees index in the CPI where data are available on prices and enrolment by program for every university.

9.9 In the CPI there are also some elementary price indices that are not calculated via sampling and price observation, but rather by imputation.^Note For the most part these elementary aggregates are individually small residual groupings of products that serve to make the classification exhaustive. The statistical error of these imputed elementary price indices would be similar to those of the donor indices. Since many of these imputed elementary price indices are “catch-all” categories, which are individually small and dispersed across the CPI basket, it is unlikely this estimation method results in a significant increase in the error of the CPI.

9.10 In an ideal, simple situation an elementary aggregate would refer to a group of homogenous products and accurate information on the prices and quantities of all consumer transactions would be available in a timely manner. In such a case, the average transaction price (unit value) for one month divided by the average price in the previous month would provide an accurate estimate of price change for the elementary aggregate.

9.11 In reality, product classes are rarely fully homogeneous and full transaction information is rarely available. For this reason elementary price indices must be estimated using sampling methods.^Note Additionally, elementary aggregates in the CPI usually include many varieties of competing products and outlets entering and exiting the market. Because of these complexities, which are common to most elementary aggregates, there is potential for error at the lower level of CPI estimation.

9.12 The general sampling approach for the CPI involves three stages.^Note Two of the stages (geography and outlets) use full or partial frames for the selection of sampling units. However, except for scanner data, there is no comprehensive frame for all products that consumers buy. Therefore, in the vast majority of cases, the third stage, in which representative products (RP) are designated, is done judgmentally. Sampling error can be introduced at any of the stages of the sample selection process. The potential for sampling error is greater in the selection of outlets than in the selection of geographies; it is greatest for product selection because there is no comprehensive frame from which to select units for sampling. Since the CPI sample is selected using some partial frames and judgmental methods it is not possible to estimate accurately the sampling error of elementary price indices.

9.13 Error at the lower-level of the CPI may arise because of delays in introducing new products into the sample in a timely manner. The matched-model approach used in the CPI requires comparison of identical product offers (PO) over time and as a result there is a delay between the time when new products appear in the market and when their corresponding price movements are captured in the CPI. This type of error can never be completely eliminated while continuing to use the matched model approach. However, such errors can be mitigated with improved and timely sample management.^Note

9.14 As with new products, a delay in the introduction of new outlets into the CPI sample can be a source of error. In a competitive retail market, new outlets appear from time to time offering different levels of services or prices. As a result consumers may change where they shop. The CPI does not immediately capture price movements resulting from changes in the retail landscape because the outlet sample is not redrawn every month. As a result, error from outlet substitution can occur.^Note To counteract this type of error, the CPI outlet sample needs to be refreshed frequently to capture price movements in new outlets.

9.15 Bias arising from the emergence of new outlets occurs when new stores enter the market offering lower prices, thereby inducing consumers to switch outlets. Again this is a difficult source of potential bias to avoid completely, but efforts are made to refresh the outlet sample periodically to minimize this kind of bias.

9.16 Other types of error associated with the estimation of elementary price indices include various processing and clerical errors. Error can arise due to the various corrections and adjustments that are made to collected prices. The different methods used to adjust for quality change in the CPI can be imperfect and as such represent a source of error. However, effort is taken to continuously review the methods used and ensure the most appropriate quality adjustment methods are applied. Previous studies have found little indication that quality adjustments across the elementary aggregates of the CPI are consistently biased in an upward or downward direction.^Note

9.17 Clerical errors might occur when POs are being recorded by the price collection agents. However, efforts are made to minimize errors of this kind. The Computer-Assisted Personal Interview (CAPI) devices used by the price collection agents employ automatic tolerance checks and alert the price collection agents of any suspicious values as they are transcribed. In addition, once the data have been transmitted to Statistics Canada headquarters they are subjected to further verification and analysis. When unusual prices or price movements are detected, subject matter experts sometimes send a “Request for Additional Information” memo back to the price collection agents to obtain additional explanatory information on the PO. The same scrutiny is performed with prices coming from the scanner data, from the web scraped data or from other administrative data sources.

9.18 CPI calculations are carried out with computer software, which largely eliminates the possibility of arithmetic errors. However, the potential for programming errors is present. In 2001 a new methodology was introduced for the traveller accommodation elementary price index and errors were made in programming the algorithm for the new method. When the error was discovered and corrected a few years later, it was found to have caused a downward bias in the elementary index. Since that time, more rigorous testing procedures have been put in place to ensure such errors do not go undetected.

9.19 The CPI sampling strategy makes use of sample frames when selecting geographical collection areas, outlets and some products.^Note These frames can be subject to different types of error. For instance, there are likely to be delays in updating the outlet frame to include new in-scope units and to remove units no longer in scope. In addition, information about the size of individual units – typically sales data – are also subject to possible error. Some of this information comes from administrative data sources such as tax records and some is derived from other Statistics Canada surveys. In either case, the unit size information is typically subject to both sampling and non-sampling error.

Error at the Upper Level of Consumer Price Index Calculation

9.20 The calculation of price indices at the upper level of the CPI is accomplished using the Lowe price index formula. To estimate any particular aggregate index, a weighted average of its component elementary price indices is calculated. There are two possible sources of error in this calculation, both of which relate to the basket weights used in the aggregation. The first refers to errors that could be present in the expenditure estimates used as weights. The second is substitution bias which stems from the use of the Lowe formula at the upper level.

9.21 The Survey of Household Spending (SHS), the Canadian System of National Accounts’ (CSNA) estimates of Household Final Consumption Expenditures (HFCE) and additional data sources used to derive the CPI basket weights are all subject to both sampling and non-sampling error.^Note Statistical errors in the CPI basket weights can have an important effect on measured price change for aggregate price indices in the short run. However, empirical studies suggest variations in the basket weights are unlikely to have a big impact on the calculation of the all-items CPI over longer periods of time.^Note

9.22 The other potential source of error with respect to the basket weights is referred to as upper-level substitution bias. This bias arises because of the use of the Lowe formula, which is an asymmetrically weighted fixed-basket price index. Because the weights are obtained from a year that precedes the price reference period, the expenditures are not likely to be fully representative of consumer spending patterns in the price observation periods. This is because consumers tend to adjust their spending habits in response to changes in relative prices, buying more of the products whose prices have fallen or risen less rapidly, while reducing their purchases of products whose prices have increased the most. In other words, they substitute towards relatively cheaper products from relatively more expensive ones. The asymmetrically weighted fixed-basket formula of the CPI does not account for these types of changes in consumer spending until a basket update is performed.

9.23 Unlike the Lowe formula, there are five known symmetrically weighted price index formulae^Note which are theoretically free from upper level substitution bias. These index formulae use expenditures from both the price reference period 0 and the price observation period t and therefore account for product substitutions that consumers may make. In this regard they are representative of consumer spending for the periods in which price change is being calculated.

9.24 While it would be preferable to calculate the CPI using a symmetrically weighted price index formula, current period expenditure weights are not available to support a timely production of the CPI. The non-revision policy^Note of the CPI also does not facilitate the use of forecasted current period expenditures in the calculation of the official CPI.

9.25 While maintaining the use of the Lowe formula, steps are taken to reduce upper level substitution bias by updating the expenditure weights frequently and implementing them with minimal time lag. Statistics Canada took a major step forward in this regard when it switched from a four-year basket update cycle to a two-year cycle with the release of the 2011 basket in March 2013, and then to an annual basket update with the release of the 2021 basket in June 2022. In addition, the lag with which the new basket was implemented was reduced from 18 months to 13 months.^Note

9.26 Upper-level substitution bias in the CPI can be estimated “after the fact” for past periods by comparing the results of the official CPI calculated with the Lowe formula to those calculated using one of the five symmetrically weighted indices, once expenditure data become available.^Note

Notes

Note

Availability of the CPI data from Statistics Canada is discussed in Chapter 2.

The Canadian Consumer Price Index Reference Paper Chapter 9 – Reliability and Uncertainty

Error at the Lower Level of Consumer Price Index Calculation

Error at the Upper Level of Consumer Price Index Calculation

The Canadian Consumer Price Index Reference Paper
Chapter 9 – Reliability and Uncertainty