# The Canadian Consumer Price Index Reference Paper

Chapter 9 – Reliability and Uncertainty

**9.1 **The Consumer
Price Index (CPI) is widely used and trusted by Canadians. The index is never
revised, which
means
it can be used to settle contracts without concern that those contracts may
have to be reopened at a
later
time. The index release dates are typically announced a year in advance and
firmly adhered to. The
data usually become available three weeks after the price observation period. The index is available
in considerable detail and without charge from Statistics Canada.^{Note }

**9.2 **As a sample-based statistic, the CPI, like all such
statistics, cannot with 100% accuracy estimate the underlying (but
unobserved) ‘true’ value it aims to measure. Nevertheless, the size of any
statistical error or bias associated
with the CPI is likely to be small enough to be within the range of tolerance
of most users.

**9.3 **This chapter is
about the error and bias properties of the CPI. Error refers to non-systematic
inaccuracies
introduced
potentially at all stages of estimation. Errors which are systematic, meaning
they
lead to consistent
over- or under-estimation of the phenomenon being measured, are called biases.^{Note }

**9.4 **The goal of
this chapter is to inform users about the various ways in which statistical and
non-statistical error
gets
into the CPI and the steps taken by Statistics Canada to minimize the error.
The chapter is organized
under
two main themes. One is the error associated with the estimation of indices at the lower
level, while
the other discusses the error entering into the calculation of the CPI at the
upper level.

## Error at the Lower Level of Consumer Price Index Calculation

**9.5 **Since most elementary price indices are derived
from statistical samples, they are subject to sampling errors. These errors will
surely have sampling variance^{Note } and they may
also have statistical bias, although efforts are made to minimize any such bias.
Other things being equal, a larger sample size should yield a smaller sampling
variance for
a given elementary index.

**9.6 **Most of
Statistics Canada’s surveys have samples drawn randomly from a frame of all
in-scope units.
Information
about the number and size of units in the statistical population makes it
possible to analyze the sample properties and
calculate estimates of the variance and bias associated with any calculated estimates. If this were
the case for the CPI it would be possible to report, for each elementary price
index, a
corresponding
estimate of its sampling variance and bias. However, no comprehensive frame of
all consumer
products
is available and for this reason it is generally not possible to estimate the
variance and bias of
elementary price indices.

**9.7 **For a small
number of elementary aggregates in the CPI – notably drivers’ licenses, passenger and vehicle registration
fees – a single price rules the market within each geographical stratum. As a result, these elementary
price indices do not
have sampling error.

**9.8 **There are also
cases where, although prices vary, information is available on virtually all
consumer
transactions
and therefore estimates of price change have minimal sampling error. An example
of such a
case
is the tuition fees index in the CPI where data are available on prices and enrolment
by program for
every university.

**9.9 **In the CPI there
are also some elementary price indices that are not calculated via sampling and
price
observation,
but rather by imputation.^{Note } For the most
part these elementary aggregates are individually small residual
groupings of products that serve to make the classification exhaustive. The statistical error of these imputed
elementary price indices would be similar to those of the donor indices. Since
many of these
imputed
elementary price indices are “catch-all” categories, which are individually
small and dispersed
across
the CPI basket, it is unlikely this estimation method results in a
significant increase in the error of the CPI.

**9.10 **In an ideal,
simple situation an elementary aggregate would refer to a group of homogenous
products and
accurate
information on the prices and quantities of all consumer transactions would be
available in a timely
manner.
In such a case, the average transaction price (unit value) for one month
divided by the average price in the previous month would provide an accurate
estimate of price change
for the elementary aggregate.

**9.11 **In reality, product
classes are rarely fully homogeneous and full transaction information is
rarely available.
For
this reason elementary price indices must be estimated using sampling methods.^{Note } Additionally, elementary
aggregates in the CPI usually include many varieties of competing products and outlets entering and exiting the market. Because
of these complexities, which are common to most elementary aggregates, there is
potential for error at the lower level of CPI estimation.

**9.12 **The general sampling
approach for the CPI involves three stages.^{Note } Two of the
stages (geography and
outlets)
use full or partial frames for the selection of sampling units. However, except for scanner data, there is no
comprehensive
frame
for all products that consumers buy. Therefore, in the vast majority of cases,
the third stage, in which
representative
products (RP) are designated, is done judgmentally. Sampling error can be introduced
at any of the
stages
of the sample selection process. The potential for sampling error is greater in
the selection of outlets
than in the selection of geographies; it is greatest for product selection
because there is no comprehensive frame from which to select units for
sampling. Since the CPI sample is selected using some partial frames and
judgmental methods
it is not possible to
estimate accurately the sampling error of elementary price indices.

**9.13 **Error
at the lower-level of the CPI may arise because of delays in introducing new products
into the sample
in a
timely manner. The matched-model approach used in the CPI requires comparison
of identical product offers (PO) over time and as a
result there is a delay between the time when new products appear in the market and when their
corresponding price movements are captured in the CPI. This type of error can
never be
completely eliminated while continuing to use the matched model approach. However, such errors can be mitigated with
improved and timely sample management.^{Note }

**9.14 **As
with new products, a delay in the introduction of new outlets into the CPI
sample can be a source of error. In a competitive retail market, new outlets appear
from time to time offering different levels of services or prices. As a result
consumers may change where they shop. The CPI does not immediately capture
price movements resulting
from changes in the retail landscape because the outlet sample is not redrawn
every
month. As a result, error from outlet substitution can occur.^{Note } To counteract this
type of error, the CPI outlet sample needs to be refreshed frequently to
capture price movements in new outlets.

**9.15 **Bias arising
from the emergence of new outlets occurs when new stores enter the market
offering lower
prices,
thereby inducing consumers to switch outlets. Again this is a difficult source
of potential bias to avoid completely, but efforts are made to refresh the
outlet sample periodically to minimize this kind of bias.

**9.16 **Other types of
error associated with the estimation of elementary price indices include
various processing
and
clerical errors. Error can arise due to the various corrections and adjustments
that are made to collected prices. The different methods used to adjust for
quality change in the CPI can be imperfect and as such represent a
source of error. However, effort is taken to continuously review the methods
used and ensure
the
most appropriate quality adjustment methods are
applied. Previous studies have found little indication that quality
adjustments across the elementary aggregates of the CPI are consistently
biased in an upward
or downward direction.^{Note }

**9.17 **Clerical errors
might occur when POs are being recorded by the price collection agents. However,
efforts are
made
to minimize errors of this kind. The Computer-Assisted
Personal Interview (CAPI) devices used
by
the
price collection agents employ automatic tolerance checks and alert the price collection agents of any suspicious
values
as they are transcribed. In addition, once the data have been transmitted to
Statistics Canada
headquarters
they are subjected to further verification and analysis. When unusual prices or
price movements
are
detected, subject matter experts sometimes send a “Request for Additional
Information” memo back to
the price collection agents to obtain additional explanatory information on the PO. The
same scrutiny is performed with prices coming from the scanner data, from the web scraped data or from
other administrative data sources.

**9.18 **CPI calculations are carried out with computer software, which largely
eliminates the
possibility
of arithmetic errors. However, the potential for programming errors is present.
In 2001 a new
methodology
was introduced for the traveller accommodation elementary price index and
errors were made
in
programming the algorithm for the new method. When the error was discovered and
corrected a few years
later,
it was found to have caused a downward bias in the elementary index. Since that
time, more rigorous
testing procedures have been put in place to ensure such errors do not go undetected.

**9.19 **The CPI sampling
strategy makes use of sample frames when selecting geographical collection areas, outlets and some
products.^{Note } These frames can
be subject to different types of error. For instance, there are likely to be delays in
updating the outlet frame to include new in-scope units and to remove units no
longer in scope. In addition,
information about the size of individual units – typically sales data – are
also subject to possible
error.
Some of this information comes from administrative data sources such as tax
records and some is
derived
from other Statistics Canada surveys. In either case, the unit size information
is typically subject to
both sampling and non-sampling error.

## Error at the Upper Level of Consumer Price Index Calculation

**9.20 **The calculation
of price indices at the upper level of the CPI is accomplished using the Lowe
price index
formula.
To estimate any particular aggregate index, a weighted average of its component
elementary price
indices
is calculated. There are two possible sources of error in this calculation,
both of which relate to the basket weights used in the aggregation. The first refers to errors that
could be present in the expenditure estimates used as weights. The second is substitution bias which
stems from the use of the Lowe formula at the upper level.

**9.21 **The Survey of
Household Spending (SHS), the Canadian System of National Accounts’ (CSNA) estimates of Household Final Consumption Expenditures (HFCE) and additional data sources used to derive the CPI basket
weights
are
all subject to both sampling and non-sampling error.^{Note } Statistical
errors in the CPI basket weights can have an important effect on measured
price change for aggregate price indices in the short run. However, empirical
studies suggest variations in the basket weights are unlikely to have a big
impact on the calculation
of the all-items CPI over longer periods of time.^{Note }

**9.22 **The other
potential source of error with respect to the basket weights is referred to as upper-level
substitution
bias.
This bias arises because of the use of the Lowe formula, which is an asymmetrically weighted fixed-basket price index.
Because the weights are obtained from a year that precedes the price reference period, the
expenditures are not likely to be fully representative of consumer spending
patterns in the price
observation
periods. This is because consumers tend to adjust their spending habits in
response to changes
in
relative prices, buying more of the products whose prices have fallen or risen
less rapidly, while reducing their purchases of products whose prices have
increased the most. In other words, they substitute towards relatively
cheaper products from relatively more expensive ones. The asymmetrically
weighted fixed-basket
formula of the CPI does not account for these types of changes in consumer
spending until a basket update is performed.

**9.23 **Unlike the Lowe
formula, there are five known symmetrically weighted price index formulae^{Note } which are theoretically
free from upper level substitution bias. These index formulae use expenditures
from both the
price
reference period 0 and the price observation period *t* and therefore account
for product substitutions
that
consumers may make. In this regard they are representative of consumer
spending for the periods in which price change is being calculated.

**9.24 **While it would
be preferable to calculate the CPI using a symmetrically weighted price index
formula, current
period
expenditure weights are not available to support a timely production of the
CPI. The non-revision
policy^{Note } of the CPI
also does not facilitate the use of forecasted current period expenditures in
the calculation
of the official CPI.

**9.25 **While
maintaining the use of the Lowe formula, steps are taken to reduce upper level
substitution bias by
updating
the expenditure weights frequently and implementing them with minimal time lag.
Statistics Canada
took
a major step forward in this regard when it switched from a four-year basket
update cycle to a two-year cycle with the release of the 2011 basket in March 2013, and then to an annual basket update with the release of the 2021 basket in June 2022. In addition, the lag with which the new basket was implemented was reduced from 18 months to 13 months.^{Note }

**9.26 **Upper-level substitution
bias in the CPI can be estimated “after the fact” for past periods by comparing
the
results
of the official CPI calculated with the Lowe formula to those calculated using
one of the five
symmetrically weighted indices, once expenditure data become available.^{Note }

- Date modified: