2 Horvitz-Thompson estimators and the SPAR index
Jan de Haan and Rens Hendriks
Previous | Next
The typical aim of survey sampling is to estimate the
total or (arithmetic) mean of some variable for a finite population. In a
housing context we may want to estimate the total value of the housing stock
in, say, period 0. Let denote the housing stock of size and the value of house The target to be estimated is
Suppose we have a sample consisting of houses sold in the base period. If the houses
were selected by simple random sampling from the housing stock where each house had the same inclusion
probability, then the Horvitz-Thompson estimator
is an unbiased estimator of (2.1); see e.g., Cochran (1977).
A natural target though not the only possibility for a house price index would be the value
change of a fixed housing stock. Conditioning on the base period stock has two implications: additions to the stock
(mostly newly-built houses) should be excluded and the price changes of
existing properties should be adjusted for quality changes, i.e., for the impact of depreciation,
renovations and extensions. For convenience we assume that such quality changes
are negligible. In that case the target price index going from the base period
0 to the comparison period is defined as
with obvious notation. Suppose that we also have a
sample consisting of houses sold in period and assume that it is an independent random
draw from the base period stock. The ratio of the Horvitz-Thompson estimators
(the sample means) in both periods
might seem a natural estimator of our target index
(2.3). However, if the samples and are independently drawn, the variance of
estimator (2.4) can be substantial. Moreover, an estimated ratio such as (2.4)
has a bias that depends on the variance of the numerator and the covariance of
the numerator and the denominator (Cochran 1977). From an index number
perspective the issue at stake is that the mix of properties traded in period differs from that in period 0. That is, we are
not comparing like with like.
The standard approach to estimating price indexes relies
on the matched model methodology where prices and are observed for a fixed panel of items. The
use of panel data ensures that like is compared with like and will reduce the
variance of the ratio estimator because and are typically positively correlated. However,
unless the samples and are extraordinary large, there will only be
few matched houses, if any. Hence, while prices are observed for the houses belonging to for most of those houses the base period
prices are 'missing'. What may be available instead
are government assessments We could use these as base period values and
construct the following (pseudo) matched-model estimator of house price change:
A problem associated with estimator (2.5) is that the
base period index number will differ from 1 because the appraisals differ from the selling prices Rescaling (2.5) by dividing it by its base
period value is an obvious solution, yielding
Note that the rescaling factor is stochastic, as it
is a ratio of sample means for the base period, and will increase the variance
of (2.6) as compared to the estimator given by (2.5), depending on the
correlations between the appraisals and the selling prices. Details can be
found in de Haan (2007). But we cannot circumvent rescaling since a price index
that does not start at the value 1 would be meaningless.
Expression (2.6) is called a Sale Price Appraisal Ratio
(SPAR) index. The SPAR method has been applied in the Netherlands
since January 2008 to measure the price change of owner-occupied dwellings. As mentioned earlier, we assume that
the SPAR index aims at tracking the price change of the housing stock, which is a measure of the change in wealth. In the
context of the Harmonized Index of Consumer Prices on the other hand, the house
price index should measure the price change of the houses sold during the base period (Makaronidis and Hayes 2006;
Eurostat 2010). Under the latter concept there would be no sampling involved if
all transactions are recorded and used in the compilation of the index, as is
the case in the Netherlands.
The second expression on the right-hand side of (2.6)
writes the SPAR index as the product of two factors, the ratio of sample means
and a factor between brackets. As the SPAR index is essentially based on the
matched model methodology (using base period appraisals instead of sale
prices), this factor adjusts the ratio of sample means for changes in the
quality mix of the samples that occur between period 0 and period A potential problem is that the SPAR index is not a panel-type estimator. A SPAR time
series, say for periods might therefore suffer from short-term
volatility due to mix changes, especially when the number of sales is low.
Previous | Next