4 Empirical illustration

Jan de Haan and Rens Hendriks

Previous | Next

For the empirical study we used two data sets from different sources. The first data set contains the sale prices of nearly all transactions of existing houses (excluding newly-built houses) in the Netherlands between January 2003 and March 2009 as registered by the Dutch land registry office. The total number of observations amounts to 1,126,242 or approximately 15 thousand per month. The sales were recorded at the time the final agreement was made at the notary's office, on average six weeks after the preliminary sale was agreed on. The second data set contains the government appraisals, relating to January 2003, for all owner-occupied dwellings in the housing stock. Because addresses are available in both data sets, we know the sale price and the appraisal value for each transaction. Because the type of dwelling is also available, we were able to stratify by dwelling type and location.

The first thing we did was run unstratified OLS regressions of selling prices on appraisals, using model (3.8), for all 75 months. A selection of the results is listed in Table 4.1; detailed empirical material is available from the authors upon request. Not surprisingly, the coefficients β ^ t MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdHiVc=bYP0xb9sq=fFfeu0RXxb9qr0dd9q8qi0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGafqOSdi MbaKaadaahaaWcbeqaaiaadshaaaaaaa@3C6E@  are different from zero at very low significance levels. In most cases the intercepts α ^ t MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdHiVc=bYP0xb9sq=fFfeu0RXxb9qr0dd9q8qi0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGafqySde MbaKaadaahaaWcbeqaaiaadshaaaaaaa@3C6C@  differ significantly from zero at the 5% level. Roughly 80 to 90% of the variation in selling prices is 'explained' by the variation in appraisals, as shown by the R 2 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdHiVc=bYP0xb9sq=fFfeu0RXxb9qr0dd9q8qi0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamOuam aaCaaaleqabaGaaGOmaaaaaaa@3B57@  values. In other words, the correlation coefficient between selling prices and base period appraisals ranges from 0.89 to 0.95. Figure 4.1 shows that R 2 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdHiVc=bYP0xb9sq=fFfeu0RXxb9qr0dd9q8qi0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamOuam aaCaaaleqabaGaaGOmaaaaaaa@3B57@  diminishes slightly over time. As mentioned earlier, one of the reasons could be that different segments of the market exhibit different price changes. We were a bit surprised to find though that R 2 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdHiVc=bYP0xb9sq=fFfeu0RXxb9qr0dd9q8qi0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamOuam aaCaaaleqabaGaaGOmaaaaaaa@3B57@  is not the highest in January 2003, being the appraisal reference period.

Table 4.1
Regression results

Table summary
This table displays regression results. The information is grouped by month, Alpha, t, Beta, t, R squared (appearing as column headers).
Month Alpha t Beta t R squared
January 2003 1,900.49 2.26 0.98 275.19 0.87
January 2004 5,039.16 5.96 1.01 269.26 0.88
January 2005 -2,555.12 2.43 1.08 237.54 0.84
January 2006 1,282.14 1.41 1.11 286.39 0.87
January 2007 -7,567.99 6.36 1.19 243.72 0.83
January 2008 11,007.39 8.48 1.26 231.93 0.83
January 2009 16,677.31 9.83 1.3 184.24 0.81
Figure 4.1  R squared values

Description for figure 4.1

Figure 4.1  R squared values

Based on the above regression results, we computed GREG price index numbers according to equation (3.10). From January 2003 until mid 2008 house prices increased by some 25% in the Netherlands but then started to fall, probably due to the financial and economic crisis. Importantly, the GREG index turns out to be a lot smoother than the simple ratio of sample means as Figure 4.2 makes clear, which is precisely what the index has been designed for.

Figure 4.2  GREG
index and ratio of sample means

Description for figure 4.2

Figure 4.2  GREG index and ratio of sample means

Figure 4.3 compares the GREG index with the SPAR index. In general the trend of both indexes is very similar, although there appears to be a small difference by the end of the period. Figure 4.4 shows that the month-to-month changes in the GREG and SPAR indexes do not differ much either, the GREG index being just a little bit less volatile. So we can conclude that, at the nationwide level, both methods generate more or less equal results. Note that the SPAR index in Figures 4.3 and 4.4 is not the official SPAR index published by Statistics Netherlands. We computed a fixed base index using appraisals for January 2003 only whereas the official index is a chained index, based on appraisals for various reference periods; see also Section 5.3.

Figure 4.3  GREG
and SPAR indexes

Description for figure 4.3

Figure 4.3  GREG and SPAR indexes

Figure 4.4  GREG
and SPAR: month-to-month percentage changes

Description for figure 4.4

Figure 4.4  GREG and SPAR: month-to-month percentage changes

Next we stratified the data by thirteen provinces and five types of dwellings, ran OLS regressions per month for the resulting 65 strata and calculated GREG indexes as well as sample means ratios. Figure 4.5 displays the results for one stratum, apartments in the province of Friesland. Due to the relatively low number of observations there are some dramatic spikes, for instance in September 2009 when the ratio of sample means increases by 50%. Again, the GREG index is smoother than the ratio of sample means (but still very volatile) and strikingly similar to the SPAR. The same picture emerges for the other strata, so we do not present those results.

Figure 4.5  GREG
and SPAR indexes and ratio of sample means; apartments in the province of
Friesland)

Description for figure 4.5

Figure 4.5  GREG and SPAR indexes and ratio of sample means; apartments in the province of Friesland

Finally, using the stratum results, we computed stratified GREG indexes for the whole country according to equation (3.13), where the base period appraisal shares serve as stock value weights. As can be seen from Figure 4.6, there are hardly any differences between the stratified and unstratified GREG indexes, suggesting that sample selection bias is not a major issue. Figure 4.6 also shows a second alternative GREG price index, computed according to equation (3.15), which is based on OLS regressions of the dummy variable model (3.14). And again, the differences with the original GREG index appear to be small.

Figure 4.6  GREG,
stratified GREG and dummy variable GREG indexes

Description for figure 4.6

Figure 4.6  GREG, stratified GREG and dummy variable GREG indexes

It should be noted that even within strata some houses are still more likely to sell than others, in particular during the crisis after 2008, so that some sample selection bias in the GREG and SPAR indexes will remain. The direction and magnitude of this bias can only be predicted if data on property characteristics was available to estimate the likelihood of houses to sell. Also, as was mentioned earlier, a too detailed stratification will increase both the sampling variance and sample bias if the number of houses sold is extremely low, and may raise rather than reduce the mean square error of the estimators.

Previous | Next

Date modified: