Bayesian small area demography
Section 2. Mortality rates for Māori

2.1 The estimation problem

Mortality rates are a fundamental measure of human welfare, as well as a major performance indicator for the health sector. Mortality rates are also a key input for population forecasts, and for the life insurance industry.

New Zealand’s national statistical office (Stats NZ), publishes estimates of mortality rates for Māori by sex and by one-year age groups. These rates are “super-population” estimates. Super-population mortality rates measure the underlying risk of dying. They can be contrasted with finite-population rates, which measure the actual number of deaths divided by the actual population at risk. Suppose, for instance, that no 6-year-old Māori die in a particular year. The finite-population mortality rate is exactly zero, but the underlying risk of dying, and hence the super-population mortality rate, is presumably non-zero.

To derive death rates we need death counts and measures of population at risk. New Zealand’s death registration system is efficient and complete, and reporting of ethnicity on the death registrations is generally reliable (Bryant and Howard, 2017), so data on death counts can be treated as error-free. Finding appropriate measures of population at risk is more challenging. Population at risk is measured using person-years. For instance, if a person spends 9 months in New Zealand during the period of interest, then that person contributes 0.75 person-years to the population at risk. Ideally, population at risk would be obtained by summing up person-years contributed by each person in the population. However, such data can be difficult to obtain. Instead, demographers typically approximate population at risk using population count multiplied by length of period. Population counts for Māori in New Zealand are relatively accurate for census years (Bryant, Dunstan, Graham, Matheson-Dunning, Shrosbree and Speirs, 2016), but become less accurate away from census years, because it is not possible to tell, from international migration data, how many Māori are entering and leaving the country. In addition, Stats NZ does not treat ethnicity as a characteristic that is fixed at birth, but rather as an aspect of personal identity that individuals can change over their lifetimes.

In response to the difficulties in estimating Māori population counts outside census years, Stats NZ focuses on periods centered on census years. Censuses are normally carried out every 5 years in New Zealand, though the 2011 census was postponed until 2013 because of an earthquake. The standard approach to mortality estimation is to use three-year periods, centered on a census year, such as 2012-2014. Using a three-year period gives larger numbers of death counts in each age-sex cell, and hence more stable estimates, than would be the case with single-year periods. To approximate the population at risk over a three-year period, Stats NZ uses the population count at the middle of the period, that is on June 30 of the census year, multiplied by 3.

To give an idea of the modelling challenge, Figure 2.1 shows direct estimates of mortality rates on a log scale for Māori males in 2012-2014, for single-year age groups $0, 1, \dots, 100 + .$ Direct estimates of mortality rates are simply death counts for each age-sex cell divided by the population at risk for that cell. The diameter of each circle in Figure 2.1 is proportional to the square root of the number of deaths. Altogether, there were 9,170 deaths during the period, with the largest cell consisting of 130 deaths, two cells having 0 deaths, and a median death count of 27. The Māori male population on June 30, 2013 was 328,000, giving a population at risk of $328,000 \times 3= 984,000$ person-years.

Figure 2.1 Direct estimates of mortality rates for Māori males in 2012-2014, by single year of age. The diameter of the circles are proportional to the square root of the number of deaths during the period

Description for Figure 2.1

This figure presents a scatter plot of the direct estimates of mortality rates for Mori males in 2012-2014, by single year of age. The death rate is on the y-axis, in the log scale, going from 1e-04 to 1. Age is on the x-axis going from 0 to 100. Death rates are relatively high during the first year of life, before falling to very low levels. Somewhere between age 10 and age 20, death rates rise sharply, and then climb slowly up to about age 35. The death rates between ages 40 and 90 appear to fall on a straight increasing line. The trends in death rates are difficult to pin down above age 90. The death rate for 99-year-olds is over 1.

Estimating underlying death rates between ages 40 and 90 is relatively easy. There are plenty of data, and, when shown on a log scale, the rates appear to fall on a straight line.

Somewhere between age 10 and age 20, death rates rise sharply, and then climb slowly up to about age 35. Many countries have a similar pattern of unusually high mortality rates in the late teenage years and 20s, particularly among males. The phenomenon is referred to as the “accident hump” (meaning, mainly, car accidents), though in many places, including New Zealand, it would be more accurate to call it an accident and suicide hump.

Death rates are relatively high during the first year of life, before falling to very low levels. Exactly how low these rates go is difficult to tell, because death counts are small and the associated direct estimates are highly erratic. The same problem also exists above age 90, where trends in death rates are difficult to pin down.

The death rate for 99-year-olds is over 1. This implies that the number of deaths of 99-year-olds is greater than the (approximate) number of person-years lived during the period 2012-2014 by 99-year-olds. Rates, unlike probabilities, have no upper bound. Consider, for instance, a population consisting of one person, who dies 9 months into a one-year period. The implied death rate for that period is $1 / 0 .75 \approx 1 .33 .$

2.2 The model

2.2.1 Model specification

Our input data are death counts $y_{a s t}$ and population at risk $n_{a s t} .$ Subscript $a$ denotes age group; subscript $s$ denotes sex; and subscript $t$ denotes time, taking values 2005-2007 and 2012-2014. Using two periods allows us to borrow strength across time, and also to study change over time.

We model death counts as draws from a Poisson distribution,

$y_{a s t} \sim Poisson (γ_{a s t} n_{a s t}), (2.1)$

where $γ_{a s t}$ is the super-population mortality rate. We calculate $n_{a s t}$ by multiplying the population at June 30 in the census years by 3, and treat it as error-free. The main goal of the modelling is to estimate $γ_{a s t} .$

Traditionally, demographers have ignored the fact that, even after knowing the population at risk and the underlying death rate, the actual number of deaths is still random and therefore uncertain. With large cell counts, such as for national populations, this uncertainty is small, so ignoring it is sensible. With small cell counts, however, this uncertainty is substantial, and needs to be accounted for. We do this by treating $y_{a s t}$ as a random draw from a Poisson distribution.

We add to (2.1) assumptions about how $γ_{a s t}$ is likely to vary. In Bayesian terminology, we specify a prior model for the $γ_{a s t} .$ Because $γ_{a s t}$ is positive with no upper bound, we specify the model on a log scale. We assume that $γ_{a s t}$ varies systematically by age, sex, and time, with age patterns potentially differing between females and males,

$\log γ_{a s t} = β^{0} + β_{a}^{age} + β_{s}^{sex} + β_{t}^{time} + β_{a s}^{age:sex} + e_{a s t} . (2.2)$

Here, $β^{0}$ is an intercept, capturing the overall level of log mortality rates, $β_{a}^{age}$ is an age effect, capturing variation across age, $β_{s}^{sex}$ is a sex effect, capturing variation between sexes, $β_{t}^{time}$ is a time effect, capturing common time trends, and $β_{a s}^{age:sex}$ is an age-sex interaction, capturing variation between sexes in the age pattern. The presence of the error term $e_{a s t},$ implies that we do not expect our prior model to predict $\log γ_{a s t}$ with complete accuracy. Standard generalized linear models do not have an equivalent term, and thus are implicitly making stronger assumptions about the correctness of the model. We assume that the error term $e_{a s t}$ has a normal distribution with mean 0 and variance $σ^{2} .$ The higher the value of $σ^{2},$ the less the implied accuracy of the prior model.

The most importance source of variation in mortality rates is age. As is apparent in Figure 2.1, mortality rates for people in the 90s are three or four orders of magnitude higher than mortality rates for young children. It is therefore crucial for accurate estimation that we capture the main features of the age pattern.

We model age effects using an approach originally developed for modelling change over time rather than age, a “local trend” model (Prado and West, 2010, pages 119-121),

$β_{a}^{age} = α_{a}^{age} + 1 (a =0) ψ + u_{a}^{age}, (2.3)$

$α_{a}^{age} = α_{a - 1}^{age} + δ_{a - 1}^{age} + v_{a}^{age}, (2.4)$

$δ_{a}^{age} = δ_{a - 1}^{age} + w_{a}^{age} . (2.5)$

Use of time series models to capture variation over age is relatively common in statistical demography. The fundamental idea is that values for neighboring age groups, like values for neighboring time periods, are more highly correlated than values for age groups or time periods that are distant from one another.

Equation (2.3) says that age effects are a combination of underlying level, captured by $α_{a}^{age},$ and age-specific idiosyncratic effects, captured by error term $u_{a}^{age} .$ Age group 0 typically has much higher mortality rates than those for other young age groups, reflecting the special risks faced by infants. This extra mortality is modelled by parameter $ψ .$ Equation (2.4) says that the level effect at age $a$ equals the level effect at age $a - 1,$ plus an increment $δ_{a - 1}^{age},$ plus an idiosyncratic error $v_{a}^{age} .$ Equation (2.5) says that the increment at age $a,$ $δ_{a}^{age},$ equals the increment at age $a - 1,$ $δ_{a - 1}^{age},$ plus an idiosyncratic error $w_{a}^{age} .$ Under a local trend model, age effects are expected to rise or fall linearly, but the slope of the line can change, or even reverse direction, over the whole length of the age pattern. Our priors for the starting values of $α^{age}$ and $δ^{age}$ are $α_{0}^{age} \sim N (0, 10^{2})$ and $δ_{0}^{age} \sim N (0, 1) .$

The age-sex interaction term $β_{a s}^{age:sex}$ measures variation between sexes in the age pattern for mortality. We use a “local level” model (Prado and West, 2010, pages 119-121),

$β_{a s}^{age:sex} = α_{a s}^{age:sex} + u_{a s}^{age:sex}, (2.6)$

$α_{a s}^{age:sex} = α_{a - 1, s}^{age:sex} + v_{a s}^{age:sex}, (2.7)$

$s \in {Female, Male} .$ This model expresses the idea that, after accounting for age effects and sex effects, the residuals for mortality rates will be similar between neighbouring age groups, within each sex. The lack of a trend term $(δ)$ implies that we do not expect these residuals to systematically trend upwards or downwards across the age range. We assume that any systematic trend will be shared by both sexes, and hence will be accounted for by the trend term in the age effect. Our prior for the starting value of $α^{age:sex}$ is $α_{0}^{age:sex} \sim N (0, 10^{2}) .$

We use a simple model for sex effects,

$β_{s}^{sex} \sim N (0, 1), (2.8)$

$s \in {Female, Male} .$ This implies that we expect that the mean difference between mortality rates for sex $s$ and the average mortality rates for both sexes to lie within the range $(- 2, 2)$ on a log scale. The variance of the female-male differences is $1 + 1=2,$ so we expect this difference to lie within the range $(- 2 \sqrt{2}, 2 \sqrt{2})$ on a log scale , or (0.06, 16.9) on the original scale, which is a very large range compared to actual sex differences. This is an example of a “weakly informative” prior, in that it understates the actual strength of existing scientific knowledge (Gelman, Jakulin, Pittau and Su, 2008). Weakly informative priors provide many of the benefits of strong priors, by ruling out implausible values, and speeding up computations. However, they are much more convenient, since they do not require the analyst to precisely summarize external information about the parameter in question, which can be difficult.

As there are only two time periods, and hence insufficient information to warrant a complicated model for time effects, we simply assume that $β_{t}^{time} \sim N (0, 1) .$ We assume $β^{0} \sim N (0, 10^{2}) .$

All the error terms in our model $(e_{a s t}, u_{a}^{age}, v_{a}^{age}, w_{a}^{age}, u_{a s}^{age:sex},$ and $v_{a s}^{age:sex})$ have normal distributions with mean 0. The standard deviation parameters for the error terms $e_{a s t}, u_{a}^{age}, v_{a}^{age}$ and $w_{a}^{age}$ all have a $half - t$ distribution, with 7 degrees of freedom and scale parameter 1. Figure 2.2 shows a $half - t$ distribution with 7 degrees of freedom and scale parameter 1. The distribution puts a 65% probability on values below 1, and a 2% probability on values exceeding 3.

In practice, we expect the standard deviation of our error terms to be well under 1. The standard deviation governs the size of age-to-age, sex-to-sex, or time-to-time differences in rates. A standard deviation of 1 implies that we would often see differences of 100% or more, which we do not see in practice. Our prior for standard deviations is therefore weakly informative.

The standard deviation parameters for the error terms $u_{a s}^{age:sex}$ and $v_{a s}^{age:sex}$ in the age-sex interaction have a $half - t$ distribution, with 7 degrees of freedom and scale parameter 0.5. We use a smaller scale for the interaction on the principle that interactions are typically smaller in size than main effects (Gelman et al., 2008).

Figure 2.2 A half-t distribution with 7 degrees of freedom and scale 1

Description for Figure 2.2

Figure showing a half distribution with 7 degrees of freedom and scale 1. The density is on the y-axis, going from 0 to 0.8. The values are on the x-axis goes from 0 to 4. The distribution puts a 65% probability on values below 1, and a 2% probability on values exceeding 3. The density is close to 0.8 for a value of 0 and then decrease to close to 0 for values of 3 or more.

2.2.2 Model output

The output from a Bayesian analysis is a sample from the posterior distribution of all of the parameters conditional on the data. The parameters include mortality rates, disaggregated by age, sex, and time, parameters in the model for age effects, parameters in the model for sex effects, and so on. Our main interest lies in the mortality rates. The mortality rates are well identified from the data. The main effects and interactions in the prior model, in contrast, are only weakly identified. We discuss the identification issue further in Appendix A.

We simulate draws from the posterior distribution using 4 independent chains, each with a burnin of 100,000 and production of 100,000. We keep every 250^th iteration from each chain, yielding a combined sample of $S = 1,600$ draws. We monitor convergence using potential scale reduction factors (Gelman, Carlin, Stern, Dunson, Vehtari and Rubin, 2014, Section 11.4). The calculations are done in our own open source R package demest. Sample code is shown in Appendix B.

For any given parameter, we use the median of its posterior draws as the point estimate, and use the $100 p %$ $(0< p <1)$ credible interval formed by the $[50 (1 - p)]$ and $[50 + 50 p]$ percentiles of its posterior draws to measure the uncertainty. For instance, a 95% credible interval with $p = 0 .95$ is formed by the 2.5% and 97.5% percentiles of the posterior draws.

The posterior draws can easily be used to construct estimates of functions of the model parameters, together with measures of uncertainty. In the study of mortality, a particularly important example is life expectancy at birth. Life expectancy is a complicated nonlinear function of age-specific mortality rates (Preston, Heuveline and Guillot, 2001). Let $f (γ)$ denote the nonlinear function that produces life expectancy from a set of age-specific mortality rates $γ .$ If $γ^{(1)}, \dots, γ^{(S)}$ represent a sample from the posterior distribution of $γ,$ then $f (γ^{(1)}), \dots, f (γ^{(S)})$ form a sample from the posterior distribution of life expectancy. We can summarize this sample to get point estimates and credible intervals of life expectancy.

Our approach is fully Bayesian in that, in addition to specifying a prior for $γ_{a s t},$ we also specify priors for hyper-parameters, such as $σ^{2},$ that govern the prior for $γ_{a s t} .$ Inferences about the hyper-parameters are made together with inferences about $γ_{a s t},$ using the joint posterior distribution. An alternative approach, known as Empirical Bayes, is to construct point estimates for the hyper-parameters and make inferences about $γ_{a s t}$ conditional on these point estimates (Rao and Molina, 2015, Chapter 9).

Empirical Bayes approaches can be less computationally intensive than fully Bayesian ones, which means they sometimes scale better to large datasets. They can, however, be difficult to implement with complicated models containing many levels, such as ours. Using probability distributions, rather than point estimates, for hyper-parameters also leads to a more complete representation of uncertainty.

2.3 Results

Figure 2.3 shows results from the model. The light blue band represents 95% credible intervals. If the assumptions of the model are met, then each vertical slice of the band has a 95% probability of containing the true value for $γ_{a s t} .$ The pale line in the middle is the median of the posterior distribution, which can be used for point estimates. The black circles are the direct estimates from Figure 2.1.

Figure 2.3 Modelled estimates of mortality rates for Māori males in 2012-2014, by single year of age. The light blue band represents 95% credible intervals, and the white line represents posterior medians. The black circles represent direct estimates

Description for Figure 2.3

This figure graphs the modelled estimates of mortality rates for Mori males in 2012-2014, by single year of age. There is a band that represents 95% credible intervals, and a line that represents posterior medians. The circles represent direct estimates from Figure 2.1. The death rate is on the y-axis, in the log scale, going from 1e-04 to 1. Age is on the x-axis going from 0 to 100. The model makes no attempt to smooth away the spike in mortality at age 0. Around age 10, estimates are, on a log scale, less precise than for most other age groups, reflecting the small cell counts for children. Around age 18, the slope changes abruptly, marking the beginning of the accident hump. Over ages 40-80, the age pattern obtained from the model is approximately linear. The model successfully smooths through the random variation in the direct estimates. Beyond age 90, uncertainty increases steadily, as death counts become smaller and smaller. The posterior median suggests an apparent plateauing in mortality rates beyond age 90.

The age pattern obtained from the model is approximately linear over ages 40-80. The model successfully smooths through the random variation in the direct estimates. Around age 18, however, the slope changes abruptly, marking the beginning of the accident hump. The smoothness at ages 40 and over does not come at the expense of an ability to detect local changes in the teenage years. The model also makes no attempt to smooth away the spike in mortality at age 0. This is a result of the inclusion of a covariate for age 0: models that do not have this term do partly smooth away the spike (results not shown).

Estimates around age 10 are, on a log scale, less precise than for most other age groups, reflecting the small cell counts for children. In other words, the model produces uncertainty measures that reflect local availability of data.

Uncertainty also increases steadily beyond age 90, as death counts become smaller and smaller. The posterior median suggests little increase in death rates beyond age 90. The apparent plateauing in mortality rates may be genuine: Māori males who survive to age 90 may systematically differ from ones who do not, so that the flat mortality for people at high ages reflects a kind of selection effect (Vaupel, Manton and Stallard, 1979). However, it is also possible that the plateauing reflects problems with the input data, such as inaccurate recording of ages of very old people.

Figure 2.4 shows life expectancies derived from the model. Unlike Figures 2.1 and 2.3, it shows results for both sexes and both periods. Female life expectancy at birth increased from 75.1 years, with a 95% credible interval of (74.8, 75.5), in 2005-2007 to 76.7 years, with a credible interval of (76.4, 77.0), in 2012-2014. The corresponding estimates for males are 70.8 (70.5, 71.1) and 72.5 (72.2, 72.9). It is still rare in demography for life expectancies to be accompanied by uncertainty measures. Using Bayesian methods, however, uncertainty measures can be calculated routinely.

Figure 2.4 Modelled estimates of life expectancy at birth, for Māori, 2005-2007 and 2012-2014. The light blue bands represent 95% credible intervals, and the white lines represent posterior medians

Description for Figure 2.4

Figure presenting the modelled estimates of life expectancy at birth, for Mori, for two periods: 2005-2007 and 2012-2014, by sex. There is a line representing the posterior median, by period and by sex. Around each median, a 95% credible intervals is provided. The age is on the y axis, going from 72 to 76. The two periods, 2005-2007 and 2012-2014 are on the x-axis. Female life expectancy at birth increased from 75.1 years, with a 95% credible interval of (74.8, 75.5), in 2005-2007 to 76.7 years, with a credible interval of (76.4, 77.0), in 2012-2014. The corresponding estimates for males are 70.8 (70.5, 71.1) and 72.5 (72.2, 72.9).

ISSN : 1492-0921

Editorial policy

Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.

Submission of Manuscripts

Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).

Note of appreciation

Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.

Standards of service to the public

Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.

Copyright

Published by authority of the Minister responsible for Statistics Canada.

Use of this publication is governed by the Statistics Canada Open Licence Agreement.

Catalogue No. 12-001-X

Frequency: Semi-annual

Ottawa

Date modified:: 2019-05-07

Language selection

Search and menus

Search

Bayesian small area demography
Section 2. Mortality rates for Māori

2.1 The estimation problem

2.2 The model

2.2.1 Model specification

2.2.2 Model output

2.3 Results

Bayesian small area demography Section 2. Mortality rates for Māori

2.1 The estimation problem

2.2 The model

2.2.1 Model specification

2.2.2 Model output

2.3 Results

Editorial policy

Submission of Manuscripts

Note of appreciation

Standards of service to the public

Copyright

Bayesian small area demography
Section 2. Mortality rates for Māori