Bayesian small area demography
Section 4. Discussion

Despite the increasing popularity of Bayesian methods in the research community, national statistical agencies and policy analysts have been wary of these methods. National statistical agencies are particularly concerned about two aspects of Bayesian methods: their use of prior distributions, and their complexity.

The use of prior distributions to represent external information is indeed a distinctive feature of Bayesian analyses. Little (2012) has argued that national statistical agencies should use “noninformative” priors, which avoid the impression of subjectivity, and which form a bridge to classical methods, in that they often lead to similar results. Among Bayesian statisticians, however, weakly informative priors have been gradually displacing noninformative priors as the default for most analyses. Compared with noninformative priors, weakly informative priors can stabilize estimates, and speed up calculations. But because they rule out only the most implausible values, they are generally no more controversial, and require little more work or justification, than noninformative priors.

However, in cases where the data to hand do not permit sufficiently strong answers to the questions of interest, there may be advantages to using priors that are strongly, rather than weakly, informative. In our obesity example, for instance, it may be possible to improve on both of our forecasting models by specifying priors for the standard deviation parameters that accurately reflect likely year-on-year variation in obesity rates.

If statistical agencies were to use strongly informative priors, they would need to spell these priors out clearly, justify their choices, and test sensitivity to alternative choices. But, in most cases, this would be an improvement on current practice. Current practice with analyses such as population forecasts is often to apply informal adjustments, or to retrospectively adjust assumptions, until a plausible result is obtained. Bayesian methods provide analysts with a more transparent and systematic way of bringing in external information and expert judgement.

Objections about Bayesian models being complicated are partly true. Many Bayesian models are complicated, in that, like the models presented in this paper, they use many layers and many parameters. At the same time, however, the individual components of these models are often simple and intuitive. To make sense of our model for mortality rates, for instance, we can start with the likelihood, move on to the prior model, and then consider the priors for main effects and interactions one by one. With this divide-and-conquer approach, even complicated models are accessible. Moreover, the main assumptions behind the models can often be described in nontechnical language, even if the mathematical techniques cannot.

Similarly, the traditional objection that Bayesian modelling require advanced computing skills is gradually losing force. Packages such as ours allow analysts to fit specific classes of demographic estimation models relatively easily. General-purpose Bayesian programming languages such as Stan (Carpenter, Gelman, Hoffman, Lee, Goodrich, Betancourt, Brubaker, Guo, Li and Riddell, 2016) offer greater flexibility in exchange for slightly more programming effort. These tools allow practitioners to easily fit complicated Bayesian models.

Acknowledgements

The views expressed are those of the authors, and should not be attributed to Peking University, Stats NZ, or any other organization.

Appendix A

Identification of our model

In the prior model, each main effect or interaction includes all possible categories of the classifying dimensions. For instance, the sex effect includes separate female and male effects, and the age-sex interaction includes effects for every possible combination of age and sex. Because all of our priors are proper (i.e., are genuine probability distributions that integrate to 1), the posterior distribution is proper. All parameters are therefore identified in the broadest sense, and a Bayesian analysis can be carried out.

The main effects and interactions are, however, only weakly identified. For instance, adding a value λ MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9I8WrFj0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqaH7oaBaaa@3378@ to the female and male effects β Female sex MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9I8WrFj0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqaHYoGydaqhaaWcbaGaaeOraiaabw gacaqGTbGaaeyyaiaabYgacaqGLbaabaGaae4CaiaabwgacaqG4baa aaaa@3BC8@ and β Male sex , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9I8WrFj0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqaHYoGydaqhaaWcbaGaaeytaiaabg gacaqGSbGaaeyzaaqaaiaabohacaqGLbGaaeiEaaaakiaacYcaaaa@3AB1@ and subtracting λ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9I8WrFj0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqaH7oaBaaa@3379@ from the intercept β 0 , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9I8WrFj0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqaHYoGydaahaaWcbeqaaiaaicdaaa GccaGGSaaaaa@3507@ will produce exactly the same expected value for the γ a s t MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9I8WrFj0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqaHZoWzdaWgaaWcbaGaamyyaiaado hacaWG0baabeaaaaa@366F@ as the original parameter settings. The data do not allow us to distinguish between the two possibilities. Identification is achieved entirely through the differences in prior densities for the original and shifted parameters.

The γ a s t , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9I8WrFj0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqaHZoWzdaWgaaWcbaGaamyyaiaado hacaWG0baabeaakiaacYcaaaa@3729@ in contrast, cannot be arbitrarily shifted without affecting the likelihood Poisson ( y a s t | γ a s t n a s t ) . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9I8WrFj0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaqGqbGaae4BaiaabMgacaqGZbGaae 4Caiaab+gacaqGUbWaaeWaaeaadaabcaqaaiaadMhadaWgaaWcbaGa amyyaiaadohacaWG0baabeaakiaaykW7aiaawIa7aiaaykW7cqaHZo WzdaWgaaWcbaGaamyyaiaadohacaWG0baabeaakiaad6gadaWgaaWc baGaamyyaiaadohacaWG0baabeaaaOGaayjkaiaawMcaaiaac6caaa a@4BEB@ In others words, the γ a s t MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9I8WrFj0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqaHZoWzdaWgaaWcbaGaamyyaiaado hacaWG0baabeaaaaa@366F@ are well identified from the data. Shifting the values of the main effects and interactions does not affect inferences about standard deviation terms, as inferences about standard deviations depend on variation across effects, rather than absolute levels. The standard deviation terms are therefore also well identified.

In this paper we only report the γ a s t . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9I8WrFj0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqaHZoWzdaWgaaWcbaGaamyyaiaado hacaWG0baabeaakiaac6caaaa@372B@ In some applications, however, the main effects and interactions are also of interest. In such cases, one approach is to systematically shift the parameter estimates to achieve identification (Gelman, 2005).

Appendix B

R code

We have developed a set of R MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9I8WrFj0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWGsbaaaa@329C@ packages for implementing Bayesian small area demographic estimation and forecasting. The packages are available at github.com/statisticsnz/R. Package dembase contains data structures for demographic data and functions for manipulating these data structures. The basic data structure is a “demographic array”, which, in addition to the counts or rates themselves, also holds metadata such as age groups or time periods, and units of measurement. Bayesian estimation and forecasting is carried out by functions in package demest. The estimation functions use metadata from the demographic arrays to assign sensible default values. As a result, complex models can be specified and run relatively simply. For instance, the key parts of the code for our model in the mortality example are set out in Figure B.1. Package demlife contains tools for creating life tables and extracting life table functions.

Figure B.1 R code to specify and run the mortality model, using package demest

Description for Figure B.1

Figure presenting the R code to specify and run the mortality model, using package demest.

model = Model (y ~ Poisson (mean ~ age * sex + period), age ~ DLM (covariates = Covariates (infant = TRUE), damp = NULL), age: sex ~ DLM (trend = NULL, damp = NULL), jump = 0.05)

filename = “out/mortality_model.est”

estimateModel (model = model, y = deaths, exposure = 3 * population, filename = filename, nBurnin = 100000, nSim = 100000, nChain = 4, nThin = 250)

References

Alho, J., and Spencer, B. (2006). Statistical Demography and Forecasting. Springer Science & Business Media.

Bijak, J., and Bryant, J. (2016). Bayesian demography 250 years after Bayes. Population Studies, 70, 1, 1-19.

Bryant, J., and Howard, A. (2017). Estimating Infant Mortality by Ethnicity: New Methods for Dealing with Inconsistent Ethnic Reporting and Small Numbers. Statistics New Zealand, Working Paper No. 17-01.

Bryant, J., Dunstan, K., Graham, P., Matheson-Dunning, N., Shrosbree, E. and Speirs, R. (2016). Measuring Uncertainty in the 2013-Base Estimated Resident Population. Statistics New Zealand, Working Paper No. 16-04.

Carpenter, B., Gelman, A., Hoffman, M., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M.A., Guo, J., Li, P. and Riddell, A. (2016). Stan: A probabilistic programming language. Journal of Statistical Software, 20, 1-37.

Chen, C., Wakefield, J. and Lumely, T. (2014). The use of sampling weights in bayesian hierarchical models for small area estimation. Spatial and Spatio-Temporal Epidemiology, 11, 33-43.

Gelman, A. (2005). Analysis of variance why it is more important than ever (with discussion). The Annals of Statistics, 33, 1, 1-53.

Gelman, A., Jakulin, A., Pittau, M.G. and Su, Y.-S. (2008). A weakly informative default prior distribution for logistic and other regression models. The Annals of Applied Statistics, 2, 4, 1360-1383.

Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.B., Vehtari, A. and Rubin, D.B. (2014). Bayesian Data Analysis, Third Edition. Boca Raton: Chapman and Hall/CRC.

Gerland, P., Raftery, A.E., Ševčíková, H., Li, N., Gu, D., Spoorenberg, T., Alkema, L., Fosdick, B.K., Chunn, J., Lalic, N., Bay, G., Buettner, T., Heilig, G.K. and Wilmoth, J. (2014). World population stabilization unlikely this century. Science, 346, 6206, 234-237.

Little, R.J. (2012). Calibrated Bayes, an alternative inferential paradigm for official statistics. Journal of Official Statistics, 28, 3, 309.

Ministry of Health (2013). New Zealand health survey: Annual update of key findings 2012/13. Technical report, Ministry of Health.

Pfeffermann, D. (2013). New important developments in small area estimation. Statistical Science, 28, 1, 40-68.

Prado, R., and West, M. (2010). Time Series: Modeling, Computation, and Inference. Boca Raton: Chapman and Hall/CRC.

Preston, S., Heuveline, P. and Guillot, M. (2001). Demography: Modelling and Measuring Population Processes. Blackwell, Oxford.

Rao, J.N.K., and Molina, I. (2015). Small Area Estimation, Second Edition. New York: John Wiley & Sons, Inc.

United Nations General Assembly (2015). Transforming our world: The 2030 agenda for sustainable development. Available at https://www.unfpa.org/resources/transforming-our-world-2030-agenda-sustainable-development.

Vaupel, J.W., Manton, K.G. and Stallard, E. (1979). The impact of heterogeneity in individual frailty on the dynamics of mortality. Demography, 16, 3, 439-454.


Date modified: