Predicting transitions into and out of poverty using machine learning

Articles and reports: 11-522-X202100100003
Description:

The increasing size and richness of digital data allow for modeling more complex relationships and interactions, which is the strongpoint of machine learning. Here we applied gradient boosting to the Dutch system of social statistical datasets to estimate transition probabilities into and out of poverty. Individual estimates are reasonable, but the main advantages of the approach in combination with SHAP and global surrogate models are the simultaneous ranking of hundreds of features by their importance, detailed insight into their relationship with the transition probabilities, and the data-driven identification of subpopulations with relatively high and low transition probabilities. In addition, we decompose the difference in feature importance between general and subpopulation into a frequency and a feature effect. We caution for misinterpretation and discuss future directions.

Key Words: Classification; Explainability; Gradient boosting; Life event; Risk factors; SHAP decomposition.

Issue Number: 2021001
Author(s): Burger, Joep; van der Laan, Jan
Main Product: Statistics Canada International Symposium Series: Proceedings
Format Release date More information
PDF October 15, 2021