# Multiple imputation of missing values in household data with structural zeros Section 2. Review of the NDPMPM model

Hu et al. (2018) present the NDPMPM model including motivation for how it can preserve associations across variables and account for structural zeros. Here, we summarize the model without detailed motivations, referring the reader to Hu et al. (2018) for more information. We begin with notation needed to understand the model and the Gibbs sampler, assuming complete data. The presentation closely follows that in Hu et al. (2018).

## 2.1  Notation and model specification

Suppose the data contain $n$ households. Each household $i=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}n$ contains ${n}_{i}$ individuals, so that there are ${\sum }_{i=1}^{n}\text{\hspace{0.17em}}{n}_{i}=N$ individuals in the data. Let ${X}_{ik}\in \left\{1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{d}_{k}\right\}$ be the value of categorical variable $k$ for household $i,$ which is assumed to be identical for all ${n}_{i}$ individuals in household $i,$ where $k=p+1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}p+q.$ Let ${X}_{ijk}\in \left\{1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{d}_{k}\right\}$ be the value of categorical variable $k$ for person $j$ in household $i,$ where $j=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{n}_{i}$ and $k=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}p.$ Let ${X}_{i}=\left({X}_{i\left(p+1\right)},\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{X}_{i\left(p+q\right)},\text{\hspace{0.17em}}{X}_{i11},\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{X}_{i{n}_{i}p}\right)$ include all household-level and individual-level variables for the ${n}_{i}$ individuals in household $i.$

Let $\mathcal{H}$ be the set of all household sizes that are possible in the population. For all $h\in \mathcal{H},$ let ${\mathcal{C}}_{h}$ represent the set of all combinations of individual-level and household-level variables for households of size $h$ , including impossible combinations; that is, ${\mathcal{C}}_{h}={\prod }_{k=p+1}^{p+q}\left\{1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{d}_{k}\right\}{\prod }_{j=1}^{h}\text{\hspace{0.17em}}{\prod }_{k=1}^{p}\left\{1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{d}_{k}\right\}.$ Let ${\mathcal{S}}_{h}\subset {\mathcal{C}}_{h}$ represent the set of impossible combinations, i.e., those that are structural zeros, for households of size $h.$ These include combinations of variables within any individual, e.g., a three year old person cannot be a spouse, or across individuals in the same household, e.g., a person cannot be older than his biological parents. Let $\mathcal{C}={\cup }_{h\in \mathcal{H}}\text{\hspace{0.17em}}{\mathcal{C}}_{h}$ and $\mathcal{S}={\cup }_{h\in \mathcal{H}}\text{\hspace{0.17em}}{\mathcal{S}}_{h}.$

Although the NDPMPM model we use restricts the support of ${X}_{i}$ to $\mathcal{C}-\mathcal{S},$ it is helpful for understanding the model to begin with no restrictions on the support of ${X}_{i}.$ Each household $i$ belongs to one of $F$ classes representing latent household types. For $i=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}n,$ let ${G}_{i}\in \left\{1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}F\right\}$ indicate the household class for household $i.$ Let ${\pi }_{g}=\mathrm{Pr}\left({G}_{i}=g\right)$ be the probability that household $i$ belongs to class $g.$ Within any class, all household-level variables follow independent, multinomial distributions. For any $k\in \left\{p+1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}p+q\right\}$ and any $c\in \left\{1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{d}_{k}\right\},$ let ${\lambda }_{gc}^{\left(k\right)}=\mathrm{Pr}\left({X}_{ik}=c\text{\hspace{0.17em}}|\text{\hspace{0.17em}}{G}_{i}=g\right)$ for any class $g,$ where ${\lambda }_{gc}^{\left(k\right)}$ is the same value for every household in class $g.$ Let $\pi =\left\{{\pi }_{1},\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{\pi }_{F}\right\},$ and $\lambda =\left\{{\lambda }_{gc}^{\left(k\right)}\text{ }:\text{\hspace{0.17em}}\text{\hspace{0.17em}}c=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{d}_{k};\text{\hspace{0.17em}}\text{\hspace{0.17em}}k=p+1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}p+q;\text{\hspace{0.17em}}\text{\hspace{0.17em}}g=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}F\right\}.$

Within each household class, each individual belongs to one of $S$ individual-level latent classes. For $i=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}n$ and $j=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{n}_{i},$ let ${M}_{ij}$ represent the individual-level latent class of individual $j$ in household $i.$ Let ${\omega }_{gm}=\mathrm{Pr}\left({M}_{ij}=m\text{\hspace{0.17em}}|\text{\hspace{0.17em}}{G}_{i}=g\right)$ be the probability that individual $j$ in household $i$ belongs to individual-level class $m$ nested within household-level class $g.$ Within any individual-level class, all individual-level variables follow independent, multinomial distributions. For any $k\in \left\{1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}p\right\}$ and any $c\in \left\{1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{d}_{k}\right\},$ let ${\varphi }_{gmc}^{\left(k\right)}=\mathrm{Pr}\left({X}_{ijk}=c\text{\hspace{0.17em}}|\text{\hspace{0.17em}}\left({G}_{i},\text{\hspace{0.17em}}{M}_{ij}\right)=\left(g,\text{\hspace{0.17em}}m\right)\right)$ for the class pair $\left(g,\text{\hspace{0.17em}}m\right),$ where ${\varphi }_{gmc}^{\left(k\right)}$ is the same value for every individual in the class pair $\left(g,\text{\hspace{0.17em}}m\right).$ Let $\omega =\left\{{\omega }_{gm}\text{ }:\text{\hspace{0.17em}}\text{\hspace{0.17em}}g=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}F;\text{\hspace{0.17em}}\text{\hspace{0.17em}}m=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}S\right\},$ and $\varphi =\left\{{\varphi }_{gmc}^{\left(k\right)}\text{​}:\text{\hspace{0.17em}}\text{\hspace{0.17em}}c=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{d}_{k};\text{\hspace{0.17em}}\text{\hspace{0.17em}}k=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}p;\text{\hspace{0.17em}}\text{\hspace{0.17em}}m=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}S;\text{\hspace{0.17em}}\text{\hspace{0.17em}}g=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}F\right\}.$

For purposes of the Gibbs sampler in Section 2.2, it is useful to distinguish values of ${X}_{i}$ that satisfy all the structural zero constraints from those that do not. Let the superscript $“1”$ indicate that a random variable has support only on $\mathcal{C}-\mathcal{S}.$ For example, ${X}_{i}^{1}$ represents data for a household with values restricted only on $\mathcal{C}-\mathcal{S},$ i.e., not an impossible household, whereas ${X}_{i}$ represents data for a household with any values in $\mathcal{C}.$ Let ${\mathcal{X}}^{1}$ be the observed data comprising $n$ households, that is, a realization of $\left({X}_{1}^{1},\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{X}_{n}^{1}\right).$ The kernel of the NDPMPM, $\mathrm{Pr}\left({\mathcal{X}}^{1}\text{\hspace{0.17em}}|\text{\hspace{0.17em}}\theta \right),$ is

$L\left({\mathcal{X}}^{1}|\text{\hspace{0.17em}}\theta \right)\text{\hspace{0.17em}}=\prod _{i=1}^{n}\sum _{h\in \mathcal{H}}\text{\hspace{0.17em}}1\left\{{n}_{i}=h\right\}1\left\{{X}_{i}^{1}\notin {\mathcal{S}}_{h}\right\}\left[\sum _{g=1}^{F}\text{\hspace{0.17em}}{\pi }_{g}\prod _{k=p+1}^{p+q}\text{\hspace{0.17em}}{\lambda }_{g{X}_{ik}^{1}}^{\left(k\right)}\prod _{j=1}^{h}\sum _{m=1}^{S}\text{\hspace{0.17em}}{\omega }_{gm}\prod _{k=1}^{p}\text{\hspace{0.17em}}{\varphi }_{gm{X}_{ijk}^{1}}^{\left(k\right)}\right],\text{ }\text{ }\left(2.1\right)$

where $\theta$ includes all the parameters, and $1\left\{.\right\}$ equals one when the condition inside the $\left\{\text{ }\right\}$ is true and equals zero otherwise.

For all $h\in \mathcal{H},$ let ${n}_{1h}={\sum }_{i=1}^{n}\text{\hspace{0.17em}}1\left\{{n}_{i}=h\right\}$ be the number of households of size $h$ in ${\mathcal{X}}^{1}$ and ${\pi }_{0h}\left(\theta \right)=\mathrm{Pr}\left({X}_{i}\in {\mathcal{S}}_{h}\text{\hspace{0.17em}}|\text{\hspace{0.17em}}\theta \right).$ As stated in Hu et al. (2018), the normalizing constant in the likelihood in (2.1) is ${\prod }_{h\in \mathcal{H}}{\left(1-{\pi }_{0h}\left(\theta \right)\right)}^{{n}_{1h}}\text{​}.$ Therefore, the posterior distribution is

$\mathrm{Pr}\left(\theta \text{\hspace{0.17em}}|\text{\hspace{0.17em}}{\mathcal{X}}^{1}\text{​},\text{\hspace{0.17em}}T\left(\mathcal{S}\right)\right)\propto \mathrm{Pr}\left({\mathcal{X}}^{1}\text{\hspace{0.17em}}|\text{\hspace{0.17em}}\theta \right)\mathrm{Pr}\left(\theta \right)=\frac{1}{{\prod }_{h\in \mathcal{H}}{\left(1-{\pi }_{0h}\left(\theta \right)\right)}^{{n}_{1h}}}L\left({\mathcal{X}}^{1}\text{\hspace{0.17em}}|\text{\hspace{0.17em}}\theta \right)\mathrm{Pr}\left(\theta \right)\text{ }\text{ }\left(2.2\right)$

where $T\left(\mathcal{S}\right)$ emphasizes that the density is for the NDPMPM with support restricted to $\mathcal{C}-\mathcal{S}.$

The likelihood in (2.1) can be written as a generative model of the form

$\begin{array}{ll}{X}_{ik}\text{\hspace{0.17em}}|\text{\hspace{0.17em}}{G}_{i},\text{\hspace{0.17em}}\lambda \sim \hfill & \text{Discrete}\left({\lambda }_{{G}_{i}1}^{\left(k\right)},\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{\lambda }_{{G}_{i}{d}_{k}}^{\left(k\right)}\right)\hfill \\ \hfill & \forall i=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}n\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{and}\text{\hspace{0.17em}}\text{\hspace{0.17em}}k=p+1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}p+q\text{ }\text{ }\text{ }\left(2.3\right)\hfill \end{array}$

$\begin{array}{ll}{X}_{ijk}|{G}_{i},\text{\hspace{0.17em}}{M}_{ij},\text{\hspace{0.17em}}\varphi ,\text{\hspace{0.17em}}{n}_{i}\sim \hfill & \text{Discrete}\left({\varphi }_{{G}_{i}{M}_{ij}1}^{\left(k\right)},\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{\varphi }_{{G}_{i}{M}_{ij}{d}_{k}}^{\left(k\right)}\right)\hfill \\ \hfill & \forall i=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}n\text{\hspace{0.17em}},\text{\hspace{0.17em}}j=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{n}_{i}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{and}\text{\hspace{0.17em}}\text{\hspace{0.17em}}k=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}p\text{ }\text{ }\text{ }\left(2.4\right)\hfill \end{array}$

$\begin{array}{ll}{G}_{i}\text{\hspace{0.17em}}|\text{\hspace{0.17em}}\pi \sim \hfill & \text{Discrete}\left({\pi }_{1},\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{\pi }_{F}\right)\hfill \\ \hfill & \forall i=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}n\text{ }\text{ }\text{ }\text{ }\text{ }\text{ }\left(2.5\right)\hfill \end{array}$

$\begin{array}{ll}{M}_{ij}\text{\hspace{0.17em}}|\text{\hspace{0.17em}}{G}_{i},\text{\hspace{0.17em}}\omega ,\text{\hspace{0.17em}}{n}_{i}\sim \hfill & \text{Discrete}\left({\omega }_{{G}_{i}1},\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{\omega }_{{G}_{i}S}\right)\hfill \\ \hfill & \forall i=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}n\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{and}\text{\hspace{0.17em}}\text{\hspace{0.17em}}j=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{n}_{i}\text{ }\text{ }\text{ }\text{ }\left(2.6\right)\hfill \end{array}$

where the Discrete distribution refers to the multinomial distribution with sample size equal to one. We restrict the support of each ${X}_{i}$ to ensure the model assigns zero probability to all combinations in $\mathcal{S}$ as desired. The model in (2.3) to (2.6) can be used without restricting the support to $\mathcal{C}-\mathcal{S}.$ This ignores all structural zeros. While not appropriate for the joint distribution of household data, this model turns out to useful for the Gibbs sampler. We refer to the generative model in (2.3) to (2.6) with support on all of $\mathcal{C}$ as the untruncated NDPMPM. For contrast, we call the model in (2.1) the truncated NDPMPM.

For prior distributions, we follow the recommendations of Hu et al. (2018). We use independent uniform Dirichlet distributions as priors for $\lambda$ and $\varphi ,$ and the truncated stick-breaking representation of the Dirichlet process as priors for $\pi$ and $\omega$ (Sethuraman, 1994; Dunson and Xing, 2009; Si and Reiter, 2013; Manrique-Vallier and Reiter, 2014),

${\lambda }_{g}^{\left(k\right)}=\left({\lambda }_{g1}^{\left(k\right)},\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{\lambda }_{g{d}_{k}}^{\left(k\right)}\right)\sim \text{Dirichlet}\left(1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}1\right)\text{ }\text{ }\text{ }\text{ }\text{ }\left(2.7\right)$

${\varphi }_{gm}^{\left(k\right)}=\left({\varphi }_{gm1}^{\left(k\right)},\dots ,{\varphi }_{gm{d}_{k}}^{\left(k\right)}\right)\sim \text{Dirichlet}\left(1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}1\right)\text{ }\text{ }\text{ }\text{ }\text{ }\left(2.8\right)$

${\pi }_{g}={u}_{g}\prod _{f

${u}_{g}\sim \text{Beta}\left(1,\text{\hspace{0.17em}}\alpha \right)\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{for}\text{\hspace{0.17em}}\text{\hspace{0.17em}}g=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}F-1,\text{\hspace{0.17em}}{u}_{F}=1\text{ }\text{ }\text{ }\text{ }\left(2.10\right)$

$\alpha \sim \text{Gamma}\left(\text{0}\text{.25},\text{\hspace{0.17em}}\text{0}\text{.25}\right)\text{ }\text{ }\text{ }\text{ }\text{ }\left(2.11\right)$

${\omega }_{gm}={v}_{gm}\prod _{s

${v}_{gm}\sim \text{Beta}\left(1,\text{\hspace{0.17em}}{\beta }_{g}\right)\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{for}\text{\hspace{0.17em}}\text{\hspace{0.17em}}m=1,\dots ,\text{\hspace{0.17em}}S-1,\text{\hspace{0.17em}}{v}_{gS}=1\text{ }\text{ }\text{ }\text{ }\left(2.13\right)$

${\beta }_{g}\sim \text{Gamma}\left(\text{0}\text{.25},\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{0}\text{.25}\right).\text{ }\text{ }\text{ }\text{ }\left(2.14\right)$

We set the parameters for the Dirichlet distributions in (2.7) and (2.8) to ${1}_{{d}_{k}}$ (a ${d}_{k}\text{ }-$ dimensional vector of ones) and the parameters for the Gamma distributions in (2.11) and (2.14) to 0.25 to represent vague prior specifications. We also set ${\beta }_{g}=\beta$ for computational expedience. For further discussion on prior specifications, see Hu et al. (2018).

Conceptually, the latent household-level classes can be interpreted as clusters of households with similar compositions, e.g., households with children or households in which no one is related. Similarly, the latent individual-level classes can be interpreted as clusters of individuals with similar characteristics, e.g., older male spouses or young female children. However, for purposes of imputation, we do not care much about interpreting the classes, as they serve mainly to induce dependence across variables and individuals in the joint distribution.

It is important to select $F$ and $S$ to be large enough to ensure accurate estimation of the joint distribution. However, we also do not want to make $F$ and $S$ so large as to produce many empty classes in the model estimation. Allowing many empty classes increases computational running time without any corresponding increase in estimation accuracy. This can be especially problematic in the Gibbs sampler for the truncated NDPMPM, as these empty classes can introduce mass in regions of the space where impossible combinations are likely to be generated. This slows down the convergence of the Gibbs sampler.

We therefore recommend following the strategy in Hu et al. (2018) when setting $\left(F\text{​},\text{\hspace{0.17em}}S\right).$ Analysts can start with moderate values for both, say between 10 and 15, in initial tuning runs. After convergence, analysts examine posterior samples of the latent classes to check how many individual-level and household-level latent classes are occupied. Such posterior predictive checks can provide evidence for the case that larger values for $F$ and $S$ are needed. If the numbers of occupied household-level classes hits $F\text{​},$ we suggest increasing $F\text{​}.$ If the number of occupied individual-level classes hits $S,$ we suggest increasing $F$ first but then increasing $S,$ possibly in addition to $F\text{​},$ if increasing $F$ alone does not suffice. When posterior predictive checks do not provide evidence that larger values of $F$ and $S$ are needed, analysts need not increase the number of classes, as doing so is not expected to improve the accuracy of the estimation. We note that similar logic is used in other mixture model contexts (Walker, 2007; Si and Reiter, 2013; Manrique-Vallier and Reiter, 2014; Murray and Reiter, 2016).

## 2.2  MCMC sampler for the NDPMPM

Hu et al. (2018) use a data augmentation strategy (Manrique-Vallier and Reiter, 2014) to estimate the posterior distribution in (2.2). They assume that the observed data ${\mathcal{X}}^{1}\text{​},$ which includes only feasible households, is a subset from a hypothetical sample $\mathcal{X}$ of $\left(n+{n}_{0}\right)$ households directly generated from the untruncated NDPMPM. That is, $\mathcal{X}$ is generated on the support $\mathcal{C}$ where all combinations are possible and structural zeros rules are not enforced, but we only observe the sample of $n$ households ${\mathcal{X}}^{1}$ that satisfy the structural zero rules and do not observe the sample of ${n}_{0}$ households ${\mathcal{X}}^{0}=\mathcal{X}-{\mathcal{X}}^{1}$ that fail the rules.

We use the strategy of Hu et al. (2018) and augment the data as follows. For each $h\in \mathcal{H},$ we simulate $\mathcal{X}$ from the untruncated NDPMPM, stopping when the number of simulated feasible households in $\mathcal{X}$ directly matches ${n}_{1h}$ for all $h\in \mathcal{H}.$ We replace the simulated feasible households in $\mathcal{X}$ with ${\mathcal{X}}^{1}\text{​},$ thus, assuming that $\mathcal{X}$ already contains ${\mathcal{X}}^{1}$ and we only need to generate the part ${\mathcal{X}}^{0}$ that fall in $\mathcal{S}.$ Given a draw of $\mathcal{X},$ we draw $\theta$ from posterior distribution defined by the untruncated NDPMPM, treating $\mathcal{X}$ as the observed data. This posterior distribution can be estimated using a blocked Gibbs sampler (Ishwaran and James, 2001; Si and Reiter, 2013).

We now present the full MCMC sampler for fitting the truncated NDPMPM. Let ${G}^{0}$ and ${M}^{0}$ be vectors of the latent class membership indicators for the households in ${\mathcal{X}}^{0}$ and ${n}_{0h}$ be the number of households of size $h$ in ${\mathcal{X}}^{0}\text{​},$ with ${n}_{0}={\sum }_{h}\text{\hspace{0.17em}}{n}_{0h}.$ In each full conditional, let “ $–$ ” represent conditioning on all other variables and parameters in the model. At each MCMC iteration, we do the following steps.

• S1. Set ${\mathcal{X}}^{0}={G}^{0}={M}^{0}=\varnothing .$ For each $h\in \mathcal{H},$ repeat the following:
1. Set ${t}_{0}=0$ and ${t}_{1}=0.$
2. Sample ${G}_{i}^{0}\in \left\{1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}F\right\}\sim \text{Discrete}\left({\pi }_{1}^{**}\text{​},\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{\pi }_{F}^{**}\right)$ where ${\pi }_{g}^{**}\text{\hspace{0.17em}}\propto \text{\hspace{0.17em}}{\lambda }_{gh}^{\left(k\right)}{\pi }_{g}$ and $k$ is the index for the household-level variable “household size”.
3. For $j=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}h,$ sample ${M}_{ij}^{0}\in \left\{1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}S\right\}\sim \text{Discrete}\left({\omega }_{{G}_{i}^{0}1},\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{\omega }_{{G}_{i}^{0}S}\right).$
4. Set ${X}_{ik}^{0}=h,$ where ${X}_{ik}^{0}$ corresponds to the variable for household size. Sample the remaining household-level and individual-level values using the likelihoods in (2.3) and (2.4). Set the household’s simulated value to ${X}_{i}^{0}.$
5. If ${X}_{i}^{0}\in {\mathcal{S}}_{h},$ let ${t}_{0}={t}_{0}+1,$ ${\mathcal{X}}^{0}={\mathcal{X}}^{0}\cup {X}_{i}^{0},$ ${G}^{0}={G}^{0}\cup {G}_{i}^{0}$ and ${M}^{0}={M}^{0}\cup \left\{{M}_{i1}^{0},\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{M}_{ih}^{0}\right\}.$ Otherwise set ${t}_{1}={t}_{1}+1.$
6. If ${t}_{1}<{n}_{1h},$ return to step (b). Otherwise, set ${n}_{0h}={t}_{0}.$
• S2. For observations in ${\mathcal{X}}^{1}\text{​},$
1. Sample ${G}_{i}\in \left\{1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}F\right\}\sim \text{Discrete}\left({\pi }_{1}^{*},\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{\pi }_{F}^{*}\right)$ for $i=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}n,$ where

${\pi }_{g}^{*}=\mathrm{Pr}\left({G}_{i}=g\text{\hspace{0.17em}}|\text{\hspace{0.17em}}-\right)=\frac{{\pi }_{g}\left[{\prod }_{k=p+1}^{q}\text{\hspace{0.17em}}{\lambda }_{g{X}_{ik}^{1}}^{\left(k\right)}\left({\prod }_{j=1}^{{n}_{i}}\text{\hspace{0.17em}}{\sum }_{m=1}^{S}\text{\hspace{0.17em}}{\omega }_{gm}{\prod }_{k=1}^{p}\text{\hspace{0.17em}}{\varphi }_{gm{X}_{ijk}^{1}}^{\left(k\right)}\right)\right]}{{\sum }_{f=1}^{F}\text{\hspace{0.17em}}{\pi }_{f}\left[{\prod }_{k=p+1}^{q}\text{\hspace{0.17em}}{\lambda }_{f{X}_{ik}^{1}}^{\left(k\right)}\left({\prod }_{j=1}^{{n}_{i}}\text{\hspace{0.17em}}{\sum }_{m=1}^{S}\text{\hspace{0.17em}}{\omega }_{gm}{\prod }_{k=1}^{p}\text{\hspace{0.17em}}{\varphi }_{fm{X}_{ijk}^{1}}^{\left(k\right)}\right)\right]}$

for $g=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}F\text{​}.$ Set ${G}_{i}^{1}={G}_{i}.$
1. Sample ${M}_{ij}\in \left\{1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}S\right\}\sim \text{Discrete}\left({\omega }_{{G}_{i}^{1}1}^{*},\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{\omega }_{{G}_{i}^{1}S}^{*}\right)$ for $i=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}n$ and $j=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}{n}_{i},$ where

${\omega }_{{G}_{i}^{1}m}^{*}=\mathrm{Pr}\left({M}_{ij}=m\text{\hspace{0.17em}}|\text{\hspace{0.17em}}-\right)=\frac{{\omega }_{{G}_{i}^{1}m}{\prod }_{k=1}^{p}\text{\hspace{0.17em}}{\varphi }_{{G}_{i}^{1}m{X}_{ijk}^{1}}^{\left(k\right)}}{{\sum }_{s=1}^{S}\text{\hspace{0.17em}}{\omega }_{{G}_{i}^{1}s}{\prod }_{k=1}^{p}\text{\hspace{0.17em}}{\varphi }_{{G}_{i}^{1}s{X}_{ijk}^{1}}^{\left(k\right)}}$

for $m=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}S.$ Set ${M}_{ij}^{1}={M}_{ij}.$
• S3. Set ${u}_{F}=1.$ Sample

${u}_{g}\text{\hspace{0.17em}}|\text{\hspace{0.17em}}-\sim \text{Beta}\left(1+{U}_{g},\text{\hspace{0.17em}}\alpha +\sum _{f=g+1}^{F}\text{\hspace{0.17em}}{U}_{f}\right),\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\pi }_{g}={u}_{g}\prod _{f

where

${U}_{g}=\sum _{i=1}^{n}\text{\hspace{0.17em}}1\left({G}_{i}^{1}=g\right)+\sum _{i=1}^{{n}_{0}}\text{\hspace{0.17em}}1\left({G}_{i}^{0}=g\right)$

for $g=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}F-1.$
• S4. Set ${v}_{gM}=1$ for $g=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}F.$ Sample

${v}_{gm}\text{\hspace{0.17em}}|\text{\hspace{0.17em}}-\sim \text{Beta}\left(1+{V}_{gm},\text{\hspace{0.17em}}\beta +\sum _{s=m+1}^{S}\text{\hspace{0.17em}}{V}_{gs}\right),\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\omega }_{gm}={v}_{gm}\prod _{s

where

${V}_{gm}=\sum _{i=1}^{n}\text{\hspace{0.17em}}1\left({M}_{ij}^{1}=m,\text{\hspace{0.17em}}{G}_{i}^{1}=g\right)+\sum _{i=1}^{{n}_{0}}\text{\hspace{0.17em}}1\left({M}_{ij}^{0}=m,\text{\hspace{0.17em}}{G}_{i}^{0}=g\right)$

for $m=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}S-1$ and $g=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}F.$
• S5. Sample

${\lambda }_{g}^{\left(k\right)}\text{\hspace{0.17em}}|\text{\hspace{0.17em}}-\sim \text{Dirichlet}\left(1+{\eta }_{g1}^{\left(k\right)},\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}1+{\eta }_{g{d}_{k}}^{\left(k\right)}\right)$

where

${\eta }_{gc}^{\left(k\right)}=\sum _{i\text{\hspace{0.17em}}|\text{\hspace{0.17em}}{G}_{i}^{1}=g}^{n}\text{\hspace{0.17em}}1\left({X}_{ik}^{1}=c\right)+\sum _{i\text{\hspace{0.17em}}|\text{\hspace{0.17em}}{G}_{i}^{0}=g}^{{n}_{0}}\text{\hspace{0.17em}}1\left({X}_{ik}^{0}=c\right)$

for $g=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}F$ and $k=p+1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}q.$
• S6. Sample

${\varphi }_{gm}^{\left(k\right)}\text{\hspace{0.17em}}|\text{\hspace{0.17em}}-\sim \text{Dirichlet}\left(1+{\nu }_{gm1}^{\left(k\right)},\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}1+{\nu }_{gm{d}_{k}}^{\left(k\right)}\right)$

where

${\nu }_{gmc}^{\left(k\right)}=\sum _{i,\text{\hspace{0.17em}}j\text{\hspace{0.17em}}|\text{\hspace{0.17em}}\begin{array}{c}{G}_{i}^{1}=g,\text{\hspace{0.17em}}{M}_{ij}^{1}=m\end{array}}^{n}\text{\hspace{0.17em}}1\left({X}_{ijk}^{1}=c\right)+\sum _{i,\text{\hspace{0.17em}}j\text{\hspace{0.17em}}|\text{\hspace{0.17em}}\begin{array}{c}{G}_{i}^{0}=g,\text{\hspace{0.17em}}{M}_{ij}^{0}=m\end{array}}^{{n}_{0}}\text{\hspace{0.17em}}1\left({X}_{ijk}^{0}=c\right)$

for $g=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}F,$ $m=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}S$ and $k=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}p.$
• S7. Sample

$\alpha \text{\hspace{0.17em}}|\text{\hspace{0.17em}}-\sim \text{Gamma}\left({a}_{\alpha }+F-1,\text{\hspace{0.17em}}{b}_{\alpha }-\sum _{g=1}^{F-1}\text{\hspace{0.17em}}\text{log}\left(1-{u}_{g}\right)\right).$

• S8. Sample

$\beta \text{\hspace{0.17em}}|\text{\hspace{0.17em}}-\sim \text{Gamma}\left({a}_{\beta }+F×\left(S-1\right),\text{\hspace{0.17em}}{b}_{\beta }-\sum _{m=1}^{S-1}\text{\hspace{0.17em}}\sum _{g=1}^{F}\text{\hspace{0.17em}}\text{log}\left(1-{v}_{gm}\right)\right).$

This Gibbs sampler is implemented in the R software package “NestedCategBayesImpute” (Wang, Akande, Hu, Reiter and Barrientos, 2016). The software can be used to generate synthetic versions of the original data, but it requires all data to be complete.

﻿

Is something not working? Is there information outdated? Can't find what you're looking for?