Are deep learning models superior for missing data imputation in surveys? Evidence from an empirical comparison
Section 3. Simulation-based evaluation of imputation methods

Methods for missing data imputation are usually evaluated via real-data based simulations (van Buuren, 2018). Namely, one creates missing values from a complete dataset according to a missing data mechanism (Little and Rubin, 2014), imputes the missing values by a specific method, and then compares these imputed values with the original “true” values based on some metrics.

We first give a quick review of Rubin’s MI combination rules. Let Q MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGrbaaaa@3283@ be the target estimand in the population, and q ( l ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGXbWaaWbaaSqabeaadaqadeqaai aadYgaaiaawIcacaGLPaaaaaaaaa@354B@ and u ( l ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWG1bWaaWbaaSqabeaadaqadeqaai aadYgaaiaawIcacaGLPaaaaaaaaa@354F@ be the point and variance estimate of Q MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGrbaaaa@3283@ based on the l th MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGSbWaaWbaaSqabeaacaqG0bGaae iAaaaaaaa@34AD@ imputed dataset, respectively. The MI point estimate of Q MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGrbaaaa@3283@ is q ¯ L = l = 1 L q ( l ) / L , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaaceWGXbGbaebadaWgaaWcbaGaamitaa qabaGccaaMe8Uaeyypa0JaaGjbVpaaqadabeWcbaGaamiBaiaai2da caaIXaaabaGaamitaaqdcqGHris5aOGaaGPaVpaalyaabaGaamyCam aaCaaaleqabaWaaeWabeaacaWGSbaacaGLOaGaayzkaaaaaaGcbaGa amitaaaacaGGSaaaaa@43FD@ and the corresponding estimate of the variance is equal to T L = ( 1 + 1 / L ) b L + u ¯ L , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGubWaaSbaaSqaaiaadYeaaeqaaO GaaGjbVlabg2da9iaaysW7daqadeqaamaalyaabaGaaGymaiaaysW7 cqGHRaWkcaaMe8UaaGymaiaaykW7aeaacaaMc8UaamitaaaaaiaawI cacaGLPaaacaaMe8UaamOyamaaBaaaleaacaWGmbaabeaakiaaysW7 cqGHRaWkcaaMe8UabmyDayaaraWaaSbaaSqaaiaadYeaaeqaaOGaai ilaaaa@4CE6@ where b L = l = 1 L ( q ( l ) q ¯ L ) 2 / ( L 1 ) , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGIbWaaSbaaSqaaiaadYeaaeqaaO GaaGjbVlaai2dacaaMe8+aaabmaeqaleaacaWGSbGaaGypaiaaigda aeaacaWGmbaaniabggHiLdGccaaMc8+aaSGbaeaadaqadeqaaiaadg hadaahaaWcbeqaamaabmqabaGaamiBaaGaayjkaiaawMcaaaaakiaa ysW7cqGHsislcaaMe8UabmyCayaaraWaaSbaaSqaaiaadYeaaeqaaa GccaGLOaGaayzkaaWaaWbaaSqabeaacaaIYaaaaaGcbaGaaGPaVlaa iIcacaWGmbGaaGjbVlabgkHiTiaaysW7caaIXaGaaGykaaaacaGGSa aaaa@53E2@ and u ¯ L = l = 1 L u ( l ) / L . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaaceWG1bGbaebadaWgaaWcbaGaamitaa qabaGccaaMe8UaaGypaiaaysW7daaeWaqabSqaaiaadYgacaaI9aGa aGymaaqaaiaadYeaa0GaeyyeIuoakiaaykW7daWcgaqaaiaadwhada ahaaWcbeqaamaabmqabaGaamiBaaGaayjkaiaawMcaaaaakiaaykW7 aeaacaaMc8UaamitaaaacaGGUaaaaa@46DE@ The confidence interval of Q MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGrbaaaa@3283@ is constructed using ( q ¯ L Q ) ~ t ν ( 0, T L ) , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaadaqadeqaaiqadghagaqeamaaBaaale aacaWGmbaabeaakiaaysW7cqGHsislcaaMe8UaamyuaaGaayjkaiaa wMcaaiaaysW7ieaacaWF+bGaaGjbVlaadshadaWgaaWcbaGaeqyVd4 gabeaakiaaykW7daqadeqaaiaaicdacaaISaGaaGjbVlaadsfadaWg aaWcbaGaamitaaqabaaakiaawIcacaGLPaaacaGGSaaaaa@49D4@ where t v MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWG0bWaaSbaaSqaaiaadAhaaeqaaa aa@33CD@ is a t MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWG0baaaa@32A6@ -distribution with ν = ( L 1 ) ( 1 + u ¯ L / [ ( 1 + 1 / L ) b L ] ) 2 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacqaH9oGBcaaMe8Uaeyypa0JaaGjbVp aabmqabaGaamitaiaaysW7cqGHsislcaaMe8UaaGymaaGaayjkaiaa wMcaaiaaysW7daqadeqaamaalyaabaGaaGymaiaaysW7cqGHRaWkca aMe8UabmyDayaaraWaaSbaaSqaaiaadYeaaeqaaOGaaGPaVdqaaiaa ykW7daWadeqaaiaaiIcadaWcgaqaaiaaigdacaaMe8Uaey4kaSIaaG jbVlaaigdaaeaacaWGmbaaaiaaiMcacaaMe8UaamOyamaaBaaaleaa caWGmbaabeaaaOGaay5waiaaw2faaaaaaiaawIcacaGLPaaadaahaa Wcbeqaaiaaikdaaaaaaa@59CA@ degrees of freedom.

The first step in our simulation-based evaluation procedure is choosing a dataset with all values observed, which is taken as the “population”. We then choose a set of target estimands Q MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGrbaaaa@3283@ and compute their values from this population data, which are taken as the “ground truth”. The estimands are usually summary statistics of the variables or parameters in a down-stream analysis model, e.g., a coefficient in a regression model (Tang, Song, Belin and Unützer, 2005; Huque, Carlin, Simpson and Lee, 2018). Second, we randomly draw without replacement H MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGibaaaa@327A@ samples of size n MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGUbaaaa@32A0@ from the population data, and in each of sample ( h = 1, , H ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaadaqadeqaaiaadIgacaaMe8Uaeyypa0 JaaGjbVlaaigdacaaISaGaaGjbVlablAciljaacYcacaaMe8Uaamis aaGaayjkaiaawMcaaaaa@3F6E@ create missing data according to a specific missing data mechanism and pre-fixed proportion of missingness. Third, for each simulated sample with missing data, we create L MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGmbaaaa@327E@ imputed datasets using the imputation method under consideration and construct the point and interval estimate of each estimand using Rubin’s rules. Lastly, we compute performance metrics of each estimand from the quantities obtained in the previous step.

In the empirical application, we select a large complete subsample from the American Community Survey (ACS) − a national survey that bears the hallmarks of many big survey data − as our population. Since discrete variables are prevalent in the ACS, as well as in most survey data, we focus on the marginal probabilities of binary and categorical variables; e.g., a categorical variable with K MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGlbaaaa@327D@ categories has K 1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGlbGaaGjbVlabgkHiTiaaysW7ca aIXaaaaa@373F@ estimands. To evaluate how well the imputation methods preserve the multivariate distributional properties, similar to Akande et al. (2017), we also consider the bivariate probabilities of all two-way combinations of categories in binary and categorical variables. Another useful metric is the finite-sample pairwise correlations between continuous variables. For continuous variables, the common estimands are mean, median or variance. To facilitate meaningful comparisons of the results between the categorical and continuous variables, we propose to discretize each continuous variable into K MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGlbaaaa@327D@ categories based on the sample quantiles. We then evaluate these binned continuous variables as categorical variables based on the aforementioned estimands of marginal and bivariate probabilities.

For each estimand Q , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGrbGaaiilaaaa@3333@ we consider three metrics. The first metric focuses on bias. To accommodate close-to-zero estimands that are prevalent in probabilities of categorical variables, we consider the absolute standardized bias (ASB) of each estimand Q : MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGrbGaaGPaVlaacQdaaaa@34CC@

ASB = h = 1 H | q ¯ L ( h ) Q | / ( H Q ) , ( 3.1 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaqGbbGaae4uaiaabkeacaaMe8UaaG jbVlaab2dacaaMe8UaaGjbVpaaqahabeWcbaGaamiAaiaaysW7caaI 9aGaaGjbVlaaigdaaeaacaWGibaaniabggHiLdGccaaMc8+aaSGbae aadaabdeqaaiaaykW7ceWGXbGbaebadaqhaaWcbaGaamitaaqaamaa bmqabaGaamiAaaGaayjkaiaawMcaaaaakiaaysW7cqGHsislcaaMe8 UaamyuaiaaykW7aiaawEa7caGLiWoacaaMc8oabaGaaGPaVpaabmqa baGaamisaiaaysW7cqGHflY1caaMe8UaamyuaaGaayjkaiaawMcaaa aacaaISaGaaGzbVlaaywW7caaMf8UaaGzbVlaaywW7caGGOaGaaG4m aiaac6cacaaIXaGaaiykaaaa@6C8C@

where q ¯ L ( h ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaaceWGXbGbaebadaqhaaWcbaGaamitaa qaamaabmqabaGaamiAaaGaayjkaiaawMcaaaaaaaa@3630@ is the MI point estimate of Q MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGrbaaaa@3283@ in simulation h . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGObGaaiOlaaaa@334C@

The second metric is the relative mean squared error (Rel.MSE), which is the ratio between the MSE of estimating Q MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGrbaaaa@3283@ from the imputed data and that from the sampled data before introducing the missing data:

Rel .MSE = h = 1 H ( q ¯ L ( h ) Q ) 2 h = 1 H ( Q ˜ ( h ) Q ) 2 , ( 3.2 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaqGsbGaaeyzaiaabYgacaqGUaGaae ytaiaabofacaqGfbGaaGjbVlaaysW7caqG9aGaaGjbVlaaysW7daWc aaqaamaaqadabaGaaGPaVpaabmqabaGabmyCayaaraWaa0baaSqaai aadYeaaeaadaqadeqaaiaadIgaaiaawIcacaGLPaaaaaGccaaMe8Ua eyOeI0IaaGjbVlaadgfaaiaawIcacaGLPaaadaahaaWcbeqaaiaaik daaaaabaGaamiAaiaai2dacaaIXaaabaGaamisaaqdcqGHris5aaGc baWaaabmaeaacaaMc8+aaeWabeaaceWGrbGbaGaadaahaaWcbeqaam aabmqabaGaamiAaaGaayjkaiaawMcaaaaakiaaysW7cqGHsislcaaM e8UaamyuaaGaayjkaiaawMcaamaaCaaaleqabaGaaGOmaaaaaeaaca WGObGaaGypaiaaigdaaeaacaWGibaaniabggHiLdaaaOGaaGilaiaa ywW7caaMf8UaaGzbVlaaywW7caaMf8UaaiikaiaaiodacaGGUaGaaG OmaiaacMcaaaa@6ECB@

where q ¯ L ( h ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaaceWGXbGbaebadaqhaaWcbaGaamitaa qaamaabmqabaGaamiAaaGaayjkaiaawMcaaaaaaaa@3630@ is defined earlier, and Q ˜ ( h ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaaceWGrbGbaGaadaahaaWcbeqaamaabm qabaGaamiAaaGaayjkaiaawMcaaaaaaaa@3536@ is the prototype estimator of Q , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGrbGaaiilaaaa@3333@ i.e., the point estimate from the complete sampled data in simulation h . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGObGaaiOlaaaa@334C@

The third metric is coverage rate, which is the proportion of the α % MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacqaHXoqycaaILaaaaa@33FB@ (e.g., 95%) confidence intervals, denoted by CI h α MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaaMi8Uaae4qaiaabMeadaqhaaWcba GaamiAaaqaaiabeg7aHbaaaaa@3789@ ( h = 1, , H ) , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaadaqadeqaaiaadIgacaaMe8UaaGypai aaysW7caaIXaGaaGilaiaaysW7cqWIMaYscaGGSaGaaGjbVlaadIea aiaawIcacaGLPaaacaGGSaaaaa@3FDF@ in the H MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGibaaaa@327A@ simulations that contain the true Q : MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGrbGaaGPaVlaacQdaaaa@34CC@

Coverage = h = 1 H 1 { Q CI h α } / H . ( 3.3 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaqGdbGaae4BaiaabAhacaqGLbGaae OCaiaabggacaqGNbGaaeyzaiaaysW7caaMe8UaaGypaiaaysW7caaM e8+aaabCaeqaleaacaWGObGaaGypaiaaigdaaeaacaWGibaaniabgg HiLdGccaaMc8+aaSGbaeaacaWHXaGaaGjbVpaacmqabaGaamyuaiaa ysW7cqGHiiIZcaaMe8Uaae4qaiaabMeadaqhaaWcbaGaamiAaaqaai abeg7aHbaaaOGaay5Eaiaaw2haaiaaykW7aeaacaaMc8Uaamisaaaa caaIUaGaaGzbVlaaywW7caaMf8UaaGzbVlaaywW7caGGOaGaaG4mai aac6cacaaIZaGaaiykaaaa@653D@

We recommend conducting a large number of simulations (e.g., H 100 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGibGaaGjbVlabgwMiZkaaysW7ca aIXaGaaGimaiaaicdacaGGPaaaaa@3A36@ to obtain reliable estimates of MSE and coverage. This would not be a problem for deep learning algorithms, which can be typically completed in seconds even with large sample sizes. However, it can be computationally prohibitive for the MICE algorithms when each of the simulated data is large (e.g., n = 100,000 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qrpq0xc9fs0xc9q8qqaqFn0dXdir=xcv k9pIe9q8qqaq=dir=f0=yqaqVeLsFr0=vr0=vr0db8meaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGUbGaaGjbVlaai2dacaaMe8Uaae ymaiaabcdacaqGWaGaaeilaiaabcdacaqGWaGaaeimaaaa@3D92@ in some of our simulations). In the situation that one has to rely on only a few or even a single simulation for evaluation, we propose a modified metric of bias. Specifically, for each categorical variable or binned continuous variable j , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGQbGaaiilaaaa@334C@ we define the weighted absolute bias (WAB) as the sum of the absolute bias weighted by the true marginal probability in each category:

Weighted absolute bias = k = 1 K Q j k | q ¯ j k ( h ) Q j k | , ( 3.4 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaqGxbGaaeyzaiaabMgacaqGNbGaae iAaiaabshacaqGLbGaaeizaiaaysW7caaMe8UaaeyyaiaabkgacaqG ZbGaae4BaiaabYgacaqG1bGaaeiDaiaabwgacaaMe8UaaGjbVlaabk gacaqGPbGaaeyyaiaabohacaaMe8UaaGjbVlaai2dacaaMe8UaaGjb VpaaqahabeWcbaGaam4AaiaaysW7caaI9aGaaGjbVlaaigdaaeaaca WGlbaaniabggHiLdGccaaMc8UaamyuamaaBaaaleaacaWGQbGaam4A aaqabaGccaaMe8+aaqWaaeaacaaMc8UabmyCayaaraWaa0baaSqaai aadQgacaWGRbaabaWaaeWabeaacaWGObaacaGLOaGaayzkaaaaaOGa aGjbVlabgkHiTiaaysW7caWGrbWaaSbaaSqaaiaadQgacaWGRbaabe aakiaaykW7aiaawEa7caGLiWoacaaMc8UaaGilaiaaywW7caaMf8Ua aGzbVlaaywW7caaMf8UaaiikaiaaiodacaGGUaGaaGinaiaacMcaaa a@8044@

where K MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGlbaaaa@327D@ is the total number of categories, Q j k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGrbWaaSbaaSqaaiaadQgacaWGRb aabeaaaaa@348E@ is the population marginal probability of category k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGRbaaaa@329D@ in variable j , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGQbGaaiilaaaa@334C@ and q ¯ j k ( h ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaaceWGXbGbaebadaqhaaWcbaGaamOAai aadUgaaeaadaqadeqaaiaadIgaaiaawIcacaGLPaaaaaaaaa@373E@ is its corresponding point estimate in simulation h . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGObGaaiOlaaaa@334C@ We can also average the weighted absolute bias over a number of repeatedly simulated samples.

The above procedure and metrics differ from the common practice in the machine learning literature. For example, many machine learning papers on missing data imputation conduct simulations on benchmark datasets, but these data often have vastly different structure and features from survey data and thus are less informative for the goal of this paper. One such dataset is the Breast Cancer dataset in the UCI Machine Learning Repository (Dua and Graff, 2017), which has only 569 sample units and no categorical variables. Also, these simulations are usually based on randomly creating missing values of a single dataset repeatedly rather than on drawing repeated samples from a population, and thus fails to account for the sampling mechanism. Moreover, these evaluations often use metrics focusing on accuracy of individual predictions rather than distributional features. Specifically, the most commonly used metrics are the root mean squared error (RMSE) and accuracy (Gondara and Wang, 2018; Yoon, Jordon and Schaar, 2018; Lu et al., 2020). Both metrics can be defined in an overall or variable-specific fashion, but the machine learning literature usually focuses on the overall version. The overall RMSE is defined as

RMSE = i = 1 n j M i j ( Y ^ i j Y i j ) 2 i = 1 n j M i j , ( 3.5 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaqGsbGaaeytaiaabofacaqGfbGaaG jbVlaaysW7caaI9aGaaGjbVlaaysW7daGcaaqaamaalaaabaWaaabm aeaadaaeqaqaaiaaykW7caWGnbWaaSbaaSqaaiaadMgacaWGQbaabe aakiaaykW7daqadeqaaiqadMfagaqcamaaBaaaleaacaWGPbGaamOA aaqabaGccaaMe8UaeyOeI0IaaGjbVlaadMfadaWgaaWcbaGaamyAai aadQgaaeqaaaGccaGLOaGaayzkaaWaaWbaaSqabeaacaaIYaaaaaqa aiaadQgaaeqaniabggHiLdaaleaacaWGPbGaaGPaVlaai2dacaaMc8 UaaGymaaqaaiaad6gaa0GaeyyeIuoaaOqaamaaqadabaWaaabeaeaa caaMc8UaamytamaaBaaaleaacaWGPbGaamOAaaqabaaabaGaamOAaa qab0GaeyyeIuoaaSqaaiaadMgacaaMc8UaaGypaiaaykW7caaIXaaa baGaamOBaaqdcqGHris5aaaaaSqabaGccaaISaGaaGzbVlaaywW7ca aMf8UaaGzbVlaaywW7caGGOaGaaG4maiaac6cacaaI1aGaaiykaaaa @757D@

where Y i j MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGzbWaaSbaaSqaaiaadMgacaWGQb aabeaaaaa@3494@ is the value of continuous variable j MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGQbaaaa@329C@ for individual i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGPbaaaa@329B@ in the complete data before introducing missing data, and Y ^ i j MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaaceWGzbGbaKaadaWgaaWcbaGaamyAai aadQgaaeqaaaaa@34A4@ is the corresponding imputed value. For non-missing values (i.e., M i j = 1 ) , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGnbWaaSbaaSqaaiaadMgacaWGQb aabeaakiaaysW7caaI9aGaaGjbVlaaigdacaGGPaGaaiilaaaa@3A8B@ Y i j = Y ^ i j . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGzbWaaSbaaSqaaiaadMgacaWGQb aabeaakiaaysW7cqGH9aqpcaaMe8UabmywayaajaWaaSbaaSqaaiaa dMgacaWGQbaabeaakiaac6caaaa@3C71@ The (overall) accuracy is defined for categorical variables, namely it is the proportion of the imputed values being equal to the corresponding original “true” value:

Accuracy = i = 1 n j S cat M i j 1 ( Y ^ i j = Y i j ) i = 1 n j S cat M i j , ( 3.6 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaqGbbGaae4yaiaabogacaqG1bGaae OCaiaabggacaqGJbGaaeyEaiaaysW7caaMe8UaaeypaiaaysW7caaM e8+aaSaaaeaadaaeWaqaamaaqababaGaaGPaVlaad2eadaWgaaWcba GaamyAaiaadQgaaeqaamrr1ngBPrwtHrhAYaqeguuDJXwAKbstHrhA Gq1DVbaceaGccqWFaCFmtCvAUfKttLearCqqSDwzYLwyUbacfaGae4 xmaeJaaGjbVpaabmqabaGabmywayaajaWaaSbaaSqaaiaadMgacaWG QbaabeaakiaaysW7caaI9aGaaGjbVlaadMfadaWgaaWcbaGaamyAai aadQgaaeqaaaGccaGLOaGaayzkaaaaleaacaWGQbGaaGPaVlabgIGi olaaykW7caWGtbWaaSbaaWqaaiaabogacaqGHbGaaeiDaaqabaaale qaniabggHiLdaaleaacaWGPbGaaGPaVlaai2dacaaMc8UaaGymaaqa aiaad6gaa0GaeyyeIuoaaOqaamaaqadabaWaaabeaeaacaaMc8Uaam ytamaaBaaaleaacaWGPbGaamOAaaqabaaabaGaamOAaiaaykW7cqGH iiIZcaaMc8Uaam4uamaaBaaameaacaqGJbGaaeyyaiaabshaaeqaaa WcbeqdcqGHris5aaWcbaGaamyAaiaaykW7caaI9aGaaGPaVlaaigda aeaacaWGUbaaniabggHiLdaaaOGaaGilaiaaywW7caaMf8UaaGzbVl aaywW7caaMf8UaaiikaiaaiodacaGGUaGaaGOnaiaacMcaaaa@9B19@

where S cat MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGtbWaaSbaaSqaaiaabogacaqGHb GaaeiDaaqabaaaaa@3572@ is the set of categorical variables.

A number of caveats are in order for the RMSE and accuracy metrics. First, they are usually computed on a single imputed sample as an overall measure of an imputation method, but this ignores the uncertainty of imputations. Second, both RMSE and accuracy are single value summaries and do not capture the multivariate distributional feature of data. Third, RMSE does not adjust for the different scale of variables and can be be easily dominated by a few outliers; also, it is often computed without differentiating between continuous and categorical variables. Lastly, when there are multiple ( L ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaadaqadeqaaiaadYeaaiaawIcacaGLPa aaaaa@3408@ imputed data, a common way is to use the mean of the L MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaacaWGmbaaaa@327E@ imputed value as Y ^ i j MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGaaiaadaaakeaaceWGzbGbaKaadaWgaaWcbaGaamyAai aadQgaaeqaaaaa@34A4@ in (3.5), but the statistical meaning of the resulting metrics is opaque. This is particularly problematic for categorical variables. For these reasons, we warn against using the overall RMSE and accuracy as the only metrics for comparing imputation methods, and one should exercise caution when interpreting them.


Date modified: