Model-assisted calibration of non-probability sample survey data using adaptive LASSO
Section 4. Simulation study

We design a simulation to evaluate the finite sample properties of T ^ y LASSO MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaaceWGubGbaKaadaqhaaWcbaGaamyEaa qaaiaabYeacaqGbbGaae4uaiaabofacaqGpbaaaaaa@3816@ and the asymptotic variance estimates of T ^ y LASSO , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaaceWGubGbaKaadaqhaaWcbaGaamyEaa qaaiaabYeacaqGbbGaae4uaiaabofacaqGpbaaaOGaaGzaVlaacYca aaa@3A5A@ v LASSO MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWG2bWaaWbaaSqabeaacaqGmbGaae yqaiaabofacaqGtbGaae4taaaaaaa@372A@ and v g LASSO . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWG2bWaa0baaSqaaiaadEgaaeaaca qGmbGaaeyqaiaabofacaqGtbGaae4taaaakiaaygW7caGGUaaaaa@3A5C@ We also consider a naive bootstrap estimator v boot LASSO , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWG2bWaa0baaSqaaiaabkgacaqGVb Gaae4BaiaabshaaeaacaqGmbGaaeyqaiaabofacaqGtbGaae4taaaa kiaaygW7caGGSaaaaa@3D2E@ obtained by drawing 500 samples with replacement from each simulation sample, as an alternative variance estimator of T ^ y LASSO . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaaceWGubGbaKaadaqhaaWcbaGaamyEaa qaaiaabYeacaqGbbGaae4uaiaabofacaqGpbaaaOGaaGzaVlaac6ca aaa@3A5C@

To simulate non-probability samples, we generate samples with unequal selection probabilities, but set design weights to d A = N / n . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWHKbWaaWbaaSqabeaacaWGbbaaaO GaaGypamaalyaabaGaamOtaaqaaiaad6gaaaGaaiOlaaaa@3730@ We also consider T ^ y GREG MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaaceWGubGbaKaadaqhaaWcbaGaamyEaa qaaiaabEeacaqGsbGaaeyraiaabEeaaaaaaa@3736@ (traditional calibration estimator) and T ^ y HT MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaaceWGubGbaKaadaqhaaWcbaGaamyEaa qaaiaabIeacaqGubaaaaaa@35A7@ (pure design-based Horvitz-Thompson estimator). Because T ^ y LASSO MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaaceWGubGbaKaadaqhaaWcbaGaamyEaa qaaiaabYeacaqGbbGaae4uaiaabofacaqGpbaaaaaa@3816@ performs both variable selection and estimation, we implement a backward stepwise selection to select the working model for GREG. Although there is no theoretical justification for using stepwise variable selection, Skinner and Silva (1997) have shown that given two auxiliary variables, a stepwise procedure can result in improved efficiency of GREG estimator. We are interested in knowing the performance of each estimator under (1) populations with different signal-to-noise-ratios (SNR), (2) independent, informative, and biased sampling schemes, and (3) small and large sample sizes. The signal-to-noise ratio is calculated according to definitions in Czanner, Sarma, Eden and Brown (2008). We set two levels of correlations (low/high) between covariates, crossed with two levels of effect sizes (low/high) of the covariates. We set the low/high and high/low populations to have the same SNR in order to understand the influence of correlation and effect size on estimator’s performance given the same SNR. Three sampling schemes are used to draw samples: simple-random-sampling without replacement, SRS, Poisson sampling with selection probabilities proportional to covariates, POI ( X ) , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8qrpq0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaqGqbGaae4taiaabMeadaqadaqaai aabIfaaiaawIcacaGLPaaacaGGSaaaaa@3724@ and Poisson sampling with selection probabilities proportional to covariates and the outcome, POI ( X+Y ) . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8qrpq0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaqGqbGaae4taiaabMeadaqadaqaai aabIfacaqGRaGaaeywaaGaayjkaiaawMcaaiaac6caaaa@38B0@ POI ( X+Y ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8qrpq0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaqGqbGaae4taiaabMeadaqadaqaai aabIfacaqGRaGaaeywaaGaayjkaiaawMcaaaaa@37FE@ sampling simulates self-selection bias of non-probability samples, where the propensity of a respondent to participate in a study relates to the analysis variable. We consider two sample sizes: 250 and 1,000. Thus we have a total of 2 × 2 × 3 × 2 = 24 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaaIYaGaey41aqRaaGOmaiabgEna0k aaiodacqGHxdaTcaaIYaGaaGypaiaaikdacaaI0aaaaa@3D68@ experimental groups.

4.1  Population

To create collinearity among covariates, we follow an auto-decay correlation structure commonly used in LASSO-related simulations (Tibshirani, 1996): cor ( X i , X j ) = ρ | i j | , i = 1, , p . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaqGJbGaae4Baiaabkhadaqadaqaai aadIfadaWgaaWcbaGaamyAaaqabaGccaaISaGaaGjbVlaadIfadaWg aaWcbaGaamOAaaqabaaakiaawIcacaGLPaaacaaI9aGaeqyWdi3aaW baaSqabeaacaaI8bGaaGPaVlaadMgacqGHsislcaWGQbGaaGPaVlaa iYhaaaGccaaISaGaaGjbVlaaykW7caWGPbGaaGypaiaaigdacaaISa GaaGjbVlablAciljaaiYcacaaMe8UaamiCaiaac6caaaa@54C4@ We generate a population of size N = MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWGobGaaGypaaaa@338B@ 100,000 from a multivariate normal distribution with mean 0 ( p × 1 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWHWaWaaSbaaSqaamaabmaabaGaam iCaiabgEna0kaaigdaaiaawIcacaGLPaaaaeqaaaaa@3826@ and covariance Σ ρ , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqqHJoWudaahaaWcbeqaaiabeg8aYb aakiaaygW7caGGSaaaaa@37A6@ p = MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWGWbGaaGypaaaa@33AD@  40. The continuous outcome variable is generated by the regression model:

y i = β 0 + β 1 x i 1 + β 2 x i 2 + + β 40 x i 40 + N ( 0,3 ) . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWG5bWaaSbaaSqaaiaadMgaaeqaaO GaaGypaiabek7aInaaBaaaleaacaaIWaaabeaakiabgUcaRiabek7a InaaBaaaleaacaaIXaaabeaakiaadIhadaWgaaWcbaGaamyAaiaaig daaeqaaOGaey4kaSIaeqOSdi2aaSbaaSqaaiaaikdaaeqaaOGaamiE amaaBaaaleaacaWGPbGaaGOmaaqabaGccqGHRaWkcqWIMaYscqGHRa WkcqaHYoGydaWgaaWcbaGaaGinaiaaicdaaeqaaOGaamiEamaaBaaa leaacaWGPbGaaGinaiaaicdaaeqaaOGaey4kaSIaamOtamaabmaaba GaaGimaiaaiYcacaaIZaaacaGLOaGaayzkaaGaaiOlaaaa@53F7@

The binary outcome variable is generated by the logistic regression model:

ϕ i = expit ( β 0 + β 1 x i 1 + β 2 x i 2 + + β 40 x i 40 ) , expit ( u ) = ( 1 + exp ( u ) ) 1 y i = bernoulli ( ϕ i ) . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8qrpq0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaafaqaaeGacaaabaGaeqy1dy2aaSbaaS qaaiaadMgaaeqaaaGcbaGaaGypaiaabwgacaqG4bGaaeiCaiaabMga caqG0bWaaeWaaeaacqaHYoGydaWgaaWcbaGaaGimaaqabaGccqGHRa WkcqaHYoGydaWgaaWcbaGaaGymaaqabaGccaWG4bWaaSbaaSqaaiaa dMgacaaIXaaabeaakiabgUcaRiabek7aInaaBaaaleaacaaIYaaabe aakiaadIhadaWgaaWcbaGaamyAaiaaikdaaeqaaOGaey4kaSIaeSOj GSKaey4kaSIaeqOSdi2aaSbaaSqaaiaaisdacaaIWaaabeaakiaadI hadaWgaaWcbaGaamyAaiaaisdacaaIWaaabeaaaOGaayjkaiaawMca aiaaiYcacaaMf8UaaeyzaiaabIhacaqGWbGaaeyAaiaabshadaqada qaaiaadwhaaiaawIcacaGLPaaacaaI9aWaaeWaaeaacaaIXaGaey4k aSIaaeyzaiaabIhacaqGWbWaaeWaaeaacaWG1baacaGLOaGaayzkaa aacaGLOaGaayzkaaWaaWbaaSqabeaacqGHsislcaaIXaaaaaGcbaGa amyEamaaBaaaleaacaWGPbaabeaaaOqaaiaai2dacaqGIbGaaeyzai aabkhacaqGUbGaae4BaiaabwhacaqGSbGaaeiBaiaabMgadaqadaqa aiabew9aMnaaBaaaleaacaWGPbaabeaaaOGaayjkaiaawMcaaiaac6 caaaaaaa@79BF@

We set ρ = MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqaHbpGCcaaI9aaaaa@3478@ 0.15 for low correlation population, and ρ = MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqaHbpGCcaaI9aaaaa@3478@ 0.73 for high correlation population. For both continuous and binary outcome variables:

Low effect-size β ( 1 ) := β 12 β 19 , β 32 β 39 = 0 .45 High effect-size β ( 1 ) := β 12 β 19 , β 32 β 39 = 0 .74 . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8qrpq0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaafaqaaeGadaaabaGaaeitaiaab+gaca qG3bGaaGjbVlaabwgacaqGMbGaaeOzaiaabwgacaqGJbGaaeiDaiaa b2cacaqGZbGaaeyAaiaabQhacaqGLbaabaGaaCOSdmaaCaaaleqaba WaaeWaaeaacaaIXaaacaGLOaGaayzkaaaaaaGcbaGaaGOoaiaai2da cqaHYoGydaWgaaWcbaGaaGymaiaaikdaaeqaaOGaeSOjGSKaeqOSdi 2aaSbaaSqaaiaaigdacaaI5aaabeaakiaaiYcacaaMe8UaeqOSdi2a aSbaaSqaaiaaiodacaaIYaaabeaakiablAciljabek7aInaaBaaale aacaaIZaGaaGyoaaqabaGccaaI9aGaaeimaiaab6cacaqG0aGaaeyn aaqaaiaabIeacaqGPbGaae4zaiaabIgacaaMe8UaaeyzaiaabAgaca qGMbGaaeyzaiaabogacaqG0bGaaeylaiaabohacaqGPbGaaeOEaiaa bwgaaeaacaWHYoWaaWbaaSqabeaadaqadaqaaiaaigdaaiaawIcaca GLPaaaaaaakeaacaaI6aGaaGypaiabek7aInaaBaaaleaacaaIXaGa aGOmaaqabaGccqWIMaYscqaHYoGydaWgaaWcbaGaaGymaiaaiMdaae qaaOGaaGilaiaaysW7cqaHYoGydaWgaaWcbaGaaG4maiaaikdaaeqa aOGaeSOjGSKaeqOSdi2aaSbaaSqaaiaaiodacaaI5aaabeaakiaai2 dacaqGWaGaaeOlaiaabEdacaqG0aGaaeOlaaaaaaa@850E@

For continuous y : MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWH5bGaaGjcVlaacQdaaaa@3542@ β 0 = 1 , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqaHYoGydaWgaaWcbaGaaGimaaqaba GccaaI9aGaaGymaiaacYcaaaa@36B4@ for binary y : MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWH5bGaaGjcVlaacQdaaaa@3542@ β 0 = 0 .4 . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqaHYoGydaWgaaWcbaGaaGimaaqaba GccaaI9aGaaeimaiaab6cacaqG0aGaaiOlaaaa@3816@ The rest of β i = 0. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqaHYoGydaWgaaWcbaGaamyAaaqaba GccaaI9aGaaGimaiaac6caaaa@36E9@ Out of 41 regression parameters, 16 are non-zero and 25 are zero.

4.2  Sampling schemes

Three sampling schemes are used to generate the sample:

  1. Simple-Random-Sampling (SRS): selection probabilities  = n / N . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacqGH9aqpdaWcgaqaaiaad6gaaeaaca WGobaaaiaac6caaaa@3585@
  2. Poisson sampling with probabilities proportional to X , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWHybGaaiilaaaa@3382@ POI ( X ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8qrpq0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaqGqbGaae4taiaabMeadaqadaqaai aabIfaaiaawIcacaGLPaaacaGGUaaaaa@3726@

{ continuous y : π i 0 .4 + 0 .4 x i 5 + 0 .4 x i 15 + 0 .4 x i 25 + 0 .4 x i 35 binary y : logit ( π i ) = 0 .4 + 0 .4 x i 5 + 0 .4 x i 15 + 0 .4 x i 25 + 0 .4 x i 35 . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaadaqabaqaauaabaqaciaaaeaacaqGJb Gaae4Baiaab6gacaqG0bGaaeyAaiaab6gacaqG1bGaae4Baiaabwha caqGZbGaaGjbVlaaykW7caWH5bGaaGjcVlaaiQdaaeaacqaHapaCda WgaaWcbaGaamyAaaqabaGccqGHDisTcaqGWaGaaeOlaiaabsdacqGH RaWkcaqGWaGaaeOlaiaabsdacaWG4bWaaSbaaSqaaiaadMgacaaI1a aabeaakiabgUcaRiaabcdacaqGUaGaaeinaiaadIhadaWgaaWcbaGa amyAaiaaigdacaaI1aaabeaakiabgUcaRiaabcdacaqGUaGaaeinai aadIhadaWgaaWcbaGaamyAaiaaikdacaaI1aaabeaakiabgUcaRiaa bcdacaqGUaGaaeinaiaadIhadaWgaaWcbaGaamyAaiaaiodacaaI1a aabeaaaOqaaiaaykW7caaMc8UaaGPaVlaaykW7caaMc8UaaGPaVlaa ykW7caaMi8UaaGjcVlaayIW7caaMi8UaaGjcVlaabkgacaqGPbGaae OBaiaabggacaqGYbGaaeyEaiaaysW7caaMc8UaaCyEaiaayIW7caaI 6aaabaGaaeiBaiaab+gacaqGNbGaaeyAaiaabshadaqadaqaaiabec 8aWnaaBaaaleaacaWGPbaabeaaaOGaayjkaiaawMcaaiaai2dacaqG WaGaaeOlaiaabsdacqGHRaWkcaqGWaGaaeOlaiaabsdacaWG4bWaaS baaSqaaiaadMgacaaI1aaabeaakiabgUcaRiaabcdacaqGUaGaaein aiaadIhadaWgaaWcbaGaamyAaiaaigdacaaI1aaabeaakiabgUcaRi aabcdacaqGUaGaaeinaiaadIhadaWgaaWcbaGaamyAaiaaikdacaaI 1aaabeaakiabgUcaRiaabcdacaqGUaGaaeinaiaadIhadaWgaaWcba GaamyAaiaaiodacaaI1aaabeaakiaac6caaaaacaGL7baaaaa@A81C@

  1. Poisson sampling with probabilities proportional to X MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWHybaaaa@32D2@ and y , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWH5bGaaiilaaaa@33A2@ POI ( X+Y ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8qrpq0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaqGqbGaae4taiaabMeadaqadaqaai aabIfacaqGRaGaaeywaaGaayjkaiaawMcaaiaac6caaaa@38B0@

{ continuous y : π i 0 .4 + 0 .4 x i 5 + 0 .4 x i 15 + 0 .4 x i 25 + 0 .4 x i 35 + 0 .5 y i binary y : logit ( π i ) 0 .4 + 0 .4 x i 5 + 0 .4 x i 15 + 0 .4 x i 25 + 0 .4 x i 35 + y i . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaadaqabaqaauaabaqaciaaaeaacaqGJb Gaae4Baiaab6gacaqG0bGaaeyAaiaab6gacaqG1bGaae4Baiaabwha caqGZbGaaGjbVlaaykW7caWH5bGaaGjcVlaaiQdaaeaacqaHapaCda WgaaWcbaGaamyAaaqabaGccqGHDisTcaqGWaGaaeOlaiaabsdacqGH RaWkcaqGWaGaaeOlaiaabsdacaWG4bWaaSbaaSqaaiaadMgacaaI1a aabeaakiabgUcaRiaabcdacaqGUaGaaeinaiaadIhadaWgaaWcbaGa amyAaiaaigdacaaI1aaabeaakiabgUcaRiaabcdacaqGUaGaaeinai aadIhadaWgaaWcbaGaamyAaiaaikdacaaI1aaabeaakiabgUcaRiaa bcdacaqGUaGaaeinaiaadIhadaWgaaWcbaGaamyAaiaaiodacaaI1a aabeaakiabgUcaRiaabcdacaqGUaGaaeynaiaadMhadaWgaaWcbaGa amyAaaqabaaakeaacaaMc8UaaGPaVlaaykW7caaMc8UaaGPaVlaayk W7caaMc8UaaGjcVlaayIW7caaMi8UaaGjcVlaayIW7caqGIbGaaeyA aiaab6gacaqGHbGaaeOCaiaabMhacaaMe8UaaGPaVlaahMhacaaMi8 UaaGOoaaqaaiaabYgacaqGVbGaae4zaiaabMgacaqG0bWaaeWaaeaa cqaHapaCdaWgaaWcbaGaamyAaaqabaaakiaawIcacaGLPaaacqGHDi sTcaqGWaGaaeOlaiaabsdacqGHRaWkcaqGWaGaaeOlaiaabsdacaWG 4bWaaSbaaSqaaiaadMgacaaI1aaabeaakiabgUcaRiaabcdacaqGUa GaaeinaiaadIhadaWgaaWcbaGaamyAaiaaigdacaaI1aaabeaakiab gUcaRiaabcdacaqGUaGaaeinaiaadIhadaWgaaWcbaGaamyAaiaaik dacaaI1aaabeaakiabgUcaRiaabcdacaqGUaGaaeinaiaadIhadaWg aaWcbaGaamyAaiaaiodacaaI1aaabeaakiabgUcaRiaadMhadaWgaa WcbaGaamyAaaqabaGccaGGUaaaaaGaay5Eaaaaaa@B0F9@

4.3  Evaluation metrics

We evaluate empirical bias, variance, and RMSE for each estimator of total. We evaluate the asymptotic variance estimates and bootstrap variance estimates by their 95% nominal coverage and %bias relative to empirical variance. We use the normal approximation to generate confidence intervals. We calculate %bias as % bias = 100 [ v var ( T ^ y LASSO ) ] / var ( T ^ y LASSO ) , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaaILaGaaeOyaiaabMgacaqGHbGaae 4Caiaai2dadaWcgaqaaiaaigdacaaIWaGaaGimamaadmaabaGaamOD aiabgkHiTiaabAhacaqGHbGaaeOCamaabmaabaGabmivayaajaWaa0 baaSqaaiaadMhaaeaacaqGmbGaaeyqaiaabofacaqGtbGaae4taaaa aOGaayjkaiaawMcaaaGaay5waiaaw2faaiaaykW7aeaacaqG2bGaae yyaiaabkhadaqadaqaaiqadsfagaqcamaaDaaaleaacaWG5baabaGa aeitaiaabgeacaqGtbGaae4uaiaab+eaaaaakiaawIcacaGLPaaaaa GaaGilaaaa@5486@ where var ( T ^ y LASSO ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaqG2bGaaeyyaiaabkhadaqadaqaai qadsfagaqcamaaDaaaleaacaWG5baabaGaaeitaiaabgeacaqGtbGa ae4uaiaab+eaaaaakiaawIcacaGLPaaaaaa@3C7B@ is the empirical variance obtained from the simulation samples.

4.4  Simulation results

The simulation results are based on S = MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWGtbGaaGypaaaa@3390@ 1,000 simulated samples per each experimental group. Table 4.1 lists the numerical results of bias, variance, and root-mean-square-error of each estimator under different experimental designs for estimating the total of a continuous outcome variable. Table 4.2 lists the numerical results for estimating the total of a binary outcome variable.

4.4.1  Root mean square error

Under SRS, all estimators are unbiased, and LASSO and GREG perform approximately equally well relative to HT. POI ( X ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8qrpq0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaqGqbGaae4taiaabMeadaqadaqaai aabIfaaiaawIcacaGLPaaaaaa@3674@ and POI ( X+Y ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8qrpq0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaqGqbGaae4taiaabMeadaqadaqaai aabIfacaqGRaGaaeywaaGaayjkaiaawMcaaaaa@37FE@ induce biased samples by selecting cases with larger covariate values with higher probabilities. Under POI ( X+Y ) , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8qrpq0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaqGqbGaae4taiaabMeadaqadaqaai aabIfacaqGRaGaaeywaaGaayjkaiaawMcaaiaacYcaaaa@38AE@ the selection also favors cases with larger outcome values. The absolute bias of LASSO decreases relative to GREG as SNR increases. This improvement is more dramatic in the binary case than the continuous case, especially for POI ( X+Y ) . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8qrpq0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaqGqbGaae4taiaabMeadaqadaqaai aabIfacaqGRaGaaeywaaGaayjkaiaawMcaaiaac6caaaa@38B0@ In terms of RMSE, LASSO has marginal improvement over GREG for estimating totals of continuous outcome variables. The improvement is slightly noticeable, about 3%, when there are highly correlated predictors in the model. For the binary setting, there is substantial improvement in MSE for LASSO over GREG as SNR increases, with reductions of 20% for the POI ( X ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8qrpq0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaqGqbGaae4taiaabMeadaqadaqaai aabIfaaiaawIcacaGLPaaaaaa@3674@ and nearly 50% for the POI ( X+Y ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8qrpq0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaqGqbGaae4taiaabMeadaqadaqaai aabIfacaqGRaGaaeywaaGaayjkaiaawMcaaaaa@37FE@ setting when SNR is large. In particular, under Low/High and High/Low population types, the SNR is the same, thus the difference in performance between LASSO and GREG is attributed to correlation or effect size. LASSO performs better in both bias and RMSE in High/Low population type, suggesting that LASSO has stronger advantage over GREG when there are highly correlated predictors in the model. This suggests that LASSO has a better variable selection capability in the presence of multicollinearity relative to stepwise variable selection procedure used in GREG.

Table 4.1
Simulation summary for continuous outcome: total, bias, and RMSE × 10 3 ; MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8qrps0l bbf9q8WrFfeuY=Hhbbf9y8qrpq0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeqabeqadiWa ceGabeqabeqabeqadeaakeaacqGHxdaTcaaMe8UaaGymaiaaicdada ahaaWcbeqaaiaaiodaaaGccaGG7aaaaa@3865@ variance × 10 6 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8qrps0l bbf9q8WrFfeuY=Hhbbf9y8qrpq0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeqabeqadiWa ceGabeqabeqabeqadeaakeaacqGHxdaTcaaMe8UaaGymaiaaicdada ahaaWcbeqaaiaaiAdaaaaaaa@379F@
Table summary
This table displays the results of Simulation summary for continuous outcome: total. The information is grouped by Population (appearing as row headers), n, Sampling scheme, HT, GREG and LASSO (appearing as column headers).
Population n Sampling scheme HT GREG LASSO
bias var rmse bias var rmse bias var rmse
low/low
T = 100.8
SNR = 0.47
250 SRS 0.5 546 23.3 0.9 425 20.6 0.9 428 20.7
POI(X) 12.4 525 26.0 -0.6 446 21.1 -0.4 441 21.0
POI(X+Y) 19.4 519 29.9 4.6 443 21.5 4.7 431 21.3
1,000 SRS 0.2 129 11.4 0.3 94 9.6 0.3 94 9.7
POI(X) 12.6 129 17.0 -0.1 91 9.5 -0.2 92 9.6
POI(X+Y) 19.7 128 22.7 4.9 91 10.7 5.0 91 10.7
low/high
T = 101.4
SNR = 1.26
250 SRS 0.4 849 29.1 0.9 415 20.4 1.0 417 20.4
POI(X) 21.1 818 35.6 -1.3 434 20.9 -1.0 432 20.8
POI(X+Y) 31.7 817 42.7 3.7 427 21.0 4.0 427 21.1
1,000 SRS 0.0 200 14.1 0.3 94 10.0 0.3 93 9.7
POI(X) 21.1 199 25.4 -0.1 91 9.6 -0.2 90 9.6
POI(X+Y) 31.7 196 34.6 4.9 91 10.7 4.8 89 10.6
high/low
T = 101.8
SNR = 1.26
250 SRS 0.1 941 30.7 1.0 421 20.6 1.0 399 20.0
POI(X) 50.2 895 58.5 -0.7 434 20.8 -1.6 402 20.1
POI(X+Y) 57.8 872 64.9 4.0 435 21.2 3.0 399 20.2
1,000 SRS 0.0 218 14.8 0.3 94 9.7 0.3 93 9.6
POI(X) 50.6 210 53.0 -0.1 93 9.7 -0.5 91 9.6
POI(X+Y) 58.2 209 59.9 4.7 95 10.8 4.2 92 10.5
high/high
T = 103.1
SNR = 3.41
250 SRS -0.4 1,897 43.6 0.8 436 20.9 1.0 407 20.2
POI(X) 83.3 1,826 93.7 -0.8 435 20.9 -1.5 406 20.2
POI(X+Y) 96.4 1,779 105.3 3.7 428 21.0 3.0 404 20.3
1,000 SRS -0.2 444 21.0 0.3 93 9.7 0.3 93 9.7
POI(X) 83.6 424 86.1 -0.2 93 9.7 -0.5 91 9.6
POI(X+Y) 96.9 423 99.0 4.4 94 10.6 4.1 92 10.4
Table 4.2
Simulation summary for binary outcome: total, bias, and RMSE × 10 3 ; MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8qrps0l bbf9q8WrFfeuY=Hhbbf9y8qrpq0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeqabeqadiWa ceGabeqabeqabeqadeaakeaacqGHxdaTcaaMe8UaaGymaiaaicdada ahaaWcbeqaaiaaiodaaaGccaGG7aaaaa@3865@ variance  × 10 6 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8qrps0l bbf9q8WrFfeuY=Hhbbf9y8qrpq0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeqabeqadiWa ceGabeqabeqabeqadeaakeaacqGHxdaTcaaMe8UaaGymaiaaicdada ahaaWcbeqaaiaaiAdaaaaaaa@379F@
Table summary
This table displays the results of Simulation summary for binary outcome: total. The information is grouped by Population (appearing as row headers), n, Sampling scheme, HT, GREG and LASSO (appearing as column headers).
Population n Sampling scheme HT GREG LASSO
bias var rmse bias var rmse bias var rmse
low/low
T = 56.2
SNR = 0.51
250 SRS 0.0 10.2 3.2 0.0 7.2 2.7 0.0 7.0 2.7
POI(X) 2.6 10.0 4.1 0.2 8.0 2.8 0.1 7.8 2.8
POI(X+Y) 4.9 9.8 5.8 2.0 8.1 3.5 1.8 7.8 3.3
1,000 SRS -0.0 2.7 1.6 0.0 1.7 1.3 0.0 1.6 1.3
POI(X) 2.5 2.4 2.9 0.0 1.8 1.3 -0.0 1.7 1.3
POI(X+Y) 4.7 2.3 5.0 1.8 1.8 2.2 1.6 1.7 2.1
low/high
T = 54.4
SNR = 1.10
250 SRS -0.0 10.8 3.3 0.0 6.1 2.5 0.1 5.4 2.3
POI(X) 3.0 10.2 4.4 0.1 6.1 2.5 0.1 5.8 2.4
POI(X+Y) 5.3 9.8 6.2 1.6 6.2 2.9 1.3 5.8 2.8
1,000 SRS -0.0 2.7 1.6 0.0 1.3 1.1 0.0 1.1 1.0
POI(X) 2.9 2.4 3.3 0.0 1.4 1.2 -0.1 1.2 1.1
POI(X+Y) 5.2 2.2 5.4 1.4 1.4 1.8 1.1 1.2 1.6
high/low
T = 54.2
SNR = 1.10
250 SRS -0.0 10.3 3.2 0.0 5.8 2.4 0.1 4.9 2.2
POI(X) 6.6 9.6 7.3 0.3 6.2 2.5 -0.2 4.8 2.2
POI(X+Y) 8.6 9.3 9.1 1.8 6.3 3.1 0.9 4.9 2.4
1,000 SRS -0.0 2.5 1.6 0.0 1.2 1.1 0.0 1.0 1.0
POI(X) 6.6 2.2 6.7 0.2 1.4 1.2 -0.2 1.1 1.1
POI(X+Y) 8.5 2.1 8.7 1.6 1.4 2.0 1.0 1.0 1.4
high/high
T = 52.8
SNR = 2.75
250 SRS -0.1 10.2 3.1 -0.0 5.2 2.3 0.1 3.8 1.9
POI(X) 7.1 9.8 7.8 0.3 5.7 2.4 -0.2 3.6 1.9
POI(X+Y) 9.1 9.4 9.6 1.5 5.7 2.8 0.5 3.7 2.0
1,000 SRS -0.1 2.5 1.6 -0.0 1.1 1.0 0.0 0.6 0.8
POI(X) 7.1 2.2 7.2 0.2 1.3 1.1 -0.2 0.7 0.9
POI(X+Y) 9.1 2.2 9.2 1.4 1.2 1.8 0.5 0.7 1.0

4.4.2  LASSO variance estimates

Tables 4.3 and 4.4 list the 95% nominal coverage and percent-bias for each of the two asymptotic closed-form variance estimators developed in this research, as well as the naive bootstrap variance estimate of the LASSO calibration estimator.

For continuous outcomes, bootstrap variances have coverages that are consistently close to 95% under SRS and POI ( X ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8qrpq0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaqGqbGaae4taiaabMeadaqadaqaai aabIfaaiaawIcacaGLPaaaaaa@3674@ sampling schemes for both sample sizes. Under POI ( X+Y ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8qrpq0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaqGqbGaae4taiaabMeadaqadaqaai aabIfacaqGRaGaaeywaaGaayjkaiaawMcaaaaa@37FE@ sampling scheme, there is very modest undercoverage in Table 4.3. The closed-form variances have coverages that are sensitive to both sample size and sampling scheme, with smaller samples tending to undercover, particularly for the POI ( X+Y ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8qrpq0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaqGqbGaae4taiaabMeadaqadaqaai aabIfacaqGRaGaaeywaaGaayjkaiaawMcaaaaa@37FE@ sampling scheme. The difference in coverage of variance estimates between small and large sample sizes is expected, since the variance estimates are asymptotic and improve over larger samples. In terms of bias of variance estimators, there is evidence that bias reduces as SNR increases. With the same SNR, both asymptotic closed-form and bootstrap variances have smaller bias given predictors with high correlations relative to predictors with high effect sizes. Closed-form variances tend to underestimate the empirical variance, especially when the sample size is small. Overall, there is very little difference between the two closed-form variance estimates. Bootstrap variance tends to overestimate the empirical variance, but the absolute bias is generally smaller than those of the closed-form variance estimates.

For binary outcomes, both asymptotic closed-form and bootstrap variance estimates are sensitive to sample size, sampling scheme, and SNR. Bootstrap variance coverages are consistently close to 95% under SRS and POI ( X ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8qrpq0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaqGqbGaae4taiaabMeadaqadaqaai aabIfaaiaawIcacaGLPaaaaaa@3674@ for both sample sizes and all population types, but coverages range from 75% to 94% under POI ( X+Y ) . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8qrpq0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaqGqbGaae4taiaabMeadaqadaqaai aabIfacaqGRaGaaeywaaGaayjkaiaawMcaaiaac6caaaa@38B0@ Under POI ( X+Y ) , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8qrpq0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaqGqbGaae4taiaabMeadaqadaqaai aabIfacaqGRaGaaeywaaGaayjkaiaawMcaaiaacYcaaaa@38AE@ the bootstrap variance coverages are better with sample size 250 than with sample size 1,000 when the bias becomes a larger part of the RMSE, and better with high-correlation populations than with low-correlation populations. In terms of coverage, closed-form variances show a similar trend under POI ( X+Y ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8qrpq0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaqGqbGaae4taiaabMeadaqadaqaai aabIfacaqGRaGaaeywaaGaayjkaiaawMcaaaaa@37FE@ as bootstrap: better coverage with smaller samples than bigger samples, and better coverage with high-correlation populations than with low-correlation populations. Under SRS and POI ( X ) , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8qrpq0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaqGqbGaae4taiaabMeadaqadaqaai aabIfaaiaawIcacaGLPaaacaGGSaaaaa@3724@ closed-form variance coverage improves as sample size increases. In terms of bias, both bootstrap and closed-form variances have smaller bias with larger sample sizes. Holding sample size fixed, closed-form variance estimates have larger bias as SNR increases. The same trend is not observed in bootstrap variance estimates. Similar to continuous outcome results, closed-form variance tends to underestimate the empirical variance, especially when the sample size is small. Unlike continuous outcome results, there is evidence that the g MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWGNbGaeyOeI0caaa@33C9@ weighted closed-form variance estimates have better bias-properties than unweighted closed-form variance estimates. The bootstrap variance tends to overestimate the empirical variance. However, the biases are much smaller than for the closed-form variance estimates.

Table 4.3
95% nominal coverage and %bias of variance estimates for LASSO
Table summary
This table displays the results of 95% nominal coverage and %bias of variance estimates for LASSO. The information is grouped by Continuous outcome (appearing as row headers), coverage and %bias (appearing as column headers).
Continuous outcome coverage %bias
Population n scheme v LASSO MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacPqpw0le9 v8qqaqFD0xXdHaVhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeqabeqadiWa ceGabeqabeqabeqadeaakeaacaWG2bWaaWbaaSqabeaacaqGmbGaae yqaiaabofacaqGtbGaae4taaaaaaa@3957@ v g LASSO MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacPqpw0le9 v8qqaqFD0xXdHaVhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeqabeqadiWa ceGabeqabeqabeqadeaakeaacaWG2bWaa0baaSqaaiaadEgaaeaaca qGmbGaaeyqaiaabofacaqGtbGaae4taaaaaaa@3A43@ v boot LASSO MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacPqpw0le9 v8qqaqFD0xXdHaVhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeqabeqadiWa ceGabeqabeqabeqadeaakeaacaWG2bWaa0baaSqaaiaabkgacaqGVb Gaae4BaiaabshaaeaacaqGmbGaaeyqaiaabofacaqGtbGaae4taaaa aaa@3D17@ v LASSO MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacPqpw0le9 v8qqaqFD0xXdHaVhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeqabeqadiWa ceGabeqabeqabeqadeaakeaacaWG2bWaaWbaaSqabeaacaqGmbGaae yqaiaabofacaqGtbGaae4taaaaaaa@3957@ v g LASSO MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacPqpw0le9 v8qqaqFD0xXdHaVhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeqabeqadiWa ceGabeqabeqabeqadeaakeaacaWG2bWaa0baaSqaaiaadEgaaeaaca qGmbGaaeyqaiaabofacaqGtbGaae4taaaaaaa@3A43@ v boot LASSO MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacPqpw0le9 v8qqaqFD0xXdHaVhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeqabeqadiWa ceGabeqabeqabeqadeaakeaacaWG2bWaa0baaSqaaiaabkgacaqGVb Gaae4BaiaabshaaeaacaqGmbGaaeyqaiaabofacaqGtbGaae4taaaa aaa@3D17@
low/low 250 SRS 91.7% 91.8% 95.4% -22.6% -22.3% 2.9%
POI(X) 91.2% 91.2% 96.1% -25.1% -24.5% 5.7%
POI(X+Y) 89.6% 89.9% 95.4% -23.5% -22.8% 7.9%
1,000 SRS 93.2% 93.2% 93.8% -7.3% -7.2% -0.3%
POI(X) 94.0% 93.9% 95.5% -5.7% -5.3% 6.6%
POI(X+Y) 90.0% 90.1% 92.1% -4.9% -4.4% 7.9%
low/high 250 SRS 91.5% 91.5% 95.7% -22.6% -22.3% 6.2%
POI(X) 90.9% 91.2% 96.4% -25.4% -24.9% 8.8%
POI(X+Y) 90.0% 90.2% 95.1% -24.5% -23.7% 9.9%
1,000 SRS 93.4% 93.5% 94.3% -6.6% -6.5% -0.1%
POI(X) 94.1% 94.2% 95.9% -4.0% -3.5% 7.6%
POI(X+Y) 90.7% 90.7% 92.7% -2.9% -2.3% 9.6%
high/low 250 SRS 92.3% 92.2% 95.4% -17.4% -17.1% 2.0%
POI(X) 92.5% 92.6% 95.8% -17.9% -16.1% 6.4%
POI(X+Y) 91.2% 91.8% 96.5% -17.4% -15.4% 7.1%
1,000 SRS 93.5% 93.5% 94.4% -6.5% -6.4% -0.9%
POI(X) 94.1% 94.0% 95.4% -5.0% -3.1% 5.7%
POI(X+Y) 91.9% 92.3% 93.4% -6.0% -3.9% 5.0%
high/high 250 SRS 92.3% 92.3% 95.2% -19.6% -19.3% 2.2%
POI(X) 92.0% 92.3% 96.1% -19.6% -17.8% 7.4%
POI(X+Y) 91.2% 91.8% 95.6% -19.1% -16.9% 8.3%
1,000 SRS 93.4% 93.4% 94.5% -6.5% -6.4% -0.7%
POI(X) 94.0% 94.5% 95.6% -4.7% -2.8% 6.7%
POI(X+Y) 92.2% 92.4% 93.4% -5.6% -3.3% 6.1%
Table 4.4
95% nominal coverage and %bias of variance estimates for LASSO
Table summary
This table displays the results of 95% nominal coverage and %bias of variance estimates for LASSO. The information is grouped by Binary outcome (appearing as row headers), coverage and %bias (appearing as column headers).
Binary outcome coverage %bias
Population n scheme v LASSO MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacPqpw0le9 v8qqaqFD0xXdHaVhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeqabeqadiWa ceGabeqabeqabeqadeaakeaacaWG2bWaaWbaaSqabeaacaqGmbGaae yqaiaabofacaqGtbGaae4taaaaaaa@3957@ v g LASSO MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacPqpw0le9 v8qqaqFD0xXdHaVhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeqabeqadiWa ceGabeqabeqabeqadeaakeaacaWG2bWaa0baaSqaaiaadEgaaeaaca qGmbGaaeyqaiaabofacaqGtbGaae4taaaaaaa@3A43@ v boot LASSO MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacPqpw0le9 v8qqaqFD0xXdHaVhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeqabeqadiWa ceGabeqabeqabeqadeaakeaacaWG2bWaa0baaSqaaiaabkgacaqGVb Gaae4BaiaabshaaeaacaqGmbGaaeyqaiaabofacaqGtbGaae4taaaa aaa@3D17@ v LASSO MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacPqpw0le9 v8qqaqFD0xXdHaVhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeqabeqadiWa ceGabeqabeqabeqadeaakeaacaWG2bWaaWbaaSqabeaacaqGmbGaae yqaiaabofacaqGtbGaae4taaaaaaa@3957@ v g LASSO MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacPqpw0le9 v8qqaqFD0xXdHaVhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeqabeqadiWa ceGabeqabeqabeqadeaakeaacaWG2bWaa0baaSqaaiaadEgaaeaaca qGmbGaaeyqaiaabofacaqGtbGaae4taaaaaaa@3A43@ v boot LASSO MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacPqpw0le9 v8qqaqFD0xXdHaVhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peuj0lXxdrpe0db9Wqpepic9qr=xfr=xfr=tmeqabeqadiWa ceGabeqabeqabeqadeaakeaacaWG2bWaa0baaSqaaiaabkgacaqGVb Gaae4BaiaabshaaeaacaqGmbGaaeyqaiaabofacaqGtbGaae4taaaa aaa@3D17@
low/low 250 SRS 89.8% 90.0% 95.9% -28.1% -27.8% 9.2%
POI(X) 88.1% 88.6% 96.7% -37.3% -35.3% 9.2%
POI(X+Y) 79.0% 79.9% 91.2% -38.7% -35.9% 8.0%
1,000 SRS 92.8% 92.8% 93.5% -11.9% -11.8% -3.5%
POI(X) 92.0% 92.8% 95.7% -17.9% -15.5% 1.0%
POI(X+Y) 68.6% 69.6% 74.6% -18.5% -14.9% 0.5%
low/high 250 SRS 86.8% 87.0% 94.9% -37.7% -37.3% 11.3%
POI(X) 85.4% 86.1% 95.5% -42.9% -41.2% 14.4%
POI(X+Y) 78.7% 80.1% 92.6% -44.0% -41.3% 14.4%
1,000 SRS 94.4% 94.3% 95.2% -5.5% -5.4% 5.8%
POI(X) 91.8% 92.1% 94.9% -20.5% -18.6% -1.8%
POI(X+Y) 76.8% 77.8% 82.9% -20.4% -16.9% -1.3%
high/low 250 SRS 89.2% 89.1% 94.4% -28.5% -28.1% 0.4%
POI(X) 89.0% 90.1% 95.5% -31.9% -25.3% 12.7%
POI(X+Y) 85.7% 88.4% 93.8% -33.9% -25.4% 10.9%
1,000 SRS 93.9% 93.9% 95.6% -6.3% -6.2% 3.5%
POI(X) 92.6% 93.4% 94.8% -16.5% -9.2% 1.9%
POI(X+Y) 83.3% 85.4% 88.1% -15.0% -5.0% 5.2%
high/high 250 SRS 82.8% 82.8% 93.8% -44.6% -44.3% -6.4%
POI(X) 83.6% 85.5% 95.1% -44.3% -39.4% 3.8%
POI(X+Y) 82.9% 85.1% 93.8% -45.1% -38.4% 4.6%
1,000 SRS 94.3% 94.4% 96.1% -7.8% -7.6% 6.3%
POI(X) 91.3% 92.2% 94.0% -20.0% -13.8% 0.2%
POI(X+Y) 86.3% 88.6% 91.5% -18.1% -9.2% 2.8%

Date modified: