# A comparison between nonparametric estimators for finite population distribution functions 5. Simulation studyA comparison between nonparametric estimators for finite population distribution functions 5. Simulation study

In this section we analyze some simulation results. Our goal is to compare efficiency with respect to the sample design of the distribution function estimators introduced in Section 2 and of the variance estimators of Section 4. The simulation results refer to simple random without replacement sampling and to Poisson sampling with unequal inclusion probabilities. As a benchmark, we included also the Horvitz-Thompson distribution function estimator

${\stackrel{^}{F}}_{\pi }\left(t\right):=\frac{1}{N}\sum _{j\in s}{\pi }_{j}^{-1}I\left({y}_{j}\le t\right)$

and the corresponding variance estimator

$\stackrel{˜}{V}\left({\stackrel{^}{F}}_{\pi }\left(t\right)\right):=\frac{1}{{N}^{2}}\sum _{i,j\in s}\frac{{\pi }_{i,j}-{\pi }_{i}{\pi }_{j}}{{\pi }_{i,j}{\pi }_{i}{\pi }_{j}}I\left({y}_{i}\le t\right)I\left({y}_{j}\le t\right)$

in the simulation study.

We considered both artificial and real populations. The former were obtained by generating $N=\text{1,000}$ values ${x}_{i}$ from i.i.d. uniform random variables with support on the interval $\left(0,1\right)$ and by combining them with three types of regression function $m\left(x\right)$ and two types of error components ${\epsilon }_{i}.$ The regression functions are (i) $m\left(x\right)=0$ (flat), (ii) $m\left(x\right)=10x$ (linear) and (iii) $m\left(x\right)=10{x}^{1/4}$ (concave), while the error components ${\epsilon }_{i}$ are either independent realizations from a unique Student $t$ distribution with $\nu =5$ d.o.f., or independent realizations from $N$ different shifted noncentral Student $t$ distributions with $\nu =5$ d.o.f. and with noncentrality parameters given by $\mu =15{x}_{i}.$ The shifts applied to the error components in the latter case make sure that the means of the noncentral Student $t$ distributions from which they were generated are zero. The artificial populations are shown in Figure 5.1 to 5.3. As for the real populations, we took the $MU284$ Population of Sweden Municipalities of Särndal et al. (1992) (population size $N=284\right)$ and considered the natural logarithm of $RMT85=$ Revenues from the 1985 municipal taxation (in millions of kronor) as study variable $Y,$ and the natural logarithm of either $P85=1985$ population (in thousands) or $REV84=$ Real estate values according to 1984 assessment (in millions of kronor) as auxiliary variable $X.$ The real populations are shown in Figure 5.4.

Description of Figure 5.1

Figure made of two scatter plots $\left(y$ versus $x\right),$ each one illustrating an artificial population. The first graph is the population generated from ${y}_{i}={\epsilon }_{i},$ where ${\epsilon }_{i}\sim$ i.i.d. Student $t$ with $\nu =5.$ The y-axis goes from -4 to 8 and the x-axis goes from 0.0 to 1.0. The scatter plot is centered around $y=0.$ The second graph is the population generated from ${y}_{i}={\epsilon }_{i}$ and ${\epsilon }_{i}\sim$ indep. noncentral Student $t$ with $\nu =5$ and $\mu =15{x}_{i}.$ The y-axis goes from -10 to 40 and the x-axis goes from 0.0 to 1.0. The scatter plot is concentrated around $y=0$ for small values of $x.$ The variation increases when $x$ increases.

Description of Figure 5.2

Figure made of two scatter plots $\left(y$ versus $x\right),$ each one illustrating an artificial population. The first graph is the population generated from ${y}_{i}=10{x}_{i}+{\epsilon }_{i},$ where ${\epsilon }_{i}\sim$ i.i.d. Student $t$ with $\nu =5.$ The y-axis goes from 0 to 10 and the x-axis goes from 0.0 to 1.0. The scatter plot is showing an increasing linear relationship between $x$ and $y.$ The second graph is the population generated from ${y}_{i}={\epsilon }_{i}$ and ${\epsilon }_{i}\sim$ indep. noncentral Student $t$ with $\nu =5$ and $\mu =15{x}_{i}.$ The y-axis goes from 0 to 50 and the x-axis goes from 0.0 to 1.0. The scatter plot is showing an increasing linear relationship between $x$ and $y.$ The variation increases when x increases.

Description of Figure 5.3

Figure made of two scatter plots $\left(y$ versus $x\right),$ each one illustrating an artificial population. The first graph is the population generated from ${y}_{i}=10{x}_{i}^{1/4}+{\epsilon }_{i},$ where ${\epsilon }_{i}\sim$ i.i.d. Student $t$ with $\nu =5.$ The y-axis goes from 0 to 15 and the x-axis goes from 0.0 to 1.0. The scatter plot is showing an increasing concave relationship between $x$ and $y.$ The second graph is the population generated from ${y}_{i}={\epsilon }_{i}$ and ${\epsilon }_{i}\sim$ indep. noncentral Student $t$ with $\nu =5$ and $\mu =15{x}_{i}.$ The y-axis goes from 0 to 50 and the x-axis goes from 0.0 to 1.0. The scatter plot is showing an increasing concave relationship between $x$ and $y.$ The variation increases when $x$ increases.

Description of Figure 5.4

Figure made of two scatter plots $\left(y$ versus $x\right),$ each one illustrating a real population, $MU284$ Population of Sweden Municipalities of Särndal et al (1992). On the first graph, ${y}_{i}=\mathrm{ln}RMT{85}_{i}$ for the ${i}^{\text{th}}$ municipality and ${x}_{i}=\mathrm{ln}P{85}_{i}.$ The y-axis goes from 3 to 9 and the x-axis goes from 1 to 6. The scatter plot is showing an increasing linear relationship between $x$ and $y.$ On the second graph, ${y}_{i}=\mathrm{ln}RMT{85}_{i}$ for the ${i}^{\text{th}}$ municipality and ${x}_{i}=\mathrm{ln}REV{84}_{i}.$ The y-axis goes from 3 to 9 and the x-axis goes from 6 to 11. The scatter plot is showing a more variable increasing linear relationship between $x$ and $y.$

From each population we selected independently $B=\text{1,000}$ samples. When sampling from the artificial populations we set the sample size equal to $n=100$ in case of simple random without replacement sampling and, in case of Poisson sampling, we set the expected sample size equal to ${n}^{*}=100$ and made the sample inclusion probabilities proportional to the standard deviations of the shifted noncentral Student  $t$ distributions of above. When sampling from the real populations, we set the sample size equal to $n=30$ in case of simple random without replacement sampling. In case of Poisson sampling, we set the expected sample size equal to ${n}^{*}=30$ and made the sample inclusion probabilities proportional to the absolute values of the residuals from the linear least squares regressions of the population ${y}_{i}$ values on the population ${x}_{i}$ values.

As for the definition of the nonparametric estimators, we used the Epanechnikov kernel function $K\left(u\right):=0.75\left(1-{u}^{2}\right)$ with $\lambda =0.15$ or $\lambda =0.3$ for the samples taken from the artificial populations, and the Gaussian kernel function $K\left(u\right):=1/\sqrt{2\pi }{e}^{-\left(1/2\right){u}^{2}}$ with $\lambda =1$ or $\lambda =2$ for the samples taken from the real populations. In the tables with the simulation results the nonparametric estimators corresponding to the small and large bandwidth values are identified with an $s$ (small) or an $l$ (large) in the subscript. We resorted to the Gaussian kernel function for the samples taken from the real populations to avoid singularity problems that occur in case of holes in the sampled set of ${x}_{i}-$ values. Such holes are much more likely to occur with the real populations than with the artificial ones, because the distributions of the auxiliary variables are asymmetric in the former. In fact, in the artificial populations the nonparametric estimators were well-defined for all the $B=\text{1,000}$ samples selected according to the simple random without replacement sampling design. For the Poisson sampling design, on the other hand, 47 among the $B=\text{1,000}$ simulated samples were such that the nonparametric estimators with the small bandwidth value could not be computed and just one of these samples was such that the nonparametric estimators with the large bandwidth value were undefined. The simulation results referring to the nonparametric estimators in Tables 5.2 and 5.5 account only for the samples where they were well-defined and thus they are based on a little less than $B=\text{1,000}$ realizations.

Tables 5.1 to 5.4 report the simulated bias (BIAS) and the simulated root mean square error (RMSE) for each distribution function estimator at different levels of $t$ at which ${F}_{N}\left(t\right)$ has been estimated: based, for example, on the values ${\stackrel{˜}{F}}_{b}\left(t\right),$ $b=1,2,\dots ,B,$ taken on by the estimator $\stackrel{˜}{F}\left(t\right),$

and

The RMSE’s show that the estimators based on the modified fitted values are usually more efficient. In sampling from the real populations the gain in RMSE is sometimes quite large. As expected, the model-based estimators tend to be more efficient than the generalized difference estimators in case of simple random without replacement sampling when both types of estimator are approximately unbiased. Under the Poisson sampling scheme the BIAS of the model-based estimators increases, but nonetheless they remain competitive. More variability in the sample inclusion probabilities would certainly change this outcome, because it would increase the BIAS of the model-based estimators. The simulation results should therefore not be seen to be in contrast with Johnson, Breidt and Opsomer (2008) who argue in favor of generalized difference estimators (called model-assisted estimators in their paper) as “a good overall choice for distribution function estimators”.

$t={F}_{N}^{-1}\left(0.05\right)$ $t={F}_{N}^{-1}\left(0.25\right)$ $t={F}_{N}^{-1}\left(0.50\right)$ $t={F}_{N}^{-1}\left(0.75\right)$ $t={F}_{N}^{-1}\left(0.95\right)$ BIAS RMSE BIAS RMSE BIAS 6 216 -3 433 31 512 23 434 12 207 15 219 10 430 0 502 -10 429 3 213 6 209 -30 411 22 484 22 414 3 200 15 214 -9 409 10 477 1 407 -10 207 6 213 8 425 24 504 -4 430 8 207 6 210 10 417 22 494 -8 422 6 206 8 213 9 426 25 503 -5 432 5 206 7 210 10 417 23 494 -6 424 4 206 7 208 11 411 19 489 -5 417 6 200 26 225 33 376 8 477 26 419 33 209 52 236 23 374 -5 475 38 421 29 213 20 195 -29 351 -89 471 11 407 30 202 36 201 -11 357 -94 473 28 410 21 204 8 211 11 370 -7 473 4 415 16 211 5 208 8 367 -5 468 5 411 16 212 11 210 11 372 -11 475 4 416 15 210 7 208 11 368 -7 468 8 412 15 211 1 211 1 391 -6 477 8 399 18 210 32 201 25 275 13 250 -14 264 -36 217 114 250 152 304 12 236 -180 312 -86 242 -50 165 12 226 51 216 26 230 13 172 -46 155 -14 199 69 195 23 211 17 156 -5 186 4 275 15 248 11 269 -2 201 -5 184 7 274 17 250 5 269 -2 196 -10 180 5 275 16 245 14 266 -1 200 -9 176 3 272 15 242 13 262 -1 194 -7 203 14 413 37 472 17 405 1 206 24 204 23 351 27 403 26 382 29 208 94 242 135 372 51 392 13 380 15 212 55 182 -9 301 -18 368 -23 359 37 202 124 210 -31 278 -63 363 -8 356 48 200 -2 194 -4 349 11 401 18 377 13 208 -2 190 -5 345 12 398 17 374 11 209 0 191 -5 352 14 401 20 376 13 207 -1 189 -6 344 13 397 18 375 12 209 -4 205 -5 401 21 470 24 401 14 207 81 207 44 316 17 384 -2 376 23 203 138 258 183 356 35 367 -50 374 8 208 7 146 -14 274 16 352 -8 358 15 197 9 144 10 246 -2 323 -18 339 24 186 3 175 3 319 10 383 17 374 10 203 0 178 5 316 11 380 17 370 8 202 1 167 5 320 12 383 17 374 9 203 -1 164 6 316 13 379 20 368 8 201 4 209 11 412 25 477 27 422 10 200 59 234 95 402 66 455 51 395 26 208 94 259 190 441 147 467 98 400 16 212 30 184 33 343 -123 435 -34 385 40 203 57 201 58 331 -148 437 2 382 34 203 1 205 7 386 12 449 17 392 13 208 -1 204 0 385 9 445 20 389 11 209 3 201 8 389 7 449 13 392 14 207 0 198 6 383 9 446 19 390 13 208 0 205 -2 399 9 463 25 398 14 208
$t={F}_{N}^{-1}\left(0.05\right)$ $t={F}_{N}^{-1}\left(0.25\right)$ $t={F}_{N}^{-1}\left(0.50\right)$ $t={F}_{N}^{-1}\left(0.75\right)$ $t={F}_{N}^{-1}\left(0.95\right)$ BIAS RMSE BIAS RMSE BIAS -10 252 -11 593 -22 738 -20 743 6 357 -1 237 9 543 -15 621 -5 590 11 302 22 244 -29 485 -3 555 9 515 -17 297 14 238 -10 492 -5 564 14 524 -1 283 -6 247 0 579 -27 724 -40 736 3 349 -2 231 11 526 -1 598 -10 566 7 285 23 248 23 505 -4 562 -27 531 -20 304 12 240 20 504 1 573 -13 538 -6 287 -6 220 -7 543 -37 741 -44 929 -48 1,058 17 164 30 411 4 749 14 590 15 190 47 173 19 383 -1 602 57 498 15 187 21 175 -7 378 -89 554 -11 473 3 192 29 152 -3 367 -99 555 27 481 3 184 1 159 10 406 -11 737 -5 579 -2 194 1 158 9 388 -5 586 14 482 -1 192 14 186 27 409 -3 562 -17 487 -10 200 3 160 22 399 -11 566 -5 482 -2 193 -3 162 -7 451 -31 738 -29 980 -55 1,067 8 461 21 561 -12 259 -18 218 -30 164 78 429 183 451 2 248 -161 261 -79 189 -69 306 12 340 10 267 15 199 6 143 -59 294 4 302 56 205 15 172 17 124 -25 441 4 560 -10 257 9 219 5 153 -14 372 35 410 -10 262 4 219 5 151 -31 333 -2 386 -29 294 4 227 -1 161 -20 339 15 372 -10 259 11 215 4 151 -15 385 3 746 -37 917 -35 1,004 -48 1,070 -4 516 30 671 7 453 11 344 6 182 63 409 129 539 61 421 9 341 1 180 44 300 -29 433 -45 422 -47 345 12 180 107 314 -41 420 -60 397 -22 323 31 171 -27 502 8 667 -8 450 0 344 -8 185 -10 364 16 510 11 425 -2 345 -7 182 -6 325 -9 479 -25 447 -14 356 -10 187 -7 332 -9 489 -5 426 -3 344 -6 182 -16 349 -2 705 -21 886 -42 1,013 -61 1,069 36 497 47 629 9 418 -11 320 15 191 56 393 186 490 43 383 -48 308 13 184 -29 276 -19 383 -18 380 -43 335 -1 204 -29 274 10 355 7 336 -29 290 23 179 -30 475 12 630 4 421 7 317 6 191 -42 336 31 452 11 390 8 312 8 186 -31 306 5 429 -18 406 -14 344 -8 210 -28 308 14 424 7 387 5 315 7 191 -15 380 10 739 -23 891 -37 993 -47 1,064 24 308 69 687 53 690 38 406 2 188 47 301 131 553 139 561 91 393 -2 186 15 237 2 435 -135 513 -59 411 12 186 27 235 18 435 -149 506 -5 374 13 179 -28 274 -8 673 4 688 3 403 -10 191 -29 251 -12 512 17 541 7 395 -9 188 -3 255 -12 481 -7 536 -20 422 -12 196 -12 251 -16 489 2 538 -4 399 -9 189 -10 267 -8 608 -4 860 -38 1,009 -63 1,066
$t={F}_{N}^{-1}\left(0.05\right)$ $t={F}_{N}^{-1}\left(0.25\right)$ $t={F}_{N}^{-1}\left(0.50\right)$ $t={F}_{N}^{-1}\left(0.75\right)$ $t={F}_{N}^{-1}\left(0.95\right)$ BIAS RMSE BIAS RMSE RBIAS 133 421 339 625 180 529 -265 490 -187 439 52 380 67 588 45 555 -63 469 -87 370 8 81 -154 203 90 130 62 123 6 54 28 66 -170 212 69 112 57 109 2 50 -28 300 -24 497 8 483 -48 421 -38 319 -28 326 -96 569 -52 544 3 466 1 319 26 177 -11 302 0 244 1 308 -18 102 29 179 -10 302 -2 243 -1 308 -21 104 22 388 -10 771 9 864 5 731 -43 394 143 449 303 643 138 554 -217 543 -166 446 62 395 62 611 36 582 -49 519 -71 376 -11 204 -32 300 -101 328 42 285 31 155 36 183 -40 288 -149 345 6 261 34 122 5 340 -22 548 4 557 -30 498 -23 332 -2 349 -78 599 -36 588 10 522 8 331 24 303 7 446 -6 494 2 439 -13 209 29 304 4 443 -6 495 -1 432 -18 192 34 395 1 766 16 880 9 744 -37 398
Table 5.4
Real populations (population size $N=284\right).$ BIAS and RMSE of distribution function estimators under Poisson sampling with inclusion probabilities proportional to the absolute value of the residuals of the linear regression of the population ${y}_{i}-$ values on the population ${x}_{i}-$ values. Expected size ${n}^{*}=30$
Table summary
This table displays the results of Real populations (population size XXXX BIAS and RMSE of distribution function estimators under Poisson sampling with inclusion probabilities proportional to the absolute value of the residuals of the population XXXX values on the population XXXX values. Expected size XXXX. The information is grouped by (appearing as row headers), XXXX, BIAS , RMSE , RMSE and RBIAS , calculated using MU284 population with XXXX and XXXX units of measure (appearing as column headers).
$t={F}_{N}^{-1}\left(0.05\right)$ $t={F}_{N}^{-1}\left(0.25\right)$ $t={F}_{N}^{-1}\left(0.50\right)$ $t={F}_{N}^{-1}\left(0.75\right)$ $t={F}_{N}^{-1}\left(0.95\right)$
BIAS RMSE BIAS RMSE RBIAS RMSE BIAS RMSE BIAS RMSE
MU284 population with $Y=\mathrm{ln}RMT85$ and $X=\mathrm{ln}P85$
${\stackrel{^}{F}}_{s}\left(t\right)$ 204 420 485 668 239 519 -412 626 -90 317
${\stackrel{^}{F}}_{l}\left(t\right)$ 180 424 417 684 319 614 -239 548 -148 348
${\stackrel{^}{F}}_{s}^{*}\left(t\right)$ -41 97 -118 199 132 178 40 140 -71 104
${\stackrel{^}{F}}_{l}^{*}\left(t\right)$ 11 70 -147 211 63 128 -25 122 -85 106
${\stackrel{˜}{F}}_{s}\left(t\right)$ 24 360 30 649 0 675 -68 614 58 368
${\stackrel{˜}{F}}_{l}\left(t\right)$ 9 390 -63 737 -64 774 -7 682 75 414
${\stackrel{˜}{F}}_{s}^{*}\left(t\right)$ 16 184 -14 307 36 283 16 323 -11 103
${\stackrel{˜}{F}}_{l}^{*}\left(t\right)$ 25 187 -15 312 30 286 14 328 -11 112
${\stackrel{˜}{F}}_{\pi }\left(t\right)$ 40 445 73 1,983 12 2,498 -43 3,094 -49 3,341
MU284 population with $Y=\mathrm{ln}RMT85$ and $X=\mathrm{ln}REV84$
${\stackrel{^}{F}}_{s}\left(t\right)$ 349 660 1,185 1,373 890 1,059 458 654 -32 270
${\stackrel{^}{F}}_{l}\left(t\right)$ 287 601 1,003 1,236 771 989 484 695 42 263
${\stackrel{^}{F}}_{s}^{*}\left(t\right)$ 317 453 739 866 761 879 624 701 159 207
${\stackrel{^}{F}}_{l}^{*}\left(t\right)$ 364 471 720 842 718 824 572 647 96 158
${\stackrel{˜}{F}}_{s}\left(t\right)$ 35 488 82 818 -31 772 7 634 -8 326
${\stackrel{˜}{F}}_{l}\left(t\right)$ 22 500 3 878 -98 852 40 704 27 354
${\stackrel{˜}{F}}_{s}^{*}\left(t\right)$ 37 317 32 498 -13 513 32 412 7 157
${\stackrel{˜}{F}}_{l}^{*}\left(t\right)$ 51 313 30 498 -30 518 12 411 -10 149
${\stackrel{˜}{F}}_{\pi }\left(t\right)$ 32 671 19 1,658 -172 2,354 -173 2,787 -191 2,935

Consider finally the simulation results referring to the variance estimators of Section 4. Tables 5.5 to 5.8 report the relative bias (RBIAS) and the relative root mean square error (RRMSE) for each of them. For example, based on the variance estimates ${\stackrel{˜}{V}}_{b}\left(\stackrel{˜}{F}\left(t\right)\right),$ $b=1,2,\dots ,B,$ obtained from the estimator $\stackrel{˜}{V}\left(\stackrel{˜}{F}\left(t\right)\right),$

and

where

${V}_{B}\left(\stackrel{˜}{F}\left(t\right)\right):=\frac{1}{B}\sum _{b=1}^{B}{\left({\stackrel{˜}{F}}_{b}\left(t\right)-{F}_{N}\left(t\right)\right)}^{2}.$

As a benchmark, we report also the RBIAS and RRMSE of the estimator

$\stackrel{˜}{V}\left({\stackrel{˜}{F}}_{\pi }\left(t\right)\right):=\frac{1}{{N}^{2}}\sum _{i,j\in s}\frac{{\pi }_{i,j}-{\pi }_{i}{\pi }_{j}}{{\pi }_{i,j}{\pi }_{i}{\pi }_{j}}I\left({y}_{i}\le t\right)I\left({y}_{j}\le t\right).$

for the variance of the Horvitz-Thompson estimator.

$t={F}_{N}^{-1}\left(0.05\right)$ $t={F}_{N}^{-1}\left(0.25\right)$ $t={F}_{N}^{-1}\left(0.50\right)$ $t={F}_{N}^{-1}\left(0.75\right)$ $t={F}_{N}^{-1}\left(0.95\right)$ RBIAS RRMSE RBIAS RRMSE RBIAS -1,092 32,442 -1,249 3,895 -1,714 3,077 -1,536 3,828 -824 34,601 -576 31,726 -603 3,838 -1,122 3,374 -951 3,758 -441 33,055 -1,091 32,579 -1,292 3,914 -1,708 3,085 -1,640 3,828 -802 34,809 -556 31,881 -622 3,857 -1,148 3,361 -1,025 3,749 -425 33,184 42 30,952 57 3,928 -592 3,776 -287 3,825 551 33,462 -1,900 29,622 50 4,707 -917 3,557 -998 3,695 -1,480 29,417 -1,359 29,623 535 4,572 -395 3,881 -527 3,736 -1,277 28,267 -1,832 30,119 -101 4,710 -991 3,530 -1,077 3,704 -1,398 29,927 -1,362 29,713 465 4,559 -420 3,865 -591 3,718 -1,236 28,489 -351 29,132 1,096 4,215 -78 4,074 574 4,067 -638 29,507 -2,170 11,624 -1,027 2,480 -816 3,274 -1,424 2,583 -1,946 8,681 -1,534 11,605 -529 2,632 -148 2,975 -859 2,590 -1,151 9,015 -1,765 12,107 -1,108 2,529 -714 3,366 -1,318 2,660 -1,905 8,658 -1,062 11,948 -671 2,735 -212 3,291 -762 2,785 -1,048 8,590 254 31,545 -52 3,726 136 4,152 267 3,992 35 30,264 -1,642 25,809 -855 3,541 -1,076 3,038 -1,081 3,030 -1,361 21,157 -950 25,692 -323 3,509 -597 3,312 -617 3,164 -1,124 20,231 -1,385 26,406 -997 3,505 -1,089 3,045 -1,096 3,033 -1,310 21,393 -832 26,212 -292 3,556 -614 3,317 -716 3,154 -1,135 20,286 105 29,621 507 3,857 209 4,244 425 3,910 -337 29,082 -2,465 30,612 -1,121 4,594 -1,512 3,183 -1,958 3,076 -863 19,720 -1,780 28,103 -663 4,420 -1,092 3,319 -1,491 3,140 -439 18,985 -2,052 33,980 -1,150 4,619 -1,537 3,217 -1,948 3,127 -954 19,637 -1,194 33,573 -691 4,472 -1,124 3,368 -1,438 3,228 -357 19,245 -81 30,001 9 3,756 -110 3,996 -598 3,661 440 32,455 -1,873 29,437 -758 3,759 -621 3,476 -709 3,599 -1,298 27,679 -1,267 28,511 -284 3,661 -131 3,758 -321 3,552 -1,075 26,790 -1,710 30,670 -928 3,741 -628 3,510 -777 3,603 -1,245 27,972 -939 30,486 -270 3,764 -171 3,803 -375 3,581 -1,014 26,926 178 29,640 599 3,816 533 4,324 590 3,874 -404 28,917
$t={F}_{N}^{-1}\left(0.05\right)$ $t={F}_{N}^{-1}\left(0.25\right)$ $t={F}_{N}^{-1}\left(0.50\right)$ $t={F}_{N}^{-1}\left(0.75\right)$ $t={F}_{N}^{-1}\left(0.95\right)$ RBIAS RRMSE RBIAS RRMSE RBIAS -3,306 65,777 -4,248 8,032 -5,093 4,242 -6,258 4,844 -5,652 32,037 -2,048 47,035 -2,656 4,705 -2,434 3,116 -3,310 3,939 -3,092 29,380 -3,362 36,855 -2,488 4,409 -1,910 3,147 -2,869 3,910 -4,329 23,247 -2,696 39,509 -2,076 4,450 -1,768 3,163 -2,648 3,811 -3,244 26,343 113 129,637 259 15,120 618 6,327 193 5,429 273 6,097 -740 125,975 -2,522 14,864 -5,466 3,658 -4,896 6,691 -1,551 83,262 -391 83,047 -1,503 8,946 -2,428 4,099 -2,228 5,526 -1,154 54,680 -3,260 58,072 -2,649 7,661 -2,260 3,936 -2,795 5,011 -2,116 48,739 -716 77,935 -2,000 7,979 -1,934 4,235 -2,279 5,243 -1,243 52,531 666 251,134 -564 26,553 -87 7,344 -2 6,029 407 6,610 -6,801 7,898 -6,470 4,281 -1,059 22,596 -398 32,401 -1,650 72,632 -4,978 5,826 -2,898 4,473 -603 9,530 206 15,226 -1,157 40,466 -4,520 6,691 -2,710 4,213 -3,245 6,723 -1,156 12,681 -2,458 32,907 -4,226 6,206 -1,674 5,062 -978 7,874 55 12,781 -1,283 33,737 -707 47,550 118 7,214 609 4,409 743 4,628 435 4,800 -7,398 8,847 -6,235 3,667 -2,493 8,171 -1,051 16,299 -1,440 71,943 -4,548 9,463 -3,136 3,282 -1,187 4,246 -832 7,638 -982 45,182 -3,902 11,727 -2,808 3,409 -2,411 3,501 -1,721 6,737 -1,671 41,389 -3,598 10,771 -2,610 3,462 -1,284 3,988 -852 7,008 -972 43,017 146 57,044 -42 8,708 520 4,784 214 4,686 390 5,085 -7,731 8,568 -6,597 3,484 -2,442 7,775 -903 16,067 -1,967 56,480 -4,611 9,378 -2,990 3,252 -874 4,119 -347 7,420 -1,310 35,051 -4,747 11,909 -2,679 3,298 -1,896 3,272 -2,248 5,747 -3,382 27,222 -4,223 10,380 -2,100 3,494 -788 3,731 -550 5,975 -1,795 29,856 -428 47,038 -206 7,350 641 4,504 738 4,708 487 4,943 -4,936 40,696 -6,111 4,579 -5,549 4,035 -1,864 14,381 -1,509 84,892 -3,004 29,404 -2,764 3,962 -2,436 3,606 -1,234 7,357 -1,103 53,875 -4,328 27,704 -2,516 4,235 -2,671 3,332 -2,586 5,955 -1,939 47,601 -3,454 28,267 -2,263 4,160 -2,329 3,574 -1,433 6,682 -1,171 50,985 152 98,607 663 12,879 15 5,376 20 5,080 429 5,619
$t={F}_{N}^{-1}\left(0.05\right)$ $t={F}_{N}^{-1}\left(0.25\right)$ $t={F}_{N}^{-1}\left(0.50\right)$ $t={F}_{N}^{-1}\left(0.75\right)$ $t={F}_{N}^{-1}\left(0.95\right)$ RBIAS RRMSE RBIAS RRMSE RBIAS -2,853 16,809 -1,700 3,037 -1,554 2,984 -1,100 4,633 -5,503 16,257 -1,110 16,374 -1,827 2,760 -1,683 2,847 -927 4,387 -3,016 18,685 -1,043 19,081 -91 7,728 -448 9,120 -484 7,715 -1,877 65,298 -424 18,971 104 7,819 -382 9,110 -301 7,799 -1,058 62,968 -186 29,720 -603 3,901 31 3,971 500 4,383 -74 28,418 -2,283 16,303 -1,450 3,538 -945 3,526 -1,071 4,300 -4,832 19,401 -1,095 16,755 -1,427 3,181 -938 3,390 -780 4,051 -2,753 20,551 -1,737 14,642 -298 5,648 -546 5,282 -736 5,679 -3,564 38,344 -1,174 14,111 -27 5,856 -422 5,452 -228 5,974 -1,433 43,923 -307 28,421 -460 3,963 -344 3,850 112 4,235 -401 27,987
$t={F}_{N}^{-1}\left(0.05\right)$ $t={F}_{N}^{-1}\left(0.25\right)$ $t={F}_{N}^{-1}\left(0.50\right)$ $t={F}_{N}^{-1}\left(0.75\right)$ $t={F}_{N}^{-1}\left(0.95\right)$ RBIAS RRMSE RBIAS RRMSE RBIAS -3,502 26,342 -1,841 14,037 -2,691 12,087 -3,415 9,674 -5,932 26,823 -2,159 27,610 -1,782 14,010 -2,840 12,002 -3,186 10,177 -4,455 26,802 -434 22,455 515 15,503 -506 31,296 -1,460 23,496 -2,649 78,527 -80 22,921 677 15,575 -280 33,294 -1,283 26,612 -1,597 72,166 -294 361,991 522 75,891 43 48,764 -241 36,354 90 32,354 -5,220 18,699 -3,667 8,749 -3,222 7,537 -3,018 9,279 -4,955 44,597 -4,254 20,765 -3,100 9,180 -3,435 7,231 -3,196 8,540 -3,461 43,206 -2,938 18,922 -1,110 11,828 -1,265 8,726 -1,040 10,963 -3,682 89,262 -1,938 19,997 -699 12,641 -1,003 9,305 -599 11,545 -1,558 98,798 -143 128,401 493 33,934 -255 18,473 -91 17,904 327 16,463

As can be seen from the simulation results, the variance estimators suffer from large variability. This problem is shared by the variance estimator for the Horvitz-Thompson estimator, which occasionally exhibits extremely large RRMSE’s. It is further interesting to note that while the RBIAS of the variance estimators for the generalized difference estimators is almost always negative and at times rather large in absolute value, the RBIAS of the variance estimator for the Horvitz-Thompson estimator is in most of the considered cases positive.

## Acknowledgements

This research was partially supported by the FAR 2014-ATE-0200 grant from University of Milano-Bicocca.

## Appendix

Let $\beta$ denote a sequence of real numbers. Throughout this appendix we shall indicate by ${O}_{{i}_{1},{i}_{2},\dots ,{i}_{k}}\left(\beta \right)$ rest terms that may depend on ${x}_{{i}_{1}},{x}_{{i}_{2}},\dots ,{x}_{{i}_{k}}$ and that are of the same order as the sequence $\beta$ uniformly for ${i}_{1},{i}_{2},\dots ,{i}_{k}\in U.$ Formally, $R\left({x}_{{i}_{1}},{x}_{{i}_{2}},\dots ,{x}_{{i}_{k}}\right)={O}_{{i}_{1},{i}_{2},\dots ,{i}_{k}}\left(\beta \right)$ if

$\underset{{i}_{1},{i}_{2},\dots ,{i}_{k}\in \text{\hspace{0.17em}}U}{\mathrm{sup}}|\text{\hspace{0.17em}}R\left({x}_{{i}_{1}},{x}_{{i}_{2}},\dots ,{x}_{{i}_{k}}\right)\text{\hspace{0.17em}}|=O\left(\beta \right).$

Moreover, to simplify the notation, we shall write ${m}_{i}$ in place of $m\left({x}_{i}\right)$ and ${\sigma }_{i}^{2}$ in place of ${\sigma }^{2}\left({x}_{i}\right).$

### Bias of the model-based Kuo estimator

$\begin{array}{ll}E\left(\stackrel{^}{F}\left(t\right)-{F}_{N}\left(t\right)\right)\hfill & =E\left(\frac{1}{N}\sum _{i\notin s}\sum _{j\in s}{w}_{i,j}\left[I\left({\epsilon }_{j}\le t-{m}_{j}\right)-I\left({\epsilon }_{i}\le t-{m}_{i}\right)\right]\right)\hfill \\ \hfill & =\frac{1}{N}\sum _{i\notin s}\sum _{j\in s}{w}_{i,j}\left[G\left(t-{m}_{j}|{x}_{j}\right)-G\left(t-{m}_{i}|{x}_{i}\right)\right]\hfill \\ \hfill & =\frac{1}{2N}\sum _{i\notin s}\left[{G}^{\left(2,0\right)}\left(t-{m}_{i}|{x}_{i}\right){\left({m}_{i}\right)}^{2}-{G}^{\left(1,0\right)}\left(t-{m}_{i}|{x}_{i}\right){m}_{i}\hfill \\ \hfill & \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}-2{G}^{\left(1,1\right)}\left(t-{m}_{i}|{x}_{i}\right){m}_{i}+{G}^{\left(0,2\right)}\left(t-{m}_{i}|{x}_{i}\right)\right]\sum _{j\in s}{w}_{i,j}{\left({x}_{j}-{x}_{i}\right)}^{2}+o\left({\lambda }^{2}\right)\hfill \\ \hfill & ={\lambda }^{2}\frac{N-n}{N}\frac{{\mu }_{2}}{2{\mu }_{0}}{\int }_{a}^{b}\left[{G}^{\left(2,0\right)}\left(t-m\left(x\right)\text{\hspace{0.17em}}|x\right){\left({m}^{\prime }\left(x\right)\right)}^{2}-{G}^{\left(1,0\right)}\left(t-m\left(x\right)\text{\hspace{0.17em}}|x\right){{m}^{\prime \prime }}^{}\left(x\right)\hfill \\ \hfill & \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}-2{G}^{\left(1,1\right)}\left(t-m\left(x\right)\text{\hspace{0.17em}}|x\right){m}^{\prime }\left(x\right)+{G}^{\left(0,2\right)}\left(t-m\left(x\right)\text{\hspace{0.17em}}|x\right)\right]{h}_{\text{\hspace{0.17em}}\overline{s}}\left(x\right)dx+o\left({\lambda }^{2}\right).\hfill \end{array}$

### Bias of the generalized difference Kuo estimator

Write

$\begin{array}{ll}\stackrel{˜}{F}\left(t\right)-{F}_{N}\left(t\right)\hfill & =\frac{1}{N}\left\{\sum _{i\notin s}\sum _{j\in s}{\stackrel{˜}{w}}_{i,j}\left[I\left({\epsilon }_{j}\le t-{m}_{j}\right)-I\left({\epsilon }_{i}\le t-{m}_{i}\right)\right]\hfill \\ \hfill & \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}+\sum _{i\in s}\left(1-\frac{1}{{\pi }_{i}}\right)\text{\hspace{0.17em}}\sum _{j\in s}{\stackrel{˜}{w}}_{i,j}\left[I\left({\epsilon }_{j}\le t-{m}_{j}\right)-I\left({\epsilon }_{i}\le t-{m}_{i}\right)\right]\right\}.\hfill \end{array}$

Similar steps as those seen for $\stackrel{^}{F}\left(t\right)$ show that

$\begin{array}{ll}E\left(\stackrel{˜}{F}\left(t\right)-{F}_{N}\left(t\right)\right)\hfill & ={\lambda }^{2}\frac{N-n}{N}\frac{{\mu }_{2}}{2{\mu }_{0}}{\int }_{a}^{b}\left[{G}^{\left(2,0\right)}\left(t-m\left(x\right)\text{\hspace{0.17em}}|x\right){\left({m}^{\prime }\left(x\right)\right)}^{2}-{G}^{\left(1,0\right)}\left(t-m\left(x\right)\text{\hspace{0.17em}}|x\right){{m}^{\prime \prime }}^{}\left(x\right)\hfill \\ \hfill & \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}-2{G}^{\left(1,1\right)}\left(t-m\left(x\right)\text{\hspace{0.17em}}|x\right){m}^{\prime }\left(x\right)+{G}^{\left(0,2\right)}\left(t-m\left(x\right)\text{\hspace{0.17em}}|x\right)\right]h\left(x\right)dx+o\left({\lambda }^{2}\right),\hfill \end{array}$

where

$h\left(x\right):={h}_{\text{\hspace{0.17em}}\overline{s}}\left(x\right)+\left(1-{\pi }^{-1}\left(x\right)\right){h}_{s}\left(x\right).$

### Variance of the model-based Kuo estimator

$\begin{array}{ll}\text{var}\left(\stackrel{^}{F}\left(t\right)-{F}_{N}\left(t\right)\right)\hfill & =\text{var}\left(\frac{1}{N}\sum _{i\notin s}\sum _{j\in s}{w}_{i,j}I\left({\epsilon }_{j}\le t-{m}_{j}\right)-\frac{1}{N}\sum _{i\notin s}I\left({y}_{i}\le t\right)\right)\hfill \\ \hfill & =\frac{1}{{N}^{2}}\sum _{{i}_{1}\notin s}\sum _{{i}_{2}\notin s}\sum _{j\in s}{w}_{{i}_{1},j}{w}_{{i}_{2},j}\left[G\left(t-{m}_{j}|\text{\hspace{0.17em}}{x}_{j}\right)-{G}^{2}\left(t-{m}_{j}|\text{\hspace{0.17em}}{x}_{j}\right)\right]\hfill \\ \hfill & \text{\hspace{0.17em}}\text{\hspace{0.17em}}+\frac{1}{{N}^{2}}\sum _{i\notin s}\left[G\left(t-{m}_{i}|\text{\hspace{0.17em}}{x}_{i}\right)-{G}^{2}\left(t-{m}_{i}|\text{\hspace{0.17em}}{x}_{i}\right)\right]\hfill \\ \hfill & ={A}_{1}+{A}_{2},\hfill \end{array}$

where

$\begin{array}{ll}{A}_{1}\hfill & :=\frac{1}{{N}^{2}}\sum _{{i}_{1}\notin s}\sum _{{i}_{2}\notin s}\sum _{j\in s}{w}_{{i}_{1},j}{w}_{{i}_{2},j}\left[G\left(t-{m}_{j}|\text{\hspace{0.17em}}{x}_{j}\right)-{G}^{2}\left(t-{m}_{j}|\text{\hspace{0.17em}}{x}_{j}\right)\right]\hfill \\ \hfill & =\frac{1}{{N}^{2}}\sum _{j\in s}\left[G\left(t-{m}_{j}|\text{\hspace{0.17em}}{x}_{j}\right)-{G}^{2}\left(t-{m}_{j}|\text{\hspace{0.17em}}{x}_{j}\right)\right]{\left(\sum _{i\notin s}{w}_{i,j}\right)}^{2}\hfill \\ \hfill & =\frac{1}{n}{\left(\frac{N-n}{N}\right)}^{2}{\int }_{a}^{b}\left[G\left(t-m\left(x\right)\text{\hspace{0.17em}}|\text{\hspace{0.17em}}x\right)-{G}^{2}\left(t-m\left(x\right)\text{\hspace{0.17em}}|\text{\hspace{0.17em}}x\right)\right]\left[{h}_{\text{\hspace{0.17em}}\overline{s}}\left(x\right)/{h}_{s}\left(x\right)\right]{h}_{\text{\hspace{0.17em}}\overline{s}}\left(x\right)dx\hfill \\ \hfill & \text{\hspace{0.17em}}\text{\hspace{0.17em}}+O\left({\left(n\lambda \right)}^{-1}\alpha \right)\hfill \end{array}$

and

$\begin{array}{ll}{A}_{2}\hfill & :=\frac{1}{{N}^{2}}\sum _{i\notin s}\left[G\left(t-{m}_{i}|\text{\hspace{0.17em}}{x}_{i}\right)-{G}^{2}\left(t-{m}_{i}|\text{\hspace{0.17em}}{x}_{i}\right)\right]\hfill \\ \hfill & =\frac{1}{N-n}{\left(\frac{N-n}{N}\right)}^{2}{\int }_{a}^{b}\left[G\left(t-m\left(x\right)\text{\hspace{0.17em}}|\text{\hspace{0.17em}}x\right)-{G}^{2}\left(t-m\left(x\right)\text{\hspace{0.17em}}|\text{\hspace{0.17em}}x\right)\right]{h}_{\text{\hspace{0.17em}}\overline{s}}\left(x\right)dx+O\left({n}^{-1}\alpha \right).\hfill \end{array}$

Thus,

$\begin{array}{ll}\text{var}\left(\stackrel{^}{F}\left(t\right)-{F}_{N}\left(t\right)\right)\hfill & =\frac{1}{n}{\left(\frac{N-n}{N}\right)}^{2}{\int }_{a}^{b}\left[G\left(t-m\left(x\right)\text{\hspace{0.17em}}|\text{\hspace{0.17em}}x\right)-{G}^{2}\left(t-m\left(x\right)\text{\hspace{0.17em}}|\text{\hspace{0.17em}}x\right)\right]\left[{h}_{\text{\hspace{0.17em}}\overline{s}}\left(x\right)/{h}_{s}\left(x\right)\right]{h}_{\text{\hspace{0.17em}}\overline{s}}\left(x\right)dx\hfill \\ \hfill & +\frac{1}{N-n}{\left(\frac{N-n}{N}\right)}^{2}{\int }_{a}^{b}\left[G\left(t-m\left(x\right)\text{\hspace{0.17em}}|\text{\hspace{0.17em}}x\right)-{G}^{2}\left(t-m\left(x\right)\text{\hspace{0.17em}}|\text{\hspace{0.17em}}x\right)\right]{h}_{\text{\hspace{0.17em}}\overline{s}}\left(x\right)dx+O\left({\left(n\lambda \right)}^{-1}\alpha \right).\hfill \end{array}$

### Variance of the generalized difference Kuo estimator

Note that

$\stackrel{˜}{F}\left(t\right)-{F}_{N}\left(t\right)=\frac{1}{N}\left\{\sum _{j\in s}I\left({y}_{j}\le t\right)\left[\sum _{i\notin s}{\stackrel{˜}{w}}_{i,j}-\sum _{i\in s}{\stackrel{˜}{w}}_{i,j}\left({\pi }_{i}^{-1}-1\right)+\left({\pi }_{j}^{-1}-1\right)\right]-\sum _{i\notin s}I\left({y}_{i}\le t\right)\right\}$

so that

$\begin{array}{ll}\text{var}\left(\stackrel{˜}{F}\left(t\right)-{F}_{N}\left(t\right)\right)\hfill & =\text{var}\left(\frac{1}{N}\sum _{j\in s}I\left({y}_{j}\le t\right)\left[\sum _{i\notin s}{\stackrel{˜}{w}}_{i,j}+\left({\pi }_{j}^{-1}-1\right)-\sum _{i\in s}{\stackrel{˜}{w}}_{i,j}\left({\pi }_{i}^{-1}-1\right)\right]\right)\hfill \\ \hfill & \text{\hspace{0.17em}}\text{\hspace{0.17em}}+\text{var}\left(\frac{1}{N}\sum _{i\notin s}I\left({y}_{i}\le t\right)\right)\hfill \\ \hfill & ={B}_{1}+{A}_{2},\hfill \end{array}$

where ${A}_{2}$ is the same as in the variance of $\stackrel{^}{F}\left(t\right),$ and where

$\begin{array}{ll}{B}_{1}\hfill & :=\text{var}\left(\frac{1}{N}\sum _{j\in s}I\left({y}_{j}\le t\right)\left[\sum _{i\notin s}{\stackrel{˜}{w}}_{i,j}+\left({\pi }_{j}^{-1}-1\right)-\sum _{i\in s}{\stackrel{˜}{w}}_{i,j}\left({\pi }_{i}^{-1}-1\right)\right]\right)\hfill \\ \hfill & =\frac{1}{{N}^{2}}\sum _{j\in s}\left[G\left(t-{m}_{j}|\text{\hspace{0.17em}}{x}_{j}\right)-{G}^{2}\left(t-{m}_{j}|\text{\hspace{0.17em}}{x}_{j}\right)\right]{\left[\sum _{i\notin s}{\stackrel{˜}{w}}_{i,j}+\left({\pi }_{j}^{-1}-1\right)-\sum _{i\in s}{\stackrel{˜}{w}}_{i,j}\left({\pi }_{i}^{-1}-1\right)\right]}^{2}\hfill \\ \hfill & =\frac{1}{{N}^{2}}\sum _{j\in s}\left[G\left(t-{m}_{j}|\text{\hspace{0.17em}}{x}_{j}\right)-{G}^{2}\left(t-{m}_{j}|\text{\hspace{0.17em}}{x}_{j}\right)\right]{\left[\sum _{i\notin s}{\stackrel{˜}{w}}_{i,j}+\left({\pi }_{j}^{-1}-1\right)\left(1-\sum _{i\in s}{\stackrel{˜}{w}}_{i,j}\right)\right]}^{2}+O\left(\lambda {n}^{-1}\right)\hfill \\ \hfill & =\frac{1}{n}{\left(\frac{N-n}{N}\right)}^{2}{\int }_{a}^{b}\left[G\left(t-m\left(x\right)\text{\hspace{0.17em}}|\text{\hspace{0.17em}}x\right)-{G}^{2}\left(t-m\left(x\right)\text{\hspace{0.17em}}|\text{\hspace{0.17em}}x\right)\right]\left[{h}_{\text{\hspace{0.17em}}\overline{s}}\left(x\right)/{h}_{s}\left(x\right)\right]{h}_{\text{\hspace{0.17em}}\overline{s}}\left(x\right)dx\hfill \\ \hfill & \text{\hspace{0.17em}}\text{\hspace{0.17em}}+O\left({\left(n\lambda \right)}^{-1}\alpha +\lambda {n}^{-1}\right)\hfill \\ \hfill & ={A}_{1}+O\left({\left(n\lambda \right)}^{-1}\alpha +\lambda {n}^{-1}\right).\hfill \end{array}$

Thus,

$\text{var}\left(\stackrel{˜}{F}\left(t\right)-{F}_{N}\left(t\right)\right)=\text{var}\left(\stackrel{^}{F}\left(t\right)-{F}_{N}\left(t\right)\right)+O\left({\left(n\lambda \right)}^{-1}\alpha +\lambda {n}^{-1}\right).$

## Bias of the model-based estimator with modified fitted values

Let ${\stackrel{^}{\stackrel{^}{m}}}_{i}:={\sum }_{k\in s}{w}_{i,k}{m}_{k},$ ${c}_{i,j}:=1-{w}_{j,j}+{w}_{i,j}$ and

${d}_{i,j}:=\frac{1}{{c}_{i,j}}\left[\left(1-{c}_{i,j}\right)\left(t-{m}_{i}\right)+\left({\stackrel{^}{\stackrel{^}{m}}}_{j}-{m}_{j}\right)-\left({\stackrel{^}{\stackrel{^}{m}}}_{i}-{m}_{i}\right)+\sum _{k\in s,k\ne j}\left({w}_{j,k}-{w}_{i,k}\right){\epsilon }_{k}\right].$

Observe that ${w}_{i,j}={O}_{i,j}\left({\left(n\lambda \right)}^{-1}\right)$ so that

${y}_{j}-{\stackrel{^}{m}}_{j}\le t-{\stackrel{^}{m}}_{i}$

is (asymptotically, as soon as ${c}_{i,j}>0\right)$ equivalent to

${\epsilon }_{j}\le t-{m}_{i}+{d}_{i,j}.$

Since ${d}_{i,j}$ does not depend on ${\epsilon }_{j},$ it follows that

$\begin{array}{ll}E\left(I\left({y}_{j}-{\stackrel{^}{m}}_{j}\le t-{\stackrel{^}{m}}_{i}\right)\right)\hfill & =E\left(I\left({\epsilon }_{j}\le t-{m}_{i}+{d}_{i,j}\right)\right)\hfill \\ \hfill & =E\left(E\left(I\left({\epsilon }_{j}\le t-{m}_{i}+{d}_{i,j}\right)\text{\hspace{0.17em}}|\text{\hspace{0.17em}}{\epsilon }_{k},k\ne j\right)\right)\hfill \\ \hfill & =E\left(G\left(t-{m}_{i}+{d}_{i,j}|\text{\hspace{0.17em}}{x}_{j}\right)\right).\hfill \end{array}\text{ }\text{ }\text{ }\text{ }\text{ }\left(A.1\right)$

Now, using the fact that

${d}_{i,j}=\left(1-{c}_{i,j}\right)\left(t-{m}_{i}\right)+\left({\stackrel{^}{\stackrel{^}{m}}}_{j}-{m}_{j}\right)-\left({\stackrel{^}{\stackrel{^}{m}}}_{i}-{m}_{i}\right)+\sum _{k\in s,k\ne j}\left({w}_{j,k}-{w}_{i,k}\right){\epsilon }_{k}+R\left({d}_{i,j}\right),\text{ }\text{ }\text{ }\left(A.2\right)$

where

${E}^{1/4}\left(\text{\hspace{0.17em}}{|R\left({d}_{i,j}\right)\text{\hspace{0.17em}}|}^{\text{\hspace{0.17em}}4}\right)={O}_{i,j}\left(\lambda {n}^{-1}+{\left(n\lambda \right)}^{-3/2}\right),\text{ }\text{ }\text{ }\text{ }\text{ }\left(A.3\right)$

it is seen from (A.1) that

$\begin{array}{ll}E\left(I\left({y}_{j}-{\stackrel{^}{m}}_{j}\le t-{\stackrel{^}{m}}_{i}\right)\right)\hfill & =E\left(G\left(t-{m}_{i}+{d}_{i,j}\right)\text{\hspace{0.17em}}|\text{\hspace{0.17em}}{x}_{j}\right)\hfill \\ \hfill & =G\left(t-{m}_{i}|\text{\hspace{0.17em}}{x}_{j}\right)+{G}^{\left(1,0\right)}\left(t-{m}_{i}|\text{\hspace{0.17em}}{x}_{j}\right)E\left({d}_{i,j}\right)\hfill \\ \hfill & \text{\hspace{0.17em}}\text{\hspace{0.17em}}+\frac{1}{2}{G}^{\left(2,0\right)}\left(t-{m}_{i}|\text{\hspace{0.17em}}{x}_{j}\right)E\left({d}_{i,j}^{2}\right)+{o}_{i,j}\left({\lambda }^{4}+{\left(n\lambda \right)}^{-1}\right).\hfill \end{array}\text{ }\text{ }\text{ }\text{ }\left(A.4\right)$

Thus,

$\begin{array}{ll}E\left({\stackrel{^}{F}}^{*}\text{​}\left(t\right)-{F}_{N}\left(t\right)\right)\hfill & =E\left(\frac{1}{N}\sum _{i\notin s}\sum _{j\in s}{w}_{i,j}\left(I\left({y}_{j}-{\stackrel{^}{m}}_{j}\le t-{\stackrel{^}{m}}_{i}\right)-I\left({y}_{i}\le t\right)\right)\right)\hfill \\ \hfill & =\frac{1}{N}\sum _{i\notin s}\sum _{j\in s}{w}_{i,j}\left[G\left(t-{m}_{i}|\text{\hspace{0.17em}}{x}_{j}\right)-G\left(t-{m}_{i}|\text{\hspace{0.17em}}{x}_{i}\right)\right]\hfill \\ \hfill & \text{\hspace{0.17em}}\text{\hspace{0.17em}}+\frac{1}{N}\sum _{i\notin s}\sum _{j\in s}{w}_{i,j}{G}^{\left(1,0\right)}\left(t-{m}_{i}|\text{\hspace{0.17em}}{x}_{j}\right)E\left({d}_{i,j}\right)\hfill \\ \hfill & \text{\hspace{0.17em}}\text{\hspace{0.17em}}+\frac{1}{2N}\sum _{i\notin s}\sum _{j\in s}{w}_{i,j}{G}^{\left(2,0\right)}\left(t-{m}_{i}|\text{\hspace{0.17em}}{x}_{j}\right)E\left({d}_{i,j}^{2}\right)+o\left({\lambda }^{4}+{\left(n\lambda \right)}^{-1}\right)\hfill \\ \hfill & :={C}_{1}+{C}_{2}+{C}_{3}+o\left({\lambda }^{4}+{\left(n\lambda \right)}^{-1}\right).\hfill \end{array}\text{ }\text{ }\text{ }\left(A.5\right)$

Consider first ${C}_{1}$ and note that

$\begin{array}{ll}{C}_{1}\hfill & :=\frac{1}{N}\sum _{i\notin s}\sum _{j\in s}{w}_{i,j}\left[G\left(t-{m}_{i}|\text{\hspace{0.17em}}{x}_{j}\right)-G\left(t-{m}_{i}|\text{\hspace{0.17em}}{x}_{i}\right)\right]\hfill \\ \hfill & =\frac{1}{2N}\sum _{i\notin s}{G}^{\left(0,2\right)}\left(t-{m}_{i}|\text{\hspace{0.17em}}{x}_{i}\right)\sum _{j\in s}{w}_{i,j}{\left({x}_{j}-{x}_{i}\right)}^{2}+o\left({\lambda }^{2}\right)\hfill \\ \hfill & ={\lambda }^{2}\frac{N-n}{N}\frac{{\mu }_{2}}{{\mu }_{0}}{\int }_{a}^{b}{G}^{\left(0,2\right)}\left(t-m\left(x\right)\text{\hspace{0.17em}}|\text{\hspace{0.17em}}x\right){h}_{\text{\hspace{0.17em}}\overline{s}}\left(x\right)dx+o\left({\lambda }^{2}\right).\hfill \end{array}$

Consider next ${C}_{2}.$ (A.2) and (A.3) imply that

$\begin{array}{ll}E\left({d}_{i,j}\right)\hfill & =\left(1-{c}_{i,j}\right)\left(t-{m}_{i}\right)+\left({\stackrel{^}{\stackrel{^}{m}}}_{j}-{m}_{j}\right)-\left({\stackrel{^}{\stackrel{^}{m}}}_{i}-{m}_{i}\right)+{O}_{i,j}\left(\lambda {n}^{-1}+{\left(n\lambda \right)}^{-3/2}\right)\hfill \\ \hfill & =\left({w}_{j,j}-{w}_{i,j}\right)\left(t-{m}_{i}\right)+{m}_{j}\sum _{k\in s}{w}_{j,k}{\left({x}_{k}-{x}_{j}\right)}^{2}-{m}_{i}\sum _{k\in s}{w}_{i,k}{\left({x}_{k}-{x}_{i}\right)}^{2}\hfill \\ \hfill & \text{\hspace{0.17em}}\text{\hspace{0.17em}}+{o}_{i,j}\left({\lambda }^{2}\right)+{O}_{i,j}\left(\lambda {n}^{-1}+{\left(n\lambda \right)}^{-3/2}\right)\hfill \\ \hfill & =\left({w}_{j,j}-{w}_{i,j}\right)\left(t-{m}_{i}\right)+\left({m}_{j}-{m}_{i}\right)\sum _{k\in s}{w}_{j,k}{\left({x}_{k}-{x}_{j}\right)}^{2}\hfill \\ \hfill & \text{\hspace{0.17em}}\text{\hspace{0.17em}}+{m}_{i}\left(\sum _{k\in s}{w}_{j,k}{\left({x}_{k}-{x}_{j}\right)}^{2}-\sum _{k\in s}{w}_{i,k}{\left({x}_{k}-{x}_{i}\right)}^{2}\right)\hfill \\ \hfill & \text{\hspace{0.17em}}\text{\hspace{0.17em}}+{o}_{i,j}\left({\lambda }^{2}\right)+{O}_{i,j}\left(\lambda {n}^{-1}+{\left(n\lambda \right)}^{-3/2}\right)\hfill \end{array}$

so that

${C}_{2}={C}_{2,a}+{C}_{2,b}+{C}_{2,c}+o\left({\lambda }^{2}\right)+O\left(\lambda {n}^{-1}+{\left(n\lambda \right)}^{-3/2}\right),$

where

$\begin{array}{ll}{C}_{2,a}\hfill & :=\frac{1}{N}\sum _{i\notin s}\sum _{j\in s}{w}_{i,j}{G}^{\left(1,0\right)}\left(t-{m}_{i}|\text{\hspace{0.17em}}{x}_{j}\right)\left({w}_{j,j}-{w}_{i,j}\right)\left(t-{m}_{i}\right)\hfill \\ \hfill & =\frac{1}{N}\sum _{i\notin s}{G}^{\left(1,0\right)}\left(t-{m}_{i}|\text{\hspace{0.17em}}{x}_{i}\right)\left(t-{m}_{i}\right)\sum _{j\in s}{w}_{i,j}\left({w}_{j,j}-{w}_{i,j}\right)+O\left({n}^{-1}\right)\hfill \\ \hfill & =\frac{1}{n\lambda }\frac{N-n}{N}\frac{K\left(0\right)-\kappa }{{\mu }_{0}}{\int }_{a}^{b}{G}^{\left(1,0\right)}\left(t-m\left(x\right)\text{\hspace{0.17em}}|\text{\hspace{0.17em}}x\right)\left(t-m\left(x\right)\right)\left[{h}_{\text{\hspace{0.17em}}\overline{s}}\left(x\right)/{h}_{s}\left(x\right)\right]dx\hfill \\ \hfill & \text{\hspace{0.17em}}\text{\hspace{0.17em}}+O\left({\left(n\lambda \right)}^{-1}{\lambda }^{-1}\alpha +{n}^{-1}\right)\hfill \end{array}$

with $\kappa :={\int }_{-1}^{1}{K}^{2}\left(u\right)du,$

$\begin{array}{ll}{C}_{2,b}\hfill & :=\frac{1}{N}\sum _{i\notin s}\sum _{j\in s}{w}_{i,j}{G}^{\left(1,0\right)}\left(t-{m}_{i}|\text{\hspace{0.17em}}{x}_{j}\right)\left({m}_{j}-{m}_{i}\right)\sum _{k\in s}{w}_{j,k}{\left({x}_{k}-{x}_{j}\right)}^{2}\hfill \\ \hfill & =o\left({}^{}\hfill \end{array}$