Comparison of the conditional bias and Kokic and Bell methods for Poisson and stratified sampling
Section 2. The processing of influential units by winsorization following the approach of Kokic and Bell

Table of contents

In this section, we present the method initially proposed by Kokic and Bell (1994), which applies to samples selected through stratified simple random sampling, and an extension of this method to the case of samples selected through Poisson sampling.

2.1 Case of stratified simple random sampling

Consider a finite is a population $U$ of size $N$ and a variable of interest $X$ observed on a sample $S$ of fixed size $n$ and for which we are looking to estimate the total $T (X) = \sum_{i \in U} X_{i}$ on the population. The approach of Kokic and Bell (1994) is based on the following hypotheses:

$X$ is a positive or nil variable;
$S$ is selected according to a stratified simple random sampling design $P,$ following strata $U_{h} , h =1, \dots, H .$ In each stratum of size $N_{h},$ a sample $S_{h}$ of size $n_{h}$ is selected according to a simple random design without replacement. The expectation with respect to the sampling design will be denoted $E_{p}$ afterwards;
in each stratum $U_{h},$ the values of $X$ in the population are derived from random variables $X_{h i}$ that are independent and identically distributed according to a law $L_{h}$ (or of the same model $m)$ with expectation $μ_{h} .$ The expectation and the variance with respect to this model will be denoted $E_{m}$ and $V_{m}$ respectively hereafter;
we have, for each stratum $U_{h},$ $N_{h}$ realizations ${\overset{⌣}{X}}_{h i}$ of the variable $X$ derived from the same law $L_{h}$ but independent of the sample $S_{h} .$

In this context, Kokic and Bell (1994) propose applying a Type II winsorization; they associate with each stratum $U_{h}$ a threshold $K_{h}$ independent of the sample $S$ and define the winsorized variable $\tilde{X},$ for $i \in S,$ by:

${\tilde{X}}_{h i} = {\begin{array}{l} X_{h i} & if X_{h i} < K_{h} \\ \frac{n_{h}}{N_{h}} X_{h i} + (1 - \frac{n_{h}}{N_{h}}) K_{h} & if X_{h i} \geq K_{h} . \end{array}$

The winsorized estimator of the total $X$ is then the expansion estimator of the total of the winsorized variable $\tilde{X:}$ $\hat{T} (\tilde{X}) = \sum_{h =1}^{H} \frac{N_{h}}{n_{h}} \sum_{i \in S_{h}} {\tilde{X}}_{h i} .$

The thresholds $K_{h}$ are determined so as to obtain the estimator $\hat{T} (\tilde{X})$ with the lowest mean square error with respect to both the sampling design and the law of $X$ in each stratum, i.e.,

${(K_{h}^{*})}_{h =1, \dots, H} \in {Argmin}_{{(K_{h})}_{h =1, \dots, H}} E_{m} E_{P} {{[\hat{T} (\tilde{X}) - T (X)]}^{2}} .$

The optimal thresholds must therefore protect the winsorized estimator on average over all possible samples in the population, and on average on the law of the variable of interest, i.e., on average over all the possible populations considering the law of $X .$

Kokic and Bell (1994) place themselves in an asymptotic framework by considering a set of populations, sampling designs and samples indexed by $ν \in ℕ$ such as:

$\forall ν \in ℕ, \forall h =1, \dots, H, n_{h_{v}} >1;$
$N_{ν}, n_{ν} \underset{ν \to + \infty}{\to} + \infty;$
$\exists ϵ \in] 0, 1 / 2 [, \forall ν \in ℕ, \forall h =1, \dots, H, ϵ < \frac{n_{h_{v}}}{N_{h_{v}}} <1 - ϵ;$
the number of strata $H$ is fixed.

They also propose denoting $J_{h i} = I (X_{h i} \geq K_{h})$ the winsorization indicator. To reduce the notations, we will omit in the rest of the article the indicator $ν$ as well as the indicator $i$ in the expression of the expectations and variances $E_{m}$ and $V_{m}$ of the random variables and $X_{h i}$ $J_{h i}$ under the law of $X$ in the stratum $h .$ Insofar as these variables are assumed to be independent and identically distributed in each stratum, $E_{m} (X_{h i})$ for example, is indeed the same, regardless of the observation considered.

In this context, Kokic and Bell (1994) show that, at the optimum and asymptotically, all the thresholds are linked to one another by the relation:

$(\frac{N_{h}}{n_{h}} - 1) (K_{h} - μ_{h}) \sim - B (2.1)$

with $B = \sum_{h =1}^{H} N_{h} (1 - \frac{n_{h}}{N_{h}}) [K_{h} E_{m} (J_{h}) - E_{m} (X_{h} J_{h})]$ the bias of the winsorized estimator. The notation $\sim$ corresponds to an asymptotic equivalence when $n_{ν}$ tends toward infinity (which is equivalent to saying when $ν$ tends toward infinity).

If we denote $X_{h i}^{*} = (\frac{N_{h}}{n_{h}} - 1) (X_{h i} - μ_{h})$ and $L = - B,$ then we can notice that at the optimum given (2.1), $J_{h i} = J_{h i}^{*} = I (X_{h i}^{*} \geq L)$ and the bias $B$ is the opposite of the zero-point of the function $F$ defined by:

$F (L) = L {1 + \sum_{h =1}^{H} n_{h} E_{m} (J_{h}^{*})} - \sum_{h =1}^{H} n_{h} E_{m} (X_{h}^{*} J_{h}^{*}) . (2.2)$

Determining the zero-point of the function $F$ requires estimates of $μ_{h},$ $E_{m} (J_{h}^{*})$ and $E_{m} (X_{h}^{*} J_{h}^{*}) .$ To do this, Kokic and Bell (1994) rely on observations of the variable $X$ in each stratum. These observations must come from a source independent of the sample, since the demonstration of formulas (2.1) and (2.2) is based on the fact that the thresholds $K_{h}$ are assumed to be independent of the sample $S .$

If we assume that for each stratum $h$ we have $p_{h}$ realizations ${\overset{⌣}{X}}_{h i}$ of $X,$ then we can estimate $F$ by:

$\begin{array}{l} \hat{F} (L) & = L {1 + \sum_{h =1}^{H} n_{h} \frac{\sum_{i =1}^{p_{h}} I ({\overset{⌣}{X}}_{h i}^{*} \geq L)}{p_{h}}} \\ - \sum_{h =1}^{H} n_{h} \frac{\sum_{i =1}^{p_{h}} {\overset{⌣}{X}}_{h i}^{*} I ({\overset{⌣}{X}}_{h i}^{*} \geq L)}{p_{h}} (2.3) \end{array}$

with

${\overset{⌣}{X}}_{h i}^{*} = (\frac{N_{h}}{n_{h}} - 1) ({\overset{⌣}{X}}_{h i} - \frac{\sum_{j =1}^{p_{h}} {\overset{⌣}{X}}_{h j}}{p_{h}})$

and estimate the optimal bias $B$ as the opposite of the zero-point of $\hat{F} .$

Now, $\hat{F}$ is an increasing function and is linear by sections, which admits only one zero-point. This can be estimated simply by denoting ${\overset{⌣}{X}}_{(i)}^{*}$ the values of ${\overset{⌣}{X}}_{h i}^{*}$ sorted in ascending order and by calculating $\hat{F} ({\overset{⌣}{X}}_{(1)}^{*}),$ $\hat{F} ({\overset{⌣}{X}}_{(2)}^{*}), \dots$ until $\hat{F}$ sign changes.

Indeed, $\hat{F} ({\overset{⌣}{X}}_{(1)}^{*}) = {\overset{⌣}{X}}_{(1)}^{*} + \sum_{h =1}^{H} \frac{\sum_{i =1}^{p_{h}} ({\overset{⌣}{X}}_{(1)}^{*} - {\overset{⌣}{X}}_{h i}^{*})}{p_{h}}$ is negative because ${\overset{⌣}{X}}_{(1)}^{*}$ is by definition lower than all the others ${\overset{⌣}{X}}_{h i}^{*}$ and because ${\overset{⌣}{X}}_{(1)}^{*}$ is negative, since $\frac{\sum_{j =1}^{p_{h}} {\overset{⌣}{X}}_{h j}^{*}}{p_{h}} =0.$ However, $\hat{F} ({\overset{⌣}{X}}_{(p)}^{*}) = {\overset{⌣}{X}}_{(p)}^{*} \geq 0,$ for similar reasons by denoting $p = \sum_{h =1}^{H} p_{h} .$

By denoting $j$ the indicator such as $\hat{F} ({\overset{⌣}{X}}_{(j)}^{*}) \leq 0$ and $\hat{F} ({\overset{⌣}{X}}_{(j + 1)}^{*}) \geq 0,$ $B$ can be estimated by linear interpolation, i.e., by

$\hat{B} = - \frac{{\overset{⌣}{X}}_{(j)}^{*} \hat{F} ({\overset{⌣}{X}}_{(j)}^{*}) - {\overset{⌣}{X}}_{(j + 1)}^{*} \hat{F} ({\overset{⌣}{X}}_{(j + 1)}^{*})}{\hat{F} ({\overset{⌣}{X}}_{(j)}^{*}) - \hat{F} ({\overset{⌣}{X}}_{(j + 1)}^{*})} . (2.4)$

2.2 Extension to the case of the Poisson sampling design

We now place ourselves in the situation in which the sampling design $P$ by which $S$ is selected is a Poisson sampling design, in which each unit $i$ of the population can belong to the sample with a probability $π_{i} >0.$ We are always interested in estimating the total in the population $T (X) = \sum_{i \in U} X_{i}$ of a variable $X .$ The extension of the Kokic and Bell method to this sampling design assumes:

that $X$ is a positive or nil variable;
that it is possible to partition the population and the sample into subpopulations U h MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWGvbWaaSbaaSqaaiaadIgaaeqaaa aa@33AE@ and S h MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWGtbWaaSbaaSqaaiaadIgaaeqaaa aa@33AC@ in which all the values d h i X h i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWGKbWaaSbaaSqaaiaadIgacaWGPb aabeaakiaadIfadaWgaaWcbaGaamiAaiaadMgaaeqaaaaa@3799@ are independent realizations from the same model verifying:
- $\forall h =1, \dots, H, \forall i \in U_{h}, d_{h i} X_{h i} = μ_{h} + ϵ_{h i}, (2.5)$
- with
- ${\begin{array}{l} E_{m} (ϵ_{h i}) & =0 \\ V_{m} (ϵ_{h i}) & = σ_{h}^{2} < + \infty \end{array}$
- where $E_{m}$ and $V_{m}$ designates the expectation and variance with respect to the model (2.5).

In this context, we propose, as in the original method applied to stratified simple random sampling, associating a threshold $K_{h}, h =1, \dots, H$ with each part $S_{h}, h =1, \dots, H$ and defining:

the winsorized variable X ˜ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaaceWGybGbaGaaaaa@32A7@ by
- ${\tilde{X}}_{h i} = {\begin{array}{l} X_{h i} & if d_{h i} X_{h i} \leq K_{h} \\ \frac{X_{h i}}{d_{h i}} + (1 - \frac{1}{d_{h i}}) \frac{K_{h}}{d_{h i}} & if d_{h i} X_{h i} > K_{h}, \end{array} (2.6)$
- where $d_{h i} = \frac{1}{π_{i}}$ is the weight of the unit $i$ in part $h .$
the winsorized estimator of the total X MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWGybaaaa@3298@ as the usual expansion estimator of the total X ˜ : MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaaceWGybGbaGaacaaMi8UaaiOoaaaa@34F6@
- $\hat{T} (\tilde{X}) = \sum_{h =1}^{H} \sum_{i \in S_{h}} d_{h i} {\tilde{X}}_{h i} . (2.7)$

In the article by Kokic and Bell (1994), the subpopulations with which the thresholds are associated are the drawing strata, which respect two properties: the draws are independent between strata, and the authors postulate an identical population model for all observations in the same stratum. In the case of Poisson sampling, the drawings are by nature independent between units.

The strong hypothesis underlying model (2.5) is that values $X_{h i}$ multiplied by weights $d_{h i}$ are assumed to have constant expectation in each stratum. This means that the inclusion probabilities within each stratum are defined proportionally to the variable of interest $X .$ In practice, these inclusion probabilities are often defined proportionally to a known auxiliary variable that is strongly correlated with $X,$ which makes it possible to be close to the hypothesis underlying model (2.5).

Note also that model (2.5) is the one under which the Horvitz-Thompson estimator is optimal in the sense of minimizing the mean square error with respect to the model.

In the following, the random variables $d_{h i} X_{h i}$ being assumed to be independent and identically distributed within each stratum, we will denote $Z_{h i} = d_{h i} X_{h i} .$

We also place ourselves in the same asymptotic framework as Kokic and Bell (1994) by adapting the hypothesis on the inclusion probabilities:

$\forall h =1, \dots, H, \exists (λ_{1 h}, λ_{2 h}) \in {] 0, 1 [}^{2} , such that \forall i \in U_{h} , \min (π_{i}) > λ_{1 h} and \max (π_{i}) < λ_{2 h} . (2.8)$

As in the approach presented in the previous section, the thresholds $K_{h}$ are determined so as to minimize the mean square error of the winsorized estimator $\hat{T} (\tilde{X})$ with respect to both the model of the variable $X$ and the sampling design $P,$ i.e., on average across all possible populations, given the super-population model applied to $X$ and on average for all samples drawn from these populations, given the sampling design $P :$

${(K_{h}^{*})}_{h =1, \dots, H} \in {Argmin}_{{(K_{h})}_{h =1, \dots, H}} E_{m} E_{P} {{[\hat{T} (\tilde{X}) - T (X)]}^{2}} .$

It is possible to show (see Appendix A) that at the optimum and asymptotically, denoting as previously $J_{h i} = I (Z_{h i} > K_{h})$ and omitting the indicator $i$ in the expression of expectations and variances under model (2.5) of the variables $Z_{h i}$ and $J_{h i} :$

$\forall h =1, \dots, H, K_{h} \sim - \frac{A_{h}}{C_{h} + D_{h}} B (2.9)$

with

${\begin{array}{l} A_{h} & = \sum_{i \in U_{h}} \frac{1}{d_{h i}} (1 - \frac{1}{d_{h i}}) \\ C_{h} & = \sum_{i \in U_{h}} {(\frac{1}{d_{h i}})}^{2} {(1 - \frac{1}{d_{h i}})}^{2} \\ D_{h} & = \sum_{i \in U_{h}} \frac{1}{d_{h i}} {(1 - \frac{1}{d_{h i}})}^{3} \end{array}$

and

$B = \sum_{h =1}^{H} A_{h} [K_{h} E_{m} (J_{h}) - E_{m} (J_{h} Z_{h})] . (2.10)$

$B$ is the bias of the optimal winsorized estimator $\hat{T} (\tilde{X})$ at the optimum the threshold $K_{h}$ is therefore equal to a near positive term, in contrast to the bias multiplied by the term $\frac{A_{h}}{C_{h} + D_{h}} .$

If we denote $L = - B$ and $X_{h i}^{*} = \frac{C_{h} + D_{h}}{A_{h}} Z_{h i},$ then asymptotically $J_{h i} = J_{h i}^{*} = I (X_{h i}^{*} > L)$ using relation (2.9).

By injecting equivalence relation (2.9) into formula (2.10) defining $B,$ we obtain only optimally and asymptotically, $B$ is the opposite of the zero-point of the function $F$ defined by:

$F (L) = L (1 + \sum_{h =1}^{H} \frac{A_{h}^{2}}{C_{h} + D_{h}} E_{m} (J_{h}^{*})) - \sum_{h =1}^{H} \frac{A_{h}^{2}}{C_{h} + D_{h}} E_{m} (J_{h}^{*} X_{h}^{*}) . (2.11)$

As in the previous section, we assume finally that we have, for each subpopulation $h,$ of $p_{h}$ realizations ${\overset{⌣}{X}}_{h i}$ drawn from the law of $X$ and independent of the sample $S .$ With these observations, we can estimate $F$ by:

$\hat{F} (L) = L (1 + \sum_{h =1}^{H} \frac{A_{h}^{2}}{C_{h} + D_{h}} \frac{\sum_{i =1}^{p_{h}} I ({\overset{⌣}{X}}_{h i}^{*} > L)}{p_{h}}) - \sum_{h =1}^{H} \frac{A_{h}^{2}}{C_{h} + D_{h}} \frac{\sum_{i =1}^{p_{h}} {\overset{⌣}{X}}_{h i}^{*} I ({\overset{⌣}{X}}_{h i}^{*} > L)}{p_{h}} (2.12)$

and estimate $B$ by the opposite of the zero-point of $\hat{F} .$

We will denote ${\overset{⌣}{X}}_{(j)}^{*}$ the values of the ${\overset{⌣}{X}}_{h i}^{*}$ placed in ascending order. Then, between two successive values ${\overset{⌣}{X}}_{(j)}^{*}$ and ${\overset{⌣}{X}}_{(j + 1)}^{*},$ the indicators $I ({\overset{⌣}{X}}_{h i}^{*} > L),$ as functions of $L,$ remain constant and with a positive slope. $\hat{F}$ is therefore a linear and increasing function of $L .$

In addition, $\hat{F} (0) = - \sum_{h =1}^{H} \frac{A_{h}^{2}}{C_{h} + D_{h}} \frac{\sum_{i =1}^{p_{h}} {\overset{⌣}{X}}_{h i}^{*}}{p_{h}} \leq 0$ and, when $L$ exceeds ${\overset{⌣}{X}}_{(p)}^{*},$ with $p = \sum_{h =1}^{H} p_{h},$ $\hat{F} (L) = L \geq 0.$ To determine the zero-point of $\hat{F},$ it is necessary to operate using a method similar to that proposed by Kokic and Bell (1994) in the case of stratified simple random sampling:

calculate $\hat{F} (0),$ $\hat{F} ({\overset{⌣}{X}}_{(1)}^{*}),$ $\hat{F} ({\overset{⌣}{X}}_{(2)}^{*}), \dots, \hat{F} ({\overset{⌣}{X}}_{(p)}^{*});$
identify the value $j$ such as $\hat{F} ({\overset{⌣}{X}}_{(j)}^{*}) \leq 0$ and $\hat{F} ({\overset{⌣}{X}}_{(j + 1)}^{*}) \geq 0,$ assuming that ${\overset{⌣}{X}}_{(0)}^{*} =0;$
B MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFj0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8peeu0xXdcrpe0db9Wqpepec9ar=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacaWGcbaaaa@3282@ is then estimated by interpolation, as in the previous section:
- $\hat{B} = - \frac{{\overset{⌣}{X}}_{(j)}^{*} \hat{F} ({\overset{⌣}{X}}_{(j)}^{*}) - {\overset{⌣}{X}}_{(j + 1)}^{*} \hat{F} ({\overset{⌣}{X}}_{(j + 1)}^{*})}{\hat{F} ({\overset{⌣}{X}}_{(j)}^{*}) - \hat{F} ({\overset{⌣}{X}}_{(j + 1)}^{*})} .$

ISSN : 1492-0921

Editorial policy

Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.

Submission of Manuscripts

Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).

Note of appreciation

Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.

Standards of service to the public

Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.

Copyright

Published by authority of the Minister responsible for Statistics Canada.

Use of this publication is governed by the Statistics Canada Open Licence Agreement.

Catalogue No. 12-001-X

Frequency: Semi-annual

Ottawa

Date modified:: 2018-12-20

Language selection

Search and menus

Search

Comparison of the conditional bias and Kokic and Bell methods for Poisson and stratified sampling
Section 2. The processing of influential units by winsorization following the approach of Kokic and Bell

2.1 Case of stratified simple random sampling

2.2 Extension to the case of the Poisson sampling design

Comparison of the conditional bias and Kokic and Bell methods for Poisson and stratified sampling Section 2. The processing of influential units by winsorization following the approach of Kokic and Bell

2.1 Case of stratified simple random sampling

2.2 Extension to the case of the Poisson sampling design

Editorial policy

Submission of Manuscripts

Note of appreciation

Standards of service to the public

Copyright

Comparison of the conditional bias and Kokic and Bell methods for Poisson and stratified sampling
Section 2. The processing of influential units by winsorization following the approach of Kokic and Bell