2. Definitions and notation
Piero Demetrio Falorsi and Paolo Righi
Previous | Next
In this section,
we introduce the concepts of estimation domain and planned
domain which play a key role in the framework presented herein.
Let
be the reference population of
elements and let
be an estimation domain, i.e., a generic sub-population of
with
elements, for which separate
estimates must be calculated. Let
denote the value of the
variable of interest attached to
the
population unit and let
denote the domain membership
indicator for unit
defined as
We assume that the
values are available in the
sampling frame and more than one value
can be 1 for each unit
therefore, the estimation domains
can overlap.
The parameters of
interest are the
domain totals
Let
be a single-stage without replacement sampling design and
be the
vector of inclusion
probabilities. Let
be the sample selected with
probability
Denote by
the subpopulation of size
where
and
otherwise.
We focus on fixed
size sampling designs which are those satisfying
where
and
is the vector of integer
numbers defining the sample sizes fixed at the design stage. Since the sample
size
corresponding to
does not vary among sample
selections, the subpopulation
will be referred to as a planned
domain in the sequel. A necessary but not sufficient condition for
ensuring that (2.3) is satisfied is that the vector
is such that
In our setting,
the planned domains can overlap; therefore, the unit
may have more than one value
(for
Let us suppose that the
values are known, and available
in the sampling frame, for all population units. We suppose furthermore that
the
matrix
is non-singular.
The planned
domains and their relationship with
the estimation domains play a central role in our generalized framework. We
assume that the estimation domains may be defined as an aggregation of complete
planned domains, which ensure that the expected sample size in the
estimation domain
say
can be obtained as a simple
aggregation of the expected sample sizes of the planned domains that are
included within it. Finally, let
be the Horvitz-Thompson (HT)
estimator of
with
An example from business surveys. Suppose that the survey estimates must be
calculated separately considering three domain types: region (with 20 modalities), economic
activity (2 modalities: goods and services) and enterprise size (3 modalities: small, medium and large
enterprises). That is, there are
possible overlapping estimation
domains. The planned domains can be
defined with different options.
Option
1. The single planned domain
is identified by a specific
intersection of the categories of the estimation domains. In this case
planned domains are defined. They
represent a specific partition of
The planned domains do not
overlap and
Option
2. The planned domains
coincide with the estimation
domains. Therefore,
and the
are defined as vectors with three
1’s, so that
Recall that the planned domains
overlap.
Option
3. The planned domains
are defined as (i) region by economic activity and (ii) economic activity by enterprise
size; then,
with
Other intermediate
relationships among estimation and planned domains are possible.
It is emphasised
that the planned domains represent the basis for defining broad classes of sampling designs. For instance, stratified sampling designs require
that the planned domains do not overlap, as
and each
is referred to as a stratum.
Therefore, Option 1 in the example above leads us to define a stratified
sampling design. Furthermore, the strata defined as in Option 1 are the basis
of the so-called “multi-way stratified sampling design” (Winkler 2001).
If
the
sample sizes of the planned domains identified in Option 1 (strata) are not
strictly controlled. Nevertheless, the sample sizes are still controlled at an
aggregated level. In Option 2 of the example above, the sample sizes are
controlled only for the estimation domains; while in Option 3, the sample sizes
are controlled for the subsets of two different partitions, defined by (i)
the region by economic activity and (ii)
the economic activity by enterprise size. On the basis of the Winkler’s definition, we
denote the designs using these types of planned domains as Incomplete multi-way Stratified
Sampling (ISS) designs.
Previous | Next