# 4. Optimal solutions

Sun Woong Kim, Steven G. Heeringa and Peter W. Solenberger

Given the set of $L$ possible arrays in $\mathfrak{B}$ , consider the subset ${\mathfrak{B}}^{\text{'}}\text{\hspace{0.17em}}(\subseteq \mathfrak{B}\text{\hspace{0.17em}})$ where

$$p\left({B}_{k}\right)>0.$$

A **solution
set** to a controlled selection problem
$A$
denoted as

$$\left\{\left({B}_{k},p\left({B}_{k}\right)\right),\text{\hspace{0.17em}}\text{\hspace{0.05em}}{B}_{k}\in {\mathfrak{B}}^{\text{'}}\right\}$$

is the set of the arrays that have the required positive selection probabilities $\left(p\left({B}_{k}\right)>0\right)$ . This solution set, or simply a “solution” to the controlled selection problem, is usually obtained by an algorithm to control the constraints in (3.1) through (3.6). As described in the introduction, since Goodman and Kish (1950), many algorithms for obtaining solutions to controlled selection problems have been developed.

Until Groves and Hess (1975) suggested a computer algorithm, most solutions were manually obtained in a process that resembled solving a mathematical puzzle. Furthermore, for most problems it is possible that there is more than one solution set that meets the constraints. Since the 1980s, the computer-intensive controlled selection algorithms using transportation theory, network flow, integer programming, and LP have been developed. These algorithms may depend on highly specialized software or may be programmed to run in major software systems.

However, previous
solutions ranging from manual to computer-intensive algorithms have rarely been
compared empirically using a standard set of performance criteria. Therefore,
we begin with the description of a concept called **optimal solution sets**, or more simply, **optimal solutions**.

The controlled selection
problem
$A$
is only one array, but there may be many
possible arrays in
$\mathfrak{B}$
.
Also, only one array
${B}_{k}$
from any solution to
$A$
is randomly chosen by
$p\left({B}_{k}\right)$
as the basis for choosing the stratified
sample. In general then, we might define an optimal solution as that satisfying
the following **requirements** (R1 and
R2):

**R1**.
The solution is obtained based on appropriate and objective measurements of the **closeness** between
$A$
and every single array
${B}_{k}$
in
$\mathfrak{B}$
.

**R2**.
The solution, as much as possible, maximizes the probabilities of selection
over the **arrays nearest*** *to* *
$A$
under such measurements as referenced in R1.

The remainder of this section will address how to
specify R1 and R2 for optimal solutions. First, in order to define **closeness** in R1, a real number
$d({B}_{k}:A)$
representing the distance between
$A$
and
${B}_{k}$
,
can be considered, where
$d$
is a distance function that satisfies the following
axioms:

Axiom (iii) is termed the **triangle inequality axiom**. Distance
functions satisfying (i), (ii), and (iii) can be defined by using the two-ordered
$RC\text{-tuples}$
$\left({a}_{11},{a}_{12},\cdot \cdot \cdot ,{a}_{RC}\right)$
and
$\left({b}_{11k},{b}_{12k},\cdot \cdot \cdot ,{b}_{RCk}\right)$
for
$A$
and
${B}_{k}$
.
We first define the ordinary or **Euclidean
distance*** *(**2-norm distance**):

$${d}_{2}\left({B}_{k},A\right)={\left[{\displaystyle \sum _{i=1}^{R}{\displaystyle \sum _{j=1}^{C}{\left({b}_{ijk}-{a}_{ij}\right)}^{2}}}\right]}^{\frac{1}{2}},\text{}k=1,\dots ,L.\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}(4.1)\text{\hspace{1em}}$$

This function is probably the most familiar measure to define the distance between ${B}_{k}$ and $A$ .

Also, we can define the
function called the **Chebyshev distance **(**infinite
norm distance**):

$${d}_{\infty}\left({B}_{k},A\right)=\mathrm{max}\left\{\left|{b}_{ijk}-{a}_{ij}\right|:i=1,\dots ,R,\text{}j=1,\dots ,C\right\},\text{}k=1,\dots ,L.\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}(4.2)$$

These distance functions give rise to distinct distance spaces. Owing to (3.2), for any ${B}_{k}$ , the following holds.

$$0\le {d}_{2}\left({B}_{k},A\right)<{\left(RC\right)}^{1/2}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}(4.3)$$

and

$$0\le {d}_{\infty}\left({B}_{k},A\right)<1.\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}(4.4)$$

For instance, for the $3\times 3$ array in Problem 2.1 and the $8\times 3$ array in Problem 2.3, $0<{d}_{2}\left({B}_{k},A\right)<3$ and $0<{d}_{2}\left({B}_{k},A\right)<4.9$ , respectively.

Second, as mentioned in
R2, regarding the **arrays nearest** to
$A$
under such measurements described in R1,
consider the set of arrays in
$\mathfrak{B}$
having the minimum
$\text{\hspace{0.17em}}{d}_{2}$
or
${d}_{\infty}$
value from
$A$
. Let
${\mathfrak{B}}_{2}(\subseteq {\mathfrak{B}}^{\prime})$
be the set of the arrays having the minimum
$\text{\hspace{0.17em}}{d}_{2}$
value from
$A$
and
${\mathfrak{B}}_{\infty}(\subseteq {\mathfrak{B}}^{\prime})$
be the set of the arrays having the minimum
${d}_{\infty}$
value from
$A$
.

Assuming that all
possible arrays in
$\mathfrak{B}$
are known, we define **optimum arrays** as follows.

**Definition**. The arrays in
${\mathfrak{B}}_{2}\cup {\mathfrak{B}}_{\infty}$
are called the **optimum arrays**.

Note that in the new algorithm for controlled selection to be described in Section 6, $\text{\hspace{0.17em}}{d}_{2}$ or $\text{\hspace{0.17em}}{d}_{\infty}$ are chosen based on preference. We avoid defining the intersection of ${\mathfrak{B}}_{2}$ and ${\mathfrak{B}}_{\infty}$ as the optimum arrays because this may exclude the other arrays not in ${\mathfrak{B}}_{2}\cap {\mathfrak{B}}_{\infty}$ with the same minimum $\text{\hspace{0.17em}}{d}_{2}$ ( $\text{\hspace{0.17em}}{d}_{\infty}$ ) value. We illustrate below that there may exist a very small number of optimum arrays relative to the number of all possible arrays in $\mathfrak{B}$ for any $A$ . The details of how to find all possible arrays will be described in Section 6 and Section 7.

**Illustrations**.

For Problem 2.1 through Problem 2.4, it is noted that ${\mathfrak{B}}_{2}\subseteq {\mathfrak{B}}_{\infty}$ . Thus, it is possible to use ${d}_{\infty}$ only in illustrating the optimum arrays.

For Problem 2.1, there are six possible arrays satisfying (3.1), (3.2), (3.3), and (3.4). That is, $\mathfrak{B}=\left\{{B}_{k},\text{\hspace{0.05em}}\text{\hspace{0.17em}}k=1,\dots ,6\right\}\text{\hspace{0.05em}}\text{\hspace{0.05em}}$ , as given in Table 4.1. There exists only one optimum array, ${B}_{2}$ , with the minimum value of ${d}_{\infty}=0.5$ .

$A$ | ${B}_{1}$ | ${B}_{2}$ | ${B}_{3}$ | ${B}_{4}$ | ${B}_{5}$ | ${B}_{6}$ |
---|---|---|---|---|---|---|

$$\begin{array}{ccc}0.8& 0.5& 0.7\\ 0.7& 0.8& 0.5\\ 0.5& 0.7& 0.8\end{array}$$ | $$\begin{array}{ccc}0& 1& 1\\ 1& 0& 1\\ 1& 1& 0\end{array}$$ | $$\begin{array}{ccc}1& 0& 1\\ 1& 1& 0\\ 0& 1& 1\end{array}$$ | $$\begin{array}{ccc}1& 1& 0\\ 0& 1& 1\\ 1& 0& 1\end{array}$$ | $$\begin{array}{ccc}0& 1& 1\\ 1& 1& 0\\ 1& 0& 1\end{array}$$ | $$\begin{array}{ccc}1& 0& 1\\ 0& 1& 1\\ 1& 1& 0\end{array}$$ | $$\begin{array}{ccc}1& 1& 0\\ 1& 0& 1\\ 0& 1& 1\end{array}$$ |

${d}_{\infty}$ | 0.8 | 0.5 | 0.7 | 0.8 | 0.8 | 0.8 |

Problem 2.2 has 30 possible arrays, and there are three optimum arrays, shown in Table 4.2

Problem 2.3 has 141 possible arrays. There are six optimum arrays, where each array has the same ${d}_{\infty}=0.6$ . One of them is given in Table 4.3.

There are 159 possible arrays for Problem 2.4, and there is only one optimum array given in Table 4.4.

Accordingly, based on the
definition of optimum arrays as well as
$\text{\hspace{0.17em}}{d}_{2}$
and
${d}_{\infty}$
satisfying the axioms (i), (ii), and (iii), we
suggest the following **specifications** (S1 and S2) of R1 and R2 of optimal solutions:

**S1**.
The solution is based on the values of the distance
$\text{\hspace{0.17em}}{d}_{2}\left({d}_{\infty}\right)$
between
$A$
and every single array
${B}_{k}$
in
$\mathfrak{B}$
.

**S2**.
The solution maximizes the probabilities of selection of optimum arrays.

S1 and S2 will be the rudiments of a new algorithm presented in Section 6, and in the next section we turn into the discussion on the previous algorithms from the viewpoint of optimal solutions.

## Report a problem on this page

Is something not working? Is there information outdated? Can't find what you're looking for?

Please contact us and let us know how we can help you.

- Date modified: