# Unequal probability inverse sampling Section 2. Formalization of the problemUnequal probability inverse sampling Section 2. Formalization of the problem

The following notation is used:

• $U:$ a population of $N$ enterprises, i.e., $U=\left\{1,\dots ,i,\dots ,N\right\}$ $\left(U$ may denote the population of enterprises in an economic region),
• $L:$ the list of occupations,
• $M:$ the number of occupations in the list, i.e., the size of $L,$
• ${F}_{i}:$ the list of occupations in enterprise $i,$ with ${F}_{i}\subset L,$
• ${D}_{i}:$ the list of occupations absent from enterprise $i,$ with ${D}_{i}\subset L,$ ${F}_{i}\cup {D}_{i}=L$ and ${D}_{i}\cap {F}_{i}=\varnothing ,$
• $M{p}_{i}:$ the number of occupations in enterprise $i,$ i.e., the size of ${F}_{i},$
• $r:$ the number of distinct occupations to be obtained in each enterprise,
• ${X}_{i}:$ the number of failures before the $r$ occupations in enterprise $i$ are obtained by selecting the occupations using a given design.

The main objective is to estimate the average wage for an occupation in the total population. Let ${y}_{ik}$ be the average wage for occupation $k$ in enterprise $i,$ and let ${z}_{ik}$ be the number of employees with occupation $k$ in enterprise $i.$ The objective is to estimate the average wage for occupation $k$ given by

${\overline{Y}}_{k}=\frac{\sum _{i\in U\text{\hspace{0.17em}}|\text{\hspace{0.17em}}{F}_{i}∍k}{z}_{ik}{y}_{ik}}{\sum _{i\in U\text{\hspace{0.17em}}|\text{\hspace{0.17em}}{F}_{i}∍k}{z}_{ik}}.$

Assume that a sample of enterprises ${S}_{1}$ is selected from $U$ using some given design with inclusion probabilities ${\pi }_{1i}.$ In enterprise $i,$ a sample of occupations ${S}_{i}$ is selected using one of the designs described above with inclusion probability ${\pi }_{k\text{\hspace{0.17em}}|\text{\hspace{0.17em}}i}.$ If the design is with replacement, ${\pi }_{k\text{\hspace{0.17em}}|\text{\hspace{0.17em}}i}$ represents the expected number of times that occupation $k$ is selected in enterprise $i.$

${\overline{Y}}_{k}$ can be estimated using a “ratio” type estimator (Hájek 1971):

$Y ¯ ^ k = ∑ i∈ S 1 | ( S i ∩ F i )∍k z ik y ik π 1i π k|i ∑ i∈ S 1 | ( S i ∩ F i )∍k z ik π 1i π k|i . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGGiFv0de9sqqrpepC0xbbL8F4rqqrpipC0xd9Wqpe0dd9 qqaqFeFr0xbbG8FaYPYRWFb9fi0lXxbvc9Ff0dfrpm0dXdHqps0=vr 0=vr0=fdbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGabmywayaary aajaWaaSbaaSqaaiaadUgaaeqaaOGaaGypamaalaaabaWaaabuaeqa leaadaabcaqaaiaadMgacqGHiiIZcaWGtbWaaSbaaWqaaiaaigdaae qaaSGaaGPaVdGaayjcSdGaaGPaVpaabmaabaGaam4uamaaBaaameaa caWGPbaabeaaliabgMIihlaadAeadaWgaaadbaGaamyAaaqabaaali aawIcacaGLPaaacqGHniYjcaWGRbaabeqdcqGHris5aOWaaSaaaeaa caWG6bWaaSbaaSqaaiaadMgacaWGRbaabeaakiaadMhadaWgaaWcba GaamyAaiaadUgaaeqaaaGcbaGaeqiWda3aaSbaaSqaaiaaigdacaWG Pbaabeaakiabec8aWnaaBaaaleaacaWGRbGaaGiFaiaadMgaaeqaaa aaaOqaamaaqafabeWcbaWaaqGaaeaacaWGPbGaeyicI4Saam4uamaa BaaameaacaaIXaaabeaaliaaykW7aiaawIa7aiaaykW7daqadaqaai aadofadaWgaaadbaGaamyAaaqabaWccqGHPiYXcaWGgbWaaSbaaWqa aiaadMgaaeqaaaWccaGLOaGaayzkaaGaeyydICIaam4Aaaqab0Gaey yeIuoakmaalaaabaGaamOEamaaBaaaleaacaWGPbGaam4Aaaqabaaa keaacqaHapaCdaWgaaWcbaGaaGymaiaadMgaaeqaaOGaeqiWda3aaS baaSqaaiaadUgacaaI8bGaamyAaaqabaaaaaaakiaai6caaaa@79DD@$

Therefore, the probability that an occupation will be selected in an enterprise must be known. However, with an inverse type design, the probability is unknown and must therefore be estimated in order to estimate ${\overline{Y}}_{k}.$ Since the inclusion probabilities appear in the denominator, it is preferable to estimate the inverses of ${\pi }_{k\text{\hspace{0.17em}}|\text{\hspace{0.17em}}i}.$ In an enterprise, an occupation’s probability of being selected decreases as the number of occupations increases. In addition, the probability depends on the inverse sampling design used in each enterprise.

Date modified: