Development of a small area estimation system at Statistics Canada

Section 2. Core notation and background

Table of contents

We first introduce some notation that will define the various small area estimators included in the production system. Let $U$ denote a population of size $N .$ This population is partitioned into $M$ mutually exclusive and exhaustive areas, where each area $U_{i} \subset U, i = 1, \dots, M$ has $N_{i}$ observations. A sample, $s,$ of size $n$ is drawn from the population using a well-defined probability mechanism $p (s)$ and the resulting sample is split into areas $s_{i} = s \cap U_{i}, i = 1, \dots, M .$ Note that, for some of the areas, the realized sample size $n_{i}$ may be zero. The set of $m (m \leq M)$ areas, where $n_{i}$ is strictly greater than 0, will be denoted as $A .$ The set of the remaining areas, where $n_{i}$ is equal than 0, will be denoted as $\bar{A} .$

Let $π_{j} = \sum_{{s : j \in s}} p (s), j \in U,$ be the inclusion probabilities where ${s : j \in s}$ denotes summation over all samples $s$ containing unit $j .$ We denote the sampling weight for unit $j$ as $d_{j},$ where $d_{j} = π_{j}^{- 1} .$ The final weight associated with unit $j$ will be denoted as $w_{j} .$ This weight will normally be the product of the original design weight $(d_{j})$ times an adjustment factor that reflects the incorporation of available auxiliary data (via regression or calibration), as well as non-response adjustments. Note that the auxiliary data used in the adjustment factor may not necessarily be the same as those used for small area estimation.

The objective of a small area estimation system is to estimate a population parameter $θ_{i}$ (e.g., a mean or a total) for each area $i$ for a given variable of interest $y$ when some area sample sizes $n_{i}$ are too small to use direct estimation procedures. A direct estimator of $θ_{i}$ is one that uses values of the variable of interest, $y,$ strictly from the sample units in area $i .$ However, a major disadvantage of such estimators is that unacceptably large standard errors may result: this is especially true if the area sample size is small. Small area procedures use indirect estimators that borrow strength across areas, by using models which link all areas through some common parameters. Indirect estimators will be efficient (i.e., increase the effective sample size and thus decrease the standard error) if the model holds for each area. Departures from the model will result in reduced accuracy. There is a wide variety of indirect estimators available and a good summary is provided in Rao and Molina (2015).

Small area estimators are classified as area or unit level depending on the level at which the modeling is performed. Area level small area estimators are based on models linking a given parameter of interest to area-specific auxiliary variables. Unit level small area estimators are based on models linking the variable of interest to unit-specific auxiliary variables. Area level small area estimators are computed if the unit level area data are not available. They can also be computed if the unit level data are available by aggregating them to the appropriate area level. This might be useful in practice because the area level small area estimators may be less prone to outliers than their unit level counterpart.

ISSN : 1492-0921

Editorial policy

Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.

Submission of Manuscripts

Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).

Note of appreciation

Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.

Standards of service to the public

Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.

Copyright

Published by authority of the Minister responsible for Statistics Canada.

Use of this publication is governed by the Statistics Canada Open Licence Agreement.

Catalogue No. 12-001-X

Frequency: Semi-annual

Ottawa

Date modified:: 2019-05-07

Language selection

Search and menus

Search

Development of a small area estimation system at Statistics Canada

Section 2. Core notation and background

Development of a small area estimation system at Statistics Canada Section 2. Core notation and background

Editorial policy

Submission of Manuscripts

Note of appreciation

Standards of service to the public

Copyright

Development of a small area estimation system at Statistics Canada

Section 2. Core notation and background