# 8.0 Guidelines for release

## Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

Microdata users should apply the rules for assessing data quality, below, to all estimates they produce, and retain only those that satisfy the release criteria. Estimates that do not satisfy the release criteria are not reliable.

## 8.1 Introduction

The guidelines for release and publication make use of the concept of sampling variability to determine whether estimates obtained from the microdata are reliable. Sampling variability is the error in the estimates caused by the fact that we survey a sample rather than the entire population. The concept of standard error and the related concept of coefficient of variation and confidence interval provide an indication of the magnitude of the sampling variability.

The standard error and coefficient of variation do not measure any systematic biases in the survey data which might affect the estimate. Rather, they are based on the assumption that the sampling errors follow a normal probability distribution. Subject to this assumption, it is possible to estimate the extent to which different samples that have the same design and the same number of observations would give different results. This indicates the margin of error that is likely to be included in the estimates derived from our single sample.

For a detailed description of the measures of sampling variability, see A. Satin and W. Shastry, Survey Sampling: A Non-Mathematical Guide, Statistics Canada, Catalogue 12-602E.

## 8.2 Minimum sizes of estimates for release

In general, the smaller the sample, the greater the sampling variability. Likewise, estimates of small population subgroups are less reliable than estimates of large population subgroups. The minimum allowable sizes of estimates, also called the release cut-offs, are a quick rule for determining whether an estimate can be released, before applying the more rigorous test that uses the coefficient of variation. The release cut-offs are calculated specifically for the Survey of Financial Security, based on the sample size and the sample design.

The cut-off for the unweighted count must be satisfied:

• Unweighted count: The number of observations must be at least 25. If the unweighted count is less than 25, then the weighted estimate should not be released regardless of the value of its coefficient of variation.

## 8.3 Hypothesis tests provided by statistical software packages

Microdata users should be aware that the results of hypothesis tests (such as the p values accompanying t statistics or Pearson statistics) that are provided automatically by standard statistical software packages are incorrect for data provided by surveys with a complex survey design, such as Survey of Financial Security. Such packages calculate these test results under the assumption of simple random sampling. That is, they do not take into account the special sample design features of SFS such as stratification, clustering, and unequal selection probabilities. While many of the standard packages can account for the unequal selection probabilities in the production of estimates by allowing the use of weights, these packages do not properly take the sample design into account when producing variance estimates that form part of most test statistics.

To perform hypothesis tests, a two-step method can be employed with the standard statistical software to form the test statistics. First, estimate the characteristics of interest using the weights provided on the microdata file. Second, obtain approximate variance estimates of these characteristics by rerunning the same software procedure as that used for producing the characteristic estimates but using a scaled weight that consists of the original weight divided by the average of the original weights of all the observations being used in your computations. The quantities calculated in the two steps can then be combined to form test statistics. It must be noted that this method provides only rough approximations to the standard errors.

It should be noted that users of the SFS PUMF cannot readily obtain better design-based variance estimates through the use of statistical software specifically designed for survey data. This is because the design information required by these software packages is not currently available on the SFS data file due to confidentiality considerations. However, better variance estimates can be produced by Statistics Canada on a cost recovery basis.