Concepts and Methods Guide
6. Weighting
Archived Content
Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.
Skip to text
Text begins
6.1 Initial weights
In a sample survey each selected person represents not only themselves but also others who were not sampled. Consequently, a weight is associated with each person selected to indicate the number of people they represent. This weight must be used for all estimates. For example, in a simple random sample of 2% of the population, each person represents 50 people in the population. The initial weight is then adjusted for such things as non-response and gaps between the characteristics of the sample and the known totals for the target population (post-stratification). The weighting process for the Aboriginal Peoples Survey (APS) consists of seven steps.
In this chapter, the term “census” will be used to refer to the long-form census questionnaire.
6.2 Adjustment for units not sent to collection
Some sampled units were not sent to collection for various reasons. Among these units were:
- units for which three members of the same household had already been selected;
- units selected for the top-up sample (see subsection 3.2.3) that were in a household in which some members had already been selected (due to operational requirements, individuals in these households who were selected for the top-up sample were not eligible for collection);
- units without a name or date of birth;
- duplicates identified by overlap of names, birthdates and addresses following sample selection; and
- units from the Whapmagoostui Reserve selected by mistake, as mentioned in subsection 3.2.3.
In the first, second and third cases, a ratio adjustment was made by region, Aboriginal group and age group. The weight of units removed in the first, second and third cases were set to zero and the weights of the remaining units were increased proportionally (ratio adjustment) within a region, Aboriginal group and age group. The weights of duplicates and units from the Whapmagoostui Reserve were simply set to zero. A total of 405 units were not sent to collection.
6.3 Non-response adjustment
Two adjustments were made for two types of non-response: selected people with whom no contact was made (non-contact: 3,994 people) and people contacted who did not (or could not) provide information about themselves (non-response with contact: 5,874 people). The second type of non-response is mainly associated with refusals or “disguised” refusals. An example of a disguised refusal might be a person contacted several times who continually postpones the interview. Two adjustments were made since the characteristics of the people who could not be reached are often different from those who refused to respond when contacted.
Weights were first adjusted for non-contact cases then for non-response with contact. In the rest of the document, the term “non-response” will be used for both types of non-response.
Each non-response adjustment was done in three steps. In the first step, a logistic regression model was used to predict the response probability (probability of getting a response) for each unit (both responding and non-responding units) from a series of explanatory variables. These explanatory variables are divided into two groups. The first group consists of the “person” or “household” characteristics from the 2016 Census for the person selected (e.g., Aboriginal group of the person selected, number of people in the household of the person selected, etc.). The second group of explanatory variables consists of collection variables called “paradata”. The number of attempts made to contact a person or whether tracing was required are examples of paradata. The paradata were found to be good predictors of response or non-response since many of these variables measure the effort to contact a person or to obtain a response from a contacted person. For example, individuals for whom many contact attempts were required to establish initial contact can be considered to be very similar to individuals for whom no contact was made despite numerous attempts.
In the second step, respondents and non-respondents with similar response probabilities were divided into adjustment classes using cluster analysis. A simulation was carried out to determine approximately the optimal number of classes and the minimum number of respondents per class. The response rate was calculated for each class based on the number of respondents and non-respondents in the class. The calculated response rate was then weighted using the weights from the previous adjustment step.
In the third step, the weights of the responding units within each class were adjusted using the inverse of the weighted response rate in that class. The weights of the non-responding units were set to zero.
It is important to note that at this stage, all units considered to be out-of-scope were classified as respondents. In fact, all the information required to determine that they were out-of-scope was obtained from these individuals. The weights of these units were set to zero in the second post-stratification (see section 6.5) and these units were removed from the analytical file. Retaining them until that step will make it possible to internally produce weighted estimates of different groups of units outside the target population. This will be very useful, for example, in estimating certain parameters at the time of the next survey.
6.4 Adjustment for partial respondents
Partial respondents are individuals who reported Aboriginal identity in the APS but who did not provide sufficient information to meet the definition of respondent, as defined in chapter 5. There were 362 partial respondents, which should have little impact on the estimates.
A ratio adjustment was carried out by region, Aboriginal group and age group, as measured in the 2016 Census. Only the weights of respondents with Aboriginal identity were increased to reflect the removal of partial respondents (the weights of out-of-scope units, including non-Aboriginal individuals on the APS, were not adjusted), knowing that these partial respondents had reported Aboriginal identity. The weights of partial respondents were then set to zero.
6.5 Post-stratification
Post-stratification ensures that the sum of the adjusted weights for the responding units corresponds to the census estimates, according to different groups called post-strata.
Two separate post-stratifications were carried out for the APS. The first one aimed to adjust the weights to the Aboriginal identity or ancestry population estimated by the census, by post-stratum, using the identity and ancestry variables from the RDB (see subsection 3.1.3) at the time of sample selection (and not the variables measured by the APS, which were the subject of the second post-stratification). The post-strata were defined from certain combinations of region, Aboriginal type (identity or ancestry only), Aboriginal group (Status First Nations, Non-status First Nations, Métis, Inuit, other) and age group. The distinction between Status and Non-status First Nations was used only for the provinces between Ontario and British Columbia. The census estimates on which the APS weights were adjusted correspond exactly to the APS coverage, i.e., the Aboriginal identity or ancestry-only population aged 15 and over as of January 15, 2017, excluding people living on reserve and some First Nations communities in the territories.
The weights were adjusted according to the ratio of the weighted census estimate to the weighted APS sample estimate for each post-stratum. This ensures that the sample did not underrepresent or overrepresent certain combinations of Aboriginal groups, regions and age groups from the census.
Given that the responses to the questions defining the Aboriginal identity population (see subsection 3.1.1) could differ between the APS and the census, a second post-stratification was carried out. It should be noted that the APS questions defining the Aboriginal identity population were slightly different from those asked in the census (see Table 1 in section 2 and subsection 3.1.1). The second post-stratification ensured that the Aboriginal identity population—estimated from the APS questions—corresponded to the Aboriginal identity population defined according to the census within each post-stratum. Unlike the first post-stratification, the second one was not a classical post-stratification, in which weights were readjusted to take account of underrepresentation or overrepresentation of certain groups in the sample. In fact, the answers to the questions on Aboriginal identity in the APS may have differed from those obtained in the census for a variety of reasons (see section 8.1). Instead, this post-stratification ensured that the APS Aboriginal identity population counts were the same as those obtained from the census. After this step, only respondents with an Aboriginal identity according to the APS had positive weights. The weights of out-of-scope units had been set to zero.
The second post-strata were formed from specific combinations of region, Aboriginal identity group (Status First Nations, Non-status First Nations, Métis, Inuit) and age group. Since it was impossible to preserve the multiple identity counts between the APS and the census estimates (counts too small or discrepancies too large), individuals who reported an identity of First Nations and Métis, First Nations and Inuit, or First Nations, Métis and Inuit were combined with individuals who reported a First Nations identity during the second post-stratification. Individuals who reported a Métis and Inuit identity were combined with Métis. Moreover, unlike the 2012 APS, the 2017 APS did not impute respondents from the group “Status Indian or member of a First Nation/Indian band only” as being part of the First Nations group. Individuals reporting being a Status Indian or member of a First Nation/Indian band but not self-reporting as Aboriginal were kept as a separate fourth identity group (Aboriginal responses not included elsewhere), ensuring consistency with census concepts. However, during the second post-stratification, individuals in this group were combined with the First Nations group. In fact, there were few units in the “Aboriginal responses not included elsewhere” group and including them in the First Nations group will improve comparability between the 2017 APS estimates and the 2012 APS estimates.
6.6 Adjustment for extreme weights – Sigma gap method
Once the above weight adjustments were completed, some weights could have very large values compared to others. This could have created problems during estimation if, in addition to very large weights, these units also had very different characteristics from the units with smaller weights. A method referred to as the ”sigma gap” method was used to detect extreme weights within each post-stratum, the post-strata being closely linked to the survey’s domains of estimation (see subsection 3.2.1). Bernier and Nobrega (1998)Note describe one application of the sigma gap method. The sigma gap method used here was intended to detect extreme weights by calculating the difference between two successive weights after being sorted in descending order. This difference was compared to n*standard deviation of the weights within each post-stratum. If the difference exceeded n*standard deviation of the weights, the largest of the two weights was declared extreme. Once a weight was declared extreme, all others in its post-stratum that were larger than it were automatically considered to be extreme as well. Those weights were then truncated to the value of the first weight that was not extreme. The mass of the truncated weights was then redistributed within the post-strata using a ratio adjustment. After examining a number of scenarios, a value of 2.5 was finally selected for n. This particular value of n identified an acceptable number of extreme weights. In fact, most weights that would have been intuitively identified as extreme following a manual review were identified by applying the sigma gap method with a value of 2.5 for n. As well, a small number of weights per post-stratum were identified as extreme, thereby leaving the vast majority of the weights calculated in the previous steps unchanged.
6.7 Addition of units not enrolled under the Nunavut Agreement
In the last weighting step, a total of 274 responding units were added to the APS sample. These units were selected for the APS - Nunavut Inuit Supplement and were ultimately determined to be out-of-scope for that survey because they were not enrolled under the Nunavut Agreement. Consequently, these units were excluded from the APS - Nunavut Inuit Supplement sample. Nonetheless, as they had completed the APS questionnaire, it would have been unfortunate to lose their information. Therefore, these 274 units were added to the APS sample with a weight of one. The APS sample units in Nunavut that were not enrolled under the Nunavut Agreement were then reweighted within the second post-stratification adjustment groups in order to maintain the previously achieved control totals.
- Date modified: