Statistics Canada
Symbol of the Government of Canada
Private and Public Investment in Canada, Intentions

2008

61-205-X


Warning View the most recent version.

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

Methodology

Introduction

The Capital Expenditures Survey (CES) produces data on investment made in Canada, in all types of Canadian industries. These data are gathered twice a year, at two very specific times. This permits follow-up on intentions and achievements in terms of investment, on an annual basis. A single sample is used to collect data for three different fiscal years. An initial questionnaire is mailed to sample units in March of fiscal year Y. It collects actual data for fiscal year Y-1, which has just ended. A second questionnaire is then mailed to the same units in October of fiscal year Y. That questionnaire collects preliminary actual data for fiscal year Y, which will end in a few months, and intentions data for fiscal year Y+1. The sample is selected in November of fiscal year Y-1.

Just as one sample is used to collect data for three different fiscal years, one fiscal year is covered by three different samples. One sample produces intentions data for fiscal year Y. One year later, a second sample produces preliminary actual data for fiscal year Y. One year further on, a third sample produces actual data for fiscal year Y.

In February of year Y, Investment and Capital Stock Division (ICSD) publishes the results of the Survey on Actual Data (SA) for fiscal year Y-2, the Survey on Preliminary Actual Data (SPA) for fiscal year Y-1, and the Survey on Intentions (SI) for fiscal year Y.

In the SI and SPA surveys, the variables of interest are capital expenditures on new construction (CC) and capital expenditures on new machinery and new equipment (CM). In the SA survey, we add repair expenditures on construction (RC) as well as repair expenditures on machinery and equipment (RM). In addition, the SA survey produces more detailed estimates for new capital. In fact, capital expenditures by type of assets are also available in the publication catalogue no. 61-223-X Capital Expenditures by Type of Asset.

Methodology by industrial sector

As in any survey covering several industrial sectors, the methodology for the CES survey differs from one sector to another and thus requires very detailed explanations that are impossible to cover in one section. The following is how the methodology for the various industrial sectors is divided under the North American Industrial Classification System (NAICS):

Sector 11, sub-sector 111 and 112 (Crop and Animal Production Industries):

  1. The survey is conducted by Agriculture Division (AD) which adds investment questions to some of their surveys of farmers. The data are processed by AD and the estimates are re-integrated into the bi-annual publication. Refer to "Non-surveyed data" in "Data quality, concepts and methodology — Sources" for more details.

Sector 11, sub-sector 114 (Fishing, Hunting and Trapping Industry) and sector 23 (Construction Industry):

  1. There is no survey. The data published are based on economic indicators. For more details, refer to "Non-surveyed data" in "Data quality, concepts and methodology — Sources".

Sector 91, sub-sector 913 (Local Governments):

  1. The survey is conducted by Public Institutions Division (PID) which uses this opportunity to request the distribution of investment expenditures by function for their own publication "Public Sector Finance". The data, however, are processed by ICSD and usually are in the same format as most of the data gathered by ICSD. For more details on the sampling methodology, see Pandher (1995). It should be noted that in the case of Quebec, a special arrangement provides investment values for the province.

Sectors 21, sub-sectors 211 (Crude Petroleum and Natural Gas) and 212 (Mining) and 91 sub-sectors 911, 912 and 914 (Federal Government, Provincial and Territorial Governments and Aboriginal Government):

  1. A sample using a model based methodology has been preserved. The treatment is the same for the remainder of the samples with only a few exceptions. For more details, see Lacroix (1991).

Sector 21 Canadian industry 213119 (Other support activities for mining), sector 55 Canadian industry 551114 (Head-office), and sector 81, sub-sector 814 (Private households):

  1. There are no surveys and no estimates for these sectors.

Other industrial sectors:

  1. The methodology used will be described in this section, in particular a model-assisted estimation method.

In fact, the next sections discuss primarily the methodology used for sampling and for calendarization, imputation and estimation of the other sectors. The information on the methodology of the industrial sectors other than that described in the last point, is available in the reference documents cited.

Survey frame

The frame consists primarily of the Business Register (BR) developed by Statistics Canada. Business Register Division (BRD) is responsible for maintenance and updating of the register. The register is used by a large number of surveys that in turn provide it with feedback to ensure that the latest changes in the business world are incorporated into the BR as quickly as possible.

The BR contains the units required to establish our final survey frame. They are arranged hierarchically as follows: Enterprise - Company - Establishment - Location. An enterprise may comprise several companies, each of which may have several establishments that in turn may operate in several locations. This so-called “statistical” structure is in fact a model of the operational structure described by the enterprise itself. Based on the information available for each level of the operational structure, we define the corresponding statistical structure. For example, to be considered an establishment, a respondent must be able to supply the BR with the wages and rates of pay, income and major inputs in the operational process.

For these units that are part of the non-integrated portion (NIP) of the BR, the statistical structure is linear: an enterprise is related to a single company, a single establishment and a single location. In the integrated portion (IP), the structure may be linear but usually is more complex. Figure 1 illustrates both structures.

Figure 1
Statistical structures
Statistical structures

The sampling unit selected for the Capital Expenditure Survey is the establishment, which best corresponds to the gathering and disclosure of investment data. For more details on the BR, refer to Cuthill (1996).

When the sample is drawn in November, a new ”image” is taken from the BR. With the new Unified Enterprise Survey, the BR has improved its coverage therefore the “image” is now more complete and up to date. Since the Capital Expenditures Survey is part of the unified survey, it uses this new image for the purpose of sampling.

Since the questionnaires are mailed out in the following March and October, and given the dynamic nature of businesses, we can be certain that new projects will start up after the sample is selected. To be sure that major investments are not “overlooked”,units are added to the sample even after the first mailing when the project is deemed important enough. These “new projects”, as they are called, are found from newspapers, company reports or lists of building permits. These are sampled with certainty and allow us to avoid gross under-estimation of the value of investment in their industries.

It should be noted that certain units, such as new projects, which we want to have in the sample have incomplete information.Income, which is known for all units on the frame, may be unknown for these units. Since income is used in a range of processes (imputation, estimation, etc.), these units are grouped together to be dealt with separately during data processing.

Grouping

Before sampling begins, all units from the private sector not in the mining and manufacturing industries are grouped together using the following method. All establishments operating in the same province, in the same six-digit-code industrial sector and under the same enterprise have been grouped together in a single super-establishment. The income of the super-establishment is the sum of all income for the establishments that comprise it, while the remaining information is taken from the head of the group, either the head officewhere possible, or the establishment with the highest income, where applicable. For the public sector, all the units are in the sample.

Once the new universe is constructed with the new super-establishments, all units with income of less than a certain limit are eliminated from the frame unless they constitute head offices or laboratories, in which case the units are chosen with certainty. This procedure is instituted to avoid “losing” these units, which generate practically no income, but might account for substantial investment.

The limit that delineates the units non-surveyed is determined as a function of province and industry. It varies from $100,000 to $850,000 depending on the size of the units within the industry and the province grouping. The limit is calculated in such a way that a maximum of 10% of the total revenue in the group is excluded from sampling. This allows reducing the response burden for small units and thus follows the bureau guidelines. The non-covered portion is estimated using administrative data when it is available (refer "Estimation" for more details).

When all groups have been assembled and the small units have been eliminated, the survey population is ready for stratification.

Sampling

The sampling is divided into the three traditional parts: stratification, allocation and selection. These are described in the following text.

Stratification

The sample has first been stratified by geographic location, industrial classification and also by country of control in order to answer new needs. The geographic division is based on the 13 provinces and territories, with no other refinement (no infra-provincial stratification). Twelve countries of control were considered in the stratification this year: Canada, USA, Germany, Japan, France, Great Britain, Sweden, Italy, Netherlands, China, Hong Kong and Australia.The remaining countries were grouped together. For the industrial stratification, the 1997 NAICS is used at the level required for estimation purposes. If, for example, for a certain industry, the most disaggregated level published corresponds to the 3-digit NAICS, this will be the stratification level. It should be noted that for the remainder of the section, the 6-digit NAICS will be abbreviated as NAICS-6, the 5-digit NAICS as NAICS-5, and so forth.

Text table 1 shows, by industry, the most disaggregated possible publication levels for provincial and Canadian estimates.

 
Most disaggregated publication levels
Industry sector NAICS code sector NAICS publication level
Agriculture, forestry, fishing and hunting 11 3
Mining and oil and gas extraction 21 3 to 6
Utilities 22 4
Manufacturing (NAICS -3 316 and 323) 31-33 3 and 4
Wholesale trade 41 3
Retail trade 44-45 3
Transportation and warehousing 48-49 3
Information and cultural industries 51 3
Finance and insurance 52 3
Real Estate and rental leasing 53 4
Professional, scientific and technical services 54 4
Management of companies and enterprises 55 2
Administration and support, waste management and remediation services 56 3
Education services 61 4
Health care and social assistance 62 3
Arts, entertainment and recreation 71 3
Accomodations and food services 72 3
Other services 81 3
Public administration 91 3

All provincial publication levels are at the sector level except for the Manufacturing industry where it is at the NAICS-3 level for four provinces: Québec, Ontario, Alberta and British Columbia.

Allocation

Once the initial stratification has been introduced, we compute the coefficient of variation (CV) (see "Estimation" for more information on CV) to be targeted using the revenue variable to reach the CV set for the most disaggregated publication level, in our case by province and different industrial classification level as defined previously. An example helps to better define the situation.

Assume that we want to publish estimates for sector 72 (Accommodations and Food Services), which corresponds to NAICS-3 at the Canada level and the whole industry at the Province / Territory level. We then construct text table 2, in which the number of provinces has been reduced to 3 and the number of NAICS-3 for the industry as a whole is 2, specifically the sub-sectors (SS) 721 and 722.

 
Cross publication for sector 72
  Province 1 Province 2 Province 3 CV
SS721 ... ... ... 15%
SS722 ... ... ... 15%
CV 15% 15% 15% ...

The initial stratification corresponds to each cell in text table 2 and the marginals correspond to the estimates we wish to publish. If, for example, we wish to publish estimates with a target CV of 15%, we must first compute the CV to be targeted for each cell, so that the marginal CVs are met.

Before we can compute the CV required at the cell levelto reach the CV set for the marginals, we must adjust the marginal CVs. In fact, we cannot obtain 15% CVs in both directions, because when we set the variance in one direction to obtain the targeted CV, we automatically set the variance (thus the CV) for the other direction and we are "subject to” the resulting CV. With the knowledge that the CVs in both directions cannot be simultaneously equal to the targeted CV (unless by chance), we have chosen to minimize the distance from the marginal CVs to the target CV. In one direction, we then obtain a resulting CV greater than the target CV and in the other, a CV less than this same CV. This is done by minimizing the distance between the resulting CVs and the target CV under the constraint that the variances must be the same in both directions. In mathematical terms:

Figure 2
Image

where CVA and CVB represent the CVs attainable in both directions, CVc represents the target CV and VAand VB represents the variances in both directions.

Let us call the resulting CV the new target CV. In the preceding example, we could end up with new target CVs as in text table 3.

 
New target CVs (closest to the targeted CV)
  Province 1 Province 2 Province 3 CV
SS721 ... ... ... 11%
SS722 ... ... ... 11%
CV 18% 18% 18% ...

To reach the new target CV, we must compute what the targeted CVs should be for each of the initial strata by using a raking ratio algorithm as described in Latouche (1988).

Using the letters A and B again to designate the two directions (A the geographic direction and B the industrial direction, for example), we recompute the cell CVs until the combination of the CVs on the same line or in the same column is close enough to the target CV for the corresponding marginal.

Figure 3
Image

where:

r
denotes the current iteration,
r-1
denotes the preceding iteration,
i.
denotes the marginal in direction A,
.j
denotes the marginal in direction B,
ij
denotes a crossover of directions A and B and
Y
corresponds to the total for the income variable for a given group.

The algorithm stops when the convergence criterion (0.1%) is met or after a maximum of 10 iterations. It should be noted here that the algorithm converges very quickly and is almost certain to reach the targeted CV for the marginals. Text table 4 illustrates the result of the iterative procedure.

 
Cell CVs after iteration
  Province 1 Province 2 Province 3 CV
SS721 20% 23% 24% 11%
SS722 !7% 20% 21% 11%
CV 18% 18% 18% ...

Now that the CV is set for each of the initial strata (these correspond to the cells in the preceding table), we can stratify them into two major strata: large, in which the sample is conducted with certainty, and small, in which the sampling is conducted under a probability scheme so the new target CV can be attained. The preferred method for splitting cells in two is that advanced by Hidiroglou (1986) which has the merit of minimizing the sampling size while attaining the target CV. The technique is simple: start with the equation that gives the CV for the initial stratum

Figure 4
Image

It can be rewritten to isolate n(t), the total number of units to be sampled based on t, the number of units sampled with certainty:

Figure 5
Image

We then must clearly understand the function to find its minimum point. This can be attained through an iterative process that computes the following two parameters after converging: the dividing value separating the initial stratum into two final strata as well as the sample size for each of the strata. There will be t units in the take-all stratum and n(t) - t units to be taken in the take-somestratum. This process will have taken the minimum number of units to attain the target CV set.

It is highly likely that we will not obtain the precise target CV for the cells. The CV reached is usually close, but for some cells may be as much as 2% below the target CV. The effect of this is a slight change in the CVs targeted for the marginals. Text table 5 reproduces the results from text table 4 following application of Hidiroglou’s algorithm.

 
Final cell CVs after iterations
  Province 1 Province 2 Province 3 CV
SS721 20.10% 22.80% 24% 10.80%
SS722 17.20% 21.50% 20.40% 11.70%
CV 18.10% 18.90% 17.80% ...

Once this step is complete, we can then proceed with the actual selection of the sample.

Selection

For the take-some strata, selection is based on a simple random process under the constraints of minimizing the overlap with the Unified Enterprise Survey (UES) (For more details on this survey, see Simard and al (2001)). A minimal sampling fraction of 1% and a minimum of 3 units sampled by stratum. In the take-all strata, all units are sampled with certainty.

Data editing

Once the sample has been selected, a questionnaire is mailed out and respondents are urged to complete and return it. Units that have not responded are subject to mail and telephone follow-up to ensure the data is obtained. A special effort is made for units in the take-all strata.

Once the data have been captured, some edits are conducted for each establishment. For example, several rules of consistency are in place to ensure that if some fields are coded, all related fields are also coded. For example, we can ensure that the sum of the parts equals the whole, that certain cells are properly filled out, etc.

Some edits focus directly on investment data. For example, if historical data are available, some tolerance rules are applied.

When no historical data are available, all respondents reporting investment of $10,000,000 or more are the subject of thorough checks. It should be noted that these rules are subject to change.

Finally, a large number of qualitative (rather than quantitative) editing rules are also in place. For more details on editing rules, see Corneau (1995).

Calendarization

Once data has been collected and edited, we can proceed with the calendarization of the data. This process will generate data for the January to December period for the reference year when the respondent has given data on another period. In fact, to reduce the response burden, we acceptthat the respondent provides data on a fiscal basis. For a given year, its fiscal period must end between January 1st of the target year and March 31st of the following target year.

To prevent the production of estimations linked to many different fiscal periods, calendarization is done. The main idea is relatively simple: first “break” the annual data into monthly data, extrapolate if needed and then sum the monthly values forming the year of interest to get the calendarized data of the respondent.

The method developed by Cholette (1984) is used to “break” the data into monthly portions and extrapolate. The method is similar to a benchmarking technique. We can summarize the algorithm in the following manner:

We are trying to minimize the function

Figure 6
Image

in such a way that the sum of the monthly values (xm) over the fiscal period is equal to the respondent’s reported data.

The series of zm correspond to known auxiliary information about the respondent such as its cycle or trend. For the survey, this option is not used and the series is simply a constant value which corresponds to minimizing the month to month change (while the fiscal total is still respected).

The available number of months (T) on which the minimization function is calculated depends on the historical information of the respondent. However, since usually a respondent gets at least two questionnaires covering two distinct calendar years, T should at least be equal to 24. Periods that are not covered by the fiscal data (at the beginning and at the end of the series) are extrapolated using the last (or the first) calculated monthly value. The rest of the process can be applied on both calendar and fiscal data of the respondents.

Outlier detection

Once the reported data are on a calendar basis, we proceed with the detection of outliers.Detection may be conducted at four levels, beginning at the most disaggregated. If there are not at least 25 units at this level, we proceed to the next level. As many as three variables may be involved in defining these levels: industrial level, size and geographic area.

There are three size categories: take-all stratum with known income, take-all stratum with unknown income, and take-some stratum.

With respect to geographic areas, units are located in large provinces (Que., Ont., Alta. and B.C.), mid-sized provinces (N.S., N.B., Man. and Sask.), or small provinces (P.E.I., Y.T., N.W.T., Nvt. and N.L.).

The four detection levels are:

Level 1:
NAICS-3 * Size *Que., Ont., Alta., B.C., small and mid-sized provinces (separated)
Level 2:
NAICS-3 * Size * large provinces and small and mid-sized provinces (together)
Level 3:
NAICS-3 * Size *Canada
Level 4:
Sector *Canada

When publication is at the Sector level for an industry, detection begins at the most aggregate level, for example, level 4.

In addition, the outlier detection module is run before and after imputation. After imputation, this is done with the imputed data and permits detection of outliers among the imputed data.

The Hidiroglou-Berthelot (1986) method is used to detect them. Establishment “i” is considered an outlier if one of the two relations is checked:

  1. Yi < M - C*DQ1
  2. Yi > M + C*DQ3
  1. where:
  2. DQ1 =Max(M-Q1, |A*M|),
  3. DQ3 =Max(Q3-M, |A*M|),
  4. M is the median (the point at which exactly 50% of establishments lie on either side),
  5. Q1 is the first quartile (25% of establishments are smaller and 75% are larger),
  6. Q3 is the third quartile (75% of establishments are smaller and 25% are larger),
  7. A and C take the values of 0.5 and 20 respectively.

Four ratios are used to detect outliers: calendarized CC over revenue, calendarized CM over revenue, CC over revenue and CM over revenue. If an establishment is found to be an outlier for one of these ratios, it is automatically considered an outlier for both investment variables, CC and CM, both calendarized and fiscal. In the case of the SA, the same procedure is carried out for the RC and RM variables as for the CC and CM variables.

Imputation

Records found to be outliers are not imputed since the consistency rules have already been applied and the investment reported by the respondent is deemed valid. These records are simply excluded from calculation of the average during imputation of non-respondents. Moreover, if some of the establishments found to be outliers form part of the take-some strata, they are moved up to the take-all strata with known revenues and the selection probability for residual units is recomputed.

For records to be imputed, three imputation methods are used to proceed with evaluation of the missing data. There is no partial imputation: the two variables of interest, CC and CM (RC and RM are added in the case of the SA) are available or missing for each establishment. The three methods therefore allow us to impute all of the variables in parallel. The first method is simply the substitution with the historical value. For the following surveys, we use the historical value as long as that value is available for the same reference year:

Yits= Yit(s–1)

where t is the reference year, s the current survey, s-1 the most recent preceding survey for which the data are reported and y is the variable of interest.

For the Survey on Intentions (SI), since it is the first survey for a given reference year and then, no historical data are available for the same year, we use historical information from the previous year:

Yits= Yi(t–1)(s–1)

Where t-1 is the previous reference year.

We should note that this last imputation is also used for the variables RC and RM since these variables are required only for the Survey on Actual Data, so no historical value is available for the same reference year.

In both cases, the imputation is done (whenever possible) before the calendarization process. Hence data imputed from a period that could be different from the calendar year are calendarized as well.

The second method is used when no historical value is available for a unit. In this case, we impute using the current ratio method:

Figure 7
Image

where x is revenue.

The third method is used for units without historical value and a revenue unknown. In this case, we use the imputation by the average of current values:

Figure 8
Image

An important factor when computing the imputed value is the level at which imputation is conducted. In fact, the imputation is conducted if the imputation group includes at least 10 establishments for which the questionnaire is complete and if these represent at least 25% of units in the group.

Imputation groups

The initial imputation group corresponds to the stratum used for sampling once it is updated with the new data gathered. If one of the preceding constraints (10 units, 25% of units) is not met, we move to a more aggregated imputation group within the same industrial group and in the same size group, but in which all provinces are combined. As in outlier detection, the possible sizes are take-all stratum with known income, take-all stratum with unknown income and take-some stratum.

If the constraints still are not met, the industries are grouped. For example, all NAICS-6s from a given NAICS-5 are combined. We remain at the Canada level and within the same size group. The most aggregated level we can reach corresponds to the groups for all NAICS-3s in a given sector, at the Canada level, for one size group where the last level of the take-all stratum with known and unknown revenues are regrouped. Two examples will provide a better understanding.

If an establishment in the Canadian mining industry 212114 in Ontario that is part of the take-some group is to be imputed, we obtain the following sequence:

  1. 212114 - Ontario - take-some stratum
  2. 212114 - Canada - take-some stratum
  3. 21211 - Canada - take-some stratum
  4. 2121 - Canada - take-some stratum
  5. 212 - Canada - take-some stratum
  6. 21 - Mining and Oil and Gas Extraction sector - Canada - take-some stratum

If an establishment in sector 55 (Management of Companies and Enterprises) in Quebec that is part of the take-all group with unknown revenues is to be imputed, we obtain the following sequence:

  1. Sector 55-Quebec-take-all stratum (unknown revenues)
  2. Sector 55-Canada-take-all stratum (unknown revenues)
  3. Sector 55-Canada-take-all stratum (known and unknown revenues)

We should also point out that a record imputed at a disaggregated level can be used to compute the averages during imputation of another record at a more aggregated level. For example, if we manage to impute all records for Alberta at the first imputation level and must move to the next level for records from New Brunswick, these will be imputed at the Canadian level and the imputed Alberta records will be used in computing the averages at the Canadian level.

Once the missing values for establishments are imputed, we can move on to the estimation stage.

Estimation

The ratio estimator is used for estimation with revenue being the auxiliary variable. This method ensures that the final weight multiplied by the income for each unit in the sample matches the known total for the income variable for the entire population in the group. The groups used in this instance correspond to the lowest industry level published within a single size group at the Canadian level. The difference from the original stratum is the grouping at the Canadian level. The following example provides a better understanding.

For an establishment for which the stratum corresponds to NAICS-3 323 of the Manufacturing sector in Nova Scotia for the take-some stratum, we use the estimation group

  1. 323 - Canada - take-some stratum

During the survey, an establishment may be reclassified into a new industry or province. This new classification is used to define the domain of publication and it is this classification that will determine where the investments will appear in the final table. The following example provides a better understanding.

If an establishment sampled in Quebec under NAICS-3 411 is found in Ontario under NAICS-3 444, it will have the following characteristics:

  1. stratum: 411 - Quebec
  2. group for computing outliers: 444 - Ontario
  3. initial imputation group: 444 - Ontario
  4. estimation group: 411 - Canada
  5. domain of publication: 444 - Ontario
Figure 9
Image
  1. where:
  2. x is the auxiliary variable (revenue),
  3. h denotes the stratum,
  4. g denotes the estimation group,
  5. d denotes the domain of publication,
  6. n denotes the sample size,
  7. N denotes the population size,
  8. s denotes the sample,
  9. P denotes the population,
  10. w denotes the final weight,
  11. D denotes the sample weight,
  12. G denotes the control weight ("G-weight"),
  13. y is the variable of interest (investment) and
  14. p denotes the selection probability.

Note that the G-weight calculation is done in such a way that the final weight wi cannot be lower than one. In doing that, we ensure that a respondent’s value will be at least that value once it is weighted.

Estimation of variance and calculation of CV

Variance is estimated using Taylor’s linearization formula in the case of ratio estimator. This is available in Estevao (1991). Using the same notation as before:

Figure 10
Image

Estimation adjustment for the non-surveyed portion

Administrative data is used when it is available, for the non-observed portion of the survey.

For the survey on actual data, administrative data from the three previous years is used for creating a model to derive capital expenditures.

For surveys on intentions and preliminary actual data, there is no administrative data covering the reference periods for these surveys. The non-surveyed portion is estimated using the surveyed trend between actual data, intentions and preliminary actual data, which is applied to the estimation of the non-observed portion that has been calculated for the survey on actual data.

On average, estimating the non-observed portion contributes 2% to the total estimation.

Quality indicator

When the estimates are published, a scale distinguishes between the various qualities of accuracy. It combines the effect of sampling (since we did not do a census) and the imputation rate (each imputation (other than historical imputation) adds to the uncertainty of the results). The scale is presented in text table 6.

 
Quality indicator interpretation
  Imputation rate
CV 0.00 to 0.10 0.10 to 0.33 0.33 to 0.60 0.60 and more
0.00  to  0.05  A  B  C F
0.05  to  0.10  B  C  D F
0.10 to 0.15  C  D  E F
0.15 to 0.25  D  E F F
0.25 to 0.50  E F F F
0.50 and more F F F F
Note(s):
AExcellent;BVery Good;CGood;DAcceptable;EUse with caution; F Too unreliable to be published.

Due to some technical considerations, the quality indicator will not be implemented for the present publication.

Confidentiality

Some confidentiality rules obviously are used to suppress any information that might lead to disclosure of the data supplied by a respondent. These rules allow Statistics Canada to comply with its mandate of non-disclosure of information supplied by respondents. The rules themselves are confidential and are not available for consultation.

Sampling error and non-sampling error

The difference between an estimate based on sample data and the value obtained by surveying the entire population is called the sampling error. This difference varies with sample size, expenditure variability, sampling scheme, and estimation method. In general, the larger a sample, the smaller its sampling error. If the population is very heterogeneous, a larger sample size is required to produce a reliable estimate. The sampling error is measured by a quantity known as the standard deviation. The latter indicates the expected variability of the estimate that will be produced if the expenditures are sampled repeatedly. The actual value of the standard deviation is unknown, but it can be estimated from the sample.

Another measure of precision is the coefficient of variation (CV). The CV is simply the standard deviation expressed as a percentage of the estimate. Hence it is a relative measure of precision and can be used for comparisons across industries or provinces. The smaller the CV, the more reliable the estimate. (See "Data quality, concepts and methodology — Quality measures" section).

Another kind of error is non-sampling error. Although every effort is made to keep such errors to a minimum, they always exist. They are not taken into account in computing the CV, nor are they measured by the CV. Measures such as response rate, coverage rate and imputation rate can be used as indicators of the possible extent of non-sampling errors.