Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes.

Administrative data use

Scope and purpose

Administrative records are data collected for the purpose of carrying out various non-statistical programs. For example, administrative records are maintained to regulate the flow of goods and people across borders, to respond to the legal requirements of registering particular events such as births and deaths, and to administer benefits such as pensions or obligations like taxation. As such, the records are collected with a specific decision-taking purpose in mind, and so the identity of the unit corresponding to a given record is crucial. In contrast, in the case of statistical records, on the basis of which no action concerning an individual is intended or even allowed, the identity of individuals is of no interest once the database has been finalized.

Administrative records present a number of advantages to a statistical agency and to analysts. Demands for statistics on all aspects of our lives, our society and our economy continue to grow. These demands often occur in a climate of tight budgetary constraints. Statistical agencies also share with many respondents a growing concern over the mounting burden of response to surveys. Respondents may also react negatively if they feel they have already provided similar information (e.g., revenue) to administrative programs and surveys. Administrative records, because they already exist, do not require the cost of direct data collection nor do they impose a further burden on respondents. It is important to note that the explosion of technology has also permitted statistical agencies to overcome the limitations caused by the processing of large datasets. For all these reasons, administrative records are becoming increasingly usable and are being used for statistical purposes.

Statistical uses of administrative records include (i) use for survey frames, directly as the frame or to supplement an existing frame, (ii) replacement of data collection (e.g., use of taxation data for small businesses in lieu of seeking survey data for them), (iii) use in editing and imputation, (iv) direct tabulation, (v) indirect use in estimation (e.g., as auxiliary information in calibration estimation, benchmarking or calendarisation), and (vi) survey evaluation, including data confrontation (e.g., comparison of survey estimates with estimates from a related administrative program).


It is Statistics Canada's policy to use administrative records whenever they present a cost-effective alternative to direct data collection. As with any data acquisition program, consideration of the use of administrative records for statistical purposes is a matter of balancing the costs and benefits. Administrative records start with a huge advantage they avoid further data collection costs and respondent burden, provided the coverage and the conceptual framework of the administrative data are compatible with the target population. Depending on the use, it is often valuable to combine an administrative source with another source of information.

The use of administrative records may raise concerns about the privacy of the information in the public domain. These concerns are even more important when the administrative records are linked to other sources of data. The Policy on Informing Survey Respondents (Statistics Canada, 1998a) requires that Statistics Canada provides all respondents with information such as the purpose of the survey, the confidentiality protection, the record linkage plans and the identity of the parties to any agreements to share the information provided by those respondents. Record linkage must be in compliance with the Agency's Policy on Record Linkage (Statistics Canada, 1996a). In particular, all requests for record linkage must be submitted to the Confidentiality and Legislation Committee and approved by the Policy Committee.

The use of administrative data may require the statistical agency to implement a number, usually only a few, of the survey steps discussed in previous sections. This is because many of the survey steps (e.g., direct collection and data capture) are performed by the administrative organization. As a result, additional guidelines to those previously presented are required to suggest ways to compensate for any differences in the quality goals of source organization (e.g., to compensate for the outgoing quality from the data capture, which is often uncontrolled).

One must keep in mind the fundamental reason for the existence of these administrative records: they are the result of an administrative program that was put in place for administrative reasons. Often the statistical uses of these records were unknown when the program was implemented and statistical agency invariably has limited impact in the development of the program. For that reason, any decisions related to the use of administrative records must be preceded by an assessment of such records in terms of their coverage, content, concepts and definitions, the quality assurance and control procedures put in place by the administrative program to ensure their quality, the frequency of the data, the timeliness in receiving the data by the statistical agency and the stability of the program over time. Obviously, the cost of obtaining the administrative records is also a key factor in the decision whether to use such records.

  • Many of the guidelines in earlier sections are applicable to administrative records. Sampling and data capture guidelines (see sections on Sampling and on Data collection and capture operations) will be relevant if administrative records exist only on paper and have to be coded and captured. These guidelines will also be of value for administrative data available in electronic form, including EDI and EDR. Note that these data, because they exist in electronic form, may be inherently less stable and subject to additional errors arising from data treatment and transmission processes at source. Editing and dissemination guidelines (see sections on Editing and on Data dissemination) apply to all cases where a file of individual administrative records is obtained or created for subsequent processing and analysis.

  • Consider privacy implications of the publication of information from administrative records. Although the Statistics Act provides Statistics Canada with the authority to access administrative records for statistical purposes, this use may not have been foreseen by the original suppliers of information (Statistics Canada, 1970). Therefore, programs should be prepared to explain and justify the public value and innocuous nature of this secondary use.

  • Collaborate with the designers of new or redesigned administrative systems. This can help in building statistical requirements into administrative systems from the start. Such opportunities are rare, but when they happen, the eventual statistical value of the statistical agency’s participation can far exceed the time expended on exercise.

  • Maintain continuing liaison with the provider of administrative records. Liaison with the provider is necessary at the beginning of the use of administrative records. However, it is even more important to keep in close contact with the supplier at all times so that the statistical agency is not surprised by any impeding changes, and can even influence them. Feedback to the supplier of statistical information and of weaknesses found in the data can be of value to the supplier, leading to a strengthening of the administrative source.

  • Understand the context under which the administrative organization created the administrative program (e.g., legislation, objectives, and needs). It has a profound impact on (i) the universe covered, (ii) the contents, (iii) the concepts and definitions used, (iv) the frequency and timeliness, (v) the quality of the recorded information, and (vi) the stability over time.

  • Study each data item in the administrative records that are planned to be used for statistical purposes. Investigate its quality. Understand the concepts, definitions and procedures underlying its collection and processing by the administrative organisation. Some of the items might be of very poor quality and thus might not be fit for use. For example, the quality of classification coding (e.g., occupation, industrial activity, geography) might not be sufficient for some statistical uses or might limit its use.

  • Like data collected by means of a survey, administrative data are also subject to partial and total nonresponse. In some instances, the lack of timeliness in obtaining all administrative data introduces greater nonresponse. Some guidelines provided in the section on Response and nonresponse will thus apply. Unless nonrespondents can be followed up and responses obtained, develop an imputation or a weight-adjustment procedure to deal with this nonresponse (see sections on Imputation and on Estimation). Administrative sources are sometimes outdated. Therefore, as part of the imputation process, give special attention to the identification of active and/or inactive units. Some imputation or transformation may also be required in cases where some of the units report the data at a different frequency (e.g., weekly or quarterly) than the one desired (e.g., monthly).

  • Keep in mind that if the information they provide to the administrative source can cause gains or losses to individuals or businesses, there may be biases in the information supplied. Special studies may be needed in order to assess and understand these sources of error.

  • Document the nature and quality of the administrative data once assessed. Documentation helps statisticians decide the uses to which the administrative data are best suited. Choose appropriate methodologies for the statistical program based on administrative data and inform users of the methodology and data quality.

  • Keep in mind that the longevity of the source of administrative data and its continued scope is usually entirely in the hands of the administrative organization. The administrative considerations that originally dictated the concepts, definitions, coverage, frequency, timeliness and other attributes of the administrative program may, over time, undergo changes that distort time series derived from the administrative source. Be aware of such changes, and deal with their impact on the statistical program.

  • Implement continuous or periodic assessment of incoming data quality. Assurance that data quality is being maintained is important because the statistical agency does not control the data collection process. This assessment may consist of implementing additional safeguards and controls (e.g., the use of statistical quality control methods and procedures, edit rules) when receiving the data, comparisons with other sources or sample follow-up studies.

  • When record linkage of administrative records is necessary (e.g., for tracing respondents, for supplementing survey data, or for data analysis), conform to the Agency's Policy on Record Linkage. Privacy concerns that may arise when a single administrative record source is used are multiplied when linkage is made to other sources. In such cases, the subjects may not be aware that information supplied on two separate occasions is being combined. The Policy on Record Linkage is designed to ensure that the public value of each record linkage truly outweighs any intrusion on privacy that it represents.

  • It is not always easy to combine an administrative source with another source of information. This is especially true when a common matching key for both sources is not available and record linkage techniques are used. In this case, select the type of linkage methodology (i.e., exact matching or statistical matching) in accordance with the objectives of the statistical program. When the purpose is frame creation and maintenance, edit and imputation or weighting, exact matching is appropriate. When the sources are linked for performing some data analyses that are impossible otherwise, consider statistical matching, i.e., matching of records with similar statistical properties (see Cox and Boruch, 1988; Scheuren and Winkler, 1993; Kovacevic, 1999).

  • When record linkage is to be performed, make appropriate use of existing software. Statistics Canada’s Generalized Record Linkage Software is but one example of a number of well-documented packages.

  • When data from more than one administrative source are combined, pay additional attention to reconcile potential differences in their concepts, definitions, reference dates, coverage, and the data quality standards applied at each data source. Examples are education data sources, health and crime reports, and registries of births, marriages, licenses, and registered vehicles, which are provided by various organizations and government agencies.

  • Some administrative data are longitudinal in nature (e.g., income tax, goods and services tax). When records from different reference periods are linked, they are very rich data mines for researchers. Remain especially vigilant when creating such longitudinal and person-oriented databases, as their use raises very serious privacy concerns. Use the identifier with care, as a unit may change identifiers over time. Track down such changes to ensure proper temporal data analysis. In some instances the same unit may have two or more identifiers for the same reference period, thus introducing duplication in the administrative file. If this occurs, develop an unduplication mechanism.

  • Administrative information is sometimes used to replace a set of questions that would otherwise be asked of the respondent. In this instance, permission from the respondent may have to be obtained. Follow the Policy on Informing Survey Respondents in this regard. When consent is not obtained, put collection procedures in place for the equivalent survey questions to be asked of the respondents.

  • Administrative files are often very large and their use can sometimes lead to significant processing costs and timeliness issues. Depending on the need, make use of a random sample from large administrative files to reduce costs.


