Appendix 2: Glossary of Terms
Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.
The descendants of the original inhabitants of North America. The Canadian Constitution recognizes three groups of Aboriginal people – First Nations (or North American Indian people, consisting of Status and non-Status Indians), Métis and Inuit. These are three separate peoples with unique heritages, languages, cultural practices and spiritual beliefs.
A Statistics Canada microdata set for a given survey, available for use in Research Data Centres (RDCs) across Canada. RDCs provide researchers with access, in a secure university setting, to microdata from population and household surveys. The centres are staffed by Statistics Canada employees. They are operated under the provisions of the Statistics Act in accordance with all the confidentiality rules and are accessible only to researchers with approved projects who have been sworn in under the Statistics Act as 'deemed employees.'
The bootstrap method is an approach for estimating error in a dataset related to sampling. Sampling introduces error because data are not taken from the entire population, but only a sub-section, called a sample, which is then used to make estimates for the whole population. There are several methods for estimating the level of sampling error. The bootstrap method usually selects a number of subsamples from the main sample and produces estimates for each subsample. The sampling error is estimated as a function of the observed differences between estimates from the different subsamples.
Census metropolitan area (
) and Census agglomeration (CA)
Area consisting of one or more neighbouring municipalities situated around a major urban core. A census metropolitan area must have a total population of at least 100,000 of which 50,000 or more live in the urban core. A census agglomeration must have an urban core population of at least 10,000.
Census subdivision (CSD)
This is the general term for municipalities (as determined by provincial/territorial legislation) or areas treated as municipal equivalents for statistical purposes (e.g., Indian reserves, Indian settlements and unorganized territories).
Census of population
A census is the collection of information about all units in a population, sometimes also called a 100% sample survey. Under the Statistics Act of 1971, it is a statutory requirement to conduct a nationwide census every five years. The Census of Population provides information needed by community groups, businesses and governments to develop plans for education and training, seniors' housing, day care, fire protection, public transport, and many other programs.
As used in demography, a number of people having a common characteristic, for example, all persons in a given population who were born in 1940, or all persons suffering from a particular disease.
This is a term used within Statistics Canada to describe information that is subject to the secrecy provisions of the Statistics Act. Information is deemed confidential either because it directly identifies a responding unit, for example, by name, or because it could permit specific responding units to be identified, even when the data is stripped of identifiers, due to the information's detail or its geographical structure or format.
Confidentiality denotes an implied trust relationship between the person providing the information and the individual or organization collecting it. This relationship is built on the assurance that the information will not be disclosed without the person's permission. Under the Statistics Act, information that would identify an individual, business or institution can not be disclosed without their knowledge or consent.
Coverage is the extent to which every person or unit intended for inclusion in a survey or census is in fact counted and counted only once. Coverage errors refer to when persons or units of the survey or census are missed (under-coverage) or over-counted (over-coverage). Studies are often conducted by Statistics Canada to provide estimates of under-coverage and over-coverage of a given survey or census or to examine related issues. For example, Statistics Canada has studied and analyzed the extent to which cell-phone use affects coverage for telephone surveys.
CV – Coefficient of variation
In a sample survey, results from the sample are used to estimate what the findings would be if the whole population were to be measured. In this process of estimation, some level of error is inevitable. The coefficient of variation (CV) is a way of expressing the sampling error associated with an estimate. First a standard error or 'average' error of the estimate is calculated. The CV is obtained by dividing the standard error of the estimate by the estimate itself and expressing the resulting fraction as a percentage. The lower the CV, the higher the data quality (see Margin of error).
A degree or level of confidence that the data and statistical information are "fit for use". The particular issues of quality or fitness for use that must be addressed by Statistics Canada are relevance, accuracy, timeliness, accessibility, interpretability and coherence.
An organized and sorted list of facts or information about a set of individuals, households, businesses, or other relevant units. A Statistics Canada dataset is usually generated by a survey or administrative data, stored on a computer, and organized in such a way that it may be accessed easily by a wide variety of statistical application programs.
The process of providing statistical products and services to the general public and to specific data users. Statistics Canada disseminates data and analysis in the form of survey results, research reports, technical papers, periodical magazines, census products, and research compendia. Online products date from 1996 to the present. Historical material can be located using the Library Catalogue. Statistics Canada information is also distributed to an approved network of depository libraries.
The objective of dissemination activities is to provide relevant information in a timely fashion, in useful formats, and through accessible channels. Activities in place to support the dissemination of products include client consultation services, marketing, promotions, user-training and other client services.
A new variable constructed by applying logical or mathematical operations to one or more existing variables in order to meet particular data needs. For example, an age variable can be derived from date of birth information. As another example, a derived variable could be obtained called 'presence of a chronic health condition' based on whether or not a respondent answered 'yes' at least once to a series of questions asking about specific chronic health conditions such as asthma, diabetes, heart disease, etc.
Editing is a process that ensures survey data are accurate, complete and consistent. A set of editing rules or conditions is applied to a dataset. Data which do not meet the conditions are examined and corrected where appropriate.
In a sample survey, results from the sample are used to estimate what the findings would be if the whole population were to be measured. The accuracy of such an estimate is a measure of how much the estimate differs from the correct or "true" figure. Departures from true figures are known as errors. Errors can arise from many sources, but can be grouped into a few broad categories: coverage errors, non-response errors, response errors, processing errors and sampling errors.
Non-response errors occur when it proves impossible to obtain a complete questionnaire from a person, household, or organization. Although certain adjustments for missing data can be made during processing, non-response means some loss of accuracy is inevitable.
Response errors indicate that a response may not be entirely accurate. The respondent may have misinterpreted the question or may not know the answer, especially if it is given for an absent household member, for example.
Sampling error refers to the fact that the results of the weighted sample differ somewhat from the results that would have been obtained from the total population. The difference is known as sampling error. The actual sampling error is of course unknown, but it is possible to calculate an "average" value, known as the "standard error".
A term that came into common usage in the 1970s to replace the word "Indian," which many people found offensive. Although the term First Nations is widely used, no legal definition of it exists. Among its uses, the term "First Nations peoples" refers to the North American Indian people in Canada, both Status and Non-Status. Many people have also adopted the term "First Nation" to replace the word "band" in the name of their community.
A list, map, or conceptual specification of the units comprising the survey population from which persons can be selected. For example, a telephone or city directory, or a list of members of a particular association or group.
The number of times an event or item occurs in a dataset.
A chart or table showing how often each value or range of values of a variable appear in a dataset. It is sometimes called a one-way frequency table to indicate that the distribution contains counts for one variable only.
Imputation involves replacing either missing or invalid data with valid data. This is normally performed using predetermined rules or with the use of data from a 'statistical neighbour'–another responding unit who has similar characteristics. Imputation is often combined with data editing.
A unit that meets all criteria for the survey. For the APS, in the provinces, all Aboriginal individuals living off reserve, aged 6 to 14 years of age as of October 31, 2006 were in scope for the children and youth component, and all Aboriginal individuals aged 15 and older as of October 31, 2006 were in scope for the adult component. In the territories, all Aboriginal individuals living on- and off-reserve aged 6 to 14 years of age as of October 31, 2006 were in scope for the children and youth component, and all Aboriginal individuals aged 15 and older as of October 31, 2006 were in scope for the adult component.
The Canadian federal legislation, first passed in 1876, that sets out certain federal government obligations, and regulates the management of Indian reserve lands. The act has been amended several times, most recently in 1985.
A group of North American Indian people for whom lands have been set apart and money is held by the Crown. Each band has its own governing band council, usually consisting of one or more chiefs, and several councillors. Community members choose the chief and councillors by election, or sometimes through traditional custom. The members of a band generally share common values, traditions and practices rooted in their ancestral heritage. Today, many bands prefer to be known as First Nations.
Organization of results from Statistics Canada activities, including data files, databases, tables, graphs, maps, and text. This organization can be either pre-defined (standard information product) or made in response to special requests (customized information product). Information products can be made available on either print or electronic media.
Interpretability reflects the ease with which the user may understand, properly use and analyze the data or information. The degree of interpretability is largely determined by: the adequacy of definitions on concepts, target populations and variables; terminology underlying the data; and information on any limitations of the data.
Inuit Nunaat is the homeland of Inuit of Canada. It includes communities in Nunatsiavut (Northern coastal Labrador), Nunavik (Northern Quebec), the territory of Nunavut and the Inuvialuit region (Northwest Territories). These regions collectively encompass the area traditionally used and occupied by Inuit in Canada.
The singular form of the word Inuit (i.e. 'a person').
Margin of error
In a sample survey, results from the sample are used to estimate what the findings would be if the whole population were to be measured. In this process of estimation, some level of error is inevitable. The margin of error, a measure used to build confidence intervals, serves as a rough indicator of the precision of an estimate. For example, pollsters often say that a certain percentage of the population, plus or minus the margin of error (expressed in percentage points), is likely to vote for a certain candidate, 19 times out of 20. To calculate the margin of error, which in this example corresponds to a 95% confidence interval, the pollster would use the equivalent of plus or minus two standard errors of the estimate (see Standard error).
People of mixed North American Indian and European ancestry who identify themselves as Métis people, as distinct from North American Indian people, Inuit or non-Aboriginal people. The Métis have a unique culture that draws on their diverse ancestral origins, such as Scottish, French, Ojibway and Cree.
Files of records pertaining to individual responding units.
North American Indian
A term that describes all Aboriginal people in Canada who are not Inuit or Métis. North American Indian peoples are one of three groups of people recognized as Aboriginal in the Constitution Act, 1982. This also refers to First Nations people consisting of status and non-status Indians.
Data collected for a given variable about a particular responding unit. Examples include the specific values for a responding unit on characteristics such as age, gender or marital status—the observations might be '77', 'woman' and 'widowed'.
Out of scope
A sampled unit that does not meet all criteria for being surveyed. For the APS, in the provinces, a person could be out of scope by, for example, being less than 6 years of age or by being non-Aboriginal or by living on reserve. In the territories, a person could be out of scope by being less than 6 years of age or by being non-Aboriginal.
The complete group of units to which survey results are to apply. These units may be persons, households, businesses, institutions, etc. The term "Target Population"is often used to refer to all potentially surveyed units, as defined in a clear, precise way by the survey study. This is the population for which information is wanted.
A postcensal survey is one where surveyed units are selected based upon their responses to the Census of Population. These surveys are generally conducted shortly after the Census data have been processed.
A proportion refers to how many responses fall into a given response category in relation to the total responses. It is calculated by dividing the frequency of the response category by the total number of responses to the question.
PUMF - public use microdata file
Public use microdata files provide access to responding units so that users can conduct their own research or analysis. They involve a non-identifiable data set containing characteristics pertaining to the units of the survey (e.g., individuals, households or businesses). All such datasets have been authorized for release to the public by the Statistics Canada Microdata Release Committee. The dataset contains no confidential information in that individual identifiers have been removed and any data combination or geography which could potentially reveal the identity of a responding unit has been modified.
Research data centres (RDCs)
The Research data centre program provides researchers with access, in a secure Statistics Canada governed setting, to micro data from population and household surveys. The RDC program is part of an initiative by Statistics Canada, the Social Sciences and Humanities Research Council (SSHRC) and university consortia to help strengthen Canada's social research capacity and to support the policy research community. The program is also supported by the Canadian Foundation for Innovation (CFI) and the Canadian Institutes of Health Research (CIHR).
The respondent is the person providing the information for the surveyed unit, which could be a person, household, business or institution. In the case of APS, the respondents are the parent or guardian of the selected children and youth aged 6 to 14 years, and the adult aged 15 and older for the Adult component.
The responding unit refers to the surveyed unit for which a response is obtained. In the case of the APS, it would be the child/youth aged 6 to 14 years of age for whom a response is obtained from the parent or guardian. This term is defined to distinguish it from the term "respondent" which in the case of APS refers to the parent or guardian providing the information for the child/youth. For the Adult component for APS for aged 15 and older, the responding unit is the same as the adult respondent.
The proportion of a sample for which a response to a questionnaire is obtained, usually expressed as a percentage. Non-response covers those who refused to participate as well as persons whom the survey was unable to reach.
Rural areas include all territory lying outside urban areas. An urban area has a minimum population concentration of 1,000 persons and a population density of at least 400 persons per square kilometre, based on the current census population count. Taken together, urban and rural areas cover all of Canada. Rural population includes all population living in the rural fringes of census metropolitan areas (CMAs) and census agglomerations (CAs) , as well as population living in rural areas outside CMAs and CAs.
A set of specifications that describe the sampling elements of a survey in detail. These elements include population, frame, surveyed units, sample size, sample selection and estimation method.
The process of selecting some part of a population to observe so as to estimate something of interest about the whole population. Examples of different sampling methods include simple random sampling, stratified random sampling, cluster sampling and multi-stage sampling.
Sampling or sampled unit
The unit selected by the sample design and from which measurements are taken for a survey. Examples include persons, households, families or businesses. For APS, the sampling unit is the person.
Standard deviation measures the dispersion of a data set around the mean. It is the most widely-used measure of dispersion. Mathematically, the standard deviation is the square root of variance.
In a sample survey, results from the sample are used to estimate what the findings would be if the whole population were to be measured. Sampling error refers to the fact that the results of the weighted sample differ somewhat from the results that would have been obtained from the total population. The difference is known as sampling error. The actual sampling error is of course unknown, but it is possible to calculate an "average" value, known as the "standard error".
An Act regarding statistics of Canada. Includes the definition of Statistics Canada's mandate: ''There shall continue to be a statistics bureau under the Minister, to be known as Statistics Canada, the duties of which are:
- to collect, compile, analyze, abstract and publish statistical information relating to the commercial, industrial, financial, social, economic and general activities and condition of the people;
- to collaborate with departments of government in the collection, compilation and publication of statistical information, including statistics derived from the activities of those departments;
- to take the census of population of Canada and the census of agriculture of Canada as provided in this Act;
- to promote the avoidance of duplication in the information collected by departments of government; and
- generally, to promote and develop integrated social and economic statistics pertaining to the whole of Canada and to each of the provinces thereof and to coordinate plans for the integration of those statistics.''
The process by which particular data are prevented from being released based on criteria designed to protect confidentiality. 'Cell' suppression refers to procedures used to protect sensitive tabular data from disclosure; a cell being an individual entry in a table. For the APS, data was also suppressed for reasons of data quality (CV larger than 33.3%).
The selected unit from which measurements are taken for a sample survey or a Census. Examples include persons, households, families or businesses. For APS, the surveyed unit (which is also the sampled units since APS is a sample survey) is the children/youth 6 to 14 years of age and the adults aged 15 and older.
An urban area has a minimum population concentration of 1,000 persons and a population density of at least 400 persons per square kilometre, based on the current census population count. All territory outside urban areas is classified as rural. Taken together, urban and rural areas cover all of Canada. The urban population includes all population living in the urban cores, secondary urban cores and urban fringes of census metropolitan areas (CMAs) and census agglomerations (CAs), as well as the population living in urban areas outside CMAs and CAs.
These guides accompany Statistics Canada survey datasets, such as analytical files and Public Use Microdata Files (PUMF), providing the detailed technical information required to use the data appropriately. The guide typically contains important information to know prior to data analysis: weighting variables to use, procedures related to the estimate of variance, and precautions to take in the dissemination of the data.
A measure of dispersion for a given characteristic or variable in a dataset. It indicates how much variability exists for that characteristic. Technically, it is calculated as the average squared deviation from the mean of each observation in the data set for a particular variable.
A weight is the average number of units in the population that a unit in the survey represents. Examples of a unit include a person or a household. Weights are applied to responding units in a sample database in order to ensure that, when making inferences from the survey data to population parameters, estimates of characteristics for the total population are obtained.
- Date modified: