Health Reports
Data profile: The Statistics Canada Biobank

by Natalie Cross, Janine Clarke, Carol Perez-Iratxeta and Audra Nagasawa

Release date: November 16, 2022

DOI: https://www.doi.org/10.25318/82-003-x202201100003-eng

Abstract

Introduction

The Statistics Canada Biobank (Biobank) is a valuable source of nationally representative health information. It contains biospecimens collected from the Canadian Health Measures Survey (CHMS) and the Canadian COVID-19 Antibody and Health Survey (CCAHS). Both surveys are voluntary and aim to collect a variety of important health information from Canadians to create nationally representative estimates. This information is collected through questionnaires, physical measures, and self-administered sample collection. Biospecimens collected as part of the CHMS and CCAHS from consenting participants include whole blood, plasma, serum, urine, DNA samples, and dried blood spots. These samples are stored as part of the Biobank for future health research. Canadian researchers can apply to the Biobank program to use this nationally representative source of biospecimens. Results obtained from their research can also be combined with a wide variety of health and lifestyle information collected as part of the CHMS and CCAHS, making the Biobank a rich source of health-related information that can fill data gaps on the health concerns that are important to Canadians. This data resource profile provides an overview of the Biobank to inform researchers and data users about the program and how it can be used as a resource for the advancement of health-related research.

Keywords

Biobank, biospecimens, Canada, health measures

Authors

Natalie Cross, Janine Clarke, Carol Perez-Iratxeta and Audra Nagasawa (audra.nagasawa@statcan.gc.ca) are with the Centre for Direct Health Measures, Statistics Canada.

Introduction

The Statistics Canada Biobank is a valuable, yet underutilized, source of nationally representative health information. The main purpose of the Biobank is to accelerate future research projects and to build health monitoring opportunities on a representative sample of Canadians.

There are multiple key advantages to biobanking. Notably, it provides a time- and cost-effective model to perform health research, as the samples are readily available for analysis and do not need to be collected from another group of participants. Ultimately, biobanking provides an invaluable resource that can help Canadians benefit from advances in science and medicine.

This data resource profile is intended to provide the academic community with details about this valuable resource so more researchers might take advantage of it for their projects.

Background—Canadian Health Measures Survey

The CHMS is an ongoing, voluntary survey that began collection in 2007. The survey can be conducted in English or French. However, participants who are unable to communicate in either official language can request an interpreter to complete the survey in their preferred language. Each CHMS collection cycle lasts approximately two years, which is enough time for survey collection to take place at 16 different sampling sites across five Canadian regions (the Atlantic provinces, Quebec, Ontario, the Prairie provinces, and British Columbia). These sites are chosen randomly based on the sampling framework outlined in the CHMS Integrated Metadatabase (IMDB) web page.Note 1 The CHMS was designed to create national baseline data to evaluate and monitor the major health concerns of Canadians. The CHMS collects information on the health and lifestyle habits of Canadians living in the 10 provinces, with certain exclusions outlined in the Data resource description section below. Cycle 1 of the CHMS collected information on Canadians aged 6 to 79 years, Cycles 2 to 6 collected information on those aged 3 to 79 years and Cycle 7 will collect information on those aged 1 to 79 years. Some of the topics covered in this survey include physical activity, body composition, blood pressure, bone density, vision, and environmental contaminants in blood, urine, and hair.

CHMS data are collected in two parts. The first part consists of a household survey, and the second part involves a visit to a mobile examination centre (MEC). Both components of the survey collect complementary information, with the goal of improving health programs and services in Canada. Because the information is collected under the authority of the Statistics Act, participant information must be kept strictly confidential. For more details on what is collected in the CHMS, refer to www.statcan.gc.ca/eng/survey/household/5071/informationsheet.

Background—Canadian COVID-19 Antibody and Health Survey

In response to the COVID-19 pandemic, the Canadian COVID-19 Antibody and Health Survey (CCAHS) was developed and collection started in November 2020. Like the CHMS, this survey is available in both official languages. The CCAHS is a voluntary survey aimed at expanding what is known about SARS-CoV-2, including its impact on the health of Canadians. The target population of the CCAHS in the first reference period (November 2020 to April 2021) was individuals aged 1 year and older living in the 10 provinces and 3 territories. For the second reference period (April to August 2022), the CCAHS covered individuals aged 18 years and older living in the 10 provinces. Certain exclusions to the population coverage are outlined in the Data resource description section below. More information about the sampling method can also be found at the CCAHS IMDB web page.Note 20

Like the CHMS, CCAHS data are collected in two parts. The first part of the survey involves an electronic questionnaire about a participant’s general health and exposure to SARS-CoV-2 and takes about 20 minutes to complete. The questionnaire must be completed first, as written consent is required to complete the second part of the survey. This second component of the survey involves two self-administered tests to determine the presence of COVID-19 antibodies and active COVID-19 infections: an at-home finger prick blood test called a dried blood spot (DBS) test, and a PCR (polymerase chain reaction) saliva test. CCAHS data are collected under the authority of the Statistics Act, thus, respondent information is kept strictly confidential.

Background – Statistics Canada Biobank

The Biobank is integral to the CHMS and CCAHS. It stores whole blood, plasma, serum, urine, DNA samples, and dried blood spot samples from consenting participants at the Public Health Agency of Canada’s National Microbiology Laboratory (NML) in Winnipeg, Manitoba. The stored biospecimens can be used in future health studies led by Canadian academic and government researchers, allowing for expansion and elaboration on the information already collected as part of these surveys. Data from a biobank project could also be combined with survey content, further improving the richness of the data collected in the CHMS and CCAHS. One of the most valuable aspects of the Biobank is that it is a cost-effective source of biospecimens representative of the Canadian population. Current and past biobank research projects are detailed in Table 1. Biobank project descriptions can also be found at https://www.statcan.gc.ca/eng/microdata/biobank/projects


Table 1
Summary of completed and ongoing Statistics Canada Biobank projects 
Table summary
This table displays the results of Summary of completed and ongoing Statistics Canada Biobank projects . The information is grouped by Project title (appearing as row headers), Cycle(s), Matrix and Project description (appearing as column headers).
Project title Cycle(s) Matrix Project description
Improved delivery of respiratory health care services using a metabolomic approach 1, 2 Urine Apply metabolomic approaches to differentiate preschool children with asthma from other causes of wheeze-like illness and to differentiate adults with asthma from chronic pulmonary obstructive disease. Ultimately, this would help to develop a diagnostic urine test to improve diagnosis accuracy.
Testing of potential interfering substances in human serum on the liaison 25OHD assay
4 Serum Investigate cholesterol’s interference on measured vitamin D and use the results from this investigation to standardize Canadian Health Measures Survey Cycle 4 vitamin D estimates.
Measuring the immunity of Canadians to measles and varicella and assessing the risk of epidemics (iCARE) 2, 3 Serum Examine the level of population immunity to measles and varicella (chickenpox), two vaccine-preventable diseases in Canada.
Genetic modifiers of folate, vitamin B12 and homocysteine status in a cross-sectional study of the Canadian population
1 DNA Identify associations between a number of genetic variants that are common in the Canadian population, with folate and vitamin B12 status.
Biomonitoring of environmental chemicals in samples of the Canadian Health Measures Survey Biobank 3, 4, 5, 6 Urine and serum Measure high-priority environmental chemicals and their metabolites in the blood and urine of Statistics Canada biobank participants.
Canadian biomonitoring data, reference ranges and associations with health outcomes of priority metals and trace elements to inform risk assessments under the Chemicals Management Plan
2 Blood Generate biomonitoring data, examine associations with health outcomes and establish reference ranges to inform risk assessments for priority metals or trace elements.
Fatty acid reference ranges from the Canadian Health Measures Survey 1, 2 Plasma Determine the reference ranges (normal values) of plasma fatty acids of Canadians to provide an appropriate clinical interpretation of laboratory test results. The study will also serve to establish healthy target levels of fatty acids and determine whether certain concentrations are associated with deficiency or increased risk of chronic disease.
Evaluation of diagnostic assays for COVID-19 caused by SARS-CoV-2 1, 5 Serum, whole blood, plasma and urine Evaluate multiple commercially available blood tests that detect antibodies raised in patients who have been exposed to SARS-CoV-2 and antigen capture tests to add to the repertoire of available diagnostic assays for COVID-19. The Statistics Canada biobank samples will provide a source of pre-outbreak samples.
Estimating hepatitis C virus and hepatitis B virus prevalence among the general population in Canada
5, 6 Serum Estimate the hepatitis C and hepatitis B disease burden in the Canadian population.
Genome-wide genotyping of the Canadian Health Measures Survey 2-5 DNA Genotype DNA samples from consenting Canadian Health Measures Survey participants. The second part of this study aims to perform a genome-wide association study to identify genetic determinants of the levels of environmental toxins.
Generation of reference intervals for neurological biomarker analytes in plasma 5 Plasma Determine the reference intervals of brain-derived proteins that are emerging as promising biomarkers for neurological conditions. These references could be used in the future to assess the extent of brain damage from blood samples.

Other biorepositories

Aside from Statistics Canada, other agencies and organizations within Canada and in other countries have found a need for the type of data a biorepository can provide. Within Canada, the Canadian Partnership for Tomorrow’s Health (CanPath)Note 2 and the Maternal-Infant Research on Environmental Chemicals (MIREC)Note 3 are two other biorepositories available to researchers. CanPath is a voluntary survey that collects genomic, clinical, behavioural and environmental information from over 300,000 Canadians aged 30 to 74 years, with 160,000 participants providing blood samples, and MIREC has followed 2,001 women throughout and after their pregnancies. The Centers for Disease Control and Prevention (CDC) in the United States runs a comparable program called the Biospecimen Program as a part of the National Health and Nutrition Examination Survey that has been ongoing since 1988.Note 4 A much larger program in size and scope is the UK Biobank in the United Kingdom, which involves 500,000 participants.Note 5

The mandates of the Canadian, U.S. and U.K. programs are all quite similar: use the stored biospecimens for health-related research with the ultimate goal of improving human health.Note 6Note 7Note 8 A key difference between the Statistics Canada Biobank and biobanks run by CanPath, MIREC, the CDC and the UK Biobank is that these other biobank programs include a longitudinal or follow-up element, where a cohort of participants is followed over time and repeatedly contribute to the biobank program. These biorepositories have been instrumental in many areas of research, including diabetes;Note 9Note 10Note 11Note 12 hypertension;Note 13Note 14Note 15 and environmental toxins, including metals,Note 16Note 17 pesticides,Note 18 and phthalates.Note 19 All this research is supporting the shared goal of improving health and health services within these countries—and around the world.

Data resource description

Collection and storage methods and available samples

CHMS participants who go to the MEC can consent to having blood and urine samples collected, as long as each test is deemed appropriate for their personal health condition. With the participant’s consent, the Biobank will also store samples of whole blood, plasma, serum, buffy coat, urine, and DNA. These biospecimens are shipped weekly from the MEC and are stored in -80⁰C freezers at the NML to ensure the long-term integrity of the samples.

As of January 2022, six CHMS collection cycles have been completed. Cycle 7 was delayed because of the COVID-19 pandemic and other operational reasons. In each cycle, approximately 5,700 participants aged 3 to 79 years consented to the storage of their samples for use in future research. A summary of the approximate sample sizes, age ranges and available amounts for each of the available biological matrices for each CHMS collection cycle can be found in Table 2. Note that Cycle 7 will also include participants aged 1 to 2 years.


Table 2
Overview of the approximate sample sizes, age ranges and amount available for each of the biological matrices that are available for each Canadian Health Measures Survey collection cycle  
Table summary
This table displays the results of Overview of the approximate sample sizes. The information is grouped by Matrix and age range
(years) (appearing as row headers), Sample size and Amount
available, calculated using number, milliliter, number and microgram units of measure (appearing as column headers).
Matrix and age range
(years)
Sample size Amount
available
number milliliter
Serum
3 to 79 5,700 0.5
3 to 39 3,500 0.5
6 to 79 5,100 0.5
12 to 79 4,100 0.5, 1.0
20 to 79 3,100 0.5, 1.0
Plasma
3 to 79 5,700 0.5
6 to 79 5,100 0.5
12 to 79 4,100 0.5
Whole blood
6 to 79 5,100 1.0
12 to 79 4,100 1.0
Urine
3 to 79 5,700 1.0, 2.0, 4.5
6 to 79 5,100 1.0, 4.5
   number microgram
DNA
14 to 79 3,700 1.0

CCAHS participants receive in their mail, a survey invitation and an antibody test kit to collect dried blood samples; some participants also receive a PCR saliva test kit. Since both test kits are self-administered, and the questionnaire is available online, participants can complete the full survey from home. The test kits are returned using the enclosed prepaid postage envelope for shipment to the NML. DBS kits are then sent to a reference laboratory for analysis. With the participant’s consent, leftover DBS samples will be stored at the NML in -80⁰C freezers for use in future research projects. After Cycle 1 of the CCAHS, almost 10,000 DBS samples are in storage. An additional 32,000 samples are estimated to be available after Cycle 2.

Target population

The target population of Cycle 1 of the CHMS was Canadians living in the 10 provinces aged 6 to 79 years. For cycles 2 to 6, the age was 3 to 79 years, and for Cycle 7 it will be 1 to 79 years. The target population of the CCAHS for the November 2020 to April 2021 reference period was individuals aged 1 and older living in the 10 provinces and the capitals of the three territories. For the April to August 2022 reference period, the target population was individuals 18 and older living in the 10 provinces. Excluded from these target populations are people living on reserves and other Indigenous settlements in the provinces, full-time members of the Canadian Armed Forces living on military bases, the institutionalized population, and residents of certain remote regions. Individuals living in the three territories are excluded from the CHMS and the second reference period of the CCAHS. Together, these exclusions represent approximately 3% to 4% of the target population. Further information on the sampling strategy can be found on the CHMS IMDB web pageNote 1 and on the CCAHS IMDB web page.Note 20

Governance

The Biobank Advisory Committee (BAC) is responsible for the governance of the Biobank and plays an important role in the approval process of project applications to the Biobank. The BAC, which is composed of academic and federal researchers, ensures that research proposals are appropriate from an ethics and governance perspective and are scientifically sound. Because Biobank samples are a limited resource, the intent, methodology, feasibility, research ethics and relevance of each project are thoroughly evaluated. Thus, BAC members are responsible for advising Statistics Canada on the best use of the stored biospecimens to serve CHMS and CCAHS objectives.

Data resource use

Summary of projects

Table 1 provides a summary of ongoing and completed Biobank projects. Some projects have made it to the publication of research findings stage, including one of the first Biobank projects to be approved. This project looked at the genetic modifiers of folate, vitamin B12, and homocysteine status. By measuring the strength of the association between selected genetic polymorphisms and vitamin B12 and folate status, this project pinpointed the potential relevance of specific genes involved in vitamin absorption or uptake, transport and metabolism.Note 21 Another Biobank project published its findings on the reference ranges of metals and trace elements present in CHMS blood samples.Note 22 Of the 12 analytes measured in this study, 8 (cerium, lanthanum, neodymium, praseodymium, yttrium, germanium, tellurium and titanium) were detected above the method reporting limit (MRL) in less than 1% of the Canadian population and 3 (aluminum, bismuth and chromium) were detected above the MRL in 3% to 5% of the Canadian population. In contrast, lithium was detected above the MRL in 66% of the population. Recently, a project attempted to measure and characterize the immune status of the Canadian population to measles and varicella. Although this project is still ongoing, the measles part of the project has concluded, and suggests that measles immunity in Canada could be below the required threshold to sustain elimination.Note 23

Some Biobank projects have analyzed multiple cycles of biobank samples (Table 1) to increase the sample size for greater statistical power or to consider variation over time. The biomonitoring of environmental chemicals project and the genome-wide genotyping project use biospecimens from four different cycles, each. The biomonitoring of environmental chemicals project is providing complementary information, and is expanding on the biomonitoring component of the CHMS, which measures environmental chemicals, including pesticides, metals, fluoride and phenols. As part of the genome-wide genotyping project (refer to Table 1), DNA from CHMS respondents aged 14 to 79 years in Cycles 2 through 5 has been genotyped. This project will greatly expand the scientific utility of the CHMS cohort, and will contribute to the understanding of disease outside the scope of this program, as the genotyping data has the potential to be used in many other studies.

Strengths and limitations

Multiple matrices available

Researchers applying to use Biobank samples can choose among several matrices: whole blood, serum, plasma, urine, DNA samples, and DBS. Having these options broadens the scope of potential projects that can be accomplished, (for example, some assays can only be performed using a certain matrix) and ensures researchers are able to choose the most stable and appropriate matrix for their analytes of interest.

Representative of the Canadian population

The Biobank offers researchers a unique resource that would otherwise be nearly impossible to obtain: a source of biospecimens representative of the Canadian population. Being representative makes the results of the CHMS, the CCAHS, and Biobank studies relevant to the Canadian population. As such, survey and Biobank data are useful for producing Canada-wide estimates of the analytes of interest. This is a key advantage over non-statistically sampled cohorts, like CanPath, because participants that voluntarily complete health-oriented surveys tend to be more health conscious and lead healthier lifestyles than the overall population. However, it is not recommended to attempt to produce regional estimates, as the sampling method was not designed to do so, and could result in extreme sampling variability or unstable estimates of sampling variability. This could be a limitation for stakeholders looking for health information at that geographic level to fulfill their role in Canada’s health care system. A potential exception is the combination of multiple cycles. This might allow for the production of estimates for the largest regions of Ontario or Quebec.

Relatively small sample sizes are not suited for rare conditions or some population comparisons

The complete set of samples for each cycle is a few thousand. This may be sufficient for estimates about commonly occurring health conditions, but the sample size may not be large enough to capture the prevalence of rare conditions. The power to detect statistically significant results may also be low. Rare conditions may be missed entirely in the collection, and would therefore not be represented in the samples. In some cases, however, combining cycles can increase statistical power and may help produce statistically sound estimates for uncommon health conditions.

Cross-sectional and not longitudinal

Because the CHMS and CCAHS are cross-sectional surveys, they provide a snapshot of the health of Canadians at a particular moment in time. Some of the benefits of a cross-sectional study are that it avoids sample attrition and participation bias, while longitudinal studies rely on the participation of the same group of participants in each cycle.

Content changes cycle to cycle

A large portion of the content of the CHMS and the CCAHS is consistent cycle to cycle. This is the core content, which includes variables such as a complete blood count, blood chemistry, blood metals, cholesterol, creatinine, chronic conditions, and vaccination status. Having content repeated each year allows researchers to look at the changes in these variables over time. The variables that are consistently included in each cycle were chosen to meet Canada’s priority health concerns, such as cardiovascular health.

However, some survey content varies from cycle to cycle. Changes have been made to certain sociodemographic variables in the CHMS, including those related to income, job classification, population group, Indigenous identity, marital status and gender (which was added to Cycle 7). With the CCAHS, information on COVID-19 vaccination changed as the vaccine became widely available between the two collection periods. Another portion of the surveys consists of rotating or buy-in variables, which, for the CHMS, includes nutrition markers such as vitamins D and B12, ferritin and folate, and environmental contaminants such as BPA (bisphenol A), phthalates, plasticizers, and parabens. This allows for the collection of a wider variety of content while managing the extra costs and participant burden of including extra variables. For the CHMS, some of this content, although not included each year, is included in paired cycles. This is beneficial to the biobank as some research projects require the analysis of multiple cycles of samples to attain a sufficient sample size. However, some of the content that was not measured during the Cycles 1-6 is now being measured in the stored samples to fill some data gaps (See Table 1 - Biomonitoring of environmental chemicals in samples of the Canadian Health Measures Survey Biobank project).

Data resource access

Description of the application process

The overall application process is outlined in Table 3. Bona-fide scientists can submit their Biobank project applications at any time. Both Canadian and international researchers can gain access to Biobank samples; however, the Biobank samples cannot leave Canada, and the data must be accessed through research data centres (RDCs) located in Canada. Additionally, international researchers need to be based in a Canadian laboratory with a co-investigator who is either a Canadian citizen or a permanent resident of Canada. The application should be submitted through the Microdata Access Portal (https://www.statcan.gc.ca/rdc-cdr/eng). More information on the application process can be found at https://www.statcan.gc.ca/eng/microdata/data-centres/access, and questions about the application process can be directed to the Statistics Canada Biobank coordinator (statcan.biobankinfo-infobiobanque.statcan@statcan.gc.ca).


Table 3
Summary of a Statistics Canada Biobank project process 
Table summary
This table displays the results of Summary of a Statistics Canada Biobank project process . The information is grouped by Phase (appearing as row headers), Description of phase (appearing as column headers).
Phase Description of phase
Researcher application Application form with details on proposed research project
Curricula vitae of researchers
Proof of funding and ethics review
Scientific peer review arrangements
Phase 0: Approval Initial screening by Statistics Canada Biobank proposal review group
Feasibility evaluation
Statistics Canada Biobank Advisory Committee review
Iterative feedback provided to the researcher on their project proposal based on project reviews and evaluations
Proof of funding and ethics approval
Security clearance (Tier 1 or Tier 2 projects)
Phase 1: Delivery Publication of plain language summary on the Statistics Canada Biobank website
Contract negotiations and costing estimate
Researchers deemed as Statistics Canada employees and take the confidentiality oath as per the Statistics Act
Laboratory audit
Delivery of biospecimens
Laboratory analysis Compliance with the Statistics Canada Biobank quality assurance and quality control strategy
CHMS quality consultation
The laboratory communicates analysis results with Statistics Canada through a secure internet portal
Phase 2: Data processing, review and dissemination The data file is processed and formatted
A data dictionary and user guide are prepared for the research data centre (RDC)
Data quality control and assessment
Possible weighting of results for nationally representative estimates
Creation and verification of the dissemination file
Data file is released to the RDC, where it can be merged with other CHMS variables
One-year data exclusivity period after delivery to RDC (opportunity to publish first)
Phase 3: Destruction of samples and data One year after publication, samples should be destroyed

In addition to the review and approval of the CHMS and CCAHS’s research ethics board (REB), applicants are required to complete a separate review with the REBs at their affiliated institutions, and to provide the corresponding ethics certificate for their planned research to Statistics Canada. Researchers are also required to submit proof of funding.

Each project proposal is assessed for its feasibility, quality, importance and impact. An initial feasibility assessment is conducted by the Biobank proposal review group. Iterative feedback is provided to the researcher before the application is reviewed by the BAC. Following approval from the BAC, contract negotiations can begin between Statistics Canada and the applicant’s institution in preparation for the delivery of biospecimen samples.

While the proposal is being assessed, an application for the security clearance of the researchers who will be involved in the project is initiated. Any person who is providing laboratory services to Statistics Canada, including biobank-related research, must become a deemed employee of Statistics Canada. More information on the deeming process can be found at https://www.statcan.gc.ca/en/microdata/data-centres/faq.

Researchers and their facilities must meet certain security requirements to access, use, and store the Biobank biospecimens. In terms of security requirements, access to Biobank biospecimens is based on a tiered access system, depending on the confidentiality disclosure risk. Projects fall under one of two categories: Tier 1 or Tier 2. Tier 1 projects are deemed to have low disclosure risks. This applies to most blood and urine biomarkers. Security requirements for Tier 1 projects are handled by Statistics Canada and are outlined in the contract agreement signed between Statistics Canada and the applicant’s laboratory or institution. Tier 2 projects are those with high disclosure risks, and have inherently disclosable biomarkers, such as genetic data or microbiome profiles. Security requirements for Tier 2 projects are handled by Public Services and Procurement Canada’s contract security program.

Research data centres

RDCsNote 24 provide a rich repository of Statistics Canada microdata, and include social and business surveys, administrative data and linked data. RDCs allow academic and government researchers to access data while ensuring that Statistics Canada maintains confidentiality and data privacy. Statistics Canada employees are also available at RDCs to provide support. Researchers will access biobank, CHMS and CCAHS microdata mostly at RDCs, and it is at RDCs where they will be able to merge their data with requested variables from the CHMS or the CCAHS. More information on RDCs can be found at https://www.statcan.gc.ca/eng/rdc/index.

Conclusions

The Statistics Canada Biobank is a valuable resource for health research in Canada. Although several projects have successfully used Biobank biospecimens, there is still potential to expand its use to other health-related topics. It offers researchers a representative, cross-sectional source of samples in several different matrices to allow for best-fit given certain instrumental or biological considerations. Although the Biobank is less suitable to create regional estimates or estimates on rare health conditions, it is possible in some cases to overcome this limitation by combining multiple cycles to increase sample size and statistical power. Researchers can gain access not only to the Biobank biospecimens but also to the linked survey variables available in the RDCs. Ultimately, the purpose of the Biobank is to support health-related research projects, and create health monitoring opportunities to benefit the health of Canadians.

Date modified: