Statistical methods

Key indicators

Changing any selection will automatically update the page content.

Selected geographical area: Canada

Selected geographical area: Newfoundland and Labrador

Selected geographical area: Prince Edward Island

Selected geographical area: Nova Scotia

Selected geographical area: New Brunswick

Selected geographical area: Quebec

Selected geographical area: Ontario

Selected geographical area: Manitoba

Selected geographical area: Saskatchewan

Selected geographical area: Alberta

Selected geographical area: British Columbia

Selected geographical area: Yukon

Selected geographical area: Northwest Territories

Selected geographical area: Nunavut

Sort Help
entries

Results

All (2,299)

All (2,299) (10 to 20 of 2,299 results)

  • Articles and reports: 11-522-X202200100005
    Description: Sampling variance smoothing is an important topic in small area estimation. In this paper, we propose sampling variance smoothing methods for small area proportion estimation. In particular, we consider the generalized variance function and design effect methods for sampling variance smoothing. We evaluate and compare the smoothed sampling variances and small area estimates based on the smoothed variance estimates through analysis of survey data from Statistics Canada. The results from real data analysis indicate that the proposed sampling variance smoothing methods work very well for small area estimation.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100006
    Description: The Australian Bureau of Statistics (ABS) is committed to improving access to more microdata, while ensuring privacy and confidentiality is maintained, through its virtual DataLab which supports researchers to undertake complex research more efficiently. Currently, the DataLab research outputs need to follow strict rules to minimise disclosure risks for clearance. However, the clerical-review process is not cost effective and has potential to introduce errors. The increasing number of statistical outputs from different projects can potentially introduce differencing risks even though these outputs from different projects have met the strict output rules. The ABS has been exploring the possibility of providing automatic output checking using the ABS cellkey methodology to ensure that all outputs across different projects are protected consistently to minimise differencing risks and reduce costs associated with output checking.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100007
    Description: With the availability of larger and more diverse data sources, Statistical Institutes in Europe are inclined to publish statistics on smaller groups than they used to do. Moreover, high impact global events like the Covid crisis and the situation in Ukraine may also ask for statistics on specific subgroups of the population. Publishing on small, targeted groups not only raises questions on statistical quality of the figures, it also raises issues concerning statistical disclosure risk. The principle of statistical disclosure control does not depend on the size of the groups the statistics are based on. However, the risk of disclosure does depend on the group size: the smaller a group, the higher the risk. Traditional ways to deal with statistical disclosure control and small group sizes include suppressing information and coarsening categories. These methods essentially increase the (mean) group sizes. More recent approaches include perturbative methods that have the intention to keep the group sizes small in order to preserve as much information as possible while reducing the disclosure risk sufficiently. In this paper we will mention some European examples of special focus group statistics and discuss the implications on statistical disclosure control. Additionally, we will discuss some issues that the use of perturbative methods brings along: its impact on disclosure risk and utility as well as the challenges in proper communication thereof.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100008
    Description: The publication of more disaggregated data can increase transparency and provide important information on underrepresented groups. Developing more readily available access options increases the amount of information available to and produced by researchers. Increasing the breadth and depth of the information released allows for a better representation of the Canadian population, but also puts a greater responsibility on Statistics Canada to do this in a way that preserves confidentiality, and thus it is helpful to develop tools which allow Statistics Canada to quantify the risk from the additional data granularity. In an effort to evaluate the risk of a database reconstruction attack on Statistics Canada’s published Census data, this investigation follows the strategy of the US Census Bureau, who outlined a method to use a Boolean satisfiability (SAT) solver to reconstruct individual attributes of residents of a hypothetical US Census block, based just on a table of summary statistics. The technique is expanded to attempt to reconstruct a small fraction of Statistics Canada’s Census microdata. This paper will discuss the findings of the investigation, the challenges involved in mounting a reconstruction attack, and the effect of an existing confidentiality measure in mitigating these attacks. Furthermore, the existing strategy is compared to other potential methods used to protect data – in particular, releasing tabular data perturbed by some random mechanism, such as those suggested by differential privacy.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100009
    Description: Education and training is acknowledged as fundamental for the development of a society. It is a complex multidimensional phenomenon, which determinants are ascribable to several interrelated familiar and socio-economic conditions. To respond to the demand of supporting statistical information for policymaking and its monitoring and evaluation process, the Italian National Statistical Institute (Istat) is renewing the education and training statistical production system, implementing a new thematic statistical register. It will be part of the Istat Integrated System of Registers, thus allowing relating the education and training phenomenon to other relevant phenomena, e.g. transition to work.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100010
    Description: Growing Up in Québec is a longitudinal population survey that began in the spring of 2021 at the Institut de la statistique du Québec. Among the children targeted by this longitudinal follow-up, some will experience developmental difficulties at some point in their lives. Those same children often have characteristics associated with higher sample attrition (low-income family, parents with a low level of education). This article describes the two main challenges we encountered when trying to ensure sufficient representativeness of these children, in both the overall results and the subpopulation analyses.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100011
    Description: In 2021, Statistics Canada initiated the Disaggregated Data Action Plan, a multi-year initiative to support more representative data collection methods, enhance statistics on diverse populations to allow for intersectional analyses, and support government and societal efforts to address known inequalities and bring considerations of fairness and inclusion into decision making. As part of this initiative, we are building the Survey Series on People and their Communities, a new probabilistic panel specifically designed to collect data that can be disaggregated according to racialized group. This new tool will allow us to address data gaps and emerging questions related to diversity. This paper will give an overview of the design of the Survey Series on People and their Communities.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100012
    Description: At Statistics Netherlands (SN) for some economic sectors two partly-independent intra-annual turnover index series are available: a monthly series based on survey data and a quarterly series based on value added tax data for the smaller units and re-used survey data for the other units. SN aims to benchmark the monthly turnover index series to the quarterly census data on a quarterly basis. This cannot currently be done because the tax data has a different quarterly pattern: the turnover is relatively large in the fourth quarter of the year and smaller in the first quarter. With the current study we aim to describe this deviating quarterly pattern at micro level. In the past we developed a mixture model using absolute turnover levels that could explain part of the quarterly patterns. Because the absolute turnover levels differ between the two series, in the current study we use a model based on relative quarterly turnover levels within a year.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100013
    Description: Respondents to typical household surveys tend to significantly underreport their potential use of food aid distributed by associations. This underreporting is most likely related to the social stigma felt by people experiencing great financial difficulty. As a result, survey estimates of the number of recipients of that aid are much lower than the direct counts from the associations. Those counts tend to overestimate due to double counting. Through its adapted protocol, the Enquête Aide alimentaire (EAA) collected in late 2021 in France at a sample of sites of food aid distribution associations, controls the biases that affect the other sources and determines to what extent this aid is used.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100014
    Description: Ethnic minorities are often underrepresented in survey research, due to the challenges many researchers face in including these populations. While some studies discuss several methods in comparison, few have directly compared these methods empirically, leaving researchers seeking to include ethnic minorities in their studies unsure of their best options. In this article, I briefly review the methodological and ethical reasons for increasing ethnic minority representation in social science research, as well as challenges of doing so. I then present findings from ten studies which empirically compare methods of sampling and/or recruiting ethnic minority individuals. Finally, I discuss some implications for future research.
    Release date: 2024-03-25
Data (9)

Data (9) ((9 results))

No content available at this time.

Analysis (1,874)

Analysis (1,874) (0 to 10 of 1,874 results)

  • Articles and reports: 75F0002M2024005
    Description: The Canadian Income Survey (CIS) has introduced improvements to the methods and data sources used to produce income and poverty estimates with the release of its 2022 reference year estimates. Foremost among these improvements is a significant increase in the sample size for a large subset of the CIS content. The weighting methodology was also improved and the target population of the CIS was changed from persons aged 16 years and over to persons aged 15 years and over. This paper describes the changes made and presents the approximate net result of these changes on the income estimates and data quality of the CIS using 2021 data. The changes described in this paper highlight the ways in which data quality has been improved while having little impact on key CIS estimates and trends.
    Release date: 2024-04-26

  • Journals and periodicals: 75F0002M
    Description: This series provides detailed documentation on income developments, including survey design issues, data quality evaluation and exploratory research.
    Release date: 2024-04-26

  • Stats in brief: 11-001-X202411338008
    Description: Release published in The Daily – Statistics Canada’s official release bulletin
    Release date: 2024-04-22

  • Articles and reports: 18-001-X2024001
    Description: This study applies small area estimation (SAE) and a new geographic concept called Self-contained Labor Area (SLA) to the Canadian Survey on Business Conditions (CSBC) with a focus on remote work opportunities in rural labor markets. Through SAE modelling, we estimate the proportions of businesses, classified by general industrial sector (service providers and goods producers), that would primarily offer remote work opportunities to their workforce.
    Release date: 2024-04-22

  • Articles and reports: 11-522-X202200100001
    Description: Record linkage aims at identifying record pairs related to the same unit and observed in two different data sets, say A and B. Fellegi and Sunter (1969) suggest each record pair is tested whether generated from the set of matched or unmatched pairs. The decision function consists of the ratio between m(y) and u(y),probabilities of observing a comparison y of a set of k>3 key identifying variables in a record pair under the assumptions that the pair is a match or a non-match, respectively. These parameters are usually estimated by means of the EM algorithm using as data the comparisons on all the pairs of the Cartesian product ?=A×B. These observations (on the comparisons and on the pairs status as match or non-match) are assumed as generated independently of other pairs, assumption characterizing most of the literature on record linkage and implemented in software tools (e.g. RELAIS, Cibella et al. 2012). On the contrary, comparisons y and matching status in ? are deterministically dependent. As a result, estimates on m(y) and u(y) based on the EM algorithm are usually bad. This fact jeopardizes the effective application of the Fellegi-Sunter method, as well as automatic computation of quality measures and possibility to apply efficient methods for model estimation on linked data (e.g. regression functions), as in Chambers et al. (2015). We propose to explore ? by a set of samples, each one drawn so to preserve independence of comparisons among the selected record pairs. Simulations are encouraging.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100002
    Description: The authors used the Splink probabilistic linkage package developed by the UK Ministry of Justice, to link census data from England and Wales to itself to find duplicate census responses. A large gold standard of confirmed census duplicates was available meaning that the results of the Splink implementation could be quality assured. This paper describes the implementation and features of Splink, gives details of the settings and parameters that we used to tune Splink for our particular project, and gives the results that we obtained.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100003
    Description: Estimation at fine levels of aggregation is necessary to better describe society. Small area estimation model-based approaches that combine sparse survey data with rich data from auxiliary sources have been proven useful to improve the reliability of estimates for small domains. Considered here is a scenario where small area model-based estimates, produced at a given aggregation level, needed to be disaggregated to better describe the social structure at finer levels. For this scenario, an allocation method was developed to implement the disaggregation, overcoming challenges associated with data availability and model development at such fine levels. The method is applied to adult literacy and numeracy estimation at the county-by-group-level, using data from the U.S. Program for the International Assessment of Adult Competencies. In this application the groups are defined in terms of age or education, but the method could be applied to estimation of other equity-deserving groups.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100004
    Description: In accordance with Statistics Canada’s long-term Disaggregated Data Action Plan (DDAP), several initiatives have been implemented into the Labour Force Survey (LFS). One of the more direct initiatives was a targeted increase in the size of the monthly LFS sample. Furthermore, a regular Supplement program was introduced, where an additional series of questions are asked to a subset of LFS respondents and analyzed in a monthly or quarterly production cycle. Finally, the production of modelled estimates based on Small Area Estimation (SAE) methodologies resumed for the LFS and will include a wider scope with more analytical value than what had existed in the past. This paper will give an overview of these three initiatives.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100005
    Description: Sampling variance smoothing is an important topic in small area estimation. In this paper, we propose sampling variance smoothing methods for small area proportion estimation. In particular, we consider the generalized variance function and design effect methods for sampling variance smoothing. We evaluate and compare the smoothed sampling variances and small area estimates based on the smoothed variance estimates through analysis of survey data from Statistics Canada. The results from real data analysis indicate that the proposed sampling variance smoothing methods work very well for small area estimation.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100006
    Description: The Australian Bureau of Statistics (ABS) is committed to improving access to more microdata, while ensuring privacy and confidentiality is maintained, through its virtual DataLab which supports researchers to undertake complex research more efficiently. Currently, the DataLab research outputs need to follow strict rules to minimise disclosure risks for clearance. However, the clerical-review process is not cost effective and has potential to introduce errors. The increasing number of statistical outputs from different projects can potentially introduce differencing risks even though these outputs from different projects have met the strict output rules. The ABS has been exploring the possibility of providing automatic output checking using the ABS cellkey methodology to ensure that all outputs across different projects are protected consistently to minimise differencing risks and reduce costs associated with output checking.
    Release date: 2024-03-25
Reference (363)

Reference (363) (60 to 70 of 363 results)

  • Notices and consultations: 13-605-X201400414107
    Description:

    Beginning in November 2014, International Trade in goods data will be provided on a Balance of Payments (BOP) basis for additional country detail. In publishing this data, BOP-based exports to and imports from 27 countries, referred to as Canada’s Principal Trading Partners (PTPs), will be highlighted for the first time. BOP-based trade in goods data will be available for countries such as China and Mexico, Brazil and India, South Korea, and our largest European Union trading partners, in response to substantial demand for information on these countries in recent years. Until now, Canada’s geographical trading patterns have been examined almost exclusively through analysis of Customs-based trade data. Moreover, BOP trade in goods data for these countries will be available alongside the now quarterly Trade in Services data as well as annual Foreign Direct Investment data for many of these Principal Trading Partners, facilitating country-level international trade and investment analysis using fully comparable data. The objective of this article is to introduce these new measures. This note will first walk users through the key BOP concepts, most importantly the concept of change in ownership. This will serve to familiarize analysts with the Balance of Payments framework for analyzing country-level data, in contrast to Customs-based trade data. Second, some preliminary analysis will be reviewed to illustrate the concepts, with provisional estimates for BOP-based trade with China serving as the principal example. Lastly, we will outline the expansion of quarterly trade in services to generate new estimates of trade for the PTPs and discuss future work in trade statistics.

    Release date: 2014-11-04

  • Surveys and statistical programs – Documentation: 11-522-X201300014258
    Description:

    The National Fuel Consumption Survey (FCS) was created in 2013 and is a quarterly survey that is designed to analyze distance driven and fuel consumption for passenger cars and other vehicles weighing less than 4,500 kilograms. The sampling frame consists of vehicles extracted from the vehicle registration files, which are maintained by provincial ministries. For collection, FCS uses car chips for a part of the sampled units to collect information about the trips and the fuel consumed. There are numerous advantages to using this new technology, for example, reduction in response burden, collection costs and effects on data quality. For the quarters in 2013, the sampled units were surveyed 95% via paper questionnaires and 5% with car chips, and in Q1 2014, 40% of sampled units were surveyed with car chips. This study outlines the methodology of the survey process, examines the advantages and challenges in processing and imputation for the two collection modes, presents some initial results and concludes with a summary of the lessons learned.

    Release date: 2014-10-31

  • Surveys and statistical programs – Documentation: 11-522-X201300014259
    Description:

    In an effort to reduce response burden on farm operators, Statistics Canada is studying alternative approaches to telephone surveys for producing field crop estimates. One option is to publish harvested area and yield estimates in September as is currently done, but to calculate them using models based on satellite and weather data, and data from the July telephone survey. However before adopting such an approach, a method must be found which produces estimates with a sufficient level of accuracy. Research is taking place to investigate different possibilities. Initial research results and issues to consider are discussed in this paper.

    Release date: 2014-10-31

  • Surveys and statistical programs – Documentation: 11-522-X201300014260
    Description:

    The Survey of Employment, Payrolls and Hours (SEPH) produces monthly estimates and determines the month-to-month changes for variables such as employment, earnings and hours at detailed industrial levels for Canada, the provinces and territories. In order to improve the efficiency of collection activities for this survey, an electronic questionnaire (EQ) was introduced in the fall of 2012. Given the timeframe allowed for this transition as well as the production calendar of the survey, a conversion strategy was developed for the integration of this new mode. The goal of the strategy was to ensure a good adaptation of the collection environment and also to allow the implementation of a plan of analysis that would evaluate the impact of this change on the results of the survey. This paper will give an overview of the conversion strategy, the different adjustments that were made during the transition period and the results of various evaluations that were conducted. For example, the impact of the integration of the EQ on the collection process, the response rate and the follow-up rate will be presented. In addition, the effect that this new collection mode has on the survey estimates will also be discussed. More specifically, the results of a randomized experiment that was conducted in order to determine the presence of a mode effect will be presented.

    Release date: 2014-10-31

  • Surveys and statistical programs – Documentation: 11-522-X201300014269
    Description:

    The Census Overcoverage Study (COS) is a critical post-census coverage measurement study. Its main objective is to produce estimates of the number of people erroneously enumerated, by province and territory, study the characteristics of individuals counted multiple times and identify possible reasons for the errors. The COS is based on the sampling and clerical review of groups of connected records that are built by linking the census response database to an administrative frame, and to itself. In this paper we describe the new 2011 COS methodology. This methodology has incorporated numerous improvements including a greater use of probabilistic record-linkage, the estimation of linking parameters with an Expectation-Maximization (E-M) algorithm, and the efficient use of household information to detect more overcoverage cases.

    Release date: 2014-10-31

  • Surveys and statistical programs – Documentation: 11-522-X201300014278
    Description:

    In January and February 2014, Statistics Canada conducted a test aiming at measuring the effectiveness of different collection strategies using an online self-reporting survey. Sampled units were contacted using mailed introductory letters and asked to complete the online survey without any interviewer contact. The objectives of this test were to measure the take-up rates for completing an online survey, and to profile the respondents/non-respondents. Different samples and letters were tested to determine the relative effectiveness of the different approaches. The results of this project will be used to inform various social surveys that are preparing to include an internet response option in their surveys. The paper will present the general methodology of the test as well as results observed from collection and the analysis of profiles.

    Release date: 2014-10-31

  • Surveys and statistical programs – Documentation: 11-522-X201300014285
    Description:

    The 2011 National Household Survey (NHS) is a voluntary survey that replaced the traditional mandatory long-form questionnaire of the Canadian census of population. The NHS sampled about 30% of Canadian households and achieved a design-weighted response rate of 77%. In comparison, the last census long form was sent to 20% of households and achieved a response rate of 94%. Based on the long-form data, Statistics Canada traditionally produces two public use microdata files (PUMFs): the individual PUMF and the hierarchical PUMF. Both give information on individuals, but the hierarchical PUMF provides extra information on the household and family relationships between the individuals. To produce two PUMFs, based on the NHS data, that cover the whole country evenly and that do not overlap, we applied a special sub-sampling strategy. Difficulties in the confidentiality analyses have increased because of the numerous new variables, the more detailed geographic information and the voluntary nature of the NHS. This paper describes the 2011 PUMF methodology and how it balances the requirements for more information and for low risk of disclosure.

    Release date: 2014-10-31

  • Surveys and statistical programs – Documentation: 11-522-X201300014290
    Description:

    This paper describes a new module that will project families and households by Aboriginal status using the Demosim microsimulation model. The methodology being considered would assign a household/family headship status annually to each individual and would use the headship rate method to calculate the number of annual families and households by various characteristics and geographies associated with Aboriginal populations.

    Release date: 2014-10-31

  • Surveys and statistical programs – Documentation: 13-605-X201400214100
    Description:

    Canadian international merchandise trade data are released monthly and may be revised in subsequent releases as new information becomes available. These data are released approximately 35 days following the close of the reference period and represent one of the timeliest economic indicators produced by Statistics Canada. Given their timeliness, some of the data are not received in time and need to be estimated or modelled. This is the case for imports and exports of crude petroleum and natural gas. More specifically, at the time of release, energy trade data are based on an incomplete set of information and are revised as Statistics Canada and National Energy Board information becomes available in the subsequent months. Due to the increasing importance of energy imports and exports and the timeliness of the data, the revisions to energy prices and volumes are having an increasingly significant impact on the monthly revision to Canada’s trade balance. This note explains how the estimates in the initial release are made when data sources are not yet available, and how the original data are adjusted in subsequent releases.

    Release date: 2014-10-03

  • Surveys and statistical programs – Documentation: 99-011-X2011002
    Description:

    The 2011 NHS Aboriginal Peoples Technical Report deals with: (1) Aboriginal ancestry, (2) Aboriginal identity, (3) Registered Indian status and (4) First Nation/Indian band membership.

    The report contains explanations of concepts, data quality, historical comparability and comparability to other sources, as well as information on data collection, processing and dissemination.

    Release date: 2014-05-28

Browse our partners page to find a complete list of our partners and their associated products.

Date modified: