Keyword search

Filter results by

Search Help
Currently selected filters that can be removed

Keyword(s)

Geography

3 facets displayed. 0 facets selected.

Survey or statistical program

34 facets displayed. 0 facets selected.

Content

1 facets displayed. 0 facets selected.
Sort Help
entries

Results

All (389)

All (389) (0 to 10 of 389 results)

  • Articles and reports: 12-001-X201900100003
    Description:

    In this short article, I will attempt to provide some highlights of my chancy life as a Statistician in chronological order spanning over sixty years, 1954 to present.

    Release date: 2019-05-07

  • Articles and reports: 62F0014M2019002
    Description:

    This paper describes a new methodology that Statistics Canada has adopted to measure the rent index.

    Release date: 2019-02-27

  • Articles and reports: 12-001-X201800254953
    Description:

    Sample coordination seeks to create a probabilistic dependence between the selection of two or more samples drawn from the same population or from overlapping populations. Positive coordination increases the expected sample overlap, while negative coordination decreases it. There are numerous applications for sample coordination with varying objectives. A spatially balanced sample is a sample that is well-spread in some space. Forcing a spread within the selected samples is a general and very efficient variance reduction technique for the Horvitz-Thompson estimator. The local pivotal method and the spatially correlated Poisson sampling are two general schemes for achieving well-spread samples. We aim to introduce coordination for these sampling methods based on the concept of permanent random numbers. The goal is to coordinate such samples while preserving spatial balance. The proposed methods are motivated by examples from forestry, environmental studies, and official statistics.

    Release date: 2018-12-20

  • Articles and reports: 12-001-X201800254958
    Description:

    Domains (or subpopulations) with small sample sizes are called small areas. Traditional direct estimators for small areas do not provide adequate precision because the area-specific sample sizes are small. On the other hand, demand for reliable small area statistics has greatly increased. Model-based indirect estimators of small area means or totals are currently used to address difficulties with direct estimation. These estimators are based on linking models that borrow information across areas to increase the efficiency. In particular, empirical best (EB) estimators under area level and unit level linear regression models with random small area effects have received a lot of attention in the literature. Model mean squared error (MSE) of EB estimators is often used to measure the variability of the estimators. Linearization-based estimators of model MSE as well as jackknife and bootstrap estimators are widely used. On the other hand, National Statistical Agencies are often interested in estimating the design MSE of EB estimators in line with traditional design MSE estimators associated with direct estimators for large areas with adequate sample sizes. Estimators of design MSE of EB estimators can be obtained for area level models but they tend to be unstable when the area sample size is small. Composite MSE estimators are proposed in this paper and they are obtained by taking a weighted sum of the design MSE estimator and the model MSE estimator. Properties of the MSE estimators under the area level model are studied in terms of design bias, relative root mean squared error and coverage rate of confidence intervals. The case of a unit level model is also examined under simple random sampling within each area. Results of a simulation study show that the proposed composite MSE estimators provide a good compromise in estimating the design MSE.

    Release date: 2018-12-20

  • Public use microdata: 82M0020X
    Description: The Canadian Tobacco, Alcohol and Drugs Survey (CTADS) is a biennial general population survey of tobacco, alcohol and drug use among Canadians aged 15 years and older, with the primary focus on 15- to 24-year-olds. The CTADS is a telephone survey conducted by Statistics Canada on behalf of Health Canada.
    Release date: 2018-11-01

  • Surveys and statistical programs – Documentation: 98-306-X
    Description:

    This report describes sampling, weighting and estimation procedures used in the 2016 Census of Population. It provides operational and theoretical justifications for them, and presents the results of the evaluations of these procedures.

    Release date: 2018-09-11

  • Surveys and statistical programs – Documentation: 84-538-X
    Geography: Canada
    Description:

    This document presents the methodology underlying the production of the life tables for Canada, provinces and territories, from reference period 1980/1982 and onward.

    Release date: 2018-06-28

  • Articles and reports: 12-001-X201800154925
    Description:

    This paper develops statistical inference based on super population model in a finite population setting using ranked set samples (RSS). The samples are constructed without replacement. It is shown that the sample mean of RSS is model unbiased and has smaller mean square prediction error (MSPE) than the MSPE of a simple random sample mean. Using an unbiased estimator of MSPE, the paper also constructs a prediction confidence interval for the population mean. A small scale simulation study shows that estimator is as good as a simple random sample (SRS) estimator for poor ranking information. On the other hand it has higher efficiency than SRS estimator when the quality of ranking information is good, and the cost ratio of obtaining a single unit in RSS and SRS is not very high. Simulation study also indicates that coverage probabilities of prediction intervals are very close to the nominal coverage probabilities. Proposed inferential procedure is applied to a real data set.

    Release date: 2018-06-21

  • Articles and reports: 12-001-X201800154926
    Description:

    This paper investigates the linearization and bootstrap variance estimation for the Gini coefficient and the change between Gini indexes at two periods of time. For the one-sample case, we use the influence function linearization approach suggested by Deville (1999), the without-replacement bootstrap suggested by Gross (1980) for simple random sampling without replacement and the with-replacement of primary sampling units described in Rao and Wu (1988) for multistage sampling. To obtain a two-sample variance estimator, we use the linearization technique by means of partial influence functions (Goga, Deville and Ruiz-Gazen, 2009). We also develop an extension of the studied bootstrap procedures for two-dimensional sampling. The two approaches are compared on simulated data.

    Release date: 2018-06-21

  • Articles and reports: 12-001-X201800154928
    Description:

    A two-phase process was used by the Substance Abuse and Mental Health Services Administration to estimate the proportion of US adults with serious mental illness (SMI). The first phase was the annual National Survey on Drug Use and Health (NSDUH), while the second phase was a random subsample of adult respondents to the NSDUH. Respondents to the second phase of sampling were clinically evaluated for serious mental illness. A logistic prediction model was fit to this subsample with the SMI status (yes or no) determined by the second-phase instrument treated as the dependent variable and related variables collected on the NSDUH from all adults as the model’s explanatory variables. Estimates were then computed for SMI prevalence among all adults and within adult subpopulations by assigning an SMI status to each NSDUH respondent based on comparing his (her) estimated probability of having SMI to a chosen cut point on the distribution of the predicted probabilities. We investigate alternatives to this standard cut point estimator such as the probability estimator. The latter assigns an estimated probability of having SMI to each NSDUH respondent. The estimated prevalence of SMI is the weighted mean of those estimated probabilities. Using data from NSDUH and its subsample, we show that, although the probability estimator has a smaller mean squared error when estimating SMI prevalence among all adults, it has a greater tendency to be biased at the subpopulation level than the standard cut point estimator.

    Release date: 2018-06-21
Data (17)

Data (17) (0 to 10 of 17 results)

  • Public use microdata: 82M0020X
    Description: The Canadian Tobacco, Alcohol and Drugs Survey (CTADS) is a biennial general population survey of tobacco, alcohol and drug use among Canadians aged 15 years and older, with the primary focus on 15- to 24-year-olds. The CTADS is a telephone survey conducted by Statistics Canada on behalf of Health Canada.
    Release date: 2018-11-01

  • Public use microdata: 89M0017X
    Description:

    The public use microdata file from the 2010 Canada Survey of Giving, Volunteering and Participating is now available. This file contains information collected from nearly 15,000 respondents aged 15 and over residing in private households in the provinces.The public use microdata file provides provincial-level information about the ways in which Canadians donate money and in-kind gifts to charitable and nonprofit organizations; volunteer their time to these organizations; provide help directly to others. Socio-demographic, income and labour force data are also included on the file.

    Release date: 2012-05-04

  • Public use microdata: 12M0022X
    Description:

    This package was designed to enable users to access and manipulate the microdata file for Cycle 22 (2008) of the General Social Survey (GSS). It contains information on the objectives, methodology and estimation procedures, as well as guidelines for releasing estimates based on the survey. Cycle 22 collected data from persons 15 years and over living in private households in Canada, excluding residents of the Yukon, Northwest Territories and Nunavut; and full-time residents of institutions. The survey covered a range of topics such as social networks, and social and civic participation. Information was also collected on major changes in respondents' lives in the last 12 months, the resources they used during these transitions and unmet needs for help. Questions were also asked on trust, sense of belonging, volunteering and unpaid work.

    Release date: 2010-03-05

  • Public use microdata: 82M0023X
    Description:

    The Participation and Activity Limitation Survey (PALS) is a post-censal survey of adults with disabilities, including any person whose everyday activities are limited because of a physical condition or health problem.

    The survey covers themes such as activity limitations, help with everyday activities, education, employment status, social participation and economic characteristics.

    Release date: 2009-05-26

  • Public use microdata: 12M0021X
    Description:

    This package was designed to enable users to access and manipulate the microdata file for the 21st cycle (2007) of the General Social Survey (GSS). It contains information on the objectives, methodology and estimation procedures, as well as guidelines for releasing estimates based on the survey. Cycle 21 of the GSS collected data from persons aged 45 years and over living in private households in the 10 provinces of Canada. The survey covered a wide range of topics such as well-being, family composition, retirement decisions and plans, care giving and care receiving experiences, social networks and housing.

    Release date: 2009-05-04

  • Public use microdata: 89M0021X
    Description:

    The Aboriginal Peoples Survey (APS) provides data on the social and economic conditions of Aboriginal people in Canada. Its specific purpose was to identify the needs of Aboriginal people focusing on issues such as health, schooling and language. The survey was designed and implemented in partnership with national Aboriginal organizations.

    This product contains information for the Aboriginal child and youth population (under 15 years) living in off-reserve areas.

    Release date: 2006-05-25

  • Public use microdata: 12M0016X
    Geography: Province or territory
    Description:

    Cycle 16 of the GSS is the second cycle (after cycle 11) to collect information social support for older Canadians, introducing modules on preparations for retirement and retirement experience. The GSS is an annual telephone survey covering the non-institutionalized population in the 10 provinces. Respondents were randomly selected from a list of individuals aged 45 and over who had responded to another Statistics Canada survey. Data were collected over an 11-month period from February to December 2002. The representative sample had about 25,000 respondents. The response rate was almost 84%.

    The main objective of the 2002 GSS was to provide data on the aging population. However, the survey allows detailed analysis of characteristics of family and friends who provide care to seniors; characteristics of seniors receiving formal and informal care; links to broader determinants of health (such as income, education and social networks); and people's retirement plans and experiences.

    Release date: 2005-11-28

  • Public use microdata: 12M0017X
    Geography: Canada
    Description:

    Topics covered include social contact with friends and relatives, unpaid help given and received, volunteering and charitable giving, civic engagement, political engagement, religious participation, trust and reciprocity. Cycle 17 of the General Social Survey is the first cycle to collect detailed information on social engagement in Canada.

    The target population for Cycle 17 is all persons 15 years of age and older in Canada, excluding residents of the Yukon, Northwest Territories and Nunavut, and full-time residents of institutions.

    Release date: 2004-11-05

  • Public use microdata: 12M0015X
    Geography: Canada
    Description:

    Cycle 15 of the General Social Survey (GSS) is the third cycle to collect detailed information on family life in Canada. The previous GSS cycles that collected family data were Cycles 5 and 10. Topics include demographic characteristics such as age, sex, and marital status; family origin of parents; brothers and sisters; marriages of respondent; common-law unions of respondent; fertility and family intentions; values and attitudes; education history; work history; main activity and other characteristics.

    The target population for Cycle 15 of the GSS is all persons 15 years of age and older in Canada, excluding residents of the Yukon, Northwest Territories, and Nunavut, and full-time residents of institutions.

    Release date: 2003-04-04

  • Public use microdata: 12M0014X
    Geography: Province or territory
    Description:

    This report presents a brief overview of the information collected in Cycle 14 of the General Social Survey (GSS). Cycle 14 is the first cycle to collect detailed information on access to and use of information communication technology in Canada. Topics include general use of technology and computers, technology in the workplace, development of computer skills, frequency of Internet and E-mail use, non-users and security and information on the Internet. The target population of the GSS is all individuals aged 15 and over living in a private household in one of the ten provinces.

    Release date: 2001-06-29
Analysis (301)

Analysis (301) (0 to 10 of 301 results)

  • Articles and reports: 12-001-X201900100003
    Description:

    In this short article, I will attempt to provide some highlights of my chancy life as a Statistician in chronological order spanning over sixty years, 1954 to present.

    Release date: 2019-05-07

  • Articles and reports: 62F0014M2019002
    Description:

    This paper describes a new methodology that Statistics Canada has adopted to measure the rent index.

    Release date: 2019-02-27

  • Articles and reports: 12-001-X201800254953
    Description:

    Sample coordination seeks to create a probabilistic dependence between the selection of two or more samples drawn from the same population or from overlapping populations. Positive coordination increases the expected sample overlap, while negative coordination decreases it. There are numerous applications for sample coordination with varying objectives. A spatially balanced sample is a sample that is well-spread in some space. Forcing a spread within the selected samples is a general and very efficient variance reduction technique for the Horvitz-Thompson estimator. The local pivotal method and the spatially correlated Poisson sampling are two general schemes for achieving well-spread samples. We aim to introduce coordination for these sampling methods based on the concept of permanent random numbers. The goal is to coordinate such samples while preserving spatial balance. The proposed methods are motivated by examples from forestry, environmental studies, and official statistics.

    Release date: 2018-12-20

  • Articles and reports: 12-001-X201800254958
    Description:

    Domains (or subpopulations) with small sample sizes are called small areas. Traditional direct estimators for small areas do not provide adequate precision because the area-specific sample sizes are small. On the other hand, demand for reliable small area statistics has greatly increased. Model-based indirect estimators of small area means or totals are currently used to address difficulties with direct estimation. These estimators are based on linking models that borrow information across areas to increase the efficiency. In particular, empirical best (EB) estimators under area level and unit level linear regression models with random small area effects have received a lot of attention in the literature. Model mean squared error (MSE) of EB estimators is often used to measure the variability of the estimators. Linearization-based estimators of model MSE as well as jackknife and bootstrap estimators are widely used. On the other hand, National Statistical Agencies are often interested in estimating the design MSE of EB estimators in line with traditional design MSE estimators associated with direct estimators for large areas with adequate sample sizes. Estimators of design MSE of EB estimators can be obtained for area level models but they tend to be unstable when the area sample size is small. Composite MSE estimators are proposed in this paper and they are obtained by taking a weighted sum of the design MSE estimator and the model MSE estimator. Properties of the MSE estimators under the area level model are studied in terms of design bias, relative root mean squared error and coverage rate of confidence intervals. The case of a unit level model is also examined under simple random sampling within each area. Results of a simulation study show that the proposed composite MSE estimators provide a good compromise in estimating the design MSE.

    Release date: 2018-12-20

  • Articles and reports: 12-001-X201800154925
    Description:

    This paper develops statistical inference based on super population model in a finite population setting using ranked set samples (RSS). The samples are constructed without replacement. It is shown that the sample mean of RSS is model unbiased and has smaller mean square prediction error (MSPE) than the MSPE of a simple random sample mean. Using an unbiased estimator of MSPE, the paper also constructs a prediction confidence interval for the population mean. A small scale simulation study shows that estimator is as good as a simple random sample (SRS) estimator for poor ranking information. On the other hand it has higher efficiency than SRS estimator when the quality of ranking information is good, and the cost ratio of obtaining a single unit in RSS and SRS is not very high. Simulation study also indicates that coverage probabilities of prediction intervals are very close to the nominal coverage probabilities. Proposed inferential procedure is applied to a real data set.

    Release date: 2018-06-21

  • Articles and reports: 12-001-X201800154926
    Description:

    This paper investigates the linearization and bootstrap variance estimation for the Gini coefficient and the change between Gini indexes at two periods of time. For the one-sample case, we use the influence function linearization approach suggested by Deville (1999), the without-replacement bootstrap suggested by Gross (1980) for simple random sampling without replacement and the with-replacement of primary sampling units described in Rao and Wu (1988) for multistage sampling. To obtain a two-sample variance estimator, we use the linearization technique by means of partial influence functions (Goga, Deville and Ruiz-Gazen, 2009). We also develop an extension of the studied bootstrap procedures for two-dimensional sampling. The two approaches are compared on simulated data.

    Release date: 2018-06-21

  • Articles and reports: 12-001-X201800154928
    Description:

    A two-phase process was used by the Substance Abuse and Mental Health Services Administration to estimate the proportion of US adults with serious mental illness (SMI). The first phase was the annual National Survey on Drug Use and Health (NSDUH), while the second phase was a random subsample of adult respondents to the NSDUH. Respondents to the second phase of sampling were clinically evaluated for serious mental illness. A logistic prediction model was fit to this subsample with the SMI status (yes or no) determined by the second-phase instrument treated as the dependent variable and related variables collected on the NSDUH from all adults as the model’s explanatory variables. Estimates were then computed for SMI prevalence among all adults and within adult subpopulations by assigning an SMI status to each NSDUH respondent based on comparing his (her) estimated probability of having SMI to a chosen cut point on the distribution of the predicted probabilities. We investigate alternatives to this standard cut point estimator such as the probability estimator. The latter assigns an estimated probability of having SMI to each NSDUH respondent. The estimated prevalence of SMI is the weighted mean of those estimated probabilities. Using data from NSDUH and its subsample, we show that, although the probability estimator has a smaller mean squared error when estimating SMI prevalence among all adults, it has a greater tendency to be biased at the subpopulation level than the standard cut point estimator.

    Release date: 2018-06-21

  • Articles and reports: 12-001-X201800154929
    Description:

    The U.S. Census Bureau is investigating nonrespondent subsampling strategies for usage in the 2017 Economic Census. Design constraints include a mandated lower bound on the unit response rate, along with targeted industry-specific response rates. This paper presents research on allocation procedures for subsampling nonrespondents, conditional on the subsampling being systematic. We consider two approaches: (1) equal-probability sampling and (2) optimized allocation with constraints on unit response rates and sample size with the objective of selecting larger samples in industries that have initially lower response rates. We present a simulation study that examines the relative bias and mean squared error for the proposed allocations, assessing each procedure’s sensitivity to the size of the subsample, the response propensities, and the estimation procedure.

    Release date: 2018-06-21

  • Articles and reports: 12-001-X201800154963
    Description:

    The probability-sampling-based framework has dominated survey research because it provides precise mathematical tools to assess sampling variability. However increasing costs and declining response rates are expanding the use of non-probability samples, particularly in general population settings, where samples of individuals pulled from web surveys are becoming increasingly cheap and easy to access. But non-probability samples are at risk for selection bias due to differential access, degrees of interest, and other factors. Calibration to known statistical totals in the population provide a means of potentially diminishing the effect of selection bias in non-probability samples. Here we show that model calibration using adaptive LASSO can yield a consistent estimator of a population total as long as a subset of the true predictors is included in the prediction model, thus allowing large numbers of possible covariates to be included without risk of overfitting. We show that the model calibration using adaptive LASSO provides improved estimation with respect to mean square error relative to standard competitors such as generalized regression (GREG) estimators when a large number of covariates are required to determine the true model, with effectively no loss in efficiency over GREG when smaller models will suffice. We also derive closed form variance estimators of population totals, and compare their behavior with bootstrap estimators. We conclude with a real world example using data from the National Health Interview Survey.

    Release date: 2018-06-21

  • Articles and reports: 62F0014M2017002
    Description:

    This document offers information on changes to the Mortgage Interest Cost Index (MICI), which is one of the Consumer Price Index (CPI) components. It describes the new approach for estimating MICI price movements.

    Release date: 2017-11-17
Reference (95)

Reference (95) (0 to 10 of 95 results)

  • Surveys and statistical programs – Documentation: 98-306-X
    Description:

    This report describes sampling, weighting and estimation procedures used in the 2016 Census of Population. It provides operational and theoretical justifications for them, and presents the results of the evaluations of these procedures.

    Release date: 2018-09-11

  • Surveys and statistical programs – Documentation: 84-538-X
    Geography: Canada
    Description:

    This document presents the methodology underlying the production of the life tables for Canada, provinces and territories, from reference period 1980/1982 and onward.

    Release date: 2018-06-28

  • Surveys and statistical programs – Documentation: 91-528-X
    Description:

    This manual provides detailed descriptions of the data sources and methods used by Statistics Canada to estimate population. They comprise Postcensal and intercensal population estimates; base population; births and deaths; immigration; emigration; non-permanent residents; interprovincial migration; subprovincial estimates of population; population estimates by age, sex and marital status; and census family estimates. A glossary of principal terms is contained at the end of the manual, followed by the standard notation used.

    Until now, literature on the methodological changes for estimates calculations has always been spread throughout various Statistics Canada publications and background papers. This manual provides users of demographic statistics with a comprehensive compilation of the current procedures used by Statistics Canada to prepare population and family estimates.

    Release date: 2015-11-17

  • Surveys and statistical programs – Documentation: 71F0031X2015001
    Geography: Canada
    Description:

    This paper introduces and explains modifications made to the Labour Force Survey estimates in January 2015. Some of these modifications include the adjustment of all LFS estimates to reflect population counts based on the 2011 Census and includes updates to 2011 Geography classification system.

    Release date: 2015-01-28

  • Surveys and statistical programs – Documentation: 99-002-X2011001
    Description:

    This report describes sampling and weighting procedures used in the 2011 National Household Survey. It provides operational and theoretical justifications for them, and presents the results of the evaluation studies of these procedures.

    Release date: 2015-01-28

  • Surveys and statistical programs – Documentation: 71F0031X
    Description:

    This paper introduces and explains modifications made to the Labour Force Survey estimates.

    Release date: 2015-01-28

  • Surveys and statistical programs – Documentation: 12-001-X201400111886
    Description:

    Bayes linear estimator for finite population is obtained from a two-stage regression model, specified only by the means and variances of some model parameters associated with each stage of the hierarchy. Many common design-based estimators found in the literature can be obtained as particular cases. A new ratio estimator is also proposed for the practical situation in which auxiliary information is available. The same Bayes linear approach is proposed for obtaining estimation of proportions for multiple categorical data associated with finite population units, which is the main contribution of this work. A numerical example is provided to illustrate it.

    Release date: 2014-06-27

  • Surveys and statistical programs – Documentation: 12-001-X201300211888
    Description:

    When the study variables are functional and storage capacities are limited or transmission costs are high, using survey techniques to select a portion of the observations of the population is an interesting alternative to using signal compression techniques. In this context of functional data, our focus in this study is on estimating the mean electricity consumption curve over a one-week period. We compare different estimation strategies that take account of a piece of auxiliary information such as the mean consumption for the previous period. The first strategy consists in using a simple random sampling design without replacement, then incorporating the auxiliary information into the estimator by introducing a functional linear model. The second approach consists in incorporating the auxiliary information into the sampling designs by considering unequal probability designs, such as stratified and pi designs. We then address the issue of constructing confidence bands for these estimators of the mean. When effective estimators of the covariance function are available and the mean estimator satisfies a functional central limit theorem, it is possible to use a fast technique for constructing confidence bands, based on the simulation of Gaussian processes. This approach is compared with bootstrap techniques that have been adapted to take account of the functional nature of the data.

    Release date: 2014-01-15

  • Surveys and statistical programs – Documentation: 12-001-X201300111828
    Description:

    A question that commonly arises in longitudinal surveys is the issue of how to combine differing cohorts of the survey. In this paper we present a novel method for combining different cohorts, and using all available data, in a longitudinal survey to estimate parameters of a semiparametric model, which relates the response variable to a set of covariates. The procedure builds upon the Weighted Generalized Estimation Equation method for handling missing waves in longitudinal studies. Our method is set up under a joint-randomization framework for estimation of model parameters, which takes into account the superpopulation model as well as the survey design randomization. We also propose a design-based, and a joint-randomization, variance estimation method. To illustrate the methodology we apply it to the Survey of Doctorate Recipients, conducted by the U.S. National Science Foundation.

    Release date: 2013-06-28

  • Surveys and statistical programs – Documentation: 12-001-X201100211606
    Description:

    This paper introduces a U.S. Census Bureau special compilation by presenting four other papers of the current issue: three papers from authors Tillé, Lohr and Thompson as well as a discussion paper from Opsomer.

    Release date: 2011-12-21
Date modified: