Analysis

Skip to filters. View results.

Statistics Canada's Trust Centre

Results

All (138)

All (138) (130 to 140 of 138 results)

131. Statistical properties of crop production estimators Archived
Articles and reports: 12-001-X198700114468
Description:
The National Agricultural Statistics Service, U.S. Department of Agriculture, conducts yield surveys for a variety of field crops in the United States. While field sampling procedures for various crops differ, the same basic survey design is used for all crops. The survey design and current estimators are reviewed. Alternative estimators of yield and production and of the variance of the estimators are presented. Current estimators and alternative estimators are compared, both theoretically and in a Monte Carlo simulation.
Release date: 1987-06-15
132. Stratification in the Canadian Labour Force Survey Archived
Articles and reports: 12-001-X198500214372
Description:
The use of a multivariate clustering algorithm to perform stratification for the Labour Force Survey is described. The algorithm developed by Friedman and Rubin (1967) is modified to allow the formation of geographically contiguous strata and to delineate heterogeneous but compact primary sampling units (PSUs) within these strata. Studies dealing with stratification variables, stratification robustness over time, and type of stratification are described.
Release date: 1985-12-16
133. Application of linear and log-linear models to data from complex samples Archived
Articles and reports: 12-001-X198400114351
Description:
Most sample surveys conducted by organizations such as Statistics Canada or the U.S. Bureau of the Census employ complex designs. The design-based approach to statistical inference, typically the institutional standard of inference for simple population statistics such as means and totals, may be extended to parameters of analytic models as well. Most of this paper focuses on application of design-based inferences to such models, but rationales are offered for use of model-based alternatives in some instances, by way of explanation for the author’s observation that both modes of inference are used in practice at his own institution.
Within the design-based approach to inference, the paper briefly describes experience with linear regression analysis. Recently, variance computations for a number of surveys of the Census Bureau have been implemented through “replicate weighting”; the principal application has been for variances of simple statistics, but this technique also facilitates variance computation for virtually any complex analytic model. Finally, approaches and experience with log-linear models are reported.
Release date: 1984-06-15
134. Least squares and related analyses for complex survey designs Archived
Articles and reports: 12-001-X198400114352
Description:
The paper shows different estimation methods for complex survey designs. Among others, estimation of mean, ratio and regression coefficient is presented. The standard errors are estimated by different methods: the ordinary least squares procedure, the stratified weighted sample procedure, the stratified unit weight procedure, etc. Theory of large samples and conditions to apply it are also presented.
Release date: 1984-06-15
135. Estimating monthly gross flows in labour force participation Archived
Articles and reports: 12-001-X198300114335
Description:
The Canadian Labour Force Survey is a household survey conducted each month for the purpose of producing point-in-time estimates of the number of persons employed, unemployed and not in the labor force. The survey has a rotating panel design in which all individuals in a sampled household location are interviewed each month, for six consecutive months. In the past, little use has been made of this longitudinal structure, although considerable interest has been expressed in the month-to-month gross flows (transitions) amongst the labour force status categories. In this paper we discuss methods being considered by Statistics Canada for the production of gross flow estimates, but from a model-based perspective.
Release date: 1983-06-15
136. Data, statistics, information - Some issues of the Canadian Social Statistics Scene Archived
Articles and reports: 12-001-X197900254833
Description:
This paper looks at the current state of development of social statistics in Canada. Some key concepts related to statistics and social information are defined and discussed. The availability and analysis of administrative data is highlighted, along with the need for social surveys. Suggestions are made about the types of data analysis needed for the development of social decision models to meet policy requirements. Finally, an outline of priorities for future work toward the effective use of social statistics is given.
Release date: 1979-12-14
137. Approximate tests of independence and goodness of fit based on stratified multi-stage samples Archived
Articles and reports: 12-001-X197800154831
Description: The impact on linear statistics of the sample design used in obtaining survey data is the subject of much of sampling literature. Recently, more attention has been paid to the design’s impact on non-linear statistics; the major factor inhibiting these investigations has been the problem of estimating at least the first two moments of such statistics. The present article examines the problem of estimating the variances of non-linear statistics from complex samples, in the light of existing literature. The behaviour of the chi-square statistic computed from a complex sample to test hypotheses of goodness of fit or independence is studied. Alternative tests are developed and their properties studied in simulation experiments.
Release date: 1978-06-15
138. Controlled random rounding Archived
Articles and reports: 12-001-X197500254825
Description:
Random rounding is a technique to ensure confidentiality of aggregate statistics. By randomly rounding all the components of a total, independently, together with the random rounding of the total itself, substantial discrepancies may arise when aggregating the published data. This paper presents a procedure which avoids substantial discrepancies while still protecting the concept of confidentiality.
Release date: 1975-12-15

Stats in brief (3)

Stats in brief (3) ((3 results))

1. Data ethics part 2: Ethical reviews
Stats in brief: 89-20-00062022004
Description:
Gathering, exploring, analyzing and interpreting data are essential steps in producing information that benefits society, the economy and the environment. In this video, we will discuss the importance of considering data ethics throughout the process of producing statistical information.

As a pre-requisite to this video, make sure to watch the video titled “Data Ethics: An introduction” also available in Statistics Canada’s data literacy training catalogue.

Release date: 2022-10-17
2. Data ethics: An introduction Archived
Stats in brief: 89-20-00062022001
Description:
Gathering, exploring, analyzing and interpreting data are essential steps in producing information that benefits society, the economy and the environment. To properly conduct these processes, data ethics ethics must be upheld in order to ensure the appropriate use of data.

Release date: 2022-05-24
3. FAIR data principles: What is FAIR? Archived
Stats in brief: 89-20-00062022002
Description:
This video will break down what it means to be FAIR in terms of data and metadata, and how each pillar of FAIR serves to guide data users and producers alike, as they navigate their way through the data journey, in order to gain maximum, long term value.

Release date: 2022-05-24

Articles and reports (134)

Articles and reports (134) (20 to 30 of 134 results)

21. Measurement error in small area estimation: Functional versus structural versus naïve models Archived
Articles and reports: 12-001-X201900100005
Description:
Small area estimation using area-level models can sometimes benefit from covariates that are observed subject to random errors, such as covariates that are themselves estimates drawn from another survey. Given estimates of the variances of these measurement (sampling) errors for each small area, one can account for the uncertainty in such covariates using measurement error models (e.g., Ybarra and Lohr, 2008). Two types of area-level measurement error models have been examined in the small area estimation literature. The functional measurement error model assumes that the underlying true values of the covariates with measurement error are fixed but unknown quantities. The structural measurement error model assumes that these true values follow a model, leading to a multivariate model for the covariates observed with error and the original dependent variable. We compare and contrast these two models with the alternative of simply ignoring measurement error when it is present (naïve model), exploring the consequences for prediction mean squared errors of use of an incorrect model under different underlying assumptions about the true model. Comparisons done using analytic formulas for the mean squared errors assuming model parameters are known yield some surprising results. We also illustrate results with a model fitted to data from the U.S. Census Bureau’s Small Area Income and Poverty Estimates (SAIPE) Program.
Release date: 2019-05-07
22. Comparison of the conditional bias and Kokic and Bell methods for Poisson and stratified sampling Archived
Articles and reports: 12-001-X201800254961
Description:
In business surveys, it is common to collect economic variables with highly skewed distribution. In this context, winsorization is frequently used to address the problem of influential values. In stratified simple random sampling, there are two methods for selecting the thresholds involved in winsorization. This article comprises two parts. The first reviews the notations and the concept of a winsorization estimator. The second part details the two methods and extends them to the case of Poisson sampling, and then compares them on simulated data sets and on the labour cost and structure of earnings survey carried out by INSEE.
Release date: 2018-12-20
23. Canadian Cancer Treatment Linkage Project Archived
Articles and reports: 11-633-X2018016
Description:
Record linkage has been identified as a potential mechanism to add treatment information to the Canadian Cancer Registry (CCR). The purpose of the Canadian Cancer Treatment Linkage Project (CCTLP) pilot is to add surgical treatment data to the CCR. The Discharge Abstract Database (DAD) and the National Ambulatory Care Reporting System (NACRS) were linked to the CCR, and surgical treatment data were extracted. The project was funded through the Cancer Data Development Initiative (CDDI) of the Canadian Partnership Against Cancer (CPAC).
The CCTLP was developed as a feasibility study in which patient records from the CCR would be linked to surgical treatment records in the DAD and NACRS databases, maintained by the Canadian Institute for Health Information. The target cohort to whom surgical treatment data would be linked was patients aged 19 or older registered on the CCR (2010 through 2012). The linkage was completed in Statistics Canada’s Social Data Linkage Environment (SDLE).
Release date: 2018-03-27
24. Sample survey theory and methods: Past, present, and future directions Archived
Articles and reports: 12-001-X201700254888
Description:
We discuss developments in sample survey theory and methods covering the past 100 years. Neyman’s 1934 landmark paper laid the theoretical foundations for the probability sampling approach to inference from survey samples. Classical sampling books by Cochran, Deming, Hansen, Hurwitz and Madow, Sukhatme, and Yates, which appeared in the early 1950s, expanded and elaborated the theory of probability sampling, emphasizing unbiasedness, model free features, and designs that minimize variance for a fixed cost. During the period 1960-1970, theoretical foundations of inference from survey data received attention, with the model-dependent approach generating considerable discussion. Introduction of general purpose statistical software led to the use of such software with survey data, which led to the design of methods specifically for complex survey data. At the same time, weighting methods, such as regression estimation and calibration, became practical and design consistency replaced unbiasedness as the requirement for standard estimators. A bit later, computer-intensive resampling methods also became practical for large scale survey samples. Improved computer power led to more sophisticated imputation for missing data, use of more auxiliary data, some treatment of measurement errors in estimation, and more complex estimation procedures. A notable use of models was in the expanded use of small area estimation. Future directions in research and methods will be influenced by budgets, response rates, timeliness, improved data collection devices, and availability of auxiliary data, some of which will come from “Big Data”. Survey taking will be impacted by changing cultural behavior and by a changing physical-technical environment.
Release date: 2017-12-21
25. Development of a population-based microsimulation model of body mass index Archived
Articles and reports: 82-003-X201700614829
Description:
POHEM-BMI is a microsimulation tool that includes a model of adult body mass index (BMI) and a model of childhood BMI history. This overview describes the development of BMI prediction models for adults and of childhood BMI history, and compares projected BMI estimates with those from nationally representative survey data to establish validity.
Release date: 2017-06-21
26. Imputing Postal Codes to Analyze Ecological Variables in Longitudinal Cohorts: Exposure to Particulate Matter in the Canadian Census Health and Environment Cohort Database Archived
Articles and reports: 11-633-X2017006
Description:
This paper describes a method of imputing missing postal codes in a longitudinal database. The 1991 Canadian Census Health and Environment Cohort (CanCHEC), which contains information on individuals from the 1991 Census long-form questionnaire linked with T1 tax return files for the 1984-to-2011 period, is used to illustrate and validate the method. The cohort contains up to 28 consecutive fields for postal code of residence, but because of frequent gaps in postal code history, missing postal codes must be imputed. To validate the imputation method, two experiments were devised where 5% and 10% of all postal codes from a subset with full history were randomly removed and imputed.
Release date: 2017-03-13
27. Zeno: A Tool for Calculating Confidence Intervals of Rates in Health Archived
Articles and reports: 11-633-X2017005
Description:
Hospitalization rates are among commonly reported statistics related to health-care service use. The variety of methods for calculating confidence intervals for these and other health-related rates suggests a need to classify, compare and evaluate these methods. Zeno is a tool developed to calculate confidence intervals of rates based on several formulas available in the literature. This report describes the contents of the main sheet of the Zeno Tool and indicates which formulas are appropriate, based on users’ assumptions and scope of analysis.
Release date: 2017-01-19
28. The 2001 Canadian Census–Tax–Mortality Cohort: A 10-Year Follow-up Archived
Articles and reports: 11-633-X2016003
Description:
Large national mortality cohorts are used to estimate mortality rates for different socioeconomic and population groups, and to conduct research on environmental health. In 2008, Statistics Canada created a cohort linking the 1991 Census to mortality. The present study describes a linkage of the 2001 Census long-form questionnaire respondents aged 19 years and older to the T1 Personal Master File and the Amalgamated Mortality Database. The linkage tracks all deaths over a 10.6-year period (until the end of 2011, to date).
Release date: 2016-10-26
29. Nonresponse adjustments with misspecified models in stratified designs Archived
Articles and reports: 12-001-X201600114546
Description:
Adjusting the base weights using weighting classes is a standard approach for dealing with unit nonresponse. A common approach is to create nonresponse adjustments that are weighted by the inverse of the assumed response propensity of respondents within weighting classes under a quasi-randomization approach. Little and Vartivarian (2003) questioned the value of weighting the adjustment factor. In practice the models assumed are misspecified, so it is critical to understand the impact of weighting might have in this case. This paper describes the effects on nonresponse adjusted estimates of means and totals for population and domains computed using the weighted and unweighted inverse of the response propensities in stratified simple random sample designs. The performance of these estimators under different conditions such as different sample allocation, response mechanism, and population structure is evaluated. The findings show that for the scenarios considered the weighted adjustment has substantial advantages for estimating totals and using an unweighted adjustment may lead to serious biases except in very limited cases. Furthermore, unlike the unweighted estimates, the weighted estimates are not sensitive to how the sample is allocated.
Release date: 2016-06-22
30. Assimilation and Coverage of the Foreign-Born Population in Administrative Records Archived
Articles and reports: 11-522-X201700014722
Description:
The U.S. Census Bureau is researching ways to incorporate administrative data in decennial census and survey operations. Critical to this work is an understanding of the coverage of the population by administrative records. Using federal and third party administrative data linked to the American Community Survey (ACS), we evaluate the extent to which administrative records provide data on foreign-born individuals in the ACS and employ multinomial logistic regression techniques to evaluate characteristics of those who are in administrative records relative to those who are not. We find that overall, administrative records provide high coverage of foreign-born individuals in our sample for whom a match can be determined. The odds of being in administrative records are found to be tied to the processes of immigrant assimilation – naturalization, higher English proficiency, educational attainment, and full-time employment are associated with greater odds of being in administrative records. These findings suggest that as immigrants adapt and integrate into U.S. society, they are more likely to be involved in government and commercial processes and programs for which we are including data. We further explore administrative records coverage for the two largest race/ethnic groups in our sample – Hispanic and non-Hispanic single-race Asian foreign born, finding again that characteristics related to assimilation are associated with administrative records coverage for both groups. However, we observe that neighborhood context impacts Hispanics and Asians differently.
Release date: 2016-03-24

Journals and periodicals (1)

Journals and periodicals (1) ((1 result))

1. Validation Study for a Record Linkage of Births and Infant Deaths in Canada Archived
Journals and periodicals: 84F0013X
Geography: Canada, Province or territory
Description:
This study was initiated to test the validity of probabilistic linkage methods used at Statistics Canada. It compared the results of data linkages on infant deaths in Canada with infant death data from Nova Scotia and Alberta. It also compared the availability of fetal deaths on the national and provincial files.
Release date: 1999-10-08

Report a problem or mistake on this page

Date modified:: 2024-09-05

Language selection

Search and menus

Search

Analysis

Filter results by

Keyword(s)

Subject

Year of publication

Author(s)

Survey or statistical program

Content

Results

All (138) (130 to 140 of 138 results)

Stats in brief (3) ((3 results))

Articles and reports (134) (20 to 30 of 134 results)

Journals and periodicals (1) ((1 result))

Analysis

Filter results by

Keyword(s)

Subject

Year of publication

Author(s)

Survey or statistical program

Content

Results

All (138) (130 to 140 of 138 results)

Stats in brief (3) ((3 results))

Articles and reports (134) (20 to 30 of 134 results)

Journals and periodicals (1) ((1 result))

How do I use the filters and the search box?

How do I refine my search?

How does the search work?

How are the results ordered?

How are the results ordered?