Inference and foundations

Filter results by

Search Help
Currently selected filters that can be removed

Keyword(s)

Geography

1 facets displayed. 0 facets selected.

Survey or statistical program

2 facets displayed. 0 facets selected.

Content

1 facets displayed. 0 facets selected.
Sort Help
entries

Results

All (82)

All (82) (60 to 70 of 82 results)

  • Surveys and statistical programs – Documentation: 11-522-X19990015650
    Description:

    The U.S. Manufacturing Plant Ownership Change Database (OCD) was constructed using plant-level data taken from the Census Bureau's Longitudinal Research Database (LRD). It contains data on all manufacturing plants that have experienced ownership change at least once during the period 1963-92. This paper reports the status of the OCD and discuss its research possibilities. For an empirical demonstration, data taken from the database are used to study the effects of ownership changes on plant closure.

    Release date: 2000-03-02

  • Articles and reports: 11-522-X19990015654
    Description:

    A meta analysis was performed to estimate the proportion of liver carcinogens, the proportion of chemicals carcinogenic at any site, and the corresponding proportion of anticarcinogens among chemicals tested in 397 long-term cancer bioassays conducted by the U.S. National Toxicology Program. Although the estimator used was negatively biased, the study provided persuasive evidence for a larger proportion of liver carcinogens (0.43,90%CI: 0.35,0.51) than was identified by the NTP (0.28). A larger proportion of chemicals carcinogenic at any site was also estimated (0.59,90%CI: 0.49,0.69) than was identified by the NTP (0.51), although this excess was not statistically significant. A larger proportion of anticarcinogens (0.66) was estimated than carcinogens (0.59). Despite the negative bias, it was estimated that 85% of the chemicals were either carcinogenic or anticarcinogenic at some site in some sex-species group. This suggests that most chemicals tested at high enough doses will cause some sort of perturbation in tumor rates.

    Release date: 2000-03-02

  • Surveys and statistical programs – Documentation: 11-522-X19990015658
    Description:

    Radon, a naturally occurring gas found at some level in most homes, is an established risk factor for human lung cancer. The U.S. National Research Council (1999) has recently completed a comprehensive evaluation of the health risks of residential exposure to radon, and developed models for projecting radon lung cancer risks in the general population. This analysis suggests that radon may play a role in the etiology of 10-15% of all lung cancer cases in the United States, although these estimates are subject to considerable uncertainty. In this article, we present a partial analysis of uncertainty and variability in estimates of lung cancer risk due to residential exposure to radon in the United States using a general framework for the analysis of uncertainty and variability that we have developed previously. Specifically, we focus on estimates of the age-specific excess relative risk (ERR) and lifetime relative risk (LRR), both of which vary substantially among individuals.

    Release date: 2000-03-02

  • Articles and reports: 92F0138M2000003
    Description:

    Statistics Canada's interest in a common delineation of the north for statistical analysis purposes evolved from research to devise a classification to further differentiate the largely rural and remote areas that make up 96% of Canada's land area. That research led to the establishment of the census metropolitan area and census agglomeration influenced zone (MIZ) concept. When applied to census subdivisions, the MIZ categories did not work as well in northern areas as in the south. Therefore, the Geography Division set out to determine a north-south divide that would differentiate the north from the south independent of any standard geographic area boundaries.

    This working paper describes the methodology used to define a continuous line across Canada to separate the north from the south, as well as lines marking transition zones on both sides of the north-south line. It also describes the indicators selected to derive the north-south line and makes comparisons to alternative definitions of the north. The resulting classification of the north complements the MIZ classification. Together, census metropolitan areas, census agglomerations, MIZ and the North form a new Statistical Area Classification (SAC) for Canada.

    Two related Geography working papers (catalogue no. 92F0138MPE) provide further details about the MIZ classification. Working paper no. 2000-1 (92F0138MPE00001) briefly describes MIZ and includes tables of selected socio-economic characteristics from the 1991 Census tabulated by the MIZ categories, and working paper no. 2000-2 (92F0138MPE00002) describes the methodology used to define the MIZ classification.

    Release date: 2000-02-03

  • Articles and reports: 62F0014M1998013
    Geography: Canada
    Description:

    The reference population for the Consumer Price Index (CPI) has been represented, since the 1992 updating of the basket of goods and services, by families and unattached individuals living in private urban or rural households. The official CPI is a measure of the average percentage change over time in the cost of a fixed basket of goods and services purchased by Canadian consumers.

    Because of the broadly defined target population of the CPI, the measure has been criticised for failing to reflect the inflationary experiences of certain socio-economic groups. This study examines this question for three sub-groups of the reference population of the CPI. It is an extension of earlier studies on the subject done at Statistics Canada.

    In this document, analytical consumer price indexes sub-group indexes are compared to the analytical index for the whole population calculated at the national geographic level.

    The findings tend to point to those of earlier Statistics Canada studies on sub-groups in the CPI reference population. Those studies have consistently concluded that a consumer price index established for a given sub-group does not differ substantially from the index for the whole reference population.

    Release date: 1999-05-13

  • Geographic files and documentation: 92F0138M1993001
    Geography: Canada
    Description:

    The Geography Divisions of Statistics Canada and the U.S. Bureau of the Census have commenced a cooperative research program in order to foster an improved and expanded perspective on geographic areas and their relevance. One of the major objectives is to determine a common geographic area to form a geostatistical basis for cross-border research, analysis and mapping.

    This report, which represents the first stage of the research, provides a list of comparable pairs of Canadian and U.S. standard geographic areas based on current definitions. Statistics Canada and the U.S. Bureau of the Census have two basic types of standard geographic entities: legislative/administrative areas (called "legal" entities in the U.S.) and statistical areas.

    The preliminary pairing of geographic areas are based on face-value definitions only. The definitions are based on the June 4, 1991 Census of Population and Housing for Canada and the April 1, 1990 Census of Population and Housing for the U.S.A. The important aspect is the overall conceptual comparability, not the precise numerical thresholds used for delineating the areas.

    Data users should use this report as a general guide to compare the census geographic areas of Canada and the United States, and should be aware that differences in settlement patterns and population levels preclude a precise one-to-one relationship between conceptually similar areas. The geographic areas compared in this report provide a framework for further empirical research and analysis.

    Release date: 1999-03-05

  • Surveys and statistical programs – Documentation: 12-001-X19970013101
    Description:

    In the main body of statistics, sampling is often disposed of by assuming a sampling process that selects random variables such that they are independent and identically distributed (IID). Important techniques, like regression and contingency table analysis, were developed largely in the IID world; hence, adjustments are needed to use them in complex survey settings. Rather than adjust the analysis, however, what is new in the present formulation is to draw a second sample from the original sample. In this second sample, the first set of selections are inverted, so as to yield at the end a simple random sample. Of course, to employ this two-step process to draw a single simple random sample from the usually much larger complex survey would be inefficient, so multiple simple random samples are drawn and a way to base inferences on them developed. Not all original samples can be inverted; but many practical special cases are discussed which cover a wide range of practices.

    Release date: 1997-08-18

  • Surveys and statistical programs – Documentation: 12-001-X19970013102
    Description:

    The selection of auxiliary variables is considered for regression estimation in finite populations under a simple random sampling design. This problem is a basic one for model-based and model-assisted survey sampling approaches and is of practical importance when the number of variables available is large. An approach is developed in which a mean squared error estimator is minimised. This approach is compared to alternative approaches using a fixed set of auxiliary variables, a conventional significance test criterion, a condition number reduction approach and a ridge regression approach. The proposed approach is found to perform well in terms of efficiency. It is noted that the variable selection approach affects the properties of standard variance estimators and thus leads to a problem of variance estimation.

    Release date: 1997-08-18

  • Surveys and statistical programs – Documentation: 12-001-X19960022980
    Description:

    In this paper, we study a confidence interval estimation method for a finite population average when some auxiliairy information is available. As demonstrated by Royall and Cumberland in a series of empirical studies, naive use of existing methods to construct confidence intervals for population averages may result in very poor conditional coverage probabilities, conditional on the sample mean of the covariate. When this happens, we propose to transform the data to improve the precision of the normal approximation. The transformed data are then used to make inference on the original population average, and the auxiliary information is incorporated into the inference directly, or by calibration with empirical likelihood. Our approach is design-based. We apply our approach to six real populations and find that when transformation is needed, our approach performs well compared to the usual regression method.

    Release date: 1997-01-30

  • Articles and reports: 91F0015M1996001
    Geography: Canada
    Description:

    This paper describes the methodology for fertility projections used in the 1993-based population projections by age and sex for Canada, provinces and territories, 1993-2016. A new version of the parametric model known as the Pearsonian Type III curve was applied for projecting fertility age pattern. The Pearsonian Type III model is considered as an improvement over the Type I used in the past projections. This is because the Type III curve better portrays both the distribution of the age-specific fertility rates and the estimates of births. Since the 1993-based population projections are the first official projections to incorporate the net census undercoverage in the population base, it has been necessary to recalculate fertility rates based on the adjusted population estimates. This recalculation resulted in lowering the historical series of age-specific and total fertility rates, 1971-1993. The three sets of fertility assumptions and projections were developed with these adjusted annual fertility rates.

    It is hoped that this paper will provide valuable information about the technical and analytical aspects of the current fertility projection model. Discussions on the current and future levels and age pattern of fertility in Canada, provinces and territories are also presented in the paper.

    Release date: 1996-08-02
Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (69)

Analysis (69) (60 to 70 of 69 results)

  • Articles and reports: 12-001-X199200214487
    Description:

    This paper reviews the idea of robustness for randomisation and model-based inference for descriptive and analytic surveys. The lack of robustness for model-based procedures can be partially overcome by careful design. In this paper a robust model-based approach to analysis is proposed based on smoothing methods.

    Release date: 1992-12-15

  • Articles and reports: 12-001-X199200214488
    Description:

    In many finite population sampling problems the design that is optimal in the sense of minimizing the variance of the best linear unbiased estimator under a particular working model is bad in the sense of robustness - it leaves the estimator extremely vulnerable to bias if the working model is incorrect. However there are some important models under which one design provides both efficiency and robustness. We present a theorem that identifies such models and their optimal designs.

    Release date: 1992-12-15

  • Articles and reports: 12-001-X199100214504
    Description:

    Simple or marginal quota surveys are analyzed using two methods: (1) behaviour modelling (superpopulation model) and prediction estimation, and (2) sample modelling (simple restricted random sampling) and estimation derived from the sample distribution. In both cases the limitations of the theory used to establish the variance formulas and estimates when measuring totals are described. An extension of the quota method (non-proportional quotas) is also briefly described and analyzed. In some cases, this may provide a very significant improvement in survey precision. The advantages of the quota method are compared with those of random sampling. The latter remains indispensable in the case of large scale surveys within the framework of Official Statistics.

    Release date: 1991-12-16

  • Articles and reports: 12-001-X199100114521
    Description:

    Marginal and approximate conditional likelihoods are given for the correlation parameters in a normal linear regression model with correlated errors. This general likelihood approach is applied to obtain marginal and approximate conditional likelihoods for the correlation parameters in sampling on successive occasions under both simple random sampling on each occasion and more complex surveys.

    Release date: 1991-06-14

  • Articles and reports: 12-001-X199000114560
    Description:

    Early developments in sampling theory and methods largely concentrated on efficient sampling designs and associated estimation techniques for population totals or means. More recently, the theoretical foundations of survey based estimation have also been critically examined, and formal frameworks for inference on totals or means have emerged. During the past 10 years or so, rapid progress has also been made in the development of methods for the analysis of survey data that take account of the complexity of the sampling design. The scope of this paper is restricted to an overview and appraisal of some of these developments.

    Release date: 1990-06-15

  • Articles and reports: 12-001-X198900214568
    Description:

    The paper describes a Monte Carlo study of simultaneous confidence interval procedures for k > 2 proportions, under a model of two-stage cluster sampling. The procedures investigated include: (i) standard multinomial intervals; (ii) Scheffé intervals based on sample estimates of the variances of cell proportions; (iii) Quesenberry-Hurst intervals adapted for clustered data using Rao and Scott’s first and second order adjustments to X^2; (iv) simple Bonferroni intervals; (v) Bonferroni intervals based on transformations of the estimated proportions; (vi) Bonferroni intervals computed using the critical points of Student’s t. In several realistic situations, actual coverage rates of the multinomial procedures were found to be seriously depressed compared to the nominal rate. The best performing intervals, from the point of view of coverage rates and coverage symmetry (an extension of an idea due to Jennings), were the t-based Bonferroni intervals derived using log and logit transformations. Of the Scheffé-like procedures, the best performance was provided by Quesenberry-Hurst intervals in combination with first-order Rao-Scott adjustments.

    Release date: 1989-12-15

  • Articles and reports: 12-001-X198500114364
    Description:

    Conventional methods of inference in survey sampling are critically examined. The need for conditioning the inference on recognizable subsets of the population is emphasized. A number of real examples involving random sample sizes are presented to illustrate inferences conditional on the realized sample configuration and associated difficulties. The examples include the following: estimation of (a) population mean under simple random sampling; (b) population mean in the presence of outliers; (c) domain total and domain mean; (d) population mean with two-way stratification; (e) population mean in the presence of non-responses; (f) population mean under general designs. The conditional bias and the conditional variance of estimators of a population mean (or a domain mean or total), and the associated confidence intervals, are examined.

    Release date: 1985-06-14

  • Articles and reports: 12-001-X198400114351
    Description:

    Most sample surveys conducted by organizations such as Statistics Canada or the U.S. Bureau of the Census employ complex designs. The design-based approach to statistical inference, typically the institutional standard of inference for simple population statistics such as means and totals, may be extended to parameters of analytic models as well. Most of this paper focuses on application of design-based inferences to such models, but rationales are offered for use of model-based alternatives in some instances, by way of explanation for the author’s observation that both modes of inference are used in practice at his own institution.

    Within the design-based approach to inference, the paper briefly describes experience with linear regression analysis. Recently, variance computations for a number of surveys of the Census Bureau have been implemented through “replicate weighting”; the principal application has been for variances of simple statistics, but this technique also facilitates variance computation for virtually any complex analytic model. Finally, approaches and experience with log-linear models are reported.

    Release date: 1984-06-15

  • Articles and reports: 12-001-X198100214319
    Description:

    The problems associated with making analytical inferences from data based on complex sample designs are reviewed. A basic issue is the definition of the parameter of interest and whether it is a superpopulation model parameter or a finite population parameter. General methods based on a generalized Wald Statistics and its modification or on modifications of classical test statistics are discussed. More detail is given on specific methods-on linear models and regression and on categorical data analysis.

    Release date: 1981-12-15
Reference (16)

Reference (16) (10 to 20 of 16 results)

  • Surveys and statistical programs – Documentation: 11-522-X19990015650
    Description:

    The U.S. Manufacturing Plant Ownership Change Database (OCD) was constructed using plant-level data taken from the Census Bureau's Longitudinal Research Database (LRD). It contains data on all manufacturing plants that have experienced ownership change at least once during the period 1963-92. This paper reports the status of the OCD and discuss its research possibilities. For an empirical demonstration, data taken from the database are used to study the effects of ownership changes on plant closure.

    Release date: 2000-03-02

  • Surveys and statistical programs – Documentation: 11-522-X19990015658
    Description:

    Radon, a naturally occurring gas found at some level in most homes, is an established risk factor for human lung cancer. The U.S. National Research Council (1999) has recently completed a comprehensive evaluation of the health risks of residential exposure to radon, and developed models for projecting radon lung cancer risks in the general population. This analysis suggests that radon may play a role in the etiology of 10-15% of all lung cancer cases in the United States, although these estimates are subject to considerable uncertainty. In this article, we present a partial analysis of uncertainty and variability in estimates of lung cancer risk due to residential exposure to radon in the United States using a general framework for the analysis of uncertainty and variability that we have developed previously. Specifically, we focus on estimates of the age-specific excess relative risk (ERR) and lifetime relative risk (LRR), both of which vary substantially among individuals.

    Release date: 2000-03-02

  • Geographic files and documentation: 92F0138M1993001
    Geography: Canada
    Description:

    The Geography Divisions of Statistics Canada and the U.S. Bureau of the Census have commenced a cooperative research program in order to foster an improved and expanded perspective on geographic areas and their relevance. One of the major objectives is to determine a common geographic area to form a geostatistical basis for cross-border research, analysis and mapping.

    This report, which represents the first stage of the research, provides a list of comparable pairs of Canadian and U.S. standard geographic areas based on current definitions. Statistics Canada and the U.S. Bureau of the Census have two basic types of standard geographic entities: legislative/administrative areas (called "legal" entities in the U.S.) and statistical areas.

    The preliminary pairing of geographic areas are based on face-value definitions only. The definitions are based on the June 4, 1991 Census of Population and Housing for Canada and the April 1, 1990 Census of Population and Housing for the U.S.A. The important aspect is the overall conceptual comparability, not the precise numerical thresholds used for delineating the areas.

    Data users should use this report as a general guide to compare the census geographic areas of Canada and the United States, and should be aware that differences in settlement patterns and population levels preclude a precise one-to-one relationship between conceptually similar areas. The geographic areas compared in this report provide a framework for further empirical research and analysis.

    Release date: 1999-03-05

  • Surveys and statistical programs – Documentation: 12-001-X19970013101
    Description:

    In the main body of statistics, sampling is often disposed of by assuming a sampling process that selects random variables such that they are independent and identically distributed (IID). Important techniques, like regression and contingency table analysis, were developed largely in the IID world; hence, adjustments are needed to use them in complex survey settings. Rather than adjust the analysis, however, what is new in the present formulation is to draw a second sample from the original sample. In this second sample, the first set of selections are inverted, so as to yield at the end a simple random sample. Of course, to employ this two-step process to draw a single simple random sample from the usually much larger complex survey would be inefficient, so multiple simple random samples are drawn and a way to base inferences on them developed. Not all original samples can be inverted; but many practical special cases are discussed which cover a wide range of practices.

    Release date: 1997-08-18

  • Surveys and statistical programs – Documentation: 12-001-X19970013102
    Description:

    The selection of auxiliary variables is considered for regression estimation in finite populations under a simple random sampling design. This problem is a basic one for model-based and model-assisted survey sampling approaches and is of practical importance when the number of variables available is large. An approach is developed in which a mean squared error estimator is minimised. This approach is compared to alternative approaches using a fixed set of auxiliary variables, a conventional significance test criterion, a condition number reduction approach and a ridge regression approach. The proposed approach is found to perform well in terms of efficiency. It is noted that the variable selection approach affects the properties of standard variance estimators and thus leads to a problem of variance estimation.

    Release date: 1997-08-18

  • Surveys and statistical programs – Documentation: 12-001-X19960022980
    Description:

    In this paper, we study a confidence interval estimation method for a finite population average when some auxiliairy information is available. As demonstrated by Royall and Cumberland in a series of empirical studies, naive use of existing methods to construct confidence intervals for population averages may result in very poor conditional coverage probabilities, conditional on the sample mean of the covariate. When this happens, we propose to transform the data to improve the precision of the normal approximation. The transformed data are then used to make inference on the original population average, and the auxiliary information is incorporated into the inference directly, or by calibration with empirical likelihood. Our approach is design-based. We apply our approach to six real populations and find that when transformation is needed, our approach performs well compared to the usual regression method.

    Release date: 1997-01-30
Date modified: