Analysis

Skip to main content
Skip to footer

Language selection

Français

Search and menus

Search and menus

Search

Skip to filters. View results.

Statistics Canada's Trust Centre

Results

All (14)

All (14) (0 to 10 of 14 results)

1. Sample survey theory and methods: Past, present, and future directions Archived
Articles and reports: 12-001-X201700254888
Description:
We discuss developments in sample survey theory and methods covering the past 100 years. Neyman’s 1934 landmark paper laid the theoretical foundations for the probability sampling approach to inference from survey samples. Classical sampling books by Cochran, Deming, Hansen, Hurwitz and Madow, Sukhatme, and Yates, which appeared in the early 1950s, expanded and elaborated the theory of probability sampling, emphasizing unbiasedness, model free features, and designs that minimize variance for a fixed cost. During the period 1960-1970, theoretical foundations of inference from survey data received attention, with the model-dependent approach generating considerable discussion. Introduction of general purpose statistical software led to the use of such software with survey data, which led to the design of methods specifically for complex survey data. At the same time, weighting methods, such as regression estimation and calibration, became practical and design consistency replaced unbiasedness as the requirement for standard estimators. A bit later, computer-intensive resampling methods also became practical for large scale survey samples. Improved computer power led to more sophisticated imputation for missing data, use of more auxiliary data, some treatment of measurement errors in estimation, and more complex estimation procedures. A notable use of models was in the expanded use of small area estimation. Future directions in research and methods will be influenced by budgets, response rates, timeliness, improved data collection devices, and availability of auxiliary data, some of which will come from “Big Data”. Survey taking will be impacted by changing cultural behavior and by a changing physical-technical environment.
Release date: 2017-12-21
2. Properties of measures of usual daily energy expenditure Archived
Articles and reports: 11-522-X201300014266
Description:
Monitors and self-reporting are two methods of measuring energy expended in physical activity, where monitor devices typically have much smaller error variances than do self-reports. The Physical Activity Measurement Survey was designed to compare the two procedures, using replicate observations on the same individual. The replicates permit calibrating the personal report measurement to the monitor measurement and make it possible to estimate components of the measurement error variances. Estimates of the variance components of measurement error in monitor-and self-report energy expenditure are given for females in the Physical Activity Measurement Survey.
Release date: 2014-10-31
3. Regression estimation for a rejective sampling procedure Archived
Articles and reports: 11-536-X200900110808
Description:
Let auxiliary information be available for use in designing of a survey sample. Let the sample selection procedure consist of selecting a probability sample, rejecting the sample if the sample mean of an auxiliary variable is not within a specified distance of the population mean, continuing until a sample is accepted. It is proven that the large sample properties of the regression estimator for the rejective sample are the same as those of the regression estimator for the original selection procedure. Likewise the usual estimator of variance for the regression estimator is appropriate for the rejective sample. In a Monte Carlo experiment, the large sample properties hold for relatively small samples. Also the Monte Carlo results are in agreement with the theoretical orders of approximation. The efficiency effect of the described rejective sampling is o(n-1) relative to regression estimation without rejection, but the effect can be important for particular samples.
Release date: 2009-08-11
4. Small area estimation under a restriction Archived
Articles and reports: 12-001-X200800110619
Description:
Small area prediction based on random effects, called EBLUP, is a procedure for constructing estimates for small geographical areas or small subpopulations using existing survey data. The total of the small area predictors is often forced to equal the direct survey estimate and such predictors are said to be calibrated. Several calibrated predictors are reviewed and a criterion that unifies the derivation of these calibrated predictors is presented. The predictor that is the unique best linear unbiased predictor under the criterion is derived and the mean square error of the calibrated predictors is discussed. Implicit in the imposition of the restriction is the possibility that the small area model is misspecified and the predictors are biased. Augmented models with one additional explanatory variable for which the usual small area predictors achieve the self-calibrated property are considered. Simulations demonstrate that calibrated predictors have slightly smaller bias compared to those of the usual EBLUP predictor. However, if the bias is a concern, a better approach is to use an augmented model with an added auxiliary variable that is a function of area size. In the simulation, the predictors based on the augmented model had smaller MSE than EBLUP when the incorrect model was used for prediction. Furthermore, there was a very small increase in MSE relative to EBLUP if the auxiliary variable was added to the correct model.
Release date: 2008-06-26
5. Estimation of regression parameters with survey data Archived
Articles and reports: 11-522-X200600110417
Description:
The coefficients of regression equations are often parameters of interest for health surveys and such surveys are usually of complex design with differential sampling rates. We give estimators for the regression coefficients for complex surveys that are superior to ordinary expansion estimators under the subject matter model, but also retain desirable design properties. Theoretical and Monte Carlo properties are presented.
Release date: 2008-03-17
6. Hot deck imputation for the response model Archived
Articles and reports: 12-001-X20050029041
Description:
Hot deck imputation is a procedure in which missing items are replaced with values from respondents. A model supporting such procedures is the model in which response probabilities are assumed equal within imputation cells. An efficient version of hot deck imputation is described for the cell response model and a computationally efficient variance estimator is given. An approximation to the fully efficient procedure in which a small number of values are imputed for each nonrespondent is described. Variance estimation procedures are illustrated in a Monte Carlo study.
Release date: 2006-02-17
7. Towards nonnegative regression weights for survey samples Archived
Articles and reports: 12-001-X20050018091
Description:
Procedures for constructing vectors of nonnegative regression weights are considered. A vector of regression weights in which initial weights are the inverse of the approximate conditional inclusion probabilities is introduced. Through a simulation study, the weighted regression weights, quadratic programming weights, raking ratio weights, weights from logit procedure, and weights of a likelihood-type are compared.
Release date: 2005-07-21
8. Estimation of census adjustment factors Archived
Articles and reports: 12-001-X20000015176
Description:
A components-of-variance approach and an estimated covariance error structure were used in constructing predictors of adjustment factors for the 1990 Decennial Census. The variability of the estimated covariance matrix is the suspected cause of certain anomalies that appeared in the regression estimation and in the estimated adjustment factors. We investigate alternative prediction methods and propose a procedure that is less influenced by variability in the estimated covariance matrix. The proposed methodology is applied to a data set composed of 336 adjustment factors from the 1990 Post Enumeration Survey.
Release date: 2000-08-30
9. Regression weighting in the presence of nonresponse with application to the 1987-1988 Nationwide Food Consumption Survey Archived
Articles and reports: 12-001-X199400114429
Description:
A regression weight generation procedure is applied to the 1987-1988 Nationwide Food Consumption Survey of the U.S. Department of Agriculture. Regression estimation was used because of the large nonresponse in the survey. The regression weights are generalized least squares weights modified so that all weights are positive and so that large weights are smaller than the least squares weights. It is demonstrated that the regression estimator has the potential for large reductions in mean square error relative to the simple direct estimator in the presence of nonresponse.
Release date: 1994-06-15
10. Analysis of repeated surveys Archived
Articles and reports: 12-001-X199000214537
Description:
Repeated surveys in which a portion of the units are observed at more than one time point and some units are not observed at some time points are of primary interest. Least squares estimation for such surveys is reviewed. Included in the discussion are estimation procedures in which existing estimates are not revised when new data become available. Also considered are techniques for the estimation of longitudinal parameters, such as gross change tables. Estimation for a repeated survey of land use conducted by the U.S. Soil Conservation Service is described. The effects of measurement error on gross change estimates is illustrated and it is shown that survey designs constructed to enable estimation of the parameters of the measurement error process can be very efficient.
Release date: 1990-12-14

Stats in brief (0)

Stats in brief (0) (0 results)

No content available at this time.

Articles and reports (14)

Articles and reports (14) (0 to 10 of 14 results)

1. Sample survey theory and methods: Past, present, and future directions Archived
Articles and reports: 12-001-X201700254888
Description:
We discuss developments in sample survey theory and methods covering the past 100 years. Neyman’s 1934 landmark paper laid the theoretical foundations for the probability sampling approach to inference from survey samples. Classical sampling books by Cochran, Deming, Hansen, Hurwitz and Madow, Sukhatme, and Yates, which appeared in the early 1950s, expanded and elaborated the theory of probability sampling, emphasizing unbiasedness, model free features, and designs that minimize variance for a fixed cost. During the period 1960-1970, theoretical foundations of inference from survey data received attention, with the model-dependent approach generating considerable discussion. Introduction of general purpose statistical software led to the use of such software with survey data, which led to the design of methods specifically for complex survey data. At the same time, weighting methods, such as regression estimation and calibration, became practical and design consistency replaced unbiasedness as the requirement for standard estimators. A bit later, computer-intensive resampling methods also became practical for large scale survey samples. Improved computer power led to more sophisticated imputation for missing data, use of more auxiliary data, some treatment of measurement errors in estimation, and more complex estimation procedures. A notable use of models was in the expanded use of small area estimation. Future directions in research and methods will be influenced by budgets, response rates, timeliness, improved data collection devices, and availability of auxiliary data, some of which will come from “Big Data”. Survey taking will be impacted by changing cultural behavior and by a changing physical-technical environment.
Release date: 2017-12-21
2. Properties of measures of usual daily energy expenditure Archived
Articles and reports: 11-522-X201300014266
Description:
Monitors and self-reporting are two methods of measuring energy expended in physical activity, where monitor devices typically have much smaller error variances than do self-reports. The Physical Activity Measurement Survey was designed to compare the two procedures, using replicate observations on the same individual. The replicates permit calibrating the personal report measurement to the monitor measurement and make it possible to estimate components of the measurement error variances. Estimates of the variance components of measurement error in monitor-and self-report energy expenditure are given for females in the Physical Activity Measurement Survey.
Release date: 2014-10-31
3. Regression estimation for a rejective sampling procedure Archived
Articles and reports: 11-536-X200900110808
Description:
Let auxiliary information be available for use in designing of a survey sample. Let the sample selection procedure consist of selecting a probability sample, rejecting the sample if the sample mean of an auxiliary variable is not within a specified distance of the population mean, continuing until a sample is accepted. It is proven that the large sample properties of the regression estimator for the rejective sample are the same as those of the regression estimator for the original selection procedure. Likewise the usual estimator of variance for the regression estimator is appropriate for the rejective sample. In a Monte Carlo experiment, the large sample properties hold for relatively small samples. Also the Monte Carlo results are in agreement with the theoretical orders of approximation. The efficiency effect of the described rejective sampling is o(n-1) relative to regression estimation without rejection, but the effect can be important for particular samples.
Release date: 2009-08-11
4. Small area estimation under a restriction Archived
Articles and reports: 12-001-X200800110619
Description:
Small area prediction based on random effects, called EBLUP, is a procedure for constructing estimates for small geographical areas or small subpopulations using existing survey data. The total of the small area predictors is often forced to equal the direct survey estimate and such predictors are said to be calibrated. Several calibrated predictors are reviewed and a criterion that unifies the derivation of these calibrated predictors is presented. The predictor that is the unique best linear unbiased predictor under the criterion is derived and the mean square error of the calibrated predictors is discussed. Implicit in the imposition of the restriction is the possibility that the small area model is misspecified and the predictors are biased. Augmented models with one additional explanatory variable for which the usual small area predictors achieve the self-calibrated property are considered. Simulations demonstrate that calibrated predictors have slightly smaller bias compared to those of the usual EBLUP predictor. However, if the bias is a concern, a better approach is to use an augmented model with an added auxiliary variable that is a function of area size. In the simulation, the predictors based on the augmented model had smaller MSE than EBLUP when the incorrect model was used for prediction. Furthermore, there was a very small increase in MSE relative to EBLUP if the auxiliary variable was added to the correct model.
Release date: 2008-06-26
5. Estimation of regression parameters with survey data Archived
Articles and reports: 11-522-X200600110417
Description:
The coefficients of regression equations are often parameters of interest for health surveys and such surveys are usually of complex design with differential sampling rates. We give estimators for the regression coefficients for complex surveys that are superior to ordinary expansion estimators under the subject matter model, but also retain desirable design properties. Theoretical and Monte Carlo properties are presented.
Release date: 2008-03-17
6. Hot deck imputation for the response model Archived
Articles and reports: 12-001-X20050029041
Description:
Hot deck imputation is a procedure in which missing items are replaced with values from respondents. A model supporting such procedures is the model in which response probabilities are assumed equal within imputation cells. An efficient version of hot deck imputation is described for the cell response model and a computationally efficient variance estimator is given. An approximation to the fully efficient procedure in which a small number of values are imputed for each nonrespondent is described. Variance estimation procedures are illustrated in a Monte Carlo study.
Release date: 2006-02-17
7. Towards nonnegative regression weights for survey samples Archived
Articles and reports: 12-001-X20050018091
Description:
Procedures for constructing vectors of nonnegative regression weights are considered. A vector of regression weights in which initial weights are the inverse of the approximate conditional inclusion probabilities is introduced. Through a simulation study, the weighted regression weights, quadratic programming weights, raking ratio weights, weights from logit procedure, and weights of a likelihood-type are compared.
Release date: 2005-07-21
8. Estimation of census adjustment factors Archived
Articles and reports: 12-001-X20000015176
Description:
A components-of-variance approach and an estimated covariance error structure were used in constructing predictors of adjustment factors for the 1990 Decennial Census. The variability of the estimated covariance matrix is the suspected cause of certain anomalies that appeared in the regression estimation and in the estimated adjustment factors. We investigate alternative prediction methods and propose a procedure that is less influenced by variability in the estimated covariance matrix. The proposed methodology is applied to a data set composed of 336 adjustment factors from the 1990 Post Enumeration Survey.
Release date: 2000-08-30
9. Regression weighting in the presence of nonresponse with application to the 1987-1988 Nationwide Food Consumption Survey Archived
Articles and reports: 12-001-X199400114429
Description:
A regression weight generation procedure is applied to the 1987-1988 Nationwide Food Consumption Survey of the U.S. Department of Agriculture. Regression estimation was used because of the large nonresponse in the survey. The regression weights are generalized least squares weights modified so that all weights are positive and so that large weights are smaller than the least squares weights. It is demonstrated that the regression estimator has the potential for large reductions in mean square error relative to the simple direct estimator in the presence of nonresponse.
Release date: 1994-06-15
10. Analysis of repeated surveys Archived
Articles and reports: 12-001-X199000214537
Description:
Repeated surveys in which a portion of the units are observed at more than one time point and some units are not observed at some time points are of primary interest. Least squares estimation for such surveys is reviewed. Included in the discussion are estimation procedures in which existing estimates are not revised when new data become available. Also considered are techniques for the estimation of longitudinal parameters, such as gross change tables. Estimation for a repeated survey of land use conducted by the U.S. Soil Conservation Service is described. The effects of measurement error on gross change estimates is illustrated and it is shown that survey designs constructed to enable estimation of the parameters of the measurement error process can be very efficient.
Release date: 1990-12-14

Journals and periodicals (0)

Journals and periodicals (0) (0 results)

No content available at this time.

Report a problem or mistake on this page

Date modified:: 2024-09-24