Keyword search

Sort Help
entries

Results

All (33)

All (33) (0 to 10 of 33 results)

  • 1. Survey Quality Archived
    Articles and reports: 12-001-X201200211751
    Description:

    Survey quality is a multi-faceted concept that originates from two different development paths. One path is the total survey error paradigm that rests on four pillars providing principles that guide survey design, survey implementation, survey evaluation, and survey data analysis. We should design surveys so that the mean squared error of an estimate is minimized given budget and other constraints. It is important to take all known error sources into account, to monitor major error sources during implementation, to periodically evaluate major error sources and combinations of these sources after the survey is completed, and to study the effects of errors on the survey analysis. In this context survey quality can be measured by the mean squared error and controlled by observations made during implementation and improved by evaluation studies. The paradigm has both strengths and weaknesses. One strength is that research can be defined by error sources and one weakness is that most total survey error assessments are incomplete in the sense that it is not possible to include the effects of all the error sources. The second path is influenced by ideas from the quality management sciences. These sciences concern business excellence in providing products and services with a focus on customers and competition from other providers. These ideas have had a great influence on many statistical organizations. One effect is the acceptance among data providers that product quality cannot be achieved without a sufficient underlying process quality and process quality cannot be achieved without a good organizational quality. These levels can be controlled and evaluated by service level agreements, customer surveys, paradata analysis using statistical process control, and organizational assessment using business excellence models or other sets of criteria. All levels can be improved by conducting improvement projects chosen by means of priority functions. The ultimate goal of improvement projects is that the processes involved should gradually approach a state where they are error-free. Of course, this might be an unattainable goal, albeit one to strive for. It is not realistic to hope for continuous measurements of the total survey error using the mean squared error. Instead one can hope that continuous quality improvement using management science ideas and statistical methods can minimize biases and other survey process problems so that the variance becomes an approximation of the mean squared error. If that can be achieved we have made the two development paths approximately coincide.

    Release date: 2012-12-19

  • Articles and reports: 12-001-X201200211752
    Description:

    Coca is a native bush from the Amazon rainforest from which cocaine, an illegal alkaloid, is extracted. Asking farmers about the extent of their coca cultivation areas is considered a sensitive question in remote coca growing regions in Peru. As a consequence, farmers tend not to participate in surveys, do not respond to the sensitive question(s), or underreport their individual coca cultivation areas. There is a political and policy concern in accurately and reliably measuring coca growing areas, therefore survey methodologists need to determine how to encourage response and truthful reporting of sensitive questions related to coca growing. Specific survey strategies applied in our case study included establishment of trust with farmers, confidentiality assurance, matching interviewer-respondent characteristics, changing the format of the sensitive question(s), and non enforcement of absolute isolation of respondents during the survey. The survey results were validated using satellite data. They suggest that farmers tend to underreport their coca areas to 35 to 40% of their true extent.

    Release date: 2012-12-19

  • Articles and reports: 12-001-X201200211753
    Description:

    Nonresponse in longitudinal studies often occurs in a nonmonotone pattern. In the Survey of Industrial Research and Development (SIRD), it is reasonable to assume that the nonresponse mechanism is past-value-dependent in the sense that the response propensity of a study variable at time point t depends on response status and observed or missing values of the same variable at time points prior to t. Since this nonresponse is nonignorable, the parametric likelihood approach is sensitive to the specification of parametric models on both the joint distribution of variables at different time points and the nonresponse mechanism. The nonmonotone nonresponse also limits the application of inverse propensity weighting methods. By discarding all observed data from a subject after its first missing value, one can create a dataset with a monotone ignorable nonresponse and then apply established methods for ignorable nonresponse. However, discarding observed data is not desirable and it may result in inefficient estimators when many observed data are discarded. We propose to impute nonrespondents through regression under imputation models carefully created under the past-value-dependent nonresponse mechanism. This method does not require any parametric model on the joint distribution of the variables across time points or the nonresponse mechanism. Performance of the estimated means based on the proposed imputation method is investigated through some simulation studies and empirical analysis of the SIRD data.

    Release date: 2012-12-19

  • Articles and reports: 12-001-X201200211754
    Description:

    The propensity-scoring-adjustment approach is commonly used to handle selection bias in survey sampling applications, including unit nonresponse and undercoverage. The propensity score is computed using auxiliary variables observed throughout the sample. We discuss some asymptotic properties of propensity-score-adjusted estimators and derive optimal estimators based on a regression model for the finite population. An optimal propensity-score-adjusted estimator can be implemented using an augmented propensity model. Variance estimation is discussed and the results from two simulation studies are presented.

    Release date: 2012-12-19

  • Articles and reports: 12-001-X201200211755
    Description:

    Non-response in longitudinal studies is addressed by assessing the accuracy of response propensity models constructed to discriminate between and predict different types of non-response. Particular attention is paid to summary measures derived from receiver operating characteristic (ROC) curves and logit rank plots. The ideas are applied to data from the UK Millennium Cohort Study. The results suggest that the ability to discriminate between and predict non-respondents is not high. Weights generated from the response propensity models lead to only small adjustments in employment transitions. Conclusions are drawn in terms of the potential of interventions to prevent non-response.

    Release date: 2012-12-19

  • Articles and reports: 12-001-X201200211756
    Description:

    We propose a new approach to small area estimation based on joint modelling of means and variances. The proposed model and methodology not only improve small area estimators but also yield "smoothed" estimators of the true sampling variances. Maximum likelihood estimation of model parameters is carried out using EM algorithm due to the non-standard form of the likelihood function. Confidence intervals of small area parameters are derived using a more general decision theory approach, unlike the traditional way based on minimizing the squared error loss. Numerical properties of the proposed method are investigated via simulation studies and compared with other competitive methods in the literature. Theoretical justification for the effective performance of the resulting estimators and confidence intervals is also provided.

    Release date: 2012-12-19

  • Articles and reports: 12-001-X201200211757
    Description:

    Collinearities among explanatory variables in linear regression models affect estimates from survey data just as they do in non-survey data. Undesirable effects are unnecessarily inflated standard errors, spuriously low or high t-statistics, and parameter estimates with illogical signs. The available collinearity diagnostics are not generally appropriate for survey data because the variance estimators they incorporate do not properly account for stratification, clustering, and survey weights. In this article, we derive condition indexes and variance decompositions to diagnose collinearity problems in complex survey data. The adapted diagnostics are illustrated with data based on a survey of health characteristics.

    Release date: 2012-12-19

  • Articles and reports: 12-001-X201200211758
    Description:

    This paper develops two Bayesian methods for inference about finite population quantiles of continuous survey variables from unequal probability sampling. The first method estimates cumulative distribution functions of the continuous survey variable by fitting a number of probit penalized spline regression models on the inclusion probabilities. The finite population quantiles are then obtained by inverting the estimated distribution function. This method is quite computationally demanding. The second method predicts non-sampled values by assuming a smoothly-varying relationship between the continuous survey variable and the probability of inclusion, by modeling both the mean function and the variance function using splines. The two Bayesian spline-model-based estimators yield a desirable balance between robustness and efficiency. Simulation studies show that both methods yield smaller root mean squared errors than the sample-weighted estimator and the ratio and difference estimators described by Rao, Kovar, and Mantel (RKM 1990), and are more robust to model misspecification than the regression through the origin model-based estimator described in Chambers and Dunstan (1986). When the sample size is small, the 95% credible intervals of the two new methods have closer to nominal confidence coverage than the sample-weighted estimator.

    Release date: 2012-12-19

  • Articles and reports: 12-001-X201200211759
    Description:

    A benefit of multiple imputation is that it allows users to make valid inferences using standard methods with simple combining rules. Existing combining rules for multivariate hypothesis tests fail when the sampling error is zero. This paper proposes modified tests for use with finite population analyses of multiply imputed census data for the applications of disclosure limitation and missing data and evaluates their frequentist properties through simulation.

    Release date: 2012-12-19

  • Surveys and statistical programs – Documentation: 13-605-X201200511748
    Description:

    This note provides users with a reconciliation between Canadian and American measures of household disposable income, debt and the household credit market debt to disposable income ratio.

    Release date: 2012-12-03
Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (28)

Analysis (28) (10 to 20 of 28 results)

  • Articles and reports: 12-002-X201200111642
    Description:

    It is generally recommended that weighted estimation approaches be used when analyzing data from a long-form census microdata file. Since such data files are now available in the RDC's, there is a need to provide researchers there with more information about doing weighted estimation with these files. The purpose of this paper is to provide some of this information - in particular, how the weight variables were derived for the census microdata files and what weight should be used for different units of analysis. For the 1996, 2001 and 2006 censuses the same weight variable is appropriate regardless of whether people, families or households are being studied. For the 1991 census, recommendations are more complex: a different weight variable is required for households than for people and families, and additional restrictions apply to obtain the correct weight value for families.

    Release date: 2012-10-25

  • Articles and reports: 82-003-X201200311707
    Geography: Canada
    Description:

    This study compares waist circumference measured using World Health Organization and National Institutes of Health protocols to determine if the results differ significantly, and whether equations can be developed to allow comparison between waist circumference taken at the two different measurement sites.

    Release date: 2012-09-20

  • Articles and reports: 82-003-X201200311693
    Geography: Canada
    Description:

    This study describes an area-based method of calculating standardized, comparable hospitalization rates for areas with varying concentrations of foreign-born, at national and subnational levels.

    Release date: 2012-07-18

  • Articles and reports: 12-001-X201200111680
    Description:

    Survey data are potentially affected by interviewer falsifications with data fabrication being the most blatant form. Even a small number of fabricated interviews might seriously impair the results of further empirical analysis. Besides reinterviews, some statistical approaches have been proposed for identifying this type of fraudulent behaviour. With the help of a small dataset, this paper demonstrates how cluster analysis, which is not commonly employed in this context, might be used to identify interviewers who falsify their work assignments. Several indicators are combined to classify 'at risk' interviewers based solely on the data collected. This multivariate classification seems superior to the application of a single indicator such as Benford's law.

    Release date: 2012-06-27

  • Articles and reports: 12-001-X201200111681
    Description:

    This paper focuses on the application of graph theory to the development and testing of survey research instruments. A graph-theoretic approach offers several advantages over conventional approaches in the structure and features of a specifications system for research instruments, especially for large, computer-assisted instruments. One advantage is to verify the connectedness of all components and a second advantage is the ability to simulate an instrument. This approach also allows for the generation of measures to describe an instrument such as the number of routes and paths. The concept of a 'basis' is discussed in the context of software testing. A basis is the smallest set of paths within an instrument which covers all link-and-node pairings. These paths may be used as an economic and comprehensive set of test cases for instrument testing.

    Release date: 2012-06-27

  • Articles and reports: 12-001-X201200111682
    Description:

    Sample allocation issues are studied in the context of estimating sub-population (stratum or domain) means as well as the aggregate population mean under stratified simple random sampling. A non-linear programming method is used to obtain "optimal" sample allocation to strata that minimizes the total sample size subject to specified tolerances on the coefficient of variation of the estimators of strata means and the population mean. The resulting total sample size is then used to determine sample allocations for the methods of Costa, Satorra and Ventura (2004) based on compromise allocation and Longford (2006) based on specified "inferential priorities". In addition, we study sample allocation to strata when reliability requirements for domains, cutting across strata, are also specified. Performance of the three methods is studied using data from Statistics Canada's Monthly Retail Trade Survey (MRTS) of single establishments.

    Release date: 2012-06-27

  • Articles and reports: 12-001-X201200111683
    Description:

    We consider alternatives to poststratification for doubly classified data in which at least one of the two-way cells is too small to allow the poststratification based upon this double classification. In our study data set, the expected count in the smallest cell is 0.36. One approach is simply to collapse cells. This is likely, however, to destroy the double classification structure. Our alternative approaches allows one to maintain the original double classification of the data. The approaches are based upon the calibration study by Chang and Kott (2008). We choose weight adjustments dependent upon the marginal classifications (but not full cross classification) to minimize an objective function of the differences between the population counts of the two way cells and their sample estimates. In the terminology of Chang and Kott (2008), if the row and column classifications have I and J cells respectively, this results in IJ benchmark variables and I + J - 1 model variables. We study the performance of these estimators by constructing simulation simple random samples from the 2005 Quarterly Census of Employment and Wages which is maintained by the Bureau of Labor Statistics. We use the double classification of state and industry group. In our study, the calibration approaches introduced an asymptotically trivial bias, but reduced the MSE, compared to the unbiased estimator, by as much as 20% for a small sample.

    Release date: 2012-06-27

  • Articles and reports: 12-001-X201200111684
    Description:

    Many business surveys provide estimates for the monthly turnover for the major Standard Industrial Classification codes. This includes estimates for the change in the level of the monthly turnover compared to 12 months ago. Because business surveys often use overlapping samples, the turnover estimates in consecutive months are correlated. This makes the variance calculations for a change less straightforward. This article describes a general variance estimation procedure. The procedure allows for yearly stratum corrections when establishments move into other strata according to their actual sizes. The procedure also takes into account sample refreshments, births and deaths. The paper concludes with an example of the variance for the estimated yearly growth rate of the monthly turnover of Dutch Supermarkets.

    Release date: 2012-06-27

  • Articles and reports: 12-001-X201200111685
    Description:

    Survey data are often used to fit linear regression models. The values of covariates used in modeling are not controlled as they might be in an experiment. Thus, collinearity among the covariates is an inevitable problem in the analysis of survey data. Although many books and articles have described the collinearity problem and proposed strategies to understand, assess and handle its presence, the survey literature has not provided appropriate diagnostic tools to evaluate its impact on regression estimation when the survey complexities are considered. We have developed variance inflation factors (VIFs) that measure the amount that variances of parameter estimators are increased due to having non-orthogonal predictors. The VIFs are appropriate for survey-weighted regression estimators and account for complex design features, e.g., weights, clusters, and strata. Illustrations of these methods are given using a probability sample from a household survey of health and nutrition.

    Release date: 2012-06-27

  • Articles and reports: 12-001-X201200111686
    Description:

    We present a generalized estimating equations approach for estimating the concordance correlation coefficient and the kappa coefficient from sample survey data. The estimates and their accompanying standard error need to correctly account for the sampling design. Weighted measures of the concordance correlation coefficient and the kappa coefficient, along with the variance of these measures accounting for the sampling design, are presented. We use the Taylor series linearization method and the jackknife procedure for estimating the standard errors of the resulting parameter estimates. Body measurement and oral health data from the Third National Health and Nutrition Examination Survey are used to illustrate this methodology.

    Release date: 2012-06-27
Reference (5)

Reference (5) ((5 results))

  • Surveys and statistical programs – Documentation: 13-605-X201200511748
    Description:

    This note provides users with a reconciliation between Canadian and American measures of household disposable income, debt and the household credit market debt to disposable income ratio.

    Release date: 2012-12-03

  • Notices and consultations: 13-605-X201200111671
    Description:

    Macroeconomic data for Canada, including Canada's National Accounts (gross domestic product (GDP), saving and net worth), Balance of International Payments (current and capital account surplus or deficit and International Investment Position) and Government Financial Statistics (government deficit and debt) are based on international standards. These international standards are set on a coordinated basis by international organizations including the United Nations, the Organisation for Economic Cooperation and Development (OECD), the International Monetary Fund (IMF), the World Bank and Eurostat, with input from experts around the world. Canada has always played an important role in the development and updating of these standards as they have transformed from the crude guidelines of the early to mid 20th century to the fully articulated standards that exist today.

    The purpose of this document is to introduce a new presentation of the quarterly National Accounts (Income and Expenditure Accounts, Financial Flow Accounts and National Balance Sheet Accounts) that will be published with the conversion of the Canadian National Accounts to the latest international standard - System of National Accounts 2008.

    Release date: 2012-05-30

  • Notices and consultations: 62F0026M2012001
    Geography: Province or territory
    Description:

    This report describes the quality indicators produced for the 2010 Survey of Household Spending. These quality indicators, such as coefficients of variation, nonresponse rates, slippage rates and imputation rates, help users interpret the survey data.

    Release date: 2012-04-25

  • Notices and consultations: 62F0026M2012002
    Description:

    Starting with the 2010 survey year, the Survey of Household Spending (SHS) has used a different collection methodology from previous surveys. The new methodology combines a questionnaire and a diary of expenses. Also, data collection is now continuous throughout the year. This note provides information to users and prospective users of data from the SHS about the methodological differences between the redesigned SHS and the former SHS.

    Release date: 2012-04-25

  • Surveys and statistical programs – Documentation: 98-302-X
    Description:

    The Overview of the Census is a reference document covering each phase of the Census of Population and Census of Agriculture. It provides an overview of the 2011 Census from legislation governing the census to content determination, collection, processing, data quality assessment and data dissemination. It also traces the history of the census from the early days of New France to the present.

    In addition, the Overview of the Census informs users about the steps taken to protect confidential information, along with steps taken to verify the data and minimize errors. It also provides information on the possible uses of census data and covers the different levels of geography and the range of products and services available.

    The Overview of the Census may be useful to both new and experienced users who wish to familiarize themselves with and find specific information about the 2011 Census. The first part covers the Census of Population, while the second is about the Census of Agriculture.

    Release date: 2012-02-08
Date modified: