Keyword search

Sort Help
entries

Results

All (33)

All (33) (0 to 10 of 33 results)

  • 1. Survey Quality Archived
    Articles and reports: 12-001-X201200211751
    Description:

    Survey quality is a multi-faceted concept that originates from two different development paths. One path is the total survey error paradigm that rests on four pillars providing principles that guide survey design, survey implementation, survey evaluation, and survey data analysis. We should design surveys so that the mean squared error of an estimate is minimized given budget and other constraints. It is important to take all known error sources into account, to monitor major error sources during implementation, to periodically evaluate major error sources and combinations of these sources after the survey is completed, and to study the effects of errors on the survey analysis. In this context survey quality can be measured by the mean squared error and controlled by observations made during implementation and improved by evaluation studies. The paradigm has both strengths and weaknesses. One strength is that research can be defined by error sources and one weakness is that most total survey error assessments are incomplete in the sense that it is not possible to include the effects of all the error sources. The second path is influenced by ideas from the quality management sciences. These sciences concern business excellence in providing products and services with a focus on customers and competition from other providers. These ideas have had a great influence on many statistical organizations. One effect is the acceptance among data providers that product quality cannot be achieved without a sufficient underlying process quality and process quality cannot be achieved without a good organizational quality. These levels can be controlled and evaluated by service level agreements, customer surveys, paradata analysis using statistical process control, and organizational assessment using business excellence models or other sets of criteria. All levels can be improved by conducting improvement projects chosen by means of priority functions. The ultimate goal of improvement projects is that the processes involved should gradually approach a state where they are error-free. Of course, this might be an unattainable goal, albeit one to strive for. It is not realistic to hope for continuous measurements of the total survey error using the mean squared error. Instead one can hope that continuous quality improvement using management science ideas and statistical methods can minimize biases and other survey process problems so that the variance becomes an approximation of the mean squared error. If that can be achieved we have made the two development paths approximately coincide.

    Release date: 2012-12-19

  • Articles and reports: 12-001-X201200211752
    Description:

    Coca is a native bush from the Amazon rainforest from which cocaine, an illegal alkaloid, is extracted. Asking farmers about the extent of their coca cultivation areas is considered a sensitive question in remote coca growing regions in Peru. As a consequence, farmers tend not to participate in surveys, do not respond to the sensitive question(s), or underreport their individual coca cultivation areas. There is a political and policy concern in accurately and reliably measuring coca growing areas, therefore survey methodologists need to determine how to encourage response and truthful reporting of sensitive questions related to coca growing. Specific survey strategies applied in our case study included establishment of trust with farmers, confidentiality assurance, matching interviewer-respondent characteristics, changing the format of the sensitive question(s), and non enforcement of absolute isolation of respondents during the survey. The survey results were validated using satellite data. They suggest that farmers tend to underreport their coca areas to 35 to 40% of their true extent.

    Release date: 2012-12-19

  • Articles and reports: 12-001-X201200211753
    Description:

    Nonresponse in longitudinal studies often occurs in a nonmonotone pattern. In the Survey of Industrial Research and Development (SIRD), it is reasonable to assume that the nonresponse mechanism is past-value-dependent in the sense that the response propensity of a study variable at time point t depends on response status and observed or missing values of the same variable at time points prior to t. Since this nonresponse is nonignorable, the parametric likelihood approach is sensitive to the specification of parametric models on both the joint distribution of variables at different time points and the nonresponse mechanism. The nonmonotone nonresponse also limits the application of inverse propensity weighting methods. By discarding all observed data from a subject after its first missing value, one can create a dataset with a monotone ignorable nonresponse and then apply established methods for ignorable nonresponse. However, discarding observed data is not desirable and it may result in inefficient estimators when many observed data are discarded. We propose to impute nonrespondents through regression under imputation models carefully created under the past-value-dependent nonresponse mechanism. This method does not require any parametric model on the joint distribution of the variables across time points or the nonresponse mechanism. Performance of the estimated means based on the proposed imputation method is investigated through some simulation studies and empirical analysis of the SIRD data.

    Release date: 2012-12-19

  • Articles and reports: 12-001-X201200211754
    Description:

    The propensity-scoring-adjustment approach is commonly used to handle selection bias in survey sampling applications, including unit nonresponse and undercoverage. The propensity score is computed using auxiliary variables observed throughout the sample. We discuss some asymptotic properties of propensity-score-adjusted estimators and derive optimal estimators based on a regression model for the finite population. An optimal propensity-score-adjusted estimator can be implemented using an augmented propensity model. Variance estimation is discussed and the results from two simulation studies are presented.

    Release date: 2012-12-19

  • Articles and reports: 12-001-X201200211755
    Description:

    Non-response in longitudinal studies is addressed by assessing the accuracy of response propensity models constructed to discriminate between and predict different types of non-response. Particular attention is paid to summary measures derived from receiver operating characteristic (ROC) curves and logit rank plots. The ideas are applied to data from the UK Millennium Cohort Study. The results suggest that the ability to discriminate between and predict non-respondents is not high. Weights generated from the response propensity models lead to only small adjustments in employment transitions. Conclusions are drawn in terms of the potential of interventions to prevent non-response.

    Release date: 2012-12-19

  • Articles and reports: 12-001-X201200211756
    Description:

    We propose a new approach to small area estimation based on joint modelling of means and variances. The proposed model and methodology not only improve small area estimators but also yield "smoothed" estimators of the true sampling variances. Maximum likelihood estimation of model parameters is carried out using EM algorithm due to the non-standard form of the likelihood function. Confidence intervals of small area parameters are derived using a more general decision theory approach, unlike the traditional way based on minimizing the squared error loss. Numerical properties of the proposed method are investigated via simulation studies and compared with other competitive methods in the literature. Theoretical justification for the effective performance of the resulting estimators and confidence intervals is also provided.

    Release date: 2012-12-19

  • Articles and reports: 12-001-X201200211757
    Description:

    Collinearities among explanatory variables in linear regression models affect estimates from survey data just as they do in non-survey data. Undesirable effects are unnecessarily inflated standard errors, spuriously low or high t-statistics, and parameter estimates with illogical signs. The available collinearity diagnostics are not generally appropriate for survey data because the variance estimators they incorporate do not properly account for stratification, clustering, and survey weights. In this article, we derive condition indexes and variance decompositions to diagnose collinearity problems in complex survey data. The adapted diagnostics are illustrated with data based on a survey of health characteristics.

    Release date: 2012-12-19

  • Articles and reports: 12-001-X201200211758
    Description:

    This paper develops two Bayesian methods for inference about finite population quantiles of continuous survey variables from unequal probability sampling. The first method estimates cumulative distribution functions of the continuous survey variable by fitting a number of probit penalized spline regression models on the inclusion probabilities. The finite population quantiles are then obtained by inverting the estimated distribution function. This method is quite computationally demanding. The second method predicts non-sampled values by assuming a smoothly-varying relationship between the continuous survey variable and the probability of inclusion, by modeling both the mean function and the variance function using splines. The two Bayesian spline-model-based estimators yield a desirable balance between robustness and efficiency. Simulation studies show that both methods yield smaller root mean squared errors than the sample-weighted estimator and the ratio and difference estimators described by Rao, Kovar, and Mantel (RKM 1990), and are more robust to model misspecification than the regression through the origin model-based estimator described in Chambers and Dunstan (1986). When the sample size is small, the 95% credible intervals of the two new methods have closer to nominal confidence coverage than the sample-weighted estimator.

    Release date: 2012-12-19

  • Articles and reports: 12-001-X201200211759
    Description:

    A benefit of multiple imputation is that it allows users to make valid inferences using standard methods with simple combining rules. Existing combining rules for multivariate hypothesis tests fail when the sampling error is zero. This paper proposes modified tests for use with finite population analyses of multiply imputed census data for the applications of disclosure limitation and missing data and evaluates their frequentist properties through simulation.

    Release date: 2012-12-19

  • Surveys and statistical programs – Documentation: 13-605-X201200511748
    Description:

    This note provides users with a reconciliation between Canadian and American measures of household disposable income, debt and the household credit market debt to disposable income ratio.

    Release date: 2012-12-03
Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (28)

Analysis (28) (20 to 30 of 28 results)

  • Articles and reports: 12-001-X201200111687
    Description:

    To create public use files from large scale surveys, statistical agencies sometimes release random subsamples of the original records. Random subsampling reduces file sizes for secondary data analysts and reduces risks of unintended disclosures of survey participants' confidential information. However, subsampling does not eliminate risks, so that alteration of the data is needed before dissemination. We propose to create disclosure-protected subsamples from large scale surveys based on multiple imputation. The idea is to replace identifying or sensitive values in the original sample with draws from statistical models, and release subsamples of the disclosure-protected data. We present methods for making inferences with the multiple synthetic subsamples.

    Release date: 2012-06-27

  • Articles and reports: 12-001-X201200111688
    Description:

    We study the problem of nonignorable nonresponse in a two dimensional contingency table which can be constructed for each of several small areas when there is both item and unit nonresponse. In general, the provision for both types of nonresponse with small areas introduces significant additional complexity in the estimation of model parameters. For this paper, we conceptualize the full data array for each area to consist of a table for complete data and three supplemental tables for missing row data, missing column data, and missing row and column data. For nonignorable nonresponse, the total cell probabilities are allowed to vary by area, cell and these three types of "missingness". The underlying cell probabilities (i.e., those which would apply if full classification were always possible) for each area are generated from a common distribution and their similarity across the areas is parametrically quantified. Our approach is an extension of the selection approach for nonignorable nonresponse investigated by Nandram and Choi (2002a, b) for binary data; this extension creates additional complexity because of the multivariate nature of the data coupled with the small area structure. As in that earlier work, the extension is an expansion model centered on an ignorable nonresponse model so that the total cell probability is dependent upon which of the categories is the response. Our investigation employs hierarchical Bayesian models and Markov chain Monte Carlo methods for posterior inference. The models and methods are illustrated with data from the third National Health and Nutrition Examination Survey.

    Release date: 2012-06-27

  • Articles and reports: 12-001-X201200111689
    Description:

    When there is unit (whole-element) nonresponse in a survey sample drawn using probability-sampling principles, a common practice is to divide the sample into mutually exclusive groups in such a way that it is reasonable to assume that each sampled element in a group were equally likely to be a survey nonrespondent. In this way, unit response can be treated as an additional phase of probability sampling with the inverse of the estimated probability of unit response within a group serving as an adjustment factor when computing the final weights for the group's respondents. If the goal is to estimate the population mean of a survey variable that roughly behaves as if it were a random variable with a constant mean within each group regardless of the original design weights, then incorporating the design weights into the adjustment factors will usually be more efficient than not incorporating them. In fact, if the survey variable behaved exactly like such a random variable, then the estimated population mean computed with the design-weighted adjustment factors would be nearly unbiased in some sense (i.e., under the combination of the original probability-sampling mechanism and a prediction model) even when the sampled elements within a group are not equally likely to respond.

    Release date: 2012-06-27

  • Articles and reports: 75F0002M2012002
    Description:

    In order to provide a holographic or complete picture of low income, Statistics Canada uses three complementary low income lines: the Low Income Cut-offs (LICOs), the Low Income Measures (LIMs) and the Market Basket Measure (MBM). While the first two lines were developed by Statistics Canada, the MBM is based on concepts developed by Human Resources and Skill Development Canada. Though these measures differ from one another, they give a generally consistent picture of low income status over time. None of these measures is the best. Each contributes its own perspective and its own strengths to the study of low income, so that cumulatively, the three provide a better understanding of the phenomenon of low income as a whole. These measures are not measures of poverty, but strictly measures of low income.

    Release date: 2012-06-18

  • Articles and reports: 82-003-X201200211648
    Geography: Canada
    Description:

    This analysis uses information from the 2007 to 2009 Canadian Health Measures Survey to examine moderate-to-vigorous physical activity, sedentary behaviour and sleep duration in children aged 6 to 11. The objective was to compare and contrast findings from these data collection methods and explore differences in their associations with health markers in children.

    Release date: 2012-04-18

  • Articles and reports: 82-003-X201200111633
    Geography: Canada
    Description:

    This paper explains the methodology for creating Geozones, which are area-based thresholds of population characteristics derived from census data, which can be used in the analysis of social or economic differences in health and health service utilization.

    Release date: 2012-03-21

  • Articles and reports: 11F0019M2012340
    Geography: Canada
    Description:

    This paper studies the effect of selective attrition on estimates of immigrant earnings growth based on repeated cross-sectional data in Canada. Longitudinal tax data linked to immigrant landing records are used in order to estimate the change in immigrant earnings and the immigrant-Canadian-born earnings gap. The results are compared with those from repeated cross-sectional data. This approach eliminates differences in results that may stem from variation in collection modes and procedures across datasets.

    Release date: 2012-02-28

  • Articles and reports: 82-003-X201200111625
    Geography: Canada
    Description:

    This study compares estimates of the prevalence of cigarette smoking based on self-report with estimates based on urinary cotinine concentrations. The data are from the 2007 to 2009 Canadian Health Measures Survey, which included self-reported smoking status and the first nationally representative measures of urinary cotinine.

    Release date: 2012-02-15
Reference (5)

Reference (5) ((5 results))

  • Surveys and statistical programs – Documentation: 13-605-X201200511748
    Description:

    This note provides users with a reconciliation between Canadian and American measures of household disposable income, debt and the household credit market debt to disposable income ratio.

    Release date: 2012-12-03

  • Notices and consultations: 13-605-X201200111671
    Description:

    Macroeconomic data for Canada, including Canada's National Accounts (gross domestic product (GDP), saving and net worth), Balance of International Payments (current and capital account surplus or deficit and International Investment Position) and Government Financial Statistics (government deficit and debt) are based on international standards. These international standards are set on a coordinated basis by international organizations including the United Nations, the Organisation for Economic Cooperation and Development (OECD), the International Monetary Fund (IMF), the World Bank and Eurostat, with input from experts around the world. Canada has always played an important role in the development and updating of these standards as they have transformed from the crude guidelines of the early to mid 20th century to the fully articulated standards that exist today.

    The purpose of this document is to introduce a new presentation of the quarterly National Accounts (Income and Expenditure Accounts, Financial Flow Accounts and National Balance Sheet Accounts) that will be published with the conversion of the Canadian National Accounts to the latest international standard - System of National Accounts 2008.

    Release date: 2012-05-30

  • Notices and consultations: 62F0026M2012001
    Geography: Province or territory
    Description:

    This report describes the quality indicators produced for the 2010 Survey of Household Spending. These quality indicators, such as coefficients of variation, nonresponse rates, slippage rates and imputation rates, help users interpret the survey data.

    Release date: 2012-04-25

  • Notices and consultations: 62F0026M2012002
    Description:

    Starting with the 2010 survey year, the Survey of Household Spending (SHS) has used a different collection methodology from previous surveys. The new methodology combines a questionnaire and a diary of expenses. Also, data collection is now continuous throughout the year. This note provides information to users and prospective users of data from the SHS about the methodological differences between the redesigned SHS and the former SHS.

    Release date: 2012-04-25

  • Surveys and statistical programs – Documentation: 98-302-X
    Description:

    The Overview of the Census is a reference document covering each phase of the Census of Population and Census of Agriculture. It provides an overview of the 2011 Census from legislation governing the census to content determination, collection, processing, data quality assessment and data dissemination. It also traces the history of the census from the early days of New France to the present.

    In addition, the Overview of the Census informs users about the steps taken to protect confidential information, along with steps taken to verify the data and minimize errors. It also provides information on the possible uses of census data and covers the different levels of geography and the range of products and services available.

    The Overview of the Census may be useful to both new and experienced users who wish to familiarize themselves with and find specific information about the 2011 Census. The first part covers the Census of Population, while the second is about the Census of Agriculture.

    Release date: 2012-02-08
Date modified: