Keyword search

Filter results by

Search Help
Currently selected filters that can be removed

Keyword(s)

Year of publication

1 facets displayed. 1 facets selected.

Geography

2 facets displayed. 0 facets selected.

Content

1 facets displayed. 0 facets selected.
Sort Help
entries

Results

All (124)

All (124) (0 to 10 of 124 results)

  • Articles and reports: 12-001-X200800210754
    Description:

    The context of the discussion is the increasing incidence of international surveys, of which one is the International Tobacco Control (ITC) Policy Evaluation Project, which began in 2002. The ITC country surveys are longitudinal, and their aim is to evaluate the effects of policy measures being introduced in various countries under the WHO Framework Convention on Tobacco Control. The challenges of organization, data collection and analysis in international surveys are reviewed and illustrated. Analysis is an increasingly important part of the motivation for large scale cross-cultural surveys. The fundamental challenge for analysis is to discern the real response (or lack of response) to policy change, separating it from the effects of data collection mode, differential non-response, external events, time-in-sample, culture, and language. Two problems relevant to statistical analysis are discussed. The first problem is the question of when and how to analyze pooled data from several countries, in order to strengthen conclusions which might be generally valid. While in some cases this seems to be straightforward, there are differing opinions on the extent to which pooling is possible and reasonable. It is suggested that for formal comparisons, random effects models are of conceptual use. The second problem is to find models of measurement across cultures and data collection modes which will enable calibration of continuous, binary and ordinal responses, and produce comparisons from which extraneous effects have been removed. It is noted that hierarchical models provide a natural way of relaxing requirements of model invariance across groups.

    Release date: 2008-12-23

  • Articles and reports: 12-001-X200800210755
    Description:

    Dependent interviewing (DI) is used in many longitudinal surveys to "feed forward" data from one wave to the next. Though it is a promising technique which has been demonstrated to enhance data quality in certain respects, relatively little is known about how it is actually administered in the field. This research seeks to address this issue through behavior coding. Various styles of DI were employed in the English Longitudinal Study of Ageing (ELSA) in January, 2006, and recordings were made of pilot field interviews. These recordings were analysed to determine whether the questions (particularly the DI aspects) were administered appropriately and to explore the respondent's reaction to the fed-forward data. Of particular interest was whether respondents confirmed or challenged the previously-reported information, whether the prior wave data came into play when respondents were providing their current-wave answers, and how any discrepancies were negotiated by the interviewer and respondent. Also of interest was to examine the effectiveness of various styles of DI. For example, in some cases the prior wave data was brought forward and respondents were asked to explicitly confirm it; in other cases the previous data was read and respondents were asked if the situation was still the same. Results indicate varying levels of compliance in terms of initial question-reading, and suggest that some styles of DI may be more effective than others.

    Release date: 2008-12-23

  • Articles and reports: 12-001-X200800210756
    Description:

    In longitudinal surveys nonresponse often occurs in a pattern that is not monotone. We consider estimation of time-dependent means under the assumption that the nonresponse mechanism is last-value-dependent. Since the last value itself may be missing when nonresponse is nonmonotone, the nonresponse mechanism under consideration is nonignorable. We propose an imputation method by first deriving some regression imputation models according to the nonresponse mechanism and then applying nonparametric regression imputation. We assume that the longitudinal data follow a Markov chain with finite second-order moments. No other assumption is imposed on the joint distribution of longitudinal data and their nonresponse indicators. A bootstrap method is applied for variance estimation. Some simulation results and an example concerning the Current Employment Survey are presented.

    Release date: 2008-12-23

  • Articles and reports: 12-001-X200800210757
    Description:

    Sample weights can be calibrated to reflect the known population totals of a set of auxiliary variables. Predictors of finite population totals calculated using these weights have low bias if these variables are related to the variable of interest, but can have high variance if too many auxiliary variables are used. This article develops an "adaptive calibration" approach, where the auxiliary variables to be used in weighting are selected using sample data. Adaptively calibrated estimators are shown to have lower mean squared error and better coverage properties than non-adaptive estimators in many cases.

    Release date: 2008-12-23

  • Articles and reports: 12-001-X200800210758
    Description:

    We propose a method for estimating the variance of estimators of changes over time, a method that takes account of all the components of these estimators: the sampling design, treatment of non-response, treatment of large companies, correlation of non-response from one wave to another, the effect of using a panel, robustification, and calibration using a ratio estimator. This method, which serves to determine the confidence intervals of changes over time, is then applied to the Swiss survey of value added.

    Release date: 2008-12-23

  • Articles and reports: 12-001-X200800210759
    Description:

    The analysis of stratified multistage sample data requires the use of design information such as stratum and primary sampling unit (PSU) identifiers, or associated replicate weights, in variance estimation. In some public release data files, such design information is masked as an effort to avoid their disclosure risk and yet to allow the user to obtain valid variance estimation. For example, in area surveys with a limited number of PSUs, the original PSUs are split or/and recombined to construct pseudo-PSUs with swapped second or subsequent stage sampling units. Such PSU masking methods, however, obviously distort the clustering structure of the sample design, yielding biased variance estimates possibly with certain systematic patterns between two variance estimates from the unmasked and masked PSU identifiers. Some of the previous work observed patterns in the ratio of the masked and unmasked variance estimates when plotted against the unmasked design effect. This paper investigates the effect of PSU masking on variance estimates under cluster sampling regarding various aspects including the clustering structure and the degree of masking. Also, we seek a PSU masking strategy through swapping of subsequent stage sampling units that helps reduce the resulting biases of the variance estimates. For illustration, we used data from the National Health Interview Survey (NHIS) with some artificial modification. The proposed strategy performs very well in reducing the biases of variance estimates. Both theory and empirical results indicate that the effect of PSU masking on variance estimates is modest with minimal swapping of subsequent stage sampling units. The proposed masking strategy has been applied to the 2003-2004 National Health and Nutrition Examination Survey (NHANES) data release.

    Release date: 2008-12-23

  • Articles and reports: 12-001-X200800210760
    Description:

    The design of a stratified simple random sample without replacement from a finite population deals with two main issues: the definition of a rule to partition the population into strata, and the allocation of sampling units in the selected strata. This article examines a tree-based strategy which plans to approach jointly these issues when the survey is multipurpose and multivariate information, quantitative or qualitative, is available. Strata are formed through a hierarchical divisive algorithm that selects finer and finer partitions by minimizing, at each step, the sample allocation required to achieve the precision levels set for each surveyed variable. In this way, large numbers of constraints can be satisfied without drastically increasing the sample size, and also without discarding variables selected for stratification or diminishing the number of their class intervals. Furthermore, the algorithm tends not to define empty or almost empty strata, thus avoiding the need for strata collapsing aggregations. The procedure was applied to redesign the Italian Farm Structure Survey. The results indicate that the gain in efficiency held using our strategy is nontrivial. For a given sample size, this procedure achieves the required precision by exploiting a number of strata which is usually a very small fraction of the number of strata available when combining all possible classes from any of the covariates.

    Release date: 2008-12-23

  • Articles and reports: 12-001-X200800210761
    Description:

    Optimum stratification is the method of choosing the best boundaries that make strata internally homogeneous, given some sample allocation. In order to make the strata internally homogenous, the strata should be constructed in such a way that the strata variances for the characteristic under study be as small as possible. This could be achieved effectively by having the distribution of the main study variable known and create strata by cutting the range of the distribution at suitable points. If the frequency distribution of the study variable is unknown, it may be approximated from the past experience or some prior knowledge obtained at a recent study. In this paper the problem of finding Optimum Strata Boundaries (OSB) is considered as the problem of determining Optimum Strata Widths (OSW). The problem is formulated as a Mathematical Programming Problem (MPP), which minimizes the variance of the estimated population parameter under Neyman allocation subject to the restriction that sum of the widths of all the strata is equal to the total range of the distribution. The distributions of the study variable are considered as continuous with Triangular and Standard Normal density functions. The formulated MPPs, which turn out to be multistage decision problems, can then be solved using dynamic programming technique proposed by Bühler and Deutler (1975). Numerical examples are presented to illustrate the computational details. The results obtained are also compared with the method of Dalenius and Hodges (1959) with an example of normal distribution.

    Release date: 2008-12-23

  • Articles and reports: 12-001-X200800210762
    Description:

    This paper considers the optimum allocation in multivariate stratified sampling as a nonlinear matrix optimisation of integers. As a particular case, a nonlinear problem of the multi-objective optimisation of integers is studied. A full detailed example including some of proposed techniques is provided at the end of the work.

    Release date: 2008-12-23

  • Articles and reports: 12-001-X200800210763
    Description:

    The present work illustrates a sampling strategy useful for obtaining planned sample size for domains belonging to different partitions of the population and in order to guarantee the sampling errors of domain estimates be lower than given thresholds. The sampling strategy that covers the multivariate multi-domain case is useful when the overall sample size is bounded and consequently the standard solution of using a stratified sample with the strata given by cross-classification of variables defining the different partitions is not feasible since the number of strata is larger than the overall sample size. The proposed sampling strategy is based on the use of balanced sampling selection technique and on a GREG-type estimation. The main advantages of the solution is the computational feasibility which allows one to easily implement an overall small area strategy considering jointly the sampling design and the estimator and improving the efficiency of the direct domain estimators. An empirical simulation on real population data and different domain estimators shows the empirical properties of the examined sample strategy.

    Release date: 2008-12-23
Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (118)

Analysis (118) (0 to 10 of 118 results)

  • Articles and reports: 12-001-X200800210754
    Description:

    The context of the discussion is the increasing incidence of international surveys, of which one is the International Tobacco Control (ITC) Policy Evaluation Project, which began in 2002. The ITC country surveys are longitudinal, and their aim is to evaluate the effects of policy measures being introduced in various countries under the WHO Framework Convention on Tobacco Control. The challenges of organization, data collection and analysis in international surveys are reviewed and illustrated. Analysis is an increasingly important part of the motivation for large scale cross-cultural surveys. The fundamental challenge for analysis is to discern the real response (or lack of response) to policy change, separating it from the effects of data collection mode, differential non-response, external events, time-in-sample, culture, and language. Two problems relevant to statistical analysis are discussed. The first problem is the question of when and how to analyze pooled data from several countries, in order to strengthen conclusions which might be generally valid. While in some cases this seems to be straightforward, there are differing opinions on the extent to which pooling is possible and reasonable. It is suggested that for formal comparisons, random effects models are of conceptual use. The second problem is to find models of measurement across cultures and data collection modes which will enable calibration of continuous, binary and ordinal responses, and produce comparisons from which extraneous effects have been removed. It is noted that hierarchical models provide a natural way of relaxing requirements of model invariance across groups.

    Release date: 2008-12-23

  • Articles and reports: 12-001-X200800210755
    Description:

    Dependent interviewing (DI) is used in many longitudinal surveys to "feed forward" data from one wave to the next. Though it is a promising technique which has been demonstrated to enhance data quality in certain respects, relatively little is known about how it is actually administered in the field. This research seeks to address this issue through behavior coding. Various styles of DI were employed in the English Longitudinal Study of Ageing (ELSA) in January, 2006, and recordings were made of pilot field interviews. These recordings were analysed to determine whether the questions (particularly the DI aspects) were administered appropriately and to explore the respondent's reaction to the fed-forward data. Of particular interest was whether respondents confirmed or challenged the previously-reported information, whether the prior wave data came into play when respondents were providing their current-wave answers, and how any discrepancies were negotiated by the interviewer and respondent. Also of interest was to examine the effectiveness of various styles of DI. For example, in some cases the prior wave data was brought forward and respondents were asked to explicitly confirm it; in other cases the previous data was read and respondents were asked if the situation was still the same. Results indicate varying levels of compliance in terms of initial question-reading, and suggest that some styles of DI may be more effective than others.

    Release date: 2008-12-23

  • Articles and reports: 12-001-X200800210756
    Description:

    In longitudinal surveys nonresponse often occurs in a pattern that is not monotone. We consider estimation of time-dependent means under the assumption that the nonresponse mechanism is last-value-dependent. Since the last value itself may be missing when nonresponse is nonmonotone, the nonresponse mechanism under consideration is nonignorable. We propose an imputation method by first deriving some regression imputation models according to the nonresponse mechanism and then applying nonparametric regression imputation. We assume that the longitudinal data follow a Markov chain with finite second-order moments. No other assumption is imposed on the joint distribution of longitudinal data and their nonresponse indicators. A bootstrap method is applied for variance estimation. Some simulation results and an example concerning the Current Employment Survey are presented.

    Release date: 2008-12-23

  • Articles and reports: 12-001-X200800210757
    Description:

    Sample weights can be calibrated to reflect the known population totals of a set of auxiliary variables. Predictors of finite population totals calculated using these weights have low bias if these variables are related to the variable of interest, but can have high variance if too many auxiliary variables are used. This article develops an "adaptive calibration" approach, where the auxiliary variables to be used in weighting are selected using sample data. Adaptively calibrated estimators are shown to have lower mean squared error and better coverage properties than non-adaptive estimators in many cases.

    Release date: 2008-12-23

  • Articles and reports: 12-001-X200800210758
    Description:

    We propose a method for estimating the variance of estimators of changes over time, a method that takes account of all the components of these estimators: the sampling design, treatment of non-response, treatment of large companies, correlation of non-response from one wave to another, the effect of using a panel, robustification, and calibration using a ratio estimator. This method, which serves to determine the confidence intervals of changes over time, is then applied to the Swiss survey of value added.

    Release date: 2008-12-23

  • Articles and reports: 12-001-X200800210759
    Description:

    The analysis of stratified multistage sample data requires the use of design information such as stratum and primary sampling unit (PSU) identifiers, or associated replicate weights, in variance estimation. In some public release data files, such design information is masked as an effort to avoid their disclosure risk and yet to allow the user to obtain valid variance estimation. For example, in area surveys with a limited number of PSUs, the original PSUs are split or/and recombined to construct pseudo-PSUs with swapped second or subsequent stage sampling units. Such PSU masking methods, however, obviously distort the clustering structure of the sample design, yielding biased variance estimates possibly with certain systematic patterns between two variance estimates from the unmasked and masked PSU identifiers. Some of the previous work observed patterns in the ratio of the masked and unmasked variance estimates when plotted against the unmasked design effect. This paper investigates the effect of PSU masking on variance estimates under cluster sampling regarding various aspects including the clustering structure and the degree of masking. Also, we seek a PSU masking strategy through swapping of subsequent stage sampling units that helps reduce the resulting biases of the variance estimates. For illustration, we used data from the National Health Interview Survey (NHIS) with some artificial modification. The proposed strategy performs very well in reducing the biases of variance estimates. Both theory and empirical results indicate that the effect of PSU masking on variance estimates is modest with minimal swapping of subsequent stage sampling units. The proposed masking strategy has been applied to the 2003-2004 National Health and Nutrition Examination Survey (NHANES) data release.

    Release date: 2008-12-23

  • Articles and reports: 12-001-X200800210760
    Description:

    The design of a stratified simple random sample without replacement from a finite population deals with two main issues: the definition of a rule to partition the population into strata, and the allocation of sampling units in the selected strata. This article examines a tree-based strategy which plans to approach jointly these issues when the survey is multipurpose and multivariate information, quantitative or qualitative, is available. Strata are formed through a hierarchical divisive algorithm that selects finer and finer partitions by minimizing, at each step, the sample allocation required to achieve the precision levels set for each surveyed variable. In this way, large numbers of constraints can be satisfied without drastically increasing the sample size, and also without discarding variables selected for stratification or diminishing the number of their class intervals. Furthermore, the algorithm tends not to define empty or almost empty strata, thus avoiding the need for strata collapsing aggregations. The procedure was applied to redesign the Italian Farm Structure Survey. The results indicate that the gain in efficiency held using our strategy is nontrivial. For a given sample size, this procedure achieves the required precision by exploiting a number of strata which is usually a very small fraction of the number of strata available when combining all possible classes from any of the covariates.

    Release date: 2008-12-23

  • Articles and reports: 12-001-X200800210761
    Description:

    Optimum stratification is the method of choosing the best boundaries that make strata internally homogeneous, given some sample allocation. In order to make the strata internally homogenous, the strata should be constructed in such a way that the strata variances for the characteristic under study be as small as possible. This could be achieved effectively by having the distribution of the main study variable known and create strata by cutting the range of the distribution at suitable points. If the frequency distribution of the study variable is unknown, it may be approximated from the past experience or some prior knowledge obtained at a recent study. In this paper the problem of finding Optimum Strata Boundaries (OSB) is considered as the problem of determining Optimum Strata Widths (OSW). The problem is formulated as a Mathematical Programming Problem (MPP), which minimizes the variance of the estimated population parameter under Neyman allocation subject to the restriction that sum of the widths of all the strata is equal to the total range of the distribution. The distributions of the study variable are considered as continuous with Triangular and Standard Normal density functions. The formulated MPPs, which turn out to be multistage decision problems, can then be solved using dynamic programming technique proposed by Bühler and Deutler (1975). Numerical examples are presented to illustrate the computational details. The results obtained are also compared with the method of Dalenius and Hodges (1959) with an example of normal distribution.

    Release date: 2008-12-23

  • Articles and reports: 12-001-X200800210762
    Description:

    This paper considers the optimum allocation in multivariate stratified sampling as a nonlinear matrix optimisation of integers. As a particular case, a nonlinear problem of the multi-objective optimisation of integers is studied. A full detailed example including some of proposed techniques is provided at the end of the work.

    Release date: 2008-12-23

  • Articles and reports: 12-001-X200800210763
    Description:

    The present work illustrates a sampling strategy useful for obtaining planned sample size for domains belonging to different partitions of the population and in order to guarantee the sampling errors of domain estimates be lower than given thresholds. The sampling strategy that covers the multivariate multi-domain case is useful when the overall sample size is bounded and consequently the standard solution of using a stratified sample with the strata given by cross-classification of variables defining the different partitions is not feasible since the number of strata is larger than the overall sample size. The proposed sampling strategy is based on the use of balanced sampling selection technique and on a GREG-type estimation. The main advantages of the solution is the computational feasibility which allows one to easily implement an overall small area strategy considering jointly the sampling design and the estimator and improving the efficiency of the direct domain estimators. An empirical simulation on real population data and different domain estimators shows the empirical properties of the examined sample strategy.

    Release date: 2008-12-23
Reference (7)

Reference (7) ((7 results))

  • Surveys and statistical programs – Documentation: 75F0002M2008005
    Description:

    The Survey of Labour and Income Dynamics (SLID) is a longitudinal survey initiated in 1993. The survey was designed to measure changes in the economic well-being of Canadians as well as the factors affecting these changes. Sample surveys are subject to sampling errors. In order to consider these errors, each estimates presented in the "Income Trends in Canada" series comes with a quality indicator based on the coefficient of variation. However, other factors must also be considered to make sure data are properly used. Statistics Canada puts considerable time and effort to control errors at every stage of the survey and to maximise the fitness for use. Nevertheless, the survey design and the data processing could restrict the fitness for use. It is the policy at Statistics Canada to furnish users with measures of data quality so that the user is able to interpret the data properly. This report summarizes the set of quality measures of SLID data. Among the measures included in the report are sample composition and attrition rates, sampling errors, coverage errors in the form of slippage rates, response rates, tax permission and tax linkage rates, and imputation rates.

    Release date: 2008-08-20

  • Surveys and statistical programs – Documentation: 13-017-X
    Description: This guide focuses on the Income and Expenditure Accounts. It provides an overview, an outline of the concepts and definitions, an explanation of the sources of information and statistical methods, a glossary of terms, and a broad compilation of other facts about the accounts.
    Release date: 2008-06-30

  • Surveys and statistical programs – Documentation: 89-631-X
    Description:

    This report highlights the latest developments and rationale behind recent cycles of the General Social Survey (GSS). Starting with an overview of the GSS mandate and historic cycle topics, we then focus on two recent cycles related to families in Canada: Family Transitions (2006) and Family, Social Support and Retirement (2007). Finally, we give a summary of what is to come in the 2008 GSS on Social Networks, and describe a special project to mark 'Twenty Years of GSS'.

    The survey collects data over a twelve month period from the population living in private households in the 10 provinces. For all cycles except Cycles 16 and 21, the population aged 15 and older has been sampled. Cycles 16 and 21 sampled persons aged 45 and older.

    Cycle 20 (GSS 2006) is the fourth cycle of the GSS to collect data on families (the first three cycles on the family were in 1990, 1995 and 2001). Cycle 20 covers much the same content as previous cycles on families with some sections revised and expanded. The data enable analysts to measure conjugal and fertility history (chronology of marriages, common-law unions, and children), family origins, children's home leaving, fertility intentions, child custody as well as work history and other socioeconomic characteristics. Questions on financial support agreements or arrangements (for children and the ex-spouse or ex-partner) for separated and divorced families have been modified. Also, sections on social networks, well-being and housing characteristics have been added.

    Release date: 2008-05-27

  • Surveys and statistical programs – Documentation: 75F0002M1992001
    Description:

    Starting in 1994, the Survey of Labour and Income Dynamics (SLID) will follow individuals and families for at least six years, tracking their labour market experiences, changes in income and family circumstances. An initial proposal for the content of SLID, entitled "Content of the Survey of Labour and Income Dynamics : Discussion Paper", was distributed in February 1992.

    That paper served as a background document for consultation with and a review by interested users. The content underwent significant change during this process. Based upon the revised content, a large-scale test of SLID will be conducted in February and May 1993.

    The present document outlines the income and wealth content to be tested in May 1993. This document is really a continuation of SLID Research Paper Series 92-01A, which outlines the demographic and labour content used in the January /February 1993 test.

    Release date: 2008-02-29

  • Surveys and statistical programs – Documentation: 75F0002M1992002
    Description:

    When a survey respondent is asked to recall various events, it is known that the quality of the responses diminishes as the length of recall increases. On the other hand, increasing the frequency of data collection increases both the costs of collection and the burden on the respondents. The paper examines options which attempt to strike a reasonable balance between these factors. As it relates to this decision, the paper also describes how the sample has been designed to ensure that it remains representative of the target population, both for a given year and over time.

    The conclusion is that, at this time, SLID should collect labour data in January to cover the previous calendar year and to collect income data in May, again to cover the previous calendar year.

    Release date: 2008-02-29

  • Surveys and statistical programs – Documentation: 75F0002M1992007
    Description:

    A Preliminary Interview will be conducted on the first panel of SLID, in January 1993, as a supplement to the Labour Force Survey. The first panel is made up of about 20,000 households that are rotating out of the Labour Force Survey in January and February, 1993.

    The purpose of this document is to provide a description of the purpose of the SLID Preliminary Interview and the question wordings to be used.

    Release date: 2008-02-29

  • Surveys and statistical programs – Documentation: 82-225-X200701010508
    Description:

    The Record Linkage Overview describes the process used in annual internal record linkage of the Canadian Cancer Registry. The steps include: preparation; pre-processing; record linkage; post-processing; analysis and resolution; resolution entry; and, resolution processing.

    Release date: 2008-01-18
Date modified: