Keyword search

Skip to main content
Skip to footer

Language selection

Français

Search and menus

Search and menus

Search

Results

All (24)

All (24) (0 to 10 of 24 results)

1. Survey sampling theory over the twentieth century and its relation to computing technology Archived
Articles and reports: 12-001-X20000015174
Description:
Computation is an integral part of statistical analysis in general and survey sampling in particular. What kinds of analyses can be carried out will depend upon what kind of computational power is available. The general development of sampling theory is traced in connection with technological developments in computation.
Release date: 2000-08-30
2. Estimation of census adjustment factors Archived
Articles and reports: 12-001-X20000015176
Description:
A components-of-variance approach and an estimated covariance error structure were used in constructing predictors of adjustment factors for the 1990 Decennial Census. The variability of the estimated covariance matrix is the suspected cause of certain anomalies that appeared in the regression estimation and in the estimated adjustment factors. We investigate alternative prediction methods and propose a procedure that is less influenced by variability in the estimated covariance matrix. The proposed methodology is applied to a data set composed of 336 adjustment factors from the 1990 Post Enumeration Survey.
Release date: 2000-08-30
3. Multilevel modelling of complex survey longitudinal data with time varying random effects Archived
Articles and reports: 12-001-X20000015178
Description:
Longitudinal observations consist of repeated measurements on the same units over a number of occasions, with fixed or varying time spells between the occasions. Each vector observation can be viewed therefore as a time series, usually of short length. Analyzing the measurements for all the units permits the fitting of low-order time series models, despite the short lengths of the individual series.
Release date: 2000-08-30
4. A conditional mean squared error of small area estimators Archived
Articles and reports: 12-001-X20000015179
Description:
This paper suggests estimating the conditional mean squared error of small area estimators to evaluate their accuracy. This mean squared error is conditional in the sense that it measures the variability with respect to the sampling design for a particular realization of the smoothing model underlying the small area estimators. An unbiased estimators for the conditional mean squared error is easily constructed using Stein's Lemma for the expectation of normal random variables.
Release date: 2000-08-30
5. Cold deck and ratio imputation Archived
Articles and reports: 12-001-X20000015180
Description:
Imputation is a common procedure to compensate for nonresponse in survey problems. Using auxiliary data, imputation may produce estimators that are more efficient than the one constructed by ignoring nonrespondents and re-weighting. We study and compare the mean squared errors of survey estimators based on data imputed using three difference imputation techniques: the commonly used ratio imputation method and two cold deck imputation methods that are frequently adopted in economic area surveys conducted by the U.S. Census Bureau and the U.S. Bureau of Labor Statistics.
Release date: 2000-08-30
6. Model-based estimation with link-tracing sampling designs Archived
Articles and reports: 12-001-X20000015181
Description:
Samples from hidden and hard-to-access human populations are often obtained by procedures in which social links are followed from one respondent to another. Inference from the sample to the larger population of interest can be affected by the link-tracing design and the type of data it produces. The population with its social network structure can be modeled as a stochastic graph with a joint distribution of node values representing characteristics of individuals and arc indicators representing social relationships between individuals.
Release date: 2000-08-30
7. Calibration and restricted weights Archived
Articles and reports: 12-001-X20000015182
Description:
To better understand the impact of imposing a restricted region on calibration weights, the author reviews the latter's aymptotic behaviour. Necessary and sufficient conditions are provided for the existence of a solution to the calibration equation with weights within given intervals.
Release date: 2000-08-30
8. Local unconditional best linear unbiased estimators: Applications to survey sampling Archived
Articles and reports: 12-001-X20000015184
Description:
Survey statisticians frequently use superpopulation linear regression models. The Gauss-Markov theorem, assuming fixed regressors or conditioning on observed values of regressors, asserts that the standard estimators of regression coefficients are best linear unbiased.
Release date: 2000-08-30
9. Factors Affecting Urban Transit Ridership Archived
Journals and periodicals: 53F0003X
Geography: Canada
Description:
For several years, urban transit ridership in Canada has been declining. In the late 1990s, ridership began to stabilize but at a level well below the peaks reached in previous years. Many have postulated reasons for the decline, including the dominance of the automobile, changes in work locations and hours, increasing fares, decreasing subsidies and increasing suburbanization.
Using data from approximately 85 Canadian urban transit service providers, over a period of 8 years, this paper outlines the empirical results of analysis to measure factors that have affected urban transit ridership. Among the key goals of this project was the development of measures of fare elasticity.
Demographic, socio-economic and level of service variables were used in the research to explain changes in ridership. A variety of dummy variables was also used to account for structural differences.
The paper concludes with an examination of major Canadian cities that carry the majority of all commuters in the country.
Release date: 2000-06-06
10. Combining administrative data with survey data: experience in the Australian survey of employment and unemployment patterns Archived
Surveys and statistical programs – Documentation: 11-522-X19990015644
Description:
One method of enriching survey data is to supplement information collected directly from the respondent with that obtained from administrative systems. The aims of such a practice include being able to collect data which might not otherwise be possible, provision of better quality information for data items which respondents may not be able to report accurately (or not at all) reduction of respondent load, and maximising the utility of information held in administrative systems. Given the direct link with administrative information, the data set resulting from such techniques is potentially a powerful basis for policy-relevant analysis and evaluation. However, the processes involved in effectively combining data from different sources raise a number of challenges which need to be addressed by the parties involved. These include issues associated with privacy, data linking, data quality, estimation, and dissemination.
Release date: 2000-03-02

Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (14)

Analysis (14) (0 to 10 of 14 results)

1. Survey sampling theory over the twentieth century and its relation to computing technology Archived
Articles and reports: 12-001-X20000015174
Description:
Computation is an integral part of statistical analysis in general and survey sampling in particular. What kinds of analyses can be carried out will depend upon what kind of computational power is available. The general development of sampling theory is traced in connection with technological developments in computation.
Release date: 2000-08-30
2. Estimation of census adjustment factors Archived
Articles and reports: 12-001-X20000015176
Description:
A components-of-variance approach and an estimated covariance error structure were used in constructing predictors of adjustment factors for the 1990 Decennial Census. The variability of the estimated covariance matrix is the suspected cause of certain anomalies that appeared in the regression estimation and in the estimated adjustment factors. We investigate alternative prediction methods and propose a procedure that is less influenced by variability in the estimated covariance matrix. The proposed methodology is applied to a data set composed of 336 adjustment factors from the 1990 Post Enumeration Survey.
Release date: 2000-08-30
3. Multilevel modelling of complex survey longitudinal data with time varying random effects Archived
Articles and reports: 12-001-X20000015178
Description:
Longitudinal observations consist of repeated measurements on the same units over a number of occasions, with fixed or varying time spells between the occasions. Each vector observation can be viewed therefore as a time series, usually of short length. Analyzing the measurements for all the units permits the fitting of low-order time series models, despite the short lengths of the individual series.
Release date: 2000-08-30
4. A conditional mean squared error of small area estimators Archived
Articles and reports: 12-001-X20000015179
Description:
This paper suggests estimating the conditional mean squared error of small area estimators to evaluate their accuracy. This mean squared error is conditional in the sense that it measures the variability with respect to the sampling design for a particular realization of the smoothing model underlying the small area estimators. An unbiased estimators for the conditional mean squared error is easily constructed using Stein's Lemma for the expectation of normal random variables.
Release date: 2000-08-30
5. Cold deck and ratio imputation Archived
Articles and reports: 12-001-X20000015180
Description:
Imputation is a common procedure to compensate for nonresponse in survey problems. Using auxiliary data, imputation may produce estimators that are more efficient than the one constructed by ignoring nonrespondents and re-weighting. We study and compare the mean squared errors of survey estimators based on data imputed using three difference imputation techniques: the commonly used ratio imputation method and two cold deck imputation methods that are frequently adopted in economic area surveys conducted by the U.S. Census Bureau and the U.S. Bureau of Labor Statistics.
Release date: 2000-08-30
6. Model-based estimation with link-tracing sampling designs Archived
Articles and reports: 12-001-X20000015181
Description:
Samples from hidden and hard-to-access human populations are often obtained by procedures in which social links are followed from one respondent to another. Inference from the sample to the larger population of interest can be affected by the link-tracing design and the type of data it produces. The population with its social network structure can be modeled as a stochastic graph with a joint distribution of node values representing characteristics of individuals and arc indicators representing social relationships between individuals.
Release date: 2000-08-30
7. Calibration and restricted weights Archived
Articles and reports: 12-001-X20000015182
Description:
To better understand the impact of imposing a restricted region on calibration weights, the author reviews the latter's aymptotic behaviour. Necessary and sufficient conditions are provided for the existence of a solution to the calibration equation with weights within given intervals.
Release date: 2000-08-30
8. Local unconditional best linear unbiased estimators: Applications to survey sampling Archived
Articles and reports: 12-001-X20000015184
Description:
Survey statisticians frequently use superpopulation linear regression models. The Gauss-Markov theorem, assuming fixed regressors or conditioning on observed values of regressors, asserts that the standard estimators of regression coefficients are best linear unbiased.
Release date: 2000-08-30
9. Factors Affecting Urban Transit Ridership Archived
Journals and periodicals: 53F0003X
Geography: Canada
Description:
For several years, urban transit ridership in Canada has been declining. In the late 1990s, ridership began to stabilize but at a level well below the peaks reached in previous years. Many have postulated reasons for the decline, including the dominance of the automobile, changes in work locations and hours, increasing fares, decreasing subsidies and increasing suburbanization.
Using data from approximately 85 Canadian urban transit service providers, over a period of 8 years, this paper outlines the empirical results of analysis to measure factors that have affected urban transit ridership. Among the key goals of this project was the development of measures of fare elasticity.
Demographic, socio-economic and level of service variables were used in the research to explain changes in ridership. A variety of dummy variables was also used to account for structural differences.
The paper concludes with an examination of major Canadian cities that carry the majority of all commuters in the country.
Release date: 2000-06-06
10. Some recent advances in model-based small area estimation Archived
Articles and reports: 12-001-X19990024880
Description:
J.N.K. Rao gives an overview of the methods and models used for small area estimation. This is an update of his previous overview (Ghosh and Rao, 1994, Statistical Science). He first presents a general discussion of small area models, making a distinction between area level models and unit level models. He then describes the development in three main approaches for inference based on these models: EBLUP, EB and HB, gives several examples of recent applications. Finally, he presents an interesting discussion identifying the gaps and areas that require further research.
Release date: 2000-03-01

Reference (10)

Reference (10) ((10 results))

1. Combining administrative data with survey data: experience in the Australian survey of employment and unemployment patterns Archived
Surveys and statistical programs – Documentation: 11-522-X19990015644
Description:
One method of enriching survey data is to supplement information collected directly from the respondent with that obtained from administrative systems. The aims of such a practice include being able to collect data which might not otherwise be possible, provision of better quality information for data items which respondents may not be able to report accurately (or not at all) reduction of respondent load, and maximising the utility of information held in administrative systems. Given the direct link with administrative information, the data set resulting from such techniques is potentially a powerful basis for policy-relevant analysis and evaluation. However, the processes involved in effectively combining data from different sources raise a number of challenges which need to be addressed by the parties involved. These include issues associated with privacy, data linking, data quality, estimation, and dissemination.
Release date: 2000-03-02
2. Fusion of data and estimation by entropy maximization Archived
Surveys and statistical programs – Documentation: 11-522-X19990015672
Description:
Data fusion as discussed here means to create a set of data on not jointly observed variables from two different sources. Suppose for instance that observations are available for (X,Z) on a set of individuals and for (Y,Z) on a different set of individuals. Each of X, Y and Z may be a vector variable. The main purpose is to gain insight into the joint distribution of (X,Y) using Z as a so-called matching variable. At first however, it is attempted to recover as much information as possible on the joint distribution of (X,Y,Z) from the distinct sets of data. Such fusions can only be done at the cost of implementing some distributional properties for the fused data. These are conditional independencies given the matching variables. Fused data are typically discussed from the point of view of how appropriate this underlying assumption is. Here we give a different perspective. We formulate the problem as follows: how can distributions be estimated in situations when only observations from certain marginal distributions are available. It can be solved by applying the maximum entropy criterium. We show in particular that data created by fusing different sources can be interpreted as a special case of this situation. Thus, we derive the needed assumption of conditional independence as a consequence of the type of data available.
Release date: 2000-03-02
3. Spatial statistics and environmental epidemiology using routine data Archived
Surveys and statistical programs – Documentation: 11-522-X19990015674
Description:
The effect of the environment on health is of increasing concern, in particular the effects of the release of industrial pollutants into the air, the ground and into water. An assessment of the risks to public health of any particular pollution source is often made using the routine health, demographic and environmental data collected by government agencies. These datasets have important differences in sampling geography and in sampling epochs which affect the epidemiological analyses which draw them together. In the UK, health events are recorded for individuals, giving cause codes, a data of diagnosis or death, and using the unit postcode as a geographical reference. In contrast, small area demographic data are recorded only at the decennial census, and released as area level data in areas distinct from postcode geography. Environmental exposure data may be available at yet another resolution, depending on the type of exposure and the source of the measurements.
Release date: 2000-03-02
4. Estimation using the generalized weight share method: The case of record linkage Archived
Surveys and statistical programs – Documentation: 11-522-X19990015680
Description:
To augment the amount of available information, data from different sources are increasingly being combined. These databases are often combined using record linkage methods. When there is no unique identifier, a probabilistic linkage is used. In that case, a record on a first file is associated with a probability that is linked to a record on a second file, and then a decision is taken on whether a possible link is a true link or not. This usually requires a non-negligible amount of manual resolution. It might then be legitimate to evaluate if manual resolution can be reduced or even eliminated. This issue is addressed in this paper where one tries to produce an estimate of a total (or a mean) of one population, when using a sample selected from another population linked somehow to the first population. In other words, having two populations linked through probabilistic record linkage, we try to avoid any decision concerning the validity of links and still be able to produce an unbiased estimate for a total of the one of two populations. To achieve this goal, we suggest the use of the Generalised Weight Share Method (GWSM) described by Lavallée (1995).
Release date: 2000-03-02
5. Dual system estimation and the 2001 Census coverage surveys of the U.K. Archived
Surveys and statistical programs – Documentation: 11-522-X19990015682
Description:
The application of dual system estimation (DSE) to matched Census / Post Enumeration Survey (PES) data in order to measure net undercount is well understood (Hogan, 1993). However, this approach has so far not been used to measure net undercount in the UK. The 2001 PES in the UK will use this methodology. This paper presents the general approach to design and estimation for this PES (the 2001 Census Coverage Survey). The estimation combines DSE with standard ratio and regression estimation. A simulation study using census data from the 1991 Census of England and Wales demonstrates that the ratio model is in general more robust than the regression model.
Release date: 2000-03-02
6. Simultaneous calibration of several surveys Archived
Surveys and statistical programs – Documentation: 11-522-X19990015684
Description:
Often, the same information is gathered almost simultaneously for several different surveys. In France, this practice is institutionalized for household surveys that have a common set of demographic variables, i.e., employment, residence and income. These variables are important co-factors for the variables of interest in each survey, and if used carefully, can reinforce the estimates derived from each survey. Techniques for calibrating uncertain data can apply naturally in this context. This involves finding the best unbiased estimator in common variables and calibrating each survey based on that estimator. The estimator thus obtained in each survey is always a linear estimator, the weightings of which can be easily explained and the variance can be obtained with no new problems, as can the variance estimate. To supplement the list of regression estimators, this technique can also be seen as a ridge-regression estimator, or as a Bayesian-regression estimator.
Release date: 2000-03-02
7. Diagnostics for comparison and combined use of diary and interview data from the U.S. consumer expenditure survey Archived
Surveys and statistical programs – Documentation: 11-522-X19990015686
Description:
The U.S. Consumer Expenditure Survey uses two instruments, a diary and an in-person interview, to collect data on many categories of consumer expenditures. Consequently, it is important to use these data efficiently to estimate mean expenditures and related parameters. Three options are: (1) use only data from the diary source; (2) Use only data from the interview source; and (3) use generalized least squares, or related methods, to combine the diary and interview data. Historically, the U.S. Bureau of Labor Statistics has focused on options (1) and (2) for estimation at the five or six-digit Universal Classification Code level. Evaluation and possible implementation of option (3) depends on several factors, including possible measurement biases in the diary and interview data; the empirical magnitude of these biases, relative to the standard errors of customary mean estimators; and the degree of homogeneity of these biases across strata and periods. This paper reviews some issues related to options (1) through (3); describes a relatively simple generalized least squares method for implementation of option (3); and discussed the need for diagnostics to evaluate the feasibility and relative efficiency of the generalized least squares method.
Release date: 2000-03-02
8. A method of generating a sample of artificial data from several existing tables: Application based on the residential electric power market Archived
Surveys and statistical programs – Documentation: 11-522-X19990015690
Description:
The artificial sample was generated in two steps. The first step, based on a master panel, was a Multiple Correspondence Analysis (MCA) carried out on basic variables. Then, "dummy" individuals were generated randomly using the distribution of each "significant" factor in the analysis. Finally, for each individual, a value was generated for each basic variable most closely linked to one of the previous factors. This method ensured that sets of variables were drawn independently. The second step consisted in grafting some other data bases, based on certain property requirements. A variable was generated to be added on the basis of its estimated distribution, using a generalized linear model for common variables and those already added. The same procedure was then used to graft the other samples. This method was applied to the generation of an artificial sample taken from two surveys. The artificial sample that was generated was validated using sample comparison testing. The results were positive, demonstrating the feasibility of this method.
Release date: 2000-03-02
9. Using meta-analysis to understand the impact of time-of-use rates Archived
Surveys and statistical programs – Documentation: 11-522-X19990015692
Description:
Electricity rates that vary by time-of-day have the potential to significantly increase economic efficiency in the energy market. A number of utilities have undertaken economic studies of time-of-use rates schemes for their residential customers. This paper uses meta-analysis to examine the impact of time-of-use rates on electricity demand pooling the results of thirty-eight separate programs. There are four key findings. First, very large peak to off-peak price ratios are needed to significantly affect peak demand. Second, summer peak rates are relatively effective compared to winter peak rates. Third, permanent time-or-use rates are relatively effective compared to experimental ones. Fourth, demand charges rival ordinary time-of-use rates in terms of impact.
Release date: 2000-03-02
10. Meta-analysis of population dynamics data: Hierarchical modelling to reduce uncertainty Archived
Surveys and statistical programs – Documentation: 11-522-X19990015694
Description:
We use data on 14 populations of coho salmon to estimate critical parameters that are vital for management of fish populations. Parameter estimates from individual data sets are inefficient and can be highly biased, and we investigate methods to overcome these problems. Combination of data sets using nonlinear mixed effects models provides more useful results, however questions of influence and robustness are raised. For comparison, robust estimates are obtained. Model-robustness is also explored using a family of alternative functional forms. Our results allow ready calculation of the limits of exploitation and may help to prevent extinction of fish stocks. Similar methods can be applied in other contexts where parameter estimation is part of a larger decision-making process.
Release date: 2000-03-02

Report a problem or mistake on this page

Date modified:: 2024-09-23