Articles and reports: 12-001-X20060029551


To select a survey sample, it happens that one does not have a frame containing the desired collection units, but rather another frame of units linked in a certain way to the list of collection units. It can then be considered to select a sample from the available frame in order to produce an estimate for the desired target population by using the links existing between the two. This can be designated by Indirect Sampling.

Estimation for the target population surveyed by Indirect Sampling can constitute a big challenge, in particular if the links between the units of the two are not one-to-one. The problem comes especially from the difficulty to associate a selection probability, or an estimation weight, to the surveyed units of the target population. In order to solve this type of estimation problem, the Generalized Weight Share Method (GWSM) has been developed by Lavallée (1995) and Lavallée (2002). The GWSM provides an estimation weight for every surveyed unit of the target population.

This paper first describes Indirect Sampling, which constitutes the foundations of the GWSM. Second, an overview of the GWSM is given where we formulate the GWSM in a theoretical framework using matrix notation. Third, we present some properties of the GWSM such as unbiasedness and transitivity. Fourth, we consider the special case where the links between the two populations are expressed by indicator variables. Fifth, some special typical linkages are studied to assess their impact on the GWSM. Finally, we consider the problem of optimality. We obtain optimal weights in a weak sense (for specific values of the variable of interest), and conditions for which these weights are also optimal in a strong sense and independent of the variable of interest.

Release date: 2006-12-21

Articles and reports: 11-522-X20040018752


This paper outlines some possible applications of the permanent sample of households ready to respond with respect to surveying difficult-to-reach population groups.

Release date: 2005-10-27

Articles and reports: 11-522-X20040018756


This paper evaluates several approaches that have been used to construct or augment frames for a variety of Statistics Canada surveys. On the basis of these experiences, some good practices for frame construction and use are proposed.

Release date: 2005-10-27

Articles and reports: 12-001-X20040027756


It is usually discovered in the data collection phase of a survey that some units in the sample are ineligible even if the frame information has indicated otherwise. For example, in many business surveys a nonnegligible proportion of the sampled units will have ceased trading since the latest update of the frame. This information may be fed back to the frame and used in subsequent surveys, thereby making forthcoming samples more efficient by avoiding sampling ineligible units. On the first of two survey occasions, we assume that all ineligible units in the sample (or set of samples) are detected and excluded from the frame. On the second occasion, a subsample of the eligible part is observed again. The subsample may be augmented with a fresh sample that will contain both eligible and ineligible units. We investigate what effect on survey estimation the process of feeding back information on ineligibility may have, and derive an expression for the bias that can occur as a result of feeding back. The focus is on estimation of the total using the common expansion estimator. An estimator that is nearly unbiased in the presence of feed back is obtained. This estimator relies on consistent estimates of the number of eligible and ineligible units in the population being available.

Release date: 2005-02-03

Articles and reports: 11-522-X20030017596


This paper discusses the measurement problems that affected the Demographic Analysis (DA), a coverage measurement program used for Census 2000.

Release date: 2005-01-26

Articles and reports: 12-001-X20030026777


The Accuracy and Coverage Evaluation survey was conducted to estimate the coverage in the 2000 U.S. Census. After field procedures were completed, several types of missing data had to be addressed to apply dual-system estimation. Some housing units were not interviewed. Two noninterview adjustments were devised from the same set of interviews, one for each of two points in time. In addition, the resident, match, or enumeration status of some respondents was not determined. Methods applied in the past were replaced to accommodate a tighter schedule to compute and verify the estimates. This paper presents the extent of missing data in the survey, describes the procedures applied, comparing them to past and current alternatives, and provides analytical summaries of the procedures, including comparisons of dual-system estimates of population under alternatives. Because the resulting levels of missing data were low, it appears that alternative procedures would not have affected the results substantially. However some changes in the estimates are noted.

Release date: 2004-01-27

Articles and reports: 12-001-X20030026780


Coverage errors and other coverage issues related to the population censuses are examined in the light of the recent literature. Especially, when the actual population census count of persons are matched with their corresponding post enumeration survey counts, the aggregated results in a dual record system setting can provide some coverage error statistics.

In this paper, the coverage error issues are evaluated and alternative solutions are discussed in the light of the results from the latest Population Census of Turkey. By using the Census and post enumeration survey data, regional comparison of census coverage was also made and has shown greater variability among regions. Some methodological remarks are also made on the possible improvements on the current enumeration procedures.

Release date: 2004-01-27

Articles and reports: 12-001-X20020026431


When stand-alone sampling frames that list all establishments and their measures of size are available, establishment surveys typically use the Hansen-Hurwitz (HH) PPS (probability proportional to size) estimator to estimate the volume of transactions that establishments have with populations. This paper proposes the network sampling (NS) version of the HH estimator as a potential competitor of the PPS estimator. The NS estimator depends on the population survey-generated establishment frame that lists households and their selection probabilities in a population sample survey, and the number of transactions, if any, of each household with each establishment. A statistical model is developed in this paper to compare the efficiencies of the HH and NS estimators in single-stage and two-stage establishment sample surveys assuming the stand-alone sampling frame and the population survey-generated frame are flawless in coverage and size measures.

Release date: 2003-01-29

Articles and reports: 11-522-X20010016248


This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

The Sawmill Survey is a voluntary census of sawmills in Great Britain. It is limited to fixed mills using domestically-grown timber. Three approaches to assess the coverage of this survey are described:

(1) A sample survey of the sawmilling industry from the UK's business register, excluding businesses already sampled in the Sawmill Survey, is used to assess the undercoverage in the list of known sawmills; (2) A non-response follow-up using local knowledge of regional officers of the Forestry Commission, is used to estimate the sawmills that do not respond (mostly the smaller mills); and (3) A survey of small-scale sawmills and mobile sawmills (many of these businesses are micro-enterprises) is conducted to analyse their significance.

These three approaches are synthesized to give an estimate of the coverage of the original survey compared with the total activity identified, and to estimate the importance of micro-enterprises to the sawmilling industry in Great Britain.

Release date: 2002-09-12

Articles and reports: 11-522-X20010016266


This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

The key measure of Census quality is the level of response achieved. In recent censuses around the world, this level has been in the high nineties percentage range. This was also true of the 1991 Census in Britain (98%). However, what was particularly noticeable about this Census was the differential response rate and the difficulty in effectively measuring this rate. The United Kingdom set up the One Number Census program in order to research and develop a more effective methodology to measure and account for under-enumeration in the 2001 Census. The key element in this process is the Census Coverage Survey - a significantly larger and redesigned post-enumeration survey.

This paper describes the planning and design of the Census Coverage Survey with particular emphasis on the implementation of the proposed field methodology. It also provides a high-level overview of the success of this survey.

Release date: 2002-09-12
Articles and reports: 12-001-X201500114149


This paper introduces a general framework for deriving the optimal inclusion probabilities for a variety of survey contexts in which disseminating survey estimates of pre-established accuracy for a multiplicity of both variables and domains of interest is required. The framework can define either standard stratified or incomplete stratified sampling designs. The optimal inclusion probabilities are obtained by minimizing costs through an algorithm that guarantees the bounding of sampling errors at the domains level, assuming that the domain membership variables are available in the sampling frame. The target variables are unknown, but can be predicted with suitable super-population models. The algorithm takes properly into account this model uncertainty. Some experiments based on real data show the empirical properties of the algorithm.

Release date: 2015-06-29

Articles and reports: 12-001-X201400214128


Users, funders and providers of official statistics want estimates that are “wider, deeper, quicker, better, cheaper” (channeling Tim Holt, former head of the UK Office for National Statistics), to which I would add “more relevant” and “less burdensome”. Since World War II, we have relied heavily on the probability sample survey as the best we could do - and that best being very good - to meet these goals for estimates of household income and unemployment, self-reported health status, time use, crime victimization, business activity, commodity flows, consumer and business expenditures, et al. Faced with secularly declining unit and item response rates and evidence of reporting error, we have responded in many ways, including the use of multiple survey modes, more sophisticated weighting and imputation methods, adaptive design, cognitive testing of survey items, and other means to maintain data quality. For statistics on the business sector, in order to reduce burden and costs, we long ago moved away from relying solely on surveys to produce needed estimates, but, to date, we have not done that for household surveys, at least not in the United States. I argue that we can and must move from a paradigm of producing the best estimates possible from a survey to that of producing the best possible estimates to meet user needs from multiple data sources. Such sources include administrative records and, increasingly, transaction and Internet-based data. I provide two examples - household income and plumbing facilities - to illustrate my thesis. I suggest ways to inculcate a culture of official statistics that focuses on the end result of relevant, timely, accurate and cost-effective statistics and treats surveys, along with other data sources, as means to that end.

Release date: 2014-12-19

Articles and reports: 12-001-X201100111443


Dual frame telephone surveys are becoming common in the U.S. because of the incompleteness of the landline frame as people transition to cell phones. This article examines nonsampling errors in dual frame telephone surveys. Even though nonsampling errors are ignored in much of the dual frame literature, we find that under some conditions substantial biases may arise in dual frame telephone surveys due to these errors. We specifically explore biases due to nonresponse and measurement error in these telephone surveys. To reduce the bias resulting from these errors, we propose dual frame sampling and weighting methods. The compositing factor for combining the estimates from the two frames is shown to play an important role in reducing nonresponse bias.

Release date: 2011-06-29

Articles and reports: 11-522-X200800010972


Release date: 2009-12-03

Articles and reports: 11-522-X200800010979


Prior to 2006, the Canadian Census of Population relied on field staff to deliver questionnaires to all dwellings in Canada. For the 2006 Census, an address frame was created to cover almost 70% of dwellings in Canada, and these questionnaires were delivered by Canada Post. For the 2011 Census, Statistics Canada aims to expand this frame further, with a target of delivering questionnaires by mail to between 80% and 85% of dwellings. Mailing questionnaires for the Census raises a number of issues, among them: ensuring returned questionnaires are counted in the right area, creating an up to date address frame that includes all new growth, and determining which areas are unsuitable for having questionnaires delivered by mail. Changes to the address frame update procedures for 2011, most notably the decision to use purely administrative data as the frame wherever possible and conduct field update exercises only where deemed necessary, provide a new set of challenges for the 2011 Census.

Release date: 2009-12-03

Articles and reports: 11-522-X200600110420


Most major survey research organizations in the United States and Canada do not include wireless telephone numbers when conducting random-digit-dialed (RDD) household telephone surveys. In this paper, we offer the most up-to-date estimates available from the U.S. National Center for Health Statistics and Statistics Canada concerning the prevalence and demographic characteristics of the wireless-only population. We then present data from the U.S. National Health Interview Survey on the health and health care access of wireless-only adults, and we examine the potential for coverage bias when health research is conducted using RDD surveys that exclude wireless telephone numbers.

Release date: 2008-03-17

Articles and reports: 12-001-X200700210497


Coverage deficiencies are estimated and analysed for the 2000 population census in Switzerland. For the undercoverage component, the estimation is based on a sample independent of the census and a match with the census. For the overcoverage component, the estimation is based on a sample drawn from the census list and a match with the rest of the census. The over- and undercoverage components are then combined to obtain an estimate of the resulting net coverage. This estimate is based on a capture-recapture model, named the dual system, combined with a synthetic model. The estimators are calculated for the full population and different subgroups, with a variance estimated by a stratified jackknife. The coverage analyses are supplemented by a study of matches between the independent sample and the census in order to determine potential errors of measurement and location in the census data.

Release date: 2008-01-03

Articles and reports: 11-522-X20050019449


Literature about Multiple Frame estimation theory mainly concentrates over the Dual Frame case and it is only rarely concerned with the important practical issue of the variance estimation. By using a multiplicity approach a fixed weights Single Frame estimator for Multiple Frame Survey is proposed.

Release date: 2007-03-02

Articles and reports: 11-522-X20050019452


The redesign of the Dutch Business Register was started for both technical and statistical reasons. The major changes in the new register are the use of the new Dutch Basic Business Register as the source for legal and local units, the inclusion of administrative units in the register and a new automated algorithm to derive the statistical frame from administrative sources.

Release date: 2007-03-02

Articles and reports: 11-522-X20050019454


The goal of the BR Redesign Project is to simplify, optimize, and harmonize its processes and methods. This paper provides an overview of the BR Redesign with emphasis on the issues that affect the methodology of business surveys.

Release date: 2007-03-02
## Reference (5) ((5 results))

Surveys and statistical programs – Documentation: 12-001-X19980024353


This paper studies response errors in the Current Population Survey of the U.S. Bureau of the Census and assesses their impact on the unemployment rates published by the Bureau of Labour Statistics. The measurement of these error rates is obtained from reinterview data, using an extension of the Hui and Walter (1980) procedure for the evaluation of diagnostic tests. Unlike prior studies which assumed that the reconciled reinterview yields the true status, the method estimates the error rates in both interviews. Using these estimated error rates, we show that the misclassification in the original survey creates a cyclical effect on the reported estimated unemployment rates. In particular, the degress of underestimation increases when true unemployment is high. As there was insufficient data to distinguish between a model assuming that the misclassification rates are the same throughout the business cycle, and one that allows the error rates to differ in periods of low, moderate and high unemployment, our findings should be regarded as preliminary. Nonetheless, they indicated that the relationship between the models used to assess the accuracy of diagnostic tests, and those measuring misclassification rates of survey data, deserves further study.

Release date: 1999-01-14

Surveys and statistical programs – Documentation: 12-001-X19980013906


In sample surveys, the units contained in the sampling frame ideally have a one-to-one correspondence with the elements in the target population under study. In many cases, however, the frame has a many-to-many structure. That is, a unit in the frame may be associated with multiple target population elements and a target population element may be associated with multiple frame units. Such was the case in a building characteristics survey in which the frame was a list of street addresses, but the target population was commercial buildings. The frame was messy because a street address corresponded either to a single building, multiple buildings, or part of a building. In this paper, we develop estimators and formulas for their variances in both simple and stratified random sampling designs when the frame has a many-to-many structure.

Release date: 1998-07-31

Surveys and statistical programs – Documentation: 12-001-X19980013912


Efficient estimates of population size and totals based on information from multiple list frames and an independent area frame are considered. This work is an extension of the methodology proposed by Harley (1962) which considers two general frames. A main disadvantage of list frames is that they are typically incomplete. In this paper, we propose several methods to address frame deficiencies. A joint list-area sampling design incorporates multiple frames and achieves full coverage of the target population. For each combination of frames, we present the appropriate notation, likelihood function, and parameter estimators. Results from a simulation study that compares the various properties of the proposed estimators are also presented.

Release date: 1998-07-31

Surveys and statistical programs – Documentation: 12-001-X19980013913


Temporary mobility is hypothesized to contribute toward within-household coverage error since it may affect an individual's determination of "usual residence" - a concept commonly applied when listing persons as part of a household-based survey or census. This paper explores a typology of temporary mobility patterns and how they relate to the identification of usual residence. Temporary mobility is defined by the pattern of movement away from, but usually back to a single residence over a two-three month reference period. The typology is constructed using two dimensions: the variety of places visited and the frequency of visits made. Using data from the U.S. Living Situation Survey (LSS) conducted in 1993, four types of temporary mobility patterns are identified. In particular, two groups exhibiting patterns of repeat visit behavior were found to contain more of the types of people who tend to be missed during censuses and surveys. Log-linear modeling indicates spent away and demographic characteristics.

Release date: 1998-07-31

Surveys and statistical programs – Documentation: 12-001-X19970023620


Since France has no population registers, population censuses are the basis for its socio-demographic information system. However, between two censuses, some data must be updated, in particular at a high level of geographic detail, especially since censuses are tending, for various reasons, to be less frequent. In 1993, the Institut National de la Statistique et des Études Économiques (INSEE) set up a team whose objective was to propose a system to substantially improve the existing mechanism for making small area population estimates. Its task was twofold: to prepare an efficient and robust synthesis of the information available from different administrative sources, and to assemble a sufficient number of "good" sources. The "multi-source" system that it designed, which is reported on here, is flexible and reliable, without being overly complex.

Release date: 1998-03-12
