Replication variance estimation after sample-based calibration
Section 1. Introduction
Variance estimation methods for complex surveys include linearization and replication methods. Some of the practical advantages of replication methods include the facts that multiple weight adjustments such as nonresponse adjustments and calibration are readily incorporated into the estimates, that detailed design information does not need to be released in the public-use datasets, and that data users can readily obtain variance estimates for wide classes of estimators without the need for derivations. There are numerous replication methods in use, with the appropriate choice of method dictated by the sampling design and the estimation objectives of the survey. We refer to Wolter (2007) for an overview of the types of variance estimation replication methods.
The problem we are addressing in this article is how to incorporate calibration into replication variance estimation, when the calibration control totals are themselves random and their variance is also estimated by a replication method. This problem occurred because we (the authors) were working with two surveys on the same topic and for the same target population, for which we were tasked with producing a unified set of estimates.
The first survey is the 2016 National Survey of Fishing, Hunting, and Wildlife-Associated Recreation (FHWAR). This survey, conducted by the U.S. Census Bureau, used successive difference replication (SDR), which is a variant of balanced repeated replication (BRR). SDR was originally proposed in Fay and Train (1995) and is frequently used for Census Bureau surveys. The second survey is the 2016 50-state Survey of FHWAR, conducted by the Rockville Institute, the nonprofit affiliate of Westat. This survey used Delete-A-Group Jackknife (DAGJK) as the replication method (Kott, 2001).
The two 2016 FHWAR surveys were fielded concurrently using different modes of data collection, specifically to allow for comparison between the two and for subsequent reconciliation of the estimates. The National survey used a combination of telephone and in-person data collection and had a sample size sufficient to produce estimates at the census division level. The 50-state survey was a mail-based survey and, as its name implies, had a sample size sufficient to produce estimates at the state level. However, these differences in mode, together with further differences including other survey implementation aspects, subsampling strategies and estimation methods, led to substantial and often statistically significant differences in the estimates, with typically higher estimates in the 50-State Survey than in the National Survey. See Fish and Wildlife Service and Census Bureau (2018) and Rockville Institute (2018) for more details about the two FHWAR surveys.
As noted above, we were responsible for developing a calibration approach to “align” the estimates from the two surveys, in the sense of producing estimates at the state level based on the 50-state survey but compatible with those obtained from the National Survey. This, in turn, would make it possible to compare the 2016 state-level estimates to those from prior iterations of the National survey, which has been conducted since 1955 and with survey results that are directly comparable since 1991. One of the key steps in reconciling the estimates involved calibrating the demographic composition of the 50-state survey to that of the National survey, given that the latter was considered the “gold standard” in this application. To this end, a set of demographic estimates from the National survey were used as control totals for calibration of the 50-State survey. Because these control totals are themselves estimates, however, it was necessary to make sure that their variability is reflected in the variance estimates of the calibrated 50-State Survey estimates. This is an application of sample-based calibration (calibrating to random control totals). Sample-based calibration is typically seen in multi-phase surveys, in which the samples and the estimation methods can be coordinated. In the current setting, the two surveys are independent and have two sets of replicates created using different replication methods.
There is a limited literature on how to account for sample-based calibration in replicate variance estimation. Fuller (1998) developed a replication variance estimator for two-phase samples, in which the phase two estimates are calibrated to phase one control totals. In this approach, the phase two replicates are modified by adjustments derived from the spectral decomposition of the phase one estimated variance-covariance matrix of the control totals. Dever and Valliant (2010) and Dever and Valliant (2016) studied weight calibration to estimated control totals under a scenario where a (benchmark) survey is used to calibrate another (analytic) survey, which is more closely related to our setting. In the latter article, their simulation studies were developed for a generalized regression estimator, and linearization and jackknife replication variance estimation methods were compared. For the jackknife replication, the authors compared the performance of the Fuller (1998) adjustment and two adjustments based on draws from a multivariate normal distribution: one using the full variance-covariance matrix of the control totals, and one using only the diagonal of this matrix. The latter approach had been proposed by Nadimpalli, Judkins and Chu (2004), but no theoretical justification was provided. The method was motivated by considering the asymptotic distribution of the estimated control totals, which is then used to generate “synthetic” versions of these estimates for use as replicate control totals.
In this paper, we describe an approach to modify the replicates of the survey to be calibrated by using the replicates from the control survey directly. We show how this method can be used even when the replication methods and/or the number of replicates differ between the two surveys. Interestingly, Kott (2005) already made a brief mention of an approach that likewise uses the replicates directly, in the special case of both surveys using DAGJK with the same number of replicates. Unlike the methods in Fuller (1998) and Nadimpalli et al. (2004), these approaches do not require explicit calculation of the variance-covariance matrix of the control survey, greatly simplifying implementation in practice. In addition, they use valid calibrated totals, unlike the methods relying on draws from a normal distribution which can result in unstable or even unfeasible calibrated totals.
More generally, methods for harmonizing estimates from two surveys can be viewed as an application of statistical data integration (SDI), (Lahiri, 2020), a set of methods used to combine multiple data sources to create improved or new estimates compared to what can be obtained from the separate datasets. While they did not use the term SDI, Lohr and Raghunathan (2017) give an overview of the state-of-the-art tools available to perform most of the commonly encountered SDI activities. In a typical SDI application, the goal is the optimal combination of the information in the multiple data sources, which almost always involves creating an estimator that is different from those that are obtained from the separate sources. Methods to achieve this can be design-based, as in multi-frame estimation (Lohr and Rao, 2006) and composite regression estimation (Merkouris, 2004), or model-based (e.g., Raghunathan, Xie, Schenker, Parsons, Davis, Dodd and Feuer, 2007). Sample-based calibration falls in the design-based category, but also aims to reproduce the estimates from one of the data sources exactly.
The remainder of the paper is as follows. The proposed method is developed under the setting of regression estimation in Section 2. Raking is another common calibration method and the one used for the two surveys of interest, so we extend the results to this setting in Section 3. In Section 4 we illustrate both the Fuller (1998) method and the proposed method using data from the two 2016 surveys of FHWAR. Section 5 provides overall conclusions.
- Date modified: