Weighted censored quantile regression
Section 4. Performance analysis
We conduct extensive simulation studies to compare the
performance between our proposed EL based weighted censored quantile regression
estimator and the standard censored quantile regression estimator. For our
simulation, we use the models discussed in Tang and Leng (2012).
The simulation models used to generate the logarithmic
event time
and logarithmic censoring time
for the
subject are given in Table 4.1 under four
Cases (i)-(iv).
Table 4.1
Four simulation models to generate event and censoring times
Table summary
This table displays the results of Four simulation models to generate event and censoring times. The information is grouped by Cases (appearing as row headers), Models and Error Distribution (appearing as column headers).
| Cases |
Models |
Error Distribution |
| (i) |
|
|
|
| (ii) |
|
|
|
| (iii) |
|
|
|
| (iv) |
|
|
|
In Cases (i) and (ii), event times and censoring times
are generated from the homoscedastic models and in Cases (iii) and (iv), we
considered heteroscedastic models to examine the efficiency gain of our
proposed method over the standard censored quantile regression. We set the
parameter values as
and selected
to maintain approximately 30% of the censoring
proportion in each case. We generated explanatory variables from zero mean
bivariate normal distribution with covariance,
We considered two different ways to compute the EL
based probability weights. In numerical study -I, we compute
based on the auxiliary information related to
the failure time,
whereas in numerical study -II,
are computed using the observed survival time,
In numerical study -II, we employ the
synthetic variable approach (Koul et al., 1981; Qin and Jing, 2001; Li and
Wang, 2003) to compute the EL based data driven probability weights.
4.1 Numerical study
-I
To compute
first we need to have a known population
parameter,
or its estimate. We considered a linear
relation between
and
with slopes
and
and intercept
as the auxiliary information. We estimated
using the standard linear regression (least
square) based on a large, finite population with size,
10,000.
We need to generate censoring times as well to compute the event indicator,
and survival time,
to estimate the censored quantile regression
parameters. To fit the weighted censored quantile regression model given in (1.2),
we generated another
observations
with
using the same models given in Table 4.1.
We considered the sample sizes,
100
and 200 and three quantiles,
0.25,
0.50, 0.75. For our proposed method, we estimated
using the estimating function,
defined based on the normal equations of the
linear least squares method as,
For a given quantile,
the true value of the censored quantile
regression parameters
are estimated from the population of size,
10,000.
In general, under a linear model assumption, the true value of the censored
quantile regression slope parameters are the same as the
(i.e.,
But for the intercept, it is
where
is the error distribution. We conducted 1,000
simulations and computed mean bias, standard error (SE) and 95% coverage
probability (CP) of the model parameter estimates for different sample sizes
using 250 bootstrap samples. We compared the performance of our proposed method
(CQR-EL1) with the standard censored quantile regression (CQR) model. We
present the simulation results in Tables 4.2 to 4.5 respectively for Cases
(i)-(iv) with
Table 4.2
Bias, SE and CP of regression parameters for Case (i) model with independent covariates
Table summary
This table displays the results of Bias n, , CQR and CQR-EL1, calculated using 0.25, 0.50 and 0.75 units of measure (appearing as column headers).
|
n |
|
CQR |
CQR-EL1 |
| 0.25 |
0.50 |
0.75 |
0.25 |
0.50 |
0.75 |
| Bias |
100 |
|
0.0042 |
0.0170 |
0.0647 |
0.0027 |
0.0180 |
0.0771 |
|
0.0029 |
0.0035 |
0.0094 |
-0.0014 |
-0.0048 |
0.0030 |
|
-0.0049 |
-0.0141 |
-0.0100 |
-0.0047 |
-0.0124 |
-0.0171 |
| 200 |
|
0.0218 |
0.0298 |
0.0501 |
0.0199 |
0.0322 |
0.0635 |
|
0.0016 |
0.0026 |
0.0057 |
0.0008 |
0.0028 |
0.0048 |
|
-0.0020 |
-0.0032 |
-0.0078 |
-0.0010 |
0.0001 |
-0.0071 |
| SE |
100 |
|
0.1449 |
0.1404 |
0.2268 |
0.1103 |
0.1086 |
0.2110 |
|
0.1533 |
0.1515 |
0.2141 |
0.1159 |
0.1109 |
0.2000 |
|
0.1519 |
0.1525 |
0.2198 |
0.1149 |
0.1109 |
0.2082 |
| 200 |
|
0.0973 |
0.0929 |
0.1292 |
0.0720 |
0.0703 |
0.1221 |
|
0.1040 |
0.1029 |
0.1341 |
0.0746 |
0.0718 |
0.1173 |
|
0.1041 |
0.1027 |
0.1354 |
0.0752 |
0.0717 |
0.1177 |
| CP |
100 |
|
93.3 |
93.4 |
95.7 |
95.8 |
96.6 |
97.0 |
|
94.7 |
95.8 |
96.5 |
95.1 |
96.2 |
97.9 |
|
96.0 |
96.3 |
96.4 |
96.4 |
96.4 |
96.9 |
| 200 |
|
92.3 |
91.9 |
92.7 |
92.7 |
92.5 |
94.8 |
|
94.5 |
96.2 |
95.0 |
95.0 |
95.5 |
96.9 |
|
93.6 |
95.0 |
95.2 |
94.2 |
94.9 |
95.8 |
Table 4.3
Bias, SE and CP of regression parameters for Case (ii) model with independent covariates
Table summary
This table displays the results of Bias n, , CQR and CQR-EL1 (appearing as column headers).
|
n |
|
CQR |
CQR-EL1 |
| 0.25 |
0.50 |
0.75 |
0.25 |
0.50 |
0.75 |
| Bias |
100 |
|
0.0105 |
0.0288 |
0.1088 |
0.0119 |
0.0270 |
0.1062 |
|
0.0063 |
0.0214 |
0.0169 |
0.0005 |
0.0102 |
0.0066 |
|
0.0164 |
0.0096 |
-0.0170 |
0.0152 |
0.0079 |
-0.0184 |
| 200 |
|
0.0267 |
0.0355 |
0.0821 |
0.0276 |
0.0340 |
0.0760 |
|
0.0006 |
-0.0032 |
0.0050 |
0.0042 |
0.0032 |
0.0024 |
|
0.0112 |
0.0025 |
0.0051 |
0.0029 |
-0.0038 |
-0.0057 |
| SE |
100 |
|
0.1871 |
0.1538 |
0.2980 |
0.1522 |
0.1304 |
0.2914 |
|
0.1946 |
0.1664 |
0.2698 |
0.1555 |
0.1318 |
0.2480 |
|
0.1955 |
0.1676 |
0.2733 |
0.1556 |
0.1327 |
0.2543 |
| 200 |
|
0.1235 |
0.1029 |
0.1621 |
0.0998 |
0.0871 |
0.1556 |
|
0.1301 |
0.1146 |
0.1663 |
0.1010 |
0.0893 |
0.1473 |
|
0.1315 |
0.1149 |
0.1671 |
0.1023 |
0.0897 |
0.1465 |
| CP |
100 |
|
95.5 |
93.1 |
94.7 |
96.2 |
94.8 |
97.2 |
|
95.6 |
93.5 |
96.4 |
95.7 |
95.6 |
97.8 |
|
95.9 |
95.4 |
96.4 |
96.0 |
95.0 |
97.2 |
| 200 |
|
93.1 |
91.2 |
94.0 |
93.0 |
93.8 |
95.7 |
|
95.0 |
95.5 |
95.4 |
94.8 |
95.5 |
96.2 |
|
95.5 |
95.7 |
95.5 |
95.0 |
95.2 |
96.3 |
Table 4.4
Bias, SE and CP of regression parameters for Case (iii) model with independent covariates
Table summary
This table displays the results of Bias n, , CQR and CQR-EL1 (appearing as column headers).
|
n |
|
CQR |
CQR-EL1 |
| 0.25 |
0.50 |
0.75 |
0.25 |
0.50 |
0.75 |
| Bias |
100 |
|
0.0062 |
0.0088 |
0.0224 |
0.0055 |
0.0085 |
0.0254 |
|
0.0042 |
0.0051 |
0.0076 |
0.0034 |
0.0016 |
0.0057 |
|
-0.0038 |
-0.0039 |
-0.0069 |
-0.0013 |
0.0003 |
-0.0010 |
| 200 |
|
0.0064 |
0.0072 |
0.0167 |
0.0064 |
0.0089 |
0.0195 |
|
0.0012 |
0.0038 |
0.0033 |
-0.0006 |
-0.0003 |
-0.0014 |
|
-0.0015 |
-0.0031 |
-0.0017 |
-0.0004 |
0.0002 |
0.0023 |
| SE |
100 |
|
0.0472 |
0.0466 |
0.0767 |
0.0349 |
0.0338 |
0.0737 |
|
0.0566 |
0.0570 |
0.0796 |
0.0424 |
0.0411 |
0.0708 |
|
0.0567 |
0.0575 |
0.0807 |
0.0425 |
0.0418 |
0.0720 |
| 200 |
|
0.0313 |
0.0301 |
0.0402 |
0.0225 |
0.0213 |
0.0345 |
|
0.0371 |
0.0377 |
0.0489 |
0.0276 |
0.0267 |
0.0402 |
|
0.0367 |
0.0376 |
0.0488 |
0.0270 |
0.0267 |
0.0401 |
| CP |
100 |
|
94.4 |
95.0 |
96.1 |
94.3 |
96.0 |
97.1 |
|
95.0 |
95.2 |
95.5 |
95.2 |
95.3 |
97.4 |
|
96.6 |
96.7 |
97.3 |
95.4 |
96.6 |
98.0 |
| 200 |
|
94.1 |
93.4 |
94.9 |
93.2 |
94.0 |
94.1 |
|
94.0 |
94.9 |
96.0 |
93.0 |
95.1 |
95.9 |
|
94.6 |
95.0 |
95.3 |
94.4 |
95.3 |
94.8 |
Table 4.5
Bias, SE and CP of regression parameters for Case (iv) model with independent covariates
Table summary
This table displays the results of Bias n, , CQR and CQR-EL1 (appearing as column headers).
|
n |
|
CQR |
CQR-EL1 |
| 0.25 |
0.50 |
0.75 |
0.25 |
0.50 |
0.75 |
| Bias |
100 |
|
0.0066 |
0.0097 |
0.0364 |
0.0048 |
0.0076 |
0.0273 |
|
0.0031 |
0.0039 |
0.0041 |
0.0026 |
0.0043 |
0.0036 |
|
0.0008 |
-0.0009 |
-0.0018 |
0.0008 |
-0.0035 |
-0.0028 |
| 200 |
|
0.0083 |
0.0089 |
0.0243 |
0.0100 |
0.0103 |
0.0258 |
|
-0.0020 |
0.0016 |
0.0017 |
-0.0022 |
-0.0008 |
-0.0018 |
|
0.0008 |
-0.0012 |
-0.0031 |
0.0026 |
0.0012 |
0.0004 |
| SE |
100 |
|
0.0600 |
0.0507 |
0.1103 |
0.0466 |
0.0407 |
0.1038 |
|
0.0667 |
0.0592 |
0.0993 |
0.0514 |
0.0468 |
0.0885 |
|
0.0677 |
0.0600 |
0.1014 |
0.0525 |
0.0470 |
0.0921 |
| 200 |
|
0.0395 |
0.0327 |
0.0521 |
0.0305 |
0.0260 |
0.0464 |
|
0.0429 |
0.0386 |
0.0568 |
0.0331 |
0.0298 |
0.0491 |
|
0.0429 |
0.0389 |
0.0580 |
0.0331 |
0.0301 |
0.0501 |
| CP |
100 |
|
93.5 |
95.0 |
97.7 |
94.7 |
95.5 |
97.8 |
|
95.6 |
96.6 |
97.0 |
96.0 |
96.3 |
97.3 |
|
96.0 |
96.2 |
97.3 |
95.8 |
96.7 |
97.0 |
| 200 |
|
93.0 |
93.9 |
94.9 |
93.5 |
93.4 |
94.1 |
|
95.6 |
95.8 |
94.7 |
94.5 |
95.2 |
95.4 |
|
94.5 |
95.9 |
95.5 |
94.5 |
96.0 |
95.2 |
From
Tables 4.2-4.5, we see that our proposed estimator has approximately zero
bias. A comparison of SE of CQR-EL1 with CQR indicates that the SE of CQR-EL1
reduces remarkably for all the parameters irrespective of any quantile. For
example, we consider the scenario of
100
and
0.25
for comparison purposes throughout this paper. From Table 4.2, for CQR, SE
of
is 0.1533 and for CQR-EL1, SE of
is reduced to 0.1159. When the sample size is
increased to 200, SE of
of our proposed method further is reduced to
0.0746. If we compare the CP of our proposed method with the nominal level of
95%, CQR-EL1 provides approximately 95% coverage and becomes more stable when
the sample size increases. Similar conclusions can be reached for Case (ii)
(results are in Table 4.3) even though we considered heavy tailed
distribution for the failure time compared to Case (i). For example, SE of
using CQR is 0.1946, whereas it is only 0.1555
for the CQR-EL1 based estimate. We also observed that SE is comparatively high
in Case (ii) compared to Case (i).
In
Cases (iii) and (iv), the error depends on the covariates. Simulation results
for these Cases (Tables 4.4 and 4.5) are almost similar to the cases where
error is independent of covariates. For example, in Case (iii)
(Table 4.4), SE of
is 0.0566 and 0.0424 for CQR and CQR-EL1
respectively. Similarly, in Case (iv) (Table 4.5), SE of
is 0.0667 and 0.0514 for CQR and CQR-EL1
respectively. Here, we could also see a slight increase in the SE of estimates
for Case (iv) because of the heavy tailed distribution assumption for the
failure time compared to Case (iii).
4.2 Numerical study
-II
In most of the survival data with random right
censoring, the observed data are the triplet
We consider a linear relationship between the
survival time
and the covariates as the auxiliary
information. Here we cannot use the EL estimating function,
defined in (4.1) because of the censoring.
There are other methods available in the literature which take care of the
right censoring in the linear regression.
Koul et al. (1981) introduced a synthetic data
approach by transforming the survival time,
to a synthetic variable,
as
where
is the censoring indicator and
is the distribution of the censoring time.
if
is independent of both
and
When
is unknown, we can replace it with its
Kaplan-Meier estimator. The estimator of
using the Kaplan-Meier (Kaplan and Meier,
1958) estimator is
where
are ordered and the corresponding censoring
indicator is
We can estimate
as
Qin and Jing (2001) and Li and Wang (2003)
independently provided the estimating function to compute the EL based data
driven probabilities as
We can compute the
and
using the sample analogues of (4.2) and (4.3)
respectively.
To compute
we consider a linear relation between
and
with slopes
and
and intercept
We estimate
using (4.4) based on a large, finite
population with size,
10,000.
To fit the weighted censored quantile regression model given in (1.2), we
generate another
observations
with
using the same models given in Table 4.1.
For our proposed method, we estimate
using the estimating function,
given in (4.5).
Similar to numerical study -I, we present the results
based on 1,000 simulations and report the bias, standard error (SE) and
empirical coverage probability (CP) for the nominal level of 95% based on 250
bootstrap samples. We provide the summary of the simulation results for this
study in Tables 4.6-4.9.
Similar to the population information related to
(numerical study -I), conclusions are almost
similar for uncorrelated covariates. From Tables 4.6-4.9 we see that our
proposed method (CQR-EL2) provides unbiased estimates irrespective of any
sample size and quantile. If we consider the coverage probability, both CQR and
CQR-EL2 provide approximately 95% coverage. For any quantile, there is a
reduction in the standard error of CQR-EL2 parameter estimates compared to CQR
parameter estimates. If we consider Case (i) as a basic model, CQR-EL2 with
Case (ii) has reasonably higher SE along with CQR because of the heavy tailed
distribution of the observed survival time. When the error depended on the
covariates (Cases (iii) & (iv)), the SE of CQR-EL2 reduced considerably.
We also conducted large number of simulations with
correlated covariates with
0.5
as well as constructed weights based on simple relationship with one covariate
only for both numerical studies. The results of these simulations are not
provided here to save the space. The conclusions arrived are almost similar to
the uncorrelated covariate cases.
In numerical study -I, we noticed that there is a slight
reduction in SE of
using heteroscedastic models for CQR-EL1. But
use of the estimating function,
(CQR-EL2), does not reduce the SE of
under heteroscedastic models. Since we
utilized only partial population information in relation to
the standard error of
and
reduced for CQR-EL2 compared to CQR. The
standard error of
was not changed.
Our simulation studies reveal that auxiliary information
greatly enhances the efficiency of estimation, if the population information
related to both
and
is available. If the population information is
only related to
the efficiency gain is limited to
and
only. However, under heteroscedastic models,
the efficiency of estimating
slightly improved in numerical study -I, but
not in numerical study -II.
Table 4.6
Bias, SE and CP of regression parameters for Case (i) model with independent covariates
Table summary
This table displays the results of Bias n, , CQR and CQR-EL2 (appearing as column headers).
|
n |
|
CQR |
CQR-EL2 |
| 0.25 |
0.50 |
0.75 |
0.25 |
0.50 |
0.75 |
| Bias |
100 |
|
0.0042 |
0.0170 |
0.0647 |
0.0217 |
0.0275 |
0.0720 |
|
0.0029 |
0.0035 |
0.0094 |
-0.0491 |
-0.0411 |
-0.0090 |
|
-0.0049 |
-0.0141 |
-0.0100 |
0.0116 |
-0.0029 |
-0.0194 |
| 200 |
|
0.0218 |
0.0298 |
0.0501 |
0.0220 |
0.0323 |
0.0562 |
|
0.0016 |
0.0026 |
0.0057 |
-0.0295 |
-0.0273 |
-0.0119 |
|
-0.0020 |
-0.0032 |
-0.0078 |
0.0034 |
0.0053 |
-0.0011 |
| SE |
100 |
|
0.1449 |
0.1404 |
0.2268 |
0.1273 |
0.1233 |
0.2160 |
|
0.1533 |
0.1515 |
0.2141 |
0.1475 |
0.1416 |
0.2075 |
|
0.1519 |
0.1525 |
0.2198 |
0.1416 |
0.1414 |
0.2162 |
| 200 |
|
0.0973 |
0.0929 |
0.1292 |
0.0840 |
0.0798 |
0.1239 |
|
0.1040 |
0.1029 |
0.1341 |
0.0970 |
0.0921 |
0.1278 |
|
0.1041 |
0.1027 |
0.1354 |
0.0957 |
0.0936 |
0.1304 |
| CP |
100 |
|
93.3 |
93.4 |
95.7 |
94.3 |
96.1 |
96.8 |
|
94.7 |
95.8 |
96.5 |
94.6 |
96.1 |
96.9 |
|
96.0 |
96.3 |
96.4 |
95.4 |
95.4 |
97.4 |
| 200 |
|
92.3 |
91.9 |
92.7 |
92.9 |
92.3 |
94.3 |
|
94.5 |
96.2 |
95.0 |
95.3 |
95.3 |
94.8 |
|
93.6 |
95.0 |
95.2 |
93.5 |
94.9 |
95.9 |
Table 4.7
Bias, SE and CP of regression parameters for Case (ii) model with independent covariates
Table summary
This table displays the results of Bias n, , CQR and CQR-EL2 (appearing as column headers).
|
n |
|
CQR |
CQR-EL2 |
| 0.25 |
0.50 |
0.75 |
0.25 |
0.50 |
0.75 |
| Bias |
100 |
|
0.0105 |
0.0288 |
0.1088 |
0.0306 |
0.0461 |
0.1139 |
|
0.0063 |
0.0214 |
0.0169 |
-0.0841 |
-0.0503 |
-0.0216 |
|
0.0164 |
0.0096 |
-0.0170 |
0.0329 |
0.0260 |
-0.0094 |
| 200 |
|
0.0267 |
0.0355 |
0.0821 |
0.0419 |
0.0508 |
0.0921 |
|
0.0006 |
-0.0032 |
0.0050 |
-0.0022 |
-0.0010 |
-0.0188 |
|
0.0112 |
0.0025 |
0.0051 |
0.0251 |
0.0137 |
0.0133 |
| SE |
100 |
|
0.1871 |
0.1538 |
0.2980 |
0.1619 |
0.1379 |
0.2768 |
|
0.1946 |
0.1664 |
0.2698 |
0.1863 |
0.1595 |
0.2548 |
|
0.1955 |
0.1676 |
0.2733 |
0.1787 |
0.1549 |
0.2632 |
| 200 |
|
0.1235 |
0.1029 |
0.1621 |
0.1048 |
0.0900 |
0.1551 |
|
0.1301 |
0.1146 |
0.1663 |
0.1214 |
0.1052 |
0.1575 |
|
0.1315 |
0.1149 |
0.1671 |
0.1185 |
0.1044 |
0.1606 |
| CP |
100 |
|
95.5 |
93.1 |
94.7 |
95.9 |
94.2 |
97.5 |
|
95.6 |
93.5 |
96.4 |
94.8 |
93.3 |
96.7 |
|
95.9 |
95.4 |
96.4 |
94.2 |
94.2 |
96.3 |
| 200 |
|
93.1 |
91.2 |
94.0 |
93.5 |
93.0 |
94.7 |
|
95.0 |
95.5 |
95.4 |
94.5 |
94.0 |
94.9 |
|
95.5 |
95.7 |
95.5 |
94.8 |
94.5 |
95.4 |
Table 4.8
Bias, SE and CP of regression parameters for Case (iii) model with independent covariates
Table summary
This table displays the results of Bias n, , CQR and CQR-EL2 (appearing as column headers).
|
n |
|
CQR |
CQR-EL2 |
| 0.25 |
0.50 |
0.75 |
0.25 |
0.50 |
0.75 |
| Bias |
100 |
|
0.0062 |
0.0088 |
0.0224 |
0.0127 |
0.0146 |
0.0302 |
|
0.0042 |
0.0051 |
0.0076 |
-0.0071 |
-0.0043 |
0.0021 |
|
-0.0038 |
-0.0039 |
-0.0069 |
0.0018 |
0.0017 |
-0.0040 |
| 200 |
|
0.0064 |
0.0072 |
0.0167 |
0.0094 |
0.0105 |
0.0197 |
|
0.0012 |
0.0038 |
0.0033 |
-0.0042 |
-0.0026 |
-0.0007 |
|
-0.0015 |
-0.0031 |
-0.0017 |
0.0009 |
-0.0003 |
0.0015 |
| SE |
100 |
|
0.0472 |
0.0466 |
0.0767 |
0.0448 |
0.0445 |
0.0801 |
|
0.0566 |
0.0570 |
0.0796 |
0.0541 |
0.0549 |
0.0830 |
|
0.0567 |
0.0575 |
0.0807 |
0.0538 |
0.0558 |
0.0833 |
| 200 |
|
0.0313 |
0.0301 |
0.0402 |
0.0292 |
0.0283 |
0.0396 |
|
0.0371 |
0.0377 |
0.0489 |
0.0348 |
0.0356 |
0.0484 |
|
0.0367 |
0.0376 |
0.0488 |
0.0344 |
0.0359 |
0.0488 |
| CP |
100 |
|
94.4 |
95.0 |
96.1 |
93.9 |
94.7 |
96.9 |
|
95.0 |
95.2 |
95.5 |
94.6 |
94.7 |
96.3 |
|
96.6 |
96.7 |
97.3 |
95.8 |
96.4 |
97.3 |
| 200 |
|
94.1 |
93.4 |
94.9 |
93.9 |
93.8 |
94.9 |
|
94.0 |
94.9 |
96.0 |
94.1 |
94.3 |
95.0 |
|
94.6 |
95.0 |
95.3 |
94.0 |
95.4 |
94.3 |
Table 4.9
Bias, SE and CP of regression parameters for Case (iv) model with independent covariates
Table summary
This table displays the results of Bias n, (équation) , CQR and CQR-EL2 (appearing as column headers).
|
n |
|
CQR |
CQR-EL2 |
| 0.25 |
0.50 |
0.75 |
0.25 |
0.50 |
0.75 |
| Bias |
100 |
|
0.0066 |
0.0097 |
0.0364 |
0.0189 |
0.0169 |
0.0419 |
|
0.0031 |
0.0039 |
0.0041 |
-0.0138 |
-0.0073 |
-0.0000 |
|
0.0008 |
-0.0009 |
-0.0018 |
0.0074 |
0.0060 |
0.0024 |
| 200 |
|
0.0083 |
0.0089 |
0.0243 |
0.0124 |
0.0119 |
0.0273 |
|
-0.0020 |
0.0016 |
0.0017 |
-0.0097 |
-0.0051 |
-0.0032 |
|
0.0008 |
-0.0012 |
-0.0031 |
0.0019 |
0.0004 |
-0.0020 |
| SE |
100 |
|
0.0600 |
0.0507 |
0.1103 |
0.0548 |
0.0486 |
0.1159 |
|
0.0667 |
0.0592 |
0.0993 |
0.0618 |
0.0581 |
0.1018 |
|
0.0677 |
0.0600 |
0.1014 |
0.0616 |
0.0578 |
0.1066 |
| 200 |
|
0.0395 |
0.0327 |
0.0521 |
0.0359 |
0.0304 |
0.0516 |
|
0.0429 |
0.0386 |
0.0568 |
0.0397 |
0.0364 |
0.0558 |
|
0.0429 |
0.0389 |
0.0580 |
0.0397 |
0.0368 |
0.0579 |
| CP |
100 |
|
93.5 |
95.0 |
97.7 |
92.9 |
95.2 |
97.6 |
|
95.6 |
96.6 |
97.0 |
94.2 |
95.5 |
97.4 |
|
96.0 |
96.2 |
97.3 |
96.3 |
97.0 |
97.6 |
| 200 |
|
93.0 |
93.9 |
94.9 |
93.3 |
94.2 |
95.8 |
|
95.6 |
95.8 |
94.7 |
94.0 |
95.5 |
95.2 |
|
94.5 |
95.9 |
95.5 |
94.9 |
96.0 |
94.7 |
Note that the value of the auxiliary parameter value
plays a big role in the efficiency of the weighted censored quantile regression
parameter estimates. If the estimate of
based present study data and previous study
(or known
value) are very close, then all weights will
be close to
and solutions to (1.1) and (1.2) remain the
same. If data on previous studies are not available, we can make of the data
available in the present study to estimate the value of
In this case, if dimensions of
and estimating equation
are same, then all weights will be equal to
and solutions to (1.1) and (1.2) remain same.
However, if the dimensions of
is greater than that of
the weights
is no longer equal to
and this scheme provides an efficiency gain
over the conventional QR estimates (Tang and Leng, 2012).
4.3 Case example
The North Central Cancer Treatment Group (NCCTG) was
initiated by a group of physicians from the north central region of the
United States of America
and the Mayo Clinic in
Rochester
,
Minnesota
.
This study was conducted by NCCTG to determine whether the conclusions from the
patient-completed questionnaire and those already obtained by the patient’s
physician were independent or not (Loprinzi, Laurie, Wieand, Krook, Novotny,
Kugler, Bartel, Law, Bateman and Klatt, 1994). They used the performance scores
(ECOG and Karnofsky) to assess the patient’s daily activities. The dataset is
available in the “survival” package of R software with readings of 228
patients. Because of the incompleteness of some of the variables, we had to
limit the dataset to 167 observations. For the illustration of our proposed
method, we changed our focus to identify the effect of following covariates
over the observed survival time at different quantiles. We considered “age”,
patient’s age in years; “sex”, (Male = 1 Female = 2); “ph.ecog”,
ECOG performance score measured by physician (0 = good 5 = dead);
“meal.cal”, calories consumed at meals and “wt.loss”, weight loss in the last
six months as the covariates. After removing the incomplete patient readings,
the available ECOG scores were 0,1 and 2 only. We defined two dummy categorical
variables for “ph.ecog” as follows.
To demonstrate the usefulness of our proposed method, we
randomly selected a part (100 observations) of the complete data (167
observations) by considering it to be the data available from the previous
study. We assumed that there exists a linear relation between the logarithm of
the observed survival time and all the continuous explanatory variables (age,
meal.cal and wt.loss) as the available auxiliary information. We estimated the
by the least square method based on 100
observations where the response is the synthetic variable defined by (4.2).
Then we computed the EL based data driven probability weights for the present
study data points (67 observations). After computing the weights, we estimated
the weighted censored quantile regression parameters using Peng and Huang
(2008) method with all the covariates. For the present study data, the
censoring proportion is 0.283. Interestingly, we estimated the regression
parameters using CQR up to the
quantile, where as we could estimate to the
quantile using CQR-EL2. Along with the
estimates for the quantiles,
0.25,
0.50, 0.75, we report standard error (SE) and 95% confidence limits using 250
bootstrap samples as well in Table 4.10.
Table 4.10
Estimates, SE and 95% CI for regression parameters of NCCTG lung cancer data
Table summary
This table displays the results of Estimates , CQR and CQR-EL2 (appearing as column headers).
|
|
CQR |
CQR-EL2 |
| 0.25 |
0.50 |
0.75 |
0.25 |
0.50 |
0.75 |
|
Intercept |
5.4777 |
4.2651 |
5.5380 |
4.7531 |
4.1729 |
6.4258 |
| Age |
-0.0168 |
0.0179 |
0.0040 |
-0.0047 |
0.0202 |
-0.0032 |
| Sex |
0.7201 |
0.6180 |
0.4181 |
0.7606 |
0.6638 |
0.3651 |
| ECOG1 |
-0.7059 |
-0.5449 |
-0.2029 |
-0.5701 |
-0.5355 |
-0.2884 |
| ECOG2 |
-0.8677 |
-0.9402 |
-0.8336 |
-1.1584 |
-1.0612 |
-1.0192 |
| MealCal |
0.0004 |
0.0001 |
0.0001 |
0.0004 |
0.0001 |
-0.0000 |
| WtLoss |
-0.0007 |
-0.0084 |
-0.0023 |
-0.0023 |
-0.0100 |
-0.0135 |
| SE |
Intercept |
1.9235 |
1.4314 |
1.7494 |
1.6628 |
1.4149 |
1.4666 |
| Age |
0.0277 |
0.0188 |
0.0225 |
0.0256 |
0.0184 |
0.0176 |
| Sex |
0.5610 |
0.3389 |
0.3716 |
0.5374 |
0.3317 |
0.2809 |
| ECOG1 |
0.6521 |
0.3436 |
0.3375 |
0.6498 |
0.3493 |
0.2434 |
| ECOG2 |
1.0317 |
0.5410 |
0.6061 |
0.9336 |
0.5413 |
0.3879 |
| MealCal |
0.0009 |
0.0006 |
0.0008 |
0.0009 |
0.0006 |
0.0005 |
| WtLoss |
0.0181 |
0.0128 |
0.0231 |
0.0157 |
0.0124 |
0.0100 |
| CI |
Intercept |
(1.6, 9.14) |
(2.38, 8) |
(2.08, 8.94) |
(1.79, 8.31) |
(2.32, 7.87) |
(3.14, 8.89) |
| Age |
(-0.07, 0.04) |
(-0.04, 0.04) |
(-0.04, 0.05) |
(-0.06, 0.04) |
(-0.03, 0.04) |
(-0.03, 0.04) |
| Sex |
(-0.45, 1.74) |
(0, 1.33) |
(-0.13, 1.33) |
(-0.39, 1.71) |
(-0.04, 1.27) |
(-0.07, 1.03) |
| ECOG1 |
(-1.75, 0.81) |
(-1.15, 0.2) |
(-0.97, 0.35) |
(-1.86, 0.69) |
(-1.18, 0.19) |
(-0.78, 0.18) |
| ECOG2 |
(-2.88, 1.16) |
(-2, 0.12) |
(-2.11, 0.26) |
(-2.83, 0.83) |
(-2.13, -0.01) |
(-1.73, -0.21) |
| WtLoss |
(-0.04, 0.03) |
(-0.03, 0.02) |
(-0.05, 0.04) |
(-0.04, 0.02) |
(-0.03, 0.01) |
(-0.04, 0) |
From Table 4.10, we see that the standard error of
the estimates of all the continuous variable parameters and the intercept
reduced considerably because we considered the auxiliary information related to
them. For the remaining variables, a reduction of standard error can also be
seen, even though we did not consider any auxiliary information related to
them. In the censored quantile regression with the EL based data driven
probability weights, we see narrower 95% confidence limits for all the
variables compared to those using the standard censored quantile regression.