4 Consistency of PPL-BIC
Chen Xu, Jiahua Chen and Harold Mantel
Previous | Next
We now investigate the asymptotic behavior of the
PPL-BIC procedure under the joint randomization framework. Suppose there is a
sequence of finite populations, say with Each is an independent and identically distributed
(i.i.d.) sample of size from a super-population modeled by (2.1) with
random variable Within each a sample of size is drawn according to some sampling scheme. We
assume that both and increase to infinity as with the sampling fraction bounded by some constant For simplicity of notation, we will drop the
index in the following discussion.
Without loss of generality, we assume that the first coefficients are nonzero and denote the true
value of by with Also, we use to denote the true model to be identified. We establish the selection
consistency of PPL-BIC in two steps. In the first step we show that, for
appropriate choices of the PPL can consistently identify the true so that with probability tending to 1. In the second
step, we verify that BIC (3.4) consistently selects over
For the asymptotic analysis, we define and associate with to make a sequence. Under the joint randomization
framework, we show the claim of step 1 as the following theorem.
Theorem 1 Under regularity conditions on
model (2.1) and other requirements specified in the online supplement, if as then there exists a local maximizer of the penalized pseudo-likelihood function
(3.5) such that
with denoting the Euclidean norm.
The consistency result in Theorem 1 holds for popular
nonconvex penalty functions. For example, for the penalty with consistency holds if for the SCAD penalty, consistency holds if and It also implies that with probability tending
to 1, the true model is included in which serves as a prerequisite for the
selection consistency of BIC over
We now establish the consistency of using BIC on with a specified that satisfies Theorem 1. Following the
notation used in Section 3.2, let be the model corresponding to a PPL estimator and let be the range of under consideration. We define two collections
of candidate models as follows:
-
Over-fitted models:
-
Under-fitted models:
Notation denotes there is at least one different
element between two sets, so that is the collection of candidate models which
does not include all variables in the true model. Then, can be partitioned accordingly into
By Theorem 1, we have shown that Therefore, the selection consistency of BIC
over is achieved if BIC is able to identify from any model with We use the following theorem to establish this
consistency result.
Theorem 2 Under the same conditions as in Theorem 1,
where and are defined in (4.1).
Previous | Next