An alternative way of estimating a cumulative logistic model with complex survey data
Section 1. Introduction: Fitting a regression model with complex survey data

The goal of this paper is to show an alternative way of estimating a cumulative logistic model (also called the ordinal logistic model or the ordinal regression model), that is, a regression model with a categorical dependent variable having more than two ordered categories, given complex survey data. The standard estimation methods cannot be implemented with most conventional “design-based” software, such as SAS (SAS Institute Inc., 2015), except when the “parallel line assumption” holds as we shall see.

The standard “design-based” framework for fitting a regression model to survey data was introduced by Fuller (1975) for linear regression and by Binder (1983) more generally. This framework treats the finite population as a realization of independent trials from a conceptual population. A maximum likelihood regression estimator could, in principle, be estimated from the finite-population values. The goal in the Fuller/Binder framework is to estimate the conceptual maximum-likelihood estimator, or its limit as the population grows arbitrarily large, from survey data. Skinner (1989) refers to this as the “pseudo-maximum-likelihood” approach.

Kott (2018) describes an alternative model-based approach to estimating regression models with complex survey data dubbed “design sensitive” robust model-based estimation. Following Kott (2007), the standard model is defined in this approach in this manner:

y k = f ( x k T β ) + ε k , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyEamaaBa aaleaacaWGRbaabeaakiabg2da9iaadAgadaqadaqaaiaahIhadaqh aaWcbaGaam4AaaqaaiaadsfaaaGccaWHYoaacaGLOaGaayzkaaGaey 4kaSIaeqyTdu2aaSbaaSqaaiaadUgaaeqaaOGaaiilaaaa@4432@  where E ( ε k | x k ) = 0. ( 1.1 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyramaabm aabaGaeqyTdu2aaSbaaSqaaiaadUgaaeqaaOGaaGjcVpaaeeaabaGa aGPaVlaahIhadaWgaaWcbaGaam4AaaqabaaakiaawEa7aaGaayjkai aawMcaaiabg2da9iaaicdacaGGUaGaaGzbVlaaywW7caaMf8UaaGzb VlaaywW7caGGOaGaaGymaiaac6cacaaIXaGaaiykaaaa@4FA6@

Although apparently very general, there is a key restriction imposed by the standard model in equation (1.1): E ( ε k ) = 0 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyramaabm aabaGaeqyTdu2aaSbaaSqaaiaadUgaaeqaaaGccaGLOaGaayzkaaGa eyypa0JaaGimaaaa@3CD7@ no matter the value of x k . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaaCiEamaaBa aaleaacaWGRbaabeaakiaac6caaaa@38CF@ This assumption can fail and the standard model not be appropriate in the population being analyzed.

In the extended model, E ( ε k | x k ) = 0 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyramaabm aabaGaeqyTdu2aaSbaaSqaaiaadUgaaeqaaOGaaGjcVpaaeeaabaGa aGPaVlaahIhadaWgaaWcbaGaam4AaaqabaaakiaawEa7aaGaayjkai aawMcaaiabg2da9iaaicdaaaa@43AE@ in equation (1.1) is replaced by E ( x k ε k ) = 0 . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyramaabm aabaGaaCiEamaaBaaaleaacaWGRbaabeaakiabew7aLnaaBaaaleaa caWGRbaabeaaaOGaayjkaiaawMcaaiabg2da9iaahcdacaGGUaaaaa@3FAF@ Unlike the standard model, the robust more general extended model rarely fails.

With an independent identically distributed (iid) population U MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyvaaaa@36D1@ of N MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOtaaaa@36CA@ elements, it is easy to see that

p lim { N 1 U [ y k f ( x k T β ) ] x k } = 0 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiCaiGacY gacaGGPbGaaiyBamaacmaabaGaamOtamaaCaaaleqabaGaeyOeI0Ia aGymaaaakmaaqababaWaamWaaeaacaWG5bWaaSbaaSqaaiaadUgaae qaaOGaeyOeI0IaamOzamaabmaabaGaaCiEamaaDaaaleaacaWGRbaa baGaamivaaaakiaahk7aaiaawIcacaGLPaaaaiaawUfacaGLDbaaaS qaaiaadwfaaeqaniabggHiLdGccaaMc8UaaCiEamaaBaaaleaacaWG RbaabeaaaOGaay5Eaiaaw2haaiabg2da9iaahcdaaaa@528C@

under the extended model. Given a complex sample S MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaam4uaaaa@36CF@  with weights { w k } , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaiWaaeaaca WG3bWaaSbaaSqaaiaadUgaaeqaaaGccaGL7bGaayzFaaGaaiilaaaa @3AFA@ each (nearly) equal to the inverse of the corresponding element’s selection probability,

p lim { N 1 S w k [ y k f ( x k T β ) ] x k } = 0 ( 1.2 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiCaiGacY gacaGGPbGaaiyBamaacmaabaGaamOtamaaCaaaleqabaGaeyOeI0Ia aGymaaaakmaaqababaGaam4DamaaBaaaleaacaWGRbaabeaakmaadm aabaGaamyEamaaBaaaleaacaWGRbaabeaakiabgkHiTiaadAgadaqa daqaaiaahIhadaqhaaWcbaGaam4AaaqaaiaadsfaaaGccaWHYoaaca GLOaGaayzkaaaacaGLBbGaayzxaaaaleaacaWGtbaabeqdcqGHris5 aOGaaGjbVlaahIhadaWgaaWcbaGaam4AaaqabaaakiaawUhacaGL9b aacqGH9aqpcaWHWaGaaGzbVlaaywW7caaMf8UaaGzbVlaaywW7caGG OaGaaGymaiaac6cacaaIYaGaaiykaaaa@5FF6@

under mild conditions on the sampling design. The parenthetical “nearly” needs to be added when the weights include adjustments for unit nonresponse or coverage errors in the frame which the analyst assumes have been accounted for in an asymptotically unbiased manner. Calibration weight adjustments for statistical efficiency are another reason to add “nearly”.

Whether the analyst assumes the standard or the extended model holds in the population, solving for b MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaaCOyaaaa@36E2@ in the weighted estimating equation (Godambe and Thompson, 1986)

S w k [ y k f ( x k T b ) ] x k = 0 ( 1.3 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaabeaeaaca WG3bWaaSbaaSqaaiaadUgaaeqaaOWaamWaaeaacaWG5bWaaSbaaSqa aiaadUgaaeqaaOGaeyOeI0IaamOzamaabmaabaGaaCiEamaaDaaale aacaWGRbaabaGaamivaaaakiaahkgaaiaawIcacaGLPaaaaiaawUfa caGLDbaaaSqaaiaadofaaeqaniabggHiLdGccaaMe8UaaCiEamaaBa aaleaacaWGRbaabeaakiabg2da9iaahcdacaaMf8UaaGzbVlaaywW7 caaMf8UaaGzbVlaacIcacaaIXaGaaiOlaiaaiodacaGGPaaaaa@56FC@

provides a consistent estimator for β MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaaCOSdaaa@3735@ under mild conditions.

The pseudo-maximum-likelihood estimating equation in Binder is

S w k f ( x k T b ) v k [ y k f ( x k T b ) ] x k = 0 , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaabeaeaaca WG3bWaaSbaaSqaaiaadUgaaeqaaOWaaSaaaeaaceWGMbGbauaadaqa daqaaiaahIhadaqhaaWcbaGaam4AaaqaaiaadsfaaaGccaWHIbaaca GLOaGaayzkaaaabaGaamODamaaBaaaleaacaWGRbaabeaaaaGcdaWa daqaaiaadMhadaWgaaWcbaGaam4AaaqabaGccqGHsislcaWGMbWaae WaaeaacaWH4bWaa0baaSqaaiaadUgaaeaacaWGubaaaOGaaCOyaaGa ayjkaiaawMcaaaGaay5waiaaw2faaaWcbaGaam4uaaqab0GaeyyeIu oakiaaysW7caWH4bWaaSbaaSqaaiaadUgaaeqaaOGaeyypa0JaaCim aiaacYcaaaa@5500@

where v k = E ( ε k 2 | x k ) . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamODamaaBa aaleaacaWGRbaabeaakiabg2da9iaadweadaqadaqaamaaeiaabaGa eqyTdu2aa0baaSqaaiaadUgaaeaacaaIYaaaaOGaaGPaVdGaayjcSd GaaGPaVlaahIhadaWgaaWcbaGaam4AaaqabaaakiaawIcacaGLPaaa caGGUaaaaa@4680@ For logistic, Poisson, and ordinary least squares (OLS) linear regression, f ( x k T β ) / v k = 1. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaSGbaeaace WGMbGbauaadaqadaqaaiaahIhadaqhaaWcbaGaam4Aaaqaaiaadsfa aaGccaWHYoaacaGLOaGaayzkaaaabaGaamODamaaBaaaleaacaWGRb aabeaaaaGccqGH9aqpcaaIXaGaaiOlaaaa@4160@ This equality may not hold for general least squares (GLS) linear regression, however even when the elements are uncorrelated. It also need not hold for a cumulative logistic regression model.

The cumulative logistic model is a multinomial logistic regression model for L MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamitaaaa@36C8@ categories with a natural ordering (e.g., always, frequently, sometimes, never). Being in the first category is assumed to fit a logistic model. Being in either the first or second category is assumed to fit a logistic model. Being in the first, second, or third category is assumed to fit a logistic model, and so forth.

The general cumulative logistic model is (splitting out the intercept from the rest of the covariates)

E ( y l k | x k ) = exp ( α l + x k T β l ) 1 + exp ( α l + x k T β l ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyramaabm aabaWaaqGaaeaacaWG5bWaaSbaaSqaaiabloriSjaadUgaaeqaaOGa aGPaVdGaayjcSdGaaGPaVlaahIhadaWgaaWcbaGaam4Aaaqabaaaki aawIcacaGLPaaacqGH9aqpdaWcaaqaaiGacwgacaGG4bGaaiiCamaa bmaabaGaeqySde2aaSbaaSqaaiabloriSbqabaGaey4kaSIccaWH4b Waa0baaSqaaiaadUgaaeaacaWGubaaaOGaaCOSdmaaBaaaleaacqWI tecBaeqaaaGccaGLOaGaayzkaaaabaGaaGymaiabgUcaRiGacwgaca GG4bGaaiiCamaabmaabaGaeqySde2aaSbaaSqaaiabloriSbqabaGa ey4kaSIccaWH4bWaa0baaSqaaiaadUgaaeaacaWGubaaaOGaaCOSdm aaBaaaleaacqWItecBaeqaaaGccaGLOaGaayzkaaaaaaaa@6108@ for l = 1 , , L 1 , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeS4eHWMaey ypa0JaaGymaiaacYcacaaMe8UaeSOjGSKaaiilaiaaysW7caWGmbGa eyOeI0IaaGymaiaacYcacaaMc8oaaa@4338@

where y l k = 1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyEamaaBa aaleaacqWItecBcaWGRbaabeaakiabg2da9iaaigdaaaa@3B0D@ when k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaam4Aaaaa@36E7@ is in one of the first l MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeS4eHWgaaa@3728@ categories, 0 otherwise. The parallel-lines assumption is that β l = β MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaaCOSdmaaBa aaleaacqWItecBaeqaaOGaeyypa0JaaCOSdaaa@3AE0@ for all integer values of l MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeS4eHWgaaa@3728@ less than L MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamitaaaa@36C8@ with each such value having its own intercept ( α l ) . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaeWaaeaacq aHXoqydaWgaaWcbaGaeS4eHWgabeaaaOGaayjkaiaawMcaaiaac6ca aaa@3B38@ The cumulative logistic model under the parallel-lines assumption is often called a proportional-odds model. We will call it the “simple cumulative logistic model,” although it is more commonly referred to as the cumulative logistic model (or the ordinal logistic model).

Finding the a l MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyyamaaBa aaleaacqWItecBaeqaaaaa@383A@ and b l MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaaCOyamaaBa aaleaacqWItecBaeqaaaaa@383F@ that satisfy the estimating equation:

k S w k [ y l k exp ( a l + x k T b l ) 1 + exp ( a l + x k T b l ) ] [ 1 x k ] = 0 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaabeaeaaca WG3bWaaSbaaSqaaiaadUgaaeqaaOWaamWaaeaacaWG5bWaaSbaaSqa aiabloriSjaadUgaaeqaaOGaeyOeI0YaaSaaaeaaciGGLbGaaiiEai aacchadaqadaqaaiaadggadaWgaaWcbaGaeS4eHWgabeaacqGHRaWk kiaahIhadaqhaaWcbaGaam4AaaqaaiaadsfaaaGccaWHIbWaaSbaaS qaaiabloriSbqabaaakiaawIcacaGLPaaaaeaacaaIXaGaey4kaSIa ciyzaiaacIhacaGGWbWaaeWaaeaacaWGHbWaaSbaaSqaaiabloriSb qabaGaey4kaSIccaWH4bWaa0baaSqaaiaadUgaaeaacaWGubaaaOGa aCOyamaaBaaaleaacqWItecBaeqaaaGccaGLOaGaayzkaaaaaaGaay 5waiaaw2faaaWcbaGaam4AaiabgIGiolaadofaaeqaniabggHiLdGc caaMc8+aamWaaeaafaqabeGabaaabaGaaGymaaqaaiaahIhadaWgaa WcbaGaam4AaaqabaaaaaGccaGLBbGaayzxaaGaeyypa0JaaCimaaaa @6729@ for l = 1 , , L 1 ( 1.4 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeS4eHWMaey ypa0JaaGymaiaacYcacaaMe8UaeSOjGSKaaiilaiaaysW7caWGmbGa eyOeI0IaaGymaiaaywW7caaMf8UaaGzbVlaaywW7caaMf8Uaaiikai aaigdacaGGUaGaaGinaiaacMcaaaa@4C47@

can be used for estimating the general cumulative logistic model. This is not the pseudo-maximum-likelihood estimating equation in the surveylogistic routine in SAS/STAT 14.1 (An (2002, page 7) discusses the multivariate pseudo-maximum-likelihood estimating equation fit by this procedure), the logistic routine in SUDAAN 11 (Research Triangle Institute, 2012) or the gologit2 routine in STATA (Williams, 2005) for the simple cumulative logistic model. Only the STATA routine allows the b l MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaaCOyamaaBa aaleaacqWItecBaeqaaaaa@383F@ to vary.

Given L MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamitaaaa@36C8@ nominal categories and complex survey data, SAS and SUDAAN can fit the general multinomial logistic model,

E ( y l k | x k ) = exp ( α l + x k T β l ) 1 + j = 1 L 1 exp ( α j + x k T β j ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyramaabm aabaWaaqGaaeaacaWG5bWaaSbaaSqaaiabloriSjaadUgaaeqaaOGa aGPaVdGaayjcSdGaaGPaVlaahIhadaWgaaWcbaGaam4Aaaqabaaaki aawIcacaGLPaaacqGH9aqpdaWcaaqaaiGacwgacaGG4bGaaiiCamaa bmaabaGaeqySde2aaSbaaSqaaiabloriSbqabaGaey4kaSIccaWH4b Waa0baaSqaaiaadUgaaeaacaWGubaaaOGaaCOSdmaaBaaaleaacqWI tecBaeqaaaGccaGLOaGaayzkaaaabaGaaGymaiabgUcaRmaaqadaba GaciyzaiaacIhacaGGWbWaaeWaaeaacqaHXoqydaWgaaWcbaGaamOA aaqabaGaey4kaSIccaWH4bWaa0baaSqaaiaadUgaaeaacaWGubaaaO GaaCOSdmaaBaaaleaacaWGQbaabeaaaOGaayjkaiaawMcaaaWcbaGa amOAaiabg2da9iaaigdaaeaacaWGmbGaeyOeI0IaaGymaaqdcqGHri s5aaaaaaa@67AF@ for l = 1 , , L 1 , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeS4eHWMaey ypa0JaaGymaiaacYcacaaMe8UaeSOjGSKaaiilaiaaysW7caWGmbGa eyOeI0IaaGymaiaacYcacaaMc8oaaa@4338@

with y l k = 1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyEamaaBa aaleaacqWItecBcaWGRbaabeaakiabg2da9iaaigdaaaa@3B0D@ when k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaam4Aaaaa@36E7@ is in the l th MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeS4eHW2aaW baaSqabeaacaqG0bGaaeiAaaaaaaa@3937@ category, 0 otherwise; this is not the same thing as the general cumulative logistic model, which these programs cannot estimate with complex survey data.

In what follows, we introduce a modest example of a simple cumulative logistic model. Given complex survey data, we fit the model both with the pseudo-maximum-likelihood technique and with equation (1.4). The latter is accomplished by creating a data set with L 1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamitaiabgk HiTiaaigdaaaa@3870@ observations for each respondent k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaam4Aaaaa@36E7@  (note that y 1 k , , y L 1 k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyEamaaBa aaleaacaaIXaGaam4AaaqabaGccaGGSaGaaGjbVlablAciljaacYca caaMe8UaamyEamaaBaaaleaacaWGmbGaeyOeI0IaaGymaiaadUgaae qaaaaa@4305@ are in the same primary sampling unit). We follow Kott (2018) and call this fitting method the “design-sensitive” technique, even though, strictly speaking, it is model based. Moreover, the pseudo-maximum-likelihood approach is also sensitive to the design weights and other aspects of the sampling design.

The article goes on to test the parallel-lines assumption. A simple example is presented in Section 2. Section 3 concludes with a discussion.


Date modified: