An alternative way of estimating a cumulative logistic model with complex survey data
Section 2. A simple example

Table of contents

The National Survey on Drug Use and Health (NSDUH) is an annual survey of the civilian, noninstitutionalized population aged 12 or older living in the United States. Using NSDUH data from 2006 to 2010, we focus on a survey question given to adolescents (12-17) who received depression treatment in the past year:

During the past 12 months, how much has treatment or counseling helped you?

The viable responses were: Not at all (l); A little (2); Some (3); A lot (4); or Extremely (5).

We discarded missing and invalid responses both to this question and to the question of whether the respondent received depression treatment in the past year. We will return to this practice in the discussion section.

Using SAS, we estimated the following simple cumulative logistic model:

$E (y_{l k} | x_{k}) = \frac{\exp (α_{l} + m e d s_{k} β)}{1 + \exp (α_{l} + m e d s_{k} β)}$ for $l = 1, \dots, L - 1, (2.1)$

where $m e d s = 1$ when respondent $k$ was taking medication for depression (0 otherwise), with both pseudo-maximum-likelihood and the design-sensitive technique. For pseudo-maximum-likelihood estimation, we reversed the order of the responses with $y_{1 k} = 1$ when $k$ responded that treatment (or counseling) helped extremely, $y_{2 k} = 1$ when $k$ responded that treatment helped extremely or a lot, $y_{3 k} = 1$ when $k$ responded that treatment helped more than a little, and $y_{4 k} = 1$ when $k$ responded that treatment helped at least a little. Finally, $y_{5 k} = 1 - y_{4 k} = 1$ when $k$ responded that treatment did not help at all. In SAS, this meant dependent variable $Y$ was set equal to 1 when treatment helped extremely, to 2 when treatment helped a lot, $\dots,$ and to 5 when treatment didn’t help at all.

For the design-sensitive technique, we created four observations from $k$ in a new data set. In the $i^{th}$ observation labeled $C = i$ in SAS, a class (categorical) variable added to the model statement, we created a dependent variable (D) equal to $y_{i k}$ in equation (2.1). We needed to add EVENT = “1” after D in the model statement because we were modeling when $D = 1.$

SAS code for both estimation techniques are in the appendix. The NSDUH data set we used had 60 variance strata with two variance primary sampling units (PSUs) in each and analysis weights based on the probabilities of selection and unit response.

The parameter estimates from our pseudo-maximum-likelihood and design-sensitive SAS runs are displayed in Tables 2.1 and 2.2, respectively. In Table 2.1, Intercept $= i$ is the pseudo-maximum-likelihood estimate of $α_{i k}$ in equation (2.1). The sum of the Intercept and $C = i$ in Table 2.2 is the design-sensitive estimate for $α_{i k}$ when $i = 1, 2,$ or $3,$ while the design-sensitive estimate for $α_{4 k}$ is the Intercept in Table 2.2 minus the sum: $[C = 1] + [C = 2] + [C = 3] .$ Finally (and more simply), meds in both tables estimates $β .$

In all cases, estimates of the same parameter from the two tables are close. The percent increase in every level of satisfaction with treatment due to having taken drugs for depression (the estimate for $β)$ is roughly 45% (in our discussion of the results of the logistic regressions, we treat differences of the log odds as equal to percent differences in the odds, even though this is only approximately true). That near equality suggests that the parallel-lines assumption is not violated by our NSDUH data.

Table 2.1
Pseudo-maximum-likelihood estimates for the simple cumulative logistic model
Table summary
This table displays the results of Pseudo-maximum-likelihood estimates for the simple cumulative logistic model. The information is grouped by Parameter (appearing as row headers), Estimate, Standard Error, t Value and Pr > | t | (appearing as column headers).
Parameter	Estimate	Standard Error	t Value	Pr > \| t \|
Intercept 1	-2.2917	0.0913	-25.10	< 0.0001
Intercept 2	-0.7617	0.0685	-11.11	< 0.0001
Intercept 3	0.2511	0.0624	4.02	0.0002
Intercept 4	1.3695	0.0739	18.53	< 0.0001
meds	0.4516	0.0965	4.68	< 0.0001
NOTE: The degrees of freedom for the t tests is 60.

Table 2.2
Design-sensitive estimates for the simple cumulative logistic model
Table summary
This table displays the results of Design-sensitive estimates for the simple cumulative logistic model. The information is grouped by Parameter (appearing as row headers), Estimate, Standard Error, t Value and Pr > | t | (appearing as column headers).
Parameter	Estimate	Standard Error	t Value	Pr > \| t \|
Intercept	-0.3591	0.0583	-6.16	< 0.0001
C 1	-1.9329	0.0592	-32.63	< 0.0001
C 2	-0.4039	0.0356	-11.33	< 0.0001
C 3	0.6087	0.0392	15.52	< 0.0001
meds	0.4498	0.0955	4.71	< 0.0001
NOTE: The degrees of freedom for the t tests is 60.

The parallel-lines assumption can be tested directly by adding a class variable M to the design-sensitive data set with

$\begin{array}{l} M = 1 when C = 1 and m e d s = 1, \\ M = 2 when C = 2 and m e d s = 1, \\ M = 3 when C = 3 and m e d s = 1, and \\ M = 4 otherwise . \end{array}$

When added to the model statement in SAS, the class variable M captures the differing impacts of taking medication for depression in the previous year on the levels of satisfaction with treatment. For example, the estimated percent increase in the odds of being extremely pleased by treatment due to having taken drugs for depression during the year is, according to Table 2.3, 0.3816 (from $m e d s)$ plus 0.0717 (from M = 1) or 45.33%. The other percent increases are lower, but none are significantly different from the others. We see that from the extremely low F value for M in Table 2.4. In addition, none of the $t$ -values for an M in Table 2.3 is significant even at the 0.5 level (10 times larger than the standard 0.05 level).

Table 2.3
Estimating the general cumulative logistic model
Table summary
This table displays the results of Estimating the general cumulative logistic model. The information is grouped by Parameter (appearing as row headers), Estimate, Standard Error, t Value and Pr > | t | (appearing as column headers).
Parameter	Estimate	Standard Error	t Value	Pr > \| t \|
Intercept	-0.2919	0.1270	-2.30	0.0251
C 1	-1.9636	0.0806	-24.37	< 0.0001
C 2	-0.4104	0.0440	-9.33	< 0.0001
C 3	0.6202	0.0490	12.66	< 0.0001
Meds	0.3816	0.1452	2.63	0.0109
M 1	0.0717	0.1273	0.56	0.5754
M 2	0.0234	0.0652	0.36	0.7215
M 3	-0.0236	0.0719	-0.33	0.7439
NOTE: The degrees of freedom for the t tests is 60.

Table 2.4
F tests for the general cumulative logistic model
Table summary
This table displays the results of F tests for the general cumulative logistic model. The information is grouped by Effect (appearing as row headers), F Value, Num DF, Den DF and Pr > F (appearing as column headers).
Effect	F Value	Num DF	Den DF	Pr > F
C	280.39	3	58	< 0.0001
Meds	6.91	1	60	0.0109
M	0.16	3	58	0.9239

ISSN : 1492-0921

Editorial policy

Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.

Submission of Manuscripts

Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).

Note of appreciation

Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.

Standards of service to the public

Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.

Copyright

Published by authority of the Minister responsible for Statistics Canada.

Use of this publication is governed by the Statistics Canada Open Licence Agreement.

Catalogue No. 12-001-X

Frequency: Semi-annual

Ottawa

Date modified:: 2019-07-04

Language selection

Search and menus

Search

An alternative way of estimating a cumulative logistic model with complex survey data
Section 2. A simple example

An alternative way of estimating a cumulative logistic model with complex survey data Section 2. A simple example

Editorial policy

Submission of Manuscripts

Note of appreciation

Standards of service to the public

Copyright

An alternative way of estimating a cumulative logistic model with complex survey data
Section 2. A simple example