# Appendix I: Multivariate logistic regression

## Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

For the binary logistic regression, the goodness of fit is measured by a transformation on the maximum likelihood estimate (MLE) such that,

(A1) Goodness of Fit = -2 log(L), where L (likelihood) = Π pi Π (1-pi).

The -2log(L) value is approximately distributed as a χ2 with degrees
of freedom equal to the number of variables. Since the MLE is a probability
and thus cannot exceed 1, its log will be negative. As the fit improves,
the ML probability increases toward 1, the log will also increase (although
remain negative) and the -2log(L) will be smaller. An additional variable
improves the model fit if it reduces the -2log likelihood (Table I). The initial 'control' model included social and demographic characteristics
(x_{1,}x_{2,} and x_{4} to x_{6}) such as age, income and location, found to be of importance
to Internet use. The 'full' model includes online behaviours (depth, breadth
and experience - x_{3}, x_{7,} and x_{8} - as well as credit card concern, x_{9}.). The 'final'
model treats experience as a categorical variable; all variables remain significant
while the overall fit of the model is improved (Table I).

The Nagelkerke statistic is a pseudo R^{2} that attempts to provide
a logistic analogy to the R^{2} in Ordinary Least Squares (OLS). Although
the Nagelkerke varies from 0 to 1, as does R^{2} in
OLS, it does not indicate the proportion of variance explained by the predictors
(UCLA, 2004). Rather, it indicates the proportion of unaccounted variance
that is reduced by adding variables to the model compared to the null model
(i.e. just the constant). The Nagelkerke value increased from 0.102 in
the control model to 0.300 in the final model.

For continuous variables, the interpretation of slope coefficients is similar to that of OLS regression. For the discrete predictor variables, the regression coefficient (B) is equal to the log odds ratio of the event for the use or non-use of the Internet by an individual. Odds are defined as p/q or p/(1 - p), where p = the probability of the event and q = (1 - p). A log odds ratio is defined as:

(A2) ln[p1/(1 - p1)]/[p0/(1 - p0)].

- Date modified: