Health Reports
Evaluating the psychometric properties of the parent-rated Strengths and Difficulties Questionnaire in a nationally representative sample of Canadian children and adolescents aged 6 to 17 years

by Matt D. Hoffmann, Justin J. Lang, Michelle D. Guerrero, Jameason D. Cameron, Gary S. Goldfield, Heather M. Orpana and Margaret de Groh

Release date: August 19, 2020

DOI: https://www.doi.org/10.25318/82-003-x202000800002-eng

Mental health difficulties are prevalent among children and adolescents across the world.Note 1 This is particularly important from a public health perspective because many psychological and behavioural disorders often begin in childhood.Note 2 The World Health Organization has cited poor mental health as one of the leading causes of disability and economic burden.Note 3 Between 1996 to 1997 and 2009 to 2010, the use of health care services for mental illness in Canada increased by 35% for children and 44% for adolescents. However, because not all children and adolescents with mental health difficulties access health care services,Note 4 estimates based on administrative data from health care services may underestimate the prevalence of mental health difficulties in these populations. Therefore, identifying brief measures of mental health in children and adolescents that demonstrate evidence of validity is critical to informing screening, population-level surveillance and prevention strategies.

The Strengths and Difficulties Questionnaire (SDQ) is a brief, widely used measure of children’s and adolescents’ social, emotional and behavioural difficulties,Note 5Note 6 or mental health difficulties.Note 7 The SDQ was designed to be completed by parents and teachers of children and adolescents aged 4 to 16, or by 11- to 16-year-olds themselves.Note 5Note 8 Researchers have extended this age range to include 17-year-olds.Note 9Note 10Note 11

In Canada, the SDQ has been completed by parents or guardians of children and adolescents as part of the Canadian Health Measures Survey (CHMS). For the purposes of this study, the SDQ completed by parents or guardians as part of the CHMS is hereafter referred to as the parent-rated SDQ. Despite its inclusion in the CHMS, the psychometric properties of the parent-rated SDQ have yet to be evaluated with a nationally representative sample of Canadian children and adolescents.

The SDQ is a 25-item survey that consists of five subscales.Note 5 Four of the subscales (emotional symptoms, conduct problems, peer problems and hyperactivity) assess aspects of mental health difficulties. The fifth subscale, prosocial behaviour, represents one of the few available measures of positive mental health among children and adolescents. The results of several exploratory factor analyses have largely supported the five-factor structure of the parent-rated SDQ.Note 6Note 12Note 13 However, certain studies that used confirmatory factor analysis (CFA) have found mixed results concerning the factorial validity of the five-factor structure.Note 7Note 14Note 15Note 16Note 17Note 18Note 19Note 20

For instance, CFA studies using nationally representative parent-rated SDQ data have reported inadequate model fit statistics for the five-factor structure in GermanyNote 19 and Ireland,Note 17 but have supported the five-factor structure in the United StatesNote 14 and (after including some correlated error terms) in Great Britain.Note 7 These mixed findings indicate that the factorial validity of the parent-rated SDQ may differ by country.

Building on previous research, the CFA study from Great Britain also tested an alternative, theoretically grounded three-factor structure that comprised “internalizing” (emotional symptoms and peer problems items), “externalizing” (conduct problems and hyperactivity items), and prosocial factors. This structure was found to have a generally poor model fit.Note 7 However, a similar model that involved higher-order internalizing and externalizing factors, alongside the prosocial factor, had a nearly identical model fit to the five-factor SDQ structure.Note 7 Goodman et al.Note 7 suggested that models that incorporate the broader internalizing and externalizing factors may be more appropriate than the five-factor structure for evaluating large samples of the general population that are at low-risk of mental health difficulties.

In Canada, researchers recently explored the factor structure of the parent-rated SDQ using a community sample of 501 children aged 6 to 9.Note 21 The results of the CFA indicated that the original five-factor SDQ model fit the data well. In general, the subscale scores had acceptable internal consistency when calculated using alpha coefficients, and strong internal consistency based on composite reliability scores. Although these results support the five-factor structure of the parent-rated SDQ with a Canadian sample,Note 21 researchers have yet to conduct a large-scale investigation of the factorial validity of this measure with Canadian children and adolescents.

To this end, the purpose of this study was to examine the psychometric properties of the parent-rated SDQ with a nationally representative sample of Canadian children and adolescents aged 6 to 17. Specific objectives included (1) assessing the factorial validity of the original five-factor SDQ structure and the Goodman et al.Note 7 first-order (three-factor) model and higher-order internalizing and externalizing model, (2) assessing the reliability of the SDQ’s subscale scores; and (3) testing for measurement invariance across groups (i.e., male vs. female, children vs. adolescents, English questionnaire vs. French questionnaire).

Data and methods

Participants

The data used are from a subsample of participants aged 6 to 17 from cycle 1 (2007 to 2009), cycle 2 (2009 to 2011), cycle 3 (2012 to 2013) and cycle 4 (2014 to 2015) of the CHMS. The CHMS is an ongoing cross-sectional survey used to collect nationally representative health and wellness data on Canadians. CHMS data are collected from individuals living in households in the 10 provinces. The CHMS does not collect data from individuals living in the three territories, those living on reserves and Aboriginal settlements, full-time members of the Canadian Forces, institutionalized individuals, and individuals living in certain remote areas.Note 22 Individuals excluded from the CHMS represent roughly 4% of the target population.Note 22 Data collection for the CHMS consists of a household interview (demographic questions and a general health interview), followed by an in-person visit to a mobile examination centre for clinical and laboratory tests. The SDQ is completed as part of the household interview. A detailed overview of the CHMS sampling methodology and survey procedures is available elsewhere.Note 23 Sampling and bootstrap weights are provided by Statistics Canada.

Canadians aged 6 to 79 participated in cycle 1 of the CHMS, and Canadians aged 3 to 79 participated in cycles 2, 3 and 4. In total, 21,827 individuals (51.1% female) aged 6 to 79 participated in cycles 1 through 4 of the CHMS. The overall combined response rate for all four cycles for Canadians aged 6 to 79 was 53.1%.Note 24

The SDQ was completed for children and adolescents aged 6 to 17 in cycles 1 and 2 of the CHMS, and for children and adolescents aged 4 to 17 in cycles 3 and 4. For consistency across cycles, the lower age limit for this study was 6 years. Therefore, the total sample in this study included 7,451 individuals aged 6 to 17 (49.3% female). This was a representative sample of children and adolescents based on Canadian demographics, and individuals were selected through a multistage stratified random sampling procedure.Note 22 Participants with missing data generally had no valid SDQ responses. Thus, the sample was further reduced using listwise deletion so that only individuals with complete data for at least one of the original five SDQ factors were retained. Further information regarding sample size retained for each of the five original SDQ factors is provided in Table 1.

All CFA-related analyses in this study included data from 6,960 individuals (491 individuals were excluded, 6.6% of the total sample), with the exception of testing for survey language invariance, which comprised 6,904 individuals (547 individuals were excluded, 7.3% of the total sample). All CFA-related analyses used pairwise deletion to handle missing data.

Statistics Canada obtained ethics approval for the CHMS from the Health Canada and Public Health Agency of Canada Research Ethics Board. Participation in the CHMS was voluntary. Informed consent was provided by all respondents (i.e., parents or guardians of children and adolescents) prior to participating in the CHMS.Note 25

Parent-rated Strengths and Difficulties Questionnaire

A parent or guardian answered all SDQ5 items by reflecting on their child’s behaviour over the past six months. As previously noted, the SDQ includes five subscales: emotional symptoms (five items), conduct problems (five items), peer problems (five items), hyperactivity (five items) and prosocial behaviour (five items). All items are scored on a three-point Likert-type scale: 0 (not true), 1 (somewhat true) and 2 (certainly true). Higher scores on the four difficulties subscales reflect greater difficulties in those areas, whereas higher scores on the prosocial behaviour subscale indicate a strength. Some items within the difficulties subscales are reverse-scored because of positive wording. For the Goodman et al.Note 7 alternative first-order and higher-order factor structures, SDQ items are allocated to the respective factors.

Statistical analyses

Descriptive statistics and polychoric correlations were calculated in SAS EG 7.1 (SAS Institute Inc., Cary, North Carolina, United States). All analyses in SAS incorporated sampling weights to account for the CHMS’s complex survey design. Polychoric correlations were calculated using normalized weights. Bootstrap weights were used to estimate variance, using the balanced repeated replication method, with degrees of freedom set to 46.Note 24

CFAs were conducted using the latent variable modelling program Mplus,version 7 (Muthén & Muthén, Los Angeles, California, United States). Because CFA is a construct-confirming technique, analyses were conducted using unweighted data.Note 26Note 27 The weighted least squares mean and variance-adjusted estimator was used, which is appropriate for ordered categorical (ordinal) variables, whereby normality is not assumed.Note 28 The following fit indices were used to evaluate model fit: comparative fit index (CFI), Tucker-Lewis index (TLI) and root mean square error of approximation (RMSEA). Model fit was deemed acceptable if CFI and TLI values were ⋝0.90 and RMSEA was ⋜0.08.Note 29Note 30 Model fit was considered good if CFI and TLI values were ⋝0.95 and RMSEA was ⋜0.06.Note 31

The chi-square statistic (χ2) was also reported, although it was not used to assess model fit because of its well-documented sensitivity to large sample sizes.Note 32 Factor loadings were required to be 0.32 or higher for items to be interpreted. For loadings, >0.71 was considered excellent, >0.63 was very good, >0.55 was good, >0.45 was fair, and between 0.45 and 0.32 was poor.Note 33Note 34

The model fit statistics for the three competing SDQ models (i.e., five-factor, three-factor and higher-order model) were examined to compare the adequacy of the fit of the different models. Models can be considered indistinguishable when the change in CFI is less than 0.010 and the change in RMSEA is less than 0.015.Note 35

Changes in model fit statistics were also inspected to test measurement invariance. There is support for a more constrained model when the CFI decreases by less than 0.010 and the RMSEA increases by less than 0.015.Note 35 Configural, metric and scalar invariance was tested to determine whether the underlying factor structure of the SDQ was the same across groups. Configural invariance assesses the fit of the factor structure when there are no invariance constraints imposed across groups. Metric invariance assesses the invariance of factor loadings across groups. Scalar invariance assesses the invariance of both factor loadings and item intercepts across groups. Metric invariance was assessed after configural invariance was established, and scalar invariance was tested after metric invariance was established. There is strong measurement invariance when all three tests demonstrate invariance across groups.Note 36

In this study, these tests were performed with respect to sex (male vs. female), age (children [6 to 9 years] vs. adolescents [10 to 17 years]) and survey language (English vs. French). Age group classifications were based on guidelines from the World Health Organization,Note 37 which defines adolescents as individuals aged 10 to 19 years.

Finally, standardized factor loadings were used to calculate composite reliability scores for the latent factors.Note 38 Composite reliability was calculated because of increasing criticisms surrounding the use of the alpha coefficient as a measure of reliability.Note 39Note 40

Results

Because the model fit statistics for the five-factor model were not inferior to those of the alternative models (see results below)—and because the five-factor model is the original, empirically derived SDQ model—descriptive statistics, correlations, internal consistencies (composite reliability), item-by-item factor loadings and invariance testing results are reported for the five-factor model only.

Descriptive statistics for the five SDQ factors are reported in Table 1. Among the four difficulties subscales, males and females scored lowest (best) on conduct problems and highest (worst) on hyperactivity. Children and adolescents scored lowest (best) on peer problems and conduct problems, respectively, while both children and adolescents scored highest (worst) on hyperactivity. Participants whose parents completed the SDQ in English scored lowest (best) on conduct problems and highest (worst) on hyperactivity, while those whose parents completed the SDQ in French scored lowest (best) on peer problems and highest (worst) on hyperactivity.

In terms of prosocial behaviour, males had a mean score of 8.85 and females had a mean score of 9.31. Children had a mean score of 9.08 and adolescents had a mean score of 9.06. Participants whose parents completed the SDQ in English had a mean score of 9.10, and those whose parents completed the SDQ in French had a mean score of 8.95.

Polychoric correlations among the five SDQ factors are reported in Table 2. Correlations were positive among the four difficulties factors, while the prosocial factor correlated negatively with the four difficulties factors.

According to the CFA results, the five-factor model fit was acceptable based on CFI (0.923) and TLI (0.913) values, and good based on the RMSEA (0.048; 90% confidence interval [CI] [0.047 to 0.049]), χ2 (265) = 4,523.73, p < 0.001. Similarly, the higher-order model fit was acceptable based on CFI (0.920) and TLI (0.910) values, and good based on the RMSEA (0.049; 90% CI [0.047 to 0.050]), χ2 (268) = 4688.34, p < 0.001. Although the higher-order model fit appeared to be slightly inferior to the five-factor model fit, it could not be empirically distinguished from the fit of the five-factor model (∆ CFI = -0.003; ∆ TLI = -0.003; ∆ RMSEA = +0.001).

Finally, CFA results indicated that the three-factor model failed to reach acceptable model fit based on CFI (0.883) and TLI (0.871) values, but that it had good model fit based on the RMSEA (0.058; 90% CI [0.057 to 0.060]), χ2 (272) = 6718.60, p < 0.001. The three-factor model fit was generally inferior to the five-factor model fit (∆ CFI = -0.040; ∆ TLI = -0.042; ∆ RMSEA = +0.010) and the higher-order model fit (∆ CFI = -0.037; ∆ TLI = -0.039; ∆ RMSEA = +0.009).

All standardized factor loadings for the five-factor model were significant (p < 0.001) and ranged from 0.45­­ to 0.90 (Table 3). Composite reliability scores provided strong support for the internal consistency of all five factors (all scores ⋝ 0.79; Table 3). For the higher-order model, all first-order factor loadings were significant (p < 0.001) and ranged from 0.45 to 0.90. All second-order factor loadings were significant (p < 0.001) and ranged from 0.75 to 0.98. All factor loadings for the three-factor model were significant (p < 0.001) and ranged from 0.40 to 0.89.

The results of measurement invariance testing for age (children vs. adolescents), sex (male vs. female) and survey language (English vs. French) provided evidence of satisfactory model fit (Table 4). All changes in fit indices pertaining to configural, metric and scalar invariance were well within acceptable ranges.Note 35 Therefore, there was support for strong measurement invariance across sex, age and language groups.

Discussion

The results of this study provide evidence for the factorial validity and reliability of the parent-rated SDQ with a nationally representative sample of Canadian children and adolescents. Specifically, the overall soundness of the original five-factor modelNote 5 was supported using the parent-rated SDQ for children and adolescents aged 6 to 17. The five-factor model was also found to be invariant with respect to sex, age and survey language, highlighting the measure’s robustness across different subsamples of the Canadian youth population. The findings of this study complement over two decades of international research examining the factor structure of the parent-rated SDQ.Note 6Note 13Note 14Note 16

When examining the five-factor structure of the parent-rated SDQ, some CFA studies that used nationally representative samples reported poor model fit statistics on particular fit indices.Note 17Note 19 However, the results of all three (standard) fit indices in this study supported the five-factor structure. The five-factor SDQ structure was also confirmed in a recent CFA study with a community sample of Canadian childrenNote 21 and in other CFA studies that used nationally representative samples from the United StatesNote 14 and Great Britain.Note 7 Similar to this study, a study by He et al.Note 14 that used a sample of American adolescents found support for the five-factor model without needing to correlate errors. In contrast, Goodman et al.Note 7 found support for the five-factor model with a sample of children and adolescents from Great Britain, but only after correlating some error terms.

This study’s findings also support the Goodman et al.Note 7 higher-order internalizing and externalizing model. This makes sense since the higher-order model contains the same underlying factor structure as the five-factor model. The higher-order model thus appears to be a viable alternative for researchers who want to examine parent-rated SDQ data from an internalizing and externalizing perspective. In contrast, this study found that the three-factor internalizing and externalizing model was not a good fit to the data, according to certain fit indices. These results align with those of Goodman et al.,Note 7 who concluded that a first-order model that included internalizing and externalizing factors was not a viable simplification of the five-factor model. This study’s results are also congruent with those of McCrory and Layte,Note 17 who found that the three-factor model was particularly problematic with a nationally representative sample of children from Ireland. In sum, the findings from this study suggest that the internalizing and externalizing factors demonstrate evidence of factorial validity in the context of the higher-order model, but not the first-order (three-factor) model.

Strengths and limitations

This study used data from a nationally representative sample of Canadian children and adolescents that spanned four CHMS cycles (2007 to 2015). The study included Canadian youth aged 6 to 17, resulting in an adequate representation of both children and adolescents. Parent-rated SDQ data demonstrated acceptable model fit using a stringent confirmatory technique (i.e., CFA) that allowed for no cross-loadings on unintended factors. Exploratory structural equation modelling, which is a more flexible technique because it allows for cross-loadings on unintended factors, was not required to achieve acceptable model fit in this study. Furthermore, the majority of SDQ items had factor loadings that could be considered “very good.”Note 33Note 34

Despite these strengths, the results of this study provide evidence of the SDQ’s factorial validity only. Other types of construct validity were not assessed, such as criterion or concurrent validity. Moreover, test–retest reliability could not be examined because such data were not collected in the CHMS. Finally, the extent to which parent-rated SDQ scores were similar across parent raters (i.e., mother vs. father) could not be assessed because such data were not available for cycles 1 to 4 of the CHMS. However, previous research does suggest moderate to large correlations between scores for SDQs completed by mothers and fathers.Note 41Note 42

Conclusion

Consistent with findings from several studies across the globe,Note 43 this study’s findings support the parent-rated SDQ as a psychometrically sound tool for assessing mental health difficulties for Canadian children and adolescents in the general population. The SDQ also has the added benefit of assessing prosocial behaviour, a measure of positive mental health among children and adolescents. In addition to demonstrating evidence of factorial validity, reliability and measurement invariance, the parent-rated SDQ is also relatively brief in comparison with typical psychosocial measures. Normative parent-rated SDQ data for the general Canadian youth population are not yet available; this is an important area for future research. Establishing criterion-referenced cut points among Canadian children and adolescents would help clinicians and health care practitioners identify individuals who may be at risk for mental health difficulties.

References
Date modified: