Evaluation of the Strengths and Difficulties Questionnaire
Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.
The 2006 Aboriginal Children's Survey (ACS) provides information on the health, development and well-being of First Nations, Métis and Inuit children under 6 years of age and living off reserve in urban, rural, and northern locations in Canada.
A technical advisory group (TAG) consisting of Aboriginal and non-Aboriginal educators, researchers, and other professionals in early child development provided guidance for the development of the survey (Aboriginal Children's Survey, 2006: Concepts and Methods Guide). Through discussions with the TAG, the 2006 ACS included a widely used measure, the Strengths and Difficulties Questionnaire (SDQ; Goodman, 1997), to provide information on children's behaviours and feelings. The Strengths and Difficulties Questionnaire is a brief behavioural screening questionnaire consisting of 25 items grouped into five subscales assessing different aspects of children's behaviours, emotions, and relationships (Goodman, 1997). The five subscales include one subscale of strengths and four subscales of difficulties. In the 2006 ACS the SDQ items were based on the child's parent or guardian's responses for children aged 2 to 5 years.
The Strengths and Difficulties Questionnaire is designed to be used as a screening tool in clinical assessment, to assess treatment outcomes, and as a research tool (Goodman, 2001). At present, no survey instruments exist that have been specifically designed to provide information on the behavioural characteristics of Aboriginal children in Canada. Yet existing standardized instruments that have been developed for, and validated on, general populations may not be appropriate to provide information on the behavioural characteristics of Aboriginal children. Salient markers of development may not be the same for Aboriginal and non-Aboriginal populations because of different cultural practices, which are not considered in most standard instruments. For example, storytelling is important in many Aboriginal cultures and children's abilities to understand and tell stories may be an important factor in pro-social behaviour, but is not included in standard instruments. Furthermore, different world views or language barriers may lead to different interpretations of the questions by some Aboriginal respondents.
One of the purposes of the 2006 ACS was to provide information on the development and well-being of Aboriginal children in Canada. In order to undertake such research it is necessary to ensure that the outcome measures used are relevant for Aboriginal children and that the dimensions identified in the scale apply to this group. Recognizing differences between Aboriginal groups, analyses reporting on the development and well-being of Aboriginal children often provide information for each Aboriginal group separately. As such, the validity and reliability of both existing and developed measures needs to be assessed separately for First Nations, Métis, and Inuit children. Examining Aboriginal groups separately will allow for an assessment of whether the scales may be appropriate for one Aboriginal group but not another.
The purpose of this document is twofold. The first is to determine if the SDQ subscales developed by Goodman (1997) are valid and reliable for Aboriginal children. The second aim is to determine if SDQ items can be grouped into different subscales that have higher validity and reliability for Aboriginal children.
This document is divided into four parts. The first part describes the SDQ instrument administered in the 2006 ACS. The second part describes the methodology used to assess the validity and reliability of the original Goodman (1997) SDQ subscales as well as the methodology used to construct and validate an alternative set of subscales using the SDQ. The third part discusses the results and the subscales produced by the empirical analyses and the construct validity for Aboriginal children. The fourth part concludes with a discussion of the findings and implications of the results.
The 2006 Aboriginal Children's Survey (ACS) is a postcensal survey in which the sample of children was based on responses to the 2006 census questionnaire (Aboriginal Children's Survey, 2006: Concepts and Methods Guide). The sample was drawn from the population of children whose response met one of the following criteria: had Aboriginal ancestors; were self-identified as North American Indian, Métis, or Inuit; had treaty or registered Indian status; or had Indian band membership. The Aboriginal Children's Survey was administered to a sample of First Nations children living off reserve, Métis children, and Inuit children. Although children living in Indian settlements and reserves in the 10 provinces were not included in the 2006 ACS, all First Nations, Métis and Inuit children living in the territories were included in the target population.1 The 2006 Aboriginal Children's Survey assessed a total of 13,921 children who were under the age of 6 years as of October 31, 2006. Among selected respondents, the response rate for the 2006 ACS was 81.1%. The child's parent or guardian responded to the survey items on behalf of the child. In 89.3% of cases, this was the birth mother or father.
The Strengths and Difficulties Questionnaire (SDQ), an instrument developed by Robert Goodman (1997), was designed to assess children's social and emotional behaviour. The original version of the SDQ was designed to be completed by the parent or teacher of 4- to 16-year-old children (Goodman, 1997). Another version of the SDQ was designed to be completed by the parent or guardian of 3- to 4-year-old children, and this was the version used in the 2006 ACS. The child's parent or guardian was asked to respond to a series of questions about the child's behaviour and emotions (e.g., 'Is he/she considerate of other people's feelings?') on a three-point Likert scale using the responses 'not true', 'somewhat true' or 'certainly true'. In the version for 3- to 4-year-olds, 22 of the 25 items were identical to the original version for 4- to 16-year-olds, but three items were changed to be more appropriate for younger children.
The Strengths and Difficulties Questionnaire consists of 25 items which are grouped into five subscales: (1) pro-social, (2) hyperactivity-inattention, (3) emotional symptoms, (4) conduct problems, and (5) peer problems (see table 4 for individual items). Each subscale consists of five items and the values are summed to create a total score. Four of the subscale sub-totals are summed to create a 'total difficulties' score.
These five subscales have been internationally validated and have been found to have, overall, good psychometric properties (Goodman, 2001; Rotherberger and Woerner, 2004; Woerner et al., 2004; Muris et al., 2003; Palmieri and Smith, 2007). While most studies have found satisfactory reliability for the SDQ there are exceptions. A validation study of the SDQ for Arab children in Gaza found a low Chronbach's Alpha across all subscales for children aged 3, 6, 11, and 16 years (Thabet, Stretch and Vostanis, 2000). The lowest Cronbach's Alpha was 0.18 for peer problems and the highest was 0.65 for pro-social, both of which fell below the 0.70 cut-off for reliability. The authors concluded that the factor structure of the SDQ may not be appropriate for children in the sample. They also suggested that particular items may have a different meaning or interpretation among Arab parent respondents.
Existing validation studies are limited for the current purposes because they are typically conducted on non-Aboriginal populations and usually do not focus on preschool aged children exclusively. One study specifically assessed the validity of the SDQ for Aboriginal children in Australia aged 4 to 17 years (Zubrick et al., 2006). Pilot testing of the SDQ among an Aboriginal population found that the response categories ('not true', 'somewhat true' or 'certainly true') were not well understood and were altered to 'no', 'sometimes' or 'yes'. This study found that while the overall SDQ was acceptable, the peer problems subscale had low reliability. However, the extent to which the alteration of the item responses influenced the reliability of the scales is not known.
Most studies assessing validity and reliability of the SDQ have focused on older children. For example, Muris et al. (2003) found that the SDQ was valid for 9 to15 year-old Dutch children. Goodman (2001) assessed the reliability and validity of the SDQ for 5- to 15-year-old children. Palmieri and Smith (2007) assessed the reliability of the SDQ for custodial grandparents of children between the ages of 4 and 16 years. No studies were found which specifically assess the validity and reliability of the SDQ for children under 6 years old. Furthermore, no studies were found assessing the validity and reliability of the 3- to 4-year-old version of the SDQ used in the 2006 ACS.
The administration of the strengths and difficulties questionnaire (SDQ) in the 2006 ACS differed from the usual procedure, a fact that also may influence item responses and the validity of the subscales. First, the Strengths and Difficulties Questionnaire for 3- to 4-year-olds was developed as a paper-and-pencil questionnaire to be completed by the parent but in the 2006 ACS was administered using in-person interviews in Inuit communities and regions of the Northwest Territories outside Yellowknife while in other regions of Canada it was conducted using telephone interviews (Aboriginal Children's Survey, 2006: Concepts and Methods Guide). Second, the version of the SDQ used in the 2006 ACS was developed for 3- to 4-year-olds but administered to 2- to 5-year-olds. Third, the 2006 ACS was translated into seven Aboriginal languages and translators and/or interpreters were hired when requests for other Aboriginal languages were made (Aboriginal Children's Survey, 2006: Concepts and Methods Guide). It is not known if the translation of the ACS into different languages may have altered the meaning of some SDQ items. In all cases, the SDQ component of the ACS questionnaire was completed in English or French by field staff. If a respondent had difficulty with understanding a particular question, a translated version was read or an interpreter was used. However, no information was recorded indicating if a particular interview or question was administered in an Aboriginal language or a translator was used. Therefore, it is impossible to test the effect of language administration for this population.
Data used for the current analysis are for First Nations, Métis and Inuit children who were part of the Aboriginal identity population between the ages of 2 and 5 years living off reserve (7,255). Children were grouped by the three Aboriginal identity populations as reported by the child's parent or guardian (First Nations, Métis, and Inuit). Respondents had the option of reporting multiple Aboriginal identities (e.g., Métis and Inuit) for the child. In the present analyses, multiple identity responses were included. As such, the sum of the sample sizes for each Aboriginal group is greater than the total sample size as some children are included in multiple groups. After excluding cases with missing data on all items (141 in total), the total sample size was 7,111 and 3,462 for off-reserve First Nations, 2,570 for Métis, and 1,181 for Inuit. In total, 2.57% of the children (or 181) had multiple Aboriginal identities. Among children with responses to the SDQ, 86.11% had no missing values for any item while 7.90% were missing one item, 2.40% were missing two items and 2.76% were missing three or more items. The percentage of children missing at least one SDQ item decreased as the age of the child increased. For example, 17.29% of 2-year-olds were missing at least one item, compared to 11.55% of 5-year-olds. It is possible that parents of young children had difficulties responding to some SDQ items. The percent of children missing at least one SDQ item varied by the education of the parental respondent: 20.65% of children whose parent did not complete high school were missing at least one SDQ item while 9.84% of children whose parent completed postsecondary education were missing at least one SDQ item. There was some variation in missing responses by Aboriginal group. Among children with responses to the SDQ, at least one item was missing for 10.89% of off-reserve First Nations children, 8.83% of Métis children and 34.46% of Inuit children.
To assess the construct validity and reliability of the Goodman (1997) scale, confirmatory factor analysis (CFA) was used. The confirmatory factor analysis is a structural equation modelling technique used to examine the extent to which a given factor structure fits the sample data (Tabachnick and Fiddell, 2001). As such, the confirmatory factor analysis can be used to determine how a hypothesized model fits or how the existing dimensions of a scale fit the sample data. There are several measures available to assess the goodness of fit of a set of factors using CFA. The Comparative Fit Index (CFI, Bentler, 1990), Tucker and Lewis Index (TLI, Tucker and Lewis, 1973) and the Root Mean Square Error of Approximation (RMSEA, Brown and Cudeck, 1993) were used to assess the goodness of fit of the Goodman (1997) subscales.Comparative Fit Index and Tucker and Lewis Index values of 0.90 or greater indicate an acceptable fit (Bentler, 1990; Tucker and Lewis, 1973) while for the RMSEA, values of 0.06 or less indicate a good fit (Hu and Bentler, 1999). The chi-square (χ2) test statistic was not used to assess goodness of fit because of its sensitivity to sample size (Browne and Cudeck, 1993).
While the comparative fit index, and root mean square error of approximation assess the goodness of fit of a set of factors, or scales, they do not assess the overall reliability of an individual factor or subscale. Raykov's (1997) composite reliability coefficient (CRC) was used to assess the internal consistency of the items comprising each of the five Goodman (1997) subscales. The composite reliability coefficient is used in CFA and is analogous to Cronbach's Alpha, which is a commonly used measure of internal consistency (Hatcher, 1994). Chronbach's Alpha assumes that a factor explains an equal amount of variance in each of the items reflecting that factor, thus often underestimating reliability (Raykov, 1997). The composite reliability coefficient provides a more accurate reliability estimate because it loosens this restrictive assumption (Raykov, 1997). The composite reliability coefficient statistic, ρ, represents the proportion of variance explained by the factor within a set of observed items. A ρ statistic of 0.70 or more suggests satisfactory composite reliability for the items comprising a factor (Fornell and Larker, 1981).
After using the confirmatory factor analysis to assess the construct validity of the existing dimensions in the derived Goodman (1997) SDQ subscales, the next step was to conduct an exploratory factor analysis (EFA) for the 25 SDQ items to determine whether an alternative factor structure may yield a better fit than the Goodman (1997) subscales. The exploratory factor analysis is a statistical technique used to determine if a set of core factors exists for a set of items. This technique was originally used by Goodman (1997) to derive the subscales. An exploratory factor analysis with a promax rotation, which assumes that the identified factors are correlated, was selected because measures were expected to be correlated (Tabachnick and Fiddell, 2001). The exploratory factor analysis produces factor loadings for each variable, which represent the extent to which the variable reflects the underlying factor. Generally, factor loadings between 0.32 and 0.44 are poor, between 0.45 and 0.54 are fair, between 0.55 and 0.62 are good, between 0.63 and 0.70 are very good and 0.71 or greater are excellent (Tabachnick and Fiddell, 2001). Some studies select 0.3 as a cut-off while others select 0.4; in this study it was decided to only retain items with factor loadings equal to or greater than 0.35.
After the factors were identified, the items comprising each factor were evaluated to ensure the groupings were theoretically meaningful (e.g., groupings of similar items). The next step was to perform a CFA to assess how the alternative factor structure fit the data. The confirmatory factor analysis was conducted separately for off-reserve First Nations, Métis, and Inuit children. The Comparative Fit Index, Tucker and Lewis Index, and Root Mean Square Error of Approximation(RMSEA) were used to assess the fit of the alternative factor structure. Descriptive statistics were also computed.
While the version of the SDQ used in the ACS was intended for 3- to 4-year-olds, it was used to assess children aged 2 to 5 years in the ACS. It is unclear whether the items in the SDQ are appropriate for 2-year-old children as the SDQ was not designed for this age group. Because the version of the SDQ used in the ACS (3- to 4-year-olds) only differs slightly (3 of the 25 questions are altered) from the version for 4- to 16-year-olds, it was expected that the questionnaire would be appropriate for 5-year-olds. To determine if the inclusion of 2-year-olds and possibly 5-year-olds influenced the reliability and validity of the subscales, CFA was conducted for multiple age groupings. First, CFA was conducted on all children assessed using the SDQ (2- to 5-year-olds). Second, confirmatory factor analysis was conducted for children aged 3 to 5 years to determine if the inclusion of 2-year-olds influenced the CFA results. Finally, because the SDQ version used in the ACS was intended for 3- to 4-year-olds, a CFA was conducted for this age group to determine if this restriction would result in higher reliability and validity.
Preparation of the data and descriptive statistics were conducted in SAS version 9.1 (for Windows) and the CFA and EFA were conducted using MPLUS (version 5.0; Muthén and Muthén, 1998-2007). The advantage of MPLUS over SAS for CFA and EFA is that it allows the SDQ items to be specified as ordered categorical variables rather than as continuous variables. Responses to the SDQ questions were not continuous as they were based on a three-point Likert scale. Normalized survey sampling weights were used in all analyses to account for the unequal probability of selection among respondents, unit non-response, and post-stratification. Cases with missing data were included in the CFA.
To assess the construct validity of the original Goodman (1997) subscales for the different age groups for each Aboriginal group, CFA was used and table 1 shows the goodness of fit statistics. For children aged 2 to 5 years, the CFI was well below the 0.90 cut-off for Inuit but close to the cut-off for Métis. The Tucker and Lewis Index was above the 0.90 cut-off for Métis and near the cut-off for off-reserve First Nations children. The root mean square error of approximation was at or below the cut-off of 0.06 for Métis children, and this was consistent across all age groupings, indicating that the original SDQ subscales may be valid for Métis children. For off-reserve First Nations children the RMSEA was slightly above the 0.06 cut-off for all age groupings. The root mean square error of approximation was 0.062 for 3- to 4-year-old children compared to 0.065 for 2- to 5-year-old children, suggesting comparability among the different age ranges. These results suggest that the original subscales may not be appropriate for Inuit children, but they may be appropriate for off-reserve First Nations and Métis children.
To assess the internal reliability of the original five Goodman (1997) subscales, the CRC statistic 'ρ' was calculated. Results for children aged 2 to 5 years, aged 3 to 5 years, and aged 3 to 4 years are in table 2. For off-reserve First Nations and Métis children, the ρ statistics for the pro-social, hyperactivity-inattention, emotional symptoms, and conduct problems subscales was at or above the 0.70 cut-off, indicating that the subscale items have good internal consistency. For Inuit children, the pro-social, hyperactivity-inattention, and peer problems subscales had ρ statistics very close to or above the 0.70 cut-off. The ρ statistics for emotional symptoms (0.65) was close to the cut-off for Inuit children aged 2 to 5 years but declined to 0.61 for children aged 3 to 4 years. This suggests that the emotional symptoms subscale may not have satisfactory reliability among Inuit children. The composite reliability coefficient for the peer problems subscale did not reach the 0.70 cut-off for off-reserve First Nations (ρ =0.61), Métis (ρ =0.62) or Inuit (ρ =0.57) children. While four of the five subscales were reliable for off-reserve First Nations and Métis children, three were reliable for Inuit children. The results showed that the peer problems subscale has the lowest reliability of the five subscales, especially for Inuit children.
To examine whether there was a more relevant factor structure for Aboriginal children than that produced by the original Goodman (1997) subscales, the next step was to conduct an EFA for the 25 SDQ items, using a Robust Weighted Least Squares estimator. Because Goodman (1997) established a five-factor structure, which has been confirmed in several different populations, a five-factor EFA was conducted using the ACS data to determine whether the same five factors would emerge. An initial factor analysis identified five factors with an eigen value greater than 1.0, confirming that a five factor model would indeed fit the data (Kaiser, 1960). A five-factor EFA was then conducted using a promax rotation because this rotation assumes that the factors are not independent. Table 3 shows the factor loadings for each of the five extracted factors. While most items had a factor loading equal to or greater than 0.35, the item K01S 'Picked on or bullied by other children' only had a factor loading of 0.33 and was subsequently excluded.
Items were assigned to a factor based on an empirical approach in which an item was assigned to the factor with highest loading reaching the 0.35 cut-off. While five factors were extracted, factor four only had two items, K01W 'Gets along better with adults than children' and K01F 'Rather solitary, prefers to play alone' reaching the 0.35 cut-off. Because a factor with only one or two items is likely to have low reliability as a subscale, this factor was dropped. To confirm this, Chronbach's Alpha, a measure of internal consistency, was calculated and the value was below the minimum 0.70 cut-off (alpha = 0.48), confirming that Factor 4 has poor internal consistency (Nunnally, 1978).
The four retained factors were assigned the same names as the factors in the Goodman (1997) subscales based on the number of similar items shared between the factors but have the prefix 'EFA'. Table 4 compares the items of the original Goodman (1997) subscales to the items comprising the EFA subscales. The items comprising emotional symptoms are identical for both the original Goodman (1997) subscale and the alternative EFA subscale. While the Goodman (1997) pro-social subscale includes 5 items, 10 items load onto the EFA pro-social subscale. The Goodman (1997) hyperactivity-inattention subscale contains five items; however, only three of these items load onto the alternative EFA hyperactivity-inattention subscale. The items K01U 'Can stop and think things out before acting' and K01Y 'Good attention span, sees work through to the end', which are in the Goodman (1997) hyperactivity-inattention subscale load onto the EFA pro-social subscale. However, K01Y 'Good attention span, sees work through to the end' also loads onto the EFA hyperactivity-inattention subscale but the factor score is lower than for the EFA pro-social subscale. The conduct problems subscale was similar for the Goodman subscale and EFA subscale with the exception of K01G 'Generally well-behaved, usually does what adults request', which loaded onto the EFA pro-social subscale. While this item also loaded onto the EFA conduct problems subscale, the factor score was lower than that for the EFA pro-social subscale.
The confirmatory factor analysis (CFA) was then used to evaluate the fit of the four-factor structure to the ACS data. The goodness of fit statistics for children aged 2 to 5 years, aged 3 to 5 years, and aged 3 to 4 years are presented in table 5. For all three Aboriginal groups the RMSEA is below the 0.06 cut-off, indicating an acceptable fit of the four factors, and this is consistent for all age groups. The Tucker and Lewis Index reached the 0.90 cut-off for off-reserve First Nations, Métis, and Inuit groups. For children aged 2 to 5 years the CFI reached the 0.90 cut-off for Inuit (0.919) but was just short of the cut-off for off-reserve First Nations (0.866) and Métis (0.897). Results were similar for children aged 3 to 5 years and aged 3 to 4 years.
The composite reliability coefficient (CRC), which assessed the reliability of the four subscales using EFA, is presented in table 6. All subscales reach the 0.70 cut-off, indicating a good fit, with the exception of the emotional symptoms subscale for Inuit children. The composite reliability coefficient was nearly identical for all age groups. However, similar to the Goodman emotional symptoms scale, the CRC was lower for children aged 3 to 4 years (ρ=0.61) than for those aged 2 to 5 years (ρ=0.65).
Construction of the exploratory factor analysis (EFA) Strengths and Difficulties Questionnaire (SDQ) subscales
Using the results from the exploratory factor analysis (EFA) a new set of subscales was constructed from the SDQ items. Measures were constructed for children with valid data on at least 80% of the items comprising a single subscale. Items comprising each factor were summed and divided by the number of items to produce a mean score ranging from 1 to 3. Descriptive statistics for the SDQ factors produced using EFA are in table 7. The mean scores are similar across the three Aboriginal groups.
The objectives of this paper were to evaluate the validity and reliability of the Strengths and Difficulties Questionnaire (SDQ) subscales developed by Goodman (1997) for Aboriginal children surveyed in the Canadian ACS. The Comparative Fit Index (CFI) goodness of fit statistics for the Goodman (1997) scale did not reach the 0.90 cut-off for off-reserve First Nations, Métis or Inuit children, and the TLI and the RMSEA only reached their respective cut-offs for Métis children. However, the TLI for off-reserve First Nations was close to the 0.90 cut-off. These findings indicate that the SDQ subscales may not be a good fit for Aboriginal children, especially Inuit children and, to a lesser extent, off-reserve First Nations children. Based on these results, it is recommended that researchers exercise caution when using the SDQ subscales in analyses with the ACS.
The composite reliability coefficient (CRC) was used to assess the internal consistency of the items comprising each of the five subscales. The results showed that the pro-social and hyperactivity-inattention subscales met the 0.70 cut-off for off-reserve First Nations, Métis, and Inuit children. For Inuit children, the emotional symptoms subscale did not meet the 0.70 cut-off for the CRC, indicating that this subscale has low reliability for Inuit children. The peer problems subscale had low reliability across all Aboriginal groups. Similarly, a study assessing the reliability and validity of the SDQ for Aboriginal children in Australia also found low reliability for this subscale (Zubrick et al., 2006). Goodman's 2001 validation study of the SDQ also found that the reliability of the peer problems scale was low among parent respondents (with a Chronbach's Alpha rating of 0.57). Based on these findings, it is recommended that the peer problems subscale not be used for Aboriginal children. It is also recommended that researchers exercise caution when using the emotional symptoms subscale for Inuit children, especially when analyses are limited to only 3- to 4-year-olds.
The version of the SDQ used in the 2006 ACS was designed for 3- to 4 year-old children but administered to 2- to 5-year-old children. To examine if the validity of the SDQ subscales was influenced by the inclusion of 2- and 5-year-olds, parallel analyses were run for children aged 3 to 5 years and children aged 3 to 4 years. For the most part, results were similar for all age groupings, indicating that the inclusion of 2- and 5-year-old children did not have a large impact on the results.
The exploratory factor analysis factor analysis identified four factors (each with more than two items) from the 25 SDQ items. The confirmatory factor analysis was conducted on these four factors and the RMSEA indicated an acceptable model fit for the three Aboriginal groups. While the Tucker and Lewis Index reached the 0.90 cut-off for all Aboriginal groups, the CFI only reached the 0.90 cut-off for Inuit but was very close to the cut-off for off-reserve First Nations and Métis. These findings indicate that that the subscales produced by the EFA have higher validity than the Goodman (1997) SDQ subscales for Aboriginal children. Researchers may wish to use these subscales to analyze the behavioural characteristics of Aboriginal children.
The composite reliability coefficient was also used to assess the internal consistency of the items comprising each factor produced by the EFA. The composite reliability coefficient met the 0.70 cut-off for all subscales and each Aboriginal group with the exception of emotional symptoms for Inuit children. The emotional symptoms subscale produced by the EFA was identical to the Goodman scale. Researchers should similarly use caution when using this scale to assess Inuit children, especially when the sample is limited to 3- to 4-year-old children. Overall, these results indicate that the individual subscales are reliable for off-reserve First Nations, Métis, and Inuit children.
There are several factors related to the administration of the SDQ that may limit its validity for Aboriginal children. First, the Strengths and Difficulties Questionnaire was administered to 2- to 5-year-olds though the version of the questionnaire used was a version developed for 3- and 4-year-olds. However, the confirmatory factor analysis showed consistent results across the three age categories, indicating that the inclusion of both 2- and 5-year-olds did not have a large impact on the validity and reliability of the SDQ. Nevertheless, some items on the SDQ may not be appropriate or relevant for 2-year-old children. This may explain the higher rate of missing data for 2-year-old children. Second, the Strengths and Difficulties Questionnaire was designed to be completed using pencil-and-paper but was completed using telephone or in-person interviews. While this may have influenced the responses to the SDQ items, the extent to which this difference may have influenced results is not known. Third, different world views and translation of the SDQ into Aboriginal languages may have influenced the interpretation of and responses to the items.
The results of this study suggest several areas for future research. This study used EFA to produce an alternative set of factors which had higher validity than the Goodman (1997) scales. However, the applicability of the 25 SDQ items for providing relevant information on Aboriginal children's behaviour was not assessed. It is possible that this instrument, designed for children in the general population, is not a valid measure of Aboriginal children's behaviour. It is possible that the SDQ does not tap the most relevant behaviours important for Aboriginal children's development. Future research should focus on the development of a reliable and valid instrument specifically designed to provide information on the behaviour of Aboriginal children. The development of such an instrument would benefit from discussions with early childhood development experts, Aboriginal researchers, educators, and parents of Aboriginal children.
The Goodman (1997) emotional symptoms subscales showed good reliability for off-reserve First Nations and Métis children but reliability was lower for Inuit children. Future research should examine these differences more carefully by assessing the extent to which these differences reflect the diversity between Aboriginal groups or cultural or language issues related to the administration of the questionnaire. Given the diversity of Aboriginal communities, future research should also determine if a pan-Aboriginal instrument is appropriate or if a separate instrument for First Nations, Métis and Inuit children may serve as a better assessment tool. Future research should also examine the high rate of missing data for SDQ items among Inuit respondents. It is not known if this is due to administration of the questionnaire (in-person interviewing in Inuit communities), socio-demographic characteristics, issues related to language barriers or translation of the questionnaire, different world views, or other issues.
The findings of this study demonstrate that instruments of child development and well-being developed for the general population can not necessarily be directly applied to Aboriginal children. The findings of this study demonstrate that researchers should use caution when using instruments developed for the general population in reporting on characteristics of Aboriginal children.
An objective of this study was to evaluate if the SDQ scales developed by Goodman (1997) are valid and reliable for Aboriginal children. The results of the CFA suggest that researchers should use caution when using the original Goodman (1997) subscales to analyze the behavioural characteristics of Aboriginal children, especially for Inuit children. Analysis examining each individual scale showed low reliability for the Goodman (1997) peer problems scale across all Aboriginal groups. Based on this finding, it is recommended that this scale not be used in analyses of Aboriginal children from the 2006 ACS. Results also showed lower reliability for the Goodman (1997) emotional symptoms subscale for Inuit children. It is therefore recommended that caution be used when using these subscales in analyses dealing with Inuit children.
An alternative set of subscales was produced using EFA and the results demonstrated that these modified subscales have higher reliability and validity than the Goodman (1997) scales. It should be noted, however, that these analyses have not been able to address issues of predictive or external validity of the SDQ for Aboriginal children. Based on these findings researchers may wish to use these alternate subscales when using the SDQ from the 2006 ACS. The results of this study show that measures of children's health and well-being developed for the general population are not necessarily valid measures for Aboriginal children. Future research should examine the predictive or external validity of the SDQ as well as the development of instruments specific to the description of Aboriginal children's outcomes.