Findings
Archived Content
Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.
Methods
Definitions
Results
Discussion
The 2004 Canadian Community Health Survey (CCHS) Nutrition was the first national survey of the eating habits of the Canadian population since the early 1970s. One of the objectives of the 2004 CCHS was to determine the intake of energy (calories), macronutrients (fats, proteins and carbohydrates) and micronutrients (vitamins and minerals) for different groups.
While every effort was made to ensure that the data were accurate—from questionnaire design through sampling and interviewer training to raw data verification and validation—the CCHS, like most nutrition surveys, was subject to under-reporting.1-6 For a number of reasons (forgetfulness, social desirability, self-image, fear of being negatively judged5,6), respondents may deliberately or inadvertently report that they ate and drank less than they actually did.
The CCHS used a well-established collection instrument, the Automated Multiple-Pass Method (AMPM),7,8 to maximize respondents' recall of what they consumed the day before they were interviewed. However, as reported in a companion article,9 under-reporting among the population aged 12 or older still amounted to an average of about 10% of total energy intake. Under-reporting was associated with a number of characteristics, notably, body mass index, age, sex, and physical activity.
The under-reporting of energy intake and of specific nutrients has implications for the analysis of CCHS data. For instance, relationships between the amount and types of food consumed and body mass index are obscured by under-reporting. This article aims to address such issues by identifying "plausible" respondents.
Methods
Data source
The 2004 Canadian Community Health Survey (CCHS)–Nutrition collected information about the food and nutrient intake of the household population at the national and provincial levels. It excluded members of the regular Canadian Forces; residents of the three territories; people living on Indian reserves, in institutions, and in some remote areas; and all residents (military and civilian) of Canadian Forces bases. Detailed descriptions of the CCHS design, sample and interview procedures are available in a published report.10
A total of 35,107 people completed an initial 24-hour dietary recall; a subsample of 10,786 completed a second recall three to ten days later. Response rates were 76.5% and 72.8%, respectively. The energy and nutrient composition of the foods reported during each recall was determined according to Health Canada's Canadian Nutrient File (2001b Supplement).11
The original intention of the CCHS was to weigh and measure all respondents aged 2 or older, but for various reasons, the weight and height of around 40% of them were not measured. To adjust for this non-response, another survey weight was created, based on respondent classes with similar socio-demographic characteristics. Because of the bias that has been observed between self-reported and measured data,12,13 measured height and weight are preferable to self-reports. Therefore, respondents with measured height and weight data (with the appropriate survey weight) were used for this analysis.
This study pertains to 16,190 respondents aged 12 or older who answered the physical activity questions. Women who were pregnant or breastfeeding, people of very low weight (body mass index less than 18.5kg/m2), and respondents with no or invalid dietary intakes were excluded.
Identifying plausible respondents
Identifying plausible respondents requires establishing lower and upper cut-offs for their total predicted energy expenditure; that is, a range for the amount of energy they could be expected to expend to remain at their measured weight. The total predicted energy expenditure of CCHS respondents was determined with the equations developed by the Institute of Medicine (IOM),14 which model the energy expenditure derived from a doubly labelled water study based on age, height and weight (body mass index), and level of physical activity. Details about the derivation of physical activity from the CCHS and about the IOM equations are presented in the accompanying article on energy under-reporting.9
Every CCHS respondent was identified as a plausible respondent, an under-reporter or an over-reporter, based on a comparison of their total predicted energy expenditure with their reported energy intake. Goldberg et al.15 were the first to suggest such an approach, by creating a confidence interval for physical activity level (PAL) based on coefficients of variation (CV) of subjects' energy intake (CVwEi), the accuracy of the measurement of their basal metabolic rate (CVwB), and the total variation in physical activity level (CVtP). Black16 developed a practical guide for using the cut-offs, and explained the method's limitations. McCrory et al. went further with a direct comparison of total predicted energy expenditure and measured energy intake. In an initial study, 17 the model of total energy expenditure used a limited database of only 93 individuals, whereas a second18 used the IOM equations, which were developed with information from more than 700 individuals. Both cases assumed the "low active" physical activity category for all individuals. In the report, What America Drinks,19 McCrory's method was modified to produce larger intervals for plausible intakes by assuming four different levels of physical activity for every individual.
For the present analysis, McCrory's intervals for the four levels of physical activity were applied to CCHS respondents according to the amount of activity each of them reported. That is, the interval applied to CCHS respondents depended on whether they were sedentary, low active, active or very active.
The confidence interval for the ratio of measured energy intake (rEI) to the predicted energy requirement (pER) was constructed from the standard deviation (SD), defined as follows:
where
represents the intra-individual variation of energy intake; d the number of days of recall;
, the error in predicted energy requirements; and
, the day-to-day variation and the measurement error for total energy expenditure based on doubly labelled water.
Black and Cole20 estimated
at 8.2%, which was used in the present study.
and
came from the CCHS data.
came from the respondents who provided two dietary recalls, based on the formula:
where CVi is the CV calculated for every individual.
was obtained by dividing the average standard error of individual predictions for a group by the average prediction of energy expenditure for that group.
The CCHS obtained two dietary recalls for approximately 30% of the sample, but only the first is used in the subsequent analysis. Therefore, d=1, and an average SD value of 35% is used (Table 1).
Table 1
Estimation of standard deviation (SD), by age group and sex, household
population aged 12 or older, 2004
Because the energy intake distribution is skewed, the confidence intervals were constructed in the log scale, and the cut-offs were exponentiated. The confidence interval for the energy intake to energy expenditure ratio (EI:EE) for plausible respondents is
EI:EE∈ [exp(-α*SD); exp(α*SD)]
A multiplicative factor a can be applied to the SD to construct the confidence interval. This study uses only the multiplicative factor of 1.
Respondents whose reported energy intake was less than 70% of their predicted energy expenditure were classified as under-reporters; if the figure was more than 142% of their predicted energy expenditure, they were classified as over-reporters. Plausible respondents were those whose energy intake was 70% to 142% of their predicted energy expenditure. The representativeness of this sample of plausible respondents was assessed by comparing their socio-demographic characteristics with those of the total sample.
Analytical techniques
When plausible respondents had been identified, under-reporting of energy and nutrient intake was determined by dividing estimates for plausible respondents by estimates for all respondents.
Linear regression was used to demonstrate the impact of under-reporting on the relationship between reported energy intake and weight for the total population and for plausible respondents. Logistic regression was used to determine the impact of under-reporting on modelling the characteristics of obese people. The bootstrap method, which takes the complex design of the CCHS into consideration,21-23 was used to estimate the confidence intervals of estimated ratios and odd ratios. The significance level was set at p < 0.05.
Definitions
Three types of covariates were included in this study: lifestyle risk factors, health status, and socio-demographic characteristics.
Body mass index ( BMI ) is calculated by dividing weight in kilograms by the square of height in metres. In this analysis, the BMI categories for adults were based on Health Canada guidelines.24 People whose BMI was between 18.5 kg/m2 and 24.99 kg/m2 were normal weight; between 25 kg/33m2 and 29.99 kg/m2, overweight; and more than 30 kg/m2, obese. For adolescents aged 12 to 17, the categories defined by Cole et al.25 were used.
Four levels of leisure-time physical activity were determined: sedentary, low active, active, and very active.
Alcohol consumption refers to the 12 months before the CCHS interview.
Fruit and vegetable consumption was based on the reported usual frequency of consumption, not the 24-hour recall. It represents the number of times per day respondents consumed fruit and vegetables, not the amount of food consumed.
Smokers are those who reported that they smoke daily or occasionally.
The variables related to health status were self-reportedhealth (excellent, very good, good, fair and poor) and the presence of at least one chronic condition.
The socio-demographic variables were: sex and age, based on the IOM dietary reference groups; highest level of education in the household (less than secondary graduation, secondary graduation, some postsecondary, postsecondary graduation); household income from all sources and accounting for household size (low, low/middle, middle, middle/high, and high); employment status the week before the interview; immigrant and Aboriginal status; and province of residence.
Results
One-third under-report
If an analysis uses only data for plausible respondents, the cost in terms of sample size may be high. Based on the confidence interval of 70% to 142% around the ratio of reported energy intake to predicted energy expenditure, 9,196 (57%) of CCHS respondents were identified as plausible respondents; 5,388 (33%) as under-reporters; and 1,606 (10%) as over-reporters.
The characteristics of plausible respondents did not differ significantly from those of the total population (Appendix Table A). However, significant differences between plausible respondents and under- and over-reporters emerged in relation to BMI, physical activity, highest level of education in the household, and province. These differences persisted in a logistic regression model (data not shown).
Association between reported energy intake and weight
The biological relationship between energy intake and weight is obvious: if weight is to be maintained, long-term energy expenditure must match long-term energy intake. The higher the weight, the higher the energy expenditure, and the greater the energy intake. Thus, in theory, the regression coefficients between weight and predicted energy expenditure requirements, and between weight and energy intake should be the same.
Table 2 shows the slope of weight in the model of total predicted energy expenditure. Because predicted energy expenditure depends on body weight, the relationship would be expected to be strong. In fact, this is borne out, with the R2 ranging from 0.51 to 0.77 (data not shown).
Table 2 also shows the slope of weight in the model of energy intake for all respondents and for plausible respondents. For all respondents, not only are all the slopes significantly different from the slope for total energy expenditure, but for 7 of the 12 age/sex groups, the slope is negative, indicating that the higher their measured weight, the lower their reported energy intake. In these models, R2 never exceeds 0.04 (data not shown).
Table 2
Slope of weight variable in modelling predicted energy expenditure or energy
intake for all respondents and for plausible respondents, by age group and
sex, household population aged 12 or older, Canada excluding territories,
2004
By contrast, for plausible respondents, the slope of weight is always positive. All slopes are closer to the theoretical biological relationship (the higher the weight, the greater the energy intake), and there is no significant difference from the energy expenditure model for 4 of the 12 age/sex groups. Therefore, when based on plausible respondents, the quality of the energy intake model improves, with the R2 ranging from 0.02 to 0.24 (data not shown).
Impact on reporting of nutrient intake
A comparison of the average reported consumption of a nutrient for all respondents with that for plausible respondents provides an estimate of the extent to which consumption of that nutrient is under-reported (Table 3). For example, based on the reported calorie intake of all respondents versus plausible respondents, the average rate of energy under-reporting is 8.1%, which is close to the 9.6% estimated in the accompanying article.9 Under-reporting of fat and sugar consumption amounts to 9.3% and 9.6%, respectively. Calcium (8.3%) and alcohol (8.8%) are also under-reported. (Negative values indicate that intake of the nutrient is over-reported by the total population.)
Table 3
Under-reporting of selected nutrients, household population aged 12 or older,
Canada excluding territories, 2004
The ratio of reported energy intake to energy expenditure (EI:EE) is higher for plausible respondents (0.98) than for all respondents (0.90) (Table 4). Regardless of age and sex, the ratios for plausible respondents are higher, ranging from 0.96 to 1.02, compared with 0.84 to 1.01 for the total population. Even among obese people, who tend to under-report, the ratio is 0.96 for those identified as plausible respondents, compared with 0.79 for all respondents.
Table 4
Ratio of energy intake and predicted energy expenditure of all respondents and
plausible respondents, by body mass index (BMI) category, age group and sex,
household population aged 12 or older, Canada excluding territories, 2004
Impact on modelling obesity
The substantial under-reporting of food and beverage consumption in the CCHS has implications for analysis of the data. For example, the relationship between energy intake (calories consumed) and obesity among people aged 18 or older differs depending on whether the results are based on all respondents or on plausible respondents. Based on all respondents, no significant association emerges between calorie consumption and obesity (Table 5). However, the use of only plausible respondents yields a positive and significant association between obesity and calories consumed for both sexes.
Table 5
Adjusted odds ratios relating energy intake and selected characteristics to obesity among all respondents and among
plausible respondents, household population aged 18 or older, Canada excluding territories, 2004
Discussion
The main strength of this study is the large sample size. Even excluding under- and over-reporters, the number of plausible respondents is large enough and representative enough to permit a detailed analysis. As well, the availability of measured height and weight data obviated the need to account for the response bias associated with self-reports of these variables. Finally, the analysis incorporates a physical activity variable and considers several levels of activity when predicting energy expenditure requirements.
It would have been possible to assume, as was done by McCrory et al.17 and Huang et al.,18 that all respondents were "low active." This would result in a slightly smaller sample of plausible respondents (56%), but 96% of CCHS respondents would still be classified in the same category (under-reporter, over-reporter or plausible respondent) as in the present study. The difference lies in the representativeness of the sample, which would be slightly less if a single physical activity level had been used.
Another option for estimating physical activity would have been to use the method employed in What America Drinks,19 whereby the lower confidence interval limit is set based on the assumption of a sedentary level of physical activity, and the upper confidence interval limit is set assuming a very active level of physical activity. The result is that a much larger proportion of the CCHS sample—74%—would be considered to be plausible respondents, including every plausible respondent identified in the present study. The drawback is that this method is less effective in correcting the distortion in the biological relationship between energy intake and body weight, and the relationship between energy intake and obesity in the logistic model is weaker.
An alternative is the Goldberg cut-offs for physical activity level (PAL) using the ratio of energy intake and predicted basal metabolic rates from the Schofield equations.26 Using the same SD, with a basic comparative PAL of 1.55 and a multiplicative factor of 1, yields a slightly smaller sample of plausible respondents (54%) than in the present study. While the classification of respondents as under-and over-reporters and plausible respondents is the same for 90% of the sample, the representativeness of the plausible respondents is not as good.
The confidence intervals in this study were exponentiated, whereas McCrory et al.17 and Huang et al.18 used a ±35 % confidence interval. Application of the latter to the CCHS data would shift the confidence interval toward more under-reporters and fewer over-reporters, although 92% of respondents would still be classified in the same category as in the present study. The difference is in the level of correction of the energy intake to energy expenditure ratio (EI:EE). Instead of increasing the average EI:EE ratio for plausible respondents to 0.98, as is the case when using the log scale, a symmetrical confidence interval yields an average EI:EE ratio of 0.94.
Huang et al.18 used virtually the same method as that employed in the present study to identify plausible respondents in the 1994-1996 Continuing Survey of Food Intakes by Individuals (CSFII). They identified plausible respondents through an optimal confidence interval of ± 30.8% corresponding to 1.4 times the standard deviation (SD). McCrory et al.17 used somewhat different predictive equations, but their confidence interval of ±1 SD corresponded to ± 30%. In both cases, their confidence interval is similar to the one obtained in this analysis. The difference stems from the fact that the CSFII has a smaller CV for the energy intakes, compared with the Canadian population.
Limitations
A major limitation of the present study is that predicted energy expenditure requirements are based on a model constructed by the Institute of Medicine using a sample of about 700 people of different ages and levels of physical activity; very active people and some specific ages are under-represented in that sample.
The level of physical activity measured in the CCHS is self-reported. As well, it refers only to leisure time; work- or transportation-related physical activity is not included.
Excluding under- and over-reporters from an analysis also excludes some plausible respondents who happened to eat much more or much less than usual on the dietary recall day. Even so, on the basis of their socio-demographic characteristics, those who were identified as plausible respondents are representative of the total population.
Conclusion
It is essential to consider under-reporting in analyses of data from nutrition surveys, especially when examining relationships between diet and variables that are highly correlated with under-reporting, such as body mass index. This study shows that just over half of respondents to the Canadian Community Health Survey reported food and beverage consumption that was "plausible," given their height, weight, and level of physical activity. Fully a third were "under-reporters," in that what they reported that they ate and drank could not sustain their measured weight. Under-reporting was particularly common among people who were obese, and as a consequence, under-reporting tends to distort analyses of the relationship between energy intake and obesity.
This study confirms the findings of others, specifically McCrory et al.17 and Huang et al.,18 on the impact of under-reporting on the biological relationship between energy intake and weight, as well as on models related to body mass index. And because of the large CCHS sample, these results can be confirmed for specific age and sex groups.
This study also validates the companion article on under-reporting.9 The characteristics of under-reporters, compared with those of plausible respondents, are essentially the same as with the other method. Average energy under-reporting is slightly less in this analysis, but still in line with the results in the other article.
The technique employed in the present study has numerous applications. It can be used to analyze models related to body mass index and to estimate the level of under-reporting of nutrients, certain foods, and food groups. The results show that some foods are particularly susceptible to under-reporting. Future studies could attempt to better identify foods that are under-reported and occasions when the food consumed may not be reported.
- Date modified: