Overview of the NCI Method
In collaboration with colleagues from numerous institutions, the National Cancer Institute (NCI) has developed a unified framework to predict usual dietary intakes of episodically-consumed or ubiquitously-consumed dietary constituents using two or more 24-hour recalls for at least a subset of a sample. This method can be used for a variety of general applications, including:
The NCI method provides one way of estimating usual intake, but it is not the only method available.
Like the ISUF Method, the premise of the NCI method is that usual intake is equal to the probability of consumption on a given day times the average amount consumed on a "consumption day." The exact methods used for dietary components that are consumed nearly every day by nearly everyone (ubiquitously consumed) differ from those used for dietary components that are not (episodically consumed). In general, the former category refers to nutrients and the latter category refers to foods.
For episodically-consumed dietary constituents, a two-part model with correlated person-specific effects is used to model usual intake. The first part of the model estimates the probability of consuming an episodically-consumed dietary constituent using logistic regression with a person-specific random effect. The second part of the model specifies the consumption-day amount using linear regression on a transformed scale, also with a person-specific random effect. The person-specific effects represent the deviation of the individual’s probability of consumption and amount of intake from the population mean. Because these effects are specific to individuals, they vary only between individuals; therefore, they capture the between-person variation of usual intake in the population. The two parts of the model, probability and consumption-day amount,are linked by allowing the two person-specific effects to be correlated and by including common covariates (e.g., age, sex) in both parts of the model. Intake data from 24-hour recalls provide the values for the dependent variable; one or no covariates may be incorporated into the statistical model. The resulting estimated model parameters can then be used to estimate the final products, depending on the application of interest.
For a ubiquitously-consumed dietary constituent, the probability part of the model is not needed because the probability of consumption is assumed to be one. With a ubiquitously-consumed dietary constituent, however, zero intakes may occasionally occur. In this case, they are set to one-half of the minimum value in the sample.
Evidence for the validity of the NCI method, as it relates to estimating the distribution of usual intakes of episodically consumed dietary constitutents, has been published through a series of papers in the Journal of the American Dietetic Association. Simulation studies show that the NCI method for foods is an improvement over existing methods, including the two-day mean and ISUF method (Tooze, 2006). Methodology for estimating usual food intakes for use in a regression analysis -- for example, to examine relationships between diet and health -- have been published in Biometrics (Kipnis et al., 2009). Analyses establishing the validity of the method to estimate the distribution of usual intakes of ubiquitously-consumed dietary constituents indicate that the NCI method provides better estimates than using a 2-day mean, and generally provides results that are very similar to the ISU method (Tooze, 2010), although a thorough investigation of the comparison across a wide variety of ubiquitously-consumed dietary constituents has not been done.
Covariates, which are incorporated into the NCI method through the two regression models, may be used to explain some of the variation of usual intake both between and within individuals. Fitting a categorical variable allows the mean usual intake to vary for each level of the category. For example, incorporating interview sequence as a covariate in the modeling allows the usual intake to be shifted by a constant amount, depending on whether the data were gathered on day 1 or day 2. Adjusting for this covariate would adjust for the tendency of day 2 recalls having lower usual intake than day 1 recalls, if this were to occur.
In general, two broad classes of covariates may be incorporated into the NCI models: covariates that vary between individuals, and covariates that may vary within an individual over time. Some examples of this first class of covariates include demographics or other personal characteristics. These types of covariates may be used to explain variation in usual intakes or make estimates for a subpopulation of interest. The latter class of covariates may (or may not) differ from day 1 to day 2 of the recalls. They include weekend effects, interview season effects, and any other variables that may vary from day to day. The purpose of incorporating these variables into the model is to adjust for differences due to known day-to-day variation, such as different eating patterns on weekends versus weekdays or by season, and differences in reporting that may occur with the repeated administration of the dietary recall.
In general, estimating the distribution of usual intake does not require the use of covariates to explain variation between persons. This is because the total between-person variation (both explained and unexplained) is reflected in the estimates. However, incorporating covariates may be useful for defining subpopulations for which you would like to estimate the distribution of usual intake of a dietary component. Covariates also may be used to reflect the distribution of usual intake in the week or year, rather than in the NHANES sample. For example, by incorporating a weekend covariate, it is possible to obtain estimates that reflect differences in intake by weekend and weekday (and by season). In applications relating usual intake to other variables, explaining the unknown variation in usual intake is beneficial, and is necessary for some procedures, such as regression calibration.
The NCI method adjusts in two ways for weekend or weekday consumption of the recall when estimating the usual intake distribution. First, the weekend vs. weekday indicator variable is incorporated as a covariate in the modeling. Second, different estimates of intake for weekend and weekday are weighted (by 3/7 and 4/7 respectively) when estimating the distribution of usual intake. This ensures that the estimates reflect overall intake in the population rather than intake of the sampled days.
Because an additional post-stratification step is performed for the dietary weights to balance recalls across days of the week, it may seem as though the NCI method is “overadjusting” for weekend and weekday consumption. However, the dietary weights are created to balance the sampling over the week. The weekend/weekday covariate is used to model different levels of consumption by weekend/weekday and to estimate the distribution of usual intake accordingly.
The NCI method involves using two or more 24-hour recalls as well as covariates, which may include data from an FFQ such as the NHANES 2003-2006 Food Frequency Questionnaire (formerly called Food Propensity Questionnaire). A frequency instrument can improve the power to detect relationships between dietary intakes as predictor variables and other variables. The magnitude of improvement depends on the proportion of zeroes in the dietary constituent, with the FFQ having a greater impact on dietary constituents with a large number of zero intakes (i.e., episodically-consumed foods).. However, when applying the NCI method to estimate usual intake distributions, satisfactory results can generally be obtained without the FFQ as a covariate.
For a particular episodically-consumed dietary constituent such as a food, if the food is not reported on the 24-hour recall, it is assumed to indicate no consumption of the food on that day. The NCI method assumes that the 24-hour recall is an unbiased instrument for measuring usual intake on the original scale -- in other words, that it does not misclassify the respondent's consumption, and that it provides an unbiased measure of the amount consumed on a consumption day. In other words, it assumes that the amount consumed is prone only to random, not systematic error.
Many studies have found misreporting of energy intake on both 24-hour recalls and food frequency instruments, almost always in the direction of underreporting. This suggests that some foods are underreported. Furthermore, there is evidence for intake-related bias and person-specific bias on 24-hour recalls using recovery biomarker (i.e., a biomarker that recovers intake without systematic bias, for energy and protein). However, without a recovery biomarker for each dietary constituent of interest, it is not possible to correct for systematic errors on the 24-hour recall.
If only a limited number of repeated 24-hour recalls are available, reliable separation between non-consumers, irregular consumers, and always-consumers is not possible. Therefore, in the absence of extra information about ever- vs. never-consumption, the NCI method does not estimate the proportion of non-consumers/always-consumers of a given food.
An overview of the steps involved in fitting the NCI method is provided below, and an example of the applications of interest is found in Table 1. The first step of any of the applications of the NCI method is to fit the statistical model. Then, depending on the application of interest, the parameters are used in different ways.
Step 1: Fit a two-part statistical model with correlated person-specific effects. Then, use the estimated model parameters to complete Step 2.
Step 2: Estimate final products depending on application of interest
Table 1. Statistical tasks and applications addressed by the NCI method and the SAS macros available for each task.
Statistical Task | Typical Application | SAS Macro |
---|---|---|
Estimating the mean and distribution of intake for a population |
|
MIXTRAN DISTRIB |
Estimating the mean and distribution of intake for a subpopulation |
|
MIXTRAN DISTRIB |
Estimating individual food intake to make etiologic inferences |
|
MIXTRAN DISTRIB INDIVINT |
Approximating the effects of individual covariates on food intake |
|
MIXTRAN |
Macros for fitting the NCI Method are available on the NCI website. Version 1.1 of the macros was used in this tutorial. We recommend that you check this website for macro updates before starting any analysis. Additional details regarding the macros and additional examples may also be found on the website.
Dodd KW, Guenther PM, Freedman LS, Subar AF, Kipnis V, Midthune D, Tooze JA, Krebs-Smith SM. Statistical methods for estimating usual intake of nutrients and foods: a review of the theory. Journal of the American Dietetics Association 2006;106(10):1640-1650.
Kipnis V, Midthune D, Buckman DW, Dodd KW, Guenther PM, Krebs-Smith SM, Subar AF, Tooze JA, Carroll RJ, Freedman LS. Modeling data with excess zeros and measurement error: application to evaluating relationships between episodically consumed foods and health outcomes. Biometrics 2009; 65:1003-1010.
Subar AF, Dodd KW, Guenther PM, Kipnis V, Midthune D, McDowell M, Tooze JA, Freedman L, Krebs-Smith SM. The Food Propensity Questionnaire (FPQ): concept, development and validation for use as a covariate in model to estimate usual food intake. Journal of the American Dietetics Association 2006;106(10):1556-1563.
Tooze JA, Kipnis V, Buckman DW, Carroll RJ, Freedman LS, Guenther PM, Krebs-Smith SM, Subar AF, Dodd KW. A mixed-effects model approach for estimating the distribution of usual intake of nutrients: the NCI method. Statistics in Medicine 2010 Nov 30;29(27):2857-2868.
Tooze JA, Midthune D, Dodd KW, Krebs-Smith SM, Subar AF, Carroll RJ, Kipnis V. A new statistical method for estimating the distribution of usual intake of episodically consumed foods. Journal of the American Dietetics Association 2006;106(10):1575-1587.
Close Window to return to module page.