Key Concepts About Missing Data in NHANES II

Missing values may distort your analysis results. You must evaluate the extent of missing data in your dataset to determine whether the data are useable without additional re-weighting for item non-response. As a general rule, if 10% or less of your data for a variable are missing from your analytic dataset, it is usually acceptable to continue your analysis without further evaluation or adjustment. However, if more than 10% of the data for a variable are missing, you may need to determine whether the missing values are distributed equally across socio-demographic characteristics, and decide whether imputation of missing values or use of adjusted weights are necessary. (Please see Analytic Guidelines for more information.)

When you review the codebooks of NHANES II data, you should note that NHANES II assigns missing values in the following way:

However, other types of data also are important to consider as unavailable for analysis.  When a sample person refuses to answer a question, or the interview runs out of time, or other reasons why an answer may be unobtainable, a response is assigned a value of either "8," "88,"  "888," or "8888" depending on the number of digits in the variable value range. A "Don't know" response is assigned a value of either "9," "99," "999," or "9999," which is also dependent on the number of digits in the variable value range.

If you fail to identify these other types of missing data, and treat the assigned values for "Don't know" as real values, you will get distorted results in your statistical analyses. Therefore, it is important to recode "Blank but applicable" or "Don't know" responses as missing values (either as a period (.) for numeric variables or as a blank for character variables).

 

Unavailable for Analysis Codes
NHANES II codes Description Action
. (period)