Skip directly to search Skip directly to A to Z list Skip directly to navigation Skip directly to site content Skip directly to page options
CDC Home

Clean & Recode Data

lineYou are at Continuous NHANES contentlineGo to NHANES III contentlineGo to NHANES II contentlineGo to NHANES I contentline

Purpose

Cleaning and recoding NHANES data is necessary before you can use NHANES variables for your analyses. NHANES data may need to be cleaned if there are missing data, skip patterns, or outliers in the dataset. Alternatively, you may need to recode data in order to define new variables values.

Task 1: Identify, Recode, and Evaluate Missing Data

Missing values may distort your analysis results. You must evaluate the extent of missing data in your dataset to determine whether the data are useable without additional re-weighting for item non-response.

Task 2: Check for Skip Patterns and Explain How They Affect Results

The significance of a skip pattern depends on the question leading to the skip pattern, the questions within that skip pattern, and the variables you intend to analyze.

 

Info iconIMPORTANT NOTE

If you fail to check for skip patterns, you may obtain only a proportion of  the population, instead of the entire study population.

 

Task 3: Check Distributions and Describe the Impact of Influential Outliers

Before you analyze your data, it is very important that you check the distribution and normality of the data and identify outliers for continuous variables.

Task 4: Recode Variables

Recoding is an important step for preparing an analytical dataset. You may want to recode variables to create new variables that fit your analytic needs.

 

Contact Us:
  • National Center for Health Statistics
    3311 Toledo Rd
    Hyattsville, MD 20782
  • 1 (800) 232-4636
  • cdcinfo@cdc.gov
  • Page last updated: May 7, 2013