Review Data & Create New Variables
Purpose:
Reviewing NHANES environmental chemical data and creating new variables may be necessary before you can use the variables in the dataset. NHANES environmental chemical data may need to be adjusted if the dataset has missing data or outliers. Depending on the purpose of your analysis, you may also need to create new variables (e.g., to create a category variable based on level of detection).
Task 1: Identify, Recode, and Evaluate Missing Data
Missing values may distort your analysis results. You must evaluate the extent of missing data in your dataset to determine whether the data are useable without additional re-weighting for item non-response.
Task 2: Check Distributions and Describe the Impact of Influential Outliers
Before you analyze environmental chemical data, it is very important that you check the distribution and normality of the data, identify outliers, and determine how outliers might affect your analysis.
Task 3: Check for Data Symmetry
Many statistical procedures are based on the assumption that data are normally distributed, and therefore, symmetrically distributed. However, the distributions of environmental chemical concentrations in blood or urine are often skewed.
Task 4: Create New Variables
Recoding is an important step for preparing an analytic dataset. You may want to recode variables or create new variables that fit your analytic needs.
Contact Us:
- National Center for Health Statistics
3311 Toledo Rd
Hyattsville, MD 20782 - 1 (800) 232-4636
- cdcinfo@cdc.gov