Data Normality and Transformations
Purpose
Environmental chemical data usually are not normally distributed. The distributions are often skewed to the right, and have values below the Limit of Detection. Distributions may include extreme values or outliers.
Descriptive statistic analysis and graphical assessments are used to check for data normality. Appropriate transformations on the variable can be implemented if needed. Determining normality and transforming necessary variables are key steps in performing statistical analyses of environmental chemical exposure data.
Task 1: Check for Data Normality
Descriptive statistics of normality are useful in examining whether your variable is normally distributed or highly skewed. Plotting will help you visualize the skewness of your data distribution. These are important assessments before conducting hypothesis testing with environmental chemical variables.
Task 2: Transformation of Variables to Approximate a Normal Distribution
When there is evidence of data skewness, one option is to transform the variable. Commonly used transformations with environmental chemical data include log(x), square-root(x), arc-sine(x), 1/x, exp(x), squaring (i.e., x2), cubing (i.e.,x3), etc.
- Key Concepts about a Method to Transform a Variable to Approximate a Normal Distribution for Variable Transformations
- How to Transform a Single Variable
Contact Us:
- National Center for Health Statistics
3311 Toledo Rd
Hyattsville, MD 20782 - 1 (800) 232-4636
- cdcinfo@cdc.gov