Assess Normality and Estimate Percentiles, Geometric Means, and Proportions
Purpose
NHANES data are often used to provide national estimates on important public health issues. This module introduces how to generate the descriptive statistics for NHANES data that are most often used to obtain these estimates. Topics covered in this module include checking frequency distribution and normality, generating percentiles, generating geometric means, and generating proportions.
Task 1: Check Frequency Distribution and Normality
It is highly recommended that you examine the frequency distribution and normality of the data before starting any analysis. These descriptive statistics are useful in determining whether parametric or non-parametric methods are appropriate to use, and whether you need to recode or transform data to account for extreme values and outliers.
- Key Concepts about Checking Frequency Distribution and Normality
- How to Check Frequency Distribution and Normality in SAS
- Download Sample Code and Dataset
Because the procedure is the same as it would be for any dataset, links to the Continuous NHANES Web Tutorial are provided here for your convenience. If you use the code, please be sure to change it to match the variables in your environmental analytic dataset.
Task 2: Generate Percentiles
Percentiles are used to indicate the relative position of an individual within a given dataset. Frequency distribution and percentiles also can be used to describe the characteristics and shape of a distribution and to check for outliers.
Although SAS have commands for calculating estimates of weighted percentiles, they do not have commands to directly produce standard errors for the percentiles. So this tutorial will not provide sample programs in SAS for percentiles and their standard errors. Please refer to SUDAAN program for reference.
- Key Concepts about Percentiles
- How to Generate Percentiles in SUDAAN
- Download Sample Code and Dataset
Task 3: Generate Geometric Means
A geometric mean provides a better estimate of central tendency for data that are distributed with a "long tail" at the upper end of the distribution, which is very common in the measurement of environmental chemical in blood or urine.
- Key Concepts about Generating Geometric Means
- How to Generate Geometric Means
- Download Sample Code and Dataset
Task 4: Generate Proportions
Proportions are used for prevalence estimates of an event or trait (e.g., the prevalence of persons with high blood pressure (HBP) in the United States).
- Key Concepts about Proportions
- How to Generate Proportions Using SUDAAN
- How to Generate Proportions Using SAS Survey Procedures
- Download Sample Code and Dataset
Because the procedure is the same as it would be for any dataset, links to the Continuous NHANES Web Tutorial are provided here for your convenience. If you use the code, please be sure to change it to match the variables in your environmental analytic dataset.
Contact Us:
- National Center for Health Statistics
3311 Toledo Rd
Hyattsville, MD 20782 - 1 (800) 232-4636
- cdcinfo@cdc.gov
- Page last updated: May 3, 2013
- Page last reviewed: May 3, 2013
- Content source: CDC/National Center for Health Statistics
- Page maintained by: NCHS/NHANES
800-CDC-INFO (800-232-4636) TTY: (888) 232-6348 - Contact CDC–INFO