Preparing a Physical Activity Questionnaire Dataset
Purpose
Module 5 illustrates the basic principles for preparing a physical activity questionnaire dataset. We encourage you to approach the following tasks in sequence so that you replicate the steps you will need to take when you conduct an analysis of NHANES data. To help guide you through this process, we’ve created a SAS program titled “PAQMSTR.SAS” that demonstrates how to prepare a physical activity analytic dataset. We’ll also use this program to conduct analyses, such as calculating total physical activity minutes/week and MET-minutes/per week and comparing summary variables to national recommendations for physical activity.
When you have completed the tutorial and want to prepare a dataset for your own analysis, you may choose to look at activity energy expenditure within a different timeframe (e.g., daily) or you may want to perform an analysis that assesses the physical activity questionnaire data in a different way. You can still use this SAS program and tutorial as a guide for preparing your dataset and building your own analysis.
Task 1: Locate Variables
Physical activity questionnaire data files and supporting documentation are stored in the Questionnaire section of the NHANES website. This task will teach you how to identify physical activity questionnaire variables, appropriate sample weights, and their file locations.
Step 1: Identify Physical Activity Questionnaire Variables and File Locations
- Key Concepts about Identifying Physical Activity Questionnaire Variables and File Locations
- How to Identify Physical Activity Questionnaire Variables and File Locations
Step 2: Identify Correct Sampling Weights and File Locations
- Key Concepts about Identifying Correct Sampling Weights and File Locations
- How to Identify Correct Sampling Weights and File Locations
Task 2: Download Data
To organize your data most effectively, it is helpful to create folders in which to save your data files, documentation, and extracted SAS datasets.
Step 1: Create a Directory
Step 2: Download Data Files and Supporting Documentation
- Key Concepts about Downloading Data Files and Supporting Documentation
- How to Download Data Files and Documentation
Step 3: Extract and Save Data Files
Task 3: Append & Merge Datasets
Typically, an NHANES physical activity questionnaire analytic dataset will include variables from multiple types of data files collected during two or more cycles. You will need to merge the data to include variables from multiple physical activity questionnaire files and append the data to combine years of data from multiple cycles.
- Key Concepts about Merging & Appending NHANES Data for Physical Activity Questionnaire Analyses
- How to Merge & Append NHANES Data for Physical Activity Questionnaire Analyses
Task 4: Review Data & Create New Variables
Before you can use the variables in the physical activity questionnaire dataset, you may need to review the data and create new variables. For example, you may need to adjust the NHANES data if the dataset has missing data or outliers. Depending on the purpose of your analysis, you also may need to create new variables (e.g., creating a summary variable that combines household/yard, travel, and leisure activity so as to determine physical activity guideline adherence).
Step 1: Identify, Recode, and Evaluate Missing Data
- Key Concepts about Identifying, Recoding, and Evaluating Missing Data
- How to Identify, Recode, and Evaluate Missing Data
Step 2: Create New Variables
- Key Concepts about Creating New Variables hyperlink
- How to Create New Variables to Describe Household/Yard Physical Activity
- How to Create New Variables to Describe Transportation Physical Activity
- How to Create New Variables to Describe Leisure-time Physical Activity
- How to Create New Variables to Describe Physical Activity Guideline Adherence – Minutes of Weekly Physical Activity
Step 3: Check Distributions and Describe the Impact of Influential Outliers
- Key Concepts about Outliers in NHANES Data
- How to Identify and Describe the Impact of Influential Outliers
Task 5: Format & Label Variables
Formats and labels are user-defined tools that provide a convenient way to describe variables in your SAS output. Although adding formats or labels to your variables is optional, it is often helpful when reviewing the output from your analyses.
- Key Concepts about Defining Formats and Labeling Variables
- How to Define Formats and Label Variables
Task 6: Save a Dataset
In this module, you will learn how to create a permanent dataset in a SAS library. This will allow you to save the temporary dataset that you have been working with as a permanent file on your computer so you can continue your work at a later time.
Contact Us:
- National Center for Health Statistics
3311 Toledo Rd
Hyattsville, MD 20782 - 1 (800) 232-4636
- cdcinfo@cdc.gov