Here are the steps to keep NHANES II data:
As described in the Data Structure and Contents module, all of the NHANES II data files contain demographic, sample design, and weighting variables. It is highly recommended that you include these variables from the Health History Questionnaire files; one file covers ages 6 months-11 years and one file covers ages 12-74 years. These files include all survey participants who completed the household interview.
Because you are interested only in a subset of the variables, you can use the keep option statement to select relevant variables. No output is associated with this procedure, so you will need to check the SAS log file to make sure that the procedure was completed successfully. Additionally, you can use SAS Explorer to see that the new datasets (Lab, MDExam, Youth, Adult, Anthro, and Suppl) are in your WORK library. Note that the SAS dataset names used here are those created in the "Download Data Files" module.
Statements | Explanation |
---|---|
libname NH2 "C:\NHANES II\DATA" ; | Use the libname statement to refer to the data folder. |
lab; | Use the data step to create a dataset for your laboratory data (lab). |
keep =SEQN N2LB0421 N2LB0426); | NH2.lab (
Use the set statement to bring in the laboratory file. Use the name you gave the dataset when you created it as a permanent SAS dataset in the previous module "Download Data Files". Use the keep statement to select the variables of interest.
IMPORTANT NOTE
Notice that in the keep statement, a variable named "seqn" is included. SEQN stands for sequence number and should be included whenever datasets are appended. SEQN is a unique identifier for each observation (participant) in NHANES. Every time you extract variables from an NHANES III data file, you should include the SEQN variable in your selection. Failing to do so will lead to problems if you want to sort or merge your data files at a later time. See Keep & Merge Module Task 2 for more information on Merging. |
mdexam; |
Use the data step to create a dataset for your examination data (mdexam). |
keep =SEQN N2PE0411 N2PE0771 N2PE0414 N2PE0774); | NH2.mdexam (
Use the set statement to bring in the examination file. Use the keep statement to select the variables of interest. |
adult; |
Use the data step to create a dataset for your adult questionnaire data (adult). |
keep =SEQN N2AH0625 N2AH0626 N2AH1059 N2AH1060 N2AH1067 N2AH1068 N2AH1069 N2AH0491 N2AH0495 N2AH1089 N2AH0062 N2AH0064 N2AH0047 N2AH0045 N2AH0055 N2AH0056 N2AH0326 N2AH0260 N2AH0324 N2AH0282); | NH2.adult (
Use the set statement to bring in the adult questionnaire file. Use the keep statement to select the variables of interest. |
youth; |
Use the data step to create the dataset for your youth questionnaire data (youth). |
keep =SEQN N2CH0062 N2CH0064 N2CH0047 N2CH0045 N2CH0055 N2CH0056 N2CH0326 N2CH0324 N2CH0282); | NH2.youth (
Use the set statement to bring in the youth questionnaire file. Use the keep statement to select the variables of interest. |
anthro; |
Use the data step to create a dataset for your anthropometric data (anthro). |
keep =SEQN N2BM0412 N2BM0418); | NH2.anthro (
Use the set statement to bring in the anthropometric file. Use the keep statement to select the variables of interest. |
suppl; |
Use the data step to create the dataset for your supplemental health questionnaire data (suppl). |
keep =SEQN N2SH0785); | NH2.suppl (
Use the set statement to bring in the supplemental health questionnaire file. Use the keep statement to select the variables of interest. |
After keeping the variables, it is recommended that you check the contents again to make sure that the datasets were kept with the correct variables in them and with the total number of variables expected.
Highlighted results of the proc contents procedure on the new dataset are: