How to Define Formats and Label Variables
Name, Define, and Apply Custom Formats
To create custom formats for your dataset, you will need to use the PROC FORMAT procedure. Using the VALUE statement, you first assign a name to a format. Then, you use descriptive text to define the values of the format. Note that all assigned text names for the values must be surrounded by single quotation marks in order to be applied properly.
The sample code, which comes from the “PAQMSTR” program, below shows how to name and define a custom format. This example uses the format ADHEREF. (Note that you can assign any name you choose, so long as it meets the SAS specifications for a valid format name. See a SAS manual for more information.) This format defines values 1 through 3, with each value representing the level of adherence to the 2008 Physical Activity Guidelines for Americans.
Sample Code
set paq;
if (. < PAG_MINW < 150) then ADHERENCE = 1;
else if (150 <= PAG_MINW < 300) then ADHERENCE = 2;
else if (PAG_MINW >= 300) then ADHERENCE = 3;
;
value ADHERENCE
1 = "Below"
2 = "Meets"
3 = "Exceeds" ;
;
After you have named and defined a format, apply it to selected variables using the FORMAT statement in the data step of your code. Applying a format to a variable allows you to determine how the values will look in the output (e.g., Adherence group 1 will be represented by the text “Below”). When assigning formats to variables, note that format names always come directly after variable names and MUST end with a period, as shown in the sample code below.
Sample Code
set paq;
format ADHERENCE ADHERENCEF. ;
;
Apply Labels to Variables
Variables are given a text description using a LABEL statement. One way to do this is by using a SAS data step, as shown below in the sample code from the “PAQMSTR” program. User-defined labels should always be surrounded by single- or double-quotation marks.
Sample Code
data paq;
set paq;
label ADHERENCE = "Level of adherence to 2008 Physical Activity Guidelines for Americans" ;
;