A SAS Program for the WHO Growth Charts (ages 0 to <2 years)
The purpose of this SAS program is to calculate the percentiles and Z-scores (standard deviations) for a child’s sex and age from birth up to 2 years of age for BMI, weight, height, skinfold thicknesses (triceps and subscapular), arm circumference, and head circumference based on the WHO Growth Charts. Weight-for-height z-scores and percentiles are also calculated. Observations that contain extreme values (absolute z-scores above 5 or 6 [PDF-152KB]) are flagged as being biologically implausible. Although WHO provides several macros and a PC program for these calculations, this SAS program follows the same steps as does the SAS program for the CDC growth charts. Additional details about the ages for which the various z-scores and percentiles are calculated are given in Table 2 (below).
The SAS program, WHO-source-code.sas (files are below, in step #1), calculates these z-scores and percentiles based on reference values in WHOref_d.sas7bdat. This reference data set combines values from several WHO datasets described in http://www.who.int/childgrowth/software/readme_sas.pdf [PDF-152KB]. If you’re not using SAS, you can download WHOref_d.cvs [CVS-160KB], and create a program based on who-source-code.sas [SAS-6KB] to do the necessary calculations.
Instructions for SAS users
Step 1: Download the SAS program (who-source-code.sas [SAS-6KB]) and the reference data file (WHOref_d.sas7bdat). Do not alter these files, but move them to a directory (folder) that SAS can access. If you are using Chrome or Firefox, right click to save the who-source-code.sas file.
For the following example, the files have been saved in c:\sas\growth charts\who\data.
Step 2: Create a libname statement in your SAS program to point at the folder location of ‘WHOref_d.sas7bdat’. An example would be:
libname refdir ‘c:\sas\growth charts\who\data’;
Note the SAS code expects this name to be refdir; do not change this name.
Step 3: Set your existing dataset containing height, weight, sex, age and other variables into a temporary dataset, named mydata. Variables in your dataset should be renamed and coded as follows:
Table 1
Variable | Description |
---|---|
agedays | Child’s age in days; must be present. If this value is not an integer, the program rounds to the nearest whole number.
If age is known only to the completed number of weeks (e.g., 5 weeks of age would represent any number of days between 35 and 41), multiply by 7 and consider adding 4 (median number of days in a week). If age is known only to the completed number of months, multiply by 365.25/12, and consider adding 15. |
sex | Coded as 1 for boys and 2 for girls. |
height | Recumbent length in cm. If standing height (rather than recumbent length) was recorded, add 0.7 cm to the values (see http://www.ncbi.nlm.nih.gov/pubmed/?term=16817681, and http://www.who.int/nutrition/media_page/tr_summary_english.pdf [PDF-89KB]). |
weight | Weight (kg) |
bmi | BMI (weight (kg) / height (m)2). If your data doesn’t contain BMI, the program calculates it. If BMI is present in your data, the program will not overwrite it. |
headcir | Head circumference (cm) |
armcir | Arm circumference (cm) |
tsf | Triceps skinfold thickness (mm) |
ssf | Subscapular skinfold thickness (mm) |
Z-scores and percentiles for variables that are not in mydata will be coded as missing (.) in the output dataset (named _whodata). Sex (coded as 1 for boys and 2 for girls) and agedays must be in mydata. It’s unlikely that the SAS code will overwrite other variables in your dataset, but you should avoid having variable names that begin with an underscore, such as _bmi.
Step 4: Copy and paste the following line into your SAS program after the line (or lines) in step #3.
%include ‘c:\sas\growth charts\who\data\WHO-source-code.sas’; run;
If necessary, change this statement to point at the folder containing the downloaded ‘WHO-source-code.sas’ file. This tells your SAS program to run the statements in ‘WHO-source-code.sas’.
Step 5: Submit the %include statement. This will create a dataset, named _whodata, which contains all of your original variables along with z-scores, percentiles, and flags for extreme values. The names and descriptions of these new variables in _whodata are in Table 2.
Table 2: Z-Scores, percentiles, and extreme values (biologically implausible, BIV) in output dataset, _whodata
Description |
Variable |
Cutoff for Extreme Z-Scores |
|||
---|---|---|---|---|---|
Percentile |
Z-score |
Flag for Extreme |
Low z-score (Flag coded as -1) |
High z-score (Flag coded as +1) |
|
Weight-for-age for children between 1 and 731 (inclusive) days of age |
wapct |
waz |
_bivwt |
< -6 |
> 5 |
Height-for-age for children between 1 and 731 days of age |
hapct |
haz |
_bivht |
< -6 |
>6 |
Weight-for-height for children with heights between 45 and 110 cm |
whpct |
whz |
_bivwh |
< -5 |
>5 |
BMI-for-age for children between 1 and 731 days of age. Note that for children under 2 y of age, weight-for-height, not BMI-for-age, is recommended. |
bmipct |
bmiz |
_bivbmi |
< -5 |
>5 |
Head circumference-for-age for children between 0 and 731 days of age |
headcpct |
headcz |
_bivhc |
< -5 |
>5 |
Arm circumference-for-age for children between 91 and 731 days of age |
armcpct |
armcz |
_bivac |
< -5 |
>5 |
Subscapular skinfold thickness-for-age for children between 91 and 731 days of age |
ssfpct |
ssfz |
_bivssf |
< -5 |
>5 |
Triceps skinfold thickness-for-age for children between 91 and 731 days of age |
tsfpct |
tsfz |
_bivtsf |
< -5 |
>5 |
Step 6: Examine the new dataset, _whodata, with PROC MEANS or some other procedure to verify that the z-scores and other variables have been created. If a variable in Table 1 was not in your original dataset (e.g., arm circumference), the output dataset will indicate that all values for the percentiles and z-scores of this variable are missing. If values for other variables are unexpectedly missing, make sure that you’ve renamed and recoded variables as indicated in Table 1 and that your SAS dataset is named mydata. The program should not modify your original data, but will add new variables to your original dataset.
Example SAS code corresponding to steps 2 to 6. You can simply cut and paste these lines into a SAS program, but you’ll need to change the libname and %include statements to point at the folders containing the downloaded files.
libname refdir ‘c:\sas\growth charts\who\data’;
data mydata; set whatever-your-original-dataset-is-named;
%include ‘c:\sas\growth charts\who\data\WHO-source-code.sas’;
proc means data=_whodata; run;
Additional Information
Z-scores are calculated as
Z = [ ((value / M)**L) – 1] / (S * L) ,
in which ‘value’ is the child’s BMI, weight, height, etc. The L, M, and S values are in WHOref_d.sas7bdat and vary according to the child’s sex and age or according to the child’s sex and height. Percentiles are then calculated from the z-scores (for example, a z-score of 1.96 is equal to the 97.5 percentile). For more information on the LMS method, see http://www.ncbi.nlm.nih.gov/pmc/articles/PMC27365/
Extreme or Biologically implausible Values
The SAS code also flags extreme values (biologically implausible values, or BIV) according to the WHO criteria at http://www.who.int/childgrowth/software/readme_sas.pdf [PDF-152KB]. Each variable has a BIV flag that is coded as -1 (an extremely low z-score), +1 (extremely high z-score), or 0 (the z-score is between the low and high cut-points). These BIVs flags, along with other variables that are in the output dataset, _whodata, are shown in Table 2.
The z-scores in the output data set, _whodata, can also be used to construct other cut-points for extreme (or biologically implausible) values. For example, if the distribution of weight in your data is strongly skewed to the right, you might use bmiz > 7 (rather than bmiz > 5) as the cut-point for extremely high BMI-for-age. This could be recoded as:
if -5 <= bmiz <= 7 then _bivbmi=0; *plausible;
else if bmiz > 7 then _bivbmi=1; *high BIV;
else if . < bmiz < -5 then _bivbmi= -1; *low BIV;
There are also 2 overall indicators of extreme values in the output dataset: _bivlow and _bivhigh. These variables indicate whether any measurement is extremely high ( _bivhigh=1) or extremely low (_bivlow=1). If a child does not have an extreme value for any measurement, both variables are coded as 0.
- Page last reviewed: July 21, 2017
- Page last updated: July 21, 2017
- Content source: