In this task, you will use the chi-square test to determine whether age group and osteoporosis treatment status are independent of each other.
The PROC SURVEYFREQ procedure is used in SAS to examine the relationship between two categorical variables and obtain chi-square statistics. Use the STRATA statement to specify the strata variable to account for the design effects of stratification. Use the CLUSTER statement to specify PSU to account for design effects of clustering. Use the WEIGHT statement to account for the unequal probability of sampling and non-response. Use the WHERE statement to specify the subpopulation of interest.
Use the TABLE statement to create a cross tab of the categorical variables age group (AGEGRP) and osteoporosis treatment status (TREATOSTEO). The options included after the backslash instruct SAS to output the column percent (COL), row percent (ROW), Wald chi-square (WCHISQ), and Wald log linear chi-square (WLLCHISQ), and suppress the standard deviation (NOSTD) and weighted sums (NOWT). The CHISQ option is used to obtain the Rao-Scott chi-square and the CHISQ1 option is used to obtain the Rao-Scott modified chi-square. Use the FORMAT statement to read the SAS formats.
*-------------------------------------------------------------------------;
* Use the PROC SURVEYFREQ procedure to perform a chi-square test in
SAS. ;
* This test will be used to determine whether age group and treatment for ;
* osteoporosis are independent of each other in respondents aged 20
and ;
*
over.
;
*-------------------------------------------------------------------------;
strata
SDMVSTRA;
cluster SDMVPSU;
weight
WTINT2YR;
where RIDAGEYR >=
20 ;
table
AGEGRP*TREATOSTEO/col row nostd nowt wchisq wllchisq
chisq chisq1;
format
AGEGRP AGEGRP.
TREATOSTEO
YESNO. ;
;
For complex survey data such as NHANES, using the Rao-Scott F adjusted chi-square statistic is recommended since it yields a more conservative interpretation than the Wald chi-square.
The SURVEYFREQ Procedure
Data Summary
Number of Strata 15
Number of Clusters 30
Number of Observations 5041
Sum of Weights 205284669
Table of AGEGRP by treatOSTEO
Row Column
AGEGRP treatOSTEO Frequency Percent Percent Percent
--------------------------------------------------------------------
20-39 Yes 2 0.0924 0.2375 2.2097
No 1738 38.8105 99.7625 40.5042
Total 1740 38.9029 100.000
--------------------------------------------------------------------
40-59 Yes 36 1.0062 2.6126 24.0624
No 1358 37.5077 97.3874 39.1446
Total 1394 38.5139 100.000
--------------------------------------------------------------------
>= 60 Yes 227 3.0831 13.6521 73.7279
No 1662 19.5001 86.3479 20.3512
Total 1889 22.5832 100.000
--------------------------------------------------------------------
Total Yes 265 4.1817 100.000
No 4758 95.8183 100.000
Total 5023 100.000
--------------------------------------------------------------------
Frequency Missing = 18
Rao-Scott Chi-Square Test
Pearson Chi-Square 341.6678
Design Correction 0.6712
Rao-Scott Chi-Square 509.0778
DF 2
Pr > ChiSq <.0001
F Value 254.5389
Num DF 2
Den DF 30
Pr > F <.0001
Sample Size = 5023
Rao-Scott Modified Chi-Square Test
Pearson Chi-Square 341.6678
Design Correction 1.5353
Rao-Scott Chi-Square 222.5434
DF 2
Pr > ChiSq <.0001
F Value 111.2717
Num DF 2
Den DF 30
Pr > F <.0001
Sample Size = 5023
Wald Chi-Square Test
Chi-Square 91.2484
F Value 45.6242
Num DF 2
Den DF 15
Pr > F <.0001
Adj F Value 42.5826
Num DF 2
Den DF 14
Pr > Adj F <.0001
Sample Size = 5023
Wald Log-Linear Chi-Square Test
Chi-Square 1216.9520
F Value 608.4760
Num DF 2
Den DF 15
Pr > F <.0001
Adj F Value 567.9109
Num DF 2
Den DF 14
Pr > Adj F <.0001
Sample Size = 5023
Highlights from the output include: