On Fri, Aug 19, 2011 at 7:25 AM, Simon Kiss <sjkiss at gmail.com>
wrote:> Dear list colleagues,
> I'm trying to come up with a test question for undergraduates to
illustrate comparison of means from a complex survey design. The data for the
example looks roughly like this:
>
> mytest<-data.frame(harper=rnorm(500, mean=60, sd=1),
party=sample(c("BQ", "NDP", "Conservative",
"Liberal", "None", NA), size=500, replace=TRUE),
natwgt=sample(c(0.88, 0.99, 1.43, 1.22, 1.1), size=500, replace=TRUE),
gender=sample(c("Male", "Female"), size=500, replace=TRUE))
>
> Using svyby I can get the means for each group of interest (primarily the
party variable), but I can't get further to actually do the comparison of
means. ?I saw a reference on the help listserv to the effect that the survey
package does not do ttests and that one should use svyglm. ?However, that was in
2009 and I see that there's a command, svytteset in the package which seems
to be on point. ?However, when I've tried that command I can't get it to
work: it returns the following error message:
>
> t = NaN, df = 3255, p-value = NA
> alternative hypothesis: true difference in mean is not equal to 0
> sample estimates:
> difference in mean
> ? ? ? ? ?38.80387
It would have been helpful to give your code.
mytest<-data.frame(harper=rnorm(500, mean=60, sd=1),
party=sample(c("BQ", "NDP", "Conservative",
"Liberal", "None", NA),
size=500, replace=TRUE), natwgt=sample(c(0.88, 0.99, 1.43, 1.22, 1.1),
size=500, replace=TRUE), gender=sample(c("Male", "Female"),
size=500,
replace=TRUE))
des<-svydesign(id=~1, weight=~natwgt,data=mytest)
> svyttest(harper~gender, design=des)
Design-based t-test
data: harper ~ gender
t = 0.7678, df = 498, p-value = 0.443
alternative hypothesis: true difference in mean is not equal to 0
sample estimates:
difference in mean
0.06621512
You can compare one party to all others:
> svyttest(harper~I(party=="Liberal"), design=des)
Design-based t-test
data: harper ~ I(party == "Liberal")
t = -1.3194, df = 498, p-value = 0.1876
alternative hypothesis: true difference in mean is not equal to 0
sample estimates:
difference in mean
-0.1432304
Or subset down to just two parties to compare:
> svyttest(harper~I(party=="Liberal"), design=subset(des,party %in%
c("Liberal","Conservative")))
Design-based t-test
data: harper ~ I(party == "Liberal")
t = -0.5622, df = 181, p-value = 0.5747
alternative hypothesis: true difference in mean is not equal to 0
sample estimates:
difference in mean
-0.08031931
Or convert the design to use replicate weights, so you then can
compute contrasts after svyby()
> rdes<-as.svrepdesign(des)
> support<-svyby(~harper,~party,svymean,design=rdes,covmat=TRUE)
> support
party harper se
BQ BQ 59.89231 0.10544441
Conservative Conservative 59.89364 0.10824393
Liberal Liberal 59.81332 0.09534826
NDP NDP 59.98576 0.11232908
None None 60.07644 0.10327210> svycontrast(support, quote(BQ-Conservative))
nlcon SE
contrast -0.0013296 0.1511> svycontrast(support, quote(Conservative-Liberal))
nlcon SE
contrast 0.080319 0.1442
Though if I were doing this analysis I'd fit a linear regression model.
-thomas
--
Thomas Lumley
Professor of Biostatistics
University of Auckland