thr3ads.net - R help - [R] Comparison of means in survey package [Aug 2011]

If this information is useful, please help other people find it:
Share via:

Simon Kiss

2011-Aug-18 19:25 UTC

[R] Comparison of means in survey package

Dear list colleagues,
I'm trying to come up with a test question for undergraduates to illustrate
comparison of means from a complex survey design. The data for the example looks
roughly like this:

mytest<-data.frame(harper=rnorm(500, mean=60, sd=1),
party=sample(c("BQ", "NDP", "Conservative",
"Liberal", "None", NA), size=500, replace=TRUE),
natwgt=sample(c(0.88, 0.99, 1.43, 1.22, 1.1), size=500, replace=TRUE),
gender=sample(c("Male", "Female"), size=500, replace=TRUE))

Using svyby I can get the means for each group of interest (primarily the party
variable), but I can't get further to actually do the comparison of means. 
I saw a reference on the help listserv to the effect that the survey package
does not do ttests and that one should use svyglm.  However, that was in 2009
and I see that there's a command, svytteset in the package which seems to be
on point.  However, when I've tried that command I can't get it to work:
it returns the following error message:

t = NaN, df = 3255, p-value = NA 
alternative hypothesis: true difference in mean is not equal to 0 
sample estimates:
difference in mean 
          38.80387 

This is from my data, not the code above.  Would there also be a way just to do
the comparison of means test between two subgroups of a factor, and not just on
all factor levels?

Using 2.13 on mac os 10.6 and the latest version of survey package.

Yours, Simon Kiss


*********************************
Simon J. Kiss, PhD
Assistant Professor, Wilfrid Laurier University
73 George Street
Brantford, Ontario, Canada
N3T 2C9
Cell: +1 905 746 7606

Thomas Lumley

2011-Aug-18 21:19 UTC

head link

[R] Comparison of means in survey package

On Fri, Aug 19, 2011 at 7:25 AM, Simon Kiss <sjkiss at gmail.com>
wrote:> Dear list colleagues,
> I'm trying to come up with a test question for undergraduates to
illustrate comparison of means from a complex survey design. The data for the
example looks roughly like this:
>
> mytest<-data.frame(harper=rnorm(500, mean=60, sd=1),
party=sample(c("BQ", "NDP", "Conservative",
"Liberal", "None", NA), size=500, replace=TRUE),
natwgt=sample(c(0.88, 0.99, 1.43, 1.22, 1.1), size=500, replace=TRUE),
gender=sample(c("Male", "Female"), size=500, replace=TRUE))
>
> Using svyby I can get the means for each group of interest (primarily the
party variable), but I can't get further to actually do the comparison of
means. ?I saw a reference on the help listserv to the effect that the survey
package does not do ttests and that one should use svyglm. ?However, that was in
2009 and I see that there's a command, svytteset in the package which seems
to be on point. ?However, when I've tried that command I can't get it to
work: it returns the following error message:
>
> t = NaN, df = 3255, p-value = NA
> alternative hypothesis: true difference in mean is not equal to 0
> sample estimates:
> difference in mean
> ? ? ? ? ?38.80387
It would have been helpful to give your code.

mytest<-data.frame(harper=rnorm(500, mean=60, sd=1),
party=sample(c("BQ", "NDP", "Conservative",
"Liberal", "None", NA),
size=500, replace=TRUE), natwgt=sample(c(0.88, 0.99, 1.43, 1.22, 1.1),
size=500, replace=TRUE), gender=sample(c("Male", "Female"),
size=500,
replace=TRUE))

des<-svydesign(id=~1, weight=~natwgt,data=mytest)
> svyttest(harper~gender, design=des)
	Design-based t-test

data:  harper ~ gender
t = 0.7678, df = 498, p-value = 0.443
alternative hypothesis: true difference in mean is not equal to 0
sample estimates:
difference in mean
        0.06621512

You can compare one party to all others:
> svyttest(harper~I(party=="Liberal"), design=des)
	Design-based t-test

data:  harper ~ I(party == "Liberal")
t = -1.3194, df = 498, p-value = 0.1876
alternative hypothesis: true difference in mean is not equal to 0
sample estimates:
difference in mean
        -0.1432304

Or subset down to just two parties to compare:
> svyttest(harper~I(party=="Liberal"), design=subset(des,party %in%
c("Liberal","Conservative")))

	Design-based t-test

data:  harper ~ I(party == "Liberal")
t = -0.5622, df = 181, p-value = 0.5747
alternative hypothesis: true difference in mean is not equal to 0
sample estimates:
difference in mean
       -0.08031931

Or convert the design to use replicate weights, so you then can
compute contrasts after svyby()
> rdes<-as.svrepdesign(des)
> support<-svyby(~harper,~party,svymean,design=rdes,covmat=TRUE)
> support                    party   harper         se
BQ                     BQ 59.89231 0.10544441
Conservative Conservative 59.89364 0.10824393
Liberal           Liberal 59.81332 0.09534826
NDP                   NDP 59.98576 0.11232908
None                 None 60.07644 0.10327210> svycontrast(support, quote(BQ-Conservative))              nlcon     SE
contrast -0.0013296 0.1511> svycontrast(support, quote(Conservative-Liberal))            nlcon     SE
contrast 0.080319 0.1442

Though if I were doing this analysis I'd fit a linear regression model.

   -thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

Seemingly Similar Threads

Search for more maybe matching threads

R help - Aug 2011 - Comparison of means in survey package

[R] Comparison of means in survey package

[R] Comparison of means in survey package

Seemingly Similar Threads