Simon Kiss
2011-Mar-30 15:01 UTC
[R] sampling design runs with no errors but returns empty data set
Dear colleagues, I'm working with the 2008 Canada Election Studies (http://www.queensu.ca/cora/_files/_CES/CES2008.sav.zip), trying to construct a weighted national sample using the survey package. Three weights are included in the national survey (a household weight, a provincial weight and a national weight which is a product of the first two). In the following code I removed variables with missing national weights and tried to construct the sample from advice I've gleaned from the documentation for the survey package and other help requests. There are no errors, but the data frame (weight_test) contains no What am I missing? Yours, Simon Kiss P.S. The code is only reproducible if the data set is downloadable. I'm nt sure ces<-read.spss(file.choose(), to.data.frame=TRUE, use.value.labels=FALSE) missing_data<-subset(ces1, !is.na(ces08_NATWGT)) weight_test<-svydesign(id=~0, weights=~ces08_NATWGT, data=missing_data) Note: this is some reproducible code that creates a data set that is a very stripped down version of what I'm working with, but with this, the surveydesign function appears to work properly. mydat<-data.frame(ces08_HHWGT=runif(3000, 0.5, 5), ces08_PROVWGT=runif(3000, 0.6, 1.2), party=sample(c("NDP", "BQ", "Lib", "Con"), 3000, replace=TRUE), age=sample(seq(18, 72,1), 3000, replace=TRUE), income=sample(seq(21,121,1), 3000, replace=TRUE)) mydat$ces08_NATWGT<-mydat$ces08_HHWGT*mydat$ces08_PROVWGT weight_test<-svydesign(id=~1, weights=~ces08_NATWGT, data=mydat) ********************************* Simon J. Kiss, PhD Assistant Professor, Wilfrid Laurier University 73 George Street Brantford, Ontario, Canada N3T 2C9 Cell: +1 519 761 7606
Thomas Lumley
2011-Mar-30 20:01 UTC
[R] sampling design runs with no errors but returns empty data set
On Thu, Mar 31, 2011 at 4:01 AM, Simon Kiss <sjkiss at gmail.com> wrote:> Dear colleagues, > I'm working with the 2008 Canada Election Studies (http://www.queensu.ca/cora/_files/_CES/CES2008.sav.zip), trying to construct a weighted national sample using the survey package. > Three weights are included in the national survey (a household weight, ?a provincial weight and a national weight which is a product of the first two). > In the following code I removed variables with missing national weights and tried to construct the sample from advice I've gleaned from the documentation for the survey package and other help requests. > There are no errors, but the data frame (weight_test) contains no > What am I missing? > Yours, Simon Kiss > P.S. The code is only reproducible if the data set is downloadable. ?I'm nt sure > > ces<-read.spss(file.choose(), to.data.frame=TRUE, use.value.labels=FALSE) > missing_data<-subset(ces1, !is.na(ces08_NATWGT)) > weight_test<-svydesign(id=~0, weights=~ces08_NATWGT, data=missing_data) >The code isn't reproducible even with the data. The code refers to a data frame ces1, which isn't defined, and to a variable ces08_NATWGT that isn't in the data set. However, a bit of Googling suggests that the variable CES08_NA is probably the one you mean, giving the following code library(survey) library(foreign) ces<-read.spss("CES2008.sav", to.data.frame=TRUE, use.value.labels=FALSE) missing_data<-subset(ces, !is.na(CES08_NA)) weight_test<-svydesign(id=~0, weights=~CES08_NA, data=missing_data) which seems to produce a perfectly reasonable survey design object.> weight_testIndependent Sampling design (with replacement) svydesign(id = ~0, weights = ~CES08_NA, data = missing_data)> dim(weight_test)[1] 3257 531> svymean(~factor(GENDER),weight_test)mean SE factor(GENDER)1 0.47362 0.01 factor(GENDER)5 0.52638 0.01 Since you don't say how you concluded the object contained no, I don't know what you were seeing. Note that weight_test is not supposed to be a data frame. It's a survey design object. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland