I am trying to create a loop that will perform a series of analyses. I am using geeglm from geepack, which fails if there are any null values. Creating a subset solves this, but do not seem to be able to set the subset dynamically based on a changing variable. while (j <= y.num) { strSubset = as.character(df.IV$IV[j]) #Gives column name in quotes df.data.sub = subset(df.data, strSubset>=0) #subset dataset is not created # analyses on subset take place j = j + 1 } If I type the variable name in the formula it works, so I assume that I am not creating the variable in a manner that allows it to be evaluated in the subset function. Any help would be greatly appreciated! Michael -- View this message in context: http://r.789695.n4.nabble.com/Create-subset-using-variable-tp4316812p4316812.html Sent from the R help mailing list archive at Nabble.com.
On Jan 21, 2012, at 3:18 PM, pansophy wrote:> I am trying to create a loop that will perform a series of > analyses. I am > using geeglm from geepack, which fails if there are any null values. > Creating a subset solves this, but do not seem to be able to set the > subset > dynamically based on a changing variable.This is an exercise in guesswork since you have not provided the data structures that you are accessing.> > while (j <= y.num) {If you want to avoid NULL values in sequences you can use seq_along()> > strSubset = as.character(df.IV$IV[j]) #Gives column name in > quotesIs there really a dataframe named 'df.IV' or are you perhaps an expatriate from another programming locale where the "." is an accessor operator?> df.data.sub = subset(df.data, strSubset>=0) > > #subset dataset is not created`subset` uses nonstandard evaluation. It's very handy for interactive work but for programming you cannot use it in the manner you imagine. Try instead: df.data.sub =df.data[ df[["IV"]] >= 0 ) Or perhaps: df.data.sub =df.data[ df[[strSubset]] >= 0 ) Although I'm not sure what `strSubset` will evaluate to in your situation.> > # analyses on subset take place > > j = j + 1 > } > > If I type the variable name in the formula it works, so I assume > that I am > not creating the variable in a manner that allows it to be evaluated > in the > subset function.That much is certain. -- David Winsemius, MD West Hartford, CT
pansophy <mjs2134 <at> columbia.edu> writes:> > I am trying to create a loop that will perform a series of analyses. I am > using geeglm from geepack, which fails if there are any null values. > Creating a subset solves this, but do not seem to be able to set the subset > dynamically based on a changing variable. > > while (j <= y.num) { > > strSubset = as.character(df.IV$IV[j]) #Gives column name in quotes > df.data.sub = subset(df.data, strSubset>=0) > > #subset dataset is not created > > # analyses on subset take place > > j = j + 1 > } > > If I type the variable name in the formula it works, so I assume that I am > not creating the variable in a manner that allows it to be evaluated in the > subset function. Any help would be greatly appreciated! > > Michael > > -- > View this message in context:http://r.789695.n4.nabble.com/Create-subset-using-variable-tp4316812p4316812.html> Sent from the R help mailing list archive at Nabble.com. > >I think you want to try and use the double bracket with your df to access the column of interest in your data frame. To remove null values, you could use the na.omit() function, assuming when you say null values are represented as NAs: while (j <= y.num) { strSubset = as.character(df.IV$IV[j]) #Gives column name in quotes df.data.sub = df.data[[strSubset]] df.data.sub = na.omit(df.data.sub) # if null values are given as NA df.data.sub = df.data.sub[df.data.sub >= 0] # if null values are < 0 #subset dataset is not created # analyses on subset take place j = j + 1 } Hope that helps, Ken
For example: # dataset age<-18:29 height<-58:69 df.ex=data.frame(age=age,height=height) df.ex[4,1]<-NA # dataset of columns that will be used for analysis values<-c("age", "height") df.variables=data.frame(values) # Age column has a null (NA) value. The row must be removed for the analysis to run # explicit creation df.ex.sub.explicit<-subset(df.ex, age >= 0) dim(df.ex.sub.explicit) #11 obs of 2 variables i=1 strFilter=as.character(df.variables$values[i]) df.ex.sub.passvar<-subset(df.ex,strFilter>=0) dim(df.ex.sub.explicit) #12 obs of 2 variables -- View this message in context: http://r.789695.n4.nabble.com/Create-subset-using-variable-tp4316812p4317196.html Sent from the R help mailing list archive at Nabble.com.
pansophy <mjs2134 <at> columbia.edu> writes:> > I am trying to create a loop that will perform a series of analyses. I am > using geeglm from geepack, which fails if there are any null values. > Creating a subset solves this, but do not seem to be able to set the subset > dynamically based on a changing variable. > > while (j <= y.num) { > > strSubset = as.character(df.IV$IV[j]) #Gives column name in quotes > df.data.sub = subset(df.data, strSubset>=0) > > #subset dataset is not created > > # analyses on subset take place > > j = j + 1 > } >I've answered this on Stack Overflow. While I don't think there is any formal policy against cross-posting, I (and at least a few other participants in both lists) deprecate cross-posting because it leads to redundant effort. Ben Bolker