Philip Robinson
2012-Mar-01 00:42 UTC
[R] identifying a column name correctly to use in a formula
Hi,
I have a large matrix (SNPs) that I want to cycle over with logistic
regression with interaction terms. I have made a loop but I am struggling
to identify to the formula the name of the column in a way which is
meaningful to the formula. It errors becasue it is not evaluated proporly.
(below is a pilot with only 7 to 33 columns, my actual has 200,000 columns)
My attempts:
for (i in 7:33) {
label <- colnames(n)[i]
model1 <- glm(AS~label*interaction,family=binomial("logit"),data=n)
X <- summary(model1)$coefficients[2,1]
Y <- c(label,X)
vector <- rbind(vector,Y)
} #variable lengths differ
Error in model.frame.default(formula = AS ~ label, data = n,
drop.unused.levels = TRUE) :
variable lengths differ (found for 'label')
#This is because it is trying to do logistic regression on a character
string
for (i in 7:33) {
label <- eval(colnames(n)[i])
model1 <- glm(AS~label*interaction,family=binomial("logit"),data=n)
X <- summary(model1)$coefficients[2,1]
Y <- c(label,X)
vector <- rbind(vector,Y)
} #variable lengths differ
Error in model.frame.default(formula = AS ~ label, data = n,
drop.unused.levels = TRUE) :
variable lengths differ (found for 'label')
#same as above
for (i in 7:33) {
label <- as.name(colnames(n)[i])
model1 <- glm(AS~label*interaction,family=binomial("logit"),data=n)
X <- summary(model1)$coefficients[2,1]
Y <- c(label,X)
vector <- rbind(vector,Y)
}
Error in model.frame.default(formula = AS ~ label, data = n,
drop.unused.levels = TRUE) :
invalid type (symbol) for variable 'label
#not sure what this error is
for (i in 7:33) {
label <- eval(as.name(colnames(n)[i]))
model1 <- glm(AS~label*interaction,family=binomial("logit"),data=n)
X <- summary(model1)$coefficients[2,1]
Y <- c(label,X)
vector <- rbind(vector,Y)
}
# Error in eval(expr, envir, enclos) : object 'B1' not found
B1 is the name of the first column - this isn't an object and that seems to
be why it is causing an error
for (i in 7:33) {
label <- as.formula(colnames(n)[i])
model1 <- glm(AS~label*interaction,family=binomial("logit"),data=n)
X <- summary(model1)$coefficients[2,1]
Y <- c(label,X)
vector <- rbind(vector,Y)
}
Error in eval(expr, envir, enclos) : object 'B1' not found
#same as above
for (i in 7:33) {
label <- eval(as.formula(colnames(n)[i]))
model1 <- glm(AS~label*interaction,family=binomial("logit"),data=n)
X <- summary(model1)$coefficients[2,1]
Y <- c(label,X)
vector <- rbind(vector,Y)
}
Error in eval(expr, envir, enclos) : object 'B1' not found
#same as above
Any help would be appreciated.
Thanks
Philip
[[alternative HTML version deleted]]
Rui Barradas
2012-Mar-01 03:24 UTC
[R] identifying a column name correctly to use in a formula
Hello,> > I have a large matrix (SNPs) that I want to cycle over with logistic > regression with interaction terms. I have made a loop but I am struggling > to identify to the formula the name of the column in a way which is > meaningful to the formula. It errors becasue it is not evaluated proporly. > You have must first write the formula in full, using 'paste'. >Try DF <- data.frame(Resp=rnorm(10), B=rnorm(10), C=rnorm(10), Interaction=rnorm(10)) #DF for(i in 2:3){ cname <- colnames(DF)[i] # # In 3 steps to be more readable Regr <- paste(cname, "Interaction", sep="*") fmlaText <- paste("Resp", Regr, sep="~") # After step 2 it's already printable print(fmlaText) # Step 3: transform it into a formula object fmla <- as.formula(fmlaText) model1 <- glm(fmla, data=DF) print(summary(model1)) } You have must first write the formula in full, using 'paste'. Hope this helps, Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/identifying-a-column-name-correctly-to-use-in-a-formula-tp4433605p4433924.html Sent from the R help mailing list archive at Nabble.com.
R. Michael Weylandt
2012-Mar-01 04:04 UTC
[R] identifying a column name correctly to use in a formula
Your method of constructing a formula is funny: is there a term called
"interaction" or do you mean an interaction in the statistical sense?
Once you do that, I'd think the easiest way to proceed is to use
as.formula() to construct your formula programmatically and then to
pass that to glm(). Something like
form <- as.formula(paste("AS ~ ", colnames(n)[i], sep =
""))
glm(form, data = n, framily = bonimial("logit")
Michael
On Wed, Feb 29, 2012 at 7:42 PM, Philip Robinson
<philip.c.robinson at gmail.com> wrote:> Hi,
>
> I have a large matrix (SNPs) that I want to cycle over with logistic
> regression with interaction terms. I have made a loop but I am struggling
> to identify to the formula the name of the column in a way which is
> meaningful to the formula. It errors becasue it is not evaluated proporly.
>
> (below is a pilot with only 7 to 33 columns, my actual has 200,000 columns)
>
> My attempts:
>
>
> for (i in 7:33) {
> ?label <- colnames(n)[i]
> model1 <-
glm(AS~label*interaction,family=binomial("logit"),data=n)
> ? ?X <- summary(model1)$coefficients[2,1]
> Y <- c(label,X)
> vector <- rbind(vector,Y)
> } #variable lengths differ
>
> Error in model.frame.default(formula = AS ~ label, data = n,
> drop.unused.levels = TRUE) :
> ?variable lengths differ (found for 'label')
>
> #This is because it is trying to do logistic regression on a character
> string
>
> for (i in 7:33) {
> ?label <- eval(colnames(n)[i])
> model1 <-
glm(AS~label*interaction,family=binomial("logit"),data=n)
> ? ?X <- summary(model1)$coefficients[2,1]
> Y <- c(label,X)
> vector <- rbind(vector,Y)
> } #variable lengths differ
>
> Error in model.frame.default(formula = AS ~ label, data = n,
> drop.unused.levels = TRUE) :
> ?variable lengths differ (found for 'label')
>
> #same as above
>
> for (i in 7:33) {
> ?label <- as.name(colnames(n)[i])
> model1 <-
glm(AS~label*interaction,family=binomial("logit"),data=n)
> ? ?X <- summary(model1)$coefficients[2,1]
> Y <- c(label,X)
> vector <- rbind(vector,Y)
> }
>
> Error in model.frame.default(formula = AS ~ label, data = n,
> drop.unused.levels = TRUE) :
> ?invalid type (symbol) for variable 'label
> #not sure what this error is
>
> for (i in 7:33) {
> ?label <- eval(as.name(colnames(n)[i]))
> model1 <-
glm(AS~label*interaction,family=binomial("logit"),data=n)
> ? ?X <- summary(model1)$coefficients[2,1]
> Y <- c(label,X)
> vector <- rbind(vector,Y)
> }
>
> # Error in eval(expr, envir, enclos) : object 'B1' not found
> B1 is the name of the first column - this isn't an object and that
seems to
> be why it is causing an error
>
> for (i in 7:33) {
> ?label <- as.formula(colnames(n)[i])
> model1 <-
glm(AS~label*interaction,family=binomial("logit"),data=n)
> ? ?X <- summary(model1)$coefficients[2,1]
> Y <- c(label,X)
> vector <- rbind(vector,Y)
> }
> Error in eval(expr, envir, enclos) : object 'B1' not found
>
> #same as above
>
> for (i in 7:33) {
> ?label <- eval(as.formula(colnames(n)[i]))
> model1 <-
glm(AS~label*interaction,family=binomial("logit"),data=n)
> ? ?X <- summary(model1)$coefficients[2,1]
> Y <- c(label,X)
> vector <- rbind(vector,Y)
> }
>
> Error in eval(expr, envir, enclos) : object 'B1' not found
> #same as above
>
> Any help would be appreciated.
>
> Thanks
> Philip
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.