I have two data sets: File1.txt: Name id1 id2 id3 ... N1 0 1 0 ... N2 0 1 1 ... N3 1 1 -1 ... ... File2.txt: Group id1 id2 id3 ... G1 1.22 1.34 2.44 ... G2 2.33 2.56 2.56 ... G3 1.56 1.99 1.46 ... ... I like to do: x1<-c(0,1,0,...) y1<-c(1.22,1.34, 2.44, ...) z1<-data.frame(x,y) summary(glm(y1~x1,data=z1) But I do the same thing by inputting the data sets from the two files e <- read.table("file1.txt", header=TRUE,row.names=1) g <- read.table("file2.txt", header=TRUE,row.names=1) e1<-exp[1,] g1<-geno[1,] d1<-data.frame(g, e) summary(glm(e1 ~ g1, data=d1)) the error message is Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : invalid variable type Execution halted Thanks in advance, Ying [[alternative HTML version deleted]]
Hu, Ying (NIH/NCI) wrote:> I have two data sets: > File1.txt: > Name id1 id2 id3 ... > N1 0 1 0 ... > N2 0 1 1 ... > N3 1 1 -1 ... > ... > > File2.txt: > Group id1 id2 id3 ... > G1 1.22 1.34 2.44 ... > G2 2.33 2.56 2.56 ... > G3 1.56 1.99 1.46 ... > ... > I like to do: > x1<-c(0,1,0,...) > y1<-c(1.22,1.34, 2.44, ...) > z1<-data.frame(x,y) > summary(glm(y1~x1,data=z1) > > But I do the same thing by inputting the data sets from the two files > e <- read.table("file1.txt", header=TRUE,row.names=1) > g <- read.table("file2.txt", header=TRUE,row.names=1) > e1<-exp[1,] > g1<-geno[1,] > d1<-data.frame(g, e) > summary(glm(e1 ~ g1, data=d1)) > > the error message is > Error in model.frame(formula, rownames, variables, varnames, extras, > extranames, : > invalid variable type > Execution halted > > Thanks in advance, > > YingYou have several inconsistencies in your example, so it will be difficult to figure out what you are trying to accomplish. > e <- read.table("file1.txt", header=TRUE,row.names=1) > g <- read.table("file2.txt", header=TRUE,row.names=1) > e1<-exp[1,] What's "exp"? Also it's dangerous to use an R function as a variable name. Most of the time R can tell the difference, but in some cases it cannot. > g1<-geno[1,] What's "geno"? > d1<-data.frame(g, e) d1 is now e and g cbind'ed together? > summary(glm(e1 ~ g1, data=d1)) Are "e1" and "g1" elements of "d1"? From what you've told us, I don't know where the error is occurring. Also, if you are having errors, you can more easily isolate the problem by doing: fit <- glm(e1 ~ g1, data = d1) summary(fit) This will at least tell you the problem is in your call to "glm" and not "summary.glm". --sundar P.S. Please (re-)read the POSTING GUIDE. Most of the time you will figure out problems such as these on your own during the process of creating a reproducible example.
Thanks for your help. # read the two data sets e <- as.matrix(read.table("file1.txt", header=TRUE,row.names=1)) g <- as.matrix(read.table("file2.txt", header=TRUE,row.names=1)) # solution d1<-data.frame(g[1,], e[1,]) fit<-glm(e[1,] ~ g[1,], data=d1) summary(fit) I am not sure that is the best solution. Thanks again, Ying -----Original Message----- From: Gavin Simpson [mailto:gavin.simpson at ucl.ac.uk] Sent: Wednesday, August 17, 2005 7:01 PM To: Sundar Dorai-Raj Cc: Hu, Ying (NIH/NCI); r-help at stat.math.ethz.ch Subject: Re: [R] do glm with two data sets On Wed, 2005-08-17 at 17:22 -0500, Sundar Dorai-Raj wrote:> > Hu, Ying (NIH/NCI) wrote: > > I have two data sets: > > File1.txt: > > Name id1 id2 id3 ... > > N1 0 1 0 ... > > N2 0 1 1 ... > > N3 1 1 -1 ... > > ... > > > > File2.txt: > > Group id1 id2 id3 ... > > G1 1.22 1.34 2.44 ... > > G2 2.33 2.56 2.56 ... > > G3 1.56 1.99 1.46 ... > > ... > > I like to do: > > x1<-c(0,1,0,...) > > y1<-c(1.22,1.34, 2.44, ...) > > z1<-data.frame(x,y) > > summary(glm(y1~x1,data=z1) > > > > But I do the same thing by inputting the data sets from the two files > > e <- read.table("file1.txt", header=TRUE,row.names=1) > > g <- read.table("file2.txt", header=TRUE,row.names=1) > > e1<-exp[1,] > > g1<-geno[1,] > > d1<-data.frame(g, e) > > summary(glm(e1 ~ g1, data=d1)) > > > > the error message is > > Error in model.frame(formula, rownames, variables, varnames, extras, > > extranames, : > > invalid variable type > > Execution halted > > > > Thanks in advance, > > > > YingHi Ying, That error message is likely caused by having a data.frame on the right hand side (rhs) of the formula. You can't have a data.frame on the rhs of a formula and g1 is still a data frame even if you only choose the first row, e.g.: dat <- as.data.frame(matrix(100, 10, 10)) class(dat[1, ]) [1] "data.frame" You could try: glm(e1 ~ ., data=g1[1, ]) and see if that works, but as Sundar notes, your post is a little difficult to follow, so this may not do what you were trying to achieve. HTH Gav> > You have several inconsistencies in your example, so it will be > difficult to figure out what you are trying to accomplish. > > > e <- read.table("file1.txt", header=TRUE,row.names=1) > > g <- read.table("file2.txt", header=TRUE,row.names=1) > > e1<-exp[1,] > > What's "exp"? Also it's dangerous to use an R function as a variable > name. Most of the time R can tell the difference, but in some cases it > cannot. > > > g1<-geno[1,] > > What's "geno"? > > > d1<-data.frame(g, e) > > d1 is now e and g cbind'ed together? > > > summary(glm(e1 ~ g1, data=d1)) > > Are "e1" and "g1" elements of "d1"? From what you've told us, I don't > know where the error is occurring. Also, if you are having errors, you > can more easily isolate the problem by doing: > > fit <- glm(e1 ~ g1, data = d1) > summary(fit) > > This will at least tell you the problem is in your call to "glm" and not > "summary.glm". > > --sundar > > P.S. Please (re-)read the POSTING GUIDE. Most of the time you will > figure out problems such as these on your own during the process of > creating a reproducible example. > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide!http://www.R-project.org/posting-guide.html -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [T] +44 (0)20 7679 5522 ENSIS Research Fellow [F] +44 (0)20 7679 7565 ENSIS Ltd. & ECRC [E] gavin.simpsonATNOSPAMucl.ac.uk UCL Department of Geography [W] http://www.ucl.ac.uk/~ucfagls/cv/ 26 Bedford Way [W] http://www.ucl.ac.uk/~ucfagls/ London. WC1H 0AP. %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
You are right. # read the two data sets e <- as.matrix(read.table("file1.txt", header=TRUE,row.names=1)) g <- as.matrix(read.table("file2.txt", header=TRUE,row.names=1)) # solution 2 summary(glm(e[1,] ~ g[1,])) summary(glm(e[1,] ~ g[2,])) ... They work very well. If I put it in the loop, such as for (i in 1:50){ for (j in 1:50){ cat("file1 row:", i, "file2 row:", j, "\n") print(summary(glm(e[i,] ~ g[j,]))) } } Why do I have to use "print" to print the results? If without "print" for (i in 1:50){ for (j in 1:50){ cat("file1 row:", i, "file2 row:", j, "\n") summary(glm(e[i,] ~ g[j,])) } } then without the results of glm. Thanks a lot. Ying -----Original Message----- From: Gavin Simpson [mailto:gavin.simpson at ucl.ac.uk] Sent: Thursday, August 18, 2005 11:00 AM To: Hu, Ying (NIH/NCI) Cc: Sundar Dorai-Raj; r-help at stat.math.ethz.ch Subject: RE: [R] do glm with two data sets On Thu, 2005-08-18 at 10:38 -0400, Hu, Ying (NIH/NCI) wrote:> Thanks for your help. > > # read the two data sets > e <- as.matrix(read.table("file1.txt", header=TRUE,row.names=1)) > g <- as.matrix(read.table("file2.txt", header=TRUE,row.names=1)) > # solution > d1<-data.frame(g[1,], e[1,])This is redundant, as:> fit<-glm(e[1,] ~ g[1,], data=d1)and: fit <- glm(e[1, ] ~ g[1, ]) are equivalent - you don't need data = d1 in this case, e.g: e <- matrix(c(0, 1, 0, 0, 1, 1, 1, 1, -1), ncol = 3, byrow = TRUE) e g <- matrix(c(1.22, 1.34, 2.44, 2.33, 2.56, 2.56, 1.56, 1.99, 1.46), ncol = 3, byrow = TRUE) g fit <- glm(e[1, ] ~ g[1, ]) fit works fine.> summary(fit) > > I am not sure that is the best solution.This seems a strange way of doing this. Why not: pred <- g[1, ] resp <- e[1, ] fit <- glm(resp ~ pred) fit and do your subsetting outside the glm call - makes things clearer no? Unless you plan to do many glm()s one per row of your two matrices. If that is the case, then there are better ways of approaching this.> Thanks again, > > YingHTH G> > -----Original Message----- > From: Gavin Simpson [mailto:gavin.simpson at ucl.ac.uk] > Sent: Wednesday, August 17, 2005 7:01 PM > To: Sundar Dorai-Raj > Cc: Hu, Ying (NIH/NCI); r-help at stat.math.ethz.ch > Subject: Re: [R] do glm with two data sets > > On Wed, 2005-08-17 at 17:22 -0500, Sundar Dorai-Raj wrote: > > > > Hu, Ying (NIH/NCI) wrote: > > > I have two data sets: > > > File1.txt: > > > Name id1 id2 id3 ... > > > N1 0 1 0 ... > > > N2 0 1 1 ... > > > N3 1 1 -1 ... > > > ... > > > > > > File2.txt: > > > Group id1 id2 id3 ... > > > G1 1.22 1.34 2.44 ... > > > G2 2.33 2.56 2.56 ... > > > G3 1.56 1.99 1.46 ... > > > ... > > > I like to do: > > > x1<-c(0,1,0,...) > > > y1<-c(1.22,1.34, 2.44, ...) > > > z1<-data.frame(x,y) > > > summary(glm(y1~x1,data=z1) > > > > > > But I do the same thing by inputting the data sets from the two files > > > e <- read.table("file1.txt", header=TRUE,row.names=1) > > > g <- read.table("file2.txt", header=TRUE,row.names=1) > > > e1<-exp[1,] > > > g1<-geno[1,] > > > d1<-data.frame(g, e) > > > summary(glm(e1 ~ g1, data=d1)) > > > > > > the error message is > > > Error in model.frame(formula, rownames, variables, varnames, extras, > > > extranames, : > > > invalid variable type > > > Execution halted > > > > > > Thanks in advance, > > > > > > Ying > > Hi Ying, > > That error message is likely caused by having a data.frame on the right > hand side (rhs) of the formula. You can't have a data.frame on the rhs > of a formula and g1 is still a data frame even if you only choose the > first row, e.g.: > > dat <- as.data.frame(matrix(100, 10, 10)) > class(dat[1, ]) > [1] "data.frame" > > You could try: > > glm(e1 ~ ., data=g1[1, ]) > > and see if that works, but as Sundar notes, your post is a little > difficult to follow, so this may not do what you were trying to achieve. > > HTH > > Gav > > > > > You have several inconsistencies in your example, so it will be > > difficult to figure out what you are trying to accomplish. > > > > > e <- read.table("file1.txt", header=TRUE,row.names=1) > > > g <- read.table("file2.txt", header=TRUE,row.names=1) > > > e1<-exp[1,] > > > > What's "exp"? Also it's dangerous to use an R function as a variable > > name. Most of the time R can tell the difference, but in some cases it > > cannot. > > > > > g1<-geno[1,] > > > > What's "geno"? > > > > > d1<-data.frame(g, e) > > > > d1 is now e and g cbind'ed together? > > > > > summary(glm(e1 ~ g1, data=d1)) > > > > Are "e1" and "g1" elements of "d1"? From what you've told us, I don't > > know where the error is occurring. Also, if you are having errors, you > > can more easily isolate the problem by doing: > > > > fit <- glm(e1 ~ g1, data = d1) > > summary(fit) > > > > This will at least tell you the problem is in your call to "glm" and not> > "summary.glm". > > > > --sundar > > > > P.S. Please (re-)read the POSTING GUIDE. Most of the time you will > > figure out problems such as these on your own during the process of > > creating a reproducible example. > > > > ______________________________________________ > > R-help at stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html-- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [T] +44 (0)20 7679 5522 ENSIS Research Fellow [F] +44 (0)20 7679 7565 ENSIS Ltd. & ECRC [E] gavin.simpsonATNOSPAMucl.ac.uk UCL Department of Geography [W] http://www.ucl.ac.uk/~ucfagls/cv/ 26 Bedford Way [W] http://www.ucl.ac.uk/~ucfagls/ London. WC1H 0AP. %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%