Dear List, Please bear with a poor newbee, who might be doing everything backwards (I was brought up in pure math). I want to make a simple multi-linear regression on a set of data. I did some expreiments, and if X is a 4 by 2 array and Y is a 4 by 1 array, I can do a linear regression by lm(y~x). Now I have a tab-delimited text file with 10 rows of 300 measurements and an other file with 10 rows of one value each. When I read in those files using read.delim(), I get data frames, and apparently I can no longer do the multi-linear regression. Is there a way to convert the data frames into arrays, or am I going the wrong way about this? Sincerely Thomas Poulsen
TAPO (Thomas Agersten Poulsen) wrote:> Is there a way to convert the data frames into arrays, or am I > going the wrong way about this?This is possible, but the behaviour depends on the datatype, e.g. numeric or character. Simply look for ?as.matrix or ?as.array Thomas P.
Dear Thomas, In fact, the more common way to fit a linear regression in R is to use variables in a data frame (or list) along with a model formula specifying the model. All of this is explained in the Introduction to R manual that is distributed with R: see, in particular, Sec. 6.3 on data frames, Sec. 7 on reading data from files, and Sec. 11 on statistical models. Given two data frames, say d1 and d2, the first containing, e.g., observations on variables x1 and x2 and the second on y, one could do lm(y ~ x1 + x2, data=c(x1, x2)) or lm(y ~ x1 + x2, data=data.frame(x1, x2)). That said, it's not altogether clear to me what it is that you're trying to do. Are there 10 observations on 300 variables in the first data frame, constituting the predictors, and 10 observations on 1 variable in the second data frame, constituting the response? If so, you have many more predictors than observations, and it's not reasonable to perform a regression. Of course, I may not have this straight. I hope this helps, John> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of TAPO > (Thomas Agersten Poulsen) > Sent: Friday, May 28, 2004 2:11 PM > To: r-help at stat.math.ethz.ch > Subject: [R] Converting data frame to array? > > Dear List, > > Please bear with a poor newbee, who might be doing > everything backwards (I was brought up in pure math). > > I want to make a simple multi-linear regression on a > set of data. I did some expreiments, and if X is a 4 by 2 > array and Y is a 4 by > 1 array, I can do a linear regression by lm(y~x). > > Now I have a tab-delimited text file with 10 rows of > 300 measurements and an other file with 10 rows of one value > each. When I read in those files using read.delim(), I get > data frames, and apparently I can no longer do the > multi-linear regression. > > Is there a way to convert the data frames into arrays, > or am I going the wrong way about this? > > Sincerely > Thomas Poulsen > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html
Dear John, Thank you for your helpful answer. I was obviously being stupid, as I have, as you point out, more predictors than observations. What I was hoping to get was some sort of an "explaining linear combination" of my predictors: which predictors are important for the results I see (if any) and which are irrelevant. Any hints on how to achieve that? Cheers Thomas -----Original Message----- From: John Fox [mailto:jfox at mcmaster.ca] Sent: 29. maj 2004 01:24 To: TAPO (Thomas Agersten Poulsen) Cc: r-help at stat.math.ethz.ch Subject: RE: [R] Converting data frame to array? Dear Thomas, In fact, the more common way to fit a linear regression in R is to use variables in a data frame (or list) along with a model formula specifying the model. All of this is explained in the Introduction to R manual that is distributed with R: see, in particular, Sec. 6.3 on data frames, Sec. 7 on reading data from files, and Sec. 11 on statistical models. Given two data frames, say d1 and d2, the first containing, e.g., observations on variables x1 and x2 and the second on y, one could do lm(y ~ x1 + x2, data=c(x1, x2)) or lm(y ~ x1 + x2, data=data.frame(x1, x2)). That said, it's not altogether clear to me what it is that you're trying to do. Are there 10 observations on 300 variables in the first data frame, constituting the predictors, and 10 observations on 1 variable in the second data frame, constituting the response? If so, you have many more predictors than observations, and it's not reasonable to perform a regression. Of course, I may not have this straight. I hope this helps, John> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of TAPO > (Thomas Agersten Poulsen) > Sent: Friday, May 28, 2004 2:11 PM > To: r-help at stat.math.ethz.ch > Subject: [R] Converting data frame to array? > > Dear List, > > Please bear with a poor newbee, who might be doing > everything backwards (I was brought up in pure math). > > I want to make a simple multi-linear regression on a > set of data. I did some expreiments, and if X is a 4 by 2 > array and Y is a 4 by > 1 array, I can do a linear regression by lm(y~x). > > Now I have a tab-delimited text file with 10 rows of > 300 measurements and an other file with 10 rows of one value > each. When I read in those files using read.delim(), I get > data frames, and apparently I can no longer do the > multi-linear regression. > > Is there a way to convert the data frames into arrays, > or am I going the wrong way about this? > > Sincerely > Thomas Poulsen > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html