Hi, I have the following question about creating data frames. I want to create a data frame with 2 components: a vector and a matrix. Let me use a simple example: y <- rnorm(10) x <- matrix(rnorm(150), nrow=10) Now if I do dd <- data.frame(x=x, y=y) I get a data frame with 16 colums, but if, according to the documentation, I do dd <- data.frame(x=I(x), y=y) then str(dd) gives: 'data.frame': 10 obs. of 2 variables: $ x: AsIs [1:10, 1:15] 0.700073.... -0.44371.... -0.46625.... 0.977337.... 0.509786.... ... $ y: num 0.4676 -1.4343 -0.3671 0.0637 -0.231 ... This looks and works OK. Now, there exists a CRAN package called pls. It has a yarn data set in it.> data(yarn) > str(yarn)'data.frame': 28 obs. of 3 variables: $ NIR : num [1:28, 1:268] 3.07 3.07 3.08 3.08 3.1 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : NULL $ density: num 100 80.2 79.5 60.8 60 ... $ train : logi TRUE TRUE TRUE TRUE TRUE TRUE ... This looks almost the same, except the matrix component in my example has the AsIs instead of num. Is this just some older behavior of the data.frame function producing this difference? If not, how can I get my data frame (dd) to look like yarn? I read the help pages for data.frame and as.data.frame and found this paragraph If a list is supplied, each element is converted to a column in the data frame. Similarly, each column of a matrix is converted separately. This can be overridden if the object has a class which has a method for as.data.frame: two examples are matrices of class "model.matrix" (which are included as a single column) and list objects of class "POSIXlt" which are coerced to class "POSIXct". If I do> methods(as.data.frame)[1] as.data.frame.aovproj* as.data.frame.array [3] as.data.frame.AsIs as.data.frame.character [5] as.data.frame.complex as.data.frame.data.frame [7] as.data.frame.Date as.data.frame.default [9] as.data.frame.difftime as.data.frame.factor [11] as.data.frame.ftable* as.data.frame.integer [13] as.data.frame.list as.data.frame.logical [15] as.data.frame.logLik* as.data.frame.matrix [17] as.data.frame.model.matrix as.data.frame.numeric [19] as.data.frame.numeric_version as.data.frame.ordered [21] as.data.frame.POSIXct as.data.frame.POSIXlt [23] as.data.frame.raw as.data.frame.table [25] as.data.frame.ts as.data.frame.vector so it looks like there is a matrix method for as.data.frame. The question then is how can I override the default behavior for the matrix object (converting columns separately). Any hint will be appreciated, Andy __________________________________ Andy Jaworski 518-1-01 Process Laboratory 3M Corporate Research Laboratory ----- E-mail: apjaworski@mmm.com Tel: (651) 733-6092 Fax: (651) 736-3122 [[alternative HTML version deleted]]
Andy, Did you run into any kind of trouble? I'm asking because I'm maintaining a package for spectroscopic data that heavily uses "I (spectra.matrix)" ... However, once you have the matrix safe inside the data.frame, you can delete the "AsIs": > a <- matrix (1:9, 3) > str (a) int [1:3, 1:3] 1 2 3 4 5 6 7 8 9 > df <- data.frame (a = I (a)) > str (df) 'data.frame': 3 obs. of 1 variable: $ a: 'AsIs' int [1:3, 1:3] 1 2 3 4 5 6 7 8 9 > df$a <- unclass (df$a) > str (df) 'data.frame': 3 obs. of 1 variable: $ a: int [1:3, 1:3] 1 2 3 4 5 6 7 8 9 > df$a [,1] [,2] [,3] [1,] 1 4 7 [2,] 2 5 8 [3,] 3 6 9 > dim (df) [1] 3 1 However, I don't know whether something can now trigger a conversion to data.frame that the AsIs would have stopped. Cheers, Claudia apjaworski at mmm.com wrote:> Hi, > > I have the following question about creating data frames. I want to > create a data frame with 2 components: a vector and a matrix. > > Let me use a simple example: > > y <- rnorm(10) > x <- matrix(rnorm(150), nrow=10) > > Now if I do > > dd <- data.frame(x=x, y=y) > > I get a data frame with 16 colums, but if, according to the documentation, > I do > > dd <- data.frame(x=I(x), y=y) > > then str(dd) gives: > > 'data.frame': 10 obs. of 2 variables: > $ x: AsIs [1:10, 1:15] 0.700073.... -0.44371.... -0.46625.... > 0.977337.... 0.509786.... ... > $ y: num 0.4676 -1.4343 -0.3671 0.0637 -0.231 ... > > This looks and works OK. > > Now, there exists a CRAN package called pls. It has a yarn data set in > it. > >> data(yarn) >> str(yarn) > 'data.frame': 28 obs. of 3 variables: > $ NIR : num [1:28, 1:268] 3.07 3.07 3.08 3.08 3.1 ... > ..- attr(*, "dimnames")=List of 2 > .. ..$ : NULL > .. ..$ : NULL > $ density: num 100 80.2 79.5 60.8 60 ... > $ train : logi TRUE TRUE TRUE TRUE TRUE TRUE ... > > This looks almost the same, except the matrix component in my example has > the AsIs instead of num. > > Is this just some older behavior of the data.frame function producing this > difference? If not, how can I get my data frame (dd) to look like yarn? > > I read the help pages for data.frame and as.data.frame and found this > paragraph > > If a list is supplied, each element is converted to a column in the data > frame. Similarly, each column of a matrix is converted separately. This > can be overridden if the object has a class which has a method for > as.data.frame: two examples are matrices of class "model.matrix" (which > are included as a single column) and list objects of class "POSIXlt" which > are coerced to class "POSIXct". > > If I do > >> methods(as.data.frame) > [1] as.data.frame.aovproj* as.data.frame.array > [3] as.data.frame.AsIs as.data.frame.character > [5] as.data.frame.complex as.data.frame.data.frame > [7] as.data.frame.Date as.data.frame.default > [9] as.data.frame.difftime as.data.frame.factor > [11] as.data.frame.ftable* as.data.frame.integer > [13] as.data.frame.list as.data.frame.logical > [15] as.data.frame.logLik* as.data.frame.matrix > [17] as.data.frame.model.matrix as.data.frame.numeric > [19] as.data.frame.numeric_version as.data.frame.ordered > [21] as.data.frame.POSIXct as.data.frame.POSIXlt > [23] as.data.frame.raw as.data.frame.table > [25] as.data.frame.ts as.data.frame.vector > > so it looks like there is a matrix method for as.data.frame. The question > then is how can I override the default behavior for the matrix object > (converting columns separately). > > > Any hint will be appreciated, > > Andy > > > __________________________________ > Andy Jaworski > 518-1-01 > Process Laboratory > 3M Corporate Research Laboratory > ----- > E-mail: apjaworski at mmm.com > Tel: (651) 733-6092 > Fax: (651) 736-3122 > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Claudia Beleites Dipartimento dei Materiali e delle Risorse Naturali Universit? degli Studi di Trieste Via Alfonso Valerio 6/a I-34127 Trieste phone: +39 0 40 5 58-37 68 email: cbeleites at units.it
apjaworski at mmm.com wrote:> > Thanks for the quick reply. > > No, I did not run into any problems so far. I have been using the PLS > package and the modelling functions seem to work just fine. > > In fact, even if I let the data.frame convert the x matrix to separate > column, the "y ~ x" modeling syntax still seems to work fine. >I don't see that behaviour: rm (x) # make sure there is no leftover x in the workspace mat <- matrix (1 : 9, 3) df <- data.frame (y = 1 : 3, x = mat) str (df) df coef (plsr (y ~ x, data = df, ncomp = 1)) # error coef (plsr (y ~ x.1 + x.2 + x.3, data = df, ncomp = 1)) # works df$x <- I (-mat) str (df) df coef (plsr (y ~ x, data = df, ncomp = 1)) # works Claudia PS: May I be curious: what kind of data do you analyze with PLS?> Thanks again, > > Andy > > __________________________________ > Andy Jaworski > 518-1-01 > Process Laboratory > 3M Corporate Research Laboratory > ----- > E-mail: apjaworski at mmm.com > Tel: (651) 733-6092 > Fax: (651) 736-3122 > > > From: Claudia Beleites <cbeleites at units.it> > To: apjaworski at mmm.com > Cc: r-help at r-project.org > Date: 03/12/2010 02:13 PM > Subject: Re: [R] Data frame question > > > > > > Andy, > > Did you run into any kind of trouble? > I'm asking because I'm maintaining a package for spectroscopic data that > heavily > uses "I (spectra.matrix)" ... > > However, once you have the matrix safe inside the data.frame, you can > delete the > "AsIs": > > > a <- matrix (1:9, 3) > > str (a) > int [1:3, 1:3] 1 2 3 4 5 6 7 8 9 > > df <- data.frame (a = I (a)) > > str (df) > 'data.frame': 3 obs. of 1 variable: > $ a: 'AsIs' int [1:3, 1:3] 1 2 3 4 5 6 7 8 9 > > df$a <- unclass (df$a) > > str (df) > 'data.frame': 3 obs. of 1 variable: > $ a: int [1:3, 1:3] 1 2 3 4 5 6 7 8 9 > > df$a > [,1] [,2] [,3] > [1,] 1 4 7 > [2,] 2 5 8 > [3,] 3 6 9 > > dim (df) > [1] 3 1 > > However, I don't know whether something can now trigger a conversion to > data.frame that the AsIs would have stopped. > > Cheers, > > Claudia > > apjaworski at mmm.com wrote: > > Hi, > > > > I have the following question about creating data frames. I want to > > create a data frame with 2 components: a vector and a matrix. > > > > Let me use a simple example: > > > > y <- rnorm(10) > > x <- matrix(rnorm(150), nrow=10) > > > > Now if I do > > > > dd <- data.frame(x=x, y=y) > > > > I get a data frame with 16 colums, but if, according to the > documentation, > > I do > > > > dd <- data.frame(x=I(x), y=y) > > > > then str(dd) gives: > > > > 'data.frame': 10 obs. of 2 variables: > > $ x: AsIs [1:10, 1:15] 0.700073.... -0.44371.... -0.46625.... > > 0.977337.... 0.509786.... ... > > $ y: num 0.4676 -1.4343 -0.3671 0.0637 -0.231 ... > > > > This looks and works OK. > > > > Now, there exists a CRAN package called pls. It has a yarn data set in > > it. > > > >> data(yarn) > >> str(yarn) > > 'data.frame': 28 obs. of 3 variables: > > $ NIR : num [1:28, 1:268] 3.07 3.07 3.08 3.08 3.1 ... > > ..- attr(*, "dimnames")=List of 2 > > .. ..$ : NULL > > .. ..$ : NULL > > $ density: num 100 80.2 79.5 60.8 60 ... > > $ train : logi TRUE TRUE TRUE TRUE TRUE TRUE ... > > > > This looks almost the same, except the matrix component in my example > has > > the AsIs instead of num. > > > > Is this just some older behavior of the data.frame function producing > this > > difference? If not, how can I get my data frame (dd) to look like yarn? > > > > I read the help pages for data.frame and as.data.frame and found this > > paragraph > > > > If a list is supplied, each element is converted to a column in the data > > frame. Similarly, each column of a matrix is converted separately. This > > can be overridden if the object has a class which has a method for > > as.data.frame: two examples are matrices of class "model.matrix" (which > > are included as a single column) and list objects of class "POSIXlt" > which > > are coerced to class "POSIXct". > > > > If I do > > > >> methods(as.data.frame) > > [1] as.data.frame.aovproj* as.data.frame.array > > [3] as.data.frame.AsIs as.data.frame.character > > [5] as.data.frame.complex as.data.frame.data.frame > > [7] as.data.frame.Date as.data.frame.default > > [9] as.data.frame.difftime as.data.frame.factor > > [11] as.data.frame.ftable* as.data.frame.integer > > [13] as.data.frame.list as.data.frame.logical > > [15] as.data.frame.logLik* as.data.frame.matrix > > [17] as.data.frame.model.matrix as.data.frame.numeric > > [19] as.data.frame.numeric_version as.data.frame.ordered > > [21] as.data.frame.POSIXct as.data.frame.POSIXlt > > [23] as.data.frame.raw as.data.frame.table > > [25] as.data.frame.ts as.data.frame.vector > > > > so it looks like there is a matrix method for as.data.frame. The > question > > then is how can I override the default behavior for the matrix object > > (converting columns separately). > > > > > > Any hint will be appreciated, > > > > Andy > > > > > > __________________________________ > > Andy Jaworski > > 518-1-01 > > Process Laboratory > > 3M Corporate Research Laboratory > > ----- > > E-mail: apjaworski at mmm.com > > Tel: (651) 733-6092 > > Fax: (651) 736-3122 > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > <https://stat.ethz.ch/mailman/listinfo/r-help> > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > <http://www.r-project.org/posting-guide.html> > > and provide commented, minimal, self-contained, reproducible code. > > > -- > Claudia Beleites > Dipartimento dei Materiali e delle Risorse Naturali > Universit? degli Studi di Trieste > Via Alfonso Valerio 6/a > I-34127 Trieste > > phone: +39 0 40 5 58-37 68 > email: cbeleites at units.it > > >-- Claudia Beleites Dipartimento dei Materiali e delle Risorse Naturali Universit? degli Studi di Trieste Via Alfonso Valerio 6/a I-34127 Trieste phone: +39 0 40 5 58-37 68 email: cbeleites at units.it