Onur Uncu
2012-Jun-10 10:41 UTC
[R] Data.frames can not hold objects...What can be done in the following scenario?
R-Help community, I understand that data.frames can hold elements of type double, string etc but NOT objects (such as a matrix etc). This is not convenient for me in the following situation. I have a function that takes 2 inputs and returns a vector: testfun <- function (x,y) seq(x,y,1) I have a data.frame defined as follows: testframe<-data.frame(xvalues=c(2,3),yvalues=c(4,5)) I would like to apply testfun to every row of testframe and then create a new column in the data.frame which holds the returned vectors as objects. Why do I want this? Because the returned vectors are an intermediate step towards further calculations. It would be great to keep adding new columns to the data.frame with the intermediate objects. But this is not possible since data.frames can not hold objects as elements. What do you suggest as an elegant solution in this scenario? Thank you for any help! I would love to hear if forum
Duncan Murdoch
2012-Jun-10 11:02 UTC
[R] Data.frames can not hold objects...What can be done in the following scenario?
On 12-06-10 6:41 AM, Onur Uncu wrote:> R-Help community, > > I understand that data.frames can hold elements of type double, string > etc but NOT objects (such as a matrix etc).That is incorrect. Dataframes can hold list vectors. For example: A <- data.frame(x = 1:3) A$y <- list(matrix(1, 2,2), matrix(2, 3,3), matrix(3,4,4)) A[1,2] will now extract the 2x2 matrix, A[2,2] will extract the 3x3, etc. Duncan Murdoch This is not convenient for> me in the following situation. I have a function that takes 2 inputs > and returns a vector: > > testfun<- function (x,y) seq(x,y,1) > > I have a data.frame defined as follows: > > testframe<-data.frame(xvalues=c(2,3),yvalues=c(4,5)) > > I would like to apply testfun to every row of testframe and then > create a new column in the data.frame which holds the returned vectors > as objects. Why do I want this? Because the returned vectors are an > intermediate step towards further calculations. It would be great to > keep adding new columns to the data.frame with the intermediate > objects. But this is not possible since data.frames can not hold > objects as elements. What do you suggest as an elegant solution in > this scenario? Thank you for any help! > > > > > > > > > I would love to hear if forum > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Onur Uncu
2012-Jun-10 11:29 UTC
[R] Data.frames can not hold objects...What can be done in the following scenario?
Thank you Duncan. A follow-up question is, how can I achieve the desired result in the earlier email? (i.e. Add the resulting vectors as a new column to the existing data.frame?) I tried the following: testframe$newcolumn<-apply(testframe,1,function(x)testfun(x[1],x[2])) but I am getting the following error: Error in `$<-.data.frame`(`*tmp*`, "vecss", value = c(2, 3, 4, 3, 4, 5 : replacement has 3 rows, data has 2 Thanks for the help. On Sun, Jun 10, 2012 at 12:02 PM, Duncan Murdoch <murdoch.duncan at gmail.com> wrote:> On 12-06-10 6:41 AM, Onur Uncu wrote: >> >> R-Help community, >> >> I understand that data.frames can hold elements of type double, string >> etc but NOT objects (such as a matrix etc). > > > That is incorrect. ?Dataframes can hold list vectors. ?For example: > > A <- data.frame(x = 1:3) > A$y <- list(matrix(1, 2,2), matrix(2, 3,3), matrix(3,4,4)) > > A[1,2] will now extract the 2x2 matrix, A[2,2] will extract the 3x3, etc. > > Duncan Murdoch > > This is not convenient for >> >> me in the following situation. I have a function that takes 2 inputs >> and returns a vector: >> >> testfun<- function (x,y) seq(x,y,1) >> >> I have a data.frame defined as follows: >> >> testframe<-data.frame(xvalues=c(2,3),yvalues=c(4,5)) >> >> I would like to apply testfun to every row of testframe and then >> create a new column in the data.frame which holds the returned vectors >> as objects. Why do I want this? Because the returned vectors are an >> intermediate step towards further calculations. It would be great to >> keep adding new columns to the data.frame with the intermediate >> objects. But this is not possible since data.frames can not hold >> objects as elements. What do you suggest as an elegant solution in >> this scenario? Thank you for any help! >> >> >> >> >> >> >> >> >> I would love to hear if forum >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > >
Rui Barradas
2012-Jun-12 15:15 UTC
[R] Data.frames can not hold objects...What can be done in the following scenario?
Hello, You're right, to put lists or vectors as elements of data frames is not the best practice. Note, however, that the opposite is not true, it's common and good practice to have data frames and other objects as list elements, especially if they are in some way related. If, for instance, we have several files, each of them with the same structure but with measurements taken at different sites or dates, it's frequent to read them into a list of data frames. In your case, maybe it would be better to create a list, not another column of the data.frame. testlist <- lapply(...etc...) Like this the extra information would be kept in a more flexible structure, but sharing the index number. Anyway, all general rules have exceptions, and I don't dislike the original one. It does make sense. Rui Barradas Em 11-06-2012 23:55, Onur Uncu escreveu:> Thank you Rui! You have been very helpful to me. > > I was told in another R forum today that it is bad programming > practice to put lists/vectors as elements into data.frames. I am very > new to R programming and I am trying to figure out elegant ways of > writing code. This is why I asked the below questions... Thank you for > your help. > > > > On Mon, Jun 11, 2012 at 11:35 PM, Rui Barradas <ruipbarradas at sapo.pt> wrote: >> Hello, >> >> There are also other possibilities. What I believe is the easiest is to go >> back to the beginning, i.e., have the function return a vector as before, >> and then use lapply on the data.frame's rows. >> >> testfun <- function (x, y) seq(x, y, 1) >> >> >> testframe$newcolumn <- lapply(1:nrow(testframe), function(i) >> testfun(testframe[i, 1], testframe[i, 2])) >> class(testframe$newcolumn) # [1] "list" >> >> testframe$newcolumn[[1]] # a vector, no longer a list >> testframe$newcolumn[[1]][2] # 2nd element of that vector >> >> >> The main point is that data.frames are lists of a special kind, they >> implement the statistical concept of variables and their observations, the >> columns and the rows. And like all list, its elements can be any R object >> including lists. >> >> Rui Barradas >> >> Em 11-06-2012 23:02, R. Michael Weylandt escreveu: >>> >>> It is possible to chain together uses of `[[` -- e.g., >>> >>> x <- list(1:5, list(letters[1:5], list(LETTERS[1:5]))) >>> >>> x[[c(1,2)]] # 2L >>> >>> x[[c(2,1,3)]] # "c" >>> >>> x[[c(2,2,1,3)]] # "C" >>> >>> which is sometimes useful. >>> >>> Best, >>> Michael >>> >>> On Mon, Jun 11, 2012 at 4:35 PM, Onur Uncu <onuruncu at gmail.com> wrote: >>>> >>>> Rui and the R-help team, >>>> >>>> In Rui's helpful answer below, the function returns a list as output. >>>> When we apply() the function to the data.frame, dataframe$newcolumn >>>> has 2 layers of list before we can access each vector elements. For >>>> instance, dataframe$newcolumn[[1]][[1]] is a vector whereas >>>> dataframe$newcolumn and dataframe$newcolumn[[1]] are lists. Is there a >>>> solution that involves less layers of lists? I am just trying to >>>> understand the R language better. >>>> >>>> Thank you. >>>> >>>> >>>> On Sun, Jun 10, 2012 at 3:18 PM, Rui Barradas <ruipbarradas at sapo.pt> >>>> wrote: >>>>> >>>>> Hello, >>>>> >>>>> What you need is to have your function return a list, not a vector. Like >>>>> this >>>>> >>>>> testfun <- function (x, y) list(seq(x, y, 1)) >>>>> >>>>> testframe<-data.frame(xvalues=c(2,3),yvalues=c(4,5)) >>>>> >>>>> testframe$newcolumn <- apply(testframe, 1, function(x) testfun(x[1], >>>>> x[2])) >>>>> class(testframe$newcolumn) # [1] "list" >>>>> >>>>> Then you access lists and list elements. >>>>> >>>>> testframe$newcolumn[[1]] # a list with just one element >>>>> testframe$newcolumn[[1]][[1]] # that element, a vector >>>>> testframe$newcolumn[[1]][[1]][2] # the vector's 2nd element >>>>> >>>>> >>>>> Since you want the function to return vectors in order to do further >>>>> computations, you'll access those vectors by varying the list index, >>>>> >>>>> >>>>> testframe$newcolumn[[1]][[1]] # first list, it's only vector >>>>> testframe$newcolumn[[2]][[1]] # second list, it's only vector >>>>> >>>>> >>>>> Etc. >>>>> >>>>> Hope this helps, >>>>> >>>>> Rui Barradas >>>>> >>>>> Em 10-06-2012 12:29, Onur Uncu escreveu: >>>>>> >>>>>> >>>>>> Thank you Duncan. A follow-up question is, how can I achieve the >>>>>> desired result in the earlier email? (i.e. Add the resulting vectors >>>>>> as a new column to the existing data.frame?) I tried the following: >>>>>> >>>>>> testframe$newcolumn<-apply(testframe,1,function(x)testfun(x[1],x[2])) >>>>>> >>>>>> but I am getting the following error: >>>>>> >>>>>> Error in `$<-.data.frame`(`*tmp*`, "vecss", value = c(2, 3, 4, 3, 4, 5 >>>>>> : replacement has 3 rows, data has 2 >>>>>> >>>>>> Thanks for the help. >>>>>> >>>>>> >>>>>> On Sun, Jun 10, 2012 at 12:02 PM, Duncan Murdoch >>>>>> <murdoch.duncan at gmail.com> wrote: >>>>>>> >>>>>>> >>>>>>> On 12-06-10 6:41 AM, Onur Uncu wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> R-Help community, >>>>>>>> >>>>>>>> I understand that data.frames can hold elements of type double, >>>>>>>> string >>>>>>>> etc but NOT objects (such as a matrix etc). >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> That is incorrect. Dataframes can hold list vectors. For example: >>>>>>> >>>>>>> A <- data.frame(x = 1:3) >>>>>>> A$y <- list(matrix(1, 2,2), matrix(2, 3,3), matrix(3,4,4)) >>>>>>> >>>>>>> A[1,2] will now extract the 2x2 matrix, A[2,2] will extract the 3x3, >>>>>>> etc. >>>>>>> >>>>>>> Duncan Murdoch >>>>>>> >>>>>>> This is not convenient for >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> me in the following situation. I have a function that takes 2 inputs >>>>>>>> and returns a vector: >>>>>>>> >>>>>>>> testfun<- function (x,y) seq(x,y,1) >>>>>>>> >>>>>>>> I have a data.frame defined as follows: >>>>>>>> >>>>>>>> testframe<-data.frame(xvalues=c(2,3),yvalues=c(4,5)) >>>>>>>> >>>>>>>> I would like to apply testfun to every row of testframe and then >>>>>>>> create a new column in the data.frame which holds the returned >>>>>>>> vectors >>>>>>>> as objects. Why do I want this? Because the returned vectors are an >>>>>>>> intermediate step towards further calculations. It would be great to >>>>>>>> keep adding new columns to the data.frame with the intermediate >>>>>>>> objects. But this is not possible since data.frames can not hold >>>>>>>> objects as elements. What do you suggest as an elegant solution in >>>>>>>> this scenario? Thank you for any help! >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I would love to hear if forum >>>>>>>> >>>>>>>> ______________________________________________ >>>>>>>> R-help at r-project.org mailing list >>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>>> PLEASE do read the posting guide >>>>>>>> http://www.R-project.org/posting-guide.html >>>>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> ______________________________________________ >>>>>> R-help at r-project.org mailing list >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>> PLEASE do read the posting guide >>>>>> http://www.R-project.org/posting-guide.html >>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>> >>>>> >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >> >>