I'd like to have a data.frame structured something like the following: d <- data.frame ( x=list( c(1,2), c(5,2), c(9,1) ), y=c( 1, -1, -1) ) The reason is this: 'd' is the training data for a machine learning algorithm. d$x is the independent data, and d$y is the dependent data. In general my machine learning code will work where each element of d$x is a vector of one or more real numbers. So for instance, the same code should work when d$x[1] = 42, or when d$x[1] = (42, 3, 5). All that matters is that all element within d$x are lists/vectors of the same length. Does anyone know if/how I can get a data.frame set up like that? Thanks, Christian
Why do you need to use a data frame? A list will give you the flexibility you want: d <- list( x=list( c(1,2), c(5,2), c(9,1) ), y=c( 1, -1, -1) ) Then you can access the individual elements> d$x[[1]] [1] 1 2 [[2]] [1] 5 2 [[3]] [1] 9 1> d$y[1] 1 -1 -1> d$x[[1]][1] 1 2 -Christos> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of > Christian Convey > Sent: Monday, February 12, 2007 11:29 PM > To: r-help at stat.math.ethz.ch > Subject: [R] Can a data.frame column contain lists/arrays? > > I'd like to have a data.frame structured something like the following: > > d <- data.frame ( > x=list( c(1,2), c(5,2), c(9,1) ), > y=c( 1, -1, -1) > ) > > The reason is this: 'd' is the training data for a machine > learning algorithm. d$x is the independent data, and d$y is > the dependent data. > > In general my machine learning code will work where each > element of d$x is a vector of one or more real numbers. So > for instance, the same code should work when d$x[1] = 42, or > when d$x[1] = (42, 3, 5). > All that matters is that all element within d$x are > lists/vectors of the same length. > > Does anyone know if/how I can get a data.frame set up like that? > > Thanks, > Christian > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >
--- Christian Convey <christian.convey at gmail.com> wrote:> I'd like to have a data.frame structured something > like the following: > > d <- data.frame ( > x=list( c(1,2), c(5,2), c(9,1) ), > y=c( 1, -1, -1) > ) > > The reason is this: 'd' is the training data for a > machine learning > algorithm. d$x is the independent data, and d$y is > the dependent > data. > > In general my machine learning code will work where > each element of > d$x is a vector of one or more real numbers. So for > instance, the > same code should work when d$x[1] = 42, or when > d$x[1] = (42, 3, 5). > All that matters is that all element within d$x are > lists/vectors of > the same length. > > Does anyone know if/how I can get a data.frame set > up like that? > > Thanks, > ChristianI doubt it. A data.frame is a specific subset of a list. You should be able to do anything you want with a list. Have a look at the Lists and Dataframes chapter of Intro to R.
> I'd like to have a data.frame structured something like the following: > > d <- data.frame ( > x=list( c(1,2), c(5,2), c(9,1) ), > y=c( 1, -1, -1) > ) > > The reason is this: 'd' is the training data for a machine learning > algorithm. d$x is the independent data, and d$y is the dependent > data. > > In general my machine learning code will work where each element of > d$x is a vector of one or more real numbers. So for instance, the > same code should work when d$x[1] = 42, or when d$x[1] = (42, 3, 5). > All that matters is that all element within d$x are lists/vectors of > the same length. > > Does anyone know if/how I can get a data.frame set up like that?You certainly can, although it requires a little work. A data.frame is a list of vectors, each of the same length, and a list is a type of vector. I use this structure fairly often in my own work, and find it quite useful. However, the data.frame and as.data.frame functions try to be helpful at converting lists to regular columns so you must first create your data.frame and then add the column which is a list:> df <- data.frame(a=1:2) > df$b <- list(1:5, 6:10) > dfa b 1 1 1, 2, 3, 4, 5 2 2 6, 7, 8, 9, 10> str(df)'data.frame': 2 obs. of 2 variables: $ a: int 1 2 $ b:List of 2 ..$ : int 1 2 3 4 5 ..$ : int 6 7 8 9 10 but> data.frame(a=1:2, b = list(1:5, 6:10))Error in data.frame(a = 1:2, b = list(1:5, 6:10)) : arguments imply differing number of rows: 2, 5 Note that it is possible to create structures like this which do not print, but still contain valid objects:> df$b <- list(lm(mpg~wt, data=mtcars), lm(mpg~vs, data=mtcars)) > dfError in unlist(x, recursive, use.names) : argument not a list> summary(df[1,2])Length Class Mode [1,] 12 lm list> summary(df[1,2][[1]])Call: lm(formula = mpg ~ wt, data = mtcars) Residuals: Min 1Q Median 3Q Max -4.5432 -2.3647 -0.1252 1.4096 6.8727 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 37.2851 1.8776 19.858 < 2e-16 *** wt -5.3445 0.5591 -9.559 1.29e-10 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 3.046 on 30 degrees of freedom Multiple R-Squared: 0.7528, Adjusted R-squared: 0.7446 F-statistic: 91.38 on 1 and 30 DF, p-value: 1.294e-10 There are some functions in the reshape package, in particular stamp, which make this a bit easier for particular types of data. Regards, Hadley