Hi experts, I am trying to write a very flexible method that allows me to add a new column to an existing data frame. This is what I have so far: add.column <- function(df, new.col, name) { n.row <- dim(df)[1] length(new.col) <- n.row names(new.col) <- name return(cbind(df, new.col)) } df <- NULL df <- data.frame(a=c(1,2,3)) df # corect: added NA to new collumn df <- add.column(df,c(1,2),'myNewColumn2') df # problem: not added, data frame should be extended with NAs add.column(df,c(1,2,3,4),'myNewColumn3') df However, there are two problems: 1) The column name is not renamed accurately but always set to 'new.col' . Surely this could be done outside the function, but it would be better if its self contained. 2) It does not work for cases where new.col is longer than the length of the data frame. In such cases, I would like to add NA's to the data frame if it has less rows. Any ideas to to solve this? Ralf
On Tue, Aug 3, 2010 at 5:32 PM, Ralf B <ralf.bierig at gmail.com> wrote:> Hi experts, > > I am trying to write a very flexible method that allows me to add a > new column to an existing data frame. This is what I have so far:The existing way is fairly flexible, just name the new data...> mydf <- data.frame(a = c(1, 2, 3) ) > cbind(mydf, c = c(4, 5, 6) )a c 1 1 4 2 2 5 3 3 6 I do not have any good suggestions for extending the data with NAs if the row lengths do not match. Josh> > add.column <- function(df, new.col, name) { > ? ? ? ?n.row <- dim(df)[1] > ? ? ? ?length(new.col) <- n.row > ? ? ? ?names(new.col) <- name > ? ? ? ?return(cbind(df, new.col)) > } > > df <- NULL > df <- data.frame(a=c(1,2,3)) > df > # corect: added NA to new collumn > df <- add.column(df,c(1,2),'myNewColumn2') > df > # problem: not added, data frame should be extended with NAs > add.column(df,c(1,2,3,4),'myNewColumn3') > df > > > However, there are two problems: > > 1) The column name is not renamed accurately but always set to > 'new.col' . Surely this could be done outside the function, but it > would be better if its self contained. > 2) It does not work for cases where new.col is longer than the length > of the data frame. In such cases, I would like to add NA's to the data > frame if it has less rows. > > Any ideas to to solve this? > > Ralf > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/
Try this: new.col <- 1:2 transform(df, myNewColumn2 = new.col[1:nrow(df)]) On Tue, Aug 3, 2010 at 9:32 PM, Ralf B <ralf.bierig@gmail.com> wrote:> Hi experts, > > I am trying to write a very flexible method that allows me to add a > new column to an existing data frame. This is what I have so far: > > add.column <- function(df, new.col, name) { > n.row <- dim(df)[1] > length(new.col) <- n.row > names(new.col) <- name > return(cbind(df, new.col)) > } > > df <- NULL > df <- data.frame(a=c(1,2,3)) > df > # corect: added NA to new collumn > df <- add.column(df,c(1,2),'myNewColumn2') > df > # problem: not added, data frame should be extended with NAs > add.column(df,c(1,2,3,4),'myNewColumn3') > df > > > However, there are two problems: > > 1) The column name is not renamed accurately but always set to > 'new.col' . Surely this could be done outside the function, but it > would be better if its self contained. > 2) It does not work for cases where new.col is longer than the length > of the data frame. In such cases, I would like to add NA's to the data > frame if it has less rows. > > Any ideas to to solve this? > > Ralf > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O [[alternative HTML version deleted]]
On Aug 3, 2010, at 8:32 PM, Ralf B wrote:> Hi experts, > > I am trying to write a very flexible method that allows me to add a > new column to an existing data frame. This is what I have so far: > > add.column <- function(df, new.col, name) { > n.row <- dim(df)[1] > length(new.col) <- n.row > names(new.col) <- name > return(cbind(df, new.col)) > } > > df <- NULL > df <- data.frame(a=c(1,2,3)) > df > # corect: added NA to new collumn > df <- add.column(df,c(1,2),'myNewColumn2') > df > # problem: not added, data frame should be extended with NAs > add.column(df,c(1,2,3,4),'myNewColumn3') > df > > > However, there are two problems: > > 1) The column name is not renamed accurately but always set to > 'new.col' . Surely this could be done outside the function, but it > would be better if its self contained.Try this: add.col <- function(df, vec, namevec){ length(vec) <- nrow(df) # pads with NA's cbind(df, namevec=vec)} # names new col properly> 2) It does not work for cases where new.col is longer than the length > of the data frame. In such cases, I would like to add NA's to the data > frame if it has less rows.Don't have a compact answer to this. (Tried re-dimensioning with "dim() <-" but it was not accepted by the interpreter. Would need to add a test at the beginning and then pad with rows of NA's using rbind before cbinding as above. add.col <- function(df, vec, namevec){ if (nrow(df) < length(vec) ){ df <- # pads rows if needed rbind(df, matrix(NA, length(vec)-nrow(df), ncol(df), dimnames=list( NULL, names(df) ) ) ) } length(vec) <- nrow(df) # pads with NA's df[, namevec] <- vec; # names new col properly return(df)}> > Any ideas to to solve this?Has not been tested with columns of varying types. -- David. West Hartford, CT
Sometimes we try to make things behave the way we think they ought to and find it surprisingly difficult. Later we discover that our original premise was flawed and we wasted our time trying to force fit our ideas to work. Since all of the i-th elements of the columns of a data table are supposed to correspond to each other, the information content of adding a value corresponding to a bunch of NAs is practically zero, so that the value of the effort put into doing so is practically zero as well. If your information relationships are really this sparse you may be better off assembling your data in "melted" form to begin with (c.f. "melt" in package "reshape"). Alternately, perhaps you should be creating multiple data frames (and optionally using "merge" to bring them all together later). If there is no implied correspondence between the i-th elements in each vector, perhaps you should just be using named elements in a list to hold the vectors instead of using a data frame. "Ralf B" <ralf.bierig at gmail.com> wrote:>Hi experts, > >I am trying to write a very flexible method that allows me to add a >new column to an existing data frame. This is what I have so far: > >add.column <- function(df, new.col, name) { > n.row <- dim(df)[1] > length(new.col) <- n.row > names(new.col) <- name > return(cbind(df, new.col)) >} > >df <- NULL >df <- data.frame(a=c(1,2,3)) >df ># corect: added NA to new collumn >df <- add.column(df,c(1,2),'myNewColumn2') >df ># problem: not added, data frame should be extended with NAs >add.column(df,c(1,2,3,4),'myNewColumn3') >df > > >However, there are two problems: > >1) The column name is not renamed accurately but always set to >'new.col' . Surely this could be done outside the function, but it >would be better if its self contained. >2) It does not work for cases where new.col is longer than the length >of the data frame. In such cases, I would like to add NA's to the data >frame if it has less rows. > >Any ideas to to solve this? > >Ralf > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.--------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --------------------------------------------------------------------------- Sent from my phone. Please excuse my brevity.