Hi Folks: Here's the situation:> m <- cbind(x=letters[1:3], y = letters[4:6]) > mx y [1,] "a" "d" [2,] "b" "e" [3,] "c" "f" ## m is a 2 column character matrix> d <- data.frame(a=1:3,b=4:6) > d$c <- m > da b c.x c.y 1 1 4 a d 2 2 5 b e 3 3 6 c f ## But please note (as was remarked in a thread here a couple of months ago)> ncol(d)[1] 3 ## d is a ** 3 ** column data frame Now what I wish to do is programmatically convert d to a 4 column frame with names c("a","b","x","y"). Of course: 1. The column classes/modes must be preserved (character going to factor and numeric remaining numeric). 2. I assume that I do not know a priori which of d's components/columns are matrices and which are vectors. 3. There may be many more columns which are vectors or matrix than just the three in this little example. I can easily and sensibly accomplish these 3 tasks, but the problem is that I run afoul of data frame column naming procedures in doing so, about which the data.frame Help page says rather enigmatically: "How the names of the data frame are created is complex, and the rest of this paragraph is only the basic story." Indeed! (This, of course, is shorthand for "Go look at the source if you want to know!" ) Anyway, AFAICT from the Help, any "simple" approach to conversion using data.frame results in "c.x" and "c.y" for the names of the last two columns. I **can** get what I want by explicitly constructing the vector of names via the following ugly hack; my question is, can it be improved?> dd <- do.call(data.frame,d)> dda b c.x c.y 1 1 4 a d 2 2 5 b e 3 3 6 c f> ncol(dd)[1] 4> cnames <- sapply(d,colnames) > cnames$a NULL $b NULL $c [1] "x" "y"> names(dd) <- unlist(ifelse(sapply(cnames,is.null),names(d),cnames))##Yuck!> dda b x y 1 1 4 a d 2 2 5 b e 3 3 6 c f Cheers to all, Bert -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
Hi Bert, maybe I'm missing the point, but dd<-cbind(d,m) does 1, 2 and 3 as desired: n <- data.frame(nx=letters[7:9], ny = 7:9) str(cbind(d,m,n)) t<-letters[7:9] str(cbind(d,t)) cheers. Am 06.09.2012 18:03, schrieb Bert Gunter:> Hi Folks: > > Here's the situation: > >> m <- cbind(x=letters[1:3], y = letters[4:6]) >> m > x y > [1,] "a" "d" > [2,] "b" "e" > [3,] "c" "f" > > ## m is a 2 column character matrix > >> d <- data.frame(a=1:3,b=4:6) >> d$c <- m >> d > a b c.x c.y > 1 1 4 a d > 2 2 5 b e > 3 3 6 c f > > ## But please note (as was remarked in a thread here a couple of months ago) >> ncol(d) > [1] 3 > > ## d is a ** 3 ** column data frame > > Now what I wish to do is programmatically convert d to a 4 column > frame with names c("a","b","x","y"). Of course: > > 1. The column classes/modes must be preserved (character going to > factor and numeric remaining numeric). > > 2. I assume that I do not know a priori which of d's > components/columns are matrices and which are vectors. > > 3. There may be many more columns which are vectors or matrix than > just the three in this little example. > > I can easily and sensibly accomplish these 3 tasks, but the problem is > that I run afoul of data frame column naming procedures in doing so, > about which the data.frame Help page says rather enigmatically: > > "How the names of the data frame are created is complex, and the rest > of this paragraph is only the basic story." Indeed! > (This, of course, is shorthand for "Go look at the source if you want > to know!" ) > > Anyway, AFAICT from the Help, any "simple" approach to conversion > using data.frame results in "c.x" and "c.y" for the names of the last > two columns. I **can** get what I want by explicitly constructing the > vector of names via the following ugly hack; my question is, can it be > improved? > >> dd <- do.call(data.frame,d) > >> dd > a b c.x c.y > 1 1 4 a d > 2 2 5 b e > 3 3 6 c f > >> ncol(dd) > [1] 4 > >> cnames <- sapply(d,colnames) >> cnames > $a > NULL > > $b > NULL > > $c > [1] "x" "y" > > >> names(dd) <- unlist(ifelse(sapply(cnames,is.null),names(d),cnames)) > ##Yuck! > >> dd > a b x y > 1 1 4 a d > 2 2 5 b e > 3 3 6 c f > > Cheers to all, > Bert > >-- Eik Vettorazzi Department of Medical Biometry and Epidemiology University Medical Center Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/7410-58243 F ++49/40/7410-57790 -- Pflichtangaben gem?? Gesetz ?ber elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG): Universit?tsklinikum Hamburg-Eppendorf; K?rperschaft des ?ffentlichen Rechts; Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. Guido Sauter (Vertreter des Vorsitzenden), Dr. Alexander Kirstein, Joachim Pr?l?, Prof. Dr. Dr. Uwe Koch-Gromus
On Sep 6, 2012, at 9:03 AM, Bert Gunter wrote:> Hi Folks: > > Here's the situation: > >> m <- cbind(x=letters[1:3], y = letters[4:6]) >> m > x y > [1,] "a" "d" > [2,] "b" "e" > [3,] "c" "f" > > ## m is a 2 column character matrix > >> d <- data.frame(a=1:3,b=4:6) >> d$c <- m >> d > a b c.x c.y > 1 1 4 a d > 2 2 5 b e > 3 3 6 c f > > ## But please note (as was remarked in a thread here a couple of months ago) >> ncol(d) > [1] 3 > > ## d is a ** 3 ** column data frameI guess this means you are not the one performing the d$c <- m step? If you were under control of that step, you can get different (and more to your liking) behavior with 'cbind.data.frame':> cbind(d, m)a b x y 1 1 4 a d 2 2 5 b e 3 3 6 c f> ncol( cbind(d, m) )[1] 4> > Now what I wish to do is programmatically convert d to a 4 column > frame with names c("a","b","x","y"). Of course: > > 1. The column classes/modes must be preserved (character going to > factor and numeric remaining numeric). > > 2. I assume that I do not know a priori which of d's > components/columns are matrices and which are vectors. > > 3. There may be many more columns which are vectors or matrix than > just the three in this little example. > > I can easily and sensibly accomplish these 3 tasks, but the problem is > that I run afoul of data frame column naming procedures in doing so, > about which the data.frame Help page says rather enigmatically: > > "How the names of the data frame are created is complex, and the rest > of this paragraph is only the basic story." Indeed! > (This, of course, is shorthand for "Go look at the source if you want > to know!" ) > > Anyway, AFAICT from the Help, any "simple" approach to conversion > using data.frame results in "c.x" and "c.y" for the names of the last > two columns. I **can** get what I want by explicitly constructing the > vector of names via the following ugly hack; my question is, can it be > improved? > >> dd <- do.call(data.frame,d) > >> dd > a b c.x c.y > 1 1 4 a d > 2 2 5 b e > 3 3 6 c f > >> ncol(dd) > [1] 4 > >> cnames <- sapply(d,colnames) >> cnames > $a > NULL > > $b > NULL > > $c > [1] "x" "y" > > >> names(dd) <- unlist(ifelse(sapply(cnames,is.null),names(d),cnames)) > ##Yuck! > >> dd > a b x y > 1 1 4 a d > 2 2 5 b e > 3 3 6 c f > > Cheers to all, > Bert > > > -- > > Bert Gunter > Genentech Nonclinical Biostatistics > > Internal Contact Info: > Phone: 467-7374 > Website: > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD Alameda, CA, USA