Jill Hollenbach
2009-Aug-12 01:48 UTC
[R] paste first row string onto every string in column
Hi, I am trying to edit a data frame such that the string in the first line is appended onto the beginning of each element in the subsequent rows. The data looks like this:> dfV1 V2 V3 V4 1 DPA1* DPA1* DPB1* DPB1* 2 0103 0104 0401 0601 3 0103 0103 0301 0402 . . and what I want is this:>dfnewV1 V2 V3 V4 1 DPA1* DPA1* DPB1* DPB1* 2 DPA1*0103 DPA1*0104 DPB1*0401 DPB1*0601 3 DPA1*0103 DPA1*0103 DPB1*0301 DPB1*0402 any help is much appreciated, I am new to this and struggling. Jill ___ Jill Hollenbach, PhD, MPH Assistant Staff Scientist Center for Genetics Children's Hospital Oakland Research Institute jhollenbach at chori.org -- View this message in context: http://www.nabble.com/paste-first-row-string-onto-every-string-in-column-tp24928720p24928720.html Sent from the R help mailing list archive at Nabble.com.
Hi Jill, Completely not elegant, but may be usefull. Of course other colleagues will solve this with 1 line command :-) cheers milton df<-read.table(stdin(), head=T, sep=",") V1,V2,V3,V4 DPA1*,DPA1*,DPB1*,DPB1* 0103,0104,0401,0601 0103,0103,0301,0402 df.new<-as.matrix(df) for (i in 2:dim(df)[1]) { for (j in 1:dim(df)[2]) { df.new[i,j]<-paste(c(as.character(df[1,j])), c(as.character(df[i,j])), sep="") } } df.new<-data.frame(df.new) df df.new On Tue, Aug 11, 2009 at 9:48 PM, Jill Hollenbach <jhollenbach@chori.org>wrote:> > Hi, > I am trying to edit a data frame such that the string in the first line is > appended onto the beginning of each element in the subsequent rows. The > data > looks like this: > > > df > V1 V2 V3 V4 > 1 DPA1* DPA1* DPB1* DPB1* > 2 0103 0104 0401 0601 > 3 0103 0103 0301 0402 > . > . > and what I want is this: > > >dfnew > V1 V2 V3 V4 > 1 DPA1* DPA1* DPB1* DPB1* > 2 DPA1*0103 DPA1*0104 DPB1*0401 DPB1*0601 > 3 DPA1*0103 DPA1*0103 DPB1*0301 DPB1*0402 > > any help is much appreciated, I am new to this and struggling. > Jill > > ___ > Jill Hollenbach, PhD, MPH > Assistant Staff Scientist > Center for Genetics > Children's Hospital Oakland Research Institute > jhollenbach@chori.org > > -- > View this message in context: > http://www.nabble.com/paste-first-row-string-onto-every-string-in-column-tp24928720p24928720.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Patrick Connolly
2009-Aug-12 07:51 UTC
[R] paste first row string onto every string in column
On Tue, 11-Aug-2009 at 06:48PM -0700, Jill Hollenbach wrote: |> |> Hi, |> I am trying to edit a data frame such that the string in the first line is |> appended onto the beginning of each element in the subsequent rows. The data |> looks like this: |> |> > df |> V1 V2 V3 V4 |> 1 DPA1* DPA1* DPB1* DPB1* |> 2 0103 0104 0401 0601 |> 3 0103 0103 0301 0402 |> . |> . |> and what I want is this: |> |> >dfnew |> V1 V2 V3 V4 |> 1 DPA1* DPA1* DPB1* DPB1* |> 2 DPA1*0103 DPA1*0104 DPB1*0401 DPB1*0601 |> 3 DPA1*0103 DPA1*0103 DPB1*0301 DPB1*0402 |> |> any help is much appreciated, I am new to this and struggling. as.data.frame(lapply(df, function(x) paste(x[1], x[-1], sep = ""))) There's a few ideas in there that will get you started. We could add a bit more to get the first row you want in one line, but you'll be able to work that out. That will end up as a dataframe of factors. You might need to do something else with it after that stage. HTH -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___ Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) ..... Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
Inchallah Yarab
2009-Aug-12 09:52 UTC
[R] Re : paste first row string onto every string in column
try this , save your data in C:/ and write this (data <- read.csv2("c:/jill.csv", sep=",")) (C<-data[1,]) A <- numeric(4) B <- numeric(4) for (i in 1 :4){ A[i] <- paste(data[1,i],data[2,i]) B[i] <- paste(data[1,i],data[3,i]) } A B (data1 <- rbind(as.character(A),as.character(B))) (data2 <- rbind(C,data1)) normaly that gives that> (data <- read.csv2("c:/jill.csv", sep=","))V1 V2 V3 V4 1 DPA1* DPA1* DPB1* DPB1* 2 103 104 401 601 3 103 103 301 402> (C<-data[1,])V1 V2 V3 V4 1 DPA1* DPA1* DPB1* DPB1*> A <- numeric(4) > B <- numeric(4) > for (i in 1 :4){+ A[i] <- paste(data[1,i],data[2,i]) + B[i] <- paste(data[1,i],data[3,i]) + }> A[1] "DPA1* 103" "DPA1* 104" "DPB1* 401" "DPB1* 601"> B[1] "DPA1* 103" "DPA1* 103" "DPB1* 301" "DPB1* 402"> > (data1 <- rbind(as.character(A),as.character(B)))[,1] [,2] [,3] [,4] [1,] "DPA1* 103" "DPA1* 104" "DPB1* 401" "DPB1* 601" [2,] "DPA1* 103" "DPA1* 103" "DPB1* 301" "DPB1* 402"> (data2 <- rbind(C,data1))V1 V2 V3 V4 1 DPA1* DPA1* DPB1* DPB1* 2 DPA1* 103 DPA1* 104 DPB1* 401 DPB1* 601 3 DPA1* 103 DPA1* 103 DPB1* 301 DPB1* 402>Hope that helps!!! inchallah yarab ________________________________ De : Jill Hollenbach <jhollenbach@chori.org> À : r-help@r-project.org Envoyé le : Mercredi, 12 Août 2009, 3h48mn 37s Objet : [R] paste first row string onto every string in column Hi, I am trying to edit a data frame such that the string in the first line is appended onto the beginning of each element in the subsequent rows. The data looks like this:> dfV1 V2 V3 V4 1 DPA1* DPA1* DPB1* DPB1* 2 0103 0104 0401 0601 3 0103 0103 0301 0402 . . and what I want is this:>dfnewV1 V2 V3 V4 1 DPA1* DPA1* DPB1* DPB1* 2 DPA1*0103 DPA1*0104 DPB1*0401 DPB1*0601 3 DPA1*0103 DPA1*0103 DPB1*0301 DPB1*0402 any help is much appreciated, I am new to this and struggling. Jill ___ Jill Hollenbach, PhD, MPH Assistant Staff Scientist Center for Genetics Children's Hospital Oakland Research Institute jhollenbach@chori.org -- View this message in context: http://www.nabble.com/paste-first-row-string-onto-every-string-in-column-tp24928720p24928720.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
Let's start with something simple and relatively easy to understand, since you're new to this. First, here's an example of the core of the idea:> paste('a',1:4)[1] "a 1" "a 2" "a 3" "a 4" Make it a little closer to your situation:> paste('a*',1:4, sep='')[1] "a*1" "a*2" "a*3" "a*4" Sometimes it helps to save the number of rows in your dataframe in a new variable nr <- nrow(df) Then, for your first column, the "a*" in the above example is df$V1[1] For the 1:4 in the example, you use df$V1[ 2:nr] Put it together and you have: dfnew <- df dfnew$V1[ 2:nr] <- paste( dfnew$V1[1], dfnew$V1[ 2:nr] ) But you can use "-1" instead of "2:nr", and you get dfnew$V1[ -1 ] <- paste( dfnew$V1[1], dfnew$V1[ -1] ) That's how you can do it one column at a time. Since you have only four columns, just do the same thing to V2, V3, and V4. But if you want a more general method, one that works no matter how many columns you have, and no matter what they are named, then you can use lapply() to loop over the columns. This is what Patrick Connolly suggested, which is as.data.frame(lapply(df, function(x) paste(x[1], x[-1], sep = ""))) Note, though, that this will do it to all columns, so if you ever happen to have a dataframe where you don't want to do all columns, you'll have to be a little trickier with the lapply() solution. -Don At 6:48 PM -0700 8/11/09, Jill Hollenbach wrote:>Hi, >I am trying to edit a data frame such that the string in the first line is >appended onto the beginning of each element in the subsequent rows. The data >looks like this: > >> df > V1 V2 V3 V4 >1 DPA1* DPA1* DPB1* DPB1* >2 0103 0104 0401 0601 >3 0103 0103 0301 0402 >. >. > and what I want is this: > >>dfnew > V1 V2 V3 V4 >1 DPA1* DPA1* DPB1* DPB1* >2 DPA1*0103 DPA1*0104 DPB1*0401 DPB1*0601 >3 DPA1*0103 DPA1*0103 DPB1*0301 DPB1*0402 > >any help is much appreciated, I am new to this and struggling. >Jill > >___ > Jill Hollenbach, PhD, MPH > Assistant Staff Scientist > Center for Genetics > Children's Hospital Oakland Research Institute > jhollenbach at chori.org > >-- >View this message in context: >http://*www.*nabble.com/paste-first-row-string-onto-every-string-in-column-tp24928720p24928720.html >Sent from the R help mailing list archive at Nabble.com. > >______________________________________________ >R-help at r-project.org mailing list >https://*stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://*www.*R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.-- -------------------------------------- Don MacQueen Environmental Protection Department Lawrence Livermore National Laboratory Livermore, CA, USA 925-423-1062