I have a dataFrame sID <- c("a", "1,2,3", "b", "4,5,6") rID <- c("shr1125", "bwr331", "bwr330", "vjhr1022") tmp <- data.frame(cbind(sID,rID)) but I need to split tmp$sID into three different columns, filling locations where tmp$sID has only one value with NA. I can split tmp$sID by the comma tmp.1 <- strsplit(tmp$sID, ",") but I can't figure out how to convert the resulting list into a dataFrame. Ideally, tmp will become four columns wide, something like sID.a sID.b sID.c rID NA NA a shr1125 1 2 3 bwr331 NA NA b bwr330 4 5 6 vjhr1022 Thoughts or suggestions? I tried havecomma - grep(',', tmp$sID) for( i in 1:nrow(tmp)){ if (!(tmp[i,] %in% havecomma)){ tmp$sID[i] <- paste(', ,', tmp$sID[i], sep="") } } and thought that I might be able to force the list into a dataframe once each component had three items, but it just seemed to apply the paste() function to everything which gave me a list with varying numbers of items. I'm stuck. Thanks for your help - SR Steven H. Ranney [[alternative HTML version deleted]]
Try, sID <- c("a", "1,2,3", "b", "4,5,6") tmp1 <- strsplit(sID,',') tmp2 <- lapply(tmp1, function(x) if (length(x)==1) c('','',x) else x ) tmp3 <- matrix(unlist(tmp2),ncol=3, byrow=TRUE) rID <- c("shr1125", "bwr331", "bwr330", "vjhr1022") newdf <- data.frame(cbind(tmp3,rID)) You'll need to name the first three columns. As an aside, note that you don't need the cbind in your data.frame(cbind(sID,rID)) because data.frame(sID,rID) does just as well. But cbind is needed in my example, because tmp3 is a matrix. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 8/13/13 12:09 PM, "Steven Ranney" <steven.ranney at gmail.com> wrote:>I have a dataFrame > >sID <- c("a", "1,2,3", "b", "4,5,6") >rID <- c("shr1125", "bwr331", "bwr330", "vjhr1022") > >tmp <- data.frame(cbind(sID,rID)) > >but I need to split tmp$sID into three different columns, filling >locations >where tmp$sID has only one value with NA. > >I can split tmp$sID by the comma > >tmp.1 <- strsplit(tmp$sID, ",") > >but I can't figure out how to convert the resulting list into a dataFrame. > >Ideally, tmp will become four columns wide, something like > >sID.a sID.b sID.c rID >NA NA a shr1125 >1 2 3 bwr331 >NA NA b bwr330 >4 5 6 vjhr1022 > >Thoughts or suggestions? > >I tried > >havecomma - grep(',', tmp$sID) > >for( i in 1:nrow(tmp)){ > if (!(tmp[i,] %in% havecomma)){ > tmp$sID[i] <- paste(', ,', tmp$sID[i], sep="") > } > } > >and thought that I might be able to force the list into a dataframe once >each component had three items, but it just seemed to apply the paste() >function to everything which gave me a list with varying numbers of items. > >I'm stuck. > >Thanks for your help - > >SR > > > > > > >Steven H. Ranney > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.
Hi, You could try: tmp[,1]<- as.character(tmp[,1]) ?tmp[,1][-grep(",",tmp[,1])]<-paste0(",,",tmp[,1][-grep(",",tmp[,1])]) tmp2<-data.frame(read.table(text=tmp[,1],sep=",",header=FALSE,stringsAsFactors=FALSE),rID=tmp[,2],stringsAsFactors=FALSE) ? colnames(tmp2)[1:3]<-paste("sID",letters[1:3],sep=".") tmp2 #? sID.a sID.b sID.c????? rID #1??? NA??? NA???? a? shr1125 #2???? 1???? 2???? 3?? bwr331 #3??? NA??? NA???? b?? bwr330 #4???? 4???? 5???? 6 vjhr1022 BTW, ?data.frame(sID,rID,stringsAsFactors=FALSE)#cbind is not needed.? In this case, it is okay, #??? sID????? rID #1???? a? shr1125 #2 1,2,3?? bwr331 #3???? b?? bwr330 #4 4,5,6 vjhr1022 #But if they were of different class: str(data.frame(cbind(sID,Col2=1:4),stringsAsFactors=FALSE)) #'data.frame':??? 4 obs. of? 2 variables: # $ sID : chr? "a" "1,2,3" "b" "4,5,6" # $ Col2: chr? "1" "2" "3" "4" ?str(data.frame(sID,Col2=1:4,stringsAsFactors=FALSE)) #'data.frame':??? 4 obs. of? 2 variables: # $ sID : chr? "a" "1,2,3" "b" "4,5,6" # $ Col2: int? 1 2 3 4 A.K. ----- Original Message ----- From: Steven Ranney <steven.ranney at gmail.com> To: "r-help at r-project.org" <r-help at r-project.org> Cc: Sent: Tuesday, August 13, 2013 3:09 PM Subject: [R] Convert list with missing values to dataFrame I have a dataFrame sID <- c("a", "1,2,3", "b", "4,5,6") rID <- c("shr1125", "bwr331", "bwr330", "vjhr1022") tmp <- data.frame(cbind(sID,rID)) but I need to split tmp$sID into three different columns, filling locations where tmp$sID has only one value with NA. I can split tmp$sID by the comma tmp.1 <- strsplit(tmp$sID, ",") but I can't figure out how to convert the resulting list into a dataFrame. Ideally, tmp will become four columns wide, something like sID.a? sID.b? sID.c? rID NA? ? NA? ? a? ? ? ? shr1125 1? ? ? ? 2? ? ? 3? ? ? ? bwr331 NA? ? NA? ? b? ? ? bwr330 4? ? ? ? 5? ? ? ? 6? ? ? vjhr1022 Thoughts or suggestions? I tried havecomma - grep(',', tmp$sID) for( i in 1:nrow(tmp)){ ? if (!(tmp[i,] %in% havecomma)){ ? ? tmp$sID[i] <- paste(', ,', tmp$sID[i], sep="") ? ? } ? ? } and thought that I might be able to force the list into a dataframe once each component had three items, but it just seemed to apply the paste() function to everything which gave me a list with varying numbers of items. I'm stuck. Thanks for your help - SR Steven H. Ranney ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.