Hi all, I'm trying to do some data manipulation using R, but I'm a bit stuck. I have to warn you, I'm a real R noob. I have for example this file: V1 V2 V3 V4 V5 V6 1:156706559 rs8658 dbSNP_52 C/G/A C=2996/G=7762/A=0 31.8803/20.2782/27.849 1:69116 none none A/G A=1/G=611 0.0/0.2747/0.1634 1:69134 none none G/A G=8/A=724 1.9108/0.4785/1.0929 1:69270 none none G/A G=1896/A=888 10.2394/42.6562/31.8966 The format that I want this data in is: V1 V2 V3 V4 V5 V6 V7 V8 V9 1 156706559 rs8658 dbSNP_52 C A 2996 0 27.849 1 156706559 rs8658 dbSNP_52 G A 7762 0 27.849 1 69116 none none A G 1 611 0.1634 1 69134 none none G A 8 724 10.929 1 69270 none none G A 1896 888 318.966 So first separate column V1 by ":". This was done pretty easily. After that separate column V4 by "/". This was a bit trickier, seeing as some rows are longer than others, but I managed to do it with this code. Probably a really lousy way to do it, but it worked. (Don't pay too much attention to the column numbers, my original file has more columns) splittingAllele <- function(y) { #####Splitting Column 4 in Variant and Normal Allele r <- strsplit(y$V4, "/") d <- NULL d <- as.list(d) for (x in 1:length(r)) { d <- rbind(d, r[[x]][length(r[[x]])]) } d <- as.character(unlist(d)) d <- as.data.frame(d) y[,28] <- d y[,28] <- as.character(y[,28]) f <- as.data.frame(substr(y[,4], 1, nchar(y[,4])-2)) test3 <- y[,c(1:3)] test3[,4] <- f test3[,5:28] <-y[,c(28,5:27)] r <- strsplit(as.character(test3[,4]), "/") p1 <- cbind(unlist(r), rep(as.character(test3[,1]), sapply(r, length))) p2 <- cbind(unlist(r), rep(as.character(test3[,2]), sapply(r, length))) p3 <- cbind(unlist(r), rep(as.character(test3[,3]), sapply(r, length))) p5 <- cbind(unlist(r), rep(as.character(test3[,5]), sapply(r, length))) p8 <- cbind(unlist(r), rep(as.character(test3[,8]), sapply(r, length))) p9 <- cbind(unlist(r), rep(as.character(test3[,9]), sapply(r, length))) test4 <- cbind(p1[,2], p2[,2], p3[,2], p3[,1], p5[,2], p8[,2], p9[,2]) test4 <- as.data.frame(test4) test5 <- test4[!duplicated(test4),] return(test5) } Now I want to separate column V5, but I'm stuck here. I think I can allmost use the exact same code as before, but I can't figure it out. Any help please?? Thank you in advance! -- View this message in context: http://r.789695.n4.nabble.com/data-manipulation-tp4288663p4288663.html Sent from the R help mailing list archive at Nabble.com.