HI,
Try this:
dat1<- read.table(text="
?V1,V2,V3,V4,V5,V6,V7
?chr1,564563,564598,564588 564589,1336,+,134
?chr1,564620,564649,564644 564645,94,+,10
?chr1,565369,565404,565371 565372,217,+,8
?chr1,565463,565541,565480 565481,1214,+,15
?chr1,565653,565697,565662 565663,1031,+,28
?chr1,565861,565922,565883 565884,316,+,12
",sep=",",header=TRUE,stringsAsFactors=FALSE)
library(reshape2)
dat2<-with(dat1,{cbind(dat1[,-4],colsplit(V4,pattern="
",names=c("peak_start","peak_end")))})
?dat2
#???? V1???? V2???? V3?? V5 V6? V7 peak_start peak_end
#1? chr1 564563 564598 1336? + 134???? 564588?? 564589
#2? chr1 564620 564649?? 94? +? 10???? 564644?? 564645
#3? chr1 565369 565404? 217? +?? 8???? 565371?? 565372
#4? chr1 565463 565541 1214? +? 15???? 565480?? 565481
#5? chr1 565653 565697 1031? +? 28???? 565662?? 565663
#6? chr1 565861 565922? 316? +? 12???? 565883?? 565884
library(data.table)
datNew<- data.table(dat2)
A.K.
----- Original Message -----
From: "deconstructed.morning at gmail.com" <deconstructed.morning
at gmail.com>
To: smartpink111 at yahoo.com
Cc:
Sent: Sunday, March 10, 2013 5:48 PM
Subject: Re: splitting column into two
Hello,
I saw your solution for this question and I want to ask you should I do when I
have a very large file, that looks like this:
> clusters<-data.table(CTSS[,
grep("V1$|V2$|V3$|V4$|V5$|V6$|V7$", names(CTSS))])
> head(clusters)
? ? V1? ? V2? ? V3? ? ? ? ? ? ? ? ? ? ? V4? V5 V6? V7
1: chr1 564563 564598 564588 564589 1336? + 134
2: chr1 564620 564649 564644 564645? 94? +? 10
3: chr1 565369 565404 565371 565372? 217? +? 8
4: chr1 565463 565541 565480 565481 1214? +? 15
5: chr1 565653 565697 565662 565663 1031? +? 28
6: chr1 565861 565922 565883 565884? 316? +? 12
What I want is to replace column V4 which contain two numbers separated by a
space,? with two columns that are numerical. I have tried this:
new <- cbind(CTSS,colsplit(CTSS$V4, ' ', c('peak_start',
'peak_end')) )
but instead of replacing the column it keeps it the same and adds two new
columns at end of the columns(after 625 columns). Please let me know if you have
a better solution.
Thank you,
Nanami
<quote author='arun kirshna'>
Hi,
May be this helps:
dat1<-read.table(text="
0111 0214 0203 0404 1112 0513 0709 1010 0915 0813
0112 0314 0204 0504 1132 0543 0789 1020 0965 0823
",sep="",header=FALSE,colClasses=rep("character",10))
res<-do.call(data.frame,lapply(dat1,function(x)
do.call(rbind,lapply(strsplit(x,""),function(y)
c(paste0(y[1],y[2]),paste0(y[3],y[4]))))))
colnames(res)<-paste0("V",1:20)
res
#? V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20
#1 01 11 02 14 02 03 04 04 11? 12? 05? 13? 07? 09? 10? 10? 09? 15? 08? 13
#2 01 12 03 14 02 04 05 04 11? 32? 05? 43? 07? 89? 10? 20? 09? 65? 08? 23
A.K.
</quote>
Quoted from:
http://r.789695.n4.nabble.com/splitting-column-into-two-tp4656108p4656111.html