Dear All,
I've successfully import my synteny data to R by using scan command. Below
show my results. My major problem with my data is how am i going to combine the
column names with the data( splt) where i have tried on cbind but a warning
message occur. I have realized that the splt data only have 5 column instead of
6. Please help me with this!!
I want my data to be a numerical data with a proper column and column names
and to replace CS with 1 and CSO with 0 and also to get remove all the
punctuations and the characters from the data.
Attach herewith is my original data. Your kindly help is highly appreciated
and thanks in advance.
Cheers,
Anisah
1)for col names
nms<-scan("C:/Users/user/Documents/cfa-1.txt",sep="\t",nlines=1,skip=10,what=character(0))
Read 6 items> nms
[1] "CS(O) id (number of marker/anchor) "
[2] " Location(s) on reference "
[3] "CS(O) size"
[4] "CS(O) density on reference chromosome"
[5] "Location(s) on tested "
[6] "Breakpoints CS(O) locations (denstiy of marker/anchor)"
2) my data
x<-scan("C:/Users/user/Documents/cfa-1.txt",sep="\n",skip=12,what=character(0))
Read 21 items> splt<-strsplit(x,"\t")
> splt
[[1]]
[1] "CS 1 (73): " " cfa1: [ 3251712 -
24126920 ] "
[3] " 20875208 " " 3 "
[5] " hsa18: [ 132170848 - 50139168 ] " "] 24126920, 24153560 [(8
) "
[[2]]
[1] "CS 2 (3): " " cfa1: [ 24153560 -
24265894 ] "
[3] " 112334 " " 27 "
[5] " hsa18: [ 50105060 - 49934572 ] " "] 24265894, 24823786 [(7
) "
[[3]]
[1] "CSO 3.1 (6): "
[2] " cfa1: [ 24823786 - 27113036 ] "
[3] " 2289250 "
[4] " 3 "
[5] " hsa18: [ 48121156 - 46579500 ]- Decreasing order - ] 27113036,
27418228 [ (13)"
[[4]]
[1] "CSO 3.2 (4): "
[2] " cfa1: [ 27418228 - 27578150 ] "
[3] " 159922 "
[4] " 25 "
[5] " hsa18: [ 13872043 - 13208795 ]- Decreasing order - ] 27578150,
28055666 [(9 ) "
[[5]]
[1] "CS 4 (4): " " cfa1: [ 28055666 -
28835230 ] "
[3] " 779564 " " 5 "
[5] " hsa6: [ 132311008 - 133132200 ] " "] 28835230, 29482792 [(7
) "
[[6]]
[1] "CS 5 (46): " " cfa1: [ 29482792 -
40120672 ] "
[3] " 10637880 " " 4 "
[5] " hsa6: [ 133604208 - 146227152 ] " "] 40120672, 40539680 [(8
) "
[[7]]
[1] "CS 6 (9): " " cfa1: [ 40539680 -
43339444 ] "
[3] " 2799764 " " 3 "
[5] " hsa6: [ 146390608 - 149867328 ] " "] 43339444, 43390788
[(13 ) "
[[8]]
[1] "CSO 7.1 (74): "
[2] " cfa1: [ 43390788 - 59714992 ] "
[3] " 16324204 "
[4] " 5 "
[5] " hsa6: [ 149929104 - 169714432 ]- Increasing order -] 59714992,
59864308 [ (15)"
[[9]]
[1] "CSO 7.2 (52): "
[2] " cfa1: [ 59864308 - 72417520 ] "
[3] " 12553212 "
[4] " 4 "
[5] " hsa6: [ 116707976 - 131508152 ]- Increasing order - "
[6] "] 72417520, 73256040 [(7 ) "
[[10]]
[1] "CSO 8.1 (12): "
[2] " cfa1: [ 73256040 - 75192808 ] "
[3] " 1936768 "
[4] " 6 "
[5] " hsa9: [ 98441680 - 96360824 ]- Decreasing order - "
[6] "] 75192808, 75272528 [ "
[7] " (6 )"
[[11]]
[1] "CSO 8.2 (56): "
[2] " cfa1: [ 75272528 - 91881664 ] "
[3] " 16609136 "
[4] " 3 "
[5] " hsa9: [ 89530256 - 70341312 ]- Decreasing order - "
[6] "] 91881664, 92281272 [ "
[7] " (5 )"
[[12]]
[1] "CSO 8.3 (22): "
[2] " cfa1: [ 92281272 - 96913624 ] "
[3] " 4632352 "
[4] " 5 "
[5] " hsa9: [ 261625 - 5755076 ]- Increasing order - "
[6] "] 96913624, 98067040 [ "
[7] " (5 )"
[[13]]
[1] "CSO 8.4 (15): "
[2] " cfa1: [ 98067040 - 100692560 ] 2625520 "
[3] " 6 "
[4] " hsa9: [ 93833248 - 89771184 ]- Decreasing order - ] 100692560,
101013264 [ "
[5] "(13 )"
[[14]]
[1] "CSO 8.5 (18): "
[2] " cfa1: [ 101013264 - 102120080 ] 1106816 "
[3] " 16 "
[4] " hsa9: [ 95832896 - 94012312 ]- Decreasing order -]
102271920,102458192 [(25 ) "
[[15]]
[1] "CS 9 (55): "
[2] " cfa1: [ 102458192 - 105936824 ] 3478632 "
[3] " 16 "
[4] " hsa19: [ 63765096 - 59618416 ] "
[5] "] 105936824, 106097392 [(35 ) "
[[16]]
[1] "CSO 10.1 (81): "
[2] " cfa1: [ 106097392 - 110263696 ] 4166304 "
[3] " 19 "
[4] " hsa19: [ 59386008 - 54256216 ]- Decreasing order - "
[5] "] 110263696,110288752 [ (60 )"
[[17]]
[1] "CSO 10.2 (18): "
[2] " cfa1: [ 110288752 - 110567608 ] 278856 "
[3] " 65 "
[4] " hsa19: [ 54163196 - 53814360 ]- Decreasing order - "
[5] "] 110567608,110575576 [ (50 )"
[[18]]
[1] "CSO 10.3 (60): "
[2] " cfa1: [ 110575576 - 112727048 ] 2151472 "
[3] " 28 "
[4] " hsa19: [ 53649284 - 50959884 ]- Decreasing order - "
[5] "] 112727048,112775144 [(40 ) "
[[19]]
[1] "CS 11 (173): "
[2] " cfa1: [ 112775144 - 119848336 ] 7073192 "
[3] " 24 "
[4] " hsa19: [ 50887772 - 40849556 ] "
[5] "] 119848336, 119880560 [(55 ) "
[[20]]
[1] "CS 12 (33): "
[2] " cfa1: [ 119880560 - 121690672 ] 1810112 "
[3] " 18 "
[4] " hsa19: [ 40824500 - 38556448 ] "
[5] "] 121690672, 121820640 [(16 ) "
[[21]]
[1] "CS 13 (22): "
[2] " cfa1: [ 121820640 - 124798800 ] 2978160 "
[3] " 7 "
[4] " hsa19: [ 38391408 - 34709332 ] "
[5] "] 124798800, Telomere [(-NA-) "
---------------------------------
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: cfa-1.txt
Url:
https://stat.ethz.ch/pipermail/r-help/attachments/20080106/dfa48c83/attachment-0002.txt