Hi all, I have a column that has the following format: chr1:564588..564589,+ and I want to extract only the coordinates; I have tried writing a regular expression but I couldn't figure out how I should write it. Does anyone know? Thank you, Best, Nanami [[alternative HTML version deleted]]
If I understand what you want (which I may very well not) you could use something like this: If this is an example of your type of data: 564589,+ substr(x, 1, 6) as.numeric(x) Please try to post something more thorough if you would like a better answer. Sam -- View this message in context: http://r.789695.n4.nabble.com/extract-data-from-a-column-tp3609890p3610030.html Sent from the R help mailing list archive at Nabble.com.
So if I am given some data that look like this:> head(CTSS)V1 V2 V3 V4 V5 V6 V7 1 chr1 564563 564598 chr1:564588..564589,+ 1336 2 chr1 564620 564649 chr1:564644..564645,+ 94 3 chr1 565369 565404 chr1:565371..565372,+ 217 4 chr1 565463 565541 chr1:565480..565481,+ 1214 5 chr1 565653 565697 chr1:565662..565663,+ 1031 6 chr1 565861 565922 chr1:565883..565884,+ 316 what I would like to do is to obtain two columns from column V4 like: start_coord stop_coord 564588 564589 564644 564645 I will try and see if it works what you suggested. Thank you, Best, Nanami -- View this message in context: http://r.789695.n4.nabble.com/extract-data-from-a-column-tp3609890p3610049.html Sent from the R help mailing list archive at Nabble.com.
I figured it out: x<-sub("^.*:([[:digit:]]+)..([[:digit:]]+).*", "\\1 \\2", CTSS$V4) -- View this message in context: http://r.789695.n4.nabble.com/extract-data-from-a-column-tp3609890p3610147.html Sent from the R help mailing list archive at Nabble.com.
Seemingly Similar Threads
- extract data from a data frame field
- splitting column into two
- Ashlee Vance's article on R in the New York Times
- return counts of elements on a table column depending on elements on another column
- create a new data frame after comparing two columns of the previous data frame