Stephen HonKit Wong
2017-Jul-21 19:22 UTC
[R] a difficult situation, how to do this using base function.
Hello, I have a following dataframe with many rows. data.frame(match.start=c(5,10,100,200),range.coordinates=c("1000-1050","1500-1555","5000-5050,6000-6180","100-150,200-260,600-900")) match.start range.coordinates 5 1000-1050 10 1500-1555 100 5000-5050,6000-6180 200 100-150,200-260,600-900 I want to test for each row element in column "match.start" (e.g. 100 on 3rd row) if it is less than the accumulated range (e.g. for 5000-5050, 6000-6180, the accumulated range is: 50, 230), then update the match start as 6000+ (100-50) = 6050. The result is put on third column. match.start range.coordinates match.start.updated 5 1000-1050 1005 10 1500-1555 1510 100 5000-5050,6000-6180 6050 200 100-150,200-260,600-900 690 Many thanks. [[alternative HTML version deleted]]
Jeff Newmiller
2017-Jul-22 00:38 UTC
[R] a difficult situation, how to do this using base function.
This is a plain text email list. Please learn how to explain this to your email client because what YOU saw before you sent it is not what WE saw after it bounced through the mailing list, and that can lead to misunderstandings. If at all possible you should try to augment your table with an additional table to contain the range.coordinates, or replace the range.coordinates column with a list of tables. I have parsed your format into tables on the fly, but this is inefficient and fragile. ###### DF <- data.frame( match.start=c( 5, 10, 100, 200 ) , range.coordinates = c( "1000-1050" , "1500-1555" , "5000-5050,6000-6180" , "100-150,200-260,600-900" ) , stringsAsFactors = FALSE ) lookupFunction <- function( ms, rc ) { rcdf <- as.data.frame( lapply( as.data.frame( t( as.data.frame( strsplit( strsplit( rc, ",", fixed=TRUE )[[ 1 ]], "-" ) ) ), stringsAsFactors=FALSE ), as.numeric ) ) rcdf$V3 <- with( rcdf, cumsum( V2-V1 ) ) rcq <- c( 0, rcdf$V3 ) idx <- findInterval( ms, rcdf$V3 ) + 1 rcdf$V1[ idx ] + ms - rcq[ idx ] } DF$match.start.updated <- unlist( lapply( seq.int( nrow( DF ) ) , function( i ) { lookupFunction( DF$match.start[ i ] , DF$range.coordinates[ i ] ) } ) ) ###### On Fri, 21 Jul 2017, Stephen HonKit Wong wrote:> Hello, > > I have a following dataframe with many rows. > data.frame(match.start=c(5,10,100,200),range.coordinates=c("1000-1050","1500-1555","5000-5050,6000-6180","100-150,200-260,600-900")) > > match.start range.coordinates > 5 1000-1050 > 10 1500-1555 > 100 5000-5050,6000-6180 > 200 100-150,200-260,600-900 > > I want to test for each row element in column "match.start" (e.g. 100 on > 3rd row) if it is less than the accumulated range (e.g. for 5000-5050, > 6000-6180, the accumulated range is: 50, 230), then update the match start > as 6000+ (100-50) = 6050. The result is put on third column. > > match.start range.coordinates match.start.updated > 5 1000-1050 1005 > 10 1500-1555 1510 > 100 5000-5050,6000-6180 6050 > 200 100-150,200-260,600-900 690 > > Many thanks. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >--------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k