I have some data coming from SQL sources that I wish to relate in various ways. For reasons only known to our IT people, this can't be done in SQL at present. I am looking for an R'ish technique for looking up new columns on a data frame. As a simple, hardwired example I have tried the following: # This gives me two columns, one the lookup value and the second one # the result column, ie my lookup table. stcl = read.csv("stockclass.csv") stockclass = as.vector(stcl$stock_class) # This gives me what appears to be a dictionary or map names(stockclass) = as.vector(stcl$stock_group) getstockclass = function(stock_group) { try(stockclass[[stock_group]], TRUE) } csg$stk_class=factor(sapply(csg$stock_group, getstockclass)) I need the try since if there is a missing value I get an exception. I also tried something along the lines of (from memory): getstockclass = function(stock_group) { stcl[which(stcl$stock_group == stock_group),]$stock_class } These work but I just wanted to check if there was an inbuilt way to do this kind of thing in R? I searched on "join" without much luck. Really what I would like is a generic function that: - Takes 2 data frames, - Some kind of specification on which column(s) to join - Outputs the joined frames, or perhaps a vector which is an index vector that I can use on the second data frame. I don't really want to reinvent SQL and my data sets are not huge. cheers
All together now: ?merge :-) -----Original Message----- From: r-help-bounces at stat.math.ethz.ch on behalf of Paul Sorenson Sent: Mon 1/24/2005 10:34 PM To: r-help at stat.math.ethz.ch Cc: Subject: [R] lookups and joins I have some data coming from SQL sources that I wish to relate in various ways. For reasons only known to our IT people, this can't be done in SQL at present. I am looking for an R'ish technique for looking up new columns on a data frame. As a simple, hardwired example I have tried the following: # This gives me two columns, one the lookup value and the second one # the result column, ie my lookup table. stcl = read.csv("stockclass.csv") stockclass = as.vector(stcl$stock_class) # This gives me what appears to be a dictionary or map names(stockclass) = as.vector(stcl$stock_group) getstockclass = function(stock_group) { try(stockclass[[stock_group]], TRUE) } csg$stk_class=factor(sapply(csg$stock_group, getstockclass)) I need the try since if there is a missing value I get an exception. I also tried something along the lines of (from memory): getstockclass = function(stock_group) { stcl[which(stcl$stock_group == stock_group),]$stock_class } These work but I just wanted to check if there was an inbuilt way to do this kind of thing in R? I searched on "join" without much luck. Really what I would like is a generic function that: - Takes 2 data frames, - Some kind of specification on which column(s) to join - Outputs the joined frames, or perhaps a vector which is an index vector that I can use on the second data frame. I don't really want to reinvent SQL and my data sets are not huge. cheers ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Maybe Matching Threads
- has_many fails where find_by_sql succeeds
- [PATCH v2 0/2] Implement VFP context switch for arm32
- Optimal parameters for Savitzky-Golay smoothing filter (loop)
- [Bug 1340] New: Support for Camellia block cipher to OpenSSH-portable.
- Fatal: Unknown userdb type 'pgsql'