If you are willing to entertain another approach, have a look at ?cut. By defining the 'breaks' argument appropriately, you can easily create a factor that tells you which values should be looked up and which accepted as is. If I understand correctly, this seems to be what you want. If I have not, just ignore and wait for a more useful reply. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Tue, Jan 19, 2021 at 10:24 AM Steven Rigatti <sjrigatti at gmail.com> wrote:> I am having some problems with what seems like a pretty simple issue. I > have some data where I want to convert numbers. Specifically, this is > cancer data and the size of tumors is encoded using millimeter > measurements. However, if the actual measurement is not available the > coding may imply a less specific range of sizes. For instance numbers 0-89 > may indicate size in mm, but 90 indicates "greater than 90 mm" , 91 > indicates "1 to 2 cm", etc. So, I want to translate 91 to 90, 92 to 15, > etc. > > I have many such tables so I would like to be able to write a function > which takes as input a threshold over which new values need to be looked > up, and the new lookup table, returning the new values. > > I successfully wrote the function: > > translate_seer_numeric <- function(var, upper, lookup) { > names(lookup) <- c('old','new') > names(var) <- 'old' > var <- as.data.frame(var) > lookup2 <- data.frame(old = c(1:upper), > new = c(1:upper)) > lookup3 <- rbind(lookup, lookup2) > print(var) > res <- left_join(var, lookup3, by = 'old') %>% > select(new) > > res > > } > > test1 <- data.frame(old = c(99,95,93, 8))lup <- data.frame(bif = c(93, 95, > 99), > new = c(3, 5, NA)) > translate_seer_numeric(test1, 90, lup) > > The above test generates the desired output: > > old1 992 953 934 8 > new1 NA2 53 34 8 > > My problem comes when I try to put this in line with pipes and the mutate > function: > > test1 %>% > mutate(varb = translate_seer_numeric(var = old, 90, lup))#### > Error: Problem with `mutate()` input `varb`. > x Join columns must be present in data. > x Problem with `old`. > i Input `varb` is `translate_seer_numeric(var = test1$old, 90, lup)`. > > Thoughts?? > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
It's not that I can't get the output I want. I was able to do that. It is just that I can't make it pipeable - I get that weird error message that I don't understand. On Tue, Jan 19, 2021 at 1:34 PM Bert Gunter <bgunter.4567 at gmail.com> wrote:> If you are willing to entertain another approach, have a look at ?cut. By > defining the 'breaks' argument appropriately, you can easily create a > factor that tells you which values should be looked up and which accepted > as is. If I understand correctly, this seems to be what you want. If I have > not, just ignore and wait for a more useful reply. > > Cheers, > Bert > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Tue, Jan 19, 2021 at 10:24 AM Steven Rigatti <sjrigatti at gmail.com> > wrote: > >> I am having some problems with what seems like a pretty simple issue. I >> have some data where I want to convert numbers. Specifically, this is >> cancer data and the size of tumors is encoded using millimeter >> measurements. However, if the actual measurement is not available the >> coding may imply a less specific range of sizes. For instance numbers 0-89 >> may indicate size in mm, but 90 indicates "greater than 90 mm" , 91 >> indicates "1 to 2 cm", etc. So, I want to translate 91 to 90, 92 to 15, >> etc. >> >> I have many such tables so I would like to be able to write a function >> which takes as input a threshold over which new values need to be looked >> up, and the new lookup table, returning the new values. >> >> I successfully wrote the function: >> >> translate_seer_numeric <- function(var, upper, lookup) { >> names(lookup) <- c('old','new') >> names(var) <- 'old' >> var <- as.data.frame(var) >> lookup2 <- data.frame(old = c(1:upper), >> new = c(1:upper)) >> lookup3 <- rbind(lookup, lookup2) >> print(var) >> res <- left_join(var, lookup3, by = 'old') %>% >> select(new) >> >> res >> >> } >> >> test1 <- data.frame(old = c(99,95,93, 8))lup <- data.frame(bif = c(93, >> 95, 99), >> new = c(3, 5, NA)) >> translate_seer_numeric(test1, 90, lup) >> >> The above test generates the desired output: >> >> old1 992 953 934 8 >> new1 NA2 53 34 8 >> >> My problem comes when I try to put this in line with pipes and the mutate >> function: >> >> test1 %>% >> mutate(varb = translate_seer_numeric(var = old, 90, lup))#### >> Error: Problem with `mutate()` input `varb`. >> x Join columns must be present in data. >> x Problem with `old`. >> i Input `varb` is `translate_seer_numeric(var = test1$old, 90, lup)`. >> >> Thoughts?? >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >[[alternative HTML version deleted]]
Second this. There is also the findInterval function, which omits the factor attributes and just returns integers that can be used in lookup tables. On January 19, 2021 10:33:59 AM PST, Bert Gunter <bgunter.4567 at gmail.com> wrote:>If you are willing to entertain another approach, have a look at ?cut. >By >defining the 'breaks' argument appropriately, you can easily create a >factor that tells you which values should be looked up and which >accepted >as is. If I understand correctly, this seems to be what you want. If I >have >not, just ignore and wait for a more useful reply. > >Cheers, >Bert > >Bert Gunter > >"The trouble with having an open mind is that people keep coming along >and >sticking things into it." >-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > >On Tue, Jan 19, 2021 at 10:24 AM Steven Rigatti <sjrigatti at gmail.com> >wrote: > >> I am having some problems with what seems like a pretty simple issue. >I >> have some data where I want to convert numbers. Specifically, this is >> cancer data and the size of tumors is encoded using millimeter >> measurements. However, if the actual measurement is not available the >> coding may imply a less specific range of sizes. For instance numbers >0-89 >> may indicate size in mm, but 90 indicates "greater than 90 mm" , 91 >> indicates "1 to 2 cm", etc. So, I want to translate 91 to 90, 92 to >15, >> etc. >> >> I have many such tables so I would like to be able to write a >function >> which takes as input a threshold over which new values need to be >looked >> up, and the new lookup table, returning the new values. >> >> I successfully wrote the function: >> >> translate_seer_numeric <- function(var, upper, lookup) { >> names(lookup) <- c('old','new') >> names(var) <- 'old' >> var <- as.data.frame(var) >> lookup2 <- data.frame(old = c(1:upper), >> new = c(1:upper)) >> lookup3 <- rbind(lookup, lookup2) >> print(var) >> res <- left_join(var, lookup3, by = 'old') %>% >> select(new) >> >> res >> >> } >> >> test1 <- data.frame(old = c(99,95,93, 8))lup <- data.frame(bif >c(93, 95, >> 99), >> new = c(3, 5, NA)) >> translate_seer_numeric(test1, 90, lup) >> >> The above test generates the desired output: >> >> old1 992 953 934 8 >> new1 NA2 53 34 8 >> >> My problem comes when I try to put this in line with pipes and the >mutate >> function: >> >> test1 %>% >> mutate(varb = translate_seer_numeric(var = old, 90, lup))#### >> Error: Problem with `mutate()` input `varb`. >> x Join columns must be present in data. >> x Problem with `old`. >> i Input `varb` is `translate_seer_numeric(var = test1$old, 90, lup)`. >> >> Thoughts?? >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.-- Sent from my phone. Please excuse my brevity.