Hello I hope that there is a simple solution to this apparently complex problem. Any help will be much appreciated: I have a dataframe with Left and Right readings (that is, elements in each row are paired). For instance, Left Right [1] 9 8 [2] 4 3 [3] 2 1 [4] 6 5 [5] 3 1 [6] 4 1 [7] 3 2 [8] 4 2 [9] 10 8 [10] 9 10 I need to produce a new data frame where the values are transformed according to a look-up table such as input output [1] 5 1 [2] 10 1 [3] 4 2 [4] 8 3 [5] 6 5 [6] 5 6 [7] 7 6 [8] 2 7 [9] 9 7 [10] 10 7 [11] 2 8 So [1, ] in the new dataframe would be 7 3. Quite simple so far, but what makes things complicated is the multiple outputs for a single input. In this example, 10 corresponds to 1 and 7 so [9, ] in the input dataframe must yield two rows in its output counterpart: 1 3 and 7 3. Likewise the output for [10, ] should be 7 1 and 7 7. In addition, given that 3 and 1 are missing as inputs the output for [5, ] should be NA NA. Thank you very much for your time. Juan Antonio Balbuena -- Dr. Juan A. Balbuena Marine Zoology Unit Cavanilles Institute of Biodiversity and Evolutionary Biology University of Valencia [1]http://www.uv.es/~balbuena P.O. Box 22085 [2]http://www.uv.es/cavanilles/zoomarin/index.htm 46071 Valencia, Spain [3]http://cetus.uv.es/mullpardb/index.html e-mail: [4]j.a.balbuena at uv.es tel. +34 963 543 658 fax +34 963 543 733 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ NOTE! For shipments by EXPRESS COURIER use the following street address: C/ Catedr??tico Jos?? Beltr??n 2, 46980 Paterna (Valencia), Spain. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ References 1. http://www.uv.es/%7Ebalbuena 2. http://www.uv.es/cavanilles/zoomarin/index.htm 3. http://cetus.uv.es/mullpardb/index.html 4. mailto:j.a.balbuena at uv.es
Perhaps this will help. Jean df <- structure(list(Left = c(9L, 4L, 2L, 6L, 3L, 4L, 3L, 4L, 10L, 9L), Right = c(8L, 3L, 1L, 5L, 1L, 1L, 2L, 2L, 8L, 10L)), .Names = c("Left", "Right"), class = "data.frame", row.names = 1:10) lookup <- structure(list(input = c(5L, 10L, 4L, 8L, 6L, 5L, 7L, 2L, 9L, 10L, 2L), output = c(1L, 1L, 2L, 3L, 5L, 6L, 6L, 7L, 7L, 7L, 8L)), .Names = c("input", "output"), class = "data.frame", row.names = 1:10) df2 <- merge(df, lookup, by.x="Left", by.y="input", all.x=TRUE) names(df2)[names(df2)=="output"] <- "Leftout" df3 <- merge(df2, lookup, by.x="Right", by.y="input", all.x=TRUE) names(df3)[names(df3)=="output"] <- "Rightout" Right Left Leftout Rightout 1 1 2 7 NA 2 1 2 8 NA 3 1 3 NA NA 4 1 4 2 NA 5 2 3 NA 7 6 2 3 NA 8 7 2 4 2 7 8 2 4 2 8 9 3 4 2 NA 10 5 6 5 6 11 5 6 5 1 12 8 9 7 3 13 8 10 1 3 14 8 10 7 3 15 10 9 7 1 16 10 9 7 7 On Thu, Jul 25, 2013 at 10:13 AM, Juan Antonio Balbuena <j.a.balbuena@uv.es>wrote:> > Hello > I hope that there is a simple solution to this apparently complex > problem. > Any help will be much appreciated: > I have a dataframe with Left and Right readings (that is, elements in > each > row are paired). For instance, > Left Right > [1] 9 8 > [2] 4 3 > [3] 2 1 > [4] 6 5 > [5] 3 1 > [6] 4 1 > [7] 3 2 > [8] 4 2 > [9] 10 8 > [10] 9 10 > I need to produce a new data frame where the values are transformed > according to a look-up table such as > input output > [1] 5 1 > [2] 10 1 > [3] 4 2 > [4] 8 3 > [5] 6 5 > [6] 5 6 > [7] 7 6 > [8] 2 7 > [9] 9 7 > [10] 10 7 > [11] 2 8 > So [1, ] in the new dataframe would be 7 3. Quite simple so far, but > what > makes things complicated is the multiple outputs for a single input. In > this > example, 10 corresponds to 1 and 7 so [9, ] in the input dataframe must > yield two rows in its output counterpart: 1 3 and 7 3. Likewise the > output > for [10, ] should be 7 1 and 7 7. In addition, given that 3 and 1 are > missing as inputs the output for [5, ] should be NA NA. > Thank you very much for your time. > Juan Antonio Balbuena > > -- > > Dr. Juan A. Balbuena > Marine Zoology Unit > Cavanilles Institute of Biodiversity and Evolutionary Biology > University of > Valencia > [1]http://www.uv.es/~balbuena > P.O. Box 22085 > [2]http://www.uv.es/cavanilles/zoomarin/index.htm > 46071 Valencia, Spain > [3]http://cetus.uv.es/mullpardb/index.html > e-mail: [4]j.a.balbuena@uv.es tel. +34 963 543 658 fax +34 963 > 543 733 > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > NOTE! For shipments by EXPRESS COURIER use the following street address: > C/ Catedrático José Beltrán 2, 46980 Paterna (Valencia), Spain. > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > References > > 1. http://www.uv.es/%7Ebalbuena > 2. http://www.uv.es/cavanilles/zoomarin/index.htm > 3. http://cetus.uv.es/mullpardb/index.html > 4. mailto:j.a.balbuena@uv.es > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
It would be helpful if you included the expected output for your example, but I think the following does what you want by using merge() for each lookup: f0 <- function(inputDF, lookupDF) { tmp1 <- merge(inputDF, lookupDF, by.x="Left", by.y="input",all.x=TRUE) tmp2 <- merge(tmp1, lookupDF, by.x="Right", by.y="input", all.x=TRUE) with(tmp2, data.frame(ID=ID, Right=output.x, Left=output.y)[order(ID), ]) } # Your example data with an ID column added to track where the output rows came from myInputDF <- data.frame( ID = 1:10, Left = c(9, 4, 2, 6, 3, 4, 3, 4, 10, 9), Right = c(8, 3, 1, 5, 1, 1, 2, 2, 8, 10)) myLookupDF <- data.frame( input = c(5, 10, 4, 8, 6, 5, 7, 2, 9, 10, 2), output = c(1, 1, 2, 3, 5, 6, 6, 7, 7, 7, 8)) f0(myInputDF, myLookupDF) # ID Right Left # 12 1 7 3 # 9 2 2 NA # 1 3 7 NA # 2 3 8 NA # 10 4 5 6 # 11 4 5 1 # 3 5 NA NA # 4 6 2 NA # 5 7 NA 7 # 6 7 NA 8 # 7 8 2 7 # 8 8 2 8 # 13 9 1 3 # 14 9 7 3 # 15 10 7 1 # 16 10 7 7 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf > Of Juan Antonio Balbuena > Sent: Thursday, July 25, 2013 8:13 AM > To: r-help at r-project.org > Subject: [R] transform dataframe with look-up table > > > Hello > I hope that there is a simple solution to this apparently complex problem. > Any help will be much appreciated: > I have a dataframe with Left and Right readings (that is, elements in each > row are paired). For instance, > Left Right > [1] 9 8 > [2] 4 3 > [3] 2 1 > [4] 6 5 > [5] 3 1 > [6] 4 1 > [7] 3 2 > [8] 4 2 > [9] 10 8 > [10] 9 10 > I need to produce a new data frame where the values are transformed > according to a look-up table such as > input output > [1] 5 1 > [2] 10 1 > [3] 4 2 > [4] 8 3 > [5] 6 5 > [6] 5 6 > [7] 7 6 > [8] 2 7 > [9] 9 7 > [10] 10 7 > [11] 2 8 > So [1, ] in the new dataframe would be 7 3. Quite simple so far, but what > makes things complicated is the multiple outputs for a single input. In this > example, 10 corresponds to 1 and 7 so [9, ] in the input dataframe must > yield two rows in its output counterpart: 1 3 and 7 3. Likewise the output > for [10, ] should be 7 1 and 7 7. In addition, given that 3 and 1 are > missing as inputs the output for [5, ] should be NA NA. > Thank you very much for your time. > Juan Antonio Balbuena > > -- > > Dr. Juan A. Balbuena > Marine Zoology Unit > Cavanilles Institute of Biodiversity and Evolutionary Biology > University of > Valencia > [1]http://www.uv.es/~balbuena > P.O. Box 22085 > [2]http://www.uv.es/cavanilles/zoomarin/index.htm > 46071 Valencia, Spain > [3]http://cetus.uv.es/mullpardb/index.html > e-mail: [4]j.a.balbuena at uv.es tel. +34 963 543 658 fax +34 963 543 733 > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > NOTE! For shipments by EXPRESS COURIER use the following street address: > C/ Catedr??tico Jos?? Beltr??n 2, 46980 Paterna (Valencia), Spain. > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > References > > 1. http://www.uv.es/%7Ebalbuena > 2. http://www.uv.es/cavanilles/zoomarin/index.htm > 3. http://cetus.uv.es/mullpardb/index.html > 4. mailto:j.a.balbuena at uv.es > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
On 7/25/2013 8:13 AM, Juan Antonio Balbuena wrote:> > Hello > I hope that there is a simple solution to this apparently complex problem. > Any help will be much appreciated: > I have a dataframe with Left and Right readings (that is, elements in each > row are paired). For instance, > Left Right > [1] 9 8 > [2] 4 3 > [3] 2 1 > [4] 6 5 > [5] 3 1 > [6] 4 1 > [7] 3 2 > [8] 4 2 > [9] 10 8 > [10] 9 10 > I need to produce a new data frame where the values are transformed > according to a look-up table such as > input output > [1] 5 1 > [2] 10 1 > [3] 4 2 > [4] 8 3 > [5] 6 5 > [6] 5 6 > [7] 7 6 > [8] 2 7 > [9] 9 7 > [10] 10 7 > [11] 2 8 > So [1, ] in the new dataframe would be 7 3. Quite simple so far, but what > makes things complicated is the multiple outputs for a single input. In this > example, 10 corresponds to 1 and 7 so [9, ] in the input dataframe must > yield two rows in its output counterpart: 1 3 and 7 3. Likewise the output > for [10, ] should be 7 1 and 7 7. In addition, given that 3 and 1 are > missing as inputs the output for [5, ] should be NA NA. > Thank you very much for your time. > Juan Antonio Balbuenamerge can handle both of these requirements. First, making the two datasets reproducible: Start <- data.frame(Left=c(9,4,2,6,3,4,3,4,10,9), Right=c(8,3,1,5,1,1,2,2,8,10)) transformer <- data.frame(input=c(5,10,4,8,6,5,7,2,9,10,2), output=c(1,1,2,3,5,6,6,7,7,7,8)) Then add a marker of the original row numbers so that the work can be checked more easily later (not really needed for the calculations): Start$rownum <- seq_len(nrow(Start)) Two merge statements with the columns specified and all.x set to TRUE (to keep cases even without a match): End <- merge(merge(Start, transformer, by.x="Left", by.y="input", all.x=TRUE), transformer, by.x="Right", by.y="input", all.x=TRUE) Then we can look at the output, resorted by the original row numbers: End[order(End$rownum),] which gives Right Left rownum output.x output.y 12 8 9 1 7 3 9 3 4 2 2 NA 1 1 2 3 7 NA 2 1 2 3 8 NA 10 5 6 4 5 6 11 5 6 4 5 1 3 1 3 5 NA NA 4 1 4 6 2 NA 5 2 3 7 NA 7 6 2 3 7 NA 8 7 2 4 8 2 7 8 2 4 8 2 8 13 8 10 9 1 3 14 8 10 9 7 3 15 10 9 10 7 1 16 10 9 10 7 7> -- > > Dr. Juan A. Balbuena > Marine Zoology Unit > Cavanilles Institute of Biodiversity and Evolutionary Biology > University of > Valencia > [1]http://www.uv.es/~balbuena > P.O. Box 22085 > [2]http://www.uv.es/cavanilles/zoomarin/index.htm > 46071 Valencia, Spain > [3]http://cetus.uv.es/mullpardb/index.html > e-mail: [4]j.a.balbuena at uv.es tel. +34 963 543 658 fax +34 963 543 733 > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > NOTE! For shipments by EXPRESS COURIER use the following street address: > C/ Catedr??tico Jos?? Beltr??n 2, 46980 Paterna (Valencia), Spain. > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > References > > 1. http://www.uv.es/%7Ebalbuena > 2. http://www.uv.es/cavanilles/zoomarin/index.htm > 3. http://cetus.uv.es/mullpardb/index.html > 4. mailto:j.a.balbuena at uv.es >-- Brian S. Diggs, PhD Senior Research Associate, Department of Surgery Oregon Health & Science University
Here's an approach that seems to work. I added an 11th case to your data since you did not have a case where both Left and Right had multiple values in the lookup table. This creates an id value so that we can merge left and right separately and then merge them back together: # Create test data frames Left <- c(9, 4, 2, 6, 3, 4, 3, 4, 10, 9, 2) Right <- c(8, 3, 1, 5, 1, 1, 2, 2, 8, 10, 5) ID <- 1:11 Pair <- data.frame(ID, Left, Right) input <- c(5, 10, 4, 8, 6, 5, 7, 2, 9, 10, 2) output <- c(1, 1, 2, 3, 5, 6, 6, 7, 7, 7, 8) Lookup <- data.frame(input, output) # Merges Lout <- merge(Pair, Lookup, by.x="Left", by.y="input", all.x=TRUE)[,c("ID", "Left", "output")] Rout <- merge(Pair, Lookup, by.x="Right", by.y="input", all.x=TRUE)[, c("ID", "Right", "output")] names(Rout)[3] <- "outputR" names(Lout)[3] <- "outputL" merge(Lout, Rout, all=TRUE)[,c(1, 2, 4, 3, 5)] ID Left Right outputL outputR 1 1 9 8 7 3 2 2 4 3 2 NA 3 3 2 1 7 NA 4 3 2 1 8 NA 5 4 6 5 5 6 6 4 6 5 5 1 7 5 3 1 NA NA 8 6 4 1 2 NA 9 7 3 2 NA 7 10 7 3 2 NA 8 11 8 4 2 2 8 12 8 4 2 2 7 13 9 10 8 1 3 14 9 10 8 7 3 15 10 9 10 7 1 16 10 9 10 7 7 17 11 2 5 7 6 18 11 2 5 7 1 19 11 2 5 8 6 20 11 2 5 8 1 ------------------------------------- David L Carlson Associate Professor of Anthropology Texas A&M University College Station, TX 77840-4352 -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Juan Antonio Balbuena Sent: Thursday, July 25, 2013 10:13 AM To: r-help at r-project.org Subject: [R] transform dataframe with look-up table Hello I hope that there is a simple solution to this apparently complex problem. Any help will be much appreciated: I have a dataframe with Left and Right readings (that is, elements in each row are paired). For instance, Left Right [1] 9 8 [2] 4 3 [3] 2 1 [4] 6 5 [5] 3 1 [6] 4 1 [7] 3 2 [8] 4 2 [9] 10 8 [10] 9 10 I need to produce a new data frame where the values are transformed according to a look-up table such as input output [1] 5 1 [2] 10 1 [3] 4 2 [4] 8 3 [5] 6 5 [6] 5 6 [7] 7 6 [8] 2 7 [9] 9 7 [10] 10 7 [11] 2 8 So [1, ] in the new dataframe would be 7 3. Quite simple so far, but what makes things complicated is the multiple outputs for a single input. In this example, 10 corresponds to 1 and 7 so [9, ] in the input dataframe must yield two rows in its output counterpart: 1 3 and 7 3. Likewise the output for [10, ] should be 7 1 and 7 7. In addition, given that 3 and 1 are missing as inputs the output for [5, ] should be NA NA. Thank you very much for your time. Juan Antonio Balbuena -- Dr. Juan A. Balbuena Marine Zoology Unit Cavanilles Institute of Biodiversity and Evolutionary Biology University of Valencia [1]http://www.uv.es/~balbuena P.O. Box 22085 [2]http://www.uv.es/cavanilles/zoomarin/index.htm 46071 Valencia, Spain [3]http://cetus.uv.es/mullpardb/index.html e-mail: [4]j.a.balbuena at uv.es tel. +34 963 543 658 fax +34 963 543 733 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ NOTE! For shipments by EXPRESS COURIER use the following street address: C/ Catedr??tico Jos?? Beltr??n 2, 46980 Paterna (Valencia), Spain. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ References 1. http://www.uv.es/%7Ebalbuena 2. http://www.uv.es/cavanilles/zoomarin/index.htm 3. http://cetus.uv.es/mullpardb/index.html 4. mailto:j.a.balbuena at uv.es ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.