Hello again... sorry to be posting yet again, but I hadn't anticipated this 
problem.
I am trying to now put the names found in one column in data frame 1 (lets 
call it df.1[,1]) in to a list from the rows where the values in df.1[,2] 
match values in a column of another dataframe (df.2[3])
I tried to write this function so that it put the list of names (called 
Iffy) where the 2 criteria (df.1[141] and df.2[21]) matched but I think its 
too complex for a beginner R-enthusiast
ify<-function(x,y,a,b,c) if(x[[,a]]==y[[,b]]) {list(x[[,c]])} else {NULL}
Iffy<-apply(  df.1,  1,  FUN=ify,  x=df.1,  y=df.2,  a=2,  b=3,  c=1  )
But this didn't work... Error in FUN(newX[, i], ...) : unused argument(s) 
(newX[, i])
Here is a dataset that replicates the problem, you'll notice the
"h"
criteria values are different between the two dataframes and therefore it 
would produce a list  of the 9 letters where the two criteria columns 
matched (a,b,c,d,e,f,g,i,j):
df.1<-data.frame(rep(letters[1:10]))
colnames(df.1)[1]<-("Letters")
set.seed(1)
df.1$numb1<-rnorm(10,1,1)
df.1$extra.col<-c(1,2,3,4,5,6,7,8,9,10)
df.1$id<-c("CG234","CG232","CG441","CG128","CG125","CG182","CG982","CG541","CG282","CG154")
df.1
df.2<-data.frame(rep(letters[1:10]))
colnames(df.2)[1]<-("Letters")
set.seed(1)
df.2$extra.col<-c(1,2,3,4,5,6,7,8,9,10)
df.2$numb1<-rnorm(10,1,1)
df.2$id<-c("CG234","CG232","CG441","CG128","CG125","CG182","CG982","CG541","CG282","CG154")
df.2[8,3]<-12
df.1
df.2
Your patience is much appreciated,
Rob
R. Michael Weylandt <michael.weylandt@gmail.com>
2011-Nov-16  13:34 UTC
[R] create list of names where two df contain == values
I'm not at a computer now, so I can't take a close look at it, but I think the match() function can be helpful here. I'll try to get back to you with a fuller answer later. Michael On Nov 16, 2011, at 8:03 AM, "Rob Griffin" <robgriffin247 at hotmail.com> wrote:> Hello again... sorry to be posting yet again, but I hadn't anticipated this problem. > > I am trying to now put the names found in one column in data frame 1 (lets call it df.1[,1]) in to a list from the rows where the values in df.1[,2] match values in a column of another dataframe (df.2[3]) > I tried to write this function so that it put the list of names (called Iffy) where the 2 criteria (df.1[141] and df.2[21]) matched but I think its too complex for a beginner R-enthusiast > > ify<-function(x,y,a,b,c) if(x[[,a]]==y[[,b]]) {list(x[[,c]])} else {NULL} > Iffy<-apply( df.1, 1, FUN=ify, x=df.1, y=df.2, a=2, b=3, c=1 ) > > But this didn't work... Error in FUN(newX[, i], ...) : unused argument(s) (newX[, i]) > > > Here is a dataset that replicates the problem, you'll notice the "h" criteria values are different between the two dataframes and therefore it would produce a list of the 9 letters where the two criteria columns matched (a,b,c,d,e,f,g,i,j): > > > > df.1<-data.frame(rep(letters[1:10])) > colnames(df.1)[1]<-("Letters") > set.seed(1) > df.1$numb1<-rnorm(10,1,1) > df.1$extra.col<-c(1,2,3,4,5,6,7,8,9,10) > df.1$id<-c("CG234","CG232","CG441","CG128","CG125","CG182","CG982","CG541","CG282","CG154") > df.1 > > df.2<-data.frame(rep(letters[1:10])) > colnames(df.2)[1]<-("Letters") > set.seed(1) > df.2$extra.col<-c(1,2,3,4,5,6,7,8,9,10) > df.2$numb1<-rnorm(10,1,1) > df.2$id<-c("CG234","CG232","CG441","CG128","CG125","CG182","CG982","CG541","CG282","CG154") > df.2[8,3]<-12 > > df.1 > df.2 > > > > > Your patience is much appreciated, > Rob > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
David Winsemius
2011-Nov-16  14:04 UTC
[R] create list of names where two df contain == values
On Nov 16, 2011, at 8:03 AM, Rob Griffin wrote:> Hello again... sorry to be posting yet again, but I hadn't > anticipated this problem. > > I am trying to now put the names found in one column in data frame 1 > (lets call it df.1[,1]) in to a list from the rows where the values > in df.1[,2] match values in a column of another dataframe (df.2[3]) > I tried to write this function so that it put the list of names > (called Iffy) where the 2 criteria (df.1[141] and df.2[21]) matched > but I think its too complex for a beginner R-enthusiast > > ify<-function(x,y,a,b,c) if(x[[,a]]==y[[,b]]) {list(x[[,c]])} else > {NULL}When you are building a helper function for use with apply, your should realize that tat function will be getting a vector, not a list. The construction "[[,a]]" looks pretty strange as well. Generally column selection is done with one of "[[a]]" or "[ , a]". I am not absolutely sure that you cannot have "[[,]]" but I was under the impression you could not. AND you shouldn't be retruning NULLs if what yoyr really want are NA's.> Iffy<-apply( df.1, 1, FUN=ify, x=df.1, y=df.2, a=2, b=3, > c=1 )So a single vector will be assigned to the x argument in the ify function and the rest of the arguments will be populated from the other arguments. You do NOT need to supply an "x" argument in that list and if you do so you will throw an error. Furthermore you cannot expect the apply function to keep track of which row it's one for indexing a different data.frame. The mapply function might be used for this purpose but I am going to suggest a much cleaner solution below.> > But this didn't work... Error in FUN(newX[, i], ...) : unused > argument(s) (newX[, i]) > > > Here is a dataset that replicates the problem, you'll notice the "h" > criteria values are different between the two dataframes and > therefore it would produce a list of the 9 letters where the two > criteria columns matched (a,b,c,d,e,f,g,i,j):If you know that df.1 and df.2 have the same number of rows then use the ifelse function which is designed to work on vectors. The if)_else construct is NOT: > ifelse( df.1[,2] ==df.2[,3], {as.character(df.1[,1])} , {NA} ) [1] "a" "b" "c" "d" "e" "f" "g" NA "i" "j" The reason as.character was needed lies in that fact that you constructed df.1[,1] as a factor variable. AS I understand it, the ifelse tries to make it numeric to match the datatype of the comaprison. I've never understood this frankly. Maybe someoen can educate me. If you wanted a function that allowed you to specify the columns and dataframes then consider this ret3.m1.eq.n2 <- function(df1, df2, col1, col2, col3){ ifelse( df1[,col1] ==df2[,col2], {as.character(df1[,col3])} , {NA} )> > > > df.1<-data.frame(rep(letters[1:10])) > colnames(df.1)[1]<-("Letters") > set.seed(1) > df.1$numb1<-rnorm(10,1,1) > df.1$extra.col<-c(1,2,3,4,5,6,7,8,9,10) > df.1$id<- > c > ("CG234 > ","CG232 > ","CG441","CG128","CG125","CG182","CG982","CG541","CG282","CG154") > df.1 > > df.2<-data.frame(rep(letters[1:10])) > colnames(df.2)[1]<-("Letters") > set.seed(1) > df.2$extra.col<-c(1,2,3,4,5,6,7,8,9,10) > df.2$numb1<-rnorm(10,1,1) > df.2$id<- > c > ("CG234 > ","CG232 > ","CG441","CG128","CG125","CG182","CG982","CG541","CG282","CG154") > df.2[8,3]<-12 > > df.1 > df.2 > > > > > Your patience is much appreciated, > Rob > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT
Dennis Murphy
2011-Nov-16  15:03 UTC
[R] create list of names where two df contain == values
Hi:
I think you're overthinking this problem. As is usually the case in R,
a vectorized solution is clearer and provides more easily understood
code.
It's not obvious to me exactly what you want, so we'll try a couple of
variations on the same idea. Equality of floating point numbers is a
difficult computational problem (see R FAQ 7.31), but if it makes
sense to define a threshold difference between floating numbers that
practically equates to zero, then you're in business. In your example,
the difference in numb1 for letter h in the two data frames is far
from zero, so define 'equal' to be a difference < 10 ^{-6}. Then:
# Return the entire matching data frame
df.1[abs(df.1$numb1 - df.2$numb1) < 0.000001, ]
   Letters     numb1 extra.col    id
1        a 0.3735462         1 CG234
2        b 1.1836433         2 CG232
3        c 0.1643714         3 CG441
4        d 2.5952808         4 CG128
5        e 1.3295078         5 CG125
6        f 0.1795316         6 CG182
7        g 1.4874291         7 CG982
9        i 1.5757814         9 CG282
10       j 0.6946116        10 CG154
# Return the matching letters only as a vector:
df.1[abs(df.1$numb1 - df.2$numb1) < 0.000001, 'Letters' ]
If you want the latter object to remain a data frame, use drop = FALSE
as an extra argument after 'Letters'. If you want to create a list
object such that each letter comprises a different list component,
then the following will do - the as.character() part coerces the
factor Letters into a character object:
as.list(as.character(df.1[abs(df.1$numb1 - df.2$numb1) < 0.000001,
             'Letters' ]))
HTH,
Dennis
On Wed, Nov 16, 2011 at 5:03 AM, Rob Griffin <robgriffin247 at
hotmail.com> wrote:> Hello again... sorry to be posting yet again, but I hadn't anticipated
this
> problem.
>
> I am trying to now put the names found in one column in data frame 1 (lets
> call it df.1[,1]) in to a list from the rows where the values in df.1[,2]
> match values in a column of another dataframe (df.2[3])
> I tried to write this function so that it put the list of names (called
> Iffy) where the 2 criteria (df.1[141] and df.2[21]) matched but I think its
> too complex for a beginner R-enthusiast
>
> ify<-function(x,y,a,b,c) if(x[[,a]]==y[[,b]]) {list(x[[,c]])} else
{NULL}
> Iffy<-apply( ?df.1, ?1, ?FUN=ify, ?x=df.1, ?y=df.2, ?a=2, ?b=3, ?c=1 ?)
>
> But this didn't work... Error in FUN(newX[, i], ...) : unused
argument(s)
> (newX[, i])
>
>
> Here is a dataset that replicates the problem, you'll notice the
"h"
> criteria values are different between the two dataframes and therefore it
> would produce a list ?of the 9 letters where the two criteria columns
> matched (a,b,c,d,e,f,g,i,j):
>
>
>
> df.1<-data.frame(rep(letters[1:10]))
> colnames(df.1)[1]<-("Letters")
> set.seed(1)
> df.1$numb1<-rnorm(10,1,1)
> df.1$extra.col<-c(1,2,3,4,5,6,7,8,9,10)
>
df.1$id<-c("CG234","CG232","CG441","CG128","CG125","CG182","CG982","CG541","CG282","CG154")
> df.1
>
> df.2<-data.frame(rep(letters[1:10]))
> colnames(df.2)[1]<-("Letters")
> set.seed(1)
> df.2$extra.col<-c(1,2,3,4,5,6,7,8,9,10)
> df.2$numb1<-rnorm(10,1,1)
>
df.2$id<-c("CG234","CG232","CG441","CG128","CG125","CG182","CG982","CG541","CG282","CG154")
> df.2[8,3]<-12
>
> df.1
> df.2
>
>
>
>
> Your patience is much appreciated,
> Rob
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>