vaneet
2012-Feb-10 16:15 UTC
[R] problem subsetting data frame with variable instead of constant
Hello, I've encountered a very weird issue with the method subset(), or maybe this is something I don't know about said method that when you're subsetting based on the columns of a data frame you can only use constants (0.1, 2.3, 2.2) instead of variables? Here's a look at my data frame called 'ea.cad.pwr': *>ea.ca.pwr[1:5,] MAF OR POWER 1 0.02 0.01 0.9999 2 0.02 0.02 0.9998 3 0.02 0.03 0.9997 4 0.02 0.04 0.9995 5 0.02 0.05 0.9993* Here's my subset lines which finds no rows: *power1 = subset(ea.cad.pwr, MAF == maf1 & OR == odds) power2 = subset(ea.cad.pwr, MAF == maf2 & OR == odds) * Now when maf1 = 0.2 and odds = 1.2 it finds nothing. I know for a fact that there's a row with these values: *> ea.cad.pwr[1430:1432,] MAF OR POWER 1430 0.2 0.58 0.9996 1431 0.2 1.20 0.3092 1432 0.2 1.22 0.3914* I have code working in a loop and each previous iteration the subset() function is working fine, but in this iteration some different lines are executed which are relevant to these variables, here they are: * maf1 = maf.adj - 0.01 maf2 = maf.adj + 0.01* Basically maf.adj is always a 2 decimal number (in this case = 0.21), and I'm computing the numbers around it by a difference of 0.01 (0.2,0.22) in case maf.adj isn't in the table. maf.adj is read from another dataframe, when I use it to subset it always works fine but when I do this innocent subtraction for some reason it doesn't work. If I rewrite statements like this it works: *power1 = subset(ea.cad.pwr, MAF == 0.2 & OR == odds) power2 = subset(ea.cad.pwr, MAF == 0.22 & OR == odds) * Even if I write this first: maf1 = 0.2 Then: power1 = subset(ea.cad.pwr, MAF == maf1 & OR == odds) It works as well! That's what's really confusing, how can this subtraction mess everything up? Please help if you can..thank you! Vaneet -- View this message in context: http://r.789695.n4.nabble.com/problem-subsetting-data-frame-with-variable-instead-of-constant-tp4376759p4376759.html Sent from the R help mailing list archive at Nabble.com.
Petr Savicky
2012-Feb-10 16:27 UTC
[R] problem subsetting data frame with variable instead of constant
On Fri, Feb 10, 2012 at 08:15:39AM -0800, vaneet wrote:> Hello, > > I've encountered a very weird issue with the method subset(), or maybe this > is something I don't know about said method that when you're subsetting > based on the columns of a data frame you can only use constants (0.1, 2.3, > 2.2) instead of variables? > > Here's a look at my data frame called 'ea.cad.pwr': > *>ea.ca.pwr[1:5,] > MAF OR POWER > 1 0.02 0.01 0.9999 > 2 0.02 0.02 0.9998 > 3 0.02 0.03 0.9997 > 4 0.02 0.04 0.9995 > 5 0.02 0.05 0.9993* > > Here's my subset lines which finds no rows: > > *power1 = subset(ea.cad.pwr, MAF == maf1 & OR == odds) > power2 = subset(ea.cad.pwr, MAF == maf2 & OR == odds) > * > Now when maf1 = 0.2 and odds = 1.2 it finds nothing. I know for a fact that > there's a row with these values: > *> ea.cad.pwr[1430:1432,] > MAF OR POWER > 1430 0.2 0.58 0.9996 > 1431 0.2 1.20 0.3092 > 1432 0.2 1.22 0.3914* > > I have code working in a loop and each previous iteration the subset() > function is working fine, but in this iteration some different lines are > executed which are relevant to these variables, here they are: > * > maf1 = maf.adj - 0.01 > maf2 = maf.adj + 0.01* > > Basically maf.adj is always a 2 decimal number (in this case = 0.21), and > I'm computing the numbers around it by a difference of 0.01 (0.2,0.22) in > case maf.adj isn't in the table. maf.adj is read from another dataframe, > when I use it to subset it always works fine but when I do this innocent > subtraction for some reason it doesn't work. If I rewrite statements like > this it works: > > *power1 = subset(ea.cad.pwr, MAF == 0.2 & OR == odds) > power2 = subset(ea.cad.pwr, MAF == 0.22 & OR == odds) > * > > Even if I write this first: > > maf1 = 0.2 > > Then: > > power1 = subset(ea.cad.pwr, MAF == maf1 & OR == odds) > > It works as well! That's what's really confusing, how can this subtraction > mess everything up? Please help if you can..thank you!Hi. This may be a rounding problem. Try 0.3 - 0.1 == 0.2 [1] FALSE Explicit rounding to a not too large number of decimal digits can help. round(0.3 - 0.1, digits=7) == 0.2 [1] TRUE See also FAQ 7.31 or http://rwiki.sciviews.org/doku.php?id=misc:r_accuracy Hope this helps. Petr Savicky.
Sarah Goslee
2012-Feb-10 16:28 UTC
[R] problem subsetting data frame with variable instead of constant
This is likely a representation issue, as in R FAQ 7.31. ?"==" suggests that using identical and all.equal is a better strategy: x1 <- 0.5 - 0.3 x2 <- 0.3 - 0.1 x1 == x2 # FALSE on most machines identical(all.equal(x1, x2), TRUE) # TRUE everywhere Sarah On Fri, Feb 10, 2012 at 11:15 AM, vaneet <vaneet.lotay at mountsinai.org> wrote:> Hello, > > I've encountered a very weird issue with the method subset(), or maybe this > is something I don't know about said method that when you're subsetting > based on the columns of a data frame you can only use constants (0.1, 2.3, > 2.2) instead of variables? > > Here's a look at my data frame called 'ea.cad.pwr': > *>ea.ca.pwr[1:5,] > ? MAF ? OR ?POWER > 1 0.02 0.01 0.9999 > 2 0.02 0.02 0.9998 > 3 0.02 0.03 0.9997 > 4 0.02 0.04 0.9995 > 5 0.02 0.05 0.9993* > > Here's my subset lines which finds no rows: > > *power1 = subset(ea.cad.pwr, MAF == maf1 & OR == odds) > power2 = subset(ea.cad.pwr, MAF == maf2 & OR == odds) > * > Now when maf1 = 0.2 and odds = 1.2 it finds nothing. ?I know for a fact that > there's a row with these values: > *> ea.cad.pwr[1430:1432,] > ? ? MAF ? OR ?POWER > 1430 0.2 0.58 0.9996 > 1431 0.2 1.20 0.3092 > 1432 0.2 1.22 0.3914* > > I have code working in a loop and each previous iteration the subset() > function is working fine, but in this iteration some different lines are > executed which are relevant to these variables, here they are: > * > maf1 = maf.adj - 0.01 > maf2 = maf.adj + 0.01* > > Basically maf.adj is always a 2 decimal number (in this case = 0.21), and > I'm computing the numbers around it by a difference of 0.01 (0.2,0.22) in > case maf.adj isn't in the table. ?maf.adj is read from another dataframe, > when I use it to subset it always works fine but when I do this innocent > subtraction for some reason it doesn't work. ?If I rewrite statements like > this it works: > > *power1 = subset(ea.cad.pwr, MAF == 0.2 & OR == odds) > power2 = subset(ea.cad.pwr, MAF == 0.22 & OR == odds) > * > > Even if I write this first: > > maf1 = 0.2 > > Then: > > power1 = subset(ea.cad.pwr, MAF == maf1 & OR == odds) > > It works as well! That's what's really confusing, how can this subtraction > mess everything up? ?Please help if you can..thank you! > > Vaneet >-- Sarah Goslee http://www.functionaldiversity.org
vaneet
2012-Feb-10 16:38 UTC
[R] problem subsetting data frame with variable instead of constant
Thanks guys, both those solutions work. I really appreciate the help! -- View this message in context: http://r.789695.n4.nabble.com/problem-subsetting-data-frame-with-variable-instead-of-constant-tp4376759p4376826.html Sent from the R help mailing list archive at Nabble.com.