vaneet
2012-Feb-10 16:15 UTC
[R] problem subsetting data frame with variable instead of constant
Hello,
I've encountered a very weird issue with the method subset(), or maybe this
is something I don't know about said method that when you're subsetting
based on the columns of a data frame you can only use constants (0.1, 2.3,
2.2) instead of variables?
Here's a look at my data frame called 'ea.cad.pwr':
*>ea.ca.pwr[1:5,]
MAF OR POWER
1 0.02 0.01 0.9999
2 0.02 0.02 0.9998
3 0.02 0.03 0.9997
4 0.02 0.04 0.9995
5 0.02 0.05 0.9993*
Here's my subset lines which finds no rows:
*power1 = subset(ea.cad.pwr, MAF == maf1 & OR == odds)
power2 = subset(ea.cad.pwr, MAF == maf2 & OR == odds)
*
Now when maf1 = 0.2 and odds = 1.2 it finds nothing. I know for a fact that
there's a row with these values:
*> ea.cad.pwr[1430:1432,]
MAF OR POWER
1430 0.2 0.58 0.9996
1431 0.2 1.20 0.3092
1432 0.2 1.22 0.3914*
I have code working in a loop and each previous iteration the subset()
function is working fine, but in this iteration some different lines are
executed which are relevant to these variables, here they are:
*
maf1 = maf.adj - 0.01
maf2 = maf.adj + 0.01*
Basically maf.adj is always a 2 decimal number (in this case = 0.21), and
I'm computing the numbers around it by a difference of 0.01 (0.2,0.22) in
case maf.adj isn't in the table. maf.adj is read from another dataframe,
when I use it to subset it always works fine but when I do this innocent
subtraction for some reason it doesn't work. If I rewrite statements like
this it works:
*power1 = subset(ea.cad.pwr, MAF == 0.2 & OR == odds)
power2 = subset(ea.cad.pwr, MAF == 0.22 & OR == odds)
*
Even if I write this first:
maf1 = 0.2
Then:
power1 = subset(ea.cad.pwr, MAF == maf1 & OR == odds)
It works as well! That's what's really confusing, how can this
subtraction
mess everything up? Please help if you can..thank you!
Vaneet
--
View this message in context:
http://r.789695.n4.nabble.com/problem-subsetting-data-frame-with-variable-instead-of-constant-tp4376759p4376759.html
Sent from the R help mailing list archive at Nabble.com.
Petr Savicky
2012-Feb-10 16:27 UTC
[R] problem subsetting data frame with variable instead of constant
On Fri, Feb 10, 2012 at 08:15:39AM -0800, vaneet wrote:> Hello, > > I've encountered a very weird issue with the method subset(), or maybe this > is something I don't know about said method that when you're subsetting > based on the columns of a data frame you can only use constants (0.1, 2.3, > 2.2) instead of variables? > > Here's a look at my data frame called 'ea.cad.pwr': > *>ea.ca.pwr[1:5,] > MAF OR POWER > 1 0.02 0.01 0.9999 > 2 0.02 0.02 0.9998 > 3 0.02 0.03 0.9997 > 4 0.02 0.04 0.9995 > 5 0.02 0.05 0.9993* > > Here's my subset lines which finds no rows: > > *power1 = subset(ea.cad.pwr, MAF == maf1 & OR == odds) > power2 = subset(ea.cad.pwr, MAF == maf2 & OR == odds) > * > Now when maf1 = 0.2 and odds = 1.2 it finds nothing. I know for a fact that > there's a row with these values: > *> ea.cad.pwr[1430:1432,] > MAF OR POWER > 1430 0.2 0.58 0.9996 > 1431 0.2 1.20 0.3092 > 1432 0.2 1.22 0.3914* > > I have code working in a loop and each previous iteration the subset() > function is working fine, but in this iteration some different lines are > executed which are relevant to these variables, here they are: > * > maf1 = maf.adj - 0.01 > maf2 = maf.adj + 0.01* > > Basically maf.adj is always a 2 decimal number (in this case = 0.21), and > I'm computing the numbers around it by a difference of 0.01 (0.2,0.22) in > case maf.adj isn't in the table. maf.adj is read from another dataframe, > when I use it to subset it always works fine but when I do this innocent > subtraction for some reason it doesn't work. If I rewrite statements like > this it works: > > *power1 = subset(ea.cad.pwr, MAF == 0.2 & OR == odds) > power2 = subset(ea.cad.pwr, MAF == 0.22 & OR == odds) > * > > Even if I write this first: > > maf1 = 0.2 > > Then: > > power1 = subset(ea.cad.pwr, MAF == maf1 & OR == odds) > > It works as well! That's what's really confusing, how can this subtraction > mess everything up? Please help if you can..thank you!Hi. This may be a rounding problem. Try 0.3 - 0.1 == 0.2 [1] FALSE Explicit rounding to a not too large number of decimal digits can help. round(0.3 - 0.1, digits=7) == 0.2 [1] TRUE See also FAQ 7.31 or http://rwiki.sciviews.org/doku.php?id=misc:r_accuracy Hope this helps. Petr Savicky.
Sarah Goslee
2012-Feb-10 16:28 UTC
[R] problem subsetting data frame with variable instead of constant
This is likely a representation issue, as in R FAQ 7.31.
?"==" suggests that using identical and all.equal is a better
strategy:
x1 <- 0.5 - 0.3
x2 <- 0.3 - 0.1
x1 == x2 # FALSE on most machines
identical(all.equal(x1, x2), TRUE) # TRUE everywhere
Sarah
On Fri, Feb 10, 2012 at 11:15 AM, vaneet <vaneet.lotay at mountsinai.org>
wrote:> Hello,
>
> I've encountered a very weird issue with the method subset(), or maybe
this
> is something I don't know about said method that when you're
subsetting
> based on the columns of a data frame you can only use constants (0.1, 2.3,
> 2.2) instead of variables?
>
> Here's a look at my data frame called 'ea.cad.pwr':
> *>ea.ca.pwr[1:5,]
> ? MAF ? OR ?POWER
> 1 0.02 0.01 0.9999
> 2 0.02 0.02 0.9998
> 3 0.02 0.03 0.9997
> 4 0.02 0.04 0.9995
> 5 0.02 0.05 0.9993*
>
> Here's my subset lines which finds no rows:
>
> *power1 = subset(ea.cad.pwr, MAF == maf1 & OR == odds)
> power2 = subset(ea.cad.pwr, MAF == maf2 & OR == odds)
> *
> Now when maf1 = 0.2 and odds = 1.2 it finds nothing. ?I know for a fact
that
> there's a row with these values:
> *> ea.cad.pwr[1430:1432,]
> ? ? MAF ? OR ?POWER
> 1430 0.2 0.58 0.9996
> 1431 0.2 1.20 0.3092
> 1432 0.2 1.22 0.3914*
>
> I have code working in a loop and each previous iteration the subset()
> function is working fine, but in this iteration some different lines are
> executed which are relevant to these variables, here they are:
> *
> maf1 = maf.adj - 0.01
> maf2 = maf.adj + 0.01*
>
> Basically maf.adj is always a 2 decimal number (in this case = 0.21), and
> I'm computing the numbers around it by a difference of 0.01 (0.2,0.22)
in
> case maf.adj isn't in the table. ?maf.adj is read from another
dataframe,
> when I use it to subset it always works fine but when I do this innocent
> subtraction for some reason it doesn't work. ?If I rewrite statements
like
> this it works:
>
> *power1 = subset(ea.cad.pwr, MAF == 0.2 & OR == odds)
> power2 = subset(ea.cad.pwr, MAF == 0.22 & OR == odds)
> *
>
> Even if I write this first:
>
> maf1 = 0.2
>
> Then:
>
> power1 = subset(ea.cad.pwr, MAF == maf1 & OR == odds)
>
> It works as well! That's what's really confusing, how can this
subtraction
> mess everything up? ?Please help if you can..thank you!
>
> Vaneet
>
--
Sarah Goslee
http://www.functionaldiversity.org
vaneet
2012-Feb-10 16:38 UTC
[R] problem subsetting data frame with variable instead of constant
Thanks guys, both those solutions work. I really appreciate the help! -- View this message in context: http://r.789695.n4.nabble.com/problem-subsetting-data-frame-with-variable-instead-of-constant-tp4376759p4376826.html Sent from the R help mailing list archive at Nabble.com.