Thomas.Salvesen at SYNGENTA.COM
2011-Mar-07 07:08 UTC
[R] rowSums - am I getting something wrong?
I am trying to construct a data set with some sequences for example: a = seq(0,1,0.1) m = matrix(nrow = 1331, ncol = 3) m[,1] = rep(a,121) m[,2] = rep(a,11,each = 11) m[,3] = rep(a,1,each = 121) I realize that there may be better ways of doing this, but this approach demonstrates the problem I'm having. I then want to get the sum of the rows and delete any row with a sum of greater than 1. But have a problem with rows containing any combination of the values 0.6, 0.3 and 0.1 as the sum of these is clearly 1, but a request for which rows have a sum greater than 1 will return rows with these values. Row 161 is the first row containing these values: [161,] 0.6 0.3 0.1 which(rowSum(m)>1)> [53] 119 120 121 132 142 143 152 153 154 161 162As far as I can tell this only affects combinations of 0.6, 0.3 and 0.1 (though I haven't checked every value in the matrix) If I try the following: q=rowSums(m) which(q>1)>[53] 119 120 121 132 142 143 152 153 154 161 162But if I add and subtract 1 from this: q=q+1 q=q-1 which(q>1) [53] 119 120 121 132 142 143 152 153 154 162 What exactly is going on here? I don't have the problem with other combinations (eg 0.7, 0.2, 0.1). I assume that there is something about the data format that I don't understand, but if I make a data frame of the matrix I found the same effect. Any help would be great Tom message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. [[alternative HTML version deleted]]
Hi Tom, That's once again the floating point number issue: see FAQ 7.31. Look at this: sum(m[161,]) [1] 1 sum(m[161,])==1 [1] FALSE sum(m[161,])-1 [1] 2.220446e-16 So 0.6+0.3+0.1 is indeed greater than 1 Try this instead: round(sum(m[161,]))==1 [1] TRUE HTH, Ivan Le 3/7/2011 08:08, Thomas.Salvesen at syngenta.com a ?crit :> I am trying to construct a data set with some sequences for example: > > a = seq(0,1,0.1) > > m = matrix(nrow = 1331, ncol = 3) > m[,1] = rep(a,121) > m[,2] = rep(a,11,each = 11) > m[,3] = rep(a,1,each = 121) > > I realize that there may be better ways of doing this, but this approach demonstrates the problem I'm having. > > I then want to get the sum of the rows and delete any row with a sum of greater than 1. But have a problem with rows containing any combination of the values 0.6, 0.3 and 0.1 as the sum of these is clearly 1, but a request for which rows have a sum greater than 1 will return rows with these values. Row 161 is the first row containing these values: > > [161,] 0.6 0.3 0.1 > > which(rowSum(m)>1) > >> [53] 119 120 121 132 142 143 152 153 154 161 162 > As far as I can tell this only affects combinations of 0.6, 0.3 and 0.1 (though I haven't checked every value in the matrix) > > If I try the following: > > q=rowSums(m) > which(q>1) > >> [53] 119 120 121 132 142 143 152 153 154 161 162 > But if I add and subtract 1 from this: > > q=q+1 > q=q-1 > which(q>1) > > [53] 119 120 121 132 142 143 152 153 154 162 > > What exactly is going on here? I don't have the problem with other combinations (eg 0.7, 0.2, 0.1). I assume that there is something about the data format that I don't understand, but if I make a data frame of the matrix I found the same effect. > > Any help would be great > > Tom > > > > > > > message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. S?ugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calandra at uni-hamburg.de ********** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php
rex.dwyer at syngenta.com
2011-Mar-07 14:28 UTC
[R] rowSums - am I getting something wrong?
Hi Thomas, Several of us explained this in different ways just last week, so you might search the archive. Floating point numbers are an approximate representation of real numbers. Things that can be expressed exactly in powers of 10 can't be expressed exactly in powers of 2. So the sum 0.6+0.3+0.1 is NOT clearly 1.0. You can use signif and round to overcome this> a = seq(0,1,0.1) > a[1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0> a[7]-0.6[1] 1.110223e-16> > 1-(a[4]+a[7]+a[2])[1] -2.220446e-16> b = rev(seq(1,0,-0.1)) > b[1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0> a-b[1] 0.000000e+00 2.775558e-17 5.551115e-17 1.110223e-16 1.110223e-16 [6] 0.000000e+00 1.110223e-16 1.110223e-16 0.000000e+00 0.000000e+00 [11] 0.000000e+00> round(a-b,10)[1] 0 0 0 0 0 0 0 0 0 0 0> round(a,10)-round(b,10)[1] 0 0 0 0 0 0 0 0 0 0 0>The first commandment of floating point programming is THOU SHALT NOT TEST WHETHER TWO FP NUMBERS ARE EQUAL HTH Rex -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Thomas.Salvesen at syngenta.com Sent: Monday, March 07, 2011 2:09 AM To: r-help at r-project.org Subject: [R] rowSums - am I getting something wrong? I am trying to construct a data set with some sequences for example: a = seq(0,1,0.1) m = matrix(nrow = 1331, ncol = 3) m[,1] = rep(a,121) m[,2] = rep(a,11,each = 11) m[,3] = rep(a,1,each = 121) I realize that there may be better ways of doing this, but this approach demonstrates the problem I'm having. I then want to get the sum of the rows and delete any row with a sum of greater than 1. But have a problem with rows containing any combination of the values 0.6, 0.3 and 0.1 as the sum of these is clearly 1, but a request for which rows have a sum greater than 1 will return rows with these values. Row 161 is the first row containing these values: [161,] 0.6 0.3 0.1 which(rowSum(m)>1)> [53] 119 120 121 132 142 143 152 153 154 161 162As far as I can tell this only affects combinations of 0.6, 0.3 and 0.1 (though I haven't checked every value in the matrix) If I try the following: q=rowSums(m) which(q>1)>[53] 119 120 121 132 142 143 152 153 154 161 162But if I add and subtract 1 from this: q=q+1 q=q-1 which(q>1) [53] 119 120 121 132 142 143 152 153 154 162 What exactly is going on here? I don't have the problem with other combinations (eg 0.7, 0.2, 0.1). I assume that there is something about the data format that I don't understand, but if I make a data frame of the matrix I found the same effect. Any help would be great Tom message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited.