Hello R-help, I noticed the following surprising behavior when using %in% to find elements in a vector generated using seq(). # weird!!!> c(7.7, 7.8, 7.9) %in% seq(4, 8, by=0.1)[1] TRUE FALSE TRUE # OK now> c(7.7, 7.8, 7.9) %in% round(seq(4, 8, by=0.1), 1)[1] TRUE TRUE TRUE # wait, how is this different?> c(7.7, 7.8, 7.9) %in% seq(7, 8, by=0.1)[1] TRUE TRUE TRUE Is there an obvious reason for this behavior which I am missing? Seems like a bug to me... Thanks in advance! Vadim> sessionInfo()R version 2.12.0 (2010-10-15) Platform: i386-apple-darwin9.8.0/i386 (32-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] graphics grDevices utils datasets stats methods base other attached packages: [1] cluster_1.13.1 nlme_3.1-97 lattice_0.19-13 loaded via a namespace (and not attached): [1] grid_2.12.0 [[alternative HTML version deleted]]
Not weird at all!!! You need to understand how computers do arithmetic. See R FAQ 7.31. -- Bert On Fri, Nov 12, 2010 at 10:55 AM, Vadim Patsalo <patsalov at gmail.com> wrote:> Hello R-help, > > I noticed the following surprising behavior when using %in% to find elements in a vector generated using seq(). > > # weird!!! >> c(7.7, 7.8, 7.9) %in% seq(4, 8, by=0.1) > [1] ?TRUE FALSE ?TRUE > > # OK now >> c(7.7, 7.8, 7.9) %in% round(seq(4, 8, by=0.1), 1) > [1] TRUE TRUE TRUE > > # wait, how is this different? >> c(7.7, 7.8, 7.9) %in% seq(7, 8, by=0.1) > [1] TRUE TRUE TRUE > > Is there an obvious reason for this behavior which I am missing? Seems like a bug to me... > > Thanks in advance! > Vadim > >> sessionInfo() > R version 2.12.0 (2010-10-15) > Platform: i386-apple-darwin9.8.0/i386 (32-bit) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] graphics ?grDevices utils ? ? datasets ?stats ? ? methods ? base > > other attached packages: > [1] cluster_1.13.1 ?nlme_3.1-97 ? ? lattice_0.19-13 > > loaded via a namespace (and not attached): > [1] grid_2.12.0 > > > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Bert Gunter Genentech Nonclinical Biostatistics
Patrick and Bert, Thank you both for you replies to my question. I see how my na?ve expectations fail to floating point arithmetic. However, I still believe there is an underlying problem. It seems to me that when asked,> c(7.7, 7.8, 7.9) %in% seq(4, 8, by=0.1) > [1] TRUE FALSE TRUER should return TRUE in all instances. %in% is testing set membership... in that way, shouldn't it be using all.equal() (instead of the implicit '=='), as Patrick suggests the R inferno? Is there a convenient way to test set membership using all.equal()? In particular, can you do it (conveniently) when the lengths of the numeric lists are different? Thanks again for your reply! Vadim On Nov 13, 2010, at 5:46 AM, Patrick Burns wrote:> See Circle 1 of 'The R Inferno'.