A client came into our consulting center with some data that had been damaged by somebody who opened it in MS Excel. The columns were supposed to be integer valued, 0 through 5, but some of the values were mysteriously damaged. There were scores like 1.18329322 and such in there. Until he tracks down the original data and finds out what went wrong, he wants to take all fractional valued scores and convert to NA. As a quick hack, I suggest an approach using %%> x <- c(1,2,3,1.1,2.12131, 2.001) > x %% 1[1] 0.00000 0.00000 0.00000 0.10000 0.12131 0.00100> which(x %% 1 > 0)[1] 4 5 6> xbad <- which(x %% 1 > 0) > x[xbad] <- NA > x[1] 1 2 3 NA NA NA I worry about whether x %% 1 may ever return a non zero result for an integer because of rounding error. Is there a recommended approach? What about zapsmall on the left, but what on the right of >? which( zapsmall(x %% 1) > 0 ) Thanks in advance -- Paul E. Johnson Professor, Political Science 1541 Lilac Lane, Room 504 University of Kansas
Hi Paul, What about using: x[x != as.integer(x)] <- NA I cannot think of a situation off hand where this would fail to turn every non integer to missing. I wonder if there is really a point to this? Can the client proceed with data analysis with any degree of confidence when an unknown mechanism has altered data in unknown ways? Could Excel have sometimes changed one integer to another (e.g., 4s became 1.18whatever, but 3s became 1s or....)? Cheers, Josh On Sat, Aug 13, 2011 at 12:42 PM, Paul Johnson <pauljohn32 at gmail.com> wrote:> A client came into our consulting center with some data that had been > damaged by somebody who opened it in MS Excel. ?The columns were > supposed to be integer valued, 0 through 5, but some of the values > were mysteriously damaged. There were scores like 1.18329322 and such > in there. ?Until he tracks down the original data and finds out what > went wrong, he wants to take all fractional valued scores and convert > to NA. > > As a quick hack, I suggest an approach using %% > >> x <- c(1,2,3,1.1,2.12131, 2.001) >> x %% 1 > [1] 0.00000 0.00000 0.00000 0.10000 0.12131 0.00100 >> which(x %% 1 > 0) > [1] 4 5 6 >> xbad <- which(x %% 1 > 0) >> ?x[xbad] <- NA >> ?x > [1] ?1 ?2 ?3 NA NA NA > > I worry about whether x %% 1 may ever return a non zero result for an > integer because of rounding error. > > Is there a recommended approach? > > What about zapsmall on the left, but what on the right of >? > > which( zapsmall(x %% 1) > ?0 ) > > > Thanks in advance > > -- > Paul E. Johnson > Professor, Political Science > 1541 Lilac Lane, Room 504 > University of Kansas > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/
How about something like: If(round(x)!=x){zap} not exactly working code but might help Ken On Aug 13, 2554 BE, at 3:42 PM, Paul Johnson <pauljohn32 at gmail.com> wrote:> A client came into our consulting center with some data that had been > damaged by somebody who opened it in MS Excel. The columns were > supposed to be integer valued, 0 through 5, but some of the values > were mysteriously damaged. There were scores like 1.18329322 and such > in there. Until he tracks down the original data and finds out what > went wrong, he wants to take all fractional valued scores and convert > to NA. > > As a quick hack, I suggest an approach using %% > >> x <- c(1,2,3,1.1,2.12131, 2.001) >> x %% 1 > [1] 0.00000 0.00000 0.00000 0.10000 0.12131 0.00100 >> which(x %% 1 > 0) > [1] 4 5 6 >> xbad <- which(x %% 1 > 0) >> x[xbad] <- NA >> x > [1] 1 2 3 NA NA NA > > I worry about whether x %% 1 may ever return a non zero result for an > integer because of rounding error. > > Is there a recommended approach? > > What about zapsmall on the left, but what on the right of >? > > which( zapsmall(x %% 1) > 0 ) > > > Thanks in advance > > -- > Paul E. Johnson > Professor, Political Science > 1541 Lilac Lane, Room 504 > University of Kansas > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Actually sapply(x %% 1, function(x) isTRUE(all.equal(x, 0))) seems to be the way to go. Uwe Ligges On 14.08.2011 07:17, Ken wrote:> How about something like: > If(round(x)!=x){zap} not exactly working code but might help > > Ken > On Aug 13, 2554 BE, at 3:42 PM, Paul Johnson<pauljohn32 at gmail.com> wrote: > >> A client came into our consulting center with some data that had been >> damaged by somebody who opened it in MS Excel. The columns were >> supposed to be integer valued, 0 through 5, but some of the values >> were mysteriously damaged. There were scores like 1.18329322 and such >> in there. Until he tracks down the original data and finds out what >> went wrong, he wants to take all fractional valued scores and convert >> to NA. >> >> As a quick hack, I suggest an approach using %% >> >>> x<- c(1,2,3,1.1,2.12131, 2.001) >>> x %% 1 >> [1] 0.00000 0.00000 0.00000 0.10000 0.12131 0.00100 >>> which(x %% 1> 0) >> [1] 4 5 6 >>> xbad<- which(x %% 1> 0) >>> x[xbad]<- NA >>> x >> [1] 1 2 3 NA NA NA >> >> I worry about whether x %% 1 may ever return a non zero result for an >> integer because of rounding error. >> >> Is there a recommended approach? >> >> What about zapsmall on the left, but what on the right of>? >> >> which( zapsmall(x %% 1)> 0 ) >> >> >> Thanks in advance >> >> -- >> Paul E. Johnson >> Professor, Political Science >> 1541 Lilac Lane, Room 504 >> University of Kansas >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.