Hi I want to clean my data frame, based on the age column, whereas i want to delete the rows that the difference between its elements (i+1)-i= integer. i used a <- diff(df$age) for(i in a){if(is.integer(a) == true){df <- df[-a,] }} but, it doesn?t work, any ideas Thanks in advance Bayan
Hi Bayan, In your code, 'a' is a vector and is.integer(a) is a logical of length 1 - most likely FALSE if even one element of a is not an integer. (Since R will coerce all the elements of a to the same type.) You need to decide whether something "close enough" to an integer is to be considered an integer - e.g. a distance of 0.000001 = 1e-6. a <- df$age df <- df[ c( TRUE, abs( a - round(a,0) )%%1 ) > 1e-6 ), ] I added the 'TRUE' at the beginning to always keep the first row of df. If you prefer to always keep the last row then move the TRUE to the end. HTH, Eric On Tue, Sep 26, 2017 at 12:50 PM, bayan sardini <sardinibayan at gmail.com> wrote:> Hi > > I want to clean my data frame, based on the age column, whereas i want to > delete the rows that the difference between its elements (i+1)-i= integer. > i used > > a <- diff(df$age) > for(i in a){if(is.integer(a) == true){df <- df[-a,] > }} > > but, it doesn?t work, any ideas > > Thanks in advance > Bayan > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]
Hi Bayan, Your question seems to imply that the "age" column contains floating point numbers, e.g. df height weight age 170 72 21.5 ... If this is so, you will only find an integer in diff(age) if two adjacent numbers happen to have the same decimal fraction _and_ the subtraction does not produce a very small decimal remainder due to one or both of the numbers being unable to be represented exactly in binary notation as Eric pointed out. This seems an unusual criterion for discarding values. Perhaps if you explain why an integer result is undesirable it would help. It can be done: badrows<-which(is.integer(diff(df$age))) df<-df[-badrows,] OR df<-df[badrows+1,] if you want to delete the second rather than the first age. Jim On Tue, Sep 26, 2017 at 7:50 PM, bayan sardini <sardinibayan at gmail.com> wrote:> Hi > > I want to clean my data frame, based on the age column, whereas i want to delete the rows that the difference between its elements (i+1)-i= integer. i used > > a <- diff(df$age) > for(i in a){if(is.integer(a) == true){df <- df[-a,] > }} > > but, it doesn?t work, any ideas > > Thanks in advance > Bayan > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.