arun
2013-Nov-15 23:32 UTC
[R] delete rows with duplicate numbers of opposite signs in the same column
Hi, Try: library(plyr) ?fun1 <- function(dat){ ?fun2 <- function(x) {indx <- x <0 ? x1 <- x[!indx] %in% abs(x[indx]) ? x2 <- abs(x[indx]) %in% x[!indx] ? x3 <- rbind(x[!indx][x1],x[indx][x2]) ? x[!x %in% x3]} ?if(length(colnames(dat)) > 2) { ? lapply(colnames(dat)[-1], function(x) { ? dat1 <- cbind(dat[1],dat[x]) ?ddply(dat1,.(Customer),colwise(fun2)) ?}) ?} ?else { ?ddply(dat,.(Customer),colwise(fun2)) ?} ?} dat1 <- structure(list(Customer = c("A", "A", "A", "B", "B", "B"), Consumption = c(100L, -100L, 150L, 20L, 30L, -30L)), .Names = c("Customer", "Consumption" ), class = "data.frame", row.names = c(NA, -6L)) dat2 <- structure(list(Customer = c("A", "A", "A", "B", "B", "B"), Consumption = c(100L, -100L, 150L, 20L, 30L, -30L), Column2 = c(30, -30, 40, 80, -40, 40)), .Names = c("Customer", "Consumption", "Column2"), row.names = c(NA, -6L), class = "data.frame") dat3 <- structure(list(Customer = c("A", "A", "A", "B", "B", "B", "B", "B"), Consumption = c(100, -100, 150, 20, 30, -30, 20, -40), ??? Column2 = c(30, 40, -30, -40, 80, 40, 20, -60)), .Names = c("Customer", "Consumption", "Column2"), row.names = c(NA, 8L), class = "data.frame") fun1(dat1) fun1(dat2) fun1(dat3) A.K. Hi guys I am working on the dataset that there are some duplicates with opposite signs in the same column. But those pairs of opposites are errors, I have to delete them. For example: Customer ? ?Consumption A ? ? ? ? ? ? ? ? ?100 A ? ? ? ? ? ? ? ? -100 A ? ? ? ? ? ? ? ? ?150 B ? ? ? ? ? ? ? ? ? 20 B ? ? ? ? ? ? ? ? ? 30 B ? ? ? ? ? ? ? ? ?-30 I have to get rid of those opposites for each customer(Consumption is one of the 13 variables in the dataset). This question troubles me for a long time, I really have no idea. Can anyone help me out or give me some hint? I really appreciate.