Jim Lemon
2015-Jun-06 07:50 UTC
[R] if else statement for rain data to define zero for dry and one to wet
Hi rosalinazairimah, I think the problem is that you are using "if" instead of "ifelse". Try this: wet_dry<-function(x,thresh=0.1) { for(column in 1:dim(x)[2]) x[,column]<-ifelse(x[,column]>=thresh,1,0) return(x) } wet_dry(dt) and see what you get. Also, why can I read your message perfectly while everybody else can't? Jim>> -----Original Message----- >> From: roslinaump at gmail.com >> Sent: Fri, 5 Jun 2015 16:49:08 +0800 >> To: r-help at r-project.org >> Subject: [R] if else statement for rain data to define zero for dry and >> one to wet >> >> Dear r-users, >> >> I have a set of rain data: >> >> X1950 X1951 X1952 X1953 X1954 X1955 X1956 X1957 X1958 X1959 X1960 X1961 >> X1962 >> >> 1 0.0 0.0 14.3 0.0 13.5 13.2 4.0 0 3.3 0 0 0.0 >> >> >> 2 0.0 0.0 21.9 0.0 10.9 6.6 2.1 0 0.0 0 0 0.0 >> >> >> 3 25.3 6.7 18.6 0.8 2.3 0.0 8.0 0 0.0 0 0 11.0 >> >> >> 4 12.7 3.4 37.2 0.9 8.4 0.0 5.8 0 0.0 0 0 5.5 >> >> >> 5 0.0 0.0 58.3 3.6 21.1 4.2 3.0 0 0.0 0 0 15.9 >> >> >> I would like to go through each column and define each cell with value >> greater than 0.1 mm will be 1 and else zero. Hence I would like to attach >> the rain data and the category side by side: >> >> >> 1950 state >> >> 1 0.0 0 >> >> 2 0.0 0 >> >> 3 25.3 1 >> >> 4 12.7 1 >> >> 5 0.0 0 >> >> >> ... >> >> >> This is my code: >> >> >> wet_dry <- function(dt) >> >> { cl <- length(dt) >> >> tresh <- 0.1 >> >> >> for (i in 1:cl) >> >> { xi <- dt[,i] >> >> if (xi < tresh ) 0 else 1 >> >> } >> >> dd <- cbind(dt,xi) >> >> dd >> >> } >> >> >> wet_dry(dt) >> >> >> Results: >> >>> wet_dry(dt) >> >> X1950 X1951 X1952 X1953 X1954 X1955 X1956 X1957 X1958 X1959 X1960 >> X1961 >> X1962 X1963 X1964 X1965 X1966 X1967 X1968 X1969 X1970 X1971 X1972 X1973 >> X1974 X1975 X1976 X1977 >> >> 1 0.0 0.0 14.3 0.0 13.5 13.2 4.0 0.0 3.3 0.0 0.0 >> 0.0 >> 4.2 0.0 2.2 0.0 4.4 5.1 0 7.2 0.0 0.0 0.0 5.1 >> 0 0.0 0 0.3 >> >> 2 0.0 0.0 21.9 0.0 10.9 6.6 2.1 0.0 0.0 0.0 0.0 >> 0.0 >> 8.4 0.0 4.0 0.0 4.9 0.7 0 0.0 0.0 0.0 0.0 5.4 >> 0 3.3 0 0.3 >> >> 3 25.3 6.7 18.6 0.8 2.3 0.0 8.0 0.0 0.0 0.0 0.0 >> 11.0 >> 4.2 0.0 2.0 0.0 14.2 17.1 0 0.0 0.0 0.0 0.0 2.1 >> 0 1.7 0 4.4 >> >> 4 12.7 3.4 37.2 0.9 8.4 0.0 5.8 0.0 0.0 0.0 0.0 >> 5.5 >> 0.0 0.0 5.4 0.0 6.4 14.9 0 10.1 2.9 143.4 0.0 6.1 >> 0 0.0 0 33.5 >> >> >> It does not work and give me the original data. Why is that? >> >> >> Thank you so much for your help. >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code.
roslinazairimah zakaria
2015-Jun-06 09:14 UTC
[R] if else statement for rain data to define zero for dry and one to wet
Thank you jim. On Saturday, June 6, 2015, Jim Lemon <drjimlemon at gmail.com> wrote:> Hi rosalinazairimah, > I think the problem is that you are using "if" instead of "ifelse". Try > this: > > wet_dry<-function(x,thresh=0.1) { > for(column in 1:dim(x)[2]) x[,column]<-ifelse(x[,column]>=thresh,1,0) > return(x) > } > wet_dry(dt) > > and see what you get. > > Also, why can I read your message perfectly while everybody else can't? > > Jim > > >> -----Original Message----- > >> From: roslinaump at gmail.com <javascript:;> > >> Sent: Fri, 5 Jun 2015 16:49:08 +0800 > >> To: r-help at r-project.org <javascript:;> > >> Subject: [R] if else statement for rain data to define zero for dry and > >> one to wet > >> > >> Dear r-users, > >> > >> I have a set of rain data: > >> > >> X1950 X1951 X1952 X1953 X1954 X1955 X1956 X1957 X1958 X1959 X1960 X1961 > >> X1962 > >> > >> 1 0.0 0.0 14.3 0.0 13.5 13.2 4.0 0 3.3 0 0 > 0.0 > >> > >> > >> 2 0.0 0.0 21.9 0.0 10.9 6.6 2.1 0 0.0 0 0 > 0.0 > >> > >> > >> 3 25.3 6.7 18.6 0.8 2.3 0.0 8.0 0 0.0 0 0 > 11.0 > >> > >> > >> 4 12.7 3.4 37.2 0.9 8.4 0.0 5.8 0 0.0 0 0 > 5.5 > >> > >> > >> 5 0.0 0.0 58.3 3.6 21.1 4.2 3.0 0 0.0 0 0 > 15.9 > >> > >> > >> I would like to go through each column and define each cell with value > >> greater than 0.1 mm will be 1 and else zero. Hence I would like to > attach > >> the rain data and the category side by side: > >> > >> > >> 1950 state > >> > >> 1 0.0 0 > >> > >> 2 0.0 0 > >> > >> 3 25.3 1 > >> > >> 4 12.7 1 > >> > >> 5 0.0 0 > >> > >> > >> ... > >> > >> > >> This is my code: > >> > >> > >> wet_dry <- function(dt) > >> > >> { cl <- length(dt) > >> > >> tresh <- 0.1 > >> > >> > >> for (i in 1:cl) > >> > >> { xi <- dt[,i] > >> > >> if (xi < tresh ) 0 else 1 > >> > >> } > >> > >> dd <- cbind(dt,xi) > >> > >> dd > >> > >> } > >> > >> > >> wet_dry(dt) > >> > >> > >> Results: > >> > >>> wet_dry(dt) > >> > >> X1950 X1951 X1952 X1953 X1954 X1955 X1956 X1957 X1958 X1959 X1960 > >> X1961 > >> X1962 X1963 X1964 X1965 X1966 X1967 X1968 X1969 X1970 X1971 X1972 X1973 > >> X1974 X1975 X1976 X1977 > >> > >> 1 0.0 0.0 14.3 0.0 13.5 13.2 4.0 0.0 3.3 0.0 0.0 > >> 0.0 > >> 4.2 0.0 2.2 0.0 4.4 5.1 0 7.2 0.0 0.0 0.0 5.1 > >> 0 0.0 0 0.3 > >> > >> 2 0.0 0.0 21.9 0.0 10.9 6.6 2.1 0.0 0.0 0.0 0.0 > >> 0.0 > >> 8.4 0.0 4.0 0.0 4.9 0.7 0 0.0 0.0 0.0 0.0 5.4 > >> 0 3.3 0 0.3 > >> > >> 3 25.3 6.7 18.6 0.8 2.3 0.0 8.0 0.0 0.0 0.0 0.0 > >> 11.0 > >> 4.2 0.0 2.0 0.0 14.2 17.1 0 0.0 0.0 0.0 0.0 2.1 > >> 0 1.7 0 4.4 > >> > >> 4 12.7 3.4 37.2 0.9 8.4 0.0 5.8 0.0 0.0 0.0 0.0 > >> 5.5 > >> 0.0 0.0 5.4 0.0 6.4 14.9 0 10.1 2.9 143.4 0.0 6.1 > >> 0 0.0 0 33.5 > >> > >> > >> It does not work and give me the original data. Why is that? > >> > >> > >> Thank you so much for your help. > >> > >> [[alternative HTML version deleted]] > >> > >> ______________________________________________ > >> R-help at r-project.org <javascript:;> mailing list -- To UNSUBSCRIBE and > more, see > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Dennis Murphy
2015-Jun-06 20:55 UTC
[R] if else statement for rain data to define zero for dry and one to wet
I'm sorry, but I have to take issue with this particular use case of ifelse(). When the goal is to generate a logical vector, ifelse() is very inefficient. It's better to apply a logical condition directly to the object in question and multiply the result by 1 to make it numeric/integer rather than logical. To illustrate this, consider the following toy example. The function f1 replicates the suggestion to apply ifelse() columnwise (with the additional overhead of preallocating storage for the result), whereas the function f2 applies the logical condition on the matrix itself using vectorization, with the recognition that a matrix is an atomic vector with a dim attribute. set.seed(5290) # 1000 x 1000 matrix m <- matrix(sample(c(0, 0.05, 0.2), 1e6, replace = TRUE), ncol = 1000) f1 <- function(mat) { newmat <- matrix(NA, ncol = ncol(mat), nrow = nrow(mat)) for(i in seq_len(ncol(mat))) newmat[, i] <- ifelse(mat[, i] > 0.1, 1, 0) newmat } f2 <- function(mat) 1 * (mat > 0.1) On my system, I got> system.time(m1 <- f1(m))user system elapsed 0.14 0.00 0.14> system.time(m2 <- f2(m))user system elapsed 0.01 0.00 0.01> identical(m1, m2)[1] TRUE The all too common practice of using ifelse(condition, 1, 0) on an atomic vector is easily replaced by 1 * (condition), where the result of condition is a logical atomic object coerced to numeric. To reduce memory, one should better define f2 as f2 <- function(mat) 1L * (mat > 0.1) but doing so in this example no longer creates identical objects since> typeof(m1)[1] "double" Thus, f1 is not only inefficient in terms of execution time, it's also inefficient in terms of storage. Given several recent warnings in this forum about the inefficiency of ifelse() and the dozens of times I've seen the idiom implemented in f1 as a solution over the last several years (to which I have likely contributed in my distant past as an R-helper), I felt compelled to say something about this practice, which BTW extends not just to 0/1 return values but to 0/x return values, where x is a nonzero real number. Dennis On Sat, Jun 6, 2015 at 12:50 AM, Jim Lemon <drjimlemon at gmail.com> wrote:> Hi rosalinazairimah, > I think the problem is that you are using "if" instead of "ifelse". Try this: > > wet_dry<-function(x,thresh=0.1) { > for(column in 1:dim(x)[2]) x[,column]<-ifelse(x[,column]>=thresh,1,0) > return(x) > } > wet_dry(dt) > > and see what you get. > > Also, why can I read your message perfectly while everybody else can't? > > Jim > >>> -----Original Message----- >>> From: roslinaump at gmail.com >>> Sent: Fri, 5 Jun 2015 16:49:08 +0800 >>> To: r-help at r-project.org >>> Subject: [R] if else statement for rain data to define zero for dry and >>> one to wet >>> >>> Dear r-users, >>> >>> I have a set of rain data: >>> >>> X1950 X1951 X1952 X1953 X1954 X1955 X1956 X1957 X1958 X1959 X1960 X1961 >>> X1962 >>> >>> 1 0.0 0.0 14.3 0.0 13.5 13.2 4.0 0 3.3 0 0 0.0 >>> >>> >>> 2 0.0 0.0 21.9 0.0 10.9 6.6 2.1 0 0.0 0 0 0.0 >>> >>> >>> 3 25.3 6.7 18.6 0.8 2.3 0.0 8.0 0 0.0 0 0 11.0 >>> >>> >>> 4 12.7 3.4 37.2 0.9 8.4 0.0 5.8 0 0.0 0 0 5.5 >>> >>> >>> 5 0.0 0.0 58.3 3.6 21.1 4.2 3.0 0 0.0 0 0 15.9 >>> >>> >>> I would like to go through each column and define each cell with value >>> greater than 0.1 mm will be 1 and else zero. Hence I would like to attach >>> the rain data and the category side by side: >>> >>> >>> 1950 state >>> >>> 1 0.0 0 >>> >>> 2 0.0 0 >>> >>> 3 25.3 1 >>> >>> 4 12.7 1 >>> >>> 5 0.0 0 >>> >>> >>> ... >>> >>> >>> This is my code: >>> >>> >>> wet_dry <- function(dt) >>> >>> { cl <- length(dt) >>> >>> tresh <- 0.1 >>> >>> >>> for (i in 1:cl) >>> >>> { xi <- dt[,i] >>> >>> if (xi < tresh ) 0 else 1 >>> >>> } >>> >>> dd <- cbind(dt,xi) >>> >>> dd >>> >>> } >>> >>> >>> wet_dry(dt) >>> >>> >>> Results: >>> >>>> wet_dry(dt) >>> >>> X1950 X1951 X1952 X1953 X1954 X1955 X1956 X1957 X1958 X1959 X1960 >>> X1961 >>> X1962 X1963 X1964 X1965 X1966 X1967 X1968 X1969 X1970 X1971 X1972 X1973 >>> X1974 X1975 X1976 X1977 >>> >>> 1 0.0 0.0 14.3 0.0 13.5 13.2 4.0 0.0 3.3 0.0 0.0 >>> 0.0 >>> 4.2 0.0 2.2 0.0 4.4 5.1 0 7.2 0.0 0.0 0.0 5.1 >>> 0 0.0 0 0.3 >>> >>> 2 0.0 0.0 21.9 0.0 10.9 6.6 2.1 0.0 0.0 0.0 0.0 >>> 0.0 >>> 8.4 0.0 4.0 0.0 4.9 0.7 0 0.0 0.0 0.0 0.0 5.4 >>> 0 3.3 0 0.3 >>> >>> 3 25.3 6.7 18.6 0.8 2.3 0.0 8.0 0.0 0.0 0.0 0.0 >>> 11.0 >>> 4.2 0.0 2.0 0.0 14.2 17.1 0 0.0 0.0 0.0 0.0 2.1 >>> 0 1.7 0 4.4 >>> >>> 4 12.7 3.4 37.2 0.9 8.4 0.0 5.8 0.0 0.0 0.0 0.0 >>> 5.5 >>> 0.0 0.0 5.4 0.0 6.4 14.9 0 10.1 2.9 143.4 0.0 6.1 >>> 0 0.0 0 33.5 >>> >>> >>> It does not work and give me the original data. Why is that? >>> >>> >>> Thank you so much for your help. >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
William Dunlap
2015-Jun-06 21:48 UTC
[R] if else statement for rain data to define zero for dry and one to wet
Your f1() has an unneeded for loop in it. f1a <- function(mat) mat > 0.1, 1, 0) would do the same thing in a bit less time. However, I think that a simple mat > 0.1 would be preferable. The resulting TRUEs and FALSEs are easier to interpret than the 1s and 0s that f1a() produces and arithmetic functions treat them TRUE as 1 and FALSE as 0 internally. E.g., mean(mat>0.1) gives the proportion of wet(tish) days. Bill Dunlap TIBCO Software wdunlap tibco.com On Sat, Jun 6, 2015 at 1:55 PM, Dennis Murphy <djmuser at gmail.com> wrote:> I'm sorry, but I have to take issue with this particular use case of > ifelse(). When the goal is to generate a logical vector, ifelse() is > very inefficient. It's better to apply a logical condition directly to > the object in question and multiply the result by 1 to make it > numeric/integer rather than logical. > > To illustrate this, consider the following toy example. The function > f1 replicates the suggestion to apply ifelse() columnwise (with the > additional overhead of preallocating storage for the result), whereas > the function f2 applies the logical condition on the matrix itself > using vectorization, with the recognition that a matrix is an atomic > vector with a dim attribute. > > set.seed(5290) > > # 1000 x 1000 matrix > m <- matrix(sample(c(0, 0.05, 0.2), 1e6, replace = TRUE), ncol = 1000) > > f1 <- function(mat) > { > newmat <- matrix(NA, ncol = ncol(mat), nrow = nrow(mat)) > for(i in seq_len(ncol(mat))) > newmat[, i] <- ifelse(mat[, i] > 0.1, 1, 0) > newmat > } > > f2 <- function(mat) 1 * (mat > 0.1) > > > On my system, I got > > > system.time(m1 <- f1(m)) > user system elapsed > 0.14 0.00 0.14 > > > system.time(m2 <- f2(m)) > user system elapsed > 0.01 0.00 0.01 > > > identical(m1, m2) > [1] TRUE > > The all too common practice of using ifelse(condition, 1, 0) on an > atomic vector is easily replaced by 1 * (condition), where the result > of condition is a logical atomic object coerced to numeric. > > To reduce memory, one should better define f2 as > > f2 <- function(mat) 1L * (mat > 0.1) > > but doing so in this example no longer creates identical objects since > > > typeof(m1) > [1] "double" > > Thus, f1 is not only inefficient in terms of execution time, it's also > inefficient in terms of storage. > > Given several recent warnings in this forum about the inefficiency of > ifelse() and the dozens of times I've seen the idiom implemented in f1 > as a solution over the last several years (to which I have likely > contributed in my distant past as an R-helper), I felt compelled to > say something about this practice, which BTW extends not just to 0/1 > return values but to > 0/x return values, where x is a nonzero real number. > > Dennis > > > On Sat, Jun 6, 2015 at 12:50 AM, Jim Lemon <drjimlemon at gmail.com> wrote: > > Hi rosalinazairimah, > > I think the problem is that you are using "if" instead of "ifelse". Try > this: > > > > wet_dry<-function(x,thresh=0.1) { > > for(column in 1:dim(x)[2]) x[,column]<-ifelse(x[,column]>=thresh,1,0) > > return(x) > > } > > wet_dry(dt) > > > > and see what you get. > > > > Also, why can I read your message perfectly while everybody else can't? > > > > Jim > > > >>> -----Original Message----- > >>> From: roslinaump at gmail.com > >>> Sent: Fri, 5 Jun 2015 16:49:08 +0800 > >>> To: r-help at r-project.org > >>> Subject: [R] if else statement for rain data to define zero for dry and > >>> one to wet > >>> > >>> Dear r-users, > >>> > >>> I have a set of rain data: > >>> > >>> X1950 X1951 X1952 X1953 X1954 X1955 X1956 X1957 X1958 X1959 X1960 X1961 > >>> X1962 > >>> > >>> 1 0.0 0.0 14.3 0.0 13.5 13.2 4.0 0 3.3 0 0 > 0.0 > >>> > >>> > >>> 2 0.0 0.0 21.9 0.0 10.9 6.6 2.1 0 0.0 0 0 > 0.0 > >>> > >>> > >>> 3 25.3 6.7 18.6 0.8 2.3 0.0 8.0 0 0.0 0 0 > 11.0 > >>> > >>> > >>> 4 12.7 3.4 37.2 0.9 8.4 0.0 5.8 0 0.0 0 0 > 5.5 > >>> > >>> > >>> 5 0.0 0.0 58.3 3.6 21.1 4.2 3.0 0 0.0 0 0 > 15.9 > >>> > >>> > >>> I would like to go through each column and define each cell with value > >>> greater than 0.1 mm will be 1 and else zero. Hence I would like to > attach > >>> the rain data and the category side by side: > >>> > >>> > >>> 1950 state > >>> > >>> 1 0.0 0 > >>> > >>> 2 0.0 0 > >>> > >>> 3 25.3 1 > >>> > >>> 4 12.7 1 > >>> > >>> 5 0.0 0 > >>> > >>> > >>> ... > >>> > >>> > >>> This is my code: > >>> > >>> > >>> wet_dry <- function(dt) > >>> > >>> { cl <- length(dt) > >>> > >>> tresh <- 0.1 > >>> > >>> > >>> for (i in 1:cl) > >>> > >>> { xi <- dt[,i] > >>> > >>> if (xi < tresh ) 0 else 1 > >>> > >>> } > >>> > >>> dd <- cbind(dt,xi) > >>> > >>> dd > >>> > >>> } > >>> > >>> > >>> wet_dry(dt) > >>> > >>> > >>> Results: > >>> > >>>> wet_dry(dt) > >>> > >>> X1950 X1951 X1952 X1953 X1954 X1955 X1956 X1957 X1958 X1959 X1960 > >>> X1961 > >>> X1962 X1963 X1964 X1965 X1966 X1967 X1968 X1969 X1970 X1971 X1972 X1973 > >>> X1974 X1975 X1976 X1977 > >>> > >>> 1 0.0 0.0 14.3 0.0 13.5 13.2 4.0 0.0 3.3 0.0 0.0 > >>> 0.0 > >>> 4.2 0.0 2.2 0.0 4.4 5.1 0 7.2 0.0 0.0 0.0 5.1 > >>> 0 0.0 0 0.3 > >>> > >>> 2 0.0 0.0 21.9 0.0 10.9 6.6 2.1 0.0 0.0 0.0 0.0 > >>> 0.0 > >>> 8.4 0.0 4.0 0.0 4.9 0.7 0 0.0 0.0 0.0 0.0 5.4 > >>> 0 3.3 0 0.3 > >>> > >>> 3 25.3 6.7 18.6 0.8 2.3 0.0 8.0 0.0 0.0 0.0 0.0 > >>> 11.0 > >>> 4.2 0.0 2.0 0.0 14.2 17.1 0 0.0 0.0 0.0 0.0 2.1 > >>> 0 1.7 0 4.4 > >>> > >>> 4 12.7 3.4 37.2 0.9 8.4 0.0 5.8 0.0 0.0 0.0 0.0 > >>> 5.5 > >>> 0.0 0.0 5.4 0.0 6.4 14.9 0 10.1 2.9 143.4 0.0 6.1 > >>> 0 0.0 0 33.5 > >>> > >>> > >>> It does not work and give me the original data. Why is that? > >>> > >>> > >>> Thank you so much for your help. > >>> > >>> [[alternative HTML version deleted]] > >>> > >>> ______________________________________________ > >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>> https://stat.ethz.ch/mailman/listinfo/r-help > >>> PLEASE do read the posting guide > >>> http://www.R-project.org/posting-guide.html > >>> and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]