Zembower, Kevin
2006-Oct-26 19:55 UTC
[R] Newbie: Better way to do compound conditionals in subset?
There must be a better way to select the rows after 22-Apr-2004 and before 01-Sep-2004 with a temperature below 65 than this:> before2sw1 <- subset(energy.data, as.Date(start, format="%d-%b-%y") <as.Date("01-Sep-04", format = "%d-%b-%y"))> before2sw2 <- subset(before2sw1, as.Date(start, format="%d-%b-%y") >as.Date("22-Apr-04", format = "%d-%b-%y"), select=c(therms,temp,days)) > before2sw <- subset(before2sw2, temp < 65)Is it also possible to combine in this step: attach(before2sw) before2sw.HDD <- therms / (65 - temp) * days My data looks like this:> head(energy.data)start therms gas KWHs elect temp days 1 10-Jun-98 9 16.84 613 63.80 75 40 2 20-Jul-98 6 15.29 721 74.21 76 29 3 18-Aug-98 7 15.73 597 62.22 76 29 4 16-Sep-98 42 35.81 460 43.98 70 33 5 19-Oct-98 105 77.28 314 31.45 57 29 6 17-Nov-98 106 77.01 342 33.86 48 30>Thanks for your suggestions and advice. I'm continuing to enjoy learning R. -Kevin Kevin Zembower Internet Services Group manager Center for Communication Programs Bloomberg School of Public Health Johns Hopkins University 111 Market Place, Suite 310 Baltimore, Maryland 21202 410-659-6139
Jeffrey Robert Spies
2006-Oct-26 20:32 UTC
[R] Newbie: Better way to do compound conditionals in subset?
I would personally use the following method (example using the iris data included with R): data(iris) tSelect <- (iris$Sepal.Length > 6.0 & iris$Sepal.Length < 6.2 & iris $Sepal.Width == 3.0) tSelectedData <- iris[tSelect,] Then you can simply work with tSelectedData for whatever equation you use, i.e.: tSelectedData$Sepal.Length - tSelectedData$Sepal.Width Of course you could write all of this on one line, but that doesn't read well. Hope that helps, Jeff. On Oct 26, 2006, at 3:55 PM, Zembower, Kevin wrote:> There must be a better way to select the rows after 22-Apr-2004 and > before 01-Sep-2004 with a temperature below 65 than this: > >> before2sw1 <- subset(energy.data, as.Date(start, format="%d-%b-%y") < > as.Date("01-Sep-04", format = "%d-%b-%y")) >> before2sw2 <- subset(before2sw1, as.Date(start, format="%d-%b-%y") >> as.Date("22-Apr-04", format = "%d-%b-%y"), select=c(therms,temp,days)) >> before2sw <- subset(before2sw2, temp < 65) > > Is it also possible to combine in this step: > > attach(before2sw) > before2sw.HDD <- therms / (65 - temp) * days > > My data looks like this: >> head(energy.data) > start therms gas KWHs elect temp days > 1 10-Jun-98 9 16.84 613 63.80 75 40 > 2 20-Jul-98 6 15.29 721 74.21 76 29 > 3 18-Aug-98 7 15.73 597 62.22 76 29 > 4 16-Sep-98 42 35.81 460 43.98 70 33 > 5 19-Oct-98 105 77.28 314 31.45 57 29 > 6 17-Nov-98 106 77.01 342 33.86 48 30 >> > > Thanks for your suggestions and advice. I'm continuing to enjoy > learning > R. > > -Kevin > > Kevin Zembower > Internet Services Group manager > Center for Communication Programs > Bloomberg School of Public Health > Johns Hopkins University > 111 Market Place, Suite 310 > Baltimore, Maryland 21202 > 410-659-6139 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.
Marc Schwartz
2006-Oct-26 22:51 UTC
[R] Newbie: Better way to do compound conditionals in subset?
On Thu, 2006-10-26 at 15:55 -0400, Zembower, Kevin wrote:> There must be a better way to select the rows after 22-Apr-2004 and > before 01-Sep-2004 with a temperature below 65 than this: > > > before2sw1 <- subset(energy.data, as.Date(start, format="%d-%b-%y") < > as.Date("01-Sep-04", format = "%d-%b-%y")) > > before2sw2 <- subset(before2sw1, as.Date(start, format="%d-%b-%y") >> as.Date("22-Apr-04", format = "%d-%b-%y"), select=c(therms,temp,days)) > > before2sw <- subset(before2sw2, temp < 65)Well, the first thing that I would do is to convert 'start' to a Date: energy.data$start <- as.Date(energy.data$start, format = "%d-%b-%y") Then create your cutoff dates: Start <- as.Date("22-Apr-04", format = "%d-%b-%y") End <- as.Date("01-Sep-04", format = "%d-%b-%y") Then you can do this: New.DF <- subset(energy.data, (start >= Start) & (start < End) & (days < 65), select = c(therms, temp, days)) and then add: NewDF$HDD <- with(NewDF, (therms / (65 - temp) * days)) You can also do the calculations and add a new column to the full data frame and simply add 'HDD' to the 'select' argument in subset(). Note that if you don't want to modify the original data frame, which is something that I tend to avoid for a variety of reasons, you can copy it to another first and then run the above steps on the copy for subsequent analysis. You want to generally avoid using attach(), as it can have deleterious and not immediately evident side-effects. Review the Details section of ?attach and note some of the effects seen in the Examples there. HTH, Marc Schwartz> Is it also possible to combine in this step: > > attach(before2sw) > before2sw.HDD <- therms / (65 - temp) * days > > My data looks like this: > > head(energy.data) > start therms gas KWHs elect temp days > 1 10-Jun-98 9 16.84 613 63.80 75 40 > 2 20-Jul-98 6 15.29 721 74.21 76 29 > 3 18-Aug-98 7 15.73 597 62.22 76 29 > 4 16-Sep-98 42 35.81 460 43.98 70 33 > 5 19-Oct-98 105 77.28 314 31.45 57 29 > 6 17-Nov-98 106 77.01 342 33.86 48 30 > > > > Thanks for your suggestions and advice. I'm continuing to enjoy learning > R. > > -Kevin