Message: 63 Date: Wed, 26 Jan 2005 04:28:51 +0000 (UTC) From: Gabor Grothendieck <ggrothendieck at myway.com> Subject: Re: [R] chron: parsing dates into a data frame using a forloop To: r-help at stat.math.ethz.ch Message-ID: <loom.20050126T052153-333 at post.gmane.org> Content-Type: text/plain; charset=us-ascii Benjamin M. Osborne <Benjamin.Osborne <at> uvm.edu> writes: : : I have one data frame with a column of dates and I want to fill another data : frame with one column of dates, one of years, one of months, one of a unique : combination of year and month, and one of days, but R seems to have some : problems with this. My initial data frame looks like this (ignore the NAs in : the other fields): : : > mans[1:10,] : date loc snow.new prcp tmin snow.dep tmax : 1 11/01/54 2 NA NA NA NA NA : 2 11/02/54 2 NA NA NA NA NA : 3 11/03/54 2 NA NA NA NA NA : 4 11/04/54 2 NA NA NA NA NA : 5 11/05/54 2 NA NA NA NA NA : 6 11/06/54 2 NA NA NA NA NA : 7 11/07/54 2 NA NA NA NA NA : 8 11/08/54 2 NA NA NA NA NA : 9 11/09/54 2 NA NA NA NA NA : 10 11/10/54 2 NA NA NA NA NA : > : : The code and resultant data frame look like this: : : > for(i in 1:10){ : + mans.met$date[i]<-mans$date[i] : + mans.met$year[i]<-years(mans.met$date[i]) : + mans.met$month[i]<-months(mans.met$date[i]) : + mans.met$yearmo[i]<-cut(mans.met$date[i], "months") : + mans.met$day[i]<-days(mans.met$date[i]) : + } : > mans.met[1:10,] : date year month yearmo day snow.new snow.dep prcp tmin tmax tmean : 1 11/01/54 1 11 1 1 NA NA NA NA NA NA : 2 11/02/54 1 11 1 2 NA NA NA NA NA NA : 3 11/03/54 1 11 1 3 NA NA NA NA NA NA : 4 11/04/54 1 11 1 4 NA NA NA NA NA NA : 5 11/05/54 1 11 1 5 NA NA NA NA NA NA : 6 11/06/54 1 11 1 6 NA NA NA NA NA NA : 7 11/07/54 1 11 1 7 NA NA NA NA NA NA : 8 11/08/54 1 11 1 8 NA NA NA NA NA NA : 9 11/09/54 1 11 1 9 NA NA NA NA NA NA : 10 11/10/54 1 11 1 10 NA NA NA NA NA NA : > : : The problem seems to be with assigning within the forloop, or making the : assignment into a data frame, since: : : > years(mans.met$date[5]) : [1] 1954 : Levels: 1954 : > test<-years(mans.met$date[5]) : > test : [1] 1954 : Levels: 1954 : > : > months(mans.met$date[5]) : [1] Nov : 12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec : > test<-months(mans.met$date[5]) : > test : [1] Nov : 12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec : > : > cut(mans.met$date[3], "months") : [1] Nov 54 : Levels: Nov 54 : > test<-cut(mans.met$date[3], "months") : > test : [1] Nov 54 : Levels: Nov 54 : > : > days(mans.met$date[4]) : [1] 4 : 31 Levels: 1 < 2 < 3 < 4 < 5 < 6 < 7 < 8 < 9 < 10 < 11 < 12 < 13 < ... < 31 : > test<-days(mans.met$date[4]) : > test : [1] 4 : 31 Levels: 1 < 2 < 3 < 4 < 5 < 6 < 7 < 8 < 9 < 10 < 11 < 12 < 13 < ... < 31 : > : : Any suggestions will be appreciated. : -Ben Osborne I guess you set up mans.met as numeric columns and when you assign your factors to numeric variables you get the underlying codes. Note that if f is a factor then as.numeric(f) gives the codes underlying the factor whereas as.character(f) gives the labels. It would be better not to use a loop at all. I don't know whether you want or not want factors but at any rate here is something you could try. It creates data frame df2 without a loop. df2 <- data.frame(date = mans$date, yearmo = as.character(cut(mans$date, "m"))) df2 <- cbind(df2, month.day.year(mans$date)) Finally, do you really want this redundant representation? I would tend to go with just storing the dates and computing any of the other quantities on-the-fly as needed. ########## The reason for the redundancy is that I will want to summarize these 50 years of daily time series data by month, so that records that share each unique year and month in the mans.met$yearmo column will be summed or averaged, etc. into a new row in another data frame(mans.monthly, having nrow=length(unique(mans.met$yearmo))). The way I would do this is again using a forloop, but the loop won't recognize : for (i in 1:(length(unique(mans.met$yearmo[i])))){ What I really need to know is why I can call any ith of unique(mans.met$yearmo[i]) by itself, but not in a loop. Or, perhaps there is an even easier way to extract the year and month from the date column on the fly to compute these summaries? Thanks, Ben Osborne -- Botany Department University of Vermont 109 Carrigan Drive Burlington, VT 05405 benjamin.osborne at uvm.edu phone: 802-656-0297 fax: 802-656-0440
Benjamin M. Osborne <Benjamin.Osborne <at> uvm.edu> writes: : : Message: 63 : Date: Wed, 26 Jan 2005 04:28:51 +0000 (UTC) : From: Gabor Grothendieck <ggrothendieck <at> myway.com> : Subject: Re: [R] chron: parsing dates into a data frame using a : forloop : To: r-help <at> stat.math.ethz.ch : Message-ID: <loom.20050126T052153-333 <at> post.gmane.org> : Content-Type: text/plain; charset=us-ascii : : Benjamin M. Osborne <Benjamin.Osborne <at> uvm.edu> writes: : : : : : I have one data frame with a column of dates and I want to fill another data : : frame with one column of dates, one of years, one of months, one of a unique : : combination of year and month, and one of days, but R seems to have some : : problems with this. My initial data frame looks like this (ignore the NAs in : : the other fields): : : : : > mans[1:10,] : : date loc snow.new prcp tmin snow.dep tmax : : 1 11/01/54 2 NA NA NA NA NA : : 2 11/02/54 2 NA NA NA NA NA : : 3 11/03/54 2 NA NA NA NA NA : : 4 11/04/54 2 NA NA NA NA NA : : 5 11/05/54 2 NA NA NA NA NA : : 6 11/06/54 2 NA NA NA NA NA : : 7 11/07/54 2 NA NA NA NA NA : : 8 11/08/54 2 NA NA NA NA NA : : 9 11/09/54 2 NA NA NA NA NA : : 10 11/10/54 2 NA NA NA NA NA : : > : : : : The code and resultant data frame look like this: : : : : > for(i in 1:10){ : : + mans.met$date[i]<-mans$date[i] : : + mans.met$year[i]<-years(mans.met$date[i]) : : + mans.met$month[i]<-months(mans.met$date[i]) : : + mans.met$yearmo[i]<-cut(mans.met$date[i], "months") : : + mans.met$day[i]<-days(mans.met$date[i]) : : + } : : > mans.met[1:10,] : : date year month yearmo day snow.new snow.dep prcp tmin tmax tmean : : 1 11/01/54 1 11 1 1 NA NA NA NA NA NA : : 2 11/02/54 1 11 1 2 NA NA NA NA NA NA : : 3 11/03/54 1 11 1 3 NA NA NA NA NA NA : : 4 11/04/54 1 11 1 4 NA NA NA NA NA NA : : 5 11/05/54 1 11 1 5 NA NA NA NA NA NA : : 6 11/06/54 1 11 1 6 NA NA NA NA NA NA : : 7 11/07/54 1 11 1 7 NA NA NA NA NA NA : : 8 11/08/54 1 11 1 8 NA NA NA NA NA NA : : 9 11/09/54 1 11 1 9 NA NA NA NA NA NA : : 10 11/10/54 1 11 1 10 NA NA NA NA NA NA : : > : : : : The problem seems to be with assigning within the forloop, or making the : : assignment into a data frame, since: : : : : > years(mans.met$date[5]) : : [1] 1954 : : Levels: 1954 : : > test<-years(mans.met$date[5]) : : > test : : [1] 1954 : : Levels: 1954 : : > : : > months(mans.met$date[5]) : : [1] Nov : : 12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec : : > test<-months(mans.met$date[5]) : : > test : : [1] Nov : : 12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec : : > : : > cut(mans.met$date[3], "months") : : [1] Nov 54 : : Levels: Nov 54 : : > test<-cut(mans.met$date[3], "months") : : > test : : [1] Nov 54 : : Levels: Nov 54 : : > : : > days(mans.met$date[4]) : : [1] 4 : : 31 Levels: 1 < 2 < 3 < 4 < 5 < 6 < 7 < 8 < 9 < 10 < 11 < 12 < 13 < ... < 31 : : > test<-days(mans.met$date[4]) : : > test : : [1] 4 : : 31 Levels: 1 < 2 < 3 < 4 < 5 < 6 < 7 < 8 < 9 < 10 < 11 < 12 < 13 < ... < 31 : : > : : : : Any suggestions will be appreciated. : : -Ben Osborne : : I guess you set up mans.met as numeric columns and when you : assign your factors to numeric variables you get : the underlying codes. Note that if f is a factor then as.numeric(f) : gives the codes underlying the factor whereas as.character(f) gives : the labels. : : It would be better not to use a loop at all. I don't know whether you : want or not want factors but at any rate here is something you could : try. It creates data frame df2 without a loop. : : df2 <- data.frame(date = mans$date, yearmo = as.character(cut (mans$date, "m"))) : df2 <- cbind(df2, month.day.year(mans$date)) : : Finally, do you really want this redundant representation? I would tend to : go with just storing the dates and computing any of the other quantities : on-the-fly as needed. : : ########## : The reason for the redundancy is that I will want to summarize these 50 years of : daily time series data by month, so that records that share each unique year : and month in the mans.met$yearmo column will be summed or averaged, etc. into a : new row in another data frame(mans.monthly, having : nrow=length(unique(mans.met$yearmo))). The way I would do this is again using : a forloop, but the loop won't recognize : : for (i in 1:(length(unique(mans.met$yearmo[i])))){ This seems circular. You are defining i in terms of i. : : What I really need to know is why I can call any ith of : unique(mans.met$yearmo[i]) : by itself, but not in a loop. : : Or, perhaps there is an even easier way to extract the year and month from the : date : column on the fly to compute these summaries? Look at ?aggregate, ?by and ?tapply. e.g. aggregate(mans[,-1], list(cut(mans$date, "m")), mean)
Apparently Analagous Threads
- chron: parsing dates into a data frame using a forloop
- coercing a list to a data frame, lists in foreloops
- [LLVMdev] What opt pass attempts implements this optimization?
- allowing line wrap for long strip text in xyplot (lattice)
- [LLVMdev] What opt pass attempts implements this optimization?