Hi All, I have a data set with 11000 rows & 19 columns. I have 2 columns on which I need to summarize the data:- Date & Weight. Snapshot is : Date 13/03/2015 31/03/2015 15/03/2015 17/03/2015 17/03/2015 11/3/2015 11/3/2015 19/03/2015 CHG_WT 0 0 0 770 3,730 70 10 500 Now I need to summarize this data based on Day wise trend of weight however I have tried bifurcating and truncating the date and saw multiple options over the web - zoo package, iso week etc but I am not sure on how to reach to this analysis. If you experts can please suggest how to achieve the requirement. Thanks, Shivi -- View this message in context: http://r.789695.n4.nabble.com/Summarizing-data-based-on-Date-tp4708328.html Sent from the R help mailing list archive at Nabble.com.
Hi Is your Date really Date or is it character? What is result of str(Date) If you want to det summaries for dates you can use ?aggregate However in this case I strongly recommend to show us your data by dput(yourdata) and explain on the example what summary do you want. I can be completely wrong but maybe aggregate(CHG_WT, list(format(Date, "%d"), sum) can get you required values. Cheers Petr> -----Original Message----- > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Shivi82 > Sent: Monday, June 08, 2015 10:08 AM > To: r-help at r-project.org > Subject: [R] Summarizing data based on Date > > Hi All, > > I have a data set with 11000 rows & 19 columns. > I have 2 columns on which I need to summarize the data:- Date & Weight. > Snapshot is : > Date > 13/03/2015 > 31/03/2015 > 15/03/2015 > 17/03/2015 > 17/03/2015 > 11/3/2015 > 11/3/2015 > 19/03/2015 > > CHG_WT > 0 > 0 > 0 > 770 > 3,730 > 70 > 10 > 500 > Now I need to summarize this data based on Day wise trend of weight > however I have tried bifurcating and truncating the date and saw > multiple options over the web - zoo package, iso week etc but I am not > sure on how to reach to this analysis. > If you experts can please suggest how to achieve the requirement. > Thanks, Shivi > > > > > > > > > > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Summarizing-data-based-on-Date- > tp4708328.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.________________________________ Tento e-mail a jak?koliv k n?mu p?ipojen? dokumenty jsou d?v?rn? a jsou ur?eny pouze jeho adres?t?m. Jestli?e jste obdr?el(a) tento e-mail omylem, informujte laskav? neprodlen? jeho odes?latele. Obsah tohoto emailu i s p??lohami a jeho kopie vyma?te ze sv?ho syst?mu. Nejste-li zam??len?m adres?tem tohoto emailu, nejste opr?vn?ni tento email jakkoliv u??vat, roz?i?ovat, kop?rovat ?i zve?ej?ovat. Odes?latel e-mailu neodpov?d? za eventu?ln? ?kodu zp?sobenou modifikacemi ?i zpo?d?n?m p?enosu e-mailu. V p??pad?, ?e je tento e-mail sou??st? obchodn?ho jedn?n?: - vyhrazuje si odes?latel pr?vo ukon?it kdykoliv jedn?n? o uzav?en? smlouvy, a to z jak?hokoliv d?vodu i bez uveden? d?vodu. - a obsahuje-li nab?dku, je adres?t opr?vn?n nab?dku bezodkladn? p?ijmout; Odes?latel tohoto e-mailu (nab?dky) vylu?uje p?ijet? nab?dky ze strany p??jemce s dodatkem ?i odchylkou. - trv? odes?latel na tom, ?e p??slu?n? smlouva je uzav?ena teprve v?slovn?m dosa?en?m shody na v?ech jej?ch n?le?itostech. - odes?latel tohoto emailu informuje, ?e nen? opr?vn?n uzav?rat za spole?nost ??dn? smlouvy s v?jimkou p??pad?, kdy k tomu byl p?semn? zmocn?n nebo p?semn? pov??en a takov? pov??en? nebo pln? moc byly adres?tovi tohoto emailu p??padn? osob?, kterou adres?t zastupuje, p?edlo?eny nebo jejich existence je adres?tovi ?i osob? j?m zastoupen? zn?m?. This e-mail and any documents attached to it may be confidential and are intended only for its intended recipients. If you received this e-mail by mistake, please immediately inform its sender. Delete the contents of this e-mail with all attachments and its copies from your system. If you are not the intended recipient of this e-mail, you are not authorized to use, disseminate, copy or disclose this e-mail in any manner. The sender of this e-mail shall not be liable for any possible damage caused by modifications of the e-mail or by delay with transfer of the email. In case that this e-mail forms part of business dealings: - the sender reserves the right to end negotiations about entering into a contract in any time, for any reason, and without stating any reasoning. - if the e-mail contains an offer, the recipient is entitled to immediately accept such offer; The sender of this e-mail (offer) excludes any acceptance of the offer on the part of the recipient containing any amendment or variation. - the sender insists on that the respective contract is concluded only upon an express mutual agreement on all its aspects. - the sender of this e-mail informs that he/she is not authorized to enter into any contracts on behalf of the company except for cases in which he/she is expressly authorized to do so in writing, and such authorization or power of attorney is submitted to the recipient or the person represented by the recipient, or the existence of such authorization is known to the recipient of the person represented by the recipient.
Hi Petr, Thanks for the explanation below. I tried the code you supplied however it seems as my date is a factor hence it is not working. The error I got from the code was : Error: unexpected symbol in: "final<-aggregate(test$CHG_WT,list(format(test$CR_DT,"%d"),sum) final" str(test$CR_DT)- gives Factor with 31 levels -- View this message in context: http://r.789695.n4.nabble.com/Summarizing-data-based-on-Date-tp4708328p4708333.html Sent from the R help mailing list archive at Nabble.com.
Hi Petr I researched a lot over the net and R manual as well based on which I revamped my code and came to the code as: test$CR_DT <- as.Date(test$CR_DT, '%d-%b-%y') iii<- aggregate(test$CHG_WT,list(format(test$CR_DT,"%m")),FUN=sum) However it still gives me the error as below: Error in Summary.factor(c(1L, 1L, 1L, 3286L, 1646L, 3241L, 1L, 1L, 1307L, : ?sum? not meaningful for factors. If could you guide on how to achieve the desired output. Thanks. -- View this message in context: http://r.789695.n4.nabble.com/Summarizing-data-based-on-Date-tp4708328p4708384.html Sent from the R help mailing list archive at Nabble.com.
What does the following command print out? str(test) The error message indicates that test$CHG_WT is not numeric. ------------------------------------- David L Carlson Department of Anthropology Texas A&M University College Station, TX 77840-4352 -----Original Message----- From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Shivi82 Sent: Tuesday, June 9, 2015 7:01 AM To: r-help at r-project.org Subject: Re: [R] Summarizing data based on Date Hi Petr I researched a lot over the net and R manual as well based on which I revamped my code and came to the code as: test$CR_DT <- as.Date(test$CR_DT, '%d-%b-%y') iii<- aggregate(test$CHG_WT,list(format(test$CR_DT,"%m")),FUN=sum) However it still gives me the error as below: Error in Summary.factor(c(1L, 1L, 1L, 3286L, 1646L, 3241L, 1L, 1L, 1307L, : ?sum? not meaningful for factors. If could you guide on how to achieve the desired output. Thanks. -- View this message in context: http://r.789695.n4.nabble.com/Summarizing-data-based-on-Date-tp4708328p4708384.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi, As David said have a look at str(test). You have a factor in there or else that weird "list(format(test$CR_DT,"%m"))" command in aggregate() is mucking things up. What is "list(format(test$CR_DT,"%m"))" intended to do? No ,a quick test says it is mucking something else up and not giving the us the factor problem. Here is your sample data and what I think is what you are trying to do. Note the data is supplied in dput() format which is the preferred way to supply sample data to the R-help list. See http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example and http://adv-r.had.co.nz/Reproducibility.html for more information. I used lubridate's dmy() function rather than as.Date() to format the dates. dat1 <- structure(list(dd = structure(c(1426204800, 1427760000, 1426377600, 1426550400, 1426550400, 1426032000, 1426032000, 1426723200), tzone = "UTC", class = c("POSIXct", "POSIXt")), wt = c(0, 0, 0, 770, 3.73, 70, 10, 500)), .Names = c("dd", "wt"), row.names = c(NA, -8L), class = "data.frame") str(dat1) aggregate(dat1$wt, list(dat1$dd), sum) John Kane Kingston ON Canada> -----Original Message----- > From: shivibhatia at ymail.com > Sent: Tue, 9 Jun 2015 05:01:23 -0700 (PDT) > To: r-help at r-project.org > Subject: Re: [R] Summarizing data based on Date > > Hi Petr > > I researched a lot over the net and R manual as well based on which I > revamped my code and came to the code as: > test$CR_DT <- as.Date(test$CR_DT, '%d-%b-%y') > > iii<- aggregate(test$CHG_WT,list(format(test$CR_DT,"%m")),FUN=sum) > > However it still gives me the error as below: > Error in Summary.factor(c(1L, 1L, 1L, 3286L, 1646L, 3241L, 1L, 1L, 1307L, > : > ?sum? not meaningful for factors. > > If could you guide on how to achieve the desired output. Thanks. > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Summarizing-data-based-on-Date-tp4708328p4708384.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.____________________________________________________________ TRY FREE IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if5 Capture screenshots, upload images, edit and send them to your friends through IMs, post on Twitter?, Facebook?, MySpace?, LinkedIn? ? FAST!
HI All, I am able to get the desired result. Thanks for extending help. while reading the csv file I made some changes as : Test<-read.csv("Testdata.csv", head=TRUE, stringsAsFactors = FALSE, strip.white = TRUE) with this character var were not changed to factors. Then aggregation was simple: aggregate(test$CHG_WT, list(test$CR_DT), sum) However the output is not sorted based on Dates and the columns names appearing as very different: Group.1 x 1 1-Mar-15 909791 2 10-Mar-15 822436 3 11-Mar-15 848609 4 12-Mar-15 924842 5 13-Mar-15 895270 6 14-Mar-15 93238 7 2-Mar-15 731600 Can you all please suggest why the column names are so different and how I could sort based on dates. I added the sort option in the above syntax aggregate(test$CHG_WT, list(test$CR_DT), sum,sort(test$CR_DT,decreasing TRUE)) But it gave me an error: Error in FUN(X[[i]], ...) : invalid 'type' (character) of argument Thanks All. -- View this message in context: http://r.789695.n4.nabble.com/Summarizing-data-based-on-Date-tp4708328p4708423.html Sent from the R help mailing list archive at Nabble.com.
Hi Shivi I think the names issue is just that that is aggregate()'s default. Just rename using ?names For the 'sort' problem, AFAIK you cannot sort within aggregate(), at least not how you are doing it, nor do you want to do a sort(). You need ?order for what you want to do with a data.frame. Sort is for vectors Does this do what you want? dat1 <- structure(list(dd = structure(c(1426204800, 1427760000, 1426377600, 1426550400, 1426550400, 1426032000, 1426032000, 1426723200), tzone = "UTC", class c("POSIXct", "POSIXt")), wt = c(0, 0, 0, 770, 3.73, 70, 10, 500)), .Names = c("dd", "wt"), row.names = c(NA, -8L), class = "data.frame") str(dat1) dat2 <- aggregate(dat1$wt, list(dat1$dd), sum) names(dat2) <- c("dd", "wt") dat2 [order(dat2$dd),] John Kane Kingston ON Canada> -----Original Message----- > From: shivibhatia at ymail.com > Sent: Tue, 9 Jun 2015 22:51:47 -0700 (PDT) > To: r-help at r-project.org > Subject: Re: [R] Summarizing data based on Date > > HI All, > > I am able to get the desired result. Thanks for extending help. > while reading the csv file I made some changes as : > > Test<-read.csv("Testdata.csv", head=TRUE, stringsAsFactors = FALSE, > strip.white = TRUE) > with this character var were not changed to factors. > > Then aggregation was simple: > aggregate(test$CHG_WT, list(test$CR_DT), sum) > > However the output is not sorted based on Dates and the columns names > appearing as very different: > > Group.1 x > 1 1-Mar-15 909791 > 2 10-Mar-15 822436 > 3 11-Mar-15 848609 > 4 12-Mar-15 924842 > 5 13-Mar-15 895270 > 6 14-Mar-15 93238 > 7 2-Mar-15 731600 > > Can you all please suggest why the column names are so different and how > I > could sort based on dates. I added the sort option in the above syntax > aggregate(test$CHG_WT, list(test$CR_DT), sum,sort(test$CR_DT,decreasing > TRUE)) > > But it gave me an error: > Error in FUN(X[[i]], ...) : invalid 'type' (character) of argument > Thanks All. > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Summarizing-data-based-on-Date-tp4708328p4708423.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.____________________________________________________________ Can't remember your password? Do you need a strong and secure password? Use Password manager! It stores your passwords & protects your account.
Hi I (wrongly) understood that Shivi82 wanted to summarise on month values. Therefore format(test$CR_DT,"%m") shall give you month number and list is required by aggregate. All the problem was in test$CHG_WT which seems to be a factor (for whatever reason) Cheers Petr> -----Original Message----- > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of John > Kane > Sent: Tuesday, June 09, 2015 6:42 PM > To: Shivi82; r-help at r-project.org > Subject: Re: [R] Summarizing data based on Date > > Hi, > > As David said have a look at str(test). You have a factor in there or > else that weird "list(format(test$CR_DT,"%m"))" command in aggregate() > is mucking things up. What is "list(format(test$CR_DT,"%m"))" intended > to do? No ,a quick test says it is mucking something else up and not > giving the us the factor problem. > > Here is your sample data and what I think is what you are trying to do. > Note the data is supplied in dput() format which is the preferred way > to supply sample data to the R-help list. See > http://stackoverflow.com/questions/5963269/how-to-make-a-great-r- > reproducible-example and http://adv-r.had.co.nz/Reproducibility.html > for more information. I used lubridate's dmy() function rather than > as.Date() to format the dates. > > dat1 <- structure(list(dd = structure(c(1426204800, 1427760000, > 1426377600, 1426550400, 1426550400, 1426032000, 1426032000, > 1426723200), tzone = "UTC", class = c("POSIXct", "POSIXt")), wt = c(0, > 0, 0, 770, 3.73, 70, 10, 500)), .Names = c("dd", "wt"), row.names > c(NA, -8L), class = "data.frame") > > str(dat1) > > aggregate(dat1$wt, list(dat1$dd), sum) > > > John Kane > Kingston ON Canada > > > > -----Original Message----- > > From: shivibhatia at ymail.com > > Sent: Tue, 9 Jun 2015 05:01:23 -0700 (PDT) > > To: r-help at r-project.org > > Subject: Re: [R] Summarizing data based on Date > > > > Hi Petr > > > > I researched a lot over the net and R manual as well based on which I > > revamped my code and came to the code as: > > test$CR_DT <- as.Date(test$CR_DT, '%d-%b-%y') > > > > iii<- aggregate(test$CHG_WT,list(format(test$CR_DT,"%m")),FUN=sum) > > > > However it still gives me the error as below: > > Error in Summary.factor(c(1L, 1L, 1L, 3286L, 1646L, 3241L, 1L, 1L, > > 1307L, > > : > > ?sum? not meaningful for factors. > > > > If could you guide on how to achieve the desired output. Thanks. > > > > > > > > -- > > View this message in context: > > http://r.789695.n4.nabble.com/Summarizing-data-based-on-Date- > tp4708328 > > p4708384.html Sent from the R help mailing list archive at > Nabble.com. > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ____________________________________________________________ > TRY FREE IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if5 > Capture screenshots, upload images, edit and send them to your friends > through IMs, post on Twitter?, Facebook?, MySpace?, LinkedIn? ? FAST! > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.________________________________ Tento e-mail a jak?koliv k n?mu p?ipojen? dokumenty jsou d?v?rn? a jsou ur?eny pouze jeho adres?t?m. Jestli?e jste obdr?el(a) tento e-mail omylem, informujte laskav? neprodlen? jeho odes?latele. Obsah tohoto emailu i s p??lohami a jeho kopie vyma?te ze sv?ho syst?mu. Nejste-li zam??len?m adres?tem tohoto emailu, nejste opr?vn?ni tento email jakkoliv u??vat, roz?i?ovat, kop?rovat ?i zve?ej?ovat. Odes?latel e-mailu neodpov?d? za eventu?ln? ?kodu zp?sobenou modifikacemi ?i zpo?d?n?m p?enosu e-mailu. V p??pad?, ?e je tento e-mail sou??st? obchodn?ho jedn?n?: - vyhrazuje si odes?latel pr?vo ukon?it kdykoliv jedn?n? o uzav?en? smlouvy, a to z jak?hokoliv d?vodu i bez uveden? d?vodu. - a obsahuje-li nab?dku, je adres?t opr?vn?n nab?dku bezodkladn? p?ijmout; Odes?latel tohoto e-mailu (nab?dky) vylu?uje p?ijet? nab?dky ze strany p??jemce s dodatkem ?i odchylkou. - trv? odes?latel na tom, ?e p??slu?n? smlouva je uzav?ena teprve v?slovn?m dosa?en?m shody na v?ech jej?ch n?le?itostech. - odes?latel tohoto emailu informuje, ?e nen? opr?vn?n uzav?rat za spole?nost ??dn? smlouvy s v?jimkou p??pad?, kdy k tomu byl p?semn? zmocn?n nebo p?semn? pov??en a takov? pov??en? nebo pln? moc byly adres?tovi tohoto emailu p??padn? osob?, kterou adres?t zastupuje, p?edlo?eny nebo jejich existence je adres?tovi ?i osob? j?m zastoupen? zn?m?. This e-mail and any documents attached to it may be confidential and are intended only for its intended recipients. If you received this e-mail by mistake, please immediately inform its sender. Delete the contents of this e-mail with all attachments and its copies from your system. If you are not the intended recipient of this e-mail, you are not authorized to use, disseminate, copy or disclose this e-mail in any manner. The sender of this e-mail shall not be liable for any possible damage caused by modifications of the e-mail or by delay with transfer of the email. In case that this e-mail forms part of business dealings: - the sender reserves the right to end negotiations about entering into a contract in any time, for any reason, and without stating any reasoning. - if the e-mail contains an offer, the recipient is entitled to immediately accept such offer; The sender of this e-mail (offer) excludes any acceptance of the offer on the part of the recipient containing any amendment or variation. - the sender insists on that the respective contract is concluded only upon an express mutual agreement on all its aspects. - the sender of this e-mail informs that he/she is not authorized to enter into any contracts on behalf of the company except for cases in which he/she is expressly authorized to do so in writing, and such authorization or power of attorney is submitted to the recipient or the person represented by the recipient, or the existence of such authorization is known to the recipient of the person represented by the recipient.
Thank you John for spending time on this query and helping out. It really helped me and finally i am able to achieve the desired results. Thanks a ton to all others as well to spending time and furbishing solution. Regards, Shivi -- View this message in context: http://r.789695.n4.nabble.com/Summarizing-data-based-on-Date-tp4708328p4708500.html Sent from the R help mailing list archive at Nabble.com.