Dear all, I have this set of data. I would like to sum the EnergykWh according date sequences.> head(dt1,20) StationName date time EnergykWh1 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 12:09 4.680496 2 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 19:50 6.272414 3 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 20:22 1.032782 4 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 8:25 11.004884 5 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 14:23 10.096824 6 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 18:17 6.658797 7 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 21:46 4.808874 8 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 10:19 1.469384 9 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 12:12 2.996239 10 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 14:12 0.303222 11 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 16:22 4.988339 12 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 19:16 8.131804 13 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 19:19 0.117156 14 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 20:24 3.285669 15 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 9:54 1.175608 16 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 12:16 3.677487 17 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 13:53 1.068393 18 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 19:03 8.820755 19 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 22:00 8.138583 20 PALO ALTO CA / CAMBRIDGE #1 1/18/2016 8:58 9.057500 I have tried this: library(dplyr) sums <- dt1 %>% group_by(date) %>% summarise(EnergykWh = sum(EnergykWh)) head(sums,20) The date is not by daily sequence but by year sequence.> head(sums,20)# A tibble: 20 ? 2date EnergykWh <chr> <dbl> 1 1/1/2017 25.3 2 1/1/2018 61.0 3 1/1/2019 0.627 4 1/1/2020 10.7 5 1/10/2017 69.4 6 1/10/2018 54.5 7 1/10/2019 49.1 8 1/10/2020 45.9 9 1/11/2017 73.9 10 1/11/2018 53.3 11 1/11/2019 93.5 12 1/11/2020 66.7 13 1/12/2017 78.6 14 1/12/2018 42.2 15 1/12/2019 22.7 16 1/12/2020 80.9 17 1/13/2017 85.6 18 1/13/2018 46.4 19 1/13/2019 40.0 20 1/13/2020 121. Thank you very much for any help given. -- *Roslinazairimah Zakaria* *Tel: +609-5492370; Fax. No.+609-5492766* *Email: roslinazairimah at ump.edu.my <roslinazairimah at ump.edu.my>; roslinaump at gmail.com <roslinaump at gmail.com>* Faculty of Industrial Sciences & Technology University Malaysia Pahang Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia [[alternative HTML version deleted]]
date appears to be a character variable, and R is treating it as such. str(dt1) might give you some insight. Or the dplyr equivalent glimpse(dt1) I think R did what you asked, but if you want to be able to order records by date, in temporal order, you need to tell R that it is a date: library(dplyr) sums <- dt1 %>% mutate(realDate = as.Date(date, format = "%m/%d/%Y") %>% group_by(realDate) %>% summarise(EnergykWh = sum(EnergykWh)) --Chris Ryan roslinazairimah zakaria wrote:> Dear all, > > I have this set of data. I would like to sum the EnergykWh according date > sequences. > >> head(dt1,20) StationName date time EnergykWh > 1 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 12:09 4.680496 > 2 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 19:50 6.272414 > 3 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 20:22 1.032782 > 4 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 8:25 11.004884 > 5 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 14:23 10.096824 > 6 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 18:17 6.658797 > 7 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 21:46 4.808874 > 8 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 10:19 1.469384 > 9 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 12:12 2.996239 > 10 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 14:12 0.303222 > 11 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 16:22 4.988339 > 12 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 19:16 8.131804 > 13 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 19:19 0.117156 > 14 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 20:24 3.285669 > 15 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 9:54 1.175608 > 16 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 12:16 3.677487 > 17 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 13:53 1.068393 > 18 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 19:03 8.820755 > 19 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 22:00 8.138583 > 20 PALO ALTO CA / CAMBRIDGE #1 1/18/2016 8:58 9.057500 > > I have tried this: > library(dplyr) > sums <- dt1 %>% > group_by(date) %>% > summarise(EnergykWh = sum(EnergykWh)) > > head(sums,20) > > The date is not by daily sequence but by year sequence. > >> head(sums,20)# A tibble: 20 ? 2 > date EnergykWh > <chr> <dbl> 1 1/1/2017 25.3 2 1/1/2018 61.0 3 > 1/1/2019 0.627 4 1/1/2020 10.7 5 1/10/2017 69.4 6 > 1/10/2018 54.5 7 1/10/2019 49.1 8 1/10/2020 45.9 9 > 1/11/2017 73.9 10 1/11/2018 53.3 11 1/11/2019 93.5 12 > 1/11/2020 66.7 13 1/12/2017 78.6 14 1/12/2018 42.2 15 > 1/12/2019 22.7 16 1/12/2020 80.9 17 1/13/2017 85.6 18 > 1/13/2018 46.4 19 1/13/2019 40.0 20 1/13/2020 121. > > > > Thank you very much for any help given. > >
How about send a 'dput' of some sample data. My guess is that your date is 'character' and not 'Date'. Thanks Jim Holtman *Data Munger Guru* *What is the problem that you are trying to solve?Tell me what you want to do, not how you want to do it.* On Thu, Nov 2, 2023 at 4:24?PM roslinazairimah zakaria <roslinaump at gmail.com> wrote:> Dear all, > > I have this set of data. I would like to sum the EnergykWh according date > sequences. > > > head(dt1,20) StationName date time EnergykWh > 1 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 12:09 4.680496 > 2 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 19:50 6.272414 > 3 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 20:22 1.032782 > 4 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 8:25 11.004884 > 5 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 14:23 10.096824 > 6 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 18:17 6.658797 > 7 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 21:46 4.808874 > 8 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 10:19 1.469384 > 9 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 12:12 2.996239 > 10 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 14:12 0.303222 > 11 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 16:22 4.988339 > 12 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 19:16 8.131804 > 13 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 19:19 0.117156 > 14 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 20:24 3.285669 > 15 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 9:54 1.175608 > 16 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 12:16 3.677487 > 17 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 13:53 1.068393 > 18 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 19:03 8.820755 > 19 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 22:00 8.138583 > 20 PALO ALTO CA / CAMBRIDGE #1 1/18/2016 8:58 9.057500 > > I have tried this: > library(dplyr) > sums <- dt1 %>% > group_by(date) %>% > summarise(EnergykWh = sum(EnergykWh)) > > head(sums,20) > > The date is not by daily sequence but by year sequence. > > > head(sums,20)# A tibble: 20 ? 2 > date EnergykWh > <chr> <dbl> 1 1/1/2017 25.3 2 1/1/2018 61.0 3 > 1/1/2019 0.627 4 1/1/2020 10.7 5 1/10/2017 69.4 6 > 1/10/2018 54.5 7 1/10/2019 49.1 8 1/10/2020 45.9 9 > 1/11/2017 73.9 10 1/11/2018 53.3 11 1/11/2019 93.5 12 > 1/11/2020 66.7 13 1/12/2017 78.6 14 1/12/2018 42.2 15 > 1/12/2019 22.7 16 1/12/2020 80.9 17 1/13/2017 85.6 18 > 1/13/2018 46.4 19 1/13/2019 40.0 20 1/13/2020 121. > > > > Thank you very much for any help given. > > > -- > *Roslinazairimah Zakaria* > *Tel: +609-5492370; Fax. No.+609-5492766* > > *Email: roslinazairimah at ump.edu.my <roslinazairimah at ump.edu.my>; > roslinaump at gmail.com <roslinaump at gmail.com>* > Faculty of Industrial Sciences & Technology > University Malaysia Pahang > Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Thank you very much for your help. It is very much appreciated. On Fri, Nov 3, 2023 at 7:23?AM roslinazairimah zakaria <roslinaump at gmail.com> wrote:> Dear all, > > I have this set of data. I would like to sum the EnergykWh according date > sequences. > > > head(dt1,20) StationName date time EnergykWh > 1 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 12:09 4.680496 > 2 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 19:50 6.272414 > 3 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 20:22 1.032782 > 4 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 8:25 11.004884 > 5 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 14:23 10.096824 > 6 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 18:17 6.658797 > 7 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 21:46 4.808874 > 8 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 10:19 1.469384 > 9 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 12:12 2.996239 > 10 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 14:12 0.303222 > 11 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 16:22 4.988339 > 12 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 19:16 8.131804 > 13 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 19:19 0.117156 > 14 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 20:24 3.285669 > 15 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 9:54 1.175608 > 16 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 12:16 3.677487 > 17 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 13:53 1.068393 > 18 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 19:03 8.820755 > 19 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 22:00 8.138583 > 20 PALO ALTO CA / CAMBRIDGE #1 1/18/2016 8:58 9.057500 > > I have tried this: > library(dplyr) > sums <- dt1 %>% > group_by(date) %>% > summarise(EnergykWh = sum(EnergykWh)) > > head(sums,20) > > The date is not by daily sequence but by year sequence. > > > head(sums,20)# A tibble: 20 ? 2 > date EnergykWh > <chr> <dbl> 1 1/1/2017 25.3 2 1/1/2018 61.0 3 1/1/2019 0.627 4 1/1/2020 10.7 5 1/10/2017 69.4 6 1/10/2018 54.5 7 1/10/2019 49.1 8 1/10/2020 45.9 9 1/11/2017 73.9 10 1/11/2018 53.3 11 1/11/2019 93.5 12 1/11/2020 66.7 13 1/12/2017 78.6 14 1/12/2018 42.2 15 1/12/2019 22.7 16 1/12/2020 80.9 17 1/13/2017 85.6 18 1/13/2018 46.4 19 1/13/2019 40.0 20 1/13/2020 121. > > > > Thank you very much for any help given. > > > -- > *Roslinazairimah Zakaria* > *Tel: +609-5492370; Fax. No.+609-5492766* > > *Email: roslinazairimah at ump.edu.my <roslinazairimah at ump.edu.my>; > roslinaump at gmail.com <roslinaump at gmail.com>* > Faculty of Industrial Sciences & Technology > University Malaysia Pahang > Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia >-- *Roslinazairimah Zakaria* *Tel: +609-5492370; Fax. No.+609-5492766* *Email: roslinazairimah at ump.edu.my <roslinazairimah at ump.edu.my>; roslinaump at gmail.com <roslinaump at gmail.com>* Faculty of Industrial Sciences & Technology University Malaysia Pahang Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia [[alternative HTML version deleted]]
Hi, I tried this: # extract date from the time stamp dt1 <- cbind(as.Date(dt$EndDate, format="%m/%d/%Y"), dt$EnergykWh) head(dt1) colnames(dt1) <- c("date", "EnergykWh") and my dt1 becomes these, the dates are replace by numbers. dt1 <- cbind(as.Date(dt$EndDate, format="%m/%d/%Y"), dt$EnergykWh) dput(head(dt1)) colnames(dt1) <- c("date", "EnergykWh") dput(head(dt1))> dput(head(dt1))structure(c(16814, 16814, 16814, 16815, 16815, 16815, 4.680496,6.272414, 1.032782, 11.004884, 10.096824, 6.658797), dim = c(6L, 2L), dimnames = list(NULL, c("date", "EnergykWh"))) Then I tried this: library(dplyr) dt1 %>% group_by(date) %>% summarise(EnergykWh.sum = sum(EnergykWh)) and got this errors dt1 %>%+ group_by(date) %>%+ summarise(EnergykWh.sum sum(EnergykWh))Error in UseMethod("group_by") : no applicable method for 'group_by' applied to an object of class "c('matrix', 'array', 'double', 'numeric')" On Fri, Nov 3, 2023 at 7:23?AM roslinazairimah zakaria <roslinaump at gmail.com> wrote:> Dear all, > > I have this set of data. I would like to sum the EnergykWh according date > sequences. > > > head(dt1,20) StationName date time EnergykWh > 1 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 12:09 4.680496 > 2 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 19:50 6.272414 > 3 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 20:22 1.032782 > 4 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 8:25 11.004884 > 5 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 14:23 10.096824 > 6 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 18:17 6.658797 > 7 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 21:46 4.808874 > 8 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 10:19 1.469384 > 9 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 12:12 2.996239 > 10 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 14:12 0.303222 > 11 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 16:22 4.988339 > 12 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 19:16 8.131804 > 13 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 19:19 0.117156 > 14 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 20:24 3.285669 > 15 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 9:54 1.175608 > 16 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 12:16 3.677487 > 17 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 13:53 1.068393 > 18 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 19:03 8.820755 > 19 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 22:00 8.138583 > 20 PALO ALTO CA / CAMBRIDGE #1 1/18/2016 8:58 9.057500 > > I have tried this: > library(dplyr) > sums <- dt1 %>% > group_by(date) %>% > summarise(EnergykWh = sum(EnergykWh)) > > head(sums,20) > > The date is not by daily sequence but by year sequence. > > > head(sums,20)# A tibble: 20 ? 2 > date EnergykWh > <chr> <dbl> 1 1/1/2017 25.3 2 1/1/2018 61.0 3 1/1/2019 0.627 4 1/1/2020 10.7 5 1/10/2017 69.4 6 1/10/2018 54.5 7 1/10/2019 49.1 8 1/10/2020 45.9 9 1/11/2017 73.9 10 1/11/2018 53.3 11 1/11/2019 93.5 12 1/11/2020 66.7 13 1/12/2017 78.6 14 1/12/2018 42.2 15 1/12/2019 22.7 16 1/12/2020 80.9 17 1/13/2017 85.6 18 1/13/2018 46.4 19 1/13/2019 40.0 20 1/13/2020 121. > > > > Thank you very much for any help given. > > > -- > *Roslinazairimah Zakaria* > *Tel: +609-5492370; Fax. No.+609-5492766* > > *Email: roslinazairimah at ump.edu.my <roslinazairimah at ump.edu.my>; > roslinaump at gmail.com <roslinaump at gmail.com>* > Faculty of Industrial Sciences & Technology > University Malaysia Pahang > Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia >-- *Roslinazairimah Zakaria* *Tel: +609-5492370; Fax. No.+609-5492766* *Email: roslinazairimah at ump.edu.my <roslinazairimah at ump.edu.my>; roslinaump at gmail.com <roslinaump at gmail.com>* Faculty of Industrial Sciences & Technology University Malaysia Pahang Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia [[alternative HTML version deleted]]