Hi, I tried this: # extract date from the time stamp dt1 <- cbind(as.Date(dt$EndDate, format="%m/%d/%Y"), dt$EnergykWh) head(dt1) colnames(dt1) <- c("date", "EnergykWh") and my dt1 becomes these, the dates are replace by numbers. dt1 <- cbind(as.Date(dt$EndDate, format="%m/%d/%Y"), dt$EnergykWh) dput(head(dt1)) colnames(dt1) <- c("date", "EnergykWh") dput(head(dt1))> dput(head(dt1))structure(c(16814, 16814, 16814, 16815, 16815, 16815, 4.680496,6.272414, 1.032782, 11.004884, 10.096824, 6.658797), dim = c(6L, 2L), dimnames = list(NULL, c("date", "EnergykWh"))) Then I tried this: library(dplyr) dt1 %>% group_by(date) %>% summarise(EnergykWh.sum = sum(EnergykWh)) and got this errors dt1 %>%+ group_by(date) %>%+ summarise(EnergykWh.sum sum(EnergykWh))Error in UseMethod("group_by") : no applicable method for 'group_by' applied to an object of class "c('matrix', 'array', 'double', 'numeric')" On Fri, Nov 3, 2023 at 7:23?AM roslinazairimah zakaria <roslinaump at gmail.com> wrote:> Dear all, > > I have this set of data. I would like to sum the EnergykWh according date > sequences. > > > head(dt1,20) StationName date time EnergykWh > 1 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 12:09 4.680496 > 2 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 19:50 6.272414 > 3 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 20:22 1.032782 > 4 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 8:25 11.004884 > 5 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 14:23 10.096824 > 6 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 18:17 6.658797 > 7 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 21:46 4.808874 > 8 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 10:19 1.469384 > 9 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 12:12 2.996239 > 10 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 14:12 0.303222 > 11 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 16:22 4.988339 > 12 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 19:16 8.131804 > 13 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 19:19 0.117156 > 14 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 20:24 3.285669 > 15 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 9:54 1.175608 > 16 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 12:16 3.677487 > 17 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 13:53 1.068393 > 18 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 19:03 8.820755 > 19 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 22:00 8.138583 > 20 PALO ALTO CA / CAMBRIDGE #1 1/18/2016 8:58 9.057500 > > I have tried this: > library(dplyr) > sums <- dt1 %>% > group_by(date) %>% > summarise(EnergykWh = sum(EnergykWh)) > > head(sums,20) > > The date is not by daily sequence but by year sequence. > > > head(sums,20)# A tibble: 20 ? 2 > date EnergykWh > <chr> <dbl> 1 1/1/2017 25.3 2 1/1/2018 61.0 3 1/1/2019 0.627 4 1/1/2020 10.7 5 1/10/2017 69.4 6 1/10/2018 54.5 7 1/10/2019 49.1 8 1/10/2020 45.9 9 1/11/2017 73.9 10 1/11/2018 53.3 11 1/11/2019 93.5 12 1/11/2020 66.7 13 1/12/2017 78.6 14 1/12/2018 42.2 15 1/12/2019 22.7 16 1/12/2020 80.9 17 1/13/2017 85.6 18 1/13/2018 46.4 19 1/13/2019 40.0 20 1/13/2020 121. > > > > Thank you very much for any help given. > > > -- > *Roslinazairimah Zakaria* > *Tel: +609-5492370; Fax. No.+609-5492766* > > *Email: roslinazairimah at ump.edu.my <roslinazairimah at ump.edu.my>; > roslinaump at gmail.com <roslinaump at gmail.com>* > Faculty of Industrial Sciences & Technology > University Malaysia Pahang > Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia >-- *Roslinazairimah Zakaria* *Tel: +609-5492370; Fax. No.+609-5492766* *Email: roslinazairimah at ump.edu.my <roslinazairimah at ump.edu.my>; roslinaump at gmail.com <roslinaump at gmail.com>* Faculty of Industrial Sciences & Technology University Malaysia Pahang Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia [[alternative HTML version deleted]]
Cbind is not a very good tool for adding columns to a data frame. Either use explicit column referencing like dt1$x <- new_data_vector but you do have to make sure the new data vector has the same number of values and in the same order as the other data in dt1. The transition from multiple records per day to one record per day would make the starting data and ending data incompatible. library(dplyr) dt1 <- structure(list(StationName = c("PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1"), date = c("1/14/2016", "1/14/2016", "1/14/2016", "1/15/2016", "1/15/2016", "1/15/2016", "1/15/2016", "1/16/2016", "1/16/2016", "1/16/2016", "1/16/2016", "1/16/2016", "1/16/2016", "1/16/2016", "1/17/2016", "1/17/2016", "1/17/2016", "1/17/2016", "1/17/2016", "1/18/2016"), time = c("12:09", "19:50", "20:22", "8:25", "14:23", "18:17", "21:46", "10:19", "12:12", "14:12", "16:22", "19:16", "19:19", "20:24", "9:54", "12:16", "13:53", "19:03", "22:00", "8:58"), EnergykWh = c(4.680496, 6.272414, 1.032782, 11.004884, 10.096824, 6.658797, 4.808874, 1.469384, 2.996239, 0.303222, 4.988339, 8.131804, 0.117156, 3.285669, 1.175608, 3.677487, 1.068393, 8.820755, 8.138583, 9.0575)), row.names = c(NA, 20L), class = "data.frame") ans <- ( dt1 %>% mutate( Dt = as.Date( date, format = "%m/%d/%Y" ) ) %>% group_by( StationName, Dt ) %>% summarize( DailyEnergykWh = sum( EnergykWh ) ) ) On November 3, 2023 2:51:20 AM PDT, roslinazairimah zakaria <roslinaump at gmail.com> wrote:>Hi, >I tried this: ># extract date from the time stamp >dt1 <- cbind(as.Date(dt$EndDate, format="%m/%d/%Y"), dt$EnergykWh) >head(dt1) >colnames(dt1) <- c("date", "EnergykWh") >and >my dt1 becomes these, the dates are replace by numbers. > >dt1 <- cbind(as.Date(dt$EndDate, format="%m/%d/%Y"), dt$EnergykWh) >dput(head(dt1)) >colnames(dt1) <- c("date", "EnergykWh") >dput(head(dt1)) > > >> dput(head(dt1))structure(c(16814, 16814, 16814, 16815, 16815, 16815, 4.680496, >6.272414, 1.032782, 11.004884, 10.096824, 6.658797), dim = c(6L, >2L), dimnames = list(NULL, c("date", "EnergykWh"))) > >Then I tried this: >library(dplyr) >dt1 %>% > group_by(date) %>% > summarise(EnergykWh.sum = sum(EnergykWh)) >and got this errors > >dt1 %>%+ group_by(date) %>%+ summarise(EnergykWh.sum >sum(EnergykWh))Error in UseMethod("group_by") : > no applicable method for 'group_by' applied to an object of class >"c('matrix', 'array', 'double', 'numeric')" > > > >On Fri, Nov 3, 2023 at 7:23?AM roslinazairimah zakaria <roslinaump at gmail.com> >wrote: > >> Dear all, >> >> I have this set of data. I would like to sum the EnergykWh according date >> sequences. >> >> > head(dt1,20) StationName date time EnergykWh >> 1 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 12:09 4.680496 >> 2 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 19:50 6.272414 >> 3 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 20:22 1.032782 >> 4 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 8:25 11.004884 >> 5 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 14:23 10.096824 >> 6 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 18:17 6.658797 >> 7 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 21:46 4.808874 >> 8 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 10:19 1.469384 >> 9 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 12:12 2.996239 >> 10 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 14:12 0.303222 >> 11 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 16:22 4.988339 >> 12 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 19:16 8.131804 >> 13 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 19:19 0.117156 >> 14 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 20:24 3.285669 >> 15 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 9:54 1.175608 >> 16 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 12:16 3.677487 >> 17 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 13:53 1.068393 >> 18 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 19:03 8.820755 >> 19 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 22:00 8.138583 >> 20 PALO ALTO CA / CAMBRIDGE #1 1/18/2016 8:58 9.057500 >> >> I have tried this: >> library(dplyr) >> sums <- dt1 %>% >> group_by(date) %>% >> summarise(EnergykWh = sum(EnergykWh)) >> >> head(sums,20) >> >> The date is not by daily sequence but by year sequence. >> >> > head(sums,20)# A tibble: 20 ? 2 >> date EnergykWh >> <chr> <dbl> 1 1/1/2017 25.3 2 1/1/2018 61.0 3 1/1/2019 0.627 4 1/1/2020 10.7 5 1/10/2017 69.4 6 1/10/2018 54.5 7 1/10/2019 49.1 8 1/10/2020 45.9 9 1/11/2017 73.9 10 1/11/2018 53.3 11 1/11/2019 93.5 12 1/11/2020 66.7 13 1/12/2017 78.6 14 1/12/2018 42.2 15 1/12/2019 22.7 16 1/12/2020 80.9 17 1/13/2017 85.6 18 1/13/2018 46.4 19 1/13/2019 40.0 20 1/13/2020 121. >> >> >> >> Thank you very much for any help given. >> >> >> -- >> *Roslinazairimah Zakaria* >> *Tel: +609-5492370; Fax. No.+609-5492766* >> >> *Email: roslinazairimah at ump.edu.my <roslinazairimah at ump.edu.my>; >> roslinaump at gmail.com <roslinaump at gmail.com>* >> Faculty of Industrial Sciences & Technology >> University Malaysia Pahang >> Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia >> > >-- Sent from my phone. Please excuse my brevity.
Is this what you are after? library(tidyverse) library(lubridate) input <- structure(list(StationName = c("PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1"), date = c("1/14/2016", "1/14/2016", "1/14/2016", "1/15/2016", "1/15/2016", "1/15/2016", "1/15/2016", "1/16/2016", "1/16/2016", "1/16/2016", "1/16/2016", "1/16/2016", "1/16/2016", "1/16/2016", "1/17/2016", "1/17/2016", "1/17/2016", "1/17/2016", "1/17/2016", "1/18/2016"), time = c("12:09", "19:50", "20:22", "8:25", "14:23", "18:17", "21:46", "10:19", "12:12", "14:12", "16:22", "19:16", "19:19", "20:24", "9:54", "12:16", "13:53", "19:03", "22:00", "8:58"), EnergykWh = c(4.680496, 6.272414, 1.032782, 11.004884, 10.096824, 6.658797, 4.808874, 1.469384, 2.996239, 0.303222, 4.988339, 8.131804, 0.117156, 3.285669, 1.175608, 3.677487, 1.068393, 8.820755, 8.138583, 9.0575)), row.names = c(NA, 20L), class = "data.frame") # convert date from character to Date byDate <- input |> mutate(newdate = mdy(date)) |> group_by(newdate) |> summarise(total = sum(EnergykWh)) byDate ## # A tibble: 5 ? 2 ## newdate total ## <date> <dbl> ## 1 2016-01-14 12.0 ## 2 2016-01-15 32.6 ## 3 2016-01-16 21.3 ## 4 2016-01-17 22.9 ## 5 2016-01-18 9.06 Thanks Jim Holtman *Data Munger Guru* *What is the problem that you are trying to solve?Tell me what you want to do, not how you want to do it.* On Fri, Nov 3, 2023 at 2:51?AM roslinazairimah zakaria <roslinaump at gmail.com> wrote:> Hi, > I tried this: > # extract date from the time stamp > dt1 <- cbind(as.Date(dt$EndDate, format="%m/%d/%Y"), dt$EnergykWh) > head(dt1) > colnames(dt1) <- c("date", "EnergykWh") > and > my dt1 becomes these, the dates are replace by numbers. > > dt1 <- cbind(as.Date(dt$EndDate, format="%m/%d/%Y"), dt$EnergykWh) > dput(head(dt1)) > colnames(dt1) <- c("date", "EnergykWh") > dput(head(dt1)) > > > > dput(head(dt1))structure(c(16814, 16814, 16814, 16815, 16815, 16815, > 4.680496, > 6.272414, 1.032782, 11.004884, 10.096824, 6.658797), dim = c(6L, > 2L), dimnames = list(NULL, c("date", "EnergykWh"))) > > Then I tried this: > library(dplyr) > dt1 %>% > group_by(date) %>% > summarise(EnergykWh.sum = sum(EnergykWh)) > and got this errors > > dt1 %>%+ group_by(date) %>%+ summarise(EnergykWh.sum > sum(EnergykWh))Error in UseMethod("group_by") : > no applicable method for 'group_by' applied to an object of class > "c('matrix', 'array', 'double', 'numeric')" > > > > On Fri, Nov 3, 2023 at 7:23?AM roslinazairimah zakaria < > roslinaump at gmail.com> > wrote: > > > Dear all, > > > > I have this set of data. I would like to sum the EnergykWh according date > > sequences. > > > > > head(dt1,20) StationName date time EnergykWh > > 1 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 12:09 4.680496 > > 2 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 19:50 6.272414 > > 3 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 20:22 1.032782 > > 4 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 8:25 11.004884 > > 5 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 14:23 10.096824 > > 6 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 18:17 6.658797 > > 7 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 21:46 4.808874 > > 8 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 10:19 1.469384 > > 9 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 12:12 2.996239 > > 10 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 14:12 0.303222 > > 11 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 16:22 4.988339 > > 12 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 19:16 8.131804 > > 13 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 19:19 0.117156 > > 14 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 20:24 3.285669 > > 15 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 9:54 1.175608 > > 16 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 12:16 3.677487 > > 17 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 13:53 1.068393 > > 18 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 19:03 8.820755 > > 19 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 22:00 8.138583 > > 20 PALO ALTO CA / CAMBRIDGE #1 1/18/2016 8:58 9.057500 > > > > I have tried this: > > library(dplyr) > > sums <- dt1 %>% > > group_by(date) %>% > > summarise(EnergykWh = sum(EnergykWh)) > > > > head(sums,20) > > > > The date is not by daily sequence but by year sequence. > > > > > head(sums,20)# A tibble: 20 ? 2 > > date EnergykWh > > <chr> <dbl> 1 1/1/2017 25.3 2 1/1/2018 61.0 3 > 1/1/2019 0.627 4 1/1/2020 10.7 5 1/10/2017 69.4 6 > 1/10/2018 54.5 7 1/10/2019 49.1 8 1/10/2020 45.9 9 > 1/11/2017 73.9 10 1/11/2018 53.3 11 1/11/2019 93.5 12 > 1/11/2020 66.7 13 1/12/2017 78.6 14 1/12/2018 42.2 15 > 1/12/2019 22.7 16 1/12/2020 80.9 17 1/13/2017 85.6 18 > 1/13/2018 46.4 19 1/13/2019 40.0 20 1/13/2020 121. > > > > > > > > Thank you very much for any help given. > > > > > > -- > > *Roslinazairimah Zakaria* > > *Tel: +609-5492370; Fax. No.+609-5492766* > > > > *Email: roslinazairimah at ump.edu.my <roslinazairimah at ump.edu.my>; > > roslinaump at gmail.com <roslinaump at gmail.com>* > > Faculty of Industrial Sciences & Technology > > University Malaysia Pahang > > Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia > > > > > -- > *Roslinazairimah Zakaria* > *Tel: +609-5492370; Fax. No.+609-5492766* > > *Email: roslinazairimah at ump.edu.my <roslinazairimah at ump.edu.my>; > roslinaump at gmail.com <roslinaump at gmail.com>* > Faculty of Industrial Sciences & Technology > University Malaysia Pahang > Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]