Stefano Sofia
2016-Jun-10 10:45 UTC
[R] create an empty data frame and then fill in it (and then evaluate the mean of semi-hourly data for each day)
Thank you for your answer. Very clear. (I don't like the second solution either.) Let me then ask a final question. From an initial data frame with semi-hourly data (df_snow, with two columns, data_POSIX of type "POSIXct" "POSIXt" and snow of type "numeric"), I need to evaluate the mean of for each day. data_POSIX snow 2004-11-01 00:00:00 50 2004-11-01 00:30:00 55 2004-11-01 01:00:00 60 ... I first created a new column of type "Date" df_snow$day <- as.Date(df_snow$data_POSIX,"%Y-%m-%d") then I created a new data frame called df_snow_day to store the mean of data grouped by day: list_days <- unique(df_snow$day) df_snow_day <- data.frame(day=list_days) Finally I applied lapply in this way: df_snow_day$snow <- lapply(df_snow_day$day, function(x) round(mean(df_snow$snow[df_snow$day == x], na.rm=T))) This does not work. I do not understand why the class of df_snow_day$snow is of type list either: day snow NA <NA> NULL NA.1 <NA> NULL NA.2 <NA> NULL Where is my mistake? Thank you for all your help Stefano _____________________________________________ Da: Duncan Murdoch [murdoch.duncan at gmail.com] Inviato: gioved? 9 giugno 2016 12.36 A: Stefano Sofia; r-help at r-project.org Oggetto: Re: [R] create an empty data frame and then fill in it On 09/06/2016 6:22 AM, Stefano Sofia wrote:> Dear R list users, > sorry for this simple question, but I already spent many efforts to solve it. > > I create an empty data frame called df_year like > > df_year <- data.frame(day=as.Date(character()), hs_MteBove=integer(), hs_MtePrata=integer(), hs_Pintura=integer(), hs_Pizzo=integer(), hs_Sassotetto=integer(), hs_Sibilla=integer(), stringsAsFactors=FALSE) > > and then I start to fill in it with > > df_year$day <- seq(as.Date("2004-11-01-00-00","%Y-%m-%d"), as.Date("2005-05-01-00-00","%Y-%m-%d"), by="day") > > but I get the following error: > "replacement has 182 rows, data has 0" > > Where is my silly mistake?Your dataframe has 0 rows, so you can't put a 182 row vector into the first column. Unlike vectors, dataframes won't grow if you make assignments beyond the end of the rows. There are at least a couple of solutions: 1. Don't create columns until you have data ready for them. You can wait to create the dataframe until your "day" column is ready: df_year <- data.frame(day = seq(...)) As you compute other columns of the same length, you can add them, e.g. df_year$hs_MteBove <- ... 2. Create your columns with the right length from the beginning: df_year <- data.frame(day = rep(as.Date(NA), 182), ...) I don't like this solution as much. Duncan Murdoch ________________________________ AVVISO IMPORTANTE: Questo messaggio di posta elettronica pu? contenere informazioni confidenziali, pertanto ? destinato solo a persone autorizzate alla ricezione. I messaggi di posta elettronica per i client di Regione Marche possono contenere informazioni confidenziali e con privilegi legali. Se non si ? il destinatario specificato, non leggere, copiare, inoltrare o archiviare questo messaggio. Se si ? ricevuto questo messaggio per errore, inoltrarlo al mittente ed eliminarlo completamente dal sistema del proprio computer. Ai sensi dell?art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessit? ed urgenza, la risposta al presente messaggio di posta elettronica pu? essere visionata da persone estranee al destinatario. IMPORTANT NOTICE: This e-mail message is intended to be received only by persons entitled to receive the confidential information it may contain. E-mail messages to clients of Regione Marche may contain information that is confidential and legally privileged. Please do not read, copy, forward, or store this message unless you are an intended recipient of it. If you have received this message in error, please forward it to the sender and delete it completely from your computer system.
PIKAL Petr
2016-Jun-10 10:56 UTC
[R] create an empty data frame and then fill in it (and then evaluate the mean of semi-hourly data for each day)
Hi Sofia df_snow_day <- aggregate(df_snow$snow, list(df_snow$day), mean, na.rm=TRUE) should give you automagically required data frame. Regards Petr> -----Original Message----- > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Stefano > Sofia > Sent: Friday, June 10, 2016 12:46 PM > To: Duncan Murdoch <murdoch.duncan at gmail.com>; r-help at r-project.org > Subject: Re: [R] create an empty data frame and then fill in it (and then > evaluate the mean of semi-hourly data for each day) > > Thank you for your answer. Very clear. > (I don't like the second solution either.) Let me then ask a final question. > From an initial data frame with semi-hourly data (df_snow, with two > columns, data_POSIX of type "POSIXct" "POSIXt" and snow of type > "numeric"), I need to evaluate the mean of for each day. > > data_POSIX snow > 2004-11-01 00:00:00 50 > 2004-11-01 00:30:00 55 > 2004-11-01 01:00:00 60 > ... > > I first created a new column of type "Date" > df_snow$day <- as.Date(df_snow$data_POSIX,"%Y-%m-%d") > > then I created a new data frame called df_snow_day to store the mean of > data grouped by day: > list_days <- unique(df_snow$day) > df_snow_day <- data.frame(day=list_days) > > Finally I applied lapply in this way: > df_snow_day$snow <- lapply(df_snow_day$day, function(x) > round(mean(df_snow$snow[df_snow$day == x], na.rm=T))) > > This does not work. I do not understand why the class of df_snow_day$snow > is of type list either: > > day snow > NA <NA> NULL > NA.1 <NA> NULL > NA.2 <NA> NULL > > Where is my mistake? > > Thank you for all your help > Stefano > > > _____________________________________________ > > Da: Duncan Murdoch [murdoch.duncan at gmail.com] > Inviato: gioved? 9 giugno 2016 12.36 > A: Stefano Sofia; r-help at r-project.org > Oggetto: Re: [R] create an empty data frame and then fill in it > > On 09/06/2016 6:22 AM, Stefano Sofia wrote: > > Dear R list users, > > sorry for this simple question, but I already spent many efforts to solve it. > > > > I create an empty data frame called df_year like > > > > df_year <- data.frame(day=as.Date(character()), hs_MteBove=integer(), > > hs_MtePrata=integer(), hs_Pintura=integer(), hs_Pizzo=integer(), > > hs_Sassotetto=integer(), hs_Sibilla=integer(), stringsAsFactors=FALSE) > > > > and then I start to fill in it with > > > > df_year$day <- seq(as.Date("2004-11-01-00-00","%Y-%m-%d"), > > as.Date("2005-05-01-00-00","%Y-%m-%d"), by="day") > > > > but I get the following error: > > "replacement has 182 rows, data has 0" > > > > Where is my silly mistake? > > Your dataframe has 0 rows, so you can't put a 182 row vector into the first > column. > > Unlike vectors, dataframes won't grow if you make assignments beyond the > end of the rows. > > There are at least a couple of solutions: > > 1. Don't create columns until you have data ready for them. > > You can wait to create the dataframe until your "day" column is ready: > > df_year <- data.frame(day = seq(...)) > > As you compute other columns of the same length, you can add them, e.g. > > df_year$hs_MteBove <- ... > > 2. Create your columns with the right length from the beginning: > > df_year <- data.frame(day = rep(as.Date(NA), 182), ...) > > I don't like this solution as much. > > Duncan Murdoch > > > ________________________________ > > AVVISO IMPORTANTE: Questo messaggio di posta elettronica pu? contenere > informazioni confidenziali, pertanto ? destinato solo a persone autorizzate > alla ricezione. I messaggi di posta elettronica per i client di Regione Marche > possono contenere informazioni confidenziali e con privilegi legali. Se non si ? > il destinatario specificato, non leggere, copiare, inoltrare o archiviare questo > messaggio. Se si ? ricevuto questo messaggio per errore, inoltrarlo al > mittente ed eliminarlo completamente dal sistema del proprio computer. Ai > sensi dell?art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessit? ed > urgenza, la risposta al presente messaggio di posta elettronica pu? essere > visionata da persone estranee al destinatario. > IMPORTANT NOTICE: This e-mail message is intended to be received only by > persons entitled to receive the confidential information it may contain. E-mail > messages to clients of Regione Marche may contain information that is > confidential and legally privileged. Please do not read, copy, forward, or store > this message unless you are an intended recipient of it. If you have received > this message in error, please forward it to the sender and delete it > completely from your computer system. > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.________________________________ Tento e-mail a jak?koliv k n?mu p?ipojen? dokumenty jsou d?v?rn? a jsou ur?eny pouze jeho adres?t?m. Jestli?e jste obdr?el(a) tento e-mail omylem, informujte laskav? neprodlen? jeho odes?latele. Obsah tohoto emailu i s p??lohami a jeho kopie vyma?te ze sv?ho syst?mu. Nejste-li zam??len?m adres?tem tohoto emailu, nejste opr?vn?ni tento email jakkoliv u??vat, roz?i?ovat, kop?rovat ?i zve?ej?ovat. Odes?latel e-mailu neodpov?d? za eventu?ln? ?kodu zp?sobenou modifikacemi ?i zpo?d?n?m p?enosu e-mailu. V p??pad?, ?e je tento e-mail sou??st? obchodn?ho jedn?n?: - vyhrazuje si odes?latel pr?vo ukon?it kdykoliv jedn?n? o uzav?en? smlouvy, a to z jak?hokoliv d?vodu i bez uveden? d?vodu. - a obsahuje-li nab?dku, je adres?t opr?vn?n nab?dku bezodkladn? p?ijmout; Odes?latel tohoto e-mailu (nab?dky) vylu?uje p?ijet? nab?dky ze strany p??jemce s dodatkem ?i odchylkou. - trv? odes?latel na tom, ?e p??slu?n? smlouva je uzav?ena teprve v?slovn?m dosa?en?m shody na v?ech jej?ch n?le?itostech. - odes?latel tohoto emailu informuje, ?e nen? opr?vn?n uzav?rat za spole?nost ??dn? smlouvy s v?jimkou p??pad?, kdy k tomu byl p?semn? zmocn?n nebo p?semn? pov??en a takov? pov??en? nebo pln? moc byly adres?tovi tohoto emailu p??padn? osob?, kterou adres?t zastupuje, p?edlo?eny nebo jejich existence je adres?tovi ?i osob? j?m zastoupen? zn?m?. This e-mail and any documents attached to it may be confidential and are intended only for its intended recipients. If you received this e-mail by mistake, please immediately inform its sender. Delete the contents of this e-mail with all attachments and its copies from your system. If you are not the intended recipient of this e-mail, you are not authorized to use, disseminate, copy or disclose this e-mail in any manner. The sender of this e-mail shall not be liable for any possible damage caused by modifications of the e-mail or by delay with transfer of the email. In case that this e-mail forms part of business dealings: - the sender reserves the right to end negotiations about entering into a contract in any time, for any reason, and without stating any reasoning. - if the e-mail contains an offer, the recipient is entitled to immediately accept such offer; The sender of this e-mail (offer) excludes any acceptance of the offer on the part of the recipient containing any amendment or variation. - the sender insists on that the respective contract is concluded only upon an express mutual agreement on all its aspects. - the sender of this e-mail informs that he/she is not authorized to enter into any contracts on behalf of the company except for cases in which he/she is expressly authorized to do so in writing, and such authorization or power of attorney is submitted to the recipient or the person represented by the recipient, or the existence of such authorization is known to the recipient of the person represented by the recipient.
Duncan Murdoch
2016-Jun-10 11:03 UTC
[R] create an empty data frame and then fill in it (and then evaluate the mean of semi-hourly data for each day)
On 10/06/2016 6:45 AM, Stefano Sofia wrote:> Thank you for your answer. Very clear. > (I don't like the second solution either.) > Let me then ask a final question. > From an initial data frame with semi-hourly data (df_snow, with two columns, data_POSIX of type "POSIXct" "POSIXt" and snow of type "numeric"), I need to evaluate the mean of for each day. > > data_POSIX snow > 2004-11-01 00:00:00 50 > 2004-11-01 00:30:00 55 > 2004-11-01 01:00:00 60 > ... > > I first created a new column of type "Date" > df_snow$day <- as.Date(df_snow$data_POSIX,"%Y-%m-%d") > > then I created a new data frame called df_snow_day to store the mean of data grouped by day: > list_days <- unique(df_snow$day) > df_snow_day <- data.frame(day=list_days) > > Finally I applied lapply in this way: > df_snow_day$snow <- lapply(df_snow_day$day, function(x) round(mean(df_snow$snow[df_snow$day == x], na.rm=T))) > > This does not work. I do not understand why the class of df_snow_day$snow is of type list either:lapply() returns a list. Petr's solution is probably better, but you could likely get what you want using vapply() instead: df_snow_day$snow <- vapply(df_snow_day$day, function(x) round(mean(df_snow$snow[df_snow$day == x], na.rm=T)), 0) The 0 at the end is an example of the numeric function result you want, so that vapply() knows to create a numeric vector. Duncan Murdoch> > day snow > NA <NA> NULL > NA.1 <NA> NULL > NA.2 <NA> NULL > > Where is my mistake? > > Thank you for all your help > Stefano > > > _____________________________________________ > > Da: Duncan Murdoch [murdoch.duncan at gmail.com] > Inviato: gioved? 9 giugno 2016 12.36 > A: Stefano Sofia; r-help at r-project.org > Oggetto: Re: [R] create an empty data frame and then fill in it > > On 09/06/2016 6:22 AM, Stefano Sofia wrote: >> Dear R list users, >> sorry for this simple question, but I already spent many efforts to solve it. >> >> I create an empty data frame called df_year like >> >> df_year <- data.frame(day=as.Date(character()), hs_MteBove=integer(), hs_MtePrata=integer(), hs_Pintura=integer(), hs_Pizzo=integer(), hs_Sassotetto=integer(), hs_Sibilla=integer(), stringsAsFactors=FALSE) >> >> and then I start to fill in it with >> >> df_year$day <- seq(as.Date("2004-11-01-00-00","%Y-%m-%d"), as.Date("2005-05-01-00-00","%Y-%m-%d"), by="day") >> >> but I get the following error: >> "replacement has 182 rows, data has 0" >> >> Where is my silly mistake? > > Your dataframe has 0 rows, so you can't put a 182 row vector into the > first column. > > Unlike vectors, dataframes won't grow if you make assignments beyond the > end of the rows. > > There are at least a couple of solutions: > > 1. Don't create columns until you have data ready for them. > > You can wait to create the dataframe until your "day" column is ready: > > df_year <- data.frame(day = seq(...)) > > As you compute other columns of the same length, you can add them, e.g. > > df_year$hs_MteBove <- ... > > 2. Create your columns with the right length from the beginning: > > df_year <- data.frame(day = rep(as.Date(NA), 182), ...) > > I don't like this solution as much. > > Duncan Murdoch > > > ________________________________ > > AVVISO IMPORTANTE: Questo messaggio di posta elettronica pu? contenere informazioni confidenziali, pertanto ? destinato solo a persone autorizzate alla ricezione. I messaggi di posta elettronica per i client di Regione Marche possono contenere informazioni confidenziali e con privilegi legali. Se non si ? il destinatario specificato, non leggere, copiare, inoltrare o archiviare questo messaggio. Se si ? ricevuto questo messaggio per errore, inoltrarlo al mittente ed eliminarlo completamente dal sistema del proprio computer. Ai sensi dell?art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessit? ed urgenza, la risposta al presente messaggio di posta elettronica pu? essere visionata da persone estranee al destinatario. > IMPORTANT NOTICE: This e-mail message is intended to be received only by persons entitled to receive the confidential information it may contain. E-mail messages to clients of Regione Marche may contain information that is confidential and legally privileged. Please do not read, copy, forward, or store this message unless you are an intended recipient of it. If you have received this message in error, please forward it to the sender and delete it completely from your computer system. >
William Dunlap
2016-Jun-10 15:01 UTC
[R] create an empty data frame and then fill in it (and then evaluate the mean of semi-hourly data for each day)
Finally I applied lapply in this way: df_snow_day$snow <- lapply(df_snow_day$day, function(x) round(mean(df_snow$snow[df_snow$day == x], na.rm=T)) This does not work. I do not understand why the class of df_snow_day$snow is of type list either: lapply()'s output is always a list. I first created a new column of type "Date" df_snow$day <- as.Date(df_snow$data_POSIX,"%Y-%m-%d") If 'date_POSIX' is of class "POSIXct" that line gives a warning because the second argument to as.Data.POSIXct is the time zone ('tz'). Perhaps your data_POSIX column is really character. I made my df_snow as follows: txt <- c("data_POSIX\tsnow", "2004-11-01 00:00:00\t50", "2004-11-01 00:30:00\t55", "2004-11-01 01:00:00\t60") df_snow <- read.table(sep="\t", text=txt,header=TRUE, colClasses=c("POSIXct","numeric")) str(df_snow) 'data.frame': 3 obs. of 2 variables: $ data_POSIX: POSIXct, format: "2004-11-01 00:00:00" ... $ snow : num 50 55 60 and as.Date gave: > as.Date(df_snow$data_POSIX,"%Y-%m-%d") [1] "2004-11-01" "2004-11-01" "2004-11-01" Warning message: In as.POSIXlt.POSIXct(x, tz = tz) : unknown timezone '%Y-%m-%d' Also, converting POSIXct objects to Date objects is usually the wrong thing to do, as the time zone in the POSIXct object is ignored (I think UTC is assumed): > ct <- as.POSIXct(sprintf("2016-%02d-%02d %02d:%02d", 2:5, 22:25, 15:18, 45:48), tz="US/Pacific") > data.frame(ct,as.Date(ct)) # note day-of-month mismatches ct as.Date.ct. 1 2016-02-22 15:45:00 2016-02-22 2 2016-03-23 16:46:00 2016-03-23 3 2016-04-24 17:47:00 2016-04-25 4 2016-05-25 18:48:00 2016-05-26 You can convert to a POSIXlt object and pull out the day-of-month or day-of-year > as.POSIXlt(ct)$mday [1] 22 23 24 25 > as.POSIXlt(ct)$yday [1] 52 82 114 145 I can never remember which helper functions are available for this sort of thing. Many people like the ones in the lubridate package. Bill Dunlap TIBCO Software wdunlap tibco.com On Fri, Jun 10, 2016 at 3:45 AM, Stefano Sofia < stefano.sofia at regione.marche.it> wrote:> Thank you for your answer. Very clear. > (I don't like the second solution either.) > Let me then ask a final question. > From an initial data frame with semi-hourly data (df_snow, with two > columns, data_POSIX of type "POSIXct" "POSIXt" and snow of type "numeric"), > I need to evaluate the mean of for each day. > > data_POSIX snow > 2004-11-01 00:00:00 50 > 2004-11-01 00:30:00 55 > 2004-11-01 01:00:00 60 > ... > > I first created a new column of type "Date" > df_snow$day <- as.Date(df_snow$data_POSIX,"%Y-%m-%d") > > then I created a new data frame called df_snow_day to store the mean of > data grouped by day: > list_days <- unique(df_snow$day) > df_snow_day <- data.frame(day=list_days) > > Finally I applied lapply in this way: > df_snow_day$snow <- lapply(df_snow_day$day, function(x) > round(mean(df_snow$snow[df_snow$day == x], na.rm=T))) > > This does not work. I do not understand why the class of df_snow_day$snow > is of type list either: > > day snow > NA <NA> NULL > NA.1 <NA> NULL > NA.2 <NA> NULL > > Where is my mistake? > > Thank you for all your help > Stefano > > > _____________________________________________ > > Da: Duncan Murdoch [murdoch.duncan at gmail.com] > Inviato: gioved? 9 giugno 2016 12.36 > A: Stefano Sofia; r-help at r-project.org > Oggetto: Re: [R] create an empty data frame and then fill in it > > On 09/06/2016 6:22 AM, Stefano Sofia wrote: > > Dear R list users, > > sorry for this simple question, but I already spent many efforts to > solve it. > > > > I create an empty data frame called df_year like > > > > df_year <- data.frame(day=as.Date(character()), hs_MteBove=integer(), > hs_MtePrata=integer(), hs_Pintura=integer(), hs_Pizzo=integer(), > hs_Sassotetto=integer(), hs_Sibilla=integer(), stringsAsFactors=FALSE) > > > > and then I start to fill in it with > > > > df_year$day <- seq(as.Date("2004-11-01-00-00","%Y-%m-%d"), > as.Date("2005-05-01-00-00","%Y-%m-%d"), by="day") > > > > but I get the following error: > > "replacement has 182 rows, data has 0" > > > > Where is my silly mistake? > > Your dataframe has 0 rows, so you can't put a 182 row vector into the > first column. > > Unlike vectors, dataframes won't grow if you make assignments beyond the > end of the rows. > > There are at least a couple of solutions: > > 1. Don't create columns until you have data ready for them. > > You can wait to create the dataframe until your "day" column is ready: > > df_year <- data.frame(day = seq(...)) > > As you compute other columns of the same length, you can add them, e.g. > > df_year$hs_MteBove <- ... > > 2. Create your columns with the right length from the beginning: > > df_year <- data.frame(day = rep(as.Date(NA), 182), ...) > > I don't like this solution as much. > > Duncan Murdoch > > > ________________________________ > > AVVISO IMPORTANTE: Questo messaggio di posta elettronica pu? contenere > informazioni confidenziali, pertanto ? destinato solo a persone autorizzate > alla ricezione. I messaggi di posta elettronica per i client di Regione > Marche possono contenere informazioni confidenziali e con privilegi legali. > Se non si ? il destinatario specificato, non leggere, copiare, inoltrare o > archiviare questo messaggio. Se si ? ricevuto questo messaggio per errore, > inoltrarlo al mittente ed eliminarlo completamente dal sistema del proprio > computer. Ai sensi dell?art. 6 della DGR n. 1394/2008 si segnala che, in > caso di necessit? ed urgenza, la risposta al presente messaggio di posta > elettronica pu? essere visionata da persone estranee al destinatario. > IMPORTANT NOTICE: This e-mail message is intended to be received only by > persons entitled to receive the confidential information it may contain. > E-mail messages to clients of Regione Marche may contain information that > is confidential and legally privileged. Please do not read, copy, forward, > or store this message unless you are an intended recipient of it. If you > have received this message in error, please forward it to the sender and > delete it completely from your computer system. > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]