John Fox
2022-Jun-03 00:26 UTC
[R] rbind of multiple data frames by column name, when each data frames can contain different columns
Dear Stefano,
I don't believe that your question has been answered.
You can use merge(), twice:
------- snip --------
> merge(merge(df1, df2, all=TRUE), df3, all=TRUE)
data_POSIX Station2_Hs Station2_Hn Station2_flag Station3_Hs Station3_Hn
1 2000-12-01 20 0 0 NA NA
2 2000-12-02 20 0 0 NA NA
3 2000-12-03 30 10 0 NA NA
4 2000-12-04 30 0 1 NA NA
5 2000-12-05 0 5 0 NA NA
6 2001-12-01 NA NA NA 20 0
7 2001-12-02 NA NA NA 20 0
8 2001-12-03 NA NA NA 30 10
9 2001-12-04 NA NA NA 30 0
10 2001-12-05 NA NA NA 0 5
11 2002-12-01 50 20 0 20 0
12 2002-12-02 60 20 0 20 0
13 2002-12-03 70 20 0 30 10
14 2002-12-04 NA NA NA 30 0
15 2002-12-05 NA NA NA 0 5
Station3_flag Station1_Hs Station1_Hn Station1_flag
1 NA 30 10 0
2 NA 40 20 0
3 NA 50 10 0
4 NA NA NA NA
5 NA 55 5 0
6 0 50 20 0
7 0 60 20 0
8 0 70 20 0
9 1 NA NA NA
10 0 NA NA NA
11 0 NA NA NA
12 0 NA NA NA
13 0 NA NA NA
14 1 NA NA NA
15 0 NA NA NA
------- snip --------
The columns aren't in the order that you specified but, if that's
important, you can simply reorder them.
I hope this helps,
John
--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/
On 2022-06-02 2:12 a.m., Stefano Sofia wrote:> Dear R-list users,
>
> for each winter season from 2000 to 2022 I have a data frame collecting for
different weather stations snowpack height (Hs), snowfall in the last 24h (Hn)
and a validation flag.
>
> Suppose I have these three following data frames
>
>
> df1 <- data.frame(data_POSIX=seq(as.POSIXct("2000-12-01",
format="%Y-%m-%d", tz="Etc/GMT-1"),
as.POSIXct("2000-12-05", format="%Y-%m-%d",
tz="Etc/GMT-1"), by="1 days"), Station1_Hs = c(30, 40, 50,
NA, 55), Station1_Hn = c(10, 20, 10, NA, 5), Station1_flag = c(0, 0, 0, NA, 0),
Station2_Hs = c(20, 20, 30, 30, 0), Station2_Hn = c(0, 0, 10, 0, 5),
Station2_flag = c(0, 0, 0, 1, 0))
>
>
> df2 <- data.frame(data_POSIX=seq(as.POSIXct("2001-12-01",
format="%Y-%m-%d", tz="Etc/GMT-1"),
as.POSIXct("2001-12-05", format="%Y-%m-%d",
tz="Etc/GMT-1"), by="1 days"), Station1_Hs = c(50, 60, 70,
NA, NA), Station1_Hn = c(20, 20, 20, NA, NA), Station1_flag = c(0, 0, 0, NA,
NA), Station3_Hs = c(20, 20, 30, 30, 0), Station3_Hn = c(0, 0, 10, 0, 5),
Station3_flag = c(0, 0, 0, 1, 0))
>
>
> df3 <- data.frame(data_POSIX=seq(as.POSIXct("2002-12-01",
format="%Y-%m-%d", tz="Etc/GMT-1"),
as.POSIXct("2002-12-05", format="%Y-%m-%d",
tz="Etc/GMT-1"), by="1 days"), Station2_Hs = c(50, 60, 70,
NA, NA), Station2_Hn = c(20, 20, 20, NA, NA), Station2_flag = c(0, 0, 0, NA,
NA), Station3_Hs = c(20, 20, 30, 30, 0), Station3_Hn = c(0, 0, 10, 0, 5),
Station3_flag = c(0, 0, 0, 1, 0))
>
>
> As you can see, each data frame can have different stations loaded.
>
> I would need to call rbind matching data frames by column name (i.e. by
station name), keeping in mind that the number of stations loaded in each data
frame may differ. The result should be
>
> data_POSIX Station1_Hs Station1_Hn Station1_flag Station2_Hs Station2_Hn
Station2_flag Station3_Hs Station3_Hn Station3_flag
> 2000-12-01 30 10 0 20 0 0 NA NA NA
> 2000-12-02 40 20 0 20 0 0 NA NA NA
> 2000-12-03 50 10 0 30 10 0 NA NA NA
> 2000-12-04 NA NA NA 30 0 0 NA NA NA
> 2000-12-05 55 5 0 0 5 0 NA NA NA
> 2001-12-01 50 20 0 NA NA NA 20 0 0
> 2001-12-02 60 20 0 NA NA NA 20 0 0
> 2001-12-03 70 20 0 NA NA NA 30 10 0
> 2001-12-04 NA NA NA NA NA NA 30 0 1
> 2001-12-05 NA NA NA NA NA NA 0 5 0
> 2002-12-01 NA NA NA 50 20 0 20 0 0
> 2002-12-02 NA NA NA 60 20 0 20 0 0
> 2002-12-03 NA NA NA 70 20 0 30 10 0
> 2002-12-04 NA NA NA NA NA NA 30 0 1
> 2002-12-05 NA NA NA NA NA NA 0 5 0
>
> I tried this code
>
> df_list <- list(df1, df2, df3)
> allNms <- unique(unlist(lapply(df_list, names)))
> do.call(rbind, c(lapply(df_list, function(x) data.frame(c(x,
sapply(setdiff(allNms, names(x)), function(y) NA)))), make.row.names=FALSE))
>
> but I get this error:
> Error in (function (..., row.names = NULL, check.rows = FALSE, check.names
= TRUE, :
> arguments imply differing number of rows
>
> Could someone please help me?
>
>
> Thank you for your attention
>
> Stefano
>
>
> (oo)
> --oOO--( )--OOo--------------------------------------
> Stefano Sofia PhD
> Civil Protection - Marche Region - Italy
> Meteo Section
> Snow Section
> Via del Colle Ameno 5
> 60126 Torrette di Ancona, Ancona (AN)
> Uff: +39 071 806 7743
> E-mail: stefano.sofia at regione.marche.it
> ---Oo---------oO----------------------------------------
>
> ________________________________
>
> AVVISO IMPORTANTE: Questo messaggio di posta elettronica pu? contenere
informazioni confidenziali, pertanto ? destinato solo a persone autorizzate alla
ricezione. I messaggi di posta elettronica per i client di Regione Marche
possono contenere informazioni confidenziali e con privilegi legali. Se non si ?
il destinatario specificato, non leggere, copiare, inoltrare o archiviare questo
messaggio. Se si ? ricevuto questo messaggio per errore, inoltrarlo al mittente
ed eliminarlo completamente dal sistema del proprio computer. Ai sensi
dell'art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessit? ed
urgenza, la risposta al presente messaggio di posta elettronica pu? essere
visionata da persone estranee al destinatario.
> IMPORTANT NOTICE: This e-mail message is intended to be received only by
persons entitled to receive the confidential information it may contain. E-mail
messages to clients of Regione Marche may contain information that is
confidential and legally privileged. Please do not read, copy, forward, or store
this message unless you are an intended recipient of it. If you have received
this message in error, please forward it to the sender and delete it completely
from your computer system.
>
> --
> Questo messaggio stato analizzato da Libraesva ESG ed risultato non
infetto.
> This message was scanned by Libraesva ESG and is believed to be clean.
>
>
> [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.