thr3ads.net - R help - [R] merged data frame with <NA> [Sep 2020]

If this information is useful, please help other people find it:
Share via:

Stefano Sofia

2020-Sep-29 07:59 UTC

[R] merged data frame with <NA>

Dear R users,
I'm struggling with a simple "merge".

I have an external file called df.txt like

data_POSIX, event
2005-11-14 02:30:00, "start"
2005-11-14 11:30:00, "end"

I load it with

df1 <- read.table(file="df.txt", header=TRUE, sep=",",
dec = ".", stringsAsFactors=FALSE)
df1$data_POSIX <- as.POSIXct(df1$data_POSIX, format="%Y-%m-%d
%H:%M:%S", tz="Etc/GMT-1")

Then I create a new data frame df2:

day_1 <- as.POSIXct("2005-11-14-00-00",
format="%Y-%m-%d-%H-%M", tz="Etc/GMT-1")
day_2 <- as.POSIXct("2005-11-14-12-00",
format="%Y-%m-%d-%H-%M", tz="Etc/GMT-1")
df2 <- data.frame(data_POSIX=seq(day_1, day_2, by="30 min"))

Finally

df3 <- merge(df2, df1, by=c("data_POSIX"), all.x=TRUE)

gives

           data_POSIX event
1  2005-11-14 00:00:00    <NA>
2  2005-11-14 00:30:00    <NA>
3  2005-11-14 01:00:00    <NA>
4  2005-11-14 01:30:00    <NA>
5  2005-11-14 02:00:00    <NA>
6  2005-11-14 02:30:00  start
7  2005-11-14 03:00:00    <NA>
8  2005-11-14 03:30:00    <NA>
9  2005-11-14 04:00:00    <NA>
10 2005-11-14 04:30:00    <NA>
11 2005-11-14 05:00:00    <NA>
12 2005-11-14 05:30:00    <NA>
13 2005-11-14 06:00:00    <NA>
14 2005-11-14 06:30:00    <NA>
15 2005-11-14 07:00:00    <NA>
16 2005-11-14 07:30:00    <NA>
17 2005-11-14 08:00:00    <NA>
18 2005-11-14 08:30:00    <NA>
19 2005-11-14 09:00:00    <NA>
20 2005-11-14 09:30:00    <NA>
21 2005-11-14 10:00:00    <NA>
22 2005-11-14 10:30:00    <NA>
23 2005-11-14 11:00:00    <NA>
24 2005-11-14 11:30:00    end
25 2005-11-14 12:00:00    <NA>

Why there is <NA> instead of NA?
And why

df3$pch[df3$event == "start"] <- 24

gives a whole column of NA and not 24 at row 6?


         (oo)
--oOO--( )--OOo----------------
Stefano Sofia PhD
Civil Protection - Marche Region
Meteo Section
Snow Section
Via del Colle Ameno 5
60126 Torrette di Ancona, Ancona
Uff: 071 806 7743
E-mail: stefano.sofia at regione.marche.it
---Oo---------oO----------------

________________________________

AVVISO IMPORTANTE: Questo messaggio di posta elettronica pu? contenere
informazioni confidenziali, pertanto ? destinato solo a persone autorizzate alla
ricezione. I messaggi di posta elettronica per i client di Regione Marche
possono contenere informazioni confidenziali e con privilegi legali. Se non si ?
il destinatario specificato, non leggere, copiare, inoltrare o archiviare questo
messaggio. Se si ? ricevuto questo messaggio per errore, inoltrarlo al mittente
ed eliminarlo completamente dal sistema del proprio computer. Ai sensi dell?art.
6 della DGR n. 1394/2008 si segnala che, in caso di necessit? ed urgenza, la
risposta al presente messaggio di posta elettronica pu? essere visionata da
persone estranee al destinatario.
IMPORTANT NOTICE: This e-mail message is intended to be received only by persons
entitled to receive the confidential information it may contain. E-mail messages
to clients of Regione Marche may contain information that is confidential and
legally privileged. Please do not read, copy, forward, or store this message
unless you are an intended recipient of it. If you have received this message in
error, please forward it to the sender and delete it completely from your
computer system.

--
Questo messaggio  stato analizzato da Libra ESVA ed  risultato non infetto.
This message was scanned by Libra ESVA and is believed to be clean.


	[[alternative HTML version deleted]]

Ivan Calandra

2020-Sep-29 08:09 UTC

head link

[R] merged data frame with <NA>

Hi Stefano,

For the merge() part, I'll leave it to more expert users (I rarely use 
merge(), and every time I need it, it's painful...).

To know why <NA> instead of NA, check the results with str(df3); I guess 
it is not the mode you expected.

For more details, you should provide the file, or better a reproducible 
example using dput().

For the second part, your syntax was not correct (subsetting a column 
for elements based on a column that is not part of the subset!). And 
there is no column "pch" in your example. Try:
df3[df3$event == "start", "event"] <- 24

HTH,
Ivan

--
Dr. Ivan Calandra
TraCEr, laboratory for Traceology and Controlled Experiments
MONREPOS Archaeological Research Centre and
Museum for Human Behavioural Evolution
Schloss Monrepos
56567 Neuwied, Germany
+49 (0) 2631 9772-243
https://www.researchgate.net/profile/Ivan_Calandra

On 29/09/2020 9:59, Stefano Sofia wrote:> Dear R users,
> I'm struggling with a simple "merge".
>
> I have an external file called df.txt like
>
> data_POSIX, event
> 2005-11-14 02:30:00, "start"
> 2005-11-14 11:30:00, "end"
>
> I load it with
>
> df1 <- read.table(file="df.txt", header=TRUE,
sep=",", dec = ".", stringsAsFactors=FALSE)
> df1$data_POSIX <- as.POSIXct(df1$data_POSIX, format="%Y-%m-%d
%H:%M:%S", tz="Etc/GMT-1")
>
> Then I create a new data frame df2:
>
> day_1 <- as.POSIXct("2005-11-14-00-00",
format="%Y-%m-%d-%H-%M", tz="Etc/GMT-1")
> day_2 <- as.POSIXct("2005-11-14-12-00",
format="%Y-%m-%d-%H-%M", tz="Etc/GMT-1")
> df2 <- data.frame(data_POSIX=seq(day_1, day_2, by="30 min"))
>
> Finally
>
> df3 <- merge(df2, df1, by=c("data_POSIX"), all.x=TRUE)
>
> gives
>
>             data_POSIX event
> 1  2005-11-14 00:00:00    <NA>
> 2  2005-11-14 00:30:00    <NA>
> 3  2005-11-14 01:00:00    <NA>
> 4  2005-11-14 01:30:00    <NA>
> 5  2005-11-14 02:00:00    <NA>
> 6  2005-11-14 02:30:00  start
> 7  2005-11-14 03:00:00    <NA>
> 8  2005-11-14 03:30:00    <NA>
> 9  2005-11-14 04:00:00    <NA>
> 10 2005-11-14 04:30:00    <NA>
> 11 2005-11-14 05:00:00    <NA>
> 12 2005-11-14 05:30:00    <NA>
> 13 2005-11-14 06:00:00    <NA>
> 14 2005-11-14 06:30:00    <NA>
> 15 2005-11-14 07:00:00    <NA>
> 16 2005-11-14 07:30:00    <NA>
> 17 2005-11-14 08:00:00    <NA>
> 18 2005-11-14 08:30:00    <NA>
> 19 2005-11-14 09:00:00    <NA>
> 20 2005-11-14 09:30:00    <NA>
> 21 2005-11-14 10:00:00    <NA>
> 22 2005-11-14 10:30:00    <NA>
> 23 2005-11-14 11:00:00    <NA>
> 24 2005-11-14 11:30:00    end
> 25 2005-11-14 12:00:00    <NA>
>
> Why there is <NA> instead of NA?
> And why
>
> df3$pch[df3$event == "start"] <- 24
>
> gives a whole column of NA and not 24 at row 6?
>
>
>           (oo)
> --oOO--( )--OOo----------------
> Stefano Sofia PhD
> Civil Protection - Marche Region
> Meteo Section
> Snow Section
> Via del Colle Ameno 5
> 60126 Torrette di Ancona, Ancona
> Uff: 071 806 7743
> E-mail: stefano.sofia at regione.marche.it
> ---Oo---------oO----------------
>
> ________________________________
>
> AVVISO IMPORTANTE: Questo messaggio di posta elettronica pu? contenere
informazioni confidenziali, pertanto ? destinato solo a persone autorizzate alla
ricezione. I messaggi di posta elettronica per i client di Regione Marche
possono contenere informazioni confidenziali e con privilegi legali. Se non si ?
il destinatario specificato, non leggere, copiare, inoltrare o archiviare questo
messaggio. Se si ? ricevuto questo messaggio per errore, inoltrarlo al mittente
ed eliminarlo completamente dal sistema del proprio computer. Ai sensi dell?art.
6 della DGR n. 1394/2008 si segnala che, in caso di necessit? ed urgenza, la
risposta al presente messaggio di posta elettronica pu? essere visionata da
persone estranee al destinatario.
> IMPORTANT NOTICE: This e-mail message is intended to be received only by
persons entitled to receive the confidential information it may contain. E-mail
messages to clients of Regione Marche may contain information that is
confidential and legally privileged. Please do not read, copy, forward, or store
this message unless you are an intended recipient of it. If you have received
this message in error, please forward it to the sender and delete it completely
from your computer system.
>
> --
> Questo messaggio  stato analizzato da Libra ESVA ed  risultato non infetto.
> This message was scanned by Libra ESVA and is believed to be clean.
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Carlos Ortega

2020-Sep-29 08:57 UTC

head link

[R] merged data frame with <NA>

Hi All,

I recreated a new "sintetic" df1 based on df2 and I could merge:
> day_1 <- as.POSIXct("2005-11-14-00-00",
format="%Y-%m-%d-%H-%M",
tz="Etc/GMT-1")> day_2 <- as.POSIXct("2005-11-14-12-00",
format="%Y-%m-%d-%H-%M",
tz="Etc/GMT-1")> df2 <- data.frame(data_POSIX=seq(day_1, day_2, by="30 min"))
>
>
>
> df1 <- data.frame(+                   data_POSIX = df2[sample(1:nrow(df2), 15, replace FALSE),],
+                   event = sample(c('start', 'end'), 15,
replace = TRUE)
+ )>
> df3 <- merge(df2, df1, by=c("data_POSIX"), all.x=TRUE)
> df3            data_POSIX event
1  2005-11-14 00:00:00   end
2  2005-11-14 00:30:00 start
3  2005-11-14 01:00:00  <NA>
4  2005-11-14 01:30:00   end
5  2005-11-14 02:00:00   end
6  2005-11-14 02:30:00 start
7  2005-11-14 03:00:00   end
8  2005-11-14 03:30:00   end
9  2005-11-14 04:00:00 start
10 2005-11-14 04:30:00 start
11 2005-11-14 05:00:00  <NA>
12 2005-11-14 05:30:00  <NA>
13 2005-11-14 06:00:00  <NA>
14 2005-11-14 06:30:00 start
15 2005-11-14 07:00:00  <NA>
16 2005-11-14 07:30:00  <NA>
17 2005-11-14 08:00:00  <NA>
18 2005-11-14 08:30:00  <NA>
19 2005-11-14 09:00:00 start
20 2005-11-14 09:30:00 start
21 2005-11-14 10:00:00  <NA>
22 2005-11-14 10:30:00   end
23 2005-11-14 11:00:00  <NA>
24 2005-11-14 11:30:00   end
25 2005-11-14 12:00:00 start

Regards,
Carlos Ortega

On Tue, Sep 29, 2020 at 10:13 AM Ivan Calandra <calandra at rgzm.de>
wrote:
> Hi Stefano,
>
> For the merge() part, I'll leave it to more expert users (I rarely use
> merge(), and every time I need it, it's painful...).
>
> To know why <NA> instead of NA, check the results with str(df3); I
guess
> it is not the mode you expected.
>
> For more details, you should provide the file, or better a reproducible
> example using dput().
>
> For the second part, your syntax was not correct (subsetting a column
> for elements based on a column that is not part of the subset!). And
> there is no column "pch" in your example. Try:
> df3[df3$event == "start", "event"] <- 24
>
> HTH,
> Ivan
>
> --
> Dr. Ivan Calandra
> TraCEr, laboratory for Traceology and Controlled Experiments
> MONREPOS Archaeological Research Centre and
> Museum for Human Behavioural Evolution
> Schloss Monrepos
> 56567 Neuwied, Germany
> +49 (0) 2631 9772-243
> https://www.researchgate.net/profile/Ivan_Calandra
>
> On 29/09/2020 9:59, Stefano Sofia wrote:
> > Dear R users,
> > I'm struggling with a simple "merge".
> >
> > I have an external file called df.txt like
> >
> > data_POSIX, event
> > 2005-11-14 02:30:00, "start"
> > 2005-11-14 11:30:00, "end"
> >
> > I load it with
> >
> > df1 <- read.table(file="df.txt", header=TRUE,
sep=",", dec = ".",
> stringsAsFactors=FALSE)
> > df1$data_POSIX <- as.POSIXct(df1$data_POSIX, format="%Y-%m-%d
%H:%M:%S",
> tz="Etc/GMT-1")
> >
> > Then I create a new data frame df2:
> >
> > day_1 <- as.POSIXct("2005-11-14-00-00",
format="%Y-%m-%d-%H-%M",
> tz="Etc/GMT-1")
> > day_2 <- as.POSIXct("2005-11-14-12-00",
format="%Y-%m-%d-%H-%M",
> tz="Etc/GMT-1")
> > df2 <- data.frame(data_POSIX=seq(day_1, day_2, by="30
min"))
> >
> > Finally
> >
> > df3 <- merge(df2, df1, by=c("data_POSIX"), all.x=TRUE)
> >
> > gives
> >
> >             data_POSIX event
> > 1  2005-11-14 00:00:00    <NA>
> > 2  2005-11-14 00:30:00    <NA>
> > 3  2005-11-14 01:00:00    <NA>
> > 4  2005-11-14 01:30:00    <NA>
> > 5  2005-11-14 02:00:00    <NA>
> > 6  2005-11-14 02:30:00  start
> > 7  2005-11-14 03:00:00    <NA>
> > 8  2005-11-14 03:30:00    <NA>
> > 9  2005-11-14 04:00:00    <NA>
> > 10 2005-11-14 04:30:00    <NA>
> > 11 2005-11-14 05:00:00    <NA>
> > 12 2005-11-14 05:30:00    <NA>
> > 13 2005-11-14 06:00:00    <NA>
> > 14 2005-11-14 06:30:00    <NA>
> > 15 2005-11-14 07:00:00    <NA>
> > 16 2005-11-14 07:30:00    <NA>
> > 17 2005-11-14 08:00:00    <NA>
> > 18 2005-11-14 08:30:00    <NA>
> > 19 2005-11-14 09:00:00    <NA>
> > 20 2005-11-14 09:30:00    <NA>
> > 21 2005-11-14 10:00:00    <NA>
> > 22 2005-11-14 10:30:00    <NA>
> > 23 2005-11-14 11:00:00    <NA>
> > 24 2005-11-14 11:30:00    end
> > 25 2005-11-14 12:00:00    <NA>
> >
> > Why there is <NA> instead of NA?
> > And why
> >
> > df3$pch[df3$event == "start"] <- 24
> >
> > gives a whole column of NA and not 24 at row 6?
> >
> >
> >           (oo)
> > --oOO--( )--OOo----------------
> > Stefano Sofia PhD
> > Civil Protection - Marche Region
> > Meteo Section
> > Snow Section
> > Via del Colle Ameno 5
> > 60126 Torrette di Ancona, Ancona
> > Uff: 071 806 7743
> > E-mail: stefano.sofia at regione.marche.it
> > ---Oo---------oO----------------
> >
> > ________________________________
> >
> > AVVISO IMPORTANTE: Questo messaggio di posta elettronica pu? contenere
> informazioni confidenziali, pertanto ? destinato solo a persone autorizzate
> alla ricezione. I messaggi di posta elettronica per i client di Regione
> Marche possono contenere informazioni confidenziali e con privilegi legali.
> Se non si ? il destinatario specificato, non leggere, copiare, inoltrare o
> archiviare questo messaggio. Se si ? ricevuto questo messaggio per errore,
> inoltrarlo al mittente ed eliminarlo completamente dal sistema del proprio
> computer. Ai sensi dell?art. 6 della DGR n. 1394/2008 si segnala che, in
> caso di necessit? ed urgenza, la risposta al presente messaggio di posta
> elettronica pu? essere visionata da persone estranee al destinatario.
> > IMPORTANT NOTICE: This e-mail message is intended to be received only
by
> persons entitled to receive the confidential information it may contain.
> E-mail messages to clients of Regione Marche may contain information that
> is confidential and legally privileged. Please do not read, copy, forward,
> or store this message unless you are an intended recipient of it. If you
> have received this message in error, please forward it to the sender and
> delete it completely from your computer system.
> >
> > --
> > Questo messaggio  stato analizzato da Libra ESVA ed  risultato non
> infetto.
> > This message was scanned by Libra ESVA and is believed to be clean.
> >
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

R help - Sep 2020 - merged data frame with <NA>

[R] merged data frame with <NA>

[R] merged data frame with <NA>

[R] merged data frame with <NA>