Hi R users,
I want to do a data reshape from long to wide, I thought it was easy using
tidyverse spread function, but it did not work well. Can you help me?
Thank you,
Ding
test1 data frame is long file and test2 is the wide file I want to get
test1 <- data.frame (vntr1=c("v1","v1",
"v2","v2","v2","v2"),
val =c(0.98,0.02, 0.59,0.12,0.11,0.04))
test2 <- data.frame(vntr1=c("v1","v2"),
a1 =c(0.98, 0.5693),
a2 = c(0.02, 0.12),
a3 =c(NA, 0.11),
a4=c(NA, 0.04))
the following code does not work
test2 <-test1 %>%spread(vntr1, val)
Error: Each row of output must be identified by a unique combination of keys.
Keys are shared for 6 rows:
* 1, 2
* 3, 4, 5, 6
Do you need to create unique ID with tibble::rowid_to_column()?
Call `rlang::last_error()` to see a backtrace
----------------------------------------------------------------------
------------------------------------------------------------
-SECURITY/CONFIDENTIALITY WARNING-
This message and any attachments are intended solely for the individual or
entity to which they are addressed. This communication may contain information
that is privileged, confidential, or exempt from disclosure under applicable law
(e.g., personal health information, research data, financial information).
Because this e-mail has been sent without encryption, individuals other than the
intended recipient may be able to view the information, forward it to others or
tamper with the information without the knowledge or consent of the sender. If
you are not the intended recipient, or the employee or person responsible for
delivering the message to the intended recipient, any dissemination,
distribution or copying of the communication is strictly prohibited. If you
received the communication in error, please notify the sender immediately by
replying to this message and deleting the message and any accompanying files
from your system. If, due to the security risks, you do not wish to receive
further communications via e-mail, please reply to this message and inform the
sender that you do not wish to receive further e-mail from the sender. (LCP301)
Hello,
It's a bit more complicated than you have coded it.
I will use pivot_wider, it's now the natural way of doing it.
test1 %>%
group_by(vntr1) %>%
mutate(group = row_number()) %>%
ungroup() %>%
pivot_wider(
id_cols ="vntr1",
names_from = "group",
names_prefix = "a",
values_from = "val"
)
Hope this helps,
Rui Barradas
?s 19:57 de 03/04/20, Yuan Chun Ding escreveu:> Hi R users,
>
> I want to do a data reshape from long to wide, I thought it was easy using
tidyverse spread function, but it did not work well. Can you help me?
>
> Thank you,
>
> Ding
>
> test1 data frame is long file and test2 is the wide file I want to get
>
> test1 <- data.frame (vntr1=c("v1","v1",
"v2","v2","v2","v2"),
> val =c(0.98,0.02, 0.59,0.12,0.11,0.04))
>
> test2 <- data.frame(vntr1=c("v1","v2"),
> a1 =c(0.98, 0.5693),
> a2 = c(0.02, 0.12),
> a3 =c(NA, 0.11),
> a4=c(NA, 0.04))
>
> the following code does not work
> test2 <-test1 %>%spread(vntr1, val)
>
> Error: Each row of output must be identified by a unique combination of
keys.
> Keys are shared for 6 rows:
> * 1, 2
> * 3, 4, 5, 6
> Do you need to create unique ID with tibble::rowid_to_column()?
> Call `rlang::last_error()` to see a backtrace
>
> ----------------------------------------------------------------------
> ------------------------------------------------------------
> -SECURITY/CONFIDENTIALITY WARNING-
>
> This message and any attachments are intended solely for the individual or
entity to which they are addressed. This communication may contain information
that is privileged, confidential, or exempt from disclosure under applicable law
(e.g., personal health information, research data, financial information).
Because this e-mail has been sent without encryption, individuals other than the
intended recipient may be able to view the information, forward it to others or
tamper with the information without the knowledge or consent of the sender. If
you are not the intended recipient, or the employee or person responsible for
delivering the message to the intended recipient, any dissemination,
distribution or copying of the communication is strictly prohibited. If you
received the communication in error, please notify the sender immediately by
replying to this message and deleting the message and any accompanying files
from your system. If, due to the security risks, you do not wish to receive
further communications via e-mail, please reply to this message and inform the
sender that you do not wish to receive further e-mail from the sender. (LCP301)
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Hi Rui,
Thanks a lot,
i got this error, I have library(tidyverse).
Ding
Error in pivot_wider(., id_cols = "vntr1", names_from =
"group", names_prefix = "a", :
could not find function "pivot_wider"
________________________________________
From: Rui Barradas [ruipbarradas at sapo.pt]
Sent: Friday, April 3, 2020 12:08 PM
To: Yuan Chun Ding; r-help mailing list
Subject: Re: [R] a simple reshape
Hello,
It's a bit more complicated than you have coded it.
I will use pivot_wider, it's now the natural way of doing it.
test1 %>%
group_by(vntr1) %>%
mutate(group = row_number()) %>%
ungroup() %>%
pivot_wider(
id_cols ="vntr1",
names_from = "group",
names_prefix = "a",
values_from = "val"
)
Hope this helps,
Rui Barradas
?s 19:57 de 03/04/20, Yuan Chun Ding escreveu:> Hi R users,
>
> I want to do a data reshape from long to wide, I thought it was easy using
tidyverse spread function, but it did not work well. Can you help me?
>
> Thank you,
>
> Ding
>
> test1 data frame is long file and test2 is the wide file I want to get
>
> test1 <- data.frame (vntr1=c("v1","v1",
"v2","v2","v2","v2"),
> val =c(0.98,0.02, 0.59,0.12,0.11,0.04))
>
> test2 <- data.frame(vntr1=c("v1","v2"),
> a1 =c(0.98, 0.5693),
> a2 = c(0.02, 0.12),
> a3 =c(NA, 0.11),
> a4=c(NA, 0.04))
>
> the following code does not work
> test2 <-test1 %>%spread(vntr1, val)
>
> Error: Each row of output must be identified by a unique combination of
keys.
> Keys are shared for 6 rows:
> * 1, 2
> * 3, 4, 5, 6
> Do you need to create unique ID with tibble::rowid_to_column()?
> Call `rlang::last_error()` to see a backtrace
>
> ----------------------------------------------------------------------
> ------------------------------------------------------------
> -SECURITY/CONFIDENTIALITY WARNING-
>
> This message and any attachments are intended solely for the individual or
entity to which they are addressed. This communication may contain information
that is privileged, confidential, or exempt from disclosure under applicable law
(e.g., personal health information, research data, financial information).
Because this e-mail has been sent without encryption, individuals other than the
intended recipient may be able to view the information, forward it to others or
tamper with the information without the knowledge or consent of the sender. If
you are not the intended recipient, or the employee or person responsible for
delivering the message to the intended recipient, any dissemination,
distribution or copying of the communication is strictly prohibited. If you
received the communication in error, please notify the sender immediately by
replying to this message and deleting the message and any accompanying files
from your system. If, due to the security risks, you do not wish to receive
further communications via e-mail, please reply to this message and inform the
sender that you do not wish to receive further e-mail from the sender. (LCP301)
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>
https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!5JdwuHJ-WxfEwZqhEKnPmaGJJqCVHrJXr2iVwpKZ8UBKwdjSGRA4oSGZ-U4t$
> PLEASE do read the posting guide
https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!5JdwuHJ-WxfEwZqhEKnPmaGJJqCVHrJXr2iVwpKZ8UBKwdjSGRA4oR2DUtwI$
> and provide commented, minimal, self-contained, reproducible code.
>
Hi Ding, If you are still having trouble, perhaps: library(prettyR) stretch_df(test1,"vntr1","val") Jim On Sat, Apr 4, 2020 at 5:58 AM Yuan Chun Ding <ycding at coh.org> wrote:> > Hi R users, > > I want to do a data reshape from long to wide, I thought it was easy using tidyverse spread function, but it did not work well. Can you help me? > > Thank you, > > Ding > > test1 data frame is long file and test2 is the wide file I want to get > > test1 <- data.frame (vntr1=c("v1","v1", "v2","v2","v2","v2"), > val =c(0.98,0.02, 0.59,0.12,0.11,0.04)) > > test2 <- data.frame(vntr1=c("v1","v2"), > a1 =c(0.98, 0.5693), > a2 = c(0.02, 0.12), > a3 =c(NA, 0.11), > a4=c(NA, 0.04)) > > the following code does not work > test2 <-test1 %>%spread(vntr1, val) > > Error: Each row of output must be identified by a unique combination of keys. > Keys are shared for 6 rows: > * 1, 2 > * 3, 4, 5, 6 > Do you need to create unique ID with tibble::rowid_to_column()? > Call `rlang::last_error()` to see a backtrace > > ---------------------------------------------------------------------- > ------------------------------------------------------------ > -SECURITY/CONFIDENTIALITY WARNING- > > This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to receive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. (LCP301) > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi,
[For a non-tidyverse solution:]
Your problem is ambiguous without a 'time' variable; e.g., why should
the answer not be
test2 <- data.frame(vntr1=c("v1","v2"),
a1 =c(NA, 0.5693),
a2 = c(0.02, 0.12),
a3 =c(NA, 0.11),
a4=c(0.98, 0.04))
? If you do add an artificial time variable, say using
test1 <- transform(test1,
time = unsplit(lapply(split(vntr1, vntr1), seq_along), vntr1))
to give
> test1
vntr1 val time
1 v1 0.98 1
2 v1 0.02 2
3 v2 0.59 1
4 v2 0.12 2
5 v2 0.11 3
6 v2 0.04 4
then either reshape() or dcast() easily gives you what you want:
> reshape(test1, v.names = "val", idvar = "vntr1",
direction = "wide", timevar = "time")
vntr1 val.1 val.2 val.3 val.4
1 v1 0.98 0.02 NA NA
3 v2 0.59 0.12 0.11 0.04
> reshape2::dcast(test1, vntr1 ~ time, value.var="val")
vntr1 1 2 3 4
1 v1 0.98 0.02 NA NA
2 v2 0.59 0.12 0.11 0.04
-Deepayan
On Sat, Apr 4, 2020 at 12:28 AM Yuan Chun Ding <ycding at coh.org>
wrote:>
> Hi R users,
>
> I want to do a data reshape from long to wide, I thought it was easy using
tidyverse spread function, but it did not work well. Can you help me?
>
> Thank you,
>
> Ding
>
> test1 data frame is long file and test2 is the wide file I want to get
>
> test1 <- data.frame (vntr1=c("v1","v1",
"v2","v2","v2","v2"),
> val =c(0.98,0.02, 0.59,0.12,0.11,0.04))
>
> test2 <- data.frame(vntr1=c("v1","v2"),
> a1 =c(0.98, 0.5693),
> a2 = c(0.02, 0.12),
> a3 =c(NA, 0.11),
> a4=c(NA, 0.04))
>
> the following code does not work
> test2 <-test1 %>%spread(vntr1, val)
>
> Error: Each row of output must be identified by a unique combination of
keys.
> Keys are shared for 6 rows:
> * 1, 2
> * 3, 4, 5, 6
> Do you need to create unique ID with tibble::rowid_to_column()?
> Call `rlang::last_error()` to see a backtrace
>
> ----------------------------------------------------------------------
> ------------------------------------------------------------
> -SECURITY/CONFIDENTIALITY WARNING-
>
> This message and any attachments are intended solely for the individual or
entity to which they are addressed. This communication may contain information
that is privileged, confidential, or exempt from disclosure under applicable law
(e.g., personal health information, research data, financial information).
Because this e-mail has been sent without encryption, individuals other than the
intended recipient may be able to view the information, forward it to others or
tamper with the information without the knowledge or consent of the sender. If
you are not the intended recipient, or the employee or person responsible for
delivering the message to the intended recipient, any dissemination,
distribution or copying of the communication is strictly prohibited. If you
received the communication in error, please notify the sender immediately by
replying to this message and deleting the message and any accompanying files
from your system. If, due to the security risks, you do not wish to receive
further communications via e-mail, please reply to this message and inform the
sender that you do not wish to receive further e-mail from the sender. (LCP301)
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Hi Deepayan,
Thank you very much!! Yes, your method also works very well, I thought about
creating an extra time variable, but did not know how to do it.
Ding
________________________________________
From: Deepayan Sarkar [deepayan.sarkar at gmail.com]
Sent: Saturday, April 4, 2020 3:10 AM
To: Yuan Chun Ding
Cc: r-help mailing list
Subject: Re: [R] a simple reshape
Hi,
[For a non-tidyverse solution:]
Your problem is ambiguous without a 'time' variable; e.g., why should
the answer not be
test2 <- data.frame(vntr1=c("v1","v2"),
a1 =c(NA, 0.5693),
a2 = c(0.02, 0.12),
a3 =c(NA, 0.11),
a4=c(0.98, 0.04))
? If you do add an artificial time variable, say using
test1 <- transform(test1,
time = unsplit(lapply(split(vntr1, vntr1), seq_along), vntr1))
to give
> test1
vntr1 val time
1 v1 0.98 1
2 v1 0.02 2
3 v2 0.59 1
4 v2 0.12 2
5 v2 0.11 3
6 v2 0.04 4
then either reshape() or dcast() easily gives you what you want:
> reshape(test1, v.names = "val", idvar = "vntr1",
direction = "wide", timevar = "time")
vntr1 val.1 val.2 val.3 val.4
1 v1 0.98 0.02 NA NA
3 v2 0.59 0.12 0.11 0.04
> reshape2::dcast(test1, vntr1 ~ time, value.var="val")
vntr1 1 2 3 4
1 v1 0.98 0.02 NA NA
2 v2 0.59 0.12 0.11 0.04
-Deepayan
On Sat, Apr 4, 2020 at 12:28 AM Yuan Chun Ding <ycding at coh.org>
wrote:>
> Hi R users,
>
> I want to do a data reshape from long to wide, I thought it was easy using
tidyverse spread function, but it did not work well. Can you help me?
>
> Thank you,
>
> Ding
>
> test1 data frame is long file and test2 is the wide file I want to get
>
> test1 <- data.frame (vntr1=c("v1","v1",
"v2","v2","v2","v2"),
> val =c(0.98,0.02, 0.59,0.12,0.11,0.04))
>
> test2 <- data.frame(vntr1=c("v1","v2"),
> a1 =c(0.98, 0.5693),
> a2 = c(0.02, 0.12),
> a3 =c(NA, 0.11),
> a4=c(NA, 0.04))
>
> the following code does not work
> test2 <-test1 %>%spread(vntr1, val)
>
> Error: Each row of output must be identified by a unique combination of
keys.
> Keys are shared for 6 rows:
> * 1, 2
> * 3, 4, 5, 6
> Do you need to create unique ID with tibble::rowid_to_column()?
> Call `rlang::last_error()` to see a backtrace
>
> ----------------------------------------------------------------------
> ------------------------------------------------------------
> -SECURITY/CONFIDENTIALITY WARNING-
>
> This message and any attachments are intended solely for the individual or
entity to which they are addressed. This communication may contain information
that is privileged, confidential, or exempt from disclosure under applicable law
(e.g., personal health information, research data, financial information).
Because this e-mail has been sent without encryption, individuals other than the
intended recipient may be able to view the information, forward it to others or
tamper with the information without the knowledge or consent of the sender. If
you are not the intended recipient, or the employee or person responsible for
delivering the message to the intended recipient, any dissemination,
distribution or copying of the communication is strictly prohibited. If you
received the communication in error, please notify the sender immediately by
replying to this message and deleting the message and any accompanying files
from your system. If, due to the security risks, you do not wish to receive
further communications via e-mail, please reply to this message and inform the
sender that you do not wish to receive further e-mail from the sender. (LCP301)
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>
https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!_gFkJ_Cf4ZEMwLhfpOwr3W8LB2SUv3_s6vPFDW1_kVUN891RfsB4KvcZNHBM$
> PLEASE do read the posting guide
https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!_gFkJ_Cf4ZEMwLhfpOwr3W8LB2SUv3_s6vPFDW1_kVUN891RfsB4KkMsIU01$
> and provide commented, minimal, self-contained, reproducible code.