Hi R users, I want to do a data reshape from long to wide, I thought it was easy using tidyverse spread function, but it did not work well. Can you help me? Thank you, Ding test1 data frame is long file and test2 is the wide file I want to get test1 <- data.frame (vntr1=c("v1","v1", "v2","v2","v2","v2"), val =c(0.98,0.02, 0.59,0.12,0.11,0.04)) test2 <- data.frame(vntr1=c("v1","v2"), a1 =c(0.98, 0.5693), a2 = c(0.02, 0.12), a3 =c(NA, 0.11), a4=c(NA, 0.04)) the following code does not work test2 <-test1 %>%spread(vntr1, val) Error: Each row of output must be identified by a unique combination of keys. Keys are shared for 6 rows: * 1, 2 * 3, 4, 5, 6 Do you need to create unique ID with tibble::rowid_to_column()? Call `rlang::last_error()` to see a backtrace ---------------------------------------------------------------------- ------------------------------------------------------------ -SECURITY/CONFIDENTIALITY WARNING- This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to receive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. (LCP301)
Hello, It's a bit more complicated than you have coded it. I will use pivot_wider, it's now the natural way of doing it. test1 %>% group_by(vntr1) %>% mutate(group = row_number()) %>% ungroup() %>% pivot_wider( id_cols ="vntr1", names_from = "group", names_prefix = "a", values_from = "val" ) Hope this helps, Rui Barradas ?s 19:57 de 03/04/20, Yuan Chun Ding escreveu:> Hi R users, > > I want to do a data reshape from long to wide, I thought it was easy using tidyverse spread function, but it did not work well. Can you help me? > > Thank you, > > Ding > > test1 data frame is long file and test2 is the wide file I want to get > > test1 <- data.frame (vntr1=c("v1","v1", "v2","v2","v2","v2"), > val =c(0.98,0.02, 0.59,0.12,0.11,0.04)) > > test2 <- data.frame(vntr1=c("v1","v2"), > a1 =c(0.98, 0.5693), > a2 = c(0.02, 0.12), > a3 =c(NA, 0.11), > a4=c(NA, 0.04)) > > the following code does not work > test2 <-test1 %>%spread(vntr1, val) > > Error: Each row of output must be identified by a unique combination of keys. > Keys are shared for 6 rows: > * 1, 2 > * 3, 4, 5, 6 > Do you need to create unique ID with tibble::rowid_to_column()? > Call `rlang::last_error()` to see a backtrace > > ---------------------------------------------------------------------- > ------------------------------------------------------------ > -SECURITY/CONFIDENTIALITY WARNING- > > This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to receive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. (LCP301) > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Hi Rui, Thanks a lot, i got this error, I have library(tidyverse). Ding Error in pivot_wider(., id_cols = "vntr1", names_from = "group", names_prefix = "a", : could not find function "pivot_wider" ________________________________________ From: Rui Barradas [ruipbarradas at sapo.pt] Sent: Friday, April 3, 2020 12:08 PM To: Yuan Chun Ding; r-help mailing list Subject: Re: [R] a simple reshape Hello, It's a bit more complicated than you have coded it. I will use pivot_wider, it's now the natural way of doing it. test1 %>% group_by(vntr1) %>% mutate(group = row_number()) %>% ungroup() %>% pivot_wider( id_cols ="vntr1", names_from = "group", names_prefix = "a", values_from = "val" ) Hope this helps, Rui Barradas ?s 19:57 de 03/04/20, Yuan Chun Ding escreveu:> Hi R users, > > I want to do a data reshape from long to wide, I thought it was easy using tidyverse spread function, but it did not work well. Can you help me? > > Thank you, > > Ding > > test1 data frame is long file and test2 is the wide file I want to get > > test1 <- data.frame (vntr1=c("v1","v1", "v2","v2","v2","v2"), > val =c(0.98,0.02, 0.59,0.12,0.11,0.04)) > > test2 <- data.frame(vntr1=c("v1","v2"), > a1 =c(0.98, 0.5693), > a2 = c(0.02, 0.12), > a3 =c(NA, 0.11), > a4=c(NA, 0.04)) > > the following code does not work > test2 <-test1 %>%spread(vntr1, val) > > Error: Each row of output must be identified by a unique combination of keys. > Keys are shared for 6 rows: > * 1, 2 > * 3, 4, 5, 6 > Do you need to create unique ID with tibble::rowid_to_column()? > Call `rlang::last_error()` to see a backtrace > > ---------------------------------------------------------------------- > ------------------------------------------------------------ > -SECURITY/CONFIDENTIALITY WARNING- > > This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to receive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. (LCP301) > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!5JdwuHJ-WxfEwZqhEKnPmaGJJqCVHrJXr2iVwpKZ8UBKwdjSGRA4oSGZ-U4t$ > PLEASE do read the posting guide https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!5JdwuHJ-WxfEwZqhEKnPmaGJJqCVHrJXr2iVwpKZ8UBKwdjSGRA4oR2DUtwI$ > and provide commented, minimal, self-contained, reproducible code. >
Hi Ding, If you are still having trouble, perhaps: library(prettyR) stretch_df(test1,"vntr1","val") Jim On Sat, Apr 4, 2020 at 5:58 AM Yuan Chun Ding <ycding at coh.org> wrote:> > Hi R users, > > I want to do a data reshape from long to wide, I thought it was easy using tidyverse spread function, but it did not work well. Can you help me? > > Thank you, > > Ding > > test1 data frame is long file and test2 is the wide file I want to get > > test1 <- data.frame (vntr1=c("v1","v1", "v2","v2","v2","v2"), > val =c(0.98,0.02, 0.59,0.12,0.11,0.04)) > > test2 <- data.frame(vntr1=c("v1","v2"), > a1 =c(0.98, 0.5693), > a2 = c(0.02, 0.12), > a3 =c(NA, 0.11), > a4=c(NA, 0.04)) > > the following code does not work > test2 <-test1 %>%spread(vntr1, val) > > Error: Each row of output must be identified by a unique combination of keys. > Keys are shared for 6 rows: > * 1, 2 > * 3, 4, 5, 6 > Do you need to create unique ID with tibble::rowid_to_column()? > Call `rlang::last_error()` to see a backtrace > > ---------------------------------------------------------------------- > ------------------------------------------------------------ > -SECURITY/CONFIDENTIALITY WARNING- > > This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to receive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. (LCP301) > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi, [For a non-tidyverse solution:] Your problem is ambiguous without a 'time' variable; e.g., why should the answer not be test2 <- data.frame(vntr1=c("v1","v2"), a1 =c(NA, 0.5693), a2 = c(0.02, 0.12), a3 =c(NA, 0.11), a4=c(0.98, 0.04)) ? If you do add an artificial time variable, say using test1 <- transform(test1, time = unsplit(lapply(split(vntr1, vntr1), seq_along), vntr1)) to give> test1vntr1 val time 1 v1 0.98 1 2 v1 0.02 2 3 v2 0.59 1 4 v2 0.12 2 5 v2 0.11 3 6 v2 0.04 4 then either reshape() or dcast() easily gives you what you want:> reshape(test1, v.names = "val", idvar = "vntr1", direction = "wide", timevar = "time")vntr1 val.1 val.2 val.3 val.4 1 v1 0.98 0.02 NA NA 3 v2 0.59 0.12 0.11 0.04> reshape2::dcast(test1, vntr1 ~ time, value.var="val")vntr1 1 2 3 4 1 v1 0.98 0.02 NA NA 2 v2 0.59 0.12 0.11 0.04 -Deepayan On Sat, Apr 4, 2020 at 12:28 AM Yuan Chun Ding <ycding at coh.org> wrote:> > Hi R users, > > I want to do a data reshape from long to wide, I thought it was easy using tidyverse spread function, but it did not work well. Can you help me? > > Thank you, > > Ding > > test1 data frame is long file and test2 is the wide file I want to get > > test1 <- data.frame (vntr1=c("v1","v1", "v2","v2","v2","v2"), > val =c(0.98,0.02, 0.59,0.12,0.11,0.04)) > > test2 <- data.frame(vntr1=c("v1","v2"), > a1 =c(0.98, 0.5693), > a2 = c(0.02, 0.12), > a3 =c(NA, 0.11), > a4=c(NA, 0.04)) > > the following code does not work > test2 <-test1 %>%spread(vntr1, val) > > Error: Each row of output must be identified by a unique combination of keys. > Keys are shared for 6 rows: > * 1, 2 > * 3, 4, 5, 6 > Do you need to create unique ID with tibble::rowid_to_column()? > Call `rlang::last_error()` to see a backtrace > > ---------------------------------------------------------------------- > ------------------------------------------------------------ > -SECURITY/CONFIDENTIALITY WARNING- > > This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to receive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. (LCP301) > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi Deepayan, Thank you very much!! Yes, your method also works very well, I thought about creating an extra time variable, but did not know how to do it. Ding ________________________________________ From: Deepayan Sarkar [deepayan.sarkar at gmail.com] Sent: Saturday, April 4, 2020 3:10 AM To: Yuan Chun Ding Cc: r-help mailing list Subject: Re: [R] a simple reshape Hi, [For a non-tidyverse solution:] Your problem is ambiguous without a 'time' variable; e.g., why should the answer not be test2 <- data.frame(vntr1=c("v1","v2"), a1 =c(NA, 0.5693), a2 = c(0.02, 0.12), a3 =c(NA, 0.11), a4=c(0.98, 0.04)) ? If you do add an artificial time variable, say using test1 <- transform(test1, time = unsplit(lapply(split(vntr1, vntr1), seq_along), vntr1)) to give> test1vntr1 val time 1 v1 0.98 1 2 v1 0.02 2 3 v2 0.59 1 4 v2 0.12 2 5 v2 0.11 3 6 v2 0.04 4 then either reshape() or dcast() easily gives you what you want:> reshape(test1, v.names = "val", idvar = "vntr1", direction = "wide", timevar = "time")vntr1 val.1 val.2 val.3 val.4 1 v1 0.98 0.02 NA NA 3 v2 0.59 0.12 0.11 0.04> reshape2::dcast(test1, vntr1 ~ time, value.var="val")vntr1 1 2 3 4 1 v1 0.98 0.02 NA NA 2 v2 0.59 0.12 0.11 0.04 -Deepayan On Sat, Apr 4, 2020 at 12:28 AM Yuan Chun Ding <ycding at coh.org> wrote:> > Hi R users, > > I want to do a data reshape from long to wide, I thought it was easy using tidyverse spread function, but it did not work well. Can you help me? > > Thank you, > > Ding > > test1 data frame is long file and test2 is the wide file I want to get > > test1 <- data.frame (vntr1=c("v1","v1", "v2","v2","v2","v2"), > val =c(0.98,0.02, 0.59,0.12,0.11,0.04)) > > test2 <- data.frame(vntr1=c("v1","v2"), > a1 =c(0.98, 0.5693), > a2 = c(0.02, 0.12), > a3 =c(NA, 0.11), > a4=c(NA, 0.04)) > > the following code does not work > test2 <-test1 %>%spread(vntr1, val) > > Error: Each row of output must be identified by a unique combination of keys. > Keys are shared for 6 rows: > * 1, 2 > * 3, 4, 5, 6 > Do you need to create unique ID with tibble::rowid_to_column()? > Call `rlang::last_error()` to see a backtrace > > ---------------------------------------------------------------------- > ------------------------------------------------------------ > -SECURITY/CONFIDENTIALITY WARNING- > > This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to receive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. (LCP301) > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!_gFkJ_Cf4ZEMwLhfpOwr3W8LB2SUv3_s6vPFDW1_kVUN891RfsB4KvcZNHBM$ > PLEASE do read the posting guide https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!_gFkJ_Cf4ZEMwLhfpOwr3W8LB2SUv3_s6vPFDW1_kVUN891RfsB4KkMsIU01$ > and provide commented, minimal, self-contained, reproducible code.