Massimo Bressan
2018-Jun-06 08:13 UTC
[R] aggregate and list elements of variables in data.frame
#given the following reproducible and simplified example t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) t #I need to get the following result r<-data.frame(unique_A=c(123, 345, 678, 789),list_id=c('1,3,6,9','2,5,8','4','7,10')) r # i.e. aggregate over the variable "A" and list all elements of the variable "id" satisfying the criteria of having the same corrisponding value of "A" #any help for that? #so far I've just managed to "aggregate" and "count", like: library(sqldf) sqldf('select count(*) as count_id, A as unique_A from t group by A') library(dplyr) t%>%group_by(unique_A=A) %>% summarise(count_id = n()) # thank you [[alternative HTML version deleted]]
Ivan Calandra
2018-Jun-06 08:21 UTC
[R] aggregate and list elements of variables in data.frame
Hi Massimo, Something along those lines could help you I guess: t$A <- factor(t$A) sapply(levels(t$A), function(x) which(t$A==x)) You can then play with the output using paste() Ivan -- Dr. Ivan Calandra TraCEr, laboratory for Traceology and Controlled Experiments MONREPOS Archaeological Research Centre and Museum for Human Behavioural Evolution Schloss Monrepos 56567 Neuwied, Germany +49 (0) 2631 9772-243 https://www.researchgate.net/profile/Ivan_Calandra On 06/06/2018 10:13, Massimo Bressan wrote:> #given the following reproducible and simplified example > > t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) > t > > #I need to get the following result > > r<-data.frame(unique_A=c(123, 345, 678, 789),list_id=c('1,3,6,9','2,5,8','4','7,10')) > r > > # i.e. aggregate over the variable "A" and list all elements of the variable "id" satisfying the criteria of having the same corrisponding value of "A" > #any help for that? > > #so far I've just managed to "aggregate" and "count", like: > > library(sqldf) > sqldf('select count(*) as count_id, A as unique_A from t group by A') > > library(dplyr) > t%>%group_by(unique_A=A) %>% summarise(count_id = n()) > > # thank you > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Massimo Bressan
2018-Jun-07 08:09 UTC
[R] aggregate and list elements of variables in data.frame
thanks for the help I'm posting here the complete solution t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) t$A <- factor(t$A) l<-sapply(levels(t$A), function(x) which(t$A==x)) r<-data.frame(list_id=unlist(lapply(l, paste, collapse = ", "))) r<-cbind(unique_A=row.names(r),r) row.names(r)<-NULL r best Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it> A: "r-help" <R-help at r-project.org> Inviato: Mercoled?, 6 giugno 2018 10:13:10 Oggetto: aggregate and list elements of variables in data.frame #given the following reproducible and simplified example t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) t #I need to get the following result r<-data.frame(unique_A=c(123, 345, 678, 789),list_id=c('1,3,6,9','2,5,8','4','7,10')) r # i.e. aggregate over the variable "A" and list all elements of the variable "id" satisfying the criteria of having the same corrisponding value of "A" #any help for that? #so far I've just managed to "aggregate" and "count", like: library(sqldf) sqldf('select count(*) as count_id, A as unique_A from t group by A') library(dplyr) t%>%group_by(unique_A=A) %>% summarise(count_id = n()) # thank you -- ------------------------------------------------------------ Massimo Bressan ARPAV Agenzia Regionale per la Prevenzione e Protezione Ambientale del Veneto Dipartimento Provinciale di Treviso Via Santa Barbara, 5/a 31100 Treviso, Italy tel: +39 0422 558545 fax: +39 0422 558516 e-mail: massimo.bressan at arpa.veneto.it ------------------------------------------------------------ [[alternative HTML version deleted]]
Massimo Bressan
2018-Jun-07 12:21 UTC
[R] aggregate and list elements of variables in data.frame
sorry, but by further looking at the example I just realised that the posted solution it's not completely what I need because in fact I do not need to get back the 'indices' but instead the corrisponding values of column A #please consider this new example t<-data.frame(id=c(18,91,20,68,54,27,26,15,4,97),A=c(123,345,123,678,345,123,789,345,123,789)) t # I need to get this result r<-data.frame(unique_A=c(123, 345, 678, 789),list_id=c('18,20,27,4','91,54,15','68','26,97')) r # any help for this, please? Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it> A: "r-help" <R-help at r-project.org> Inviato: Gioved?, 7 giugno 2018 10:09:55 Oggetto: Re: aggregate and list elements of variables in data.frame thanks for the help I'm posting here the complete solution t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) t$A <- factor(t$A) l<-sapply(levels(t$A), function(x) which(t$A==x)) r<-data.frame(list_id=unlist(lapply(l, paste, collapse = ", "))) r<-cbind(unique_A=row.names(r),r) row.names(r)<-NULL r best Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it> A: "r-help" <R-help at r-project.org> Inviato: Mercoled?, 6 giugno 2018 10:13:10 Oggetto: aggregate and list elements of variables in data.frame #given the following reproducible and simplified example t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) t #I need to get the following result r<-data.frame(unique_A=c(123, 345, 678, 789),list_id=c('1,3,6,9','2,5,8','4','7,10')) r # i.e. aggregate over the variable "A" and list all elements of the variable "id" satisfying the criteria of having the same corrisponding value of "A" #any help for that? #so far I've just managed to "aggregate" and "count", like: library(sqldf) sqldf('select count(*) as count_id, A as unique_A from t group by A') library(dplyr) t%>%group_by(unique_A=A) %>% summarise(count_id = n()) # thank you -- ------------------------------------------------------------ Massimo Bressan ARPAV Agenzia Regionale per la Prevenzione e Protezione Ambientale del Veneto Dipartimento Provinciale di Treviso Via Santa Barbara, 5/a 31100 Treviso, Italy tel: +39 0422 558545 fax: +39 0422 558516 e-mail: massimo.bressan at arpa.veneto.it ------------------------------------------------------------ -- ------------------------------------------------------------ Massimo Bressan ARPAV Agenzia Regionale per la Prevenzione e Protezione Ambientale del Veneto Dipartimento Provinciale di Treviso Via Santa Barbara, 5/a 31100 Treviso, Italy tel: +39 0422 558545 fax: +39 0422 558516 e-mail: massimo.bressan at arpa.veneto.it ------------------------------------------------------------ [[alternative HTML version deleted]]
Eik Vettorazzi
2018-Jun-08 07:45 UTC
[R] aggregate and list elements of variables in data.frame
Hi, if you are willing to use dplyr, you can do all in one line of code: library(dplyr) df<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) df%>%group_by(unique_A=A)%>%summarise(list_id=paste(id,collapse=", "))->r cheers Am 06.06.2018 um 10:13 schrieb Massimo Bressan:> #given the following reproducible and simplified example > > t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) > t > > #I need to get the following result > > r<-data.frame(unique_A=c(123, 345, 678, 789),list_id=c('1,3,6,9','2,5,8','4','7,10')) > r > > # i.e. aggregate over the variable "A" and list all elements of the variable "id" satisfying the criteria of having the same corrisponding value of "A" > #any help for that? > > #so far I've just managed to "aggregate" and "count", like: > > library(sqldf) > sqldf('select count(*) as count_id, A as unique_A from t group by A') > > library(dplyr) > t%>%group_by(unique_A=A) %>% summarise(count_id = n()) > > # thank you > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Eik Vettorazzi Department of Medical Biometry and Epidemiology University Medical Center Hamburg-Eppendorf Martinistrasse 52 building W 34 20246 Hamburg Phone: +49 (0) 40 7410 - 58243 Fax: +49 (0) 40 7410 - 57790 Web: www.uke.de/imbe -- _____________________________________________________________________ Universit?tsklinikum Hamburg-Eppendorf; K?rperschaft des ?ffentlichen Rechts; Gerichtsstand: Hamburg | www.uke.de Vorstandsmitglieder: Prof. Dr. Burkhard G?ke (Vorsitzender), Prof. Dr. Dr. Uwe Koch-Gromus, Joachim Pr?l?, Martina Saurin (komm.) _____________________________________________________________________ SAVE PAPER - THINK BEFORE PRINTING