thr3ads.net - R help - [R] aggregate and list elements of variables in data.frame [Jun 2018]

If this information is useful, please help other people find it:
Share via:

Massimo Bressan

2018-Jun-06 08:13 UTC

[R] aggregate and list elements of variables in data.frame

#given the following reproducible and simplified example 

t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) 
t 

#I need to get the following result 

r<-data.frame(unique_A=c(123, 345, 678,
789),list_id=c('1,3,6,9','2,5,8','4','7,10'))
r 

# i.e. aggregate over the variable "A" and list all elements of the
variable "id" satisfying the criteria of having the same corrisponding
value of "A"
#any help for that? 

#so far I've just managed to "aggregate" and "count",
like:

library(sqldf) 
sqldf('select count(*) as count_id, A as unique_A from t group by A') 

library(dplyr) 
t%>%group_by(unique_A=A) %>% summarise(count_id = n()) 

# thank you 


	[[alternative HTML version deleted]]

Ivan Calandra

2018-Jun-06 08:21 UTC

head link

[R] aggregate and list elements of variables in data.frame

Hi Massimo,

Something along those lines could help you I guess:
t$A <- factor(t$A)
sapply(levels(t$A), function(x) which(t$A==x))

You can then play with the output using paste()

Ivan

--
Dr. Ivan Calandra
TraCEr, laboratory for Traceology and Controlled Experiments
MONREPOS Archaeological Research Centre and
Museum for Human Behavioural Evolution
Schloss Monrepos
56567 Neuwied, Germany
+49 (0) 2631 9772-243
https://www.researchgate.net/profile/Ivan_Calandra

On 06/06/2018 10:13, Massimo Bressan wrote:> #given the following reproducible and simplified example
>
> t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789))
> t
>
> #I need to get the following result
>
> r<-data.frame(unique_A=c(123, 345, 678,
789),list_id=c('1,3,6,9','2,5,8','4','7,10'))
> r
>
> # i.e. aggregate over the variable "A" and list all elements of
the variable "id" satisfying the criteria of having the same
corrisponding value of "A"
> #any help for that?
>
> #so far I've just managed to "aggregate" and
"count", like:
>
> library(sqldf)
> sqldf('select count(*) as count_id, A as unique_A from t group by
A')
>
> library(dplyr)
> t%>%group_by(unique_A=A) %>% summarise(count_id = n())
>
> # thank you
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Massimo Bressan

2018-Jun-07 08:09 UTC

head link

[R] aggregate and list elements of variables in data.frame

thanks for the help 

I'm posting here the complete solution 

t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) 
t$A <- factor(t$A) 
l<-sapply(levels(t$A), function(x) which(t$A==x)) 
r<-data.frame(list_id=unlist(lapply(l, paste, collapse = ", "))) 
r<-cbind(unique_A=row.names(r),r) 
row.names(r)<-NULL 
r 

best 



Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it> 
A: "r-help" <R-help at r-project.org> 
Inviato: Mercoled?, 6 giugno 2018 10:13:10 
Oggetto: aggregate and list elements of variables in data.frame 

#given the following reproducible and simplified example 

t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) 
t 

#I need to get the following result 

r<-data.frame(unique_A=c(123, 345, 678,
789),list_id=c('1,3,6,9','2,5,8','4','7,10'))
r 

# i.e. aggregate over the variable "A" and list all elements of the
variable "id" satisfying the criteria of having the same corrisponding
value of "A"
#any help for that? 

#so far I've just managed to "aggregate" and "count",
like:

library(sqldf) 
sqldf('select count(*) as count_id, A as unique_A from t group by A') 

library(dplyr) 
t%>%group_by(unique_A=A) %>% summarise(count_id = n()) 

# thank you 


-- 

------------------------------------------------------------ 
Massimo Bressan 

ARPAV 
Agenzia Regionale per la Prevenzione e 
Protezione Ambientale del Veneto 

Dipartimento Provinciale di Treviso 
Via Santa Barbara, 5/a 
31100 Treviso, Italy 

tel: +39 0422 558545 
fax: +39 0422 558516 
e-mail: massimo.bressan at arpa.veneto.it 
------------------------------------------------------------ 

	[[alternative HTML version deleted]]

Massimo Bressan

2018-Jun-07 12:21 UTC

head link

[R] aggregate and list elements of variables in data.frame

sorry, but by further looking at the example I just realised that the posted
solution it's not completely what I need because in fact I do not need to
get back the 'indices' but instead the corrisponding values of column A

#please consider this new example 

t<-data.frame(id=c(18,91,20,68,54,27,26,15,4,97),A=c(123,345,123,678,345,123,789,345,123,789))
t 

# I need to get this result 
r<-data.frame(unique_A=c(123, 345, 678,
789),list_id=c('18,20,27,4','91,54,15','68','26,97'))
r 

# any help for this, please? 





Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it> 
A: "r-help" <R-help at r-project.org> 
Inviato: Gioved?, 7 giugno 2018 10:09:55 
Oggetto: Re: aggregate and list elements of variables in data.frame 

thanks for the help 

I'm posting here the complete solution 

t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) 
t$A <- factor(t$A) 
l<-sapply(levels(t$A), function(x) which(t$A==x)) 
r<-data.frame(list_id=unlist(lapply(l, paste, collapse = ", "))) 
r<-cbind(unique_A=row.names(r),r) 
row.names(r)<-NULL 
r 

best 



Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it> 
A: "r-help" <R-help at r-project.org> 
Inviato: Mercoled?, 6 giugno 2018 10:13:10 
Oggetto: aggregate and list elements of variables in data.frame 

#given the following reproducible and simplified example 

t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) 
t 

#I need to get the following result 

r<-data.frame(unique_A=c(123, 345, 678,
789),list_id=c('1,3,6,9','2,5,8','4','7,10'))
r 

# i.e. aggregate over the variable "A" and list all elements of the
variable "id" satisfying the criteria of having the same corrisponding
value of "A"
#any help for that? 

#so far I've just managed to "aggregate" and "count",
like:

library(sqldf) 
sqldf('select count(*) as count_id, A as unique_A from t group by A') 

library(dplyr) 
t%>%group_by(unique_A=A) %>% summarise(count_id = n()) 

# thank you 


-- 

------------------------------------------------------------ 
Massimo Bressan 

ARPAV 
Agenzia Regionale per la Prevenzione e 
Protezione Ambientale del Veneto 

Dipartimento Provinciale di Treviso 
Via Santa Barbara, 5/a 
31100 Treviso, Italy 

tel: +39 0422 558545 
fax: +39 0422 558516 
e-mail: massimo.bressan at arpa.veneto.it 
------------------------------------------------------------ 


-- 

------------------------------------------------------------ 
Massimo Bressan 

ARPAV 
Agenzia Regionale per la Prevenzione e 
Protezione Ambientale del Veneto 

Dipartimento Provinciale di Treviso 
Via Santa Barbara, 5/a 
31100 Treviso, Italy 

tel: +39 0422 558545 
fax: +39 0422 558516 
e-mail: massimo.bressan at arpa.veneto.it 
------------------------------------------------------------ 

	[[alternative HTML version deleted]]

Eik Vettorazzi

2018-Jun-08 07:45 UTC

head link

[R] aggregate and list elements of variables in data.frame

Hi,
if you are willing to use dplyr, you can do all in one line of code:

library(dplyr)
df<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789))

df%>%group_by(unique_A=A)%>%summarise(list_id=paste(id,collapse=",
"))->r

cheers


Am 06.06.2018 um 10:13 schrieb Massimo Bressan:> #given the following reproducible and simplified example 
> 
> t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) 
> t 
> 
> #I need to get the following result 
> 
> r<-data.frame(unique_A=c(123, 345, 678,
789),list_id=c('1,3,6,9','2,5,8','4','7,10'))
> r 
> 
> # i.e. aggregate over the variable "A" and list all elements of
the variable "id" satisfying the criteria of having the same
corrisponding value of "A"
> #any help for that? 
> 
> #so far I've just managed to "aggregate" and
"count", like:
> 
> library(sqldf) 
> sqldf('select count(*) as count_id, A as unique_A from t group by
A')
> 
> library(dplyr) 
> t%>%group_by(unique_A=A) %>% summarise(count_id = n()) 
> 
> # thank you 
> 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
-- 
Eik Vettorazzi

Department of Medical Biometry and Epidemiology
University Medical Center Hamburg-Eppendorf

Martinistrasse 52
building W 34
20246 Hamburg

Phone: +49 (0) 40 7410 - 58243
Fax:   +49 (0) 40 7410 - 57790
Web: www.uke.de/imbe
--

_____________________________________________________________________

Universit?tsklinikum Hamburg-Eppendorf; K?rperschaft des ?ffentlichen Rechts;
Gerichtsstand: Hamburg | www.uke.de
Vorstandsmitglieder: Prof. Dr. Burkhard G?ke (Vorsitzender), Prof. Dr. Dr. Uwe
Koch-Gromus, Joachim Pr?l?, Martina Saurin (komm.)
_____________________________________________________________________

SAVE PAPER - THINK BEFORE PRINTING

R help - Jun 2018 - aggregate and list elements of variables in data.frame

[R] aggregate and list elements of variables in data.frame

[R] aggregate and list elements of variables in data.frame

[R] aggregate and list elements of variables in data.frame

[R] aggregate and list elements of variables in data.frame

[R] aggregate and list elements of variables in data.frame