thr3ads.net - R help - [R] aggregate and list elements of variables in data.frame [Jun 2018]

If this information is useful, please help other people find it:
Share via:

Massimo Bressan

2018-Jun-07 12:21 UTC

[R] aggregate and list elements of variables in data.frame

sorry, but by further looking at the example I just realised that the posted
solution it's not completely what I need because in fact I do not need to
get back the 'indices' but instead the corrisponding values of column A

#please consider this new example 

t<-data.frame(id=c(18,91,20,68,54,27,26,15,4,97),A=c(123,345,123,678,345,123,789,345,123,789))
t 

# I need to get this result 
r<-data.frame(unique_A=c(123, 345, 678,
789),list_id=c('18,20,27,4','91,54,15','68','26,97'))
r 

# any help for this, please? 





Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it> 
A: "r-help" <R-help at r-project.org> 
Inviato: Gioved?, 7 giugno 2018 10:09:55 
Oggetto: Re: aggregate and list elements of variables in data.frame 

thanks for the help 

I'm posting here the complete solution 

t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) 
t$A <- factor(t$A) 
l<-sapply(levels(t$A), function(x) which(t$A==x)) 
r<-data.frame(list_id=unlist(lapply(l, paste, collapse = ", "))) 
r<-cbind(unique_A=row.names(r),r) 
row.names(r)<-NULL 
r 

best 



Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it> 
A: "r-help" <R-help at r-project.org> 
Inviato: Mercoled?, 6 giugno 2018 10:13:10 
Oggetto: aggregate and list elements of variables in data.frame 

#given the following reproducible and simplified example 

t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) 
t 

#I need to get the following result 

r<-data.frame(unique_A=c(123, 345, 678,
789),list_id=c('1,3,6,9','2,5,8','4','7,10'))
r 

# i.e. aggregate over the variable "A" and list all elements of the
variable "id" satisfying the criteria of having the same corrisponding
value of "A"
#any help for that? 

#so far I've just managed to "aggregate" and "count",
like:

library(sqldf) 
sqldf('select count(*) as count_id, A as unique_A from t group by A') 

library(dplyr) 
t%>%group_by(unique_A=A) %>% summarise(count_id = n()) 

# thank you 


-- 

------------------------------------------------------------ 
Massimo Bressan 

ARPAV 
Agenzia Regionale per la Prevenzione e 
Protezione Ambientale del Veneto 

Dipartimento Provinciale di Treviso 
Via Santa Barbara, 5/a 
31100 Treviso, Italy 

tel: +39 0422 558545 
fax: +39 0422 558516 
e-mail: massimo.bressan at arpa.veneto.it 
------------------------------------------------------------ 


-- 

------------------------------------------------------------ 
Massimo Bressan 

ARPAV 
Agenzia Regionale per la Prevenzione e 
Protezione Ambientale del Veneto 

Dipartimento Provinciale di Treviso 
Via Santa Barbara, 5/a 
31100 Treviso, Italy 

tel: +39 0422 558545 
fax: +39 0422 558516 
e-mail: massimo.bressan at arpa.veneto.it 
------------------------------------------------------------ 

	[[alternative HTML version deleted]]

Ivan Calandra

2018-Jun-07 12:28 UTC

head link

[R] aggregate and list elements of variables in data.frame

Using which() to subset t$id should do the trick:

sapply(levels(t$A), function(x) t$id[which(t$A==x)])

Ivan

--
Dr. Ivan Calandra
TraCEr, laboratory for Traceology and Controlled Experiments
MONREPOS Archaeological Research Centre and
Museum for Human Behavioural Evolution
Schloss Monrepos
56567 Neuwied, Germany
+49 (0) 2631 9772-243
https://www.researchgate.net/profile/Ivan_Calandra

On 07/06/2018 14:21, Massimo Bressan wrote:> sorry, but by further looking at the example I just realised that the
posted solution it's not completely what I need because in fact I do not
need to get back the 'indices' but instead the corrisponding values of
column A
>
> #please consider this new example
>
>
t<-data.frame(id=c(18,91,20,68,54,27,26,15,4,97),A=c(123,345,123,678,345,123,789,345,123,789))
> t
>
> # I need to get this result
> r<-data.frame(unique_A=c(123, 345, 678,
789),list_id=c('18,20,27,4','91,54,15','68','26,97'))
> r
>
> # any help for this, please?
>
>
>
>
>
> Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it>
> A: "r-help" <R-help at r-project.org>
> Inviato: Gioved?, 7 giugno 2018 10:09:55
> Oggetto: Re: aggregate and list elements of variables in data.frame
>
> thanks for the help
>
> I'm posting here the complete solution
>
> t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789))
> t$A <- factor(t$A)
> l<-sapply(levels(t$A), function(x) which(t$A==x))
> r<-data.frame(list_id=unlist(lapply(l, paste, collapse = ",
")))
> r<-cbind(unique_A=row.names(r),r)
> row.names(r)<-NULL
> r
>
> best
>
>
>
> Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it>
> A: "r-help" <R-help at r-project.org>
> Inviato: Mercoled?, 6 giugno 2018 10:13:10
> Oggetto: aggregate and list elements of variables in data.frame
>
> #given the following reproducible and simplified example
>
> t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789))
> t
>
> #I need to get the following result
>
> r<-data.frame(unique_A=c(123, 345, 678,
789),list_id=c('1,3,6,9','2,5,8','4','7,10'))
> r
>
> # i.e. aggregate over the variable "A" and list all elements of
the variable "id" satisfying the criteria of having the same
corrisponding value of "A"
> #any help for that?
>
> #so far I've just managed to "aggregate" and
"count", like:
>
> library(sqldf)
> sqldf('select count(*) as count_id, A as unique_A from t group by
A')
>
> library(dplyr)
> t%>%group_by(unique_A=A) %>% summarise(count_id = n())
>
> # thank you
>
>

Ben Tupper

2018-Jun-07 12:47 UTC

head link

[R] aggregate and list elements of variables in data.frame

Hi,

Does this do what you want?  I had to change the id values to something more
obvious.  It uses tibbles which allow each variable to be a list.

library(tibble)
library(dplyr)
x       <- tibble(id=LETTERS[1:10],
                A=c(123,345,123,678,345,123,789,345,123,789))
uA      <- unique(x$A)
idx     <- lapply(uA, function(v) which(x$A %in% v))
vals    <- lapply(idx, function(index) x$id[index])

r <- tibble(unique_A = uA, list_idx = idx, list_vals = vals)

> r# A tibble: 4 x 3
  unique_A list_idx  list_vals
     <dbl> <list>    <list>   
1     123. <int [4]> <chr [4]>
2     345. <int [3]> <chr [3]>
3     678. <int [1]> <chr [1]>
4     789. <int [2]> <chr [2]>> r$list_idx[1][[1]]
[1] 1 3 6 9
> r$list_vals[1][[1]]
[1] "A" "C" "F" "I"


Cheers,
ben


> On Jun 7, 2018, at 8:21 AM, Massimo Bressan <massimo.bressan at
arpa.veneto.it> wrote:
> 
> sorry, but by further looking at the example I just realised that the
posted solution it's not completely what I need because in fact I do not
need to get back the 'indices' but instead the corrisponding values of
column A
> 
> #please consider this new example 
> 
>
t<-data.frame(id=c(18,91,20,68,54,27,26,15,4,97),A=c(123,345,123,678,345,123,789,345,123,789))
> t 
> 
> # I need to get this result 
> r<-data.frame(unique_A=c(123, 345, 678,
789),list_id=c('18,20,27,4','91,54,15','68','26,97'))
> r 
> 
> # any help for this, please? 
> 
> 
> 
> 
> 
> Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it> 
> A: "r-help" <R-help at r-project.org> 
> Inviato: Gioved?, 7 giugno 2018 10:09:55 
> Oggetto: Re: aggregate and list elements of variables in data.frame 
> 
> thanks for the help 
> 
> I'm posting here the complete solution 
> 
> t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) 
> t$A <- factor(t$A) 
> l<-sapply(levels(t$A), function(x) which(t$A==x)) 
> r<-data.frame(list_id=unlist(lapply(l, paste, collapse = ",
")))
> r<-cbind(unique_A=row.names(r),r) 
> row.names(r)<-NULL 
> r 
> 
> best 
> 
> 
> 
> Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it> 
> A: "r-help" <R-help at r-project.org> 
> Inviato: Mercoled?, 6 giugno 2018 10:13:10 
> Oggetto: aggregate and list elements of variables in data.frame 
> 
> #given the following reproducible and simplified example 
> 
> t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) 
> t 
> 
> #I need to get the following result 
> 
> r<-data.frame(unique_A=c(123, 345, 678,
789),list_id=c('1,3,6,9','2,5,8','4','7,10'))
> r 
> 
> # i.e. aggregate over the variable "A" and list all elements of
the variable "id" satisfying the criteria of having the same
corrisponding value of "A"
> #any help for that? 
> 
> #so far I've just managed to "aggregate" and
"count", like:
> 
> library(sqldf) 
> sqldf('select count(*) as count_id, A as unique_A from t group by
A')
> 
> library(dplyr) 
> t%>%group_by(unique_A=A) %>% summarise(count_id = n()) 
> 
> # thank you 
> 
> 
> -- 
> 
> ------------------------------------------------------------ 
> Massimo Bressan 
> 
> ARPAV 
> Agenzia Regionale per la Prevenzione e 
> Protezione Ambientale del Veneto 
> 
> Dipartimento Provinciale di Treviso 
> Via Santa Barbara, 5/a 
> 31100 Treviso, Italy 
> 
> tel: +39 0422 558545 
> fax: +39 0422 558516 
> e-mail: massimo.bressan at arpa.veneto.it 
> ------------------------------------------------------------ 
> 
> 
> -- 
> 
> ------------------------------------------------------------ 
> Massimo Bressan 
> 
> ARPAV 
> Agenzia Regionale per la Prevenzione e 
> Protezione Ambientale del Veneto 
> 
> Dipartimento Provinciale di Treviso 
> Via Santa Barbara, 5/a 
> 31100 Treviso, Italy 
> 
> tel: +39 0422 558545 
> fax: +39 0422 558516 
> e-mail: massimo.bressan at arpa.veneto.it 
> ------------------------------------------------------------ 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
Ben Tupper
Bigelow Laboratory for Ocean Sciences
60 Bigelow Drive, P.O. Box 380
East Boothbay, Maine 04544
http://www.bigelow.org

Ecological Forecasting: https://eco.bigelow.org/






	[[alternative HTML version deleted]]

Massimo Bressan

2018-Jun-07 13:27 UTC

head link

[R] aggregate and list elements of variables in data.frame

thank you for the help 

this is my solution based on your valuable hint but without the need to pass
through the use of a 'tibble'

x<-data.frame(id=LETTERS[1:10], A=c(123,345,123,678,345,123,789,345,123,789))
uA<-unique(x$A) 
idx<-lapply(uA, function(v) which(x$A %in% v)) 
vals<- lapply(idx, function(index) x$id[index]) 
data.frame(unique_A = uA, list_vals=unlist(lapply(vals, paste, collapse =
", ")))

best 



Da: "Ben Tupper" <btupper at bigelow.org> 
A: "Massimo Bressan" <massimo.bressan at arpa.veneto.it> 
Cc: "r-help" <R-help at r-project.org> 
Inviato: Gioved?, 7 giugno 2018 14:47:55 
Oggetto: Re: [R] aggregate and list elements of variables in data.frame 

Hi, 

Does this do what you want? I had to change the id values to something more
obvious. It uses tibbles which allow each variable to be a list.

library(tibble) 
library(dplyr) 
x <- tibble(id=LETTERS[1:10], 
A=c(123,345,123,678,345,123,789,345,123,789)) 
uA <- unique(x$A) 
idx <- lapply(uA, function(v) which(x$A %in% v)) 
vals <- lapply(idx, function(index) x$id[index]) 

r <- tibble(unique_A = uA, list_idx = idx, list_vals = vals) 

> r # A tibble: 4 x 3 
unique_A list_idx list_vals 
<dbl> <list> <list> 
1 123. <int [4]> <chr [4]> 
2 345. <int [3]> <chr [3]> 
3 678. <int [1]> <chr [1]> 
4 789. <int [2]> <chr [2]> > r$list_idx[1] [[1]] 
[1] 1 3 6 9 
> r$list_vals[1] [[1]] 
[1] "A" "C" "F" "I" 


Cheers, 
ben 


	[[alternative HTML version deleted]]

R help - Jun 2018 - aggregate and list elements of variables in data.frame

[R] aggregate and list elements of variables in data.frame

[R] aggregate and list elements of variables in data.frame

[R] aggregate and list elements of variables in data.frame

[R] aggregate and list elements of variables in data.frame