Massimo Bressan
2018-Jun-07 12:21 UTC
[R] aggregate and list elements of variables in data.frame
sorry, but by further looking at the example I just realised that the posted
solution it's not completely what I need because in fact I do not need to
get back the 'indices' but instead the corrisponding values of column A
#please consider this new example
t<-data.frame(id=c(18,91,20,68,54,27,26,15,4,97),A=c(123,345,123,678,345,123,789,345,123,789))
t
# I need to get this result
r<-data.frame(unique_A=c(123, 345, 678,
789),list_id=c('18,20,27,4','91,54,15','68','26,97'))
r
# any help for this, please?
Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it>
A: "r-help" <R-help at r-project.org>
Inviato: Gioved?, 7 giugno 2018 10:09:55
Oggetto: Re: aggregate and list elements of variables in data.frame
thanks for the help
I'm posting here the complete solution
t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789))
t$A <- factor(t$A)
l<-sapply(levels(t$A), function(x) which(t$A==x))
r<-data.frame(list_id=unlist(lapply(l, paste, collapse = ", ")))
r<-cbind(unique_A=row.names(r),r)
row.names(r)<-NULL
r
best
Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it>
A: "r-help" <R-help at r-project.org>
Inviato: Mercoled?, 6 giugno 2018 10:13:10
Oggetto: aggregate and list elements of variables in data.frame
#given the following reproducible and simplified example
t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789))
t
#I need to get the following result
r<-data.frame(unique_A=c(123, 345, 678,
789),list_id=c('1,3,6,9','2,5,8','4','7,10'))
r
# i.e. aggregate over the variable "A" and list all elements of the
variable "id" satisfying the criteria of having the same corrisponding
value of "A"
#any help for that?
#so far I've just managed to "aggregate" and "count",
like:
library(sqldf)
sqldf('select count(*) as count_id, A as unique_A from t group by A')
library(dplyr)
t%>%group_by(unique_A=A) %>% summarise(count_id = n())
# thank you
--
------------------------------------------------------------
Massimo Bressan
ARPAV
Agenzia Regionale per la Prevenzione e
Protezione Ambientale del Veneto
Dipartimento Provinciale di Treviso
Via Santa Barbara, 5/a
31100 Treviso, Italy
tel: +39 0422 558545
fax: +39 0422 558516
e-mail: massimo.bressan at arpa.veneto.it
------------------------------------------------------------
--
------------------------------------------------------------
Massimo Bressan
ARPAV
Agenzia Regionale per la Prevenzione e
Protezione Ambientale del Veneto
Dipartimento Provinciale di Treviso
Via Santa Barbara, 5/a
31100 Treviso, Italy
tel: +39 0422 558545
fax: +39 0422 558516
e-mail: massimo.bressan at arpa.veneto.it
------------------------------------------------------------
[[alternative HTML version deleted]]
Ivan Calandra
2018-Jun-07 12:28 UTC
[R] aggregate and list elements of variables in data.frame
Using which() to subset t$id should do the trick: sapply(levels(t$A), function(x) t$id[which(t$A==x)]) Ivan -- Dr. Ivan Calandra TraCEr, laboratory for Traceology and Controlled Experiments MONREPOS Archaeological Research Centre and Museum for Human Behavioural Evolution Schloss Monrepos 56567 Neuwied, Germany +49 (0) 2631 9772-243 https://www.researchgate.net/profile/Ivan_Calandra On 07/06/2018 14:21, Massimo Bressan wrote:> sorry, but by further looking at the example I just realised that the posted solution it's not completely what I need because in fact I do not need to get back the 'indices' but instead the corrisponding values of column A > > #please consider this new example > > t<-data.frame(id=c(18,91,20,68,54,27,26,15,4,97),A=c(123,345,123,678,345,123,789,345,123,789)) > t > > # I need to get this result > r<-data.frame(unique_A=c(123, 345, 678, 789),list_id=c('18,20,27,4','91,54,15','68','26,97')) > r > > # any help for this, please? > > > > > > Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it> > A: "r-help" <R-help at r-project.org> > Inviato: Gioved?, 7 giugno 2018 10:09:55 > Oggetto: Re: aggregate and list elements of variables in data.frame > > thanks for the help > > I'm posting here the complete solution > > t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) > t$A <- factor(t$A) > l<-sapply(levels(t$A), function(x) which(t$A==x)) > r<-data.frame(list_id=unlist(lapply(l, paste, collapse = ", "))) > r<-cbind(unique_A=row.names(r),r) > row.names(r)<-NULL > r > > best > > > > Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it> > A: "r-help" <R-help at r-project.org> > Inviato: Mercoled?, 6 giugno 2018 10:13:10 > Oggetto: aggregate and list elements of variables in data.frame > > #given the following reproducible and simplified example > > t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) > t > > #I need to get the following result > > r<-data.frame(unique_A=c(123, 345, 678, 789),list_id=c('1,3,6,9','2,5,8','4','7,10')) > r > > # i.e. aggregate over the variable "A" and list all elements of the variable "id" satisfying the criteria of having the same corrisponding value of "A" > #any help for that? > > #so far I've just managed to "aggregate" and "count", like: > > library(sqldf) > sqldf('select count(*) as count_id, A as unique_A from t group by A') > > library(dplyr) > t%>%group_by(unique_A=A) %>% summarise(count_id = n()) > > # thank you > >
Ben Tupper
2018-Jun-07 12:47 UTC
[R] aggregate and list elements of variables in data.frame
Hi,
Does this do what you want? I had to change the id values to something more
obvious. It uses tibbles which allow each variable to be a list.
library(tibble)
library(dplyr)
x <- tibble(id=LETTERS[1:10],
A=c(123,345,123,678,345,123,789,345,123,789))
uA <- unique(x$A)
idx <- lapply(uA, function(v) which(x$A %in% v))
vals <- lapply(idx, function(index) x$id[index])
r <- tibble(unique_A = uA, list_idx = idx, list_vals = vals)
> r
# A tibble: 4 x 3
unique_A list_idx list_vals
<dbl> <list> <list>
1 123. <int [4]> <chr [4]>
2 345. <int [3]> <chr [3]>
3 678. <int [1]> <chr [1]>
4 789. <int [2]> <chr [2]>> r$list_idx[1]
[[1]]
[1] 1 3 6 9
> r$list_vals[1]
[[1]]
[1] "A" "C" "F" "I"
Cheers,
ben
> On Jun 7, 2018, at 8:21 AM, Massimo Bressan <massimo.bressan at
arpa.veneto.it> wrote:
>
> sorry, but by further looking at the example I just realised that the
posted solution it's not completely what I need because in fact I do not
need to get back the 'indices' but instead the corrisponding values of
column A
>
> #please consider this new example
>
>
t<-data.frame(id=c(18,91,20,68,54,27,26,15,4,97),A=c(123,345,123,678,345,123,789,345,123,789))
> t
>
> # I need to get this result
> r<-data.frame(unique_A=c(123, 345, 678,
789),list_id=c('18,20,27,4','91,54,15','68','26,97'))
> r
>
> # any help for this, please?
>
>
>
>
>
> Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it>
> A: "r-help" <R-help at r-project.org>
> Inviato: Gioved?, 7 giugno 2018 10:09:55
> Oggetto: Re: aggregate and list elements of variables in data.frame
>
> thanks for the help
>
> I'm posting here the complete solution
>
> t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789))
> t$A <- factor(t$A)
> l<-sapply(levels(t$A), function(x) which(t$A==x))
> r<-data.frame(list_id=unlist(lapply(l, paste, collapse = ",
")))
> r<-cbind(unique_A=row.names(r),r)
> row.names(r)<-NULL
> r
>
> best
>
>
>
> Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it>
> A: "r-help" <R-help at r-project.org>
> Inviato: Mercoled?, 6 giugno 2018 10:13:10
> Oggetto: aggregate and list elements of variables in data.frame
>
> #given the following reproducible and simplified example
>
> t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789))
> t
>
> #I need to get the following result
>
> r<-data.frame(unique_A=c(123, 345, 678,
789),list_id=c('1,3,6,9','2,5,8','4','7,10'))
> r
>
> # i.e. aggregate over the variable "A" and list all elements of
the variable "id" satisfying the criteria of having the same
corrisponding value of "A"
> #any help for that?
>
> #so far I've just managed to "aggregate" and
"count", like:
>
> library(sqldf)
> sqldf('select count(*) as count_id, A as unique_A from t group by
A')
>
> library(dplyr)
> t%>%group_by(unique_A=A) %>% summarise(count_id = n())
>
> # thank you
>
>
> --
>
> ------------------------------------------------------------
> Massimo Bressan
>
> ARPAV
> Agenzia Regionale per la Prevenzione e
> Protezione Ambientale del Veneto
>
> Dipartimento Provinciale di Treviso
> Via Santa Barbara, 5/a
> 31100 Treviso, Italy
>
> tel: +39 0422 558545
> fax: +39 0422 558516
> e-mail: massimo.bressan at arpa.veneto.it
> ------------------------------------------------------------
>
>
> --
>
> ------------------------------------------------------------
> Massimo Bressan
>
> ARPAV
> Agenzia Regionale per la Prevenzione e
> Protezione Ambientale del Veneto
>
> Dipartimento Provinciale di Treviso
> Via Santa Barbara, 5/a
> 31100 Treviso, Italy
>
> tel: +39 0422 558545
> fax: +39 0422 558516
> e-mail: massimo.bressan at arpa.veneto.it
> ------------------------------------------------------------
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Ben Tupper
Bigelow Laboratory for Ocean Sciences
60 Bigelow Drive, P.O. Box 380
East Boothbay, Maine 04544
http://www.bigelow.org
Ecological Forecasting: https://eco.bigelow.org/
[[alternative HTML version deleted]]
Massimo Bressan
2018-Jun-07 13:27 UTC
[R] aggregate and list elements of variables in data.frame
thank you for the help this is my solution based on your valuable hint but without the need to pass through the use of a 'tibble' x<-data.frame(id=LETTERS[1:10], A=c(123,345,123,678,345,123,789,345,123,789)) uA<-unique(x$A) idx<-lapply(uA, function(v) which(x$A %in% v)) vals<- lapply(idx, function(index) x$id[index]) data.frame(unique_A = uA, list_vals=unlist(lapply(vals, paste, collapse = ", "))) best Da: "Ben Tupper" <btupper at bigelow.org> A: "Massimo Bressan" <massimo.bressan at arpa.veneto.it> Cc: "r-help" <R-help at r-project.org> Inviato: Gioved?, 7 giugno 2018 14:47:55 Oggetto: Re: [R] aggregate and list elements of variables in data.frame Hi, Does this do what you want? I had to change the id values to something more obvious. It uses tibbles which allow each variable to be a list. library(tibble) library(dplyr) x <- tibble(id=LETTERS[1:10], A=c(123,345,123,678,345,123,789,345,123,789)) uA <- unique(x$A) idx <- lapply(uA, function(v) which(x$A %in% v)) vals <- lapply(idx, function(index) x$id[index]) r <- tibble(unique_A = uA, list_idx = idx, list_vals = vals)> r# A tibble: 4 x 3 unique_A list_idx list_vals <dbl> <list> <list> 1 123. <int [4]> <chr [4]> 2 345. <int [3]> <chr [3]> 3 678. <int [1]> <chr [1]> 4 789. <int [2]> <chr [2]>> r$list_idx[1][[1]] [1] 1 3 6 9> r$list_vals[1][[1]] [1] "A" "C" "F" "I" Cheers, ben [[alternative HTML version deleted]]