thr3ads.net - R help - [R] How to extract same columns from identical dataframes in a list? [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Wolfgang Waser

2016-Feb-09 09:03 UTC

[R] How to extract same columns from identical dataframes in a list?

Hi,

sorry if my description was too short / unclear.
> I have a list of 7 data frames, each data frame having 24 rows (hour of
> the day) and 5 columns (weeks) with a total of 5 x 24 values
[1]
	week1	week2	week3	...
1	x	a	m	...
2	y	b	n
3	z	c	o
.	.	.	.
.	.	.	.
.	.	.	.
24	.	.	.

[2]
	week1 week2 week3 ...
1	x2	a2	m2	...
2	y2	b2	n2
3	z2	c2	o2
.	.	.	.
.	.	.	.
.	.	.	.
24	.	.	.

[3]
...

.
.
.

[7]
...

I now would like to extract e.g. all week2 columns of all data frames in
the list and combine them in a new data frame using cbind.

new data frame

week2 ([1])	week2 ([2])	week2 ([3])	...
a		a2		.
b		b2		.
c		c2		.
.
.
.

I will then do further row-wise calculations using e.g. apply(x,1,mean),
the result being a vector of 24 values.

I have not found a way to extract specific columns of the data frames in
a list.

As mentioned I can use

sapply(list_of_dataframes,"[",1:24)

which will pick the first 24 values (first column) of each data frame in
the list and arrange them as an array of 24 rows and 7 columns (7 data
frames are in the list).
To pick the second column (week2) using sapply I have to use the next 24
values from 25 to 48:

sapply(list_of_dataframes,"[",25:48)

It seems that sapply treats the data frames in the list as vectors. I
can of course extract all consecutive weeks using consecutive blocks of
24 values, but this seems cumbersome.

The question remains, how to select specific columns from data frames in
a list, e.g. all columns 3 of all data frames in the list.

Reformatting (unlist(), dim()) in one data frame with one column for
each week does not help, since I'm not calculating colMeans etc, but
row-wise calculations using apply(x,1,FUN) ("applying a function to
margins of an array or matrix").

Thanks for you help and suggestions!

Wolfgang

On 08/02/16 18:00, D?nes T?th wrote:> Hi,
> 
> Although you did not provide any reproducible example, it seems you
> store the same type of values in your data.frames. If this is true, it
> is much more efficient to store your data in an array:
> 
> mylist <- list(a = data.frame(week1 = rnorm(24), week2 = rnorm(24)),
>                b = data.frame(week1 = rnorm(24), week2 = rnorm(24)))
> 
> myarray <- unlist(mylist, use.names = FALSE)
> dim(myarray) <- c(nrow(mylist$a), ncol(mylist$a), length(mylist))
> dimnames(myarray) <- list(hour = rownames(mylist$a),
>                           week = colnames(mylist$a),
>                           other = names(mylist))
> # now you can do:
> mean(myarray[, "week1", "a"])
> 
> # or:
> colMeans(myarray)
> 
> 
> Cheers,
>   Denes
> 
> 
> On 02/08/2016 02:33 PM, Wolfgang Waser wrote:
>> Hello,
>>
>> I have a list of 7 data frames, each data frame having 24 rows (hour of
>> the day) and 5 columns (weeks) with a total of 5 x 24 values
>>
>> I would like to combine all 7 columns of week 1 (and 2 ...) in a
>> separate data frame for hourly calculations, e.g.
>>> apply(new.data.frame,1,mean)
>>
>> In some way sapply (lapply) works, but I cannot directly select columns
>> of the original data frames in the list. As a workaround I have to
>> select a range of values:
>>
>>> sapply(list_of_dataframes,"[",1:24)
>>
>> Values 1:24 give the first column, 25:48 the second and so on.
>>
>> Is there an easier / more direct way to select for specific columns
>> instead of selecting a range of values, avoiding loops?
>>
>>
>> Cheers,
>>
>> Wolfgang
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
-- 
Frankenf?rder Forschungsgesellschaft mbH
Dr. Wolfgang Waser
Wissenschaftsbereich Berlin
Chausseestra?e 10
10115 Berlin
Tel.:  +49(0)30 2809 1936
Fax.:  +49(0)30 2809 1940
E-Mail: waser at frankenfoerder-fg.de

Frankenf?rder Forschungsgesellschaft mbH (FFG)
Sitz: Luckenwalde,Amtsgericht Potsdam, HRB: 6499
Gesch?ftsf?hrerin: Dipl. Agraring. Doreen Sparborth
Tel.: +49(0)30 2809 1931, E-Mail: info at frankenfoerder-fg.de
http://www.frankenfoerder-fg.de

S Ellison

2016-Feb-09 14:46 UTC

head link

[R] How to extract same columns from identical dataframes in a list?

Does
do.call('cbind', list_of_dataframes)

do what you want?

S Ellison

> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Wolfgang
> Waser
> Sent: 09 February 2016 09:03
> To: D?nes T?th; r-help at r-project.org
> Subject: Re: [R] How to extract same columns from identical dataframes in a
> list?
> 
> Hi,
> 
> sorry if my description was too short / unclear.
> 
> > I have a list of 7 data frames, each data frame having 24 rows (hour
> > of the day) and 5 columns (weeks) with a total of 5 x 24 values
> 
> [1]
> 	week1	week2	week3	...
> 1	x	a	m	...
> 2	y	b	n
> 3	z	c	o
> .	.	.	.
> .	.	.	.
> .	.	.	.
> 24	.	.	.
> 
> 
> [2]
> 	week1 week2 week3 ...
> 1	x2	a2	m2	...
> 2	y2	b2	n2
> 3	z2	c2	o2
> .	.	.	.
> .	.	.	.
> .	.	.	.
> 24	.	.	.
> 
> 
> [3]
> ...
> 
> .
> .
> .
> 
> 
> [7]
> ...
> 
> 
> 
> I now would like to extract e.g. all week2 columns of all data frames in
the list
> and combine them in a new data frame using cbind.
> 
> new data frame
> 
> week2 ([1])	week2 ([2])	week2 ([3])	...
> a		a2		.
> b		b2		.
> c		c2		.
> .
> .
> .
> 
> I will then do further row-wise calculations using e.g. apply(x,1,mean),
the
> result being a vector of 24 values.
> 
> 
> I have not found a way to extract specific columns of the data frames in a
list.
> 
> 
> As mentioned I can use
> 
> sapply(list_of_dataframes,"[",1:24)
> 
> which will pick the first 24 values (first column) of each data frame in
the list
> and arrange them as an array of 24 rows and 7 columns (7 data frames are in
> the list).
> To pick the second column (week2) using sapply I have to use the next 24
values
> from 25 to 48:
> 
> sapply(list_of_dataframes,"[",25:48)
> 
> 
> It seems that sapply treats the data frames in the list as vectors. I can
of course
> extract all consecutive weeks using consecutive blocks of
> 24 values, but this seems cumbersome.
> 
> 
> The question remains, how to select specific columns from data frames in a
list,
> e.g. all columns 3 of all data frames in the list.
> 
> 
> Reformatting (unlist(), dim()) in one data frame with one column for each
week
> does not help, since I'm not calculating colMeans etc, but row-wise
calculations
> using apply(x,1,FUN) ("applying a function to margins of an array or
matrix").
> 
> 
> 
> Thanks for you help and suggestions!
> 
> 
> Wolfgang
> 
> 
> 
> On 08/02/16 18:00, D?nes T?th wrote:
> > Hi,
> >
> > Although you did not provide any reproducible example, it seems you
> > store the same type of values in your data.frames. If this is true, it
> > is much more efficient to store your data in an array:
> >
> > mylist <- list(a = data.frame(week1 = rnorm(24), week2 =
rnorm(24)),
> >                b = data.frame(week1 = rnorm(24), week2 = rnorm(24)))
> >
> > myarray <- unlist(mylist, use.names = FALSE)
> > dim(myarray) <- c(nrow(mylist$a), ncol(mylist$a), length(mylist))
> > dimnames(myarray) <- list(hour = rownames(mylist$a),
> >                           week = colnames(mylist$a),
> >                           other = names(mylist)) # now you can do:
> > mean(myarray[, "week1", "a"])
> >
> > # or:
> > colMeans(myarray)
> >
> >
> > Cheers,
> >   Denes
> >
> >
> > On 02/08/2016 02:33 PM, Wolfgang Waser wrote:
> >> Hello,
> >>
> >> I have a list of 7 data frames, each data frame having 24 rows
(hour
> >> of the day) and 5 columns (weeks) with a total of 5 x 24 values
> >>
> >> I would like to combine all 7 columns of week 1 (and 2 ...) in a
> >> separate data frame for hourly calculations, e.g.
> >>> apply(new.data.frame,1,mean)
> >>
> >> In some way sapply (lapply) works, but I cannot directly select
> >> columns of the original data frames in the list. As a workaround I
> >> have to select a range of values:
> >>
> >>> sapply(list_of_dataframes,"[",1:24)
> >>
> >> Values 1:24 give the first column, 25:48 the second and so on.
> >>
> >> Is there an easier / more direct way to select for specific
columns
> >> instead of selecting a range of values, avoiding loops?
> >>
> >>
> >> Cheers,
> >>
> >> Wolfgang
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> 
> --
> Frankenf?rder Forschungsgesellschaft mbH Dr. Wolfgang Waser
> Wissenschaftsbereich Berlin Chausseestra?e 10
> 10115 Berlin
> Tel.:  +49(0)30 2809 1936
> Fax.:  +49(0)30 2809 1940
> E-Mail: waser at frankenfoerder-fg.de
> 
> Frankenf?rder Forschungsgesellschaft mbH (FFG)
> Sitz: Luckenwalde,Amtsgericht Potsdam, HRB: 6499
> Gesch?ftsf?hrerin: Dipl. Agraring. Doreen Sparborth
> Tel.: +49(0)30 2809 1931, E-Mail: info at frankenfoerder-fg.de
> http://webdefence.global.blackspider.com/urlwrap/?q=AXicJcrBCsIwDADQgHfB
> DzFbEUU97bKh_-CldGkdZslIO4t_L-
> g7v90Gtg7gcQMw_rTOY7Y3zn7ioFJMGYPOsLpLf0-
> DtO5wOh8hIzFPWaXjFJLpuvzWs5Tl2jS1Vozm5UUSlWwk28eEI8HfF6ucIuc&Z
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}

Ulrik Stervbo

2016-Feb-09 14:49 UTC

head link

[R] How to extract same columns from identical dataframes in a list?

Hi Wolfgang,

If you use cbind in the do.call you get something close to what you want.
You can then use grep to get just the columns you are interested in (I
assume they all have the same name in the original data.frames)

dfa <- data.frame(cola = c(6:10), colb = c(1:5), colc = c(11:15))
dfb <- data.frame(cola = c(6:10), colb = c(1:5), colc = c(11:15))

df.lst <- list(dfa.name = dfa, dfb.name = dfb)

# Bind together
all.df <- do.call(cbind, df.lst)
colnames(all.df)
# Get just those with colb in the column name
all.df <- all.df[, grep("colb", colnames(all.df))]
# And calculate the means
rowMeans(all.df)

This also works if you want to use more then one column
all.df <- do.call(cbind, df.lst)
all.df <- all.df[, grep("colb|colc", colnames(all.df))]
rowMeans(all.df)

You can still use plyr:
library(plyr)

all.df <- ldply(df.lst, function(cur.df){return(cur.df$colb)}, .id
"org.df")
all.df$org.df <- NULL
colMeans(all.df)

but if you want to extract more then one column form each data.frame it be
comes a little tricky.

HTH
Ulrik

On Tue, 9 Feb 2016 at 15:18 Wolfgang Waser <waser at frankenfoerder-fg.de>
wrote:
> Hi,
>
> sorry if my description was too short / unclear.
>
> > I have a list of 7 data frames, each data frame having 24 rows (hour
of
> > the day) and 5 columns (weeks) with a total of 5 x 24 values
>
> [1]
>         week1   week2   week3   ...
> 1       x       a       m       ...
> 2       y       b       n
> 3       z       c       o
> .       .       .       .
> .       .       .       .
> .       .       .       .
> 24      .       .       .
>
>
> [2]
>         week1 week2 week3 ...
> 1       x2      a2      m2      ...
> 2       y2      b2      n2
> 3       z2      c2      o2
> .       .       .       .
> .       .       .       .
> .       .       .       .
> 24      .       .       .
>
>
> [3]
> ...
>
> .
> .
> .
>
>
> [7]
> ...
>
>
>
> I now would like to extract e.g. all week2 columns of all data frames in
> the list and combine them in a new data frame using cbind.
>
> new data frame
>
> week2 ([1])     week2 ([2])     week2 ([3])     ...
> a               a2              .
> b               b2              .
> c               c2              .
> .
> .
> .
>
> I will then do further row-wise calculations using e.g. apply(x,1,mean),
> the result being a vector of 24 values.
>
>
> I have not found a way to extract specific columns of the data frames in
> a list.
>
>
> As mentioned I can use
>
> sapply(list_of_dataframes,"[",1:24)
>
> which will pick the first 24 values (first column) of each data frame in
> the list and arrange them as an array of 24 rows and 7 columns (7 data
> frames are in the list).
> To pick the second column (week2) using sapply I have to use the next 24
> values from 25 to 48:
>
> sapply(list_of_dataframes,"[",25:48)
>
>
> It seems that sapply treats the data frames in the list as vectors. I
> can of course extract all consecutive weeks using consecutive blocks of
> 24 values, but this seems cumbersome.
>
>
> The question remains, how to select specific columns from data frames in
> a list, e.g. all columns 3 of all data frames in the list.
>
>
> Reformatting (unlist(), dim()) in one data frame with one column for
> each week does not help, since I'm not calculating colMeans etc, but
> row-wise calculations using apply(x,1,FUN) ("applying a function to
> margins of an array or matrix").
>
>
>
> Thanks for you help and suggestions!
>
>
> Wolfgang
>
>
>
> On 08/02/16 18:00, D?nes T?th wrote:
> > Hi,
> >
> > Although you did not provide any reproducible example, it seems you
> > store the same type of values in your data.frames. If this is true, it
> > is much more efficient to store your data in an array:
> >
> > mylist <- list(a = data.frame(week1 = rnorm(24), week2 =
rnorm(24)),
> >                b = data.frame(week1 = rnorm(24), week2 = rnorm(24)))
> >
> > myarray <- unlist(mylist, use.names = FALSE)
> > dim(myarray) <- c(nrow(mylist$a), ncol(mylist$a), length(mylist))
> > dimnames(myarray) <- list(hour = rownames(mylist$a),
> >                           week = colnames(mylist$a),
> >                           other = names(mylist))
> > # now you can do:
> > mean(myarray[, "week1", "a"])
> >
> > # or:
> > colMeans(myarray)
> >
> >
> > Cheers,
> >   Denes
> >
> >
> > On 02/08/2016 02:33 PM, Wolfgang Waser wrote:
> >> Hello,
> >>
> >> I have a list of 7 data frames, each data frame having 24 rows
(hour of
> >> the day) and 5 columns (weeks) with a total of 5 x 24 values
> >>
> >> I would like to combine all 7 columns of week 1 (and 2 ...) in a
> >> separate data frame for hourly calculations, e.g.
> >>> apply(new.data.frame,1,mean)
> >>
> >> In some way sapply (lapply) works, but I cannot directly select
columns
> >> of the original data frames in the list. As a workaround I have to
> >> select a range of values:
> >>
> >>> sapply(list_of_dataframes,"[",1:24)
> >>
> >> Values 1:24 give the first column, 25:48 the second and so on.
> >>
> >> Is there an easier / more direct way to select for specific
columns
> >> instead of selecting a range of values, avoiding loops?
> >>
> >>
> >> Cheers,
> >>
> >> Wolfgang
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
>
> --
> Frankenf?rder Forschungsgesellschaft mbH
> Dr. Wolfgang Waser
> Wissenschaftsbereich Berlin
> Chausseestra?e 10
> 10115 Berlin
> Tel.:  +49(0)30 2809 1936
> Fax.:  +49(0)30 2809 1940
> E-Mail: waser at frankenfoerder-fg.de
>
> Frankenf?rder Forschungsgesellschaft mbH (FFG)
> Sitz: Luckenwalde,Amtsgericht Potsdam, HRB: 6499
> Gesch?ftsf?hrerin: Dipl. Agraring. Doreen Sparborth
> Tel.: +49(0)30 2809 1931, E-Mail: info at frankenfoerder-fg.de
> http://www.frankenfoerder-fg.de
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

peter dalgaard

2016-Feb-09 15:19 UTC

head link

[R] How to extract same columns from identical dataframes in a list?

Like this?
> l <- replicate(3,data.frame(w1=sample(1:4),w2=sample(1:4)),
simplify=FALSE)
> l[[1]]
  w1 w2
1  2  2
2  3  3
3  1  1
4  4  4

[[2]]
  w1 w2
1  3  4
2  2  2
3  1  3
4  4  1

[[3]]
  w1 w2
1  1  4
2  4  3
3  2  1
4  3  2
> sapply(l,"[[",2)     [,1] [,2] [,3]
[1,]    2    4    4
[2,]    3    2    3
[3,]    1    3    1
[4,]    4    1    2

Or even
> sapply(l,"[",,2)     [,1] [,2] [,3]
[1,]    2    4    4
[2,]    3    2    3
[3,]    1    3    1
[4,]    4    1    2


Notice that if dd[1:24] gives you the 1st column, then dd is not a data frame
but rather a matrix, and indexing semantics are different. In that case, for
some unspeakable reason, the empty index does not work and you'll need
something like
> l <- replicate(3,cbind(w1=sample(1:4),w2=sample(1:4)), simplify=FALSE)
> sapply(l,"[",T,2)     [,1] [,2] [,3]
[1,]    4    3    2
[2,]    1    1    4
[3,]    3    2    3
[4,]    2    4    1

Or, brute-force-and-ignorance:
> sapply(l, function(e) e[, 2])     [,1] [,2] [,3]
[1,]    4    3    2
[2,]    1    1    4
[3,]    3    2    3
[4,]    2    4    1





On 09 Feb 2016, at 10:03 , Wolfgang Waser <waser at frankenfoerder-fg.de>
wrote:
> Hi,
> 
> sorry if my description was too short / unclear.
> 
>> I have a list of 7 data frames, each data frame having 24 rows (hour of
>> the day) and 5 columns (weeks) with a total of 5 x 24 values
> 
> [1]
> 	week1	week2	week3	...
> 1	x	a	m	...
> 2	y	b	n
> 3	z	c	o
> .	.	.	.
> .	.	.	.
> .	.	.	.
> 24	.	.	.
> 
> 
> [2]
> 	week1 week2 week3 ...
> 1	x2	a2	m2	...
> 2	y2	b2	n2
> 3	z2	c2	o2
> .	.	.	.
> .	.	.	.
> .	.	.	.
> 24	.	.	.
> 
> 
> [3]
> ...
> 
> .
> .
> .
> 
> 
> [7]
> ...
> 
> 
> 
> I now would like to extract e.g. all week2 columns of all data frames in
> the list and combine them in a new data frame using cbind.
> 
> new data frame
> 
> week2 ([1])	week2 ([2])	week2 ([3])	...
> a		a2		.
> b		b2		.
> c		c2		.
> .
> .
> .
> 
> I will then do further row-wise calculations using e.g. apply(x,1,mean),
> the result being a vector of 24 values.
> 
> 
> I have not found a way to extract specific columns of the data frames in
> a list.
> 
> 
> As mentioned I can use
> 
> sapply(list_of_dataframes,"[",1:24)
> 
> which will pick the first 24 values (first column) of each data frame in
> the list and arrange them as an array of 24 rows and 7 columns (7 data
> frames are in the list).
> To pick the second column (week2) using sapply I have to use the next 24
> values from 25 to 48:
> 
> sapply(list_of_dataframes,"[",25:48)
> 
> 
> It seems that sapply treats the data frames in the list as vectors. I
> can of course extract all consecutive weeks using consecutive blocks of
> 24 values, but this seems cumbersome.
> 
> 
> The question remains, how to select specific columns from data frames in
> a list, e.g. all columns 3 of all data frames in the list.
> 
> 
> Reformatting (unlist(), dim()) in one data frame with one column for
> each week does not help, since I'm not calculating colMeans etc, but
> row-wise calculations using apply(x,1,FUN) ("applying a function to
> margins of an array or matrix").
> 
> 
> 
> Thanks for you help and suggestions!
> 
> 
> Wolfgang
> 
> 
> 
> On 08/02/16 18:00, D?nes T?th wrote:
>> Hi,
>> 
>> Although you did not provide any reproducible example, it seems you
>> store the same type of values in your data.frames. If this is true, it
>> is much more efficient to store your data in an array:
>> 
>> mylist <- list(a = data.frame(week1 = rnorm(24), week2 = rnorm(24)),
>>               b = data.frame(week1 = rnorm(24), week2 = rnorm(24)))
>> 
>> myarray <- unlist(mylist, use.names = FALSE)
>> dim(myarray) <- c(nrow(mylist$a), ncol(mylist$a), length(mylist))
>> dimnames(myarray) <- list(hour = rownames(mylist$a),
>>                          week = colnames(mylist$a),
>>                          other = names(mylist))
>> # now you can do:
>> mean(myarray[, "week1", "a"])
>> 
>> # or:
>> colMeans(myarray)
>> 
>> 
>> Cheers,
>>  Denes
>> 
>> 
>> On 02/08/2016 02:33 PM, Wolfgang Waser wrote:
>>> Hello,
>>> 
>>> I have a list of 7 data frames, each data frame having 24 rows
(hour of
>>> the day) and 5 columns (weeks) with a total of 5 x 24 values
>>> 
>>> I would like to combine all 7 columns of week 1 (and 2 ...) in a
>>> separate data frame for hourly calculations, e.g.
>>>> apply(new.data.frame,1,mean)
>>> 
>>> In some way sapply (lapply) works, but I cannot directly select
columns
>>> of the original data frames in the list. As a workaround I have to
>>> select a range of values:
>>> 
>>>> sapply(list_of_dataframes,"[",1:24)
>>> 
>>> Values 1:24 give the first column, 25:48 the second and so on.
>>> 
>>> Is there an easier / more direct way to select for specific columns
>>> instead of selecting a range of values, avoiding loops?
>>> 
>>> 
>>> Cheers,
>>> 
>>> Wolfgang
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>> 
> 
> -- 
> Frankenf?rder Forschungsgesellschaft mbH
> Dr. Wolfgang Waser
> Wissenschaftsbereich Berlin
> Chausseestra?e 10
> 10115 Berlin
> Tel.:  +49(0)30 2809 1936
> Fax.:  +49(0)30 2809 1940
> E-Mail: waser at frankenfoerder-fg.de
> 
> Frankenf?rder Forschungsgesellschaft mbH (FFG)
> Sitz: Luckenwalde,Amtsgericht Potsdam, HRB: 6499
> Gesch?ftsf?hrerin: Dipl. Agraring. Doreen Sparborth
> Tel.: +49(0)30 2809 1931, E-Mail: info at frankenfoerder-fg.de
> http://www.frankenfoerder-fg.de
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

Wolfgang Waser

2016-Feb-10 09:04 UTC

head link

[R] How to extract same columns from identical dataframes in a list?

Hi,

sapply(l,"[",T,2)

and

sapply(l, function(e) e[, 2])


work fine!


Thanks a lot!

Why is the second version "brute force and ignorance"? Is one of the
versions to be preferred? If so, which and why (very briefly, please)?


Results of the other options mentioned:
> sapply(l,"[[",2)
results in a single vector of length 7

> sapply(l,"[",,2)Error in lapply(X = X, FUN = FUN, ...) :
argument is missing, with no default

These versions probably don't work due the "data frames" in the
list
actually being matrices.


I'm not enough of a programer to always make complete sense of the R
help pages. Should I have found this information in the sapply - R help
page?
Where else could I check before pestering the R mailing list, which, of
course, provides quick and valuable answers.


Cheers,

Wolfgang




On 09/02/16 16:19, peter dalgaard wrote:> Like this?
> 
>> l <- replicate(3,data.frame(w1=sample(1:4),w2=sample(1:4)),
simplify=FALSE)
>> l
> [[1]]
>   w1 w2
> 1  2  2
> 2  3  3
> 3  1  1
> 4  4  4
> 
> [[2]]
>   w1 w2
> 1  3  4
> 2  2  2
> 3  1  3
> 4  4  1
> 
> [[3]]
>   w1 w2
> 1  1  4
> 2  4  3
> 3  2  1
> 4  3  2
> 
>> sapply(l,"[[",2)
>      [,1] [,2] [,3]
> [1,]    2    4    4
> [2,]    3    2    3
> [3,]    1    3    1
> [4,]    4    1    2
> 
> Or even
> 
>> sapply(l,"[",,2)
>      [,1] [,2] [,3]
> [1,]    2    4    4
> [2,]    3    2    3
> [3,]    1    3    1
> [4,]    4    1    2
> 
> 
> Notice that if dd[1:24] gives you the 1st column, then dd is not a data
frame but rather a matrix, and indexing semantics are different. In that case,
for some unspeakable reason, the empty index does not work and you'll need
something like
> 
>> l <- replicate(3,cbind(w1=sample(1:4),w2=sample(1:4)),
simplify=FALSE)
>> sapply(l,"[",T,2)
>      [,1] [,2] [,3]
> [1,]    4    3    2
> [2,]    1    1    4
> [3,]    3    2    3
> [4,]    2    4    1
> 
> Or, brute-force-and-ignorance:
> 
>> sapply(l, function(e) e[, 2])
>      [,1] [,2] [,3]
> [1,]    4    3    2
> [2,]    1    1    4
> [3,]    3    2    3
> [4,]    2    4    1
> 
> 
> 
> 
> 
> On 09 Feb 2016, at 10:03 , Wolfgang Waser <waser at
frankenfoerder-fg.de> wrote:
> 
>> Hi,
>>
>> sorry if my description was too short / unclear.
>>
>>> I have a list of 7 data frames, each data frame having 24 rows
(hour of
>>> the day) and 5 columns (weeks) with a total of 5 x 24 values
>>
>> [1]
>> 	week1	week2	week3	...
>> 1	x	a	m	...
>> 2	y	b	n
>> 3	z	c	o
>> .	.	.	.
>> .	.	.	.
>> .	.	.	.
>> 24	.	.	.
>>
>>
>> [2]
>> 	week1 week2 week3 ...
>> 1	x2	a2	m2	...
>> 2	y2	b2	n2
>> 3	z2	c2	o2
>> .	.	.	.
>> .	.	.	.
>> .	.	.	.
>> 24	.	.	.
>>
>>
>> [3]
>> ...
>>
>> .
>> .
>> .
>>
>>
>> [7]
>> ...
>>
>>
>>
>> I now would like to extract e.g. all week2 columns of all data frames
in
>> the list and combine them in a new data frame using cbind.
>>
>> new data frame
>>
>> week2 ([1])	week2 ([2])	week2 ([3])	...
>> a		a2		.
>> b		b2		.
>> c		c2		.
>> .
>> .
>> .
>>
>> I will then do further row-wise calculations using e.g.
apply(x,1,mean),
>> the result being a vector of 24 values.
>>
>>
>> I have not found a way to extract specific columns of the data frames
in
>> a list.
>>
>>
>> As mentioned I can use
>>
>> sapply(list_of_dataframes,"[",1:24)
>>
>> which will pick the first 24 values (first column) of each data frame
in
>> the list and arrange them as an array of 24 rows and 7 columns (7 data
>> frames are in the list).
>> To pick the second column (week2) using sapply I have to use the next
24
>> values from 25 to 48:
>>
>> sapply(list_of_dataframes,"[",25:48)
>>
>>
>> It seems that sapply treats the data frames in the list as vectors. I
>> can of course extract all consecutive weeks using consecutive blocks of
>> 24 values, but this seems cumbersome.
>>
>>
>> The question remains, how to select specific columns from data frames
in
>> a list, e.g. all columns 3 of all data frames in the list.
>>
>>
>> Reformatting (unlist(), dim()) in one data frame with one column for
>> each week does not help, since I'm not calculating colMeans etc,
but
>> row-wise calculations using apply(x,1,FUN) ("applying a function
to
>> margins of an array or matrix").
>>
>>
>>
>> Thanks for you help and suggestions!
>>
>>
>> Wolfgang
>>
>>
>>
>> On 08/02/16 18:00, D?nes T?th wrote:
>>> Hi,
>>>
>>> Although you did not provide any reproducible example, it seems you
>>> store the same type of values in your data.frames. If this is true,
it
>>> is much more efficient to store your data in an array:
>>>
>>> mylist <- list(a = data.frame(week1 = rnorm(24), week2 =
rnorm(24)),
>>>               b = data.frame(week1 = rnorm(24), week2 = rnorm(24)))
>>>
>>> myarray <- unlist(mylist, use.names = FALSE)
>>> dim(myarray) <- c(nrow(mylist$a), ncol(mylist$a),
length(mylist))
>>> dimnames(myarray) <- list(hour = rownames(mylist$a),
>>>                          week = colnames(mylist$a),
>>>                          other = names(mylist))
>>> # now you can do:
>>> mean(myarray[, "week1", "a"])
>>>
>>> # or:
>>> colMeans(myarray)
>>>
>>>
>>> Cheers,
>>>  Denes
>>>
>>>
>>> On 02/08/2016 02:33 PM, Wolfgang Waser wrote:
>>>> Hello,
>>>>
>>>> I have a list of 7 data frames, each data frame having 24 rows
(hour of
>>>> the day) and 5 columns (weeks) with a total of 5 x 24 values
>>>>
>>>> I would like to combine all 7 columns of week 1 (and 2 ...) in
a
>>>> separate data frame for hourly calculations, e.g.
>>>>> apply(new.data.frame,1,mean)
>>>>
>>>> In some way sapply (lapply) works, but I cannot directly select
columns
>>>> of the original data frames in the list. As a workaround I have
to
>>>> select a range of values:
>>>>
>>>>> sapply(list_of_dataframes,"[",1:24)
>>>>
>>>> Values 1:24 give the first column, 25:48 the second and so on.
>>>>
>>>> Is there an easier / more direct way to select for specific
columns
>>>> instead of selecting a range of values, avoiding loops?
>>>>
>>>>
>>>> Cheers,
>>>>
>>>> Wolfgang
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible
code.

R help - Feb 2016 - How to extract same columns from identical dataframes in a list?

[R] How to extract same columns from identical dataframes in a list?

[R] How to extract same columns from identical dataframes in a list?

[R] How to extract same columns from identical dataframes in a list?

[R] How to extract same columns from identical dataframes in a list?

[R] How to extract same columns from identical dataframes in a list?