thr3ads.net - R help - [R] How to extract same columns from identical dataframes in a list? [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Wolfgang Waser

2016-Feb-08 13:33 UTC

[R] How to extract same columns from identical dataframes in a list?

Hello,

I have a list of 7 data frames, each data frame having 24 rows (hour of
the day) and 5 columns (weeks) with a total of 5 x 24 values

I would like to combine all 7 columns of week 1 (and 2 ...) in a
separate data frame for hourly calculations, e.g.> apply(new.data.frame,1,mean)
In some way sapply (lapply) works, but I cannot directly select columns
of the original data frames in the list. As a workaround I have to
select a range of values:
> sapply(list_of_dataframes,"[",1:24)
Values 1:24 give the first column, 25:48 the second and so on.

Is there an easier / more direct way to select for specific columns
instead of selecting a range of values, avoiding loops?


Cheers,

Wolfgang

Ulrik Stervbo

2016-Feb-08 15:33 UTC

head link

[R] How to extract same columns from identical dataframes in a list?

Hi Wolfgang,

I'm not sure exactly what you want, but the ldply in the package plyr can
help you make a data.frame from a list of data.frames:

library(plyr)

dfa <- data.frame(cola = LETTERS[1:5], colb = c(1:5))
dfb <- data.frame(cola = LETTERS[1:5], colb = c(1:5))

df.lst <- list(dfa.name = dfa, dfb.name = dfb)

# If you want to use column number
ldply(df.lst, function(cur.df){return(cur.df[, 2])})

# If the column name is always the same
ldply(df.lst, function(cur.df){return(cur.df$colb)})

# Use the entire data.frame
ldply(df.lst, function(cur.df){return(cur.df)})

# The latter can also be done with do.call
do.call(rbind, df.lst)

Hope this helps,
Ulrik

On Mon, 8 Feb 2016 at 16:07 Wolfgang Waser <waser at frankenfoerder-fg.de>
wrote:
> Hello,
>
> I have a list of 7 data frames, each data frame having 24 rows (hour of
> the day) and 5 columns (weeks) with a total of 5 x 24 values
>
> I would like to combine all 7 columns of week 1 (and 2 ...) in a
> separate data frame for hourly calculations, e.g.
> > apply(new.data.frame,1,mean)
>
> In some way sapply (lapply) works, but I cannot directly select columns
> of the original data frames in the list. As a workaround I have to
> select a range of values:
>
> > sapply(list_of_dataframes,"[",1:24)
>
> Values 1:24 give the first column, 25:48 the second and so on.
>
> Is there an easier / more direct way to select for specific columns
> instead of selecting a range of values, avoiding loops?
>
>
> Cheers,
>
> Wolfgang
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Jeff Newmiller

2016-Feb-08 16:26 UTC

head link

[R] How to extract same columns from identical dataframes in a list?

Or

do.call( rbind, list.of.df )

from base R (without some of the robust behaviors that ldply implements).
-- 
Sent from my phone. Please excuse my brevity.

On February 8, 2016 7:33:54 AM PST, Ulrik Stervbo <ulrik.stervbo at
gmail.com> wrote:>Hi Wolfgang,
>
>I'm not sure exactly what you want, but the ldply in the package plyr
>can
>help you make a data.frame from a list of data.frames:
>
>library(plyr)
>
>dfa <- data.frame(cola = LETTERS[1:5], colb = c(1:5))
>dfb <- data.frame(cola = LETTERS[1:5], colb = c(1:5))
>
>df.lst <- list(dfa.name = dfa, dfb.name = dfb)
>
># If you want to use column number
>ldply(df.lst, function(cur.df){return(cur.df[, 2])})
>
># If the column name is always the same
>ldply(df.lst, function(cur.df){return(cur.df$colb)})
>
># Use the entire data.frame
>ldply(df.lst, function(cur.df){return(cur.df)})
>
># The latter can also be done with do.call
>do.call(rbind, df.lst)
>
>Hope this helps,
>Ulrik
>
>On Mon, 8 Feb 2016 at 16:07 Wolfgang Waser <waser at
frankenfoerder-fg.de>
>wrote:
>
>> Hello,
>>
>> I have a list of 7 data frames, each data frame having 24 rows (hour
>of
>> the day) and 5 columns (weeks) with a total of 5 x 24 values
>>
>> I would like to combine all 7 columns of week 1 (and 2 ...) in a
>> separate data frame for hourly calculations, e.g.
>> > apply(new.data.frame,1,mean)
>>
>> In some way sapply (lapply) works, but I cannot directly select
>columns
>> of the original data frames in the list. As a workaround I have to
>> select a range of values:
>>
>> > sapply(list_of_dataframes,"[",1:24)
>>
>> Values 1:24 give the first column, 25:48 the second and so on.
>>
>> Is there an easier / more direct way to select for specific columns
>> instead of selecting a range of values, avoiding loops?
>>
>>
>> Cheers,
>>
>> Wolfgang
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>	[[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
	[[alternative HTML version deleted]]

Dénes Tóth

2016-Feb-08 17:00 UTC

head link

[R] How to extract same columns from identical dataframes in a list?

Hi,

Although you did not provide any reproducible example, it seems you 
store the same type of values in your data.frames. If this is true, it 
is much more efficient to store your data in an array:

mylist <- list(a = data.frame(week1 = rnorm(24), week2 = rnorm(24)),
                b = data.frame(week1 = rnorm(24), week2 = rnorm(24)))

myarray <- unlist(mylist, use.names = FALSE)
dim(myarray) <- c(nrow(mylist$a), ncol(mylist$a), length(mylist))
dimnames(myarray) <- list(hour = rownames(mylist$a),
                           week = colnames(mylist$a),
                           other = names(mylist))
# now you can do:
mean(myarray[, "week1", "a"])

# or:
colMeans(myarray)


Cheers,
   Denes


On 02/08/2016 02:33 PM, Wolfgang Waser wrote:> Hello,
>
> I have a list of 7 data frames, each data frame having 24 rows (hour of
> the day) and 5 columns (weeks) with a total of 5 x 24 values
>
> I would like to combine all 7 columns of week 1 (and 2 ...) in a
> separate data frame for hourly calculations, e.g.
>> apply(new.data.frame,1,mean)
>
> In some way sapply (lapply) works, but I cannot directly select columns
> of the original data frames in the list. As a workaround I have to
> select a range of values:
>
>> sapply(list_of_dataframes,"[",1:24)
>
> Values 1:24 give the first column, 25:48 the second and so on.
>
> Is there an easier / more direct way to select for specific columns
> instead of selecting a range of values, avoiding loops?
>
>
> Cheers,
>
> Wolfgang
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Wolfgang Waser

2016-Feb-09 09:03 UTC

head link

[R] How to extract same columns from identical dataframes in a list?

Hi,

sorry if my description was too short / unclear.
> I have a list of 7 data frames, each data frame having 24 rows (hour of
> the day) and 5 columns (weeks) with a total of 5 x 24 values
[1]
	week1	week2	week3	...
1	x	a	m	...
2	y	b	n
3	z	c	o
.	.	.	.
.	.	.	.
.	.	.	.
24	.	.	.

[2]
	week1 week2 week3 ...
1	x2	a2	m2	...
2	y2	b2	n2
3	z2	c2	o2
.	.	.	.
.	.	.	.
.	.	.	.
24	.	.	.

[3]
...

.
.
.

[7]
...

I now would like to extract e.g. all week2 columns of all data frames in
the list and combine them in a new data frame using cbind.

new data frame

week2 ([1])	week2 ([2])	week2 ([3])	...
a		a2		.
b		b2		.
c		c2		.
.
.
.

I will then do further row-wise calculations using e.g. apply(x,1,mean),
the result being a vector of 24 values.

I have not found a way to extract specific columns of the data frames in
a list.

As mentioned I can use

sapply(list_of_dataframes,"[",1:24)

which will pick the first 24 values (first column) of each data frame in
the list and arrange them as an array of 24 rows and 7 columns (7 data
frames are in the list).
To pick the second column (week2) using sapply I have to use the next 24
values from 25 to 48:

sapply(list_of_dataframes,"[",25:48)

It seems that sapply treats the data frames in the list as vectors. I
can of course extract all consecutive weeks using consecutive blocks of
24 values, but this seems cumbersome.

The question remains, how to select specific columns from data frames in
a list, e.g. all columns 3 of all data frames in the list.

Reformatting (unlist(), dim()) in one data frame with one column for
each week does not help, since I'm not calculating colMeans etc, but
row-wise calculations using apply(x,1,FUN) ("applying a function to
margins of an array or matrix").

Thanks for you help and suggestions!

Wolfgang

On 08/02/16 18:00, D?nes T?th wrote:> Hi,
> 
> Although you did not provide any reproducible example, it seems you
> store the same type of values in your data.frames. If this is true, it
> is much more efficient to store your data in an array:
> 
> mylist <- list(a = data.frame(week1 = rnorm(24), week2 = rnorm(24)),
>                b = data.frame(week1 = rnorm(24), week2 = rnorm(24)))
> 
> myarray <- unlist(mylist, use.names = FALSE)
> dim(myarray) <- c(nrow(mylist$a), ncol(mylist$a), length(mylist))
> dimnames(myarray) <- list(hour = rownames(mylist$a),
>                           week = colnames(mylist$a),
>                           other = names(mylist))
> # now you can do:
> mean(myarray[, "week1", "a"])
> 
> # or:
> colMeans(myarray)
> 
> 
> Cheers,
>   Denes
> 
> 
> On 02/08/2016 02:33 PM, Wolfgang Waser wrote:
>> Hello,
>>
>> I have a list of 7 data frames, each data frame having 24 rows (hour of
>> the day) and 5 columns (weeks) with a total of 5 x 24 values
>>
>> I would like to combine all 7 columns of week 1 (and 2 ...) in a
>> separate data frame for hourly calculations, e.g.
>>> apply(new.data.frame,1,mean)
>>
>> In some way sapply (lapply) works, but I cannot directly select columns
>> of the original data frames in the list. As a workaround I have to
>> select a range of values:
>>
>>> sapply(list_of_dataframes,"[",1:24)
>>
>> Values 1:24 give the first column, 25:48 the second and so on.
>>
>> Is there an easier / more direct way to select for specific columns
>> instead of selecting a range of values, avoiding loops?
>>
>>
>> Cheers,
>>
>> Wolfgang
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
-- 
Frankenf?rder Forschungsgesellschaft mbH
Dr. Wolfgang Waser
Wissenschaftsbereich Berlin
Chausseestra?e 10
10115 Berlin
Tel.:  +49(0)30 2809 1936
Fax.:  +49(0)30 2809 1940
E-Mail: waser at frankenfoerder-fg.de

Frankenf?rder Forschungsgesellschaft mbH (FFG)
Sitz: Luckenwalde,Amtsgericht Potsdam, HRB: 6499
Gesch?ftsf?hrerin: Dipl. Agraring. Doreen Sparborth
Tel.: +49(0)30 2809 1931, E-Mail: info at frankenfoerder-fg.de
http://www.frankenfoerder-fg.de

R help - Feb 2016 - How to extract same columns from identical dataframes in a list?

[R] How to extract same columns from identical dataframes in a list?

[R] How to extract same columns from identical dataframes in a list?

[R] How to extract same columns from identical dataframes in a list?

[R] How to extract same columns from identical dataframes in a list?

[R] How to extract same columns from identical dataframes in a list?