thr3ads.net - R help - [R] Split a dataframe by rownames and/or colnames [Feb 2015]

If this information is useful, please help other people find it:
Share via:

Tim Richter-Heitmann

2015-Feb-20 17:33 UTC

[R] Split a dataframe by rownames and/or colnames

Dear List,

Consider this example

df <- data.frame(matrix(rnorm(9*9), ncol=9))
names(df) <- c("c_1", "d_1", "e_1",
"a_p", "b_p", "c_p", "1_o1",
"2_o1",
"3_o1")
row.names(df) <- names(df)


indx <- gsub(".*_", "", names(df))

I can split the dataframe by the index that is given in the column.names 
after the underscore "_".

list2env(
   setNames(
     lapply(split(colnames(df), indx), function(x) df[x]),
     paste('df', sort(unique(indx)), sep="_")),
   envir=.GlobalEnv)

However, i changed my mind and want to do it now by rownames. Exchanging 
colnames with rownames does not work, it gives the exact same output (9 
rows x 3 columns). I could do
as.data.frame(t(df_x),
but maybe that is not elegant.
What would be the solution for splitting the dataframe by rows?

Thank you very much!

-- 
Tim Richter-Heitmann

Bert Gunter

2015-Feb-20 18:25 UTC

head link

[R] Split a dataframe by rownames and/or colnames

I think

?tapply

and friends: ?by ?aggregate  ?ave

is what you want.

-- Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
Clifford Stoll




On Fri, Feb 20, 2015 at 9:33 AM, Tim Richter-Heitmann
<trichter at uni-bremen.de> wrote:> Dear List,
>
> Consider this example
>
> df <- data.frame(matrix(rnorm(9*9), ncol=9))
> names(df) <- c("c_1", "d_1", "e_1",
"a_p", "b_p", "c_p", "1_o1",
"2_o1",
> "3_o1")
> row.names(df) <- names(df)
>
>
> indx <- gsub(".*_", "", names(df))
>
> I can split the dataframe by the index that is given in the column.names
> after the underscore "_".
>
> list2env(
>   setNames(
>     lapply(split(colnames(df), indx), function(x) df[x]),
>     paste('df', sort(unique(indx)), sep="_")),
>   envir=.GlobalEnv)
>
> However, i changed my mind and want to do it now by rownames. Exchanging
> colnames with rownames does not work, it gives the exact same output (9
rows
> x 3 columns). I could do
> as.data.frame(t(df_x),
> but maybe that is not elegant.
> What would be the solution for splitting the dataframe by rows?
>
> Thank you very much!
>
> --
> Tim Richter-Heitmann
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius

2015-Feb-20 19:36 UTC

head link

[R] Split a dataframe by rownames and/or colnames

On Feb 20, 2015, at 9:33 AM, Tim Richter-Heitmann wrote:
> Dear List,
> 
> Consider this example
> 
> df <- data.frame(matrix(rnorm(9*9), ncol=9))
> names(df) <- c("c_1", "d_1", "e_1",
"a_p", "b_p", "c_p", "1_o1",
"2_o1", "3_o1")
> row.names(df) <- names(df)
> 
> 
> indx <- gsub(".*_", "", names(df))
> 
> I can split the dataframe by the index that is given in the column.names
after the underscore "_".
> 
> list2env(
>  setNames(
>    lapply(split(colnames(df), indx), function(x) df[x]),
>    paste('df', sort(unique(indx)), sep="_")),
>  envir=.GlobalEnv)
> 
> However, i changed my mind and want to do it now by rownames. Exchanging
colnames with rownames does not work, it gives the exact same output (9 rows x 3
columns). I could do
> as.data.frame(t(df_x),
> but maybe that is not elegant.
> What would be the solution for splitting the dataframe by rows?
The split.data.frame method seems to work perfectly well with a rownames-derived
index argument:
> split(df, sub(".+_","", rownames(df) ) )$`1`
      c_1   d_1  e_1   a_p   b_p   c_p  1_o1 2_o1  3_o1
c_1 -0.11 -0.04 1.33 -0.87 -0.16 -0.25 -0.75 0.34  0.14
d_1 -0.62 -0.94 0.80 -0.78 -0.70  0.74  0.11 1.44 -0.33
e_1  0.98 -0.83 0.48  0.19 -0.32 -1.01  1.28 1.04 -2.16

$o1
       c_1   d_1   e_1   a_p   b_p   c_p  1_o1  2_o1  3_o1
1_o1 -0.93 -0.02  0.69 -0.67  1.04  1.04 -1.50 -0.36  0.50
2_o1  0.02 -0.16 -0.09 -1.50 -0.02 -1.04  1.07 -0.45  1.56
3_o1 -1.42  0.88 -0.05  0.85 -1.35  0.21  1.35  0.92 -0.76

$p
      c_1   d_1   e_1   a_p  b_p   c_p  1_o1  2_o1  3_o1
a_p -1.35  0.91 -0.58 -0.63 0.94 -1.13  0.71  0.25  0.82
b_p -0.25 -0.73 -0.41 -1.71 1.28  0.19 -0.35  1.74 -0.93
c_p -0.01 -1.11 -0.12  0.58 1.51  0.03 -0.99 -0.23 -0.03
> 
> Thank you very much!
> 
> -- 
> Tim Richter-Heitmann
> -- 

David Winsemius
Alameda, CA, USA

Tim Richter-Heitmann

2015-Feb-23 12:03 UTC

head link

[R] Split a dataframe by rownames and/or colnames

Thank you very much for the line. It was doing the split as suggested.
However, i want to release all the dataframes to the environment (later 
on, for each dataframe, some dozen lines of code will be carried out, 
and i dont know how to do it w lapply or for-looping, so i do it 
separately):

list2env(split(df, sub(".+_","", rownames(df))),
envir=.GlobalEnv)

Anyway, the dataframes have now numeric names in some cases, and cannot 
be easily accessed because of it.
How would the line be  altered to add an "df_" for each  of the 
dataframe names resulting from list2env?

Thank you very much!



Thanks, On 20.02.2015 20:36, David Winsemius wrote:> On Feb 20, 2015, at 9:33 AM, Tim Richter-Heitmann wrote:
>
>> Dear List,
>>
>> Consider this example
>>
>> df <- data.frame(matrix(rnorm(9*9), ncol=9))
>> names(df) <- c("c_1", "d_1", "e_1",
"a_p", "b_p", "c_p", "1_o1",
"2_o1", "3_o1")
>> row.names(df) <- names(df)
>>
>>
>> indx <- gsub(".*_", "", names(df))
>>
>> I can split the dataframe by the index that is given in the
column.names after the underscore "_".
>>
>> list2env(
>>   setNames(
>>     lapply(split(colnames(df), indx), function(x) df[x]),
>>     paste('df', sort(unique(indx)), sep="_")),
>>   envir=.GlobalEnv)
>>
>> However, i changed my mind and want to do it now by rownames.
Exchanging colnames with rownames does not work, it gives the exact same output
(9 rows x 3 columns). I could do
>> as.data.frame(t(df_x),
>> but maybe that is not elegant.
>> What would be the solution for splitting the dataframe by rows?
> The split.data.frame method seems to work perfectly well with a
rownames-derived index argument:
>
>> split(df, sub(".+_","", rownames(df) ) )
> $`1`
>        c_1   d_1  e_1   a_p   b_p   c_p  1_o1 2_o1  3_o1
> c_1 -0.11 -0.04 1.33 -0.87 -0.16 -0.25 -0.75 0.34  0.14
> d_1 -0.62 -0.94 0.80 -0.78 -0.70  0.74  0.11 1.44 -0.33
> e_1  0.98 -0.83 0.48  0.19 -0.32 -1.01  1.28 1.04 -2.16
>
> $o1
>         c_1   d_1   e_1   a_p   b_p   c_p  1_o1  2_o1  3_o1
> 1_o1 -0.93 -0.02  0.69 -0.67  1.04  1.04 -1.50 -0.36  0.50
> 2_o1  0.02 -0.16 -0.09 -1.50 -0.02 -1.04  1.07 -0.45  1.56
> 3_o1 -1.42  0.88 -0.05  0.85 -1.35  0.21  1.35  0.92 -0.76
>
> $p
>        c_1   d_1   e_1   a_p  b_p   c_p  1_o1  2_o1  3_o1
> a_p -1.35  0.91 -0.58 -0.63 0.94 -1.13  0.71  0.25  0.82
> b_p -0.25 -0.73 -0.41 -1.71 1.28  0.19 -0.35  1.74 -0.93
> c_p -0.01 -1.11 -0.12  0.58 1.51  0.03 -0.99 -0.23 -0.03
>
>> Thank you very much!
>>
>> -- 
>> Tim Richter-Heitmann
>>

-- 
Tim Richter-Heitmann (M.Sc.)
PhD Candidate



International Max-Planck Research School for Marine Microbiology
University of Bremen
Microbial Ecophysiology Group (AG Friedrich)
FB02 - Biologie/Chemie
Leobener Stra?e (NW2 A2130)
D-28359 Bremen
Tel.: 0049(0)421 218-63062
Fax: 0049(0)421 218-63069

R help - Feb 2015 - Split a dataframe by rownames and/or colnames

[R] Split a dataframe by rownames and/or colnames

[R] Split a dataframe by rownames and/or colnames

[R] Split a dataframe by rownames and/or colnames

[R] Split a dataframe by rownames and/or colnames