thr3ads.net - R help - [R] column selection for aggregate() [Jan 2010]

If this information is useful, please help other people find it:
Share via:

Ivan Calandra

2010-Jan-18 14:53 UTC

[R] column selection for aggregate()

Hi everybody!

I'm working on R today so I have a lot of questions (you may have 
noticed that it's the 3rd email today). I'm new on R, so please excuse 
the "spam"!

I have a dataset "ssfa" with many rows and the column names are:
 > names(ssfa)
 [1] "SPECSHOR"  "BONE"      "TO_POS"   
"MEASUREM"  "FACETTE"   "SHEARFAC"
 [7] "ENA_BA"    "SEL_FACET" "SEL_MEAS" 
"Asfc"      "Smc"       "epLsar"
[13] "HAsfc4"    "HAsfc9"    "HAsfc16"  
"HAsfc25"   "HAsfc36"   "HAsfc49"
[19] "HAsfc64"   "HAsfc81"   "HAsfc100" 
"HAsfc121"  "Tfv"       "Ftfv"

I want to aggregate that way:
ssfamean <- aggregate(ssfa[c("Asfc", "Smc",
"epLsar", "HAsfc4",
"HAsfc9", "HAsfc16", "HAsfc25",
"HAsfc36", "HAsfc49", "HAsfc64",
"HAsfc81", "HAsfc100", "HAsfc121",
"Tfv", "Ftfv")], ssfa[c("SPECSHOR",
"BONE", "TO_POS", "FACETTE", "SHEARFAC",
"ENA_BA")], mean).

As you can see, it is very long since I have many variables. Basically I 
want to select all numerical variables (10 to 24), and all categorical 
variables except MEASUREM, SEL_FACET and SEL_MEAS without having to 
write each of them. I would also like to avoid writing the names, the 
indexes would be nice.
I tried with:
 > ssfamean <- aggregate(ssfa[c(ssfa[[10]]:ssfa[[24]])], 
ssfa[c("SPECSHOR", "BONE", "TO_POS",
"FACETTE", "SHEARFAC", "ENA_BA")],
mean)
but it obviously doesn't work (well "obviously"...)

Could anyone help me on this?
Thanks in advance
Ivan

	[[alternative HTML version deleted]]

b k

2010-Jan-18 15:03 UTC

head link

[R] column selection for aggregate()

On Mon, Jan 18, 2010 at 9:53 AM, Ivan Calandra
<ivan.calandra@uni-hamburg.de> wrote:
> Hi everybody!
>
> I'm working on R today so I have a lot of questions (you may have
> noticed that it's the 3rd email today). I'm new on R, so please
excuse
> the "spam"!
>
> I have a dataset "ssfa" with many rows and the column names are:
>  > names(ssfa)
>  [1] "SPECSHOR"  "BONE"      "TO_POS"   
"MEASUREM"  "FACETTE"   "SHEARFAC"
>  [7] "ENA_BA"    "SEL_FACET" "SEL_MEAS" 
"Asfc"      "Smc"       "epLsar"
> [13] "HAsfc4"    "HAsfc9"    "HAsfc16"  
"HAsfc25"   "HAsfc36"   "HAsfc49"
> [19] "HAsfc64"   "HAsfc81"   "HAsfc100" 
"HAsfc121"  "Tfv"       "Ftfv"
>
> I want to aggregate that way:
> ssfamean <- aggregate(ssfa[c("Asfc", "Smc",
"epLsar", "HAsfc4",
> "HAsfc9", "HAsfc16", "HAsfc25",
"HAsfc36", "HAsfc49", "HAsfc64",
> "HAsfc81", "HAsfc100", "HAsfc121",
"Tfv", "Ftfv")], ssfa[c("SPECSHOR",
> "BONE", "TO_POS", "FACETTE",
"SHEARFAC", "ENA_BA")], mean).
>
> As you can see, it is very long since I have many variables. Basically I
> want to select all numerical variables (10 to 24), and all categorical
> variables except MEASUREM, SEL_FACET and SEL_MEAS without having to
> write each of them. I would also like to avoid writing the names, the
> indexes would be nice.
> I tried with:
>  > ssfamean <- aggregate(ssfa[c(ssfa[[10]]:ssfa[[24]])],
> ssfa[c("SPECSHOR", "BONE", "TO_POS",
"FACETTE", "SHEARFAC", "ENA_BA")],
> mean)
> but it obviously doesn't work (well "obviously"...)
>
Numeric column indexing?

eg:
ssfa[10:24]
ssfa[[10:24]] #list including column name

col_index <- match("Asfc", ssfa)
total_num_cols <- 14
ssfa[col_index:total_num_cols]


I'm new at R also but that's how I would approach it.

Ben K.

	[[alternative HTML version deleted]]

Gabor Grothendieck

2010-Jan-18 15:47 UTC

head link

[R] column selection for aggregate()

Try summaryBy in the doBy package. e.g. using the built-in CO2
summarize each numeric variable by each factor except for the factors
Plant and Type:

library(doBy)
summaryBy(. ~ ., data = subset(CO2, select = - c(Plant, Type)))


On Mon, Jan 18, 2010 at 9:53 AM, Ivan Calandra
<ivan.calandra at uni-hamburg.de> wrote:> Hi everybody!
>
> I'm working on R today so I have a lot of questions (you may have
> noticed that it's the 3rd email today). I'm new on R, so please
excuse
> the "spam"!
>
> I have a dataset "ssfa" with many rows and the column names are:
> ?> names(ssfa)
> ?[1] "SPECSHOR" ?"BONE" ? ? ?"TO_POS" ?
?"MEASUREM" ?"FACETTE" ? "SHEARFAC"
> ?[7] "ENA_BA" ? ?"SEL_FACET" "SEL_MEAS"
?"Asfc" ? ? ?"Smc" ? ? ? "epLsar"
> [13] "HAsfc4" ? ?"HAsfc9" ? ?"HAsfc16" ?
"HAsfc25" ? "HAsfc36" ? "HAsfc49"
> [19] "HAsfc64" ? "HAsfc81" ? "HAsfc100"
?"HAsfc121" ?"Tfv" ? ? ? "Ftfv"
>
> I want to aggregate that way:
> ssfamean <- aggregate(ssfa[c("Asfc", "Smc",
"epLsar", "HAsfc4",
> "HAsfc9", "HAsfc16", "HAsfc25",
"HAsfc36", "HAsfc49", "HAsfc64",
> "HAsfc81", "HAsfc100", "HAsfc121",
"Tfv", "Ftfv")], ssfa[c("SPECSHOR",
> "BONE", "TO_POS", "FACETTE",
"SHEARFAC", "ENA_BA")], mean).
>
> As you can see, it is very long since I have many variables. Basically I
> want to select all numerical variables (10 to 24), and all categorical
> variables except MEASUREM, SEL_FACET and SEL_MEAS without having to
> write each of them. I would also like to avoid writing the names, the
> indexes would be nice.
> I tried with:
> ?> ssfamean <- aggregate(ssfa[c(ssfa[[10]]:ssfa[[24]])],
> ssfa[c("SPECSHOR", "BONE", "TO_POS",
"FACETTE", "SHEARFAC", "ENA_BA")],
> mean)
> but it obviously doesn't work (well "obviously"...)
>
> Could anyone help me on this?
> Thanks in advance
> Ivan
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

R help - Jan 2010 - column selection for aggregate()

[R] column selection for aggregate()

[R] column selection for aggregate()

[R] column selection for aggregate()

Seemingly Similar Threads