Hi everybody! I'm working on R today so I have a lot of questions (you may have noticed that it's the 3rd email today). I'm new on R, so please excuse the "spam"! I have a dataset "ssfa" with many rows and the column names are: > names(ssfa) [1] "SPECSHOR" "BONE" "TO_POS" "MEASUREM" "FACETTE" "SHEARFAC" [7] "ENA_BA" "SEL_FACET" "SEL_MEAS" "Asfc" "Smc" "epLsar" [13] "HAsfc4" "HAsfc9" "HAsfc16" "HAsfc25" "HAsfc36" "HAsfc49" [19] "HAsfc64" "HAsfc81" "HAsfc100" "HAsfc121" "Tfv" "Ftfv" I want to aggregate that way: ssfamean <- aggregate(ssfa[c("Asfc", "Smc", "epLsar", "HAsfc4", "HAsfc9", "HAsfc16", "HAsfc25", "HAsfc36", "HAsfc49", "HAsfc64", "HAsfc81", "HAsfc100", "HAsfc121", "Tfv", "Ftfv")], ssfa[c("SPECSHOR", "BONE", "TO_POS", "FACETTE", "SHEARFAC", "ENA_BA")], mean). As you can see, it is very long since I have many variables. Basically I want to select all numerical variables (10 to 24), and all categorical variables except MEASUREM, SEL_FACET and SEL_MEAS without having to write each of them. I would also like to avoid writing the names, the indexes would be nice. I tried with: > ssfamean <- aggregate(ssfa[c(ssfa[[10]]:ssfa[[24]])], ssfa[c("SPECSHOR", "BONE", "TO_POS", "FACETTE", "SHEARFAC", "ENA_BA")], mean) but it obviously doesn't work (well "obviously"...) Could anyone help me on this? Thanks in advance Ivan [[alternative HTML version deleted]]
On Mon, Jan 18, 2010 at 9:53 AM, Ivan Calandra <ivan.calandra@uni-hamburg.de> wrote:> Hi everybody! > > I'm working on R today so I have a lot of questions (you may have > noticed that it's the 3rd email today). I'm new on R, so please excuse > the "spam"! > > I have a dataset "ssfa" with many rows and the column names are: > > names(ssfa) > [1] "SPECSHOR" "BONE" "TO_POS" "MEASUREM" "FACETTE" "SHEARFAC" > [7] "ENA_BA" "SEL_FACET" "SEL_MEAS" "Asfc" "Smc" "epLsar" > [13] "HAsfc4" "HAsfc9" "HAsfc16" "HAsfc25" "HAsfc36" "HAsfc49" > [19] "HAsfc64" "HAsfc81" "HAsfc100" "HAsfc121" "Tfv" "Ftfv" > > I want to aggregate that way: > ssfamean <- aggregate(ssfa[c("Asfc", "Smc", "epLsar", "HAsfc4", > "HAsfc9", "HAsfc16", "HAsfc25", "HAsfc36", "HAsfc49", "HAsfc64", > "HAsfc81", "HAsfc100", "HAsfc121", "Tfv", "Ftfv")], ssfa[c("SPECSHOR", > "BONE", "TO_POS", "FACETTE", "SHEARFAC", "ENA_BA")], mean). > > As you can see, it is very long since I have many variables. Basically I > want to select all numerical variables (10 to 24), and all categorical > variables except MEASUREM, SEL_FACET and SEL_MEAS without having to > write each of them. I would also like to avoid writing the names, the > indexes would be nice. > I tried with: > > ssfamean <- aggregate(ssfa[c(ssfa[[10]]:ssfa[[24]])], > ssfa[c("SPECSHOR", "BONE", "TO_POS", "FACETTE", "SHEARFAC", "ENA_BA")], > mean) > but it obviously doesn't work (well "obviously"...) >Numeric column indexing? eg: ssfa[10:24] ssfa[[10:24]] #list including column name col_index <- match("Asfc", ssfa) total_num_cols <- 14 ssfa[col_index:total_num_cols] I'm new at R also but that's how I would approach it. Ben K. [[alternative HTML version deleted]]
Try summaryBy in the doBy package. e.g. using the built-in CO2 summarize each numeric variable by each factor except for the factors Plant and Type: library(doBy) summaryBy(. ~ ., data = subset(CO2, select = - c(Plant, Type))) On Mon, Jan 18, 2010 at 9:53 AM, Ivan Calandra <ivan.calandra at uni-hamburg.de> wrote:> Hi everybody! > > I'm working on R today so I have a lot of questions (you may have > noticed that it's the 3rd email today). I'm new on R, so please excuse > the "spam"! > > I have a dataset "ssfa" with many rows and the column names are: > ?> names(ssfa) > ?[1] "SPECSHOR" ?"BONE" ? ? ?"TO_POS" ? ?"MEASUREM" ?"FACETTE" ? "SHEARFAC" > ?[7] "ENA_BA" ? ?"SEL_FACET" "SEL_MEAS" ?"Asfc" ? ? ?"Smc" ? ? ? "epLsar" > [13] "HAsfc4" ? ?"HAsfc9" ? ?"HAsfc16" ? "HAsfc25" ? "HAsfc36" ? "HAsfc49" > [19] "HAsfc64" ? "HAsfc81" ? "HAsfc100" ?"HAsfc121" ?"Tfv" ? ? ? "Ftfv" > > I want to aggregate that way: > ssfamean <- aggregate(ssfa[c("Asfc", "Smc", "epLsar", "HAsfc4", > "HAsfc9", "HAsfc16", "HAsfc25", "HAsfc36", "HAsfc49", "HAsfc64", > "HAsfc81", "HAsfc100", "HAsfc121", "Tfv", "Ftfv")], ssfa[c("SPECSHOR", > "BONE", "TO_POS", "FACETTE", "SHEARFAC", "ENA_BA")], mean). > > As you can see, it is very long since I have many variables. Basically I > want to select all numerical variables (10 to 24), and all categorical > variables except MEASUREM, SEL_FACET and SEL_MEAS without having to > write each of them. I would also like to avoid writing the names, the > indexes would be nice. > I tried with: > ?> ssfamean <- aggregate(ssfa[c(ssfa[[10]]:ssfa[[24]])], > ssfa[c("SPECSHOR", "BONE", "TO_POS", "FACETTE", "SHEARFAC", "ENA_BA")], > mean) > but it obviously doesn't work (well "obviously"...) > > Could anyone help me on this? > Thanks in advance > Ivan > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >