I'm trying to write a function to find the means over factors of the responses in a mlm (something I would do easily in SAS with PROC SUMMARY). The not-working stub of a function to do what I want is below, and my problem is that I don't know how to call aggregate (or some other function) in the context of terms in a linear model extracted from a lm/mlm object. means.mlm <- function(mod, terms, data) { responses <-dimnames(mod$fitted.values)[[2]] if (missing(terms)) terms <- mod$terms n.terms <- length(terms) for (term in 1:n.terms){ label <- attr(terms[term], "term.labels") cat(label,":\n") means <- aggregate(data, list(label), FUN=mean) print(means) } } Here is a sample context-- a randomized block design with two crossed factors (Contour, Depth) and 9 responses (pH, N, ... Conduc). [An R-readable version of the data is appended at the bottom.] > soils <- read.delim("soils.dat") > str(soils) 'data.frame': 48 obs. of 14 variables: $ Group : int 1 1 1 1 2 2 2 2 3 3 ... $ Contour: Factor w/ 3 levels "Depression","Slope",..: 3 3 3 3 3 3 3 3 3 3 ... $ Depth : Factor w/ 4 levels "0-10","10-30",..: 1 1 1 1 2 2 2 2 3 3 ... $ Gp : Factor w/ 12 levels "D0","D1","D3",..: 9 9 9 9 10 10 10 10 11 11 ... $ Block : int 1 2 3 4 1 2 3 4 1 2 ... $ pH : num 5.4 5.65 5.14 5.14 5.14 5.1 4.7 4.46 4.37 4.39 ... $ N : num 0.188 0.165 0.26 0.169 0.164 0.094 0.1 0.112 0.112 0.058 ... $ Dens : num 0.92 1.04 0.95 1.1 1.12 1.22 1.52 1.47 1.07 1.54 ... $ P : int 215 208 300 248 174 129 117 170 121 115 ... $ Ca : num 16.4 12.3 13.0 11.9 14.2 ... $ Mg : num 7.65 5.15 5.68 7.88 8.12 ... $ K : num 0.72 0.71 0.68 1.09 0.7 0.81 0.39 0.7 0.74 0.77 ... $ Na : num 1.14 0.94 0.6 1.01 2.17 2.67 3.32 3.76 5.74 5.85 ... $ Conduc : num 1.09 1.35 1.41 1.64 1.85 3.18 4.16 5.14 5.73 6.45 ... > I fit a mlm model, > attach(soils) > soils.mod <- lm(cbind(pH,N,Dens,P,Ca,Mg,K,Na,Conduc) ~ Block + Contour*Depth) and then I want to get tables of the means over the factors in the model terms. From the console, I can get the means for a factor or model term (but with annoying warning messages > (m1<-aggregate(soils, list(Contour), mean)) Group.1 Group Contour Depth Gp Block pH N Dens P Ca Mg 1 Depression 10.5 NA NA NA 2.5 4.691875 0.0873125 1.343125 188.1875 7.101250 8.98625 2 Slope 6.5 NA NA NA 2.5 4.746250 0.1059375 1.332500 159.4375 8.109375 8.32000 3 Top 2.5 NA NA NA 2.5 4.570000 0.1125625 1.271875 150.8750 8.877500 8.08750 K Na Conduc 1 0.378125 5.823125 6.945625 2 0.415625 6.103125 6.963750 3 0.605000 4.872500 5.856250 Warning messages: 1: argument is not numeric or logical: returning NA in: mean.default(X[[1]], ...) 2: argument is not numeric or logical: returning NA in: mean.default(X[[2]], ...) ... (suppressed) ... 9: argument is not numeric or logical: returning NA in: mean.default(X[[3]], ...) > When I call my function, I get: > means.mlm(soils.mod, data=soils) Block : Error in FUN(X[[1]], ...) : arguments must have same length > traceback() 6: stop("arguments must have same length") 5: FUN(X[[1]], ...) 4: lapply(x, tapply, by, FUN, ..., simplify = FALSE) 3: aggregate.data.frame(data, list(label), FUN = mean) 2: aggregate(data, list(label), FUN = mean) 1: means.mlm(soils.mod, data = soils) > The soils data: soils <- structure(list(Group = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8, 9, 9, 9, 9, 10, 10, 10, 10, 11, 11, 11, 11, 12, 12, 12, 12), Contour = structure(c(3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), .Label = c("Depression", "Slope", "Top"), class = "factor"), Depth = structure(c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4), .Label = c("0-10", "10-30", "30-60", "60-90"), class = "factor"), Gp = structure(c(9, 9, 9, 9, 10, 10, 10, 10, 11, 11, 11, 11, 12, 12, 12, 12, 5, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4), .Label = c("D0", "D1", "D3", "D6", "S0", "S1", "S3", "S6", "T0", "T1", "T3", "T6"), class = "factor"), Block = c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4), pH = c(5.4, 5.65, 5.14, 5.14, 5.14, 5.1, 4.7, 4.46, 4.37, 4.39, 4.17, 3.89, 3.88, 4.07, 3.88, 3.74, 5.11, 5.46, 5.61, 5.85, 4.57, 5.11, 4.78, 6.67, 3.96, 4, 4.12, 4.99, 3.8, 3.96, 3.93, 4.02, 5.24, 5.2, 5.3, 5.67, 4.46, 4.91, 4.79, 5.36, 3.94, 4.52, 4.35, 4.64, 3.82, 4.24, 4.22, 4.41 ), N = c(0.188, 0.165, 0.26, 0.169, 0.164, 0.094, 0.1, 0.112, 0.112, 0.058, 0.078, 0.07, 0.077, 0.046, 0.055, 0.053, 0.247, 0.298, 0.145, 0.186, 0.102, 0.097, 0.122, 0.083, 0.059, 0.05, 0.086, 0.048, 0.049, 0.036, 0.048, 0.039, 0.194, 0.256, 0.136, 0.127, 0.087, 0.092, 0.047, 0.095, 0.054, 0.051, 0.032, 0.065, 0.038, 0.035, 0.03, 0.058), Dens = c(0.92, 1.04, 0.95, 1.1, 1.12, 1.22, 1.52, 1.47, 1.07, 1.54, 1.26, 1.42, 1.25, 1.54, 1.53, 1.4, 0.94, 0.96, 1.1, 1.2, 1.37, 1.3, 1.3, 1.42, 1.53, 1.5, 1.55, 1.46, 1.48, 1.28, 1.42, 1.51, 1, 0.78, 1, 1.13, 1.24, 1.47, 1.46, 1.26, 1.6, 1.53, 1.55, 1.46, 1.4, 1.47, 1.56, 1.58), P = c(215, 208, 300, 248, 174, 129, 117, 170, 121, 115, 112, 117, 127, 91, 91, 79, 261, 300, 242, 229, 156, 139, 214, 132, 98, 115, 148, 97, 108, 103, 109, 100, 445, 380, 259, 248, 276, 158, 121, 195, 148, 115, 82, 152, 105, 100, 97, 130), Ca = c(16.35, 12.25, 13.02, 11.92, 14.17, 8.55, 8.74, 9.49, 8.85, 4.73, 6.29, 6.61, 6.41, 3.82, 4.98, 5.86, 13.25, 12.3, 9.66, 13.78, 8.58, 8.58, 8.22, 12.68, 4.8, 5.06, 6.16, 7.49, 3.82, 4.78, 4.93, 5.66, 12.27, 11.39, 9.96, 9.12, 7.24, 7.37, 6.99, 8.59, 4.85, 6.34, 5.99, 4.43, 4.65, 4.56, 5.29, 4.58), Mg = c(7.65, 5.15, 5.68, 7.88, 8.12, 6.92, 8.16, 9.16, 10.35, 6.91, 7.95, 9.76, 10.96, 6.61, 8, 10.14, 7.55, 7.5, 6.76, 7.12, 9.92, 8.69, 7.75, 9.56, 10, 8.91, 7.58, 9.38, 8.8, 7.29, 7.47, 8.84, 6.27, 7.55, 8.08, 7.04, 9.4, 10.57, 9.91, 8.66, 9.62, 9.78, 9.73, 10.54, 9.85, 8.95, 8.37, 9.46), K = c(0.72, 0.71, 0.68, 1.09, 0.7, 0.81, 0.39, 0.7, 0.74, 0.77, 0.26, 0.41, 0.56, 0.5, 0.23, 0.41, 0.61, 0.68, 0.63, 0.62, 0.63, 0.42, 0.32, 0.55, 0.36, 0.28, 0.16, 0.4, 0.24, 0.24, 0.14, 0.37, 0.72, 0.78, 0.45, 0.55, 0.43, 0.59, 0.3, 0.48, 0.18, 0.34, 0.22, 0.22, 0.18, 0.33, 0.14, 0.14), Na = c(1.14, 0.94, 0.6, 1.01, 2.17, 2.67, 3.32, 3.76, 5.74, 5.85, 5.3, 8.3, 9.67, 7.67, 8.78, 11.04, 1.86, 2, 1.01, 3.09, 3.67, 4.7, 3.07, 8.3, 6.52, 7.91, 6.39, 9.7, 9.57, 9.67, 9.65, 10.54, 1.02, 1.63, 1.97, 1.43, 4.17, 5.07, 5.15, 4.17, 7.2, 8.52, 7.02, 7.61, 10.15, 10.51, 8.27, 9.28 ), Conduc = c(1.09, 1.35, 1.41, 1.64, 1.85, 3.18, 4.16, 5.14, 5.73, 6.45, 8.37, 9.21, 10.64, 10.07, 11.26, 12.15, 2.61, 1.98, 0.76, 2.85, 3.24, 4.63, 3.67, 8.1, 7.72, 9.78, 9.07, 9.13, 11.57, 11.42, 13.32, 11.57, 0.75, 2.2, 2.27, 0.67, 5.08, 6.37, 6.82, 3.65, 10.14, 9.74, 8.6, 9.09, 12.26, 11.29, 9.51, 12.69)), .Names = c("Group", "Contour", "Depth", "Gp", "Block", "pH", "N", "Dens", "P", "Ca", "Mg", "K", "Na", "Conduc" ), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35", "36", "37", "38", "39", "40", "41", "42", "43", "44", "45", "46", "47", "48")) -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele Street http://www.math.yorku.ca/SCS/friendly.html Toronto, ONT M3J 1P3 CANADA
I am not sure I understand the question -- is it how to get the same result you did but without warnings? The warnings are, of course, there since you are trying to take means of factors and its not clear what that really is supposed to do. At any rate you could eliminate the factors and then take the means of what is left or you could convert the factors to numeric if that makes any sense. Here are a few alternatives using aggregate and summaryBy from the doBy package aggregate(data.matrix(soils), soils["Contour"], mean) # same but removing factors aggregate(data.matrix(soils[,-(2:4)]), soils["Contour"], mean) library(doBy) summaryBy(. ~ Contour, soils, keep.names = TRUE) # same but removing factors summaryBy(.~ Contour, soils[,-(3:4)], keep.names = TRUE) On 11/21/06, Michael Friendly <friendly at yorku.ca> wrote:> I'm trying to write a function to find the means over factors of the > responses in a mlm (something I would do easily in SAS with PROC SUMMARY). > > The not-working stub of a function to do what I want is below, > and my problem is that I don't know how to call aggregate (or > some other function) in the context of terms in a linear model > extracted from a lm/mlm object. > > means.mlm <- function(mod, terms, data) { > responses <-dimnames(mod$fitted.values)[[2]] > if (missing(terms)) terms <- mod$terms > n.terms <- length(terms) > > for (term in 1:n.terms){ > label <- attr(terms[term], "term.labels") > cat(label,":\n") > means <- aggregate(data, list(label), FUN=mean) > print(means) > } > } > > Here is a sample context-- a randomized block design > with two crossed factors (Contour, Depth) and 9 responses (pH, N, ... > Conduc). [An R-readable version of the data is appended at the bottom.] > > > soils <- read.delim("soils.dat") > > str(soils) > 'data.frame': 48 obs. of 14 variables: > $ Group : int 1 1 1 1 2 2 2 2 3 3 ... > $ Contour: Factor w/ 3 levels "Depression","Slope",..: 3 3 3 3 3 3 3 3 > 3 3 ... > $ Depth : Factor w/ 4 levels "0-10","10-30",..: 1 1 1 1 2 2 2 2 3 3 ... > $ Gp : Factor w/ 12 levels "D0","D1","D3",..: 9 9 9 9 10 10 10 10 > 11 11 ... > $ Block : int 1 2 3 4 1 2 3 4 1 2 ... > $ pH : num 5.4 5.65 5.14 5.14 5.14 5.1 4.7 4.46 4.37 4.39 ... > $ N : num 0.188 0.165 0.26 0.169 0.164 0.094 0.1 0.112 0.112 > 0.058 ... > $ Dens : num 0.92 1.04 0.95 1.1 1.12 1.22 1.52 1.47 1.07 1.54 ... > $ P : int 215 208 300 248 174 129 117 170 121 115 ... > $ Ca : num 16.4 12.3 13.0 11.9 14.2 ... > $ Mg : num 7.65 5.15 5.68 7.88 8.12 ... > $ K : num 0.72 0.71 0.68 1.09 0.7 0.81 0.39 0.7 0.74 0.77 ... > $ Na : num 1.14 0.94 0.6 1.01 2.17 2.67 3.32 3.76 5.74 5.85 ... > $ Conduc : num 1.09 1.35 1.41 1.64 1.85 3.18 4.16 5.14 5.73 6.45 ... > > > I fit a mlm model, > > > attach(soils) > > soils.mod <- lm(cbind(pH,N,Dens,P,Ca,Mg,K,Na,Conduc) ~ Block + > Contour*Depth) > > and then I want to get tables of the means over the factors in the model > terms. From the console, I can get the means for a factor or model > term (but with annoying warning messages > > > (m1<-aggregate(soils, list(Contour), mean)) > Group.1 Group Contour Depth Gp Block pH N Dens > P Ca Mg > 1 Depression 10.5 NA NA NA 2.5 4.691875 0.0873125 1.343125 > 188.1875 7.101250 8.98625 > 2 Slope 6.5 NA NA NA 2.5 4.746250 0.1059375 1.332500 > 159.4375 8.109375 8.32000 > 3 Top 2.5 NA NA NA 2.5 4.570000 0.1125625 1.271875 > 150.8750 8.877500 8.08750 > K Na Conduc > 1 0.378125 5.823125 6.945625 > 2 0.415625 6.103125 6.963750 > 3 0.605000 4.872500 5.856250 > Warning messages: > 1: argument is not numeric or logical: returning NA in: > mean.default(X[[1]], ...) > 2: argument is not numeric or logical: returning NA in: > mean.default(X[[2]], ...) > ... (suppressed) ... > 9: argument is not numeric or logical: returning NA in: > mean.default(X[[3]], ...) > > > > When I call my function, I get: > > > means.mlm(soils.mod, data=soils) > Block : > Error in FUN(X[[1]], ...) : arguments must have same length > > traceback() > 6: stop("arguments must have same length") > 5: FUN(X[[1]], ...) > 4: lapply(x, tapply, by, FUN, ..., simplify = FALSE) > 3: aggregate.data.frame(data, list(label), FUN = mean) > 2: aggregate(data, list(label), FUN = mean) > 1: means.mlm(soils.mod, data = soils) > > > > The soils data: > > soils <- > structure(list(Group = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, > 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8, 9, > 9, 9, 9, 10, 10, 10, 10, 11, 11, 11, 11, 12, 12, 12, 12), Contour > structure(c(3, > 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, > 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > 1, 1, 1, 1, 1), .Label = c("Depression", "Slope", "Top"), class > "factor"), > Depth = structure(c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, > 4, 4, 4, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, > 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4), .Label = c("0-10", > "10-30", "30-60", "60-90"), class = "factor"), Gp = structure(c(9, > 9, 9, 9, 10, 10, 10, 10, 11, 11, 11, 11, 12, 12, 12, 12, > 5, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8, 1, 1, 1, > 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4), .Label = c("D0", > "D1", "D3", "D6", "S0", "S1", "S3", "S6", "T0", "T1", "T3", > "T6"), class = "factor"), Block = c(1, 2, 3, 4, 1, 2, 3, > 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, > 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, > 2, 3, 4), pH = c(5.4, 5.65, 5.14, 5.14, 5.14, 5.1, 4.7, 4.46, > 4.37, 4.39, 4.17, 3.89, 3.88, 4.07, 3.88, 3.74, 5.11, 5.46, > 5.61, 5.85, 4.57, 5.11, 4.78, 6.67, 3.96, 4, 4.12, 4.99, > 3.8, 3.96, 3.93, 4.02, 5.24, 5.2, 5.3, 5.67, 4.46, 4.91, > 4.79, 5.36, 3.94, 4.52, 4.35, 4.64, 3.82, 4.24, 4.22, 4.41 > ), N = c(0.188, 0.165, 0.26, 0.169, 0.164, 0.094, 0.1, 0.112, > 0.112, 0.058, 0.078, 0.07, 0.077, 0.046, 0.055, 0.053, 0.247, > 0.298, 0.145, 0.186, 0.102, 0.097, 0.122, 0.083, 0.059, 0.05, > 0.086, 0.048, 0.049, 0.036, 0.048, 0.039, 0.194, 0.256, 0.136, > 0.127, 0.087, 0.092, 0.047, 0.095, 0.054, 0.051, 0.032, 0.065, > 0.038, 0.035, 0.03, 0.058), Dens = c(0.92, 1.04, 0.95, 1.1, > 1.12, 1.22, 1.52, 1.47, 1.07, 1.54, 1.26, 1.42, 1.25, 1.54, > 1.53, 1.4, 0.94, 0.96, 1.1, 1.2, 1.37, 1.3, 1.3, 1.42, 1.53, > 1.5, 1.55, 1.46, 1.48, 1.28, 1.42, 1.51, 1, 0.78, 1, 1.13, > 1.24, 1.47, 1.46, 1.26, 1.6, 1.53, 1.55, 1.46, 1.4, 1.47, > 1.56, 1.58), P = c(215, 208, 300, 248, 174, 129, 117, 170, > 121, 115, 112, 117, 127, 91, 91, 79, 261, 300, 242, 229, > 156, 139, 214, 132, 98, 115, 148, 97, 108, 103, 109, 100, > 445, 380, 259, 248, 276, 158, 121, 195, 148, 115, 82, 152, > 105, 100, 97, 130), Ca = c(16.35, 12.25, 13.02, 11.92, 14.17, > 8.55, 8.74, 9.49, 8.85, 4.73, 6.29, 6.61, 6.41, 3.82, 4.98, > 5.86, 13.25, 12.3, 9.66, 13.78, 8.58, 8.58, 8.22, 12.68, > 4.8, 5.06, 6.16, 7.49, 3.82, 4.78, 4.93, 5.66, 12.27, 11.39, > 9.96, 9.12, 7.24, 7.37, 6.99, 8.59, 4.85, 6.34, 5.99, 4.43, > 4.65, 4.56, 5.29, 4.58), Mg = c(7.65, 5.15, 5.68, 7.88, 8.12, > 6.92, 8.16, 9.16, 10.35, 6.91, 7.95, 9.76, 10.96, 6.61, 8, > 10.14, 7.55, 7.5, 6.76, 7.12, 9.92, 8.69, 7.75, 9.56, 10, > 8.91, 7.58, 9.38, 8.8, 7.29, 7.47, 8.84, 6.27, 7.55, 8.08, > 7.04, 9.4, 10.57, 9.91, 8.66, 9.62, 9.78, 9.73, 10.54, 9.85, > 8.95, 8.37, 9.46), K = c(0.72, 0.71, 0.68, 1.09, 0.7, 0.81, > 0.39, 0.7, 0.74, 0.77, 0.26, 0.41, 0.56, 0.5, 0.23, 0.41, > 0.61, 0.68, 0.63, 0.62, 0.63, 0.42, 0.32, 0.55, 0.36, 0.28, > 0.16, 0.4, 0.24, 0.24, 0.14, 0.37, 0.72, 0.78, 0.45, 0.55, > 0.43, 0.59, 0.3, 0.48, 0.18, 0.34, 0.22, 0.22, 0.18, 0.33, > 0.14, 0.14), Na = c(1.14, 0.94, 0.6, 1.01, 2.17, 2.67, 3.32, > 3.76, 5.74, 5.85, 5.3, 8.3, 9.67, 7.67, 8.78, 11.04, 1.86, > 2, 1.01, 3.09, 3.67, 4.7, 3.07, 8.3, 6.52, 7.91, 6.39, 9.7, > 9.57, 9.67, 9.65, 10.54, 1.02, 1.63, 1.97, 1.43, 4.17, 5.07, > 5.15, 4.17, 7.2, 8.52, 7.02, 7.61, 10.15, 10.51, 8.27, 9.28 > ), Conduc = c(1.09, 1.35, 1.41, 1.64, 1.85, 3.18, 4.16, 5.14, > 5.73, 6.45, 8.37, 9.21, 10.64, 10.07, 11.26, 12.15, 2.61, > 1.98, 0.76, 2.85, 3.24, 4.63, 3.67, 8.1, 7.72, 9.78, 9.07, > 9.13, 11.57, 11.42, 13.32, 11.57, 0.75, 2.2, 2.27, 0.67, > 5.08, 6.37, 6.82, 3.65, 10.14, 9.74, 8.6, 9.09, 12.26, 11.29, > 9.51, 12.69)), .Names = c("Group", "Contour", "Depth", "Gp", > "Block", "pH", "N", "Dens", "P", "Ca", "Mg", "K", "Na", "Conduc" > ), class = "data.frame", row.names = c("1", "2", "3", "4", "5", > "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", > "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", > "28", "29", "30", "31", "32", "33", "34", "35", "36", "37", "38", > "39", "40", "41", "42", "43", "44", "45", "46", "47", "48")) > > -- > Michael Friendly Email: friendly AT yorku DOT ca > Professor, Psychology Dept. > York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 > 4700 Keele Street http://www.math.yorku.ca/SCS/friendly.html > Toronto, ONT M3J 1P3 CANADA > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Dear Mike, I don't have time this morning to work this out in detail, but perhaps the following will help: model.response(model.frame(soils.mod)) will give you the response-variable matrix for the model. variable.names(model.frame(soils.mod)) will give you the names of the variables in the model, with the lhs first. If you need more information about the structure of the model, you'll find that in terms(soils.mod) (but I think that you've already discovered that). I hope this helps, John -------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox --------------------------------> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of > Michael Friendly > Sent: Tuesday, November 21, 2006 11:21 AM > To: R-Help > Subject: [R] means over factors in mlm terms > > I'm trying to write a function to find the means over factors > of the responses in a mlm (something I would do easily in SAS > with PROC SUMMARY). > > The not-working stub of a function to do what I want is > below, and my problem is that I don't know how to call > aggregate (or some other function) in the context of terms in > a linear model extracted from a lm/mlm object. > > means.mlm <- function(mod, terms, data) { > responses <-dimnames(mod$fitted.values)[[2]] > if (missing(terms)) terms <- mod$terms > n.terms <- length(terms) > > for (term in 1:n.terms){ > label <- attr(terms[term], "term.labels") > cat(label,":\n") > means <- aggregate(data, list(label), FUN=mean) > print(means) > } > } > > Here is a sample context-- a randomized block design with two > crossed factors (Contour, Depth) and 9 responses (pH, N, ... > Conduc). [An R-readable version of the data is appended at > the bottom.] > > > soils <- read.delim("soils.dat") > > str(soils) > 'data.frame': 48 obs. of 14 variables: > $ Group : int 1 1 1 1 2 2 2 2 3 3 ... > $ Contour: Factor w/ 3 levels "Depression","Slope",..: 3 3 > 3 3 3 3 3 3 > 3 3 ... > $ Depth : Factor w/ 4 levels "0-10","10-30",..: 1 1 1 1 2 > 2 2 2 3 3 ... > $ Gp : Factor w/ 12 levels "D0","D1","D3",..: 9 9 9 9 > 10 10 10 10 > 11 11 ... > $ Block : int 1 2 3 4 1 2 3 4 1 2 ... > $ pH : num 5.4 5.65 5.14 5.14 5.14 5.1 4.7 4.46 4.37 4.39 ... > $ N : num 0.188 0.165 0.26 0.169 0.164 0.094 0.1 0.112 0.112 > 0.058 ... > $ Dens : num 0.92 1.04 0.95 1.1 1.12 1.22 1.52 1.47 1.07 1.54 ... > $ P : int 215 208 300 248 174 129 117 170 121 115 ... > $ Ca : num 16.4 12.3 13.0 11.9 14.2 ... > $ Mg : num 7.65 5.15 5.68 7.88 8.12 ... > $ K : num 0.72 0.71 0.68 1.09 0.7 0.81 0.39 0.7 0.74 0.77 ... > $ Na : num 1.14 0.94 0.6 1.01 2.17 2.67 3.32 3.76 5.74 5.85 ... > $ Conduc : num 1.09 1.35 1.41 1.64 1.85 3.18 4.16 5.14 > 5.73 6.45 ... > > > I fit a mlm model, > > > attach(soils) > > soils.mod <- lm(cbind(pH,N,Dens,P,Ca,Mg,K,Na,Conduc) ~ Block + > Contour*Depth) > > and then I want to get tables of the means over the factors > in the model terms. From the console, I can get the means > for a factor or model term (but with annoying warning messages > > > (m1<-aggregate(soils, list(Contour), mean)) > Group.1 Group Contour Depth Gp Block pH N > Dens > P Ca Mg > 1 Depression 10.5 NA NA NA 2.5 4.691875 0.0873125 1.343125 > 188.1875 7.101250 8.98625 > 2 Slope 6.5 NA NA NA 2.5 4.746250 0.1059375 1.332500 > 159.4375 8.109375 8.32000 > 3 Top 2.5 NA NA NA 2.5 4.570000 0.1125625 1.271875 > 150.8750 8.877500 8.08750 > K Na Conduc > 1 0.378125 5.823125 6.945625 > 2 0.415625 6.103125 6.963750 > 3 0.605000 4.872500 5.856250 > Warning messages: > 1: argument is not numeric or logical: returning NA in: > mean.default(X[[1]], ...) > 2: argument is not numeric or logical: returning NA in: > mean.default(X[[2]], ...) > ... (suppressed) ... > 9: argument is not numeric or logical: returning NA in: > mean.default(X[[3]], ...) > > > > When I call my function, I get: > > > means.mlm(soils.mod, data=soils) > Block : > Error in FUN(X[[1]], ...) : arguments must have same length > > traceback() > 6: stop("arguments must have same length") > 5: FUN(X[[1]], ...) > 4: lapply(x, tapply, by, FUN, ..., simplify = FALSE) > 3: aggregate.data.frame(data, list(label), FUN = mean) > 2: aggregate(data, list(label), FUN = mean) > 1: means.mlm(soils.mod, data = soils) > > > > The soils data: > > soils <- > structure(list(Group = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, > 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8, > 9, 9, 9, 9, 10, 10, 10, 10, 11, 11, 11, 11, 12, 12, 12, 12), > Contour = structure(c(3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, > 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), .Label = > c("Depression", "Slope", "Top"), class = "factor"), > Depth = structure(c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, > 4, 4, 4, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, > 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4), .Label > = c("0-10", > "10-30", "30-60", "60-90"), class = "factor"), Gp = > structure(c(9, > 9, 9, 9, 10, 10, 10, 10, 11, 11, 11, 11, 12, 12, 12, 12, > 5, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8, 1, 1, 1, > 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4), .Label = c("D0", > "D1", "D3", "D6", "S0", "S1", "S3", "S6", "T0", "T1", "T3", > "T6"), class = "factor"), Block = c(1, 2, 3, 4, 1, 2, 3, > 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, > 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, > 2, 3, 4), pH = c(5.4, 5.65, 5.14, 5.14, 5.14, 5.1, 4.7, 4.46, > 4.37, 4.39, 4.17, 3.89, 3.88, 4.07, 3.88, 3.74, 5.11, 5.46, > 5.61, 5.85, 4.57, 5.11, 4.78, 6.67, 3.96, 4, 4.12, 4.99, > 3.8, 3.96, 3.93, 4.02, 5.24, 5.2, 5.3, 5.67, 4.46, 4.91, > 4.79, 5.36, 3.94, 4.52, 4.35, 4.64, 3.82, 4.24, 4.22, 4.41 > ), N = c(0.188, 0.165, 0.26, 0.169, 0.164, 0.094, 0.1, 0.112, > 0.112, 0.058, 0.078, 0.07, 0.077, 0.046, 0.055, 0.053, 0.247, > 0.298, 0.145, 0.186, 0.102, 0.097, 0.122, 0.083, 0.059, 0.05, > 0.086, 0.048, 0.049, 0.036, 0.048, 0.039, 0.194, 0.256, 0.136, > 0.127, 0.087, 0.092, 0.047, 0.095, 0.054, 0.051, 0.032, 0.065, > 0.038, 0.035, 0.03, 0.058), Dens = c(0.92, 1.04, 0.95, 1.1, > 1.12, 1.22, 1.52, 1.47, 1.07, 1.54, 1.26, 1.42, 1.25, 1.54, > 1.53, 1.4, 0.94, 0.96, 1.1, 1.2, 1.37, 1.3, 1.3, 1.42, 1.53, > 1.5, 1.55, 1.46, 1.48, 1.28, 1.42, 1.51, 1, 0.78, 1, 1.13, > 1.24, 1.47, 1.46, 1.26, 1.6, 1.53, 1.55, 1.46, 1.4, 1.47, > 1.56, 1.58), P = c(215, 208, 300, 248, 174, 129, 117, 170, > 121, 115, 112, 117, 127, 91, 91, 79, 261, 300, 242, 229, > 156, 139, 214, 132, 98, 115, 148, 97, 108, 103, 109, 100, > 445, 380, 259, 248, 276, 158, 121, 195, 148, 115, 82, 152, > 105, 100, 97, 130), Ca = c(16.35, 12.25, 13.02, 11.92, 14.17, > 8.55, 8.74, 9.49, 8.85, 4.73, 6.29, 6.61, 6.41, 3.82, 4.98, > 5.86, 13.25, 12.3, 9.66, 13.78, 8.58, 8.58, 8.22, 12.68, > 4.8, 5.06, 6.16, 7.49, 3.82, 4.78, 4.93, 5.66, 12.27, 11.39, > 9.96, 9.12, 7.24, 7.37, 6.99, 8.59, 4.85, 6.34, 5.99, 4.43, > 4.65, 4.56, 5.29, 4.58), Mg = c(7.65, 5.15, 5.68, 7.88, 8.12, > 6.92, 8.16, 9.16, 10.35, 6.91, 7.95, 9.76, 10.96, 6.61, 8, > 10.14, 7.55, 7.5, 6.76, 7.12, 9.92, 8.69, 7.75, 9.56, 10, > 8.91, 7.58, 9.38, 8.8, 7.29, 7.47, 8.84, 6.27, 7.55, 8.08, > 7.04, 9.4, 10.57, 9.91, 8.66, 9.62, 9.78, 9.73, 10.54, 9.85, > 8.95, 8.37, 9.46), K = c(0.72, 0.71, 0.68, 1.09, 0.7, 0.81, > 0.39, 0.7, 0.74, 0.77, 0.26, 0.41, 0.56, 0.5, 0.23, 0.41, > 0.61, 0.68, 0.63, 0.62, 0.63, 0.42, 0.32, 0.55, 0.36, 0.28, > 0.16, 0.4, 0.24, 0.24, 0.14, 0.37, 0.72, 0.78, 0.45, 0.55, > 0.43, 0.59, 0.3, 0.48, 0.18, 0.34, 0.22, 0.22, 0.18, 0.33, > 0.14, 0.14), Na = c(1.14, 0.94, 0.6, 1.01, 2.17, 2.67, 3.32, > 3.76, 5.74, 5.85, 5.3, 8.3, 9.67, 7.67, 8.78, 11.04, 1.86, > 2, 1.01, 3.09, 3.67, 4.7, 3.07, 8.3, 6.52, 7.91, 6.39, 9.7, > 9.57, 9.67, 9.65, 10.54, 1.02, 1.63, 1.97, 1.43, 4.17, 5.07, > 5.15, 4.17, 7.2, 8.52, 7.02, 7.61, 10.15, 10.51, 8.27, 9.28 > ), Conduc = c(1.09, 1.35, 1.41, 1.64, 1.85, 3.18, 4.16, 5.14, > 5.73, 6.45, 8.37, 9.21, 10.64, 10.07, 11.26, 12.15, 2.61, > 1.98, 0.76, 2.85, 3.24, 4.63, 3.67, 8.1, 7.72, 9.78, 9.07, > 9.13, 11.57, 11.42, 13.32, 11.57, 0.75, 2.2, 2.27, 0.67, > 5.08, 6.37, 6.82, 3.65, 10.14, 9.74, 8.6, 9.09, 12.26, 11.29, > 9.51, 12.69)), .Names = c("Group", "Contour", "Depth", > "Gp", "Block", "pH", "N", "Dens", "P", "Ca", "Mg", "K", "Na", "Conduc" > ), class = "data.frame", row.names = c("1", "2", "3", "4", > "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", > "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", > "26", "27", "28", "29", "30", "31", "32", "33", "34", "35", > "36", "37", "38", "39", "40", "41", "42", "43", "44", "45", > "46", "47", "48")) > > -- > Michael Friendly Email: friendly AT yorku DOT ca > Professor, Psychology Dept. > York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 > 4700 Keele Street http://www.math.yorku.ca/SCS/friendly.html > Toronto, ONT M3J 1P3 CANADA > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
On 11/21/06, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:> I am not sure I understand the question -- is it how to get the same > result you did but without warnings? The warnings are, of course, > there since you are trying to take means of factors and its not clear > what that really is supposed to do. > > At any rate you could eliminate the factors and then take the means > of what is left or you could convert the factors to numeric if that makes > any sense. Here are a few alternatives using aggregate and > summaryBy from the doBy package > > > aggregate(data.matrix(soils), soils["Contour"], mean) > > # same but removing factors > aggregate(data.matrix(soils[,-(2:4)]), soils["Contour"], mean) > > library(doBy) > summaryBy(. ~ Contour, soils, keep.names = TRUE) > > # same but removing factors > summaryBy(.~ Contour, soils[,-(3:4)], keep.names = TRUE)Hmm. It seems summaryBy automatically removes the factors from the left hand side so the last two are actually the same.
## Thank you, Michael, for the example. I believe you are looking ## for the natural extension to mlm of the proj function. ## Here is the lapply workaround from which that extension ## might be written ## ## Rich sapply(soils, class) soils$Group <- factor(soils$Group) soils$Block <- factor(soils$Block) sapply(soils, class) result <- list() for (i in names(soils)[6:14]) result[[i]] <- aov(soils[[i]] ~ Block + Contour*Depth, data=soils) proj(result[[1]]) lapply(result, proj) model.tables(result[[1]], type="means") lapply(result, model.tables, type="means") ## Exercise for the reader ## 1. defined proj.lm to be proj.aov when it makes sense ## 2. extend the above to the proj.mlm method ## 3. collapse the projections to tables ## (this is actually how model.tables is written). ## 4. extend model.tables to the lm and mlm when it makes sense.