thr3ads.net - R help - [R] sapply() related query [Jun 2009]

If this information is useful, please help other people find it:
Share via:

Girish A.R.

2009-Jun-17 15:06 UTC

[R] sapply() related query

Hi folks,

I'm trying to consolidate the outputs (of anova() and lrm()) from
multiple runs of single-variable logistic regression. Here's how the
output looks:
------------------------------------------------------------
                          y ~ x1      y ~ x2       y ~ x3      y ~
x4
Chi-Square 0.1342152  1.573538  1.267291  1.518200
d.f.                           2                 2
2              1
P                0.9350946  0.4553136 0.5306538  0.2178921
R2            0.01003342   0.1272791 0.0954126 0.1184302
-------------------------------------------------------------------
The problem I have is when there are a lot more variables (15+) --- It
would be nice if this output is transposed.

A reproducible code is included below. I tried the transpose function,
but it didn't seem to work. If there is a neater way of getting the
desired output, I'd appreciate that as well.

==========================================Lines <- "y   x1  x2  x3  x4
0   m   1   0   7
1   t   2   1   13
0   f   1   2   18
1   t   1   2   16
1   f   3   0   16
0   t   3   1   16
0   t   1   1   16
0   t   2   1   16
1   t   3   2   14
0   t   1   0   9
0   t   1   0   10
1   m   1   0   4
0   f   2   2   18
1   f   1   1   12
0   t   2   0   13
0   t   1   1  16
1   t   1   2   7
0   f   2   1   18"

my.data <- read.table(textConnection(Lines), header = TRUE)
my.data$x1 <- as.factor(my.data$x1)
my.data$x2 <- as.factor(my.data$x2)
my.data$x3 <- as.factor(my.data$x3)
my.data$y <- as.logical(my.data$y)

sapply(paste("y ~", names(my.data)[2:dim(my.data)[2]]),
function(f){tab <- cbind(as.data.frame(t(anova(lrm(as.formula(f),data
= my.data,x=T,y=T))[1,])),
as.data.frame(t(lrm(as.formula(f),data = my.data,x=T,y=T)$stats[10])))
})
================================
Thanks,

- Girish
> sessionInfo()R version 2.9.0 (2009-04-17)
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.
1252;LC_MONETARY=English_United States.
1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] splines   grid      stats     graphics  grDevices utils
datasets  methods
[9] base

other attached packages:
 [1] xtable_1.5-5       fBasics_2100.77    timeSeries_2100.83
timeDate_290.85
 [5] gplots_2.7.1       caTools_1.9        bitops_1.0-4.1
gdata_2.4.2
 [9] gtools_2.6.1       Design_2.2-0       survival_2.35-4
RWinEdt_1.8-1
[13] coda_0.13-4        verification_1.29  CircStats_0.2-3
boot_1.2-36
[17] fields_5.02        spam_0.15-4        waveslim_1.6.1
lmtest_0.9-24
[21] zoo_1.5-6          psychometric_2.1   multilevel_2.3
MASS_7.2-47
[25] nlme_3.1-92        languageR_0.953    zipfR_0.6-5
lme4_0.999375-31
[29] Matrix_0.999375-29 lattice_0.17-25    ggplot2_0.8.3
reshape_0.8.3
[33] plyr_0.1.8         proto_0.3-8        Hmisc_3.6-0
doBy_3.9
[37] car_1.2-14

loaded via a namespace (and not attached):
[1] cluster_1.12.0   Formula_0.1-3    kinship_1.1.0-22
plm_1.1-4        sandwich_2.2-1
[6] tools_2.9.0

Marc Schwartz

2009-Jun-17 16:01 UTC

head link

[R] sapply() related query

On Jun 17, 2009, at 10:06 AM, Girish A.R. wrote:
> Hi folks,
>
> I'm trying to consolidate the outputs (of anova() and lrm()) from
> multiple runs of single-variable logistic regression. Here's how the
> output looks:
> ------------------------------------------------------------
>                          y ~ x1      y ~ x2       y ~ x3      y ~
> x4
> Chi-Square 0.1342152  1.573538  1.267291  1.518200
> d.f.                           2                 2
> 2              1
> P                0.9350946  0.4553136 0.5306538  0.2178921
> R2            0.01003342   0.1272791 0.0954126 0.1184302
> -------------------------------------------------------------------
> The problem I have is when there are a lot more variables (15+) --- It
> would be nice if this output is transposed.
>
> A reproducible code is included below. I tried the transpose function,
> but it didn't seem to work. If there is a neater way of getting the
> desired output, I'd appreciate that as well.
>
> ==========================================> Lines <- "y   x1  x2
x3  x4
> 0   m   1   0   7
> 1   t   2   1   13
> 0   f   1   2   18
> 1   t   1   2   16
> 1   f   3   0   16
> 0   t   3   1   16
> 0   t   1   1   16
> 0   t   2   1   16
> 1   t   3   2   14
> 0   t   1   0   9
> 0   t   1   0   10
> 1   m   1   0   4
> 0   f   2   2   18
> 1   f   1   1   12
> 0   t   2   0   13
> 0   t   1   1  16
> 1   t   1   2   7
> 0   f   2   1   18"
>
> my.data <- read.table(textConnection(Lines), header = TRUE)
> my.data$x1 <- as.factor(my.data$x1)
> my.data$x2 <- as.factor(my.data$x2)
> my.data$x3 <- as.factor(my.data$x3)
> my.data$y <- as.logical(my.data$y)
>
> sapply(paste("y ~", names(my.data)[2:dim(my.data)[2]]),
> function(f){tab <- cbind(as.data.frame(t(anova(lrm(as.formula(f),data
> = my.data,x=T,y=T))[1,])),
> as.data.frame(t(lrm(as.formula(f),data = my.data,x=T,y=T)$stats[10])))
> })
> ================================>
> Thanks,
>
> - Girish

You can try something like this:

library(Design)

my.func <- function(x)
{
   mod <- lrm(my.data$y ~ x)
   data.frame(t(anova(mod)[1, ]), R2 = mod$stats[10])
}

 > t(sapply(my.data[, -1], my.func))
    Chi.Square d.f. P         R2
x1 0.1342152  2    0.9350946 0.01003342
x2 1.573538   2    0.4553136 0.1272791
x3 1.267291   2    0.5306538 0.0954126
x4 1.518200   1    0.2178921 0.1184302


I am not sure what your end game might be, but would simply express  
the appropriate caution if this is a step in any approach to variable  
selection for subsequent model development...

HTH,

Marc Schwartz

R help - Jun 2009 - sapply() related query

[R] sapply() related query

[R] sapply() related query