thr3ads.net - R help - [R] functions on rows or columns of two (or more) arrays [Aug 2011]

If this information is useful, please help other people find it:
Share via:

Jim Bouldin

2011-Aug-04 21:17 UTC

[R] functions on rows or columns of two (or more) arrays

I realize this should be simple, but even after reading over the several
help pages several times, I still cannot decide between the myriad
"apply"
functions to address it.  I simply want to apply a function to all the rows
(or columns) of the same index from two (or more) identically sized arrays
(or data frames).

For example:
> a=matrix(1:50,nrow=10)
> a2=floor(jitter(a,amount=50))
> a      [,1] [,2] [,3] [,4] [,5]
 [1,]    1   11   21   31   41
 [2,]    2   12   22   32   42
 [3,]    3   13   23   33   43
 [4,]    4   14   24   34   44
 [5,]    5   15   25   35   45
 [6,]    6   16   26   36   46
 [7,]    7   17   27   37   47
 [8,]    8   18   28   38   48
 [9,]    9   19   29   39   49
[10,]   10   20   30   40   50> a2      [,1] [,2] [,3] [,4] [,5]
 [1,]   31   56  -29  -13   10
 [2,]   38   61   71   55    9
 [3,]  -29   38   47   12   38
 [4,]   12    2   43   39   93
 [5,]  -43   23  -23   62    1
 [6,]  -13   61   55   11    2
 [7,]  -42    1   38   12    8
 [8,]  -13   -6  -18   16   95
 [9,]  -19   -2   78   33    1
[10,]   20  -16  -11   19   17

if I try the following for example:
apply(a,1,function(x) lm(a~a2))

I get 10 identical repeats (except for the list indexer) of the following:

[[1]]

Call:
lm(formula = a ~ a2)

Coefficients:
             [,1]       [,2]       [,3]       [,4]       [,5]
(Intercept)   8.372135  18.372135  28.372135  38.372135  48.372135
a21          -0.006163  -0.006163  -0.006163  -0.006163  -0.006163
a22          -0.093390  -0.093390  -0.093390  -0.093390  -0.093390
a23           0.009315   0.009315   0.009315   0.009315   0.009315
a24          -0.015143  -0.015143  -0.015143  -0.015143  -0.015143
a25          -0.026761  -0.026761  -0.026761  -0.026761  -0.026761

...Which is clearly very wrong, in a number of ways.  If I try by columns:
apply(a,2,function(x) lm(a~a2))
...I get exactly the same result.

So, which is the appropriate apply-type function when two arrays (or
d.f.'s?) are involved like this? Or none of them and some other approach
(other than looping which I can do but which I assume is not optimal)?
Thanks for any help.
-- 
Jim Bouldin, PhD
Research Ecologist

	[[alternative HTML version deleted]]

R. Michael Weylandt

2011-Aug-04 21:29 UTC

head link

[R] functions on rows or columns of two (or more) arrays

I hope someone experience with plyr package comes and helps because this
sounds like what it does well, but for your specific example something like
this works:

A = rbind(a,a2)
q = apply(A,2,function(x) {lm(x[1:nrow(a)] ~ x[-(1:nrow(a))])})

but yeah, that's pretty rough so I hope someone can come up with something
more elegant.

If nothing else, I think that idea can be made to work in most
circumstances: put it together, then break it apart inside the function
passed to apply.

Michael Weylandt

On Thu, Aug 4, 2011 at 5:17 PM, Jim Bouldin <bouldinjr@gmail.com> wrote:
> I realize this should be simple, but even after reading over the several
> help pages several times, I still cannot decide between the myriad
"apply"
> functions to address it.  I simply want to apply a function to all the rows
> (or columns) of the same index from two (or more) identically sized arrays
> (or data frames).
>
> For example:
>
> > a=matrix(1:50,nrow=10)
> > a2=floor(jitter(a,amount=50))
> > a
>      [,1] [,2] [,3] [,4] [,5]
>  [1,]    1   11   21   31   41
>  [2,]    2   12   22   32   42
>  [3,]    3   13   23   33   43
>  [4,]    4   14   24   34   44
>  [5,]    5   15   25   35   45
>  [6,]    6   16   26   36   46
>  [7,]    7   17   27   37   47
>  [8,]    8   18   28   38   48
>  [9,]    9   19   29   39   49
> [10,]   10   20   30   40   50
> > a2
>      [,1] [,2] [,3] [,4] [,5]
>  [1,]   31 56 -29 -13 10
>  [2,]   38   61   71   55    9
>  [3,]  -29   38   47   12   38
>  [4,]   12    2   43   39   93
>  [5,]  -43   23  -23   62    1
>  [6,]  -13   61   55   11    2
>  [7,]  -42    1   38   12    8
>  [8,]  -13   -6  -18   16   95
>  [9,]  -19   -2   78   33    1
> [10,]   20 -16 -11 19 17
>
> if I try the following for example:
> apply(a,1,function(x) lm(a~a2))
>
> I get 10 identical repeats (except for the list indexer) of the following:
>
> [[1]]
>
> Call:
> lm(formula = a ~ a2)
>
> Coefficients:
>             [,1]       [,2]       [,3]       [,4]       [,5]
> (Intercept)   8.372135  18.372135  28.372135  38.372135  48.372135
> a21          -0.006163  -0.006163  -0.006163  -0.006163  -0.006163
> a22          -0.093390  -0.093390  -0.093390  -0.093390  -0.093390
> a23           0.009315   0.009315   0.009315   0.009315   0.009315
> a24          -0.015143  -0.015143  -0.015143  -0.015143  -0.015143
> a25          -0.026761  -0.026761  -0.026761  -0.026761  -0.026761
>
> ...Which is clearly very wrong, in a number of ways.  If I try by columns:
> apply(a,2,function(x) lm(a~a2))
> ...I get exactly the same result.
>
> So, which is the appropriate apply-type function when two arrays (or
> d.f.'s?) are involved like this? Or none of them and some other
approach
> (other than looping which I can do but which I assume is not optimal)?
> Thanks for any help.
> --
> Jim Bouldin, PhD
> Research Ecologist
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Florent D.

2011-Aug-04 23:56 UTC

head link

[R] functions on rows or columns of two (or more) arrays

The apply function also works with multi-dimensional arrays, I think
this is what you want to achieve using a 3d array:

aaa <- array(NA, dim = c(2, dim(a)))
aaa[1,,] <- a
aaa[2,,] <- a2
apply(aaa, 3, function(x)lm(x[1,]~x[2,]))

Dennis Murphy

2011-Aug-05 05:19 UTC

head link

[R] functions on rows or columns of two (or more) arrays

Hi:

Here's one approach:

a=matrix(1:50,nrow=10)
a2=floor(jitter(a,amount=50))

# Write a function to combine the columns of interest
# into a data frame and fit a linear model
regfn <- function(k) {
     rdf <- data.frame(x = a[k, ], y = a2[k, ])
     lm(y ~ x, data = rdf)
   }

# Use lapply() to run regfn() recursively along
# the rows of a and a2:
modlist <- lapply(seq_len(nrow(a)), regfn)

# I prefer plyr for extraction of output from a list of models.
# Here are a few examples:

library('plyr')
# Extract the R^2 values
ldply(modlist, function(m) summary(m)$r.squared)
# Extract the residuals
laply(modlist, function(m) resid(m))
# Extract the estimated model coefficients
ldply(modlist, function(m) coef(m))
# Extract the coefficient summary tables as a list
llply(modlist, function(m) summary(m)$coefficients)

In the anonymous functions, the argument m refers to an arbitrary lm
object, so you can do to it what you would with any given lm object;
all you're doing is abstracting the process.

HTH,
Dennis

On Thu, Aug 4, 2011 at 2:17 PM, Jim Bouldin <bouldinjr at gmail.com>
wrote:> I realize this should be simple, but even after reading over the several
> help pages several times, I still cannot decide between the myriad
"apply"
> functions to address it. ?I simply want to apply a function to all the rows
> (or columns) of the same index from two (or more) identically sized arrays
> (or data frames).
>
> For example:
>
>> a=matrix(1:50,nrow=10)
>> a2=floor(jitter(a,amount=50))
>> a
> ? ? ?[,1] [,2] [,3] [,4] [,5]
> ?[1,] ? ?1 ? 11 ? 21 ? 31 ? 41
> ?[2,] ? ?2 ? 12 ? 22 ? 32 ? 42
> ?[3,] ? ?3 ? 13 ? 23 ? 33 ? 43
> ?[4,] ? ?4 ? 14 ? 24 ? 34 ? 44
> ?[5,] ? ?5 ? 15 ? 25 ? 35 ? 45
> ?[6,] ? ?6 ? 16 ? 26 ? 36 ? 46
> ?[7,] ? ?7 ? 17 ? 27 ? 37 ? 47
> ?[8,] ? ?8 ? 18 ? 28 ? 38 ? 48
> ?[9,] ? ?9 ? 19 ? 29 ? 39 ? 49
> [10,] ? 10 ? 20 ? 30 ? 40 ? 50
>> a2
> ? ? ?[,1] [,2] [,3] [,4] [,5]
> ?[1,] ? 31 56 -29 -13 10
> ?[2,] ? 38 ? 61 ? 71 ? 55 ? ?9
> ?[3,] ?-29 ? 38 ? 47 ? 12 ? 38
> ?[4,] ? 12 ? ?2 ? 43 ? 39 ? 93
> ?[5,] ?-43 ? 23 ?-23 ? 62 ? ?1
> ?[6,] ?-13 ? 61 ? 55 ? 11 ? ?2
> ?[7,] ?-42 ? ?1 ? 38 ? 12 ? ?8
> ?[8,] ?-13 ? -6 ?-18 ? 16 ? 95
> ?[9,] ?-19 ? -2 ? 78 ? 33 ? ?1
> [10,] ? 20 -16 -11 19 17
>
> if I try the following for example:
> apply(a,1,function(x) lm(a~a2))
>
> I get 10 identical repeats (except for the list indexer) of the following:
>
> [[1]]
>
> Call:
> lm(formula = a ~ a2)
>
> Coefficients:
> ? ? ? ? ? ? [,1] ? ? ? [,2] ? ? ? [,3] ? ? ? [,4] ? ? ? [,5]
> (Intercept) ? 8.372135 ?18.372135 ?28.372135 ?38.372135 ?48.372135
> a21 ? ? ? ? ?-0.006163 ?-0.006163 ?-0.006163 ?-0.006163 ?-0.006163
> a22 ? ? ? ? ?-0.093390 ?-0.093390 ?-0.093390 ?-0.093390 ?-0.093390
> a23 ? ? ? ? ? 0.009315 ? 0.009315 ? 0.009315 ? 0.009315 ? 0.009315
> a24 ? ? ? ? ?-0.015143 ?-0.015143 ?-0.015143 ?-0.015143 ?-0.015143
> a25 ? ? ? ? ?-0.026761 ?-0.026761 ?-0.026761 ?-0.026761 ?-0.026761
>
> ...Which is clearly very wrong, in a number of ways. ?If I try by columns:
> apply(a,2,function(x) lm(a~a2))
> ...I get exactly the same result.
>
> So, which is the appropriate apply-type function when two arrays (or
> d.f.'s?) are involved like this? Or none of them and some other
approach
> (other than looping which I can do but which I assume is not optimal)?
> Thanks for any help.
> --
> Jim Bouldin, PhD
> Research Ecologist
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Possibly Parallel Threads

Search for more reasonably related threads

R help - Aug 2011 - functions on rows or columns of two (or more) arrays

[R] functions on rows or columns of two (or more) arrays

[R] functions on rows or columns of two (or more) arrays

[R] functions on rows or columns of two (or more) arrays

[R] functions on rows or columns of two (or more) arrays

Possibly Parallel Threads