thr3ads.net - R help - [R] linear regression on groups of consecutive rows of a matrix [Nov 2009]

If this information is useful, please help other people find it:
Share via:

Jim Bouldin

2009-Nov-24 20:25 UTC

[R] linear regression on groups of consecutive rows of a matrix

I want to perform linear regression on groups of consecutive rows--say 5 to
10 such--of two matrices.  There are many such potential groups because the
matrices have thousands of rows. The matrices are both of the form:
> shp[1:5,16:20]      SL495B SL004C SL005C SL005A SL017A
-2649   1.06   0.56     NA     NA     NA
-2648   0.97   0.57     NA     NA     NA
-2647   0.46   0.30     NA     NA     NA
-2646   0.92   0.48     NA     NA     NA
-2645   0.82   0.48     NA     NA     NA

That is, they both have NA values, and non-NA values, in the same matrix
positions.  In my attempts so far, I have had two problems.  First, in
using the split function (which I assume is essential here), I am unable to
split the matrices by groups of rows (say rows 1 to 5, 6 to 10, etc):
> shp_split = split(shp,row(shp))
will split the matrix by rows but not by groups thereof. Stumped.

Second, I cannot seem to get rid of the NA values, which would prevent the
regression even is I could figure out how to split the matrices correctly,
e.g.:> shp_split = split(shp,row(shp))
> shp_split = shp_split[!is.na(shp_split)]
> shp_split[1]$`1`
  [1] 0.68 0.28 0.43 0.47 0.64 0.40 0.69 0.56 0.62 0.40 1.01 0.67 0.17 1.36
1.84 1.06 0.56   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA
  NA   NA   NA etc

IF I solve these problems, will I in fact be able to perform individual
linear regressions on the (numerous) collections of 5 to 10 rows?

Thanks as always for any insight.


Jim Bouldin
Research Ecologist
Department of Plant Sciences, UC Davis
Davis CA, 95616
530-554-1740

David Winsemius

2009-Nov-24 20:52 UTC

head link

[R] linear regression on groups of consecutive rows of a matrix

Perhaps along these lines:
1st  #need to decide what your group width is , so the second number  
inside the extraction call will be that number minus 1:

for (x in seq(1:1000, by=6) {
     temp <- na,omit( shp[x:(x+5), ] )  # Need the parens in x:(x+5)
     lm( formula, data=temp)
    }

Or depending on what you actually meant:

for (x in seq(1:1000, by=5) {
     temp <-  shp[ x:(x+4), which(!is.na(shp[x:x+4, ]))]
     lm( formula, data=temp)
    }

But I do feel compelled to ask: Do you really get meaningful  
information from lm applied to 5 cases? Especially when the predictors  
used may not be the same from subset to subset???

-- 
David

On Nov 24, 2009, at 3:25 PM, Jim Bouldin wrote:
>
> I want to perform linear regression on groups of consecutive rows-- 
> say 5 to
> 10 such--of two matrices.  There are many such potential groups  
> because the
> matrices have thousands of rows. The matrices are both of the form:
>
>> shp[1:5,16:20]
>      SL495B SL004C SL005C SL005A SL017A
> -2649   1.06   0.56     NA     NA     NA
> -2648   0.97   0.57     NA     NA     NA
> -2647   0.46   0.30     NA     NA     NA
> -2646   0.92   0.48     NA     NA     NA
> -2645   0.82   0.48     NA     NA     NA
>
> That is, they both have NA values, and non-NA values, in the same  
> matrix
> positions.  In my attempts so far, I have had two problems.  First, in
> using the split function (which I assume is essential here), I am  
> unable to
> split the matrices by groups of rows (say rows 1 to 5, 6 to 10, etc):
>
>> shp_split = split(shp,row(shp))
>
> will split the matrix by rows but not by groups thereof. Stumped.
>
> Second, I cannot seem to get rid of the NA values, which would  
> prevent the
> regression even is I could figure out how to split the matrices  
> correctly,
> e.g.:
>> shp_split = split(shp,row(shp))
>> shp_split = shp_split[!is.na(shp_split)]
>> shp_split[1]
> $`1`
>  [1] 0.68 0.28 0.43 0.47 0.64 0.40 0.69 0.56 0.62 0.40 1.01 0.67  
> 0.17 1.36
> 1.84 1.06 0.56   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA    
> NA   NA
>  NA   NA   NA etc
>
> IF I solve these problems, will I in fact be able to perform  
> individual
> linear regressions on the (numerous) collections of 5 to 10 rows?
>
> Thanks as always for any insight.
>
>
> Jim Bouldin
> Research Ecologist
> Department of Plant Sciences, UC Davis
> Davis CA, 95616
> 530-554-1740
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

Jim Bouldin

2009-Nov-24 21:06 UTC

head link

[R] linear regression on groups of consecutive rows of a matrix

> But I do feel compelled to ask: Do you really get meaningful  
> information from lm applied to 5 cases? Especially when the predictors  
> used may not be the same from subset to subset???
Thanks again for your help David.  Your question is a good one. It's a bit
complicated but here's the basics. The predictors are the same between
subsets, in the sense that, for each group of rows (which represent tree
ring years), the predictors and predictands are always from the same set of
trees, even though that set changes slightly between consecutive subsets. 
Typically there will be 20+ observations per year (row), so for 5 rows I
have n = 100+.  For my purposes (removing the effect of tree size on ring
width for small groups of years) that is more than good enough.

Now to try out your suggestion...
Jim

> 
> -- 
> David
> 
> On Nov 24, 2009, at 3:25 PM, Jim Bouldin wrote:
> 
> >
> > I want to perform linear regression on groups of consecutive rows-- 
> > say 5 to
> > 10 such--of two matrices.  There are many such potential groups  
> > because the
> > matrices have thousands of rows. The matrices are both of the form:
> >
> >> shp[1:5,16:20]
> >      SL495B SL004C SL005C SL005A SL017A
> > -2649   1.06   0.56     NA     NA     NA
> > -2648   0.97   0.57     NA     NA     NA
> > -2647   0.46   0.30     NA     NA     NA
> > -2646   0.92   0.48     NA     NA     NA
> > -2645   0.82   0.48     NA     NA     NA
> >
> > That is, they both have NA values, and non-NA values, in the same  
> > matrix
> > positions.  In my attempts so far, I have had two problems.  First, in
> > using the split function (which I assume is essential here), I am  
> > unable to
> > split the matrices by groups of rows (say rows 1 to 5, 6 to 10, etc):
> >
> >> shp_split = split(shp,row(shp))
> >
> > will split the matrix by rows but not by groups thereof. Stumped.
> >
> > Second, I cannot seem to get rid of the NA values, which would  
> > prevent the
> > regression even is I could figure out how to split the matrices  
> > correctly,
> > e.g.:
> >> shp_split = split(shp,row(shp))
> >> shp_split = shp_split[!is.na(shp_split)]
> >> shp_split[1]
> > $`1`
> >  [1] 0.68 0.28 0.43 0.47 0.64 0.40 0.69 0.56 0.62 0.40 1.01 0.67  
> > 0.17 1.36
> > 1.84 1.06 0.56   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA    
> > NA   NA
> >  NA   NA   NA etc
> >
> > IF I solve these problems, will I in fact be able to perform  
> > individual
> > linear regressions on the (numerous) collections of 5 to 10 rows?
> >
> > Thanks as always for any insight.
> >
> >
> > Jim Bouldin
> > Research Ecologist
> > Department of Plant Sciences, UC Davis
> > Davis CA, 95616
> > 530-554-1740
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
> 
> 
Jim Bouldin, PhD
Research Ecologist
Department of Plant Sciences, UC Davis
Davis CA, 95616
530-554-1740

Reasonably Related Threads

Search for more reasonably related threads

R help - Nov 2009 - linear regression on groups of consecutive rows of a matrix

[R] linear regression on groups of consecutive rows of a matrix

[R] linear regression on groups of consecutive rows of a matrix

[R] linear regression on groups of consecutive rows of a matrix

Reasonably Related Threads