Hi all,
I have a data set that looks a bit like this.
feed1
RFU Site Vial Time lnRFU
1 44448 1 1 10 10.702075
2 47521 1 1 20 10.768927
3 42905 1 1 30 10.66674
4 46867 1 1 40 10.755069
5 42995 1 1 50 10.668839
6 43074 1 1 60 10.670675
7 41195 1 1 70 10.626072
8 47090 1 2 10 10.759816
9 48100 1 2 20 10.781037
10 43215 1 2 30 10.673943
11 39656 1 2 40 10.587998
12 38799 1 2 50 10.566150
13 38424 1 2 60 10.556438
14 35240 1 2 70 10.469937
15 46427 1 3 10 10.745636
16 46418 1 3 20 10.745443
17 42095 1 3 30 10.647684
......
There are 5 columns of data, three levels of "Site", 10
"Vials" per site,
and measurements were taken at 10 min intervals from 10-70.. I am primarily
interested in the relationship between "Time" and "lnRFU" to
calculate the
rate at which lnRFU declines over time. I have a nice plot using a ggplot2
code that looks like this
p<-ggplot(data=feed1,aes(x=Time,y=lnRFU))
p+geom_point(size=4)+facet_grid(Site~Vial)+geom_smooth(method="lm")
The graph is useful to visualize the changes over time and grouped by both
Site and Vial, but I also need the slopes of the linear regressions for each
Vial, within a Site. This is where I run into a problem. I want to run a
linear regression of lnRFU as a function of Time grouped by both Site and
Vial. Its easy to visualize this comparison in ggplot using facet_grid(),
but I'm not sure how to do a similar comparison/analysis within lm()
I imagine something like
fit<-lm(lnRFU~Time | Vial * Site, data=feed1)
in which I group by both Vial and Site, but obviously this code doesn't
work. Does anyone have an idea for how to do a linear regression with two
grouping variables? Do I have to go back and combine Vial and Site into a
single grouping variable or can I leave the dataframe the way it is? I'm
trying to imagine a means of accomplishing the same type of thing that
facet_grid does when it allows you to plot the data as a function of two
"grouping" variables.
Thanks for you time. I greatly appreciate it.
Nate Miller
[[alternative HTML version deleted]]
You can do something like this sp<-split(dat, list(dat$Vial,dat$Site)) seq.model<-lapply(sp, function(x) lm(x$InRFU~x$Time)) Then, extract whatever you want from seq.model Weidong Gu On Mon, Aug 22, 2011 at 9:15 PM, Nathan Miller <natemiller77 at gmail.com> wrote:> Hi all, > > I have a data set that looks a bit like this. > > feed1 > ? ? ?RFU Site Vial Time ? ? ? lnRFU > 1 ? 44448 ? ?1 ? ?1 ? 10 ?10.702075 > 2 ? 47521 ? ?1 ? ?1 ? 20 ?10.768927 > 3 ? 42905 ? ?1 ? ?1 ? 30 ?10.66674 > 4 ? 46867 ? ?1 ? ?1 ? 40 ?10.755069 > 5 ? 42995 ? ?1 ? ?1 ? 50 ?10.668839 > 6 ? 43074 ? ?1 ? ?1 ? 60 ?10.670675 > 7 ? 41195 ? ?1 ? ?1 ? 70 ?10.626072 > 8 ? 47090 ? ?1 ? ?2 ? 10 ?10.759816 > 9 ? 48100 ? ?1 ? ?2 ? 20 ?10.781037 > 10 ?43215 ? ?1 ? ?2 ? 30 ?10.673943 > 11 ?39656 ? ?1 ? ?2 ? 40 ?10.587998 > 12 ?38799 ? ?1 ? ?2 ? 50 ?10.566150 > 13 ?38424 ? ?1 ? ?2 ? 60 10.556438 > 14 35240 1 2 70 ?10.469937 > 15 ?46427 ? ?1 ? ?3 ? 10 ?10.745636 > 16 46418 1 3 20 ?10.745443 > 17 ?42095 ? ?1 ? ?3 ? 30 ?10.647684 > ...... > There are 5 columns of data, three levels of "Site", 10 "Vials" per site, > and measurements were taken at 10 min intervals from 10-70.. I am primarily > interested in the relationship between "Time" and "lnRFU" to calculate the > rate at which lnRFU declines over time. I have a nice plot using a ggplot2 > code that looks like this > > p<-ggplot(data=feed1,aes(x=Time,y=lnRFU)) > p+geom_point(size=4)+facet_grid(Site~Vial)+geom_smooth(method="lm") > > The graph is useful to visualize the changes over time and grouped by both > Site and Vial, but I also need the slopes of the linear regressions for each > Vial, within a Site. This is where I run into a problem. I want to run a > linear regression of lnRFU as a function of Time grouped by both Site and > Vial. Its easy to visualize this comparison in ggplot using facet_grid(), > but I'm not sure how to do a similar comparison/analysis within lm() > > I imagine something like > > fit<-lm(lnRFU~Time | Vial * Site, data=feed1) > > ?in which I group by both Vial and Site, but obviously this code doesn't > work. Does anyone have an idea for how to do a linear regression with two > grouping variables? Do I have to go back and combine Vial and Site into a > single grouping variable or can I leave the dataframe the way it is? I'm > trying to imagine a means of accomplishing the same type of thing that > facet_grid does when it allows you to plot the data as a function of two > "grouping" variables. > > Thanks for you time. I greatly appreciate it. > > Nate Miller > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Hi:
You're kind of on the right track, but there is no conditioning
formula in lm(); it's not lattice :) This is relatively easy to do
with the plyr package, though:
library('plyr')
# Generate a list of models - the subsetting variables (Site, Vial) are
# used to generate the data splits and the function is run on a generic
# data split d, assumed to be a data frame.
mlist <- dlply(feed1, .(Site, Vial), function(d) lm(lnRFU ~ Time, data = d))
# To get the set of model coefficients, take the list mlist as the input
# data (argument 1) and then for each generic component 'm', extract its
# coefficients:
ldply(mlist, function(m) coef(m))
For your test data, only Vial varies - the example code below reflects that:
> mlist <- dlply(feed1, .(Vial), function(d) lm(lnRFU ~ Time, data = d))
> length(mlist) # three component model objects
[1] 3> ldply(mlist, function(m) coef(m))
Vial (Intercept) Time
1 1 10.75440 -0.001508621
2 2 10.83171 -0.005095100
3 3 10.81087 -0.004897600
This idea can be generalized: if you want to pull out similar pieces
of output from each model, run ldply() on the list of models and
create a utility function that outputs, for a generic model object m,
what you want to have returned. Common choices include R^2 values,
tables of coefficients (as lists instead of data frames) or residuals
and predicted values. The game is to write the function so that it
takes a [list] model object (here, m) as input and a data frame as
output. You can also extract output from summary(m) in a similar way,
using m as the input object.
HTH,
Dennis
On Mon, Aug 22, 2011 at 6:15 PM, Nathan Miller <natemiller77 at gmail.com>
wrote:> Hi all,
>
> I have a data set that looks a bit like this.
>
> feed1
> ? ? ?RFU Site Vial Time ? ? ? lnRFU
> 1 ? 44448 ? ?1 ? ?1 ? 10 ?10.702075
> 2 ? 47521 ? ?1 ? ?1 ? 20 ?10.768927
> 3 ? 42905 ? ?1 ? ?1 ? 30 ?10.66674
> 4 ? 46867 ? ?1 ? ?1 ? 40 ?10.755069
> 5 ? 42995 ? ?1 ? ?1 ? 50 ?10.668839
> 6 ? 43074 ? ?1 ? ?1 ? 60 ?10.670675
> 7 ? 41195 ? ?1 ? ?1 ? 70 ?10.626072
> 8 ? 47090 ? ?1 ? ?2 ? 10 ?10.759816
> 9 ? 48100 ? ?1 ? ?2 ? 20 ?10.781037
> 10 ?43215 ? ?1 ? ?2 ? 30 ?10.673943
> 11 ?39656 ? ?1 ? ?2 ? 40 ?10.587998
> 12 ?38799 ? ?1 ? ?2 ? 50 ?10.566150
> 13 ?38424 ? ?1 ? ?2 ? 60 10.556438
> 14 35240 1 2 70 ?10.469937
> 15 ?46427 ? ?1 ? ?3 ? 10 ?10.745636
> 16 46418 1 3 20 ?10.745443
> 17 ?42095 ? ?1 ? ?3 ? 30 ?10.647684
> ......
> There are 5 columns of data, three levels of "Site", 10
"Vials" per site,
> and measurements were taken at 10 min intervals from 10-70.. I am primarily
> interested in the relationship between "Time" and
"lnRFU" to calculate the
> rate at which lnRFU declines over time. I have a nice plot using a ggplot2
> code that looks like this
>
> p<-ggplot(data=feed1,aes(x=Time,y=lnRFU))
>
p+geom_point(size=4)+facet_grid(Site~Vial)+geom_smooth(method="lm")
>
> The graph is useful to visualize the changes over time and grouped by both
> Site and Vial, but I also need the slopes of the linear regressions for
each
> Vial, within a Site. This is where I run into a problem. I want to run a
> linear regression of lnRFU as a function of Time grouped by both Site and
> Vial. Its easy to visualize this comparison in ggplot using facet_grid(),
> but I'm not sure how to do a similar comparison/analysis within lm()
>
> I imagine something like
>
> fit<-lm(lnRFU~Time | Vial * Site, data=feed1)
>
> ?in which I group by both Vial and Site, but obviously this code
doesn't
> work. Does anyone have an idea for how to do a linear regression with two
> grouping variables? Do I have to go back and combine Vial and Site into a
> single grouping variable or can I leave the dataframe the way it is?
I'm
> trying to imagine a means of accomplishing the same type of thing that
> facet_grid does when it allows you to plot the data as a function of two
> "grouping" variables.
>
> Thanks for you time. I greatly appreciate it.
>
> Nate Miller
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
At 02:15 23/08/2011, Nathan Miller wrote:>Hi all,See comment in-line>I have a data set that looks a bit like this. > >feed1 > RFU Site Vial Time lnRFU >1 44448 1 1 10 10.702075 >2 47521 1 1 20 10.768927 >3 42905 1 1 30 10.66674 >4 46867 1 1 40 10.755069 >5 42995 1 1 50 10.668839 >6 43074 1 1 60 10.670675 >7 41195 1 1 70 10.626072 >8 47090 1 2 10 10.759816 >9 48100 1 2 20 10.781037 >10 43215 1 2 30 10.673943 >11 39656 1 2 40 10.587998 >12 38799 1 2 50 10.566150 >13 38424 1 2 60 10.556438 >14 35240 1 2 70 10.469937 >15 46427 1 3 10 10.745636 >16 46418 1 3 20 10.745443 >17 42095 1 3 30 10.647684 >...... >There are 5 columns of data, three levels of "Site", 10 "Vials" per site, >and measurements were taken at 10 min intervals from 10-70.. I am primarily >interested in the relationship between "Time" and "lnRFU" to calculate the >rate at which lnRFU declines over time. I have a nice plot using a ggplot2 >code that looks like this > >p<-ggplot(data=feed1,aes(x=Time,y=lnRFU)) >p+geom_point(size=4)+facet_grid(Site~Vial)+geom_smooth(method="lm") > >The graph is useful to visualize the changes over time and grouped by both >Site and Vial, but I also need the slopes of the linear regressions for each >Vial, within a Site. This is where I run into a problem. I want to run a >linear regression of lnRFU as a function of Time grouped by both Site and >Vial. Its easy to visualize this comparison in ggplot using facet_grid(), >but I'm not sure how to do a similar comparison/analysis within lm() > >I imagine something like > >fit<-lm(lnRFU~Time | Vial * Site, data=feed1)I think you will find lmList from nlme helpful here, try library(nlme) fitlist <- lmList(lnRFU~Time | Vial * Site, data=feed1) Untested, but should be OK. There are a number of helper functions in nlme which operate on lmList objects to give handy plots and more.> in which I group by both Vial and Site, but obviously this code doesn't >work. Does anyone have an idea for how to do a linear regression with two >grouping variables? Do I have to go back and combine Vial and Site into a >single grouping variable or can I leave the dataframe the way it is? I'm >trying to imagine a means of accomplishing the same type of thing that >facet_grid does when it allows you to plot the data as a function of two >"grouping" variables. > >Thanks for you time. I greatly appreciate it. > >Nate Miller > > [[alternative HTML version deleted]]Michael Dewey info at aghmed.fsnet.co.uk http://www.aghmed.fsnet.co.uk/home.html
Apparently Analagous Threads
- (Lattice) How to improve the readability of a bwplot, i.e. separating groups somehow
- indexing within the function "aggregate"
- Applying function with separate dataframe (calibration file) supplying some inputs
- Degrees of Freedom Not Allocated to Residuals in Reduced Model
- Calculating difference between values in data frame based on separate column