Hi all, I have a data set that looks a bit like this. feed1 RFU Site Vial Time lnRFU 1 44448 1 1 10 10.702075 2 47521 1 1 20 10.768927 3 42905 1 1 30 10.66674 4 46867 1 1 40 10.755069 5 42995 1 1 50 10.668839 6 43074 1 1 60 10.670675 7 41195 1 1 70 10.626072 8 47090 1 2 10 10.759816 9 48100 1 2 20 10.781037 10 43215 1 2 30 10.673943 11 39656 1 2 40 10.587998 12 38799 1 2 50 10.566150 13 38424 1 2 60 10.556438 14 35240 1 2 70 10.469937 15 46427 1 3 10 10.745636 16 46418 1 3 20 10.745443 17 42095 1 3 30 10.647684 ...... There are 5 columns of data, three levels of "Site", 10 "Vials" per site, and measurements were taken at 10 min intervals from 10-70.. I am primarily interested in the relationship between "Time" and "lnRFU" to calculate the rate at which lnRFU declines over time. I have a nice plot using a ggplot2 code that looks like this p<-ggplot(data=feed1,aes(x=Time,y=lnRFU)) p+geom_point(size=4)+facet_grid(Site~Vial)+geom_smooth(method="lm") The graph is useful to visualize the changes over time and grouped by both Site and Vial, but I also need the slopes of the linear regressions for each Vial, within a Site. This is where I run into a problem. I want to run a linear regression of lnRFU as a function of Time grouped by both Site and Vial. Its easy to visualize this comparison in ggplot using facet_grid(), but I'm not sure how to do a similar comparison/analysis within lm() I imagine something like fit<-lm(lnRFU~Time | Vial * Site, data=feed1) in which I group by both Vial and Site, but obviously this code doesn't work. Does anyone have an idea for how to do a linear regression with two grouping variables? Do I have to go back and combine Vial and Site into a single grouping variable or can I leave the dataframe the way it is? I'm trying to imagine a means of accomplishing the same type of thing that facet_grid does when it allows you to plot the data as a function of two "grouping" variables. Thanks for you time. I greatly appreciate it. Nate Miller [[alternative HTML version deleted]]
You can do something like this sp<-split(dat, list(dat$Vial,dat$Site)) seq.model<-lapply(sp, function(x) lm(x$InRFU~x$Time)) Then, extract whatever you want from seq.model Weidong Gu On Mon, Aug 22, 2011 at 9:15 PM, Nathan Miller <natemiller77 at gmail.com> wrote:> Hi all, > > I have a data set that looks a bit like this. > > feed1 > ? ? ?RFU Site Vial Time ? ? ? lnRFU > 1 ? 44448 ? ?1 ? ?1 ? 10 ?10.702075 > 2 ? 47521 ? ?1 ? ?1 ? 20 ?10.768927 > 3 ? 42905 ? ?1 ? ?1 ? 30 ?10.66674 > 4 ? 46867 ? ?1 ? ?1 ? 40 ?10.755069 > 5 ? 42995 ? ?1 ? ?1 ? 50 ?10.668839 > 6 ? 43074 ? ?1 ? ?1 ? 60 ?10.670675 > 7 ? 41195 ? ?1 ? ?1 ? 70 ?10.626072 > 8 ? 47090 ? ?1 ? ?2 ? 10 ?10.759816 > 9 ? 48100 ? ?1 ? ?2 ? 20 ?10.781037 > 10 ?43215 ? ?1 ? ?2 ? 30 ?10.673943 > 11 ?39656 ? ?1 ? ?2 ? 40 ?10.587998 > 12 ?38799 ? ?1 ? ?2 ? 50 ?10.566150 > 13 ?38424 ? ?1 ? ?2 ? 60 10.556438 > 14 35240 1 2 70 ?10.469937 > 15 ?46427 ? ?1 ? ?3 ? 10 ?10.745636 > 16 46418 1 3 20 ?10.745443 > 17 ?42095 ? ?1 ? ?3 ? 30 ?10.647684 > ...... > There are 5 columns of data, three levels of "Site", 10 "Vials" per site, > and measurements were taken at 10 min intervals from 10-70.. I am primarily > interested in the relationship between "Time" and "lnRFU" to calculate the > rate at which lnRFU declines over time. I have a nice plot using a ggplot2 > code that looks like this > > p<-ggplot(data=feed1,aes(x=Time,y=lnRFU)) > p+geom_point(size=4)+facet_grid(Site~Vial)+geom_smooth(method="lm") > > The graph is useful to visualize the changes over time and grouped by both > Site and Vial, but I also need the slopes of the linear regressions for each > Vial, within a Site. This is where I run into a problem. I want to run a > linear regression of lnRFU as a function of Time grouped by both Site and > Vial. Its easy to visualize this comparison in ggplot using facet_grid(), > but I'm not sure how to do a similar comparison/analysis within lm() > > I imagine something like > > fit<-lm(lnRFU~Time | Vial * Site, data=feed1) > > ?in which I group by both Vial and Site, but obviously this code doesn't > work. Does anyone have an idea for how to do a linear regression with two > grouping variables? Do I have to go back and combine Vial and Site into a > single grouping variable or can I leave the dataframe the way it is? I'm > trying to imagine a means of accomplishing the same type of thing that > facet_grid does when it allows you to plot the data as a function of two > "grouping" variables. > > Thanks for you time. I greatly appreciate it. > > Nate Miller > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Hi: You're kind of on the right track, but there is no conditioning formula in lm(); it's not lattice :) This is relatively easy to do with the plyr package, though: library('plyr') # Generate a list of models - the subsetting variables (Site, Vial) are # used to generate the data splits and the function is run on a generic # data split d, assumed to be a data frame. mlist <- dlply(feed1, .(Site, Vial), function(d) lm(lnRFU ~ Time, data = d)) # To get the set of model coefficients, take the list mlist as the input # data (argument 1) and then for each generic component 'm', extract its # coefficients: ldply(mlist, function(m) coef(m)) For your test data, only Vial varies - the example code below reflects that:> mlist <- dlply(feed1, .(Vial), function(d) lm(lnRFU ~ Time, data = d)) > length(mlist) # three component model objects[1] 3> ldply(mlist, function(m) coef(m))Vial (Intercept) Time 1 1 10.75440 -0.001508621 2 2 10.83171 -0.005095100 3 3 10.81087 -0.004897600 This idea can be generalized: if you want to pull out similar pieces of output from each model, run ldply() on the list of models and create a utility function that outputs, for a generic model object m, what you want to have returned. Common choices include R^2 values, tables of coefficients (as lists instead of data frames) or residuals and predicted values. The game is to write the function so that it takes a [list] model object (here, m) as input and a data frame as output. You can also extract output from summary(m) in a similar way, using m as the input object. HTH, Dennis On Mon, Aug 22, 2011 at 6:15 PM, Nathan Miller <natemiller77 at gmail.com> wrote:> Hi all, > > I have a data set that looks a bit like this. > > feed1 > ? ? ?RFU Site Vial Time ? ? ? lnRFU > 1 ? 44448 ? ?1 ? ?1 ? 10 ?10.702075 > 2 ? 47521 ? ?1 ? ?1 ? 20 ?10.768927 > 3 ? 42905 ? ?1 ? ?1 ? 30 ?10.66674 > 4 ? 46867 ? ?1 ? ?1 ? 40 ?10.755069 > 5 ? 42995 ? ?1 ? ?1 ? 50 ?10.668839 > 6 ? 43074 ? ?1 ? ?1 ? 60 ?10.670675 > 7 ? 41195 ? ?1 ? ?1 ? 70 ?10.626072 > 8 ? 47090 ? ?1 ? ?2 ? 10 ?10.759816 > 9 ? 48100 ? ?1 ? ?2 ? 20 ?10.781037 > 10 ?43215 ? ?1 ? ?2 ? 30 ?10.673943 > 11 ?39656 ? ?1 ? ?2 ? 40 ?10.587998 > 12 ?38799 ? ?1 ? ?2 ? 50 ?10.566150 > 13 ?38424 ? ?1 ? ?2 ? 60 10.556438 > 14 35240 1 2 70 ?10.469937 > 15 ?46427 ? ?1 ? ?3 ? 10 ?10.745636 > 16 46418 1 3 20 ?10.745443 > 17 ?42095 ? ?1 ? ?3 ? 30 ?10.647684 > ...... > There are 5 columns of data, three levels of "Site", 10 "Vials" per site, > and measurements were taken at 10 min intervals from 10-70.. I am primarily > interested in the relationship between "Time" and "lnRFU" to calculate the > rate at which lnRFU declines over time. I have a nice plot using a ggplot2 > code that looks like this > > p<-ggplot(data=feed1,aes(x=Time,y=lnRFU)) > p+geom_point(size=4)+facet_grid(Site~Vial)+geom_smooth(method="lm") > > The graph is useful to visualize the changes over time and grouped by both > Site and Vial, but I also need the slopes of the linear regressions for each > Vial, within a Site. This is where I run into a problem. I want to run a > linear regression of lnRFU as a function of Time grouped by both Site and > Vial. Its easy to visualize this comparison in ggplot using facet_grid(), > but I'm not sure how to do a similar comparison/analysis within lm() > > I imagine something like > > fit<-lm(lnRFU~Time | Vial * Site, data=feed1) > > ?in which I group by both Vial and Site, but obviously this code doesn't > work. Does anyone have an idea for how to do a linear regression with two > grouping variables? Do I have to go back and combine Vial and Site into a > single grouping variable or can I leave the dataframe the way it is? I'm > trying to imagine a means of accomplishing the same type of thing that > facet_grid does when it allows you to plot the data as a function of two > "grouping" variables. > > Thanks for you time. I greatly appreciate it. > > Nate Miller > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
At 02:15 23/08/2011, Nathan Miller wrote:>Hi all,See comment in-line>I have a data set that looks a bit like this. > >feed1 > RFU Site Vial Time lnRFU >1 44448 1 1 10 10.702075 >2 47521 1 1 20 10.768927 >3 42905 1 1 30 10.66674 >4 46867 1 1 40 10.755069 >5 42995 1 1 50 10.668839 >6 43074 1 1 60 10.670675 >7 41195 1 1 70 10.626072 >8 47090 1 2 10 10.759816 >9 48100 1 2 20 10.781037 >10 43215 1 2 30 10.673943 >11 39656 1 2 40 10.587998 >12 38799 1 2 50 10.566150 >13 38424 1 2 60 10.556438 >14 35240 1 2 70 10.469937 >15 46427 1 3 10 10.745636 >16 46418 1 3 20 10.745443 >17 42095 1 3 30 10.647684 >...... >There are 5 columns of data, three levels of "Site", 10 "Vials" per site, >and measurements were taken at 10 min intervals from 10-70.. I am primarily >interested in the relationship between "Time" and "lnRFU" to calculate the >rate at which lnRFU declines over time. I have a nice plot using a ggplot2 >code that looks like this > >p<-ggplot(data=feed1,aes(x=Time,y=lnRFU)) >p+geom_point(size=4)+facet_grid(Site~Vial)+geom_smooth(method="lm") > >The graph is useful to visualize the changes over time and grouped by both >Site and Vial, but I also need the slopes of the linear regressions for each >Vial, within a Site. This is where I run into a problem. I want to run a >linear regression of lnRFU as a function of Time grouped by both Site and >Vial. Its easy to visualize this comparison in ggplot using facet_grid(), >but I'm not sure how to do a similar comparison/analysis within lm() > >I imagine something like > >fit<-lm(lnRFU~Time | Vial * Site, data=feed1)I think you will find lmList from nlme helpful here, try library(nlme) fitlist <- lmList(lnRFU~Time | Vial * Site, data=feed1) Untested, but should be OK. There are a number of helper functions in nlme which operate on lmList objects to give handy plots and more.> in which I group by both Vial and Site, but obviously this code doesn't >work. Does anyone have an idea for how to do a linear regression with two >grouping variables? Do I have to go back and combine Vial and Site into a >single grouping variable or can I leave the dataframe the way it is? I'm >trying to imagine a means of accomplishing the same type of thing that >facet_grid does when it allows you to plot the data as a function of two >"grouping" variables. > >Thanks for you time. I greatly appreciate it. > >Nate Miller > > [[alternative HTML version deleted]]Michael Dewey info at aghmed.fsnet.co.uk http://www.aghmed.fsnet.co.uk/home.html
Reasonably Related Threads
- (Lattice) How to improve the readability of a bwplot, i.e. separating groups somehow
- indexing within the function "aggregate"
- Applying function with separate dataframe (calibration file) supplying some inputs
- Degrees of Freedom Not Allocated to Residuals in Reduced Model
- Calculating difference between values in data frame based on separate column