Please I need some help using R to analyze my data. What I would like to do is to repeat the same basic process (e.g. linear regression between wood density and distance from pith) for at least 240 data subsets within the main data-frame. Within the main data-frame, these data subsets will be defined by three variables namely, species, individual and core (i.e. 20 species, at least 6 individuals of each species, and 2 cores from each individual). Whereas I can write the code to carry out this process for each subset, I am unable to successfully instruct R to automatically carry out the process for each of these subsets (perhaps using loops). So to illustrate what I have done so far with the codes below I was able to run a regression analysis for core ‘a’ of individual 1 in the species “Apeime”. But rather than do this 240 times, I would like to tell R to repeat the process automatically using loops or any method that works. Code: RG2<-BCI[BCI$Species == "APEIME" & BCI$Individual == 1 & BCI$Core == "a", ]> plot(x=RG2$DP..cm.,y=RG2$WD..g.cm3, xlab="Distance from pith cm", main="APEIME1a", ylab="Wood density g/cm3")>RG2lm<-lm(RG2$WD..g.cm3~RG2$DP..cm.)> summary(RG2lm)Thanks Oyomoare [[alternative HTML version deleted]]
If you assume that the variance is the same in all your subsets, you can do an lm analysis with your subset classification as a factor. You could also analyze the interaction between factors and between factors and your numeric independent variable. You also should consider repeated measurement methods since you are taking 2 cores from the same individuals. On 9/20/2010 11:46 PM, Oyomoare Osazuwa-Peters wrote:> Please I need some help using R to > analyze my data. What I > would like to do is to repeat the same basic process (e.g. linear regression > between wood density and distance from pith) for at least 240 data > subsets > within the main data-frame. Within the main data-frame, these data subsets will be defined by three > variables > namely, ? species, individual and core (i.e. 20 species, at least 6 > individuals > of each species, and 2 cores from each individual). ? Whereas I can write > the code to carry out this process for each subset, I am unable to > successfully > instruct R to automatically carry out the process for each of these > subsets (perhaps using loops). So to illustrate what I have done so far > with the codes > below I was able to run a regression > analysis for core ???a??? of individual 1 in > the species ???Apeime???. But rather than do this 240 times, I would like to > tell R > to repeat the process automatically using loops or any method that > works. > > ? > > Code: > > ? > > RG2<-BCI[BCI$Species == "APEIME" > & > BCI$Individual == 1 & BCI$Core == "a", ] > >> plot(x=RG2$DP..cm., > y=RG2$WD..g.cm3, > xlab="Distance from pith cm", main="APEIME1a", > ylab="Wood density g/cm3") > >> > RG2lm<-lm(RG2$WD..g.cm3~RG2$DP..cm.) > >> summary(RG2lm) > > ? > > Thanks > > ? > > Oyomoare > > > > > > [[alternative HTML version deleted]] > > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
From: Oyomoare Osazuwa-Peters <oyomoare@yahoo.com> Subject: Re: [R] Help! To: "Erich Neuwirth" <erich.neuwirth@univie.ac.at> Date: Monday, September 20, 2010, 5:16 PM Thanks for responding to my request for help. I understand what you mean about the repeated measurements methods for the two cores. The thing though is to answer my research question, the data I really need is the radial gradient (equals the slope from a regression of the response variable (WD) on the predictor (DP)) for each core. Then, I can be begin to test for the effects of species, individuals and core using an appropriate test (likely nested anova). For now I am in the initial process of getting radial gradients and having problems with the code that would instruct R to do it all at once. My main problem is when I define the subsetting indices to be species, individual and core at the same time for the whole data frame, so that R performs the operation for each of the 240 data subsets automatically, it doesn't work. But it works when I define only a single subset of the data like I showed in my first mail. Oyomoare --- On Mon, 9/20/10, Erich Neuwirth <erich.neuwirth@univie.ac.at> wrote: From: Erich Neuwirth <erich.neuwirth@univie.ac.at> Subject: Re: [R] Help! To: r-help@r-project.org Date: Monday, September 20, 2010, 5:02 PM If you assume that the variance is the same in all your subsets, you can do an lm analysis with your subset classification as a factor. You could also analyze the interaction between factors and between factors and your numeric independent variable. You also should consider repeated measurement methods since you are taking 2 cores from the same individuals. On 9/20/2010 11:46 PM, Oyomoare Osazuwa-Peters wrote:> Please I need some help using R to > analyze my data. What I > would like to do is to repeat the same basic process (e.g. linearregression> betweenwood density and distance from pith) for at least 240 data> subsets > within the main data-frame. Within the main data-frame, these data subsets will be defined by three > variables > namely,  species, individual and core (i.e. 20 species, at least 6 > individuals > of each species, and 2 cores from each individual).  Whereas I can write > the code to carry out this process for each subset, I am unable to > successfully > instruct R to automatically carry out the process for each of these > subsets (perhaps using loops). So to illustrate what I have done so far > with the codes > below I was able to run a regression > analysis for core ‘a’ of individual 1 in > the species “Apeime�. But rather than do this 240 times, I would like to > tell R > to repeat the process automatically using loops or any method that> works. > >  > > Code: > >  > > RG2<-BCI[BCI$Species == "APEIME" > & > BCI$Individual == 1 & BCI$Core == "a", ] > >> plot(x=RG2$DP..cm., > y=RG2$WD..g.cm3, > xlab="Distance from pith cm", main="APEIME1a", > ylab="Wood density g/cm3") > >> > RG2lm<-lm(RG2$WD..g.cm3~RG2$DP..cm.) > >> summary(RG2lm) > >  > > Thanks > >  > > Oyomoare > > > > > > [[alternative HTML version deleted]] > > > > > ______________________________________________ > R-help@r-project.orgmailing list> https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-----Inline Attachment Follows----- ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
You could do most of this with the function lmList in the nlme package, but since you want both plots and summaries, you might as well do it in a more flexible loop. How about something like this: Code: ## This makes a single factor to define your groups BCI <- within(BCI, Sp_ind_core <- factor(paste(Species, Individual, Core, sep = "_"))) ## to receive your plots: jpeg(filename = "BCI_plot_%03d.jpg") ## or whatever... ## to receive your printed summaries: sink("BCI_output.txt") ## a function to do all the work: action <- function(data) { group <- as.character(data$Sp_ind_core[1]) plot(WD..g.cm3 ~ DP..cm, data, xlab = "Distance from pith cm", main = group, ylab = "Wood densith g/cm3") modl <- lm(WD..g.cm3 ~ DP..cm, data) cat("\n\n Subset: ", group, "\n") print(summary(modl)) invisible(modl) } ### now for the loop result <- lapply(split(BCI, BCI$Sp_ind_core), action) ## Finally tidy up sink() dev.off() -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Oyomoare Osazuwa-Peters Sent: Tuesday, 21 September 2010 7:47 AM To: r-help at r-project.org Subject: [R] Help! Please I need some help using R to analyse my data. What I would like to do is to repeat the same basic process (e. g. linear regression between wood density and distance from pith) for at least 240 data subsets within the main data-frame. Within the main data-frame, these data subsets will be defined by three variables namely, ?species, individual and core (i. e. 20 species, at least 6 individuals of each species, and 2 cores from each individual). ?Whereas I can write the code to carry out this process for each subset, I am unable to successfully instruct R to automatically carry out the process for each of these subsets (perhaps using loops). So to illustrate what I have done so far with the codes below I was able to run a regression analysis for core 'a' of individual 1 in the species "Apeime". But rather than do this 240 times, I would like to tell R to repeat the process automatically using loops or any method that works. Code: RG2 <- BCI[BCI$Species == "APEIME" & BCI$Individual == 1 & BCI$Core == "a", ] plot(x = RG2$DP..cm., y=RG2$WD..g.cm3, xlab = "Distance from pith cm", main = "APEIME1a", ylab = "Wood density g/cm3") RG2lm <- lm(RG2$WD..g.cm3~RG2$DP..cm.) summary(RG2lm) ? Thanks ? Oyomoare