Dear R Helpers, I have a pretty large dataframe (150,000 variables, 10,000 entries for each) and have to run a regression on each of the variables. Recorded are the pvals. I wrote a function and use sapply. The function looks something like this: calcpval<-function(x){ modela <- lm(apples~age,data=m) modelb <- lm(apples~age+ageSquared,data=m) modelc <- lm(apples~age+ageSquared+bmi,data=m) p_main <- anova(modela,modelb)$P[2] p_main_i <- anova(modela,modelc)$P[2] p_i <- anova(modelb,modelc)$P[2] return(c(p_main,p_main_i,p_i)) } This whole thing is terribly slow... I observed that it's faster when breaking down the file. But other suggestions could you please make to make it run faster (say days instead of weeks). Thank you and best regards, Georg. ***************** Georg Ehret, Johns Hopkins Medicine [[alternative HTML version deleted]]