Hello, I have a big data frame where consecutive time dates and corresponding observed values for each subject (ID) are on a line. I want to compute the linear slope for each subject. I would like to use apply but I do not know how to express the corresponding function. An example using a loop follows # # create dummy data set There are missing values a <- c(1,2,3,4, 1,1,1,1, 2,2,3,3, 3,4,NA,4, 5,5,5,5, 2.1,2.2,2.3,2.4, 2.3,2.4,2.6,2.6, 2.5,2.6,2.9,3, 2.6,NA,3.2,4) a <- matrix(a, nr=4) aa <- as.data.frame(a) names(aa) <- c("ID","X1","X2","X3","X4","Y1","Y2","Y3","Y4") # # I want the regression coefficientes of the Y on the X for each ID # sl <- rep(NA,4) for(i in 1:4) { x1 <- a[i,2:5] y1 <- a[i,6:9] sl[i] <- lm(y1 ~ x1)$coef[2] } sl # # I would like to use apply on the data.frame aa but with which function? # sl <- apply(aa,1,FUN) # FUN = ?? # Thanks for any help R.Heberto Ghezzo Ph.D. Montreal - Canada
Hi, On Wed, Aug 1, 2012 at 9:06 AM, R Heberto Ghezzo, Dr <heberto.ghezzo at mcgill.ca> wrote:> Hello, > I have a big data frame where consecutive time dates and corresponding observed values for each subject (ID) are on a line. I want to compute the linear slope for each subject. I would like to use apply but I do > not know how to express the corresponding function. An example using a loop follows > # > # create dummy data set There are missing values > a <- c(1,2,3,4, 1,1,1,1, 2,2,3,3, 3,4,NA,4, 5,5,5,5, > 2.1,2.2,2.3,2.4, 2.3,2.4,2.6,2.6, 2.5,2.6,2.9,3, > 2.6,NA,3.2,4) > a <- matrix(a, nr=4) > aa <- as.data.frame(a) > names(aa) <- c("ID","X1","X2","X3","X4","Y1","Y2","Y3","Y4") > # > # I want the regression coefficientes of the Y on the X for each ID > # > sl <- rep(NA,4) > for(i in 1:4) { > x1 <- a[i,2:5] > y1 <- a[i,6:9] > sl[i] <- lm(y1 ~ x1)$coef[2] > } > sl > # > # I would like to use apply on the data.frame aa but with which function? > # > sl <- apply(aa,1,FUN) # FUN = ??You could do it as a one-liner, but it's a lot more understandable if you write your own function. myfun <- function(a) { x1 <- a[2:5] y1 <- a[6:9] lm(y1 ~ x1)$coef[2] } Then you can pass that function to apply: sl <- apply(aa,1,myfun) Sarah -- Sarah Goslee http://www.functionaldiversity.org
"R Heberto Ghezzo, Dr" <heberto.ghezzo@mcgill.ca> wrote on 08/01/2012 08:06:30 AM:> > Hello, > I have a big data frame where consecutive time dates and > corresponding observed values for each subject (ID) are on a line. I > want to compute the linear slope for each subject. I would like to > use apply but I do > not know how to express the corresponding function. An example using > a loop follows > # > # create dummy data set There are missing values > a <- c(1,2,3,4, 1,1,1,1, 2,2,3,3, 3,4,NA,4, 5,5,5,5, > 2.1,2.2,2.3,2.4, 2.3,2.4,2.6,2.6, 2.5,2.6,2.9,3, > 2.6,NA,3.2,4) > a <- matrix(a, nr=4) > aa <- as.data.frame(a) > names(aa) <- c("ID","X1","X2","X3","X4","Y1","Y2","Y3","Y4") > # > # I want the regression coefficientes of the Y on the X for each ID > # > sl <- rep(NA,4) > for(i in 1:4) { > x1 <- a[i,2:5] > y1 <- a[i,6:9] > sl[i] <- lm(y1 ~ x1)$coef[2] > } > sl > # > # I would like to use apply on the data.frame aa but with whichfunction?> # > sl <- apply(aa,1,FUN) # FUN = ?? > #Thanks for providing example data! Very helpful. You would have to write your own function to operate on each row of the data frame. For example, sl <- apply(aa, 1, function(dat) lm(dat[6:9] ~ dat[2:5])$coef[2]) I'm not sure how much faster this will be for your big data frame. Jean> Thanks for any help > > R.Heberto Ghezzo Ph.D. > Montreal - Canada[[alternative HTML version deleted]]
Hi, maybe working with a data.frame in long format is an option - then you can use e.g. lmList and so on up to mixed models, depending on your final goals of analyses (e.g. check for differential slopes). vmat<-matrix(c("X1","X2","X3","X4","Y1","Y2","Y3","Y4"),nrow=2,byrow=T) aa.l<-reshape(aa,idvar="ID",direction="long",varying=vmat,v.names=c("X","Y")) library(nlme) (ll<-lmList(Y~X|ID,aa.l,na.action=na.omit)) summary(ll) cheers. Am 01.08.2012 15:06, schrieb R Heberto Ghezzo, Dr:> Hello, > I have a big data frame where consecutive time dates and corresponding observed values for each subject (ID) are on a line. I want to compute the linear slope for each subject. I would like to use apply but I do > not know how to express the corresponding function. An example using a loop follows > # > # create dummy data set There are missing values > a <- c(1,2,3,4, 1,1,1,1, 2,2,3,3, 3,4,NA,4, 5,5,5,5, > 2.1,2.2,2.3,2.4, 2.3,2.4,2.6,2.6, 2.5,2.6,2.9,3, > 2.6,NA,3.2,4) > a <- matrix(a, nr=4) > aa <- as.data.frame(a) > names(aa) <- c("ID","X1","X2","X3","X4","Y1","Y2","Y3","Y4") > # > # I want the regression coefficientes of the Y on the X for each ID > # > sl <- rep(NA,4) > for(i in 1:4) { > x1 <- a[i,2:5] > y1 <- a[i,6:9] > sl[i] <- lm(y1 ~ x1)$coef[2] > } > sl > # > # I would like to use apply on the data.frame aa but with which function? > # > sl <- apply(aa,1,FUN) # FUN = ?? > # > Thanks for any help > > R.Heberto Ghezzo Ph.D. > Montreal - Canada > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Eik Vettorazzi Department of Medical Biometry and Epidemiology University Medical Center Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/7410-58243 F ++49/40/7410-57790 -- Pflichtangaben gem?? Gesetz ?ber elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG): Universit?tsklinikum Hamburg-Eppendorf; K?rperschaft des ?ffentlichen Rechts; Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. Guido Sauter (Vertreter des Vorsitzenden), Dr. Alexander Kirstein, Joachim Pr?l?, Prof. Dr. Dr. Uwe Koch-Gromus