Hello,
I have a big data frame where consecutive time dates and corresponding observed
values for each subject (ID) are on a line. I want to compute the linear slope
for each subject. I would like to use apply but I do
not know how to express the corresponding function. An example using a loop
follows
#
# create dummy data set There are missing values
a <- c(1,2,3,4, 1,1,1,1, 2,2,3,3, 3,4,NA,4, 5,5,5,5,
2.1,2.2,2.3,2.4, 2.3,2.4,2.6,2.6, 2.5,2.6,2.9,3,
2.6,NA,3.2,4)
a <- matrix(a, nr=4)
aa <- as.data.frame(a)
names(aa) <-
c("ID","X1","X2","X3","X4","Y1","Y2","Y3","Y4")
#
# I want the regression coefficientes of the Y on the X for each ID
#
sl <- rep(NA,4)
for(i in 1:4) {
x1 <- a[i,2:5]
y1 <- a[i,6:9]
sl[i] <- lm(y1 ~ x1)$coef[2]
}
sl
#
# I would like to use apply on the data.frame aa but with which function?
#
sl <- apply(aa,1,FUN) # FUN = ??
#
Thanks for any help
R.Heberto Ghezzo Ph.D.
Montreal - Canada
Hi, On Wed, Aug 1, 2012 at 9:06 AM, R Heberto Ghezzo, Dr <heberto.ghezzo at mcgill.ca> wrote:> Hello, > I have a big data frame where consecutive time dates and corresponding observed values for each subject (ID) are on a line. I want to compute the linear slope for each subject. I would like to use apply but I do > not know how to express the corresponding function. An example using a loop follows > # > # create dummy data set There are missing values > a <- c(1,2,3,4, 1,1,1,1, 2,2,3,3, 3,4,NA,4, 5,5,5,5, > 2.1,2.2,2.3,2.4, 2.3,2.4,2.6,2.6, 2.5,2.6,2.9,3, > 2.6,NA,3.2,4) > a <- matrix(a, nr=4) > aa <- as.data.frame(a) > names(aa) <- c("ID","X1","X2","X3","X4","Y1","Y2","Y3","Y4") > # > # I want the regression coefficientes of the Y on the X for each ID > # > sl <- rep(NA,4) > for(i in 1:4) { > x1 <- a[i,2:5] > y1 <- a[i,6:9] > sl[i] <- lm(y1 ~ x1)$coef[2] > } > sl > # > # I would like to use apply on the data.frame aa but with which function? > # > sl <- apply(aa,1,FUN) # FUN = ??You could do it as a one-liner, but it's a lot more understandable if you write your own function. myfun <- function(a) { x1 <- a[2:5] y1 <- a[6:9] lm(y1 ~ x1)$coef[2] } Then you can pass that function to apply: sl <- apply(aa,1,myfun) Sarah -- Sarah Goslee http://www.functionaldiversity.org
"R Heberto Ghezzo, Dr" <heberto.ghezzo@mcgill.ca> wrote on 08/01/2012 08:06:30 AM:> > Hello, > I have a big data frame where consecutive time dates and > corresponding observed values for each subject (ID) are on a line. I > want to compute the linear slope for each subject. I would like to > use apply but I do > not know how to express the corresponding function. An example using > a loop follows > # > # create dummy data set There are missing values > a <- c(1,2,3,4, 1,1,1,1, 2,2,3,3, 3,4,NA,4, 5,5,5,5, > 2.1,2.2,2.3,2.4, 2.3,2.4,2.6,2.6, 2.5,2.6,2.9,3, > 2.6,NA,3.2,4) > a <- matrix(a, nr=4) > aa <- as.data.frame(a) > names(aa) <- c("ID","X1","X2","X3","X4","Y1","Y2","Y3","Y4") > # > # I want the regression coefficientes of the Y on the X for each ID > # > sl <- rep(NA,4) > for(i in 1:4) { > x1 <- a[i,2:5] > y1 <- a[i,6:9] > sl[i] <- lm(y1 ~ x1)$coef[2] > } > sl > # > # I would like to use apply on the data.frame aa but with whichfunction?> # > sl <- apply(aa,1,FUN) # FUN = ?? > #Thanks for providing example data! Very helpful. You would have to write your own function to operate on each row of the data frame. For example, sl <- apply(aa, 1, function(dat) lm(dat[6:9] ~ dat[2:5])$coef[2]) I'm not sure how much faster this will be for your big data frame. Jean> Thanks for any help > > R.Heberto Ghezzo Ph.D. > Montreal - Canada[[alternative HTML version deleted]]
Hi,
maybe working with a data.frame in long format is an option - then you
can use e.g. lmList and so on up to mixed models, depending on your
final goals of analyses (e.g. check for differential slopes).
vmat<-matrix(c("X1","X2","X3","X4","Y1","Y2","Y3","Y4"),nrow=2,byrow=T)
aa.l<-reshape(aa,idvar="ID",direction="long",varying=vmat,v.names=c("X","Y"))
library(nlme)
(ll<-lmList(Y~X|ID,aa.l,na.action=na.omit))
summary(ll)
cheers.
Am 01.08.2012 15:06, schrieb R Heberto Ghezzo, Dr:> Hello,
> I have a big data frame where consecutive time dates and corresponding
observed values for each subject (ID) are on a line. I want to compute the
linear slope for each subject. I would like to use apply but I do
> not know how to express the corresponding function. An example using a loop
follows
> #
> # create dummy data set There are missing values
> a <- c(1,2,3,4, 1,1,1,1, 2,2,3,3, 3,4,NA,4, 5,5,5,5,
> 2.1,2.2,2.3,2.4, 2.3,2.4,2.6,2.6, 2.5,2.6,2.9,3,
> 2.6,NA,3.2,4)
> a <- matrix(a, nr=4)
> aa <- as.data.frame(a)
> names(aa) <-
c("ID","X1","X2","X3","X4","Y1","Y2","Y3","Y4")
> #
> # I want the regression coefficientes of the Y on the X for each ID
> #
> sl <- rep(NA,4)
> for(i in 1:4) {
> x1 <- a[i,2:5]
> y1 <- a[i,6:9]
> sl[i] <- lm(y1 ~ x1)$coef[2]
> }
> sl
> #
> # I would like to use apply on the data.frame aa but with which function?
> #
> sl <- apply(aa,1,FUN) # FUN = ??
> #
> Thanks for any help
>
> R.Heberto Ghezzo Ph.D.
> Montreal - Canada
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Eik Vettorazzi
Department of Medical Biometry and Epidemiology
University Medical Center Hamburg-Eppendorf
Martinistr. 52
20246 Hamburg
T ++49/40/7410-58243
F ++49/40/7410-57790
--
Pflichtangaben gem?? Gesetz ?ber elektronische Handelsregister und
Genossenschaftsregister sowie das Unternehmensregister (EHUG):
Universit?tsklinikum Hamburg-Eppendorf; K?rperschaft des ?ffentlichen Rechts;
Gerichtsstand: Hamburg
Vorstandsmitglieder: Prof. Dr. Guido Sauter (Vertreter des Vorsitzenden), Dr.
Alexander Kirstein, Joachim Pr?l?, Prof. Dr. Dr. Uwe Koch-Gromus