I have a dataframe such like the following: ID TIME DV 1 0 0.880146038 1 1 0.88669051 1 3 0.610784702 1 5 0.756046666 2 0 0.456263368 2 1 0.369991537 2 3 0.508798346 2 5 0.441037014 3 0 0.854905349 3 1 0.960457553 3 3 0.609434409 3 5 0.655006334 . . . . . . I would like to generate another column with the normalized values of DV. for each ID, normalize to the value at TIME 0. I was able to use 2 loops to do the normalization, however, as 2 loops took a long time for the calculation, I don't know whether there's a better way to do it. I have the following code with only 1 loop, but it did not work. Can anyone help with this? Thanks, IDS <- unique(data.frame$ID) for(WhichID in IDS) { subset <- data.frame[,"ID"] == WhichID DVREAL <- data.frame[subset,"DV"] DVNORM[WhichID] <- DVREAL/DVREAL[1] } -- View this message in context: http://r.789695.n4.nabble.com/data-normalization-tp4633911.html Sent from the R help mailing list archive at Nabble.com.
assuming that the entries for each subject are ordered with respect to TIME and each subject has a measurement at TIME = 0, then you could use the following: Data <- read.table(textConnection("ID TIME DV 1 0 0.880146038 1 1 0.88669051 1 3 0.610784702 1 5 0.756046666 2 0 0.456263368 2 1 0.369991537 2 3 0.508798346 2 5 0.441037014 3 0 0.854905349 3 1 0.960457553 3 3 0.609434409 3 5 0.655006334"), header = TRUE) closeAllConnections() Data$DVn <- with(Data, ave(DV, ID, FUN = function (x) x/x[1])) Data If either of the above assumptions doesn't hold, you'll have to tweak it. I hope it helps. Best, Dimitris On 6/20/2012 2:08 AM, york8866 wrote:> I have a dataframe such like the following: > > ID TIME DV > 1 0 0.880146038 > 1 1 0.88669051 > 1 3 0.610784702 > 1 5 0.756046666 > 2 0 0.456263368 > 2 1 0.369991537 > 2 3 0.508798346 > 2 5 0.441037014 > 3 0 0.854905349 > 3 1 0.960457553 > 3 3 0.609434409 > 3 5 0.655006334 > . . . > . . . > > > I would like to generate another column with the normalized values of DV. > for each ID, normalize to the value at TIME 0. > I was able to use 2 loops to do the normalization, however, as 2 loops took > a long time for the calculation, I don't know whether there's a better way > to do it. > I have the following code with only 1 loop, but it did not work. Can anyone > help with this? Thanks, > > IDS <- unique(data.frame$ID) > for(WhichID in IDS) > { > subset <- data.frame[,"ID"] == WhichID > DVREAL <- data.frame[subset,"DV"] > DVNORM[WhichID] <- DVREAL/DVREAL[1] > } > > > > > > > > -- > View this message in context: http://r.789695.n4.nabble.com/data-normalization-tp4633911.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/
Thanks for the help, it works very well. -- View this message in context: http://r.789695.n4.nabble.com/data-normalization-tp4633939p4633964.html Sent from the R help mailing list archive at Nabble.com.
Hi, Probably you might have got the solution. Anyway, you can try this: Dat <- read.table(text=" ID? TIME??? DV 1? 0? 0.880146038 1? 1? 0.88669051 1? 3? 0.610784702 1? 5? 0.756046666 2? 0? 0.456263368 2? 1? 0.369991537 2? 3? 0.508798346 2? 5? 0.441037014 3? 0? 0.854905349 3? 1? 0.960457553 3? 3? 0.609434409 3? 5? 0.655006334 ",sep="", header = TRUE) ?SpDV<-function(x,n) split(x,sort(rank(x) %%n)) ?DVlist<-SpDV(Dat$DV,length(unique(Dat$ID))) ?Dat$DVn<-cbind(unlist(lapply(DVlist,FUN=function(x) x/x[1])))> Dat?? ID TIME??????? DV?????? DVn 1?? 1??? 0 0.8801460 1.0000000 2?? 1??? 1 0.8866905 1.0074357 3?? 1??? 3 0.6107847 0.6939584 4?? 1??? 5 0.7560467 0.8590014 5?? 2??? 0 0.4562634 1.0000000 6?? 2??? 1 0.3699915 0.8109166 7?? 2??? 3 0.5087983 1.1151418 8?? 2??? 5 0.4410370 0.9666281 9?? 3??? 0 0.8549053 1.0000000 10? 3??? 1 0.9604576 1.1234665 11? 3??? 3 0.6094344 0.7128677 12? 3??? 5 0.6550063 0.7661741 A.K. ----- Original Message ----- From: york8866 <yu_york at hotmail.com> To: r-help at r-project.org Cc: Sent: Tuesday, June 19, 2012 8:08 PM Subject: [R] data normalization I have a dataframe such like the following: ID??? TIME??? DV 1??? 0??? 0.880146038 1??? 1??? 0.88669051 1??? 3??? 0.610784702 1??? 5??? 0.756046666 2??? 0??? 0.456263368 2??? 1??? 0.369991537 2??? 3??? 0.508798346 2??? 5??? 0.441037014 3??? 0??? 0.854905349 3??? 1??? 0.960457553 3??? 3??? 0.609434409 3??? 5??? 0.655006334 .??? .??? . .??? .??? . I would like to generate another column with the normalized values of DV. for each ID, normalize to the value at TIME 0.? I was able to use 2 loops to do the normalization, however, as 2 loops took a long time for the calculation, I don't know whether there's a better way to do it. I have the following code with only 1 loop, but it did not work.? Can anyone help with this? Thanks, IDS <- unique(data.frame$ID) for(WhichID in IDS) { subset <- data.frame[,"ID"] == WhichID DVREAL <- data.frame[subset,"DV"] DVNORM[WhichID] <- DVREAL/DVREAL[1] } -- View this message in context: http://r.789695.n4.nabble.com/data-normalization-tp4633911.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.