I have a dataframe such like the following:
ID TIME DV
1 0 0.880146038
1 1 0.88669051
1 3 0.610784702
1 5 0.756046666
2 0 0.456263368
2 1 0.369991537
2 3 0.508798346
2 5 0.441037014
3 0 0.854905349
3 1 0.960457553
3 3 0.609434409
3 5 0.655006334
. . .
. . .
I would like to generate another column with the normalized values of DV.
for each ID, normalize to the value at TIME 0.
I was able to use 2 loops to do the normalization, however, as 2 loops took
a long time for the calculation, I don't know whether there's a better
way
to do it.
I have the following code with only 1 loop, but it did not work. Can anyone
help with this? Thanks,
IDS <- unique(data.frame$ID)
for(WhichID in IDS)
{
subset <- data.frame[,"ID"] == WhichID
DVREAL <- data.frame[subset,"DV"]
DVNORM[WhichID] <- DVREAL/DVREAL[1]
}
--
View this message in context:
http://r.789695.n4.nabble.com/data-normalization-tp4633911.html
Sent from the R help mailing list archive at Nabble.com.
assuming that the entries for each subject are ordered with respect to
TIME and each subject has a measurement at TIME = 0, then you could use
the following:
Data <- read.table(textConnection("ID TIME DV
1 0 0.880146038
1 1 0.88669051
1 3 0.610784702
1 5 0.756046666
2 0 0.456263368
2 1 0.369991537
2 3 0.508798346
2 5 0.441037014
3 0 0.854905349
3 1 0.960457553
3 3 0.609434409
3 5 0.655006334"), header = TRUE)
closeAllConnections()
Data$DVn <- with(Data, ave(DV, ID, FUN = function (x) x/x[1]))
Data
If either of the above assumptions doesn't hold, you'll have to tweak
it.
I hope it helps.
Best,
Dimitris
On 6/20/2012 2:08 AM, york8866 wrote:> I have a dataframe such like the following:
>
> ID TIME DV
> 1 0 0.880146038
> 1 1 0.88669051
> 1 3 0.610784702
> 1 5 0.756046666
> 2 0 0.456263368
> 2 1 0.369991537
> 2 3 0.508798346
> 2 5 0.441037014
> 3 0 0.854905349
> 3 1 0.960457553
> 3 3 0.609434409
> 3 5 0.655006334
> . . .
> . . .
>
>
> I would like to generate another column with the normalized values of DV.
> for each ID, normalize to the value at TIME 0.
> I was able to use 2 loops to do the normalization, however, as 2 loops took
> a long time for the calculation, I don't know whether there's a
better way
> to do it.
> I have the following code with only 1 loop, but it did not work. Can
anyone
> help with this? Thanks,
>
> IDS <- unique(data.frame$ID)
> for(WhichID in IDS)
> {
> subset <- data.frame[,"ID"] == WhichID
> DVREAL <- data.frame[subset,"DV"]
> DVNORM[WhichID] <- DVREAL/DVREAL[1]
> }
>
>
>
>
>
>
>
> --
> View this message in context:
http://r.789695.n4.nabble.com/data-normalization-tp4633911.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center
Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
Web: http://www.erasmusmc.nl/biostatistiek/
Thanks for the help, it works very well. -- View this message in context: http://r.789695.n4.nabble.com/data-normalization-tp4633939p4633964.html Sent from the R help mailing list archive at Nabble.com.
Hi, Probably you might have got the solution. Anyway, you can try this: Dat <- read.table(text=" ID? TIME??? DV 1? 0? 0.880146038 1? 1? 0.88669051 1? 3? 0.610784702 1? 5? 0.756046666 2? 0? 0.456263368 2? 1? 0.369991537 2? 3? 0.508798346 2? 5? 0.441037014 3? 0? 0.854905349 3? 1? 0.960457553 3? 3? 0.609434409 3? 5? 0.655006334 ",sep="", header = TRUE) ?SpDV<-function(x,n) split(x,sort(rank(x) %%n)) ?DVlist<-SpDV(Dat$DV,length(unique(Dat$ID))) ?Dat$DVn<-cbind(unlist(lapply(DVlist,FUN=function(x) x/x[1])))> Dat?? ID TIME??????? DV?????? DVn 1?? 1??? 0 0.8801460 1.0000000 2?? 1??? 1 0.8866905 1.0074357 3?? 1??? 3 0.6107847 0.6939584 4?? 1??? 5 0.7560467 0.8590014 5?? 2??? 0 0.4562634 1.0000000 6?? 2??? 1 0.3699915 0.8109166 7?? 2??? 3 0.5087983 1.1151418 8?? 2??? 5 0.4410370 0.9666281 9?? 3??? 0 0.8549053 1.0000000 10? 3??? 1 0.9604576 1.1234665 11? 3??? 3 0.6094344 0.7128677 12? 3??? 5 0.6550063 0.7661741 A.K. ----- Original Message ----- From: york8866 <yu_york at hotmail.com> To: r-help at r-project.org Cc: Sent: Tuesday, June 19, 2012 8:08 PM Subject: [R] data normalization I have a dataframe such like the following: ID??? TIME??? DV 1??? 0??? 0.880146038 1??? 1??? 0.88669051 1??? 3??? 0.610784702 1??? 5??? 0.756046666 2??? 0??? 0.456263368 2??? 1??? 0.369991537 2??? 3??? 0.508798346 2??? 5??? 0.441037014 3??? 0??? 0.854905349 3??? 1??? 0.960457553 3??? 3??? 0.609434409 3??? 5??? 0.655006334 .??? .??? . .??? .??? . I would like to generate another column with the normalized values of DV. for each ID, normalize to the value at TIME 0.? I was able to use 2 loops to do the normalization, however, as 2 loops took a long time for the calculation, I don't know whether there's a better way to do it. I have the following code with only 1 loop, but it did not work.? Can anyone help with this? Thanks, IDS <- unique(data.frame$ID) for(WhichID in IDS) { subset <- data.frame[,"ID"] == WhichID DVREAL <- data.frame[subset,"DV"] DVNORM[WhichID] <- DVREAL/DVREAL[1] } -- View this message in context: http://r.789695.n4.nabble.com/data-normalization-tp4633911.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.