hi all, I want to get a cumsum according to the order of some variable. However, it doesnt' work. For example, ********************** test<-data.frame(cbind(x=c(3,5,2,6,7),y=c(8,1,4,9,0))) test[order(test$x),]$sumy<-cumsum(test[order(test$x),]$y) ********************** R complians Warning message: In `[<-.data.frame`(`*tmp*`, order(test$x), , value = list(x = c(2, : provided 3 variables to replace 2 variables. while the following *********************** test$sumy<-cumsum(test[order(test$x),]$y) ****************** gives x y sumy 1 3 8 4 2 5 1 12 3 2 4 13 4 6 9 22 5 7 0 22 should it gives x y sumy 1 3 8 12 2 5 1 13 3 2 4 4 4 6 9 22 5 7 0 22 What am I missing here? thanks, Guang
> test<-data.frame(cbind(x=c(3,5,2,6,7),y=c(8,1,4,9,0))) > test[order(test$x),]$sumy<-cumsum(test[order(test$x),]$y)is asking a bit too much of R. If you add the line test$sumy <- numeric(nrow(test)) between those lines you get what you want. Here are the details. The nested replacement expression dataFrame[subscript, ]$component <- value is treated as the sequence of expressions TMP <- dataFrame[subscript,] TMP$component <- value dataFrame[subscript, ] <- TMP If dataFrame has no column named 'component' then the last expression involves trying to replace some rows of the n-column data.frame dataFrame by the contents of the (n+1)-column data.frame TMP and R will not do that. By the way, drop the cbind from data.frame(cbind(x=..., y=...)) and just use data.frame(x=..., y=...) The cbind() slows things down, wastes memory, and can give surprising results (when x and y have different classes). Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On > Behalf Of cowboy > Sent: Friday, July 27, 2012 1:23 PM > To: r-help at r-project.org > Subject: [R] why order doesn't work? > > hi all, > I want to get a cumsum according to the order of some variable. > However, it doesnt' work. > For example, > ********************** > test<-data.frame(cbind(x=c(3,5,2,6,7),y=c(8,1,4,9,0))) > test[order(test$x),]$sumy<-cumsum(test[order(test$x),]$y) > ********************** > R complians Warning message: > In `[<-.data.frame`(`*tmp*`, order(test$x), , value = list(x = c(2, : > provided 3 variables to replace 2 variables. > > while the following > *********************** > test$sumy<-cumsum(test[order(test$x),]$y) > ****************** > gives > x y sumy > 1 3 8 4 > 2 5 1 12 > 3 2 4 13 > 4 6 9 22 > 5 7 0 22 > > should it gives > > x y sumy > 1 3 8 12 > 2 5 1 13 > 3 2 4 4 > 4 6 9 22 > 5 7 0 22 > > What am I missing here? > thanks, > Guang > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
cowboy wrote> > hi all, > I want to get a cumsum according to the order of some variable. > However, it doesnt' work. > For example, > ********************** > test<-data.frame(cbind(x=c(3,5,2,6,7),y=c(8,1,4,9,0))) > test[order(test$x),]$sumy<-cumsum(test[order(test$x),]$y) > ********************** > R complians Warning message: > In `[<-.data.frame`(`*tmp*`, order(test$x), , value = list(x = c(2, : > provided 3 variables to replace 2 variables. > > while the following > *********************** > test$sumy<-cumsum(test[order(test$x),]$y) > ****************** > gives > x y sumy > 1 3 8 4 > 2 5 1 12 > 3 2 4 13 > 4 6 9 22 > 5 7 0 22 > > should it gives > > x y sumy > 1 3 8 12 > 2 5 1 13 > 3 2 4 4 > 4 6 9 22 > 5 7 0 22 > > What am I missing here? >order is working just fine. Apparently one cannot create a new column in a data.frame with the $ notation. This will work test[order(test$x),"sumy"]<-cumsum(test[order(test$x),]$y) Berend -- View this message in context: http://r.789695.n4.nabble.com/why-order-doesn-t-work-tp4638149p4638152.html Sent from the R help mailing list archive at Nabble.com.
This has nothing to do with order(). The following also reproduces the behavior:> test <- data.frame(x=letters[1:5],y = 1:5) > test[1:5,]$z <- 11:15Warning message: In `[<-.data.frame`(`*tmp*`, 1:5, , value = list(x = 1:5, y = 1:5, : provided 3 variables to replace 2 variables> testx y 1 a 1 2 b 2 3 c 3 4 d 4 5 e 5> test$z <- 11:15 > testx y z 1 a 1 11 2 b 2 12 3 c 3 13 4 d 4 14 5 e 5 15 I don't know what's going on here, but it appears that it has something to do (again!) with the $ convenience syntax, which is best avoided, it seems. Instead of the rather convoluted syntax of the first statement, note that> test <- data.frame(x=letters[1:5],y = 1:5) > test[1:5,"z"] <- 11:15 > testx y z 1 a 1 11 2 b 2 12 3 c 3 13 4 d 4 14 5 e 5 15 ... works fine. Have you read "An Intro to R" where indexing is discussed? If not, why not? If so, please follow it. It will help you avoid such difficulties. -- Bert On Fri, Jul 27, 2012 at 1:22 PM, cowboy <dgecon at gmail.com> wrote:> hi all, > I want to get a cumsum according to the order of some variable. > However, it doesnt' work. > For example, > ********************** > test<-data.frame(cbind(x=c(3,5,2,6,7),y=c(8,1,4,9,0))) > test[order(test$x),]$sumy<-cumsum(test[order(test$x),]$y) > ********************** > R complians Warning message: > In `[<-.data.frame`(`*tmp*`, order(test$x), , value = list(x = c(2, : > provided 3 variables to replace 2 variables. > > while the following > *********************** > test$sumy<-cumsum(test[order(test$x),]$y) > ****************** > gives > x y sumy > 1 3 8 4 > 2 5 1 12 > 3 2 4 13 > 4 6 9 22 > 5 7 0 22 > > should it gives > > x y sumy > 1 3 8 12 > 2 5 1 13 > 3 2 4 4 > 4 6 9 22 > 5 7 0 22 > > What am I missing here? > thanks, > Guang > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
HI, I guess you want to get the cumsum of y according to the order of x, test1<-test[order(test$x),] ?test2<-within(test1,{cumsumy<-cumsum(y)}) ?test2 #? x y cumsumy #3 2 4?????? 4 #1 3 8????? 12 #2 5 1????? 13 #4 6 9????? 22 A#5 7 0????? 22 A.K. ----- Original Message ----- From: cowboy <dgecon at gmail.com> To: r-help at r-project.org Cc: Sent: Friday, July 27, 2012 4:22 PM Subject: [R] why order doesn't work? hi all, I want to get a cumsum according to the order of some variable. However, it doesnt' work. For example, ********************** test<-data.frame(cbind(x=c(3,5,2,6,7),y=c(8,1,4,9,0))) test[order(test$x),]$sumy<-cumsum(test[order(test$x),]$y) ********************** R complians Warning message: In `[<-.data.frame`(`*tmp*`, order(test$x), , value = list(x = c(2,? : ? provided 3 variables to replace 2 variables. while the following *********************** test$sumy<-cumsum(test[order(test$x),]$y) ****************** gives ? x y sumy 1 3 8? ? 4 2 5 1? 12 3 2 4? 13 4 6 9? 22 5 7 0? 22 should it gives ? x y sumy 1 3 8? ? 12 2 5 1? 13 3 2 4? 4 4 6 9? 22 5 7 0? 22 What am I missing here? thanks, Guang ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.