hi all, I want to get a cumsum according to the order of some variable. However, it doesnt' work. For example, ********************** test<-data.frame(cbind(x=c(3,5,2,6,7),y=c(8,1,4,9,0))) test[order(test$x),]$sumy<-cumsum(test[order(test$x),]$y) ********************** R complians Warning message: In `[<-.data.frame`(`*tmp*`, order(test$x), , value = list(x = c(2, : provided 3 variables to replace 2 variables. while the following *********************** test$sumy<-cumsum(test[order(test$x),]$y) ****************** gives x y sumy 1 3 8 4 2 5 1 12 3 2 4 13 4 6 9 22 5 7 0 22 should it gives x y sumy 1 3 8 12 2 5 1 13 3 2 4 4 4 6 9 22 5 7 0 22 What am I missing here? thanks, Guang
> test<-data.frame(cbind(x=c(3,5,2,6,7),y=c(8,1,4,9,0))) > test[order(test$x),]$sumy<-cumsum(test[order(test$x),]$y)is asking a bit too much of R. If you add the line test$sumy <- numeric(nrow(test)) between those lines you get what you want. Here are the details. The nested replacement expression dataFrame[subscript, ]$component <- value is treated as the sequence of expressions TMP <- dataFrame[subscript,] TMP$component <- value dataFrame[subscript, ] <- TMP If dataFrame has no column named 'component' then the last expression involves trying to replace some rows of the n-column data.frame dataFrame by the contents of the (n+1)-column data.frame TMP and R will not do that. By the way, drop the cbind from data.frame(cbind(x=..., y=...)) and just use data.frame(x=..., y=...) The cbind() slows things down, wastes memory, and can give surprising results (when x and y have different classes). Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On > Behalf Of cowboy > Sent: Friday, July 27, 2012 1:23 PM > To: r-help at r-project.org > Subject: [R] why order doesn't work? > > hi all, > I want to get a cumsum according to the order of some variable. > However, it doesnt' work. > For example, > ********************** > test<-data.frame(cbind(x=c(3,5,2,6,7),y=c(8,1,4,9,0))) > test[order(test$x),]$sumy<-cumsum(test[order(test$x),]$y) > ********************** > R complians Warning message: > In `[<-.data.frame`(`*tmp*`, order(test$x), , value = list(x = c(2, : > provided 3 variables to replace 2 variables. > > while the following > *********************** > test$sumy<-cumsum(test[order(test$x),]$y) > ****************** > gives > x y sumy > 1 3 8 4 > 2 5 1 12 > 3 2 4 13 > 4 6 9 22 > 5 7 0 22 > > should it gives > > x y sumy > 1 3 8 12 > 2 5 1 13 > 3 2 4 4 > 4 6 9 22 > 5 7 0 22 > > What am I missing here? > thanks, > Guang > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
cowboy wrote> > hi all, > I want to get a cumsum according to the order of some variable. > However, it doesnt' work. > For example, > ********************** > test<-data.frame(cbind(x=c(3,5,2,6,7),y=c(8,1,4,9,0))) > test[order(test$x),]$sumy<-cumsum(test[order(test$x),]$y) > ********************** > R complians Warning message: > In `[<-.data.frame`(`*tmp*`, order(test$x), , value = list(x = c(2, : > provided 3 variables to replace 2 variables. > > while the following > *********************** > test$sumy<-cumsum(test[order(test$x),]$y) > ****************** > gives > x y sumy > 1 3 8 4 > 2 5 1 12 > 3 2 4 13 > 4 6 9 22 > 5 7 0 22 > > should it gives > > x y sumy > 1 3 8 12 > 2 5 1 13 > 3 2 4 4 > 4 6 9 22 > 5 7 0 22 > > What am I missing here? >order is working just fine. Apparently one cannot create a new column in a data.frame with the $ notation. This will work test[order(test$x),"sumy"]<-cumsum(test[order(test$x),]$y) Berend -- View this message in context: http://r.789695.n4.nabble.com/why-order-doesn-t-work-tp4638149p4638152.html Sent from the R help mailing list archive at Nabble.com.
This has nothing to do with order(). The following also reproduces the behavior:> test <- data.frame(x=letters[1:5],y = 1:5) > test[1:5,]$z <- 11:15Warning message: In `[<-.data.frame`(`*tmp*`, 1:5, , value = list(x = 1:5, y = 1:5, : provided 3 variables to replace 2 variables> testx y 1 a 1 2 b 2 3 c 3 4 d 4 5 e 5> test$z <- 11:15 > testx y z 1 a 1 11 2 b 2 12 3 c 3 13 4 d 4 14 5 e 5 15 I don't know what's going on here, but it appears that it has something to do (again!) with the $ convenience syntax, which is best avoided, it seems. Instead of the rather convoluted syntax of the first statement, note that> test <- data.frame(x=letters[1:5],y = 1:5) > test[1:5,"z"] <- 11:15 > testx y z 1 a 1 11 2 b 2 12 3 c 3 13 4 d 4 14 5 e 5 15 ... works fine. Have you read "An Intro to R" where indexing is discussed? If not, why not? If so, please follow it. It will help you avoid such difficulties. -- Bert On Fri, Jul 27, 2012 at 1:22 PM, cowboy <dgecon at gmail.com> wrote:> hi all, > I want to get a cumsum according to the order of some variable. > However, it doesnt' work. > For example, > ********************** > test<-data.frame(cbind(x=c(3,5,2,6,7),y=c(8,1,4,9,0))) > test[order(test$x),]$sumy<-cumsum(test[order(test$x),]$y) > ********************** > R complians Warning message: > In `[<-.data.frame`(`*tmp*`, order(test$x), , value = list(x = c(2, : > provided 3 variables to replace 2 variables. > > while the following > *********************** > test$sumy<-cumsum(test[order(test$x),]$y) > ****************** > gives > x y sumy > 1 3 8 4 > 2 5 1 12 > 3 2 4 13 > 4 6 9 22 > 5 7 0 22 > > should it gives > > x y sumy > 1 3 8 12 > 2 5 1 13 > 3 2 4 4 > 4 6 9 22 > 5 7 0 22 > > What am I missing here? > thanks, > Guang > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
HI,
I guess you want to get the cumsum of y according to the order of x,
test1<-test[order(test$x),]
?test2<-within(test1,{cumsumy<-cumsum(y)})
?test2
#? x y cumsumy
#3 2 4?????? 4
#1 3 8????? 12
#2 5 1????? 13
#4 6 9????? 22
A#5 7 0????? 22
A.K.
----- Original Message -----
From: cowboy <dgecon at gmail.com>
To: r-help at r-project.org
Cc:
Sent: Friday, July 27, 2012 4:22 PM
Subject: [R] why order doesn't work?
hi all,
I want to get a cumsum according to the order of some variable.
However, it doesnt' work.
For example,
**********************
test<-data.frame(cbind(x=c(3,5,2,6,7),y=c(8,1,4,9,0)))
test[order(test$x),]$sumy<-cumsum(test[order(test$x),]$y)
**********************
R complians Warning message:
In `[<-.data.frame`(`*tmp*`, order(test$x), , value = list(x = c(2,? :
? provided 3 variables to replace 2 variables.
while the following
***********************
test$sumy<-cumsum(test[order(test$x),]$y)
******************
gives
? x y sumy
1 3 8? ? 4
2 5 1? 12
3 2 4? 13
4 6 9? 22
5 7 0? 22
should it gives
? x y sumy
1 3 8? ? 12
2 5 1? 13
3 2 4? 4
4 6 9? 22
5 7 0? 22
What am I missing here?
thanks,
Guang
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.