James Splinter
2010-Oct-22 19:20 UTC
[R] Problem with Aggregate - Sum, limit on number of criteria
Hello, It appears there is a limit in the number of criteria that can be put into the Aggregate sum function. (It looks like it is 32). My code is; HSfirst=aggregate(count, list(P2010W,P2009S,P2009W,P2008S,P2008W,P2007S,P2007W,P2006S,P2006W,pcom,W2010W,W2009S,W2009W,W2008S,W2008W,W2007S,W2007W,W2006S,W2006W,wcom,cd,f,g,m,urb,nourb,nourb2,nourb3,nourb4,eight,thirty,fifty,sixty,seventy,xover),sum) names(HSfirst)=c("P2010W","P2009S","P2009W","P2008S","P2008W","P2007S","P2007W","P2006S","P2006W","pcom","W2010W","W2009S","W2009W","W2008S","W2008W","W2007S","W2007W","W2006S","W2006W","wcom","cd","f","g","m","urb","nourb","nourb2","nourb3","nourb4","eight","thirty","fifty","sixty","seventy","xover","count") summary(HSfirst) some output from the summary command: sixty seventy xover count Min. :0 Min. :0 Min. :0 Min. : 1.00 1st Qu.:0 1st Qu.:0 1st Qu.:0 1st Qu.: 1.00 Median :0 Median :0 Median :0 Median : 1.00 Mean :0 Mean :0 Mean :0 Mean : 30.35 3rd Qu.:0 3rd Qu.:0 3rd Qu.:0 3rd Qu.: 2.00 Max. :0 Max. :0 Max. :0 Max. :25016.00 Originally I had a problem with the variable named "xover" coming out all zeros (a column of zeros) after the data was put through the aggregation. However, I verified this variable is not a column of zeros, precisely BEFORE it goes into the aggregation, and it all zeros afterwards. To test the theory, I created nourb, nourb2, nourb3 and nourb4 (which are just duplicates of urb) and added them, the result was that the last variables turned into columns of zeros. The error message is: Warning messages: 1: In grp * nlevels(ind) : NAs produced by integer overflow 2: In grp * nlevels(ind) : NAs produced by integer overflow 3: In grp * nlevels(ind) : NAs produced by integer overflow 4: In grp * nlevels(ind) : NAs produced by integer overflow I tried searching for this error message but didn't find anything. I have very little programming background, but my colleague reviewed the source-code and said that he didn't find anything putting a limit on the number of criteria, despite that it seemingly is 32. I was hoping to find if this could be verified, that if there is or isn't a limit. Aswell, if there is a limit, what action can we take to fix the code. If anymore information is needed, please ask. Thanks, James [[alternative HTML version deleted]]
Uwe Ligges
2010-Oct-23 18:15 UTC
[R] Problem with Aggregate - Sum, limit on number of criteria
This is fixed in recent versions of R: ------------------------------------------------------------------------ r52862 | hornik | 2010-09-01 18:21:49 -0400 (Wed, 01 Sep 2010) | 1 line Changed paths: M /trunk/src/library/stats/R/aggregate.R Avoid integer overflows. Uwe Ligges On 22.10.2010 21:20, James Splinter wrote:> Hello, > > It appears there is a limit in the number of criteria that can be put into > the Aggregate sum function. (It looks like it is 32). > > My code is; > > HSfirst=aggregate(count, > list(P2010W,P2009S,P2009W,P2008S,P2008W,P2007S,P2007W,P2006S,P2006W,pcom,W2010W,W2009S,W2009W,W2008S,W2008W,W2007S,W2007W,W2006S,W2006W,wcom,cd,f,g,m,urb,nourb,nourb2,nourb3,nourb4,eight,thirty,fifty,sixty,seventy,xover),sum) > names(HSfirst)=c("P2010W","P2009S","P2009W","P2008S","P2008W","P2007S","P2007W","P2006S","P2006W","pcom","W2010W","W2009S","W2009W","W2008S","W2008W","W2007S","W2007W","W2006S","W2006W","wcom","cd","f","g","m","urb","nourb","nourb2","nourb3","nourb4","eight","thirty","fifty","sixty","seventy","xover","count") > summary(HSfirst) > > some output from the summary command: > > sixty seventy xover count > Min. :0 Min. :0 Min. :0 Min. : 1.00 > 1st Qu.:0 1st Qu.:0 1st Qu.:0 1st Qu.: 1.00 > Median :0 Median :0 Median :0 Median : 1.00 > Mean :0 Mean :0 Mean :0 Mean : 30.35 > 3rd Qu.:0 3rd Qu.:0 3rd Qu.:0 3rd Qu.: 2.00 > Max. :0 Max. :0 Max. :0 Max. :25016.00 > > > Originally I had a problem with the variable named "xover" coming out all > zeros (a column of zeros) after the data was put through the aggregation. > However, I verified this variable is not a column of zeros, precisely BEFORE > it goes into the aggregation, and it all zeros afterwards. To test the > theory, I created nourb, nourb2, nourb3 and nourb4 (which are just > duplicates of urb) and added them, the result was that the last variables > turned into columns of zeros. > > The error message is: > Warning messages: > 1: In grp * nlevels(ind) : NAs produced by integer overflow > 2: In grp * nlevels(ind) : NAs produced by integer overflow > 3: In grp * nlevels(ind) : NAs produced by integer overflow > 4: In grp * nlevels(ind) : NAs produced by integer overflow > > I tried searching for this error message but didn't find anything. I have > very little programming background, but my colleague reviewed the > source-code and said that he didn't find anything putting a limit on the > number of criteria, despite that it seemingly is 32. I was hoping to find if > this could be verified, that if there is or isn't a limit. Aswell, if there > is a limit, what action can we take to fix the code. > > If anymore information is needed, please ask. > > Thanks, > > James > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.