James Splinter
2010-Oct-22 19:20 UTC
[R] Problem with Aggregate - Sum, limit on number of criteria
Hello,
It appears there is a limit in the number of criteria that can be put into
the Aggregate sum function. (It looks like it is 32).
My code is;
HSfirst=aggregate(count,
list(P2010W,P2009S,P2009W,P2008S,P2008W,P2007S,P2007W,P2006S,P2006W,pcom,W2010W,W2009S,W2009W,W2008S,W2008W,W2007S,W2007W,W2006S,W2006W,wcom,cd,f,g,m,urb,nourb,nourb2,nourb3,nourb4,eight,thirty,fifty,sixty,seventy,xover),sum)
names(HSfirst)=c("P2010W","P2009S","P2009W","P2008S","P2008W","P2007S","P2007W","P2006S","P2006W","pcom","W2010W","W2009S","W2009W","W2008S","W2008W","W2007S","W2007W","W2006S","W2006W","wcom","cd","f","g","m","urb","nourb","nourb2","nourb3","nourb4","eight","thirty","fifty","sixty","seventy","xover","count")
summary(HSfirst)
some output from the summary command:
sixty seventy xover count
Min. :0 Min. :0 Min. :0 Min. : 1.00
1st Qu.:0 1st Qu.:0 1st Qu.:0 1st Qu.: 1.00
Median :0 Median :0 Median :0 Median : 1.00
Mean :0 Mean :0 Mean :0 Mean : 30.35
3rd Qu.:0 3rd Qu.:0 3rd Qu.:0 3rd Qu.: 2.00
Max. :0 Max. :0 Max. :0 Max. :25016.00
Originally I had a problem with the variable named "xover" coming out
all
zeros (a column of zeros) after the data was put through the aggregation.
However, I verified this variable is not a column of zeros, precisely BEFORE
it goes into the aggregation, and it all zeros afterwards. To test the
theory, I created nourb, nourb2, nourb3 and nourb4 (which are just
duplicates of urb) and added them, the result was that the last variables
turned into columns of zeros.
The error message is:
Warning messages:
1: In grp * nlevels(ind) : NAs produced by integer overflow
2: In grp * nlevels(ind) : NAs produced by integer overflow
3: In grp * nlevels(ind) : NAs produced by integer overflow
4: In grp * nlevels(ind) : NAs produced by integer overflow
I tried searching for this error message but didn't find anything. I have
very little programming background, but my colleague reviewed the
source-code and said that he didn't find anything putting a limit on the
number of criteria, despite that it seemingly is 32. I was hoping to find if
this could be verified, that if there is or isn't a limit. Aswell, if there
is a limit, what action can we take to fix the code.
If anymore information is needed, please ask.
Thanks,
James
[[alternative HTML version deleted]]
Uwe Ligges
2010-Oct-23 18:15 UTC
[R] Problem with Aggregate - Sum, limit on number of criteria
This is fixed in recent versions of R:
------------------------------------------------------------------------
r52862 | hornik | 2010-09-01 18:21:49 -0400 (Wed, 01 Sep 2010) | 1 line
Changed paths:
M /trunk/src/library/stats/R/aggregate.R
Avoid integer overflows.
Uwe Ligges
On 22.10.2010 21:20, James Splinter wrote:> Hello,
>
> It appears there is a limit in the number of criteria that can be put into
> the Aggregate sum function. (It looks like it is 32).
>
> My code is;
>
> HSfirst=aggregate(count,
>
list(P2010W,P2009S,P2009W,P2008S,P2008W,P2007S,P2007W,P2006S,P2006W,pcom,W2010W,W2009S,W2009W,W2008S,W2008W,W2007S,W2007W,W2006S,W2006W,wcom,cd,f,g,m,urb,nourb,nourb2,nourb3,nourb4,eight,thirty,fifty,sixty,seventy,xover),sum)
>
names(HSfirst)=c("P2010W","P2009S","P2009W","P2008S","P2008W","P2007S","P2007W","P2006S","P2006W","pcom","W2010W","W2009S","W2009W","W2008S","W2008W","W2007S","W2007W","W2006S","W2006W","wcom","cd","f","g","m","urb","nourb","nourb2","nourb3","nourb4","eight","thirty","fifty","sixty","seventy","xover","count")
> summary(HSfirst)
>
> some output from the summary command:
>
> sixty seventy xover count
> Min. :0 Min. :0 Min. :0 Min. : 1.00
> 1st Qu.:0 1st Qu.:0 1st Qu.:0 1st Qu.: 1.00
> Median :0 Median :0 Median :0 Median : 1.00
> Mean :0 Mean :0 Mean :0 Mean : 30.35
> 3rd Qu.:0 3rd Qu.:0 3rd Qu.:0 3rd Qu.: 2.00
> Max. :0 Max. :0 Max. :0 Max. :25016.00
>
>
> Originally I had a problem with the variable named "xover" coming
out all
> zeros (a column of zeros) after the data was put through the aggregation.
> However, I verified this variable is not a column of zeros, precisely
BEFORE
> it goes into the aggregation, and it all zeros afterwards. To test the
> theory, I created nourb, nourb2, nourb3 and nourb4 (which are just
> duplicates of urb) and added them, the result was that the last variables
> turned into columns of zeros.
>
> The error message is:
> Warning messages:
> 1: In grp * nlevels(ind) : NAs produced by integer overflow
> 2: In grp * nlevels(ind) : NAs produced by integer overflow
> 3: In grp * nlevels(ind) : NAs produced by integer overflow
> 4: In grp * nlevels(ind) : NAs produced by integer overflow
>
> I tried searching for this error message but didn't find anything. I
have
> very little programming background, but my colleague reviewed the
> source-code and said that he didn't find anything putting a limit on
the
> number of criteria, despite that it seemingly is 32. I was hoping to find
if
> this could be verified, that if there is or isn't a limit. Aswell, if
there
> is a limit, what action can we take to fix the code.
>
> If anymore information is needed, please ask.
>
> Thanks,
>
> James
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.