Hi all, I'm hoping someone more knowledgeable in Fortran than I can chime in with opinion. I'm the maintainer of the flashClust package that implements fast hierarchical clustering. The fortran code fails when the number of clustered objects is larger than about 46300. My guess is that this is because the code uses the following construct: IOFFSET=J+(I-1)*N-(I*(I+1))/2 where N is the number of clustered objects and I, J can vary between 1 and N. The result is used as index to access an array (namely the distance structure). When N is more than 46341 (or 2^16/sqrt(2)), the expressions I*(I+1) and (I-1)*N can overflow and turn negative, messing up the indexing and crashing the code and the entire R session with a segmentation fault. My solution is to turn the integers into double precision's, calculate the index and convert back as follows: XI = DBLE(I) IOFFSET=J+NINT( (XI-1)*N - (XI*(XI+1))/2) I'm wondering if there's a better way, something along the lines of unsigned and/or long integers that are available in C? Thanks, Peter
Peter Langfelder wrote:> Hi all, > > I'm hoping someone more knowledgeable in Fortran than I can chime in > with opinion. > > I'm the maintainer of the flashClust package that implements fast > hierarchical clustering. The fortran code fails when the number of > clustered objects is larger than about 46300. My guess is that this is > because the code uses the following construct:2-byte (16 bit) signed integers would have a range from -32768 to +37267. So, it looks like you may be using 2-byte integers and 46,300 would definitely cause an overflow with 16-bit integers. I haven't used Fortran for a long time, but there could be a compiler switch that forces all 2-byte integers, or a specific declaration that says I, J, N, IOFFSET are only 2-byte (16-bit) integers. I'm guess, but you might try a specification like INTEGER*4 I, J, N, IOFFSET assuming INTEGER*4 is legal with your Fortran compiler: http://gcc.gnu.org/onlinedocs/gfortran/Old_002dstyle-kind-specifications.html#Old_002dstyle-kind-specifications efg Earl F Glynn Overland Park, KS
On Mon, 7 Feb 2011, Berend Hasselman <bhh at xs4all.nl> wrote:> The overflow is not caused by 16 bits integers. > I'm quite sure the OP is using 32 bit integers. > The overflow is caused by the multiplication N*(i-1) and/or i*(i+1).> In Fortran there's not much you can do about this unless your compiler > supports larger integers.Most modern Fortran compilers offer larger integers. The selected_int_kind() function can be used to find the appropriate integer KIND for your compiler. Most, like gfortran, use kind=8 for long integer integer (kind=8) :: i16 write(*,*) huge(i16) 9223372036854775807 -- | David Duffy (MBBS PhD) ,-_|\ | email: davidD at qimr.edu.au ph: INT+61+7+3362-0217 fax: -0101 / * | Epidemiology Unit, Queensland Institute of Medical Research \_,-._/ | 300 Herston Rd, Brisbane, Queensland 4029, Australia GPG 4D0B994A v