Thanks for the replies and for confirming my suspicion. Interestingly, src/include/S.h uses a trick: ?? #define longint int and so does the nlme package (within src/init.c). On 08/15/2018 02:47 PM, Herv? Pag?s wrote:> No segfault but a BIG warning from the compiler. That's because > dereferencing the pointer inside your myfunc() function will > produce an int that is not predictable i.e. it is system-dependent. > Its value will depend on sizeof(long int) (which is not > guaranteed to be 8) and on the endianness of the system. > > Also if the pointer you pass in the call to the function is > an array of long ints, then pointer arithmetic inside your myfunc() > won't necessarily take you to the array element that you'd expect. > > Note that there are very specific situations where you can actually > do this kind of things e.g. in the context of writing a callback > function to pass to qsort(). See 'man 3 qsort' if you are on a Unix > system. In that case pointers to void and explicit casts should > be used. If done properly, this is portable code and the compiler won't > issue warnings. > > H. > > > On 08/15/2018 07:05 AM, Brian Ripley wrote: >> >> >>> On 15 Aug 2018, at 12:48, Duncan Murdoch <murdoch.duncan at gmail.com> >>> wrote: >>> >>>> On 15/08/2018 7:08 AM, Benjamin Tyner wrote: >>>> Hi >>>> In my R package, imagine I have a C function defined: >>>> ???? void myfunc(int *x) { >>>> ??????? // some code >>>> ???? } >>>> but when I call it, I pass it a pointer to a longint instead of a >>>> pointer to an int. Could this practice potentially result in a >>>> segfault? >>> >>> I don't think the passing would cause a segfault, but "some code" >>> might be expecting a positive number, and due to the type error you >>> could pass in a positive longint and have it interpreted as a >>> negative int. >> >> Are you thinking only of a little-endian system?? A 32-bit lookup of >> a pointer to a 64-bit area could read the wrong half and get a >> completely different value. >> >>> >>> Duncan Murdoch >>> >>> ______________________________________________ >>> R-devel at r-project.org mailing list >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwIFAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=ERck0y30d00Np6hqTNYfjusx1beZim0OrKe9O4vkUxU&s=x1gI9ACZol7WbaWQ7Ocv60csJFJClZotWkJIMwUdjIc&e= >>> >> >> ______________________________________________ >> R-devel at r-project.org mailing list >> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwIFAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=ERck0y30d00Np6hqTNYfjusx1beZim0OrKe9O4vkUxU&s=x1gI9ACZol7WbaWQ7Ocv60csJFJClZotWkJIMwUdjIc&e= >> >> >
Note that include/S.h contains /* This is a legacy header and no longer documented. Code using it should be converted to use R.h */ ... /* is this a good idea? - conflicts with many versions of f2c.h */ # define longint int S.h was meant to be used while converting to R C code written for S or S+. S/S+ "integers" are represented as C "long ints", whose size depends on the architecture, while R "integers" are represented as 32-bit C "ints". "longint" was invented to hide this difference. Bill Dunlap TIBCO Software wdunlap tibco.com On Wed, Aug 15, 2018 at 5:32 PM, Benjamin Tyner <btyner at gmail.com> wrote:> Thanks for the replies and for confirming my suspicion. > > Interestingly, src/include/S.h uses a trick: > > #define longint int > > and so does the nlme package (within src/init.c). > > On 08/15/2018 02:47 PM, Herv? Pag?s wrote: > >> No segfault but a BIG warning from the compiler. That's because >> dereferencing the pointer inside your myfunc() function will >> produce an int that is not predictable i.e. it is system-dependent. >> Its value will depend on sizeof(long int) (which is not >> guaranteed to be 8) and on the endianness of the system. >> >> Also if the pointer you pass in the call to the function is >> an array of long ints, then pointer arithmetic inside your myfunc() >> won't necessarily take you to the array element that you'd expect. >> >> Note that there are very specific situations where you can actually >> do this kind of things e.g. in the context of writing a callback >> function to pass to qsort(). See 'man 3 qsort' if you are on a Unix >> system. In that case pointers to void and explicit casts should >> be used. If done properly, this is portable code and the compiler won't >> issue warnings. >> >> H. >> >> >> On 08/15/2018 07:05 AM, Brian Ripley wrote: >> >>> >>> >>> On 15 Aug 2018, at 12:48, Duncan Murdoch <murdoch.duncan at gmail.com> >>>> wrote: >>>> >>>> On 15/08/2018 7:08 AM, Benjamin Tyner wrote: >>>>> Hi >>>>> In my R package, imagine I have a C function defined: >>>>> void myfunc(int *x) { >>>>> // some code >>>>> } >>>>> but when I call it, I pass it a pointer to a longint instead of a >>>>> pointer to an int. Could this practice potentially result in a >>>>> segfault? >>>>> >>>> >>>> I don't think the passing would cause a segfault, but "some code" might >>>> be expecting a positive number, and due to the type error you could pass in >>>> a positive longint and have it interpreted as a negative int. >>>> >>> >>> Are you thinking only of a little-endian system? A 32-bit lookup of a >>> pointer to a 64-bit area could read the wrong half and get a completely >>> different value. >>> >>> >>>> Duncan Murdoch >>>> >>>> ______________________________________________ >>>> R-devel at r-project.org mailing list >>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.et >>>> hz.ch_mailman_listinfo_r-2Ddevel&d=DwIFAg&c=eRAMFD45gAfqt84V >>>> tBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=ERck0 >>>> y30d00Np6hqTNYfjusx1beZim0OrKe9O4vkUxU&s=x1gI9ACZol7WbaWQ7Oc >>>> v60csJFJClZotWkJIMwUdjIc&e>>>> >>> >>> ______________________________________________ >>> R-devel at r-project.org mailing list >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.et >>> hz.ch_mailman_listinfo_r-2Ddevel&d=DwIFAg&c=eRAMFD45gAfqt84V >>> tBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=ERck0 >>> y30d00Np6hqTNYfjusx1beZim0OrKe9O4vkUxU&s=x1gI9ACZol7WbaWQ7Oc >>> v60csJFJClZotWkJIMwUdjIc&e>>> >>> >> > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]
On 15 August 2018 at 20:32, Benjamin Tyner wrote: | Thanks for the replies and for confirming my suspicion. | | Interestingly, src/include/S.h uses a trick: | | ?? #define longint int | | and so does the nlme package (within src/init.c). As Bill Dunlap already told you, this is a) ancient and b) was concerned with the int as 16 bit to 32 bit transition period. Ie a long time ago. Old C programmers remember. You should preferably not even use 'long int' on the other side but rely on the fact that all compiler nowadays allow you to specify exactly what size is used via int64_t (long), int32_t (int), ... and the unsigned cousins (which R does not have). So please receive the value as a int64_t and then cast it to an int32_t -- which corresponds to R's notion of an integer on every platform. And please note that that conversion is lossy. If you must keep 64 bits then the bit64 package by Jens Oehlschlaegel is good and eg fully supported inside data.table. We use it for 64-bit integers as nanosecond timestamps in our nanotime package (which has some converters). Dirk -- http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
On 08/16/2018 05:12 AM, Dirk Eddelbuettel wrote:> > On 15 August 2018 at 20:32, Benjamin Tyner wrote: > | Thanks for the replies and for confirming my suspicion. > | > | Interestingly, src/include/S.h uses a trick: > | > | ?? #define longint int > | > | and so does the nlme package (within src/init.c). > > As Bill Dunlap already told you, this is a) ancient and b) was concerned with > the int as 16 bit to 32 bit transition period. Ie a long time ago. Old C > programmers remember. > > You should preferably not even use 'long int' on the other side but rely on > the fact that all compiler nowadays allow you to specify exactly what size is > used via int64_t (long), int32_t (int), ... and the unsigned cousins (which R > does not have). So please receive the value as a int64_t and then cast it to > an int32_t -- which corresponds to R's notion of an integer on every platform.Only on Intel platforms int is 32 bits. Strictly speaking int is only required to be >= 16 bits. Who knows what the size of an int is on the Sunway TaihuLight for example ;-) H.> > And please note that that conversion is lossy. If you must keep 64 bits then > the bit64 package by Jens Oehlschlaegel is good and eg fully supported inside > data.table. We use it for 64-bit integers as nanosecond timestamps in our > nanotime package (which has some converters). > > Dirk >-- Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fredhutch.org Phone: (206) 667-5791 Fax: (206) 667-1319