bill at insightful.com
2008-Aug-07 09:24 UTC
[Rd] memory leak in sub("[range]", ...) when #ifndef _LIBC (PR#11946)
Full_Name: Bill Dunlap Version: R version 2.8.0 Under development (unstable) (2008-07-05 r46037) OS: Linux Submission from: (NULL) (76.28.245.14) valgrind finds some memory leaks in R when I use sub() with a range in the regular expression: % R --debugger=valgrind --debugger-args=--leak-check=full --quiet --vanilla ==28643== Memcheck, a memory error detector. ==28643== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al. ==28643== Using LibVEX rev 1658, a library for dynamic binary translation. ==28643== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP. ==28643== Using valgrind-3.2.1, a dynamic binary instrumentation framework. ==28643== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al. ==28643== For more details, rerun with: -v ==28643=> for(i in 1:1000)sub("[0-9]","*","17")> q()==28643===28643== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 38 from 2) ==28643== malloc/free: in use at exit: 12,607,663 bytes in 7,918 blocks. ==28643== malloc/free: 61,907 allocs, 53,989 frees, 54,692,481 bytes allocated. ==28643== For counts of detected errors, rerun with: -v ==28643== searching for pointers to 7,918 not-freed blocks. ==28643== checked 12,620,188 bytes. ==28643===28643== 82 bytes in 4 blocks are definitely lost in loss record 15 of 44 ==28643== at 0x40046EE: malloc (vg_replace_malloc.c:149) ==28643== by 0x3200FF9: xmalloc (in /a/devlnx3206.insightful.com/opt/builds/R/devel/LX/46036/dist/lib/R/lib/libreadline.so.4) ==28643== by 0x31EC0D5: readline_internal_teardown (in /a/devlnx3206.insightful.com/opt/builds/R/devel/LX/46036/dist/lib/R/lib/libreadline.so.4) ==28643== by 0x31FD992: rl_callback_read_char (in /a/devlnx3206.insightful.com/opt/builds/R/devel/LX/46036/dist/lib/R/lib/libreadline.so.4) ==28643== by 0x80E7634: Rstd_ReadConsole (sys-std.c:905) ==28643== by 0x8057EA9: Rf_ReplIteration (main.c:205) ==28643== by 0x80581C6: R_ReplConsole (main.c:306) ==28643== by 0x805845C: run_Rmainloop (main.c:966) ==28643== by 0x80566B5: main (Rmain.c:33) ==28643===28643===28643== 7,996 bytes in 1,999 blocks are definitely lost in loss record 35 of 44 ==28643== at 0x40046EE: malloc (vg_replace_malloc.c:149) ==28643== by 0x4005B9A: realloc (vg_replace_malloc.c:306) ==28643== by 0x80A5E22: parse_expression (regex.c:5202) ==28643== by 0x80A5FDF: parse_branch (regex.c:4707) ==28643== by 0x80A60AA: parse_reg_exp (regex.c:4666) ==28643== by 0x80A64A8: Rf_regcomp (regex.c:4635) ==28643== by 0x8110AE0: do_gsub (character.c:1355) ==28643== by 0x80653BC: do_internal (names.c:1129) ==28643== by 0x815EF17: Rf_eval (eval.c:461) ==28643== by 0x8160BD3: do_begin (eval.c:1174) ==28643== by 0x815EF17: Rf_eval (eval.c:461) ==28643== by 0x816203C: Rf_applyClosure (eval.c:667) ==28643===28643== LEAK SUMMARY: ==28643== definitely lost: 8,078 bytes in 2,003 blocks. ==28643== possibly lost: 0 bytes in 0 blocks. ==28643== still reachable: 12,599,585 bytes in 5,915 blocks. ==28643== suppressed: 0 bytes in 0 blocks. ==28643== Reachable blocks (those to which a pointer was found) are not shown. ==28643== To see them, rerun with: --show-reachable=yes The flagged memory block is the range_ends component of mbcset. I think that range_starts was also being leaked, but valgrind was combining the two. It looks like the cpp macro _LIBC is not defined when I compile R in this Linux box. regex.c defines range_ends and range_starts as different types, depending on the value of _LIBC, and it allocates space for them in either case. However, free_charset() was only freeing these things if _LIBC was defined. The following change to regex.c:free_charset() seems to take care of the problem. % svn diff regex.c Index: regex.c ==================================================================--- regex.c (revision 46038) +++ regex.c (working copy) @@ -6240,9 +6240,9 @@ # ifdef _LIBC re_free (cset->coll_syms); re_free (cset->equiv_classes); +# endif re_free (cset->range_starts); re_free (cset->range_ends); -# endif re_free (cset->char_classes); re_free (cset); }> version_ platform i686-pc-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status Under development (unstable) major 2 minor 8.0 year 2008 month 07 day 05 svn rev 46037 language R version.string R version 2.8.0 Under development (unstable) (2008-07-05 r46037)
Prof Brian Ripley
2008-Aug-07 09:47 UTC
[Rd] memory leak in sub("[range]", ...) when #ifndef _LIBC (PR#11946)
For the record: this is now fixed. On Thu, 7 Aug 2008, bill at insightful.com wrote:> Full_Name: Bill Dunlap > Version: R version 2.8.0 Under development (unstable) (2008-07-05 r46037) > OS: Linux > Submission from: (NULL) (76.28.245.14) > > > valgrind finds some memory leaks in R when I use sub() with > a range in the regular expression: > > % R --debugger=valgrind --debugger-args=--leak-check=full --quiet --vanilla > ==28643== Memcheck, a memory error detector. > ==28643== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al. > ==28643== Using LibVEX rev 1658, a library for dynamic binary translation. > ==28643== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP. > ==28643== Using valgrind-3.2.1, a dynamic binary instrumentation framework. > ==28643== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al. > ==28643== For more details, rerun with: -v > ==28643=>> for(i in 1:1000)sub("[0-9]","*","17") >> q() > ==28643=> ==28643== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 38 from 2) > ==28643== malloc/free: in use at exit: 12,607,663 bytes in 7,918 blocks. > ==28643== malloc/free: 61,907 allocs, 53,989 frees, 54,692,481 bytes allocated. > ==28643== For counts of detected errors, rerun with: -v > ==28643== searching for pointers to 7,918 not-freed blocks. > ==28643== checked 12,620,188 bytes. > ==28643=> ==28643== 82 bytes in 4 blocks are definitely lost in loss record 15 of 44 > ==28643== at 0x40046EE: malloc (vg_replace_malloc.c:149) > ==28643== by 0x3200FF9: xmalloc (in > /a/devlnx3206.insightful.com/opt/builds/R/devel/LX/46036/dist/lib/R/lib/libreadline.so.4) > ==28643== by 0x31EC0D5: readline_internal_teardown (in > /a/devlnx3206.insightful.com/opt/builds/R/devel/LX/46036/dist/lib/R/lib/libreadline.so.4) > ==28643== by 0x31FD992: rl_callback_read_char (in > /a/devlnx3206.insightful.com/opt/builds/R/devel/LX/46036/dist/lib/R/lib/libreadline.so.4) > ==28643== by 0x80E7634: Rstd_ReadConsole (sys-std.c:905) > ==28643== by 0x8057EA9: Rf_ReplIteration (main.c:205) > ==28643== by 0x80581C6: R_ReplConsole (main.c:306) > ==28643== by 0x805845C: run_Rmainloop (main.c:966) > ==28643== by 0x80566B5: main (Rmain.c:33) > ==28643=> ==28643=> ==28643== 7,996 bytes in 1,999 blocks are definitely lost in loss record 35 of > 44 > ==28643== at 0x40046EE: malloc (vg_replace_malloc.c:149) > ==28643== by 0x4005B9A: realloc (vg_replace_malloc.c:306) > ==28643== by 0x80A5E22: parse_expression (regex.c:5202) > ==28643== by 0x80A5FDF: parse_branch (regex.c:4707) > ==28643== by 0x80A60AA: parse_reg_exp (regex.c:4666) > ==28643== by 0x80A64A8: Rf_regcomp (regex.c:4635) > ==28643== by 0x8110AE0: do_gsub (character.c:1355) > ==28643== by 0x80653BC: do_internal (names.c:1129) > ==28643== by 0x815EF17: Rf_eval (eval.c:461) > ==28643== by 0x8160BD3: do_begin (eval.c:1174) > ==28643== by 0x815EF17: Rf_eval (eval.c:461) > ==28643== by 0x816203C: Rf_applyClosure (eval.c:667) > ==28643=> ==28643== LEAK SUMMARY: > ==28643== definitely lost: 8,078 bytes in 2,003 blocks. > ==28643== possibly lost: 0 bytes in 0 blocks. > ==28643== still reachable: 12,599,585 bytes in 5,915 blocks. > ==28643== suppressed: 0 bytes in 0 blocks. > ==28643== Reachable blocks (those to which a pointer was found) are not shown. > ==28643== To see them, rerun with: --show-reachable=yes > > The flagged memory block is the range_ends component of mbcset. > I think that range_starts was also being leaked, but valgrind was > combining the two. > > It looks like the cpp macro _LIBC is not defined when I compile > R in this Linux box. regex.c defines range_ends and range_starts > as different types, depending on the value of _LIBC, and it allocates > space for them in either case. However, free_charset() was only > freeing these things if _LIBC was defined. The following change > to regex.c:free_charset() seems to take care of the problem. > > % svn diff regex.c > Index: regex.c > ==================================================================> --- regex.c (revision 46038) > +++ regex.c (working copy) > @@ -6240,9 +6240,9 @@ > # ifdef _LIBC > re_free (cset->coll_syms); > re_free (cset->equiv_classes); > +# endif > re_free (cset->range_starts); > re_free (cset->range_ends); > -# endif > re_free (cset->char_classes); > re_free (cset); > } > > >> version > _ > platform i686-pc-linux-gnu > arch i686 > os linux-gnu > system i686, linux-gnu > status Under development (unstable) > major 2 > minor 8.0 > year 2008 > month 07 > day 05 > svn rev 46037 > language R > version.string R version 2.8.0 Under development (unstable) (2008-07-05 r46037) > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Seemingly Similar Threads
- memory leak in sub("[range]", ...) when #ifndef _LIBC (PR#12488)
- memory leak in sub("[range]",...)
- Question: Should we consider valid a variable defined #ifndef NDEBUG scope and used in assert?
- strange smbstatus output after update from 2.2.5 to 2.2.8a
- Question: Should we consider valid a variable defined #ifndef NDEBUG scope and used in assert?