Pierre-Olivier Gaillard
2010-Mar-11 15:03 UTC
[dtrace-discuss] Replacing libumem''s debug with dtrace?
Hi all, I have a program that crashes. From the stack I suspect it''s trying to read free memory, probably because it''s trying to clean some structures twice. Unfortunately I can''t run the program under Purify as it''s a 32 bit program and the purify runs out of 4GB. So I tried libumem as I think it should be able to find the error but libumem is running out of memory too. I am now trying to reduce the logging some more in the hope that the program will run: LD_PRELOAD="libumem.so.1" UMEM_LOGGING=''transaction'' UMEM_DEBUG=''audit,guards,verbose'' My other idea is to try to emulate the feature I want from these tools with dtrace and a huge log file: - trace all malloc and free, including stack, address and size - wait for the crash - search in the log for areas containing the pointers involved in the crash This seems pretty daunting and I wonder if I could not reduce libumem memory usage instead. I don''t mind running the program a couple times. Are there ways to use dtrace that would allow me to disable the libumem options that use the most memory? Thanks for your help, -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20100311/b603fb76/attachment.html>
Michael Schuster
2010-Mar-11 15:18 UTC
[dtrace-discuss] Replacing libumem''s debug with dtrace?
On 11.03.10 16:03, Pierre-Olivier Gaillard wrote:> Hi all, > > I have a program that crashes. From the stack I suspect it''s trying to read > free memory, probably because it''s trying to clean some structures twice. > Unfortunately I can''t run the program under Purify as it''s a 32 bit program > and the purify runs out of 4GB. So I tried libumem as I think it should be > able to find the error but libumem is running out of memory too. > > I am now trying to reduce the logging some more in the hope that the > program will run: > LD_PRELOAD="libumem.so.1" > UMEM_LOGGING=''transaction'' > UMEM_DEBUG=''audit,guards,verbose'' > > My other idea is to try to emulate the feature I want from these tools with > dtrace and a huge log file: > - trace all malloc and free, including stack, address and sizeI''d add a trace that is triggered when the program dumps core (elfcore, eg, IIRC) and print ustack() here.> - wait for the crashcan you examine the core dump? Michael -- Michael Schuster http://blogs.sun.com/recursion Recursion, n.: see ''Recursion''
Pierre-Olivier Gaillard
2010-Mar-11 15:48 UTC
[dtrace-discuss] Replacing libumem''s debug with dtrace?
Hi Michael, Yes I can look at the core dump and get a stack. That''s why I wanted to purify, because it really looks like a situation where purify would show an FMR (Free Memory Read). The FMR would show me: current stack where the application read from free memory, stack that freed the block, stack that allocated it first. I thought libumem could replace purify: - cause the app to crash on first access to free memory (thanks to the guards?) - log the transaction that freed the block and its stack - log the transaction that allocated the block So I was hoping to get into mdb after the crash and retrieve the above information but libumem runs out of memory before the problem occurs. I suppose that libumem''s debug features have a big memory overhead but I could not figure out what to disable to avoid running out of memory before I can reproduce my problem. This is why I am considering dtrace as its overhead is easy to control. I think I''ll start with adapting Chris Gerhard''s blog entry: http://blogs.sun.com/chrisg/entry/usind_dtrace_to_find_double My crash stack is in a function that deallocates objects, so I suppose that even though we don''t reach a double free we are probably calling the function twice. Thanks, On Thu, Mar 11, 2010 at 10:18 AM, Michael Schuster <Michael.Schuster at sun.com> wrote:> On 11.03.10 16:03, Pierre-Olivier Gaillard wrote: > >> Hi all, >> >> I have a program that crashes. From the stack I suspect it''s trying to >> read >> free memory, probably because it''s trying to clean some structures twice. >> Unfortunately I can''t run the program under Purify as it''s a 32 bit >> program >> and the purify runs out of 4GB. So I tried libumem as I think it should be >> able to find the error but libumem is running out of memory too. >> >> I am now trying to reduce the logging some more in the hope that the >> program will run: >> LD_PRELOAD="libumem.so.1" >> UMEM_LOGGING=''transaction'' >> UMEM_DEBUG=''audit,guards,verbose'' >> >> My other idea is to try to emulate the feature I want from these tools >> with >> dtrace and a huge log file: >> - trace all malloc and free, including stack, address and size >> > > I''d add a trace that is triggered when the program dumps core (elfcore, eg, > IIRC) and print ustack() here. > > - wait for the crash >> > > can you examine the core dump? > > Michael > -- > Michael Schuster http://blogs.sun.com/recursion > Recursion, n.: see ''Recursion'' >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20100311/1d91decc/attachment.html>
James Carlson
2010-Mar-11 16:02 UTC
[dtrace-discuss] Replacing libumem''s debug with dtrace?
Pierre-Olivier Gaillard wrote:> So I was hoping to get into mdb after the crash and retrieve the above > information but libumem runs out of memory before the problem occurs. I > suppose that libumem''s debug features have a big memory overhead but I > could not figure out what to disable to avoid running out of memory > before I can reproduce my problem.One option would be to recompile as a 64-bit application, so you have fewer limits to worry about. If you suspect a double free (or at least a use-after-free), and you can''t use libumem, one possible technique is to use the object itself to gather debug information. Change your object free function so that, before calling free(), it writes some recognizable pattern to the object (modifying some oft-used pointer inside the structure to NULL is usually a good bet), and then use the backtrace() function to store a backtrace into any remaining space inside the object. Something like this (assuming a large-enough object): void **optr = (void **)obj; assert(sizeof (*obj) >= 10 * sizeof (*optr)); assert(optr[0] != NULL && optr[1] != (void *)0x12345678); optr[0] = NULL; optr[1] = (void *)0x12345678; /* a pattern to recognize */ backtrace(optr + 2, sizeof (*obj) / sizeof (*optr) - 2); That way, when you dump, you''ll be able to see where the first free came from. -- James Carlson 42.703N 71.076W <carlsonj at workingcode.com>
Semih Cemiloglu
2010-Mar-11 22:59 UTC
[dtrace-discuss] Replacing libumem''s debug with dtrace?
Please try undocumented libumem settings: export UMEM_DEBUG=audit,contents,guards,verbose,firewall=1 export UMEM_LOGGING=transaction,fail export UMEM_OPTIONS=backend=mmap For details see: http://blogs.sun.com/peteh/entry/hidden_features_of_libumem_firewalls Kind regards Semih Cemiloglu From: dtrace-discuss-bounces at opensolaris.org [mailto:dtrace-discuss-bounces at opensolaris.org] On Behalf Of Pierre-Olivier Gaillard Sent: Friday, 12 March 2010 2:04 AM To: dtrace-discuss at opensolaris.org Subject: [dtrace-discuss] Replacing libumem''s debug with dtrace? Hi all, I have a program that crashes. From the stack I suspect it''s trying to read free memory, probably because it''s trying to clean some structures twice. Unfortunately I can''t run the program under Purify as it''s a 32 bit program and the purify runs out of 4GB. So I tried libumem as I think it should be able to find the error but libumem is running out of memory too. I am now trying to reduce the logging some more in the hope that the program will run: LD_PRELOAD="libumem.so.1" UMEM_LOGGING=''transaction'' UMEM_DEBUG=''audit,guards,verbose'' My other idea is to try to emulate the feature I want from these tools with dtrace and a huge log file: - trace all malloc and free, including stack, address and size - wait for the crash - search in the log for areas containing the pointers involved in the crash This seems pretty daunting and I wonder if I could not reduce libumem memory usage instead. I don''t mind running the program a couple times. Are there ways to use dtrace that would allow me to disable the libumem options that use the most memory? Thanks for your help, -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20100312/845e1216/attachment-0001.html>
Pierre-Olivier Gaillard
2010-Mar-12 16:05 UTC
[dtrace-discuss] Replacing libumem''s debug with dtrace?
Hi Semih, Thanks a lot. Is one of the options going to reduce the memory overhead? I can''t reproduce the issue with libumem''s debug as I run out of memory long before the point in the program when the problem happens. Anyway, thanks a lot for this information. On Thu, Mar 11, 2010 at 5:59 PM, Semih Cemiloglu <Semih.Cemiloglu at nec.com.au> wrote:> Please try undocumented libumem settings: > > > > export UMEM_DEBUG=audit,contents,guards,verbose,firewall=1 > > export UMEM_LOGGING=transaction,fail > > export UMEM_OPTIONS=backend=mmap > > > > For details see: > > http://blogs.sun.com/peteh/entry/hidden_features_of_libumem_firewalls > > > > Kind regards > > Semih Cemiloglu > > > > > > > > *From:* dtrace-discuss-bounces at opensolaris.org [mailto: > dtrace-discuss-bounces at opensolaris.org] *On Behalf Of *Pierre-Olivier > Gaillard > *Sent:* Friday, 12 March 2010 2:04 AM > *To:* dtrace-discuss at opensolaris.org > *Subject:* [dtrace-discuss] Replacing libumem''s debug with dtrace? > > > > Hi all, > > I have a program that crashes. From the stack I suspect it''s trying to read > free memory, probably because it''s trying to clean some structures twice. > Unfortunately I can''t run the program under Purify as it''s a 32 bit program > and the purify runs out of 4GB. So I tried libumem as I think it should be > able to find the error but libumem is running out of memory too. > > I am now trying to reduce the logging some more in the hope that the > program will run: > LD_PRELOAD="libumem.so.1" > UMEM_LOGGING=''transaction'' > UMEM_DEBUG=''audit,guards,verbose'' > > My other idea is to try to emulate the feature I want from these tools with > dtrace and a huge log file: > - trace all malloc and free, including stack, address and size > - wait for the crash > - search in the log for areas containing the pointers involved in the > crash > > This seems pretty daunting and I wonder if I could not reduce libumem > memory usage instead. I don''t mind running the program a couple times. > Are there ways to use dtrace that would allow me to disable the libumem > options that use the most memory? > > Thanks for your help, > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20100312/90525ed5/attachment.html>
Semih Cemiloglu
2010-Mar-14 22:58 UTC
[dtrace-discuss] Replacing libumem''s debug with dtrace?
Hi Pierre These options will not reduce libumem''s memory requirements since they still require guard pads for the allocated memory. I advise you to trim your test cases so that you can identify the defect with reduced memory requirements. Your other options are watchmalloc, mtmalloc libraries and bcheck (bounds checking) script. Followings are notes from my engineering notebook, demonstrating their typical use. In practice however I found libumem is the easiest and most helpful tool to identify defects. Kind regards Semih Cemiloglu * Bounds checking via bcheck ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ $ bcheck <flags> <app> <parameters> Possible options for bheck: -leaks Check for memory leaks (default) -access Also perform access checking on the program -memuse Also perform memory use checking on program -all Combine all See man rtc_api dbx Usage: $ dbx -C bad_mem ... (dbx) check -all ... (dbx) cont Read from unallocated (rua): ... (dbx) dbxenv rtc_auto_suppress on (dbx) cont ... (dbx) dbxenv rtc_auto_continue on (dbx) dbxenv rtc_error_log_file_name /tmp/rtc.log (dbx) cont ... (dbx) quit * watchmalloc ~~~~~~~~~~~~~ Solaris provides an alternative memory library, called watchmalloc.so, which helps in detecting usage of previously freed memory. It also ensures that previously freed memory is not returned to the application for reuse. export LD_PRELOAD=watchmalloc.so.1 export MALLOC_DEBUG=WATCH export MALLOC_DEBUG=WATCH,RW,STOP The MALLOC_DEBUG environment variable determines whether watchmalloc checks for writes outside allocated memory, or reads outside allocated memory. Possible values: WATCH Checks memory for writes to past end of allocated memory, or writes of previously deallocated memory. This will cause a program execution slowdown on the order of ten to 100 times. RW Checks for reads past end of allocated memory, or reads of previously deallocated memory. This will cause a program execution slowdown on the order of 1000 times. See: man watchmalloc * mtmalloc ~~~~~~~~~~ The mtmalloc library has options that you can access through the programmatic mallocctl interface that control whether allocated and freed memory is overwritten with a pattern so that accesses to uninitialized data, or to data after it has been freed, are more apparent. See: man mtmalloc From: Pierre-Olivier Gaillard [mailto:pierreolivier.gaillard at gmail.com] Sent: Saturday, 13 March 2010 3:05 AM To: Semih Cemiloglu Cc: dtrace-discuss at opensolaris.org Subject: Re: [dtrace-discuss] Replacing libumem''s debug with dtrace? Hi Semih, Thanks a lot. Is one of the options going to reduce the memory overhead? I can''t reproduce the issue with libumem''s debug as I run out of memory long before the point in the program when the problem happens. Anyway, thanks a lot for this information. On Thu, Mar 11, 2010 at 5:59 PM, Semih Cemiloglu <Semih.Cemiloglu at nec.com.au<mailto:Semih.Cemiloglu at nec.com.au>> wrote: Please try undocumented libumem settings: export UMEM_DEBUG=audit,contents,guards,verbose,firewall=1 export UMEM_LOGGING=transaction,fail export UMEM_OPTIONS=backend=mmap For details see: http://blogs.sun.com/peteh/entry/hidden_features_of_libumem_firewalls Kind regards Semih Cemiloglu From: dtrace-discuss-bounces at opensolaris.org<mailto:dtrace-discuss-bounces at opensolaris.org> [mailto:dtrace-discuss-bounces at opensolaris.org<mailto:dtrace-discuss-bounces at opensolaris.org>] On Behalf Of Pierre-Olivier Gaillard Sent: Friday, 12 March 2010 2:04 AM To: dtrace-discuss at opensolaris.org<mailto:dtrace-discuss at opensolaris.org> Subject: [dtrace-discuss] Replacing libumem''s debug with dtrace? Hi all, I have a program that crashes. From the stack I suspect it''s trying to read free memory, probably because it''s trying to clean some structures twice. Unfortunately I can''t run the program under Purify as it''s a 32 bit program and the purify runs out of 4GB. So I tried libumem as I think it should be able to find the error but libumem is running out of memory too. I am now trying to reduce the logging some more in the hope that the program will run: LD_PRELOAD="libumem.so.1" UMEM_LOGGING=''transaction'' UMEM_DEBUG=''audit,guards,verbose'' My other idea is to try to emulate the feature I want from these tools with dtrace and a huge log file: - trace all malloc and free, including stack, address and size - wait for the crash - search in the log for areas containing the pointers involved in the crash This seems pretty daunting and I wonder if I could not reduce libumem memory usage instead. I don''t mind running the program a couple times. Are there ways to use dtrace that would allow me to disable the libumem options that use the most memory? Thanks for your help, -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20100315/b21ffa3c/attachment.html>