thr3ads.net - dtrace discuss - [dtrace-discuss] Why does gethostbyname

If this information is useful, please help other people find it:
Share via:

J.J. Shore

2006-Oct-30 16:59 UTC

[dtrace-discuss] Why does gethostbyname_r appear to leak?

I am running a very simple multithreaded program (TestThread.C) which calls
gethostbyname_r in several threads. My analysis of this program with both truss
and DTrace suggest that is has a small leak. However, if I alter the program to
have many more threads and run for a lot longer it never runs out of memory and
does not carry on growing. Can anybody explain what I am missing in my analysis
or if there are some bugs in both Dtrace and truss that I am unaware of?

How to repeat my truss test.

1. Take the attached tar file and place in a directory and unpack.
prompt> tar xvf leak.tar
2. Run ctest.ksh
prompt> ctest.ksh
3. Look at truss.final

The ctest.ksh script compiles the program and then runs it through truss to find
all the places where memory is allocated and released. This output is then
filtered down to just libc calls and the results are stored in truss.libcalls.
truss.libcalls is then filtered to give a list of light weight processes
(excluding thread 1 which is the main program). For each lwp the script greps
out all the returns from malloc and prints the return value. Finally for each of
these addresses a search is made  for the last occurance of that address in the
original truss output. This is written to the file truss.final if it is not a
call to free. Invariably truss.final contains a list showing that the last
operation on several addresses is a malloc or calloc rather  than a free as
expected which suggests that it is leaking.

How to repeat my dtrace test.

1. Run leak in the background.
prompt> leak &
509
prompt>

2. [u]Within 10 seconds[/u] start watch_malloc_sizes.d passing in the pid of
leak and
a time interval. Thus:-
prompt> ./watch_malloc_sizes.d 509 5s
where 509 is the PID taken from above.

The DTrace script is trying to match allocated memory with a call to free in any
other thread. Every <interval> seconds it displays an interim report and
finally when the program finished is provides an overall summary. On my machine
it suggests in the same way as truss does that the program has a leak.

I am not convinced that the case is so straight forward partly because when I
extend the program into an eternal loop with many more threads it does not run 
out of memory and partly because the list of threads that leak seems to vary 
which suggests that some events are somehow being missed by truss and dtrace. If
the program is made single threaded then there are no leaks.

I would like to know what really is going on so if you have read this far thanks
for you time.
 
 
This message posted from opensolaris.org

Sean McGrath - Sun Microsystems Ireland

2006-Oct-30 19:06 UTC

head link

[dtrace-discuss] Why does gethostbyname_r appear to leak?

J.J. Shore stated:
< I am running a very simple multithreaded program (TestThread.C) which calls
gethostbyname_r in several threads. My analysis of this program with both truss
and DTrace suggest that is has a small leak. However, if I alter the program to
have many more threads and run for a lot longer it never runs out of memory and
does not carry on growing. Can anybody explain what I am missing in my analysis
or if there are some bugs in both Dtrace and truss that I am unaware of?
< 
< How to repeat my truss test.
< 
< 1. Take the attached tar file and place in a directory and unpack.
< prompt> tar xvf leak.tar
< 2. Run ctest.ksh
< prompt> ctest.ksh
< 3. Look at truss.final

   Seems the mailing list software stripped the attachment ?

   What version of Solaris Express or Update is this ?  I know there was
   a bug fix that went into a recent Solaris Express build that fixed a
   leak from gethostbyname_r.

Sean.
.
< 
< The ctest.ksh script compiles the program and then runs it through truss to
find  all the places where memory is allocated and released. This output is then
filtered down to just libc calls and the results are stored in truss.libcalls.
truss.libcalls is then filtered to give a list of light weight processes
(excluding thread 1 which is the main program). For each lwp the script greps
out all the returns from malloc and prints the return value. Finally for each of
these addresses a search is made  for the last occurance of that address in the
original truss output. This is written to the file truss.final if it is not a
call to free. Invariably truss.final contains a list showing that the last
operation on several addresses is a malloc or calloc rather  than a free as
expected which suggests that it is leaking.
< 
< How to repeat my dtrace test.
< 
< 1. Run leak in the background.
< prompt> leak &
< 509
< prompt>
< 
< 2. [u]Within 10 seconds[/u] start watch_malloc_sizes.d passing in the pid
of leak and
< a time interval. Thus:-
< prompt> ./watch_malloc_sizes.d 509 5s
< where 509 is the PID taken from above.
< 
< The DTrace script is trying to match allocated memory with a call to free
in any other thread. Every <interval> seconds it displays an interim
report and finally when the program finished is provides an overall summary. On
my machine it suggests in the same way as truss does that the program has a
leak.
< 
< I am not convinced that the case is so straight forward partly because when
I extend the program into an eternal loop with many more threads it does not run
out of memory and partly because the list of threads that leak seems to vary 
which suggests that some events are somehow being missed by truss and dtrace. If
the program is made single threaded then there are no leaks.
< 
< I would like to know what really is going on so if you have read this far
thanks for you time.
<  
<  
< This message posted from opensolaris.org
< _______________________________________________
< dtrace-discuss mailing list
< dtrace-discuss at opensolaris.org

-- 
Sean.
.

J.J. Shore

2006-Oct-31 08:40 UTC

head link

[dtrace-discuss] Re: Why does gethostbyname_r appear to leak?

I see that attachments do not get copied over when cross threading and that this
tool will not let me attach stuff to a CC thread. Anyway the attachment can be
viewed
from opensolaris.org/jive/thread.jspa?threadID=16436&tstart=0

I am using the following Solaris version:-
hawea> uname -a
SunOS hawea 5.10 Generic_118822-25 sun4u sparc SUNW,Sun-Fire-V440
 
 
This message posted from opensolaris.org

J.J. Shore

2006-Oct-31 11:11 UTC

head link

[dtrace-discuss] Re: Why does gethostbyname_r appear to leak?

Following up a suggestion I have received by e-mail I have tried using libumem,
as follows, to see if there is a leak. The result suggests there is not a leak.

i. In one terminal:-
export UMEM_DEBUG=default;
export UMEM_LOGGING=transaction;
export LD_PRELOAD=libumem.so.1;
leak

ii. In a seperate terminal
gcore $(pgrep leak)
mdb core.xxx
::findleaks
CACHE     LEAKED   BUFCTL CALLER
----------------------------------------------------------------------
   Total       0 buffers, 0 bytes

I modified the program have an eternal loop before doing this.
 
 
This message posted from opensolaris.org

Possibly Parallel Threads

Search for more possibly parallel threads

dtrace discuss - Oct 2006 - Why does gethostbyname_r appear to leak?

[dtrace-discuss] Why does gethostbyname_r appear to leak?

[dtrace-discuss] Why does gethostbyname_r appear to leak?

[dtrace-discuss] Re: Why does gethostbyname_r appear to leak?

[dtrace-discuss] Re: Why does gethostbyname_r appear to leak?

Possibly Parallel Threads