https://bugzilla.samba.org/show_bug.cgi?id=2766 ------- Additional Comments From paul@debian.org 2005-06-02 08:47 ------- It seems to be looping in userspace (no system calls, hence strace doesn't show anything). Could you try ltrace instead of strace, that should show what library functions (if any) are being called which may help. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
https://bugzilla.samba.org/show_bug.cgi?id=2766
------- Additional Comments From cs@emlix.com 2005-06-02 09:27 -------
This is ltrace and strace again (until the segfault...?)
Hope this helps a bit further?
backup:~ # ltrace -p 2227 -S
--- SIGSTOP (Stopped (signal)) ---
--- SIGSTOP (Stopped (signal)) ---
SYS__newselect(1, 0xbfffc6d0, 0xbfffc650, 0, 0xbfffc648 <unfinished ...>
backup:~ # ltrace -S -p 22276
Cannot attach to pid 22276: No such process
backup:~ # ltrace -S -p 2226
--- SIGSTOP (Stopped (signal)) ---
--- SIGSTOP (Stopped (signal)) ---
--- SIGCHLD (Child exited) ---
waitpid(-1, 0xbfff763c, 1 <unfinished ...>
SYS_waitpid(-1, 0xbfff763c, 1, 1, 0x0809e728)
= 0
<... waitpid resumed> )
= 0
SYS_sigreturn(0xb7fc8ff4, 0xbfff763c, 1, 1, 0x0809e728)
= 5
backup:~ # ltrace -S -p 2227
--- SIGSTOP (Stopped (signal)) ---
--- SIGSTOP (Stopped (signal)) ---
SYS__newselect(1, 0xbfffc6d0, 0xbfffc650, 0, 0xbfffc648 <unfinished ...>
backup:~ # strace -p 2227
Process 2227 attached - interrupt to quit
--- SIGSTOP (Stopped (signal)) @ 0 (0) ---
--- SIGSTOP (Stopped (signal)) @ 0 (0) ---
select(1, [0], [], NULL, {21, 780000} <unfinished ...>
Process 2227 detached
backup:~ # strace -p 2226
Process 2226 attached - interrupt to quit
--- SIGSTOP (Stopped (signal)) @ 0 (0) ---
--- SIGSTOP (Stopped (signal)) @ 0 (0) ---
--- SIGCHLD (Child exited) @ 0 (0) ---
waitpid(-1, 0xbfff763c, WNOHANG) = 0
sigreturn() = ? (mask now [])
Process 2226 detached
backup:~ # strace -p 2226
Process 2226 attached - interrupt to quit
Process 2226 detached
backup:~ # strace -p 2226
Process 2226 attached - interrupt to quit
Process 2226 detached
backup:~ # strace -p 2227
Process 2227 attached - interrupt to quit
select(1, [0], [], NULL, {0, 188000}) = 0 (Timeout)
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
Process 2227 detached
backup:~ # strace -p 2227
attach: ptrace(PTRACE_ATTACH, ...): No such process
backup:~ # strace -p 2226
attach: ptrace(PTRACE_ATTACH, ...): No such process
--
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug, or are watching the QA contact.
https://bugzilla.samba.org/show_bug.cgi?id=2766
------- Additional Comments From cs@emlix.com 2005-06-08 06:23 -------
Hello again,
here's another 'wild' rsync. Same symtoms as before. What's
wondering me is,
that after ltrace, there's something to see with strace - until both
processes
segfault. (The final strace commands were run in paralell)
backup:~ # ps axf
PID TTY STAT TIME COMMAND
7416 ? S 0:00 | | \_ /bin/bash /root/bin/backup.external
/backup/admin/log/backup_2005-06-07-external.log
9065 ? R 581:58 | | \_ /root/bin/rsync -v --archive --delete
/backup/pserv/ daily.0/pserv
9066 ? S 0:00 | | \_ /root/bin/rsync -v --archive
--delete /backup/pserv/ daily.0/pserv
backup:~ # netstat -anp|grep rsync
unix 3 [ ] STREAM CONNECTED 2344808 9066/rsync
unix 3 [ ] STREAM CONNECTED 2344807 9065/rsync
unix 3 [ ] STREAM CONNECTED 2344806 9065/rsync
unix 3 [ ] STREAM CONNECTED 2344805 9066/rsync
backup:~ # strace -p 9065
Process 9065 attached - interrupt to quit
Process 9065 detached
backup:~ # strace -p 9066
Process 9066 attached - interrupt to quit
select(1, [0], [], NULL, {44, 568000} <unfinished ...>
Process 9066 detached
backup:~ # ltrace -p 9065 -S
backup:~ # ltrace -p 9066 -S
SYS__newselect(1, 0xbfffc700, 0xbfffc680, 0, 0xbfffc678)
= 0
__errno_location()
= 0xb7eb6600
select(1, 0xbfffc700, 0xbfffc680, 0, 0xbfffc678 <unfinished ...>
SYS__newselect(1, 0xbfffc700, 0xbfffc680, 0, 0xbfffc678 <unfinished ...>
backup:~ # strace -p 9065
Process 9065 attached - interrupt to quit
--- SIGSTOP (Stopped (signal)) @ 0 (0) ---
--- SIGSTOP (Stopped (signal)) @ 0 (0) ---
--- SIGCHLD (Child exited) @ 0 (0) ---
waitpid(-1, 0xbffee24c, WNOHANG) = 0
sigreturn() = ? (mask now [])
--- SIGCHLD (Child exited) @ 0 (0) ---
waitpid(-1, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGSEGV}], WNOHANG) =
9066
waitpid(-1, 0xbffee25c, WNOHANG) = -1 ECHILD (No child processes)
sigreturn() = ? (mask now [])
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
Process 9065 detached
backup:~ # strace -p 9066
Process 9066 attached - interrupt to quit
--- SIGSTOP (Stopped (signal)) @ 0 (0) ---
--- SIGSTOP (Stopped (signal)) @ 0 (0) ---
select(1, [0], [], NULL, {38, 186000}) = 0 (Timeout)
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
Process 9066 detached
--
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug, or are watching the QA contact.
https://bugzilla.samba.org/show_bug.cgi?id=2766
wayned@samba.org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
------- Additional Comments From wayned@samba.org 2005-06-08 09:07 -------
The best thing you can do is to run a version with debug symbols intact: rsync
builds with debug symbols by default, but strips them from the installed
version, so either run a non-stripped binary from an rsync build dir, or make
sure that the stripped version that is running is based on the non-stripped
version (so that it differs only in the stripped symbols). Then, when the
processes hang, attach gdb to each process using a non-stripped rsync binary as
the first arg and the process number as the second arg. You'll then be able
to
run "bt" (backtrace) and see exactly where rsync is in the code. You
can look
around, printing out any variables via the "p" command, etc.
--
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug, or are watching the QA contact.
Maybe Matching Threads
- Problem with 3.0.10 and 3.0.11 with 1 smbd process using 99% cpu
- [PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression
- [PATCH] Non-daemon actions indirect through generated code
- A smbd process pegging CPU at near 100% with v3.0.10-1FC2 RPM
- Problem with 3.0.10 and 3.0.11 with 1 smbd process using99% cpu