2017-12-06 15:52 GMT+01:00 George Joseph <gjoseph at digium.com>:
>
>
> On Tue, Dec 5, 2017 at 9:20 AM, Olivier <oza.4h07 at gmail.com>
wrote:
>
>> Hello,
>>
>> I carefully read [1] which details how backtrace files can be produced.
>>
>> Maybe this seems natural to some, but how can I go one step futher, and
>> check that produced XXX-thread1.txt, XXX-brief.txt, ... files are OK ?
>>
>> In other words, where can I find an example on how to use one of those
>> files and check by myself, that if a system ever fails, I won't
have to
>> wait for another failure to provide required data to support teams ?
>>
>
> It's a great question but I could spend a week answering it and not
> scratch the surface. :)
>
Thanks very much for trying, anyway ;-)
> It's not a straightforward thing unless you know the code in question.
> The most common is a segmentation fault (segfault or SEGV).
>
True ! I experienced segfaults lately and I could not configure the
platform I used then (Debian Jessie) to produce core files in a directory
Asterisk can write into.
Now, with Debian Stretch, I can produce core file at will (with a kill -s
SIGSEGV <processid>).
I checked ast_coredumped worked OK as it produced thread.txt files and so
on.
Ideally, I would like to go one step further: check now that a future .txt
file would be "workable" (and not "you should have compiled with
option XXX
or configured with option YYY) .
> In that case, the thread1.txt file is the place to start. Since most of
> the objects passed around are really pointers to objects, the most obvious
> cause would be a 0x0 for a value. So for instance "chan=0x0".
That would
> be a pointer to a channel object that was not set when it probably should
> have been. Unfortunately, it's not only 0x0 that could cause a segv.
> Anytime a program tries to access memory it doesn't own, that signal
is
> raised. So let's say there a 256 byte buffer which the process owns.
If
> there's a bug somewhere that causes the program to try and access bytes
> beyond the end of the buffer, you MAY get a segv if that process
doesn't
> also own that memory. If this case, the backtrace won't show anything
> obvious because the pointers all look valid. There probably would be an
> index variable (i or ix, etc) that may be set to 257 but you'd have to
know
> that the buffer was only 256 bytes to realize that that was the issue.
>
So, with an artificial kill -s SIGSEGV <processid>, does the bellow output
prove I have a workable .txt files (having .txt files that let people find
the root cause of the issue is another story as we probably can only hope
for the best here) ?
# head core-brief.txt
!@!@!@! brief.txt !@!@!@!
Thread 38 (Thread 0x7f2aa5dd0700 (LWP 992)):
#0 pthread_cond_timedwait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1 0x000055cdcb69ae84 in __ast_cond_timedwait (filename=0x55cdcb7d4910
"threadpool.c", lineno=1131, func=0x55cdcb7d4ea8
<__PRETTY_FUNCTION__.8978>
"worker_idle", cond_name=0x55cdcb7d4b7f
"&worker->cond",
mutex_name=0x55cdcb7d4b71 "&worker->lock", cond=0x7f2abc000978,
t=0x7f2abc0009a8, abstime=0x7f2aa5dcfc30) at lock.c:668
#2 0x000055cdcb75d153 in worker_idle (worker=0x7f2abc000970) at
threadpool.c:1131
#3 0x000055cdcb75ce61 in worker_start (arg=0x7f2abc000970) at
threadpool.c:1022
#4 0x000055cdcb769a8c in dummy_start (data=0x7f2abc000a80) at utils.c:1238
#5 0x00007f2aeddad494 in start_thread (arg=0x7f2aa5dd0700) at
pthread_create.c:333
> Deadlocks are even harder to troubleshoot. For that, you need to look at
> full.txt to see where the threads are stuck and find the 1 thread
that's
> holding the lock that the others are stuck on.
>
> Sorry. I wish I had a better answer because it'd help a lot if folks
> could do more investigation themselves.
>
>
>
>
>
>>
>> Best regards
>>
>> [1] https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace
>>
>> --
>> _____________________________________________________________________
>> -- Bandwidth and Colocation Provided by http://www.api-digital.com --
>>
>> Check out the new Asterisk community forum at:
>> https://community.asterisk.org/
>>
>> New to Asterisk? Start here:
>> https://wiki.asterisk.org/wiki/display/AST/Getting+Started
>>
>> asterisk-users mailing list
>> To UNSUBSCRIBE or update options visit:
>> http://lists.digium.com/mailman/listinfo/asterisk-users
>>
>
>
>
> --
> George Joseph
> Digium, Inc. | Software Developer
> 445 Jan Davis Drive NW - Huntsville, AL 35806 - US
> Check us out at: www.digium.com & www.asterisk.org
>
>
> --
> _____________________________________________________________________
> -- Bandwidth and Colocation Provided by http://www.api-digital.com --
>
> Check out the new Asterisk community forum at: https://community.asterisk.
> org/
>
> New to Asterisk? Start here:
> https://wiki.asterisk.org/wiki/display/AST/Getting+Started
>
> asterisk-users mailing list
> To UNSUBSCRIBE or update options visit:
> http://lists.digium.com/mailman/listinfo/asterisk-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.digium.com/pipermail/asterisk-users/attachments/20171206/67ed7444/attachment.html>