Artem Russakovskii
2019-Feb-01 17:03 UTC
[Gluster-users] Message repeated over and over after upgrade from 4.1 to 5.3: W [dict.c:761:dict_ref] (-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329) [0x7fd966fcd329] -->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5) [0x7fd9671deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58) [0x7fd9731ea218] ) 2-dict: dict is NULL [Invalid argument]
Hi, The first (and so far only) crash happened at 2am the next day after we upgraded, on only one of four servers and only to one of two mounts. I have no idea what caused it, but yeah, we do have a pretty busy site ( apkmirror.com), and it caused a disruption for any uploads or downloads from that server until I woke up and fixed the mount. I wish I could be more helpful but all I have is that stack trace. I'm glad it's a blocker and will hopefully be resolved soon. On Thu, Jan 31, 2019, 7:26 PM Amar Tumballi Suryanarayan < atumball at redhat.com> wrote:> Hi Artem, > > Opened https://bugzilla.redhat.com/show_bug.cgi?id=1671603 (ie, as a > clone of other bugs where recent discussions happened), and marked it as a > blocker for glusterfs-5.4 release. > > We already have fixes for log flooding - https://review.gluster.org/22128, > and are the process of identifying and fixing the issue seen with crash. > > Can you please tell if the crashes happened as soon as upgrade ? or was > there any particular pattern you observed before the crash. > > -Amar > > > On Thu, Jan 31, 2019 at 11:40 PM Artem Russakovskii <archon810 at gmail.com> > wrote: > >> Within 24 hours after updating from rock solid 4.1 to 5.3, I already got >> a crash which others have mentioned in >> https://bugzilla.redhat.com/show_bug.cgi?id=1313567 and had to unmount, >> kill gluster, and remount: >> >> >> [2019-01-31 09:38:04.317604] W [dict.c:761:dict_ref] >> (-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329) >> [0x7fcccafcd329] >> -->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5) >> [0x7fcccb1deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58) >> [0x7fccd705b218] ) 2-dict: dict is NULL [Invalid argument] >> [2019-01-31 09:38:04.319308] W [dict.c:761:dict_ref] >> (-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329) >> [0x7fcccafcd329] >> -->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5) >> [0x7fcccb1deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58) >> [0x7fccd705b218] ) 2-dict: dict is NULL [Invalid argument] >> [2019-01-31 09:38:04.320047] W [dict.c:761:dict_ref] >> (-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329) >> [0x7fcccafcd329] >> -->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5) >> [0x7fcccb1deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58) >> [0x7fccd705b218] ) 2-dict: dict is NULL [Invalid argument] >> [2019-01-31 09:38:04.320677] W [dict.c:761:dict_ref] >> (-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329) >> [0x7fcccafcd329] >> -->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5) >> [0x7fcccb1deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58) >> [0x7fccd705b218] ) 2-dict: dict is NULL [Invalid argument] >> The message "I [MSGID: 108031] >> [afr-common.c:2543:afr_local_discovery_cbk] 2-SITE_data1-replicate-0: >> selecting local read_child SITE_data1-client-3" repeated 5 times between >> [2019-01-31 09:37:54.751905] and [2019-01-31 09:38:03.958061] >> The message "E [MSGID: 101191] >> [event-epoll.c:671:event_dispatch_epoll_worker] 2-epoll: Failed to dispatch >> handler" repeated 72 times between [2019-01-31 09:37:53.746741] and >> [2019-01-31 09:38:04.696993] >> pending frames: >> frame : type(1) op(READ) >> frame : type(1) op(OPEN) >> frame : type(0) op(0) >> patchset: git://git.gluster.org/glusterfs.git >> signal received: 6 >> time of crash: >> 2019-01-31 09:38:04 >> configuration details: >> argp 1 >> backtrace 1 >> dlfcn 1 >> libpthread 1 >> llistxattr 1 >> setfsid 1 >> spinlock 1 >> epoll.h 1 >> xattr.h 1 >> st_atim.tv_nsec 1 >> package-string: glusterfs 5.3 >> /usr/lib64/libglusterfs.so.0(+0x2764c)[0x7fccd706664c] >> /usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7fccd7070cb6] >> /lib64/libc.so.6(+0x36160)[0x7fccd622d160] >> /lib64/libc.so.6(gsignal+0x110)[0x7fccd622d0e0] >> /lib64/libc.so.6(abort+0x151)[0x7fccd622e6c1] >> /lib64/libc.so.6(+0x2e6fa)[0x7fccd62256fa] >> /lib64/libc.so.6(+0x2e772)[0x7fccd6225772] >> /lib64/libpthread.so.0(pthread_mutex_lock+0x228)[0x7fccd65bb0b8] >> >> /usr/lib64/glusterfs/5.3/xlator/cluster/replicate.so(+0x32c4d)[0x7fcccbb01c4d] >> >> /usr/lib64/glusterfs/5.3/xlator/protocol/client.so(+0x65778)[0x7fcccbdd1778] >> /usr/lib64/libgfrpc.so.0(+0xe820)[0x7fccd6e31820] >> /usr/lib64/libgfrpc.so.0(+0xeb6f)[0x7fccd6e31b6f] >> /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fccd6e2e063] >> /usr/lib64/glusterfs/5.3/rpc-transport/socket.so(+0xa0b2)[0x7fccd0b7e0b2] >> /usr/lib64/libglusterfs.so.0(+0x854c3)[0x7fccd70c44c3] >> /lib64/libpthread.so.0(+0x7559)[0x7fccd65b8559] >> /lib64/libc.so.6(clone+0x3f)[0x7fccd62ef81f] >> --------- >> >> Do the pending patches fix the crash or only the repeated warnings? I'm >> running glusterfs on OpenSUSE 15.0 installed via >> http://download.opensuse.org/repositories/home:/glusterfs:/Leap15-5/openSUSE_Leap_15.0/, >> not too sure how to make it core dump. >> >> If it's not fixed by the patches above, has anyone already opened a >> ticket for the crashes that I can join and monitor? This is going to create >> a massive problem for us since production systems are crashing. >> >> Thanks. >> >> Sincerely, >> Artem >> >> -- >> Founder, Android Police <http://www.androidpolice.com>, APK Mirror >> <http://www.apkmirror.com/>, Illogical Robot LLC >> beerpla.net | +ArtemRussakovskii >> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR >> <http://twitter.com/ArtemR> >> >> >> On Wed, Jan 30, 2019 at 6:37 PM Raghavendra Gowdappa <rgowdapp at redhat.com> >> wrote: >> >>> >>> >>> On Thu, Jan 31, 2019 at 2:14 AM Artem Russakovskii <archon810 at gmail.com> >>> wrote: >>> >>>> Also, not sure if related or not, but I got a ton of these "Failed to >>>> dispatch handler" in my logs as well. Many people have been commenting >>>> about this issue here >>>> https://bugzilla.redhat.com/show_bug.cgi?id=1651246. >>>> >>> >>> https://review.gluster.org/#/c/glusterfs/+/22046/ addresses this. >>> >>> >>>> ==> mnt-SITE_data1.log <=>>>>> [2019-01-30 20:38:20.783713] W [dict.c:761:dict_ref] >>>>> (-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329) >>>>> [0x7fd966fcd329] >>>>> -->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5) >>>>> [0x7fd9671deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58) >>>>> [0x7fd9731ea218] ) 2-dict: dict is NULL [Invalid argument] >>>>> ==> mnt-SITE_data3.log <=>>>>> The message "E [MSGID: 101191] >>>>> [event-epoll.c:671:event_dispatch_epoll_worker] 2-epoll: Failed to dispatch >>>>> handler" repeated 413 times between [2019-01-30 20:36:23.881090] and >>>>> [2019-01-30 20:38:20.015593] >>>>> The message "I [MSGID: 108031] >>>>> [afr-common.c:2543:afr_local_discovery_cbk] 2-SITE_data3-replicate-0: >>>>> selecting local read_child SITE_data3-client-0" repeated 42 times between >>>>> [2019-01-30 20:36:23.290287] and [2019-01-30 20:38:20.280306] >>>>> ==> mnt-SITE_data1.log <=>>>>> The message "I [MSGID: 108031] >>>>> [afr-common.c:2543:afr_local_discovery_cbk] 2-SITE_data1-replicate-0: >>>>> selecting local read_child SITE_data1-client-0" repeated 50 times between >>>>> [2019-01-30 20:36:22.247367] and [2019-01-30 20:38:19.459789] >>>>> The message "E [MSGID: 101191] >>>>> [event-epoll.c:671:event_dispatch_epoll_worker] 2-epoll: Failed to dispatch >>>>> handler" repeated 2654 times between [2019-01-30 20:36:22.667327] and >>>>> [2019-01-30 20:38:20.546355] >>>>> [2019-01-30 20:38:21.492319] I [MSGID: 108031] >>>>> [afr-common.c:2543:afr_local_discovery_cbk] 2-SITE_data1-replicate-0: >>>>> selecting local read_child SITE_data1-client-0 >>>>> ==> mnt-SITE_data3.log <=>>>>> [2019-01-30 20:38:22.349689] I [MSGID: 108031] >>>>> [afr-common.c:2543:afr_local_discovery_cbk] 2-SITE_data3-replicate-0: >>>>> selecting local read_child SITE_data3-client-0 >>>>> ==> mnt-SITE_data1.log <=>>>>> [2019-01-30 20:38:22.762941] E [MSGID: 101191] >>>>> [event-epoll.c:671:event_dispatch_epoll_worker] 2-epoll: Failed to dispatch >>>>> handler >>>> >>>> >>>> I'm hoping raising the issue here on the mailing list may bring some >>>> additional eyeballs and get them both fixed. >>>> >>>> Thanks. >>>> >>>> Sincerely, >>>> Artem >>>> >>>> -- >>>> Founder, Android Police <http://www.androidpolice.com>, APK Mirror >>>> <http://www.apkmirror.com/>, Illogical Robot LLC >>>> beerpla.net | +ArtemRussakovskii >>>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR >>>> <http://twitter.com/ArtemR> >>>> >>>> >>>> On Wed, Jan 30, 2019 at 12:26 PM Artem Russakovskii < >>>> archon810 at gmail.com> wrote: >>>> >>>>> I found a similar issue here: >>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1313567. There's a >>>>> comment from 3 days ago from someone else with 5.3 who started seeing the >>>>> spam. >>>>> >>>>> Here's the command that repeats over and over: >>>>> [2019-01-30 20:23:24.481581] W [dict.c:761:dict_ref] >>>>> (-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329) >>>>> [0x7fd966fcd329] >>>>> -->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5) >>>>> [0x7fd9671deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58) >>>>> [0x7fd9731ea218] ) 2-dict: dict is NULL [Invalid argument] >>>>> >>>> >>> +Milind Changire <mchangir at redhat.com> Can you check why this message >>> is logged and send a fix? >>> >>> >>>>> Is there any fix for this issue? >>>>> >>>>> Thanks. >>>>> >>>>> Sincerely, >>>>> Artem >>>>> >>>>> -- >>>>> Founder, Android Police <http://www.androidpolice.com>, APK Mirror >>>>> <http://www.apkmirror.com/>, Illogical Robot LLC >>>>> beerpla.net | +ArtemRussakovskii >>>>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR >>>>> <http://twitter.com/ArtemR> >>>>> >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org >>>> https://lists.gluster.org/mailman/listinfo/gluster-users >>> >>> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users > > > > -- > Amar Tumballi (amarts) >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190201/4dae7683/attachment.html>
Artem Russakovskii
2019-Feb-02 20:14 UTC
[Gluster-users] Message repeated over and over after upgrade from 4.1 to 5.3: W [dict.c:761:dict_ref] (-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329) [0x7fd966fcd329] -->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5) [0x7fd9671deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58) [0x7fd9731ea218] ) 2-dict: dict is NULL [Invalid argument]
The fuse crash happened again yesterday, to another volume. Are there any
mount options that could help mitigate this?
In the meantime, I set up a monit (https://mmonit.com/monit/) task to watch
and restart the mount, which works and recovers the mount point within a
minute. Not ideal, but a temporary workaround.
By the way, the way to reproduce this "Transport endpoint is not
connected"
condition for testing purposes is to kill -9 the right "glusterfs
--process-name fuse" process.
monit check:
check filesystem glusterfs_data1 with path /mnt/glusterfs_data1
start program = "/bin/mount /mnt/glusterfs_data1"
stop program = "/bin/umount /mnt/glusterfs_data1"
if space usage > 90% for 5 times within 15 cycles
then alert else if succeeded for 10 cycles then alert
stack trace:
[2019-02-01 23:22:00.312894] W [dict.c:761:dict_ref]
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
[0x7fa0249e4329]
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
[0x7fa024bf5af5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
[0x7fa02cf5b218] ) 0-dict: dict is NULL [Invalid argument]
[2019-02-01 23:22:00.314051] W [dict.c:761:dict_ref]
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
[0x7fa0249e4329]
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
[0x7fa024bf5af5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
[0x7fa02cf5b218] ) 0-dict: dict is NULL [Invalid argument]
The message "E [MSGID: 101191]
[event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch
handler" repeated 26 times between [2019-02-01 23:21:20.857333] and
[2019-02-01 23:21:56.164427]
The message "I [MSGID: 108031] [afr-common.c:2543:afr_local_discovery_cbk]
0-SITE_data3-replicate-0: selecting local read_child SITE_data3-client-3"
repeated 27 times between [2019-02-01 23:21:11.142467] and [2019-02-01
23:22:03.474036]
pending frames:
frame : type(1) op(LOOKUP)
frame : type(0) op(0)
patchset: git://git.gluster.org/glusterfs.git
signal received: 6
time of crash:
2019-02-01 23:22:03
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 5.3
/usr/lib64/libglusterfs.so.0(+0x2764c)[0x7fa02cf6664c]
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7fa02cf70cb6]
/lib64/libc.so.6(+0x36160)[0x7fa02c12d160]
/lib64/libc.so.6(gsignal+0x110)[0x7fa02c12d0e0]
/lib64/libc.so.6(abort+0x151)[0x7fa02c12e6c1]
/lib64/libc.so.6(+0x2e6fa)[0x7fa02c1256fa]
/lib64/libc.so.6(+0x2e772)[0x7fa02c125772]
/lib64/libpthread.so.0(pthread_mutex_lock+0x228)[0x7fa02c4bb0b8]
/usr/lib64/glusterfs/5.3/xlator/cluster/replicate.so(+0x5dc9d)[0x7fa025543c9d]
/usr/lib64/glusterfs/5.3/xlator/cluster/replicate.so(+0x70ba1)[0x7fa025556ba1]
/usr/lib64/glusterfs/5.3/xlator/protocol/client.so(+0x58f3f)[0x7fa0257dbf3f]
/usr/lib64/libgfrpc.so.0(+0xe820)[0x7fa02cd31820]
/usr/lib64/libgfrpc.so.0(+0xeb6f)[0x7fa02cd31b6f]
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fa02cd2e063]
/usr/lib64/glusterfs/5.3/rpc-transport/socket.so(+0xa0b2)[0x7fa02694e0b2]
/usr/lib64/libglusterfs.so.0(+0x854c3)[0x7fa02cfc44c3]
/lib64/libpthread.so.0(+0x7559)[0x7fa02c4b8559]
/lib64/libc.so.6(clone+0x3f)[0x7fa02c1ef81f]
Sincerely,
Artem
--
Founder, Android Police <http://www.androidpolice.com>, APK Mirror
<http://www.apkmirror.com/>, Illogical Robot LLC
beerpla.net | +ArtemRussakovskii
<https://plus.google.com/+ArtemRussakovskii> | @ArtemR
<http://twitter.com/ArtemR>
On Fri, Feb 1, 2019 at 9:03 AM Artem Russakovskii <archon810 at gmail.com>
wrote:
> Hi,
>
> The first (and so far only) crash happened at 2am the next day after we
> upgraded, on only one of four servers and only to one of two mounts.
>
> I have no idea what caused it, but yeah, we do have a pretty busy site (
> apkmirror.com), and it caused a disruption for any uploads or downloads
> from that server until I woke up and fixed the mount.
>
> I wish I could be more helpful but all I have is that stack trace.
>
> I'm glad it's a blocker and will hopefully be resolved soon.
>
> On Thu, Jan 31, 2019, 7:26 PM Amar Tumballi Suryanarayan <
> atumball at redhat.com> wrote:
>
>> Hi Artem,
>>
>> Opened https://bugzilla.redhat.com/show_bug.cgi?id=1671603 (ie, as a
>> clone of other bugs where recent discussions happened), and marked it
as a
>> blocker for glusterfs-5.4 release.
>>
>> We already have fixes for log flooding -
https://review.gluster.org/22128,
>> and are the process of identifying and fixing the issue seen with
crash.
>>
>> Can you please tell if the crashes happened as soon as upgrade ? or was
>> there any particular pattern you observed before the crash.
>>
>> -Amar
>>
>>
>> On Thu, Jan 31, 2019 at 11:40 PM Artem Russakovskii <archon810 at
gmail.com>
>> wrote:
>>
>>> Within 24 hours after updating from rock solid 4.1 to 5.3, I
already got
>>> a crash which others have mentioned in
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1313567 and had to
unmount,
>>> kill gluster, and remount:
>>>
>>>
>>> [2019-01-31 09:38:04.317604] W [dict.c:761:dict_ref]
>>>
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>> [0x7fcccafcd329]
>>>
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>> [0x7fcccb1deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>> [0x7fccd705b218] ) 2-dict: dict is NULL [Invalid argument]
>>> [2019-01-31 09:38:04.319308] W [dict.c:761:dict_ref]
>>>
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>> [0x7fcccafcd329]
>>>
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>> [0x7fcccb1deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>> [0x7fccd705b218] ) 2-dict: dict is NULL [Invalid argument]
>>> [2019-01-31 09:38:04.320047] W [dict.c:761:dict_ref]
>>>
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>> [0x7fcccafcd329]
>>>
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>> [0x7fcccb1deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>> [0x7fccd705b218] ) 2-dict: dict is NULL [Invalid argument]
>>> [2019-01-31 09:38:04.320677] W [dict.c:761:dict_ref]
>>>
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>> [0x7fcccafcd329]
>>>
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>> [0x7fcccb1deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>> [0x7fccd705b218] ) 2-dict: dict is NULL [Invalid argument]
>>> The message "I [MSGID: 108031]
>>> [afr-common.c:2543:afr_local_discovery_cbk]
2-SITE_data1-replicate-0:
>>> selecting local read_child SITE_data1-client-3" repeated 5
times between
>>> [2019-01-31 09:37:54.751905] and [2019-01-31 09:38:03.958061]
>>> The message "E [MSGID: 101191]
>>> [event-epoll.c:671:event_dispatch_epoll_worker] 2-epoll: Failed to
dispatch
>>> handler" repeated 72 times between [2019-01-31
09:37:53.746741] and
>>> [2019-01-31 09:38:04.696993]
>>> pending frames:
>>> frame : type(1) op(READ)
>>> frame : type(1) op(OPEN)
>>> frame : type(0) op(0)
>>> patchset: git://git.gluster.org/glusterfs.git
>>> signal received: 6
>>> time of crash:
>>> 2019-01-31 09:38:04
>>> configuration details:
>>> argp 1
>>> backtrace 1
>>> dlfcn 1
>>> libpthread 1
>>> llistxattr 1
>>> setfsid 1
>>> spinlock 1
>>> epoll.h 1
>>> xattr.h 1
>>> st_atim.tv_nsec 1
>>> package-string: glusterfs 5.3
>>> /usr/lib64/libglusterfs.so.0(+0x2764c)[0x7fccd706664c]
>>> /usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7fccd7070cb6]
>>> /lib64/libc.so.6(+0x36160)[0x7fccd622d160]
>>> /lib64/libc.so.6(gsignal+0x110)[0x7fccd622d0e0]
>>> /lib64/libc.so.6(abort+0x151)[0x7fccd622e6c1]
>>> /lib64/libc.so.6(+0x2e6fa)[0x7fccd62256fa]
>>> /lib64/libc.so.6(+0x2e772)[0x7fccd6225772]
>>> /lib64/libpthread.so.0(pthread_mutex_lock+0x228)[0x7fccd65bb0b8]
>>>
>>>
/usr/lib64/glusterfs/5.3/xlator/cluster/replicate.so(+0x32c4d)[0x7fcccbb01c4d]
>>>
>>>
/usr/lib64/glusterfs/5.3/xlator/protocol/client.so(+0x65778)[0x7fcccbdd1778]
>>> /usr/lib64/libgfrpc.so.0(+0xe820)[0x7fccd6e31820]
>>> /usr/lib64/libgfrpc.so.0(+0xeb6f)[0x7fccd6e31b6f]
>>> /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fccd6e2e063]
>>>
/usr/lib64/glusterfs/5.3/rpc-transport/socket.so(+0xa0b2)[0x7fccd0b7e0b2]
>>> /usr/lib64/libglusterfs.so.0(+0x854c3)[0x7fccd70c44c3]
>>> /lib64/libpthread.so.0(+0x7559)[0x7fccd65b8559]
>>> /lib64/libc.so.6(clone+0x3f)[0x7fccd62ef81f]
>>> ---------
>>>
>>> Do the pending patches fix the crash or only the repeated warnings?
I'm
>>> running glusterfs on OpenSUSE 15.0 installed via
>>>
http://download.opensuse.org/repositories/home:/glusterfs:/Leap15-5/openSUSE_Leap_15.0/,
>>> not too sure how to make it core dump.
>>>
>>> If it's not fixed by the patches above, has anyone already
opened a
>>> ticket for the crashes that I can join and monitor? This is going
to create
>>> a massive problem for us since production systems are crashing.
>>>
>>> Thanks.
>>>
>>> Sincerely,
>>> Artem
>>>
>>> --
>>> Founder, Android Police <http://www.androidpolice.com>, APK
Mirror
>>> <http://www.apkmirror.com/>, Illogical Robot LLC
>>> beerpla.net | +ArtemRussakovskii
>>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>> <http://twitter.com/ArtemR>
>>>
>>>
>>> On Wed, Jan 30, 2019 at 6:37 PM Raghavendra Gowdappa <
>>> rgowdapp at redhat.com> wrote:
>>>
>>>>
>>>>
>>>> On Thu, Jan 31, 2019 at 2:14 AM Artem Russakovskii
<archon810 at gmail.com>
>>>> wrote:
>>>>
>>>>> Also, not sure if related or not, but I got a ton of these
"Failed to
>>>>> dispatch handler" in my logs as well. Many people have
been commenting
>>>>> about this issue here
>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1651246.
>>>>>
>>>>
>>>> https://review.gluster.org/#/c/glusterfs/+/22046/ addresses
this.
>>>>
>>>>
>>>>> ==> mnt-SITE_data1.log <=>>>>>>
[2019-01-30 20:38:20.783713] W [dict.c:761:dict_ref]
>>>>>>
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>>>>> [0x7fd966fcd329]
>>>>>>
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>>>>> [0x7fd9671deaf5]
-->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>>>>> [0x7fd9731ea218] ) 2-dict: dict is NULL [Invalid
argument]
>>>>>> ==> mnt-SITE_data3.log <=>>>>>>
The message "E [MSGID: 101191]
>>>>>> [event-epoll.c:671:event_dispatch_epoll_worker]
2-epoll: Failed to dispatch
>>>>>> handler" repeated 413 times between [2019-01-30
20:36:23.881090] and
>>>>>> [2019-01-30 20:38:20.015593]
>>>>>> The message "I [MSGID: 108031]
>>>>>> [afr-common.c:2543:afr_local_discovery_cbk]
2-SITE_data3-replicate-0:
>>>>>> selecting local read_child SITE_data3-client-0"
repeated 42 times between
>>>>>> [2019-01-30 20:36:23.290287] and [2019-01-30
20:38:20.280306]
>>>>>> ==> mnt-SITE_data1.log <=>>>>>>
The message "I [MSGID: 108031]
>>>>>> [afr-common.c:2543:afr_local_discovery_cbk]
2-SITE_data1-replicate-0:
>>>>>> selecting local read_child SITE_data1-client-0"
repeated 50 times between
>>>>>> [2019-01-30 20:36:22.247367] and [2019-01-30
20:38:19.459789]
>>>>>> The message "E [MSGID: 101191]
>>>>>> [event-epoll.c:671:event_dispatch_epoll_worker]
2-epoll: Failed to dispatch
>>>>>> handler" repeated 2654 times between [2019-01-30
20:36:22.667327] and
>>>>>> [2019-01-30 20:38:20.546355]
>>>>>> [2019-01-30 20:38:21.492319] I [MSGID: 108031]
>>>>>> [afr-common.c:2543:afr_local_discovery_cbk]
2-SITE_data1-replicate-0:
>>>>>> selecting local read_child SITE_data1-client-0
>>>>>> ==> mnt-SITE_data3.log <=>>>>>>
[2019-01-30 20:38:22.349689] I [MSGID: 108031]
>>>>>> [afr-common.c:2543:afr_local_discovery_cbk]
2-SITE_data3-replicate-0:
>>>>>> selecting local read_child SITE_data3-client-0
>>>>>> ==> mnt-SITE_data1.log <=>>>>>>
[2019-01-30 20:38:22.762941] E [MSGID: 101191]
>>>>>> [event-epoll.c:671:event_dispatch_epoll_worker]
2-epoll: Failed to dispatch
>>>>>> handler
>>>>>
>>>>>
>>>>> I'm hoping raising the issue here on the mailing list
may bring some
>>>>> additional eyeballs and get them both fixed.
>>>>>
>>>>> Thanks.
>>>>>
>>>>> Sincerely,
>>>>> Artem
>>>>>
>>>>> --
>>>>> Founder, Android Police
<http://www.androidpolice.com>, APK Mirror
>>>>> <http://www.apkmirror.com/>, Illogical Robot LLC
>>>>> beerpla.net | +ArtemRussakovskii
>>>>> <https://plus.google.com/+ArtemRussakovskii> |
@ArtemR
>>>>> <http://twitter.com/ArtemR>
>>>>>
>>>>>
>>>>> On Wed, Jan 30, 2019 at 12:26 PM Artem Russakovskii <
>>>>> archon810 at gmail.com> wrote:
>>>>>
>>>>>> I found a similar issue here:
>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1313567.
There's a
>>>>>> comment from 3 days ago from someone else with 5.3 who
started seeing the
>>>>>> spam.
>>>>>>
>>>>>> Here's the command that repeats over and over:
>>>>>> [2019-01-30 20:23:24.481581] W [dict.c:761:dict_ref]
>>>>>>
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>>>>> [0x7fd966fcd329]
>>>>>>
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>>>>> [0x7fd9671deaf5]
-->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>>>>> [0x7fd9731ea218] ) 2-dict: dict is NULL [Invalid
argument]
>>>>>>
>>>>>
>>>> +Milind Changire <mchangir at redhat.com> Can you check
why this message
>>>> is logged and send a fix?
>>>>
>>>>
>>>>>> Is there any fix for this issue?
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> Sincerely,
>>>>>> Artem
>>>>>>
>>>>>> --
>>>>>> Founder, Android Police
<http://www.androidpolice.com>, APK Mirror
>>>>>> <http://www.apkmirror.com/>, Illogical Robot LLC
>>>>>> beerpla.net | +ArtemRussakovskii
>>>>>> <https://plus.google.com/+ArtemRussakovskii> |
@ArtemR
>>>>>> <http://twitter.com/ArtemR>
>>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>>
>> --
>> Amar Tumballi (amarts)
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190202/97676b19/attachment.html>