thr3ads.net - Gluster users - [Gluster-users] Message repeated over and over after upgrade from 4.1 to 5.3: W [dict.c:761:dict_ref] (-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329) [0x7fd966fcd329] -->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5) [0x7fd9671deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58) [0x7fd9731ea218] ) 2-dict: dict is NULL [Invalid argument] [Feb 2019]

If this information is useful, please help other people find it:
Share via:

Artem Russakovskii

2019-Feb-12 04:53 UTC

[Gluster-users] Message repeated over and over after upgrade from 4.1 to 5.3: W [dict.c:761:dict_ref] (-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329) [0x7fd966fcd329] -->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5) [0x7fd9671deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58) [0x7fd9731ea218] ) 2-dict: dict is NULL [Invalid argument]

Great job identifying the issue!

Any ETA on the next release with the logging and crash fixes in it?

On Mon, Feb 11, 2019, 7:19 PM Raghavendra Gowdappa <rgowdapp at
redhat.com>
wrote:
>
>
> On Mon, Feb 11, 2019 at 3:49 PM Jo?o Ba?to <
> joao.bauto at neuro.fchampalimaud.org> wrote:
>
>> Although I don't have these error messages, I'm having fuse
crashes as
>> frequent as you. I have disabled write-behind and the mount has been
>> running over the weekend with heavy usage and no issues.
>>
>
> The issue you are facing will likely be fixed by patch [1]. Me, Xavi and
> Nithya were able to identify the corruption in write-behind.
>
> [1] https://review.gluster.org/22189
>
>
>> I can provide coredumps before disabling write-behind if needed. I
opened
>> a BZ report <https://bugzilla.redhat.com/show_bug.cgi?id=1671014>
with
>> the crashes that I was having.
>>
>> *Jo?o Ba?to*
>> ---------------
>>
>> *Scientific Computing and Software Platform*
>> Champalimaud Research
>> Champalimaud Center for the Unknown
>> Av. Bras?lia, Doca de Pedrou?os
>> 1400-038 Lisbon, Portugal
>> fchampalimaud.org <https://www.fchampalimaud.org/>
>>
>>
>> Artem Russakovskii <archon810 at gmail.com> escreveu no dia
s?bado,
>> 9/02/2019 ?(s) 22:18:
>>
>>> Alright. I've enabled core-dumping (hopefully), so now I'm
waiting for
>>> the next crash to see if it dumps a core for you guys to remotely
debug.
>>>
>>> Then I can consider setting performance.write-behind to off and
>>> monitoring for further crashes.
>>>
>>> Sincerely,
>>> Artem
>>>
>>> --
>>> Founder, Android Police <http://www.androidpolice.com>, APK
Mirror
>>> <http://www.apkmirror.com/>, Illogical Robot LLC
>>> beerpla.net | +ArtemRussakovskii
>>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>> <http://twitter.com/ArtemR>
>>>
>>>
>>> On Fri, Feb 8, 2019 at 7:22 PM Raghavendra Gowdappa <rgowdapp at
redhat.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Sat, Feb 9, 2019 at 12:53 AM Artem Russakovskii
<archon810 at gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Nithya,
>>>>>
>>>>> I can try to disable write-behind as long as it doesn't
heavily impact
>>>>> performance for us. Which option is it exactly? I don't
see it set in my
>>>>> list of changed volume variables that I sent you guys
earlier.
>>>>>
>>>>
>>>> The option is performance.write-behind
>>>>
>>>>
>>>>> Sincerely,
>>>>> Artem
>>>>>
>>>>> --
>>>>> Founder, Android Police
<http://www.androidpolice.com>, APK Mirror
>>>>> <http://www.apkmirror.com/>, Illogical Robot LLC
>>>>> beerpla.net | +ArtemRussakovskii
>>>>> <https://plus.google.com/+ArtemRussakovskii> |
@ArtemR
>>>>> <http://twitter.com/ArtemR>
>>>>>
>>>>>
>>>>> On Fri, Feb 8, 2019 at 4:57 AM Nithya Balachandran <
>>>>> nbalacha at redhat.com> wrote:
>>>>>
>>>>>> Hi Artem,
>>>>>>
>>>>>> We have found the cause of one crash. Unfortunately we
have not
>>>>>> managed to reproduce the one you reported so we
don't know if it is the
>>>>>> same cause.
>>>>>>
>>>>>> Can you disable write-behind on the volume and let us
know if it
>>>>>> solves the problem? If yes, it is likely to be the same
issue.
>>>>>>
>>>>>>
>>>>>> regards,
>>>>>> Nithya
>>>>>>
>>>>>> On Fri, 8 Feb 2019 at 06:51, Artem Russakovskii
<archon810 at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Sorry to disappoint, but the crash just happened
again, so
>>>>>>> lru-limit=0 didn't help.
>>>>>>>
>>>>>>> Here's the snippet of the crash and the
subsequent remount by monit.
>>>>>>>
>>>>>>>
>>>>>>> [2019-02-08 01:13:05.854391] W
[dict.c:761:dict_ref]
>>>>>>>
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>>>>>> [0x7f4402b99329]
>>>>>>>
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>>>>>> [0x7f4402daaaf5]
-->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>>>>>> [0x7f440b6b5218] ) 0-dict: dict is NULL [In
>>>>>>> valid argument]
>>>>>>> The message "I [MSGID: 108031]
>>>>>>> [afr-common.c:2543:afr_local_discovery_cbk]
0-<SNIP>_data1-replicate-0:
>>>>>>> selecting local read_child
<SNIP>_data1-client-3" repeated 39 times between
>>>>>>> [2019-02-08 01:11:18.043286] and [2019-02-08
01:13:07.915604]
>>>>>>> The message "E [MSGID: 101191]
>>>>>>> [event-epoll.c:671:event_dispatch_epoll_worker]
0-epoll: Failed to dispatch
>>>>>>> handler" repeated 515 times between
[2019-02-08 01:11:17.932515] and
>>>>>>> [2019-02-08 01:13:09.311554]
>>>>>>> pending frames:
>>>>>>> frame : type(1) op(LOOKUP)
>>>>>>> frame : type(0) op(0)
>>>>>>> patchset: git://git.gluster.org/glusterfs.git
>>>>>>> signal received: 6
>>>>>>> time of crash:
>>>>>>> 2019-02-08 01:13:09
>>>>>>> configuration details:
>>>>>>> argp 1
>>>>>>> backtrace 1
>>>>>>> dlfcn 1
>>>>>>> libpthread 1
>>>>>>> llistxattr 1
>>>>>>> setfsid 1
>>>>>>> spinlock 1
>>>>>>> epoll.h 1
>>>>>>> xattr.h 1
>>>>>>> st_atim.tv_nsec 1
>>>>>>> package-string: glusterfs 5.3
>>>>>>>
/usr/lib64/libglusterfs.so.0(+0x2764c)[0x7f440b6c064c]
>>>>>>>
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7f440b6cacb6]
>>>>>>> /lib64/libc.so.6(+0x36160)[0x7f440a887160]
>>>>>>> /lib64/libc.so.6(gsignal+0x110)[0x7f440a8870e0]
>>>>>>> /lib64/libc.so.6(abort+0x151)[0x7f440a8886c1]
>>>>>>> /lib64/libc.so.6(+0x2e6fa)[0x7f440a87f6fa]
>>>>>>> /lib64/libc.so.6(+0x2e772)[0x7f440a87f772]
>>>>>>>
/lib64/libpthread.so.0(pthread_mutex_lock+0x228)[0x7f440ac150b8]
>>>>>>>
>>>>>>>
/usr/lib64/glusterfs/5.3/xlator/cluster/replicate.so(+0x5dc9d)[0x7f44036f8c9d]
>>>>>>>
>>>>>>>
/usr/lib64/glusterfs/5.3/xlator/cluster/replicate.so(+0x70ba1)[0x7f440370bba1]
>>>>>>>
>>>>>>>
/usr/lib64/glusterfs/5.3/xlator/protocol/client.so(+0x58f3f)[0x7f4403990f3f]
>>>>>>> /usr/lib64/libgfrpc.so.0(+0xe820)[0x7f440b48b820]
>>>>>>> /usr/lib64/libgfrpc.so.0(+0xeb6f)[0x7f440b48bb6f]
>>>>>>>
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7f440b488063]
>>>>>>>
>>>>>>>
/usr/lib64/glusterfs/5.3/rpc-transport/socket.so(+0xa0b2)[0x7f44050a80b2]
>>>>>>>
/usr/lib64/libglusterfs.so.0(+0x854c3)[0x7f440b71e4c3]
>>>>>>> /lib64/libpthread.so.0(+0x7559)[0x7f440ac12559]
>>>>>>> /lib64/libc.so.6(clone+0x3f)[0x7f440a94981f]
>>>>>>> ---------
>>>>>>> [2019-02-08 01:13:35.628478] I [MSGID: 100030]
>>>>>>> [glusterfsd.c:2715:main] 0-/usr/sbin/glusterfs:
Started running
>>>>>>> /usr/sbin/glusterfs version 5.3 (args:
/usr/sbin/glusterfs --lru-limit=0
>>>>>>> --process-name fuse --volfile-server=localhost
--volfile-id=/<SNIP>_data1
>>>>>>> /mnt/<SNIP>_data1)
>>>>>>> [2019-02-08 01:13:35.637830] I [MSGID: 101190]
>>>>>>> [event-epoll.c:622:event_dispatch_epoll_worker]
0-epoll: Started thread
>>>>>>> with index 1
>>>>>>> [2019-02-08 01:13:35.651405] I [MSGID: 101190]
>>>>>>> [event-epoll.c:622:event_dispatch_epoll_worker]
0-epoll: Started thread
>>>>>>> with index 2
>>>>>>> [2019-02-08 01:13:35.651628] I [MSGID: 101190]
>>>>>>> [event-epoll.c:622:event_dispatch_epoll_worker]
0-epoll: Started thread
>>>>>>> with index 3
>>>>>>> [2019-02-08 01:13:35.651747] I [MSGID: 101190]
>>>>>>> [event-epoll.c:622:event_dispatch_epoll_worker]
0-epoll: Started thread
>>>>>>> with index 4
>>>>>>> [2019-02-08 01:13:35.652575] I [MSGID: 114020]
>>>>>>> [client.c:2354:notify]
0-<SNIP>_data1-client-0: parent translators are
>>>>>>> ready, attempting connect on transport
>>>>>>> [2019-02-08 01:13:35.652978] I [MSGID: 114020]
>>>>>>> [client.c:2354:notify]
0-<SNIP>_data1-client-1: parent translators are
>>>>>>> ready, attempting connect on transport
>>>>>>> [2019-02-08 01:13:35.655197] I [MSGID: 114020]
>>>>>>> [client.c:2354:notify]
0-<SNIP>_data1-client-2: parent translators are
>>>>>>> ready, attempting connect on transport
>>>>>>> [2019-02-08 01:13:35.655497] I [MSGID: 114020]
>>>>>>> [client.c:2354:notify]
0-<SNIP>_data1-client-3: parent translators are
>>>>>>> ready, attempting connect on transport
>>>>>>> [2019-02-08 01:13:35.655527] I
[rpc-clnt.c:2042:rpc_clnt_reconfig]
>>>>>>> 0-<SNIP>_data1-client-0: changing port to
49153 (from 0)
>>>>>>> Final graph:
>>>>>>>
>>>>>>>
>>>>>>> Sincerely,
>>>>>>> Artem
>>>>>>>
>>>>>>> --
>>>>>>> Founder, Android Police
<http://www.androidpolice.com>, APK Mirror
>>>>>>> <http://www.apkmirror.com/>, Illogical Robot
LLC
>>>>>>> beerpla.net | +ArtemRussakovskii
>>>>>>> <https://plus.google.com/+ArtemRussakovskii>
| @ArtemR
>>>>>>> <http://twitter.com/ArtemR>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Feb 7, 2019 at 1:28 PM Artem Russakovskii
<
>>>>>>> archon810 at gmail.com> wrote:
>>>>>>>
>>>>>>>> I've added the lru-limit=0 parameter to the
mounts, and I see it's
>>>>>>>> taken effect correctly:
>>>>>>>> "/usr/sbin/glusterfs --lru-limit=0
--process-name fuse
>>>>>>>> --volfile-server=localhost
--volfile-id=/<SNIP>  /mnt/<SNIP>"
>>>>>>>>
>>>>>>>> Let's see if it stops crashing or not.
>>>>>>>>
>>>>>>>> Sincerely,
>>>>>>>> Artem
>>>>>>>>
>>>>>>>> --
>>>>>>>> Founder, Android Police
<http://www.androidpolice.com>, APK Mirror
>>>>>>>> <http://www.apkmirror.com/>, Illogical
Robot LLC
>>>>>>>> beerpla.net | +ArtemRussakovskii
>>>>>>>>
<https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>>>>>>> <http://twitter.com/ArtemR>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Feb 6, 2019 at 10:48 AM Artem
Russakovskii <
>>>>>>>> archon810 at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Nithya,
>>>>>>>>>
>>>>>>>>> Indeed, I upgraded from 4.1 to 5.3, at
which point I started
>>>>>>>>> seeing crashes, and no further releases
have been made yet.
>>>>>>>>>
>>>>>>>>> volume info:
>>>>>>>>> Type: Replicate
>>>>>>>>> Volume ID: ****SNIP****
>>>>>>>>> Status: Started
>>>>>>>>> Snapshot Count: 0
>>>>>>>>> Number of Bricks: 1 x 4 = 4
>>>>>>>>> Transport-type: tcp
>>>>>>>>> Bricks:
>>>>>>>>> Brick1: ****SNIP****
>>>>>>>>> Brick2: ****SNIP****
>>>>>>>>> Brick3: ****SNIP****
>>>>>>>>> Brick4: ****SNIP****
>>>>>>>>> Options Reconfigured:
>>>>>>>>> cluster.quorum-count: 1
>>>>>>>>> cluster.quorum-type: fixed
>>>>>>>>> network.ping-timeout: 5
>>>>>>>>> network.remote-dio: enable
>>>>>>>>> performance.rda-cache-limit: 256MB
>>>>>>>>> performance.readdir-ahead: on
>>>>>>>>> performance.parallel-readdir: on
>>>>>>>>> network.inode-lru-limit: 500000
>>>>>>>>> performance.md-cache-timeout: 600
>>>>>>>>> performance.cache-invalidation: on
>>>>>>>>> performance.stat-prefetch: on
>>>>>>>>> features.cache-invalidation-timeout: 600
>>>>>>>>> features.cache-invalidation: on
>>>>>>>>> cluster.readdir-optimize: on
>>>>>>>>> performance.io-thread-count: 32
>>>>>>>>> server.event-threads: 4
>>>>>>>>> client.event-threads: 4
>>>>>>>>> performance.read-ahead: off
>>>>>>>>> cluster.lookup-optimize: on
>>>>>>>>> performance.cache-size: 1GB
>>>>>>>>> cluster.self-heal-daemon: enable
>>>>>>>>> transport.address-family: inet
>>>>>>>>> nfs.disable: on
>>>>>>>>> performance.client-io-threads: on
>>>>>>>>> cluster.granular-entry-heal: enable
>>>>>>>>> cluster.data-self-heal-algorithm: full
>>>>>>>>>
>>>>>>>>> Sincerely,
>>>>>>>>> Artem
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Founder, Android Police
<http://www.androidpolice.com>, APK Mirror
>>>>>>>>> <http://www.apkmirror.com/>,
Illogical Robot LLC
>>>>>>>>> beerpla.net | +ArtemRussakovskii
>>>>>>>>>
<https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>>>>>>>> <http://twitter.com/ArtemR>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Feb 6, 2019 at 12:20 AM Nithya
Balachandran <
>>>>>>>>> nbalacha at redhat.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Artem,
>>>>>>>>>>
>>>>>>>>>> Do you still see the crashes with 5.3?
If yes, please try mount
>>>>>>>>>> the volume using the mount option
lru-limit=0 and see if that helps. We are
>>>>>>>>>> looking into the crashes and will
update when have a fix.
>>>>>>>>>>
>>>>>>>>>> Also, please provide the gluster volume
info for the volume in
>>>>>>>>>> question.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> regards,
>>>>>>>>>> Nithya
>>>>>>>>>>
>>>>>>>>>> On Tue, 5 Feb 2019 at 05:31, Artem
Russakovskii <
>>>>>>>>>> archon810 at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> The fuse crash happened two more
times, but this time monit
>>>>>>>>>>> helped recover within 1 minute, so
it's a great workaround for now.
>>>>>>>>>>>
>>>>>>>>>>> What's odd is that the crashes
are only happening on one of 4
>>>>>>>>>>> servers, and I don't know why.
>>>>>>>>>>>
>>>>>>>>>>> Sincerely,
>>>>>>>>>>> Artem
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Founder, Android Police
<http://www.androidpolice.com>, APK
>>>>>>>>>>> Mirror
<http://www.apkmirror.com/>, Illogical Robot LLC
>>>>>>>>>>> beerpla.net | +ArtemRussakovskii
>>>>>>>>>>>
<https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>>>>>>>>>> <http://twitter.com/ArtemR>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Sat, Feb 2, 2019 at 12:14 PM
Artem Russakovskii <
>>>>>>>>>>> archon810 at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> The fuse crash happened again
yesterday, to another volume. Are
>>>>>>>>>>>> there any mount options that
could help mitigate this?
>>>>>>>>>>>>
>>>>>>>>>>>> In the meantime, I set up a
monit (https://mmonit.com/monit/)
>>>>>>>>>>>> task to watch and restart the
mount, which works and recovers the mount
>>>>>>>>>>>> point within a minute. Not
ideal, but a temporary workaround.
>>>>>>>>>>>>
>>>>>>>>>>>> By the way, the way to
reproduce this "Transport endpoint is
>>>>>>>>>>>> not connected" condition
for testing purposes is to kill -9 the right
>>>>>>>>>>>> "glusterfs --process-name
fuse" process.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> monit check:
>>>>>>>>>>>> check filesystem
glusterfs_data1 with path /mnt/glusterfs_data1
>>>>>>>>>>>>   start program  =
"/bin/mount  /mnt/glusterfs_data1"
>>>>>>>>>>>>   stop program  =
"/bin/umount /mnt/glusterfs_data1"
>>>>>>>>>>>>   if space usage > 90% for 5
times within 15 cycles
>>>>>>>>>>>>     then alert else if
succeeded for 10 cycles then alert
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> stack trace:
>>>>>>>>>>>> [2019-02-01 23:22:00.312894] W
[dict.c:761:dict_ref]
>>>>>>>>>>>>
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>>>>>>>>>>> [0x7fa0249e4329]
>>>>>>>>>>>>
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>>>>>>>>>>> [0x7fa024bf5af5]
-->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>>>>>>>>>>> [0x7fa02cf5b218] ) 0-dict: dict
is NULL [Invalid argument]
>>>>>>>>>>>> [2019-02-01 23:22:00.314051] W
[dict.c:761:dict_ref]
>>>>>>>>>>>>
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>>>>>>>>>>> [0x7fa0249e4329]
>>>>>>>>>>>>
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>>>>>>>>>>> [0x7fa024bf5af5]
-->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>>>>>>>>>>> [0x7fa02cf5b218] ) 0-dict: dict
is NULL [Invalid argument]
>>>>>>>>>>>> The message "E [MSGID:
101191]
>>>>>>>>>>>>
[event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch
>>>>>>>>>>>> handler" repeated 26 times
between [2019-02-01 23:21:20.857333] and
>>>>>>>>>>>> [2019-02-01 23:21:56.164427]
>>>>>>>>>>>> The message "I [MSGID:
108031]
>>>>>>>>>>>>
[afr-common.c:2543:afr_local_discovery_cbk] 0-SITE_data3-replicate-0:
>>>>>>>>>>>> selecting local read_child
SITE_data3-client-3" repeated 27 times between
>>>>>>>>>>>> [2019-02-01 23:21:11.142467]
and [2019-02-01 23:22:03.474036]
>>>>>>>>>>>> pending frames:
>>>>>>>>>>>> frame : type(1) op(LOOKUP)
>>>>>>>>>>>> frame : type(0) op(0)
>>>>>>>>>>>> patchset:
git://git.gluster.org/glusterfs.git
>>>>>>>>>>>> signal received: 6
>>>>>>>>>>>> time of crash:
>>>>>>>>>>>> 2019-02-01 23:22:03
>>>>>>>>>>>> configuration details:
>>>>>>>>>>>> argp 1
>>>>>>>>>>>> backtrace 1
>>>>>>>>>>>> dlfcn 1
>>>>>>>>>>>> libpthread 1
>>>>>>>>>>>> llistxattr 1
>>>>>>>>>>>> setfsid 1
>>>>>>>>>>>> spinlock 1
>>>>>>>>>>>> epoll.h 1
>>>>>>>>>>>> xattr.h 1
>>>>>>>>>>>> st_atim.tv_nsec 1
>>>>>>>>>>>> package-string: glusterfs 5.3
>>>>>>>>>>>>
/usr/lib64/libglusterfs.so.0(+0x2764c)[0x7fa02cf6664c]
>>>>>>>>>>>>
>>>>>>>>>>>>
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7fa02cf70cb6]
>>>>>>>>>>>>
/lib64/libc.so.6(+0x36160)[0x7fa02c12d160]
>>>>>>>>>>>>
/lib64/libc.so.6(gsignal+0x110)[0x7fa02c12d0e0]
>>>>>>>>>>>>
/lib64/libc.so.6(abort+0x151)[0x7fa02c12e6c1]
>>>>>>>>>>>>
/lib64/libc.so.6(+0x2e6fa)[0x7fa02c1256fa]
>>>>>>>>>>>>
/lib64/libc.so.6(+0x2e772)[0x7fa02c125772]
>>>>>>>>>>>>
/lib64/libpthread.so.0(pthread_mutex_lock+0x228)[0x7fa02c4bb0b8]
>>>>>>>>>>>>
>>>>>>>>>>>>
/usr/lib64/glusterfs/5.3/xlator/cluster/replicate.so(+0x5dc9d)[0x7fa025543c9d]
>>>>>>>>>>>>
>>>>>>>>>>>>
/usr/lib64/glusterfs/5.3/xlator/cluster/replicate.so(+0x70ba1)[0x7fa025556ba1]
>>>>>>>>>>>>
>>>>>>>>>>>>
/usr/lib64/glusterfs/5.3/xlator/protocol/client.so(+0x58f3f)[0x7fa0257dbf3f]
>>>>>>>>>>>>
/usr/lib64/libgfrpc.so.0(+0xe820)[0x7fa02cd31820]
>>>>>>>>>>>>
/usr/lib64/libgfrpc.so.0(+0xeb6f)[0x7fa02cd31b6f]
>>>>>>>>>>>>
>>>>>>>>>>>>
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fa02cd2e063]
>>>>>>>>>>>>
>>>>>>>>>>>>
/usr/lib64/glusterfs/5.3/rpc-transport/socket.so(+0xa0b2)[0x7fa02694e0b2]
>>>>>>>>>>>>
/usr/lib64/libglusterfs.so.0(+0x854c3)[0x7fa02cfc44c3]
>>>>>>>>>>>>
/lib64/libpthread.so.0(+0x7559)[0x7fa02c4b8559]
>>>>>>>>>>>>
/lib64/libc.so.6(clone+0x3f)[0x7fa02c1ef81f]
>>>>>>>>>>>>
>>>>>>>>>>>> Sincerely,
>>>>>>>>>>>> Artem
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Founder, Android Police
<http://www.androidpolice.com>, APK
>>>>>>>>>>>> Mirror
<http://www.apkmirror.com/>, Illogical Robot LLC
>>>>>>>>>>>> beerpla.net |
+ArtemRussakovskii
>>>>>>>>>>>>
<https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>>>>>>>>>>>
<http://twitter.com/ArtemR>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Feb 1, 2019 at 9:03 AM
Artem Russakovskii <
>>>>>>>>>>>> archon810 at gmail.com>
wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> The first (and so far only)
crash happened at 2am the next day
>>>>>>>>>>>>> after we upgraded, on only
one of four servers and only to one of two
>>>>>>>>>>>>> mounts.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have no idea what caused
it, but yeah, we do have a pretty
>>>>>>>>>>>>> busy site (apkmirror.com),
and it caused a disruption for any
>>>>>>>>>>>>> uploads or downloads from
that server until I woke up and fixed the mount.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I wish I could be more
helpful but all I have is that stack
>>>>>>>>>>>>> trace.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm glad it's a
blocker and will hopefully be resolved soon.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Jan 31, 2019, 7:26
PM Amar Tumballi Suryanarayan <
>>>>>>>>>>>>> atumball at redhat.com>
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Artem,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Opened
https://bugzilla.redhat.com/show_bug.cgi?id=1671603
>>>>>>>>>>>>>> (ie, as a clone of
other bugs where recent discussions happened), and
>>>>>>>>>>>>>> marked it as a blocker
for glusterfs-5.4 release.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> We already have fixes
for log flooding -
>>>>>>>>>>>>>>
https://review.gluster.org/22128, and are the process of
>>>>>>>>>>>>>> identifying and fixing
the issue seen with crash.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can you please tell if
the crashes happened as soon as
>>>>>>>>>>>>>> upgrade ? or was there
any particular pattern you observed before the crash.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -Amar
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Jan 31, 2019 at
11:40 PM Artem Russakovskii <
>>>>>>>>>>>>>> archon810 at
gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Within 24 hours
after updating from rock solid 4.1 to 5.3, I
>>>>>>>>>>>>>>> already got a crash
which others have mentioned in
>>>>>>>>>>>>>>>
https://bugzilla.redhat.com/show_bug.cgi?id=1313567 and had
>>>>>>>>>>>>>>> to unmount, kill
gluster, and remount:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [2019-01-31
09:38:04.317604] W [dict.c:761:dict_ref]
>>>>>>>>>>>>>>>
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>>>>>>>>>>>>>> [0x7fcccafcd329]
>>>>>>>>>>>>>>>
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>>>>>>>>>>>>>> [0x7fcccb1deaf5]
-->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>>>>>>>>>>>>>> [0x7fccd705b218] )
2-dict: dict is NULL [Invalid argument]
>>>>>>>>>>>>>>> [2019-01-31
09:38:04.319308] W [dict.c:761:dict_ref]
>>>>>>>>>>>>>>>
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>>>>>>>>>>>>>> [0x7fcccafcd329]
>>>>>>>>>>>>>>>
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>>>>>>>>>>>>>> [0x7fcccb1deaf5]
-->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>>>>>>>>>>>>>> [0x7fccd705b218] )
2-dict: dict is NULL [Invalid argument]
>>>>>>>>>>>>>>> [2019-01-31
09:38:04.320047] W [dict.c:761:dict_ref]
>>>>>>>>>>>>>>>
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>>>>>>>>>>>>>> [0x7fcccafcd329]
>>>>>>>>>>>>>>>
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>>>>>>>>>>>>>> [0x7fcccb1deaf5]
-->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>>>>>>>>>>>>>> [0x7fccd705b218] )
2-dict: dict is NULL [Invalid argument]
>>>>>>>>>>>>>>> [2019-01-31
09:38:04.320677] W [dict.c:761:dict_ref]
>>>>>>>>>>>>>>>
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>>>>>>>>>>>>>> [0x7fcccafcd329]
>>>>>>>>>>>>>>>
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>>>>>>>>>>>>>> [0x7fcccb1deaf5]
-->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>>>>>>>>>>>>>> [0x7fccd705b218] )
2-dict: dict is NULL [Invalid argument]
>>>>>>>>>>>>>>> The message "I
[MSGID: 108031]
>>>>>>>>>>>>>>>
[afr-common.c:2543:afr_local_discovery_cbk] 2-SITE_data1-replicate-0:
>>>>>>>>>>>>>>> selecting local
read_child SITE_data1-client-3" repeated 5 times between
>>>>>>>>>>>>>>> [2019-01-31
09:37:54.751905] and [2019-01-31 09:38:03.958061]
>>>>>>>>>>>>>>> The message "E
[MSGID: 101191]
>>>>>>>>>>>>>>>
[event-epoll.c:671:event_dispatch_epoll_worker] 2-epoll: Failed to dispatch
>>>>>>>>>>>>>>> handler"
repeated 72 times between [2019-01-31 09:37:53.746741] and
>>>>>>>>>>>>>>> [2019-01-31
09:38:04.696993]
>>>>>>>>>>>>>>> pending frames:
>>>>>>>>>>>>>>> frame : type(1)
op(READ)
>>>>>>>>>>>>>>> frame : type(1)
op(OPEN)
>>>>>>>>>>>>>>> frame : type(0)
op(0)
>>>>>>>>>>>>>>> patchset:
git://git.gluster.org/glusterfs.git
>>>>>>>>>>>>>>> signal received: 6
>>>>>>>>>>>>>>> time of crash:
>>>>>>>>>>>>>>> 2019-01-31 09:38:04
>>>>>>>>>>>>>>> configuration
details:
>>>>>>>>>>>>>>> argp 1
>>>>>>>>>>>>>>> backtrace 1
>>>>>>>>>>>>>>> dlfcn 1
>>>>>>>>>>>>>>> libpthread 1
>>>>>>>>>>>>>>> llistxattr 1
>>>>>>>>>>>>>>> setfsid 1
>>>>>>>>>>>>>>> spinlock 1
>>>>>>>>>>>>>>> epoll.h 1
>>>>>>>>>>>>>>> xattr.h 1
>>>>>>>>>>>>>>> st_atim.tv_nsec 1
>>>>>>>>>>>>>>> package-string:
glusterfs 5.3
>>>>>>>>>>>>>>>
/usr/lib64/libglusterfs.so.0(+0x2764c)[0x7fccd706664c]
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7fccd7070cb6]
>>>>>>>>>>>>>>>
/lib64/libc.so.6(+0x36160)[0x7fccd622d160]
>>>>>>>>>>>>>>>
/lib64/libc.so.6(gsignal+0x110)[0x7fccd622d0e0]
>>>>>>>>>>>>>>>
/lib64/libc.so.6(abort+0x151)[0x7fccd622e6c1]
>>>>>>>>>>>>>>>
/lib64/libc.so.6(+0x2e6fa)[0x7fccd62256fa]
>>>>>>>>>>>>>>>
/lib64/libc.so.6(+0x2e772)[0x7fccd6225772]
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
/lib64/libpthread.so.0(pthread_mutex_lock+0x228)[0x7fccd65bb0b8]
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
/usr/lib64/glusterfs/5.3/xlator/cluster/replicate.so(+0x32c4d)[0x7fcccbb01c4d]
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
/usr/lib64/glusterfs/5.3/xlator/protocol/client.so(+0x65778)[0x7fcccbdd1778]
>>>>>>>>>>>>>>>
/usr/lib64/libgfrpc.so.0(+0xe820)[0x7fccd6e31820]
>>>>>>>>>>>>>>>
/usr/lib64/libgfrpc.so.0(+0xeb6f)[0x7fccd6e31b6f]
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fccd6e2e063]
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
/usr/lib64/glusterfs/5.3/rpc-transport/socket.so(+0xa0b2)[0x7fccd0b7e0b2]
>>>>>>>>>>>>>>>
/usr/lib64/libglusterfs.so.0(+0x854c3)[0x7fccd70c44c3]
>>>>>>>>>>>>>>>
/lib64/libpthread.so.0(+0x7559)[0x7fccd65b8559]
>>>>>>>>>>>>>>>
/lib64/libc.so.6(clone+0x3f)[0x7fccd62ef81f]
>>>>>>>>>>>>>>> ---------
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Do the pending
patches fix the crash or only the repeated
>>>>>>>>>>>>>>> warnings? I'm
running glusterfs on OpenSUSE 15.0 installed via
>>>>>>>>>>>>>>>
http://download.opensuse.org/repositories/home:/glusterfs:/Leap15-5/openSUSE_Leap_15.0/,
>>>>>>>>>>>>>>> not too sure how to
make it core dump.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If it's not
fixed by the patches above, has anyone already
>>>>>>>>>>>>>>> opened a ticket for
the crashes that I can join and monitor? This is going
>>>>>>>>>>>>>>> to create a massive
problem for us since production systems are crashing.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sincerely,
>>>>>>>>>>>>>>> Artem
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Founder, Android
Police <http://www.androidpolice.com>, APK
>>>>>>>>>>>>>>> Mirror
<http://www.apkmirror.com/>, Illogical Robot LLC
>>>>>>>>>>>>>>> beerpla.net |
+ArtemRussakovskii
>>>>>>>>>>>>>>>
<https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>>>>>>>>>>>>>>
<http://twitter.com/ArtemR>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Jan 30,
2019 at 6:37 PM Raghavendra Gowdappa <
>>>>>>>>>>>>>>> rgowdapp at
redhat.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Thu, Jan 31,
2019 at 2:14 AM Artem Russakovskii <
>>>>>>>>>>>>>>>> archon810 at
gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Also, not
sure if related or not, but I got a ton of these
>>>>>>>>>>>>>>>>>
"Failed to dispatch handler" in my logs as well. Many people have been
>>>>>>>>>>>>>>>>> commenting
about this issue here
>>>>>>>>>>>>>>>>>
https://bugzilla.redhat.com/show_bug.cgi?id=1651246.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
https://review.gluster.org/#/c/glusterfs/+/22046/
>>>>>>>>>>>>>>>> addresses this.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ==>
mnt-SITE_data1.log
<=>>>>>>>>>>>>>>>>>>
[2019-01-30 20:38:20.783713] W [dict.c:761:dict_ref]
>>>>>>>>>>>>>>>>>>
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>>>>>>>>>>>>>>>>>
[0x7fd966fcd329]
>>>>>>>>>>>>>>>>>>
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>>>>>>>>>>>>>>>>>
[0x7fd9671deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>>>>>>>>>>>>>>>>>
[0x7fd9731ea218] ) 2-dict: dict is NULL [Invalid argument]
>>>>>>>>>>>>>>>>>> ==>
mnt-SITE_data3.log
<=>>>>>>>>>>>>>>>>>>
The message "E [MSGID: 101191]
>>>>>>>>>>>>>>>>>>
[event-epoll.c:671:event_dispatch_epoll_worker] 2-epoll: Failed to dispatch
>>>>>>>>>>>>>>>>>>
handler" repeated 413 times between [2019-01-30 20:36:23.881090] and
>>>>>>>>>>>>>>>>>>
[2019-01-30 20:38:20.015593]
>>>>>>>>>>>>>>>>>> The
message "I [MSGID: 108031]
>>>>>>>>>>>>>>>>>>
[afr-common.c:2543:afr_local_discovery_cbk] 2-SITE_data3-replicate-0:
>>>>>>>>>>>>>>>>>>
selecting local read_child SITE_data3-client-0" repeated 42 times between
>>>>>>>>>>>>>>>>>>
[2019-01-30 20:36:23.290287] and [2019-01-30 20:38:20.280306]
>>>>>>>>>>>>>>>>>> ==>
mnt-SITE_data1.log
<=>>>>>>>>>>>>>>>>>>
The message "I [MSGID: 108031]
>>>>>>>>>>>>>>>>>>
[afr-common.c:2543:afr_local_discovery_cbk] 2-SITE_data1-replicate-0:
>>>>>>>>>>>>>>>>>>
selecting local read_child SITE_data1-client-0" repeated 50 times between
>>>>>>>>>>>>>>>>>>
[2019-01-30 20:36:22.247367] and [2019-01-30 20:38:19.459789]
>>>>>>>>>>>>>>>>>> The
message "E [MSGID: 101191]
>>>>>>>>>>>>>>>>>>
[event-epoll.c:671:event_dispatch_epoll_worker] 2-epoll: Failed to dispatch
>>>>>>>>>>>>>>>>>>
handler" repeated 2654 times between [2019-01-30 20:36:22.667327] and
>>>>>>>>>>>>>>>>>>
[2019-01-30 20:38:20.546355]
>>>>>>>>>>>>>>>>>>
[2019-01-30 20:38:21.492319] I [MSGID: 108031]
>>>>>>>>>>>>>>>>>>
[afr-common.c:2543:afr_local_discovery_cbk] 2-SITE_data1-replicate-0:
>>>>>>>>>>>>>>>>>>
selecting local read_child SITE_data1-client-0
>>>>>>>>>>>>>>>>>> ==>
mnt-SITE_data3.log
<=>>>>>>>>>>>>>>>>>>
[2019-01-30 20:38:22.349689] I [MSGID: 108031]
>>>>>>>>>>>>>>>>>>
[afr-common.c:2543:afr_local_discovery_cbk] 2-SITE_data3-replicate-0:
>>>>>>>>>>>>>>>>>>
selecting local read_child SITE_data3-client-0
>>>>>>>>>>>>>>>>>> ==>
mnt-SITE_data1.log
<=>>>>>>>>>>>>>>>>>>
[2019-01-30 20:38:22.762941] E [MSGID: 101191]
>>>>>>>>>>>>>>>>>>
[event-epoll.c:671:event_dispatch_epoll_worker] 2-epoll: Failed to dispatch
>>>>>>>>>>>>>>>>>> handler
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I'm
hoping raising the issue here on the mailing list may
>>>>>>>>>>>>>>>>> bring some
additional eyeballs and get them both fixed.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Sincerely,
>>>>>>>>>>>>>>>>> Artem
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> Founder,
Android Police <http://www.androidpolice.com>, APK
>>>>>>>>>>>>>>>>> Mirror
<http://www.apkmirror.com/>, Illogical Robot LLC
>>>>>>>>>>>>>>>>> beerpla.net
| +ArtemRussakovskii
>>>>>>>>>>>>>>>>>
<https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>>>>>>>>>>>>>>>>
<http://twitter.com/ArtemR>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, Jan
30, 2019 at 12:26 PM Artem Russakovskii <
>>>>>>>>>>>>>>>>> archon810
at gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I found
a similar issue here:
>>>>>>>>>>>>>>>>>>
https://bugzilla.redhat.com/show_bug.cgi?id=1313567.
>>>>>>>>>>>>>>>>>>
There's a comment from 3 days ago from someone else with 5.3 who started
>>>>>>>>>>>>>>>>>> seeing
the spam.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
Here's the command that repeats over and over:
>>>>>>>>>>>>>>>>>>
[2019-01-30 20:23:24.481581] W [dict.c:761:dict_ref]
>>>>>>>>>>>>>>>>>>
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>>>>>>>>>>>>>>>>>
[0x7fd966fcd329]
>>>>>>>>>>>>>>>>>>
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>>>>>>>>>>>>>>>>>
[0x7fd9671deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>>>>>>>>>>>>>>>>>
[0x7fd9731ea218] ) 2-dict: dict is NULL [Invalid argument]
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> +Milind
Changire <mchangir at redhat.com> Can you check why
>>>>>>>>>>>>>>>> this message is
logged and send a fix?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Is
there any fix for this issue?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
Sincerely,
>>>>>>>>>>>>>>>>>> Artem
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>
Founder, Android Police <http://www.androidpolice.com>, APK
>>>>>>>>>>>>>>>>>> Mirror
<http://www.apkmirror.com/>, Illogical Robot LLC
>>>>>>>>>>>>>>>>>>
beerpla.net | +ArtemRussakovskii
>>>>>>>>>>>>>>>>>>
<https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>>>>>>>>>>>>>>>>>
<http://twitter.com/ArtemR>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>>>>>>>>
Gluster-users mailing list
>>>>>>>>>>>>>>>>>
Gluster-users at gluster.org
>>>>>>>>>>>>>>>>>
https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>>>>>> Gluster-users
mailing list
>>>>>>>>>>>>>>> Gluster-users at
gluster.org
>>>>>>>>>>>>>>>
https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Amar Tumballi (amarts)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>> Gluster-users mailing list
>>>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>>>>
https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>
>>>>>>>>>>
_______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190211/dee7dea3/attachment.html>

Raghavendra Gowdappa

2019-Feb-12 05:34 UTC

head link

[Gluster-users] Message repeated over and over after upgrade from 4.1 to 5.3: W [dict.c:761:dict_ref] (-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329) [0x7fd966fcd329] -->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5) [0x7fd9671deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58) [0x7fd9731ea218] ) 2-dict: dict is NULL [Invalid argument]

On Tue, Feb 12, 2019 at 10:24 AM Artem Russakovskii <archon810 at
gmail.com>
wrote:
> Great job identifying the issue!
>
> Any ETA on the next release with the logging and crash fixes in it?
>
I've marked write-behind corruption as a blocker for release-6. Logging
fixes are already in codebase.

> On Mon, Feb 11, 2019, 7:19 PM Raghavendra Gowdappa <rgowdapp at
redhat.com>
> wrote:
>
>>
>>
>> On Mon, Feb 11, 2019 at 3:49 PM Jo?o Ba?to <
>> joao.bauto at neuro.fchampalimaud.org> wrote:
>>
>>> Although I don't have these error messages, I'm having fuse
crashes as
>>> frequent as you. I have disabled write-behind and the mount has
been
>>> running over the weekend with heavy usage and no issues.
>>>
>>
>> The issue you are facing will likely be fixed by patch [1]. Me, Xavi
and
>> Nithya were able to identify the corruption in write-behind.
>>
>> [1] https://review.gluster.org/22189
>>
>>
>>> I can provide coredumps before disabling write-behind if needed. I
>>> opened a BZ report
<https://bugzilla.redhat.com/show_bug.cgi?id=1671014> with
>>> the crashes that I was having.
>>>
>>> *Jo?o Ba?to*
>>> ---------------
>>>
>>> *Scientific Computing and Software Platform*
>>> Champalimaud Research
>>> Champalimaud Center for the Unknown
>>> Av. Bras?lia, Doca de Pedrou?os
>>> 1400-038 Lisbon, Portugal
>>> fchampalimaud.org <https://www.fchampalimaud.org/>
>>>
>>>
>>> Artem Russakovskii <archon810 at gmail.com> escreveu no dia
s?bado,
>>> 9/02/2019 ?(s) 22:18:
>>>
>>>> Alright. I've enabled core-dumping (hopefully), so now
I'm waiting for
>>>> the next crash to see if it dumps a core for you guys to
remotely debug.
>>>>
>>>> Then I can consider setting performance.write-behind to off and
>>>> monitoring for further crashes.
>>>>
>>>> Sincerely,
>>>> Artem
>>>>
>>>> --
>>>> Founder, Android Police <http://www.androidpolice.com>,
APK Mirror
>>>> <http://www.apkmirror.com/>, Illogical Robot LLC
>>>> beerpla.net | +ArtemRussakovskii
>>>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>>> <http://twitter.com/ArtemR>
>>>>
>>>>
>>>> On Fri, Feb 8, 2019 at 7:22 PM Raghavendra Gowdappa <
>>>> rgowdapp at redhat.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Sat, Feb 9, 2019 at 12:53 AM Artem Russakovskii <
>>>>> archon810 at gmail.com> wrote:
>>>>>
>>>>>> Hi Nithya,
>>>>>>
>>>>>> I can try to disable write-behind as long as it
doesn't heavily
>>>>>> impact performance for us. Which option is it exactly?
I don't see it set
>>>>>> in my list of changed volume variables that I sent you
guys earlier.
>>>>>>
>>>>>
>>>>> The option is performance.write-behind
>>>>>
>>>>>
>>>>>> Sincerely,
>>>>>> Artem
>>>>>>
>>>>>> --
>>>>>> Founder, Android Police
<http://www.androidpolice.com>, APK Mirror
>>>>>> <http://www.apkmirror.com/>, Illogical Robot LLC
>>>>>> beerpla.net | +ArtemRussakovskii
>>>>>> <https://plus.google.com/+ArtemRussakovskii> |
@ArtemR
>>>>>> <http://twitter.com/ArtemR>
>>>>>>
>>>>>>
>>>>>> On Fri, Feb 8, 2019 at 4:57 AM Nithya Balachandran <
>>>>>> nbalacha at redhat.com> wrote:
>>>>>>
>>>>>>> Hi Artem,
>>>>>>>
>>>>>>> We have found the cause of one crash. Unfortunately
we have not
>>>>>>> managed to reproduce the one you reported so we
don't know if it is the
>>>>>>> same cause.
>>>>>>>
>>>>>>> Can you disable write-behind on the volume and let
us know if it
>>>>>>> solves the problem? If yes, it is likely to be the
same issue.
>>>>>>>
>>>>>>>
>>>>>>> regards,
>>>>>>> Nithya
>>>>>>>
>>>>>>> On Fri, 8 Feb 2019 at 06:51, Artem Russakovskii
<archon810 at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Sorry to disappoint, but the crash just
happened again, so
>>>>>>>> lru-limit=0 didn't help.
>>>>>>>>
>>>>>>>> Here's the snippet of the crash and the
subsequent remount by monit.
>>>>>>>>
>>>>>>>>
>>>>>>>> [2019-02-08 01:13:05.854391] W
[dict.c:761:dict_ref]
>>>>>>>>
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>>>>>>> [0x7f4402b99329]
>>>>>>>>
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>>>>>>> [0x7f4402daaaf5]
-->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>>>>>>> [0x7f440b6b5218] ) 0-dict: dict is NULL [In
>>>>>>>> valid argument]
>>>>>>>> The message "I [MSGID: 108031]
>>>>>>>> [afr-common.c:2543:afr_local_discovery_cbk]
0-<SNIP>_data1-replicate-0:
>>>>>>>> selecting local read_child
<SNIP>_data1-client-3" repeated 39 times between
>>>>>>>> [2019-02-08 01:11:18.043286] and [2019-02-08
01:13:07.915604]
>>>>>>>> The message "E [MSGID: 101191]
>>>>>>>> [event-epoll.c:671:event_dispatch_epoll_worker]
0-epoll: Failed to dispatch
>>>>>>>> handler" repeated 515 times between
[2019-02-08 01:11:17.932515] and
>>>>>>>> [2019-02-08 01:13:09.311554]
>>>>>>>> pending frames:
>>>>>>>> frame : type(1) op(LOOKUP)
>>>>>>>> frame : type(0) op(0)
>>>>>>>> patchset: git://git.gluster.org/glusterfs.git
>>>>>>>> signal received: 6
>>>>>>>> time of crash:
>>>>>>>> 2019-02-08 01:13:09
>>>>>>>> configuration details:
>>>>>>>> argp 1
>>>>>>>> backtrace 1
>>>>>>>> dlfcn 1
>>>>>>>> libpthread 1
>>>>>>>> llistxattr 1
>>>>>>>> setfsid 1
>>>>>>>> spinlock 1
>>>>>>>> epoll.h 1
>>>>>>>> xattr.h 1
>>>>>>>> st_atim.tv_nsec 1
>>>>>>>> package-string: glusterfs 5.3
>>>>>>>>
/usr/lib64/libglusterfs.so.0(+0x2764c)[0x7f440b6c064c]
>>>>>>>>
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7f440b6cacb6]
>>>>>>>> /lib64/libc.so.6(+0x36160)[0x7f440a887160]
>>>>>>>> /lib64/libc.so.6(gsignal+0x110)[0x7f440a8870e0]
>>>>>>>> /lib64/libc.so.6(abort+0x151)[0x7f440a8886c1]
>>>>>>>> /lib64/libc.so.6(+0x2e6fa)[0x7f440a87f6fa]
>>>>>>>> /lib64/libc.so.6(+0x2e772)[0x7f440a87f772]
>>>>>>>>
/lib64/libpthread.so.0(pthread_mutex_lock+0x228)[0x7f440ac150b8]
>>>>>>>>
>>>>>>>>
/usr/lib64/glusterfs/5.3/xlator/cluster/replicate.so(+0x5dc9d)[0x7f44036f8c9d]
>>>>>>>>
>>>>>>>>
/usr/lib64/glusterfs/5.3/xlator/cluster/replicate.so(+0x70ba1)[0x7f440370bba1]
>>>>>>>>
>>>>>>>>
/usr/lib64/glusterfs/5.3/xlator/protocol/client.so(+0x58f3f)[0x7f4403990f3f]
>>>>>>>>
/usr/lib64/libgfrpc.so.0(+0xe820)[0x7f440b48b820]
>>>>>>>>
/usr/lib64/libgfrpc.so.0(+0xeb6f)[0x7f440b48bb6f]
>>>>>>>>
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7f440b488063]
>>>>>>>>
>>>>>>>>
/usr/lib64/glusterfs/5.3/rpc-transport/socket.so(+0xa0b2)[0x7f44050a80b2]
>>>>>>>>
/usr/lib64/libglusterfs.so.0(+0x854c3)[0x7f440b71e4c3]
>>>>>>>> /lib64/libpthread.so.0(+0x7559)[0x7f440ac12559]
>>>>>>>> /lib64/libc.so.6(clone+0x3f)[0x7f440a94981f]
>>>>>>>> ---------
>>>>>>>> [2019-02-08 01:13:35.628478] I [MSGID: 100030]
>>>>>>>> [glusterfsd.c:2715:main] 0-/usr/sbin/glusterfs:
Started running
>>>>>>>> /usr/sbin/glusterfs version 5.3 (args:
/usr/sbin/glusterfs --lru-limit=0
>>>>>>>> --process-name fuse --volfile-server=localhost
--volfile-id=/<SNIP>_data1
>>>>>>>> /mnt/<SNIP>_data1)
>>>>>>>> [2019-02-08 01:13:35.637830] I [MSGID: 101190]
>>>>>>>> [event-epoll.c:622:event_dispatch_epoll_worker]
0-epoll: Started thread
>>>>>>>> with index 1
>>>>>>>> [2019-02-08 01:13:35.651405] I [MSGID: 101190]
>>>>>>>> [event-epoll.c:622:event_dispatch_epoll_worker]
0-epoll: Started thread
>>>>>>>> with index 2
>>>>>>>> [2019-02-08 01:13:35.651628] I [MSGID: 101190]
>>>>>>>> [event-epoll.c:622:event_dispatch_epoll_worker]
0-epoll: Started thread
>>>>>>>> with index 3
>>>>>>>> [2019-02-08 01:13:35.651747] I [MSGID: 101190]
>>>>>>>> [event-epoll.c:622:event_dispatch_epoll_worker]
0-epoll: Started thread
>>>>>>>> with index 4
>>>>>>>> [2019-02-08 01:13:35.652575] I [MSGID: 114020]
>>>>>>>> [client.c:2354:notify]
0-<SNIP>_data1-client-0: parent translators are
>>>>>>>> ready, attempting connect on transport
>>>>>>>> [2019-02-08 01:13:35.652978] I [MSGID: 114020]
>>>>>>>> [client.c:2354:notify]
0-<SNIP>_data1-client-1: parent translators are
>>>>>>>> ready, attempting connect on transport
>>>>>>>> [2019-02-08 01:13:35.655197] I [MSGID: 114020]
>>>>>>>> [client.c:2354:notify]
0-<SNIP>_data1-client-2: parent translators are
>>>>>>>> ready, attempting connect on transport
>>>>>>>> [2019-02-08 01:13:35.655497] I [MSGID: 114020]
>>>>>>>> [client.c:2354:notify]
0-<SNIP>_data1-client-3: parent translators are
>>>>>>>> ready, attempting connect on transport
>>>>>>>> [2019-02-08 01:13:35.655527] I
[rpc-clnt.c:2042:rpc_clnt_reconfig]
>>>>>>>> 0-<SNIP>_data1-client-0: changing port to
49153 (from 0)
>>>>>>>> Final graph:
>>>>>>>>
>>>>>>>>
>>>>>>>> Sincerely,
>>>>>>>> Artem
>>>>>>>>
>>>>>>>> --
>>>>>>>> Founder, Android Police
<http://www.androidpolice.com>, APK Mirror
>>>>>>>> <http://www.apkmirror.com/>, Illogical
Robot LLC
>>>>>>>> beerpla.net | +ArtemRussakovskii
>>>>>>>>
<https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>>>>>>> <http://twitter.com/ArtemR>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Feb 7, 2019 at 1:28 PM Artem
Russakovskii <
>>>>>>>> archon810 at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> I've added the lru-limit=0 parameter to
the mounts, and I see it's
>>>>>>>>> taken effect correctly:
>>>>>>>>> "/usr/sbin/glusterfs --lru-limit=0
--process-name fuse
>>>>>>>>> --volfile-server=localhost
--volfile-id=/<SNIP>  /mnt/<SNIP>"
>>>>>>>>>
>>>>>>>>> Let's see if it stops crashing or not.
>>>>>>>>>
>>>>>>>>> Sincerely,
>>>>>>>>> Artem
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Founder, Android Police
<http://www.androidpolice.com>, APK Mirror
>>>>>>>>> <http://www.apkmirror.com/>,
Illogical Robot LLC
>>>>>>>>> beerpla.net | +ArtemRussakovskii
>>>>>>>>>
<https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>>>>>>>> <http://twitter.com/ArtemR>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Feb 6, 2019 at 10:48 AM Artem
Russakovskii <
>>>>>>>>> archon810 at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Nithya,
>>>>>>>>>>
>>>>>>>>>> Indeed, I upgraded from 4.1 to 5.3, at
which point I started
>>>>>>>>>> seeing crashes, and no further releases
have been made yet.
>>>>>>>>>>
>>>>>>>>>> volume info:
>>>>>>>>>> Type: Replicate
>>>>>>>>>> Volume ID: ****SNIP****
>>>>>>>>>> Status: Started
>>>>>>>>>> Snapshot Count: 0
>>>>>>>>>> Number of Bricks: 1 x 4 = 4
>>>>>>>>>> Transport-type: tcp
>>>>>>>>>> Bricks:
>>>>>>>>>> Brick1: ****SNIP****
>>>>>>>>>> Brick2: ****SNIP****
>>>>>>>>>> Brick3: ****SNIP****
>>>>>>>>>> Brick4: ****SNIP****
>>>>>>>>>> Options Reconfigured:
>>>>>>>>>> cluster.quorum-count: 1
>>>>>>>>>> cluster.quorum-type: fixed
>>>>>>>>>> network.ping-timeout: 5
>>>>>>>>>> network.remote-dio: enable
>>>>>>>>>> performance.rda-cache-limit: 256MB
>>>>>>>>>> performance.readdir-ahead: on
>>>>>>>>>> performance.parallel-readdir: on
>>>>>>>>>> network.inode-lru-limit: 500000
>>>>>>>>>> performance.md-cache-timeout: 600
>>>>>>>>>> performance.cache-invalidation: on
>>>>>>>>>> performance.stat-prefetch: on
>>>>>>>>>> features.cache-invalidation-timeout:
600
>>>>>>>>>> features.cache-invalidation: on
>>>>>>>>>> cluster.readdir-optimize: on
>>>>>>>>>> performance.io-thread-count: 32
>>>>>>>>>> server.event-threads: 4
>>>>>>>>>> client.event-threads: 4
>>>>>>>>>> performance.read-ahead: off
>>>>>>>>>> cluster.lookup-optimize: on
>>>>>>>>>> performance.cache-size: 1GB
>>>>>>>>>> cluster.self-heal-daemon: enable
>>>>>>>>>> transport.address-family: inet
>>>>>>>>>> nfs.disable: on
>>>>>>>>>> performance.client-io-threads: on
>>>>>>>>>> cluster.granular-entry-heal: enable
>>>>>>>>>> cluster.data-self-heal-algorithm: full
>>>>>>>>>>
>>>>>>>>>> Sincerely,
>>>>>>>>>> Artem
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Founder, Android Police
<http://www.androidpolice.com>, APK
>>>>>>>>>> Mirror
<http://www.apkmirror.com/>, Illogical Robot LLC
>>>>>>>>>> beerpla.net | +ArtemRussakovskii
>>>>>>>>>>
<https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>>>>>>>>> <http://twitter.com/ArtemR>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Feb 6, 2019 at 12:20 AM Nithya
Balachandran <
>>>>>>>>>> nbalacha at redhat.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Artem,
>>>>>>>>>>>
>>>>>>>>>>> Do you still see the crashes with
5.3? If yes, please try mount
>>>>>>>>>>> the volume using the mount option
lru-limit=0 and see if that helps. We are
>>>>>>>>>>> looking into the crashes and will
update when have a fix.
>>>>>>>>>>>
>>>>>>>>>>> Also, please provide the gluster
volume info for the volume in
>>>>>>>>>>> question.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> regards,
>>>>>>>>>>> Nithya
>>>>>>>>>>>
>>>>>>>>>>> On Tue, 5 Feb 2019 at 05:31, Artem
Russakovskii <
>>>>>>>>>>> archon810 at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> The fuse crash happened two
more times, but this time monit
>>>>>>>>>>>> helped recover within 1 minute,
so it's a great workaround for now.
>>>>>>>>>>>>
>>>>>>>>>>>> What's odd is that the
crashes are only happening on one of 4
>>>>>>>>>>>> servers, and I don't know
why.
>>>>>>>>>>>>
>>>>>>>>>>>> Sincerely,
>>>>>>>>>>>> Artem
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Founder, Android Police
<http://www.androidpolice.com>, APK
>>>>>>>>>>>> Mirror
<http://www.apkmirror.com/>, Illogical Robot LLC
>>>>>>>>>>>> beerpla.net |
+ArtemRussakovskii
>>>>>>>>>>>>
<https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>>>>>>>>>>>
<http://twitter.com/ArtemR>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Sat, Feb 2, 2019 at 12:14 PM
Artem Russakovskii <
>>>>>>>>>>>> archon810 at gmail.com>
wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> The fuse crash happened
again yesterday, to another volume.
>>>>>>>>>>>>> Are there any mount options
that could help mitigate this?
>>>>>>>>>>>>>
>>>>>>>>>>>>> In the meantime, I set up a
monit (https://mmonit.com/monit/)
>>>>>>>>>>>>> task to watch and restart
the mount, which works and recovers the mount
>>>>>>>>>>>>> point within a minute. Not
ideal, but a temporary workaround.
>>>>>>>>>>>>>
>>>>>>>>>>>>> By the way, the way to
reproduce this "Transport endpoint is
>>>>>>>>>>>>> not connected"
condition for testing purposes is to kill -9 the right
>>>>>>>>>>>>> "glusterfs
--process-name fuse" process.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> monit check:
>>>>>>>>>>>>> check filesystem
glusterfs_data1 with path /mnt/glusterfs_data1
>>>>>>>>>>>>>   start program  =
"/bin/mount  /mnt/glusterfs_data1"
>>>>>>>>>>>>>   stop program  =
"/bin/umount /mnt/glusterfs_data1"
>>>>>>>>>>>>>   if space usage > 90%
for 5 times within 15 cycles
>>>>>>>>>>>>>     then alert else if
succeeded for 10 cycles then alert
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> stack trace:
>>>>>>>>>>>>> [2019-02-01
23:22:00.312894] W [dict.c:761:dict_ref]
>>>>>>>>>>>>>
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>>>>>>>>>>>> [0x7fa0249e4329]
>>>>>>>>>>>>>
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>>>>>>>>>>>> [0x7fa024bf5af5]
-->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>>>>>>>>>>>> [0x7fa02cf5b218] ) 0-dict:
dict is NULL [Invalid argument]
>>>>>>>>>>>>> [2019-02-01
23:22:00.314051] W [dict.c:761:dict_ref]
>>>>>>>>>>>>>
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>>>>>>>>>>>> [0x7fa0249e4329]
>>>>>>>>>>>>>
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>>>>>>>>>>>> [0x7fa024bf5af5]
-->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>>>>>>>>>>>> [0x7fa02cf5b218] ) 0-dict:
dict is NULL [Invalid argument]
>>>>>>>>>>>>> The message "E [MSGID:
101191]
>>>>>>>>>>>>>
[event-epoll.c:671:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch
>>>>>>>>>>>>> handler" repeated 26
times between [2019-02-01 23:21:20.857333] and
>>>>>>>>>>>>> [2019-02-01
23:21:56.164427]
>>>>>>>>>>>>> The message "I [MSGID:
108031]
>>>>>>>>>>>>>
[afr-common.c:2543:afr_local_discovery_cbk] 0-SITE_data3-replicate-0:
>>>>>>>>>>>>> selecting local read_child
SITE_data3-client-3" repeated 27 times between
>>>>>>>>>>>>> [2019-02-01
23:21:11.142467] and [2019-02-01 23:22:03.474036]
>>>>>>>>>>>>> pending frames:
>>>>>>>>>>>>> frame : type(1) op(LOOKUP)
>>>>>>>>>>>>> frame : type(0) op(0)
>>>>>>>>>>>>> patchset:
git://git.gluster.org/glusterfs.git
>>>>>>>>>>>>> signal received: 6
>>>>>>>>>>>>> time of crash:
>>>>>>>>>>>>> 2019-02-01 23:22:03
>>>>>>>>>>>>> configuration details:
>>>>>>>>>>>>> argp 1
>>>>>>>>>>>>> backtrace 1
>>>>>>>>>>>>> dlfcn 1
>>>>>>>>>>>>> libpthread 1
>>>>>>>>>>>>> llistxattr 1
>>>>>>>>>>>>> setfsid 1
>>>>>>>>>>>>> spinlock 1
>>>>>>>>>>>>> epoll.h 1
>>>>>>>>>>>>> xattr.h 1
>>>>>>>>>>>>> st_atim.tv_nsec 1
>>>>>>>>>>>>> package-string: glusterfs
5.3
>>>>>>>>>>>>>
/usr/lib64/libglusterfs.so.0(+0x2764c)[0x7fa02cf6664c]
>>>>>>>>>>>>>
>>>>>>>>>>>>>
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7fa02cf70cb6]
>>>>>>>>>>>>>
/lib64/libc.so.6(+0x36160)[0x7fa02c12d160]
>>>>>>>>>>>>>
/lib64/libc.so.6(gsignal+0x110)[0x7fa02c12d0e0]
>>>>>>>>>>>>>
/lib64/libc.so.6(abort+0x151)[0x7fa02c12e6c1]
>>>>>>>>>>>>>
/lib64/libc.so.6(+0x2e6fa)[0x7fa02c1256fa]
>>>>>>>>>>>>>
/lib64/libc.so.6(+0x2e772)[0x7fa02c125772]
>>>>>>>>>>>>>
>>>>>>>>>>>>>
/lib64/libpthread.so.0(pthread_mutex_lock+0x228)[0x7fa02c4bb0b8]
>>>>>>>>>>>>>
>>>>>>>>>>>>>
/usr/lib64/glusterfs/5.3/xlator/cluster/replicate.so(+0x5dc9d)[0x7fa025543c9d]
>>>>>>>>>>>>>
>>>>>>>>>>>>>
/usr/lib64/glusterfs/5.3/xlator/cluster/replicate.so(+0x70ba1)[0x7fa025556ba1]
>>>>>>>>>>>>>
>>>>>>>>>>>>>
/usr/lib64/glusterfs/5.3/xlator/protocol/client.so(+0x58f3f)[0x7fa0257dbf3f]
>>>>>>>>>>>>>
/usr/lib64/libgfrpc.so.0(+0xe820)[0x7fa02cd31820]
>>>>>>>>>>>>>
/usr/lib64/libgfrpc.so.0(+0xeb6f)[0x7fa02cd31b6f]
>>>>>>>>>>>>>
>>>>>>>>>>>>>
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fa02cd2e063]
>>>>>>>>>>>>>
>>>>>>>>>>>>>
/usr/lib64/glusterfs/5.3/rpc-transport/socket.so(+0xa0b2)[0x7fa02694e0b2]
>>>>>>>>>>>>>
/usr/lib64/libglusterfs.so.0(+0x854c3)[0x7fa02cfc44c3]
>>>>>>>>>>>>>
/lib64/libpthread.so.0(+0x7559)[0x7fa02c4b8559]
>>>>>>>>>>>>>
/lib64/libc.so.6(clone+0x3f)[0x7fa02c1ef81f]
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sincerely,
>>>>>>>>>>>>> Artem
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Founder, Android Police
<http://www.androidpolice.com>, APK
>>>>>>>>>>>>> Mirror
<http://www.apkmirror.com/>, Illogical Robot LLC
>>>>>>>>>>>>> beerpla.net |
+ArtemRussakovskii
>>>>>>>>>>>>>
<https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>>>>>>>>>>>>
<http://twitter.com/ArtemR>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Feb 1, 2019 at 9:03
AM Artem Russakovskii <
>>>>>>>>>>>>> archon810 at gmail.com>
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The first (and so far
only) crash happened at 2am the next
>>>>>>>>>>>>>> day after we upgraded,
on only one of four servers and only to one of two
>>>>>>>>>>>>>> mounts.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have no idea what
caused it, but yeah, we do have a pretty
>>>>>>>>>>>>>> busy site
(apkmirror.com), and it caused a disruption for
>>>>>>>>>>>>>> any uploads or
downloads from that server until I woke up and fixed the
>>>>>>>>>>>>>> mount.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I wish I could be more
helpful but all I have is that stack
>>>>>>>>>>>>>> trace.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm glad it's a
blocker and will hopefully be resolved soon.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Jan 31, 2019,
7:26 PM Amar Tumballi Suryanarayan <
>>>>>>>>>>>>>> atumball at
redhat.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Artem,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Opened
https://bugzilla.redhat.com/show_bug.cgi?id=1671603
>>>>>>>>>>>>>>> (ie, as a clone of
other bugs where recent discussions happened), and
>>>>>>>>>>>>>>> marked it as a
blocker for glusterfs-5.4 release.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We already have
fixes for log flooding -
>>>>>>>>>>>>>>>
https://review.gluster.org/22128, and are the process of
>>>>>>>>>>>>>>> identifying and
fixing the issue seen with crash.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Can you please tell
if the crashes happened as soon as
>>>>>>>>>>>>>>> upgrade ? or was
there any particular pattern you observed before the crash.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -Amar
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Thu, Jan 31,
2019 at 11:40 PM Artem Russakovskii <
>>>>>>>>>>>>>>> archon810 at
gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Within 24 hours
after updating from rock solid 4.1 to 5.3,
>>>>>>>>>>>>>>>> I already got a
crash which others have mentioned in
>>>>>>>>>>>>>>>>
https://bugzilla.redhat.com/show_bug.cgi?id=1313567 and
>>>>>>>>>>>>>>>> had to unmount,
kill gluster, and remount:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [2019-01-31
09:38:04.317604] W [dict.c:761:dict_ref]
>>>>>>>>>>>>>>>>
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>>>>>>>>>>>>>>>
[0x7fcccafcd329]
>>>>>>>>>>>>>>>>
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>>>>>>>>>>>>>>>
[0x7fcccb1deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>>>>>>>>>>>>>>>
[0x7fccd705b218] ) 2-dict: dict is NULL [Invalid argument]
>>>>>>>>>>>>>>>> [2019-01-31
09:38:04.319308] W [dict.c:761:dict_ref]
>>>>>>>>>>>>>>>>
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>>>>>>>>>>>>>>>
[0x7fcccafcd329]
>>>>>>>>>>>>>>>>
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>>>>>>>>>>>>>>>
[0x7fcccb1deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>>>>>>>>>>>>>>>
[0x7fccd705b218] ) 2-dict: dict is NULL [Invalid argument]
>>>>>>>>>>>>>>>> [2019-01-31
09:38:04.320047] W [dict.c:761:dict_ref]
>>>>>>>>>>>>>>>>
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>>>>>>>>>>>>>>>
[0x7fcccafcd329]
>>>>>>>>>>>>>>>>
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>>>>>>>>>>>>>>>
[0x7fcccb1deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>>>>>>>>>>>>>>>
[0x7fccd705b218] ) 2-dict: dict is NULL [Invalid argument]
>>>>>>>>>>>>>>>> [2019-01-31
09:38:04.320677] W [dict.c:761:dict_ref]
>>>>>>>>>>>>>>>>
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>>>>>>>>>>>>>>>
[0x7fcccafcd329]
>>>>>>>>>>>>>>>>
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>>>>>>>>>>>>>>>
[0x7fcccb1deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>>>>>>>>>>>>>>>
[0x7fccd705b218] ) 2-dict: dict is NULL [Invalid argument]
>>>>>>>>>>>>>>>> The message
"I [MSGID: 108031]
>>>>>>>>>>>>>>>>
[afr-common.c:2543:afr_local_discovery_cbk] 2-SITE_data1-replicate-0:
>>>>>>>>>>>>>>>> selecting local
read_child SITE_data1-client-3" repeated 5 times between
>>>>>>>>>>>>>>>> [2019-01-31
09:37:54.751905] and [2019-01-31 09:38:03.958061]
>>>>>>>>>>>>>>>> The message
"E [MSGID: 101191]
>>>>>>>>>>>>>>>>
[event-epoll.c:671:event_dispatch_epoll_worker] 2-epoll: Failed to dispatch
>>>>>>>>>>>>>>>> handler"
repeated 72 times between [2019-01-31 09:37:53.746741] and
>>>>>>>>>>>>>>>> [2019-01-31
09:38:04.696993]
>>>>>>>>>>>>>>>> pending frames:
>>>>>>>>>>>>>>>> frame : type(1)
op(READ)
>>>>>>>>>>>>>>>> frame : type(1)
op(OPEN)
>>>>>>>>>>>>>>>> frame : type(0)
op(0)
>>>>>>>>>>>>>>>> patchset:
git://git.gluster.org/glusterfs.git
>>>>>>>>>>>>>>>> signal
received: 6
>>>>>>>>>>>>>>>> time of crash:
>>>>>>>>>>>>>>>> 2019-01-31
09:38:04
>>>>>>>>>>>>>>>> configuration
details:
>>>>>>>>>>>>>>>> argp 1
>>>>>>>>>>>>>>>> backtrace 1
>>>>>>>>>>>>>>>> dlfcn 1
>>>>>>>>>>>>>>>> libpthread 1
>>>>>>>>>>>>>>>> llistxattr 1
>>>>>>>>>>>>>>>> setfsid 1
>>>>>>>>>>>>>>>> spinlock 1
>>>>>>>>>>>>>>>> epoll.h 1
>>>>>>>>>>>>>>>> xattr.h 1
>>>>>>>>>>>>>>>> st_atim.tv_nsec
1
>>>>>>>>>>>>>>>> package-string:
glusterfs 5.3
>>>>>>>>>>>>>>>>
/usr/lib64/libglusterfs.so.0(+0x2764c)[0x7fccd706664c]
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7fccd7070cb6]
>>>>>>>>>>>>>>>>
/lib64/libc.so.6(+0x36160)[0x7fccd622d160]
>>>>>>>>>>>>>>>>
/lib64/libc.so.6(gsignal+0x110)[0x7fccd622d0e0]
>>>>>>>>>>>>>>>>
/lib64/libc.so.6(abort+0x151)[0x7fccd622e6c1]
>>>>>>>>>>>>>>>>
/lib64/libc.so.6(+0x2e6fa)[0x7fccd62256fa]
>>>>>>>>>>>>>>>>
/lib64/libc.so.6(+0x2e772)[0x7fccd6225772]
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
/lib64/libpthread.so.0(pthread_mutex_lock+0x228)[0x7fccd65bb0b8]
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
/usr/lib64/glusterfs/5.3/xlator/cluster/replicate.so(+0x32c4d)[0x7fcccbb01c4d]
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
/usr/lib64/glusterfs/5.3/xlator/protocol/client.so(+0x65778)[0x7fcccbdd1778]
>>>>>>>>>>>>>>>>
/usr/lib64/libgfrpc.so.0(+0xe820)[0x7fccd6e31820]
>>>>>>>>>>>>>>>>
/usr/lib64/libgfrpc.so.0(+0xeb6f)[0x7fccd6e31b6f]
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fccd6e2e063]
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
/usr/lib64/glusterfs/5.3/rpc-transport/socket.so(+0xa0b2)[0x7fccd0b7e0b2]
>>>>>>>>>>>>>>>>
/usr/lib64/libglusterfs.so.0(+0x854c3)[0x7fccd70c44c3]
>>>>>>>>>>>>>>>>
/lib64/libpthread.so.0(+0x7559)[0x7fccd65b8559]
>>>>>>>>>>>>>>>>
/lib64/libc.so.6(clone+0x3f)[0x7fccd62ef81f]
>>>>>>>>>>>>>>>> ---------
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Do the pending
patches fix the crash or only the repeated
>>>>>>>>>>>>>>>> warnings?
I'm running glusterfs on OpenSUSE 15.0 installed via
>>>>>>>>>>>>>>>>
http://download.opensuse.org/repositories/home:/glusterfs:/Leap15-5/openSUSE_Leap_15.0/,
>>>>>>>>>>>>>>>> not too sure
how to make it core dump.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> If it's not
fixed by the patches above, has anyone already
>>>>>>>>>>>>>>>> opened a ticket
for the crashes that I can join and monitor? This is going
>>>>>>>>>>>>>>>> to create a
massive problem for us since production systems are crashing.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Sincerely,
>>>>>>>>>>>>>>>> Artem
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> Founder,
Android Police <http://www.androidpolice.com>, APK
>>>>>>>>>>>>>>>> Mirror
<http://www.apkmirror.com/>, Illogical Robot LLC
>>>>>>>>>>>>>>>> beerpla.net |
+ArtemRussakovskii
>>>>>>>>>>>>>>>>
<https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>>>>>>>>>>>>>>>
<http://twitter.com/ArtemR>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Jan 30,
2019 at 6:37 PM Raghavendra Gowdappa <
>>>>>>>>>>>>>>>> rgowdapp at
redhat.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, Jan
31, 2019 at 2:14 AM Artem Russakovskii <
>>>>>>>>>>>>>>>>> archon810
at gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Also,
not sure if related or not, but I got a ton of
>>>>>>>>>>>>>>>>>> these
"Failed to dispatch handler" in my logs as well. Many people have
>>>>>>>>>>>>>>>>>> been
commenting about this issue here
>>>>>>>>>>>>>>>>>>
https://bugzilla.redhat.com/show_bug.cgi?id=1651246.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
https://review.gluster.org/#/c/glusterfs/+/22046/
>>>>>>>>>>>>>>>>> addresses
this.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ==>
mnt-SITE_data1.log
<=>>>>>>>>>>>>>>>>>>>
[2019-01-30 20:38:20.783713] W [dict.c:761:dict_ref]
>>>>>>>>>>>>>>>>>>>
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>>>>>>>>>>>>>>>>>>
[0x7fd966fcd329]
>>>>>>>>>>>>>>>>>>>
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>>>>>>>>>>>>>>>>>>
[0x7fd9671deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>>>>>>>>>>>>>>>>>>
[0x7fd9731ea218] ) 2-dict: dict is NULL [Invalid argument]
>>>>>>>>>>>>>>>>>>>
==> mnt-SITE_data3.log
<=>>>>>>>>>>>>>>>>>>>
The message "E [MSGID: 101191]
>>>>>>>>>>>>>>>>>>>
[event-epoll.c:671:event_dispatch_epoll_worker] 2-epoll: Failed to dispatch
>>>>>>>>>>>>>>>>>>>
handler" repeated 413 times between [2019-01-30 20:36:23.881090] and
>>>>>>>>>>>>>>>>>>>
[2019-01-30 20:38:20.015593]
>>>>>>>>>>>>>>>>>>> The
message "I [MSGID: 108031]
>>>>>>>>>>>>>>>>>>>
[afr-common.c:2543:afr_local_discovery_cbk] 2-SITE_data3-replicate-0:
>>>>>>>>>>>>>>>>>>>
selecting local read_child SITE_data3-client-0" repeated 42 times between
>>>>>>>>>>>>>>>>>>>
[2019-01-30 20:36:23.290287] and [2019-01-30 20:38:20.280306]
>>>>>>>>>>>>>>>>>>>
==> mnt-SITE_data1.log
<=>>>>>>>>>>>>>>>>>>>
The message "I [MSGID: 108031]
>>>>>>>>>>>>>>>>>>>
[afr-common.c:2543:afr_local_discovery_cbk] 2-SITE_data1-replicate-0:
>>>>>>>>>>>>>>>>>>>
selecting local read_child SITE_data1-client-0" repeated 50 times between
>>>>>>>>>>>>>>>>>>>
[2019-01-30 20:36:22.247367] and [2019-01-30 20:38:19.459789]
>>>>>>>>>>>>>>>>>>> The
message "E [MSGID: 101191]
>>>>>>>>>>>>>>>>>>>
[event-epoll.c:671:event_dispatch_epoll_worker] 2-epoll: Failed to dispatch
>>>>>>>>>>>>>>>>>>>
handler" repeated 2654 times between [2019-01-30 20:36:22.667327] and
>>>>>>>>>>>>>>>>>>>
[2019-01-30 20:38:20.546355]
>>>>>>>>>>>>>>>>>>>
[2019-01-30 20:38:21.492319] I [MSGID: 108031]
>>>>>>>>>>>>>>>>>>>
[afr-common.c:2543:afr_local_discovery_cbk] 2-SITE_data1-replicate-0:
>>>>>>>>>>>>>>>>>>>
selecting local read_child SITE_data1-client-0
>>>>>>>>>>>>>>>>>>>
==> mnt-SITE_data3.log
<=>>>>>>>>>>>>>>>>>>>
[2019-01-30 20:38:22.349689] I [MSGID: 108031]
>>>>>>>>>>>>>>>>>>>
[afr-common.c:2543:afr_local_discovery_cbk] 2-SITE_data3-replicate-0:
>>>>>>>>>>>>>>>>>>>
selecting local read_child SITE_data3-client-0
>>>>>>>>>>>>>>>>>>>
==> mnt-SITE_data1.log
<=>>>>>>>>>>>>>>>>>>>
[2019-01-30 20:38:22.762941] E [MSGID: 101191]
>>>>>>>>>>>>>>>>>>>
[event-epoll.c:671:event_dispatch_epoll_worker] 2-epoll: Failed to dispatch
>>>>>>>>>>>>>>>>>>>
handler
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I'm
hoping raising the issue here on the mailing list may
>>>>>>>>>>>>>>>>>> bring
some additional eyeballs and get them both fixed.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
Sincerely,
>>>>>>>>>>>>>>>>>> Artem
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>
Founder, Android Police <http://www.androidpolice.com>, APK
>>>>>>>>>>>>>>>>>> Mirror
<http://www.apkmirror.com/>, Illogical Robot LLC
>>>>>>>>>>>>>>>>>>
beerpla.net | +ArtemRussakovskii
>>>>>>>>>>>>>>>>>>
<https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>>>>>>>>>>>>>>>>>
<http://twitter.com/ArtemR>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Wed,
Jan 30, 2019 at 12:26 PM Artem Russakovskii <
>>>>>>>>>>>>>>>>>>
archon810 at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I
found a similar issue here:
>>>>>>>>>>>>>>>>>>>
https://bugzilla.redhat.com/show_bug.cgi?id=1313567.
>>>>>>>>>>>>>>>>>>>
There's a comment from 3 days ago from someone else with 5.3 who started
>>>>>>>>>>>>>>>>>>>
seeing the spam.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
Here's the command that repeats over and over:
>>>>>>>>>>>>>>>>>>>
[2019-01-30 20:23:24.481581] W [dict.c:761:dict_ref]
>>>>>>>>>>>>>>>>>>>
(-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>>>>>>>>>>>>>>>>>>
[0x7fd966fcd329]
>>>>>>>>>>>>>>>>>>>
-->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>>>>>>>>>>>>>>>>>>
[0x7fd9671deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>>>>>>>>>>>>>>>>>>
[0x7fd9731ea218] ) 2-dict: dict is NULL [Invalid argument]
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> +Milind
Changire <mchangir at redhat.com> Can you check why
>>>>>>>>>>>>>>>>> this
message is logged and send a fix?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Is
there any fix for this issue?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
Thanks.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
Sincerely,
>>>>>>>>>>>>>>>>>>>
Artem
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>
Founder, Android Police <http://www.androidpolice.com>, APK
>>>>>>>>>>>>>>>>>>>
Mirror <http://www.apkmirror.com/>, Illogical Robot LLC
>>>>>>>>>>>>>>>>>>>
beerpla.net | +ArtemRussakovskii
>>>>>>>>>>>>>>>>>>>
<https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>>>>>>>>>>>>>>>>>>
<http://twitter.com/ArtemR>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>>>>>>>>>
Gluster-users mailing list
>>>>>>>>>>>>>>>>>>
Gluster-users at gluster.org
>>>>>>>>>>>>>>>>>>
https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>>>>>>> Gluster-users
mailing list
>>>>>>>>>>>>>>>> Gluster-users
at gluster.org
>>>>>>>>>>>>>>>>
https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Amar Tumballi
(amarts)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>>> Gluster-users mailing list
>>>>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>>>>>
https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>
>>>>>>>>>>>
_______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org
>>>>>>
https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190212/4f1982b2/attachment.html>

Gluster users - Feb 2019 - Message repeated over and over after upgrade from 4.1 to 5.3: W [dict.c:761:dict_ref] (-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329) [0x7fd966fcd329] -->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)