thr3ads.net - Gluster users - [Gluster-users] Brick-Xlators crashes after Set-RO and Read [May 2019]

If this information is useful, please help other people find it:
Share via:

David Spisla

2019-May-17 07:50 UTC

[Gluster-users] Brick-Xlators crashes after Set-RO and Read

Hello Vijay,
thank you for the clarification. Yes, there is an unconditional dereference
in stbuf. It seems plausible that this causes the crash. I think a check
like this should help:

if (buf == NULL) {
        goto out;
}
map_atime_from_server(this, buf);

Is there a reason why buf can be NULL?

Regards
David Spisla


Am Fr., 17. Mai 2019 um 01:51 Uhr schrieb Vijay Bellur <vbellur at
redhat.com>:
> Hello David,
>
> From the backtrace it looks like stbuf is NULL in map_atime_from_server()
> as  worm_lookup_cbk has got an error (op_ret = -1, op_errno = 13). Can you
> please check if there is an unconditional dereference of stbuf in
> map_atime_from_server()?
>
> Regards,
> Vijay
>
> On Thu, May 16, 2019 at 2:36 AM David Spisla <spisla80 at gmail.com>
wrote:
>
>> Hello Vijay,
>>
>> yes, we are using custom patches. It s a helper function, which is
>> defined in xlator_helper.c and used in worm_lookup_cbk.
>> Do you think this could be the problem? The functions only manipulates
>> the atime in struct iattr
>>
>> Regards
>> David Spisla
>>
>> Am Do., 16. Mai 2019 um 10:05 Uhr schrieb Vijay Bellur <
>> vbellur at redhat.com>:
>>
>>> Hello David,
>>>
>>> Do you have any custom patches in your deployment? I looked up v5.5
but
>>> could not find the following functions referred to in the core:
>>>
>>> map_atime_from_server()
>>> worm_lookup_cbk()
>>>
>>> Neither do I see xlator_helper.c in the codebase.
>>>
>>> Thanks,
>>> Vijay
>>>
>>>
>>> #0  map_atime_from_server (this=0x7fdef401af00, stbuf=0x0) at
>>> ../../../../xlators/lib/src/xlator_helper.c:21
>>>         __FUNCTION__ = "map_to_atime_from_server"
>>> #1  0x00007fdef39a0382 in worm_lookup_cbk (frame=frame at
entry=0x7fdeac0015c8,
>>> cookie=<optimized out>, this=0x7fdef401af00, op_ret=op_ret at
entry=-1,
>>> op_errno=op_errno at entry=13,
>>>     inode=inode at entry=0x0, buf=0x0, xdata=0x0, postparent=0x0)
at
>>> worm.c:531
>>>         priv = 0x7fdef4075378
>>>         ret = 0
>>>         __FUNCTION__ = "worm_lookup_cbk"
>>>
>>> On Thu, May 16, 2019 at 12:53 AM David Spisla <spisla80 at
gmail.com>
>>> wrote:
>>>
>>>> Hello Vijay,
>>>>
>>>> I could reproduce the issue. After doing a simple DIR Listing
from
>>>> Win10 powershell, all brick processes crashes. Its not the same
scenario
>>>> mentioned before but the crash report in the bricks log is the
same.
>>>> Attached you find the backtrace.
>>>>
>>>> Regards
>>>> David Spisla
>>>>
>>>> Am Di., 7. Mai 2019 um 20:08 Uhr schrieb Vijay Bellur <
>>>> vbellur at redhat.com>:
>>>>
>>>>> Hello David,
>>>>>
>>>>> On Tue, May 7, 2019 at 2:16 AM David Spisla <spisla80 at
gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hello Vijay,
>>>>>>
>>>>>> how can I create such a core file? Or will it be
created
>>>>>> automatically if a gluster process crashes?
>>>>>> Maybe you can give me a hint and will try to get a
backtrace.
>>>>>>
>>>>>
>>>>> Generation of core file is dependent on the system
configuration.
>>>>> `man 5 core` contains useful information to generate a core
file in a
>>>>> directory. Once a core file is generated, you can use gdb
to get a
>>>>> backtrace of all threads (using "thread apply all bt
full").
>>>>>
>>>>>
>>>>>> Unfortunately this bug is not easy to reproduce because
it appears
>>>>>> only sometimes.
>>>>>>
>>>>>
>>>>> If the bug is not easy to reproduce, having a backtrace
from the
>>>>> generated core would be very useful!
>>>>>
>>>>> Thanks,
>>>>> Vijay
>>>>>
>>>>>
>>>>>>
>>>>>> Regards
>>>>>> David Spisla
>>>>>>
>>>>>> Am Mo., 6. Mai 2019 um 19:48 Uhr schrieb Vijay Bellur
<
>>>>>> vbellur at redhat.com>:
>>>>>>
>>>>>>> Thank you for the report, David. Do you have core
files available on
>>>>>>> any of the servers? If yes, would it be possible
for you to provide a
>>>>>>> backtrace.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Vijay
>>>>>>>
>>>>>>> On Mon, May 6, 2019 at 3:09 AM David Spisla
<spisla80 at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hello folks,
>>>>>>>>
>>>>>>>> we have a client application (runs on Win10)
which does some FOPs
>>>>>>>> on a gluster volume which is accessed by SMB.
>>>>>>>>
>>>>>>>> *Scenario 1* is a READ Operation which reads
all files
>>>>>>>> successively and checks if the files data was
correctly copied. While doing
>>>>>>>> this, all brick processes crashes and in the
logs one have this crash
>>>>>>>> report on every brick log:
>>>>>>>>
>>>>>>>>>
CTX_ID:a0359502-2c76-4fee-8cb9-365679dc690e-GRAPH_ID:0-PID:32934-HOST:XX-XXXXX-XX-XX-PC_NAME:shortterm-client-2-RECON_NO:-0,
gfid: 00000000-0000-0000-0000-000000000001,
req(uid:2000,gid:2000,perm:1,ngrps:1),
ctx(uid:0,gid:0,in-groups:0,perm:700,updated-fop:LOOKUP, acl:-) [Permission
denied]
>>>>>>>>> pending frames:
>>>>>>>>> frame : type(0) op(27)
>>>>>>>>> frame : type(0) op(40)
>>>>>>>>> patchset:
git://git.gluster.org/glusterfs.git
>>>>>>>>> signal received: 11
>>>>>>>>> time of crash:
>>>>>>>>> 2019-04-16 08:32:21
>>>>>>>>> configuration details:
>>>>>>>>> argp 1
>>>>>>>>> backtrace 1
>>>>>>>>> dlfcn 1
>>>>>>>>> libpthread 1
>>>>>>>>> llistxattr 1
>>>>>>>>> setfsid 1
>>>>>>>>> spinlock 1
>>>>>>>>> epoll.h 1
>>>>>>>>> xattr.h 1
>>>>>>>>> st_atim.tv_nsec 1
>>>>>>>>> package-string: glusterfs 5.5
>>>>>>>>>
/usr/lib64/libglusterfs.so.0(+0x2764c)[0x7f9a5bd4d64c]
>>>>>>>>>
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7f9a5bd57d26]
>>>>>>>>> /lib64/libc.so.6(+0x361a0)[0x7f9a5af141a0]
>>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0xb910)[0x7f9a4ef0e910]
>>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x8118)[0x7f9a4ef0b118]
>>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0x128d6)[0x7f9a4f1278d6]
>>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/access-control.so(+0x575b)[0x7f9a4f35975b]
>>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0xb3b3)[0x7f9a4f1203b3]
>>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x85b2)[0x7f9a4ef0b5b2]
>>>>>>>>>
/usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7f9a5bdd7b6c]
>>>>>>>>>
/usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7f9a5bdd7b6c]
>>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/upcall.so(+0xf548)[0x7f9a4e8cf548]
>>>>>>>>>
/usr/lib64/libglusterfs.so.0(default_lookup_resume+0x1e2)[0x7f9a5bdefc22]
>>>>>>>>>
/usr/lib64/libglusterfs.so.0(call_resume+0x75)[0x7f9a5bd733a5]
>>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/performance/io-threads.so(+0x6088)[0x7f9a4e6b7088]
>>>>>>>>>
/lib64/libpthread.so.0(+0x7569)[0x7f9a5b29f569]
>>>>>>>>>
/lib64/libc.so.6(clone+0x3f)[0x7f9a5afd69af]
>>>>>>>>>
>>>>>>>>> *Scenario 2 *The application just SET
Read-Only on each file
>>>>>>>> sucessively. After the 70th file was set, all
the bricks crashes and again,
>>>>>>>> one can read this crash report in every brick
log:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [2019-05-02 07:43:39.953591] I [MSGID:
139001]
>>>>>>>>>
[posix-acl.c:263:posix_acl_log_permit_denied] 0-longterm-access-control:
>>>>>>>>> client:
>>>>>>>>>
CTX_ID:21aa9c75-3a5f-41f9-925b-48e4c80bd24a-GRAPH_ID:0-PID:16325-HOST:XXX-X-X-XXX-PC_NAME:longterm-client-0-RECON_NO:-0,
>>>>>>>>> gfid: 00000000-0000-0000-0000-000000000001,
>>>>>>>>> req(uid:2000,gid:2000,perm:1,ngrps:1),
>>>>>>>>>
ctx(uid:0,gid:0,in-groups:0,perm:700,updated-fop:LOOKUP, acl:-) [Permission
>>>>>>>>> denied]
>>>>>>>>>
>>>>>>>>> pending frames:
>>>>>>>>>
>>>>>>>>> frame : type(0) op(27)
>>>>>>>>>
>>>>>>>>> patchset:
git://git.gluster.org/glusterfs.git
>>>>>>>>>
>>>>>>>>> signal received: 11
>>>>>>>>>
>>>>>>>>> time of crash:
>>>>>>>>>
>>>>>>>>> 2019-05-02 07:43:39
>>>>>>>>>
>>>>>>>>> configuration details:
>>>>>>>>>
>>>>>>>>> argp 1
>>>>>>>>>
>>>>>>>>> backtrace 1
>>>>>>>>>
>>>>>>>>> dlfcn 1
>>>>>>>>>
>>>>>>>>> libpthread 1
>>>>>>>>>
>>>>>>>>> llistxattr 1
>>>>>>>>>
>>>>>>>>> setfsid 1
>>>>>>>>>
>>>>>>>>> spinlock 1
>>>>>>>>>
>>>>>>>>> epoll.h 1
>>>>>>>>>
>>>>>>>>> xattr.h 1
>>>>>>>>>
>>>>>>>>> st_atim.tv_nsec 1
>>>>>>>>>
>>>>>>>>> package-string: glusterfs 5.5
>>>>>>>>>
>>>>>>>>>
/usr/lib64/libglusterfs.so.0(+0x2764c)[0x7fbb3f0b364c]
>>>>>>>>>
>>>>>>>>>
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7fbb3f0bdd26]
>>>>>>>>>
>>>>>>>>> /lib64/libc.so.6(+0x361e0)[0x7fbb3e27a1e0]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0xb910)[0x7fbb32257910]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x8118)[0x7fbb32254118]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0x128d6)[0x7fbb324708d6]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/access-control.so(+0x575b)[0x7fbb326a275b]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0xb3b3)[0x7fbb324693b3]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x85b2)[0x7fbb322545b2]
>>>>>>>>>
>>>>>>>>>
/usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7fbb3f13db6c]
>>>>>>>>>
>>>>>>>>>
/usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7fbb3f13db6c]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/upcall.so(+0xf548)[0x7fbb31c18548]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
/usr/lib64/libglusterfs.so.0(default_lookup_resume+0x1e2)[0x7fbb3f155c22]
>>>>>>>>>
>>>>>>>>>
/usr/lib64/libglusterfs.so.0(call_resume+0x75)[0x7fbb3f0d93a5]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/performance/io-threads.so(+0x6088)[0x7fbb31a00088]
>>>>>>>>>
>>>>>>>>>
/lib64/libpthread.so.0(+0x7569)[0x7fbb3e605569]
>>>>>>>>>
>>>>>>>>>
/lib64/libc.so.6(clone+0x3f)[0x7fbb3e33c9ef]
>>>>>>>>>
>>>>>>>>
>>>>>>>> This happens on a 3-Node Gluster v5.5 Cluster
on two different
>>>>>>>> volumes. But both volumes has the same
settings:
>>>>>>>>
>>>>>>>>> Volume Name: shortterm
>>>>>>>>> Type: Replicate
>>>>>>>>> Volume ID:
5307e5c5-e8a1-493a-a846-342fb0195dee
>>>>>>>>> Status: Started
>>>>>>>>> Snapshot Count: 0
>>>>>>>>> Number of Bricks: 1 x 3 = 3
>>>>>>>>> Transport-type: tcp
>>>>>>>>> Bricks:
>>>>>>>>> Brick1:
fs-xxxxx-c1-n1:/gluster/brick4/glusterbrick
>>>>>>>>> Brick2:
fs-xxxxx-c1-n2:/gluster/brick4/glusterbrick
>>>>>>>>> Brick3:
fs-xxxxx-c1-n3:/gluster/brick4/glusterbrick
>>>>>>>>> Options Reconfigured:
>>>>>>>>> storage.reserve: 1
>>>>>>>>> performance.client-io-threads: off
>>>>>>>>> nfs.disable: on
>>>>>>>>> transport.address-family: inet
>>>>>>>>> user.smb: disable
>>>>>>>>> features.read-only: off
>>>>>>>>> features.worm: off
>>>>>>>>> features.worm-file-level: on
>>>>>>>>> features.retention-mode: enterprise
>>>>>>>>> features.default-retention-period: 120
>>>>>>>>> network.ping-timeout: 10
>>>>>>>>> features.cache-invalidation: on
>>>>>>>>> features.cache-invalidation-timeout: 600
>>>>>>>>> performance.nl-cache: on
>>>>>>>>> performance.nl-cache-timeout: 600
>>>>>>>>> client.event-threads: 32
>>>>>>>>> server.event-threads: 32
>>>>>>>>> cluster.lookup-optimize: on
>>>>>>>>> performance.stat-prefetch: on
>>>>>>>>> performance.cache-invalidation: on
>>>>>>>>> performance.md-cache-timeout: 600
>>>>>>>>> performance.cache-samba-metadata: on
>>>>>>>>> performance.cache-ima-xattrs: on
>>>>>>>>> performance.io-thread-count: 64
>>>>>>>>> cluster.use-compound-fops: on
>>>>>>>>> performance.cache-size: 512MB
>>>>>>>>> performance.cache-refresh-timeout: 10
>>>>>>>>> performance.read-ahead: off
>>>>>>>>> performance.write-behind-window-size: 4MB
>>>>>>>>> performance.write-behind: on
>>>>>>>>> storage.build-pgfid: on
>>>>>>>>> features.utime: on
>>>>>>>>> storage.ctime: on
>>>>>>>>> cluster.quorum-type: fixed
>>>>>>>>> cluster.quorum-count: 2
>>>>>>>>> features.bitrot: on
>>>>>>>>> features.scrub: Active
>>>>>>>>> features.scrub-freq: daily
>>>>>>>>> cluster.enable-shared-storage: enable
>>>>>>>>>
>>>>>>>>>
>>>>>>>> Why can this happen to all Brick processes? I
don't understand the
>>>>>>>> crash report. The FOPs are nothing special and
after restart brick
>>>>>>>> processes everything works fine and our
application was succeed.
>>>>>>>>
>>>>>>>> Regards
>>>>>>>> David Spisla
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Gluster-users mailing list
>>>>>>>> Gluster-users at gluster.org
>>>>>>>>
https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>>
>>>>>>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190517/d4892d50/attachment.html>

Niels de Vos

2019-May-17 08:21 UTC

head link

[Gluster-users] Brick-Xlators crashes after Set-RO and Read

On Fri, May 17, 2019 at 09:50:28AM +0200, David Spisla
wrote:> Hello Vijay,
> thank you for the clarification. Yes, there is an unconditional dereference
> in stbuf. It seems plausible that this causes the crash. I think a check
> like this should help:
> 
> if (buf == NULL) {
>         goto out;
> }
> map_atime_from_server(this, buf);
> 
> Is there a reason why buf can be NULL?
It seems LOOKUP returned an error (errno=13: EACCES: Permission denied).
This is probably something you need to handle in worm_lookup_cbk. There
can be many reasons for a FOP to return an error, why it happened in
this case is a little difficult to say without (much) more details.

HTH,
Niels

> 
> Regards
> David Spisla
> 
> 
> Am Fr., 17. Mai 2019 um 01:51 Uhr schrieb Vijay Bellur <vbellur at
redhat.com>:
> 
> > Hello David,
> >
> > From the backtrace it looks like stbuf is NULL in
map_atime_from_server()
> > as  worm_lookup_cbk has got an error (op_ret = -1, op_errno = 13). Can
you
> > please check if there is an unconditional dereference of stbuf in
> > map_atime_from_server()?
> >
> > Regards,
> > Vijay
> >
> > On Thu, May 16, 2019 at 2:36 AM David Spisla <spisla80 at
gmail.com> wrote:
> >
> >> Hello Vijay,
> >>
> >> yes, we are using custom patches. It s a helper function, which is
> >> defined in xlator_helper.c and used in worm_lookup_cbk.
> >> Do you think this could be the problem? The functions only
manipulates
> >> the atime in struct iattr
> >>
> >> Regards
> >> David Spisla
> >>
> >> Am Do., 16. Mai 2019 um 10:05 Uhr schrieb Vijay Bellur <
> >> vbellur at redhat.com>:
> >>
> >>> Hello David,
> >>>
> >>> Do you have any custom patches in your deployment? I looked up
v5.5 but
> >>> could not find the following functions referred to in the
core:
> >>>
> >>> map_atime_from_server()
> >>> worm_lookup_cbk()
> >>>
> >>> Neither do I see xlator_helper.c in the codebase.
> >>>
> >>> Thanks,
> >>> Vijay
> >>>
> >>>
> >>> #0  map_atime_from_server (this=0x7fdef401af00, stbuf=0x0) at
> >>> ../../../../xlators/lib/src/xlator_helper.c:21
> >>>         __FUNCTION__ = "map_to_atime_from_server"
> >>> #1  0x00007fdef39a0382 in worm_lookup_cbk (frame=frame at
entry=0x7fdeac0015c8,
> >>> cookie=<optimized out>, this=0x7fdef401af00,
op_ret=op_ret at entry=-1,
> >>> op_errno=op_errno at entry=13,
> >>>     inode=inode at entry=0x0, buf=0x0, xdata=0x0,
postparent=0x0) at
> >>> worm.c:531
> >>>         priv = 0x7fdef4075378
> >>>         ret = 0
> >>>         __FUNCTION__ = "worm_lookup_cbk"
> >>>
> >>> On Thu, May 16, 2019 at 12:53 AM David Spisla <spisla80 at
gmail.com>
> >>> wrote:
> >>>
> >>>> Hello Vijay,
> >>>>
> >>>> I could reproduce the issue. After doing a simple DIR
Listing from
> >>>> Win10 powershell, all brick processes crashes. Its not the
same scenario
> >>>> mentioned before but the crash report in the bricks log is
the same.
> >>>> Attached you find the backtrace.
> >>>>
> >>>> Regards
> >>>> David Spisla
> >>>>
> >>>> Am Di., 7. Mai 2019 um 20:08 Uhr schrieb Vijay Bellur <
> >>>> vbellur at redhat.com>:
> >>>>
> >>>>> Hello David,
> >>>>>
> >>>>> On Tue, May 7, 2019 at 2:16 AM David Spisla
<spisla80 at gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Hello Vijay,
> >>>>>>
> >>>>>> how can I create such a core file? Or will it be
created
> >>>>>> automatically if a gluster process crashes?
> >>>>>> Maybe you can give me a hint and will try to get a
backtrace.
> >>>>>>
> >>>>>
> >>>>> Generation of core file is dependent on the system
configuration.
> >>>>> `man 5 core` contains useful information to generate a
core file in a
> >>>>> directory. Once a core file is generated, you can use
gdb to get a
> >>>>> backtrace of all threads (using "thread apply all
bt full").
> >>>>>
> >>>>>
> >>>>>> Unfortunately this bug is not easy to reproduce
because it appears
> >>>>>> only sometimes.
> >>>>>>
> >>>>>
> >>>>> If the bug is not easy to reproduce, having a
backtrace from the
> >>>>> generated core would be very useful!
> >>>>>
> >>>>> Thanks,
> >>>>> Vijay
> >>>>>
> >>>>>
> >>>>>>
> >>>>>> Regards
> >>>>>> David Spisla
> >>>>>>
> >>>>>> Am Mo., 6. Mai 2019 um 19:48 Uhr schrieb Vijay
Bellur <
> >>>>>> vbellur at redhat.com>:
> >>>>>>
> >>>>>>> Thank you for the report, David. Do you have
core files available on
> >>>>>>> any of the servers? If yes, would it be
possible for you to provide a
> >>>>>>> backtrace.
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>> Vijay
> >>>>>>>
> >>>>>>> On Mon, May 6, 2019 at 3:09 AM David Spisla
<spisla80 at gmail.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hello folks,
> >>>>>>>>
> >>>>>>>> we have a client application (runs on
Win10) which does some FOPs
> >>>>>>>> on a gluster volume which is accessed by
SMB.
> >>>>>>>>
> >>>>>>>> *Scenario 1* is a READ Operation which
reads all files
> >>>>>>>> successively and checks if the files data
was correctly copied. While doing
> >>>>>>>> this, all brick processes crashes and in
the logs one have this crash
> >>>>>>>> report on every brick log:
> >>>>>>>>
> >>>>>>>>>
CTX_ID:a0359502-2c76-4fee-8cb9-365679dc690e-GRAPH_ID:0-PID:32934-HOST:XX-XXXXX-XX-XX-PC_NAME:shortterm-client-2-RECON_NO:-0,
gfid: 00000000-0000-0000-0000-000000000001,
req(uid:2000,gid:2000,perm:1,ngrps:1),
ctx(uid:0,gid:0,in-groups:0,perm:700,updated-fop:LOOKUP, acl:-) [Permission
denied]
> >>>>>>>>> pending frames:
> >>>>>>>>> frame : type(0) op(27)
> >>>>>>>>> frame : type(0) op(40)
> >>>>>>>>> patchset:
git://git.gluster.org/glusterfs.git
> >>>>>>>>> signal received: 11
> >>>>>>>>> time of crash:
> >>>>>>>>> 2019-04-16 08:32:21
> >>>>>>>>> configuration details:
> >>>>>>>>> argp 1
> >>>>>>>>> backtrace 1
> >>>>>>>>> dlfcn 1
> >>>>>>>>> libpthread 1
> >>>>>>>>> llistxattr 1
> >>>>>>>>> setfsid 1
> >>>>>>>>> spinlock 1
> >>>>>>>>> epoll.h 1
> >>>>>>>>> xattr.h 1
> >>>>>>>>> st_atim.tv_nsec 1
> >>>>>>>>> package-string: glusterfs 5.5
> >>>>>>>>>
/usr/lib64/libglusterfs.so.0(+0x2764c)[0x7f9a5bd4d64c]
> >>>>>>>>>
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7f9a5bd57d26]
> >>>>>>>>>
/lib64/libc.so.6(+0x361a0)[0x7f9a5af141a0]
> >>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0xb910)[0x7f9a4ef0e910]
> >>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x8118)[0x7f9a4ef0b118]
> >>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0x128d6)[0x7f9a4f1278d6]
> >>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/access-control.so(+0x575b)[0x7f9a4f35975b]
> >>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0xb3b3)[0x7f9a4f1203b3]
> >>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x85b2)[0x7f9a4ef0b5b2]
> >>>>>>>>>
/usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7f9a5bdd7b6c]
> >>>>>>>>>
/usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7f9a5bdd7b6c]
> >>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/upcall.so(+0xf548)[0x7f9a4e8cf548]
> >>>>>>>>>
/usr/lib64/libglusterfs.so.0(default_lookup_resume+0x1e2)[0x7f9a5bdefc22]
> >>>>>>>>>
/usr/lib64/libglusterfs.so.0(call_resume+0x75)[0x7f9a5bd733a5]
> >>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/performance/io-threads.so(+0x6088)[0x7f9a4e6b7088]
> >>>>>>>>>
/lib64/libpthread.so.0(+0x7569)[0x7f9a5b29f569]
> >>>>>>>>>
/lib64/libc.so.6(clone+0x3f)[0x7f9a5afd69af]
> >>>>>>>>>
> >>>>>>>>> *Scenario 2 *The application just SET
Read-Only on each file
> >>>>>>>> sucessively. After the 70th file was set,
all the bricks crashes and again,
> >>>>>>>> one can read this crash report in every
brick log:
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> [2019-05-02 07:43:39.953591] I [MSGID:
139001]
> >>>>>>>>>
[posix-acl.c:263:posix_acl_log_permit_denied] 0-longterm-access-control:
> >>>>>>>>> client:
> >>>>>>>>>
CTX_ID:21aa9c75-3a5f-41f9-925b-48e4c80bd24a-GRAPH_ID:0-PID:16325-HOST:XXX-X-X-XXX-PC_NAME:longterm-client-0-RECON_NO:-0,
> >>>>>>>>> gfid:
00000000-0000-0000-0000-000000000001,
> >>>>>>>>> req(uid:2000,gid:2000,perm:1,ngrps:1),
> >>>>>>>>>
ctx(uid:0,gid:0,in-groups:0,perm:700,updated-fop:LOOKUP, acl:-) [Permission
> >>>>>>>>> denied]
> >>>>>>>>>
> >>>>>>>>> pending frames:
> >>>>>>>>>
> >>>>>>>>> frame : type(0) op(27)
> >>>>>>>>>
> >>>>>>>>> patchset:
git://git.gluster.org/glusterfs.git
> >>>>>>>>>
> >>>>>>>>> signal received: 11
> >>>>>>>>>
> >>>>>>>>> time of crash:
> >>>>>>>>>
> >>>>>>>>> 2019-05-02 07:43:39
> >>>>>>>>>
> >>>>>>>>> configuration details:
> >>>>>>>>>
> >>>>>>>>> argp 1
> >>>>>>>>>
> >>>>>>>>> backtrace 1
> >>>>>>>>>
> >>>>>>>>> dlfcn 1
> >>>>>>>>>
> >>>>>>>>> libpthread 1
> >>>>>>>>>
> >>>>>>>>> llistxattr 1
> >>>>>>>>>
> >>>>>>>>> setfsid 1
> >>>>>>>>>
> >>>>>>>>> spinlock 1
> >>>>>>>>>
> >>>>>>>>> epoll.h 1
> >>>>>>>>>
> >>>>>>>>> xattr.h 1
> >>>>>>>>>
> >>>>>>>>> st_atim.tv_nsec 1
> >>>>>>>>>
> >>>>>>>>> package-string: glusterfs 5.5
> >>>>>>>>>
> >>>>>>>>>
/usr/lib64/libglusterfs.so.0(+0x2764c)[0x7fbb3f0b364c]
> >>>>>>>>>
> >>>>>>>>>
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7fbb3f0bdd26]
> >>>>>>>>>
> >>>>>>>>>
/lib64/libc.so.6(+0x361e0)[0x7fbb3e27a1e0]
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0xb910)[0x7fbb32257910]
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x8118)[0x7fbb32254118]
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0x128d6)[0x7fbb324708d6]
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/access-control.so(+0x575b)[0x7fbb326a275b]
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0xb3b3)[0x7fbb324693b3]
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x85b2)[0x7fbb322545b2]
> >>>>>>>>>
> >>>>>>>>>
/usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7fbb3f13db6c]
> >>>>>>>>>
> >>>>>>>>>
/usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7fbb3f13db6c]
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/features/upcall.so(+0xf548)[0x7fbb31c18548]
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
/usr/lib64/libglusterfs.so.0(default_lookup_resume+0x1e2)[0x7fbb3f155c22]
> >>>>>>>>>
> >>>>>>>>>
/usr/lib64/libglusterfs.so.0(call_resume+0x75)[0x7fbb3f0d93a5]
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
/usr/lib64/glusterfs/5.5/xlator/performance/io-threads.so(+0x6088)[0x7fbb31a00088]
> >>>>>>>>>
> >>>>>>>>>
/lib64/libpthread.so.0(+0x7569)[0x7fbb3e605569]
> >>>>>>>>>
> >>>>>>>>>
/lib64/libc.so.6(clone+0x3f)[0x7fbb3e33c9ef]
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> This happens on a 3-Node Gluster v5.5
Cluster on two different
> >>>>>>>> volumes. But both volumes has the same
settings:
> >>>>>>>>
> >>>>>>>>> Volume Name: shortterm
> >>>>>>>>> Type: Replicate
> >>>>>>>>> Volume ID:
5307e5c5-e8a1-493a-a846-342fb0195dee
> >>>>>>>>> Status: Started
> >>>>>>>>> Snapshot Count: 0
> >>>>>>>>> Number of Bricks: 1 x 3 = 3
> >>>>>>>>> Transport-type: tcp
> >>>>>>>>> Bricks:
> >>>>>>>>> Brick1:
fs-xxxxx-c1-n1:/gluster/brick4/glusterbrick
> >>>>>>>>> Brick2:
fs-xxxxx-c1-n2:/gluster/brick4/glusterbrick
> >>>>>>>>> Brick3:
fs-xxxxx-c1-n3:/gluster/brick4/glusterbrick
> >>>>>>>>> Options Reconfigured:
> >>>>>>>>> storage.reserve: 1
> >>>>>>>>> performance.client-io-threads: off
> >>>>>>>>> nfs.disable: on
> >>>>>>>>> transport.address-family: inet
> >>>>>>>>> user.smb: disable
> >>>>>>>>> features.read-only: off
> >>>>>>>>> features.worm: off
> >>>>>>>>> features.worm-file-level: on
> >>>>>>>>> features.retention-mode: enterprise
> >>>>>>>>> features.default-retention-period: 120
> >>>>>>>>> network.ping-timeout: 10
> >>>>>>>>> features.cache-invalidation: on
> >>>>>>>>> features.cache-invalidation-timeout:
600
> >>>>>>>>> performance.nl-cache: on
> >>>>>>>>> performance.nl-cache-timeout: 600
> >>>>>>>>> client.event-threads: 32
> >>>>>>>>> server.event-threads: 32
> >>>>>>>>> cluster.lookup-optimize: on
> >>>>>>>>> performance.stat-prefetch: on
> >>>>>>>>> performance.cache-invalidation: on
> >>>>>>>>> performance.md-cache-timeout: 600
> >>>>>>>>> performance.cache-samba-metadata: on
> >>>>>>>>> performance.cache-ima-xattrs: on
> >>>>>>>>> performance.io-thread-count: 64
> >>>>>>>>> cluster.use-compound-fops: on
> >>>>>>>>> performance.cache-size: 512MB
> >>>>>>>>> performance.cache-refresh-timeout: 10
> >>>>>>>>> performance.read-ahead: off
> >>>>>>>>> performance.write-behind-window-size:
4MB
> >>>>>>>>> performance.write-behind: on
> >>>>>>>>> storage.build-pgfid: on
> >>>>>>>>> features.utime: on
> >>>>>>>>> storage.ctime: on
> >>>>>>>>> cluster.quorum-type: fixed
> >>>>>>>>> cluster.quorum-count: 2
> >>>>>>>>> features.bitrot: on
> >>>>>>>>> features.scrub: Active
> >>>>>>>>> features.scrub-freq: daily
> >>>>>>>>> cluster.enable-shared-storage: enable
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>> Why can this happen to all Brick
processes? I don't understand the
> >>>>>>>> crash report. The FOPs are nothing special
and after restart brick
> >>>>>>>> processes everything works fine and our
application was succeed.
> >>>>>>>>
> >>>>>>>> Regards
> >>>>>>>> David Spisla
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
_______________________________________________
> >>>>>>>> Gluster-users mailing list
> >>>>>>>> Gluster-users at gluster.org
> >>>>>>>>
https://lists.gluster.org/mailman/listinfo/gluster-users
> >>>>>>>
> >>>>>>>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users

Gluster users - May 2019 - Brick-Xlators crashes after Set-RO and Read

[Gluster-users] Brick-Xlators crashes after Set-RO and Read

[Gluster-users] Brick-Xlators crashes after Set-RO and Read