thr3ads.net - Gluster users - [Gluster-users] Brick-Xlators crashes after Set-RO and Read [May 2019]

If this information is useful, please help other people find it:
Share via:

David Spisla

2019-May-06 10:08 UTC

[Gluster-users] Brick-Xlators crashes after Set-RO and Read

Hello folks,

we have a client application (runs on Win10) which does some FOPs on a
gluster volume which is accessed by SMB.

*Scenario 1* is a READ Operation which reads all files successively and
checks if the files data was correctly copied. While doing this, all brick
processes crashes and in the logs one have this crash report on every brick
log:
>
CTX_ID:a0359502-2c76-4fee-8cb9-365679dc690e-GRAPH_ID:0-PID:32934-HOST:XX-XXXXX-XX-XX-PC_NAME:shortterm-client-2-RECON_NO:-0,
gfid: 00000000-0000-0000-0000-000000000001,
req(uid:2000,gid:2000,perm:1,ngrps:1),
ctx(uid:0,gid:0,in-groups:0,perm:700,updated-fop:LOOKUP, acl:-) [Permission
denied]
> pending frames:
> frame : type(0) op(27)
> frame : type(0) op(40)
> patchset: git://git.gluster.org/glusterfs.git
> signal received: 11
> time of crash:
> 2019-04-16 08:32:21
> configuration details:
> argp 1
> backtrace 1
> dlfcn 1
> libpthread 1
> llistxattr 1
> setfsid 1
> spinlock 1
> epoll.h 1
> xattr.h 1
> st_atim.tv_nsec 1
> package-string: glusterfs 5.5
> /usr/lib64/libglusterfs.so.0(+0x2764c)[0x7f9a5bd4d64c]
> /usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7f9a5bd57d26]
> /lib64/libc.so.6(+0x361a0)[0x7f9a5af141a0]
> /usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0xb910)[0x7f9a4ef0e910]
> /usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x8118)[0x7f9a4ef0b118]
> /usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0x128d6)[0x7f9a4f1278d6]
>
/usr/lib64/glusterfs/5.5/xlator/features/access-control.so(+0x575b)[0x7f9a4f35975b]
> /usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0xb3b3)[0x7f9a4f1203b3]
> /usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x85b2)[0x7f9a4ef0b5b2]
> /usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7f9a5bdd7b6c]
> /usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7f9a5bdd7b6c]
> /usr/lib64/glusterfs/5.5/xlator/features/upcall.so(+0xf548)[0x7f9a4e8cf548]
> /usr/lib64/libglusterfs.so.0(default_lookup_resume+0x1e2)[0x7f9a5bdefc22]
> /usr/lib64/libglusterfs.so.0(call_resume+0x75)[0x7f9a5bd733a5]
>
/usr/lib64/glusterfs/5.5/xlator/performance/io-threads.so(+0x6088)[0x7f9a4e6b7088]
> /lib64/libpthread.so.0(+0x7569)[0x7f9a5b29f569]
> /lib64/libc.so.6(clone+0x3f)[0x7f9a5afd69af]
>
> *Scenario 2 *The application just SET Read-Only on each file sucessively.After the 70th file was set, all the bricks crashes and again, one can read
this crash report in every brick log:
>
>
> [2019-05-02 07:43:39.953591] I [MSGID: 139001]
> [posix-acl.c:263:posix_acl_log_permit_denied] 0-longterm-access-control:
> client:
>
CTX_ID:21aa9c75-3a5f-41f9-925b-48e4c80bd24a-GRAPH_ID:0-PID:16325-HOST:XXX-X-X-XXX-PC_NAME:longterm-client-0-RECON_NO:-0,
> gfid: 00000000-0000-0000-0000-000000000001,
> req(uid:2000,gid:2000,perm:1,ngrps:1),
> ctx(uid:0,gid:0,in-groups:0,perm:700,updated-fop:LOOKUP, acl:-) [Permission
> denied]
>
> pending frames:
>
> frame : type(0) op(27)
>
> patchset: git://git.gluster.org/glusterfs.git
>
> signal received: 11
>
> time of crash:
>
> 2019-05-02 07:43:39
>
> configuration details:
>
> argp 1
>
> backtrace 1
>
> dlfcn 1
>
> libpthread 1
>
> llistxattr 1
>
> setfsid 1
>
> spinlock 1
>
> epoll.h 1
>
> xattr.h 1
>
> st_atim.tv_nsec 1
>
> package-string: glusterfs 5.5
>
> /usr/lib64/libglusterfs.so.0(+0x2764c)[0x7fbb3f0b364c]
>
> /usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7fbb3f0bdd26]
>
> /lib64/libc.so.6(+0x361e0)[0x7fbb3e27a1e0]
>
> /usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0xb910)[0x7fbb32257910]
>
> /usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x8118)[0x7fbb32254118]
>
> /usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0x128d6)[0x7fbb324708d6]
>
>
>
/usr/lib64/glusterfs/5.5/xlator/features/access-control.so(+0x575b)[0x7fbb326a275b]
>
> /usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0xb3b3)[0x7fbb324693b3]
>
> /usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x85b2)[0x7fbb322545b2]
>
> /usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7fbb3f13db6c]
>
> /usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7fbb3f13db6c]
>
> /usr/lib64/glusterfs/5.5/xlator/features/upcall.so(+0xf548)[0x7fbb31c18548]
>
> /usr/lib64/libglusterfs.so.0(default_lookup_resume+0x1e2)[0x7fbb3f155c22]
>
> /usr/lib64/libglusterfs.so.0(call_resume+0x75)[0x7fbb3f0d93a5]
>
>
>
/usr/lib64/glusterfs/5.5/xlator/performance/io-threads.so(+0x6088)[0x7fbb31a00088]
>
> /lib64/libpthread.so.0(+0x7569)[0x7fbb3e605569]
>
> /lib64/libc.so.6(clone+0x3f)[0x7fbb3e33c9ef]
>
This happens on a 3-Node Gluster v5.5 Cluster on two different volumes. But
both volumes has the same settings:
> Volume Name: shortterm
> Type: Replicate
> Volume ID: 5307e5c5-e8a1-493a-a846-342fb0195dee
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: fs-xxxxx-c1-n1:/gluster/brick4/glusterbrick
> Brick2: fs-xxxxx-c1-n2:/gluster/brick4/glusterbrick
> Brick3: fs-xxxxx-c1-n3:/gluster/brick4/glusterbrick
> Options Reconfigured:
> storage.reserve: 1
> performance.client-io-threads: off
> nfs.disable: on
> transport.address-family: inet
> user.smb: disable
> features.read-only: off
> features.worm: off
> features.worm-file-level: on
> features.retention-mode: enterprise
> features.default-retention-period: 120
> network.ping-timeout: 10
> features.cache-invalidation: on
> features.cache-invalidation-timeout: 600
> performance.nl-cache: on
> performance.nl-cache-timeout: 600
> client.event-threads: 32
> server.event-threads: 32
> cluster.lookup-optimize: on
> performance.stat-prefetch: on
> performance.cache-invalidation: on
> performance.md-cache-timeout: 600
> performance.cache-samba-metadata: on
> performance.cache-ima-xattrs: on
> performance.io-thread-count: 64
> cluster.use-compound-fops: on
> performance.cache-size: 512MB
> performance.cache-refresh-timeout: 10
> performance.read-ahead: off
> performance.write-behind-window-size: 4MB
> performance.write-behind: on
> storage.build-pgfid: on
> features.utime: on
> storage.ctime: on
> cluster.quorum-type: fixed
> cluster.quorum-count: 2
> features.bitrot: on
> features.scrub: Active
> features.scrub-freq: daily
> cluster.enable-shared-storage: enable
>
>Why can this happen to all Brick processes? I don't understand the crash
report. The FOPs are nothing special and after restart brick processes
everything works fine and our application was succeed.

Regards
David Spisla
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190506/5227e9d3/attachment.html>

Vijay Bellur

2019-May-06 17:48 UTC

head link

[Gluster-users] Brick-Xlators crashes after Set-RO and Read

Thank you for the report, David. Do you have core files available on any of
the servers? If yes, would it be possible for you to provide a backtrace.

Regards,
Vijay

On Mon, May 6, 2019 at 3:09 AM David Spisla <spisla80 at gmail.com> wrote:
> Hello folks,
>
> we have a client application (runs on Win10) which does some FOPs on a
> gluster volume which is accessed by SMB.
>
> *Scenario 1* is a READ Operation which reads all files successively and
> checks if the files data was correctly copied. While doing this, all brick
> processes crashes and in the logs one have this crash report on every brick
> log:
>
>>
CTX_ID:a0359502-2c76-4fee-8cb9-365679dc690e-GRAPH_ID:0-PID:32934-HOST:XX-XXXXX-XX-XX-PC_NAME:shortterm-client-2-RECON_NO:-0,
gfid: 00000000-0000-0000-0000-000000000001,
req(uid:2000,gid:2000,perm:1,ngrps:1),
ctx(uid:0,gid:0,in-groups:0,perm:700,updated-fop:LOOKUP, acl:-) [Permission
denied]
>> pending frames:
>> frame : type(0) op(27)
>> frame : type(0) op(40)
>> patchset: git://git.gluster.org/glusterfs.git
>> signal received: 11
>> time of crash:
>> 2019-04-16 08:32:21
>> configuration details:
>> argp 1
>> backtrace 1
>> dlfcn 1
>> libpthread 1
>> llistxattr 1
>> setfsid 1
>> spinlock 1
>> epoll.h 1
>> xattr.h 1
>> st_atim.tv_nsec 1
>> package-string: glusterfs 5.5
>> /usr/lib64/libglusterfs.so.0(+0x2764c)[0x7f9a5bd4d64c]
>> /usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7f9a5bd57d26]
>> /lib64/libc.so.6(+0x361a0)[0x7f9a5af141a0]
>>
/usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0xb910)[0x7f9a4ef0e910]
>>
/usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x8118)[0x7f9a4ef0b118]
>>
/usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0x128d6)[0x7f9a4f1278d6]
>>
/usr/lib64/glusterfs/5.5/xlator/features/access-control.so(+0x575b)[0x7f9a4f35975b]
>>
/usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0xb3b3)[0x7f9a4f1203b3]
>>
/usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x85b2)[0x7f9a4ef0b5b2]
>> /usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7f9a5bdd7b6c]
>> /usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7f9a5bdd7b6c]
>>
/usr/lib64/glusterfs/5.5/xlator/features/upcall.so(+0xf548)[0x7f9a4e8cf548]
>>
/usr/lib64/libglusterfs.so.0(default_lookup_resume+0x1e2)[0x7f9a5bdefc22]
>> /usr/lib64/libglusterfs.so.0(call_resume+0x75)[0x7f9a5bd733a5]
>>
/usr/lib64/glusterfs/5.5/xlator/performance/io-threads.so(+0x6088)[0x7f9a4e6b7088]
>> /lib64/libpthread.so.0(+0x7569)[0x7f9a5b29f569]
>> /lib64/libc.so.6(clone+0x3f)[0x7f9a5afd69af]
>>
>> *Scenario 2 *The application just SET Read-Only on each file
> sucessively. After the 70th file was set, all the bricks crashes and again,
> one can read this crash report in every brick log:
>
>>
>>
>> [2019-05-02 07:43:39.953591] I [MSGID: 139001]
>> [posix-acl.c:263:posix_acl_log_permit_denied]
0-longterm-access-control:
>> client:
>>
CTX_ID:21aa9c75-3a5f-41f9-925b-48e4c80bd24a-GRAPH_ID:0-PID:16325-HOST:XXX-X-X-XXX-PC_NAME:longterm-client-0-RECON_NO:-0,
>> gfid: 00000000-0000-0000-0000-000000000001,
>> req(uid:2000,gid:2000,perm:1,ngrps:1),
>> ctx(uid:0,gid:0,in-groups:0,perm:700,updated-fop:LOOKUP, acl:-)
[Permission
>> denied]
>>
>> pending frames:
>>
>> frame : type(0) op(27)
>>
>> patchset: git://git.gluster.org/glusterfs.git
>>
>> signal received: 11
>>
>> time of crash:
>>
>> 2019-05-02 07:43:39
>>
>> configuration details:
>>
>> argp 1
>>
>> backtrace 1
>>
>> dlfcn 1
>>
>> libpthread 1
>>
>> llistxattr 1
>>
>> setfsid 1
>>
>> spinlock 1
>>
>> epoll.h 1
>>
>> xattr.h 1
>>
>> st_atim.tv_nsec 1
>>
>> package-string: glusterfs 5.5
>>
>> /usr/lib64/libglusterfs.so.0(+0x2764c)[0x7fbb3f0b364c]
>>
>> /usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7fbb3f0bdd26]
>>
>> /lib64/libc.so.6(+0x361e0)[0x7fbb3e27a1e0]
>>
>>
/usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0xb910)[0x7fbb32257910]
>>
>>
/usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x8118)[0x7fbb32254118]
>>
>>
>>
/usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0x128d6)[0x7fbb324708d6]
>>
>>
>>
/usr/lib64/glusterfs/5.5/xlator/features/access-control.so(+0x575b)[0x7fbb326a275b]
>>
>>
/usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0xb3b3)[0x7fbb324693b3]
>>
>>
/usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x85b2)[0x7fbb322545b2]
>>
>> /usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7fbb3f13db6c]
>>
>> /usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7fbb3f13db6c]
>>
>>
>>
/usr/lib64/glusterfs/5.5/xlator/features/upcall.so(+0xf548)[0x7fbb31c18548]
>>
>>
/usr/lib64/libglusterfs.so.0(default_lookup_resume+0x1e2)[0x7fbb3f155c22]
>>
>> /usr/lib64/libglusterfs.so.0(call_resume+0x75)[0x7fbb3f0d93a5]
>>
>>
>>
/usr/lib64/glusterfs/5.5/xlator/performance/io-threads.so(+0x6088)[0x7fbb31a00088]
>>
>> /lib64/libpthread.so.0(+0x7569)[0x7fbb3e605569]
>>
>> /lib64/libc.so.6(clone+0x3f)[0x7fbb3e33c9ef]
>>
>
> This happens on a 3-Node Gluster v5.5 Cluster on two different volumes.
> But both volumes has the same settings:
>
>> Volume Name: shortterm
>> Type: Replicate
>> Volume ID: 5307e5c5-e8a1-493a-a846-342fb0195dee
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x 3 = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: fs-xxxxx-c1-n1:/gluster/brick4/glusterbrick
>> Brick2: fs-xxxxx-c1-n2:/gluster/brick4/glusterbrick
>> Brick3: fs-xxxxx-c1-n3:/gluster/brick4/glusterbrick
>> Options Reconfigured:
>> storage.reserve: 1
>> performance.client-io-threads: off
>> nfs.disable: on
>> transport.address-family: inet
>> user.smb: disable
>> features.read-only: off
>> features.worm: off
>> features.worm-file-level: on
>> features.retention-mode: enterprise
>> features.default-retention-period: 120
>> network.ping-timeout: 10
>> features.cache-invalidation: on
>> features.cache-invalidation-timeout: 600
>> performance.nl-cache: on
>> performance.nl-cache-timeout: 600
>> client.event-threads: 32
>> server.event-threads: 32
>> cluster.lookup-optimize: on
>> performance.stat-prefetch: on
>> performance.cache-invalidation: on
>> performance.md-cache-timeout: 600
>> performance.cache-samba-metadata: on
>> performance.cache-ima-xattrs: on
>> performance.io-thread-count: 64
>> cluster.use-compound-fops: on
>> performance.cache-size: 512MB
>> performance.cache-refresh-timeout: 10
>> performance.read-ahead: off
>> performance.write-behind-window-size: 4MB
>> performance.write-behind: on
>> storage.build-pgfid: on
>> features.utime: on
>> storage.ctime: on
>> cluster.quorum-type: fixed
>> cluster.quorum-count: 2
>> features.bitrot: on
>> features.scrub: Active
>> features.scrub-freq: daily
>> cluster.enable-shared-storage: enable
>>
>>
> Why can this happen to all Brick processes? I don't understand the
crash
> report. The FOPs are nothing special and after restart brick processes
> everything works fine and our application was succeed.
>
> Regards
> David Spisla
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190506/c1d08f57/attachment.html>

Gluster users - May 2019 - Brick-Xlators crashes after Set-RO and Read

[Gluster-users] Brick-Xlators crashes after Set-RO and Read

[Gluster-users] Brick-Xlators crashes after Set-RO and Read