David Spisla
2019-May-07 09:15 UTC
[Gluster-users] Brick-Xlators crashes after Set-RO and Read
Hello Vijay, how can I create such a core file? Or will it be created automatically if a gluster process crashes? Maybe you can give me a hint and will try to get a backtrace. Unfortunately this bug is not easy to reproduce because it appears only sometimes. Regards David Spisla Am Mo., 6. Mai 2019 um 19:48 Uhr schrieb Vijay Bellur <vbellur at redhat.com>:> Thank you for the report, David. Do you have core files available on any > of the servers? If yes, would it be possible for you to provide a backtrace. > > Regards, > Vijay > > On Mon, May 6, 2019 at 3:09 AM David Spisla <spisla80 at gmail.com> wrote: > >> Hello folks, >> >> we have a client application (runs on Win10) which does some FOPs on a >> gluster volume which is accessed by SMB. >> >> *Scenario 1* is a READ Operation which reads all files successively and >> checks if the files data was correctly copied. While doing this, all brick >> processes crashes and in the logs one have this crash report on every brick >> log: >> >>> CTX_ID:a0359502-2c76-4fee-8cb9-365679dc690e-GRAPH_ID:0-PID:32934-HOST:XX-XXXXX-XX-XX-PC_NAME:shortterm-client-2-RECON_NO:-0, gfid: 00000000-0000-0000-0000-000000000001, req(uid:2000,gid:2000,perm:1,ngrps:1), ctx(uid:0,gid:0,in-groups:0,perm:700,updated-fop:LOOKUP, acl:-) [Permission denied] >>> pending frames: >>> frame : type(0) op(27) >>> frame : type(0) op(40) >>> patchset: git://git.gluster.org/glusterfs.git >>> signal received: 11 >>> time of crash: >>> 2019-04-16 08:32:21 >>> configuration details: >>> argp 1 >>> backtrace 1 >>> dlfcn 1 >>> libpthread 1 >>> llistxattr 1 >>> setfsid 1 >>> spinlock 1 >>> epoll.h 1 >>> xattr.h 1 >>> st_atim.tv_nsec 1 >>> package-string: glusterfs 5.5 >>> /usr/lib64/libglusterfs.so.0(+0x2764c)[0x7f9a5bd4d64c] >>> /usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7f9a5bd57d26] >>> /lib64/libc.so.6(+0x361a0)[0x7f9a5af141a0] >>> /usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0xb910)[0x7f9a4ef0e910] >>> /usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x8118)[0x7f9a4ef0b118] >>> /usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0x128d6)[0x7f9a4f1278d6] >>> /usr/lib64/glusterfs/5.5/xlator/features/access-control.so(+0x575b)[0x7f9a4f35975b] >>> /usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0xb3b3)[0x7f9a4f1203b3] >>> /usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x85b2)[0x7f9a4ef0b5b2] >>> /usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7f9a5bdd7b6c] >>> /usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7f9a5bdd7b6c] >>> /usr/lib64/glusterfs/5.5/xlator/features/upcall.so(+0xf548)[0x7f9a4e8cf548] >>> /usr/lib64/libglusterfs.so.0(default_lookup_resume+0x1e2)[0x7f9a5bdefc22] >>> /usr/lib64/libglusterfs.so.0(call_resume+0x75)[0x7f9a5bd733a5] >>> /usr/lib64/glusterfs/5.5/xlator/performance/io-threads.so(+0x6088)[0x7f9a4e6b7088] >>> /lib64/libpthread.so.0(+0x7569)[0x7f9a5b29f569] >>> /lib64/libc.so.6(clone+0x3f)[0x7f9a5afd69af] >>> >>> *Scenario 2 *The application just SET Read-Only on each file >> sucessively. After the 70th file was set, all the bricks crashes and again, >> one can read this crash report in every brick log: >> >>> >>> >>> [2019-05-02 07:43:39.953591] I [MSGID: 139001] >>> [posix-acl.c:263:posix_acl_log_permit_denied] 0-longterm-access-control: >>> client: >>> CTX_ID:21aa9c75-3a5f-41f9-925b-48e4c80bd24a-GRAPH_ID:0-PID:16325-HOST:XXX-X-X-XXX-PC_NAME:longterm-client-0-RECON_NO:-0, >>> gfid: 00000000-0000-0000-0000-000000000001, >>> req(uid:2000,gid:2000,perm:1,ngrps:1), >>> ctx(uid:0,gid:0,in-groups:0,perm:700,updated-fop:LOOKUP, acl:-) [Permission >>> denied] >>> >>> pending frames: >>> >>> frame : type(0) op(27) >>> >>> patchset: git://git.gluster.org/glusterfs.git >>> >>> signal received: 11 >>> >>> time of crash: >>> >>> 2019-05-02 07:43:39 >>> >>> configuration details: >>> >>> argp 1 >>> >>> backtrace 1 >>> >>> dlfcn 1 >>> >>> libpthread 1 >>> >>> llistxattr 1 >>> >>> setfsid 1 >>> >>> spinlock 1 >>> >>> epoll.h 1 >>> >>> xattr.h 1 >>> >>> st_atim.tv_nsec 1 >>> >>> package-string: glusterfs 5.5 >>> >>> /usr/lib64/libglusterfs.so.0(+0x2764c)[0x7fbb3f0b364c] >>> >>> /usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7fbb3f0bdd26] >>> >>> /lib64/libc.so.6(+0x361e0)[0x7fbb3e27a1e0] >>> >>> /usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0xb910)[0x7fbb32257910] >>> >>> /usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x8118)[0x7fbb32254118] >>> >>> >>> /usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0x128d6)[0x7fbb324708d6] >>> >>> >>> /usr/lib64/glusterfs/5.5/xlator/features/access-control.so(+0x575b)[0x7fbb326a275b] >>> >>> >>> /usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0xb3b3)[0x7fbb324693b3] >>> >>> /usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x85b2)[0x7fbb322545b2] >>> >>> /usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7fbb3f13db6c] >>> >>> /usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7fbb3f13db6c] >>> >>> >>> /usr/lib64/glusterfs/5.5/xlator/features/upcall.so(+0xf548)[0x7fbb31c18548] >>> >>> /usr/lib64/libglusterfs.so.0(default_lookup_resume+0x1e2)[0x7fbb3f155c22] >>> >>> /usr/lib64/libglusterfs.so.0(call_resume+0x75)[0x7fbb3f0d93a5] >>> >>> >>> /usr/lib64/glusterfs/5.5/xlator/performance/io-threads.so(+0x6088)[0x7fbb31a00088] >>> >>> /lib64/libpthread.so.0(+0x7569)[0x7fbb3e605569] >>> >>> /lib64/libc.so.6(clone+0x3f)[0x7fbb3e33c9ef] >>> >> >> This happens on a 3-Node Gluster v5.5 Cluster on two different volumes. >> But both volumes has the same settings: >> >>> Volume Name: shortterm >>> Type: Replicate >>> Volume ID: 5307e5c5-e8a1-493a-a846-342fb0195dee >>> Status: Started >>> Snapshot Count: 0 >>> Number of Bricks: 1 x 3 = 3 >>> Transport-type: tcp >>> Bricks: >>> Brick1: fs-xxxxx-c1-n1:/gluster/brick4/glusterbrick >>> Brick2: fs-xxxxx-c1-n2:/gluster/brick4/glusterbrick >>> Brick3: fs-xxxxx-c1-n3:/gluster/brick4/glusterbrick >>> Options Reconfigured: >>> storage.reserve: 1 >>> performance.client-io-threads: off >>> nfs.disable: on >>> transport.address-family: inet >>> user.smb: disable >>> features.read-only: off >>> features.worm: off >>> features.worm-file-level: on >>> features.retention-mode: enterprise >>> features.default-retention-period: 120 >>> network.ping-timeout: 10 >>> features.cache-invalidation: on >>> features.cache-invalidation-timeout: 600 >>> performance.nl-cache: on >>> performance.nl-cache-timeout: 600 >>> client.event-threads: 32 >>> server.event-threads: 32 >>> cluster.lookup-optimize: on >>> performance.stat-prefetch: on >>> performance.cache-invalidation: on >>> performance.md-cache-timeout: 600 >>> performance.cache-samba-metadata: on >>> performance.cache-ima-xattrs: on >>> performance.io-thread-count: 64 >>> cluster.use-compound-fops: on >>> performance.cache-size: 512MB >>> performance.cache-refresh-timeout: 10 >>> performance.read-ahead: off >>> performance.write-behind-window-size: 4MB >>> performance.write-behind: on >>> storage.build-pgfid: on >>> features.utime: on >>> storage.ctime: on >>> cluster.quorum-type: fixed >>> cluster.quorum-count: 2 >>> features.bitrot: on >>> features.scrub: Active >>> features.scrub-freq: daily >>> cluster.enable-shared-storage: enable >>> >>> >> Why can this happen to all Brick processes? I don't understand the crash >> report. The FOPs are nothing special and after restart brick processes >> everything works fine and our application was succeed. >> >> Regards >> David Spisla >> >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190507/fe8551af/attachment-0001.html>
Vijay Bellur
2019-May-07 18:08 UTC
[Gluster-users] Brick-Xlators crashes after Set-RO and Read
Hello David, On Tue, May 7, 2019 at 2:16 AM David Spisla <spisla80 at gmail.com> wrote:> Hello Vijay, > > how can I create such a core file? Or will it be created automatically if > a gluster process crashes? > Maybe you can give me a hint and will try to get a backtrace. >Generation of core file is dependent on the system configuration. `man 5 core` contains useful information to generate a core file in a directory. Once a core file is generated, you can use gdb to get a backtrace of all threads (using "thread apply all bt full").> Unfortunately this bug is not easy to reproduce because it appears only > sometimes. >If the bug is not easy to reproduce, having a backtrace from the generated core would be very useful! Thanks, Vijay> > Regards > David Spisla > > Am Mo., 6. Mai 2019 um 19:48 Uhr schrieb Vijay Bellur <vbellur at redhat.com > >: > >> Thank you for the report, David. Do you have core files available on any >> of the servers? If yes, would it be possible for you to provide a backtrace. >> >> Regards, >> Vijay >> >> On Mon, May 6, 2019 at 3:09 AM David Spisla <spisla80 at gmail.com> wrote: >> >>> Hello folks, >>> >>> we have a client application (runs on Win10) which does some FOPs on a >>> gluster volume which is accessed by SMB. >>> >>> *Scenario 1* is a READ Operation which reads all files successively and >>> checks if the files data was correctly copied. While doing this, all brick >>> processes crashes and in the logs one have this crash report on every brick >>> log: >>> >>>> CTX_ID:a0359502-2c76-4fee-8cb9-365679dc690e-GRAPH_ID:0-PID:32934-HOST:XX-XXXXX-XX-XX-PC_NAME:shortterm-client-2-RECON_NO:-0, gfid: 00000000-0000-0000-0000-000000000001, req(uid:2000,gid:2000,perm:1,ngrps:1), ctx(uid:0,gid:0,in-groups:0,perm:700,updated-fop:LOOKUP, acl:-) [Permission denied] >>>> pending frames: >>>> frame : type(0) op(27) >>>> frame : type(0) op(40) >>>> patchset: git://git.gluster.org/glusterfs.git >>>> signal received: 11 >>>> time of crash: >>>> 2019-04-16 08:32:21 >>>> configuration details: >>>> argp 1 >>>> backtrace 1 >>>> dlfcn 1 >>>> libpthread 1 >>>> llistxattr 1 >>>> setfsid 1 >>>> spinlock 1 >>>> epoll.h 1 >>>> xattr.h 1 >>>> st_atim.tv_nsec 1 >>>> package-string: glusterfs 5.5 >>>> /usr/lib64/libglusterfs.so.0(+0x2764c)[0x7f9a5bd4d64c] >>>> /usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7f9a5bd57d26] >>>> /lib64/libc.so.6(+0x361a0)[0x7f9a5af141a0] >>>> /usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0xb910)[0x7f9a4ef0e910] >>>> /usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x8118)[0x7f9a4ef0b118] >>>> /usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0x128d6)[0x7f9a4f1278d6] >>>> /usr/lib64/glusterfs/5.5/xlator/features/access-control.so(+0x575b)[0x7f9a4f35975b] >>>> /usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0xb3b3)[0x7f9a4f1203b3] >>>> /usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x85b2)[0x7f9a4ef0b5b2] >>>> /usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7f9a5bdd7b6c] >>>> /usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7f9a5bdd7b6c] >>>> /usr/lib64/glusterfs/5.5/xlator/features/upcall.so(+0xf548)[0x7f9a4e8cf548] >>>> /usr/lib64/libglusterfs.so.0(default_lookup_resume+0x1e2)[0x7f9a5bdefc22] >>>> /usr/lib64/libglusterfs.so.0(call_resume+0x75)[0x7f9a5bd733a5] >>>> /usr/lib64/glusterfs/5.5/xlator/performance/io-threads.so(+0x6088)[0x7f9a4e6b7088] >>>> /lib64/libpthread.so.0(+0x7569)[0x7f9a5b29f569] >>>> /lib64/libc.so.6(clone+0x3f)[0x7f9a5afd69af] >>>> >>>> *Scenario 2 *The application just SET Read-Only on each file >>> sucessively. After the 70th file was set, all the bricks crashes and again, >>> one can read this crash report in every brick log: >>> >>>> >>>> >>>> [2019-05-02 07:43:39.953591] I [MSGID: 139001] >>>> [posix-acl.c:263:posix_acl_log_permit_denied] 0-longterm-access-control: >>>> client: >>>> CTX_ID:21aa9c75-3a5f-41f9-925b-48e4c80bd24a-GRAPH_ID:0-PID:16325-HOST:XXX-X-X-XXX-PC_NAME:longterm-client-0-RECON_NO:-0, >>>> gfid: 00000000-0000-0000-0000-000000000001, >>>> req(uid:2000,gid:2000,perm:1,ngrps:1), >>>> ctx(uid:0,gid:0,in-groups:0,perm:700,updated-fop:LOOKUP, acl:-) [Permission >>>> denied] >>>> >>>> pending frames: >>>> >>>> frame : type(0) op(27) >>>> >>>> patchset: git://git.gluster.org/glusterfs.git >>>> >>>> signal received: 11 >>>> >>>> time of crash: >>>> >>>> 2019-05-02 07:43:39 >>>> >>>> configuration details: >>>> >>>> argp 1 >>>> >>>> backtrace 1 >>>> >>>> dlfcn 1 >>>> >>>> libpthread 1 >>>> >>>> llistxattr 1 >>>> >>>> setfsid 1 >>>> >>>> spinlock 1 >>>> >>>> epoll.h 1 >>>> >>>> xattr.h 1 >>>> >>>> st_atim.tv_nsec 1 >>>> >>>> package-string: glusterfs 5.5 >>>> >>>> /usr/lib64/libglusterfs.so.0(+0x2764c)[0x7fbb3f0b364c] >>>> >>>> /usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7fbb3f0bdd26] >>>> >>>> /lib64/libc.so.6(+0x361e0)[0x7fbb3e27a1e0] >>>> >>>> >>>> /usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0xb910)[0x7fbb32257910] >>>> >>>> >>>> /usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x8118)[0x7fbb32254118] >>>> >>>> >>>> /usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0x128d6)[0x7fbb324708d6] >>>> >>>> >>>> /usr/lib64/glusterfs/5.5/xlator/features/access-control.so(+0x575b)[0x7fbb326a275b] >>>> >>>> >>>> /usr/lib64/glusterfs/5.5/xlator/features/locks.so(+0xb3b3)[0x7fbb324693b3] >>>> >>>> >>>> /usr/lib64/glusterfs/5.5/xlator/features/worm.so(+0x85b2)[0x7fbb322545b2] >>>> >>>> /usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7fbb3f13db6c] >>>> >>>> /usr/lib64/libglusterfs.so.0(default_lookup+0xbc)[0x7fbb3f13db6c] >>>> >>>> >>>> /usr/lib64/glusterfs/5.5/xlator/features/upcall.so(+0xf548)[0x7fbb31c18548] >>>> >>>> >>>> /usr/lib64/libglusterfs.so.0(default_lookup_resume+0x1e2)[0x7fbb3f155c22] >>>> >>>> /usr/lib64/libglusterfs.so.0(call_resume+0x75)[0x7fbb3f0d93a5] >>>> >>>> >>>> /usr/lib64/glusterfs/5.5/xlator/performance/io-threads.so(+0x6088)[0x7fbb31a00088] >>>> >>>> /lib64/libpthread.so.0(+0x7569)[0x7fbb3e605569] >>>> >>>> /lib64/libc.so.6(clone+0x3f)[0x7fbb3e33c9ef] >>>> >>> >>> This happens on a 3-Node Gluster v5.5 Cluster on two different volumes. >>> But both volumes has the same settings: >>> >>>> Volume Name: shortterm >>>> Type: Replicate >>>> Volume ID: 5307e5c5-e8a1-493a-a846-342fb0195dee >>>> Status: Started >>>> Snapshot Count: 0 >>>> Number of Bricks: 1 x 3 = 3 >>>> Transport-type: tcp >>>> Bricks: >>>> Brick1: fs-xxxxx-c1-n1:/gluster/brick4/glusterbrick >>>> Brick2: fs-xxxxx-c1-n2:/gluster/brick4/glusterbrick >>>> Brick3: fs-xxxxx-c1-n3:/gluster/brick4/glusterbrick >>>> Options Reconfigured: >>>> storage.reserve: 1 >>>> performance.client-io-threads: off >>>> nfs.disable: on >>>> transport.address-family: inet >>>> user.smb: disable >>>> features.read-only: off >>>> features.worm: off >>>> features.worm-file-level: on >>>> features.retention-mode: enterprise >>>> features.default-retention-period: 120 >>>> network.ping-timeout: 10 >>>> features.cache-invalidation: on >>>> features.cache-invalidation-timeout: 600 >>>> performance.nl-cache: on >>>> performance.nl-cache-timeout: 600 >>>> client.event-threads: 32 >>>> server.event-threads: 32 >>>> cluster.lookup-optimize: on >>>> performance.stat-prefetch: on >>>> performance.cache-invalidation: on >>>> performance.md-cache-timeout: 600 >>>> performance.cache-samba-metadata: on >>>> performance.cache-ima-xattrs: on >>>> performance.io-thread-count: 64 >>>> cluster.use-compound-fops: on >>>> performance.cache-size: 512MB >>>> performance.cache-refresh-timeout: 10 >>>> performance.read-ahead: off >>>> performance.write-behind-window-size: 4MB >>>> performance.write-behind: on >>>> storage.build-pgfid: on >>>> features.utime: on >>>> storage.ctime: on >>>> cluster.quorum-type: fixed >>>> cluster.quorum-count: 2 >>>> features.bitrot: on >>>> features.scrub: Active >>>> features.scrub-freq: daily >>>> cluster.enable-shared-storage: enable >>>> >>>> >>> Why can this happen to all Brick processes? I don't understand the crash >>> report. The FOPs are nothing special and after restart brick processes >>> everything works fine and our application was succeed. >>> >>> Regards >>> David Spisla >>> >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-users >> >>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190507/a3fa2730/attachment.html>