Daniel Maher
2008-Dec-16 16:52 UTC
[Gluster-users] 1.4rc3 client crashed with sig 11 + core
Hello all, I upgraded my AFR cluster (two clients, two servers) to 1.4rc3 today. As per the advice of Mr. Avati, as well as from the list, i've moved towards a client-side AFR (for the time being) in order to see if it helps with failover and recovery. Unfortunately, i hit my first client crash about 30 minutes in ! A signal 11, no less, along with a nice 9.4MB core dump. The relevant section from the glusterfs.log of the client that crashed is below : -------------------- frame : type(1) op(32) frame : type(1) op(32) frame : type(1) op(32) frame : type(1) op(13) frame : type(1) op(13) frame : type(0) op(0) Signal received: 11 configuration details:argp 1 bdb->cursor->get 1 db.h 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 tv_nsec 1 package-string: glusterfs 1.4.0rc3 -------------------- Client config : http://glusterfs.pastebin.com/m50af4d49 Server config : http://glusterfs.pastebin.com/d23a1fa7e Client details : # uname -s -r -p Linux 2.6.24.4 x86_64 # cat /etc/redhat-release Fedora release 8 (Werewolf) # rpm -qa | egrep "(gluster|fuse)" fuse-libs-2.7.3glfs10-1 fuse-2.7.3glfs10-1 glusterfs-1.4.0rc3-1 Has anybody else had 1.4(rc3 or other) sig 11 on them ? Any idea why ? -- Daniel Maher <dma+gluster AT witbe DOT net>
Anand Avati
2008-Dec-16 16:59 UTC
[Gluster-users] 1.4rc3 client crashed with sig 11 + core
Can you please get us a bt from the core dump? On Dec 16, 2008 8:54 AM, "Daniel Maher" <dma+gluster at witbe.net<dma%2Bgluster at witbe.net>> wrote: Hello all, I upgraded my AFR cluster (two clients, two servers) to 1.4rc3 today. As per the advice of Mr. Avati, as well as from the list, i've moved towards a client-side AFR (for the time being) in order to see if it helps with failover and recovery. Unfortunately, i hit my first client crash about 30 minutes in ! A signal 11, no less, along with a nice 9.4MB core dump. The relevant section from the glusterfs.log of the client that crashed is below : -------------------- frame : type(1) op(32) frame : type(1) op(32) frame : type(1) op(32) frame : type(1) op(13) frame : type(1) op(13) frame : type(0) op(0) Signal received: 11 configuration details:argp 1 bdb->cursor->get 1 db.h 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 tv_nsec 1 package-string: glusterfs 1.4.0rc3 -------------------- Client config : http://glusterfs.pastebin.com/m50af4d49 Server config : http://glusterfs.pastebin.com/d23a1fa7e Client details : # uname -s -r -p Linux 2.6.24.4 x86_64 # cat /etc/redhat-release Fedora release 8 (Werewolf) # rpm -qa | egrep "(gluster|fuse)" fuse-libs-2.7.3glfs10-1 fuse-2.7.3glfs10-1 glusterfs-1.4.0rc3-1 Has anybody else had 1.4(rc3 or other) sig 11 on them ? Any idea why ? -- Daniel Maher <dma+gluster AT witbe DOT net> _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20081216/82bb90d5/attachment.html>
Daniel Maher
2008-Dec-16 17:13 UTC
[Gluster-users] 1.4rc3 client crashed with sig 11 + core
Anand Avati wrote:> Can you please get us a bt from the core dump?(gdb) backtrace #0 0x00002b97940e1d2f in ra_frame_return () from /usr/lib64/glusterfs/1.4.0rc3/xlator/performance/read-ahead.so #1 0x00002b97940e1dce in ra_waitq_return () from /usr/lib64/glusterfs/1.4.0rc3/xlator/performance/read-ahead.so #2 0x00002b97940e21bf in ra_fault_cbk () from /usr/lib64/glusterfs/1.4.0rc3/xlator/performance/read-ahead.so #3 0x00002b9793ec312d in afr_readv_cbk () from /usr/lib64/glusterfs/1.4.0rc3/xlator/cluster/afr.so #4 0x00002b9793cab0f1 in client_readv_cbk () from /usr/lib64/glusterfs/1.4.0rc3/xlator/protocol/client.so #5 0x00002b9793c9baaf in protocol_client_interpret () from /usr/lib64/glusterfs/1.4.0rc3/xlator/protocol/client.so #6 0x00002b9793c9bc51 in protocol_client_pollin () from /usr/lib64/glusterfs/1.4.0rc3/xlator/protocol/client.so #7 0x00002b9793ca2b3a in notify () from /usr/lib64/glusterfs/1.4.0rc3/xlator/protocol/client.so #8 0x00002aaaaaaadabe in ?? () from /usr/lib64/glusterfs/1.4.0rc3/transport/socket.so #9 0x00002b9793a10985 in ?? () #10 0x00002b97939f0998 in ?? () #11 0x0000000000000000 in ?? () (gdb) -- Daniel Maher <dma+gluster AT witbe DOT net>
Anand Avati
2008-Dec-16 17:18 UTC
[Gluster-users] 1.4rc3 client crashed with sig 11 + core
Daniel, please remove read-ahead. We too found this issue and the fix is already in the repo. It will be shipped in the next rc. Apologies for the inconvience. On Dec 16, 2008 8:59 AM, "Anand Avati" <avati at zresearch.com> wrote: Can you please get us a bt from the core dump? On Dec 16, 2008 8:54 AM, "Daniel Maher" <dma+gluster at witbe.net<dma%2Bgluster at witbe.net>> wrote: Hello all, I upgraded my AFR cluster (two clients, two servers) to 1.4rc3 today. As per the advice of Mr. Avati, as well as from the list, i've moved towards a client-side AFR (for the time being) in order to see if it helps with failover and recovery. Unfortunately, i hit my first client crash about 30 minutes in ! A signal 11, no less, along with a nice 9.4MB core dump. The relevant section from the glusterfs.log of the client that crashed is below : -------------------- frame : type(1) op(32) frame : type(1) op(32) frame : type(1) op(32) frame : type(1) op(13) frame : type(1) op(13) frame : type(0) op(0) Signal received: 11 configuration details:argp 1 bdb->cursor->get 1 db.h 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 tv_nsec 1 package-string: glusterfs 1.4.0rc3 -------------------- Client config : http://glusterfs.pastebin.com/m50af4d49 Server config : http://glusterfs.pastebin.com/d23a1fa7e Client details : # uname -s -r -p Linux 2.6.24.4 x86_64 # cat /etc/redhat-release Fedora release 8 (Werewolf) # rpm -qa | egrep "(gluster|fuse)" fuse-libs-2.7.3glfs10-1 fuse-2.7.3glfs10-1 glusterfs-1.4.0rc3-1 Has anybody else had 1.4(rc3 or other) sig 11 on them ? Any idea why ? -- Daniel Maher <dma+gluster AT witbe DOT net> _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20081216/d59d953a/attachment.html>
-------- Original-Nachricht --------> Datum: Tue, 16 Dec 2008 18:13:16 +0100 > Von: Daniel Maher <dma+gluster at witbe.net> > An: Anand Avati <avati at zresearch.com> > CC: "gluster-users at gluster.org" <gluster-users at gluster.org> > Betreff: Re: [Gluster-users] 1.4rc3 client crashed with sig 11 + core> Anand Avati wrote: > > Can you please get us a bt from the core dump? >I too have a crash with 1.4rc3. I get the crash when using readahead. I will post a backtrace as soon as I get back in the office.> (gdb) backtrace > #0 0x00002b97940e1d2f in ra_frame_return () from > /usr/lib64/glusterfs/1.4.0rc3/xlator/performance/read-ahead.so > #1 0x00002b97940e1dce in ra_waitq_return () from > /usr/lib64/glusterfs/1.4.0rc3/xlator/performance/read-ahead.so > #2 0x00002b97940e21bf in ra_fault_cbk () from > /usr/lib64/glusterfs/1.4.0rc3/xlator/performance/read-ahead.so > #3 0x00002b9793ec312d in afr_readv_cbk () from > /usr/lib64/glusterfs/1.4.0rc3/xlator/cluster/afr.so > #4 0x00002b9793cab0f1 in client_readv_cbk () from > /usr/lib64/glusterfs/1.4.0rc3/xlator/protocol/client.so > #5 0x00002b9793c9baaf in protocol_client_interpret () from > /usr/lib64/glusterfs/1.4.0rc3/xlator/protocol/client.so > #6 0x00002b9793c9bc51 in protocol_client_pollin () from > /usr/lib64/glusterfs/1.4.0rc3/xlator/protocol/client.so > #7 0x00002b9793ca2b3a in notify () from > /usr/lib64/glusterfs/1.4.0rc3/xlator/protocol/client.so > #8 0x00002aaaaaaadabe in ?? () from > /usr/lib64/glusterfs/1.4.0rc3/transport/socket.so > #9 0x00002b9793a10985 in ?? () > #10 0x00002b97939f0998 in ?? () > #11 0x0000000000000000 in ?? () > (gdb) > > > -- > Daniel Maher <dma+gluster AT witbe DOT net> > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users-- Psssst! Schon vom neuen GMX MultiMessenger geh?rt? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger