Alpha Electronics
2009-May-29 19:31 UTC
[Gluster-users] Could be the bug of Glusterfs? The file system is unstable and hang
We are testing the glusterfs before recommending them to enterprise clients. We found that the file system always hang after running for about 2 days. after killing the server side process and then restart, everything goes back to normal. Here is the spec and error logged: GlusterFS version: v2.0.1 Client volume: volume brick_1 type protocol/client option transport-type tcp/client option remote-port 7777 # Non-default port option remote-host server1 option remote-subvolume brick end-volume volume brick_2 type protocol/client option transport-type tcp/client option remote-port 7777 # Non-default port option remote-host server2 option remote-subvolume brick end-volume volume bricks type cluster/distribute subvolumes brick_1 brick_2 end-volume Error logged on client side through /var/log/glusterfs.log [2009-05-29 14:58:55] E [client-protocol.c:292:call_bail] brick_1: bailing out frame LK(28) frame sent = 2009-05-29 14:28:54. frame-timeout = 1800 [2009-05-29 14:58:55] W [fuse-bridge.c:2284:fuse_setlk_cbk] glusterfs-fuse: 106850788: ERR => -1 (Transport endpoint is not connected) error logged on server [2009-05-29 14:59:15] E [client-protocol.c:292:call_bail] brick_2: bailing out frame LK(28) frame sent = 2009-05-29 14:29:05. frame-timeout = 1800 [2009-05-29 14:59:15] W [fuse-bridge.c:2284:fuse_setlk_cbk] glusterfs-fuse: 106850860: ERR => -1 (Transport endpoint is not connected) There is error message logged on server side after 1 hour in /var/log/messages: May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0] lib/util_sock.c:write_data(564) May 29 16:04:16 server2 winbindd[3649]: write_data: write failure. Error Connection reset by peer May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0] libsmb/clientgen.c:write_socket(158) May 29 16:04:16 server2 winbindd[3649]: write_socket: Error writing 104 bytes to socket 18: ERRNO = Connection reset by peer May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0] libsmb/clientgen.c:cli_send_smb(188) May 29 16:04:16 server2 winbindd[3649]: Error writing 104 bytes to client. -1 (Connection reset by peer) May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0] libsmb/cliconnect.c:cli_session_setup_spnego(859) May 29 16:04:16 server2 winbindd[3649]: Kinit failed: Cannot contact any KDC for requested realm -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20090529/a191a2bc/attachment.html>
VahriƧ Muhtaryan
2009-May-30 08:43 UTC
[Gluster-users] Could be the bug of Glusterfs? The file system is unstable and hang
Hello, I was installed new version like you and making test for something should be or not . We have same configuration but I got differnet error, I couldn't create directory or file , "it was giving Invalid Argument" and I saw that one of server give an error like below , still testing .... pending frames: frame : type(1) op(WRITE) patchset: 5c1d9108c1529a1155963cb1911f8870a674ab5b signal received: 6 configuration details:argp 1 backtrace 1 db.h 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 2.0.1 [0xfa9420] /lib/libc.so.6(abort+0x101)[0x218691] /lib/libc.so.6[0x24f24b] /lib/libc.so.6[0x2570f1] /lib/libc.so.6(cfree+0x90)[0x25abc0] /usr/local/lib/glusterfs/2.0.1/transport/socket.so(__socket_reset+0x3e)[0xc8 155e] /usr/local/lib/glusterfs/2.0.1/transport/socket.so(socket_event_poll_err+0x3 b)[0xc8303b] /usr/local/lib/glusterfs/2.0.1/transport/socket.so(socket_event_handler+0x8b )[0xc833bb] /usr/local/lib/libglusterfs.so.0[0x9820ca] /usr/local/lib/libglusterfs.so.0(event_dispatch+0x21)[0x980fb1] glusterfsd(main+0xdf3)[0x804b1a3] /lib/libc.so.6(__libc_start_main+0xdc)[0x203e8c] glusterfsd[0x8049911] --------- From: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] On Behalf Of Alpha Electronics Sent: Friday, May 29, 2009 10:32 PM To: gluster-users at gluster.org Subject: [Gluster-users] Could be the bug of Glusterfs? The file system is unstable and hang We are testing the glusterfs before recommending them to enterprise clients. We found that the file system always hang after running for about 2 days. after killing the server side process and then restart, everything goes back to normal. Here is the spec and error logged: GlusterFS version: v2.0.1 Client volume: volume brick_1 type protocol/client option transport-type tcp/client option remote-port 7777 # Non-default port option remote-host server1 option remote-subvolume brick end-volume volume brick_2 type protocol/client option transport-type tcp/client option remote-port 7777 # Non-default port option remote-host server2 option remote-subvolume brick end-volume volume bricks type cluster/distribute subvolumes brick_1 brick_2 end-volume Error logged on client side through /var/log/glusterfs.log [2009-05-29 14:58:55] E [client-protocol.c:292:call_bail] brick_1: bailing out frame LK(28) frame sent = 2009-05-29 14:28:54. frame-timeout = 1800 [2009-05-29 14:58:55] W [fuse-bridge.c:2284:fuse_setlk_cbk] glusterfs-fuse: 106850788: ERR => -1 (Transport endpoint is not connected) error logged on server [2009-05-29 14:59:15] E [client-protocol.c:292:call_bail] brick_2: bailing out frame LK(28) frame sent = 2009-05-29 14:29:05. frame-timeout = 1800 [2009-05-29 14:59:15] W [fuse-bridge.c:2284:fuse_setlk_cbk] glusterfs-fuse: 106850860: ERR => -1 (Transport endpoint is not connected) There is error message logged on server side after 1 hour in /var/log/messages: May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0] lib/util_sock.c:write_data(564) May 29 16:04:16 server2 winbindd[3649]: write_data: write failure. Error Connection reset by peer May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0] libsmb/clientgen.c:write_socket(158) May 29 16:04:16 server2 winbindd[3649]: write_socket: Error writing 104 bytes to socket 18: ERRNO = Connection reset by peer May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0] libsmb/clientgen.c:cli_send_smb(188) May 29 16:04:16 server2 winbindd[3649]: Error writing 104 bytes to client. -1 (Connection reset by peer) May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0] libsmb/cliconnect.c:cli_session_setup_spnego(859) May 29 16:04:16 server2 winbindd[3649]: Kinit failed: Cannot contact any KDC for requested realm -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20090530/88ba8118/attachment.html>
jvanwanrooy at chatventure.nl
2009-May-30 19:18 UTC
[Gluster-users] Could be the bug of Glusterfs? The file system is unstable and hang
Hi, We just ran into a problem with the same result: the crash of a client. We use two storage bricks with replication in the client. We stopped the first storage brick, which caused the crash of the client. Please take a look at my log below. Does anyone know why this is caused? Best regards J asper [2009-05-30 21:02:22] E [saved-frames.c:165:saved_frames_unwind] brick1: forced unwinding frame type(1) op(FINODELK) [2009-05-30 21:02:22] E [saved-frames.c:165:saved_frames_unwind] brick1: forced unwinding frame type(1) op(FINODELK) [2009-05-30 21:02:22] D [socket.c:1229:socket_submit] brick1: not connected (priv->connected = 255) [2009-05-30 21:02:22] N [client-protocol.c:6248:notify] brick1: disconnected [2009-05-30 21:02:22] E [socket.c:744:socket_connect_finish] brick1: connection to 172.23.120.210:6996 failed (Connection refused) pending frames: frame : type(1) op(READ) patchset: 5c1d9108c1529a1155963cb1911f8870a674ab5b signal received: 11 configuration details:argp 1 backtrace 1 db.h 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 2.0.1 /lib64/libc.so.6[0x381e830280] /usr/lib64/glusterfs/2.0.1/xlator/performance/read-ahead.so(ra_readv+0x58)[0x2b8c0d2a1888] /usr/lib64/glusterfs/2.0.1/xlator/cluster/replicate.so(afr_readv+0x173)[0x2b8c0d4b8203] /usr/lib64/libfuse.so.2[0x2b8c0d8fdf39] /usr/lib64/glusterfs/2.0.1/xlator/mount/fuse.so[0x2b8c0d6deffd] /lib64/libpthread.so.0[0x381f806367] /lib64/libc.so.6(clone+0x6d)[0x381e8d2f7d] --------- [2009-05-30 21:02:23] N [client-protocol.c:6248:notify] brick1: disconnected Jasper van Wanrooy - Chatventure BV Technical Manager T: +31 (0) 6 47 248 722 E: jvanwanrooy at chatventure.nl W: www.chatventure.nl ----- Original Message ----- From: "Vahri? Muhtaryan" <vahric at doruk.net.tr> To: "Alpha Electronics" <myitouchs at gmail.com>, gluster-users at gluster.org Sent: Saturday, 30 May, 2009 10:43:12 GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna Subject: Re: [Gluster-users] Could be the bug of Glusterfs? The file system is unstable and hang Hello, I was installed new version like you and making test for something should be or not . We have same configuration but I got differnet error, I couldn?t create directory or file , ?it was giving Invalid Argument? and I saw that one of server give an error like below , still testing .... pending frames: frame : type(1) op(WRITE) patchset: 5c1d9108c1529a1155963cb1911f8870a674ab5b signal received: 6 configuration details:argp 1 backtrace 1 db.h 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 2.0.1 [0xfa9420] /lib/libc.so.6(abort+0x101)[0x218691] /lib/libc.so.6[0x24f24b] /lib/libc.so.6[0x2570f1] /lib/libc.so.6(cfree+0x90)[0x25abc0] /usr/local/lib/glusterfs/2.0.1/transport/socket.so(__socket_reset+0x3e)[0xc8155e] /usr/local/lib/glusterfs/2.0.1/transport/socket.so(socket_event_poll_err+0x3b)[0xc8303b] /usr/local/lib/glusterfs/2.0.1/transport/socket.so(socket_event_handler+0x8b)[0xc833bb] /usr/local/lib/libglusterfs.so.0[0x9820ca] /usr/local/lib/libglusterfs.so.0(event_dispatch+0x21)[0x980fb1] glusterfsd(main+0xdf3)[0x804b1a3] /lib/libc.so.6(__libc_start_main+0xdc)[0x203e8c] glusterfsd[0x8049911] --------- From: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] On Behalf Of Alpha Electronics Sent: Friday, May 29, 2009 10:32 PM To: gluster-users at gluster.org Subject: [Gluster-users] Could be the bug of Glusterfs? The file system is unstable and hang We are testing the glusterfs before recommending them to enterprise clients. We found that the file system always hang after running for about 2 days. after killing the server side process and then restart, everything goes back to normal. Here is the spec and error logged: GlusterFS version: v2.0.1 Client volume: volume brick_1 type protocol/client option transport-type tcp/client option remote-port 7777 # Non-default port option remote-host server1 option remote-subvolume brick end-volume volume brick_2 type protocol/client option transport-type tcp/client option remote-port 7777 # Non-default port option remote-host server2 option remote-subvolume brick end-volume volume bricks type cluster/distribute subvolumes brick_1 brick_2 end-volume Error logged on client side through /var/log/glusterfs.log [2009-05-29 14:58:55] E [client-protocol.c:292:call_bail] brick_1: bailing out frame LK(28) frame sent = 2009-05-29 14:28:54. frame-timeout = 1800 [2009-05-29 14:58:55] W [fuse-bridge.c:2284:fuse_setlk_cbk] glusterfs-fuse: 106850788: ERR => -1 (Transport endpoint is not connected) error logged on server [2009-05-29 14:59:15] E [client-protocol.c:292:call_bail] brick_2: bailing out frame LK(28) frame sent = 2009-05-29 14:29:05. frame-timeout = 1800 [2009-05-29 14:59:15] W [fuse-bridge.c:2284:fuse_setlk_cbk] glusterfs-fuse: 106850860: ERR => -1 (Transport endpoint is not connected) There is error message logged on server side after 1 hour in /var/log/messages: May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0] lib/util_sock.c:write_data(564) May 29 16:04:16 server2 winbindd[3649]: write_data: write failure. Error = Connection reset by peer May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0] libsmb/clientgen.c:write_socket(158) May 29 16:04:16 server2 winbindd[3649]: write_socket: Error writing 104 bytes to socket 18: ERRNO = Connection reset by peer May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0] libsmb/clientgen.c:cli_send_smb(188) May 29 16:04:16 server2 winbindd[3649]: Error writing 104 bytes to client. -1 (Connection reset by peer) May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0] libsmb/cliconnect.c:cli_session_setup_spnego(859) May 29 16:04:16 server2 winbindd[3649]: Kinit failed: Cannot contact any KDC for requested realm _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20090530/6436cd5b/attachment.html>