Vikas R
2011-Jan-13 11:44 UTC
[Gluster-users] distribute-replicate setup GFS Client crashed
Hi there, Im running glusterfs version 3.1.0. The client crashed after sometime with below stack. 2011-01-13 08:33:49.230976] I [afr-common.c:2568:afr_notify] replicate-1: Subvolume 'distribute-1' came back up; going online. [2011-01-13 08:33:49.499909] I [afr-open.c:393:afr_openfd_sh] replicate-1: data self-heal triggered. path: /streaming/set3/work/reduce.12.1294902171.dplog.temp, reason: Replicate up down flush, data lock is held [2011-01-13 08:33:49.500500] E [afr-self-heal-common.c:1214:sh_missing_entries_create] replicate-1: no missing files - /streaming/set3/work/reduce.12.1294902171.dplog.temp. proceeding to metadata check [2011-01-13 08:33:49.501906] E [afr-common.c:110:afr_set_split_brain] replicate-1: invalid argument: inode [2011-01-13 08:33:49.501919] I [afr-self-heal-common.c:1526:afr_self_heal_completion_cbk] replicate-1: background data self-heal completed on /streaming/set3/work/reduce.12.1294902171.dplog.temp [2011-01-13 08:33:49.531838] I [dht-common.c:402:dht_revalidate_cbk] distribute-1: linkfile found in revalidate for /streaming/set3/work/mapped/dpabort/multiple_reduce.flash_pl.2.1294901929.1.172.26.98.59.2.map.10 [2011-01-13 08:33:50.396055] W [fuse-bridge.c:2765:fuse_setlk_cbk] glusterfs-fuse: 2230985: ERR => -1 (Invalid argument) pending frames: frame : type(1) op(FLUSH) frame : type(1) op(FLUSH) frame : type(1) op(LK) patchset: v3.1.0 signal received: 11 time of crash: 2011-01-13 08:33:50 configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.1.0 [0xffffe400] /usr/local/akamai/lib/glusterfs/3.1.0/xlator/cluster/afr.so(afr_internal_lock_finish+0x8b)[0xf6187eeb] /usr/local/akamai/lib/glusterfs/3.1.0/xlator/cluster/afr.so(afr_post_nonblocking_inodelk_cbk+0x4f)[0xf61884ef] /usr/local/akamai/lib/glusterfs/3.1.0/xlator/cluster/afr.so(afr_nonblocking_inodelk_cbk+0x28f)[0xf61a0dbf] /usr/local/akamai/lib/glusterfs/3.1.0/xlator/cluster/dht.so(dht_finodelk_cbk+0x82)[0xf61c3102] /usr/local/akamai/lib/glusterfs/3.1.0/xlator/protocol/client.so(client3_1_finodelk_cbk+0xbe)[0xf61f1c1e] /usr/local/akamai/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xc2)[0xf7736c42] /usr/local/akamai/lib/libgfrpc.so.0(rpc_clnt_notify+0xa2)[0xf7736e62] /usr/local/akamai/lib/libgfrpc.so.0(rpc_transport_notify+0x35)[0xf77314c5] /usr/local/akamai/lib/glusterfs/3.1.0/rpc-transport/socket.so(socket_event_poll_in+0x50)[0xf5f08500] /usr/local/akamai/lib/glusterfs/3.1.0/rpc-transport/socket.so(socket_event_handler+0x15b)[0xf5f0867b] /usr/local/akamai/lib/libglusterfs.so.0[0xf7771cff] /usr/local/akamai/lib/libglusterfs.so.0(event_dispatch+0x21)[0xf7770a21] glusterfsc(main+0x48c)[0x804c45c] /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xdc)[0xf75d718c] glusterfsc[0x804a631] The config files are attached. tx Vikas -------------- next part -------------- ## file auto generated by /usr/local/bin/glusterfs-volgen (export.vol) # Cmd line: # $ /usr/local/bin/glusterfs-volgen --name gfs 172.24.0.68:/ghostcache/home/hsawhney/gfs/ 172.24.0.222:/ghostcache/home/hsawhney/gfs/ volume posix1 type storage/posix option directory /ghostcache/gfs-export/ end-volume volume locks1 type features/locks subvolumes posix1 end-volume #volume quota1 # type features/quota # #option disk-usage-limit 100MB # subvolumes locks1 #end-volume volume brickex type performance/io-threads option thread-count 4 subvolumes locks1 end-volume volume server-tcp type protocol/server option transport-type tcp option auth.addr.brickex.allow * option transport.socket.listen-port 6996 option transport.socket.nodelay on subvolumes brickex end-volume -------------- next part -------------- # file auto generated by /usr/local/bin/glusterfs-volgen (mount.vol) # Cmd line: # $ /usr/local/bin/glusterfs-volgen --name gfs 172.24.0.68:/ghostcache/home/hsawhney/gfs/ 172.24.0.222:/ghostcache/home/hsawhney/gfs/ # TRANSPORT-TYPE tcp volume 172.26.98.55-1 type protocol/client option transport-type tcp option remote-host 172.26.98.55 option transport.socket.nodelay on option transport.remote-port 6996 option remote-subvolume brickex end-volume volume 172.26.98.56-1 type protocol/client option transport-type tcp option remote-host 172.26.98.56 option transport.socket.nodelay on option transport.remote-port 6996 option remote-subvolume brickex end-volume volume 172.26.98.57-1 type protocol/client option transport-type tcp option remote-host 172.26.98.57 option transport.socket.nodelay on option transport.remote-port 6996 option remote-subvolume brickex end-volume volume 172.26.98.59-1 type protocol/client option transport-type tcp option remote-host 172.26.98.59 option transport.socket.nodelay on option transport.remote-port 6996 option remote-subvolume brickex end-volume #volume 172.26.98.61-1 # type protocol/client # option transport-type tcp # option remote-host 172.26.98.61 # option transport.socket.nodelay on # option transport.remote-port 6996 # option remote-subvolume brickex #end-volume #volume 172.26.98.62-1 # type protocol/client # option remote-host 172.26.98.62 # option transport.socket.nodelay on # option transport.remote-port 6996 # option remote-subvolume brickex #end-volume volume distribute-1 type cluster/dht subvolumes 172.26.98.55-1 172.26.98.56-1 end-volume volume distribute-2 type cluster/dht subvolumes 172.26.98.57-1 172.26.98.59-1 end-volume #volume distribute-3 # type cluster/dht # subvolumes 172.26.98.61-1 172.26.98.62-1 #end-volume volume replicate-1 type cluster/afr option lookup-unhashed yes subvolumes distribute-1 distribute-2 #subvolumes distribute-1 distribute-2 distribute-3 end-volume #volume stripe # type cluster/stripe # option block-size 1MB # subvolumes replicate-1 replicate-2 replicate-3 #end-volume volume writebehind type performance/write-behind option cache-size 4MB subvolumes replicate-1 end-volume volume io-cache type performance/io-cache option cache-size 64MB # default is 32MB #option priority *.h:3,*.html:2,*:1 # default is '*:0' option cache-timeout 2 # default is 1 second subvolumes writebehind end-volume volume stat-prefetch type performance/stat-prefetch subvolumes io-cache end-volume