as a followup.. I have shutdown the "broken" one in the pair since it kept crashing. the working one is running on it's own but gluster dies every 10 mins or so. seems 1.4pre5 doesn't like being an AFR client all on it's own? I'm going to see if it works with only itself as the AFR subvolumes list 2008-09-23 07:24:00 E [afr.c:3434:afr_statfs_cbk] home: (child=home2) op_ret=-1 op_errno=107(Transport endpoint is not connected) 2008-09-23 07:24:03 E [afr.c:3434:afr_statfs_cbk] home: (child=home2) op_ret=-1 op_errno=107(Transport endpoint is not connected) 2008-09-23 07:24:28 E [afr.c:4759:afr_create_cbk] home: (path=/glusterfile/tmp/1222179868.H882395P21565.HOSTNAME child=home2) op_ret=-1 op_errno=107(Transport endpoint is not connected) pending frames: Signal received: 11 configuration details:argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 tv_nsec 1 package-string: glusterfs 1.4.0pre5 /lib64/libc.so.6[0x300d0322a0] /usr/local/lib/glusterfs/1.4.0pre5/xlator/cluster/afr.so(afr_incver_internal_incver_cbk+0x38)[0xe5250c] /usr/local/lib/glusterfs/1.4.0pre5/xlator/protocol/client.so(client_incver+0xb9)[0xa29072] /usr/local/lib/glusterfs/1.4.0pre5/xlator/cluster/afr.so(afr_incver_internal_lock_cbk+0x6d8)[0xe52d75] /usr/local/lib/libglusterfs.so.0[0x125c5b] /usr/local/lib/libglusterfs.so.0(mop_lock_impl+0x103)[0x12a6aa] /usr/local/lib/libglusterfs.so.0(default_lock+0x126)[0x125d88] /usr/local/lib/glusterfs/1.4.0pre5/xlator/cluster/afr.so(afr_incver_internal_fd+0x33a)[0xe530d1] /usr/local/lib/glusterfs/1.4.0pre5/xlator/cluster/afr.so(afr_close+0x26d)[0xe5c64d] /usr/local/lib/glusterfs/1.4.0pre5/xlator/mount/fuse.so[0x7299fa8] /lib64/libfuse.so.2[0x10824b2] /usr/local/lib/glusterfs/1.4.0pre5/xlator/mount/fuse.so[0x729cc35] /lib64/libpthread.so.0[0x300dc0729a] /lib64/libc.so.6(clone+0x6d)[0x300d0e439d] --------- At 07:09 AM 9/23/2008, Keith Freedman wrote:>I had a pair of servers running 1.4pre5 in AFR. >they've been running fine for over a week, and suddenly today one of >them had decided it just will crash anytime it tries to AFR a file. > >strange is, it seems to get updates form the other server. >it's not up long enough to do any thorough testing, but when I do >this from the "good" server: >echo `hostname` `date` > /gluster/shared/file >I can read the correct hostname and date from the "bad" server, but >when I do the same thing on the "bad" server, it crashes instantly. > >running FC9 with default fuse: >fuse-2.7.4-8_10.fc9.i386 > >I'm going to re-install fuse thinking that perhaps something got >corrupted, but it's odd it happened while the servers been goign just >fine for days. > >I turned on debugging and here's what it's producing >where the log ends is where the server crashed while I was tailing >the logfile: >2008-09-23 06:56:31 D [inode.c:311:__inode_retire] fuse/inode: >retiring inode(0) lru=21/0 active=21 purge=29 >2008-09-23 06:56:31 D [fuse-bridge.c:437:fuse_lookup] glusterfs-fuse: >223: LOOKUP /uservideo/public_html/Guests/Images/Misc/.htaccess >2008-09-23 06:56:31 D [inode.c:443:__inode_create] fuse/inode: create inode(0) >2008-09-23 06:56:31 D [inode.c:268:__inode_activate] fuse/inode: >activating inode(0), lru=21/0 active=22 purge=29 >2008-09-23 06:56:31 D [fuse-bridge.c:857:fuse_err_cbk] >glusterfs-fuse: 222: FLUSH() ERR => 0 >2008-09-23 06:56:31 D [fuse-bridge.c:1599:fuse_release] >glusterfs-fuse: 224: CLOSE 0x8fecf58 >2008-09-23 06:56:31 D [fuse-bridge.c:562:fuse_getattr] >glusterfs-fuse: 225: FGETATTR 20971566 >(/user2/public_html/shopping/var/run/classes/kernel/Profiler.php/0x8fece28) >2008-09-23 06:56:31 D [fuse-bridge.c:496:fuse_attr_cbk] >glusterfs-fuse: 225: FSTAT() >/user2/public_html/shopping/var/run/classes/kernel/Profiler.php => 20971566 >2008-09-23 06:56:31 D [fuse-bridge.c:398:fuse_entry_cbk] >glusterfs-fuse: 223: LOOKUP() >/uservideo/public_html/Guests/Images/Misc/.htaccess => -1 (No such >file or directory) >2008-09-23 06:56:31 D [inode.c:311:__inode_retire] fuse/inode: >retiring inode(0) lru=21/0 active=21 purge=30 >2008-09-23 06:56:31 D [inode.c:268:__inode_activate] fuse/inode: >activating inode(11010602), lru=20/0 active=22 purge=30 >2008-09-23 06:56:31 D [fuse-bridge.c:1429:fuse_open] glusterfs-fuse: >226: OPEN /uservideo/public_html/Guests/Images/Misc/userLogo.jpg >2008-09-23 06:56:31 D [fuse-bridge.c:857:fuse_err_cbk] >glusterfs-fuse: 224: CLOSE() ERR => 0 >2008-09-23 06:56:31 D [fuse-bridge.c:1572:fuse_flush] glusterfs-fuse: >227: FLUSH 0x8fece28 >2008-09-23 06:56:31 D [fuse-bridge.c:603:fuse_fd_cbk] glusterfs-fuse: >226: OPEN() /uservideo/public_html/Guests/Images/Misc/userLogo.jpg >=> 0x8fecd50 >2008-09-23 06:56:31 D [fuse-bridge.c:1487:fuse_readv] glusterfs-fuse: >228: READ (0x8fecd50, size=4096, offset=0) >2008-09-23 06:56:31 D [fuse-bridge.c:857:fuse_err_cbk] >glusterfs-fuse: 227: FLUSH() ERR => 0 >2008-09-23 06:56:31 D [fuse-bridge.c:1455:fuse_readv_cbk] >glusterfs-fuse: 228: READ => 3513/4096,0/3513 >2008-09-23 06:56:31 D [fuse-bridge.c:1599:fuse_release] >glusterfs-fuse: 229: CLOSE 0x8fece28 >2008-09-23 06:56:31 D [fuse-bridge.c:437:fuse_lookup] glusterfs-fuse: >230: LOOKUP /user2/public_html/shopping/var/run/classes/kernel/Database.php >2008-09-23 06:56:31 D [inode.c:443:__inode_create] fuse/inode: create inode(0) >2008-09-23 06:56:31 D [inode.c:268:__inode_activate] fuse/inode: >activating inode(0), lru=20/0 active=23 purge=30 >2008-09-23 06:56:31 D [fuse-bridge.c:1572:fuse_flush] glusterfs-fuse: >231: FLUSH 0x8fecd50 >2008-09-23 06:56:31 D [fuse-bridge.c:857:fuse_err_cbk] >glusterfs-fuse: 229: CLOSE() ERR => 0 >2008-09-23 06:56:31 D [inode.c:287:__inode_passivate] fuse/inode: >passivating inode(20971566) lru=21/0 active=22 purge=30 >2008-09-23 06:56:31 D [fuse-bridge.c:370:fuse_entry_cbk] >glusterfs-fuse: 230: LOOKUP() >/user2/public_html/shopping/var/run/classes/kernel/Database.php => 20971567 >2008-09-23 06:56:31 D [inode.c:287:__inode_passivate] fuse/inode: >passivating inode(20971567) lru=22/0 active=21 purge=30 >2008-09-23 06:56:31 D [fuse-bridge.c:857:fuse_err_cbk] >glusterfs-fuse: 231: FLUSH() ERR => 0 >2008-09-23 06:56:31 D [fuse-bridge.c:1599:fuse_release] >glusterfs-fuse: 232: CLOSE 0x8fecd50 >2008-09-23 06:56:31 D [fuse-b > >here's some more from when the server rebooted >+----- >2008-09-23 07:04:39 D [spec.y:194:new_section] parser: New node for 'home1' >2008-09-23 07:04:39 D [xlator.c:289:xlator_set_type] xlator: attempt >to load file /usr/local/lib/glusterfs/1.4.0pre5/xlator/storage/posix.so >2008-09-23 07:04:39 D [spec.y:219:section_type] parser: >Type:home1:storage/posix >2008-09-23 07:04:39 D [spec.y:285:section_option] parser: >Option:home1:directory:/gluster/home >2008-09-23 07:04:39 D [spec.y:367:section_end] parser: end:home1 >2008-09-23 07:04:39 D [spec.y:194:new_section] parser: New node for >'posix-locks-home1' >2008-09-23 07:04:39 D [xlator.c:289:xlator_set_type] xlator: attempt >to load file /usr/local/lib/glusterfs/1.4.0pre5/xlator/features/posix-locks.so >2008-09-23 07:04:39 D [xlator.c:318:xlator_set_type] >posix-locks-home1: dlsym(notify) on >/usr/local/lib/glusterfs/1.4.0pre5/xlator/features/posix-locks.so: >undefined symbol: notify -- neglecting >2008-09-23 07:04:39 D [spec.y:219:section_type] parser: >Type:posix-locks-home1:features/posix-locks >2008-09-23 07:04:39 D [spec.y:285:section_option] parser: >Option:posix-locks-home1:mandatory:on >2008-09-23 07:04:39 D [spec.y:352:section_sub] parser: >child:posix-locks-home1->home1 >2008-09-23 07:04:39 D [spec.y:367:section_end] parser: end:posix-locks-home1 >2008-09-23 07:04:39 D [spec.y:194:new_section] parser: New node for 'home2' >2008-09-23 07:04:39 D [xlator.c:289:xlator_set_type] xlator: attempt >to load file /usr/local/lib/glusterfs/1.4.0pre5/xlator/protocol/client.so >2008-09-23 07:04:39 D [spec.y:219:section_type] parser: >Type:home2:protocol/client >2008-09-23 07:04:39 D [spec.y:285:section_option] parser: >Option:home2:transport-type:tcp/client >2008-09-23 07:04:39 D [spec.y:285:section_option] parser: >Option:home2:remote-host:72.36.173.218 >2008-09-23 07:04:39 D [spec.y:285:section_option] parser: >Option:home2:remote-subvolume:posix-locks-home1 >2008-09-23 07:04:39 D [spec.y:285:section_option] parser: >Option:home2:transport-timeout:10 >2008-09-23 07:04:39 D [spec.y:367:section_end] parser: end:home2 >2008-09-23 07:04:39 D [spec.y:194:new_section] parser: New node for 'server' >2008-09-23 07:04:39 D [xlator.c:289:xlator_set_type] xlator: attempt >to load file /usr/local/lib/glusterfs/1.4.0pre5/xlator/protocol/server.so >2008-09-23 07:04:39 D [spec.y:219:section_type] parser: >Type:server:protocol/server >2008-09-23 07:04:39 D [spec.y:285:section_option] parser: >Option:server:transport-type:tcp/server >2008-09-23 07:04:39 D [spec.y:352:section_sub] parser: >child:server->posix-locks-home1 >2008-09-23 07:04:39 D [spec.y:285:section_option] parser: >Option:server:auth.addr.posix-locks-home1.allow:72.36.173.218,127.0.0.1 >2008-09-23 07:04:39 D [spec.y:367:section_end] parser: end:server >2008-09-23 07:04:39 D [spec.y:194:new_section] parser: New node for 'home' >2008-09-23 07:04:39 D [xlator.c:289:xlator_set_type] xlator: attempt >to load file /usr/local/lib/glusterfs/1.4.0pre5/xlator/cluster/afr.so >2008-09-23 07:04:39 D [xlator.c:324:xlator_set_type] home: strict >option validation is not enforced -- neglecting >2008-09-23 07:04:39 D [spec.y:219:section_type] parser: Type:home:cluster/afr >2008-09-23 07:04:39 D [spec.y:285:section_option] parser: >Option:home:read-subvolume:posix-locks-home1 >2008-09-23 07:04:39 D [spec.y:352:section_sub] parser: >child:home->posix-locks-home1 >2008-09-23 07:04:39 D [spec.y:352:section_sub] parser: child:home->home2 >2008-09-23 07:04:39 D [spec.y:367:section_end] parser: end:home >2008-09-23 07:04:39 D [xlator.c:289:xlator_set_type] xlator: attempt >to load file /usr/local/lib/glusterfs/1.4.0pre5/xlator/mount/fuse.so >2008-09-23 07:04:39 D [xlator.c:324:xlator_set_type] fuse: strict >option validation is not enforced -- neglecting >2008-09-23 07:04:39 D [glusterfs.c:771:main] glusterfs: running in pid 1145 >2008-09-23 07:04:39 D [fuse-options.c:140:fuse_options_validate] >fuse-options: using mount-point = /home >2008-09-23 07:04:39 D [fuse-options.c:147:fuse_options_validate] >fuse-options: using attr-timeout = 1 >2008-09-23 07:04:39 D [fuse-options.c:159:fuse_options_validate] >fuse-options: using entry-timeout = 1 >2008-09-23 07:04:39 D [fuse-options.c:171:fuse_options_validate] >fuse-options: using direct-io-mode = 1 >2008-09-23 07:04:39 D [client-protocol.c:4383:init] home2: setting >transport-timeout to 10 >2008-09-23 07:04:39 D [transport.c:104:transport_load] transport: >attempt to load file /usr/local/lib/glusterfs/1.4.0pre5/transport/socket.so >2008-09-23 07:04:39 D [client-protocol.c:4427:init] home2: defaulting >limits.transaction-size to 268435456 >2008-09-23 07:04:39 D [afr.c:6397:init] home: self-heal is enabled (default) >2008-09-23 07:04:39 D [afr.c:6421:init] home: config: reads will be >done on posix-locks-home1 >2008-09-23 07:04:39 D [afr.c:6309:notify] home: GF_EVENT_CHILD_UP >from posix-locks-home1 >2008-09-23 07:04:39 D [afr.c:6241:afr_check_xattr_cbk] home: >'posix-locks-home1' supports Extended attribute >2008-09-23 07:04:39 D [inode.c:928:inode_table_new] fuse: creating >new inode table with lru_limit=0 >2008-09-23 07:04:39 D [inode.c:443:__inode_create] fuse/inode: create inode(0) >2008-09-23 07:04:39 D [client-protocol.c:4653:notify] home2: got >GF_EVENT_PARENT_UP, attempting connect on transport >2008-09-23 07:04:39 D [transport.c:104:transport_load] transport: >attempt to load file /usr/local/lib/glusterfs/1.4.0pre5/transport/socket.so >2008-09-23 07:04:39 E [name.c:344:af_inet_server_get_local_sockaddr] >server: getaddrinfo failed (Name or service not known) >2008-09-23 07:04:39 W [common-utils.c:158:gf_print_bytes] glusterfs: >Total data (in bytes): transfered (0), received (0) >pending frames: > >Signal received: 11 >configuration details:argp 1 >backtrace 1 >dlfcn 1 >fdatasync 1 >libpthread 1 >llistxattr 1 >setfsid 1 >spinlock 1 >epoll.h 1 >xattr.h 1 >tv_nsec 1 >package-string: glusterfs 1.4.0pre5 > > > >_______________________________________________ >Gluster-users mailing list >Gluster-users at gluster.org >http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users