Marc Seeger
2013-Mar-17 13:25 UTC
[Gluster-users] Errors during dbench run (rename failed)
Hi, We just ran into drench dying on one of our test runs. We execute a dbench each on 2 machines. We use the following parameters: dbench 6 -t 60 -D $DIRECTORY (host specific, they each write in a separate one) The directories are on a mountpoint connected using glusterfs 3.3.1 (3.3.1-ubuntu1~lucid8 from https://launchpad.net/~semiosis/+archive/ubuntu-glusterfs-3.3) This is how dbench died: I, [2013-03-16T05:34:03.176890 #13121] INFO -- : [710] rename /mnt/gfs/something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/NEWPCB.PPT /mnt/gfs/something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/PPTB1E4.TMP failed (No such file or directory) - expected NT_STATUS_OK These are the logs at the time. They are a bit noisy, the matching message is emphasised using *****: [2013-03-16 05:34:03.082813] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f274 [2013-03-16 05:34:03.082813] W [client3_1-fops.c:1595:client3_1_entrylk_cbk] 0-remote9: remote operation failed: No such file or directory [2013-03-16 05:34:03.082813] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/PPTB1E4.TMP (b49d6051-93f6-4eca-b161-865a5bea964b) [2013-03-16 05:34:03.082813] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f4cc [2013-03-16 05:34:03.082813] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f6c0 [2013-03-16 05:34:03.082813] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f468 [2013-03-16 05:34:03.082813] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f33c [2013-03-16 05:34:03.082813] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory [2013-03-16 05:34:03.092814] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/ZD16.BMP (73e3b099-48cd-4e76-8049-c64bf8f63500) [2013-03-16 05:34:03.092814] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/NEWPCB.PPT (ba53fb9f-0648-4794-aaa9-bba9331b52cb) [2013-03-16 05:34:03.092814] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/PCBENCHM.PPT (a0c96e9a-4d4a-4984-9892-ff0b2ecbb7e3) [2013-03-16 05:34:03.092814] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client4/~dmtmp/PWRPNT/PPTB1E4.TMP (2b8f1677-6376-4286-a381-8f4897bc9f4a) [2013-03-16 05:34:03.092814] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f594 [2013-03-16 05:34:03.092814] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f3a0 [2013-03-16 05:34:03.092814] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f2d8 [2013-03-16 05:34:03.112816] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client4/~dmtmp/PWRPNT/ZD16.BMP (eafa5f6a-fe12-4b9c-a5b9-386f2ff2123f) [2013-03-16 05:34:03.112816] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client4/~dmtmp/PWRPNT/NEWPCB.PPT (8c99ede1-3782-49f0-b544-00f4ec3beb9b) [2013-03-16 05:34:03.112816] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client4/~dmtmp/PWRPNT/PCBENCHM.PPT (a725ede8-bc10-42a1-9622-55afad13f9f7) [2013-03-16 05:34:03.112816] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory [2013-03-16 05:34:03.112816] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory [2013-03-16 05:34:03.112816] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory [2013-03-16 05:34:03.112816] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory [2013-03-16 05:34:03.112816] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory [2013-03-16 05:34:03.112816] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory [2013-03-16 05:34:03.112816] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory [2013-03-16 05:34:03.132819] W [client3_1-fops.c:1595:client3_1_entrylk_cbk] 0-remote9: remote operation failed: No such file or directory [2013-03-16 05:34:03.132819] W [client3_1-fops.c:1595:client3_1_entrylk_cbk] 0-remote9: remote operation failed: No such file or directory [2013-03-16 05:34:03.132819] W [client3_1-fops.c:1595:client3_1_entrylk_cbk] 0-remote9: remote operation failed: No such file or directory [2013-03-16 05:34:03.132819] W [client3_1-fops.c:1595:client3_1_entrylk_cbk] 0-remote9: remote operation failed: No such file or directory [2013-03-16 05:34:03.142820] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/NEWPCB.PPT (ba53fb9f-0648-4794-aaa9-bba9331b52cb) [2013-03-16 05:34:03.142820] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f788 [2013-03-16 05:34:03.142820] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/NEWPCB.PPT (ba53fb9f-0648-4794-aaa9-bba9331b52cb) [2013-03-16 05:34:03.142820] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory [2013-03-16 05:34:03.142820] W [client3_1-fops.c:2546:client3_1_opendir_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT (6512393c-65b8-4d86-ae78-8a12eb2be395) [2013-03-16 05:34:03.172824] W [client3_1-fops.c:1595:client3_1_entrylk_cbk] 0-remote9: remote operation failed: No such file or directory [2013-03-16 05:34:03.172824] W [fuse-bridge.c:1516:fuse_rename_cbk] 0-glusterfs-fuse: 11218: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/NEWPCB.PPT -> /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/PPTB1E4.TMP => -1 (No such file or directory) [2013-03-16 05:34:03.232831] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f404 [2013-03-16 05:34:03.242832] I [afr-open.c:318:afr_openfd_fix_open_cbk] 0-replicate0: fd for /something.example.com_1363412031/clients/client0/~dmtmp/PWRPNT/PCBENCHM.PPT opened successfully on subvolume remote9 [2013-03-16 05:34:03.252834] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f5f8 [2013-03-16 05:34:03.262835] I [afr-open.c:318:afr_openfd_fix_open_cbk] 0-replicate0: fd for /something.example.com_1363412031/clients/client3/~dmtmp/PWRPNT/PCBENCHM.PPT opened successfully on subvolume remote9 *********************** [2013-03-16 05:34:03.172824] W [fuse-bridge.c:1516:fuse_rename_cbk] 0-glusterfs-fuse: 11218: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/NEWPCB.PPT -> /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/PPTB1E4.TMP => -1 (No such file or directory) *********************** [2013-03-16 05:34:03.232831] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f404 [2013-03-16 05:34:03.242832] I [afr-open.c:318:afr_openfd_fix_open_cbk] 0-replicate0: fd for /something.example.com_1363412031/clients/client0/~dmtmp/PWRPNT/PCBENCHM.PPT opened successfully on subvolume remote9 [2013-03-16 05:34:03.252834] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f5f8 [2013-03-16 05:34:03.262835] I [afr-open.c:318:afr_openfd_fix_open_cbk] 0-replicate0: fd for /something.example.com_1363412031/clients/client3/~dmtmp/PWRPNT/PCBENCHM.PPT opened successfully on subvolume remote9 [2013-03-16 05:36:21.547011] C [client-handshake.c:126:rpc_client_ping_timer_expired] 0-remote8: server 10.245.15.65:24007 has not responded in the last 42 seconds, disconnecting. [2013-03-16 05:36:21.547011] C [client-handshake.c:126:rpc_client_ping_timer_expired] 0-remote9: server 10.196.239.242:24007 has not responded in the last 42 seconds, disconnecting. [2013-03-16 05:36:21.547011] E [rpc-clnt.c:373:saved_frames_unwind] (-->/usr/lib/libgfrpc.so.0(rpc_clnt_notify+0x78) [0x7f1adab0a048] (-->/usr/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb0) [0x7f1adab09d00] (-->/usr/lib/libgfrpc.so.0(saved_frames_destroy+0xe) [0x7f1adab0976e]))) 0-remote8: forced unwinding frame type(GlusterFS 3.1) op(LOOKUP(27)) called at 2013-03-16 05:35:10.750385 (xid=0x18942x) [2013-03-16 05:36:21.547011] W [client3_1-fops.c:2630:client3_1_lookup_cbk] 0-remote8: remote operation failed: Transport endpoint is not connected. Path: / (00000000-0000-0000-0000-000000000001) [2013-03-16 05:36:21.547011] E [rpc-clnt.c:373:saved_frames_unwind] (-->/usr/lib/libgfrpc.so.0(rpc_clnt_notify+0x78) [0x7f1adab0a048] (-->/usr/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb0) [0x7f1adab09d00] (-->/usr/lib/libgfrpc.so.0(saved_frames_destroy+0xe) [0x7f1adab0976e]))) 0-remote8: forced unwinding frame type(GlusterFS 3.1) op(LOOKUP(27)) called at 2013-03-16 05:35:18.191110 (xid=0x18943x) [2013-03-16 05:36:21.547011] W [client3_1-fops.c:2630:client3_1_lookup_cbk] 0-remote8: remote operation failed: Transport endpoint is not connected. Path: / (00000000-0000-0000-0000-000000000001) [2013-03-16 05:36:21.547011] E [rpc-clnt.c:373:saved_frames_unwind] (-->/usr/lib/libgfrpc.so.0(rpc_clnt_notify+0x78) [0x7f1adab0a048] (-->/usr/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb0) [0x7f1adab09d00] (-->/usr/lib/libgfrpc.so.0(saved_frames_destroy+0xe) [0x7f1adab0976e]))) 0-remote8: forced unwinding frame type(GlusterFS Handshake) op(PING(3)) called at 2013-03-16 05:35:39.543151 (xid=0x18944x) Anybody have an idea what could cause such errors? The rpc_client_ping_timer_expired timeouts seem a bit strange. They are after the fail and we do test networking problems in a previous test, so they might just have stuck around from then. Cheers, Marc
Pranith Kumar K
2013-Mar-18 09:02 UTC
[Gluster-users] Errors during dbench run (rename failed)
On 03/17/2013 06:55 PM, Marc Seeger wrote:> Hi, > > We just ran into drench dying on one of our test runs. > We execute a dbench each on 2 machines. > We use the following parameters: dbench 6 -t 60 -D $DIRECTORY (host specific, they each write in a separate one) > The directories are on a mountpoint connected using glusterfs 3.3.1 (3.3.1-ubuntu1~lucid8 from https://launchpad.net/~semiosis/+archive/ubuntu-glusterfs-3.3) > > This is how dbench died: > > I, [2013-03-16T05:34:03.176890 #13121] INFO -- : [710] rename /mnt/gfs/something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/NEWPCB.PPT /mnt/gfs/something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/PPTB1E4.TMP failed (No such file or directory) - expected NT_STATUS_OK > > > These are the logs at the time. They are a bit noisy, the matching message is emphasised using *****: > > [2013-03-16 05:34:03.082813] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f274 > [2013-03-16 05:34:03.082813] W [client3_1-fops.c:1595:client3_1_entrylk_cbk] 0-remote9: remote operation failed: No such file or directory > [2013-03-16 05:34:03.082813] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/PPTB1E4.TMP (b49d6051-93f6-4eca-b161-865a5bea964b) > [2013-03-16 05:34:03.082813] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f4cc > [2013-03-16 05:34:03.082813] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f6c0 > [2013-03-16 05:34:03.082813] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f468 > [2013-03-16 05:34:03.082813] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f33c > [2013-03-16 05:34:03.082813] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory > [2013-03-16 05:34:03.092814] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/ZD16.BMP (73e3b099-48cd-4e76-8049-c64bf8f63500) > [2013-03-16 05:34:03.092814] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/NEWPCB.PPT (ba53fb9f-0648-4794-aaa9-bba9331b52cb) > [2013-03-16 05:34:03.092814] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/PCBENCHM.PPT (a0c96e9a-4d4a-4984-9892-ff0b2ecbb7e3) > [2013-03-16 05:34:03.092814] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client4/~dmtmp/PWRPNT/PPTB1E4.TMP (2b8f1677-6376-4286-a381-8f4897bc9f4a) > [2013-03-16 05:34:03.092814] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f594 > [2013-03-16 05:34:03.092814] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f3a0 > [2013-03-16 05:34:03.092814] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f2d8 > [2013-03-16 05:34:03.112816] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client4/~dmtmp/PWRPNT/ZD16.BMP (eafa5f6a-fe12-4b9c-a5b9-386f2ff2123f) > [2013-03-16 05:34:03.112816] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client4/~dmtmp/PWRPNT/NEWPCB.PPT (8c99ede1-3782-49f0-b544-00f4ec3beb9b) > [2013-03-16 05:34:03.112816] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client4/~dmtmp/PWRPNT/PCBENCHM.PPT (a725ede8-bc10-42a1-9622-55afad13f9f7) > [2013-03-16 05:34:03.112816] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory > [2013-03-16 05:34:03.112816] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory > [2013-03-16 05:34:03.112816] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory > [2013-03-16 05:34:03.112816] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory > [2013-03-16 05:34:03.112816] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory > [2013-03-16 05:34:03.112816] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory > [2013-03-16 05:34:03.112816] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory > [2013-03-16 05:34:03.132819] W [client3_1-fops.c:1595:client3_1_entrylk_cbk] 0-remote9: remote operation failed: No such file or directory > [2013-03-16 05:34:03.132819] W [client3_1-fops.c:1595:client3_1_entrylk_cbk] 0-remote9: remote operation failed: No such file or directory > [2013-03-16 05:34:03.132819] W [client3_1-fops.c:1595:client3_1_entrylk_cbk] 0-remote9: remote operation failed: No such file or directory > [2013-03-16 05:34:03.132819] W [client3_1-fops.c:1595:client3_1_entrylk_cbk] 0-remote9: remote operation failed: No such file or directory > [2013-03-16 05:34:03.142820] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/NEWPCB.PPT (ba53fb9f-0648-4794-aaa9-bba9331b52cb) > [2013-03-16 05:34:03.142820] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f788 > [2013-03-16 05:34:03.142820] W [client3_1-fops.c:418:client3_1_open_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/NEWPCB.PPT (ba53fb9f-0648-4794-aaa9-bba9331b52cb) > [2013-03-16 05:34:03.142820] W [client3_1-fops.c:881:client3_1_flush_cbk] 0-remote9: remote operation failed: No such file or directory > [2013-03-16 05:34:03.142820] W [client3_1-fops.c:2546:client3_1_opendir_cbk] 0-remote9: remote operation failed: No such file or directory. Path: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT (6512393c-65b8-4d86-ae78-8a12eb2be395) > [2013-03-16 05:34:03.172824] W [client3_1-fops.c:1595:client3_1_entrylk_cbk] 0-remote9: remote operation failed: No such file or directory > [2013-03-16 05:34:03.172824] W [fuse-bridge.c:1516:fuse_rename_cbk] 0-glusterfs-fuse: 11218: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/NEWPCB.PPT -> /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/PPTB1E4.TMP => -1 (No such file or directory) > [2013-03-16 05:34:03.232831] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f404 > [2013-03-16 05:34:03.242832] I [afr-open.c:318:afr_openfd_fix_open_cbk] 0-replicate0: fd for /something.example.com_1363412031/clients/client0/~dmtmp/PWRPNT/PCBENCHM.PPT opened successfully on subvolume remote9 > [2013-03-16 05:34:03.252834] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f5f8 > [2013-03-16 05:34:03.262835] I [afr-open.c:318:afr_openfd_fix_open_cbk] 0-replicate0: fd for /something.example.com_1363412031/clients/client3/~dmtmp/PWRPNT/PCBENCHM.PPT opened successfully on subvolume remote9 > > *********************** > [2013-03-16 05:34:03.172824] W [fuse-bridge.c:1516:fuse_rename_cbk] 0-glusterfs-fuse: 11218: /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/NEWPCB.PPT -> /something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/PPTB1E4.TMP => -1 (No such file or directory) > *********************** > > [2013-03-16 05:34:03.232831] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f404 > [2013-03-16 05:34:03.242832] I [afr-open.c:318:afr_openfd_fix_open_cbk] 0-replicate0: fd for /something.example.com_1363412031/clients/client0/~dmtmp/PWRPNT/PCBENCHM.PPT opened successfully on subvolume remote9 > [2013-03-16 05:34:03.252834] I [afr-inode-write.c:428:afr_open_fd_fix] 0-replicate0: Opening fd 0x7f1adb67f5f8 > [2013-03-16 05:34:03.262835] I [afr-open.c:318:afr_openfd_fix_open_cbk] 0-replicate0: fd for /something.example.com_1363412031/clients/client3/~dmtmp/PWRPNT/PCBENCHM.PPT opened successfully on subvolume remote9 > [2013-03-16 05:36:21.547011] C [client-handshake.c:126:rpc_client_ping_timer_expired] 0-remote8: server 10.245.15.65:24007 has not responded in the last 42 seconds, disconnecting. > [2013-03-16 05:36:21.547011] C [client-handshake.c:126:rpc_client_ping_timer_expired] 0-remote9: server 10.196.239.242:24007 has not responded in the last 42 seconds, disconnecting. > [2013-03-16 05:36:21.547011] E [rpc-clnt.c:373:saved_frames_unwind] (-->/usr/lib/libgfrpc.so.0(rpc_clnt_notify+0x78) [0x7f1adab0a048] (-->/usr/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb0) [0x7f1adab09d00] (-->/usr/lib/libgfrpc.so.0(saved_frames_destroy+0xe) [0x7f1adab0976e]))) 0-remote8: forced unwinding frame type(GlusterFS 3.1) op(LOOKUP(27)) called at 2013-03-16 05:35:10.750385 (xid=0x18942x) > [2013-03-16 05:36:21.547011] W [client3_1-fops.c:2630:client3_1_lookup_cbk] 0-remote8: remote operation failed: Transport endpoint is not connected. Path: / (00000000-0000-0000-0000-000000000001) > [2013-03-16 05:36:21.547011] E [rpc-clnt.c:373:saved_frames_unwind] (-->/usr/lib/libgfrpc.so.0(rpc_clnt_notify+0x78) [0x7f1adab0a048] (-->/usr/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb0) [0x7f1adab09d00] (-->/usr/lib/libgfrpc.so.0(saved_frames_destroy+0xe) [0x7f1adab0976e]))) 0-remote8: forced unwinding frame type(GlusterFS 3.1) op(LOOKUP(27)) called at 2013-03-16 05:35:18.191110 (xid=0x18943x) > [2013-03-16 05:36:21.547011] W [client3_1-fops.c:2630:client3_1_lookup_cbk] 0-remote8: remote operation failed: Transport endpoint is not connected. Path: / (00000000-0000-0000-0000-000000000001) > [2013-03-16 05:36:21.547011] E [rpc-clnt.c:373:saved_frames_unwind] (-->/usr/lib/libgfrpc.so.0(rpc_clnt_notify+0x78) [0x7f1adab0a048] (-->/usr/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb0) [0x7f1adab09d00] (-->/usr/lib/libgfrpc.so.0(saved_frames_destroy+0xe) [0x7f1adab0976e]))) 0-remote8: forced unwinding frame type(GlusterFS Handshake) op(PING(3)) called at 2013-03-16 05:35:39.543151 (xid=0x18944x) > > > Anybody have an idea what could cause such errors? > The rpc_client_ping_timer_expired timeouts seem a bit strange. They are after the fail and we do test networking problems in a previous test, so they might just have stuck around from then. > > > Cheers, > Marc > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-usershi, If obtaining of entry locks fails for any of the bricks in a replica subvolume, rename used to fail. This bug is fixed in 3.4aplha. Pranith.
Hans Lambermont
2013-Mar-18 09:06 UTC
[Gluster-users] Errors during dbench run (rename failed)
Pranith Kumar K wrote on 20130318:> On 03/17/2013 06:55 PM, Marc Seeger wrote: > >This is how dbench died: > >I, [2013-03-16T05:34:03.176890 #13121] INFO -- : [710] rename /mnt/gfs/something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/NEWPCB.PPT /mnt/gfs/something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/PPTB1E4.TMP failed (No such file or directory) - expected NT_STATUS_OK...> hi, If obtaining of entry locks fails for any of the bricks in a > replica subvolume, rename used to fail. This bug is fixed in 3.4aplha.Cool ! Will this be backported to the 3.3 branch ? regards, Hans Lambermont -- Hans Lambermont | Senior Architect (t) +31407370104 (w) www.shapeways.com
Pranith Kumar K
2013-Mar-18 09:19 UTC
[Gluster-users] Errors during dbench run (rename failed)
On 03/18/2013 02:36 PM, Hans Lambermont wrote:> Pranith Kumar K wrote on 20130318: > >> On 03/17/2013 06:55 PM, Marc Seeger wrote: >>> This is how dbench died: >>> I, [2013-03-16T05:34:03.176890 #13121] INFO -- : [710] rename /mnt/gfs/something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/NEWPCB.PPT /mnt/gfs/something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/PPTB1E4.TMP failed (No such file or directory) - expected NT_STATUS_OK > ... >> hi, If obtaining of entry locks fails for any of the bricks in a >> replica subvolume, rename used to fail. This bug is fixed in 3.4aplha. > Cool ! Will this be backported to the 3.3 branch ? > > regards, > Hans LambermontThat fix is a complete refactor of entry-locks. Let me check if I can come up with a small patch to fix just this case. Pranith.
Pranith Kumar K
2013-Mar-19 06:31 UTC
[Gluster-users] Errors during dbench run (rename failed)
On 03/18/2013 02:49 PM, Pranith Kumar K wrote:> On 03/18/2013 02:36 PM, Hans Lambermont wrote: >> Pranith Kumar K wrote on 20130318: >> >>> On 03/17/2013 06:55 PM, Marc Seeger wrote: >>>> This is how dbench died: >>>> I, [2013-03-16T05:34:03.176890 #13121] INFO -- : [710] rename >>>> /mnt/gfs/something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/NEWPCB.PPT >>>> /mnt/gfs/something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/PPTB1E4.TMP >>>> failed (No such file or directory) - expected NT_STATUS_OK >> ... >>> hi, If obtaining of entry locks fails for any of the bricks in a >>> replica subvolume, rename used to fail. This bug is fixed in 3.4aplha. >> Cool ! Will this be backported to the 3.3 branch ? >> >> regards, >> Hans Lambermont > That fix is a complete refactor of entry-locks. Let me check if I can > come up with a small patch to fix just this case. > > Pranith. > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-usersHere is the fix: http://review.gluster.com/4689 Pranith.
Hans Lambermont
2013-Mar-19 09:09 UTC
[Gluster-users] Errors during dbench run (rename failed)
Pranith Kumar K wrote on 20130319:> On 03/18/2013 02:49 PM, Pranith Kumar K wrote: >>On 03/18/2013 02:36 PM, Hans Lambermont wrote: >>>Pranith Kumar K wrote on 20130318: >>>>On 03/17/2013 06:55 PM, Marc Seeger wrote: >>>>>This is how dbench died: >>>>>I, [2013-03-16T05:34:03.176890 #13121] INFO -- : [710] >>>>>rename /mnt/gfs/something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/NEWPCB.PPT /mnt/gfs/something.example.com_1363412031/clients/client2/~dmtmp/PWRPNT/PPTB1E4.TMP >>>>>failed (No such file or directory) - expected NT_STATUS_OK >>>... >>>>hi, If obtaining of entry locks fails for any of the bricks in a >>>>replica subvolume, rename used to fail. This bug is fixed in 3.4aplha. >>>Cool ! Will this be backported to the 3.3 branch ? >>That fix is a complete refactor of entry-locks. Let me check if I >>can come up with a small patch to fix just this case. > > Here is the fix: > http://review.gluster.com/4689Great, thanks ! regards, Hans Lambermont -- Hans Lambermont | Senior Architect (t) +31407370104 (w) www.shapeways.com