Walter Deignan
2017-Sep-19 21:10 UTC
[Gluster-users] "Input/output error" on mkdir for PPC64 based client
I recently compiled the 3.10-5 client from source on a few PPC64 systems running RHEL 7.3. They are mounting a Gluster volume which is hosted on more traditional x86 servers. Everything seems to be working properly except for creating new directories from the PPC64 clients. The mkdir command gives a "Input/output error" and for the first few minutes the new directory is inaccessible. I checked the backend bricks and confirmed the directory was created properly on all of them. After waiting for 2-5 minutes the directory magically becomes accessible. This inaccessible directory issue only appears from the client which created it. When creating the directory from client #1 I can immediately see it with no errors from client #2. Using a pre-compiled 3.10-5 package on an x86 client doesn't show the issue. I poked around bugzilla but couldn't seem to find anything which matches this. [root at mqdev1 hafsdev1_gv0]# ls -lh total 8.0K drwxrwxr-x. 4 mqm mqm 4.0K Sep 19 15:47 data drwxr-xr-x. 2 root root 4.0K Sep 19 15:47 testdir [root at mqdev1 hafsdev1_gv0]# mkdir testdir2 mkdir: cannot create directory ?testdir2?: Input/output error [root at mqdev1 hafsdev1_gv0]# ls ls: cannot access testdir2: No such file or directory data testdir testdir2 [root at mqdev1 hafsdev1_gv0]# ls -lht ls: cannot access testdir2: No such file or directory total 8.0K drwxr-xr-x. 2 root root 4.0K Sep 19 15:47 testdir drwxrwxr-x. 4 mqm mqm 4.0K Sep 19 15:47 data d?????????? ? ? ? ? ? testdir2 [root at mqdev1 hafsdev1_gv0]# cd testdir2 -bash: cd: testdir2: No such file or directory *Wait a few minutes...* [root at mqdev1 hafsdev1_gv0]# ls -lht total 12K drwxr-xr-x. 2 root root 4.0K Sep 19 15:50 testdir2 drwxr-xr-x. 2 root root 4.0K Sep 19 15:47 testdir drwxrwxr-x. 4 mqm mqm 4.0K Sep 19 15:47 data [root at mqdev1 hafsdev1_gv0]# My volume config... [root at dc-hafsdev1a bricks]# gluster volume info Volume Name: gv0 Type: Replicate Volume ID: a2d37705-05cb-4700-8ed8-2cb89376faf0 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: dc-hafsdev1a.ulinedm.com:/gluster/bricks/brick1/data Brick2: dc-hafsdev1b.ulinedm.com:/gluster/bricks/brick1/data Brick3: dc-hafsdev1c.ulinedm.com:/gluster/bricks/brick1/data Options Reconfigured: nfs.disable: on transport.address-family: inet network.ping-timeout: 2 features.bitrot: on features.scrub: Active cluster.server-quorum-ratio: 51% -Walter Deignan -Uline IT, Systems Architect -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170919/aa321f6f/attachment.html>
Amar Tumballi
2017-Sep-20 18:23 UTC
[Gluster-users] "Input/output error" on mkdir for PPC64 based client
Looks like it is an issue with architecture compatibility in RPC layer (ie, with XDRs and how it is used). Just glance the logs of the client process where you saw the errors, which could give some hints. If you don't understand the logs, share them, so we will try to look into it. -Amar On Wed, Sep 20, 2017 at 2:40 AM, Walter Deignan <WDeignan at uline.com> wrote:> I recently compiled the 3.10-5 client from source on a few PPC64 systems > running RHEL 7.3. They are mounting a Gluster volume which is hosted on > more traditional x86 servers. > > Everything seems to be working properly except for creating new > directories from the PPC64 clients. The mkdir command gives a "Input/output > error" and for the first few minutes the new directory is inaccessible. I > checked the backend bricks and confirmed the directory was created properly > on all of them. After waiting for 2-5 minutes the directory magically > becomes accessible. > > This inaccessible directory issue only appears from the client which > created it. When creating the directory from client #1 I can immediately > see it with no errors from client #2. > > Using a pre-compiled 3.10-5 package on an x86 client doesn't show the > issue. > > I poked around bugzilla but couldn't seem to find anything which matches > this. > > [root at mqdev1 hafsdev1_gv0]# ls -lh > total 8.0K > drwxrwxr-x. 4 mqm mqm 4.0K Sep 19 15:47 data > drwxr-xr-x. 2 root root 4.0K Sep 19 15:47 testdir > [root at mqdev1 hafsdev1_gv0]# mkdir testdir2 > mkdir: cannot create directory ?testdir2?: Input/output error > [root at mqdev1 hafsdev1_gv0]# ls > ls: cannot access testdir2: No such file or directory > data testdir testdir2 > [root at mqdev1 hafsdev1_gv0]# ls -lht > ls: cannot access testdir2: No such file or directory > total 8.0K > drwxr-xr-x. 2 root root 4.0K Sep 19 15:47 testdir > drwxrwxr-x. 4 mqm mqm 4.0K Sep 19 15:47 data > d?????????? ? ? ? ? ? testdir2 > [root at mqdev1 hafsdev1_gv0]# cd testdir2 > -bash: cd: testdir2: No such file or directory > > *Wait a few minutes...* > > [root at mqdev1 hafsdev1_gv0]# ls -lht > total 12K > drwxr-xr-x. 2 root root 4.0K Sep 19 15:50 testdir2 > drwxr-xr-x. 2 root root 4.0K Sep 19 15:47 testdir > drwxrwxr-x. 4 mqm mqm 4.0K Sep 19 15:47 data > [root at mqdev1 hafsdev1_gv0]# > > My volume config... > > [root at dc-hafsdev1a bricks]# gluster volume info > > Volume Name: gv0 > Type: Replicate > Volume ID: a2d37705-05cb-4700-8ed8-2cb89376faf0 > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 x 3 = 3 > Transport-type: tcp > Bricks: > Brick1: dc-hafsdev1a.ulinedm.com:/gluster/bricks/brick1/data > Brick2: dc-hafsdev1b.ulinedm.com:/gluster/bricks/brick1/data > Brick3: dc-hafsdev1c.ulinedm.com:/gluster/bricks/brick1/data > Options Reconfigured: > nfs.disable: on > transport.address-family: inet > network.ping-timeout: 2 > features.bitrot: on > features.scrub: Active > cluster.server-quorum-ratio: 51% > > -Walter Deignan > -Uline IT, Systems Architect > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users >-- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170920/2e2c4c1e/attachment.html>
Walter Deignan
2017-Sep-20 18:27 UTC
[Gluster-users] "Input/output error" on mkdir for PPC64 based client
I put the share into debug mode and then repeated the process from a ppc64 client and an x86 client. Weirdly the client logs were almost identical. Here's the ppc64 gluster client log of attempting to create a folder... ------------- [2017-09-20 13:34:23.344321] D [rpc-clnt-ping.c:93:rpc_clnt_remove_ping_timer_locked] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn-0xfdf60)[0x3fff9ec56fe0] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_remove_ping_timer_locked-0x26060)[0x3fff9ebd9e20] (--> /usr/lib64/libgfrpc.so.0(+0x1a614)[0x3fff9ebda614] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_submit-0x29300)[0x3fff9ebd69b0] (--> /usr/lib64/glusterfs/3.10.5/xlator/protocol/client.so(+0x182e0)[0x3fff939182e0] ))))) 0-: 10.50.80.102:49152: ping timer event already removed [2017-09-20 13:34:23.345149] D [rpc-clnt-ping.c:93:rpc_clnt_remove_ping_timer_locked] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn-0xfdf60)[0x3fff9ec56fe0] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_remove_ping_timer_locked-0x26060)[0x3fff9ebd9e20] (--> /usr/lib64/libgfrpc.so.0(+0x1a614)[0x3fff9ebda614] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_submit-0x29300)[0x3fff9ebd69b0] (--> /usr/lib64/glusterfs/3.10.5/xlator/protocol/client.so(+0x182e0)[0x3fff939182e0] ))))) 0-: 10.50.80.103:49152: ping timer event already removed [2017-09-20 13:34:23.345977] D [rpc-clnt-ping.c:93:rpc_clnt_remove_ping_timer_locked] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn-0xfdf60)[0x3fff9ec56fe0] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_remove_ping_timer_locked-0x26060)[0x3fff9ebd9e20] (--> /usr/lib64/libgfrpc.so.0(+0x1a614)[0x3fff9ebda614] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_submit-0x29300)[0x3fff9ebd69b0] (--> /usr/lib64/glusterfs/3.10.5/xlator/protocol/client.so(+0x182e0)[0x3fff939182e0] ))))) 0-: 10.50.80.104:49152: ping timer event already removed [2017-09-20 13:34:23.346070] D [MSGID: 0] [dht-common.c:1002:dht_revalidate_cbk] 0-gv0-dht: revalidate lookup of / returned with op_ret 0 [Structure needs cleaning] [2017-09-20 13:34:23.347612] D [MSGID: 0] [dht-common.c:2699:dht_lookup] 0-gv0-dht: Calling fresh lookup for /tempdir3 on gv0-replicate-0 [2017-09-20 13:34:23.348013] D [MSGID: 0] [client-rpc-fops.c:2936:client3_3_lookup_cbk] 0-stack-trace: stack-address: 0x3fff88001080, gv0-client-1 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.348013] D [MSGID: 0] [client-rpc-fops.c:2936:client3_3_lookup_cbk] 0-stack-trace: stack-address: 0x3fff88001080, gv0-client-0 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.348083] D [MSGID: 0] [client-rpc-fops.c:2936:client3_3_lookup_cbk] 0-stack-trace: stack-address: 0x3fff88001080, gv0-client-2 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.348132] D [MSGID: 0] [afr-common.c:2264:afr_lookup_done] 0-stack-trace: stack-address: 0x3fff88001080, gv0-replicate-0 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.348166] D [MSGID: 0] [dht-common.c:2284:dht_lookup_cbk] 0-gv0-dht: fresh_lookup returned for /tempdir3 with op_ret -1 [No such file or directory] [2017-09-20 13:34:23.348195] D [MSGID: 0] [dht-common.c:2297:dht_lookup_cbk] 0-gv0-dht: Entry /tempdir3 missing on subvol gv0-replicate-0 [2017-09-20 13:34:23.348220] D [MSGID: 0] [dht-common.c:2068:dht_lookup_everywhere] 0-gv0-dht: winding lookup call to 1 subvols [2017-09-20 13:34:23.348551] D [MSGID: 0] [client-rpc-fops.c:2936:client3_3_lookup_cbk] 0-stack-trace: stack-address: 0x3fff88001080, gv0-client-1 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.348551] D [MSGID: 0] [client-rpc-fops.c:2936:client3_3_lookup_cbk] 0-stack-trace: stack-address: 0x3fff88001080, gv0-client-0 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.348613] D [MSGID: 0] [client-rpc-fops.c:2936:client3_3_lookup_cbk] 0-stack-trace: stack-address: 0x3fff88001080, gv0-client-2 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.348639] D [MSGID: 0] [afr-common.c:2264:afr_lookup_done] 0-stack-trace: stack-address: 0x3fff88001080, gv0-replicate-0 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.348665] D [MSGID: 0] [dht-common.c:1870:dht_lookup_everywhere_cbk] 0-gv0-dht: returned with op_ret -1 and op_errno 2 (/tempdir3) from subvol gv0-replicate-0 [2017-09-20 13:34:23.348697] D [MSGID: 0] [dht-common.c:1535:dht_lookup_everywhere_done] 0-gv0-dht: STATUS: hashed_subvol gv0-replicate-0 cached_subvol null [2017-09-20 13:34:23.348740] D [MSGID: 0] [dht-common.c:1596:dht_lookup_everywhere_done] 0-gv0-dht: There was no cached file and unlink on hashed is not skipped /tempdir3 [2017-09-20 13:34:23.348783] D [MSGID: 0] [dht-common.c:1599:dht_lookup_everywhere_done] 0-stack-trace: stack-address: 0x3fff88001080, gv0-dht returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.348817] D [MSGID: 0] [write-behind.c:2392:wb_lookup_cbk] 0-stack-trace: stack-address: 0x3fff88001080, gv0-write-behind returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.348845] D [MSGID: 0] [io-cache.c:267:ioc_lookup_cbk] 0-stack-trace: stack-address: 0x3fff88001080, gv0-io-cache returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.348875] D [MSGID: 0] [quick-read.c:447:qr_lookup_cbk] 0-stack-trace: stack-address: 0x3fff88001080, gv0-quick-read returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.348906] D [MSGID: 0] [md-cache.c:1048:mdc_lookup_cbk] 0-stack-trace: stack-address: 0x3fff88001080, gv0-md-cache returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.348955] D [MSGID: 0] [io-stats.c:2186:io_stats_lookup_cbk] 0-stack-trace: stack-address: 0x3fff88001080, gv0 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.348994] D [fuse-resolve.c:61:fuse_resolve_entry_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001/tempdir3: failed to resolve (No such file or directory) [2017-09-20 13:34:23.349589] D [MSGID: 0] [dht-common.c:2699:dht_lookup] 0-gv0-dht: Calling fresh lookup for /tempdir3 on gv0-replicate-0 [2017-09-20 13:34:23.349917] D [MSGID: 0] [client-rpc-fops.c:2936:client3_3_lookup_cbk] 0-stack-trace: stack-address: 0x3fff8c063900, gv0-client-1 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.349961] D [MSGID: 0] [client-rpc-fops.c:2936:client3_3_lookup_cbk] 0-stack-trace: stack-address: 0x3fff8c063900, gv0-client-2 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.350055] D [MSGID: 0] [client-rpc-fops.c:2936:client3_3_lookup_cbk] 0-stack-trace: stack-address: 0x3fff8c063900, gv0-client-0 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.350076] D [MSGID: 0] [afr-common.c:2264:afr_lookup_done] 0-stack-trace: stack-address: 0x3fff8c063900, gv0-replicate-0 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.350099] D [MSGID: 0] [dht-common.c:2284:dht_lookup_cbk] 0-gv0-dht: fresh_lookup returned for /tempdir3 with op_ret -1 [No such file or directory] [2017-09-20 13:34:23.350124] D [MSGID: 0] [dht-common.c:2297:dht_lookup_cbk] 0-gv0-dht: Entry /tempdir3 missing on subvol gv0-replicate-0 [2017-09-20 13:34:23.350146] D [MSGID: 0] [dht-common.c:2068:dht_lookup_everywhere] 0-gv0-dht: winding lookup call to 1 subvols [2017-09-20 13:34:23.350473] D [MSGID: 0] [client-rpc-fops.c:2936:client3_3_lookup_cbk] 0-stack-trace: stack-address: 0x3fff8c063900, gv0-client-1 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.350473] D [MSGID: 0] [client-rpc-fops.c:2936:client3_3_lookup_cbk] 0-stack-trace: stack-address: 0x3fff8c063900, gv0-client-0 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.350528] D [MSGID: 0] [client-rpc-fops.c:2936:client3_3_lookup_cbk] 0-stack-trace: stack-address: 0x3fff8c063900, gv0-client-2 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.350553] D [MSGID: 0] [afr-common.c:2264:afr_lookup_done] 0-stack-trace: stack-address: 0x3fff8c063900, gv0-replicate-0 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.350583] D [MSGID: 0] [dht-common.c:1870:dht_lookup_everywhere_cbk] 0-gv0-dht: returned with op_ret -1 and op_errno 2 (/tempdir3) from subvol gv0-replicate-0 [2017-09-20 13:34:23.350610] D [MSGID: 0] [dht-common.c:1535:dht_lookup_everywhere_done] 0-gv0-dht: STATUS: hashed_subvol gv0-replicate-0 cached_subvol null [2017-09-20 13:34:23.350635] D [MSGID: 0] [dht-common.c:1596:dht_lookup_everywhere_done] 0-gv0-dht: There was no cached file and unlink on hashed is not skipped /tempdir3 [2017-09-20 13:34:23.350660] D [MSGID: 0] [dht-common.c:1599:dht_lookup_everywhere_done] 0-stack-trace: stack-address: 0x3fff8c063900, gv0-dht returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.350693] D [MSGID: 0] [write-behind.c:2392:wb_lookup_cbk] 0-stack-trace: stack-address: 0x3fff8c063900, gv0-write-behind returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.350724] D [MSGID: 0] [io-cache.c:267:ioc_lookup_cbk] 0-stack-trace: stack-address: 0x3fff8c063900, gv0-io-cache returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.350756] D [MSGID: 0] [quick-read.c:447:qr_lookup_cbk] 0-stack-trace: stack-address: 0x3fff8c063900, gv0-quick-read returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.350790] D [MSGID: 0] [md-cache.c:1048:mdc_lookup_cbk] 0-stack-trace: stack-address: 0x3fff8c063900, gv0-md-cache returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.350818] D [MSGID: 0] [io-stats.c:2186:io_stats_lookup_cbk] 0-stack-trace: stack-address: 0x3fff8c063900, gv0 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.351228] D [MSGID: 0] [dht-common.c:2699:dht_lookup] 0-gv0-dht: Calling fresh lookup for /tempdir3 on gv0-replicate-0 [2017-09-20 13:34:23.351573] D [MSGID: 0] [client-rpc-fops.c:2936:client3_3_lookup_cbk] 0-stack-trace: stack-address: 0x3fff88001080, gv0-client-1 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.351573] D [MSGID: 0] [client-rpc-fops.c:2936:client3_3_lookup_cbk] 0-stack-trace: stack-address: 0x3fff88001080, gv0-client-0 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.351637] D [MSGID: 0] [client-rpc-fops.c:2936:client3_3_lookup_cbk] 0-stack-trace: stack-address: 0x3fff88001080, gv0-client-2 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.351664] D [MSGID: 0] [afr-common.c:2264:afr_lookup_done] 0-stack-trace: stack-address: 0x3fff88001080, gv0-replicate-0 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.351688] D [MSGID: 0] [dht-common.c:2284:dht_lookup_cbk] 0-gv0-dht: fresh_lookup returned for /tempdir3 with op_ret -1 [No such file or directory] [2017-09-20 13:34:23.351715] D [MSGID: 0] [dht-common.c:2297:dht_lookup_cbk] 0-gv0-dht: Entry /tempdir3 missing on subvol gv0-replicate-0 [2017-09-20 13:34:23.351738] D [MSGID: 0] [dht-common.c:2068:dht_lookup_everywhere] 0-gv0-dht: winding lookup call to 1 subvols [2017-09-20 13:34:23.352069] D [MSGID: 0] [client-rpc-fops.c:2936:client3_3_lookup_cbk] 0-stack-trace: stack-address: 0x3fff88001080, gv0-client-0 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.352074] D [MSGID: 0] [client-rpc-fops.c:2936:client3_3_lookup_cbk] 0-stack-trace: stack-address: 0x3fff88001080, gv0-client-1 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.352120] D [MSGID: 0] [client-rpc-fops.c:2936:client3_3_lookup_cbk] 0-stack-trace: stack-address: 0x3fff88001080, gv0-client-2 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.352145] D [MSGID: 0] [afr-common.c:2264:afr_lookup_done] 0-stack-trace: stack-address: 0x3fff88001080, gv0-replicate-0 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.352169] D [MSGID: 0] [dht-common.c:1870:dht_lookup_everywhere_cbk] 0-gv0-dht: returned with op_ret -1 and op_errno 2 (/tempdir3) from subvol gv0-replicate-0 [2017-09-20 13:34:23.352192] D [MSGID: 0] [dht-common.c:1535:dht_lookup_everywhere_done] 0-gv0-dht: STATUS: hashed_subvol gv0-replicate-0 cached_subvol null [2017-09-20 13:34:23.352212] D [MSGID: 0] [dht-common.c:1596:dht_lookup_everywhere_done] 0-gv0-dht: There was no cached file and unlink on hashed is not skipped /tempdir3 [2017-09-20 13:34:23.352231] D [MSGID: 0] [dht-common.c:1599:dht_lookup_everywhere_done] 0-stack-trace: stack-address: 0x3fff88001080, gv0-dht returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.352264] D [MSGID: 0] [write-behind.c:2392:wb_lookup_cbk] 0-stack-trace: stack-address: 0x3fff88001080, gv0-write-behind returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.352296] D [MSGID: 0] [io-cache.c:267:ioc_lookup_cbk] 0-stack-trace: stack-address: 0x3fff88001080, gv0-io-cache returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.352326] D [MSGID: 0] [quick-read.c:447:qr_lookup_cbk] 0-stack-trace: stack-address: 0x3fff88001080, gv0-quick-read returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.352361] D [MSGID: 0] [md-cache.c:1048:mdc_lookup_cbk] 0-stack-trace: stack-address: 0x3fff88001080, gv0-md-cache returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.352392] D [MSGID: 0] [io-stats.c:2186:io_stats_lookup_cbk] 0-stack-trace: stack-address: 0x3fff88001080, gv0 returned -1 error: No such file or directory [No such file or directory] [2017-09-20 13:34:23.352424] D [fuse-resolve.c:61:fuse_resolve_entry_cbk] 0-fuse: 00000000-0000-0000-0000-000000000001/tempdir3: failed to resolve (No such file or directory) [2017-09-20 13:34:23.352749] D [MSGID: 0] [dht-diskusage.c:96:dht_du_info_cbk] 0-gv0-dht: subvolume 'gv0-replicate-0': avail_percent is: 99.00 and avail_space is: 21425758208 and avail_inodes is: 99.00 [2017-09-20 13:34:23.353086] D [MSGID: 0] [afr-transaction.c:1934:afr_post_nonblocking_entrylk_cbk] 0-gv0-replicate-0: Non blocking entrylks done. Proceeding to FOP [2017-09-20 13:34:23.353722] D [MSGID: 0] [dht-selfheal.c:1879:dht_selfheal_layout_new_directory] 0-gv0-dht: chunk size = 0xffffffff / 20466 = 209858.658018 [2017-09-20 13:34:23.353748] D [MSGID: 0] [dht-selfheal.c:1920:dht_selfheal_layout_new_directory] 0-gv0-dht: assigning range size 0xffffffff to gv0-replicate-0 [2017-09-20 13:34:23.353897] D [MSGID: 0] [afr-lk-common.c:448:transaction_lk_op] 0-gv0-replicate-0: lk op is for a transaction [2017-09-20 13:34:23.354052] D [MSGID: 0] [afr-transaction.c:1883:afr_post_nonblocking_inodelk_cbk] 0-gv0-replicate-0: Non blocking inodelks done. Proceeding to FOP [2017-09-20 13:34:23.354453] D [MSGID: 0] [afr-lk-common.c:448:transaction_lk_op] 0-gv0-replicate-0: lk op is for a transaction [2017-09-20 13:34:23.354969] D [MSGID: 109036] [dht-common.c:9527:dht_log_new_layout_for_dir_selfheal] 0-gv0-dht: Setting layout of /tempdir3 with [Subvol_name: gv0-replicate-0, Err: -1 , Start: 0 , Stop: 4294967295 , Hash: 1 ], [2017-09-20 13:34:23.355226] D [MSGID: 0] [afr-transaction.c:1883:afr_post_nonblocking_inodelk_cbk] 0-gv0-replicate-0: Non blocking inodelks done. Proceeding to FOP [2017-09-20 13:34:23.355714] D [MSGID: 0] [afr-lk-common.c:448:transaction_lk_op] 0-gv0-replicate-0: lk op is for a transaction -Walter Deignan -Uline IT, Systems Architect From: Amar Tumballi <atumball at redhat.com> To: Walter Deignan <WDeignan at uline.com> Cc: "gluster-users at gluster.org List" <gluster-users at gluster.org> Date: 09/20/2017 01:23 PM Subject: Re: [Gluster-users] "Input/output error" on mkdir for PPC64 based client Looks like it is an issue with architecture compatibility in RPC layer (ie, with XDRs and how it is used). Just glance the logs of the client process where you saw the errors, which could give some hints. If you don't understand the logs, share them, so we will try to look into it. -Amar On Wed, Sep 20, 2017 at 2:40 AM, Walter Deignan <WDeignan at uline.com> wrote: I recently compiled the 3.10-5 client from source on a few PPC64 systems running RHEL 7.3. They are mounting a Gluster volume which is hosted on more traditional x86 servers. Everything seems to be working properly except for creating new directories from the PPC64 clients. The mkdir command gives a "Input/output error" and for the first few minutes the new directory is inaccessible. I checked the backend bricks and confirmed the directory was created properly on all of them. After waiting for 2-5 minutes the directory magically becomes accessible. This inaccessible directory issue only appears from the client which created it. When creating the directory from client #1 I can immediately see it with no errors from client #2. Using a pre-compiled 3.10-5 package on an x86 client doesn't show the issue. I poked around bugzilla but couldn't seem to find anything which matches this. [root at mqdev1 hafsdev1_gv0]# ls -lh total 8.0K drwxrwxr-x. 4 mqm mqm 4.0K Sep 19 15:47 data drwxr-xr-x. 2 root root 4.0K Sep 19 15:47 testdir [root at mqdev1 hafsdev1_gv0]# mkdir testdir2 mkdir: cannot create directory ?testdir2?: Input/output error [root at mqdev1 hafsdev1_gv0]# ls ls: cannot access testdir2: No such file or directory data testdir testdir2 [root at mqdev1 hafsdev1_gv0]# ls -lht ls: cannot access testdir2: No such file or directory total 8.0K drwxr-xr-x. 2 root root 4.0K Sep 19 15:47 testdir drwxrwxr-x. 4 mqm mqm 4.0K Sep 19 15:47 data d?????????? ? ? ? ? ? testdir2 [root at mqdev1 hafsdev1_gv0]# cd testdir2 -bash: cd: testdir2: No such file or directory *Wait a few minutes...* [root at mqdev1 hafsdev1_gv0]# ls -lht total 12K drwxr-xr-x. 2 root root 4.0K Sep 19 15:50 testdir2 drwxr-xr-x. 2 root root 4.0K Sep 19 15:47 testdir drwxrwxr-x. 4 mqm mqm 4.0K Sep 19 15:47 data [root at mqdev1 hafsdev1_gv0]# My volume config... [root at dc-hafsdev1a bricks]# gluster volume info Volume Name: gv0 Type: Replicate Volume ID: a2d37705-05cb-4700-8ed8-2cb89376faf0 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: dc-hafsdev1a.ulinedm.com:/gluster/bricks/brick1/data Brick2: dc-hafsdev1b.ulinedm.com:/gluster/bricks/brick1/data Brick3: dc-hafsdev1c.ulinedm.com:/gluster/bricks/brick1/data Options Reconfigured: nfs.disable: on transport.address-family: inet network.ping-timeout: 2 features.bitrot: on features.scrub: Active cluster.server-quorum-ratio: 51% -Walter Deignan -Uline IT, Systems Architect _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users -- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170920/f37d27ff/attachment.html>
Niels de Vos
2017-Sep-21 14:16 UTC
[Gluster-users] "Input/output error" on mkdir for PPC64 based client
On Tue, Sep 19, 2017 at 04:10:26PM -0500, Walter Deignan wrote:> I recently compiled the 3.10-5 client from source on a few PPC64 systems > running RHEL 7.3. They are mounting a Gluster volume which is hosted on > more traditional x86 servers. > > Everything seems to be working properly except for creating new > directories from the PPC64 clients. The mkdir command gives a > "Input/output error" and for the first few minutes the new directory is > inaccessible. I checked the backend bricks and confirmed the directory was > created properly on all of them. After waiting for 2-5 minutes the > directory magically becomes accessible. > > This inaccessible directory issue only appears from the client which > created it. When creating the directory from client #1 I can immediately > see it with no errors from client #2. > > Using a pre-compiled 3.10-5 package on an x86 client doesn't show the > issue. > > I poked around bugzilla but couldn't seem to find anything which matches > this.Maybe https://bugzilla.redhat.com/show_bug.cgi?id=951903 ? Some of the details have also been captured in https://github.com/gluster/glusterfs/issues/203 Capturing a tcpdump and opening it up with Wireshark may help in identifying if GFID's are mixed up. Possibly some are also mentioned in different logs, but that might be more difficult to find. HTH, Niels> > [root at mqdev1 hafsdev1_gv0]# ls -lh > total 8.0K > drwxrwxr-x. 4 mqm mqm 4.0K Sep 19 15:47 data > drwxr-xr-x. 2 root root 4.0K Sep 19 15:47 testdir > [root at mqdev1 hafsdev1_gv0]# mkdir testdir2 > mkdir: cannot create directory ?testdir2?: Input/output error > [root at mqdev1 hafsdev1_gv0]# ls > ls: cannot access testdir2: No such file or directory > data testdir testdir2 > [root at mqdev1 hafsdev1_gv0]# ls -lht > ls: cannot access testdir2: No such file or directory > total 8.0K > drwxr-xr-x. 2 root root 4.0K Sep 19 15:47 testdir > drwxrwxr-x. 4 mqm mqm 4.0K Sep 19 15:47 data > d?????????? ? ? ? ? ? testdir2 > [root at mqdev1 hafsdev1_gv0]# cd testdir2 > -bash: cd: testdir2: No such file or directory > > *Wait a few minutes...* > > [root at mqdev1 hafsdev1_gv0]# ls -lht > total 12K > drwxr-xr-x. 2 root root 4.0K Sep 19 15:50 testdir2 > drwxr-xr-x. 2 root root 4.0K Sep 19 15:47 testdir > drwxrwxr-x. 4 mqm mqm 4.0K Sep 19 15:47 data > [root at mqdev1 hafsdev1_gv0]# > > My volume config... > > [root at dc-hafsdev1a bricks]# gluster volume info > > Volume Name: gv0 > Type: Replicate > Volume ID: a2d37705-05cb-4700-8ed8-2cb89376faf0 > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 x 3 = 3 > Transport-type: tcp > Bricks: > Brick1: dc-hafsdev1a.ulinedm.com:/gluster/bricks/brick1/data > Brick2: dc-hafsdev1b.ulinedm.com:/gluster/bricks/brick1/data > Brick3: dc-hafsdev1c.ulinedm.com:/gluster/bricks/brick1/data > Options Reconfigured: > nfs.disable: on > transport.address-family: inet > network.ping-timeout: 2 > features.bitrot: on > features.scrub: Active > cluster.server-quorum-ratio: 51% > > -Walter Deignan > -Uline IT, Systems Architect> _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users