elvinas.piliponis at barclays.com
2013-May-02 12:27 UTC
[Gluster-users] GlusterFS mount does not list directory content until parent directory is listed
Hello, Have spotted strange behaviour of GlusterFS fuse mount. I am unable to list files in a directory until parent directory is listed. However if I do list file with full path it is listed on some client nodes. Example: localadmin at ldgpsua00000038:~$ ls -al /var/lib/nova/instances/_base/ ls: cannot access /var/lib/nova/instances/_base/: No such file or directory localadmin at ldgpsua00000038:~$ ls -al /var/lib/nova/instances/ total 32271483 drwxr-xr-x 413 nova nova 622592 May 2 10:48 . drwxr-xr-x 9 nova nova 4096 Nov 14 13:37 .. drwxrwxr-x 2 nova nova 102406 Apr 30 12:19 _base drwxr-xr-x 2 root root 132 Mar 12 13:58 _base.unused -rw-r--r-- 1 root root 224182 Mar 15 11:42 --help drwxr-xr-x 2 root root 226 Mar 29 03:42 instance-000001eb drwxr-xr-x 2 root root 208 Mar 29 03:42 instance-000001ec drwxrwxr-x 2 nova nova 226 Mar 29 03:33 instance-00000296 ..... [skipped lots of lines]... drwxr-xr-x 2 root root 226 Mar 29 03:18 win-src8 drwxr-xr-x 2 root root 226 Mar 29 03:18 win-src9 localadmin at ldgpsua00000038:~$ ls -al /var/lib/nova/instances/_base/ total 8523873965 drwxrwxr-x 2 nova nova 102406 Apr 30 12:19 . drwxr-xr-x 413 nova nova 622592 May 2 10:48 .. -rw-r--r-- 1 nova nova 75161927680 Apr 24 19:51 049a236d5e3b297288507788369c705d3d46c17a -rw-r--r-- 1 libvirt-qemu kvm 75161927680 Apr 24 20:14 049a236d5e3b297288507788369c705d3d46c17a_70 -rw-r--r-- 1 nova nova 75161927680 Apr 9 12:37 055c0e6190c5d5629c0b1e8c1865a60d14005471 -rw-r--r-- 1 libvirt-qemu kvm 75161927680 Apr 9 12:49 055c0e6190c5d5629c0b1e8c1865a60d14005471_70 ...[skipped lots of line].... -rw-r--r-- 1 nova nova 75161927680 Mar 25 13:13 ff8ad6c675c84df6f70f9bd0ac04fb4189b3c899 -rw-r--r-- 1 libvirt-qemu kvm 75161927680 Mar 25 13:46 ff8ad6c675c84df6f70f9bd0ac04fb4189b3c899_70 localadmin at ldgpsua00000038:~$ Before parent directory relisting log file for the mount point was full of: -------------------------- [2013-05-02 12:20:01.376593] I [dht-common.c:596:dht_revalidate_cbk] 3-glustervmstore-dht: mismatching layouts for / [2013-05-02 12:20:51.975861] I [dht-layout.c:593:dht_layout_normalize] 3-glustervmstore-dht: found anomalies in /_base. holes=0 overlaps =2 [2013-05-02 12:20:52.077131] I [dht-layout.c:593:dht_layout_normalize] 3-glustervmstore-dht: found anomalies in /_base. holes=0 overlaps =2 [2013-05-02 12:20:52.096745] I [dht-layout.c:593:dht_layout_normalize] 3-glustervmstore-dht: found anomalies in <gfid:bec8d3c8-fb57-4e0f -88da-cf28c7e5fadc>. holes=0 overlaps=2 [2013-05-02 12:20:52.096840] W [fuse-resolve.c:152:fuse_resolve_gfid_cbk] 0-fuse: bec8d3c8-fb57-4e0f-88da-cf28c7e5fadc: failed to resolv e (Invalid argument) [2013-05-02 12:20:52.096868] E [fuse-bridge.c:352:fuse_lookup_resume] 0-fuse: failed to resolve path (null) [2013-05-02 12:20:52.102258] I [dht-layout.c:593:dht_layout_normalize] 3-glustervmstore-dht: found anomalies in /_base. holes=0 overlaps =2 [2013-05-02 12:20:52.118880] I [dht-layout.c:593:dht_layout_normalize] 3-glustervmstore-dht: found anomalies in <gfid:bec8d3c8-fb57-4e0f -88da-cf28c7e5fadc>. holes=0 overlaps=2 [2013-05-02 12:20:52.118936] W [fuse-resolve.c:152:fuse_resolve_gfid_cbk] 0-fuse: bec8d3c8-fb57-4e0f-88da-cf28c7e5fadc: failed to resolv e (Invalid argument) [2013-05-02 12:20:52.118958] E [fuse-bridge.c:352:fuse_lookup_resume] 0-fuse: failed to resolve path (null) [2013-05-02 12:20:54.550788] I [dht-layout.c:593:dht_layout_normalize] 3-glustervmstore-dht: found anomalies in /_base. holes=0 overlaps =2 [2013-05-02 12:20:54.651945] I [dht-layout.c:593:dht_layout_normalize] 3-glustervmstore-dht: found anomalies in /_base. holes=0 overlaps =2 [2013-05-02 12:20:54.666836] I [dht-layout.c:593:dht_layout_normalize] 3-glustervmstore-dht: found anomalies in <gfid:bec8d3c8-fb57-4e0f -88da-cf28c7e5fadc>. holes=0 overlaps=2 [2013-05-02 12:20:54.666897] W [fuse-resolve.c:152:fuse_resolve_gfid_cbk] 0-fuse: bec8d3c8-fb57-4e0f-88da-cf28c7e5fadc: failed to resolv e (Invalid argument) [2013-05-02 12:20:54.666919] E [fuse-bridge.c:352:fuse_lookup_resume] 0-fuse: failed to resolve path (null) [2013-05-02 12:20:54.672033] I [dht-layout.c:593:dht_layout_normalize] 3-glustervmstore-dht: found anomalies in /_base. holes=0 overlaps =2 [2013-05-02 12:20:54.692809] I [dht-layout.c:593:dht_layout_normalize] 3-glustervmstore-dht: found anomalies in <gfid:bec8d3c8-fb57-4e0f -88da-cf28c7e5fadc>. holes=0 overlaps=2 [2013-05-02 12:20:54.692869] W [fuse-resolve.c:152:fuse_resolve_gfid_cbk] 0-fuse: bec8d3c8-fb57-4e0f-88da-cf28c7e5fadc: failed to resolv e (Invalid argument) [2013-05-02 12:20:54.692891] E [fuse-bridge.c:352:fuse_lookup_resume] 0-fuse: failed to resolve path (null) ------------- After relist Errors changed to info [2013-05-02 12:21:22.145739] I [dht-common.c:596:dht_revalidate_cbk] 3-glustervmstore-dht: mismatching layouts for / [2013-05-02 12:21:22.145839] I [dht-layout.c:698:dht_layout_dir_mismatch] 3-glustervmstore-dht: subvol: glustervmstore-replicate-6; inode layout - 1561806288 - 1757032073; disk layout - 0 - 330382098 [2013-05-02 12:21:22.145865] I [dht-common.c:596:dht_revalidate_cbk] 3-glustervmstore-dht: mismatching layouts for / [2013-05-02 12:21:22.145991] I [dht-layout.c:698:dht_layout_dir_mismatch] 3-glustervmstore-dht: subvol: glustervmstore-replicate-7; inode layout - 1757032074 - 1952257859; disk layout - 1636178016 - 1840700267 [2013-05-02 12:21:22.146021] I [dht-common.c:596:dht_revalidate_cbk] 3-glustervmstore-dht: mismatching layouts for / [2013-05-02 12:21:22.146177] I [dht-layout.c:698:dht_layout_dir_mismatch] 3-glustervmstore-dht: subvol: glustervmstore-replicate-8; inode layout - 1952257860 - 2147483645; disk layout - 1840700268 - 2045222519 [2013-05-02 12:21:22.146277] I [dht-common.c:596:dht_revalidate_cbk] 3-glustervmstore-dht: mismatching layouts for / [2013-05-02 12:21:22.146419] I [dht-layout.c:698:dht_layout_dir_mismatch] 3-glustervmstore-dht: subvol: glustervmstore-replicate-10; inode layout - 2733161004 - 2928386789; disk layout - 2658789276 - 2863311527 I am using Semiosis package on Ubuntu 12.04 : ii glusterfs-client 3.3.1-ubuntu1~precise8 clustered file-system (client package) ii glusterfs-common 3.3.1-ubuntu1~precise8 GlusterFS common libraries and translator modules ii glusterfs-server 3.3.1-ubuntu1~precise8 clustered file-system (server package) This morning I have recovered from cluster lockup when one node got stuck with "CPU #X stuck on task for more than XY seconds". For some reason Gluster did attempted blindly to continue IO operations although there was lots of RPC connection errors to that stuck node in the log. Node did allowed to initiate connections but nothing was happening afterwards. This have stalled all IO operations on shared file system. Potentially this might have caused issue above. However I have run volume heal full command for serveral times and no split-brain files were listed and only several heal failed files was listed in / directory for glusterfs volume. Most likely these are stale records as at least some of them are deleted some time ago. Any ideas where to look further? Thank you ______________________________________________________________________________ Elvinas Piliponis? I??UNIX Engineer? I??GTIS UNIX Engineering Tel +370 5 251 1218, 7 2249 1218? I? ?Mobile +370 656 69249 I??Email? elvinas.piliponis at barclays.com Barclays, GreenHall 9th floor 09.E4.1, Up?s g. 21, Vilnius, Lithuania LT-081218? barclays.com This e-mail and any attachments are confidential and intended solely for the addressee and may also be privileged or exempt from disclosure under applicable law. If you are not the addressee, or have received this e-mail in error, please notify the sender immediately, delete it from your system and do not copy, disclose or otherwise act upon any part of this e-mail or its attachments. Internet communications are not guaranteed to be secure or virus-free. The Barclays Group does not accept responsibility for any loss arising from unauthorised access to, or interference with, any Internet communications by any third party, or from the transmission of any viruses. Replies to this e-mail may be monitored by the Barclays Group for operational or business reasons. Any opinion or other information in this e-mail or its attachments that does not relate to the business of the Barclays Group is personal to the sender and is not given or endorsed by the Barclays Group. Barclays Bank PLC. Registered in England and Wales (registered no. 1026167). Registered Office: 1 Churchill Place, London, E14 5HP, United Kingdom. Barclays Bank PLC is authorised and regulated by the Financial Services Authority.