Christopher Anderlik
2011-Apr-29 06:42 UTC
[Gluster-users] strange behavior with glusterfs 3.2.0
hello. we use glusterfs 3.2.0 2 glusterfs-server with SLES 11.1 and several clients which access the gfs-volumes. configuration: info ---- type=2 count=2 status=1 sub_count=2 version=1 transport-type=0 volume-id=05168b54-6a5c-4aa3-91ee-63d16976c6cd brick-0=10.0.1.xxx:-glusterstorage-macm03 brick-1=10.0.1.xxy:-glusterstorage-macm03 macm03-fuse.vol --------------- volume macm03-client-0 type protocol/client option remote-host 10.0.1.xxx option remote-subvolume /glusterstorage/macm03 option transport-type tcp end-volume volume macm03-client-1 type protocol/client option remote-host 10.0.1.xxy option remote-subvolume /glusterstorage/macm03 option transport-type tcp end-volume volume macm03-replicate-0 type cluster/replicate subvolumes macm03-client-0 macm03-client-1 end-volume volume macm03-write-behind type performance/write-behind subvolumes macm03-replicate-0 end-volume volume macm03-read-ahead type performance/read-ahead subvolumes macm03-write-behind end-volume volume macm03-io-cache type performance/io-cache subvolumes macm03-read-ahead end-volume volume macm03-quick-read type performance/quick-read subvolumes macm03-io-cache end-volume volume macm03-stat-prefetch type performance/stat-prefetch subvolumes macm03-quick-read end-volume volume macm03 type debug/io-stats subvolumes macm03-stat-prefetch end-volume macm03.10.0.1.xxx.glusterstorage-macm03.vol ------------------------------------------- volume macm03-posix type storage/posix option directory /glusterstorage/macm03 end-volume volume macm03-access-control type features/access-control subvolumes macm03-posix end-volume volume macm03-locks type features/locks subvolumes macm03-access-control end-volume volume macm03-io-threads type performance/io-threads subvolumes macm03-locks end-volume volume /glusterstorage/macm03 type debug/io-stats subvolumes macm03-io-threads end-volume volume macm03-server type protocol/server option transport-type tcp option auth.addr./glusterstorage/macm03.allow * subvolumes /glusterstorage/macm03 end-volume macm03.10.0.1.xxy.glusterstorage-macm03.vol ------------------------------------------- volume macm03-posix type storage/posix option directory /glusterstorage/macm03 end-volume volume macm03-access-control type features/access-control subvolumes macm03-posix end-volume volume macm03-locks type features/locks subvolumes macm03-access-control end-volume volume macm03-io-threads type performance/io-threads subvolumes macm03-locks end-volume volume /glusterstorage/macm03 type debug/io-stats subvolumes macm03-io-threads end-volume volume macm03-server type protocol/server option transport-type tcp option auth.addr./glusterstorage/macm03.allow * subvolumes /glusterstorage/macm03 end-volume client ------ the client has mounted the volume via fstab like this: server:/macm03 /srv/www/GFS glusterfs defaults,_netdev 0 0 now we registered strange behavior and i have some questions: 1) files with size 0 we find many files with size 0. in server-log we only find this. what does this mean? (most all files in one directory has size 0). [2011-04-28 23:52:00.630869] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path resolution for /xxx/preview/4aa76fa541413.jpg (LOOKUP) [2011-04-28 23:52:00.637384] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path resolution for /xxx/preview/4aa76fa541413.jpg (UNLINK) [2011-04-28 23:52:00.693183] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path resolution for /xxx/preview/4aa76fa541413.jpg (LOOKUP) [2011-04-28 23:52:00.711092] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path resolution for /xxx/preview/4aa76fa541413.jpg (MKNOD) [2011-04-28 23:52:00.746289] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path resolution for /xxx/preview/4aa76fa541413.jpg (SETATTR) [2011-04-28 23:52:16.373532] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path resolution for /xxx/preview/4aa76fa541413.jpg (LOOKUP) 2) then the client is selfhealing meta-data all the time.... (because the file has has size 0 on one of the servers???) but we triggered selfhealing severalt time like this: http://europe.gluster.org/community/documentation/index.php/Gluster_3.1:_Triggering_Self-Heal_on_Replicate [2011-04-29 07:55:27.188743] I [afr-common.c:581:afr_lookup_collect_xattr] 0-macm03-replicate-0: data self-heal is pending for /videos12/29640/preview/4aadf4b757de6.jpg. [2011-04-29 07:55:27.188829] I [afr-common.c:735:afr_lookup_done] 0-macm03-replicate-0: background meta-data data self-heal triggered. path: /videos12/29640/preview/4aadf4b757de6.jpg [2011-04-29 07:55:27.194446] W [dict.c:437:dict_ref] (-->/opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/protocol/client.so(client3_1_fstat_cbk+0x2bb) [0x2aaaaafe833b] (-->/opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/cluster/replicate.so(afr_sh_data_fstat_cbk+0x17d) [0x2aaaab11c9ad] (-->/opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/cluster/replicate.so(afr_sh_data_fix+0x1fc) [0x2aaaab11c64c]))) 0-dict: dict is NULL 3) on some of the clients we then can not access the whole directory: # dir xxx/preview/ /bin/ls: reading directory xxx/preview/: File descriptor in bad state total 0 in logs we find this: [2011-04-29 08:36:17.224301] W [afr-common.c:634:afr_lookup_self_heal_check] 0-macm03-replicate-0: /videos12/30181: gfid different on subvolume [2011-04-29 08:36:17.241330] I [afr-common.c:680:afr_lookup_done] 0-macm03-replicate-0: entries are missing in lookup of /xxx/preview. [2011-04-29 08:36:17.241373] I [afr-common.c:735:afr_lookup_done] 0-macm03-replicate-0: background meta-data data entry self-heal triggered. path: /xxx/preview [2011-04-29 08:36:17.243160] I [afr-self-heal-metadata.c:595:afr_sh_metadata_lookup_cbk] 0-macm03-replicate-0: path /videos12/30181/preview on subvolume macm03-client-0 => -1 (No such file or directory) [2011-04-29 08:36:17.302228] I [afr-dir-read.c:120:afr_examine_dir_readdir_cbk] 0-macm03-replicate-0: /videos12/30181/preview: failed to do opendir on macm03-client-0 [2011-04-29 08:36:17.303836] I [afr-dir-read.c:174:afr_examine_dir_readdir_cbk] 0-macm03-replicate-0: entry self-heal triggered. path: /xxx/preview, reason: checksums of directory differ, forced merge option set 4) sometimes when we umount glusterfs-volume on client and remount it again, we can access the dirctory which was in bad state before -> and then also selfhealing works at it should but sometimes also a remount does not work. any help would be appreciated. thank you very very much! thx christopher
Christopher Anderlik
2011-Apr-29 07:29 UTC
[Gluster-users] strange behavior with glusterfs 3.2.0
hello. we use glusterfs 3.2.0 2 glusterfs-server with SLES 11.1 and several clients which access the gfs-volumes. configuration: info ---- type=2 count=2 status=1 sub_count=2 version=1 transport-type=0 volume-id=05168b54-6a5c-4aa3-91ee-63d16976c6cd brick-0=10.0.1.xxx:-glusterstorage-macm03 brick-1=10.0.1.xxy:-glusterstorage-macm03 macm03-fuse.vol --------------- volume macm03-client-0 type protocol/client option remote-host 10.0.1.xxx option remote-subvolume /glusterstorage/macm03 option transport-type tcp end-volume volume macm03-client-1 type protocol/client option remote-host 10.0.1.xxy option remote-subvolume /glusterstorage/macm03 option transport-type tcp end-volume volume macm03-replicate-0 type cluster/replicate subvolumes macm03-client-0 macm03-client-1 end-volume volume macm03-write-behind type performance/write-behind subvolumes macm03-replicate-0 end-volume volume macm03-read-ahead type performance/read-ahead subvolumes macm03-write-behind end-volume volume macm03-io-cache type performance/io-cache subvolumes macm03-read-ahead end-volume volume macm03-quick-read type performance/quick-read subvolumes macm03-io-cache end-volume volume macm03-stat-prefetch type performance/stat-prefetch subvolumes macm03-quick-read end-volume volume macm03 type debug/io-stats subvolumes macm03-stat-prefetch end-volume macm03.10.0.1.xxx.glusterstorage-macm03.vol ------------------------------------------- volume macm03-posix type storage/posix option directory /glusterstorage/macm03 end-volume volume macm03-access-control type features/access-control subvolumes macm03-posix end-volume volume macm03-locks type features/locks subvolumes macm03-access-control end-volume volume macm03-io-threads type performance/io-threads subvolumes macm03-locks end-volume volume /glusterstorage/macm03 type debug/io-stats subvolumes macm03-io-threads end-volume volume macm03-server type protocol/server option transport-type tcp option auth.addr./glusterstorage/macm03.allow * subvolumes /glusterstorage/macm03 end-volume macm03.10.0.1.xxy.glusterstorage-macm03.vol ------------------------------------------- volume macm03-posix type storage/posix option directory /glusterstorage/macm03 end-volume volume macm03-access-control type features/access-control subvolumes macm03-posix end-volume volume macm03-locks type features/locks subvolumes macm03-access-control end-volume volume macm03-io-threads type performance/io-threads subvolumes macm03-locks end-volume volume /glusterstorage/macm03 type debug/io-stats subvolumes macm03-io-threads end-volume volume macm03-server type protocol/server option transport-type tcp option auth.addr./glusterstorage/macm03.allow * subvolumes /glusterstorage/macm03 end-volume client ------ the client has mounted the volume via fstab like this: server:/macm03 /srv/www/GFS glusterfs defaults,_netdev 0 0 now we registered strange behavior and i have some questions: 1) files with size 0 we find many files with size 0. in server-log we only find this. what does this mean? (most all files in one directory has size 0). [2011-04-28 23:52:00.630869] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path resolution for /xxx/preview/4aa76fa541413.jpg (LOOKUP) [2011-04-28 23:52:00.637384] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path resolution for /xxx/preview/4aa76fa541413.jpg (UNLINK) [2011-04-28 23:52:00.693183] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path resolution for /xxx/preview/4aa76fa541413.jpg (LOOKUP) [2011-04-28 23:52:00.711092] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path resolution for /xxx/preview/4aa76fa541413.jpg (MKNOD) [2011-04-28 23:52:00.746289] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path resolution for /xxx/preview/4aa76fa541413.jpg (SETATTR) [2011-04-28 23:52:16.373532] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path resolution for /xxx/preview/4aa76fa541413.jpg (LOOKUP) 2) then the client is selfhealing meta-data all the time.... (because the file has has size 0 on one of the servers???) but we triggered selfhealing severalt time like this: http://europe.gluster.org/community/documentation/index.php/Gluster_3.1:_Triggering_Self-Heal_on_Replicate [2011-04-29 07:55:27.188743] I [afr-common.c:581:afr_lookup_collect_xattr] 0-macm03-replicate-0: data self-heal is pending for /videos12/29640/preview/4aadf4b757de6.jpg. [2011-04-29 07:55:27.188829] I [afr-common.c:735:afr_lookup_done] 0-macm03-replicate-0: background meta-data data self-heal triggered. path: /videos12/29640/preview/4aadf4b757de6.jpg [2011-04-29 07:55:27.194446] W [dict.c:437:dict_ref] (-->/opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/protocol/client.so(client3_1_fstat_cbk+0x2bb) [0x2aaaaafe833b] (-->/opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/cluster/replicate.so(afr_sh_data_fstat_cbk+0x17d) [0x2aaaab11c9ad] (-->/opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/cluster/replicate.so(afr_sh_data_fix+0x1fc) [0x2aaaab11c64c]))) 0-dict: dict is NULL 3) on some of the clients we then can not access the whole directory: # dir xxx/preview/ /bin/ls: reading directory xxx/preview/: File descriptor in bad state total 0 in logs we find this: [2011-04-29 08:36:17.224301] W [afr-common.c:634:afr_lookup_self_heal_check] 0-macm03-replicate-0: /videos12/30181: gfid different on subvolume [2011-04-29 08:36:17.241330] I [afr-common.c:680:afr_lookup_done] 0-macm03-replicate-0: entries are missing in lookup of /xxx/preview. [2011-04-29 08:36:17.241373] I [afr-common.c:735:afr_lookup_done] 0-macm03-replicate-0: background meta-data data entry self-heal triggered. path: /xxx/preview [2011-04-29 08:36:17.243160] I [afr-self-heal-metadata.c:595:afr_sh_metadata_lookup_cbk] 0-macm03-replicate-0: path /videos12/30181/preview on subvolume macm03-client-0 => -1 (No such file or directory) [2011-04-29 08:36:17.302228] I [afr-dir-read.c:120:afr_examine_dir_readdir_cbk] 0-macm03-replicate-0: /videos12/30181/preview: failed to do opendir on macm03-client-0 [2011-04-29 08:36:17.303836] I [afr-dir-read.c:174:afr_examine_dir_readdir_cbk] 0-macm03-replicate-0: entry self-heal triggered. path: /xxx/preview, reason: checksums of directory differ, forced merge option set 4) sometimes when we umount glusterfs-volume on client and remount it again, we can access the dirctory which was in bad state before -> and then also selfhealing works at it should but sometimes also a remount does not work. any help would be appreciated. thank you very very much! thx christopher
Christopher Anderlik
2011-May-02 06:20 UTC
[Gluster-users] strange behavior with glusterfs 3.2.0
hi. hm, so it seems that nobody else have these problems. is there any possibility to get professional support for glusterfs? does someone has a link/contact? thank you christopher Am 29.04.2011 08:42, schrieb Christopher Anderlik:> hello. > > we use glusterfs 3.2.0 > 2 glusterfs-server with SLES 11.1 > and several clients which access the gfs-volumes. > > > configuration: > > > > info > ---- > type=2 > count=2 > status=1 > sub_count=2 > version=1 > transport-type=0 > volume-id=05168b54-6a5c-4aa3-91ee-63d16976c6cd > brick-0=10.0.1.xxx:-glusterstorage-macm03 > brick-1=10.0.1.xxy:-glusterstorage-macm03 > > > > > > > macm03-fuse.vol > --------------- > volume macm03-client-0 > type protocol/client > option remote-host 10.0.1.xxx > option remote-subvolume /glusterstorage/macm03 > option transport-type tcp > end-volume > > volume macm03-client-1 > type protocol/client > option remote-host 10.0.1.xxy > option remote-subvolume /glusterstorage/macm03 > option transport-type tcp > end-volume > > volume macm03-replicate-0 > type cluster/replicate > subvolumes macm03-client-0 macm03-client-1 > end-volume > > volume macm03-write-behind > type performance/write-behind > subvolumes macm03-replicate-0 > end-volume > > volume macm03-read-ahead > type performance/read-ahead > subvolumes macm03-write-behind > end-volume > > volume macm03-io-cache > type performance/io-cache > subvolumes macm03-read-ahead > end-volume > > volume macm03-quick-read > type performance/quick-read > subvolumes macm03-io-cache > end-volume > > volume macm03-stat-prefetch > type performance/stat-prefetch > subvolumes macm03-quick-read > end-volume > > volume macm03 > type debug/io-stats > subvolumes macm03-stat-prefetch > end-volume > > > > > > macm03.10.0.1.xxx.glusterstorage-macm03.vol > ------------------------------------------- > volume macm03-posix > type storage/posix > option directory /glusterstorage/macm03 > end-volume > > volume macm03-access-control > type features/access-control > subvolumes macm03-posix > end-volume > > volume macm03-locks > type features/locks > subvolumes macm03-access-control > end-volume > > volume macm03-io-threads > type performance/io-threads > subvolumes macm03-locks > end-volume > > volume /glusterstorage/macm03 > type debug/io-stats > subvolumes macm03-io-threads > end-volume > > volume macm03-server > type protocol/server > option transport-type tcp > option auth.addr./glusterstorage/macm03.allow * > subvolumes /glusterstorage/macm03 > end-volume > > > > > > > > > > > > > macm03.10.0.1.xxy.glusterstorage-macm03.vol > ------------------------------------------- > volume macm03-posix > type storage/posix > option directory /glusterstorage/macm03 > end-volume > > volume macm03-access-control > type features/access-control > subvolumes macm03-posix > end-volume > > volume macm03-locks > type features/locks > subvolumes macm03-access-control > end-volume > > volume macm03-io-threads > type performance/io-threads > subvolumes macm03-locks > end-volume > > volume /glusterstorage/macm03 > type debug/io-stats > subvolumes macm03-io-threads > end-volume > > volume macm03-server > type protocol/server > option transport-type tcp > option auth.addr./glusterstorage/macm03.allow * > subvolumes /glusterstorage/macm03 > end-volume > > > > > > > > > > > client > ------ > the client has mounted the volume via fstab like this: > server:/macm03 /srv/www/GFS glusterfs defaults,_netdev 0 0 > > > > > > > > > > > > > now we registered strange behavior and i have some questions: > > > > 1) files with size 0 > we find many files with size 0. in server-log we only find this. what does this mean? > (most all files in one directory has size 0). > > > [2011-04-28 23:52:00.630869] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path > resolution for /xxx/preview/4aa76fa541413.jpg (LOOKUP) > [2011-04-28 23:52:00.637384] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path > resolution for /xxx/preview/4aa76fa541413.jpg (UNLINK) > [2011-04-28 23:52:00.693183] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path > resolution for /xxx/preview/4aa76fa541413.jpg (LOOKUP) > [2011-04-28 23:52:00.711092] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path > resolution for /xxx/preview/4aa76fa541413.jpg (MKNOD) > [2011-04-28 23:52:00.746289] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path > resolution for /xxx/preview/4aa76fa541413.jpg (SETATTR) > [2011-04-28 23:52:16.373532] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path > resolution for /xxx/preview/4aa76fa541413.jpg (LOOKUP) > > > > > > > > 2) then the client is selfhealing meta-data all the time.... (because the file has has size 0 on one > of the servers???) > but we triggered selfhealing severalt time like this: > http://europe.gluster.org/community/documentation/index.php/Gluster_3.1:_Triggering_Self-Heal_on_Replicate > > > > > [2011-04-29 07:55:27.188743] I [afr-common.c:581:afr_lookup_collect_xattr] 0-macm03-replicate-0: > data self-heal is pending for /videos12/29640/preview/4aadf4b757de6.jpg. > [2011-04-29 07:55:27.188829] I [afr-common.c:735:afr_lookup_done] 0-macm03-replicate-0: background > meta-data data self-heal triggered. path: /videos12/29640/preview/4aadf4b757de6.jpg > [2011-04-29 07:55:27.194446] W [dict.c:437:dict_ref] > (-->/opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/protocol/client.so(client3_1_fstat_cbk+0x2bb) > [0x2aaaaafe833b] > (-->/opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/cluster/replicate.so(afr_sh_data_fstat_cbk+0x17d) > [0x2aaaab11c9ad] > (-->/opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/cluster/replicate.so(afr_sh_data_fix+0x1fc) > [0x2aaaab11c64c]))) 0-dict: dict is NULL > > > > > > 3) > on some of the clients we then can not access the whole directory: > > # dir xxx/preview/ > /bin/ls: reading directory xxx/preview/: File descriptor in bad state > total 0 > > > in logs we find this: > > [2011-04-29 08:36:17.224301] W [afr-common.c:634:afr_lookup_self_heal_check] 0-macm03-replicate-0: > /videos12/30181: gfid different on subvolume > [2011-04-29 08:36:17.241330] I [afr-common.c:680:afr_lookup_done] 0-macm03-replicate-0: entries are > missing in lookup of /xxx/preview. > [2011-04-29 08:36:17.241373] I [afr-common.c:735:afr_lookup_done] 0-macm03-replicate-0: background > meta-data data entry self-heal triggered. path: /xxx/preview > [2011-04-29 08:36:17.243160] I [afr-self-heal-metadata.c:595:afr_sh_metadata_lookup_cbk] > 0-macm03-replicate-0: path /videos12/30181/preview on subvolume macm03-client-0 => -1 (No such file > or directory) > [2011-04-29 08:36:17.302228] I [afr-dir-read.c:120:afr_examine_dir_readdir_cbk] > 0-macm03-replicate-0: /videos12/30181/preview: failed to do opendir on macm03-client-0 > [2011-04-29 08:36:17.303836] I [afr-dir-read.c:174:afr_examine_dir_readdir_cbk] > 0-macm03-replicate-0: entry self-heal triggered. path: /xxx/preview, reason: checksums of directory > differ, forced merge option set > > > > > > > 4) > sometimes when we umount glusterfs-volume on client and remount it again, we can access the dirctory > which was in bad state before -> and then also selfhealing works at it should > but sometimes also a remount does not work. > > > > > > > any help would be appreciated. > thank you very very much! > > > > thx > christopher > > > > > > > > > > > > > > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >-- Mag. Christopher Anderlik Leiter Technik ________________________________________________________________________________ Xidras GmbH Stockern 47 3744 Stockern Austria Tel: 0043 2983 201 30 5 01 Fax: 0043 2983 201 30 5 01 9 Email: christopher.anderlik at xidras.com Web: http://www.xidras.com FN 317036 f | Landesgericht Krems | ATU64485024 ________________________________________________________________________________ VERTRAULICHE INFORMATIONEN! Diese eMail enth?lt vertrauliche Informationen und ist nur f?r den berechtigten Empf?nger bestimmt. Wenn diese eMail nicht f?r Sie bestimmt ist, bitten wir Sie, diese eMail an uns zur?ckzusenden und anschlie?end auf Ihrem Computer und Mail-Server zu l?schen. Solche eMails und Anlagen d?rfen Sie weder nutzen, noch verarbeiten oder Dritten zug?nglich machen, gleich in welcher Form. Wir danken f?r Ihre Kooperation! CONFIDENTIAL! This email contains confidential information and is intended for the authorised recipient only. If you are not an authorised recipient, please return the email to us and then delete it from your computer and mail-server. You may neither use nor edit any such emails including attachments, nor make them accessible to third parties in any manner whatsoever. Thank you for your cooperation ________________________________________________________________________________