Christopher Anderlik
2011-Apr-29 06:42 UTC
[Gluster-users] strange behavior with glusterfs 3.2.0
hello.
we use glusterfs 3.2.0
2 glusterfs-server with SLES 11.1
and several clients which access the gfs-volumes.
configuration:
info
----
type=2
count=2
status=1
sub_count=2
version=1
transport-type=0
volume-id=05168b54-6a5c-4aa3-91ee-63d16976c6cd
brick-0=10.0.1.xxx:-glusterstorage-macm03
brick-1=10.0.1.xxy:-glusterstorage-macm03
macm03-fuse.vol
---------------
volume macm03-client-0
type protocol/client
option remote-host 10.0.1.xxx
option remote-subvolume /glusterstorage/macm03
option transport-type tcp
end-volume
volume macm03-client-1
type protocol/client
option remote-host 10.0.1.xxy
option remote-subvolume /glusterstorage/macm03
option transport-type tcp
end-volume
volume macm03-replicate-0
type cluster/replicate
subvolumes macm03-client-0 macm03-client-1
end-volume
volume macm03-write-behind
type performance/write-behind
subvolumes macm03-replicate-0
end-volume
volume macm03-read-ahead
type performance/read-ahead
subvolumes macm03-write-behind
end-volume
volume macm03-io-cache
type performance/io-cache
subvolumes macm03-read-ahead
end-volume
volume macm03-quick-read
type performance/quick-read
subvolumes macm03-io-cache
end-volume
volume macm03-stat-prefetch
type performance/stat-prefetch
subvolumes macm03-quick-read
end-volume
volume macm03
type debug/io-stats
subvolumes macm03-stat-prefetch
end-volume
macm03.10.0.1.xxx.glusterstorage-macm03.vol
-------------------------------------------
volume macm03-posix
type storage/posix
option directory /glusterstorage/macm03
end-volume
volume macm03-access-control
type features/access-control
subvolumes macm03-posix
end-volume
volume macm03-locks
type features/locks
subvolumes macm03-access-control
end-volume
volume macm03-io-threads
type performance/io-threads
subvolumes macm03-locks
end-volume
volume /glusterstorage/macm03
type debug/io-stats
subvolumes macm03-io-threads
end-volume
volume macm03-server
type protocol/server
option transport-type tcp
option auth.addr./glusterstorage/macm03.allow *
subvolumes /glusterstorage/macm03
end-volume
macm03.10.0.1.xxy.glusterstorage-macm03.vol
-------------------------------------------
volume macm03-posix
type storage/posix
option directory /glusterstorage/macm03
end-volume
volume macm03-access-control
type features/access-control
subvolumes macm03-posix
end-volume
volume macm03-locks
type features/locks
subvolumes macm03-access-control
end-volume
volume macm03-io-threads
type performance/io-threads
subvolumes macm03-locks
end-volume
volume /glusterstorage/macm03
type debug/io-stats
subvolumes macm03-io-threads
end-volume
volume macm03-server
type protocol/server
option transport-type tcp
option auth.addr./glusterstorage/macm03.allow *
subvolumes /glusterstorage/macm03
end-volume
client
------
the client has mounted the volume via fstab like this:
server:/macm03 /srv/www/GFS glusterfs defaults,_netdev 0 0
now we registered strange behavior and i have some questions:
1) files with size 0
we find many files with size 0. in server-log we only find this. what does this
mean?
(most all files in one directory has size 0).
[2011-04-28 23:52:00.630869] I [server-resolve.c:580:server_resolve]
0-macm03-server: pure path
resolution for /xxx/preview/4aa76fa541413.jpg (LOOKUP)
[2011-04-28 23:52:00.637384] I [server-resolve.c:580:server_resolve]
0-macm03-server: pure path
resolution for /xxx/preview/4aa76fa541413.jpg (UNLINK)
[2011-04-28 23:52:00.693183] I [server-resolve.c:580:server_resolve]
0-macm03-server: pure path
resolution for /xxx/preview/4aa76fa541413.jpg (LOOKUP)
[2011-04-28 23:52:00.711092] I [server-resolve.c:580:server_resolve]
0-macm03-server: pure path
resolution for /xxx/preview/4aa76fa541413.jpg (MKNOD)
[2011-04-28 23:52:00.746289] I [server-resolve.c:580:server_resolve]
0-macm03-server: pure path
resolution for /xxx/preview/4aa76fa541413.jpg (SETATTR)
[2011-04-28 23:52:16.373532] I [server-resolve.c:580:server_resolve]
0-macm03-server: pure path
resolution for /xxx/preview/4aa76fa541413.jpg (LOOKUP)
2) then the client is selfhealing meta-data all the time.... (because the file
has has size 0 on one
of the servers???)
but we triggered selfhealing severalt time like this:
http://europe.gluster.org/community/documentation/index.php/Gluster_3.1:_Triggering_Self-Heal_on_Replicate
[2011-04-29 07:55:27.188743] I [afr-common.c:581:afr_lookup_collect_xattr]
0-macm03-replicate-0:
data self-heal is pending for /videos12/29640/preview/4aadf4b757de6.jpg.
[2011-04-29 07:55:27.188829] I [afr-common.c:735:afr_lookup_done]
0-macm03-replicate-0: background
meta-data data self-heal triggered. path:
/videos12/29640/preview/4aadf4b757de6.jpg
[2011-04-29 07:55:27.194446] W [dict.c:437:dict_ref]
(-->/opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/protocol/client.so(client3_1_fstat_cbk+0x2bb)
[0x2aaaaafe833b]
(-->/opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/cluster/replicate.so(afr_sh_data_fstat_cbk+0x17d)
[0x2aaaab11c9ad]
(-->/opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/cluster/replicate.so(afr_sh_data_fix+0x1fc)
[0x2aaaab11c64c]))) 0-dict: dict is NULL
3)
on some of the clients we then can not access the whole directory:
# dir xxx/preview/
/bin/ls: reading directory xxx/preview/: File descriptor in bad state
total 0
in logs we find this:
[2011-04-29 08:36:17.224301] W [afr-common.c:634:afr_lookup_self_heal_check]
0-macm03-replicate-0:
/videos12/30181: gfid different on subvolume
[2011-04-29 08:36:17.241330] I [afr-common.c:680:afr_lookup_done]
0-macm03-replicate-0: entries are
missing in lookup of /xxx/preview.
[2011-04-29 08:36:17.241373] I [afr-common.c:735:afr_lookup_done]
0-macm03-replicate-0: background
meta-data data entry self-heal triggered. path: /xxx/preview
[2011-04-29 08:36:17.243160] I
[afr-self-heal-metadata.c:595:afr_sh_metadata_lookup_cbk]
0-macm03-replicate-0: path /videos12/30181/preview on subvolume macm03-client-0
=> -1 (No such file
or directory)
[2011-04-29 08:36:17.302228] I [afr-dir-read.c:120:afr_examine_dir_readdir_cbk]
0-macm03-replicate-0: /videos12/30181/preview: failed to do opendir on
macm03-client-0
[2011-04-29 08:36:17.303836] I [afr-dir-read.c:174:afr_examine_dir_readdir_cbk]
0-macm03-replicate-0: entry self-heal triggered. path: /xxx/preview, reason:
checksums of directory
differ, forced merge option set
4)
sometimes when we umount glusterfs-volume on client and remount it again, we can
access the dirctory
which was in bad state before -> and then also selfhealing works at it should
but sometimes also a remount does not work.
any help would be appreciated.
thank you very very much!
thx
christopher
Christopher Anderlik
2011-Apr-29 07:29 UTC
[Gluster-users] strange behavior with glusterfs 3.2.0
hello.
we use glusterfs 3.2.0
2 glusterfs-server with SLES 11.1
and several clients which access the gfs-volumes.
configuration:
info
----
type=2
count=2
status=1
sub_count=2
version=1
transport-type=0
volume-id=05168b54-6a5c-4aa3-91ee-63d16976c6cd
brick-0=10.0.1.xxx:-glusterstorage-macm03
brick-1=10.0.1.xxy:-glusterstorage-macm03
macm03-fuse.vol
---------------
volume macm03-client-0
type protocol/client
option remote-host 10.0.1.xxx
option remote-subvolume /glusterstorage/macm03
option transport-type tcp
end-volume
volume macm03-client-1
type protocol/client
option remote-host 10.0.1.xxy
option remote-subvolume /glusterstorage/macm03
option transport-type tcp
end-volume
volume macm03-replicate-0
type cluster/replicate
subvolumes macm03-client-0 macm03-client-1
end-volume
volume macm03-write-behind
type performance/write-behind
subvolumes macm03-replicate-0
end-volume
volume macm03-read-ahead
type performance/read-ahead
subvolumes macm03-write-behind
end-volume
volume macm03-io-cache
type performance/io-cache
subvolumes macm03-read-ahead
end-volume
volume macm03-quick-read
type performance/quick-read
subvolumes macm03-io-cache
end-volume
volume macm03-stat-prefetch
type performance/stat-prefetch
subvolumes macm03-quick-read
end-volume
volume macm03
type debug/io-stats
subvolumes macm03-stat-prefetch
end-volume
macm03.10.0.1.xxx.glusterstorage-macm03.vol
-------------------------------------------
volume macm03-posix
type storage/posix
option directory /glusterstorage/macm03
end-volume
volume macm03-access-control
type features/access-control
subvolumes macm03-posix
end-volume
volume macm03-locks
type features/locks
subvolumes macm03-access-control
end-volume
volume macm03-io-threads
type performance/io-threads
subvolumes macm03-locks
end-volume
volume /glusterstorage/macm03
type debug/io-stats
subvolumes macm03-io-threads
end-volume
volume macm03-server
type protocol/server
option transport-type tcp
option auth.addr./glusterstorage/macm03.allow *
subvolumes /glusterstorage/macm03
end-volume
macm03.10.0.1.xxy.glusterstorage-macm03.vol
-------------------------------------------
volume macm03-posix
type storage/posix
option directory /glusterstorage/macm03
end-volume
volume macm03-access-control
type features/access-control
subvolumes macm03-posix
end-volume
volume macm03-locks
type features/locks
subvolumes macm03-access-control
end-volume
volume macm03-io-threads
type performance/io-threads
subvolumes macm03-locks
end-volume
volume /glusterstorage/macm03
type debug/io-stats
subvolumes macm03-io-threads
end-volume
volume macm03-server
type protocol/server
option transport-type tcp
option auth.addr./glusterstorage/macm03.allow *
subvolumes /glusterstorage/macm03
end-volume
client
------
the client has mounted the volume via fstab like this:
server:/macm03 /srv/www/GFS glusterfs defaults,_netdev 0 0
now we registered strange behavior and i have some questions:
1) files with size 0
we find many files with size 0. in server-log we only find this. what does this
mean?
(most all files in one directory has size 0).
[2011-04-28 23:52:00.630869] I [server-resolve.c:580:server_resolve]
0-macm03-server: pure path
resolution for /xxx/preview/4aa76fa541413.jpg (LOOKUP)
[2011-04-28 23:52:00.637384] I [server-resolve.c:580:server_resolve]
0-macm03-server: pure path
resolution for /xxx/preview/4aa76fa541413.jpg (UNLINK)
[2011-04-28 23:52:00.693183] I [server-resolve.c:580:server_resolve]
0-macm03-server: pure path
resolution for /xxx/preview/4aa76fa541413.jpg (LOOKUP)
[2011-04-28 23:52:00.711092] I [server-resolve.c:580:server_resolve]
0-macm03-server: pure path
resolution for /xxx/preview/4aa76fa541413.jpg (MKNOD)
[2011-04-28 23:52:00.746289] I [server-resolve.c:580:server_resolve]
0-macm03-server: pure path
resolution for /xxx/preview/4aa76fa541413.jpg (SETATTR)
[2011-04-28 23:52:16.373532] I [server-resolve.c:580:server_resolve]
0-macm03-server: pure path
resolution for /xxx/preview/4aa76fa541413.jpg (LOOKUP)
2) then the client is selfhealing meta-data all the time.... (because the file
has has size 0 on one
of the servers???)
but we triggered selfhealing severalt time like this:
http://europe.gluster.org/community/documentation/index.php/Gluster_3.1:_Triggering_Self-Heal_on_Replicate
[2011-04-29 07:55:27.188743] I [afr-common.c:581:afr_lookup_collect_xattr]
0-macm03-replicate-0:
data self-heal is pending for /videos12/29640/preview/4aadf4b757de6.jpg.
[2011-04-29 07:55:27.188829] I [afr-common.c:735:afr_lookup_done]
0-macm03-replicate-0: background
meta-data data self-heal triggered. path:
/videos12/29640/preview/4aadf4b757de6.jpg
[2011-04-29 07:55:27.194446] W [dict.c:437:dict_ref]
(-->/opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/protocol/client.so(client3_1_fstat_cbk+0x2bb)
[0x2aaaaafe833b]
(-->/opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/cluster/replicate.so(afr_sh_data_fstat_cbk+0x17d)
[0x2aaaab11c9ad]
(-->/opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/cluster/replicate.so(afr_sh_data_fix+0x1fc)
[0x2aaaab11c64c]))) 0-dict: dict is NULL
3)
on some of the clients we then can not access the whole directory:
# dir xxx/preview/
/bin/ls: reading directory xxx/preview/: File descriptor in bad state
total 0
in logs we find this:
[2011-04-29 08:36:17.224301] W [afr-common.c:634:afr_lookup_self_heal_check]
0-macm03-replicate-0:
/videos12/30181: gfid different on subvolume
[2011-04-29 08:36:17.241330] I [afr-common.c:680:afr_lookup_done]
0-macm03-replicate-0: entries are
missing in lookup of /xxx/preview.
[2011-04-29 08:36:17.241373] I [afr-common.c:735:afr_lookup_done]
0-macm03-replicate-0: background
meta-data data entry self-heal triggered. path: /xxx/preview
[2011-04-29 08:36:17.243160] I
[afr-self-heal-metadata.c:595:afr_sh_metadata_lookup_cbk]
0-macm03-replicate-0: path /videos12/30181/preview on subvolume macm03-client-0
=> -1 (No such file
or directory)
[2011-04-29 08:36:17.302228] I [afr-dir-read.c:120:afr_examine_dir_readdir_cbk]
0-macm03-replicate-0: /videos12/30181/preview: failed to do opendir on
macm03-client-0
[2011-04-29 08:36:17.303836] I [afr-dir-read.c:174:afr_examine_dir_readdir_cbk]
0-macm03-replicate-0: entry self-heal triggered. path: /xxx/preview, reason:
checksums of directory
differ, forced merge option set
4)
sometimes when we umount glusterfs-volume on client and remount it again, we can
access the dirctory
which was in bad state before -> and then also selfhealing works at it should
but sometimes also a remount does not work.
any help would be appreciated.
thank you very very much!
thx
christopher
Christopher Anderlik
2011-May-02 06:20 UTC
[Gluster-users] strange behavior with glusterfs 3.2.0
hi. hm, so it seems that nobody else have these problems. is there any possibility to get professional support for glusterfs? does someone has a link/contact? thank you christopher Am 29.04.2011 08:42, schrieb Christopher Anderlik:> hello. > > we use glusterfs 3.2.0 > 2 glusterfs-server with SLES 11.1 > and several clients which access the gfs-volumes. > > > configuration: > > > > info > ---- > type=2 > count=2 > status=1 > sub_count=2 > version=1 > transport-type=0 > volume-id=05168b54-6a5c-4aa3-91ee-63d16976c6cd > brick-0=10.0.1.xxx:-glusterstorage-macm03 > brick-1=10.0.1.xxy:-glusterstorage-macm03 > > > > > > > macm03-fuse.vol > --------------- > volume macm03-client-0 > type protocol/client > option remote-host 10.0.1.xxx > option remote-subvolume /glusterstorage/macm03 > option transport-type tcp > end-volume > > volume macm03-client-1 > type protocol/client > option remote-host 10.0.1.xxy > option remote-subvolume /glusterstorage/macm03 > option transport-type tcp > end-volume > > volume macm03-replicate-0 > type cluster/replicate > subvolumes macm03-client-0 macm03-client-1 > end-volume > > volume macm03-write-behind > type performance/write-behind > subvolumes macm03-replicate-0 > end-volume > > volume macm03-read-ahead > type performance/read-ahead > subvolumes macm03-write-behind > end-volume > > volume macm03-io-cache > type performance/io-cache > subvolumes macm03-read-ahead > end-volume > > volume macm03-quick-read > type performance/quick-read > subvolumes macm03-io-cache > end-volume > > volume macm03-stat-prefetch > type performance/stat-prefetch > subvolumes macm03-quick-read > end-volume > > volume macm03 > type debug/io-stats > subvolumes macm03-stat-prefetch > end-volume > > > > > > macm03.10.0.1.xxx.glusterstorage-macm03.vol > ------------------------------------------- > volume macm03-posix > type storage/posix > option directory /glusterstorage/macm03 > end-volume > > volume macm03-access-control > type features/access-control > subvolumes macm03-posix > end-volume > > volume macm03-locks > type features/locks > subvolumes macm03-access-control > end-volume > > volume macm03-io-threads > type performance/io-threads > subvolumes macm03-locks > end-volume > > volume /glusterstorage/macm03 > type debug/io-stats > subvolumes macm03-io-threads > end-volume > > volume macm03-server > type protocol/server > option transport-type tcp > option auth.addr./glusterstorage/macm03.allow * > subvolumes /glusterstorage/macm03 > end-volume > > > > > > > > > > > > > macm03.10.0.1.xxy.glusterstorage-macm03.vol > ------------------------------------------- > volume macm03-posix > type storage/posix > option directory /glusterstorage/macm03 > end-volume > > volume macm03-access-control > type features/access-control > subvolumes macm03-posix > end-volume > > volume macm03-locks > type features/locks > subvolumes macm03-access-control > end-volume > > volume macm03-io-threads > type performance/io-threads > subvolumes macm03-locks > end-volume > > volume /glusterstorage/macm03 > type debug/io-stats > subvolumes macm03-io-threads > end-volume > > volume macm03-server > type protocol/server > option transport-type tcp > option auth.addr./glusterstorage/macm03.allow * > subvolumes /glusterstorage/macm03 > end-volume > > > > > > > > > > > client > ------ > the client has mounted the volume via fstab like this: > server:/macm03 /srv/www/GFS glusterfs defaults,_netdev 0 0 > > > > > > > > > > > > > now we registered strange behavior and i have some questions: > > > > 1) files with size 0 > we find many files with size 0. in server-log we only find this. what does this mean? > (most all files in one directory has size 0). > > > [2011-04-28 23:52:00.630869] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path > resolution for /xxx/preview/4aa76fa541413.jpg (LOOKUP) > [2011-04-28 23:52:00.637384] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path > resolution for /xxx/preview/4aa76fa541413.jpg (UNLINK) > [2011-04-28 23:52:00.693183] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path > resolution for /xxx/preview/4aa76fa541413.jpg (LOOKUP) > [2011-04-28 23:52:00.711092] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path > resolution for /xxx/preview/4aa76fa541413.jpg (MKNOD) > [2011-04-28 23:52:00.746289] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path > resolution for /xxx/preview/4aa76fa541413.jpg (SETATTR) > [2011-04-28 23:52:16.373532] I [server-resolve.c:580:server_resolve] 0-macm03-server: pure path > resolution for /xxx/preview/4aa76fa541413.jpg (LOOKUP) > > > > > > > > 2) then the client is selfhealing meta-data all the time.... (because the file has has size 0 on one > of the servers???) > but we triggered selfhealing severalt time like this: > http://europe.gluster.org/community/documentation/index.php/Gluster_3.1:_Triggering_Self-Heal_on_Replicate > > > > > [2011-04-29 07:55:27.188743] I [afr-common.c:581:afr_lookup_collect_xattr] 0-macm03-replicate-0: > data self-heal is pending for /videos12/29640/preview/4aadf4b757de6.jpg. > [2011-04-29 07:55:27.188829] I [afr-common.c:735:afr_lookup_done] 0-macm03-replicate-0: background > meta-data data self-heal triggered. path: /videos12/29640/preview/4aadf4b757de6.jpg > [2011-04-29 07:55:27.194446] W [dict.c:437:dict_ref] > (-->/opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/protocol/client.so(client3_1_fstat_cbk+0x2bb) > [0x2aaaaafe833b] > (-->/opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/cluster/replicate.so(afr_sh_data_fstat_cbk+0x17d) > [0x2aaaab11c9ad] > (-->/opt/glusterfs/3.2.0/lib64/glusterfs/3.2.0/xlator/cluster/replicate.so(afr_sh_data_fix+0x1fc) > [0x2aaaab11c64c]))) 0-dict: dict is NULL > > > > > > 3) > on some of the clients we then can not access the whole directory: > > # dir xxx/preview/ > /bin/ls: reading directory xxx/preview/: File descriptor in bad state > total 0 > > > in logs we find this: > > [2011-04-29 08:36:17.224301] W [afr-common.c:634:afr_lookup_self_heal_check] 0-macm03-replicate-0: > /videos12/30181: gfid different on subvolume > [2011-04-29 08:36:17.241330] I [afr-common.c:680:afr_lookup_done] 0-macm03-replicate-0: entries are > missing in lookup of /xxx/preview. > [2011-04-29 08:36:17.241373] I [afr-common.c:735:afr_lookup_done] 0-macm03-replicate-0: background > meta-data data entry self-heal triggered. path: /xxx/preview > [2011-04-29 08:36:17.243160] I [afr-self-heal-metadata.c:595:afr_sh_metadata_lookup_cbk] > 0-macm03-replicate-0: path /videos12/30181/preview on subvolume macm03-client-0 => -1 (No such file > or directory) > [2011-04-29 08:36:17.302228] I [afr-dir-read.c:120:afr_examine_dir_readdir_cbk] > 0-macm03-replicate-0: /videos12/30181/preview: failed to do opendir on macm03-client-0 > [2011-04-29 08:36:17.303836] I [afr-dir-read.c:174:afr_examine_dir_readdir_cbk] > 0-macm03-replicate-0: entry self-heal triggered. path: /xxx/preview, reason: checksums of directory > differ, forced merge option set > > > > > > > 4) > sometimes when we umount glusterfs-volume on client and remount it again, we can access the dirctory > which was in bad state before -> and then also selfhealing works at it should > but sometimes also a remount does not work. > > > > > > > any help would be appreciated. > thank you very very much! > > > > thx > christopher > > > > > > > > > > > > > > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >-- Mag. Christopher Anderlik Leiter Technik ________________________________________________________________________________ Xidras GmbH Stockern 47 3744 Stockern Austria Tel: 0043 2983 201 30 5 01 Fax: 0043 2983 201 30 5 01 9 Email: christopher.anderlik at xidras.com Web: http://www.xidras.com FN 317036 f | Landesgericht Krems | ATU64485024 ________________________________________________________________________________ VERTRAULICHE INFORMATIONEN! Diese eMail enth?lt vertrauliche Informationen und ist nur f?r den berechtigten Empf?nger bestimmt. Wenn diese eMail nicht f?r Sie bestimmt ist, bitten wir Sie, diese eMail an uns zur?ckzusenden und anschlie?end auf Ihrem Computer und Mail-Server zu l?schen. Solche eMails und Anlagen d?rfen Sie weder nutzen, noch verarbeiten oder Dritten zug?nglich machen, gleich in welcher Form. Wir danken f?r Ihre Kooperation! CONFIDENTIAL! This email contains confidential information and is intended for the authorised recipient only. If you are not an authorised recipient, please return the email to us and then delete it from your computer and mail-server. You may neither use nor edit any such emails including attachments, nor make them accessible to third parties in any manner whatsoever. Thank you for your cooperation ________________________________________________________________________________