John Mark Walker
2011-Jul-14 17:36 UTC
[Gluster-users] Alert: GlusterFS 3.2.2 Release for GFID Mismatch
GlusterFS Alert -
Problem: GFID Mismatch
Severity: 7 (out of 10) - Loss of service but ultimately no loss of data
PREVENTION: To *prevent* the issue, please install GlusterFS 3.2.2. If
you're using 3.1.x, upgrade to 3.1.5.
Download 3.2.2 here: http://download.gluster.com/pub/gluster/glusterfs/LATEST/
Download 3.1.5 here:
http://download.gluster.com/pub/gluster/glusterfs/3.1/LATEST/
FIX:
To check for mismatched GFIDs, please review your client logs and grep for the
words:
?gfid different? or ?gfid differs?
If you see either of these conditions, simply upgrading will not fix the
problem. You will need to use our tools here: https://github.com/vikasgorur/gfid
See details below for instructions. Upgrading will not fix the issue if
you've already experienced GFID mismatches.
DETAILS:
Over the last 3 weeks we have seen a growing number of GlusterFS implementations
experiencing an issue where mismatched GFIDs are appearing within the
filesystem.
Each file/directory on a Gluster volume has a unique 128-bit number associated
with it called the GFID. This is true regardless of Gluster configuration
(distribute or distribute/replicate). One inode, one GFID. The GFID is stored on
the backend as the value of the extended attribute "trusted.gfid".
Under normal circumstances, the value of this attribute is the same on all the
backend bricks. However, certain conditions can cause the value on one or more
of the bricks to differ from that on the other bricks. This causes the GlusterFS
client to become confused and throw errors. This applies to both the 3.1.4 and
3.2.1 versions of the filesystem, and previous versions in those series. This
can happen with the Native GlusterFS, NFS, or CIFS.
PREVENTION:
To prevent this issue from occurring, please upgrade immediately to 3.1.5, or
3.2.2. This will not correct the issue should it already be present in your
cluster.
FIX:
***IMPORTANT***
To check for mismatched GFIDs, please review your client logs and grep for the
words:
?gfid different? or ?gfid differs?
If you see either of these conditions, simply upgrading will not fix the
problem. You will need to download tools here:
https://github.com/vikasgorur/gfid
Follow the instructions in the README:
https://github.com/vikasgorur/gfid/blob/master/README
Here's the quick-start version:
1. The first step is to construct the master list of all files:
# cd /export/brick1
# find . > brick1.txt
... (do for all bricks)
# cat brick1.txt brick2.txt... | sort -u > master_list.txt
2. Then we need to get the gfid's of all the inodes from these bricks:
# cd /export/brick1
# gfid-list /path/to/master_list.txt > brick1.gfid
... (do for all bricks)
3. Identify the mismatched inodes:
# gfid-mismatch brick1.gfid brick2.gfid brick3.gfid brick4.gfid
4. Delete the gfid's now by doing:
# gluster volume stop <affected volume>
# gfid-mismatch brick1.gfid brick2.gfid brick3.gfid brick4.gfid | cut -f1 -d:
> mismatched.txt
# cd /export/brick1
# gfid-delete /path/to/mismatched.txt
Repeat for the other bricks.
5. Check logs
'gfid-delete' will produce a log with one entry for each file, which is
either:
usr/bin/factor: removed OK
OR
usr/bin/vim: No such file or directory
IMPORTANT NOTE: The deletion of gfid's must be done ONLY ON A STOPPED
VOLUME.
Deleting the gfid's on a running volume with mounted clients will cause more
problems instead of solving them.
Please feel free to contact me directly with any questions.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20110714/af123216/attachment.html>
gluster1206 at akxnet.de
2011-Jul-15 15:00 UTC
[Gluster-users] Alert: GlusterFS 3.2.2 Release for GFID Mismatch
Am 14.07.2011 19:36, schrieb John Mark Walker: Well... after having installed that version, my system is DOWN and broken. Apache reports "Access denied" although the file is accessible and has proper rights. Or the even simply does not exist which never harmed before. [2011-07-15 16:58:47.494602] I [client3_1-fops.c:2228:client3_1_lookup_cbk] 0-www-client-1: remote operation failed: Permission denied [2011-07-15 16:58:47.494716] W [fuse-bridge.c:184:fuse_entry_cbk] 0-glusterfs-fuse: 645442: LOOKUP() /clients/client23/web78/web/.htaccess => -1 (Permission denied) [2011-07-15 16:58:47.496399] I [client3_1-fops.c:2228:client3_1_lookup_cbk] 0-www-client-0: remote operation failed: Permission denied [2011-07-15 16:58:47.497217] I [client3_1-fops.c:2228:client3_1_lookup_cbk] 0-www-client-1: remote operation failed: Permission denied [2011-07-15 16:58:47.497707] I [client3_1-fops.c:2228:client3_1_lookup_cbk] 0-www-client-0: remote operation failed: Permission denied [2011-07-15 16:58:47.498199] I [client3_1-fops.c:2228:client3_1_lookup_cbk] 0-www-client-1: remote operation failed: Permission denied [2011-07-15 16:58:47.498258] W [fuse-bridge.c:184:fuse_entry_cbk] 0-glusterfs-fuse: 645444: LOOKUP() /clients/client23/web78/web/error => -1 (Permission denied) [2011-07-15 16:58:47.499366] I [client3_1-fops.c:2228:client3_1_lookup_cbk] 0-www-client-0: remote operation failed: Permission denied [2011-07-15 16:58:47.499576] I [client3_1-fops.c:2228:client3_1_lookup_cbk] 0-www-client-1: remote operation failed: Permission denied [2011-07-15 16:58:47.499634] W [fuse-bridge.c:184:fuse_entry_cbk] 0-glusterfs-fuse: 645446: LOOKUP() /clients/client23/web78/web/error => -1 (Permission denied) [2011-07-15 16:58:47.502940] I [client3_1-fops.c:2228:client3_1_lookup_cbk] 0-www-client-0: remote operation failed: Permission denied [2011-07-15 16:58:47.503405] I [client3_1-fops.c:2228:client3_1_lookup_cbk] 0-www-client-1: remote operation failed: Permission denied [2011-07-15 16:58:47.503466] W [fuse-bridge.c:184:fuse_entry_cbk] 0-glusterfs-fuse: 645451: LOOKUP() /clients/client23/web78/web/.htaccess => -1 (Permission denied) [2011-07-15 16:58:55.406148] I [client3_1-fops.c:2228:client3_1_lookup_cbk] 0-www-client-0: remote operation failed: Permission denied [2011-07-15 16:58:55.406507] I [client3_1-fops.c:2228:client3_1_lookup_cbk] 0-www-client-1: remote operation failed: Permission denied [2011-07-15 16:58:55.406566] W [fuse-bridge.c:184:fuse_entry_cbk] 0-glusterfs-fuse: 647556: LOOKUP() /clients/client23/web78/web/.htaccess => -1 (Permission denied) [2011-07-15 16:58:55.409952] I [client3_1-fops.c:2228:client3_1_lookup_cbk] 0-www-client-0: remote operation failed: Permission denied [2011-07-15 16:58:55.410355] I [client3_1-fops.c:2228:client3_1_lookup_cbk] 0-www-client-1: remote operation failed: Permission denied [2011-07-15 16:58:55.410411] W [fuse-bridge.c:184:fuse_entry_cbk] 0-glusterfs-fuse: 647560: LOOKUP() /clients/client23/web78/web/.htaccess => -1 (Permission denied)