tbenzvi at 3vgeomatics.com
2014-Dec-17 01:07 UTC
[Gluster-users] Hundreds of duplicate files
Hi everyone, we have noticed some extremely odd behaviour with our distributed Gluster volume where duplicate files (same name, same or different content) are being created and stored on multiple bricks. The only consistent clue is that one of the duplicate files has the sticky bit set. I am hoping someone will be able to shed some light on why this is happening and how we can restore the volume as there appear to be hundreds of such files. I will try to provide as much pertinent information as I can. We have a 130TB Gluster volume consisting of two 20TB bricks on server1, and three 40TB bricks on a server2 which were added at a later date (and rebalancing was done). The volume is mounted on server1, and accessed only through this server but by many users. Both servers went down due to power loss several days ago after which this problem was first noticed. We ran a rebalance command on the volumes, this has not fixed the problem. Gluster volume info: Volume Name: safari Type: Distribute Volume ID: d48d0e6b-4389-4c2c-8fd1-cd2854121eda Status: Started Number of Bricks: 5 Transport-type: tcp Bricks: Brick1: server1:/data/glusterfs/safari/brick00/brick Brick2: server1:/data/glusterfs/safari/brick01/brick Brick3: server2:/data/glusterfs/safari/brick02/brick Brick4: server2:/data/glusterfs/safari/brick03/brick Brick5: server2:/data/glusterfs/safari/brick04/brick Size information: /dev/sdc 37T 16T 22T 42% /data/glusterfs/safari/brick02 /dev/sdd 37T 16T 22T 42% /data/glusterfs/safari/brick03 /dev/sde 37T 17T 21T 45% /data/glusterfs/safari/brick04 /dev/md126 11T 7.7T 2.8T 74% /data/glusterfs/safari/brick00 /dev/md124 11T 8.0T 2.5T 77% /data/glusterfs/safari/brick01 server2:/safari 130T 63T 68T 48% /sar Example 1: -Two files with the same name exist in one directory -They have different contents and attributes -A file listing on the mounted volume shows the same inode -The newer file has sticky bit set -Neither file is corrupted, they can both be viewed by using the absolute path (on the bricks) File listing on the mounted volume 13036730497538635177 -rw-rw-r-T 1 jon users 924 Dec 15 10:42 RSLC_tab 13036730497538635177 -rw-rw-r-- 1 jon users 418 Mar 18 2013 RSLC_tab Listing of the files on the bricks: 8925798411 -rw-rw-r-T+ 2 jon users 924 Dec 15 10:42 /data/glusterfs/safari/brick00/brick/complete/shm/rs2/ottawa/mf6_asc/stack_org/RSLC_tab 51541886672 -rw-rw-r--+ 2 1002 users 418 Mar 18 2013 /data/glusterfs/safari/brick02/brick/complete/shm/rs2/ottawa/mf6_asc/stack_org/RSLC_tab Example 2: -Two files with the same name exist in one directory -They have the same content and attributes -No sticky bit is set when looking at file listing on the mounted volume -Sticky bit is set for one while when looking at file listing on the bricks -Files are corrupted File listing on the mounted volume: 13012555852904096080 -rw-rw-r-- 1 tom users 2393848 Dec 8 2013 ifg_lr/20130226_20130813.diff.phi.ras 13012555852904096080 -rw-rw-r-- 1 tom users 2393848 Dec 8 2013 ifg_lr/20130226_20130813.diff.phi.ras Listing of the files on the bricks: 17058578 -rw-rw-r-T+ 2 tom users 2393848 Dec 13 17:11 /data/glusterfs/safari/brick00/brick/rsc/rs2/calgary/u22_dsc/stack_org/ifg_lr/20130226_20130813.diff.phi.ras 57986922129 -rw-rw-r--+ 2 1010 users 2393848 Dec 8 2013 /data/glusterfs/safari/brick02/brick/rsc/rs2/calgary/u22_dsc/stack_org/ifg_lr/20130226_20130813.diff.phi.ras Additionally, only some files in this directory are duplicated. The duplicated files are corrupted (can not be viewed as Raster images: the original file type) The files which are not duplicated are not corrupted. File command: (notice duplicate and singleton files) ifg_lr/20091021_20100218.diff.phi.ras: Sun raster image data, 1208 x 1981, 8-bit, RGB colormap ifg_lr/20091021_20101016.diff.phi.ras: data ifg_lr/20091021_20101016.diff.phi.ras: data ifg_lr/20091021_20101109.diff.phi.ras: Sun raster image data, 1208 x 1981, 8-bit, RGB colormap ifg_lr/20091021_20101203.diff.phi.ras: Sun raster image data, 1208 x 1981, 8-bit, RGB colormap ifg_lr/20091021_20101227.diff.phi.ras: Sun raster image data, 1208 x 1981, 8-bit, RGB colormap ifg_lr/20091021_20110120.diff.phi.ras: Sun raster image data, 1208 x 1981, 8-bit, RGB colormap ifg_lr/20091021_20110213.diff.phi.ras: data ifg_lr/20091021_20110213.diff.phi.ras: data ifg_lr/20091021_20110309.diff.phi.ras: data ifg_lr/20091021_20110309.diff.phi.ras: sticky data ifg_lr/20091021_20110402.diff.phi.ras: Sun raster image data, 1208 x 1981, 8-bit, RGB colormap Information from Gluster log file: Additionally, the log is full of thousands of the following such lines (possibly, one for each directory?) dating back several mponths 27 [2014-12-12 11:10:10.257950] I [dht-layout.c:726:dht_layout_dir_mismatch] 3-safari-dht: /rsc/tsx/lasvegas/spot_asc/stack/ifg_lr - disk layout missing 28 [2014-12-12 11:10:10.257988] I [dht-common.c:623:dht_revalidate_cbk] 3-safari-dht: mismatching layouts for /rsc/tsx/lasvegas/spot_asc/stack/ifg_ lr 29 [2014-12-12 11:10:13.042362] I [dht-layout.c:726:dht_layout_dir_mismatch] 3-safari-dht: /rsc/tsx/lasvegas/spot_dsc/stack/ifg_lr - disk layout missing 30 [2014-12-12 11:10:13.042395] I [dht-common.c:623:dht_revalidate_cbk] 3-safari-dht: mismatching layouts for /rsc/tsx/lasvegas/spot_dsc/stack/ifg_ lr 31 [2014-12-12 11:10:15.685876] I [dht-layout.c:726:dht_layout_dir_mismatch] 3-safari-dht: /rsc/tsx/lasvegas/spot_dsc/stack/ifg_lr - disk layout missing 32 [2014-12-12 11:10:15.685921] I [dht-common.c:623:dht_revalidate_cbk] 3-safari-dht: mismatching layouts for /rsc/tsx/lasvegas/spot_dsc/stack/ifg_ lr 33 [2014-12-12 11:10:19.028518] I [dht-layout.c:726:dht_layout_dir_mismatch] 3-safari-dht: /rsc/tsx/lasvegas/spot_asc/stack/ifg_lr - disk layout missing There are also 1394 of the following errors in the last year, several (but not all) of them seem to correspond to duplicate files: 40620 [2014-12-12 22:55:57.180486] W [client-rpc-fops.c:1994:client3_3_setattr_cbk] 0-safari-client-1: remote operation failed: Operation not permitte d 40621 [2014-12-12 22:55:57.180514] E [dht-linkfile.c:213:dht_linkfile_setattr_cbk] 0-safari-dht: setattr of uid/gid on /freeport/tsx/miami/sm_asc/stac k/ifg_lr/20140930_20141102.diff.phi.ras :<gfid:00000000-0000-0000-0000-000000000000> failed (Operation not permitted) Thanks, Tom -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141216/fbc36167/attachment.html>
On Wed, Dec 17, 2014 at 4:07 AM, <tbenzvi at 3vgeomatics.com> wrote:> Hi everyone, we have noticed some extremely odd behaviour with our > distributed Gluster volume where duplicate files (same name, same or > different content) are being created and stored on multiple bricks. The only > consistent clue is that one of the duplicate files has the sticky bit set. I > am hoping someone will be able to shed some light on why this is happening > and how we can restore the volume as there appear to be hundreds of such > files. I will try to provide as much pertinent information as I can. > > We have a 130TB Gluster volume consisting of two 20TB bricks on server1, and > three 40TB bricks on a server2 which were added at a later date (and > rebalancing was done). The volume is mounted on server1, and accessed only > through this server but by many users. Both servers went down due to power > loss several days ago after which this problem was first noticed. We ran a > rebalance command on the volumes, this has not fixed the problem. > > > Gluster volume info: > Volume Name: safari > Type: Distribute > Volume ID: d48d0e6b-4389-4c2c-8fd1-cd2854121eda > Status: Started > Number of Bricks: 5 > Transport-type: tcp > Bricks: > Brick1: server1:/data/glusterfs/safari/brick00/brick > Brick2: server1:/data/glusterfs/safari/brick01/brick > Brick3: server2:/data/glusterfs/safari/brick02/brick > Brick4: server2:/data/glusterfs/safari/brick03/brick > Brick5: server2:/data/glusterfs/safari/brick04/brick > > > Size information: > /dev/sdc 37T 16T 22T 42% /data/glusterfs/safari/brick02 > /dev/sdd 37T 16T 22T 42% /data/glusterfs/safari/brick03 > /dev/sde 37T 17T 21T 45% /data/glusterfs/safari/brick04 > /dev/md126 11T 7.7T 2.8T 74% /data/glusterfs/safari/brick00 > /dev/md124 11T 8.0T 2.5T 77% /data/glusterfs/safari/brick01 > server2:/safari 130T 63T 68T 48% /sar > > > Example 1: > -Two files with the same name exist in one directory > -They have different contents and attributes > -A file listing on the mounted volume shows the same inode > -The newer file has sticky bit set > -Neither file is corrupted, they can both be viewed by using the absolute > path (on the bricks) > > File listing on the mounted volume > 13036730497538635177 -rw-rw-r-T 1 jon users 924 Dec 15 10:42 RSLC_tab > 13036730497538635177 -rw-rw-r-- 1 jon users 418 Mar 18 2013 RSLC_tab > > Listing of the files on the bricks: > 8925798411 -rw-rw-r-T+ 2 jon users 924 Dec 15 10:42 > /data/glusterfs/safari/brick00/brick/complete/shm/rs2/ottawa/mf6_asc/stack_org/RSLC_tab > 51541886672 -rw-rw-r--+ 2 1002 users 418 Mar 18 2013 > /data/glusterfs/safari/brick02/brick/complete/shm/rs2/ottawa/mf6_asc/stack_org/RSLC_tab > > > Example 2: > -Two files with the same name exist in one directory > -They have the same content and attributes > -No sticky bit is set when looking at file listing on the mounted volume > -Sticky bit is set for one while when looking at file listing on the bricks > -Files are corrupted > > File listing on the mounted volume: > 13012555852904096080 -rw-rw-r-- 1 tom users 2393848 Dec 8 2013 > ifg_lr/20130226_20130813.diff.phi.ras > 13012555852904096080 -rw-rw-r-- 1 tom users 2393848 Dec 8 2013 > ifg_lr/20130226_20130813.diff.phi.ras > > Listing of the files on the bricks: > 17058578 -rw-rw-r-T+ 2 tom users 2393848 Dec 13 17:11 > /data/glusterfs/safari/brick00/brick/rsc/rs2/calgary/u22_dsc/stack_org/ifg_lr/20130226_20130813.diff.phi.ras > 57986922129 -rw-rw-r--+ 2 1010 users 2393848 Dec 8 2013 > /data/glusterfs/safari/brick02/brick/rsc/rs2/calgary/u22_dsc/stack_org/ifg_lr/20130226_20130813.diff.phi.ras > > > Additionally, only some files in this directory are duplicated. The > duplicated files are corrupted (can not be viewed as Raster images: the > original file type) > The files which are not duplicated are not corrupted. > > File command: (notice duplicate and singleton files) > ifg_lr/20091021_20100218.diff.phi.ras: Sun raster image data, 1208 x 1981, > 8-bit, RGB colormap > ifg_lr/20091021_20101016.diff.phi.ras: data > ifg_lr/20091021_20101016.diff.phi.ras: data > ifg_lr/20091021_20101109.diff.phi.ras: Sun raster image data, 1208 x 1981, > 8-bit, RGB colormap > ifg_lr/20091021_20101203.diff.phi.ras: Sun raster image data, 1208 x 1981, > 8-bit, RGB colormap > ifg_lr/20091021_20101227.diff.phi.ras: Sun raster image data, 1208 x 1981, > 8-bit, RGB colormap > ifg_lr/20091021_20110120.diff.phi.ras: Sun raster image data, 1208 x 1981, > 8-bit, RGB colormap > ifg_lr/20091021_20110213.diff.phi.ras: data > ifg_lr/20091021_20110213.diff.phi.ras: data > ifg_lr/20091021_20110309.diff.phi.ras: data > ifg_lr/20091021_20110309.diff.phi.ras: sticky data > ifg_lr/20091021_20110402.diff.phi.ras: Sun raster image data, 1208 x 1981, > 8-bit, RGB colormapTom, can you please stop glusterfs daemons (glusterd, glusterfsd) umount brick devices and check filesystems for consistency (fsck or xfs_check). If there's errors on filesystems, you will need to remove duplicate files manually, leaving one of the "good" file. And I don't know what to do with .glusterfs brick subdirectory, if fsck will "fix" it (probably just remove it).