I'm wondering if this is from a corrupted failed rebalance. In a directory that has duplicates, do "setfattr -n trusted.distribute.fix.layout -v 1 ." If that fixes it, do a rebalance...fix-layout On December 27, 2014 12:38:01 PM PST, tbenzvi at 3vgeomatics.com wrote:>Ok, I am really tearing my hair out here. I tried doing this manually >for several other files just to be sure. And in these cases it removed >the duplicate file from the directory listing, but the file can still >not be read.. Reading directly from the brick works fine. > >--------- Original Message --------- Subject: Re: [Gluster-users] >Hundreds of duplicate files >From: "Joe Julian" <joe at julianfamily.org> >Date: 12/27/14 12:01 pm >To: gluster-users at gluster.org > >Should be safe. > >Here's what I've done in the past to clean up rogue dht link files (not >that yours looked rogue though): > > find $brick_root -type f -size 0 -perm 1000 -exec /bin/rm {} \; > > On 12/27/2014 11:09 AM, tbenzvi at 3vgeomatics.com wrote: >Moving the file with linkto attribute worked! Just one copy of the file >is retained in the listing and can be read without problems. >I will write a script to remove these rogue link files from the bricks >- any risks associated with this? > >Thanks everyone for your help, of course if anyone could explain how >this happened I would love to hear it.. > >Tom > >--------- Original Message --------- Subject: Re: [Gluster-users] >Hundreds of duplicate files > From: "Vijay Bellur" <vbellur at redhat.com> > Date: 12/27/14 9:12 am > To: tbenzvi at 3vgeomatics.com, gluster-users at gluster.org > > On 12/27/2014 01:11 PM, tbenzvi at 3vgeomatics.com wrote: > > Thanks for your continued help Joe. >> A demonstration of the problem, in this case I was able to open the >file >> in vim (a text file) without any issues, however sometimes duplicated > > text files open in vim as one line consisting of @ characters, and > > binary data files can also not be opened correctly for reading. >> However the duplicate listing is still an issue. Note that Dec 13 was > > the date of a server crash. > > > > [root at jongoo ~]# ll /sar/complete/vancouver/refdem/tif2flt.pro* > > -rw-rw-r-T 1 parwant users 1712 Dec 13 19:02 tif2flt.pro > > -rw-rw-r-- 1 parwant users 1712 Jun 17 2010 tif2flt.pro > > >> A few minutes later doing the same listing.. sticky bit disappeared >and > > modification date changed > > > > [root at jongoo ~]# ll /sar/complete/vancouver/refdem/tif2flt.pro* > > -rw-rw-r-- 1 parwant users 1712 Jun 17 2010 > > /sar/complete/vancouver/refdem/tif2flt.pro > > -rw-rw-r-- 1 parwant users 1712 Jun 17 2010 > > /sar/complete/vancouver/refdem/tif2flt.pro > > > > [root at jongoo ~]# getfattr -m . -d -e hex >> >/data/glusterfs/safari/brick00/brick/complete/vancouver/refdem/tif2flt.pro > > getfattr: Removing leading '/' from absolute path names > > # file: >> >data/glusterfs/safari/brick00/brick/complete/vancouver/refdem/tif2flt.pro >> >system.posix_acl_access=0x0200000001000600ffffffff04000600ffffffff10000600ffffffff20000400ffffffff >> >trusted.SGI_ACL_FILE=0x0000000400000001ffffffff0006000000000004ffffffff0006000000000010ffffffff0006000000000020ffffffff00040000 > > trusted.gfid=0xdfe13dc088bf4a779488ef72f0a879cd > > trusted.glusterfs.dht.linkto=0x7361666172692d636c69656e742d3300 > > > > [root at ndovu ~]# getfattr -m . -d -e hex >> >/data/glusterfs/safari/brick03/brick/complete/vancouver/refdem/tif2flt.pro > > getfattr: Removing leading '/' from absolute path names > > # file: >> >data/glusterfs/safari/brick03/brick/complete/vancouver/refdem/tif2flt.pro >> >system.posix_acl_access=0x0200000001000600ffffffff04000600ffffffff10000600ffffffff20000400ffffffff >> >trusted.SGI_ACL_FILE=0x0000000400000001ffffffff0006000000000004ffffffff0006000000000010ffffffff0006000000000020ffffffff00040000 > > trusted.gfid=0xdfe13dc088bf4a779488ef72f0a879cd > > > > Is rebalance running on this volume right now? If not, can you please >move out the file copy with "trusted.glusterfs.dht.linkto" attribute >out > of the brick directory >(/data/glusterfs/safari/brick00/brick/complete/vancouver/refdem/tif2flt.pro) > > to an alternate location & check the behavior? > > Thanks, > Vijay > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users >_______________________________________________ Gluster-users mailing >list Gluster-users at gluster.org >http://www.gluster.org/mailman/listinfo/gluster-users > > >------------------------------------------------------------------------ > >_______________________________________________ >Gluster-users mailing list >Gluster-users at gluster.org >http://www.gluster.org/mailman/listinfo/gluster-users-- Sent from my Android device with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20141227/079b7374/attachment.html>
tbenzvi at 3vgeomatics.com
2014-Dec-27 23:18 UTC
[Gluster-users] Hundreds of duplicate files
That didn't fix it unfortunately. In fact, I've done a full rebalance after initially discovering the problem and after updating Gluster, but nothing was changed.. I don't know too much about how Gluster works internally; is it possible to compute the hash for each duplicate filename - figure out on which brick it belong is and find where it actually resides, then recreate the link file or update the linkto attribute? Assuming broken link files are the problem.. --------- Original Message --------- Subject: Re: [Gluster-users] Hundreds of duplicate files From: "Joe Julian" <joe at julianfamily.org> Date: 12/27/14 1:55 pm To: tbenzvi at 3vgeomatics.com, gluster-users at gluster.org I'm wondering if this is from a corrupted failed rebalance. In a directory that has duplicates, do "setfattr -n trusted.distribute.fix.layout -v 1 ." If that fixes it, do a rebalance...fix-layout On December 27, 2014 12:38:01 PM PST, tbenzvi at 3vgeomatics.com wrote: Ok, I am really tearing my hair out here. I tried doing this manually for several other files just to be sure. And in these cases it removed the duplicate file from the directory listing, but the file can still not be read.. Reading directly from the brick works fine. --------- Original Message --------- Subject: Re: [Gluster-users] Hundreds of duplicate files From: "Joe Julian" <joe at julianfamily.org> Date: 12/27/14 12:01 pm To: gluster-users at gluster.org Should be safe. Here's what I've done in the past to clean up rogue dht link files (not that yours looked rogue though): find $brick_root -type f -size 0 -perm 1000 -exec /bin/rm {} \; On 12/27/2014 11:09 AM, tbenzvi at 3vgeomatics.com wrote: Moving the file with linkto attribute worked! Just one copy of the file is retained in the listing and can be read without problems. I will write a script to remove these rogue link files from the bricks - any risks associated with this? Thanks everyone for your help, of course if anyone could explain how this happened I would love to hear it.. Tom --------- Original Message --------- Subject: Re: [Gluster-users] Hundreds of duplicate files From: "Vijay Bellur" <vbellur at redhat.com> Date: 12/27/14 9:12 am To: tbenzvi at 3vgeomatics.com, gluster-users at gluster.org On 12/27/2014 01:11 PM, tbenzvi at 3vgeomatics.com wrote: > Thanks for your continued help Joe. > A demonstration of the problem, in this case I was able to open the file > in vim (a text file) without any issues, however sometimes duplicated > text files open in vim as one line consisting of @ characters, and > binary data files can also not be opened correctly for reading. > However the duplicate listing is still an issue. Note that Dec 13 was > the date of a server crash. > > [root at jongoo ~]# ll /sar/complete/vancouver/refdem/tif2flt.pro* > -rw-rw-r-T 1 parwant users 1712 Dec 13 19:02 tif2flt.pro > -rw-rw-r-- 1 parwant users 1712 Jun 17 2010 tif2flt.pro > > A few minutes later doing the same listing.. sticky bit disappeared and > modification date changed > > [root at jongoo ~]# ll /sar/complete/vancouver/refdem/tif2flt.pro* > -rw-rw-r-- 1 parwant users 1712 Jun 17 2010 > /sar/complete/vancouver/refdem/tif2flt.pro > -rw-rw-r-- 1 parwant users 1712 Jun 17 2010 > /sar/complete/vancouver/refdem/tif2flt.pro > > [root at jongoo ~]# getfattr -m . -d -e hex > /data/glusterfs/safari/brick00/brick/complete/vancouver/refdem/tif2flt.pro > getfattr: Removing leading '/' from absolute path names > # file: > data/glusterfs/safari/brick00/brick/complete/vancouver/refdem/tif2flt.pro > system.posix_acl_access=0x0200000001000600ffffffff04000600ffffffff10000600ffffffff20000400ffffffff > trusted.SGI_ACL_FILE=0x0000000400000001ffffffff0006000000000004ffffffff0006000000000010ffffffff0006000000000020ffffffff00040000 > trusted.gfid=0xdfe13dc088bf4a779488ef72f0a879cd > trusted.glusterfs.dht.linkto=0x7361666172692d636c69656e742d3300 > > [root at ndovu ~]# getfattr -m . -d -e hex > /data/glusterfs/safari/brick03/brick/complete/vancouver/refdem/tif2flt.pro > getfattr: Removing leading '/' from absolute path names > # file: > data/glusterfs/safari/brick03/brick/complete/vancouver/refdem/tif2flt.pro > system.posix_acl_access=0x0200000001000600ffffffff04000600ffffffff10000600ffffffff20000400ffffffff > trusted.SGI_ACL_FILE=0x0000000400000001ffffffff0006000000000004ffffffff0006000000000010ffffffff0006000000000020ffffffff00040000 > trusted.gfid=0xdfe13dc088bf4a779488ef72f0a879cd > Is rebalance running on this volume right now? If not, can you please move out the file copy with "trusted.glusterfs.dht.linkto" attribute out of the brick directory (/data/glusterfs/safari/brick00/brick/complete/vancouver/refdem/tif2flt.pro) to an alternate location & check the behavior? Thanks, Vijay _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org http://www.gluster.org/mailman/listinfo/gluster-users _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org http://www.gluster.org/mailman/listinfo/gluster-users Gluster-users mailing list Gluster-users at gluster.org http://www.gluster.org/mailman/listinfo/gluster-users -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20141227/5435ed6f/attachment.html>