RASTELLI Alessandro
2015-Jan-07 16:22 UTC
[Gluster-users] Input/Output Error when deleting folder
It worked... partially :) now I can access the folders again, but I can't delete them because that there are a couple of files into them (which I don't need) The files exist only on node1,2,4,5 , but not on node3: [root at gluster02-mi ~]# getfattr -m. -e hex -d /brick1/recorder/Rec218/Rec_218_1_part_14656.ts getfattr: Removing leading '/' from absolute path names # file: brick1/recorder/Rec218/Rec_218_1_part_14656.ts trusted.ec.config=0x0000080a02000200 trusted.ec.size=0x0000000034400000 trusted.ec.version=0x0000000000001a20 trusted.gfid=0x8d5da5a1cd1949618a5b96657857ceb6 [root at gluster03-mi ~]# getfattr -m. -e hex -d /brick1/recorder/Rec218/Rec_218_1_part_14656.ts getfattr: /brick1/recorder/Rec218/Rec_218_1_part_14656.ts: No such file or directory How do I proceed? Thanks -----Original Message----- From: Xavier Hernandez [mailto:xhernandez at datalab.es] Sent: mercoled? 7 gennaio 2015 16:45 To: RASTELLI Alessandro Cc: gluster-users at gluster.org; CAZZANIGA Stefano; UBERTINI Gabriele; TECHNOLOGY - Supporto Sistemi OTT e Cloud; ORLANDO Luca Subject: Re: [Gluster-users] Input/Output Error when deleting folder Sorry, the command should be: setfattr -n trusted.ec.version -v 0x0000000000000001 <brick path>/Rec218 On 01/07/2015 04:34 PM, RASTELLI Alessandro wrote:> See my answers below: > 1. > [root at gluster03-mi ~]# ls -l > /brick1/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e4 > ls: cannot access > /brick1/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e4 > : No such file or directory [root at gluster03-mi ~]# ls -l > /brick1/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bcd > lrwxrwxrwx 1 root root 55 Dec 17 17:37 > /brick1/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bcd > -> ../../00/00/00000000-0000-0000-0000-000000000001/Rec218 > [root at gluster03-mi ~]# ls -l > /brick2/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e4 > ls: cannot access > /brick2/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e4 > : No such file or directory [root at gluster03-mi ~]# ls -l > /brick2/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bcd > lrwxrwxrwx 1 root root 55 Dec 17 17:37 > /brick2/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bcd > -> ../../00/00/00000000-0000-0000-0000-000000000001/Rec218 > > 2. > /Rec218 is supposed to be empty (or, I don't need to restore the > files) I stopped the volume, but when executing the command I get an error: > [root at gluster01-mi ~]# setfattr -n trusted.ec.version -v 0x1 > /brick1/recorder/Rec218 bad input encoding > > Regards > A. > > > > -----Original Message----- > From: Xavier Hernandez [mailto:xhernandez at datalab.es] > Sent: mercoled? 7 gennaio 2015 16:08 > To: RASTELLI Alessandro > Cc: gluster-users at gluster.org; CAZZANIGA Stefano; UBERTINI Gabriele; > TECHNOLOGY - Supporto Sistemi OTT e Cloud; ORLANDO Luca > Subject: Re: [Gluster-users] Input/Output Error when deleting folder > > I see two problems here: > > 1. There has happened something very strange on gluster03-mi. It > contains the directory, but it's not the same one that there's on the > other bricks (8 bricks have gfid a9d904af-0d9e-4018-acb2-881bd8b3c2e4, > while that node has gfid bda849fc-a556-469e-ad84-ed074f2c1bcd) > > Whatever that has happened here has affected both bricks of that node in the same way. > > What return these commands on gluster03-mi: > > ls -l > /brick1/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e4 > ls -l > /brick1/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bcd > > ls -l > /brick2/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e4 > ls -l > /brick2/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bcd > > 2. It seems that node gluster04-mi has been stopped (or rebooted or > has > failed) while an operation that modifies the directory contents was being executed, so it has lost an update an it's out of sync (both bricks on the same server have missed one update, so it seems clear that it's not a brick problem but a server problem). > > The global result of all this is that you have 4 failed bricks on a configuration that only supports 2 failed bricks. > > BTW, having two or more bricks on the same server is not recommended because a single server failure causes multiple bricks to be lost. In this case a directory can be recovered, but if this happens to a file, it won't be 100% recoverable. > > Are there any files inside /Rec218 ? > > If you are going to delete the directory and all its contents and > brick contents in gluster03-mi are the same than in other servers, the > following commands should be safe (otherwise let me know before doing > anything): > > Before starting you must be sure that nothing is creating or deleting entries inside /Rec218. It would be even better if this could be done with volume stopped. > > On each brick (including gluster03-mi): > setfattr -n trusted.ec.version -v 0x1 <brick path>/Rec218 > > On bricks in gluster03-mi: > setfattr -n trusted.gfid -v 0xa9d904af0d9e4018acb2881bd8b3c2e4 > <brick path>/Rec218 > setfattr -n trusted.glusterfs.dht -v > 0x000000010000000000000000ffffffff <brick path>/Rec218 > > On client: > check that the directory is accessible and its contents seem ok. If so: > rm -rf <mount point>/Rec218 > > If you have a way to reproduce this situation, let me know. > > Xavi > > On 01/07/2015 03:31 PM, RASTELLI Alessandro wrote: >> [root at gluster01-mi ~]# getfattr -m. -e hex -d /brick1/recorder/Rec218 >> getfattr: Removing leading '/' from absolute path names # file: >> brick1/recorder/Rec218 trusted.ec.version=0x000000000000693a >> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4 >> trusted.glusterfs.dht=0x000000010000000000000000ffffffff >> >> [root at gluster01-mi ~]# getfattr -m. -e hex -d /brick2/recorder/Rec218 >> getfattr: Removing leading '/' from absolute path names # file: >> brick2/recorder/Rec218 trusted.ec.version=0x000000000000693a >> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4 >> trusted.glusterfs.dht=0x000000010000000000000000ffffffff >> >> >> [root at gluster02-mi ~]# getfattr -m. -e hex -d /brick1/recorder/Rec218 >> getfattr: Removing leading '/' from absolute path names # file: >> brick1/recorder/Rec218 trusted.ec.version=0x000000000000693a >> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4 >> trusted.glusterfs.dht=0x000000010000000000000000ffffffff >> >> [root at gluster02-mi ~]# getfattr -m. -e hex -d /brick2/recorder/Rec218 >> getfattr: Removing leading '/' from absolute path names # file: >> brick2/recorder/Rec218 trusted.ec.version=0x000000000000693a >> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4 >> trusted.glusterfs.dht=0x000000010000000000000000ffffffff >> >> >> [root at gluster03-mi ~]# getfattr -m. -e hex -d /brick1/recorder/Rec218 >> getfattr: Removing leading '/' from absolute path names # file: >> brick1/recorder/Rec218 >> trusted.gfid=0xbda849fca556469ead84ed074f2c1bcd >> >> [root at gluster03-mi ~]# getfattr -m. -e hex -d /brick2/recorder/Rec218 >> getfattr: Removing leading '/' from absolute path names # file: >> brick2/recorder/Rec218 >> trusted.gfid=0xbda849fca556469ead84ed074f2c1bcd >> >> >> [root at gluster04-mi ~]# getfattr -m. -e hex -d /brick1/recorder/Rec218 >> getfattr: Removing leading '/' from absolute path names # file: >> brick1/recorder/Rec218 >> trusted.ec.version=0x0000000000006939 >> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4 >> trusted.glusterfs.dht=0x000000010000000000000000ffffffff >> >> [root at gluster04-mi ~]# getfattr -m. -e hex -d /brick2/recorder/Rec218 >> getfattr: Removing leading '/' from absolute path names # file: >> brick2/recorder/Rec218 >> trusted.ec.version=0x0000000000006939 >> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4 >> trusted.glusterfs.dht=0x000000010000000000000000ffffffff >> >> >> [root at gluster05-mi ~]# getfattr -m. -e hex -d /brick1/recorder/Rec218 >> getfattr: Removing leading '/' from absolute path names # file: >> brick1/recorder/Rec218 trusted.ec.version=0x000000000000693a >> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4 >> trusted.glusterfs.dht=0x000000010000000000000000ffffffff >> >> [root at gluster05-mi ~]# getfattr -m. -e hex -d /brick2/recorder/Rec218 >> getfattr: Removing leading '/' from absolute path names # file: >> brick2/recorder/Rec218 trusted.ec.version=0x000000000000693a >> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4 >> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
Xavier Hernandez
2015-Jan-07 17:14 UTC
[Gluster-users] Input/Output Error when deleting folder
If that file is missing only from gluster03-mi, and it has the same attributes in all remaining bricks, self-heal should recover it automatically. Are there differences in the extended attributes of the file on bricks that have it ? On 01/07/2015 05:22 PM, RASTELLI Alessandro wrote:> It worked... partially :) > now I can access the folders again, but I can't delete them because that there are a couple of files into them (which I don't need) > The files exist only on node1,2,4,5 , but not on node3: > > [root at gluster02-mi ~]# getfattr -m. -e hex -d /brick1/recorder/Rec218/Rec_218_1_part_14656.ts > getfattr: Removing leading '/' from absolute path names > # file: brick1/recorder/Rec218/Rec_218_1_part_14656.ts > trusted.ec.config=0x0000080a02000200 > trusted.ec.size=0x0000000034400000 > trusted.ec.version=0x0000000000001a20 > trusted.gfid=0x8d5da5a1cd1949618a5b96657857ceb6 > > [root at gluster03-mi ~]# getfattr -m. -e hex -d /brick1/recorder/Rec218/Rec_218_1_part_14656.ts > getfattr: /brick1/recorder/Rec218/Rec_218_1_part_14656.ts: No such file or directory > > How do I proceed? > Thanks > > -----Original Message----- > From: Xavier Hernandez [mailto:xhernandez at datalab.es] > Sent: mercoled? 7 gennaio 2015 16:45 > To: RASTELLI Alessandro > Cc: gluster-users at gluster.org; CAZZANIGA Stefano; UBERTINI Gabriele; TECHNOLOGY - Supporto Sistemi OTT e Cloud; ORLANDO Luca > Subject: Re: [Gluster-users] Input/Output Error when deleting folder > > Sorry, the command should be: > > setfattr -n trusted.ec.version -v 0x0000000000000001 <brick > path>/Rec218 > > On 01/07/2015 04:34 PM, RASTELLI Alessandro wrote: >> See my answers below: >> 1. >> [root at gluster03-mi ~]# ls -l >> /brick1/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e4 >> ls: cannot access >> /brick1/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e4 >> : No such file or directory [root at gluster03-mi ~]# ls -l >> /brick1/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bcd >> lrwxrwxrwx 1 root root 55 Dec 17 17:37 >> /brick1/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bcd >> -> ../../00/00/00000000-0000-0000-0000-000000000001/Rec218 >> [root at gluster03-mi ~]# ls -l >> /brick2/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e4 >> ls: cannot access >> /brick2/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e4 >> : No such file or directory [root at gluster03-mi ~]# ls -l >> /brick2/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bcd >> lrwxrwxrwx 1 root root 55 Dec 17 17:37 >> /brick2/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bcd >> -> ../../00/00/00000000-0000-0000-0000-000000000001/Rec218 >> >> 2. >> /Rec218 is supposed to be empty (or, I don't need to restore the >> files) I stopped the volume, but when executing the command I get an error: >> [root at gluster01-mi ~]# setfattr -n trusted.ec.version -v 0x1 >> /brick1/recorder/Rec218 bad input encoding >> >> Regards >> A. >> >> >> >> -----Original Message----- >> From: Xavier Hernandez [mailto:xhernandez at datalab.es] >> Sent: mercoled? 7 gennaio 2015 16:08 >> To: RASTELLI Alessandro >> Cc: gluster-users at gluster.org; CAZZANIGA Stefano; UBERTINI Gabriele; >> TECHNOLOGY - Supporto Sistemi OTT e Cloud; ORLANDO Luca >> Subject: Re: [Gluster-users] Input/Output Error when deleting folder >> >> I see two problems here: >> >> 1. There has happened something very strange on gluster03-mi. It >> contains the directory, but it's not the same one that there's on the >> other bricks (8 bricks have gfid a9d904af-0d9e-4018-acb2-881bd8b3c2e4, >> while that node has gfid bda849fc-a556-469e-ad84-ed074f2c1bcd) >> >> Whatever that has happened here has affected both bricks of that node in the same way. >> >> What return these commands on gluster03-mi: >> >> ls -l >> /brick1/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e4 >> ls -l >> /brick1/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bcd >> >> ls -l >> /brick2/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e4 >> ls -l >> /brick2/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bcd >> >> 2. It seems that node gluster04-mi has been stopped (or rebooted or >> has >> failed) while an operation that modifies the directory contents was being executed, so it has lost an update an it's out of sync (both bricks on the same server have missed one update, so it seems clear that it's not a brick problem but a server problem). >> >> The global result of all this is that you have 4 failed bricks on a configuration that only supports 2 failed bricks. >> >> BTW, having two or more bricks on the same server is not recommended because a single server failure causes multiple bricks to be lost. In this case a directory can be recovered, but if this happens to a file, it won't be 100% recoverable. >> >> Are there any files inside /Rec218 ? >> >> If you are going to delete the directory and all its contents and >> brick contents in gluster03-mi are the same than in other servers, the >> following commands should be safe (otherwise let me know before doing >> anything): >> >> Before starting you must be sure that nothing is creating or deleting entries inside /Rec218. It would be even better if this could be done with volume stopped. >> >> On each brick (including gluster03-mi): >> setfattr -n trusted.ec.version -v 0x1 <brick path>/Rec218 >> >> On bricks in gluster03-mi: >> setfattr -n trusted.gfid -v 0xa9d904af0d9e4018acb2881bd8b3c2e4 >> <brick path>/Rec218 >> setfattr -n trusted.glusterfs.dht -v >> 0x000000010000000000000000ffffffff <brick path>/Rec218 >> >> On client: >> check that the directory is accessible and its contents seem ok. If so: >> rm -rf <mount point>/Rec218 >> >> If you have a way to reproduce this situation, let me know. >> >> Xavi >> >> On 01/07/2015 03:31 PM, RASTELLI Alessandro wrote: >>> [root at gluster01-mi ~]# getfattr -m. -e hex -d /brick1/recorder/Rec218 >>> getfattr: Removing leading '/' from absolute path names # file: >>> brick1/recorder/Rec218 trusted.ec.version=0x000000000000693a >>> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4 >>> trusted.glusterfs.dht=0x000000010000000000000000ffffffff >>> >>> [root at gluster01-mi ~]# getfattr -m. -e hex -d /brick2/recorder/Rec218 >>> getfattr: Removing leading '/' from absolute path names # file: >>> brick2/recorder/Rec218 trusted.ec.version=0x000000000000693a >>> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4 >>> trusted.glusterfs.dht=0x000000010000000000000000ffffffff >>> >>> >>> [root at gluster02-mi ~]# getfattr -m. -e hex -d /brick1/recorder/Rec218 >>> getfattr: Removing leading '/' from absolute path names # file: >>> brick1/recorder/Rec218 trusted.ec.version=0x000000000000693a >>> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4 >>> trusted.glusterfs.dht=0x000000010000000000000000ffffffff >>> >>> [root at gluster02-mi ~]# getfattr -m. -e hex -d /brick2/recorder/Rec218 >>> getfattr: Removing leading '/' from absolute path names # file: >>> brick2/recorder/Rec218 trusted.ec.version=0x000000000000693a >>> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4 >>> trusted.glusterfs.dht=0x000000010000000000000000ffffffff >>> >>> >>> [root at gluster03-mi ~]# getfattr -m. -e hex -d /brick1/recorder/Rec218 >>> getfattr: Removing leading '/' from absolute path names # file: >>> brick1/recorder/Rec218 >>> trusted.gfid=0xbda849fca556469ead84ed074f2c1bcd >>> >>> [root at gluster03-mi ~]# getfattr -m. -e hex -d /brick2/recorder/Rec218 >>> getfattr: Removing leading '/' from absolute path names # file: >>> brick2/recorder/Rec218 >>> trusted.gfid=0xbda849fca556469ead84ed074f2c1bcd >>> >>> >>> [root at gluster04-mi ~]# getfattr -m. -e hex -d /brick1/recorder/Rec218 >>> getfattr: Removing leading '/' from absolute path names # file: >>> brick1/recorder/Rec218 >>> trusted.ec.version=0x0000000000006939 >>> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4 >>> trusted.glusterfs.dht=0x000000010000000000000000ffffffff >>> >>> [root at gluster04-mi ~]# getfattr -m. -e hex -d /brick2/recorder/Rec218 >>> getfattr: Removing leading '/' from absolute path names # file: >>> brick2/recorder/Rec218 >>> trusted.ec.version=0x0000000000006939 >>> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4 >>> trusted.glusterfs.dht=0x000000010000000000000000ffffffff >>> >>> >>> [root at gluster05-mi ~]# getfattr -m. -e hex -d /brick1/recorder/Rec218 >>> getfattr: Removing leading '/' from absolute path names # file: >>> brick1/recorder/Rec218 trusted.ec.version=0x000000000000693a >>> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4 >>> trusted.glusterfs.dht=0x000000010000000000000000ffffffff >>> >>> [root at gluster05-mi ~]# getfattr -m. -e hex -d /brick2/recorder/Rec218 >>> getfattr: Removing leading '/' from absolute path names # file: >>> brick2/recorder/Rec218 trusted.ec.version=0x000000000000693a >>> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4 >>> trusted.glusterfs.dht=0x000000010000000000000000ffffffff