Xavier Hernandez
2015-Jan-07 15:08 UTC
[Gluster-users] Input/Output Error when deleting folder
I see two problems here:
1. There has happened something very strange on gluster03-mi. It
contains the directory, but it's not the same one that there's on the
other bricks (8 bricks have gfid a9d904af-0d9e-4018-acb2-881bd8b3c2e4,
while that node has gfid bda849fc-a556-469e-ad84-ed074f2c1bcd)
Whatever that has happened here has affected both bricks of that node in
the same way.
What return these commands on gluster03-mi:
ls -l /brick1/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e4
ls -l /brick1/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bcd
ls -l /brick2/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e4
ls -l /brick2/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bcd
2. It seems that node gluster04-mi has been stopped (or rebooted or has
failed) while an operation that modifies the directory contents was
being executed, so it has lost an update an it's out of sync (both
bricks on the same server have missed one update, so it seems clear that
it's not a brick problem but a server problem).
The global result of all this is that you have 4 failed bricks on a
configuration that only supports 2 failed bricks.
BTW, having two or more bricks on the same server is not recommended
because a single server failure causes multiple bricks to be lost. In
this case a directory can be recovered, but if this happens to a file,
it won't be 100% recoverable.
Are there any files inside /Rec218 ?
If you are going to delete the directory and all its contents and brick
contents in gluster03-mi are the same than in other servers, the
following commands should be safe (otherwise let me know before doing
anything):
Before starting you must be sure that nothing is creating or deleting
entries inside /Rec218. It would be even better if this could be done
with volume stopped.
On each brick (including gluster03-mi):
setfattr -n trusted.ec.version -v 0x1 <brick path>/Rec218
On bricks in gluster03-mi:
setfattr -n trusted.gfid -v 0xa9d904af0d9e4018acb2881bd8b3c2e4
<brick path>/Rec218
setfattr -n trusted.glusterfs.dht -v
0x000000010000000000000000ffffffff <brick path>/Rec218
On client:
check that the directory is accessible and its contents seem ok. If so:
rm -rf <mount point>/Rec218
If you have a way to reproduce this situation, let me know.
Xavi
On 01/07/2015 03:31 PM, RASTELLI Alessandro wrote:> [root at gluster01-mi ~]# getfattr -m. -e hex -d /brick1/recorder/Rec218
> getfattr: Removing leading '/' from absolute path names
> # file: brick1/recorder/Rec218
> trusted.ec.version=0x000000000000693a
> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>
> [root at gluster01-mi ~]# getfattr -m. -e hex -d /brick2/recorder/Rec218
> getfattr: Removing leading '/' from absolute path names
> # file: brick2/recorder/Rec218
> trusted.ec.version=0x000000000000693a
> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>
>
> [root at gluster02-mi ~]# getfattr -m. -e hex -d /brick1/recorder/Rec218
> getfattr: Removing leading '/' from absolute path names
> # file: brick1/recorder/Rec218
> trusted.ec.version=0x000000000000693a
> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>
> [root at gluster02-mi ~]# getfattr -m. -e hex -d /brick2/recorder/Rec218
> getfattr: Removing leading '/' from absolute path names
> # file: brick2/recorder/Rec218
> trusted.ec.version=0x000000000000693a
> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>
>
> [root at gluster03-mi ~]# getfattr -m. -e hex -d /brick1/recorder/Rec218
> getfattr: Removing leading '/' from absolute path names
> # file: brick1/recorder/Rec218
> trusted.gfid=0xbda849fca556469ead84ed074f2c1bcd
>
> [root at gluster03-mi ~]# getfattr -m. -e hex -d /brick2/recorder/Rec218
> getfattr: Removing leading '/' from absolute path names
> # file: brick2/recorder/Rec218
> trusted.gfid=0xbda849fca556469ead84ed074f2c1bcd
>
>
> [root at gluster04-mi ~]# getfattr -m. -e hex -d /brick1/recorder/Rec218
> getfattr: Removing leading '/' from absolute path names
> # file: brick1/recorder/Rec218
> trusted.ec.version=0x0000000000006939
> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>
> [root at gluster04-mi ~]# getfattr -m. -e hex -d /brick2/recorder/Rec218
> getfattr: Removing leading '/' from absolute path names
> # file: brick2/recorder/Rec218
> trusted.ec.version=0x0000000000006939
> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>
>
> [root at gluster05-mi ~]# getfattr -m. -e hex -d /brick1/recorder/Rec218
> getfattr: Removing leading '/' from absolute path names
> # file: brick1/recorder/Rec218
> trusted.ec.version=0x000000000000693a
> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>
> [root at gluster05-mi ~]# getfattr -m. -e hex -d /brick2/recorder/Rec218
> getfattr: Removing leading '/' from absolute path names
> # file: brick2/recorder/Rec218
> trusted.ec.version=0x000000000000693a
> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
RASTELLI Alessandro
2015-Jan-07 15:34 UTC
[Gluster-users] Input/Output Error when deleting folder
See my answers below:
1.
[root at gluster03-mi ~]# ls -l
/brick1/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e4
ls: cannot access
/brick1/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e4: No such
file or directory
[root at gluster03-mi ~]# ls -l
/brick1/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bcd
lrwxrwxrwx 1 root root 55 Dec 17 17:37
/brick1/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bcd ->
../../00/00/00000000-0000-0000-0000-000000000001/Rec218
[root at gluster03-mi ~]# ls -l
/brick2/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e4
ls: cannot access
/brick2/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e4: No such
file or directory
[root at gluster03-mi ~]# ls -l
/brick2/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bcd
lrwxrwxrwx 1 root root 55 Dec 17 17:37
/brick2/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bcd ->
../../00/00/00000000-0000-0000-0000-000000000001/Rec218
2.
/Rec218 is supposed to be empty (or, I don't need to restore the files)
I stopped the volume, but when executing the command I get an error:
[root at gluster01-mi ~]# setfattr -n trusted.ec.version -v 0x1
/brick1/recorder/Rec218
bad input encoding
Regards
A.
-----Original Message-----
From: Xavier Hernandez [mailto:xhernandez at datalab.es]
Sent: mercoled? 7 gennaio 2015 16:08
To: RASTELLI Alessandro
Cc: gluster-users at gluster.org; CAZZANIGA Stefano; UBERTINI Gabriele;
TECHNOLOGY - Supporto Sistemi OTT e Cloud; ORLANDO Luca
Subject: Re: [Gluster-users] Input/Output Error when deleting folder
I see two problems here:
1. There has happened something very strange on gluster03-mi. It contains the
directory, but it's not the same one that there's on the other bricks (8
bricks have gfid a9d904af-0d9e-4018-acb2-881bd8b3c2e4,
while that node has gfid bda849fc-a556-469e-ad84-ed074f2c1bcd)
Whatever that has happened here has affected both bricks of that node in the
same way.
What return these commands on gluster03-mi:
ls -l /brick1/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e4
ls -l /brick1/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bcd
ls -l /brick2/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e4
ls -l /brick2/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bcd
2. It seems that node gluster04-mi has been stopped (or rebooted or has
failed) while an operation that modifies the directory contents was being
executed, so it has lost an update an it's out of sync (both bricks on the
same server have missed one update, so it seems clear that it's not a brick
problem but a server problem).
The global result of all this is that you have 4 failed bricks on a
configuration that only supports 2 failed bricks.
BTW, having two or more bricks on the same server is not recommended because a
single server failure causes multiple bricks to be lost. In this case a
directory can be recovered, but if this happens to a file, it won't be 100%
recoverable.
Are there any files inside /Rec218 ?
If you are going to delete the directory and all its contents and brick contents
in gluster03-mi are the same than in other servers, the following commands
should be safe (otherwise let me know before doing
anything):
Before starting you must be sure that nothing is creating or deleting entries
inside /Rec218. It would be even better if this could be done with volume
stopped.
On each brick (including gluster03-mi):
setfattr -n trusted.ec.version -v 0x1 <brick path>/Rec218
On bricks in gluster03-mi:
setfattr -n trusted.gfid -v 0xa9d904af0d9e4018acb2881bd8b3c2e4
<brick path>/Rec218
setfattr -n trusted.glusterfs.dht -v 0x000000010000000000000000ffffffff
<brick path>/Rec218
On client:
check that the directory is accessible and its contents seem ok. If so:
rm -rf <mount point>/Rec218
If you have a way to reproduce this situation, let me know.
Xavi
On 01/07/2015 03:31 PM, RASTELLI Alessandro wrote:> [root at gluster01-mi ~]# getfattr -m. -e hex -d /brick1/recorder/Rec218
> getfattr: Removing leading '/' from absolute path names # file:
> brick1/recorder/Rec218 trusted.ec.version=0x000000000000693a
> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>
> [root at gluster01-mi ~]# getfattr -m. -e hex -d /brick2/recorder/Rec218
> getfattr: Removing leading '/' from absolute path names # file:
> brick2/recorder/Rec218 trusted.ec.version=0x000000000000693a
> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>
>
> [root at gluster02-mi ~]# getfattr -m. -e hex -d /brick1/recorder/Rec218
> getfattr: Removing leading '/' from absolute path names # file:
> brick1/recorder/Rec218 trusted.ec.version=0x000000000000693a
> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>
> [root at gluster02-mi ~]# getfattr -m. -e hex -d /brick2/recorder/Rec218
> getfattr: Removing leading '/' from absolute path names # file:
> brick2/recorder/Rec218 trusted.ec.version=0x000000000000693a
> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>
>
> [root at gluster03-mi ~]# getfattr -m. -e hex -d /brick1/recorder/Rec218
> getfattr: Removing leading '/' from absolute path names # file:
> brick1/recorder/Rec218 trusted.gfid=0xbda849fca556469ead84ed074f2c1bcd
>
> [root at gluster03-mi ~]# getfattr -m. -e hex -d /brick2/recorder/Rec218
> getfattr: Removing leading '/' from absolute path names # file:
> brick2/recorder/Rec218 trusted.gfid=0xbda849fca556469ead84ed074f2c1bcd
>
>
> [root at gluster04-mi ~]# getfattr -m. -e hex -d /brick1/recorder/Rec218
> getfattr: Removing leading '/' from absolute path names # file:
> brick1/recorder/Rec218
> trusted.ec.version=0x0000000000006939
> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>
> [root at gluster04-mi ~]# getfattr -m. -e hex -d /brick2/recorder/Rec218
> getfattr: Removing leading '/' from absolute path names # file:
> brick2/recorder/Rec218
> trusted.ec.version=0x0000000000006939
> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>
>
> [root at gluster05-mi ~]# getfattr -m. -e hex -d /brick1/recorder/Rec218
> getfattr: Removing leading '/' from absolute path names # file:
> brick1/recorder/Rec218 trusted.ec.version=0x000000000000693a
> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>
> [root at gluster05-mi ~]# getfattr -m. -e hex -d /brick2/recorder/Rec218
> getfattr: Removing leading '/' from absolute path names # file:
> brick2/recorder/Rec218 trusted.ec.version=0x000000000000693a
> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff