thr3ads.net - Gluster users - [Gluster-users] Failed rebalance - lost files, inaccessible files, permission issues [Nov 2013]

If this information is useful, please help other people find it:
Share via:

Shawn Heisey

2013-Nov-09 03:23 UTC

[Gluster-users] Failed rebalance - lost files, inaccessible files, permission issues

I'm starting a new thread on this, because I have more concrete 
information than I did the first time around.  The full rebalance log 
from the machine where I started the rebalance can be found at the 
following link.  It is slightly redacted - one search/replace was made 
to replace an identifying word with REDACTED.

https://dl.dropboxusercontent.com/u/97770508/mdfs-rebalance-redacted.zip

The existing servers are running version 3.3.1-11.el6.x86_64, from 
kkeithley's epel repository.  Those servers have CentOS 6.3 on them.

The newer servers are running version 3.3.1-15.el6.x86_64, and are 
CentOS 6.4, fully upgraded as of October 28th, 2013.  Both sets of 
servers have contrib, plus, epel, rpmforge, and kkeithley's glusterfs repo.

One of our developers went through the log linked above and some of the 
source code.  I am reproducing his detailed comments below.  I have not 
looked at the log super-closely, except to produce a list of files that 
failed to migrate.

What I'm hoping is that this information can either point us at some 
proof (bug, committed patch, etc) that the problem is fixed in 3.4.1, or 
we can use it to file a new new bug.  I'm hoping that either an upgrade 
will fix it or that a workaround can be found.

I'm still hoping to hire someone to look things over.  Where can I find 
some good resources for this?  I tried sending a message to Redhat 
Consulting, but something may have gone wrong with that process, because 
it's been two days with no response.

Full quote from our developer:

------------------------------
Preface:  I know what happened to the files and it's not what I thought 
it was.  I don't know the exact cause but we're closer.

Here's where I throw away my vague hunch of two days ago.  I just 
realized that all the ZUMA files I saw on the new bricks were simply 
links created when users tried to access the files.  We did indeed 
rebalance on the files in chronological order of their uploads.  That 
was a two-day-long wrong-tree barking session because I didn't 
understand the architecture.

When I looked at the individual cases of lost or corrupted files, one 
thing kept staring at me in the face until I recognized it:

[2013-11-02 03:56:36.472170] I [dht-rebalance.c:647:dht_migrate_file] 
0-mdfs-dht: /REDACTED/mdfs/AKG/akgphotos/docs/000/002: attempting to 
move from mdfs-replicate-2 to mdfs-replicate-12
[2013-11-02 03:56:36.472186] I [dht-rebalance.c:647:dht_migrate_file] 
0-mdfs-dht: /REDACTED/mdfs/AKG/akgphotos/docs/000/002: attempting to 
move from mdfs-replicate-2 to mdfs-replicate-12
[2013-11-02 03:56:36.480567] I [dht-rebalance.c:647:dht_migrate_file] 
0-mdfs-dht: /REDACTED/mdfs/AKG/akgphotos/docs/000/002: attempting to 
move from mdfs-replicate-2 to mdfs-replicate-12

Three simultaneous processes on the same file!  Of course that would 
have undefined results, and be the cause of all our problems.  NFS may 
not be related after all.

Tediously scrolling through the error log I found mostly errors where it 
refused to copy files from a more empty brick to a fuller brick, which 
makes perfect sense.  The wheels started falling off about 26 hours into 
the rebalance.

[2013-10-29 23:13:17.193108] C 
[client-handshake.c:126:rpc_client_ping_timer_expired] 0-mdfs-client-1: 
server 10.116.0.22:24025 has not responded in the last 42 seconds, 
disconnecting.
[2013-10-29 23:13:17.200616] E [rpc-clnt.c:373:saved_frames_unwind] 
(-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x78) [0x36de60f808] 
(-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb0) 
[0x36de60f4c0] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) 
[0x36de60ef2e]))) 0-mdfs-client-1: forced unwinding frame type(GlusterFS 
3.1) op(STAT(1)) called at 2013-10-29 23:12:20.641732 (xid=0x1763733x)
[2013-10-29 23:13:17.200659] W [client3_1-fops.c:474:client3_1_stat_cbk] 
0-mdfs-client-1: remote operation failed: Transport endpoint is not 
connected
[2013-10-29 23:13:17.200756] E [rpc-clnt.c:373:saved_frames_unwind] 
(-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x78) [0x36de60f808] 
(-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb0) 
[0x36de60f4c0] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) 
[0x36de60ef2e]))) 0-mdfs-client-1: forced unwinding frame type(GlusterFS 
Handshake) op(PING(3)) called at 2013-10-29 23:12:35.184481 (xid=0x1763734x)
[2013-10-29 23:13:17.200792] W [client-handshake.c:275:client_ping_cbk] 
0-mdfs-client-1: timer must have expired
[2013-10-29 23:13:17.217477] I [client.c:2090:client_rpc_notify] 
0-mdfs-client-1: disconnected
.
.
.
[2013-10-29 23:13:28.198386] I 
[client-handshake.c:1636:select_server_supported_programs] 
0-mdfs-client-1: Using Program GlusterFS 3.3.1, Num (1298437), Version (330)
[2013-10-29 23:13:28.198803] I 
[client-handshake.c:1433:client_setvolume_cbk] 0-mdfs-client-1: 
Connected to 10.116.0.22:24025, attached to remote volume 
'/bricks/d00v00/mdfs'.
[2013-10-29 23:13:28.198826] I 
[client-handshake.c:1445:client_setvolume_cbk] 0-mdfs-client-1: Server 
and Client lk-version numbers are not same, reopening the fds
[2013-10-29 23:13:28.205629] I 
[client-handshake.c:1282:client_post_handshake] 0-mdfs-client-1: 42745 
fds open - Delaying child_up until they are re-opened
[2013-10-29 23:13:28.205782] E [inode.c:1090:__inode_path] 
(-->/usr/lib64/glusterfs/3.3.1/xlator/protocol/client.so(client_post_handshake+0x110)
[0x7fb74bdf1c90] 
(-->/usr/lib64/glusterfs/3.3.1/xlator/protocol/client.so(protocol_client_reopendir+0x107)
[0x7fb74bdf1907] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) 
[0x36dde2ca1a]))) 0-: Assertion failed: 0
[2013-10-29 23:13:28.205852] W [inode.c:1091:__inode_path] 
(-->/usr/lib64/glusterfs/3.3.1/xlator/protocol/client.so(client_post_handshake+0x110)
[0x7fb74bdf1c90] 
(-->/usr/lib64/glusterfs/3.3.1/xlator/protocol/client.so(protocol_client_reopendir+0x107)
[0x7fb74bdf1907] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) 
[0x36dde2ca1a]))) 0-mdfs-client-1: invalid inode
[2013-10-29 23:13:28.205881] W 
[client-handshake.c:1108:protocol_client_reopendir] 0-mdfs-client-1: 
couldn't build path from inode 00000000-0000-0000-0000-000000000000
[2013-10-29 23:13:28.205944] E [inode.c:1090:__inode_path] 
(-->/usr/lib64/glusterfs/3.3.1/xlator/protocol/client.so(client_post_handshake+0x110)
[0x7fb74bdf1c90] 
(-->/usr/lib64/glusterfs/3.3.1/xlator/protocol/client.so(protocol_client_reopendir+0x107)
[0x7fb74bdf1907] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) 
[0x36dde2ca1a]))) 0-: Assertion failed: 0
[2013-10-29 23:13:28.205992] W [inode.c:1091:__inode_path] 
(-->/usr/lib64/glusterfs/3.3.1/xlator/protocol/client.so(client_post_handshake+0x110)
[0x7fb74bdf1c90] 
(-->/usr/lib64/glusterfs/3.3.1/xlator/protocol/client.so(protocol_client_reopendir+0x107)
[0x7fb74bdf1907] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) 
[0x36dde2ca1a]))) 0-mdfs-client-1: invalid inode
[2013-10-29 23:13:28.206009] W 
[client-handshake.c:1108:protocol_client_reopendir] 0-mdfs-client-1: 
couldn't build path from inode 00000000-0000-0000-0000-000000000000

The assertion failures actually go on for thousands of lines even though 
only 3 seconds have elapsed.  Eventually it overcame whatever that 
handshake error means.  Not long after:

[2013-10-29 23:13:49.611069] I [dht-rebalance.c:647:dht_migrate_file] 
0-mdfs-dht: /REDACTED/mdfs/KPA/kpacontentminepix/docs/008/058: 
attempting to move from mdfs-replicate-1 to mdfs-replicate-6
[2013-10-29 23:13:49.611582] I [dht-rebalance.c:647:dht_migrate_file] 
0-mdfs-dht: /REDACTED/mdfs/KPA/kpacontentminepix/docs/008/058: 
attempting to move from mdfs-replicate-1 to mdfs-replicate-6

and there's our first lost file.  From that point on, we were trashing 
files with simultaneous copy processes.  I think a process hung up, and 
a new one was spawned, and when the first process woke back up it didn't 
know enough to stop.
------------------------------

Shawn

Anand Avati

2013-Nov-09 08:47 UTC

head link

[Gluster-users] Failed rebalance - lost files, inaccessible files, permission issues

Shawn,

Thanks for the detailed info. I have not yet looked into your logs, but
will do so soon. There have been patches on rebalance which do fix issues
related to ownership. But I am not (yet) sure about bugs which caused data
loss. One question I have is -

[2013-10-29 23:13:49.611069] I [dht-rebalance.c:647:dht_migrate_file]
0-mdfs-dht: /REDACTED/mdfs/KPA/kpacontentminepix/docs/008/058: attempting
to move from mdfs-replicate-1 to mdfs-replicate-6
[2013-10-29 23:13:49.611582] I [dht-rebalance.c:647:dht_migrate_file]
0-mdfs-dht: /REDACTED/mdfs/KPA/kpacontentminepix/docs/008/058: attempting
to move from mdfs-replicate-1 to mdfs-replicate-6

Are these two lines from the same log file or separate log files? If they
are from the same log, then it might be you need
http://review.gluster.org/4300 (available in 3.4)

It might also be that the "permission issues" is a cascaded effect of
the
same underlying problem - because the temporary file created by rebalance
would have different permissions during the process of rebalance and
failures might have left them in that state.

Avati


On Fri, Nov 8, 2013 at 7:23 PM, Shawn Heisey <gluster at elyograg.org>
wrote:
> I'm starting a new thread on this, because I have more concrete
> information than I did the first time around.  The full rebalance log from
> the machine where I started the rebalance can be found at the following
> link.  It is slightly redacted - one search/replace was made to replace an
> identifying word with REDACTED.
>
> https://dl.dropboxusercontent.com/u/97770508/mdfs-rebalance-redacted.zip
>
> The existing servers are running version 3.3.1-11.el6.x86_64, from
> kkeithley's epel repository.  Those servers have CentOS 6.3 on them.
>
> The newer servers are running version 3.3.1-15.el6.x86_64, and are CentOS
> 6.4, fully upgraded as of October 28th, 2013.  Both sets of servers have
> contrib, plus, epel, rpmforge, and kkeithley's glusterfs repo.
>
> One of our developers went through the log linked above and some of the
> source code.  I am reproducing his detailed comments below.  I have not
> looked at the log super-closely, except to produce a list of files that
> failed to migrate.
>
> What I'm hoping is that this information can either point us at some
proof
> (bug, committed patch, etc) that the problem is fixed in 3.4.1, or we can
> use it to file a new new bug.  I'm hoping that either an upgrade will
fix
> it or that a workaround can be found.
>
> I'm still hoping to hire someone to look things over.  Where can I find
> some good resources for this?  I tried sending a message to Redhat
> Consulting, but something may have gone wrong with that process, because
> it's been two days with no response.
>
> Full quote from our developer:
>
> ------------------------------
> Preface:  I know what happened to the files and it's not what I thought
it
> was.  I don't know the exact cause but we're closer.
>
> Here's where I throw away my vague hunch of two days ago.  I just
realized
> that all the ZUMA files I saw on the new bricks were simply links created
> when users tried to access the files.  We did indeed rebalance on the files
> in chronological order of their uploads.  That was a two-day-long
> wrong-tree barking session because I didn't understand the
architecture.
>
> When I looked at the individual cases of lost or corrupted files, one
> thing kept staring at me in the face until I recognized it:
>
> [2013-11-02 03:56:36.472170] I [dht-rebalance.c:647:dht_migrate_file]
> 0-mdfs-dht: /REDACTED/mdfs/AKG/akgphotos/docs/000/002: attempting to move
> from mdfs-replicate-2 to mdfs-replicate-12
> [2013-11-02 03:56:36.472186] I [dht-rebalance.c:647:dht_migrate_file]
> 0-mdfs-dht: /REDACTED/mdfs/AKG/akgphotos/docs/000/002: attempting to move
> from mdfs-replicate-2 to mdfs-replicate-12
> [2013-11-02 03:56:36.480567] I [dht-rebalance.c:647:dht_migrate_file]
> 0-mdfs-dht: /REDACTED/mdfs/AKG/akgphotos/docs/000/002: attempting to move
> from mdfs-replicate-2 to mdfs-replicate-12
>
> Three simultaneous processes on the same file!  Of course that would have
> undefined results, and be the cause of all our problems.  NFS may not be
> related after all.
>
> Tediously scrolling through the error log I found mostly errors where it
> refused to copy files from a more empty brick to a fuller brick, which
> makes perfect sense.  The wheels started falling off about 26 hours into
> the rebalance.
>
> [2013-10-29 23:13:17.193108] C
[client-handshake.c:126:rpc_client_ping_timer_expired]
> 0-mdfs-client-1: server 10.116.0.22:24025 has not responded in the last
> 42 seconds, disconnecting.
> [2013-10-29 23:13:17.200616] E [rpc-clnt.c:373:saved_frames_unwind]
> (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x78) [0x36de60f808]
> (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb0)
> [0x36de60f4c0] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)
> [0x36de60ef2e]))) 0-mdfs-client-1: forced unwinding frame type(GlusterFS
> 3.1) op(STAT(1)) called at 2013-10-29 23:12:20.641732 (xid=0x1763733x)
> [2013-10-29 23:13:17.200659] W [client3_1-fops.c:474:client3_1_stat_cbk]
> 0-mdfs-client-1: remote operation failed: Transport endpoint is not
> connected
> [2013-10-29 23:13:17.200756] E [rpc-clnt.c:373:saved_frames_unwind]
> (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x78) [0x36de60f808]
> (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb0)
> [0x36de60f4c0] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)
> [0x36de60ef2e]))) 0-mdfs-client-1: forced unwinding frame type(GlusterFS
> Handshake) op(PING(3)) called at 2013-10-29 23:12:35.184481
(xid=0x1763734x)
> [2013-10-29 23:13:17.200792] W [client-handshake.c:275:client_ping_cbk]
> 0-mdfs-client-1: timer must have expired
> [2013-10-29 23:13:17.217477] I [client.c:2090:client_rpc_notify]
> 0-mdfs-client-1: disconnected
> .
> .
> .
> [2013-10-29 23:13:28.198386] I [client-handshake.c:1636:
> select_server_supported_programs] 0-mdfs-client-1: Using Program
> GlusterFS 3.3.1, Num (1298437), Version (330)
> [2013-10-29 23:13:28.198803] I
[client-handshake.c:1433:client_setvolume_cbk]
> 0-mdfs-client-1: Connected to 10.116.0.22:24025, attached to remote
> volume '/bricks/d00v00/mdfs'.
> [2013-10-29 23:13:28.198826] I
[client-handshake.c:1445:client_setvolume_cbk]
> 0-mdfs-client-1: Server and Client lk-version numbers are not same,
> reopening the fds
> [2013-10-29 23:13:28.205629] I
[client-handshake.c:1282:client_post_handshake]
> 0-mdfs-client-1: 42745 fds open - Delaying child_up until they are
re-opened
> [2013-10-29 23:13:28.205782] E [inode.c:1090:__inode_path]
>
(-->/usr/lib64/glusterfs/3.3.1/xlator/protocol/client.so(client_post_handshake+0x110)
> [0x7fb74bdf1c90]
(-->/usr/lib64/glusterfs/3.3.1/xlator/protocol/client.so(
> protocol_client_reopendir+0x107) [0x7fb74bdf1907]
> (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x36dde2ca1a]))) 0-:
> Assertion failed: 0
> [2013-10-29 23:13:28.205852] W [inode.c:1091:__inode_path]
>
(-->/usr/lib64/glusterfs/3.3.1/xlator/protocol/client.so(client_post_handshake+0x110)
> [0x7fb74bdf1c90]
(-->/usr/lib64/glusterfs/3.3.1/xlator/protocol/client.so(
> protocol_client_reopendir+0x107) [0x7fb74bdf1907]
> (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x36dde2ca1a])))
> 0-mdfs-client-1: invalid inode
> [2013-10-29 23:13:28.205881] W
[client-handshake.c:1108:protocol_client_reopendir]
> 0-mdfs-client-1: couldn't build path from inode
00000000-0000-0000-0000-
> 000000000000
> [2013-10-29 23:13:28.205944] E [inode.c:1090:__inode_path]
>
(-->/usr/lib64/glusterfs/3.3.1/xlator/protocol/client.so(client_post_handshake+0x110)
> [0x7fb74bdf1c90]
(-->/usr/lib64/glusterfs/3.3.1/xlator/protocol/client.so(
> protocol_client_reopendir+0x107) [0x7fb74bdf1907]
> (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x36dde2ca1a]))) 0-:
> Assertion failed: 0
> [2013-10-29 23:13:28.205992] W [inode.c:1091:__inode_path]
>
(-->/usr/lib64/glusterfs/3.3.1/xlator/protocol/client.so(client_post_handshake+0x110)
> [0x7fb74bdf1c90]
(-->/usr/lib64/glusterfs/3.3.1/xlator/protocol/client.so(
> protocol_client_reopendir+0x107) [0x7fb74bdf1907]
> (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x36dde2ca1a])))
> 0-mdfs-client-1: invalid inode
> [2013-10-29 23:13:28.206009] W
[client-handshake.c:1108:protocol_client_reopendir]
> 0-mdfs-client-1: couldn't build path from inode
00000000-0000-0000-0000-
> 000000000000
>
> The assertion failures actually go on for thousands of lines even though
> only 3 seconds have elapsed.  Eventually it overcame whatever that
> handshake error means.  Not long after:
>
> [2013-10-29 23:13:49.611069] I [dht-rebalance.c:647:dht_migrate_file]
> 0-mdfs-dht: /REDACTED/mdfs/KPA/kpacontentminepix/docs/008/058: attempting
> to move from mdfs-replicate-1 to mdfs-replicate-6
> [2013-10-29 23:13:49.611582] I [dht-rebalance.c:647:dht_migrate_file]
> 0-mdfs-dht: /REDACTED/mdfs/KPA/kpacontentminepix/docs/008/058: attempting
> to move from mdfs-replicate-1 to mdfs-replicate-6
>
> and there's our first lost file.  From that point on, we were trashing
> files with simultaneous copy processes.  I think a process hung up, and a
> new one was spawned, and when the first process woke back up it didn't
know
> enough to stop.
> ------------------------------
>
> Shawn
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131109/49cf10d0/attachment.html>

Shawn Heisey

2013-Nov-26 22:51 UTC

head link

[Gluster-users] Failed rebalance - lost files, inaccessible files, permission issues

Here's what our own developer had to say about this:

On 11/8/2013 8:23 PM, Shawn Heisey wrote:> When I looked at the individual cases of lost or corrupted files, one
> thing kept staring at me in the face until I recognized it:
>
> [2013-11-02 03:56:36.472170] I [dht-rebalance.c:647:dht_migrate_file]
> 0-mdfs-dht: /REDACTED/mdfs/AKG/akgphotos/docs/000/002: attempting to
> move from mdfs-replicate-2 to mdfs-replicate-12
> [2013-11-02 03:56:36.472186] I [dht-rebalance.c:647:dht_migrate_file]
> 0-mdfs-dht: /REDACTED/mdfs/AKG/akgphotos/docs/000/002: attempting to
> move from mdfs-replicate-2 to mdfs-replicate-12
> [2013-11-02 03:56:36.480567] I [dht-rebalance.c:647:dht_migrate_file]
> 0-mdfs-dht: /REDACTED/mdfs/AKG/akgphotos/docs/000/002: attempting to
> move from mdfs-replicate-2 to mdfs-replicate-12
>
> Three simultaneous processes on the same file!  Of course that would
> have undefined results, and be the cause of all our problems.  NFS may
> not be related after all.
>
> Tediously scrolling through the error log I found mostly errors where it
> refused to copy files from a more empty brick to a fuller brick, which
> makes perfect sense.  The wheels started falling off about 26 hours into
> the rebalance.
>
> [2013-10-29 23:13:17.193108] C
> [client-handshake.c:126:rpc_client_ping_timer_expired] 0-mdfs-client-1:
> server 10.116.0.22:24025 has not responded in the last 42 seconds,
> disconnecting.
> [2013-10-29 23:13:17.200616] E [rpc-clnt.c:373:saved_frames_unwind]
> (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x78) [0x36de60f808]
> (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb0)
> [0x36de60f4c0] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)
> [0x36de60ef2e]))) 0-mdfs-client-1: forced unwinding frame type(GlusterFS
> 3.1) op(STAT(1)) called at 2013-10-29 23:12:20.641732 (xid=0x1763733x)
> [2013-10-29 23:13:17.200659] W [client3_1-fops.c:474:client3_1_stat_cbk]
> 0-mdfs-client-1: remote operation failed: Transport endpoint is not
> connected
> [2013-10-29 23:13:17.200756] E [rpc-clnt.c:373:saved_frames_unwind]
> (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x78) [0x36de60f808]
> (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb0)
> [0x36de60f4c0] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)
> [0x36de60ef2e]))) 0-mdfs-client-1: forced unwinding frame type(GlusterFS
> Handshake) op(PING(3)) called at 2013-10-29 23:12:35.184481
> (xid=0x1763734x)
> [2013-10-29 23:13:17.200792] W [client-handshake.c:275:client_ping_cbk]
> 0-mdfs-client-1: timer must have expired
> [2013-10-29 23:13:17.217477] I [client.c:2090:client_rpc_notify]
> 0-mdfs-client-1: disconnected
> .
> .
> .
> [2013-10-29 23:13:28.198386] I
> [client-handshake.c:1636:select_server_supported_programs]
> 0-mdfs-client-1: Using Program GlusterFS 3.3.1, Num (1298437), Version
> (330)
> [2013-10-29 23:13:28.198803] I
> [client-handshake.c:1433:client_setvolume_cbk] 0-mdfs-client-1:
> Connected to 10.116.0.22:24025, attached to remote volume
> '/bricks/d00v00/mdfs'.
> [2013-10-29 23:13:28.198826] I
> [client-handshake.c:1445:client_setvolume_cbk] 0-mdfs-client-1: Server
> and Client lk-version numbers are not same, reopening the fds
> [2013-10-29 23:13:28.205629] I
> [client-handshake.c:1282:client_post_handshake] 0-mdfs-client-1: 42745
> fds open - Delaying child_up until they are re-opened
> [2013-10-29 23:13:28.205782] E [inode.c:1090:__inode_path]
>
(-->/usr/lib64/glusterfs/3.3.1/xlator/protocol/client.so(client_post_handshake+0x110)
> [0x7fb74bdf1c90]
>
(-->/usr/lib64/glusterfs/3.3.1/xlator/protocol/client.so(protocol_client_reopendir+0x107)
> [0x7fb74bdf1907] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a)
> [0x36dde2ca1a]))) 0-: Assertion failed: 0
> [2013-10-29 23:13:28.205852] W [inode.c:1091:__inode_path]
>
(-->/usr/lib64/glusterfs/3.3.1/xlator/protocol/client.so(client_post_handshake+0x110)
> [0x7fb74bdf1c90]
>
(-->/usr/lib64/glusterfs/3.3.1/xlator/protocol/client.so(protocol_client_reopendir+0x107)
> [0x7fb74bdf1907] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a)
> [0x36dde2ca1a]))) 0-mdfs-client-1: invalid inode
> [2013-10-29 23:13:28.205881] W
> [client-handshake.c:1108:protocol_client_reopendir] 0-mdfs-client-1:
> couldn't build path from inode 00000000-0000-0000-0000-000000000000
> [2013-10-29 23:13:28.205944] E [inode.c:1090:__inode_path]
>
(-->/usr/lib64/glusterfs/3.3.1/xlator/protocol/client.so(client_post_handshake+0x110)
> [0x7fb74bdf1c90]
>
(-->/usr/lib64/glusterfs/3.3.1/xlator/protocol/client.so(protocol_client_reopendir+0x107)
> [0x7fb74bdf1907] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a)
> [0x36dde2ca1a]))) 0-: Assertion failed: 0
> [2013-10-29 23:13:28.205992] W [inode.c:1091:__inode_path]
>
(-->/usr/lib64/glusterfs/3.3.1/xlator/protocol/client.so(client_post_handshake+0x110)
> [0x7fb74bdf1c90]
>
(-->/usr/lib64/glusterfs/3.3.1/xlator/protocol/client.so(protocol_client_reopendir+0x107)
> [0x7fb74bdf1907] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a)
> [0x36dde2ca1a]))) 0-mdfs-client-1: invalid inode
> [2013-10-29 23:13:28.206009] W
> [client-handshake.c:1108:protocol_client_reopendir] 0-mdfs-client-1:
> couldn't build path from inode 00000000-0000-0000-0000-000000000000
>
> The assertion failures actually go on for thousands of lines even though
> only 3 seconds have elapsed.  Eventually it overcame whatever that
> handshake error means.  Not long after:
>
> [2013-10-29 23:13:49.611069] I [dht-rebalance.c:647:dht_migrate_file]
> 0-mdfs-dht: /REDACTED/mdfs/KPA/kpacontentminepix/docs/008/058:
> attempting to move from mdfs-replicate-1 to mdfs-replicate-6
> [2013-10-29 23:13:49.611582] I [dht-rebalance.c:647:dht_migrate_file]
> 0-mdfs-dht: /REDACTED/mdfs/KPA/kpacontentminepix/docs/008/058:
> attempting to move from mdfs-replicate-1 to mdfs-replicate-6
>
> and there's our first lost file.  From that point on, we were trashing
> files with simultaneous copy processes.  I think a process hung up, and
> a new one was spawned, and when the first process woke back up it
didn't
> know enough to stop.
I duplicated our production setup as closely as possible on a testbed. 
There are some differences, mostly due to the fact that the testbed is 
comprised of inexpensive desktop-class machines with nowhere near as 
much CPU, RAM, or disk resources as the server-class production hardware.

The rebalance ultimately failed because it was killed by the linux OOM 
killer, similar to what happened in production.  It was due to actual 
memory usage, rather than hitting the system-wide max open file 
descriptor limit like it did on production.

It does not appear that I have any lost files, permission errors, or 
inaccessible files on the testbed.  I did not see the critical error 
showing the ping-timeout, though.

I cannot account for the ping timeout that the production system saw, 
which we suspect caused bug 859387 to surface.  There are no kernel log 
messages on any host aside from the gluster ping-timeout itself, no 
errors on the NIC statistics, no messages in the switch logs, and no 
errors on the switch interfaces.  On 10.100.116.22, the host that the 
critical log entry says timed out, the logfile 
etc-glusterfs-glusterd.vol.log-20131103 shows activity every few seconds 
throughout the entire 42 second period.

https://bugzilla.redhat.com/show_bug.cgi?id=859387

I would like to recreate the circumstances of the bug on the testbed and 
see if I have similar problems to what I did in production, but the bug 
gives no details for reproducing, other than bringing a brick down and 
then bringing it back up.  It doesn't say HOW to bring a brick down. 
Would sending a SIGSTOP to the brick process do it, followed by a 
SIGCONT after an interval greater than 42 seconds?

Thanks,
Shawn

Reasonably Related Threads

Search for more apparently analagous threads

Gluster users - Nov 2013 - Failed rebalance - lost files, inaccessible files, permission issues

[Gluster-users] Failed rebalance - lost files, inaccessible files, permission issues

[Gluster-users] Failed rebalance - lost files, inaccessible files, permission issues

[Gluster-users] Failed rebalance - lost files, inaccessible files, permission issues

Reasonably Related Threads