Adrian Gruntkowski
2015-Oct-28 10:57 UTC
[Gluster-users] Copy operation freezes. Lots of locks in state BLOCKED (3-node setup with 1 arbiter)
Hello Pranith, Thank you for prompt reaction. I didn't get back to this until now, because I had other problems to deal with. Are there chances that it will get released this or next month? If not, I will probably have to resort to compiling on my own. Regards, Adrian 2015-10-26 12:37 GMT+01:00 Pranith Kumar Karampuri <pkarampu at redhat.com>:> > > On 10/23/2015 10:10 AM, Ravishankar N wrote: > > > > On 10/21/2015 05:55 PM, Adrian Gruntkowski wrote: > > Hello, > > I'm trying to track down a problem with my setup (version 3.7.3 on Debian > stable). > > I have a couple of volumes setup in 3-node configuration with 1 brick as > an arbiter for each. > > There are 4 volumes set up in cross-over across 3 physical servers, like > this: > > > > ------------------------------------->[ GigabitEthernet > switch ]<-------------------------- > | ^ > | > | | > | > V V > V > /-------------------------- \ > /-------------------------- \ /-------------------------- \ > | web-rep | | cluster-rep > | | mail-rep | > | | | > | | | > | vols: | | vols: > | | vols: | > | system_www1 | | system_www1 > | | system_www1(arbiter) | > | data_www1 | | data_www1 > | | data_www1(arbiter) | > | system_mail1(arbiter) | | system_mail1 > | | system_mail1 | > | data_mail1(arbiter) | | data_mail1 > | | data_mail1 | > \---------------------------/ > \---------------------------/ \---------------------------/ > > > Now, after a fresh boot-up, everything seems to be running fine. > Then I start copying big files (KVM disk images) from local disk to > gluster mounts. > In the beginning it seems to be running fine (although iowait seems go so > high that it clogs up io operations > at some moments, but that's an issue for later). After some time the > transfer freezes, then > after some (long) time, it advances in a short burst to freeze again. > Another interesting thing is that > I see constant flow of the network traffic on interfaces dedicated to > gluster, even when there's a "freeze". > > I have done "gluster volume statedump" at that time of transfer (file is > copied from local disk on cluster-rep > onto local mount of "system_www1" volume). I've observer a following > section in the dump for cluster-rep node: > > [xlator.features.locks.system_www1-locks.inode] > path=/images/101/vm-101-disk-1.qcow2 > mandatory=0 > inodelk-count=12 > lock-dump.domain.domain=system_www1-replicate-0:self-heal > inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid > 18446744073709551610, owner=c811600cd67f0000, client=0x7fbe100df280, > connection-id=cluster-vm-3603-2015/10/21-10:35:54:596929-system_www1-client-0-0-0, > granted at 2015-10-21 11:36:22 > lock-dump.domain.domain=system_www1-replicate-0 > inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=2195849216, > len=131072, pid = 18446744073709551610, owner=c811600cd67f0000, > client=0x7fbe100df280, > connection-id=cluster-vm-3603-2015/10/21-10:35:54:596929-system_www1-client-0-0-0, > granted at 2015-10-21 11:37:45 > inodelk.inodelk[1](ACTIVE)=type=WRITE, whence=0, > start=9223372036854775805, len=1, pid = 18446744073709551610, > owner=c811600cd67f0000, client=0x7fbe100df280, > connection-id=cluster-vm-3603-2015/10/21-10:35:54:596929-system_www1-client-0-0-0, > granted at 2015-10-21 11:36:22 > > > From the statedump, It looks like self-heal daemon had taken locks to heal > the file due to which the locks attempted by the client (mount) are in > blocked state. > In Arbiter volumes the client (mount) takes full locks (start=0, len=0) > for every write() as opposed to normal replica volumes which take range > locks (i.e. appropriate start,len values) for that write(). This is done to > avoid network split-brains. > So in normal replica volumes, clients can still write to a file while heal > is going on, as long as the offsets don't overlap. This is not the case > with arbiter volumes. > You can look at the client or glustershd logs to see if there are messages > that indicate healing of a file, something along the lines of "Completed > data selfheal on xxx" > > hi Adrian, > Thanks for taking the time to send this mail. I raised this as bug @ > https://bugzilla.redhat.com/show_bug.cgi?id=1275247, fix is posted for > review @ http://review.gluster.com/#/c/12426/ > > Pranith > > > inodelk.inodelk[2](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid = 0, > owner=c4fd2d78487f0000, client=0x7fbe100e1380, > connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, > blocked at 2015-10-21 11:37:45 > inodelk.inodelk[3](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid = 0, > owner=dc752e78487f0000, client=0x7fbe100e1380, > connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, > blocked at 2015-10-21 11:37:45 > inodelk.inodelk[4](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid = 0, > owner=34832e78487f0000, client=0x7fbe100e1380, > connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, > blocked at 2015-10-21 11:37:45 > inodelk.inodelk[5](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid = 0, > owner=d44d2e78487f0000, client=0x7fbe100e1380, > connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, > blocked at 2015-10-21 11:37:45 > inodelk.inodelk[6](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid = 0, > owner=306f2e78487f0000, client=0x7fbe100e1380, > connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, > blocked at 2015-10-21 11:37:45 > inodelk.inodelk[7](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid = 0, > owner=8c902e78487f0000, client=0x7fbe100e1380, > connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, > blocked at 2015-10-21 11:37:45 > inodelk.inodelk[8](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid = 0, > owner=782c2e78487f0000, client=0x7fbe100e1380, > connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, > blocked at 2015-10-21 11:37:45 > inodelk.inodelk[9](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid = 0, > owner=1c0b2e78487f0000, client=0x7fbe100e1380, > connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, > blocked at 2015-10-21 11:37:45 > inodelk.inodelk[10](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid > 0, owner=24332e78487f0000, client=0x7fbe100e1380, > connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, > blocked at 2015-10-21 11:37:45 > > There seem to be multiple locks in BLOCKED state - which doesn't look > normal to me. The other 2 nodes have > only 2 ACTIVE locks at the same time. > > Below is "gluster volume info" output. > > # gluster volume info > > Volume Name: data_mail1 > Type: Replicate > Volume ID: fc3259a1-ddcf-46e9-ae77-299aaad93b7c > Status: Started > Number of Bricks: 1 x 3 = 3 > Transport-type: tcp > Bricks: > Brick1: cluster-rep:/GFS/data/mail1 > Brick2: mail-rep:/GFS/data/mail1 > Brick3: web-rep:/GFS/data/mail1 > Options Reconfigured: > performance.readdir-ahead: on > cluster.quorum-count: 2 > cluster.quorum-type: fixed > cluster.server-quorum-ratio: 51% > > Volume Name: data_www1 > Type: Replicate > Volume ID: 0c37a337-dbe5-4e75-8010-94e068c02026 > Status: Started > Number of Bricks: 1 x 3 = 3 > Transport-type: tcp > Bricks: > Brick1: cluster-rep:/GFS/data/www1 > Brick2: web-rep:/GFS/data/www1 > Brick3: mail-rep:/GFS/data/www1 > Options Reconfigured: > performance.readdir-ahead: on > cluster.quorum-type: fixed > cluster.quorum-count: 2 > cluster.server-quorum-ratio: 51% > > Volume Name: system_mail1 > Type: Replicate > Volume ID: 0568d985-9fa7-40a7-bead-298310622cb5 > Status: Started > Number of Bricks: 1 x 3 = 3 > Transport-type: tcp > Bricks: > Brick1: cluster-rep:/GFS/system/mail1 > Brick2: mail-rep:/GFS/system/mail1 > Brick3: web-rep:/GFS/system/mail1 > Options Reconfigured: > performance.readdir-ahead: on > cluster.quorum-type: none > cluster.quorum-count: 2 > cluster.server-quorum-ratio: 51% > > Volume Name: system_www1 > Type: Replicate > Volume ID: 147636a2-5c15-4d9a-93c8-44d51252b124 > Status: Started > Number of Bricks: 1 x 3 = 3 > Transport-type: tcp > Bricks: > Brick1: cluster-rep:/GFS/system/www1 > Brick2: web-rep:/GFS/system/www1 > Brick3: mail-rep:/GFS/system/www1 > Options Reconfigured: > performance.readdir-ahead: on > cluster.quorum-type: none > cluster.quorum-count: 2 > cluster.server-quorum-ratio: 51% > > The issue does not occur when I get rid of 3rd arbiter brick. > > > What do you mean by 'getting rid of'? Killing the 3rd brick process of the > volume? > > Regards, > Ravi > > > If there's any additional information that is missing and I could provide, > please let me know. > > Greetings, > Adrian > > > _______________________________________________ > Gluster-users mailing listGluster-users at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users > > > > > _______________________________________________ > Gluster-users mailing listGluster-users at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151028/13cd15f4/attachment.html>
Pranith Kumar Karampuri
2015-Oct-28 11:02 UTC
[Gluster-users] Copy operation freezes. Lots of locks in state BLOCKED (3-node setup with 1 arbiter)
On 10/28/2015 04:27 PM, Adrian Gruntkowski wrote:> Hello Pranith, > > Thank you for prompt reaction. I didn't get back to this until now, > because I had other problems to deal with. > > Are there chances that it will get released this or next month? If > not, I will probably have to resort to compiling on my own.I am planning to get this in for 3.7.6 which is to be released by end of this month. I guess in 4-5 days :-). I will update you Pranith> > Regards, > Adrian > > > 2015-10-26 12:37 GMT+01:00 Pranith Kumar Karampuri > <pkarampu at redhat.com <mailto:pkarampu at redhat.com>>: > > > > On 10/23/2015 10:10 AM, Ravishankar N wrote: >> >> >> On 10/21/2015 05:55 PM, Adrian Gruntkowski wrote: >>> Hello, >>> >>> I'm trying to track down a problem with my setup (version 3.7.3 >>> on Debian stable). >>> >>> I have a couple of volumes setup in 3-node configuration with 1 >>> brick as an arbiter for each. >>> >>> There are 4 volumes set up in cross-over across 3 physical >>> servers, like this: >>> >>> >>> >>> ------------------------------------->[ GigabitEthernet switch >>> ]<-------------------------- >>> | ^ | >>> | | | >>> V V V >>> /-------------------------- \ /-------------------------- \ >>> /-------------------------- \ >>> | web-rep | | cluster-rep | >>> | mail-rep | >>> | | | | >>> | | >>> | vols: | | vols: | >>> | vols: | >>> | system_www1 | | system_www1 | >>> | system_www1(arbiter) | >>> | data_www1 | | data_www1 | >>> | data_www1(arbiter) | >>> | system_mail1(arbiter) | | system_mail1 | >>> | system_mail1 | >>> | data_mail1(arbiter) | | data_mail1 | >>> | data_mail1 | >>> \---------------------------/ \---------------------------/ >>> \---------------------------/ >>> >>> >>> Now, after a fresh boot-up, everything seems to be running fine. >>> Then I start copying big files (KVM disk images) from local disk >>> to gluster mounts. >>> In the beginning it seems to be running fine (although iowait >>> seems go so high that it clogs up io operations >>> at some moments, but that's an issue for later). After some time >>> the transfer freezes, then >>> after some (long) time, it advances in a short burst to freeze >>> again. Another interesting thing is that >>> I see constant flow of the network traffic on interfaces >>> dedicated to gluster, even when there's a "freeze". >>> >>> I have done "gluster volume statedump" at that time of transfer >>> (file is copied from local disk on cluster-rep >>> onto local mount of "system_www1" volume). I've observer a >>> following section in the dump for cluster-rep node: >>> >>> [xlator.features.locks.system_www1-locks.inode] >>> path=/images/101/vm-101-disk-1.qcow2 >>> mandatory=0 >>> inodelk-count=12 >>> lock-dump.domain.domain=system_www1-replicate-0:self-heal >>> inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, >>> pid = 18446744073709551610, owner=c811600cd67f0000, >>> client=0x7fbe100df280, >>> connection-id=cluster-vm-3603-2015/10/21-10:35:54:596929-system_www1-client-0-0-0, >>> granted at 2015-10-21 11:36:22 >>> lock-dump.domain.domain=system_www1-replicate-0 >>> inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, >>> start=2195849216, len=131072, pid = 18446744073709551610, >>> owner=c811600cd67f0000, client=0x7fbe100df280, >>> connection-id=cluster-vm-3603-2015/10/21-10:35:54:596929-system_www1-client-0-0-0, >>> granted at 2015-10-21 11:37:45 >>> inodelk.inodelk[1](ACTIVE)=type=WRITE, whence=0, >>> start=9223372036854775805, len=1, pid = 18446744073709551610, >>> owner=c811600cd67f0000, client=0x7fbe100df280, >>> connection-id=cluster-vm-3603-2015/10/21-10:35:54:596929-system_www1-client-0-0-0, >>> granted at 2015-10-21 11:36:22 >> >> From the statedump, It looks like self-heal daemon had taken >> locks to heal the file due to which the locks attempted by the >> client (mount) are in blocked state. >> In Arbiter volumes the client (mount) takes full locks (start=0, >> len=0) for every write() as opposed to normal replica volumes >> which take range locks (i.e. appropriate start,len values) for >> that write(). This is done to avoid network split-brains. >> So in normal replica volumes, clients can still write to a file >> while heal is going on, as long as the offsets don't overlap. >> This is not the case with arbiter volumes. >> You can look at the client or glustershd logs to see if there are >> messages that indicate healing of a file, something along the >> lines of "Completed data selfheal on xxx" > hi Adrian, > Thanks for taking the time to send this mail. I raised this > as bug @https://bugzilla.redhat.com/show_bug.cgi?id=1275247, fix > is posted for review @ http://review.gluster.com/#/c/12426/ > > Pranith > >> >>> inodelk.inodelk[2](BLOCKED)=type=WRITE, whence=0, start=0, >>> len=0, pid = 0, owner=c4fd2d78487f0000, client=0x7fbe100e1380, >>> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, >>> blocked at 2015-10-21 11:37:45 >>> inodelk.inodelk[3](BLOCKED)=type=WRITE, whence=0, start=0, >>> len=0, pid = 0, owner=dc752e78487f0000, client=0x7fbe100e1380, >>> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, >>> blocked at 2015-10-21 11:37:45 >>> inodelk.inodelk[4](BLOCKED)=type=WRITE, whence=0, start=0, >>> len=0, pid = 0, owner=34832e78487f0000, client=0x7fbe100e1380, >>> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, >>> blocked at 2015-10-21 11:37:45 >>> inodelk.inodelk[5](BLOCKED)=type=WRITE, whence=0, start=0, >>> len=0, pid = 0, owner=d44d2e78487f0000, client=0x7fbe100e1380, >>> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, >>> blocked at 2015-10-21 11:37:45 >>> inodelk.inodelk[6](BLOCKED)=type=WRITE, whence=0, start=0, >>> len=0, pid = 0, owner=306f2e78487f0000, client=0x7fbe100e1380, >>> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, >>> blocked at 2015-10-21 11:37:45 >>> inodelk.inodelk[7](BLOCKED)=type=WRITE, whence=0, start=0, >>> len=0, pid = 0, owner=8c902e78487f0000, client=0x7fbe100e1380, >>> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, >>> blocked at 2015-10-21 11:37:45 >>> inodelk.inodelk[8](BLOCKED)=type=WRITE, whence=0, start=0, >>> len=0, pid = 0, owner=782c2e78487f0000, client=0x7fbe100e1380, >>> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, >>> blocked at 2015-10-21 11:37:45 >>> inodelk.inodelk[9](BLOCKED)=type=WRITE, whence=0, start=0, >>> len=0, pid = 0, owner=1c0b2e78487f0000, client=0x7fbe100e1380, >>> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, >>> blocked at 2015-10-21 11:37:45 >>> inodelk.inodelk[10](BLOCKED)=type=WRITE, whence=0, start=0, >>> len=0, pid = 0, owner=24332e78487f0000, client=0x7fbe100e1380, >>> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, >>> blocked at 2015-10-21 11:37:45 >>> >>> There seem to be multiple locks in BLOCKED state - which doesn't >>> look normal to me. The other 2 nodes have >>> only 2 ACTIVE locks at the same time. >>> >>> Below is "gluster volume info" output. >>> >>> # gluster volume info >>> >>> Volume Name: data_mail1 >>> Type: Replicate >>> Volume ID: fc3259a1-ddcf-46e9-ae77-299aaad93b7c >>> Status: Started >>> Number of Bricks: 1 x 3 = 3 >>> Transport-type: tcp >>> Bricks: >>> Brick1: cluster-rep:/GFS/data/mail1 >>> Brick2: mail-rep:/GFS/data/mail1 >>> Brick3: web-rep:/GFS/data/mail1 >>> Options Reconfigured: >>> performance.readdir-ahead: on >>> cluster.quorum-count: 2 >>> cluster.quorum-type: fixed >>> cluster.server-quorum-ratio: 51% >>> >>> Volume Name: data_www1 >>> Type: Replicate >>> Volume ID: 0c37a337-dbe5-4e75-8010-94e068c02026 >>> Status: Started >>> Number of Bricks: 1 x 3 = 3 >>> Transport-type: tcp >>> Bricks: >>> Brick1: cluster-rep:/GFS/data/www1 >>> Brick2: web-rep:/GFS/data/www1 >>> Brick3: mail-rep:/GFS/data/www1 >>> Options Reconfigured: >>> performance.readdir-ahead: on >>> cluster.quorum-type: fixed >>> cluster.quorum-count: 2 >>> cluster.server-quorum-ratio: 51% >>> >>> Volume Name: system_mail1 >>> Type: Replicate >>> Volume ID: 0568d985-9fa7-40a7-bead-298310622cb5 >>> Status: Started >>> Number of Bricks: 1 x 3 = 3 >>> Transport-type: tcp >>> Bricks: >>> Brick1: cluster-rep:/GFS/system/mail1 >>> Brick2: mail-rep:/GFS/system/mail1 >>> Brick3: web-rep:/GFS/system/mail1 >>> Options Reconfigured: >>> performance.readdir-ahead: on >>> cluster.quorum-type: none >>> cluster.quorum-count: 2 >>> cluster.server-quorum-ratio: 51% >>> >>> Volume Name: system_www1 >>> Type: Replicate >>> Volume ID: 147636a2-5c15-4d9a-93c8-44d51252b124 >>> Status: Started >>> Number of Bricks: 1 x 3 = 3 >>> Transport-type: tcp >>> Bricks: >>> Brick1: cluster-rep:/GFS/system/www1 >>> Brick2: web-rep:/GFS/system/www1 >>> Brick3: mail-rep:/GFS/system/www1 >>> Options Reconfigured: >>> performance.readdir-ahead: on >>> cluster.quorum-type: none >>> cluster.quorum-count: 2 >>> cluster.server-quorum-ratio: 51% >>> >>> The issue does not occur when I get rid of 3rd arbiter brick. >> >> What do you mean by 'getting rid of'? Killing the 3rd brick >> process of the volume? >> >> Regards, >> Ravi >>> >>> If there's any additional information that is missing and I >>> could provide, please let me know. >>> >>> Greetings, >>> Adrian >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >>> http://www.gluster.org/mailman/listinfo/gluster-users >> >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >> http://www.gluster.org/mailman/listinfo/gluster-users > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151028/d44a07de/attachment.html>
Adrian Gruntkowski
2015-Nov-04 15:40 UTC
[Gluster-users] Copy operation freezes. Lots of locks in state BLOCKED (3-node setup with 1 arbiter)
Hello, I have applied Pranith's patch myself on current 3.7.5 release and rebuilt packages. Unfortunately, the issue is still there :( It behaves exactly the same. Regards, Adrian 2015-10-28 12:02 GMT+01:00 Pranith Kumar Karampuri <pkarampu at redhat.com>:> > > On 10/28/2015 04:27 PM, Adrian Gruntkowski wrote: > > Hello Pranith, > > Thank you for prompt reaction. I didn't get back to this until now, > because I had other problems to deal with. > > Are there chances that it will get released this or next month? If not, I > will probably have to resort to compiling on my own. > > I am planning to get this in for 3.7.6 which is to be released by end of > this month. I guess in 4-5 days :-). I will update you > > Pranith > > > Regards, > Adrian > > > 2015-10-26 12:37 GMT+01:00 Pranith Kumar Karampuri <pkarampu at redhat.com>: > >> >> >> On 10/23/2015 10:10 AM, Ravishankar N wrote: >> >> >> >> On 10/21/2015 05:55 PM, Adrian Gruntkowski wrote: >> >> Hello, >> >> I'm trying to track down a problem with my setup (version 3.7.3 on Debian >> stable). >> >> I have a couple of volumes setup in 3-node configuration with 1 brick as >> an arbiter for each. >> >> There are 4 volumes set up in cross-over across 3 physical servers, like >> this: >> >> >> >> ------------------------------------->[ GigabitEthernet >> switch ]<-------------------------- >> | ^ >> | >> | | >> | >> V V >> V >> /-------------------------- \ >> /-------------------------- \ /-------------------------- \ >> | web-rep | | cluster-rep >> | | mail-rep | >> | | | >> | | | >> | vols: | | vols: >> | | vols: | >> | system_www1 | | system_www1 >> | | system_www1(arbiter) | >> | data_www1 | | data_www1 >> | | data_www1(arbiter) | >> | system_mail1(arbiter) | | system_mail1 >> | | system_mail1 | >> | data_mail1(arbiter) | | data_mail1 >> | | data_mail1 | >> \---------------------------/ >> \---------------------------/ \---------------------------/ >> >> >> Now, after a fresh boot-up, everything seems to be running fine. >> Then I start copying big files (KVM disk images) from local disk to >> gluster mounts. >> In the beginning it seems to be running fine (although iowait seems go so >> high that it clogs up io operations >> at some moments, but that's an issue for later). After some time the >> transfer freezes, then >> after some (long) time, it advances in a short burst to freeze again. >> Another interesting thing is that >> I see constant flow of the network traffic on interfaces dedicated to >> gluster, even when there's a "freeze". >> >> I have done "gluster volume statedump" at that time of transfer (file is >> copied from local disk on cluster-rep >> onto local mount of "system_www1" volume). I've observer a following >> section in the dump for cluster-rep node: >> >> [xlator.features.locks.system_www1-locks.inode] >> path=/images/101/vm-101-disk-1.qcow2 >> mandatory=0 >> inodelk-count=12 >> lock-dump.domain.domain=system_www1-replicate-0:self-heal >> inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid >> 18446744073709551610, owner=c811600cd67f0000, client=0x7fbe100df280, >> connection-id=cluster-vm-3603-2015/10/21-10:35:54:596929-system_www1-client-0-0-0, >> granted at 2015-10-21 11:36:22 >> lock-dump.domain.domain=system_www1-replicate-0 >> inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=2195849216, >> len=131072, pid = 18446744073709551610, owner=c811600cd67f0000, >> client=0x7fbe100df280, >> connection-id=cluster-vm-3603-2015/10/21-10:35:54:596929-system_www1-client-0-0-0, >> granted at 2015-10-21 11:37:45 >> inodelk.inodelk[1](ACTIVE)=type=WRITE, whence=0, >> start=9223372036854775805, len=1, pid = 18446744073709551610, >> owner=c811600cd67f0000, client=0x7fbe100df280, >> connection-id=cluster-vm-3603-2015/10/21-10:35:54:596929-system_www1-client-0-0-0, >> granted at 2015-10-21 11:36:22 >> >> >> From the statedump, It looks like self-heal daemon had taken locks to >> heal the file due to which the locks attempted by the client (mount) are in >> blocked state. >> In Arbiter volumes the client (mount) takes full locks (start=0, len=0) >> for every write() as opposed to normal replica volumes which take range >> locks (i.e. appropriate start,len values) for that write(). This is done to >> avoid network split-brains. >> So in normal replica volumes, clients can still write to a file while >> heal is going on, as long as the offsets don't overlap. This is not the >> case with arbiter volumes. >> You can look at the client or glustershd logs to see if there are >> messages that indicate healing of a file, something along the lines of >> "Completed data selfheal on xxx" >> >> hi Adrian, >> Thanks for taking the time to send this mail. I raised this as bug @ >> https://bugzilla.redhat.com/show_bug.cgi?id=1275247, fix is posted for >> review @ http://review.gluster.com/#/c/12426/ >> >> Pranith >> >> >> inodelk.inodelk[2](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid >> 0, owner=c4fd2d78487f0000, client=0x7fbe100e1380, >> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, >> blocked at 2015-10-21 11:37:45 >> inodelk.inodelk[3](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid >> 0, owner=dc752e78487f0000, client=0x7fbe100e1380, >> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, >> blocked at 2015-10-21 11:37:45 >> inodelk.inodelk[4](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid >> 0, owner=34832e78487f0000, client=0x7fbe100e1380, >> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, >> blocked at 2015-10-21 11:37:45 >> inodelk.inodelk[5](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid >> 0, owner=d44d2e78487f0000, client=0x7fbe100e1380, >> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, >> blocked at 2015-10-21 11:37:45 >> inodelk.inodelk[6](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid >> 0, owner=306f2e78487f0000, client=0x7fbe100e1380, >> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, >> blocked at 2015-10-21 11:37:45 >> inodelk.inodelk[7](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid >> 0, owner=8c902e78487f0000, client=0x7fbe100e1380, >> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, >> blocked at 2015-10-21 11:37:45 >> inodelk.inodelk[8](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid >> 0, owner=782c2e78487f0000, client=0x7fbe100e1380, >> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, >> blocked at 2015-10-21 11:37:45 >> inodelk.inodelk[9](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid >> 0, owner=1c0b2e78487f0000, client=0x7fbe100e1380, >> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, >> blocked at 2015-10-21 11:37:45 >> inodelk.inodelk[10](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid >> 0, owner=24332e78487f0000, client=0x7fbe100e1380, >> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0, >> blocked at 2015-10-21 11:37:45 >> >> There seem to be multiple locks in BLOCKED state - which doesn't look >> normal to me. The other 2 nodes have >> only 2 ACTIVE locks at the same time. >> >> Below is "gluster volume info" output. >> >> # gluster volume info >> >> Volume Name: data_mail1 >> Type: Replicate >> Volume ID: fc3259a1-ddcf-46e9-ae77-299aaad93b7c >> Status: Started >> Number of Bricks: 1 x 3 = 3 >> Transport-type: tcp >> Bricks: >> Brick1: cluster-rep:/GFS/data/mail1 >> Brick2: mail-rep:/GFS/data/mail1 >> Brick3: web-rep:/GFS/data/mail1 >> Options Reconfigured: >> performance.readdir-ahead: on >> cluster.quorum-count: 2 >> cluster.quorum-type: fixed >> cluster.server-quorum-ratio: 51% >> >> Volume Name: data_www1 >> Type: Replicate >> Volume ID: 0c37a337-dbe5-4e75-8010-94e068c02026 >> Status: Started >> Number of Bricks: 1 x 3 = 3 >> Transport-type: tcp >> Bricks: >> Brick1: cluster-rep:/GFS/data/www1 >> Brick2: web-rep:/GFS/data/www1 >> Brick3: mail-rep:/GFS/data/www1 >> Options Reconfigured: >> performance.readdir-ahead: on >> cluster.quorum-type: fixed >> cluster.quorum-count: 2 >> cluster.server-quorum-ratio: 51% >> >> Volume Name: system_mail1 >> Type: Replicate >> Volume ID: 0568d985-9fa7-40a7-bead-298310622cb5 >> Status: Started >> Number of Bricks: 1 x 3 = 3 >> Transport-type: tcp >> Bricks: >> Brick1: cluster-rep:/GFS/system/mail1 >> Brick2: mail-rep:/GFS/system/mail1 >> Brick3: web-rep:/GFS/system/mail1 >> Options Reconfigured: >> performance.readdir-ahead: on >> cluster.quorum-type: none >> cluster.quorum-count: 2 >> cluster.server-quorum-ratio: 51% >> >> Volume Name: system_www1 >> Type: Replicate >> Volume ID: 147636a2-5c15-4d9a-93c8-44d51252b124 >> Status: Started >> Number of Bricks: 1 x 3 = 3 >> Transport-type: tcp >> Bricks: >> Brick1: cluster-rep:/GFS/system/www1 >> Brick2: web-rep:/GFS/system/www1 >> Brick3: mail-rep:/GFS/system/www1 >> Options Reconfigured: >> performance.readdir-ahead: on >> cluster.quorum-type: none >> cluster.quorum-count: 2 >> cluster.server-quorum-ratio: 51% >> >> The issue does not occur when I get rid of 3rd arbiter brick. >> >> >> What do you mean by 'getting rid of'? Killing the 3rd brick process of >> the volume? >> >> Regards, >> Ravi >> >> >> If there's any additional information that is missing and I could >> provide, please let me know. >> >> Greetings, >> Adrian >> >> >> _______________________________________________ >> Gluster-users mailing listGluster-users at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users >> >> >> >> >> _______________________________________________ >> Gluster-users mailing listGluster-users at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users >> >> >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151104/1d401967/attachment.html>