On 03/17/2015 02:14 AM, Jonathan Heese wrote:> Hello,
>
> So I resolved my previous issue with split-brains and the lack of
> self-healing by dropping my installed glusterfs* packages from 3.6.2
> to 3.5.3, but now I've picked up a new issue, which actually makes
> normal use of the volume practically impossible.
>
> A little background for those not already paying close attention:
> I have a 2 node 2 brick replicating volume whose purpose in life is to
> hold iSCSI target files, primarily for use to provide datastores to a
> VMware ESXi cluster. The plan is to put a handful of image files on
> the Gluster volume, mount them locally on both Gluster nodes, and run
> tgtd on both, pointed to the image files on the mounted gluster
> volume. Then the ESXi boxes will use multipath (active/passive) iSCSI
> to connect to the nodes, with automatic failover in case of planned or
> unplanned downtime of the Gluster nodes.
>
> In my most recent round of testing with 3.5.3, I'm seeing a massive
> failure to write data to the volume after about 5-10 minutes, so I've
> simplified the scenario a bit (to minimize the variables) to: both
> Gluster nodes up, only one node (duke) mounted and running tgtd, and
> just regular (single path) iSCSI from a single ESXi server.
>
> About 5-10 minutes into migration a VM onto the test datastore,
> /var/log/messages on duke gets blasted with a ton of messages exactly
> like this:
>
> Mar 15 22:24:06 duke tgtd: bs_rdwr_request(180) io error 0x1781e00 2a
> -1 512 22971904, Input/output error
>
>
> And /var/log/glusterfs/mnt-gluster_disk.log gets blased with a ton of
> messages exactly like this:
>
> [2015-03-16 02:24:07.572279] W [fuse-bridge.c:2242:fuse_writev_cbk]
> 0-glusterfs-fuse: 635299: WRITE => -1 (Input/output error)
>
>
Are there any messages in the mount log from AFR about split-brain just
before the above line appears?
Does `gluster v heal <VOLNAME> info` show any files? Performing I/O on
files that are in split-brain fail with EIO.
-Ravi
> And the write operation from VMware's side fails as soon as these
> messages start.
>
>
> I don't see any other errors (in the log files I know of) indicating
> the root cause of these i/o errors. I'm sure that this is not enough
> information to tell what's going on, but can anyone help me figure out
> what to look at next to figure this out?
>
>
> I've also considered using Dan Lambright's libgfapi gluster module
for
> tgtd (or something similar) to avoid going through FUSE, but I'm not
> sure whether that would be irrelevant to this problem, since I'm not
> 100% sure if it lies in FUSE or elsewhere.
>
>
> Thanks!
>
>
> /Jon Heese/
> /Systems Engineer/
> *INetU Managed Hosting*
> P: 610.266.7441 x 261
> F: 610.266.7434
> www.inetu.net<https://www.inetu.net/>
>
> /** This message contains confidential information, which also may be
> privileged, and is intended only for the person(s) addressed above.
> Any unauthorized use, distribution, copying or disclosure of
> confidential and/or privileged information is strictly prohibited. If
> you have received this communication in error, please erase all copies
> of the message and its attachments and notify the sender immediately
> via reply e-mail. **/
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150317/d7c640a2/attachment.html>