Hi!
I just gave GlusterFS a try and experienced two problems. First some background:
- I want to set up a file server with synchronous replication between
branch offices, similar to Windows DFS-Replication. The goal is _not_
high-availability or cluster-scaleout, but just having all files locally
available at each branch office.
- To test GlusterFS, I installed two virtual machines in different
locations, Ubuntu 12.04, with the GlusterFS 3.3 packages from the PPA.
- Both machines shall be server and and client, and export the GlusterFS
volume via samba.
- I set up a file system in replica mode according to the quick start
guide (except that I used ext4 instead of xfs for the brick, I had bad
experiences with xfs)
- I mounted the filesystem on both machines as localhost:/gv0, and shared
the mount via samba.
At first it seemed to work fine (Copying files from/to the share, files appear
instantly on the other host), until I did some robustness tests:
I severed the connection between the two hosts to provoke a split-brain
scenario, just to see what happens. I expected both hosts to work, but on one of
them the GlusterFS volume froze. After restarting the glusterfs-server service,
it came back.
Then I intentionally created a conflicting file on each host.
After reconnecting the host, I got "Input/Output error" on both the
conflicting file and the volume root inode. I found this
http://blog.oneiroi.co.uk/linux/gluster-resolving-a-split-brain-in-a-replicated-setup/
which fixed it for me, but having to manually fix the filesystem whenever a
branch office link goes down does not feel very trustworthy. Is there some
auto-conflict-resolving feature (last one wins, or renaming conflicting files)?
Then I took a look at the performance, and copied an ISO image (~ 700MB) to the
filesystem. Worked fine, until I tried to md5sum it from both hosts. While the
one node took a few seconds (what I expected), the other one took several
minutes. Then I found out that it read the file over the WAN link from the
distant host instead from itself. It should have had time enough (one hour) to
replicate the file across both hosts...
(By the way, I also wanted to try geo-replication (which might suffice for my
needs with a tight-enough schedule), but I was not able to create a volume with
only one brick...
So I wonder: What did I do wrong?
Thanks
Martin
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20121114/1ad5a1ca/attachment.html>