Vincent Thomasset
2011-Apr-19 15:33 UTC
[Gluster-users] read consistency amongst bricks / sync writes
Hi, This is a newbie question as i am relatively new to glusterfs but nonetheless... :) I believe the file propagation is asynchronous between the bricks and was wondering whether a subsequent write to a file and read of that same file could possibly happen on different bricks, so that the following scenario would happen: - write on brick a - read on brick b (file not found) Or, in a different fashion, if it would be possible to force sync writes across all bricks (with the obvious performance hit) to avoid the latter scenario ? This is using glusterfs 3.1.1. Thank you, Vincent
Burnash, James
2011-Apr-19 15:40 UTC
[Gluster-users] [SPAM?] read consistency amongst bricks / sync writes
Hi Vincent. I believe that if your volumes are setup as mirrored (as replicated or replicated-distribute) the file will exist on as many bricks as are designated by the number of copies (replica) requested when the volume is created. E.g volume create NEW-VOLNAME replica 2 NEW-BRICK ... In the case of the volume being replicated-distribute, if the file is not found on server b, it will make a call to server a to get it, and pass it to the gluster client without it having any knowledge of where it was actually located. Clear as mud? :-) James Burnash, Unix Engineering -----Original Message----- From: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] On Behalf Of Vincent Thomasset Sent: Tuesday, April 19, 2011 11:34 AM To: gluster-users at gluster.org Subject: [SPAM?] [Gluster-users] read consistency amongst bricks / sync writes Importance: Low Hi, This is a newbie question as i am relatively new to glusterfs but nonetheless... :) I believe the file propagation is asynchronous between the bricks and was wondering whether a subsequent write to a file and read of that same file could possibly happen on different bricks, so that the following scenario would happen: - write on brick a - read on brick b (file not found) Or, in a different fashion, if it would be possible to force sync writes across all bricks (with the obvious performance hit) to avoid the latter scenario ? This is using glusterfs 3.1.1. Thank you, Vincent _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users DISCLAIMER: This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this in error, please immediately notify me and permanently delete the original and any copy of any e-mail and any printout thereof. E-mail transmission cannot be guaranteed to be secure or error-free. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. NOTICE REGARDING PRIVACY AND CONFIDENTIALITY Knight Capital Group may, at its discretion, monitor and review the content of all e-mail communications. http://www.knight.com
Whit Blauvelt
2011-Apr-19 15:46 UTC
[Gluster-users] read consistency amongst bricks / sync writes
On Tue, Apr 19, 2011 at 05:33:58PM +0200, Vincent Thomasset wrote:> This is a newbie question as i am relatively new to glusterfs but > nonetheless... :) > > I believe the file propagation is asynchronous between the bricks and was > wondering whether a subsequent write to a file and read of that same file could > possibly happen on different bricks, so that the following scenario > would happen: > > - write on brick a > - read on brick b (file not found)I'm new too, but we've been discussing this recently here. My understanding is that files are written synchronously as long as you address your gluster file system through a client (either gluster or nfs), and not directly through through the local mount as an ext3/ext4 system. The exception would be when one of the systems is inaccessible. In that case it will take "healing" later - that is, running through the file system with "find" or the like through a gluster client - to bring the system that was unavailable up to date. Hopefully if I don't quite have this right, the more knowledgeable will step in. Whit
Whit Blauvelt
2011-Apr-19 16:21 UTC
[Gluster-users] read consistency amongst bricks / sync writes
On Tue, Apr 19, 2011 at 06:00:28PM +0200, Vincent Thomasset wrote:> My idea is currently to run it on the same machines in the cluster that > actually need storage, so that each machine would indeed use a local > mount, so that i could reuse the existing, and large, storage space > available without hassle (one of the things that make glusterfs pretty > cool to deploy on existing clusters IMO). > > But i don't see how this could be a problem, isn't the glusterfs mount the > so called native client method or did i get the doc wrong ?You can have the local mount be directly as ext3/4, in which case gluster doesn't replicate simultaneously when you write to it. Or you can also have the local mount be "-t glusterfs", in which case it does. At least in my testing. So I have a machine that's half a gluster mirror with a directory at /mnt/somedir that's formatted ext4 and given to gluster. Then gluster is exporting the filesystem as /somedir (because I reused the name), and it can be locally mounted with: mount -t glusterfs localhost:/somedir /mnt/tmp Now if I look at my filesystems with df I get /dev/mapper/volname-somedir 309637120 118580172 175328308 41% /mnt/somedir localhost:/somedir 309637120 118580224 175328256 41% /mnt/tmp (That first is because gluster's using a filesystem on an lvm in my case.) Both /mnt/somedir and /mnt/tmp have the same files in them. But anything I write to /mnt/tmp/ gets instantly replicated to the mirror, because it's going through the gluster client handling the mount. Anything I write to /mnt/somedir does not get instantly replicated, because it's mounted directly, without gluster being aware of what I'm doing until something triggers gluster later to look at that particular file. Best, Whit