Anand Avati
2010-Feb-08 03:57 UTC
[Gluster-users] [Gluster-devel] Another Data Corruption Report
Gordan, and others using a config like Bug 542 (http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=542) This corruption issue can shows up (but not always) when you have loaded both io-cache and write-behind below replicate (syntactically before replicate in the volfile) and a self-heal of a file bigger than 131072 bytes happens. Gordan, we believe this is why your corruption observations are strongly correlated to server reconnections. Please use write-behind and io-cache on top of replicate (the "normal" way, the way glusterfs-volgen would generate), and you will not face this problem. I believe the reason for using io-cache and write-behind below replicate is for improving self-heal performance - for which we suggest using 3.0.x release where we have background self-healing and diff based self-healing. Please read http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=542#c4 for details about the internals of the corruption. In summary, loading the performance (specifically io-cache) translators in the normal location will give you a quick remedy from the bug. Thanks, Avati On Sat, Jan 16, 2010 at 12:00 AM, Gordan Bobic <gordan at bobich.net> wrote:> I've just observed another case of corruption similar to what I reported a > while back with .viminfo files getting corrupted (in that case, by somehow > being clobbered by a shared library fragment from what I could tell). > > This time, however, it was much more sinister, although probably the same > failure mode. I have just seen a file in a CVS repository resiting on > GlusterFS be replaced - by another file in the same directory in the same > CVS repository! One of my header files somehow got replaced by the Makefile > in the same directory. Reviewing "cvs log" indicates that the entire file on > the CVS side was clobbered by the Makefile - there is no indication (e.g. > from cvs log) that it was accidentally copied over and committed in. > > I'm sure I don't have to stress just how mind bogglingly dangerous (as in > data corruption/loss dangerous) this is. > > Observed with 2.0.9, the volume is AFR. > > Not sure if it is in any way relevant to this particular bug report, but > whenever I do cvs update on the glfs-backed repository, I get this sort of > thing in the glfs log: > > [2010-01-15 18:05:11] E [posix.c:3156:do_xattrop] home-store: getxattr > failed on /cvs/Project/C/#cvs.lock while doing xattrop: No such file or > directory > > Gordan > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at nongnu.org > http://lists.nongnu.org/mailman/listinfo/gluster-devel >