Hi Consider the typical spit brain situation: reading from file gets EIO, logs say: [2011-08-21 13:38:54.607590] W [afr-open.c:168:afr_open] 0-gfs-replicate-0: failed to open as split brain seen, returning EIO [2011-08-21 13:38:54.607895] W [fuse-bridge.c:585:fuse_fd_cbk] 0-glusterfs-fuse: 1371456: OPEN() /manu/netbsd/usr/src/gnu/dist/groff/doc/Makefile.sub => -1 (Input/output error) On the backend I have two versions, one with size 0, the other with a decent size. I removed the one with size zero, ran ls -l on the client to trigger self heal: the log does not say it started, and I still get EIO accessing the file. Obviously that answer is cached somewhere. Is it possible to handle that situation without unmounting and remounting the volume, that is, without restarting glusterfs client? -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz manu at netbsd.org
what version of glusterfs? Ive seen this in 3.1.2 and 3.1.3 On Aug 21, 2011, at 6:53 AM, Emmanuel Dreyfus wrote:> Hi > > Consider the typical spit brain situation: reading from file gets EIO, > logs say: > [2011-08-21 13:38:54.607590] W [afr-open.c:168:afr_open] > 0-gfs-replicate-0: failed to open as split brain seen, returning EIO > [2011-08-21 13:38:54.607895] W [fuse-bridge.c:585:fuse_fd_cbk] > 0-glusterfs-fuse: 1371456: OPEN() > /manu/netbsd/usr/src/gnu/dist/groff/doc/Makefile.sub => -1 > (Input/output error) > > On the backend I have two versions, one with size 0, the other with a > decent size. I removed the one with size zero, ran ls -l on the client > to trigger self heal: the log does not say it started, and I still get > EIO accessing the file. Obviously that answer is cached somewhere. > > Is it possible to handle that situation without unmounting and > remounting the volume, that is, without restarting glusterfs client? > > -- > Emmanuel Dreyfus > http://hcpnet.free.fr/pubz > manu at netbsd.org > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-usersLuis E. Cerezo http://www.luiscerezo.org http://twitter.com/luiscerezo http://flickr.com/photos/luiscerezo photos for sale: http://photos.luiscerezo.org Voice: 412 223 7396 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20110822/086ae60a/attachment.html>
hi Emmanuel, I see that with master branch code, the test case works fine for me. The information that the file went into a split brain situation is stored in inode You can check the code of the function afr_set_split_brain (xlator_t *this, inode_t *inode, gf_boolean_t set) Pranith pranith @ /mnt/client2 13:06:00 :) $ ls abc ls: cannot access abc: Input/output error pranith @ /mnt/client2 13:06:35 :( $ sudo getfattr -d -m . /tmp/{1,2}/abc getfattr: Removing leading '/' from absolute path names # file: tmp/1/abc trusted.gfid=0s1aS1Nib4QrmwS0gnCdIP4g= # file: tmp/2/abc trusted.gfid=0sq3um4Gh2TN+zkO9ULMpgkw= pranith @ /mnt/client2 13:06:53 :) $ sudo rm -f /tmp/1/abc pranith @ /mnt/client2 13:07:03 :) $ ls abc abc pranith @ /mnt/client2 13:07:08 :) $ sudo getfattr -d -m . /tmp/{1,2}/abc getfattr: Removing leading '/' from absolute path names # file: tmp/1/abc trusted.afr.vol-client-0=0sAAAAAAAAAAAAAAAA trusted.afr.vol-client-1=0sAAAAAAAAAAAAAAAA trusted.gfid=0sq3um4Gh2TN+zkO9ULMpgkw= # file: tmp/2/abc trusted.afr.vol-client-0=0sAAAAAAAAAAAAAAAA trusted.afr.vol-client-1=0sAAAAAP////8AAAAA trusted.gfid=0sq3um4Gh2TN+zkO9ULMpgkw= Pranith On 08/21/2011 05:23 PM, Emmanuel Dreyfus wrote:> Hi > > Consider the typical spit brain situation: reading from file gets EIO, > logs say: > [2011-08-21 13:38:54.607590] W [afr-open.c:168:afr_open] > 0-gfs-replicate-0: failed to open as split brain seen, returning EIO > [2011-08-21 13:38:54.607895] W [fuse-bridge.c:585:fuse_fd_cbk] > 0-glusterfs-fuse: 1371456: OPEN() > /manu/netbsd/usr/src/gnu/dist/groff/doc/Makefile.sub => -1 > (Input/output error) > > On the backend I have two versions, one with size 0, the other with a > decent size. I removed the one with size zero, ran ls -l on the client > to trigger self heal: the log does not say it started, and I still get > EIO accessing the file. Obviously that answer is cached somewhere. > > Is it possible to handle that situation without unmounting and > remounting the volume, that is, without restarting glusterfs client? >