Tiago Carmona
2011-Oct-17 14:11 UTC
[Gluster-users] Problems with running long jobs on a replicated volume.
First of all, hi guys. My name is Tiago Carmona and I'm a DevOps to be at Unicamp in Brazil. I started using glusterFS not a long time ago, but I'm loving it. I also would like to say thanks for all the help I've got on IRC. I'm having a problem with running long jobs on a replicated volume. When I run a long job (like a chmod -R on my mount root), I got many "NFS stale handler" errors, and after some time my mount point is down with a "Transport endpoint is not connected" error, so I need to umount and mount it again. I think that my error is similar to the one at http://gluster.org/pipermail/gluster-users/2011-April/007192.html , from this list. Does anyone know what may be causing this? I'm running glusterfs on two gentoo machines. Version info bellow: glusterfs 3.2.3 built on Sep 4 2011 10:12:37 Repository revision: git://git.gluster.com/glusterfs.git Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com> Many thanks for all, Tiago Carmona -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20111017/f3928a10/attachment.html>
Peter Linder
2011-Oct-17 15:04 UTC
[Gluster-users] Problems with running long jobs on a replicated volume.
Perhaps it is similar to the problem I have, see: http://bugs.gluster.com/show_bug.cgi?id=3712 I will try perhaps tonight to leave my find command running and see if that eventually breaks the mount point. On 10/17/2011 4:11 PM, Tiago Carmona wrote:> First of all, hi guys. My name is Tiago Carmona and I'm a DevOps to be > at Unicamp in Brazil. I started using glusterFS not a long time ago, > but I'm loving it. I also would like to say thanks for all the help > I've got on IRC. > > I'm having a problem with running long jobs on a replicated volume. > When I run a long job (like a chmod -R on my mount root), I got many > "NFS stale handler" errors, and after some time my mount point is down > with a "Transport endpoint is not connected" error, so I need to > umount and mount it again. I think that my error is similar to the one > at http://gluster.org/pipermail/gluster-users/2011-April/007192.html , > from this list. Does anyone know what may be causing this? > > I'm running glusterfs on two gentoo machines. Version info bellow: > > glusterfs 3.2.3 built on Sep 4 2011 10:12:37 > Repository revision: git://git.gluster.com/glusterfs.git > <http://git.gluster.com/glusterfs.git> > Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com> > > Many thanks for all, > Tiago Carmona > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20111017/794862ef/attachment.html>
Tiago Carmona
2011-Oct-25 12:02 UTC
[Gluster-users] Problems with running long jobs on a replicated volume.
I'm still having this problem. Someone has any thought about this error? Thanks, Tiago Carmona On Mon, Oct 17, 2011 at 1:20 PM, Tiago Carmona <carmona.tiago at gmail.com>wrote:> Peter, > > It seens, at least on my volume, that the find command doesn't break it, as > I've successfully run a self healing on it. > > But yeah, the problem seens to be related. Does anyone else had a problem > like this? > > Thanks, > Tiago Carmona > > > > On Mon, Oct 17, 2011 at 1:04 PM, Peter Linder <peter.linder at fiberdirekt.se > > wrote: > >> ** >> Perhaps it is similar to the problem I have, see: >> http://bugs.gluster.com/show_bug.cgi?id=3712 >> >> I will try perhaps tonight to leave my find command running and see if >> that eventually breaks the mount point. >> >> On 10/17/2011 4:11 PM, Tiago Carmona wrote: >> >> First of all, hi guys. My name is Tiago Carmona and I'm a DevOps to be at >> Unicamp in Brazil. I started using glusterFS not a long time ago, but I'm >> loving it. I also would like to say thanks for all the help I've got on IRC. >> >> I'm having a problem with running long jobs on a replicated volume. When I >> run a long job (like a chmod -R on my mount root), I got many "NFS stale >> handler" errors, and after some time my mount point is down with a >> "Transport endpoint is not connected" error, so I need to umount and mount >> it again. I think that my error is similar to the one at >> http://gluster.org/pipermail/gluster-users/2011-April/007192.html , from >> this list. Does anyone know what may be causing this? >> >> I'm running glusterfs on two gentoo machines. Version info bellow: >> >> glusterfs 3.2.3 built on Sep 4 2011 10:12:37 >> Repository revision: git://git.gluster.com/glusterfs.git >> Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com> >> >> Many thanks for all, >> Tiago Carmona >> >> >> _______________________________________________ >> Gluster-users mailing listGluster-users at gluster.orghttp://gluster.org/cgi-bin/mailman/listinfo/gluster-users >> >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20111025/7d037193/attachment.html>