Pranith Kumar Karampuri
2016-Oct-05 14:02 UTC
[Gluster-users] Regression caused to gfapi applications with enabling client-io-threads by default
On Wed, Oct 5, 2016 at 2:00 PM, Soumya Koduri <skoduri at redhat.com> wrote:> Hi, > > With http://review.gluster.org/#/c/15051/, performace/client-io-threads > is enabled by default. But with that we see regression caused to > nfs-ganesha application trying to un/re-export any glusterfs volume. This > shall be the same case with any gfapi application using glfs_fini(). > > More details and the RCA can be found at [1]. > > In short, iot-worker threads spawned (when the above option is enabled) > are not cleaned up as part of io-threads-xlator->fini() and those threads > could end up accessing invalid/freed memory post glfs_fini(). > > The actual fix is to address io-threads-xlator->fini() to cleanup those > threads before exiting. But since those threads' IDs are currently not > stored, the fix could be very intricate and take a while. So till then to > avoid all existing applications crash, I suggest to keep this option > disabled by default and update this known_issue with enabling this option > in the release-notes. > > I sent a patch to revert the commit - http://review.gluster.org/#/c/15616/ > [2] >Good catch! I think the correct fix would be to make sure all threads die as part of PARENT_DOWN then?> Comments/Suggestions are welcome. > > Thanks, > Soumya > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1380619#c11 > [2] http://review.gluster.org/#/c/15616/ >-- Pranith -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161005/d84dd88a/attachment.html>
Soumya Koduri
2016-Oct-06 06:06 UTC
[Gluster-users] Regression caused to gfapi applications with enabling client-io-threads by default
On 10/05/2016 07:32 PM, Pranith Kumar Karampuri wrote:> > > On Wed, Oct 5, 2016 at 2:00 PM, Soumya Koduri <skoduri at redhat.com > <mailto:skoduri at redhat.com>> wrote: > > Hi, > > With http://review.gluster.org/#/c/15051/ > <http://review.gluster.org/#/c/15051/>, performace/client-io-threads > is enabled by default. But with that we see regression caused to > nfs-ganesha application trying to un/re-export any glusterfs volume. > This shall be the same case with any gfapi application using > glfs_fini(). > > More details and the RCA can be found at [1]. > > In short, iot-worker threads spawned (when the above option is > enabled) are not cleaned up as part of io-threads-xlator->fini() and > those threads could end up accessing invalid/freed memory post > glfs_fini(). > > The actual fix is to address io-threads-xlator->fini() to cleanup > those threads before exiting. But since those threads' IDs are > currently not stored, the fix could be very intricate and take a > while. So till then to avoid all existing applications crash, I > suggest to keep this option disabled by default and update this > known_issue with enabling this option in the release-notes. > > I sent a patch to revert the commit - > http://review.gluster.org/#/c/15616/ > <http://review.gluster.org/#/c/15616/> [2] > > > Good catch! I think the correct fix would be to make sure all threads > die as part of PARENT_DOWN then?From my understanding, I think these threads should be cleaned up as part of xlator->fini().I am not sure if it needs to be handled even for PARENT_DOWN as well. Do we re-spawn the threads as part of PARENT_UP then? Till that part gets fixed, can we make this option back to off by default to avoid the regressions with master and release-3.9 branch? Thanks, Soumya> > > Comments/Suggestions are welcome. > > Thanks, > Soumya > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1380619#c11 > <https://bugzilla.redhat.com/show_bug.cgi?id=1380619#c11> > [2] http://review.gluster.org/#/c/15616/ > <http://review.gluster.org/#/c/15616/> > > > > > -- > Pranith