thr3ads.net - Ocfs2 users - [Ocfs2-users] Slow umounts on SLES10 patchlevel 3 ocfs2 [Jul 2011]

If this information is useful, please help other people find it:
Share via:

Marc Grimme

2011-Jul-06 15:20 UTC

[Ocfs2-users] Slow umounts on SLES10 patchlevel 3 ocfs2

Hi,
we are using a SLES10 Patchlevel 3 with 12 Nodes hosting tomcat application
servers.
The cluster was running some time (about 200 days) without problems.

Recently we needed to shutdown the cluster for maintenance and experianced very
long times for the umount of the filesystem. It took something like 45 minutes
each node and filesystem (12 x 45 minutes shutdown time).
As a result the planned downtime had to be extended ;-) .

Is there any tuning option or the like to make those umounts faster or is this
something we have to live with?

Thanks for your help.
If you need more information let me know.

Marc.

Some info on the configuration:
---------------------------X8-----------------------------------
# /sbin/modinfo ocfs2
filename:       /lib/modules/2.6.16.60-0.54.5-smp/kernel/fs/ocfs2/ocfs2.ko
license:        GPL
author:         Oracle
version:        1.4.1-1-SLES
description:    OCFS2 1.4.1-1-SLES Wed Jul 23 18:33:42 UTC 2008 (build
f922955d99ef972235bd0c1fc236c5ddbb368611)
srcversion:     986DD1EE4F5ABD8A44FF925
depends:        ocfs2_dlm,jbd,ocfs2_nodemanager
supported:      yes
vermagic:       2.6.16.60-0.54.5-smp SMP gcc-4.1
atix at CAS12:~> /sbin/modinfo ocfs2_dlm
filename:      
/lib/modules/2.6.16.60-0.54.5-smp/kernel/fs/ocfs2/dlm/ocfs2_dlm.ko
license:        GPL
author:         Oracle
version:        1.4.1-1-SLES
description:    OCFS2 DLM 1.4.1-1-SLES Wed Jul 23 18:33:42 UTC 2008 (build
f922955d99ef972235bd0c1fc236c5ddbb368611)
srcversion:     16FE87920EA41CA613E6609
depends:        ocfs2_nodemanager
supported:      yes
vermagic:       2.6.16.60-0.54.5-smp SMP gcc-4.1
parm:           dlm_purge_interval_ms:int
parm:           dlm_purge_locks_max:int
# rpm -qa ocfs2*
ocfs2-tools-1.4.0-0.9.9
ocfs2console-1.4.0-0.9.9
---------------------------X8-----------------------------------
The kernel version is 2.6.16.60-0.54.5-smp

______________________________________________________________________________

Marc Grimme

E-Mail: grimme at atix.de

ATIX Informationstechnologie und Consulting AG | Einsteinstrasse 10 |
85716 Unterschleissheim | www.atix.de

Enterprise Linux einfach online kaufen: www.linux-subscriptions.com

Registergericht: Amtsgericht M?nchen, Registernummer: HRB 168930, USt.-Id.:
DE209485962 | Vorstand: Thomas Merz (Vors.), Marc Grimme, Mark Hlawatschek, Jan
R. Bergrath |
Vorsitzender des Aufsichtsrats: Dr. Martin Buss

Sunil Mushran

2011-Jul-06 16:37 UTC

head link

[Ocfs2-users] Slow umounts on SLES10 patchlevel 3 ocfs2

umount is a two step process. First the fs frees the inodes. Then the
o2dlm takes stock of all active resources and migrates ones that are
still in use. This typically takes some time. But I have never heard
of it taking 45 mins.

But I guess it could be if one has a lot of resources. Lets start by
getting a count.

This will dump the number of cluster locks held by the fs.
# for vol in /sys/kernel/debug/ocfs2/*
     do
         count=$(wc -l ${vol}/locking_state | cut -f1 -d' ');
         echo "$(basename ${vol}): ${count} locks" ;
     done;

This will dump the number of lock resources known to the dlm.
# for vol in /sys/kernel/debug/o2dlm/*
     do
         count=$(grep -c "^NAME:" ${vol}/locking_state);
         echo "$(basename ${vol}): ${count} resources" ;
     done;

The debugfs needs to be mounted for this to work.
mount -t debugfs none /sys/kernel/debug

Sunil

On 07/06/2011 08:20 AM, Marc Grimme wrote:> Hi,
> we are using a SLES10 Patchlevel 3 with 12 Nodes hosting tomcat application
servers.
> The cluster was running some time (about 200 days) without problems.
>
> Recently we needed to shutdown the cluster for maintenance and experianced
very long times for the umount of the filesystem. It took something like 45
minutes each node and filesystem (12 x 45 minutes shutdown time).
> As a result the planned downtime had to be extended ;-) .
>
> Is there any tuning option or the like to make those umounts faster or is
this something we have to live with?
>
> Thanks for your help.
> If you need more information let me know.
>
> Marc.
>
> Some info on the configuration:
> ---------------------------X8-----------------------------------
> # /sbin/modinfo ocfs2
> filename:       /lib/modules/2.6.16.60-0.54.5-smp/kernel/fs/ocfs2/ocfs2.ko
> license:        GPL
> author:         Oracle
> version:        1.4.1-1-SLES
> description:    OCFS2 1.4.1-1-SLES Wed Jul 23 18:33:42 UTC 2008 (build
f922955d99ef972235bd0c1fc236c5ddbb368611)
> srcversion:     986DD1EE4F5ABD8A44FF925
> depends:        ocfs2_dlm,jbd,ocfs2_nodemanager
> supported:      yes
> vermagic:       2.6.16.60-0.54.5-smp SMP gcc-4.1
> atix at CAS12:~>  /sbin/modinfo ocfs2_dlm
> filename:      
/lib/modules/2.6.16.60-0.54.5-smp/kernel/fs/ocfs2/dlm/ocfs2_dlm.ko
> license:        GPL
> author:         Oracle
> version:        1.4.1-1-SLES
> description:    OCFS2 DLM 1.4.1-1-SLES Wed Jul 23 18:33:42 UTC 2008 (build
f922955d99ef972235bd0c1fc236c5ddbb368611)
> srcversion:     16FE87920EA41CA613E6609
> depends:        ocfs2_nodemanager
> supported:      yes
> vermagic:       2.6.16.60-0.54.5-smp SMP gcc-4.1
> parm:           dlm_purge_interval_ms:int
> parm:           dlm_purge_locks_max:int
> # rpm -qa ocfs2*
> ocfs2-tools-1.4.0-0.9.9
> ocfs2console-1.4.0-0.9.9
> ---------------------------X8-----------------------------------
> The kernel version is 2.6.16.60-0.54.5-smp
>
>
______________________________________________________________________________
>
> Marc Grimme
>
> E-Mail: grimme at atix.de

Marc Grimme

2011-Jul-14 12:33 UTC

head link

[Ocfs2-users] Slow umounts on SLES10 patchlevel 3 ocfs2

So I now have two figures from two different clusters. Both are quite slow
during restarts. Having two filesystems mounted.

Cluster1 (that one that last time took very long):
Clusterlocks held by filesystem..
1788AD39151A4E76997420D62A778E65: 274258 locks
1EFA64C36FD54AB48B734A99E7F45A73: 576842 locks
Clusterresources held by filesystem..
1788AD39151A4E76997420D62A778E65: 214545 resources
1EFA64C36FD54AB48B734A99E7F45A73: 469319 resources

Second cluster (also takes quite long):
Clusterlocks held by filesystem..
1EDBCFF0CAB24D0CAE91CB2DA241E8CA: 717186 locks
585462C2FA5A428D913A3CBDBC77E116: 68 locks
Clusterresources held by filesystem..
1EDBCFF0CAB24D0CAE91CB2DA241E8CA: 587471 resources
585462C2FA5A428D913A3CBDBC77E116: 20 resources


Let me know if you need more information.

Thanks
Marc.
----- "Sunil Mushran" <sunil.mushran at oracle.com> wrote:
> It was designed to run in prod envs.
> 
> On 07/07/2011 12:21 AM, Marc Grimme wrote:
> > Sunil,
> > can I query those figures during runtime of a productive cluster?
> > Or might it influence the availability performance what ever?
> >
> > Thanks for your help.
> > Marc.
> > ----- "Sunil Mushran"<sunil.mushran at oracle.com> 
wrote:
> >
> >> umount is a two step process. First the fs frees the inodes. Then
> the
> >> o2dlm takes stock of all active resources and migrates ones that
> are
> >> still in use. This typically takes some time. But I have never
> heard
> >> of it taking 45 mins.
> >>
> >> But I guess it could be if one has a lot of resources. Lets start
> by
> >> getting a count.
> >>
> >> This will dump the number of cluster locks held by the fs.
> >> # for vol in /sys/kernel/debug/ocfs2/*
> >>       do
> >>           count=$(wc -l ${vol}/locking_state | cut -f1 -d'
');
> >>           echo "$(basename ${vol}): ${count} locks" ;
> >>       done;
> >>
> >> This will dump the number of lock resources known to the dlm.
> >> # for vol in /sys/kernel/debug/o2dlm/*
> >>       do
> >>           count=$(grep -c "^NAME:"
${vol}/locking_state);
> >>           echo "$(basename ${vol}): ${count} resources"
;
> >>       done;
> >>
> >> The debugfs needs to be mounted for this to work.
> >> mount -t debugfs none /sys/kernel/debug
> >>
> >> Sunil
> >>
> >> On 07/06/2011 08:20 AM, Marc Grimme wrote:
> >>> Hi,
> >>> we are using a SLES10 Patchlevel 3 with 12 Nodes hosting
tomcat
> >> application servers.
> >>> The cluster was running some time (about 200 days) without
> >> problems.
> >>> Recently we needed to shutdown the cluster for maintenance and
> >> experianced very long times for the umount of the filesystem. It
> took
> >> something like 45 minutes each node and filesystem (12 x 45
> minutes
> >> shutdown time).
> >>> As a result the planned downtime had to be extended ;-) .
> >>>
> >>> Is there any tuning option or the like to make those umounts
> faster
> >> or is this something we have to live with?
> >>> Thanks for your help.
> >>> If you need more information let me know.
> >>>
> >>> Marc.
> >>>
> >>> Some info on the configuration:
> >>>
---------------------------X8-----------------------------------
> >>> # /sbin/modinfo ocfs2
> >>> filename:
> >> /lib/modules/2.6.16.60-0.54.5-smp/kernel/fs/ocfs2/ocfs2.ko
> >>> license:        GPL
> >>> author:         Oracle
> >>> version:        1.4.1-1-SLES
> >>> description:    OCFS2 1.4.1-1-SLES Wed Jul 23 18:33:42 UTC
2008
> >> (build f922955d99ef972235bd0c1fc236c5ddbb368611)
> >>> srcversion:     986DD1EE4F5ABD8A44FF925
> >>> depends:        ocfs2_dlm,jbd,ocfs2_nodemanager
> >>> supported:      yes
> >>> vermagic:       2.6.16.60-0.54.5-smp SMP gcc-4.1
> >>> atix at CAS12:~>   /sbin/modinfo ocfs2_dlm
> >>> filename:
> >> /lib/modules/2.6.16.60-0.54.5-smp/kernel/fs/ocfs2/dlm/ocfs2_dlm.ko
> >>> license:        GPL
> >>> author:         Oracle
> >>> version:        1.4.1-1-SLES
> >>> description:    OCFS2 DLM 1.4.1-1-SLES Wed Jul 23 18:33:42 UTC
> 2008
> >> (build f922955d99ef972235bd0c1fc236c5ddbb368611)
> >>> srcversion:     16FE87920EA41CA613E6609
> >>> depends:        ocfs2_nodemanager
> >>> supported:      yes
> >>> vermagic:       2.6.16.60-0.54.5-smp SMP gcc-4.1
> >>> parm:           dlm_purge_interval_ms:int
> >>> parm:           dlm_purge_locks_max:int
> >>> # rpm -qa ocfs2*
> >>> ocfs2-tools-1.4.0-0.9.9
> >>> ocfs2console-1.4.0-0.9.9
> >>>
---------------------------X8-----------------------------------
> >>> The kernel version is 2.6.16.60-0.54.5-smp
> >>>
> >>>
> >>
>
______________________________________________________________________________
> >>> Marc Grimme
> >>>
> >>> E-Mail: grimme at atix.de
-- 
______________________________________________________________________________

Marc Grimme

Tel: +49 89 4523538-14
Fax: +49 89 9901766-0
E-Mail: grimme at atix.de

ATIX Informationstechnologie und Consulting AG | Einsteinstrasse 10 |
85716 Unterschleissheim | www.atix.de

Enterprise Linux einfach online kaufen: www.linux-subscriptions.com

Registergericht: Amtsgericht M?nchen, Registernummer: HRB 168930, USt.-Id.:
DE209485962 | Vorstand: Thomas Merz (Vors.), Marc Grimme, Mark Hlawatschek, Jan
R. Bergrath |
Vorsitzender des Aufsichtsrats: Dr. Martin Buss

Maybe Matching Threads

Search for more seemingly similar threads

Ocfs2 users - Jul 2011 - Slow umounts on SLES10 patchlevel 3 ocfs2

[Ocfs2-users] Slow umounts on SLES10 patchlevel 3 ocfs2

[Ocfs2-users] Slow umounts on SLES10 patchlevel 3 ocfs2

[Ocfs2-users] Slow umounts on SLES10 patchlevel 3 ocfs2

Maybe Matching Threads