thr3ads.net - freebsd stable - vnodes - is there a leak? where are they going? [Aug 2004]

If this information is useful, please help other people find it:
Share via:

Marc G. Fournier

2004-Aug-31 17:22 UTC

vnodes - is there a leak? where are they going?

I have two servers, both running 4.10 of within a few days (Aug 5 for 
venus, Aug 7 for neptune) ... both running jail environments ... one with 
~60 running, the other with ~80 ... the one with 60 has been running for 
~25 days now, and is at the border of running out of vnodes:

Aug 31 20:58:00 venus root: debug.numvnodes: 519920 - debug.freevnodes: 11058 -
debug.vnlru_nowhere: 256463 - vlrup
Aug 31 20:59:01 venus root: debug.numvnodes: 519920 - debug.freevnodes: 13155 -
debug.vnlru_nowhere: 256482 - vlrup
Aug 31 21:00:03 venus root: debug.numvnodes: 519920 - debug.freevnodes: 13092 -
debug.vnlru_nowhere: 256482 - vlruwt

while the other one has been up for ~1 days, but is using alot less, for 
more processes:

Aug 31 20:58:00 neptune root: debug.numvnodes: 344062 - debug.freevnodes: 208655
- debug.vnlru_nowhere: 0 - vlruwt
Aug 31 20:59:00 neptune root: debug.numvnodes: 344062 - debug.freevnodes: 208602
- debug.vnlru_nowhere: 0 - vlruwt
Aug 31 21:00:03 neptune root: debug.numvnodes: 344062 - debug.freevnodes: 208319
- debug.vnlru_nowhere: 0 - vlruwt

I've tried shutting down all of the VMs on venus, and umount'd all of
the
unionfs mounts, as well as the one nfs mount we have ... the above #s are 
after the VMs (and mounts are recreated ...

Now, my understanding of the vnodes is that for every file opened, a vnode 
is created ... in my case, since I'm using unionfs, there are two vnodes 
per file ... if it possible that there are 'stale' vnodes that
aren't
being freed up?  Is there some way of 'viewing' the vnode structure?

For instance, fstat shows:

venus# fstat | wc -l
    19531

So, obviously it isn't just open files that I'm dealing with here, for 
even if I double that, that is nowhere near 519920 ...

So, where else are the vnodes going?  Is there a 'leak'?  What can I
look
at to try and narrow this down / provide more information?

Even some way of determining a specific process that is sucking back alot 
of them, to move that to a different machine ... ?

Help?

----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Marc G. Fournier

2004-Aug-31 17:49 UTC

head link

vnodes - is there a leak? where are they going?

As a follow up, looking at vmstat -m .. specifically the work that David 
did on seperating the union vs regular vnodes:

   UNION mount    60     2K      3K204800K      162    0     0  32
        undcac     0     0K      1K204800K343638713    0     0  16
        unpath 13146   227K   1025K204800K 43541149    0     0  16,32,64,128
   Export Host     1     1K      1K204800K      164    0     0  256
        vnodes   141     7K      8K204800K      613    0     0  16,32,64,128,256

Why does 'vnodes' show only 141 InUse?  Or, in this case, should I be 
looking at:

      FFS node496600124150K 127870K204800K401059293    0     0  256

496k FFS nodes, if I'm reading right?

vs neptune, which is showing only:

      FFS node300433 75109K  80257K204800K  3875307    0     0  256



On Tue, 31 Aug 2004, Marc G. Fournier wrote:
>
> I have two servers, both running 4.10 of within a few days (Aug 5 for
venus,
> Aug 7 for neptune) ... both running jail environments ... one with ~60 
> running, the other with ~80 ... the one with 60 has been running for ~25
days
> now, and is at the border of running out of vnodes:
>
> Aug 31 20:58:00 venus root: debug.numvnodes: 519920 - debug.freevnodes:
11058
> - debug.vnlru_nowhere: 256463 - vlrup
> Aug 31 20:59:01 venus root: debug.numvnodes: 519920 - debug.freevnodes:
13155
> - debug.vnlru_nowhere: 256482 - vlrup
> Aug 31 21:00:03 venus root: debug.numvnodes: 519920 - debug.freevnodes:
13092
> - debug.vnlru_nowhere: 256482 - vlruwt
>
> while the other one has been up for ~1 days, but is using alot less, for
more
> processes:
>
> Aug 31 20:58:00 neptune root: debug.numvnodes: 344062 - debug.freevnodes: 
> 208655 - debug.vnlru_nowhere: 0 - vlruwt
> Aug 31 20:59:00 neptune root: debug.numvnodes: 344062 - debug.freevnodes: 
> 208602 - debug.vnlru_nowhere: 0 - vlruwt
> Aug 31 21:00:03 neptune root: debug.numvnodes: 344062 - debug.freevnodes: 
> 208319 - debug.vnlru_nowhere: 0 - vlruwt
>
> I've tried shutting down all of the VMs on venus, and umount'd all
of the
> unionfs mounts, as well as the one nfs mount we have ... the above #s are 
> after the VMs (and mounts are recreated ...
>
> Now, my understanding of the vnodes is that for every file opened, a vnode
is
> created ... in my case, since I'm using unionfs, there are two vnodes
per
> file ... if it possible that there are 'stale' vnodes that
aren't being freed
> up?  Is there some way of 'viewing' the vnode structure?
>
> For instance, fstat shows:
>
> venus# fstat | wc -l
>   19531
>
> So, obviously it isn't just open files that I'm dealing with here,
for even
> if I double that, that is nowhere near 519920 ...
>
> So, where else are the vnodes going?  Is there a 'leak'?  What can
I look at
> to try and narrow this down / provide more information?
>
> Even some way of determining a specific process that is sucking back alot
of
> them, to move that to a different machine ... ?
>
> Help?
>
> ----
> Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
> Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664
>
----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Allan Fields

2004-Sep-01 14:40 UTC

head link

vnodes - is there a leak? where are they going?

On Tue, Aug 31, 2004 at 09:21:09PM -0300, Marc G. Fournier
wrote:> 
> I have two servers, both running 4.10 of within a few days (Aug 5 for 
> venus, Aug 7 for neptune) ... both running jail environments ... one with 
> ~60 running, the other with ~80 ... the one with 60 has been running for 
> ~25 days now, and is at the border of running out of vnodes:
> 
> Aug 31 20:58:00 venus root: debug.numvnodes: 519920 - debug.freevnodes: 
> 11058 - debug.vnlru_nowhere: 256463 - vlrup
> Aug 31 20:59:01 venus root: debug.numvnodes: 519920 - debug.freevnodes: 
> 13155 - debug.vnlru_nowhere: 256482 - vlrup
> Aug 31 21:00:03 venus root: debug.numvnodes: 519920 - debug.freevnodes: 
> 13092 - debug.vnlru_nowhere: 256482 - vlruwt
>
> [..]
>
> I've tried shutting down all of the VMs on venus, and umount'd all
of the
> unionfs mounts, as well as the one nfs mount we have ... the above #s are 
> after the VMs (and mounts are recreated ...
> 
> Now, my understanding of the vnodes is that for every file opened, a vnode 
> is created ... in my case, since I'm using unionfs, there are two
vnodes
> per file ... if it possible that there are 'stale' vnodes that
aren't
> being freed up?  Is there some way of 'viewing' the vnode
structure?
>
> For instance, fstat shows:
> 
> venus# fstat | wc -l
>    19531
You can also try pstat -f|more from the user side.
> So, obviously it isn't just open files that I'm dealing with here,
for
> even if I double that, that is nowhere near 519920 ...
You might want to setup for remote kernel debugging and peek around
the system / further examine vnode structures.  (If you have physical
access to two machines you can setup a null modem cable.)
> So, where else are the vnodes going?  Is there a 'leak'?  What can
I look
> at to try and narrow this down / provide more information?
If the use count isn't decremented (to zero) vnodes wont
be placed on the freelist.  Perhaps something isn't
calling vrele() where it should in unionfs?  You should check the
reference counts: v_usecount and v_holdcnt on some of the suspect
vnodes.

Any specific things you might suspect as possible cause?
Any messages preceeding the ones you listed above?

If you can espace to the debugger, some things to try are:
	show page
	show lockedvn

You could do a dump for later examination if you are forced to
reboot the machine (after trying unmount).
> Even some way of determining a specific process that is sucking back alot 
> of them, to move that to a different machine ... ?
While this only works for open file entries you can get a top 10
by using:

fstat|perl -ane '
  $sum{$F[1]}++;
  END{print "$_: $sum{$_}\n" for sort {$sum{$b}<=>$sum{$a}} keys
%sum}
'|head -10
> ----
> Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
> Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664
-- 
 Allan Fields, AFRSL - http://afields.ca
 2D4F 6806 D307 0889 6125  C31D F745 0D72 39B4 5541

David Schultz

2004-Sep-03 23:25 UTC

head link

vnodes - is there a leak? where are they going?

On Tue, Aug 31, 2004, Marc G. Fournier wrote:> 
> I have two servers, both running 4.10 of within a few days (Aug 5 for 
> venus, Aug 7 for neptune) ... both running jail environments ... one with 
> ~60 running, the other with ~80 ... the one with 60 has been running for 
> ~25 days now, and is at the border of running out of vnodes:
> 
> Aug 31 20:58:00 venus root: debug.numvnodes: 519920 - debug.freevnodes: 
> 11058 - debug.vnlru_nowhere: 256463 - vlrup
> Aug 31 20:59:01 venus root: debug.numvnodes: 519920 - debug.freevnodes: 
> 13155 - debug.vnlru_nowhere: 256482 - vlrup
> Aug 31 21:00:03 venus root: debug.numvnodes: 519920 - debug.freevnodes: 
> 13092 - debug.vnlru_nowhere: 256482 - vlruwt
> 
> while the other one has been up for ~1 days, but is using alot less, for 
> more processes:
> 
> Aug 31 20:58:00 neptune root: debug.numvnodes: 344062 - debug.freevnodes: 
> 208655 - debug.vnlru_nowhere: 0 - vlruwt
> Aug 31 20:59:00 neptune root: debug.numvnodes: 344062 - debug.freevnodes: 
> 208602 - debug.vnlru_nowhere: 0 - vlruwt
> Aug 31 21:00:03 neptune root: debug.numvnodes: 344062 - debug.freevnodes: 
> 208319 - debug.vnlru_nowhere: 0 - vlruwt
> 
> I've tried shutting down all of the VMs on venus, and umount'd all
of the
> unionfs mounts, as well as the one nfs mount we have ... the above #s are 
> after the VMs (and mounts are recreated ...
> 
> Now, my understanding of the vnodes is that for every file opened, a vnode 
> is created ... in my case, since I'm using unionfs, there are two
vnodes
> per file ... if it possible that there are 'stale' vnodes that
aren't
> being freed up?  Is there some way of 'viewing' the vnode
structure?
> 
> For instance, fstat shows:
> 
> venus# fstat | wc -l
>    19531
> 
> So, obviously it isn't just open files that I'm dealing with here,
for
> even if I double that, that is nowhere near 519920 ...
> 
> So, where else are the vnodes going?  Is there a 'leak'?  What can
I look
> at to try and narrow this down / provide more information?
First of all, use 'fstat -m' to ensure that you're counting all
open files.  Second, I believe it is normal for the number of
files that fstat reports to be lower than the number of vnodes
actually allocated, since unreferenced vnodes (which don't show up
in fstat) are cached.

It's a bit worrisome that debug.vnlru_nowhere is large on one
machine and 0 on the other.  That number says that the system
tried to reclaim some unreferenced vnodes, but didn't find any.  I
don't know whether this indicates a leak, or merely a
vnode-intensive process.
> Even some way of determining a specific process that is sucking back alot 
> of them, to move that to a different machine ... ?
'fstat -mu' or 'fstat -mp' might be helpful in tracking down
which
user or process is eating up your vnodes.

freebsd stable - Aug 2004 - vnodes - is there a leak? where are they going?

vnodes - is there a leak? where are they going?

vnodes - is there a leak? where are they going?

vnodes - is there a leak? where are they going?

vnodes - is there a leak? where are they going?