thr3ads.net - freebsd stable - kernel process [nfscl] high cpu [Oct 2017]

If this information is useful, please help other people find it:
Share via:

Rick Macklem

2015-Sep-24 21:17 UTC

kernel process [nfscl] high cpu

Frank de Bot wrote:> Rick Macklem wrote:
> > Frank de Bot wrote:
> >> Rick Macklem wrote:
> >>> Frank de Bot wrote:
> >>>> Hi,
> >>>>
> >>>> On a 10.1-RELEASE-p9 server I have several NFS mounts used
for a
> >>>> jail.
> >>>> Because it's a server only to test, there is a low
load. But the
> >>>> [nfscl]
> >>>> process is hogging a CPU after a while. This happens
pretty fast,
> >>>> within
> >>>> 1 or 2 days. I'm noticing the high CPU of the process
when I want to
> >>>> do
> >>>> some test after a little while (those 1 or 2 days).
> >>>>
> >>>> My jail.conf look like:
> >>>>
> >>>> exec.start = "/bin/sh /etc/rc";
> >>>> exec.stop = "/bin/sh /etc/rc.shutdown";
> >>>> exec.clean;
> >>>> mount.devfs;
> >>>> exec.consolelog = "/var/log/jail.$name.log";
> >>>> #mount.fstab =
"/usr/local/etc/jail.fstab.$name";
> >>>>
> >>>> test01 {
> >>>>         host.hostname = "test01_hosting";
> >>>>         ip4.addr = somepublicaddress;
> >>>>         ip4.addr += someprivateaddress;
> >>>>
> >>>>         mount = "10.13.37.2:/tank/hostingbase     
/opt/jails/test01
> >>>>    nfs     nfsv4,minorversion=1,pnfs,ro,noatime        0  
0";
> >>>>         mount +=  "10.13.37.2:/tank/hosting/test
> >>>> /opt/jails/test01/opt       nfs    
nfsv4,minorversion=1,pnfs,noatime
> >>>>      0       0";
> >>>>
> >>>>         path = "/opt/jails/test01";
> >>>> }
> >>>>
> >>>> Last test was with NFS 4.1, I also worked with NFS 4.(0)
with the
> >>>> same
> >>>> result. In the readonly nfs share there are symbolic links
point to
> >>>> the
> >>>> read-write share for logging, storing .run files, etc.
When I monitor
> >>>> my
> >>>> network interface with tcpdump, there is little nfs
traffic, only
> >>>> when I
> >>>> do try to access the shares there is activity.
> >>>>
> >>>> What is causing nfscl to run around in circles, hogging
the CPU (it
> >>>> makes the system slow to respond too) or how can I found
out what's
> >>>> the
> >>>> cause?
> >>>>
> >>> Well, the nfscl does server->client RPCs referred to as
callbacks. I
> >>> have no idea what the implications of running it in a jail is,
but I'd
> >>> guess that these server->client RPCs get blocked somehow,
etc...
> >>> (The NFSv4.0 mechanism requires a separate IP address that the
server
> >>>  can connect to on the client. For NFSv4.1, it should use the
same
> >>>  TCP connection as is used for the client->server RPCs. The
latter
> >>>  seems like it should work, but there is probably some
glitch.)
> >>>
> >>> ** Just run without the nfscl daemon (it is only needed for
delegations
> >>> or
> >>> pNFS).
> >>
> >> How can I disable the nfscl daemon?
> >>
> > Well, the daemon for the callbacks is called nfscbd.
> > You should check via "ps ax", to see if you have it running.
> > (For NFSv4.0 you probably don't want it running, but for NFSv4.1
you
> >  do need it. pNFS won't work at all without it, but unless you
have a
> >  server that supports pNFS, it won't work anyhow. Unless your
server is
> >  a clustered Netapp Filer, you should probably not have the
"pnfs" option.)
> > 
> > To run the "nfscbd" daemon you can set:
> > nfscbd_enable="TRUE"
> > in your /etc/rc.conf will start it on boot.
> > Alternately, just type "nfscbd" as root.
> > 
> > The "nfscl" thread is always started when an NFSv4 mount is
done. It does
> > an assortment of housekeeping things, including a Renew op to make
sure the
> > lease doesn't expire. If for some reason the jail blocks these
Renew RPCs,
> > it will try to do them over and over and ... because having the lease
> > expire is bad news for NFSv4. How could you tell?
> > Well, capturing packets between the client and server, then looking at
them
> > in wireshark is probably the only way. (Or maybe a large count for
Renew
> > in the output from "nfsstat -e".)
> > 
> > "nfscbd" is optional for NFSv4.0. Without it, you simply
don't do
> > callbacks/delegations.
> > For NFSv4.1 it is pretty much required, but doesn't need a
separate
> > server->client TCP
> > connection.
> > --> I'd enable it for NFSv4.1, but disable it for NFSv4.0 at
least as a
> > starting point.
> > 
> > And as I said before, none of this is tested within jails, so I have
no
> > idea
> > what effect the jails have. Someone who understands jails might have
some
> > insight
> > w.r.t. this?
> > 
> > rick
> > 
> 
> Since last time I haven't tried to use pnfs and just sticked with
> nfsv4.0. nfscbd is not running. The server is now running 10.2. The
> number of renews is not very high (56k, getattr is for example 283M)
> View with wireshark, renew calls look good ,the nfs status is ok.
> 
> Is there a way to know what [nfscl] is active with?
> Not that I can think of. When I do "ps axHl" I see it in DL state and
not
doing much of anything. (You could try setting "sysctl
vfs.nfs.debuglevel=4",
but I don't think you'll see anything syslog'd that is useful?)
This is what I'd expect for an NFSv4.0 mount without the nfscbd running.

Basically, when the nfscbd isn't running the server shouldn't issue any
delegations, because it shouldn't see a callback path (server->client TCP
connection). Also, if you are using a FreeBSD NFS server, it won't issue
delegations unless you've enabled that, which isn't the default.

Check to see your Delegs in "nfsstat -e" is 0.

If it is, then all the nfscl should be doing is waking up once per second
and doing very little except a Renew RPC once every 30-60sec. (A fraction of
what the server's lease duration is.)

The only thing I can think of that might cause it to run a lot would be some
weirdness related to the TOD clock. It msleep()s for hz and also checks for
time_uptime (which should be in resolution of seconds) != the previous time.
(If the msleep()s were waking up too fequently, then it would loop around
 doing not much of anything, over and over and over again...)

> I do understand nfs + jails could have issues, but I like to understand
> them.
> 
> 
> Frank
> 
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at
freebsd.org"
>

Fabian Freyer

2017-Oct-15 20:45 UTC

head link

kernel process [nfscl] high cpu

Hi,

(I'm not on this list, please CC me in future replies.
My apologies to those who get this message twice, I had a typo in the
To: header, and have to re-send it to the list.)

sorry for reviving such an old thread, but I've run into this problem
lately as well, on a 11.1-RELEASE-p1 jailhost mounting NFSv4.0 mounts
into jails.

On 24.09.2015 23:17, Rick Macklem wrote:> Frank de Bot wrote:
>> Rick Macklem wrote:
>>> Frank de Bot wrote:
>>>> Rick Macklem wrote:
>>>>> Frank de Bot wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On a 10.1-RELEASE-p9 server I have several NFS mounts
used for a
>>>>>> jail.
>>>>>> Because it's a server only to test, there is a low
load. But the
>>>>>> [nfscl]
>>>>>> process is hogging a CPU after a while. This happens
pretty fast,
>>>>>> within
>>>>>> 1 or 2 days. I'm noticing the high CPU of the
process when I want to
>>>>>> do
>>>>>> some test after a little while (those 1 or 2 days).
Here's my ps ax | grep nfscl:
# ps ax | grep nfscl
11111  -  DL      932:08.74 [nfscl]
11572  -  DL      442:27.42 [nfscl]
30396  -  DL      933:44.13 [nfscl]
35902  -  DL      442:08.70 [nfscl]
40881  -  DL      938:56.04 [nfscl]
43276  -  DL      932:38.88 [nfscl]
49178  -  DL      934:24.77 [nfscl]
56314  -  DL      935:21.55 [nfscl]
60085  -  DL      936:37.11 [nfscl]
71788  -  DL      933:10.96 [nfscl]
82001  -  DL      934:45.76 [nfscl]
86222  -  DL      931:42.94 [nfscl]
92353  -  DL     1186:53.38 [nfscl]
21105 20  S+        0:00.00 grep nfscl

And this is on a 12-core with Hyperthreading:
# uptime
7:28PM  up 11 days,  4:50, 4 users, load averages: 25.49, 21.91, 20.25

Most of this load is being generated by the nfscl threads.
>>>>>> My jail.conf look like:
>>>>>>
>>>>>> exec.start = "/bin/sh /etc/rc";
>>>>>> exec.stop = "/bin/sh /etc/rc.shutdown";
>>>>>> exec.clean;
>>>>>> mount.devfs;
>>>>>> exec.consolelog = "/var/log/jail.$name.log";
>>>>>> #mount.fstab =
"/usr/local/etc/jail.fstab.$name";
>>>>>>
>>>>>> test01 {
>>>>>>         host.hostname = "test01_hosting";
>>>>>>         ip4.addr = somepublicaddress;
>>>>>>         ip4.addr += someprivateaddress;
>>>>>>
>>>>>>         mount = "10.13.37.2:/tank/hostingbase     
/opt/jails/test01
>>>>>>    nfs     nfsv4,minorversion=1,pnfs,ro,noatime       
0       0";
>>>>>>         mount +=  "10.13.37.2:/tank/hosting/test
>>>>>> /opt/jails/test01/opt       nfs    
nfsv4,minorversion=1,pnfs,noatime
>>>>>>      0       0";
>>>>>>
>>>>>>         path = "/opt/jails/test01";
>>>>>> }
>>>>>>
>>>>>> Last test was with NFS 4.1, I also worked with NFS
4.(0) with the
>>>>>> same
>>>>>> result. In the readonly nfs share there are symbolic
links point to
>>>>>> the
>>>>>> read-write share for logging, storing .run files, etc.
When I monitor
>>>>>> my
>>>>>> network interface with tcpdump, there is little nfs
traffic, only
>>>>>> when I
>>>>>> do try to access the shares there is activity.
>>>>>>
>>>>>> What is causing nfscl to run around in circles, hogging
the CPU (it
>>>>>> makes the system slow to respond too) or how can I
found out what's
>>>>>> the
>>>>>> cause?
>>>>>>
>>>>> Well, the nfscl does server->client RPCs referred to as
callbacks. I
>>>>> have no idea what the implications of running it in a jail
is, but I'd
>>>>> guess that these server->client RPCs get blocked
somehow, etc...
>>>>> (The NFSv4.0 mechanism requires a separate IP address that
the server
>>>>>  can connect to on the client. For NFSv4.1, it should use
the same
>>>>>  TCP connection as is used for the client->server RPCs.
The latter
>>>>>  seems like it should work, but there is probably some
glitch.)
>>>>>
>>>>> ** Just run without the nfscl daemon (it is only needed for
delegations
>>>>> or
>>>>> pNFS).
>>>>
>>>> How can I disable the nfscl daemon?
>>>>
>>> Well, the daemon for the callbacks is called nfscbd.
>>> You should check via "ps ax", to see if you have it
running.
>>> (For NFSv4.0 you probably don't want it running, but for
NFSv4.1 you
>>>  do need it. pNFS won't work at all without it, but unless you
have a
>>>  server that supports pNFS, it won't work anyhow. Unless your
server is
>>>  a clustered Netapp Filer, you should probably not have the
"pnfs" option.)
>>>
>>> To run the "nfscbd" daemon you can set:
>>> nfscbd_enable="TRUE"
>>> in your /etc/rc.conf will start it on boot.
>>> Alternately, just type "nfscbd" as root.
>>>
>>> The "nfscl" thread is always started when an NFSv4 mount
is done. It does
>>> an assortment of housekeeping things, including a Renew op to make
sure the
>>> lease doesn't expire. If for some reason the jail blocks these
Renew RPCs,
>>> it will try to do them over and over and ... because having the
lease
>>> expire is bad news for NFSv4. How could you tell?
>>> Well, capturing packets between the client and server, then looking
at them
>>> in wireshark is probably the only way. (Or maybe a large count for
Renew
>>> in the output from "nfsstat -e".)
>>>
>>> "nfscbd" is optional for NFSv4.0. Without it, you simply
don't do
>>> callbacks/delegations.
>>> For NFSv4.1 it is pretty much required, but doesn't need a
separate
>>> server->client TCP
>>> connection.
>>> --> I'd enable it for NFSv4.1, but disable it for NFSv4.0 at
least as a
>>> starting point.
>>>
>>> And as I said before, none of this is tested within jails, so I
have no
>>> idea
>>> what effect the jails have. Someone who understands jails might
have some
>>> insight
>>> w.r.t. this?
>>>
>>> rick
>>>
>>
>> Since last time I haven't tried to use pnfs and just sticked with
>> nfsv4.0. nfscbd is not running. The server is now running 10.2. The
>> number of renews is not very high (56k, getattr is for example 283M)
>> View with wireshark, renew calls look good ,the nfs status is ok.
>>
>> Is there a way to know what [nfscl] is active with?
>>
> Not that I can think of. When I do "ps axHl" I see it in DL state
and not
> doing much of anything. (You could try setting "sysctl
vfs.nfs.debuglevel=4",
> but I don't think you'll see anything syslog'd that is useful?)
> This is what I'd expect for an NFSv4.0 mount without the nfscbd
running.
nfscbd is not running.

I've done a bit of digging with kgdb, by attatching to a thread of one
of the nfscl's and grabbed a backtrace:

--8<---- snip
(kgdb) info threads
[...]
  678 Thread 101151 (PID=82001: nfscl)  0xffffffff80a9780a in
sched_switch ()
[...]
(kgdb) thread 678
[Switching to thread 678 (Thread 101151)]#0  0xffffffff80a9780a in
sched_switch ()
(kgdb) bt
#0  0xffffffff80a9780a in sched_switch ()
#1  0xffffffff80a75a35 in mi_switch ()
#2  0xffffffff80abaa82 in sleepq_timedwait ()
#3  0xffffffff80a75435 in _sleep ()
#4  0xffffffff8095eb7d in nfscl_renewthread ()
#5  0xffffffff80985204 in start_nfscl ()
#6  0xffffffff80a2f815 in fork_exit ()
#7  0xffffffff80ec3b7e in fork_trampoline ()
#8  0x0000000000000000 in ?? ()
--8<---- snap

So as you said, the nfscl thread is mostly hanging in a sleep. However,
every approx. 30-50 seconds (this corresponds quite well with the timing
of the renew calls on the wire) the CPU usage spikes to 100% on all 24
threads.

dtrace(1m)'ing during those spikes show a huge number (order of 10^5)
calls to nfscl_procdoesntexist and nfscl_cleanup_common:

Calls

  nfscl_postop_attr                                                12

  nfscl_loadattrcache                                              17

  nfscl_request                                                    22

  nfscl_deleggetmodtime                                            26
  nfscl_reqstart                                                   26
  nfscl_mustflush                                                  30

  nfscl_nodeleg                                                    34

  nfscl_cleanup_common                                          12005

  nfscl_procdoesntexist                                         12315



Times

  nfscl_loadattrcache                                           41949

  nfscl_deleggetmodtime                                         63014

  nfscl_postop_attr                                             65046

  nfscl_reqstart                                                77851

  nfscl_procdoesntexist                                      30187728
  nfscl_request                                             423855128
  nfscl_nodeleg                                             841033772
  nfscl_mustflush                                          4838119753
  nfscl_cleanup_common                                     6436207841

For reference, here's the dtrace(1m) script I'm using:

--8<---- snip
#!/usr/sbin/dtrace -Cs

#pragma D option quiet

fbt::nfscl_*:entry {
  @entries[probefunc] = count();
  traceme[probefunc] = 1;
  ts[probefunc] = timestamp;
}

fbt::nfscl_*:return
/traceme[probefunc]/
{
  @times[probefunc] = sum(timestamp - ts[probefunc]);

}

tick-5sec
{
  printa(@times);
  clear(@times);
  printa(@entries);
  clear(@entries);
}
--8<---- snap

For completeness, I guess I *should* synchronize this with the for(;;)
loop in nfscl_renewthread and count the actual number of process
terminations, but OTOH, I am extremely sure (just by observing the
number of running processes and PIDs) that it's nowhere near those counts.

Digging a little deeper, I see pfind(9), or more closely pfind_locked is
used in nfscl_procdoesntexist. What I do not understand yet is how
pfind(9) interacts with jails. Maybe someone(TM) could shed some light
on this?
> Basically, when the nfscbd isn't running the server shouldn't issue
any
> delegations, because it shouldn't see a callback path
(server->client TCP
> connection). Also, if you are using a FreeBSD NFS server, it won't
issue
> delegations unless you've enabled that, which isn't the default.
> 
> Check to see your Delegs in "nfsstat -e" is 0.
It is.
> If it is, then all the nfscl should be doing is waking up once per second
> and doing very little except a Renew RPC once every 30-60sec. (A fraction
of
> what the server's lease duration is.)
A tcpdump shows at least one renew call/reply pair per minute.
> The only thing I can think of that might cause it to run a lot would be
some
> weirdness related to the TOD clock. It msleep()s for hz and also checks for
> time_uptime (which should be in resolution of seconds) != the previous
time.
> (If the msleep()s were waking up too fequently, then it would loop around
>  doing not much of anything, over and over and over again...)
> 
> 
>> I do understand nfs + jails could have issues, but I like to understand
>> them.
>>
Fabian

[1]
https://lists.freebsd.org/pipermail/freebsd-hackers/2017-June/thread.html#51200

freebsd stable - Oct 2017 - kernel process [nfscl] high cpu

kernel process [nfscl] high cpu

kernel process [nfscl] high cpu