thr3ads.net - freebsd stable - post ino64: lockd no runs? [Jun 2017]

If this information is useful, please help other people find it:
Share via:

David Wolfskill

2017-Jun-11 18:08 UTC

post ino64: lockd no runs?

On Sun, Jun 04, 2017 at 08:57:44AM -0400, Michael Butler
wrote:> It seems that {rpc.}lockd no longer runs after the ino64 changes on any
> of my systems after a full rebuild of src and ports. No log entries
> offer any insight as to why :-(
> 
> 	imb
I don't tend to use NFS on my systems that are running head, so I
haven't had occasion to test this as stated.

However, I just completed my weekly update of the "prooduction"
systems
here at home, running stable/11.  And I find that lockd seems to be ...
claiming that all is well, but declining to run (for long).

To the best of my knowledge, that was not the case until this last
update, which was from:

FreeBSD albert.catwhisker.org 11.1-PRERELEASE FreeBSD 11.1-PRERELEASE #316 
r319566M/319569:1100514: Sun Jun  4 03:54:41 PDT 2017     root at
freebeast.catwhisker.org:/common/S1/obj/usr/src/sys/ALBERT  amd64

to

FreeBSD albert.catwhisker.org 11.1-BETA1 FreeBSD 11.1-BETA1 #322 
r319823M/319823:1100514: Sun Jun 11 03:56:10 PDT 2017     root at
freebeast.catwhisker.org:/common/S1/obj/usr/src/sys/ALBERT  amd64

The "glaringly obvious" symptom in my case is that I am now unable
to (directly) save an email message from within mutt(1) by appending
it to an NFS-resident file.  (Saving it to a local file, then using
cat(1) to append that to the NFS- resident file & removing the local
copy works....)

After a few variations on a theme of:

albert(11.1)[5] sudo service lockd restart
lockd not running?
Starting lockd.
albert(11.1)[6] echo $?
0
albert(11.1)[7] service lockd status
lockd is not running.

I finally(!) thought to ask ktrace what's going on (as tailing
/var/log/messages was completely unproductive, even after enabling
rc_debug).

So I tried: "sudo ktrace -di service lockd restart"; upon exanimation
of
the output of kdump(1), I see that the trace ends with:

  ...
  2811 rpc.lockd NAMI  "/var/run/logpriv"
  2786 sh       CALL  read(0xa,0x627fc0,0x400)
  2786 sh       GIO   fd 10 read 0 bytes
       ""
  2811 rpc.lockd RET   connect 0
  2786 sh       RET   read 0
  2811 rpc.lockd CALL  sendto(0x3,0x7fffffffe2c0,0x27,0,0,0)
  2786 sh       CALL  exit(0)
  2811 rpc.lockd GIO   fd 3 wrote 39 bytes
       "<30>Jun 11 15:43:10 rpc.lockd: Starting"
  2811 rpc.lockd RET   sendto 39/0x27
  2811 rpc.lockd CALL  sigaction(SIGALRM,0x7fffffffec20,0)
  2811 rpc.lockd RET   sigaction 0
  2811 rpc.lockd CALL  nlm_syscall(0,0x1e,0x4,0x801015040)
  2811 rpc.lockd RET   nlm_syscall -1 errno 14 Bad address
  2811 rpc.lockd CALL  sigprocmask(SIG_BLOCK,0x800830c78,0x7fffffffea40)
  2811 rpc.lockd RET   sigprocmask 0
  2811 rpc.lockd CALL  sigprocmask(SIG_SETMASK,0x800830c8c,0)
  2811 rpc.lockd RET   sigprocmask 0
  2811 rpc.lockd CALL  sigprocmask(SIG_BLOCK,0x800830c78,0x7fffffffe5b0)
  2811 rpc.lockd RET   sigprocmask 0
  2811 rpc.lockd CALL  sigprocmask(SIG_SETMASK,0x800830c8c,0)
  2811 rpc.lockd RET   sigprocmask 0
  2811 rpc.lockd CALL  sigprocmask(SIG_BLOCK,0x800830c78,0x7fffffffe5b0)
  2811 rpc.lockd RET   sigprocmask 0
  2811 rpc.lockd CALL  sigprocmask(SIG_SETMASK,0x800830c8c,0)
  2811 rpc.lockd RET   sigprocmask 0
  2811 rpc.lockd CALL  sigprocmask(SIG_BLOCK,0x800830c78,0x7fffffffe5b0)
  2811 rpc.lockd RET   sigprocmask 0
  2811 rpc.lockd CALL  sigprocmask(SIG_SETMASK,0x800830c8c,0)
  2811 rpc.lockd RET   sigprocmask 0
  2811 rpc.lockd CALL  exit(0x1)

Then, when I tried to send this message, I started getting more whines
from mutt(1).  I finall gave up and rebooted from the previous
environment:

FreeBSD albert.catwhisker.org 11.1-PRERELEASE FreeBSD 11.1-PRERELEASE #316 
r319566M/319569:1100514: Sun Jun  4 03:54:41 PDT 2017     root at
freebeast.catwhisker.org:/common/S1/obj/usr/src/sys/ALBERT  amd64

and lockd is running:

albert(11.1-P)[2] service lockd status
lockd is running as pid 629.
albert(11.1-P)[3] 

so mutt(1) is not pitchng a hisssy-fit every time I try to save or
send a message.

In light of the above, I have Bcced: this message to current@ (where
the thread originated) and sent it (and set replies) to stable at .

I have a test system, last updated to stable/11 as of mid-October
last year; lockd was running on it, as well (which is why I tried
going back to last week's image).  I'm happy to update it to points
where lockd may be broken, if it might help figure out what's broken
and how to fix it.

Peace,
david
-- 
David H. Wolfskill				david at catwhisker.org
Looking forward to telling Mr. Trump: "You're fired!"

See http://www.catwhisker.org/~david/publickey.gpg for my public key.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 603 bytes
Desc: not available
URL:
<http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20170611/db4ee106/attachment-0001.sig>

Cy Schubert

2017-Jun-11 18:47 UTC

head link

post ino64: lockd no runs?

In message <20170611172022.GA3184 at albert.catwhisker.org>, David
Wolfskill
write
s:> 
> --0eh6TmSyL6TZE2Uz
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
> Content-Transfer-Encoding: quoted-printable
> 
> On Sun, Jun 04, 2017 at 08:57:44AM -0400, Michael Butler wrote:
> > It seems that {rpc.}lockd no longer runs after the ino64 changes on
any
> > of my systems after a full rebuild of src and ports. No log entries
> > offer any insight as to why :-(
> >=20
> > 	imb
> 
> I don't tend to use NFS on my systems that are running head, so I
> haven't had occasion to test this as stated.
> 
> However, I just completed my weekly update of the "prooduction"
systems
> here at home, running stable/11.  And I find that lockd seems to be ...
> claiming that all is well, but declining to run (for long).
> 
> To the best of my knowledge, that was not the case until this last
> update, which was from:
> 
> FreeBSD albert.catwhisker.org 11.1-PRERELEASE FreeBSD 11.1-PRERELEASE #316
>  r319566M/319569:1100514: Sun Jun  4 03:54:41 PDT 2017     root at
freebeast.c> atwhisker.org:/common/S1/obj/usr/src/sys/ALBERT  amd64
> 
> to
> 
> FreeBSD albert.catwhisker.org 11.1-BETA1 FreeBSD 11.1-BETA1 #322 
r319823M/> 319823:1100514: Sun Jun 11 03:56:10 PDT 2017     root at
freebeast.catwhisker.> org:/common/S1/obj/usr/src/sys/ALBERT  amd64
> 
> The "glaringly obvious" symptom in my case is that I am now
unable
> to (directly) save an email message from within mutt(1) by appending
> it to an NFS-resident file.  (Saving it to a local file, then using
> cat(1) to append that to the NFS- resident file & removing the local
> copy works....)
> 
> After a few variations on a theme of:
> 
> albert(11.1)[5] sudo service lockd restart
> lockd not running?
> Starting lockd.
> albert(11.1)[6] echo $?
> 0
> albert(11.1)[7] service lockd status
> lockd is not running.
> 
> I finally(!) thought to ask ktrace what's going on (as tailing
> /var/log/messages was completely unproductive, even after enabling
> rc_debug).
> 
> So I tried: "sudo ktrace -di service lockd restart"; upon
exanimation of
> the output of kdump(1), I see that the trace ends with:
> 
>   ...
>   2811 rpc.lockd NAMI  "/var/run/logpriv"
>   2786 sh       CALL  read(0xa,0x627fc0,0x400)
>   2786 sh       GIO   fd 10 read 0 bytes
>        ""
>   2811 rpc.lockd RET   connect 0
>   2786 sh       RET   read 0
>   2811 rpc.lockd CALL  sendto(0x3,0x7fffffffe2c0,0x27,0,0,0)
>   2786 sh       CALL  exit(0)
>   2811 rpc.lockd GIO   fd 3 wrote 39 bytes
>        "<30>Jun 11 15:43:10 rpc.lockd: Starting"
>   2811 rpc.lockd RET   sendto 39/0x27
>   2811 rpc.lockd CALL  sigaction(SIGALRM,0x7fffffffec20,0)
>   2811 rpc.lockd RET   sigaction 0
>   2811 rpc.lockd CALL  nlm_syscall(0,0x1e,0x4,0x801015040)
>   2811 rpc.lockd RET   nlm_syscall -1 errno 14 Bad address
>   2811 rpc.lockd CALL  sigprocmask(SIG_BLOCK,0x800830c78,0x7fffffffea40)
>   2811 rpc.lockd RET   sigprocmask 0
>   2811 rpc.lockd CALL  sigprocmask(SIG_SETMASK,0x800830c8c,0)
>   2811 rpc.lockd RET   sigprocmask 0
>   2811 rpc.lockd CALL  sigprocmask(SIG_BLOCK,0x800830c78,0x7fffffffe5b0)
>   2811 rpc.lockd RET   sigprocmask 0
>   2811 rpc.lockd CALL  sigprocmask(SIG_SETMASK,0x800830c8c,0)
>   2811 rpc.lockd RET   sigprocmask 0
>   2811 rpc.lockd CALL  sigprocmask(SIG_BLOCK,0x800830c78,0x7fffffffe5b0)
>   2811 rpc.lockd RET   sigprocmask 0
>   2811 rpc.lockd CALL  sigprocmask(SIG_SETMASK,0x800830c8c,0)
>   2811 rpc.lockd RET   sigprocmask 0
>   2811 rpc.lockd CALL  sigprocmask(SIG_BLOCK,0x800830c78,0x7fffffffe5b0)
>   2811 rpc.lockd RET   sigprocmask 0
>   2811 rpc.lockd CALL  sigprocmask(SIG_SETMASK,0x800830c8c,0)
>   2811 rpc.lockd RET   sigprocmask 0
>   2811 rpc.lockd CALL  exit(0x1)
> 
> Then, when I tried to send this message, I started getting more whines
> =66rom mutt(1).  I finall gave up and rebooted from the previous
> environment:
> 
> FreeBSD albert.catwhisker.org 11.1-PRERELEASE FreeBSD 11.1-PRERELEASE #316
>  r319566M/319569:1100514: Sun Jun  4 03:54:41 PDT 2017     root at
freebeast.c> atwhisker.org:/common/S1/obj/usr/src/sys/ALBERT  amd64
> 
> and lockd is running:
> 
> albert(11.1-P)[2] service lockd status
> lockd is running as pid 629.
> albert(11.1-P)[3]=20
> 
> so mutt(1) is not pitchng a hisssy-fit every time I try to save or
> send a message.
> 
> 
> In light of the above, I have Bcced: this message to current@ (where
> the thread originated) and sent it (and set replies) to stable at .
> 
> 
> I have a test system, last updated to stable/11 as of mid-October
> last year; lockd was running on it, as well (which is why I tried
> going back to last week's image).  I'm happy to update it to points
> where lockd may be broken, if it might help figure out what's broken
> and how to fix it.
I'm running lockd on recent -CURRENT systems. No issues so far. Locking 
works as expected.



-- 
Cheers,
Cy Schubert <Cy.Schubert at cschubert.com>
FreeBSD UNIX:  <cy at FreeBSD.org>   Web:  http://www.FreeBSD.org

	The need of the many outweighs the greed of the few.

Konstantin Belousov

2017-Jun-11 18:58 UTC

head link

post ino64: lockd no runs?

On Sun, Jun 11, 2017 at 11:12:25AM -0700, David Wolfskill
wrote:>   2811 rpc.lockd CALL  nlm_syscall(0,0x1e,0x4,0x801015040)
>   2811 rpc.lockd RET   nlm_syscall -1 errno 14 Bad address
If you revert r319614 on stable/11, does the problem go away ?

John Baldwin

2017-Jun-12 17:14 UTC

head link

post ino64: lockd no runs?

On Sunday, June 11, 2017 11:12:25 AM David Wolfskill
wrote:> On Sun, Jun 04, 2017 at 08:57:44AM -0400, Michael Butler wrote:
> > It seems that {rpc.}lockd no longer runs after the ino64 changes on
any
> > of my systems after a full rebuild of src and ports. No log entries
> > offer any insight as to why :-(
> > 
> > 	imb
> 
> I don't tend to use NFS on my systems that are running head, so I
> haven't had occasion to test this as stated.
> 
> However, I just completed my weekly update of the "prooduction"
systems
> here at home, running stable/11.  And I find that lockd seems to be ...
> claiming that all is well, but declining to run (for long).
> 
> To the best of my knowledge, that was not the case until this last
> update, which was from:
> 
> FreeBSD albert.catwhisker.org 11.1-PRERELEASE FreeBSD 11.1-PRERELEASE #316 
r319566M/319569:1100514: Sun Jun  4 03:54:41 PDT 2017     root at
freebeast.catwhisker.org:/common/S1/obj/usr/src/sys/ALBERT  amd64
> 
> to
> 
> FreeBSD albert.catwhisker.org 11.1-BETA1 FreeBSD 11.1-BETA1 #322 
r319823M/319823:1100514: Sun Jun 11 03:56:10 PDT 2017     root at
freebeast.catwhisker.org:/common/S1/obj/usr/src/sys/ALBERT  amd64
> 
> The "glaringly obvious" symptom in my case is that I am now
unable
> to (directly) save an email message from within mutt(1) by appending
> it to an NFS-resident file.  (Saving it to a local file, then using
> cat(1) to append that to the NFS- resident file & removing the local
> copy works....)
> 
> After a few variations on a theme of:
> 
> albert(11.1)[5] sudo service lockd restart
> lockd not running?
> Starting lockd.
> albert(11.1)[6] echo $?
> 0
> albert(11.1)[7] service lockd status
> lockd is not running.
> 
> I finally(!) thought to ask ktrace what's going on (as tailing
> /var/log/messages was completely unproductive, even after enabling
> rc_debug).
> 
> So I tried: "sudo ktrace -di service lockd restart"; upon
exanimation of
> the output of kdump(1), I see that the trace ends with:
> 
>   ...
>   2811 rpc.lockd NAMI  "/var/run/logpriv"
>   2786 sh       CALL  read(0xa,0x627fc0,0x400)
>   2786 sh       GIO   fd 10 read 0 bytes
>        ""
>   2811 rpc.lockd RET   connect 0
>   2786 sh       RET   read 0
>   2811 rpc.lockd CALL  sendto(0x3,0x7fffffffe2c0,0x27,0,0,0)
>   2786 sh       CALL  exit(0)
>   2811 rpc.lockd GIO   fd 3 wrote 39 bytes
>        "<30>Jun 11 15:43:10 rpc.lockd: Starting"
>   2811 rpc.lockd RET   sendto 39/0x27
>   2811 rpc.lockd CALL  sigaction(SIGALRM,0x7fffffffec20,0)
>   2811 rpc.lockd RET   sigaction 0
>   2811 rpc.lockd CALL  nlm_syscall(0,0x1e,0x4,0x801015040)
>   2811 rpc.lockd RET   nlm_syscall -1 errno 14 Bad address
This is a really good clue.  nlm_syscall is dying with EFAULT.  The last
argument is a pointer to an array of char * pointers, and the only way
I can see it dying is if it fails to copyin() one of the strings pointed
to by those pointers.  You could try running rpc.lockd under gdb from
ports and setting a breakpoint on 'nlm_syscall' and then printing out
'addr_count' and 'p addrs@(addr_count * 2)'.

Unfortunately I'm not able to reproduce the failure on a test machine
I have running head post-ino64.

-- 
John Baldwin

Possibly Parallel Threads

Search for more seemingly similar threads

freebsd stable - Jun 2017 - post ino64: lockd no runs?

post ino64: lockd no runs?

post ino64: lockd no runs?

post ino64: lockd no runs?

post ino64: lockd no runs?

Possibly Parallel Threads