I have an Ultra 10 client running Sol10 U3 that has a zfs pool set up on the extra space of the internal ide disk. There''s just the one fs and it is shared with the sharenfs property. When this system reboots nfs/server ends up getting disabled and this is the error from the SMF logs: [ Apr 16 08:41:22 Executing start method ("/lib/svc/method/nfs-server start") ] [ Apr 16 08:41:24 Method "start" exited with status 0 ] [ Apr 18 10:59:23 Executing start method ("/lib/svc/method/nfs-server start") ] Assertion failed: pclose(fp) == 0, file ../common/libzfs_mount.c, line 380, function zfs_share If I re-enable nfs/server after the system is up it''s fine. The system was recently upgraded to use zfs and this has happened on the last two reboots. We have lots of other systems that share nfs through zfs fine and I didn''t see a similar problem on the list. Any ideas? Ben This message posted from opensolaris.org
Could it be an order problem? NFS trying to start before zfs is mounted? Just a guess, of course. I''m not real savvy in either realm. HTH, Mike Ben Miller wrote:>I have an Ultra 10 client running Sol10 U3 that has a zfs pool set up on the extra space of the internal ide disk. There''s just the one fs and it is shared with the sharenfs property. When this system reboots nfs/server ends up getting disabled and this is the error from the SMF logs: > >[ Apr 16 08:41:22 Executing start method ("/lib/svc/method/nfs-server start") ] >[ Apr 16 08:41:24 Method "start" exited with status 0 ] >[ Apr 18 10:59:23 Executing start method ("/lib/svc/method/nfs-server start") ] >Assertion failed: pclose(fp) == 0, file ../common/libzfs_mount.c, line 380, function zfs_share > >If I re-enable nfs/server after the system is up it''s fine. The system was recently upgraded to use zfs and this has happened on the last two reboots. We have lots of other systems that share nfs through zfs fine and I didn''t see a similar problem on the list. Any ideas? > >Ben > > >This message posted from opensolaris.org >_______________________________________________ >zfs-discuss mailing list >zfs-discuss at opensolaris.org >http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > >-- <http://www.sun.com/solaris> * Michael Lee * Area System Support Engineer *Sun Microsystems, Inc.* Phone x40782 / 866 877 8350 Email Mike.Lee at Sun.COM <http://www.sun.com/solaris> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070419/f3858650/attachment.html>
I have seen a previous discussion with the same error. I don''t think a solution was posted though. The libzfs_mount.c source indicates that the ''share'' command returned non zero but specified no error. Can you run ''share'' manually after a fresh boot? There may be some insight if it fails, though as you describe it, share should work without problems. -Robert http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libzfs /common/libzfs_mount.c?r=789> -----Original Message----- > From: zfs-discuss-bounces at opensolaris.org > [mailto:zfs-discuss-bounces at opensolaris.org] On Behalf Of Ben Miller > Sent: Thursday, April 19, 2007 9:05 AM > To: zfs-discuss at opensolaris.org > Subject: [zfs-discuss] ZFS disables nfs/server on a host > > I have an Ultra 10 client running Sol10 U3 that has a zfs > pool set up on the extra space of the internal ide disk. > There''s just the one fs and it is shared with the sharenfs > property. When this system reboots nfs/server ends up > getting disabled and this is the error from the SMF logs: > > [ Apr 16 08:41:22 Executing start method > ("/lib/svc/method/nfs-server start") ] > [ Apr 16 08:41:24 Method "start" exited with status 0 ] > [ Apr 18 10:59:23 Executing start method > ("/lib/svc/method/nfs-server start") ] > Assertion failed: pclose(fp) == 0, file > ../common/libzfs_mount.c, line 380, function zfs_share > > If I re-enable nfs/server after the system is up it''s fine. > The system was recently upgraded to use zfs and this has > happened on the last two reboots. We have lots of other > systems that share nfs through zfs fine and I didn''t see a > similar problem on the list. Any ideas? > > Ben > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
It does seem like an ordering problem, but nfs/server should be starting up late enough with SMF dependencies. I need to see if I can duplicate the problem on a test system... This message posted from opensolaris.org
I just rebooted this host this morning and the same thing happened again. I have the core file from zfs. [ Apr 26 07:47:01 Executing start method ("/lib/svc/method/nfs-server start") ] Assertion failed: pclose(fp) == 0, file ../common/libzfs_mount.c, line 380, func tion zfs_share Abort - core dumped Why would nfs/server be disabled instead of going into maintenance with this error? This message posted from opensolaris.org
I was able to duplicate this problem on a test Ultra 10. I put in a workaround by adding a service that depends on /milestone/multi-user-server which does a ''zfs share -a''. It''s strange this hasn''t happened on other systems, but maybe it''s related to slower systems... Ben This message posted from opensolaris.org
On Thu, 26 Apr 2007, Ben Miller wrote:> I just rebooted this host this morning and the same thing happened again. I have the core file from zfs. > > [ Apr 26 07:47:01 Executing start method ("/lib/svc/method/nfs-server start") ] > Assertion failed: pclose(fp) == 0, file ../common/libzfs_mount.c, line 380, func > tion zfs_share > Abort - core dumpedFor there to be no output between the ''executing start method'' and the assertion means that the popen succeeded, but the fgets() failed. It''s possible that fgets was interrupted and returned an EINTR, which isn''t currently being handled by the code in zfs_share_nfs(). The code I''m looking at starts here: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libzfs/common/libzfs_mount.c#454 It''d be nice to see a truss or dtrace of this to help narrow it down. Regards, markm
I just threw in a truss in the SMF script and rebooted the test system and it failed again. The truss output is at http://www.eecis.udel.edu/~bmiller/zfs.truss-Apr27-2007 thanks, Ben This message posted from opensolaris.org
Dennis Clarke
2007-Apr-27 17:09 UTC
[zfs-discuss] Re: Re: ZFS disables nfs/server on a host
On 4/27/07, Ben Miller <miller at eecis.udel.edu> wrote:> I just threw in a truss in the SMF script and rebooted the test system and it failed again. > The truss output is at http://www.eecis.udel.edu/~bmiller/zfs.truss-Apr27-2007324: read(7, 0x000CA00C, 5120) = 0 324: llseek(7, 0, SEEK_CUR) Err#29 ESPIPE 324: close(7) = 0 324: waitid(P_PID, 331, 0xFFBFE740, WEXITED|WTRAPPED) = 0 llseek(7, 0, SEEK_CUR) returns Err#29 ESPIPE . so then .. whats that mean ? ERRORS The llseek() function will fail if: ESPIPE The fildes argument is associated with a pipe or FIFO. dunno if that helps Dennis
I seem to be having nfs server issues myself, on a fresh install of b62 with a mirrored root pool. I try and I try but my nfs/server process always reverts to disabled. I unfortunately have no ufs slices to try out a basic nfs configuration. No matter how I try and share out a directory or filesystem (share or zfs share or just editing dfstab and enabling my nfs-server service) I get the following in my nfs server log: [ Apr 27 20:44:36 Executing start method ("/lib/svc/method/nfs-server start") ] No NFS filesystems are shared [ Apr 27 20:44:36 Method "start" exited with status 0 ] [ Apr 27 20:44:36 Stopping because service disabled. ] [ Apr 27 20:44:36 Executing stop method ("/lib/svc/method/nfs-server stop 112") ] Bad System Call - core dumped [ Apr 27 20:44:43 Method "stop" exited with status 0 ] I have also got the zfs share assertion in this thread, but I forget exactly what I did to invoke it. Is this is known bug or am I running into something weird? cheers, -o This message posted from opensolaris.org