Hello, I''ve got a real doozie.. We recently implemented a b89 as zfs/nfs/ cifs server. The NFS client is HP-UX (11.23). What''s happening is when our dba edits a file on the nfs mount with vi, it will not save. I removed vi from the mix by doing ''touch /nfs/file1'' then ''echo abc > /nfs/file1'' and it just sat there while the nfs servers cpu went up to 50% (one full core). This nfsstat is most troubling (I zeroed it and only tried to echo data into a file so this is the numbers for about 2 minutes before I CTRL-C''ed the echo command). Version 3: (11242416 calls) null getattr setattr lookup access readlink 0 0% 5600958 49% 5600895 49% 19 0% 9 0% 0 0% read write create mkdir symlink mknod 0 0% 40494 0% 5 0% 0 0% 0 0% 0 0% remove rmdir rename link readdir readdirplus 3 0% 0 0% 0 0% 0 0% 0 0% 7 0% fsstat fsinfo pathconf commit 12 0% 0 0% 0 0% 14 0% Thats a lot of getattr and setattr! Does anyone have any advice on where I should start to figure out what is going on? truss, dtrace, snoop.. so many choices! Thanks, -Andy
Andy Lubel wrote:> Hello, > > I''ve got a real doozie.. We recently implemented a b89 as zfs/nfs/ > cifs server. The NFS client is HP-UX (11.23). > > What''s happening is when our dba edits a file on the nfs mount with > vi, it will not save. > > I removed vi from the mix by doing ''touch /nfs/file1'' then ''echo abc > > /nfs/file1'' and it just sat there while the nfs servers cpu went up > to 50% (one full core). > > This nfsstat is most troubling (I zeroed it and only tried to echo > data into a file so this is the numbers for about 2 minutes before I > CTRL-C''ed the echo command). > > Version 3: (11242416 calls) > null getattr setattr lookup access readlink > 0 0% 5600958 49% 5600895 49% 19 0% 9 0% 0 0% > read write create mkdir symlink mknod > 0 0% 40494 0% 5 0% 0 0% 0 0% 0 0% > remove rmdir rename link readdir readdirplus > 3 0% 0 0% 0 0% 0 0% 0 0% 7 0% > fsstat fsinfo pathconf commit > 12 0% 0 0% 0 0% 14 0% > > > Thats a lot of getattr and setattr! Does anyone have any advice on > where I should start to figure out what is going on? truss, dtrace, > snoop.. so many choices!snoop would be a fine place to start. This''ll tell us what the server is responding to all those getattr/setattr calls with. - Lisa
Andy Lubel wrote:> I''ve got a real doozie.. We recently implemented a b89 as zfs/nfs/ > cifs server. The NFS client is HP-UX (11.23). > > What''s happening is when our dba edits a file on the nfs mount with > vi, it will not save. > > I removed vi from the mix by doing ''touch /nfs/file1'' then ''echo abc > > /nfs/file1'' and it just sat there while the nfs servers cpu went up > to 50% (one full core).Hi Andy, This sounds familiar: you may be hitting something I diagnosed last year. Run snoop and see if it loops like this: 10920 0.00013 141.240.193.235 -> 141.240.193.27 NFS C GETATTR3 FH=6614 10921 0.00007 141.240.193.27 -> 141.240.193.235 NFS R GETATTR3 OK 10922 0.00017 141.240.193.235 -> 141.240.193.27 NFS C SETATTR3 FH=6614 10923 0.00007 141.240.193.27 -> 141.240.193.235 NFS R SETATTR3 Update synch mismatch 10924 0.00017 141.240.193.235 -> 141.240.193.27 NFS C GETATTR3 FH=6614 10925 0.00023 141.240.193.27 -> 141.240.193.235 NFS R GETATTR3 OK 10926 0.00026 141.240.193.235 -> 141.240.193.27 NFS C SETATTR3 FH=6614 10927 0.00009 141.240.193.27 -> 141.240.193.235 NFS R SETATTR3 Update synch mismatch If you see this, you''ve hit what we filed as Sun bugid 6538387, "HP-UX automount NFS client hangs for ZFS filesystems". It''s an HP-UX bug, fixed in HP-UX 11.31. The synopsis is that HP-UX gets bitten by the nanosecond resolution on ZFS. Part of the CREATE handshake is for the server to send the create time as a ''guard'' against almost-simultaneous creates - the client has to send it back in the SETATTR to complete the file creation. HP-UX has only microsecond resolution in their VFS, and so the ''guard'' value is not sent accurately and the server rejects it, lather rinse and repeat. The spec, RFC 1813, talks about this in section 3.3.2. You can use NFSv2 in the short term until you get that update. If you see something different, by all means send us a snoop. Rob T
That was it! hpux-is-old.com -> nearline.host NFS C GETATTR3 FH=F6B3 nearline.host -> hpux-is-old.com NFS R GETATTR3 OK hpux-is-old.com -> nearline.host NFS C SETATTR3 FH=F6B3 nearline.host -> hpux-is-old.com NFS R SETATTR3 Update synch mismatch hpux-is-old.com -> nearline.host NFS C GETATTR3 FH=F6B3 nearline.host -> hpux-is-old.com NFS R GETATTR3 OK hpux-is-old.com -> nearline.host NFS C SETATTR3 FH=F6B3 nearline.host -> hpux-is-old.com NFS R SETATTR3 Update synch mismatch hpux-is-old.com -> nearline.host NFS C GETATTR3 FH=F6B3 nearline.host -> hpux-is-old.com NFS R GETATTR3 OK hpux-is-old.com -> nearline.host NFS C SETATTR3 FH=F6B3 nearline.host -> hpux-is-old.com NFS R SETATTR3 Update synch mismatch hpux-is-old.com -> nearline.host NFS C GETATTR3 FH=F6B3 It is too bad our silly hardware only allows us to go to 11.23. That''s OK though, in a couple months we will be dumping this server with new x4600''s. Thanks for the help, -Andy On Jun 5, 2008, at 6:19 PM, Robert Thurlow wrote:> Andy Lubel wrote: > >> I''ve got a real doozie.. We recently implemented a b89 as zfs/ >> nfs/ cifs server. The NFS client is HP-UX (11.23). >> What''s happening is when our dba edits a file on the nfs mount >> with vi, it will not save. >> I removed vi from the mix by doing ''touch /nfs/file1'' then ''echo >> abc > /nfs/file1'' and it just sat there while the nfs servers cpu >> went up to 50% (one full core). > > Hi Andy, > > This sounds familiar: you may be hitting something I diagnosed > last year. Run snoop and see if it loops like this: > > 10920 0.00013 141.240.193.235 -> 141.240.193.27 NFS C GETATTR3 > FH=6614 > 10921 0.00007 141.240.193.27 -> 141.240.193.235 NFS R GETATTR3 OK > 10922 0.00017 141.240.193.235 -> 141.240.193.27 NFS C SETATTR3 > FH=6614 > 10923 0.00007 141.240.193.27 -> 141.240.193.235 NFS R SETATTR3 > Update synch mismatch > 10924 0.00017 141.240.193.235 -> 141.240.193.27 NFS C GETATTR3 > FH=6614 > 10925 0.00023 141.240.193.27 -> 141.240.193.235 NFS R GETATTR3 OK > 10926 0.00026 141.240.193.235 -> 141.240.193.27 NFS C SETATTR3 > FH=6614 > 10927 0.00009 141.240.193.27 -> 141.240.193.235 NFS R SETATTR3 > Update synch mismatch > > If you see this, you''ve hit what we filed as Sun bugid 6538387, > "HP-UX automount NFS client hangs for ZFS filesystems". It''s an > HP-UX bug, fixed in HP-UX 11.31. The synopsis is that HP-UX gets > bitten by the nanosecond resolution on ZFS. Part of the CREATE > handshake is for the server to send the create time as a ''guard'' > against almost-simultaneous creates - the client has to send it > back in the SETATTR to complete the file creation. HP-UX has only > microsecond resolution in their VFS, and so the ''guard'' value is > not sent accurately and the server rejects it, lather rinse and > repeat. The spec, RFC 1813, talks about this in section 3.3.2. > You can use NFSv2 in the short term until you get that update. > > If you see something different, by all means send us a snoop. > > Rob T
On Jun 6, 2008, at 11:22 AM, Andy Lubel wrote:> That was it! > > hpux-is-old.com -> nearline.host NFS C GETATTR3 FH=F6B3 > nearline.host -> hpux-is-old.com NFS R GETATTR3 OK > hpux-is-old.com -> nearline.host NFS C SETATTR3 FH=F6B3 > nearline.host -> hpux-is-old.com NFS R SETATTR3 Update synch mismatch > hpux-is-old.com -> nearline.host NFS C GETATTR3 FH=F6B3 > nearline.host -> hpux-is-old.com NFS R GETATTR3 OK > hpux-is-old.com -> nearline.host NFS C SETATTR3 FH=F6B3 > nearline.host -> hpux-is-old.com NFS R SETATTR3 Update synch mismatch > hpux-is-old.com -> nearline.host NFS C GETATTR3 FH=F6B3 > nearline.host -> hpux-is-old.com NFS R GETATTR3 OK > hpux-is-old.com -> nearline.host NFS C SETATTR3 FH=F6B3 > nearline.host -> hpux-is-old.com NFS R SETATTR3 Update synch mismatch > hpux-is-old.com -> nearline.host NFS C GETATTR3 FH=F6B3 > > It is too bad our silly hardware only allows us to go to 11.23. > That''s OK though, in a couple months we will be dumping this server > with new x4600''s. > > Thanks for the help, > > -Andy > > > On Jun 5, 2008, at 6:19 PM, Robert Thurlow wrote: > >> Andy Lubel wrote: >> >>> I''ve got a real doozie.. We recently implemented a b89 as zfs/ >>> nfs/ cifs server. The NFS client is HP-UX (11.23). >>> What''s happening is when our dba edits a file on the nfs mount >>> with vi, it will not save. >>> I removed vi from the mix by doing ''touch /nfs/file1'' then ''echo >>> abc > /nfs/file1'' and it just sat there while the nfs servers cpu >>> went up to 50% (one full core). >> >> Hi Andy, >> >> This sounds familiar: you may be hitting something I diagnosed >> last year. Run snoop and see if it loops like this: >> >> 10920 0.00013 141.240.193.235 -> 141.240.193.27 NFS C GETATTR3 >> FH=6614 >> 10921 0.00007 141.240.193.27 -> 141.240.193.235 NFS R GETATTR3 OK >> 10922 0.00017 141.240.193.235 -> 141.240.193.27 NFS C SETATTR3 >> FH=6614 >> 10923 0.00007 141.240.193.27 -> 141.240.193.235 NFS R SETATTR3 >> Update synch mismatch >> 10924 0.00017 141.240.193.235 -> 141.240.193.27 NFS C GETATTR3 >> FH=6614 >> 10925 0.00023 141.240.193.27 -> 141.240.193.235 NFS R GETATTR3 OK >> 10926 0.00026 141.240.193.235 -> 141.240.193.27 NFS C SETATTR3 >> FH=6614 >> 10927 0.00009 141.240.193.27 -> 141.240.193.235 NFS R SETATTR3 >> Update synch mismatch >> >> If you see this, you''ve hit what we filed as Sun bugid 6538387, >> "HP-UX automount NFS client hangs for ZFS filesystems". It''s an >> HP-UX bug, fixed in HP-UX 11.31. The synopsis is that HP-UX gets >> bitten by the nanosecond resolution on ZFS. Part of the CREATE >> handshake is for the server to send the create time as a ''guard'' >> against almost-simultaneous creates - the client has to send it >> back in the SETATTR to complete the file creation. HP-UX has only >> microsecond resolution in their VFS, and so the ''guard'' value is >> not sent accurately and the server rejects it, lather rinse and >> repeat. The spec, RFC 1813, talks about this in section 3.3.2. >> You can use NFSv2 in the short term until you get that update. >> >> If you see something different, by all means send us a snoop.Update: We tried nfs v2 and the speed was terrible but the gettattr/setattr issue was gone. So what I''m looking at doing now is to create a raw volume, format it with ufs, mount it locallly, then share it over nfs. Luckily we will only have to do it this way for a few months, I don''t like the extra layer and the block device isn''t as fast as we hoped (I get about 400MB/s on the zfs filesystem and 180MB/s using the ufs-formatted local disk.. I just sure hope I''m not breaking any rules by implementing this workaround that will come back to haunt me later. -Andy>> >> >> Rob T > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On Jun 9, 2008, at 12:28 PM, Andy Lubel wrote:> > On Jun 6, 2008, at 11:22 AM, Andy Lubel wrote: > >> That was it! >> >> hpux-is-old.com -> nearline.host NFS C GETATTR3 FH=F6B3 >> nearline.host -> hpux-is-old.com NFS R GETATTR3 OK >> hpux-is-old.com -> nearline.host NFS C SETATTR3 FH=F6B3 >> nearline.host -> hpux-is-old.com NFS R SETATTR3 Update synch mismatch >> hpux-is-old.com -> nearline.host NFS C GETATTR3 FH=F6B3 >> nearline.host -> hpux-is-old.com NFS R GETATTR3 OK >> hpux-is-old.com -> nearline.host NFS C SETATTR3 FH=F6B3 >> nearline.host -> hpux-is-old.com NFS R SETATTR3 Update synch mismatch >> hpux-is-old.com -> nearline.host NFS C GETATTR3 FH=F6B3 >> nearline.host -> hpux-is-old.com NFS R GETATTR3 OK >> hpux-is-old.com -> nearline.host NFS C SETATTR3 FH=F6B3 >> nearline.host -> hpux-is-old.com NFS R SETATTR3 Update synch mismatch >> hpux-is-old.com -> nearline.host NFS C GETATTR3 FH=F6B3 >> >> It is too bad our silly hardware only allows us to go to 11.23. >> That''s OK though, in a couple months we will be dumping this server >> with new x4600''s. >> >> Thanks for the help, >> >> -Andy >> >> >> On Jun 5, 2008, at 6:19 PM, Robert Thurlow wrote: >> >>> Andy Lubel wrote: >>> >>>> I''ve got a real doozie.. We recently implemented a b89 as zfs/ >>>> nfs/ cifs server. The NFS client is HP-UX (11.23). >>>> What''s happening is when our dba edits a file on the nfs mount >>>> with vi, it will not save. >>>> I removed vi from the mix by doing ''touch /nfs/file1'' then ''echo >>>> abc > /nfs/file1'' and it just sat there while the nfs servers cpu >>>> went up to 50% (one full core). >>> >>> Hi Andy, >>> >>> This sounds familiar: you may be hitting something I diagnosed >>> last year. Run snoop and see if it loops like this: >>> >>> 10920 0.00013 141.240.193.235 -> 141.240.193.27 NFS C GETATTR3 >>> FH=6614 >>> 10921 0.00007 141.240.193.27 -> 141.240.193.235 NFS R GETATTR3 OK >>> 10922 0.00017 141.240.193.235 -> 141.240.193.27 NFS C SETATTR3 >>> FH=6614 >>> 10923 0.00007 141.240.193.27 -> 141.240.193.235 NFS R SETATTR3 >>> Update synch mismatch >>> 10924 0.00017 141.240.193.235 -> 141.240.193.27 NFS C GETATTR3 >>> FH=6614 >>> 10925 0.00023 141.240.193.27 -> 141.240.193.235 NFS R GETATTR3 OK >>> 10926 0.00026 141.240.193.235 -> 141.240.193.27 NFS C SETATTR3 >>> FH=6614 >>> 10927 0.00009 141.240.193.27 -> 141.240.193.235 NFS R SETATTR3 >>> Update synch mismatch >>> >>> If you see this, you''ve hit what we filed as Sun bugid 6538387, >>> "HP-UX automount NFS client hangs for ZFS filesystems". It''s an >>> HP-UX bug, fixed in HP-UX 11.31. The synopsis is that HP-UX gets >>> bitten by the nanosecond resolution on ZFS. Part of the CREATE >>> handshake is for the server to send the create time as a ''guard'' >>> against almost-simultaneous creates - the client has to send it >>> back in the SETATTR to complete the file creation. HP-UX has only >>> microsecond resolution in their VFS, and so the ''guard'' value is >>> not sent accurately and the server rejects it, lather rinse and >>> repeat. The spec, RFC 1813, talks about this in section 3.3.2. >>> You can use NFSv2 in the short term until you get that update. >>> >>> If you see something different, by all means send us a snoop. > > Update: > > We tried nfs v2 and the speed was terrible but the gettattr/setattr > issue was gone. So what I''m looking at doing now is to create a raw > volume, format it with ufs, mount it locallly, then share it over > nfs. Luckily we will only have to do it this way for a few months, I > don''t like the extra layer and the block device isn''t as fast as we > hoped (I get about 400MB/s on the zfs filesystem and 180MB/s using the > ufs-formatted local disk.. I just sure hope I''m not breaking any > rules by implementing this workaround that will come back to haunt me > later. > > -AndyTried this today and although things appear to function correctly, the performance seems to be steadily degrading. Am I getting burnt by double-caching? If so, what is the best way to workaround for my sad situation? I tried directio for the ufs volume and it made it even worse.. The only next thing I know to do is destroy one of my zfs pools and go back to SVM until we can get some newer nfs clients writing to this nearline. It pains me deeply!! TIA, -Andy> > >>> >>> >>> Rob T >> >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On Mon, Jun 9, 2008 at 3:14 PM, Andy Lubel <andy.lubel at gtsi.com> wrote:> Tried this today and although things appear to function correctly, the > performance seems to be steadily degrading. Am I getting burnt by > double-caching? If so, what is the best way to workaround for my sad > situation? I tried directio for the ufs volume and it made it even > worse..AFAIK, you''re doing the best that you can while playing in the constraints of ZFS. If you want to use nfs v3 with your clients, you''ll need to use UFS as the back end. You can check the size of the caches to see if that''s the problem. If you just want to take a shot i the dark and if this is the only filesystem in your zpool, either reduce the size of the zfs ARC cache, or reduce the size of the UFS cache. -B -- Brandon High bhigh at freaks.com "The good is the enemy of the best." - Nietzsche
Brandon High wrote:> On Mon, Jun 9, 2008 at 3:14 PM, Andy Lubel <andy.lubel at gtsi.com> wrote: >> Tried this today and although things appear to function correctly, the >> performance seems to be steadily degrading. Am I getting burnt by >> double-caching? If so, what is the best way to workaround for my sad >> situation? I tried directio for the ufs volume and it made it even >> worse.. > > AFAIK, you''re doing the best that you can while playing in the > constraints of ZFS. If you want to use nfs v3 with your clients, > you''ll need to use UFS as the back end.Just a clarification: NFSv3 isn''t a problem to my knowledge in general; Andy has a problem with a HP-UX client bug that he can''t appear to get a fix for. Rob T
On Mon, Jun 9, 2008 at 10:44 PM, Robert Thurlow <robert.thurlow at sun.com> wrote:> Brandon High wrote: >> AFAIK, you''re doing the best that you can while playing in the >> constraints of ZFS. If you want to use nfs v3 with your clients, >> you''ll need to use UFS as the back end. > > Just a clarification: NFSv3 isn''t a problem to my knowledge in general; > Andy has a problem with a HP-UX client bug that he can''t appear to get > a fix for.It was clear to me when I wrote it, but by "your clients" I meant Andy''s buggy hosts that can''t be upgraded to a version with a fix in place. ZFS works fine with other nfs v3 clients (or v2, or v4), AFAIK. -B -- Brandon High bhigh at freaks.com "The good is the enemy of the best." - Nietzsche