I was really hoping for some option other than ZIL_DISABLE, but finally gave up the fight. Some people suggested NFSv4 helping over NFSv3 but it didn''t... at least not enough to matter. ZIL_DISABLE was the solution, sadly. I''m running B43/X86 and hoping to get up to 48 or so soonish (I BFU''d it straight to B48 last night and brick''ed it). Here are the times. This is an untar (gtar xfj) of SIDEkick (http://www.cuddletech.com/blog/pivot/entry.php?id=491) on NFSv4 on a 20TB RAIDZ2 ZFS Pool: ZIL Enabled: real 1m26.941s ZIL Disabled: real 0m5.789s I''ll update this post again when I finally get B48 or newer on the system and try it. Thanks to everyone for their suggestions. benr. This message posted from opensolaris.org
Ben Rockwood wrote:> I was really hoping for some option other than ZIL_DISABLE, but finally gave up the fight. Some people suggested NFSv4 helping over NFSv3 but it didn''t... at least not enough to matter. > > ZIL_DISABLE was the solution, sadly. I''m running B43/X86 and hoping to get up to 48 or so soonish (I BFU''d it straight to B48 last night and brick''ed it). > > Here are the times. This is an untar (gtar xfj) of SIDEkick (http://www.cuddletech.com/blog/pivot/entry.php?id=491) on NFSv4 on a 20TB RAIDZ2 ZFS Pool: > > ZIL Enabled: > real 1m26.941s > > ZIL Disabled: > real 0m5.789s > > > I''ll update this post again when I finally get B48 or newer on the system and try it. Thanks to everyone for their suggestions. >I imagine what''s happening is that tar is a single-threaded application and it''s basically doing: open, asynchronous write, close. This will go really fast locally. But for NFS due to the way it does cache consistency, on CLOSE, it must make sure that the writes are on stable storage, so it does a COMMIT, which basically turns your asynchronous write into a synchronous write. Which means you basically have a single-threaded app doing synchronous writes- ~ 1/2 disk rotational latency per write. Check out ''mount_nfs(1M)'' and the ''nocto'' option. It might be ok for you to relax the cache consistency for client''s mount as you untar the file(s). Then remount w/out the ''nocto'' option once you''re done. Another option is to run multiple untars together. I''m guessing that you''ve got I/O to spare from ZFS''s point of view. eric
On Tue, eric kustarz wrote:> Ben Rockwood wrote: > >I was really hoping for some option other than ZIL_DISABLE, but finally > >gave up the fight. Some people suggested NFSv4 helping over NFSv3 but it > >didn''t... at least not enough to matter. > > > >ZIL_DISABLE was the solution, sadly. I''m running B43/X86 and hoping to > >get up to 48 or so soonish (I BFU''d it straight to B48 last night and > >brick''ed it). > > > >Here are the times. This is an untar (gtar xfj) of SIDEkick > >(http://www.cuddletech.com/blog/pivot/entry.php?id=491) on NFSv4 on a 20TB > >RAIDZ2 ZFS Pool: > > > >ZIL Enabled: > >real 1m26.941s > > > >ZIL Disabled: > >real 0m5.789s > > > > > >I''ll update this post again when I finally get B48 or newer on the system > >and try it. Thanks to everyone for their suggestions. > > > > I imagine what''s happening is that tar is a single-threaded application > and it''s basically doing: open, asynchronous write, close. This will go > really fast locally. But for NFS due to the way it does cache > consistency, on CLOSE, it must make sure that the writes are on stable > storage, so it does a COMMIT, which basically turns your asynchronous > write into a synchronous write. Which means you basically have a > single-threaded app doing synchronous writes- ~ 1/2 disk rotational > latency per write. > > Check out ''mount_nfs(1M)'' and the ''nocto'' option. It might be ok for > you to relax the cache consistency for client''s mount as you untar the > file(s). Then remount w/out the ''nocto'' option once you''re done.This will not correct the problem because tar is extracting and therefore creating files and directories; those creates will be synchronous at the NFS server and there is no method to change this behavior at the client. Spencer> > Another option is to run multiple untars together. I''m guessing that > you''ve got I/O to spare from ZFS''s point of view. > > eric > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
eric kustarz <eric.kustarz at sun.com> wrote:> Ben Rockwood wrote:> I imagine what''s happening is that tar is a single-threaded application > and it''s basically doing: open, asynchronous write, close. This will go > really fast locally. But for NFS due to the way it does cache > consistency, on CLOSE, it must make sure that the writes are on stable > storage, so it does a COMMIT, which basically turns your asynchronous > write into a synchronous write. Which means you basically have a > single-threaded app doing synchronous writes- ~ 1/2 disk rotational > latency per write.star does write files in a way that grants consistency, GNU tar and Sun tar do not. If you like to see the difference, you may try to compare a star x with star x -no-fsync J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
Spencer Shepler writes: > On Tue, eric kustarz wrote: > > Ben Rockwood wrote: > > >I was really hoping for some option other than ZIL_DISABLE, but finally > > >gave up the fight. Some people suggested NFSv4 helping over NFSv3 but it > > >didn''t... at least not enough to matter. > > > > > >ZIL_DISABLE was the solution, sadly. I''m running B43/X86 and hoping to > > >get up to 48 or so soonish (I BFU''d it straight to B48 last night and > > >brick''ed it). > > > > > >Here are the times. This is an untar (gtar xfj) of SIDEkick > > >(http://www.cuddletech.com/blog/pivot/entry.php?id=491) on NFSv4 on a 20TB > > >RAIDZ2 ZFS Pool: > > > > > >ZIL Enabled: > > >real 1m26.941s > > > > > >ZIL Disabled: > > >real 0m5.789s > > > > > > > > >I''ll update this post again when I finally get B48 or newer on the system > > >and try it. Thanks to everyone for their suggestions. > > > > > > > I imagine what''s happening is that tar is a single-threaded application > > and it''s basically doing: open, asynchronous write, close. This will go > > really fast locally. But for NFS due to the way it does cache > > consistency, on CLOSE, it must make sure that the writes are on stable > > storage, so it does a COMMIT, which basically turns your asynchronous > > write into a synchronous write. Which means you basically have a > > single-threaded app doing synchronous writes- ~ 1/2 disk rotational > > latency per write. > > > > Check out ''mount_nfs(1M)'' and the ''nocto'' option. It might be ok for > > you to relax the cache consistency for client''s mount as you untar the > > file(s). Then remount w/out the ''nocto'' option once you''re done. > > This will not correct the problem because tar is extracting and therefore > creating files and directories; those creates will be synchronous at > the NFS server and there is no method to change this behavior at the > client. > > Spencer > Thanks for that (I also thaught nocto would help, now I see it won''t). I would add that this is not a bug or deficientcy in implementation. Any NFS implementation tweak to make ''tar x'' go as fast as direct attached will lead to silent data corruption (tar x succeeds but the files don''t checksum ok). Interestingly the quality of service of ''tar x'' is higher over NFS than direct attach since, with direct attach, there is no guarantee that files are set to storage whereas there is one with NFS. Net Net, for single threaded ''tar x'', data integrity consideration forces NFS to provide a high quality slow service. For direct attach, we don''t have those data integrity issues, and the community has managed to get by the lower quality higher speed service. > > > > Another option is to run multiple untars together. I''m guessing that > > you''ve got I/O to spare from ZFS''s point of view. > > Or maybe a threaded tar ? To re-emphasise Eric''s point, this type of slowness affects single threaded loads. There is lots of headroom to higher performance by using concurrency. -r > > eric > > > > _______________________________________________ > > zfs-discuss mailing list > > zfs-discuss at opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Roch <Roch.Bourbonnais at Sun.COM> wrote:> I would add that this is not a bug or deficientcy in > implementation. Any NFS implementation tweak to make ''tar x'' > go as fast as direct attached will lead to silent data > corruption (tar x succeeds but the files don''t checksum > ok). > > Interestingly the quality of service of ''tar x'' is higher > over NFS than direct attach since, with direct attach, > there is no guarantee that files are set to storage whereas > there is one with NFS.Why do you beieve this? Neither Sun tar nor GNU tar call fsync which is the only way to enforce data integrity over NFS. If you like to test this, use star. Star by default calls fsync before it closes a written file in x mode. To switch this off, use star -no-fsync.> Net Net, for single threaded ''tar x'', data integrity > consideration forces NFS to provide a high quality slow > service. For direct attach, we don''t have those data > integrity issues, and the community has managed to get by the > lower quality higher speed service.What dou you have in mind? A tar that calls fsync in detached threads? J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
Joerg Schilling writes: > Roch <Roch.Bourbonnais at Sun.COM> wrote: > > > I would add that this is not a bug or deficientcy in > > implementation. Any NFS implementation tweak to make ''tar x'' > > go as fast as direct attached will lead to silent data > > corruption (tar x succeeds but the files don''t checksum > > ok). > > > > Interestingly the quality of service of ''tar x'' is higher > > over NFS than direct attach since, with direct attach, > > there is no guarantee that files are set to storage whereas > > there is one with NFS. > > Why do you beieve this? > > Neither Sun tar nor GNU tar call fsync which is the only way to > enforce data integrity over NFS. I tend to agree with this although I''d say that in practice, from performance perspective, calling fsync should be more relevant for direct attach. For NFS, doesn''t close-to-open and other aspect of the protocol need to enforce much more synchronous operations ? For tar x over nfs I''d bet the fsync will be an over-the-wire ops (say 0.5ms) but will not add an additional I/O latency (5ms) on each file extract. My target for a single threaded ''tar x'' of small files is to be able to run over NFS at 1 file per I/O latency, no matter what the backend FS is. I guess that ''star -yes-fsync'' over direct attach should behave the same ? Or do you have concurrency in there...see below. > > If you like to test this, use star. Star by default calls > fsync before it closes a written file in x mode. To switch this > off, use star -no-fsync. > > > Net Net, for single threaded ''tar x'', data integrity > > consideration forces NFS to provide a high quality slow > > service. For direct attach, we don''t have those data > > integrity issues, and the community has managed to get by the > > lower quality higher speed service. > > What dou you have in mind? > > A tar that calls fsync in detached threads? > You tell me ? We have 2 issues can we make ''tar x'' over direct attach, safe (fsync) and posix compliant while staying close to current performance characteristics ? In other words do we have the posix leeway to extract files in parallel ? For NFS, can we make ''tar x'' fast and reliable while keeping a principle of least surprise for users on this non-posix FS. -r > J?rg > > -- > EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin > js at cs.tu-berlin.de (uni) > schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ > URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
Spencer Shepler
2006-Oct-09 23:58 UTC
[nfs-discuss] Re: [zfs-discuss] Re: NFS Performance and Tar
On Tue, Roch wrote:> > Joerg Schilling writes: > > Roch <Roch.Bourbonnais at Sun.COM> wrote: > > > > > I would add that this is not a bug or deficientcy in > > > implementation. Any NFS implementation tweak to make ''tar x'' > > > go as fast as direct attached will lead to silent data > > > corruption (tar x succeeds but the files don''t checksum > > > ok). > > > > > > Interestingly the quality of service of ''tar x'' is higher > > > over NFS than direct attach since, with direct attach, > > > there is no guarantee that files are set to storage whereas > > > there is one with NFS. > > > > Why do you beieve this? > > > > Neither Sun tar nor GNU tar call fsync which is the only way to > > enforce data integrity over NFS. > > I tend to agree with this although I''d say that in practice, > from performance perspective, calling fsync should be more > relevant for direct attach. For NFS, doesn''t close-to-open > and other aspect of the protocol need to enforce much more > synchronous operations ? For tar x over nfs I''d bet the > fsync will be an over-the-wire ops (say 0.5ms) but will not > add an additional I/O latency (5ms) on each file extract.The close-to-open behavior of NFS clients is what ensures that the file data is on stable storage when close() returns. The meta-data requirements of NFS is what ensures that file creation, removal, renames, etc. are on stable storage when the server sends a response. So, unless the NFS server is behaving badly, the NFS client has a synchronous behavior and for some that means more "safe" but usually means that it is also slower that local access.> My target for a single threaded ''tar x'' of small files is to > be able to run over NFS at 1 file per I/O latency, no matter > what the backend FS is. I guess that ''star -yes-fsync'' over > direct attach should behave the same ? Or do you have > concurrency in there...see below. > > > > > If you like to test this, use star. Star by default calls > > fsync before it closes a written file in x mode. To switch this > > off, use star -no-fsync. > > > > > Net Net, for single threaded ''tar x'', data integrity > > > consideration forces NFS to provide a high quality slow > > > service. For direct attach, we don''t have those data > > > integrity issues, and the community has managed to get by the > > > lower quality higher speed service. > > > > What dou you have in mind? > > > > A tar that calls fsync in detached threads? > > > > You tell me ? We have 2 issues > > can we make ''tar x'' over direct attach, safe (fsync) > and posix compliant while staying close to current > performance characteristics ? In other words do we > have the posix leeway to extract files in parallel ? > > For NFS, can we make ''tar x'' fast and reliable while > keeping a principle of least surprise for users on > this non-posix FS.Having tar create/write/close files concurrently would be a big win over NFS mounts on almost any system. Spencer
This is correct based on our experience with ZFS. When using NFS v3, you have COMMIT''s that come down the wire to the ZFS/NFS server which forces a sync or flush to disk. To get around this issue without compromizing data integrity you can effectively get some NVRAM by adding a battery-backed hardware RAID controller to the NFS server and turn on write-back caching for the controller. This speeds things up for using tar and small files. There''s a ER open to add NVRAM support to ZFS, but don''t know the state of that. This message posted from opensolaris.org
Frank Batschulat (Home)
2006-Oct-10 04:46 UTC
[nfs-discuss] Re: [zfs-discuss] Re: NFS Performance and Tar
On Tue, 10 Oct 2006 01:25:36 +0200, Roch <Roch.Bourbonnais at Sun.COM> wrote:> You tell me ? We have 2 issues > > can we make ''tar x'' over direct attach, safe (fsync) > and posix compliant while staying close to current > performance characteristics ? In other words do we > have the posix leeway to extract files in parallel ?why fsync(3C) ? it is usually more heavy weight then opening the file with O_SYNC - and both provide POSIX synchronized file integrity completion. --- frankB
Sanjeev Bagewadi
2006-Oct-10 05:01 UTC
[nfs-discuss] Re: [zfs-discuss] Re: NFS Performance and Tar
I think the original point of NFS being better WRT data making it to the disk was that : NFS follows the SYNC-ON-CLOSE semantics. You will not see an explicit fsync() being called by the tar... -- Sanjeev. Frank Batschulat (Home) wrote:> On Tue, 10 Oct 2006 01:25:36 +0200, Roch <Roch.Bourbonnais at Sun.COM> > wrote: > >> You tell me ? We have 2 issues >> >> can we make ''tar x'' over direct attach, safe (fsync) >> and posix compliant while staying close to current >> performance characteristics ? In other words do we >> have the posix leeway to extract files in parallel ? > > > why fsync(3C) ? it is usually more heavy weight then > opening the file with O_SYNC - and both provide > POSIX synchronized file integrity completion. > > --- > frankB > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- Solaris Revenue Products Engineering, India Engineering Center, Sun Microsystems India Pvt Ltd. Tel: x27521 +91 80 669 27521
Roch <Roch.Bourbonnais at Sun.COM> wrote:> > Neither Sun tar nor GNU tar call fsync which is the only way to > > enforce data integrity over NFS. > > I tend to agree with this although I''d say that in practice, > from performance perspective, calling fsync should be more > relevant for direct attach. For NFS, doesn''t close-to-open > and other aspect of the protocol need to enforce much more > synchronous operations ? For tar x over nfs I''d bet the > fsync will be an over-the-wire ops (say 0.5ms) but will not > add an additional I/O latency (5ms) on each file extract.I did never test the performance aspects over NFS, as from my experience this is the only way to grant detection of file write problems.> My target for a single threaded ''tar x'' of small files is to > be able to run over NFS at 1 file per I/O latency, no matter > what the backend FS is. I guess that ''star -yes-fsync'' over > direct attach should behave the same ? Or do you have > concurrency in there...see below.No, star did not change since the last 15 year: - One process (the second) read/writes the archive In copy mode, this process extract the internal stream. - One process (the first) does the file I/O and archive generation.> > If you like to test this, use star. Star by default calls > > fsync before it closes a written file in x mode. To switch this > > off, use star -no-fsync. > > > > > Net Net, for single threaded ''tar x'', data integrity > > > consideration forces NFS to provide a high quality slow > > > service. For direct attach, we don''t have those data > > > integrity issues, and the community has managed to get by the > > > lower quality higher speed service. > > > > What dou you have in mind? > > > > A tar that calls fsync in detached threads? > > > > You tell me ? We have 2 issues > > can we make ''tar x'' over direct attach, safe (fsync) > and posix compliant while staying close to current > performance characteristics ? In other words do we > have the posix leeway to extract files in parallel ?What do you believe how fsync is related to POSIX? When I did introduce fsync calls 7 years ago, I did make some performance tests and it seems that on UFS, calling fsync reduces the extract performance by 10-20% which looks OK to me.> For NFS, can we make ''tar x'' fast and reliable while > keeping a principle of least surprise for users on > this non-posix FS.Someone should start with star -x vs. star -x -no-fsync tests and report timings... J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
Joerg Schilling
2006-Oct-12 10:23 UTC
[nfs-discuss] Re: [zfs-discuss] Re: NFS Performance and Tar
Spencer Shepler <spencer.shepler at sun.com> wrote:> The close-to-open behavior of NFS clients is what ensures that the > file data is on stable storage when close() returns.In the 1980s this was definitely not the case. When did this change?> The meta-data requirements of NFS is what ensures that file creation, > removal, renames, etc. are on stable storage when the server > sends a response. > > So, unless the NFS server is behaving badly, the NFS client has > a synchronous behavior and for some that means more "safe" but > usually means that it is also slower that local access.In any case, calling fsync before close does nto seem to be a problem.> > You tell me ? We have 2 issues > > > > can we make ''tar x'' over direct attach, safe (fsync) > > and posix compliant while staying close to current > > performance characteristics ? In other words do we > > have the posix leeway to extract files in parallel ? > > > > For NFS, can we make ''tar x'' fast and reliable while > > keeping a principle of least surprise for users on > > this non-posix FS. > > Having tar create/write/close files concurrently would be a > big win over NFS mounts on almost any system.Do you have an idea on how to do this? J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
Joerg Schilling
2006-Oct-12 10:27 UTC
[nfs-discuss] Re: [zfs-discuss] Re: NFS Performance and Tar
"Frank Batschulat (Home)" <Frank.Batschulat at Sun.COM> wrote:> On Tue, 10 Oct 2006 01:25:36 +0200, Roch <Roch.Bourbonnais at Sun.COM> wrote: > > > You tell me ? We have 2 issues > > > > can we make ''tar x'' over direct attach, safe (fsync) > > and posix compliant while staying close to current > > performance characteristics ? In other words do we > > have the posix leeway to extract files in parallel ? > > why fsync(3C) ? it is usually more heavy weight then > opening the file with O_SYNC - and both provide > POSIX synchronized file integrity completion.I believe that I did run tests that show that fsync is better. J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
Spencer Shepler
2006-Oct-12 14:18 UTC
[nfs-discuss] Re: [zfs-discuss] Re: NFS Performance and Tar
On Thu, Joerg Schilling wrote:> Spencer Shepler <spencer.shepler at sun.com> wrote: > > > The close-to-open behavior of NFS clients is what ensures that the > > file data is on stable storage when close() returns. > > In the 1980s this was definitely not the case. When did this change?It has not. NFS clients have always flushed (written) modified file data to the server before returning to the applications close(). The NFS client also asks that the data be committed to disk in this case.> > The meta-data requirements of NFS is what ensures that file creation, > > removal, renames, etc. are on stable storage when the server > > sends a response. > > > > So, unless the NFS server is behaving badly, the NFS client has > > a synchronous behavior and for some that means more "safe" but > > usually means that it is also slower that local access. > > In any case, calling fsync before close does nto seem to be a problem.Not for the NFS client because the default behavior has the same effect as fsync()/close().> > > > You tell me ? We have 2 issues > > > > > > can we make ''tar x'' over direct attach, safe (fsync) > > > and posix compliant while staying close to current > > > performance characteristics ? In other words do we > > > have the posix leeway to extract files in parallel ? > > > > > > For NFS, can we make ''tar x'' fast and reliable while > > > keeping a principle of least surprise for users on > > > this non-posix FS. > > > > Having tar create/write/close files concurrently would be a > > big win over NFS mounts on almost any system. > > Do you have an idea on how to do this?My naive thought would be to have multiple threads that create and write file data upon extraction. This multithreaded behavior would provide better overall throughput of an extraction given NFS'' response time characteristics. More outstanding requests results in better throughput. It isn''t only the file data being written to disk that is the overhead of the extraction, it is the creation of the directories and files that must also be committed to disk in the case of NFS. This is the other part that makes things slower than local access. Spencer
Joerg Schilling
2006-Oct-12 14:35 UTC
[nfs-discuss] Re: [zfs-discuss] Re: NFS Performance and Tar
Spencer Shepler <spencer.shepler at sun.com> wrote:> On Thu, Joerg Schilling wrote: > > Spencer Shepler <spencer.shepler at sun.com> wrote: > > > > > The close-to-open behavior of NFS clients is what ensures that the > > > file data is on stable storage when close() returns. > > > > In the 1980s this was definitely not the case. When did this change? > > It has not. NFS clients have always flushed (written) modified file data > to the server before returning to the applications close(). The NFS > client also asks that the data be committed to disk in this case.This is definitely wrong. Our developers did loose many files in the 1980s when the NFS file server did fill up the exported filesystem while several NFS clients did try to write back edited files at the same time. VI at that time did not call fsync and for this reason did not notice that the file could not be written back properly. What happens: All client did call statfs() and did asume that there is still space on the server. They all did allow to put blocks into the local clients buffer cache. VI did call close, but the client did notice the no space problem after the close did return and VI did not notice that the file was damaged and allowed the user to quit VI. Some time later, Sun did enhance VI to first call fsync() and then call close(). Only if both return 0, the file is granted to be on the server. Sun also did inform us to write applications this way in order to prevent lost file content.> > > Having tar create/write/close files concurrently would be a > > > big win over NFS mounts on almost any system. > > > > Do you have an idea on how to do this? > > My naive thought would be to have multiple threads that create and > write file data upon extraction. This multithreaded behavior would > provide better overall throughput of an extraction given NFS'' response > time characteristics. More outstanding requests results in better > throughput. It isn''t only the file data being written to disk that > is the overhead of the extraction, it is the creation of the directories > and files that must also be committed to disk in the case of NFS. > This is the other part that makes things slower than local access.Doing this with tar (which fetches the data from a serial data stream) would only make sense in case that there will be threads that only have the task to wait for a final fsync()/close(). It would also make it harder to implement error control as it may be that a problem is detected late while another large file is being extracted. Star could not just quit with an error message but would need to delay the error caused exit. J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
Spencer Shepler
2006-Oct-12 15:24 UTC
[nfs-discuss] Re: [zfs-discuss] Re: NFS Performance and Tar
On Thu, Joerg Schilling wrote:> Spencer Shepler <spencer.shepler at sun.com> wrote: > > > On Thu, Joerg Schilling wrote: > > > Spencer Shepler <spencer.shepler at sun.com> wrote: > > > > > > > The close-to-open behavior of NFS clients is what ensures that the > > > > file data is on stable storage when close() returns. > > > > > > In the 1980s this was definitely not the case. When did this change? > > > > It has not. NFS clients have always flushed (written) modified file data > > to the server before returning to the applications close(). The NFS > > client also asks that the data be committed to disk in this case. > > This is definitely wrong. > > Our developers did loose many files in the 1980s when the NFS file server > did fill up the exported filesystem while several NFS clients did try to > write back edited files at the same time. > > VI at that time did not call fsync and for this reason did not notice that > the file could not be written back properly. > > What happens: All client did call statfs() and did asume that there is > still space on the server. They all did allow to put blocks into the local > clients buffer cache. VI did call close, but the client did notice the > no space problem after the close did return and VI did not notice that the > file was damaged and allowed the user to quit VI. > > Some time later, Sun did enhance VI to first call fsync() and then call > close(). Only if both return 0, the file is granted to be on the server. > Sun also did inform us to write applications this way in order to prevent > lost file content.I didn''t comment on the error conditions that can occur during the writing of data upon close(). What you describe is the preferred method of obtaining any errors that occur during the writing of data. This occurs because the NFS client is writing asynchronously and the only method the application has of retrieving the error information is from the fsync() or close() call. At close(), it is to late to recovery so fsync() can be used to obtain any asynchronous error state. This doesn''t change the fact that upon close() the NFS client will write data back to the server. This is done to meet the close-to-open semantics of NFS.> > > > Having tar create/write/close files concurrently would be a > > > > big win over NFS mounts on almost any system. > > > > > > Do you have an idea on how to do this? > > > > My naive thought would be to have multiple threads that create and > > write file data upon extraction. This multithreaded behavior would > > provide better overall throughput of an extraction given NFS'' response > > time characteristics. More outstanding requests results in better > > throughput. It isn''t only the file data being written to disk that > > is the overhead of the extraction, it is the creation of the directories > > and files that must also be committed to disk in the case of NFS. > > This is the other part that makes things slower than local access. > > Doing this with tar (which fetches the data from a serial data stream) > would only make sense in case that there will be threads that only have the task > to wait for a final fsync()/close(). > > It would also make it harder to implement error control as it may be that > a problem is detected late while another large file is being extracted. > Star could not just quit with an error message but would need to delay the > error caused exit.Sure, I can see that it would be difficult. My point is that tar is not only waiting upon the fsync()/close() but also on file and directory creation. There is a longer delay not only because of the network latency but also the latency to writing the filesystem data to stable storage. Parallel requests will tend to overcome the delay/bandwidth issues. Not easy but can be an advantage with respect to performance. Spencer
Anton B. Rang
2006-Oct-12 21:42 UTC
[zfs-discuss] Re: [nfs-discuss] Re: Re: NFS Performance and Tar
fsync() should theoretically be better because O_SYNC requires that each write() include writing not only the data but also the inode and all indirect blocks back to the disk. This message posted from opensolaris.org
Neil Perrin
2006-Oct-13 04:56 UTC
[zfs-discuss] Re: [nfs-discuss] Re: Re: NFS Performance and Tar
As far as zfs performance is concerned, O_DSYNC and O_SYNC are equivalent. This is because, zfs saves all posix layer transactions (eg WRITE, SETATTR, RENAME...) in the log. So both meta data and data is always re-created if a replay is needed. Anton B. Rang wrote On 10/12/06 15:42,:> fsync() should theoretically be better because O_SYNC requires that each write() include writing not only the data but also the inode and all indirect blocks back to the disk. > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Joerg Schilling
2006-Oct-13 09:46 UTC
[nfs-discuss] Re: [zfs-discuss] Re: NFS Performance and Tar
Spencer Shepler <spencer.shepler at sun.com> wrote:> I didn''t comment on the error conditions that can occur during > the writing of data upon close(). What you describe is the preferred > method of obtaining any errors that occur during the writing of data. > This occurs because the NFS client is writing asynchronously and the > only method the application has of retrieving the error information > is from the fsync() or close() call. At close(), it is to late > to recovery so fsync() can be used to obtain any asynchronous error > state. > > This doesn''t change the fact that upon close() the NFS client will > write data back to the server. This is done to meet the > close-to-open semantics of NFS.Your working did not match with the reality, this is why I did write this. You did write that upon close() the client will first do something similar to fsync on that file. The problem is that this is done asynchronously and the close() return value does noo contain an indication on whether the fsync did succeed.> > It would also make it harder to implement error control as it may be that > > a problem is detected late while another large file is being extracted. > > Star could not just quit with an error message but would need to delay the > > error caused exit. > > Sure, I can see that it would be difficult. My point is that tar is > not only waiting upon the fsync()/close() but also on file and directory > creation. There is a longer delay not only because of the network > latency but also the latency to writing the filesystem data to > stable storage. Parallel requests will tend to overcome the delay/bandwidth > issues. Not easy but can be an advantage with respect to performance.I see no simple way to let tar implement concurrenty with respect to these problems. In star, it would be possible to create detached threads that work independently on small files that in sum are smaller than the size of the FIFO. This would however make the code much more complex. J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
Spencer Shepler
2006-Oct-13 14:01 UTC
[nfs-discuss] Re: [zfs-discuss] Re: NFS Performance and Tar
On Fri, Joerg Schilling wrote:> Spencer Shepler <spencer.shepler at sun.com> wrote: > > > I didn''t comment on the error conditions that can occur during > > the writing of data upon close(). What you describe is the preferred > > method of obtaining any errors that occur during the writing of data. > > This occurs because the NFS client is writing asynchronously and the > > only method the application has of retrieving the error information > > is from the fsync() or close() call. At close(), it is to late > > to recovery so fsync() can be used to obtain any asynchronous error > > state. > > > > This doesn''t change the fact that upon close() the NFS client will > > write data back to the server. This is done to meet the > > close-to-open semantics of NFS. > > Your working did not match with the reality, this is why I did write this. > You did write that upon close() the client will first do something similar to > fsync on that file. The problem is that this is done asynchronously and the > close() return value does noo contain an indication on whether the fsync > did succeed.Sorry, the code in Solaris would behave as I described. Upon the application closing the file, modified data is written to the server. The client waits for completion of those writes. If there is an error, it is returned to the caller of close(). Spencer
Jeff Victor
2006-Oct-13 14:08 UTC
[nfs-discuss] Re: [zfs-discuss] Re: NFS Performance and Tar
Spencer Shepler wrote:> On Fri, Joerg Schilling wrote: > >>>This doesn''t change the fact that upon close() the NFS client will >>>write data back to the server. This is done to meet the >>>close-to-open semantics of NFS. >> >>Your working did not match with the reality, this is why I did write this. >>You did write that upon close() the client will first do something similar to >>fsync on that file. The problem is that this is done asynchronously and the >>close() return value does noo contain an indication on whether the fsync >>did succeed. > > Sorry, the code in Solaris would behave as I described. Upon the > application closing the file, modified data is written to the server. > The client waits for completion of those writes. If there is an error, > it is returned to the caller of close().Are you talking about the client-end of NFS, as implemented in Solaris, or the "application-clients" like vi? It seems to me that you are talking about Solaris, and Joerg is talking about vi (and other applications). -------------------------------------------------------------------------- Jeff VICTOR Sun Microsystems jeff.victor @ sun.com OS Ambassador Sr. Technical Specialist Solaris 10 Zones FAQ: http://www.opensolaris.org/os/community/zones/faq --------------------------------------------------------------------------
Joerg Schilling
2006-Oct-13 14:11 UTC
[nfs-discuss] Re: [zfs-discuss] Re: NFS Performance and Tar
Spencer Shepler <spencer.shepler at sun.com> wrote:> Sorry, the code in Solaris would behave as I described. Upon the > application closing the file, modified data is written to the server. > The client waits for completion of those writes. If there is an error, > it is returned to the caller of close().So is this Solaris specific, or why are people warned to depend on the close() return code only? J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
Joerg Schilling
2006-Oct-13 14:12 UTC
[nfs-discuss] Re: [zfs-discuss] Re: NFS Performance and Tar
Jeff Victor <Jeff.Victor at Sun.COM> wrote:> >>Your working did not match with the reality, this is why I did write this. > >>You did write that upon close() the client will first do something similar to > >>fsync on that file. The problem is that this is done asynchronously and the > >>close() return value does noo contain an indication on whether the fsync > >>did succeed. > > > > Sorry, the code in Solaris would behave as I described. Upon the > > application closing the file, modified data is written to the server. > > The client waits for completion of those writes. If there is an error, > > it is returned to the caller of close(). > > Are you talking about the client-end of NFS, as implemented in Solaris, or the > "application-clients" like vi? > > It seems to me that you are talking about Solaris, and Joerg is talking about vi > (and other applications).I am talking about the syscall interface to applications. J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
Spencer Shepler
2006-Oct-13 14:12 UTC
[nfs-discuss] Re: [zfs-discuss] Re: NFS Performance and Tar
On Fri, Jeff Victor wrote:> Spencer Shepler wrote: > >On Fri, Joerg Schilling wrote: > > > >>>This doesn''t change the fact that upon close() the NFS client will > >>>write data back to the server. This is done to meet the > >>>close-to-open semantics of NFS. > >> > >>Your working did not match with the reality, this is why I did write this. > >>You did write that upon close() the client will first do something > >>similar to fsync on that file. The problem is that this is done > >>asynchronously and the > >>close() return value does noo contain an indication on whether the fsync > >>did succeed. > > > >Sorry, the code in Solaris would behave as I described. Upon the > >application closing the file, modified data is written to the server. > >The client waits for completion of those writes. If there is an error, > >it is returned to the caller of close(). > > Are you talking about the client-end of NFS, as implemented in Solaris, or > the "application-clients" like vi? > > It seems to me that you are talking about Solaris, and Joerg is talking > about vi (and other applications).NFS client. Spencer
Spencer Shepler
2006-Oct-13 14:22 UTC
[nfs-discuss] Re: [zfs-discuss] Re: NFS Performance and Tar
On Fri, Joerg Schilling wrote:> Spencer Shepler <spencer.shepler at sun.com> wrote: > > > Sorry, the code in Solaris would behave as I described. Upon the > > application closing the file, modified data is written to the server. > > The client waits for completion of those writes. If there is an error, > > it is returned to the caller of close(). > > So is this Solaris specific, or why are people warned to depend on the close() > return code only?All unix NFS clients that I know of behave the way I described. I believe the warning about relying on close() is that by the time the application receives the error it is too late to recover. If the application uses fsync() and receives an error, the application can warn the user and they may be able to do something about it (your example of ENOSPC is a very good one). Space can be freed, and the fsync() can be done again and the client will again push the writes to the server and be successful. If an application doesn''t care about recovery but wants the error to report back to the user, then close() is sufficient. Spencer
Anton B. Rang
2006-Oct-13 19:45 UTC
[zfs-discuss] Re: [nfs-discuss] Re: Re: NFS Performance and Tar
For what it''s worth, close-to-open consistency was added to Linux NFS in the 2.4.20 kernel (late 2002 timeframe). This might be the source of some of the confusion. This message posted from opensolaris.org
Roch
2006-Oct-14 01:41 UTC
[zfs-discuss] Re: [nfs-discuss] Re: Re: NFS Performance and Tar
The high order bit here is that write(); write(); fsync(); can be executed using a single I/O latency (during the fsync) whereas using O_*DSYNC, will require 2 I/O latency (one for each write). -r Neil Perrin writes: > As far as zfs performance is concerned, O_DSYNC and O_SYNC are equivalent. > This is because, zfs saves all posix layer transactions (eg WRITE, > SETATTR, RENAME...) in the log. So both meta data and data is always > re-created if a replay is needed. > > Anton B. Rang wrote On 10/12/06 15:42,: > > fsync() should theoretically be better because O_SYNC requires that each write() include writing not only the data but also the inode and all indirect blocks back to the disk. > > > > > > This message posted from opensolaris.org > > _______________________________________________ > > zfs-discuss mailing list > > zfs-discuss at opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Erblichs
2006-Oct-14 01:45 UTC
[zfs-discuss] zfs_vfsops.c : zfs_vfsinit() : line 1179: Src inspection
Group, If their is a bad vfs ops template, why wouldn''t you just return(error) versus trying to create the vnode ops template? My suggestion is after the cmn_err() then return(error); Mitchell Erblich -------------------
Richard L. Hamilton
2007-Jan-10 16:25 UTC
[nfs-discuss] Re: [zfs-discuss] Re: NFS Performance and Tar
Assuming multiple threads extracting multiple files simultaneously from a single archive, won''t the results be indeterminate if a quota or out-of-space condition comes into play? That is, normally, the sequentially earlier file might get the space while the later one wouldn''t; but if multithreaded, either (or both) could fail to get all they needed. That doesn''t even consider I/O errors on either the archive or the files being extracted. What about programs that postprocess the messages from tar? (there must be some, meant to give it a "friendlier" interface or for other purposes). This message posted from opensolaris.org _______________________________________________ nfs-discuss mailing list nfs-discuss@opensolaris.org
This is an old topic, discussed many times at length. However, I still wonder if there are any workarounds to this issue except disabling ZIL, since it makes ZFS over NFS almost unusable (it''s a whole magnitude slower). My understanding is that the ball is in the hands of NFS due to ZFS''s design. The testing results are below. Solaris 10u3 AMD64 server with Mac client over gigabit ethernet. The filesystem is on a 6 disk raidz1 pool, testing the performance of untarring (with bzip2) the Linux 2.6.21 source code. The archive is stored locally and extracted remotely. Locally ------- tar xfvj linux-2.6.21.tar.bz2 real 4m4.094s, user 0m44.732s, sys 0m26.047s star xfv linux-2.6.21.tar.bz2 real 1m47.502s, user 0m38.573s, sys 0m22.671s Over NFS -------- tar xfvj linux-2.6.21.tar.bz2 real 48m22.685s, user 0m45.703s, sys 0m59.264s star xfv linux-2.6.21.tar.bz2 real 49m13.574s, user 0m38.996s, sys 0m35.215s star -no-fsync -x -v -f linux-2.6.21.tar.bz2 real 49m32.127s, user 0m38.454s, sys 0m36.197s The performance seems pretty bad, lets see how other protocols fare. Over Samba ---------- tar xfvj linux-2.6.21.tar.bz2 real 4m34.952s, user 0m44.325s, sys 0m27.404s star xfv linux-2.6.21.tar.bz2 real 4m2.998s, user 0m44.121s, sys 0m29.214s star -no-fsync -x -v -f linux-2.6.21.tar.bz2 real 4m13.352s, user 0m44.239s, sys 0m29.547s Over AFP -------- tar xfvj linux-2.6.21.tar.bz2 real 3m58.405s, user 0m43.132s, sys 0m40.847s star xfv linux-2.6.21.tar.bz2 real 19m44.212s, user 0m38.535s, sys 0m38.866s star -no-fsync -x -v -f linux-2.6.21.tar.bz2 real 3m21.976s, user 0m42.529s, sys 0m39.529s Samba and AFP are much faster, except the fsync''ed star over AFP. Is this a ZFS or NFS issue? Over NFS to non-ZFS drive ------------------------- tar xfvj linux-2.6.21.tar.bz2 real 5m0.211s, user 0m45.330s, sys 0m50.118s star xfv linux-2.6.21.tar.bz2 real 3m26.053s, user 0m43.069s, sys 0m33.726s star -no-fsync -x -v -f linux-2.6.21.tar.bz2 real 3m55.522s, user 0m42.749s, sys 0m35.294s It looks like ZFS is the culprit here. The untarring is much faster to a single 80 GB UFS drive than a 6 disk raid-z array over NFS. Cheers, Siegfried PS. Getting netatalk to compile on amd64 Solaris required some changes since i386 wasn''t being defined anymore, and somehow it thought the architecture was sparc64 for some linking steps.
Hi Seigfried, just making sure you had seen this: http://blogs.sun.com/roch/entry/nfs_and_zfs_a_fine You have very fast NFS to non-ZFS runs. That seems only possible if the hosting OS did not sync the data when NFS required it or the drive in question had some fast write caches. If the drive did have some FWC and ZFS was still slow using them, that would be the issue with flushing mention in the blog entry. but also maybe there is something to be learned from the Samba and AFP results... Takeaways: ZFS and NFS just work together. ZFS has an open issue with some storage array (the issue is *not* related to NFS); it''s being worked on. Will need collaboration from storage vendors. NFS is slower than direct attached. Can be very very much slower on single threaded loads. There are many ways to workaround the slowness but most are just not safe for your data. -r Siegfried Nikolaivich writes: > This is an old topic, discussed many times at length. However, I > still wonder if there are any workarounds to this issue except > disabling ZIL, since it makes ZFS over NFS almost unusable (it''s a > whole magnitude slower). My understanding is that the ball is in the > hands of NFS due to ZFS''s design. The testing results are below. > > > Solaris 10u3 AMD64 server with Mac client over gigabit ethernet. The > filesystem is on a 6 disk raidz1 pool, testing the performance of > untarring (with bzip2) the Linux 2.6.21 source code. The archive is > stored locally and extracted remotely. > > Locally > ------- > tar xfvj linux-2.6.21.tar.bz2 > real 4m4.094s, user 0m44.732s, sys 0m26.047s > > star xfv linux-2.6.21.tar.bz2 > real 1m47.502s, user 0m38.573s, sys 0m22.671s > > Over NFS > -------- > tar xfvj linux-2.6.21.tar.bz2 > real 48m22.685s, user 0m45.703s, sys 0m59.264s > > star xfv linux-2.6.21.tar.bz2 > real 49m13.574s, user 0m38.996s, sys 0m35.215s > > star -no-fsync -x -v -f linux-2.6.21.tar.bz2 > real 49m32.127s, user 0m38.454s, sys 0m36.197s > > > The performance seems pretty bad, lets see how other protocols fare. > > Over Samba > ---------- > tar xfvj linux-2.6.21.tar.bz2 > real 4m34.952s, user 0m44.325s, sys 0m27.404s > > star xfv linux-2.6.21.tar.bz2 > real 4m2.998s, user 0m44.121s, sys 0m29.214s > > star -no-fsync -x -v -f linux-2.6.21.tar.bz2 > real 4m13.352s, user 0m44.239s, sys 0m29.547s > > Over AFP > -------- > tar xfvj linux-2.6.21.tar.bz2 > real 3m58.405s, user 0m43.132s, sys 0m40.847s > > star xfv linux-2.6.21.tar.bz2 > real 19m44.212s, user 0m38.535s, sys 0m38.866s > > star -no-fsync -x -v -f linux-2.6.21.tar.bz2 > real 3m21.976s, user 0m42.529s, sys 0m39.529s > > > Samba and AFP are much faster, except the fsync''ed star over AFP. Is > this a ZFS or NFS issue? > > Over NFS to non-ZFS drive > ------------------------- > tar xfvj linux-2.6.21.tar.bz2 > real 5m0.211s, user 0m45.330s, sys 0m50.118s > > star xfv linux-2.6.21.tar.bz2 > real 3m26.053s, user 0m43.069s, sys 0m33.726s > > star -no-fsync -x -v -f linux-2.6.21.tar.bz2 > real 3m55.522s, user 0m42.749s, sys 0m35.294s > > It looks like ZFS is the culprit here. The untarring is much faster > to a single 80 GB UFS drive than a 6 disk raid-z array over NFS. > > > Cheers, > Siegfried > > > PS. Getting netatalk to compile on amd64 Solaris required some > changes since i386 wasn''t being defined anymore, and somehow it > thought the architecture was sparc64 for some linking steps. > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> > Over NFS to non-ZFS drive > ------------------------- > tar xfvj linux-2.6.21.tar.bz2 > real 5m0.211s, user 0m45.330s, sys 0m50.118s > > star xfv linux-2.6.21.tar.bz2 > real 3m26.053s, user 0m43.069s, sys 0m33.726s > > star -no-fsync -x -v -f linux-2.6.21.tar.bz2 > real 3m55.522s, user 0m42.749s, sys 0m35.294s > > It looks like ZFS is the culprit here. The untarring is much > faster to a single 80 GB UFS drive than a 6 disk raid-z array over > NFS. >Comparing a ZFS pool made out of a single disk to a single UFS filesystem would be a fair comparison. What does your storage look like? eric
On Jun 12, 2007, at 12:57 AM, Roch - PAE wrote:> > Hi Seigfried, just making sure you had seen this: > > http://blogs.sun.com/roch/entry/nfs_and_zfs_a_fine > > You have very fast NFS to non-ZFS runs. > > That seems only possible if the hosting OS did not sync the > data when NFS required it or the drive in question had some > fast write caches. If the drive did have some FWC and ZFS > was still slow using them, that would be the issue with > flushing mention in the blog entry. > > but also maybe there is something to be learned from the > Samba and AFP results... > > Takeaways: > > ZFS and NFS just work together. > > ZFS has an open issue with some storage array (the > issue is *not* related to NFS); it''s being worked > on. Will need collaboration from storage vendors. > > NFS is slower than direct attached. Can be very very > much slower on single threaded loads.Roch knows this, but just to point out for others following the discussion... In this case (single threaded file creates) NFS is slower. However, NFS can go at 1Gbe wirespeed, which can be faster than your disks (depending how many spindles you have and if you''ve striped them for performance).> > There are many ways to workaround the slowness but most > are just not safe for your data.Yeah, the samba numbers were interesting... so i guess its ok in CIFS for the client to be out of sync with the server? That is, i wonder how they handle the case where the client creates a file, server replies ok w/out the data/metadata going to stable storage, server crashes, comes back up, created file is not on stable storage but the client (and its app) thinks it exists... I really would like to know the details of CIFS behavior compared to NFS... eric
eric kustarz wrote:>> >> Over NFS to non-ZFS drive >> ------------------------- >> tar xfvj linux-2.6.21.tar.bz2 >> real 5m0.211s, user 0m45.330s, sys 0m50.118s >> >> star xfv linux-2.6.21.tar.bz2 >> real 3m26.053s, user 0m43.069s, sys 0m33.726s >> >> star -no-fsync -x -v -f linux-2.6.21.tar.bz2 >> real 3m55.522s, user 0m42.749s, sys 0m35.294s >> >> It looks like ZFS is the culprit here. The untarring is much faster >> to a single 80 GB UFS drive than a 6 disk raid-z array over NFS. >> > > Comparing a ZFS pool made out of a single disk to a single UFS > filesystem would be a fair comparison.Right, and to be fairer you need to ensure the disk write cache is disabled (format -e) when testing ufs (as ufs does no flushing of the cache).
On 12-Jun-07, at 9:02 AM, eric kustarz wrote:> Comparing a ZFS pool made out of a single disk to a single UFS > filesystem would be a fair comparison. > > What does your storage look like?The storage looks like: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c0t0d0 ONLINE 0 0 0 c0t1d0 ONLINE 0 0 0 c0t2d0 ONLINE 0 0 0 c0t4d0 ONLINE 0 0 0 c0t5d0 ONLINE 0 0 0 c0t6d0 ONLINE 0 0 0 All disks are local SATA/300 drives with SATA framework on marvell card. The SATA drives are consumer drives with 16MB cache. I agree it''s not a fair comparison, especially with raidz over 6 drives. However, a performance difference of 10x is fairly large. I do not have a single drive available to test ZFS with and compare it to UFS, but I have done similar tests in the past with one ZFS drive without write cache, etc. vs. a UFS drive of the same brand/ size. The difference was still on the order of 10x slower for the ZFS drive over NFS. What could cause such a large difference? Is there a way to measure NFS_COMMIT latency? Cheers, Siegfried
On Jun 13, 2007, at 9:22 PM, Siegfried Nikolaivich wrote:> > On 12-Jun-07, at 9:02 AM, eric kustarz wrote: >> Comparing a ZFS pool made out of a single disk to a single UFS >> filesystem would be a fair comparison. >> >> What does your storage look like? > > The storage looks like: > > NAME STATE READ WRITE CKSUM > tank ONLINE 0 0 0 > raidz1 ONLINE 0 0 0 > c0t0d0 ONLINE 0 0 0 > c0t1d0 ONLINE 0 0 0 > c0t2d0 ONLINE 0 0 0 > c0t4d0 ONLINE 0 0 0 > c0t5d0 ONLINE 0 0 0 > c0t6d0 ONLINE 0 0 0 > > All disks are local SATA/300 drives with SATA framework on marvell > card. The SATA drives are consumer drives with 16MB cache. > > I agree it''s not a fair comparison, especially with raidz over 6 > drives. However, a performance difference of 10x is fairly large. > > I do not have a single drive available to test ZFS with and compare > it to UFS, but I have done similar tests in the past with one ZFS > drive without write cache, etc. vs. a UFS drive of the same brand/ > size. The difference was still on the order of 10x slower for the > ZFS drive over NFS. What could cause such a large difference? Is > there a way to measure NFS_COMMIT latency? >You should do the comparison on a single drive. For ZFS, enable the write cache as its safe to do so. For UFS, disable the write cache. Make sure you''re on non-debug bits. eric