Hi, I deployed ZFS on our mailserver recently, hoping for eternal peace after running on UFS and moving files witch each TB added. It is mailserver - it''s mdirs are on ZFS pool: capacity operations bandwidth pool used avail read write read write ------------------------- ----- ----- ----- ----- ----- ----- mailstore 3.54T 2.08T 280 295 7.10M 5.24M mirror 590G 106G 34 31 676K 786K c6t3d0 - - 14 16 960K 773K c8t22260001552EFE2Cd0 - - 16 18 1.06M 786K mirror 613G 82.9G 51 37 1.44M 838K c6t3d1 - - 20 19 1.57M 824K c5t1d1 - - 20 24 1.40M 838K c8t227C0001559A761Bd0 - - 5 101 403K 4.63M mirror 618G 78.3G 133 60 6.23M 361K c6t3d2 - - 40 27 3.21M 903K c4t2d0 - - 23 81 1.91M 2.98M c8t221200015599F2CFd0 - - 6 108 442K 4.71M mirror 613G 83.2G 110 51 3.66M 337K c6t3d3 - - 36 25 2.72M 906K c5t2d1 - - 29 65 1.80M 2.92M mirror 415G 29.0G 30 28 460K 278K c6t3d4 - - 11 19 804K 268K c4t1d2 - - 15 22 987K 278K mirror 255G 441G 26 49 536K 1.02M c8t22110001552F3C46d0 - - 12 27 835K 1.02M c8t224B0001559BB471d0 - - 12 29 835K 1.02M mirror 257G 439G 32 52 571K 1.04M c8t22480001552D7AF8d0 - - 14 28 1003K 1.04M c4t1d0 - - 14 32 1002K 1.04M mirror 251G 445G 28 53 543K 1.02M c8t227F0001552CB892d0 - - 13 28 897K 1.02M c8t22250001559830A5d0 - - 13 30 897K 1.02M mirror 17.4G 427G 22 38 339K 393K c8t22FA00015529F784d0 - - 9 19 648K 393K c5t2d2 - - 9 23 647K 393K It is 3x dual-iSCSI + 2x dual SCSI DAS arrays (RAID0, 13x250). I have problem however: The 2 SCSI arrays were able to handle the mail-traffic fine with UFS on them. The new config with 3 additional arrays seem to have problem using ZFS. The writes are waiting for 10-15 seconds to get to disk - so queue fills ver quickly, reads are quite ok. I assume this is the problem with ZFS prefering reads to writes. I also see in ''zpool iostat -v 1'' that writes are issued to disk only once in 10 secs, and then its 2000rq one sec. Reads are sustained at cca 800rq/s. Is there a way to tune this read/write ratio? Is this know problem? I tried to change vq_max_pending as suggested by Eric in http://blogs.sun.com/erickustarz/entry/vq_max_pending But no change in this write behaviour. Iostat shows cca 20-30ms asvc_t, 0%w, and cca 30% busy on all drives so these are not saturated it seems. (before with UTF they had 90%busy, 1%wait). System is Sol 10 U2, sun x4200, 4GB RAM. Please if you could give me some hint to really make this working as the way back to UFS is almost impossible on live system. -- Ivan Debn?r
Ivan, What mail clients use your mail server? You may be seeing the effects of: 6440499 zil should avoid txg_wait_synced() and use dmu_sync() to issue parallel IOs when fsyncing This bug was fixed in nevada build 43, and I don''t think made it into s10 update 2. It will, of course, be in update 3 and be available in a patch at some point. Ivan Debn?r wrote:> Hi, > > I deployed ZFS on our mailserver recently, hoping for eternal peace after running on UFS and moving files witch each TB added. > > It is mailserver - it''s mdirs are on ZFS pool: > capacity operations bandwidth > pool used avail read write read write > ------------------------- ----- ----- ----- ----- ----- ----- > mailstore 3.54T 2.08T 280 295 7.10M 5.24M > mirror 590G 106G 34 31 676K 786K > c6t3d0 - - 14 16 960K 773K > c8t22260001552EFE2Cd0 - - 16 18 1.06M 786K > mirror 613G 82.9G 51 37 1.44M 838K > c6t3d1 - - 20 19 1.57M 824K > c5t1d1 - - 20 24 1.40M 838K > c8t227C0001559A761Bd0 - - 5 101 403K 4.63M > mirror 618G 78.3G 133 60 6.23M 361K > c6t3d2 - - 40 27 3.21M 903K > c4t2d0 - - 23 81 1.91M 2.98M > c8t221200015599F2CFd0 - - 6 108 442K 4.71M > mirror 613G 83.2G 110 51 3.66M 337K > c6t3d3 - - 36 25 2.72M 906K > c5t2d1 - - 29 65 1.80M 2.92M > mirror 415G 29.0G 30 28 460K 278K > c6t3d4 - - 11 19 804K 268K > c4t1d2 - - 15 22 987K 278K > mirror 255G 441G 26 49 536K 1.02M > c8t22110001552F3C46d0 - - 12 27 835K 1.02M > c8t224B0001559BB471d0 - - 12 29 835K 1.02M > mirror 257G 439G 32 52 571K 1.04M > c8t22480001552D7AF8d0 - - 14 28 1003K 1.04M > c4t1d0 - - 14 32 1002K 1.04M > mirror 251G 445G 28 53 543K 1.02M > c8t227F0001552CB892d0 - - 13 28 897K 1.02M > c8t22250001559830A5d0 - - 13 30 897K 1.02M > mirror 17.4G 427G 22 38 339K 393K > c8t22FA00015529F784d0 - - 9 19 648K 393K > c5t2d2 - - 9 23 647K 393K > > > It is 3x dual-iSCSI + 2x dual SCSI DAS arrays (RAID0, 13x250). > > I have problem however: > The 2 SCSI arrays were able to handle the mail-traffic fine with UFS on them. > The new config with 3 additional arrays seem to have problem using ZFS. > The writes are waiting for 10-15 seconds to get to disk - so queue fills ver quickly, reads are quite ok. > I assume this is the problem with ZFS prefering reads to writes. > > I also see in ''zpool iostat -v 1'' that writes are issued to disk only once in 10 secs, and then its 2000rq one sec. > Reads are sustained at cca 800rq/s. > > Is there a way to tune this read/write ratio? Is this know problem? > > I tried to change vq_max_pending as suggested by Eric in http://blogs.sun.com/erickustarz/entry/vq_max_pending > But no change in this write behaviour. > > Iostat shows cca 20-30ms asvc_t, 0%w, and cca 30% busy on all drives so these are not saturated it seems. (before with UTF they had 90%busy, 1%wait). > > System is Sol 10 U2, sun x4200, 4GB RAM. > > Please if you could give me some hint to really make this working as the way back to UFS is almost impossible on live system. > > >
Hi, thanks for reply. The load is like this: 20 msg/s incoming 400 simult IMAP connections ( select, search, fetch-env ) 60 new websessions / s 100 simult POP3 Is there a way to get that "patch" to try? Thinks are really getting worse down here :-( It might make sense, since the mail server is probably sync-ing files before releasing them from queue. Any workaround? Ivan -----Original Message----- From: Mark.Maybee at Sun.COM [mailto:Mark.Maybee at Sun.COM] Sent: Thursday, September 07, 2006 4:54 PM To: Ivan Debn?r Cc: zfs-discuss at opensolaris.org Subject: Re: [zfs-discuss] Performance problem of ZFS ( Sol 10U2 ) Ivan, What mail clients use your mail server? You may be seeing the effects of: 6440499 zil should avoid txg_wait_synced() and use dmu_sync() to issue parallel IOs when fsyncing This bug was fixed in nevada build 43, and I don''t think made it into s10 update 2. It will, of course, be in update 3 and be available in a patch at some point. Ivan Debn?r wrote:> Hi, > > I deployed ZFS on our mailserver recently, hoping for eternal peace after running on UFS and moving files witch each TB added. > > It is mailserver - it''s mdirs are on ZFS pool: > capacity operations bandwidth > pool used avail read write read write > ------------------------- ----- ----- ----- ----- ----- ----- > mailstore 3.54T 2.08T 280 295 7.10M 5.24M > mirror 590G 106G 34 31 676K 786K > c6t3d0 - - 14 16 960K 773K > c8t22260001552EFE2Cd0 - - 16 18 1.06M 786K > mirror 613G 82.9G 51 37 1.44M 838K > c6t3d1 - - 20 19 1.57M 824K > c5t1d1 - - 20 24 1.40M 838K > c8t227C0001559A761Bd0 - - 5 101 403K 4.63M > mirror 618G 78.3G 133 60 6.23M 361K > c6t3d2 - - 40 27 3.21M 903K > c4t2d0 - - 23 81 1.91M 2.98M > c8t221200015599F2CFd0 - - 6 108 442K 4.71M > mirror 613G 83.2G 110 51 3.66M 337K > c6t3d3 - - 36 25 2.72M 906K > c5t2d1 - - 29 65 1.80M 2.92M > mirror 415G 29.0G 30 28 460K 278K > c6t3d4 - - 11 19 804K 268K > c4t1d2 - - 15 22 987K 278K > mirror 255G 441G 26 49 536K 1.02M > c8t22110001552F3C46d0 - - 12 27 835K 1.02M > c8t224B0001559BB471d0 - - 12 29 835K 1.02M > mirror 257G 439G 32 52 571K 1.04M > c8t22480001552D7AF8d0 - - 14 28 1003K 1.04M > c4t1d0 - - 14 32 1002K 1.04M > mirror 251G 445G 28 53 543K 1.02M > c8t227F0001552CB892d0 - - 13 28 897K 1.02M > c8t22250001559830A5d0 - - 13 30 897K 1.02M > mirror 17.4G 427G 22 38 339K 393K > c8t22FA00015529F784d0 - - 9 19 648K 393K > c5t2d2 - - 9 23 647K 393K > > > It is 3x dual-iSCSI + 2x dual SCSI DAS arrays (RAID0, 13x250). > > I have problem however: > The 2 SCSI arrays were able to handle the mail-traffic fine with UFS on them. > The new config with 3 additional arrays seem to have problem using ZFS. > The writes are waiting for 10-15 seconds to get to disk - so queue fills ver quickly, reads are quite ok. > I assume this is the problem with ZFS prefering reads to writes. > > I also see in ''zpool iostat -v 1'' that writes are issued to disk only once in 10 secs, and then its 2000rq one sec. > Reads are sustained at cca 800rq/s. > > Is there a way to tune this read/write ratio? Is this know problem? > > I tried to change vq_max_pending as suggested by Eric in > http://blogs.sun.com/erickustarz/entry/vq_max_pending > But no change in this write behaviour. > > Iostat shows cca 20-30ms asvc_t, 0%w, and cca 30% busy on all drives so these are not saturated it seems. (before with UTF they had 90%busy, 1%wait). > > System is Sol 10 U2, sun x4200, 4GB RAM. > > Please if you could give me some hint to really make this working as the way back to UFS is almost impossible on live system. > > >
Ivan Debn?r wrote:>Hi, > >I deployed ZFS on our mailserver recently, hoping for eternal peace after running on UFS and moving files witch each TB added. > >It is mailserver - it''s mdirs are on ZFS pool: > capacity operations bandwidth >pool used avail read write read write >------------------------- ----- ----- ----- ----- ----- ----- >mailstore 3.54T 2.08T 280 295 7.10M 5.24M > mirror 590G 106G 34 31 676K 786K > c6t3d0 - - 14 16 960K 773K > c8t22260001552EFE2Cd0 - - 16 18 1.06M 786K > mirror 613G 82.9G 51 37 1.44M 838K > c6t3d1 - - 20 19 1.57M 824K > c5t1d1 - - 20 24 1.40M 838K > c8t227C0001559A761Bd0 - - 5 101 403K 4.63M > mirror 618G 78.3G 133 60 6.23M 361K > c6t3d2 - - 40 27 3.21M 903K > c4t2d0 - - 23 81 1.91M 2.98M > c8t221200015599F2CFd0 - - 6 108 442K 4.71M > mirror 613G 83.2G 110 51 3.66M 337K > c6t3d3 - - 36 25 2.72M 906K > c5t2d1 - - 29 65 1.80M 2.92M > mirror 415G 29.0G 30 28 460K 278K > c6t3d4 - - 11 19 804K 268K > c4t1d2 - - 15 22 987K 278K > mirror 255G 441G 26 49 536K 1.02M > c8t22110001552F3C46d0 - - 12 27 835K 1.02M > c8t224B0001559BB471d0 - - 12 29 835K 1.02M > mirror 257G 439G 32 52 571K 1.04M > c8t22480001552D7AF8d0 - - 14 28 1003K 1.04M > c4t1d0 - - 14 32 1002K 1.04M > mirror 251G 445G 28 53 543K 1.02M > c8t227F0001552CB892d0 - - 13 28 897K 1.02M > c8t22250001559830A5d0 - - 13 30 897K 1.02M > mirror 17.4G 427G 22 38 339K 393K > c8t22FA00015529F784d0 - - 9 19 648K 393K > c5t2d2 - - 9 23 647K 393K > > >It is 3x dual-iSCSI + 2x dual SCSI DAS arrays (RAID0, 13x250). > >I have problem however: >The 2 SCSI arrays were able to handle the mail-traffic fine with UFS on them. >The new config with 3 additional arrays seem to have problem using ZFS. >The writes are waiting for 10-15 seconds to get to disk - so queue fills ver quickly, reads are quite ok. > >Are those synchronouse writes or asynchronous? If both, what are the percentages of each? Neil just putback a fix into snv_48 for: 6413510 zfs: writing to ZFS filesystem slows down fsync() on other files in the same FS Basically the fsync/synchronous writes end up doing more work than they should - instead of writing the data and meta-data for just the file you''re trying to fsync, you will write (and wait for) other files'' data & meta-data too. eric>I assume this is the problem with ZFS prefering reads to writes. > >I also see in ''zpool iostat -v 1'' that writes are issued to disk only once in 10 secs, and then its 2000rq one sec. >Reads are sustained at cca 800rq/s. > >Is there a way to tune this read/write ratio? Is this know problem? > >I tried to change vq_max_pending as suggested by Eric in http://blogs.sun.com/erickustarz/entry/vq_max_pending >But no change in this write behaviour. > >Iostat shows cca 20-30ms asvc_t, 0%w, and cca 30% busy on all drives so these are not saturated it seems. (before with UTF they had 90%busy, 1%wait). > >System is Sol 10 U2, sun x4200, 4GB RAM. > >Please if you could give me some hint to really make this working as the way back to UFS is almost impossible on live system. > > > > >
Hi, thanks for respose. As this is close-source mailserver (CommuniGate pro), I can''t say 100% answer, but the writes that I see that take too much time (15-30secs) are writes from temp queue to final storage, and from my understanding, they are sync so the queue manager can guarantee they are on solid storage. Apart from that however the interactive access to mailstore modifies filenames ( read/opened/deleted flags are part of filename for each file) a lot content of files is not changed any more. Also moves abetween directories and deletions are only dir operations ( don''t know whether sync or not - you know OS internals better).>From description of the error you mentioned, and also6440499 zil should avoid txg_wait_synced() and use dmu_sync() to issue parallel IOs when fsyncing I think that this may be also my case. So my question is: I run Sol10U2, is there a way to quicky test new ZFS, without reinstalling whole system? Please say there is.... Ivan -----Original Message----- From: eric kustarz [mailto:eric.kustarz at sun.com] Sent: Thursday, September 07, 2006 8:39 PM To: Ivan Debn?r Cc: zfs-discuss at opensolaris.org Subject: Re: [zfs-discuss] Performance problem of ZFS ( Sol 10U2 ) Ivan Debn?r wrote:>Hi, > >I deployed ZFS on our mailserver recently, hoping for eternal peace after running on UFS and moving files witch each TB added. > >It is mailserver - it''s mdirs are on ZFS pool: > capacity operations bandwidth >pool used avail read write read write >------------------------- ----- ----- ----- ----- ----- ----- >mailstore 3.54T 2.08T 280 295 7.10M 5.24M > mirror 590G 106G 34 31 676K 786K > c6t3d0 - - 14 16 960K 773K > c8t22260001552EFE2Cd0 - - 16 18 1.06M 786K > mirror 613G 82.9G 51 37 1.44M 838K > c6t3d1 - - 20 19 1.57M 824K > c5t1d1 - - 20 24 1.40M 838K > c8t227C0001559A761Bd0 - - 5 101 403K 4.63M > mirror 618G 78.3G 133 60 6.23M 361K > c6t3d2 - - 40 27 3.21M 903K > c4t2d0 - - 23 81 1.91M 2.98M > c8t221200015599F2CFd0 - - 6 108 442K 4.71M > mirror 613G 83.2G 110 51 3.66M 337K > c6t3d3 - - 36 25 2.72M 906K > c5t2d1 - - 29 65 1.80M 2.92M > mirror 415G 29.0G 30 28 460K 278K > c6t3d4 - - 11 19 804K 268K > c4t1d2 - - 15 22 987K 278K > mirror 255G 441G 26 49 536K 1.02M > c8t22110001552F3C46d0 - - 12 27 835K 1.02M > c8t224B0001559BB471d0 - - 12 29 835K 1.02M > mirror 257G 439G 32 52 571K 1.04M > c8t22480001552D7AF8d0 - - 14 28 1003K 1.04M > c4t1d0 - - 14 32 1002K 1.04M > mirror 251G 445G 28 53 543K 1.02M > c8t227F0001552CB892d0 - - 13 28 897K 1.02M > c8t22250001559830A5d0 - - 13 30 897K 1.02M > mirror 17.4G 427G 22 38 339K 393K > c8t22FA00015529F784d0 - - 9 19 648K 393K > c5t2d2 - - 9 23 647K 393K > > >It is 3x dual-iSCSI + 2x dual SCSI DAS arrays (RAID0, 13x250). > >I have problem however: >The 2 SCSI arrays were able to handle the mail-traffic fine with UFS on them. >The new config with 3 additional arrays seem to have problem using ZFS. >The writes are waiting for 10-15 seconds to get to disk - so queue fills ver quickly, reads are quite ok. > >Are those synchronouse writes or asynchronous? If both, what are the percentages of each? Neil just putback a fix into snv_48 for: 6413510 zfs: writing to ZFS filesystem slows down fsync() on other files in the same FS Basically the fsync/synchronous writes end up doing more work than they should - instead of writing the data and meta-data for just the file you''re trying to fsync, you will write (and wait for) other files'' data & meta-data too. eric>I assume this is the problem with ZFS prefering reads to writes. > >I also see in ''zpool iostat -v 1'' that writes are issued to disk only once in 10 secs, and then its 2000rq one sec. >Reads are sustained at cca 800rq/s. > >Is there a way to tune this read/write ratio? Is this know problem? > >I tried to change vq_max_pending as suggested by Eric in >http://blogs.sun.com/erickustarz/entry/vq_max_pending >But no change in this write behaviour. > >Iostat shows cca 20-30ms asvc_t, 0%w, and cca 30% busy on all drives so these are not saturated it seems. (before with UTF they had 90%busy, 1%wait). > >System is Sol 10 U2, sun x4200, 4GB RAM. > >Please if you could give me some hint to really make this working as the way back to UFS is almost impossible on live system. > > > > >
Ivan Debn?r wrote:> Hi, thanks for respose. > > As this is close-source mailserver (CommuniGate pro), I can''t say 100% answer, but the writes that I see that take too much time (15-30secs) are writes from temp queue to final storage, and from my understanding, they are sync so the queue manager can guarantee they are on solid storage. > > Apart from that however the interactive access to mailstore modifies filenames ( read/opened/deleted flags are part of filename for each file) a lot content of files is not changed any more. Also moves abetween directories and deletions are only dir operations ( don''t know whether sync or not - you know OS internals better). > >>From description of the error you mentioned, and also > 6440499 zil should avoid txg_wait_synced() and use dmu_sync() to issue parallel IOs when fsyncing > > I think that this may be also my case. > > So my question is: I run Sol10U2, is there a way to quicky test new ZFS, without reinstalling whole system? > Please say there is.... >Ivan, Patches with the bug fixes should become available in couple of weeks. We will let you know as soon as they are available. -Mark