Hey all1 Recently I''ve decided to implement OpenSolaris as a target for BackupExec. The server I''ve converted into a "Storage Appliance" is an IBM x3650 M2 w/ ~4TB of on board storage via ~10 local SATA drives and I''m using OpenSolaris svn_134. I''m using a QLogic 4Gb FC HBA w/ the QLT driver and presented an 8TB sparse volume to the host due to dedup and compression being turned on for the zpool. When writes begin, I see anywhere from 4.5GB/Min to 5.5GB/Min and then it drops of quickly (I mean down to 1GB/Min or less). I''ve already swapped out the card, cable, and port with no results. I have since ensured that every piece of equipment on the box had it''s firmware updated. While doing so, I installed Windows Server 2008 to flash all the firmware (IBM doesn''t have a Solaris installer). While in Server 2008, I decided to just attempt a backup via share on the 1Gbs copper connection. I saw speeds of up to 5.5GB/Min consistently and they were sustained throughout 3 days of testing. Today I decided to move back to OpenSolaris with confidence. All writes began at 5.5GB/Min and quickly dropped off. In my troubleshooting efforts, I have also dropped the fiber connection and made it an iSCSI target with no performance gains. I have let the on board RAID controller do the RAID portion instead of creating a zpool of multiple disks with no performance gains. And, I have created the target LUN using both rdsk and dsk paths. I did notice today though, that there is a direct correlation between the ARC memory usage and speed. Using arcstat.pl, as soon as arcsz hits 1G (half of c column [commit?]), my throughput hits the floor (i.e. 600MB/Min or less). I can''t figure it out. I tried every configuration possible. -- This message posted from opensolaris.org
Khushil Dep
2010-Nov-15 23:35 UTC
[zfs-discuss] ZFS - Sudden decrease in write performance
Set your txg_synctime_ms to 0x3000 and retest please? On 15 Nov 2010 23:23, "Louis" <carreirolt at gmail.com> wrote:> Hey all1 > > Recently I''ve decided to implement OpenSolaris as a target for BackupExec. > > The server I''ve converted into a "Storage Appliance" is an IBM x3650 M2 w/~4TB of on board storage via ~10 local SATA drives and I''m using OpenSolaris svn_134. I''m using a QLogic 4Gb FC HBA w/ the QLT driver and presented an 8TB sparse volume to the host due to dedup and compression being turned on for the zpool.> > When writes begin, I see anywhere from 4.5GB/Min to 5.5GB/Min and then itdrops of quickly (I mean down to 1GB/Min or less). I''ve already swapped out the card, cable, and port with no results. I have since ensured that every piece of equipment on the box had it''s firmware updated. While doing so, I installed Windows Server 2008 to flash all the firmware (IBM doesn''t have a Solaris installer).> > While in Server 2008, I decided to just attempt a backup via share on the1Gbs copper connection. I saw speeds of up to 5.5GB/Min consistently and they were sustained throughout 3 days of testing. Today I decided to move back to OpenSolaris with confidence. All writes began at 5.5GB/Min and quickly dropped off.> > In my troubleshooting efforts, I have also dropped the fiber connectionand made it an iSCSI target with no performance gains. I have let the on board RAID controller do the RAID portion instead of creating a zpool of multiple disks with no performance gains. And, I have created the target LUN using both rdsk and dsk paths.> > I did notice today though, that there is a direct correlation between theARC memory usage and speed. Using arcstat.pl, as soon as arcsz hits 1G (half of c column [commit?]), my throughput hits the floor (i.e. 600MB/Min or less). I can''t figure it out. I tried every configuration possible.> -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101115/df6b2bb0/attachment.html>
Louis Carreiro
2010-Nov-16 00:20 UTC
[zfs-discuss] ZFS - Sudden decrease in write performance
Almost! It seems like it held out a bit further than last time. Now "arcsz" hit''s 2G (matching ''c''). But it still drops off. It started at 5.6GB/Min and fell off to less than 700MB/Min. A snippet of my arcstat.pl output looks like the following: Time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 19:14:31 14K 283 1 283 2 0 0 283 1 2G 2G 19:14:32 45K 120 0 102 0 18 0 120 0 2G 2G 19:14:33 9K 228 2 213 2 15 0 223 2 2G 2G 19:14:34 14K 285 2 274 2 11 0 285 2 2G 2G 19:14:35 14K 294 1 276 2 18 0 294 1 2G 2G The above is what it looks like when my speed falls off. Is txg_synctime_ms something I can tweek or is what you suggested a normal value? I''ve read a few articles that have mentioned values lower than 12288 ms. On Mon, Nov 15, 2010 at 6:35 PM, Khushil Dep <khushil.dep at gmail.com> wrote:> Set your txg_synctime_ms to 0x3000 and retest please? > > On 15 Nov 2010 23:23, "Louis" <carreirolt at gmail.com> wrote: > > Hey all1 > > > > Recently I''ve decided to implement OpenSolaris as a target for > BackupExec. > > > > The server I''ve converted into a "Storage Appliance" is an IBM x3650 M2 > w/ ~4TB of on board storage via ~10 local SATA drives and I''m using > OpenSolaris svn_134. I''m using a QLogic 4Gb FC HBA w/ the QLT driver and > presented an 8TB sparse volume to the host due to dedup and compression > being turned on for the zpool. > > > > When writes begin, I see anywhere from 4.5GB/Min to 5.5GB/Min and then it > drops of quickly (I mean down to 1GB/Min or less). I''ve already swapped out > the card, cable, and port with no results. I have since ensured that every > piece of equipment on the box had it''s firmware updated. While doing so, I > installed Windows Server 2008 to flash all the firmware (IBM doesn''t have a > Solaris installer). > > > > While in Server 2008, I decided to just attempt a backup via share on the > 1Gbs copper connection. I saw speeds of up to 5.5GB/Min consistently and > they were sustained throughout 3 days of testing. Today I decided to move > back to OpenSolaris with confidence. All writes began at 5.5GB/Min and > quickly dropped off. > > > > In my troubleshooting efforts, I have also dropped the fiber connection > and made it an iSCSI target with no performance gains. I have let the on > board RAID controller do the RAID portion instead of creating a zpool of > multiple disks with no performance gains. And, I have created the target LUN > using both rdsk and dsk paths. > > > > I did notice today though, that there is a direct correlation between the > ARC memory usage and speed. Using arcstat.pl, as soon as arcsz hits 1G > (half of c column [commit?]), my throughput hits the floor (i.e. 600MB/Min > or less). I can''t figure it out. I tried every configuration possible. > > -- > > This message posted from opensolaris.org > > _______________________________________________ > > zfs-discuss mailing list > > zfs-discuss at opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101115/f005009d/attachment-0001.html>
Khushil Dep
2010-Nov-16 00:27 UTC
[zfs-discuss] ZFS - Sudden decrease in write performance
That controls zfs breathing, I''m on a phone writing this so u hope you won''t mind me pointing you to listware.net/201005/opensolaris-zfs/115564-zfs-discuss-small-stalls-slowing-down-rsync-from-holding-network-saturation-every-5-seconds.html On 16 Nov 2010 00:20, "Louis Carreiro" <carreirolt at gmail.com> wrote: Almost! It seems like it held out a bit further than last time. Now "arcsz" hit''s 2G (matching ''c''). But it still drops off. It started at 5.6GB/Min and fell off to less than 700MB/Min. A snippet of my arcstat.pl output looks like the following: Time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c 19:14:31 14K 283 1 283 2 0 0 283 1 2G 2G 19:14:32 45K 120 0 102 0 18 0 120 0 2G 2G 19:14:33 9K 228 2 213 2 15 0 223 2 2G 2G 19:14:34 14K 285 2 274 2 11 0 285 2 2G 2G 19:14:35 14K 294 1 276 2 18 0 294 1 2G 2G The above is what it looks like when my speed falls off. Is txg_synctime_ms something I can tweek or is what you suggested a normal value? I''ve read a few articles that have mentioned values lower than 12288 ms. On Mon, Nov 15, 2010 at 6:35 PM, Khushil Dep <khushil.dep at gmail.com> wrote:> > Set your txg_synct...-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101116/14581526/attachment.html>
Khushil Dep
2010-Nov-16 00:37 UTC
[zfs-discuss] ZFS - Sudden decrease in write performance
Points to check are iostat,fsstat, zilstat, mpstat, prstat. Check for sw interrupt sharing, disable ohci. On 16 Nov 2010 00:27, "Khushil Dep" <khushil.dep at gmail.com> wrote:> That controls zfs breathing, I''m on a phone writing this so u hope youwon''t> mind me pointing you to >listware.net/201005/opensolaris-zfs/115564-zfs-discuss-small-stalls-slowing-down-rsync-from-holding-network-saturation-every-5-seconds.html> > On 16 Nov 2010 00:20, "Louis Carreiro" <carreirolt at gmail.com> wrote: > Almost! It seems like it held out a bit further than last time. Now"arcsz"> hit''s 2G (matching ''c''). But it still drops off. It started at 5.6GB/Minand> fell off to less than 700MB/Min. > > A snippet of my arcstat.pl output looks like the following: > > Time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c > 19:14:31 14K 283 1 283 2 0 0 283 1 2G 2G > 19:14:32 45K 120 0 102 0 18 0 120 0 2G 2G > 19:14:33 9K 228 2 213 2 15 0 223 2 2G 2G > 19:14:34 14K 285 2 274 2 11 0 285 2 2G 2G > 19:14:35 14K 294 1 276 2 18 0 294 1 2G 2G > > The above is what it looks like when my speed falls off. Istxg_synctime_ms> something I can tweek or is what you suggested a normal value? I''ve read a > few articles that have mentioned values lower than 12288 ms. > > > > On Mon, Nov 15, 2010 at 6:35 PM, Khushil Dep <khushil.dep at gmail.com>wrote:>> >> Set your txg_synct...-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101116/e420f0c6/attachment-0001.html>
Louis Carreiro
2010-Nov-16 01:46 UTC
[zfs-discuss] ZFS - Sudden decrease in write performance
That''s for pointing me towards that site! Saying that "txg_synctime_ms" controls zfs''s breathing was how I was thinking about it. Great way to describe it! Unfortunately setting txg_synctime_ms to 1000 or even 1 didn''t make an improvement. I tried adding the "disable-ohci=true" to the GRUB boot menu via SSH and it didn''t come back from it''s reboot so I''m not going to be able to due much more tonight (I''m working remotely). I do notice that when the ARC size reaches capacity, that''s when things slow down. Also, it never appears to drop after I kill the IO. If I stop all IO, arcstat shows all numbers but the arcsz drop. Should arcsz drop at all? On Mon, Nov 15, 2010 at 7:27 PM, Khushil Dep <khushil.dep at gmail.com> wrote:> That controls zfs breathing, I''m on a phone writing this so u hope you > won''t mind me pointing you to > listware.net/201005/opensolaris-zfs/115564-zfs-discuss-small-stalls-slowing-down-rsync-from-holding-network-saturation-every-5-seconds.html > > On 16 Nov 2010 00:20, "Louis Carreiro" <carreirolt at gmail.com> wrote: > Almost! It seems like it held out a bit further than last time. Now "arcsz" > hit''s 2G (matching ''c''). But it still drops off. It started at 5.6GB/Min and > fell off to less than 700MB/Min. > > A snippet of my arcstat.pl output looks like the following: > > Time read miss miss% dmis dm% pmis pm% mmis mm% arcsz c > 19:14:31 14K 283 1 283 2 0 0 283 1 2G 2G > 19:14:32 45K 120 0 102 0 18 0 120 0 2G 2G > 19:14:33 9K 228 2 213 2 15 0 223 2 2G 2G > 19:14:34 14K 285 2 274 2 11 0 285 2 2G 2G > 19:14:35 14K 294 1 276 2 18 0 294 1 2G 2G > > The above is what it looks like when my speed falls off. Is txg_synctime_ms > something I can tweek or is what you suggested a normal value? I''ve read a > few articles that have mentioned values lower than 12288 ms. > > > > On Mon, Nov 15, 2010 at 6:35 PM, Khushil Dep <khushil.dep at gmail.com> > wrote: > > > > Set your txg_synct... > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101116/f99d4cb4/attachment.html>
Richard L. Hamilton
2010-Nov-20 15:43 UTC
[zfs-discuss] ZFS - Sudden decrease in write performance
arc-discuss doesn''t have anything specifically to do with ZFS; in particular, it has nothing to do with the ZFS ARC. Just an unfortunate overlap of acronyms. Cross-posted to zfs-discuss, where this probably belongs.> Hey all1 > > Recently I''ve decided to implement OpenSolaris as a > target for BackupExec. > > The server I''ve converted into a "Storage Appliance" > is an IBM x3650 M2 w/ ~4TB of on board storage via > ~10 local SATA drives and I''m using OpenSolaris > svn_134. I''m using a QLogic 4Gb FC HBA w/ the QLT > driver and presented an 8TB sparse volume to the host > due to dedup and compression being turned on for the > zpool. > > When writes begin, I see anywhere from 4.5GB/Min to > 5.5GB/Min and then it drops of quickly (I mean down > to 1GB/Min or less). I''ve already swapped out the > card, cable, and port with no results. I have since > ensured that every piece of equipment on the box had > it''s firmware updated. While doing so, I installed > Windows Server 2008 to flash all the firmware (IBM > doesn''t have a Solaris installer). > > While in Server 2008, I decided to just attempt a > backup via share on the 1Gbs copper connection. I saw > speeds of up to 5.5GB/Min consistently and they were > sustained throughout 3 days of testing. Today I > decided to move back to OpenSolaris with confidence. > All writes began at 5.5GB/Min and quickly dropped > off. > > In my troubleshooting efforts, I have also dropped > the fiber connection and made it an iSCSI target with > no performance gains. I have let the on board RAID > controller do the RAID portion instead of creating a > zpool of multiple disks with no performance gains. > And, I have created the target LUN using both rdsk > and dsk paths. > > I did notice today though, that there is a direct > correlation between the ARC memory usage and speed. > Using arcstat.pl, as soon as arcsz hits 1G (half of c > column [commit?]), my throughput hits the floor (i.e. > 600MB/Min or less). I can''t figure it out. I tried > every configuration possible.-- This message posted from opensolaris.org
This is probably the same write throttle issue which has been discussed to length on this list. Check the archives for zfs write throttle, especially posts from Jun-Jul 2009. Suresh -- This message posted from opensolaris.org