Hi all, I''ve noticed something strange in the throughput in my zpool between different snv builds, and I''m not sure if it''s an inherent difference in the build or a kernel parameter that is different in the builds. I''ve setup two similiar machines and this happens with both of them. Each system has 16 2TB Samsung HD203WI drives (total) directly connected to two LSI 3081E-R 1068e cards with IT firmware in one raidz3 vdev. In both computers, after a fresh installation of snv 134, the throughput is a maximum of about 300 MB/s during scrub or something like "dd if=/dev/zero bs=1024k of=bigfile". If I bfu to snv 138, I then get throughput of about 700 MB/s with both scrub or a single thread dd. I assumed at first this was some sort of bug or regression in 134 that made it slow. However, I''ve now tested also from the fresh 134 installation, compiling the OS/Net build 143 from the mercurial repository and booting into it, after which the dd throughput is still only about 300 MB/s just like snv 134. The scrub throughput in 143 is even slower, rarely surpassing 150 MB/s. I wonder if the scrubbing being extra slow here is related to the additional statistics displayed during the scrub that didn''t used to be shown. Is there some kind of debug option that might be enabled in the 134 build and persist if I compile snv 143 which would be off if I installed a 138 through bfu? If not, it makes me think that the bfu to 138 is changing the configuration somewhere to make it faster rather than fixing a bug or being a debug flag on or off. Does anyone have any idea what might be happening? One thing I haven''t tried is bfu''ing to 138, and from this faster working snv 138 installing the snv 143 build, which may possibly create a 143 that performs faster if it''s simply a configuration parameter. I''m not sure offhand if installing source-compiled ON builds from a bfu''d rpool is supported, although I suppose it''s simple enough to try. Thanks, Chad Cantwell
fyi, everyone, I have some more info here. in short, rich lowe''s 142 works correctly (fast) on my hardware, while both my compilations (snv 143, snv 144) and also the nexanta 3 rc2 kernel (134 with backports) are horribly slow. I finally got around to trying rich lowe''s snv 142 compilation in place of my own compilation of 143 (and later 144, not mentioned below), and unlike my own two compilations, his works very fast again on my same zpool ( scrubbing avg increased from low 100s to over 400 MB/s within a few minutes after booting into this copy of 142. I should note that since my original message, I also tried booting from a Nexanta Core 3.0 RC2 ISO after realizing it had zpool 26 support backported into 134 and was in fact able to read my zpool despite upgrading the version. Running a scrub from the F2 shell on the Nexanta CD was also slow scrubbing, just like the 143 and 144 that I compiled. So, there seem to be two possibilities. Either (and this seems unlikely) there is a problem introduced post-142 which slows things down, and it occured in 143, 144, and was brought back to 134 with Nexanta''s backports, or else (more likely) there is something different or wrong with how I''m compiling the kernel that makes the hardware not perform up to its specifications with a zpool, and possibly the Nexanta 3 RC2 ISO has the same problem as my own compilations. Chad On Tue, Jul 06, 2010 at 03:08:50PM -0700, Chad Cantwell wrote:> Hi all, > > I''ve noticed something strange in the throughput in my zpool between > different snv builds, and I''m not sure if it''s an inherent difference > in the build or a kernel parameter that is different in the builds. > I''ve setup two similiar machines and this happens with both of them. > Each system has 16 2TB Samsung HD203WI drives (total) directly connected > to two LSI 3081E-R 1068e cards with IT firmware in one raidz3 vdev. > > In both computers, after a fresh installation of snv 134, the throughput > is a maximum of about 300 MB/s during scrub or something like > "dd if=/dev/zero bs=1024k of=bigfile". > > If I bfu to snv 138, I then get throughput of about 700 MB/s with both > scrub or a single thread dd. > > I assumed at first this was some sort of bug or regression in 134 that > made it slow. However, I''ve now tested also from the fresh 134 > installation, compiling the OS/Net build 143 from the mercurial > repository and booting into it, after which the dd throughput is still > only about 300 MB/s just like snv 134. The scrub throughput in 143 > is even slower, rarely surpassing 150 MB/s. I wonder if the scrubbing > being extra slow here is related to the additional statistics displayed > during the scrub that didn''t used to be shown. > > Is there some kind of debug option that might be enabled in the 134 build > and persist if I compile snv 143 which would be off if I installed a 138 > through bfu? If not, it makes me think that the bfu to 138 is changing > the configuration somewhere to make it faster rather than fixing a bug or > being a debug flag on or off. Does anyone have any idea what might be > happening? One thing I haven''t tried is bfu''ing to 138, and from this > faster working snv 138 installing the snv 143 build, which may possibly > create a 143 that performs faster if it''s simply a configuration parameter. > I''m not sure offhand if installing source-compiled ON builds from a bfu''d > rpool is supported, although I suppose it''s simple enough to try. > > Thanks, > Chad Cantwell > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
James C. McPherson
2010-Jul-20 00:54 UTC
[zfs-discuss] zpool throughput: snv 134 vs 138 vs 143
On 20/07/10 10:40 AM, Chad Cantwell wrote:> fyi, everyone, I have some more info here. in short, rich lowe''s 142 works > correctly (fast) on my hardware, while both my compilations (snv 143, snv 144) > and also the nexanta 3 rc2 kernel (134 with backports) are horribly slow. > > I finally got around to trying rich lowe''s snv 142 compilation in place of > my own compilation of 143 (and later 144, not mentioned below), and unlike > my own two compilations, his works very fast again on my same zpool ( > scrubbing avg increased from low 100s to over 400 MB/s within a few > minutes after booting into this copy of 142. I should note that since > my original message, I also tried booting from a Nexanta Core 3.0 RC2 ISO > after realizing it had zpool 26 support backported into 134 and was in > fact able to read my zpool despite upgrading the version. Running a > scrub from the F2 shell on the Nexanta CD was also slow scrubbing, just > like the 143 and 144 that I compiled. So, there seem to be two possibilities. > Either (and this seems unlikely) there is a problem introduced post-142 which > slows things down, and it occured in 143, 144, and was brought back to 134 > with Nexanta''s backports, or else (more likely) there is something different > or wrong with how I''m compiling the kernel that makes the hardware not > perform up to its specifications with a zpool, and possibly the Nexanta 3 > RC2 ISO has the same problem as my own compilations.So - what''s your env file contents, which closedbins are you using, why crypto bits are you using, and what changeset is your own workspace synced with? James C. McPherson -- Oracle http://www.jmcp.homeunix.com/blog
On Mon, Jul 19, 2010 at 5:40 PM, Chad Cantwell <chad at iomail.org> wrote:> fyi, everyone, I have some more info here. ?in short, rich lowe''s 142 works > correctly (fast) on my hardware, while both my compilations (snv 143, snv 144) > and also the nexanta 3 rc2 kernel (134 with backports) are horribly slow. > > I finally got around to trying rich lowe''s snv 142 compilation in place of > my own compilation of 143 (and later 144, not mentioned below), and unlike > my own two compilations, his works very fast again on my same zpool ( > scrubbing avg increased from low 100s to over 400 MB/s within a few > minutes after booting into this copy of 142. ?I should note that since > my original message, I also tried booting from a Nexanta Core 3.0 RC2 ISO > after realizing it had zpool 26 support backported into 134 and was in > fact able to read my zpool despite upgrading the version. ?Running a > scrub from the F2 shell on the Nexanta CD was also slow scrubbing, just > like the 143 and 144 that I compiled. ?So, there seem to be two possibilities. > Either (and this seems unlikely) there is a problem introduced post-142 which > slows things down, and it occured in 143, 144, and was brought back to 134 > with Nexanta''s backports, or else (more likely) there is something different > or wrong with how I''m compiling the kernel that makes the hardware not > perform up to its specifications with a zpool, and possibly the Nexanta 3 > RC2 ISO has the same problem as my own compilations. > > Chad > > On Tue, Jul 06, 2010 at 03:08:50PM -0700, Chad Cantwell wrote: >> Hi all, >> >> I''ve noticed something strange in the throughput in my zpool between >> different snv builds, and I''m not sure if it''s an inherent difference >> in the build or a kernel parameter that is different in the builds. >> I''ve setup two similiar machines and this happens with both of them. >> Each system has 16 2TB Samsung HD203WI drives (total) directly connected >> to two LSI 3081E-R 1068e cards with IT firmware in one raidz3 vdev. >> >> In both computers, after a fresh installation of snv 134, the throughput >> is a maximum of about 300 MB/s during scrub or something like >> "dd if=/dev/zero bs=1024k of=bigfile". >> >> If I bfu to snv 138, I then get throughput of about 700 MB/s with both >> scrub or a single thread dd. >> >> I assumed at first this was some sort of bug or regression in 134 that >> made it slow. ?However, I''ve now tested also from the fresh 134 >> installation, compiling the OS/Net build 143 from the mercurial >> repository and booting into it, after which the dd throughput is still >> only about 300 MB/s just like snv 134. ?The scrub throughput in 143 >> is even slower, rarely surpassing 150 MB/s. ?I wonder if the scrubbing >> being extra slow here is related to the additional statistics displayed >> during the scrub that didn''t used to be shown. >> >> Is there some kind of debug option that might be enabled in the 134 build >> and persist if I compile snv 143 which would be off if I installed a 138 >> through bfu? ?If not, it makes me think that the bfu to 138 is changing >> the configuration somewhere to make it faster rather than fixing a bug or >> being a debug flag on or off. ?Does anyone have any idea what might be >> happening? ?One thing I haven''t tried is bfu''ing to 138, and from this >> faster working snv 138 installing the snv 143 build, which may possibly >> create a 143 that performs faster if it''s simply a configuration parameter. >> I''m not sure offhand if installing source-compiled ON builds from a bfu''d >> rpool is supported, although I suppose it''s simple enough to try. >> >> Thanks, >> Chad Cantwell >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >I''m surprised you''re even getting 400MB/s on the "fast" configurations, with only 16 drives in a Raidz3 configuration. To me, 16 drives in Raidz3 (single Vdev) would do about 150MB/sec, as your "slow" speeds suggest. -- Brent Jones brent at servuhome.net
Garrett D''Amore
2010-Jul-20 01:11 UTC
[zfs-discuss] zpool throughput: snv 134 vs 138 vs 143
On Mon, 2010-07-19 at 17:40 -0700, Chad Cantwell wrote:> fyi, everyone, I have some more info here. in short, rich lowe''s 142 works > correctly (fast) on my hardware, while both my compilations (snv 143, snv 144) > and also the nexanta 3 rc2 kernel (134 with backports) are horribly slow.The idea that its a regression introduced into NCP 3RC2 is not very far fetched at all. It certainly could stand some more analysis. - Garrett> > I finally got around to trying rich lowe''s snv 142 compilation in place of > my own compilation of 143 (and later 144, not mentioned below), and unlike > my own two compilations, his works very fast again on my same zpool ( > scrubbing avg increased from low 100s to over 400 MB/s within a few > minutes after booting into this copy of 142. I should note that since > my original message, I also tried booting from a Nexanta Core 3.0 RC2 ISO > after realizing it had zpool 26 support backported into 134 and was in > fact able to read my zpool despite upgrading the version. Running a > scrub from the F2 shell on the Nexanta CD was also slow scrubbing, just > like the 143 and 144 that I compiled. So, there seem to be two possibilities. > Either (and this seems unlikely) there is a problem introduced post-142 which > slows things down, and it occured in 143, 144, and was brought back to 134 > with Nexanta''s backports, or else (more likely) there is something different > or wrong with how I''m compiling the kernel that makes the hardware not > perform up to its specifications with a zpool, and possibly the Nexanta 3 > RC2 ISO has the same problem as my own compilations. > > Chad > > On Tue, Jul 06, 2010 at 03:08:50PM -0700, Chad Cantwell wrote: > > Hi all, > > > > I''ve noticed something strange in the throughput in my zpool between > > different snv builds, and I''m not sure if it''s an inherent difference > > in the build or a kernel parameter that is different in the builds. > > I''ve setup two similiar machines and this happens with both of them. > > Each system has 16 2TB Samsung HD203WI drives (total) directly connected > > to two LSI 3081E-R 1068e cards with IT firmware in one raidz3 vdev. > > > > In both computers, after a fresh installation of snv 134, the throughput > > is a maximum of about 300 MB/s during scrub or something like > > "dd if=/dev/zero bs=1024k of=bigfile". > > > > If I bfu to snv 138, I then get throughput of about 700 MB/s with both > > scrub or a single thread dd. > > > > I assumed at first this was some sort of bug or regression in 134 that > > made it slow. However, I''ve now tested also from the fresh 134 > > installation, compiling the OS/Net build 143 from the mercurial > > repository and booting into it, after which the dd throughput is still > > only about 300 MB/s just like snv 134. The scrub throughput in 143 > > is even slower, rarely surpassing 150 MB/s. I wonder if the scrubbing > > being extra slow here is related to the additional statistics displayed > > during the scrub that didn''t used to be shown. > > > > Is there some kind of debug option that might be enabled in the 134 build > > and persist if I compile snv 143 which would be off if I installed a 138 > > through bfu? If not, it makes me think that the bfu to 138 is changing > > the configuration somewhere to make it faster rather than fixing a bug or > > being a debug flag on or off. Does anyone have any idea what might be > > happening? One thing I haven''t tried is bfu''ing to 138, and from this > > faster working snv 138 installing the snv 143 build, which may possibly > > create a 143 that performs faster if it''s simply a configuration parameter. > > I''m not sure offhand if installing source-compiled ON builds from a bfu''d > > rpool is supported, although I suppose it''s simple enough to try. > > > > Thanks, > > Chad Cantwell > > _______________________________________________ > > zfs-discuss mailing list > > zfs-discuss at opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
On Tue, Jul 20, 2010 at 10:54:44AM +1000, James C. McPherson wrote:> On 20/07/10 10:40 AM, Chad Cantwell wrote: > >fyi, everyone, I have some more info here. in short, rich lowe''s 142 works > >correctly (fast) on my hardware, while both my compilations (snv 143, snv 144) > >and also the nexanta 3 rc2 kernel (134 with backports) are horribly slow. > > > >I finally got around to trying rich lowe''s snv 142 compilation in place of > >my own compilation of 143 (and later 144, not mentioned below), and unlike > >my own two compilations, his works very fast again on my same zpool ( > >scrubbing avg increased from low 100s to over 400 MB/s within a few > >minutes after booting into this copy of 142. I should note that since > >my original message, I also tried booting from a Nexanta Core 3.0 RC2 ISO > >after realizing it had zpool 26 support backported into 134 and was in > >fact able to read my zpool despite upgrading the version. Running a > >scrub from the F2 shell on the Nexanta CD was also slow scrubbing, just > >like the 143 and 144 that I compiled. So, there seem to be two possibilities. > >Either (and this seems unlikely) there is a problem introduced post-142 which > >slows things down, and it occured in 143, 144, and was brought back to 134 > >with Nexanta''s backports, or else (more likely) there is something different > >or wrong with how I''m compiling the kernel that makes the hardware not > >perform up to its specifications with a zpool, and possibly the Nexanta 3 > >RC2 ISO has the same problem as my own compilations. > > So - what''s your env file contents, which closedbins are you using, > why crypto bits are you using, and what changeset is your own workspace > synced with? > > > James C. McPherson > -- > Oracle > http://www.jmcp.homeunix.com/blogThe procedure I followed was basically what is outlined here: http://insanum.com/blog/2010/06/08/how-to-build-opensolaris using the SunStudio 12 compilers for ON and 12u1 for lint. For each build (143, 144) I cloned the exact tag for that build, i.e.: # hg clone ssh://anon at hg.opensolaris.org/hg/onnv/onnv-gate onnv-b144 # cd onnv-b144 # hg update onnv_144 Then I downloaded the corresponding closed and crypto bins from http://dlc.sun.com/osol/on/downloads/b143 or http://dlc.sun.com/osol/on/downloads/b144 The only environemnt variables I modified from the default opensolaris.sh file were the basic ones: GATE, CODEMGR_WS, STAFFER, and ON_CRYPTO_BINS to point to my work directory for the build, my username, and the relevant crypto bin: $ egrep -e "^GATE|^CODEMGR_WS|^STAFFER|^ON_CRYPTO_BINS" opensolaris.sh GATE=onnv-b144; export GATE CODEMGR_WS="/work/compiling/$GATE"; export CODEMGR_WS STAFFER=chad; export STAFFER ON_CRYPTO_BINS="$CODEMGR_WS/on-crypto-latest.$MACH.tar.bz2" I suppose the easiest way for me to confirm if there is a regression or if my compiling is flawed is to just try compiling snv_142 using the same procedure and see if it works as well as Rich Lowe''s copy or if it''s slow like my other compilations. Chad
On Mon, Jul 19, 2010 at 06:00:04PM -0700, Brent Jones wrote:> On Mon, Jul 19, 2010 at 5:40 PM, Chad Cantwell <chad at iomail.org> wrote: > > fyi, everyone, I have some more info here. ?in short, rich lowe''s 142 works > > correctly (fast) on my hardware, while both my compilations (snv 143, snv 144) > > and also the nexanta 3 rc2 kernel (134 with backports) are horribly slow. > > > > I finally got around to trying rich lowe''s snv 142 compilation in place of > > my own compilation of 143 (and later 144, not mentioned below), and unlike > > my own two compilations, his works very fast again on my same zpool ( > > scrubbing avg increased from low 100s to over 400 MB/s within a few > > minutes after booting into this copy of 142. ?I should note that since > > my original message, I also tried booting from a Nexanta Core 3.0 RC2 ISO > > after realizing it had zpool 26 support backported into 134 and was in > > fact able to read my zpool despite upgrading the version. ?Running a > > scrub from the F2 shell on the Nexanta CD was also slow scrubbing, just > > like the 143 and 144 that I compiled. ?So, there seem to be two possibilities. > > Either (and this seems unlikely) there is a problem introduced post-142 which > > slows things down, and it occured in 143, 144, and was brought back to 134 > > with Nexanta''s backports, or else (more likely) there is something different > > or wrong with how I''m compiling the kernel that makes the hardware not > > perform up to its specifications with a zpool, and possibly the Nexanta 3 > > RC2 ISO has the same problem as my own compilations. > > > > Chad > > > > On Tue, Jul 06, 2010 at 03:08:50PM -0700, Chad Cantwell wrote: > >> Hi all, > >> > >> I''ve noticed something strange in the throughput in my zpool between > >> different snv builds, and I''m not sure if it''s an inherent difference > >> in the build or a kernel parameter that is different in the builds. > >> I''ve setup two similiar machines and this happens with both of them. > >> Each system has 16 2TB Samsung HD203WI drives (total) directly connected > >> to two LSI 3081E-R 1068e cards with IT firmware in one raidz3 vdev. > >> > >> In both computers, after a fresh installation of snv 134, the throughput > >> is a maximum of about 300 MB/s during scrub or something like > >> "dd if=/dev/zero bs=1024k of=bigfile". > >> > >> If I bfu to snv 138, I then get throughput of about 700 MB/s with both > >> scrub or a single thread dd. > >> > >> I assumed at first this was some sort of bug or regression in 134 that > >> made it slow. ?However, I''ve now tested also from the fresh 134 > >> installation, compiling the OS/Net build 143 from the mercurial > >> repository and booting into it, after which the dd throughput is still > >> only about 300 MB/s just like snv 134. ?The scrub throughput in 143 > >> is even slower, rarely surpassing 150 MB/s. ?I wonder if the scrubbing > >> being extra slow here is related to the additional statistics displayed > >> during the scrub that didn''t used to be shown. > >> > >> Is there some kind of debug option that might be enabled in the 134 build > >> and persist if I compile snv 143 which would be off if I installed a 138 > >> through bfu? ?If not, it makes me think that the bfu to 138 is changing > >> the configuration somewhere to make it faster rather than fixing a bug or > >> being a debug flag on or off. ?Does anyone have any idea what might be > >> happening? ?One thing I haven''t tried is bfu''ing to 138, and from this > >> faster working snv 138 installing the snv 143 build, which may possibly > >> create a 143 that performs faster if it''s simply a configuration parameter. > >> I''m not sure offhand if installing source-compiled ON builds from a bfu''d > >> rpool is supported, although I suppose it''s simple enough to try. > >> > >> Thanks, > >> Chad Cantwell > >> _______________________________________________ > >> zfs-discuss mailing list > >> zfs-discuss at opensolaris.org > >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > _______________________________________________ > > zfs-discuss mailing list > > zfs-discuss at opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > > > I''m surprised you''re even getting 400MB/s on the "fast" > configurations, with only 16 drives in a Raidz3 configuration. > To me, 16 drives in Raidz3 (single Vdev) would do about 150MB/sec, as > your "slow" speeds suggest. > > -- > Brent Jones > brent at servuhome.netWith which drives and controllers? For a single dd thread writing a large file to fill up a new zpool from /dev/zero, in this configuration I can sustain over 700 MB/s for the duration of the process and can fill up the ~26t usable space overnight. This is with two 8 port LSI 1068e controllers and no expanders. RAIDZ operates similiar to regular raid and you should get striped speeds for sequential access minus any inefficiencies and processing time for the parity. 16 disks in raidz3 is 13 disks worth of striping so with ~700 MB/s I''m getting about 50% efficiency after the parity calculations etc which is fine with me. I understand that some people need to have higher performance random I/O to many places at once, and I think this is where more vdevs has an advantage. Sequential read/write over a single vdev is actually quite good in ZFS in my experience and on par or better than most hardware raid cards, so if you have hardware that works well in OpenSolaris and no bottlenecks in the bus or CPU (I''m not sure how much CPU is needed for good zfs performance, but most of my opensolaris machines are harpertown xeons or better) you really should be getting better performance than 150 MB/s. Chad
On Mon, Jul 19, 2010 at 07:01:54PM -0700, Chad Cantwell wrote:> On Tue, Jul 20, 2010 at 10:54:44AM +1000, James C. McPherson wrote: > > On 20/07/10 10:40 AM, Chad Cantwell wrote: > > >fyi, everyone, I have some more info here. in short, rich lowe''s 142 works > > >correctly (fast) on my hardware, while both my compilations (snv 143, snv 144) > > >and also the nexanta 3 rc2 kernel (134 with backports) are horribly slow. > > > > > >I finally got around to trying rich lowe''s snv 142 compilation in place of > > >my own compilation of 143 (and later 144, not mentioned below), and unlike > > >my own two compilations, his works very fast again on my same zpool ( > > >scrubbing avg increased from low 100s to over 400 MB/s within a few > > >minutes after booting into this copy of 142. I should note that since > > >my original message, I also tried booting from a Nexanta Core 3.0 RC2 ISO > > >after realizing it had zpool 26 support backported into 134 and was in > > >fact able to read my zpool despite upgrading the version. Running a > > >scrub from the F2 shell on the Nexanta CD was also slow scrubbing, just > > >like the 143 and 144 that I compiled. So, there seem to be two possibilities. > > >Either (and this seems unlikely) there is a problem introduced post-142 which > > >slows things down, and it occured in 143, 144, and was brought back to 134 > > >with Nexanta''s backports, or else (more likely) there is something different > > >or wrong with how I''m compiling the kernel that makes the hardware not > > >perform up to its specifications with a zpool, and possibly the Nexanta 3 > > >RC2 ISO has the same problem as my own compilations. > > > > So - what''s your env file contents, which closedbins are you using, > > why crypto bits are you using, and what changeset is your own workspace > > synced with? > > > > > > James C. McPherson > > -- > > Oracle > > http://www.jmcp.homeunix.com/blog > > > The procedure I followed was basically what is outlined here: > http://insanum.com/blog/2010/06/08/how-to-build-opensolaris > > using the SunStudio 12 compilers for ON and 12u1 for lint. > > For each build (143, 144) I cloned the exact tag for that build, i.e.: > > # hg clone ssh://anon at hg.opensolaris.org/hg/onnv/onnv-gate onnv-b144 > # cd onnv-b144 > # hg update onnv_144 > > Then I downloaded the corresponding closed and crypto bins from > http://dlc.sun.com/osol/on/downloads/b143 or > http://dlc.sun.com/osol/on/downloads/b144 > > The only environemnt variables I modified from the default opensolaris.sh > file were the basic ones: GATE, CODEMGR_WS, STAFFER, and ON_CRYPTO_BINS > to point to my work directory for the build, my username, and the relevant > crypto bin: > > $ egrep -e "^GATE|^CODEMGR_WS|^STAFFER|^ON_CRYPTO_BINS" opensolaris.sh > GATE=onnv-b144; export GATE > CODEMGR_WS="/work/compiling/$GATE"; export CODEMGR_WS > STAFFER=chad; export STAFFER > ON_CRYPTO_BINS="$CODEMGR_WS/on-crypto-latest.$MACH.tar.bz2" > > I suppose the easiest way for me to confirm if there is a regression or if my > compiling is flawed is to just try compiling snv_142 using the same procedure > and see if it works as well as Rich Lowe''s copy or if it''s slow like my other > compilations. > > Chad >I''ve just compiled and booted into snv_142, and I experienced the same slow dd and scrubbing as I did with my 142 and 143 compilations and with the Nexanta 3 RC2 CD. So, this would seem to indicate a build environment/process flaw rather than a regression. Chad
Robert Milkowski
2010-Jul-20 07:39 UTC
[zfs-discuss] zpool throughput: snv 134 vs 138 vs 143
On 20/07/2010 07:59, Chad Cantwell wrote:> > I''ve just compiled and booted into snv_142, and I experienced the same slow dd and > scrubbing as I did with my 142 and 143 compilations and with the Nexanta 3 RC2 CD. > So, this would seem to indicate a build environment/process flaw rather than a > regression. > >Are you sure it is not a debug vs. non-debug issue? -- Robert Milkowski http://milek.blogspot.com
Roy Sigurd Karlsbakk
2010-Jul-20 11:42 UTC
[zfs-discuss] zpool throughput: snv 134 vs 138 vs 143
> I''m surprised you''re even getting 400MB/s on the "fast" > configurations, with only 16 drives in a Raidz3 configuration. > To me, 16 drives in Raidz3 (single Vdev) would do about 150MB/sec, as > your "slow" speeds suggest.That''ll be for random i/o. His i/o here is sequential, so the i/o is spread over the drives. Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 roy at karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer p? norsk.
Yes, I think this might have been it. I missed the NIGHTLY_OPTIONS variable in opensolaris and I think it was compiling a debug build. I''m not sure what the ramifications are of this or how much slower a debug build should be, but I''m recompiling a release build now so hopefully all will be well. Thanks, Chad On Tue, Jul 20, 2010 at 08:39:42AM +0100, Robert Milkowski wrote:> On 20/07/2010 07:59, Chad Cantwell wrote: > > > >I''ve just compiled and booted into snv_142, and I experienced the same slow dd and > >scrubbing as I did with my 142 and 143 compilations and with the Nexanta 3 RC2 CD. > >So, this would seem to indicate a build environment/process flaw rather than a > >regression. > > > > Are you sure it is not a debug vs. non-debug issue? > > > -- > Robert Milkowski > http://milek.blogspot.com > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
No, this wasn''t it. A non debug build with the same NIGHTLY_OPTIONS at Rich Lowe''s 142 build is still very slow... On Tue, Jul 20, 2010 at 09:52:10AM -0700, Chad Cantwell wrote:> Yes, I think this might have been it. I missed the NIGHTLY_OPTIONS variable in > opensolaris and I think it was compiling a debug build. I''m not sure what the > ramifications are of this or how much slower a debug build should be, but I''m > recompiling a release build now so hopefully all will be well. > > Thanks, > Chad > > On Tue, Jul 20, 2010 at 08:39:42AM +0100, Robert Milkowski wrote: > > On 20/07/2010 07:59, Chad Cantwell wrote: > > > > > >I''ve just compiled and booted into snv_142, and I experienced the same slow dd and > > >scrubbing as I did with my 142 and 143 compilations and with the Nexanta 3 RC2 CD. > > >So, this would seem to indicate a build environment/process flaw rather than a > > >regression. > > > > > > > Are you sure it is not a debug vs. non-debug issue? > > > > > > -- > > Robert Milkowski > > http://milek.blogspot.com > > > > _______________________________________________ > > zfs-discuss mailing list > > zfs-discuss at opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On Tue, Jul 20, 2010 at 10:29 AM, Chad Cantwell <chad at iomail.org> wrote:> No, this wasn''t it. ?A non debug build with the same NIGHTLY_OPTIONS > at Rich Lowe''s 142 build is still very slow... > > On Tue, Jul 20, 2010 at 09:52:10AM -0700, Chad Cantwell wrote: >> Yes, I think this might have been it. ?I missed the NIGHTLY_OPTIONS variable in >> opensolaris and I think it was compiling a debug build. ?I''m not sure what the >> ramifications are of this or how much slower a debug build should be, but I''m >> recompiling a release build now so hopefully all will be well. >> >> Thanks, >> Chad >> >> On Tue, Jul 20, 2010 at 08:39:42AM +0100, Robert Milkowski wrote: >> > On 20/07/2010 07:59, Chad Cantwell wrote: >> > > >> > >I''ve just compiled and booted into snv_142, and I experienced the same slow dd and >> > >scrubbing as I did with my 142 and 143 compilations and with the Nexanta 3 RC2 CD. >> > >So, this would seem to indicate a build environment/process flaw rather than a >> > >regression. >> > > >> > >> > Are you sure it is not a debug vs. non-debug issue? >> > >> > >> > -- >> > Robert Milkowski >> > http://milek.blogspot.com >> >Could it somehow not be compiling 64-bit support? -- Brent Jones brent at servuhome.net
On Tue, Jul 20, 2010 at 10:45:58AM -0700, Brent Jones wrote:> On Tue, Jul 20, 2010 at 10:29 AM, Chad Cantwell <chad at iomail.org> wrote: > > No, this wasn''t it. ?A non debug build with the same NIGHTLY_OPTIONS > > at Rich Lowe''s 142 build is still very slow... > > > > On Tue, Jul 20, 2010 at 09:52:10AM -0700, Chad Cantwell wrote: > >> Yes, I think this might have been it. ?I missed the NIGHTLY_OPTIONS variable in > >> opensolaris and I think it was compiling a debug build. ?I''m not sure what the > >> ramifications are of this or how much slower a debug build should be, but I''m > >> recompiling a release build now so hopefully all will be well. > >> > >> Thanks, > >> Chad > >> > >> On Tue, Jul 20, 2010 at 08:39:42AM +0100, Robert Milkowski wrote: > >> > On 20/07/2010 07:59, Chad Cantwell wrote: > >> > > > >> > >I''ve just compiled and booted into snv_142, and I experienced the same slow dd and > >> > >scrubbing as I did with my 142 and 143 compilations and with the Nexanta 3 RC2 CD. > >> > >So, this would seem to indicate a build environment/process flaw rather than a > >> > >regression. > >> > > > >> > > >> > Are you sure it is not a debug vs. non-debug issue? > >> > > >> > > >> > -- > >> > Robert Milkowski > >> > http://milek.blogspot.com > >> > > > Could it somehow not be compiling 64-bit support? > > > -- > Brent Jones > brent at servuhome.netI thought about that but it says when it boots up that it is 64-bit, and I''m able to run 64-bit binaries. I wonder if it''s compiling for the wrong processor optomization though? Maybe if it is missing some of the newer SSEx instructions the zpool checksum checking is slowed down significantly? I don''t know how to check for this though and it seems strange it would slow it down this significantly. I''d expect even a non-SSE enabled binary to be able to calculate a few hundred MB of checksums per second for a 2.5+ghz processor. Chad
Garrett D''Amore
2010-Jul-20 20:29 UTC
[zfs-discuss] zpool throughput: snv 134 vs 138 vs 143
So the next question is, lets figure out what richlowe did differently. ;-) - Garrett
Marcelo H Majczak
2010-Jul-20 21:10 UTC
[zfs-discuss] zpool throughput: snv 134 vs 138 vs 143
If I can help narrow the variables, I compiled both 137 and 144 (137 is minimum req. to build 144) using the same recommended compiler and lint, nightly options etc. 137 works fine but 144 suffer the slowness reported. System wise, I''m using only the 32bit non-debug version in an "old" single-core/thread pentium-m laptop. What I notice is that the zpool_$pool daemon had a lot more threads (total 136, iirc), so something changed there but not necessarily related to the problem. It also seems to be issuing a lot more writing to rpool, though I can''t tell what. In my case it causes a lot of read contention since my rpool is a USB flash device with no cache. iostat says something like up to 10w/20r per second. Up to 137 the performance has been enough, so far, for my purposes on this laptop. -- This message posted from opensolaris.org
Bill Sommerfeld
2010-Jul-20 21:31 UTC
[zfs-discuss] zpool throughput: snv 134 vs 138 vs 143
On 07/20/10 14:10, Marcelo H Majczak wrote:> It also seems to be issuing a lot more > writing to rpool, though I can''t tell what. In my case it causes a > lot of read contention since my rpool is a USB flash device with no > cache. iostat says something like up to 10w/20r per second. Up to 137 > the performance has been enough, so far, for my purposes on this > laptop.if pools are more than about 60-70% full, you may be running into 6962304 workaround: add the following to /etc/system, run bootadm update-archive, and reboot -----cut here----- * Work around 6962304 set zfs:metaslab_min_alloc_size=0x1000 * Work around 6965294 set zfs:metaslab_smo_bonus_pct=0xc8 -----cut here----- no guarantees, but it''s helped a few systems.. - Bill
Bill Sommerfeld
2010-Jul-20 21:32 UTC
[zfs-discuss] zpool throughput: snv 134 vs 138 vs 143
On 07/20/10 14:10, Marcelo H Majczak wrote:> It also seems to be issuing a lot more > writing to rpool, though I can''t tell what. In my case it causes a > lot of read contention since my rpool is a USB flash device with no > cache. iostat says something like up to 10w/20r per second. Up to 137 > the performance has been enough, so far, for my purposes on this > laptop.if pools are more than about 60-70% full, you may be running into 6962304 workaround: add the following to /etc/system, run bootadm update-archive, and reboot -----cut here----- * Work around 6962304 set zfs:metaslab_min_alloc_size=0x1000 * Work around 6965294 set zfs:metaslab_smo_bonus_pct=0xc8 -----cut here----- no guarantees, but it''s helped a few systems.. - Bill
Garrett D''Amore
2010-Jul-20 21:34 UTC
[zfs-discuss] zpool throughput: snv 134 vs 138 vs 143
Your config makes me think this is an atypical ZFS configuration. As a result, I''m not as concerned. But I think the multithread/concurrency may be the biggest concern here. Perhaps the compilers are doing something different that causes significant cache issues. (Perhaps the compilers themselves are in need of an update?) - Garrett On Tue, 2010-07-20 at 14:10 -0700, Marcelo H Majczak wrote:> If I can help narrow the variables, I compiled both 137 and 144 (137 is minimum req. to build 144) using the same recommended compiler and lint, nightly options etc. 137 works fine but 144 suffer the slowness reported. System wise, I''m using only the 32bit non-debug version in an "old" single-core/thread pentium-m laptop. > > What I notice is that the zpool_$pool daemon had a lot more threads (total 136, iirc), so something changed there but not necessarily related to the problem. It also seems to be issuing a lot more writing to rpool, though I can''t tell what. In my case it causes a lot of read contention since my rpool is a USB flash device with no cache. iostat says something like up to 10w/20r per second. Up to 137 the performance has been enough, so far, for my purposes on this laptop.
>> Could it somehow not be compiling 64-bit support? >> >> >> -- >> Brent Jones >> > > I thought about that but it says when it boots up that it is 64-bit, and I''m able to run > 64-bit binaries. I wonder if it''s compiling for the wrong processor optomization though? > Maybe if it is missing some of the newer SSEx instructions the zpool checksum checking is > slowed down significantly? I don''t know how to check for this though and it seems strange > it would slow it down this significantly. I''d expect even a non-SSE enabled binary to > be able to calculate a few hundred MB of checksums per second for a 2.5+ghz processor. > > ChadWould it be possible to do a closer comparison between Rich Lowe''s fast 142 build and your slow 142 build? For example run a diff on the source, build options, and build scripts. If the build settings are close enough, a comparison of the generated binaries might be a faster way to narrow things down (if the optimizations are different then a resultant binary comparison probably won''t be useful). You said previously that:> The procedure I followed was basically what is outlined here: > http://insanum.com/blog/2010/06/08/how-to-build-opensolaris > > using the SunStudio 12 compilers for ON and 12u1 for lint. >Are these the same compiler versions Rich Lowe used? Maybe there is a compiler optimization bug. Rich Lowe''s build readme doesn''t tell us which compiler he used. http://genunix.org/dist/richlowe/README.txt> I suppose the easiest way for me to confirm if there is a regression or if my > compiling is flawed is to just try compiling snv_142 using the same procedure > and see if it works as well as Rich Lowe''s copy or if it''s slow like my other > compilations. > > ChadAnother older compilation guide: http://hub.opensolaris.org/bin/view/Community+Group+tools/building_opensolaris -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100720/019f3c81/attachment.html>
I built in the normal fashion, with the CBE compilers (cc: Sun C 5.9 SunOS_i386 Patch 124868-10 2009/04/30), and 12u1 lint. I''m not subscribed to zfs-discuss, but have you established whether the problematic build is DEBUG? (the bits I uploaded were non-DEBUG). -- Rich Haudy Kazemi wrote:>>> Could it somehow not be compiling 64-bit support? >>> >>> >>> -- >>> Brent Jones >>> >> >> I thought about that but it says when it boots up that it is 64-bit, and I''m able to run >> 64-bit binaries. I wonder if it''s compiling for the wrong processor optomization though? >> Maybe if it is missing some of the newer SSEx instructions the zpool checksum checking is >> slowed down significantly? I don''t know how to check for this though and it seems strange >> it would slow it down this significantly. I''d expect even a non-SSE enabled >> binary to be able to calculate a few hundred MB of checksums per second for >> a 2.5+ghz processor. >> >> Chad > > Would it be possible to do a closer comparison between Rich Lowe''s fast 142 > build and your slow 142 build? For example run a diff on the source, build > options, and build scripts. If the build settings are close enough, a > comparison of the generated binaries might be a faster way to narrow things > down (if the optimizations are different then a resultant binary comparison > probably won''t be useful). > > You said previously that: >> The procedure I followed was basically what is outlined here: >> http://insanum.com/blog/2010/06/08/how-to-build-opensolaris >> >> using the SunStudio 12 compilers for ON and 12u1 for lint. >> > Are these the same compiler versions Rich Lowe used? Maybe there is a > compiler optimization bug. Rich Lowe''s build readme doesn''t tell us which > compiler he used. > http://genunix.org/dist/richlowe/README.txt > >> I suppose the easiest way for me to confirm if there is a regression or if my >> compiling is flawed is to just try compiling snv_142 using the same procedure >> and see if it works as well as Rich Lowe''s copy or if it''s slow like my other >> compilations. >> >> Chad > > Another older compilation guide: > http://hub.opensolaris.org/bin/view/Community+Group+tools/building_opensolaris
Garrett D''Amore
2010-Jul-21 16:12 UTC
[zfs-discuss] zpool throughput: snv 134 vs 138 vs 143
On Wed, 2010-07-21 at 02:21 -0400, Richard Lowe wrote:> I built in the normal fashion, with the CBE compilers > (cc: Sun C 5.9 SunOS_i386 Patch 124868-10 2009/04/30), and 12u1 lint. > > I''m not subscribed to zfs-discuss, but have you established whether the > problematic build is DEBUG? (the bits I uploaded were non-DEBUG).That would make a *huge* difference. DEBUG bits have zero optimization, and also have a great number of sanity tests included that are absent from the non-DEBUG bits. If these are expensive checks on a hot code path, it can have a very nasty impact on performance. Now that said, I *hope* the bits that Nexenta delivered were *not* DEBUG. But I''ve seen at least one bug that makes me think we might be delivering DEBUG binaries. I''ll check into it. -- Garrett> > -- Rich > > Haudy Kazemi wrote: > >>> Could it somehow not be compiling 64-bit support? > >>> > >>> > >>> -- > >>> Brent Jones > >>> > >> > >> I thought about that but it says when it boots up that it is 64-bit, and I''m able to run > >> 64-bit binaries. I wonder if it''s compiling for the wrong processor optomization though? > >> Maybe if it is missing some of the newer SSEx instructions the zpool checksum checking is > >> slowed down significantly? I don''t know how to check for this though and it seems strange > >> it would slow it down this significantly. I''d expect even a non-SSE enabled > >> binary to be able to calculate a few hundred MB of checksums per second for > >> a 2.5+ghz processor. > >> > >> Chad > > > > Would it be possible to do a closer comparison between Rich Lowe''s fast 142 > > build and your slow 142 build? For example run a diff on the source, build > > options, and build scripts. If the build settings are close enough, a > > comparison of the generated binaries might be a faster way to narrow things > > down (if the optimizations are different then a resultant binary comparison > > probably won''t be useful). > > > > You said previously that: > >> The procedure I followed was basically what is outlined here: > >> http://insanum.com/blog/2010/06/08/how-to-build-opensolaris > >> > >> using the SunStudio 12 compilers for ON and 12u1 for lint. > >> > > Are these the same compiler versions Rich Lowe used? Maybe there is a > > compiler optimization bug. Rich Lowe''s build readme doesn''t tell us which > > compiler he used. > > http://genunix.org/dist/richlowe/README.txt > > > >> I suppose the easiest way for me to confirm if there is a regression or if my > >> compiling is flawed is to just try compiling snv_142 using the same procedure > >> and see if it works as well as Rich Lowe''s copy or if it''s slow like my other > >> compilations. > >> > >> Chad > > > > Another older compilation guide: > > http://hub.opensolaris.org/bin/view/Community+Group+tools/building_opensolaris > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
Hi, My bits were originally debug because I didn''t know any better. I thought I had then recompiled without debug to test again, but I didn''t realize until just now the packages end up in a different directory (nightly vs nightly-nd) so I believe after compiling non-debug I just reinstalled the debug bits. I''m about to test again with an actual non-debug 142, and after that a non-debug 145 which just came out. Thanks, Chad On Wed, Jul 21, 2010 at 02:21:51AM -0400, Richard Lowe wrote:> > I built in the normal fashion, with the CBE compilers > (cc: Sun C 5.9 SunOS_i386 Patch 124868-10 2009/04/30), and 12u1 lint. > > I''m not subscribed to zfs-discuss, but have you established whether the > problematic build is DEBUG? (the bits I uploaded were non-DEBUG). > > -- Rich > > Haudy Kazemi wrote: > >>> Could it somehow not be compiling 64-bit support? > >>> > >>> > >>> -- > >>> Brent Jones > >>> > >> > >> I thought about that but it says when it boots up that it is 64-bit, and I''m able to run > >> 64-bit binaries. I wonder if it''s compiling for the wrong processor optomization though? > >> Maybe if it is missing some of the newer SSEx instructions the zpool checksum checking is > >> slowed down significantly? I don''t know how to check for this though and it seems strange > >> it would slow it down this significantly. I''d expect even a non-SSE enabled > >> binary to be able to calculate a few hundred MB of checksums per second for > >> a 2.5+ghz processor. > >> > >> Chad > > > > Would it be possible to do a closer comparison between Rich Lowe''s fast 142 > > build and your slow 142 build? For example run a diff on the source, build > > options, and build scripts. If the build settings are close enough, a > > comparison of the generated binaries might be a faster way to narrow things > > down (if the optimizations are different then a resultant binary comparison > > probably won''t be useful). > > > > You said previously that: > >> The procedure I followed was basically what is outlined here: > >> http://insanum.com/blog/2010/06/08/how-to-build-opensolaris > >> > >> using the SunStudio 12 compilers for ON and 12u1 for lint. > >> > > Are these the same compiler versions Rich Lowe used? Maybe there is a > > compiler optimization bug. Rich Lowe''s build readme doesn''t tell us which > > compiler he used. > > http://genunix.org/dist/richlowe/README.txt > > > >> I suppose the easiest way for me to confirm if there is a regression or if my > >> compiling is flawed is to just try compiling snv_142 using the same procedure > >> and see if it works as well as Rich Lowe''s copy or if it''s slow like my other > >> compilations. > >> > >> Chad > > > > Another older compilation guide: > > http://hub.opensolaris.org/bin/view/Community+Group+tools/building_opensolaris
It does seem to be faster now that I really installed the non-debug bits. I let it resume a scrub after reboot, and while it''s not as fast as it usually is (280 - 300 MB/s vs 500+) I assume it''s just presently checking a part of the filesystem currently with smaller files thus reducing the speed, since it''s well past the prior limitation. I tested 142 non-debug briefly until the scrub reached at least 250 MB/s and then booted into 145 non-debug where I''m letting the scrub finish now. I''ll test the Nexanta disc again to be sure it was slow since I don''t recall exactly how much time I gave it in my prior tests for the scrub to reach it''s normal speed, although I can''t do that until this evening when I''m home again. Chad On Wed, Jul 21, 2010 at 09:44:42AM -0700, Chad Cantwell wrote:> Hi, > > My bits were originally debug because I didn''t know any better. I thought I had then > recompiled without debug to test again, but I didn''t realize until just now the packages > end up in a different directory (nightly vs nightly-nd) so I believe after compiling > non-debug I just reinstalled the debug bits. I''m about to test again with an actual > non-debug 142, and after that a non-debug 145 which just came out. > > Thanks, > Chad > > On Wed, Jul 21, 2010 at 02:21:51AM -0400, Richard Lowe wrote: > > > > I built in the normal fashion, with the CBE compilers > > (cc: Sun C 5.9 SunOS_i386 Patch 124868-10 2009/04/30), and 12u1 lint. > > > > I''m not subscribed to zfs-discuss, but have you established whether the > > problematic build is DEBUG? (the bits I uploaded were non-DEBUG). > > > > -- Rich > > > > Haudy Kazemi wrote: > > >>> Could it somehow not be compiling 64-bit support? > > >>> > > >>> > > >>> -- > > >>> Brent Jones > > >>> > > >> > > >> I thought about that but it says when it boots up that it is 64-bit, and I''m able to run > > >> 64-bit binaries. I wonder if it''s compiling for the wrong processor optomization though? > > >> Maybe if it is missing some of the newer SSEx instructions the zpool checksum checking is > > >> slowed down significantly? I don''t know how to check for this though and it seems strange > > >> it would slow it down this significantly. I''d expect even a non-SSE enabled > > >> binary to be able to calculate a few hundred MB of checksums per second for > > >> a 2.5+ghz processor. > > >> > > >> Chad > > > > > > Would it be possible to do a closer comparison between Rich Lowe''s fast 142 > > > build and your slow 142 build? For example run a diff on the source, build > > > options, and build scripts. If the build settings are close enough, a > > > comparison of the generated binaries might be a faster way to narrow things > > > down (if the optimizations are different then a resultant binary comparison > > > probably won''t be useful). > > > > > > You said previously that: > > >> The procedure I followed was basically what is outlined here: > > >> http://insanum.com/blog/2010/06/08/how-to-build-opensolaris > > >> > > >> using the SunStudio 12 compilers for ON and 12u1 for lint. > > >> > > > Are these the same compiler versions Rich Lowe used? Maybe there is a > > > compiler optimization bug. Rich Lowe''s build readme doesn''t tell us which > > > compiler he used. > > > http://genunix.org/dist/richlowe/README.txt > > > > > >> I suppose the easiest way for me to confirm if there is a regression or if my > > >> compiling is flawed is to just try compiling snv_142 using the same procedure > > >> and see if it works as well as Rich Lowe''s copy or if it''s slow like my other > > >> compilations. > > >> > > >> Chad > > > > > > Another older compilation guide: > > > http://hub.opensolaris.org/bin/view/Community+Group+tools/building_opensolaris
Hi Garrett, Since my problem did turn out to be a debug kernel on my compilations, I booted back into the Nexanta 3 RC2 CD and let a scrub run for about half an hour to see if I just hadn''t waited long enough the first time around. It never made it past 159 MB/s. I finally rebooted into my 145 non-debug kernel and within a few seconds of reimporting the pool the scrub was up to ~400 MB/s, so it does indeed seem like the Nexanta CD kernel is either in debug mode, or something else is slowing it down. Chad On Wed, Jul 21, 2010 at 09:12:35AM -0700, Garrett D''Amore wrote:> On Wed, 2010-07-21 at 02:21 -0400, Richard Lowe wrote: > > I built in the normal fashion, with the CBE compilers > > (cc: Sun C 5.9 SunOS_i386 Patch 124868-10 2009/04/30), and 12u1 lint. > > > > I''m not subscribed to zfs-discuss, but have you established whether the > > problematic build is DEBUG? (the bits I uploaded were non-DEBUG). > > That would make a *huge* difference. DEBUG bits have zero optimization, > and also have a great number of sanity tests included that are absent > from the non-DEBUG bits. If these are expensive checks on a hot code > path, it can have a very nasty impact on performance. > > Now that said, I *hope* the bits that Nexenta delivered were *not* > DEBUG. But I''ve seen at least one bug that makes me think we might be > delivering DEBUG binaries. I''ll check into it. > > -- Garrett > > > > > -- Rich > > > > Haudy Kazemi wrote: > > >>> Could it somehow not be compiling 64-bit support? > > >>> > > >>> > > >>> -- > > >>> Brent Jones > > >>> > > >> > > >> I thought about that but it says when it boots up that it is 64-bit, and I''m able to run > > >> 64-bit binaries. I wonder if it''s compiling for the wrong processor optomization though? > > >> Maybe if it is missing some of the newer SSEx instructions the zpool checksum checking is > > >> slowed down significantly? I don''t know how to check for this though and it seems strange > > >> it would slow it down this significantly. I''d expect even a non-SSE enabled > > >> binary to be able to calculate a few hundred MB of checksums per second for > > >> a 2.5+ghz processor. > > >> > > >> Chad > > > > > > Would it be possible to do a closer comparison between Rich Lowe''s fast 142 > > > build and your slow 142 build? For example run a diff on the source, build > > > options, and build scripts. If the build settings are close enough, a > > > comparison of the generated binaries might be a faster way to narrow things > > > down (if the optimizations are different then a resultant binary comparison > > > probably won''t be useful). > > > > > > You said previously that: > > >> The procedure I followed was basically what is outlined here: > > >> http://insanum.com/blog/2010/06/08/how-to-build-opensolaris > > >> > > >> using the SunStudio 12 compilers for ON and 12u1 for lint. > > >> > > > Are these the same compiler versions Rich Lowe used? Maybe there is a > > > compiler optimization bug. Rich Lowe''s build readme doesn''t tell us which > > > compiler he used. > > > http://genunix.org/dist/richlowe/README.txt > > > > > >> I suppose the easiest way for me to confirm if there is a regression or if my > > >> compiling is flawed is to just try compiling snv_142 using the same procedure > > >> and see if it works as well as Rich Lowe''s copy or if it''s slow like my other > > >> compilations. > > >> > > >> Chad > > > > > > Another older compilation guide: > > > http://hub.opensolaris.org/bin/view/Community+Group+tools/building_opensolaris > > _______________________________________________ > > zfs-discuss mailing list > > zfs-discuss at opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss