Other test, same setup. SOLARIS10: zpool/a filesystem containing over 10Millions subdirs each containing 10 files of about 1k zpool/b empty filesystem rsync -avx /zpool/a/* /zpool/b time: 14 hours (iostat showing %b = 100 for each lun in the zpool) FreeBSD: /vol1/a dir containing over 10Millions subdirs each containing 10 files of about 1k /vol1/b empty dir rsync -avx /vol1/a/* /vol1/b time: 1h 40m !! Also a zone running on zpool/zone1 was almost completely unusable because of i/o load. This message posted from opensolaris.org
That is bad!!!! such a big time difference... 14rs vs less than 2 hrs... did you have the same hardware setup? I did not follow up the thread... Chris On Sun, 17 Sep 2006, Gino Ruopolo wrote:> Other test, same setup. > > > SOLARIS10: > > zpool/a filesystem containing over 10Millions subdirs each containing 10 files of about 1k > zpool/b empty filesystem > > rsync -avx /zpool/a/* /zpool/b > > time: 14 hours (iostat showing %b = 100 for each lun in the zpool) > > FreeBSD: > /vol1/a dir containing over 10Millions subdirs each containing 10 files of about 1k > /vol1/b empty dir > > rsync -avx /vol1/a/* /vol1/b > > time: 1h 40m !! > > Also a zone running on zpool/zone1 was almost completely unusable because of i/o load. > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > > !DSPAM:122,450da906299689287932! >
Hi Gino, Can you post the ''zpool status'' for each pool and ''zfs get all'' for each fs; Any interesting data in the dmesg output ? -r Gino Ruopolo writes: > Other test, same setup. > > > SOLARIS10: > > zpool/a filesystem containing over 10Millions subdirs each containing 10 files of about 1k > zpool/b empty filesystem > > rsync -avx /zpool/a/* /zpool/b > > time: 14 hours (iostat showing %b = 100 for each lun in the zpool) > > FreeBSD: > /vol1/a dir containing over 10Millions subdirs each containing 10 files of about 1k > /vol1/b empty dir > > rsync -avx /vol1/a/* /vol1/b > > time: 1h 40m !! > > Also a zone running on zpool/zone1 was almost completely unusable because of i/o load. > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Hi Chris, both server same setup. OS on local hw raid mirror, other filesystem on a SAN. We found really bad performance but also that under that heavy I/O zfs pool was something like freezed. I mean, a zone living on the same zpool was completely unusable because of I/O load. We use FSS, but CPU load was really load under the tests. During the tests on FreeBSD we found I/O on the stressed filessytem slow but not freezed! later, Gino This message posted from opensolaris.org
> > Hi Gino, > > Can you post the ''zpool status'' for each pool and > ''zfs get all'' > for each fs; Any interesting data in the dmesg output > ?sure. 1) nothing on dmesg (are you thinking about shared IRQ?) 2) Only using one pool for tests: # zpool status pool: zpool1 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM zpool1 ONLINE 0 0 0 raidz ONLINE 0 0 0 c4t60001FE100118DB000091190724700C7d0 ONLINE 0 0 0 c4t60001FE100118DB000091190724700C9d0 ONLINE 0 0 0 c4t60001FE100118DB000091190724700CBd0 ONLINE 0 0 0 c4t60001FE100118DB000091190724700CCd0 ONLINE 0 0 0 errors: No known data errors 3) zpool1 type filesystem - zpool1 creation Sat Aug 26 16:45 2006 - zpool1 used 330G - zpool1 available 206G - zpool1 referenced 7.86G - zpool1 compressratio 1.28x - zpool1 mounted yes - zpool1 quota none default zpool1 reservation none default zpool1 recordsize 128K default zpool1 mountpoint /zpool1 default zpool1 sharenfs ro,anon=0 local zpool1 checksum on default zpool1 compression off default zpool1 atime on default zpool1 devices on default zpool1 exec on default zpool1 setuid on default zpool1 readonly off default zpool1 zoned off default zpool1 snapdir hidden default zpool1 aclmode groupmask default zpool1 aclinherit secure default same setup up on all FS. thanks This message posted from opensolaris.org
> We use FSS, but CPU load was really load under the > tests.errata: We use FSS, but CPU load was really LOW under the tests. This message posted from opensolaris.org
Looks like you have compression turned on? This message posted from opensolaris.org
> Looks like you have compression turned on?we made tests with compression on and off and found almost no difference. CPU load was under 3% ... This message posted from opensolaris.org
Update ... iostat output during "zpool scrub" extended device statistics device r/s w/s Mr/s Mw/s wait actv svc_t %w %b sd34 2.0 395.2 0.1 0.6 0.0 34.8 87.7 0 100 sd35 21.0 312.2 1.2 2.9 0.0 26.0 78.0 0 79 sd36 20.0 1.0 1.2 0.0 0.0 0.7 31.4 0 13 sd37 20.0 1.0 1.0 0.0 0.0 0.7 35.1 0 21 sd34 is always at 100% ... This message posted from opensolaris.org
> Update ... > > iostat output during "zpool scrub" > > extended device statistics > > w/s Mr/s Mw/s wait actv svc_t %w %b > 34 2.0 395.2 0.1 0.6 0.0 34.8 87.7 > 0 100 > 35 21.0 312.2 1.2 2.9 0.0 26.0 78.0 > 0 79 > 36 20.0 1.0 1.2 0.0 0.0 0.7 31.4 > 0 13 > 37 20.0 1.0 1.0 0.0 0.0 0.7 35.1 > 0 21 > sd34 is always at 100% ...pool: zpool1 state: ONLINE scrub: scrub in progress, 0.13% done, 72h39m to go config: NAME STATE READ WRITE CKSUM zpool1 ONLINE 0 0 0 raidz ONLINE 0 0 0 c4t60001FE100118DB000091190724700C7d0 ONLINE 0 0 0 c4t60001FE100118DB000091190724700C9d0 ONLINE 0 0 0 c4t60001FE100118DB000091190724700CBd0 ONLINE 0 0 0 c4t60001FE100118DB000091190724700CCd0 ONLINE 0 0 0 72hours?? isn''t too much for 370GB of data? This message posted from opensolaris.org
On 9/22/06, Gino Ruopolo <ginoruopolo at hotmail.com> wrote:> > Update ... > > > > iostat output during "zpool scrub" > > > > extended device statistics > > > > w/s Mr/s Mw/s wait actv svc_t %w %b > > 34 2.0 395.2 0.1 0.6 0.0 34.8 87.7 > > 0 100 > > 35 21.0 312.2 1.2 2.9 0.0 26.0 78.0 > > 0 79 > > 36 20.0 1.0 1.2 0.0 0.0 0.7 31.4 > > 0 13 > > 37 20.0 1.0 1.0 0.0 0.0 0.7 35.1 > > 0 21 > > sd34 is always at 100% ... > > > pool: zpool1 > state: ONLINE > scrub: scrub in progress, 0.13% done, 72h39m to go > config: > > NAME STATE READ WRITE CKSUM > zpool1 ONLINE 0 0 0 > raidz ONLINE 0 0 0 > c4t60001FE100118DB000091190724700C7d0 ONLINE 0 0 0 > c4t60001FE100118DB000091190724700C9d0 ONLINE 0 0 0 > c4t60001FE100118DB000091190724700CBd0 ONLINE 0 0 0 > c4t60001FE100118DB000091190724700CCd0 ONLINE 0 0 0 > > 72hours?? isn''t too much for 370GB of data? > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >For what it''s worth, I''ve found that usually, within the first ~5m or so of starting a scrub, the time estimate is disproportionate to the actual time the scrub will take. - Rich
On 9/22/06, Gino Ruopolo <ginoruopolo at hotmail.com> wrote:> Update ... > > iostat output during "zpool scrub" > > extended device statistics > device r/s w/s Mr/s Mw/s wait actv svc_t %w %b > sd34 2.0 395.2 0.1 0.6 0.0 34.8 87.7 0 100 > sd35 21.0 312.2 1.2 2.9 0.0 26.0 78.0 0 79 > sd36 20.0 1.0 1.2 0.0 0.0 0.7 31.4 0 13 > sd37 20.0 1.0 1.0 0.0 0.0 0.7 35.1 0 21 > > sd34 is always at 100% ...What is strange, is that this is almost all writes. Do you have the rsync running at this time? A scrub alone should not look like this. I have also observed some strange behavior on a 4 disk raidz, which may be related. It is possible to saturate a single disk, while all the others in the same vdev are completely idle. It is very easy to reproduce, so try the following: Create a filesystem with a 4k recordsize on a 4 disk raidz. Now, copy a large file to it, while observing ''iostat -xnz 5''. This is the worst case I have been able to produce, but the imbalance is apparent even with an untar at the default recordsize. Interestingly, it is always the last disk in the set which is busy. This behavior does not occur with a 3 disk raidz, nor is it as bad with other record sizes. Chris
other example: rsyncing from/to the same zpool: device r/s w/s Mr/s Mw/s wait actv svc_t %w %b c6 25.0 276.5 1.3 3.8 1.9 16.5 61.1 0 135 sd44 6.0 158.3 0.3 0.4 1.9 15.5 106.2 33 [b]100[/b] sd45 6.0 37.1 0.3 1.1 0.0 0.3 6.5 0 10 sd46 8.0 42.1 0.4 1.1 0.0 0.4 7.3 0 15 sd47 5.0 39.1 0.3 1.1 0.0 0.3 7.3 0 10 sd44 is always at 100, performance are really really low .. Using 3 lun or 4 lun in the zpool is the same. any suggest? This message posted from opensolaris.org