thr3ads.net - zfs discuss - [zfs-discuss] Log disk with all ssd pool? [Oct 2011]

If this information is useful, please help other people find it:
Share via:

Mark Wolek

2011-Oct-28 06:04 UTC

[zfs-discuss] Log disk with all ssd pool?

Still kicking around this idea and didn''t see it addressed in any of
the threads before the forum closed.

If one made an all ssd pool, would a log/cache drive just slow you down?  Would
zil slow you down?  Thinking rotate MLC drives with sandforce controllers every
few years to avoid losing a drive to "sorry no more writes aloud"
scenarios.

Thanks
Mark


-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20111028/a64b0fab/attachment.html>

Ian Collins

2011-Oct-28 06:52 UTC

head link

[zfs-discuss] Log disk with all ssd pool?

On 10/28/11 07:04 PM, Mark Wolek wrote:>
> Still kicking around this idea and didn?t see it addressed in any of 
> the threads before the forum closed.
>
> If one made an all ssd pool, would a log/cache drive just slow you 
> down? Would zil slow you down?
>
I would guess not, you would still be spreading your IOPs. I haven''t 
tried an all SSD pool, but I have tried adding a lump of spinning rust 
as a log to pool of identical dives and it did give a small improvement 
to NFS performance.

-- 
Ian.

Neil Perrin

2011-Oct-28 06:54 UTC

head link

Re: Log disk with all ssd pool?

On 10/28/11 00:04, Mark Wolek wrote:

Still kicking around this idea and didn’t see it
addressed in any of the threads before the forum closed.  

If one made an all ssd pool, would a log/cache
drive just slow you down?  Would zil slow you down?  Thinking rotate
MLC drives with sandforce controllers every few years to avoid losing a
drive to “sorry no more writes aloud” scenarios.  

Thanks

Mark

Interesting question. I don''t think there''s a straightforward
answer.
Oracle uses write optimised log devices and read optimised cache
devices in it''s appliances. However, assuming all the SSDs are the same
then I suspect neither a log nor a cache device would help:

Log

If there is a log then it is solely used, and can be written to in
parallel with periodic TXG commit writes to the other pool devices.  If
that log were part of the pool then the ZIL code will spread the load
among all pool devices, but will compete with TXG commit writes.  My
gut feeling is that this would be the higher performing option though. 
I think, a long time ago, I experimented with designating one disk out
of the pool as a log and saw degradation on synchronous performance.
That seems to be the equivalent to your SSD question.

Cache

Similarly for cache devices the read would compete at TXG commit
writes, but otherwise performance ought to be higher.

Neil.


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Edward Ned Harvey

2011-Oct-28 12:06 UTC

head link

[zfs-discuss] Log disk with all ssd pool?

> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
> bounces at opensolaris.org] On Behalf Of Mark Wolek
> 
> Still kicking around this idea and didn?t see it addressed in any of the
threads> before the forum closed.
> 
> If one made an all ssd pool, would a log/cache drive just slow you
> down?? Would zil slow you down?? Thinking rotate MLC drives with sandforce
> controllers every few years to avoid losing a drive to ?sorry no more
writes> aloud? scenarios.
Even if you have an all-HDD pool, you benefit by adding an HDD for log
device.  Why?  Because the log device is dedicated to ONLY sync mode writes,
and when you''re doing sync mode writes, you want low latency.  If the
primary disks in the pool are busy doing other things, that means additional
latency before they can respond to a sync mode write, to stick something in
the ZIL.

The same argument applies to SSD''s.  Even if your pool is all SSD, yes
you
benefit by adding a dedicated log device.  The benefit won''t be as
dramatic,
of course, as if your pool were HDD with SSD for log...  But it''s
something.

As for cache...  It is conceivable that you might be able to get some
benefit from cache, for the same reason.  When other disks are busy, you
might be able to get data out of the cache devices, and have some
acceleration.  But the cache devices require a not insignificant amount of
maintenance overhead, keeping track of them and populating/expiring data in
them.  I think you probably wouldn''t get much benefit from a cache
device.
I think you would probably benefit more by parallelizing your main pool
more.  For example instead of making your pool from a dozen 256G disks, you
might use two dozen 128G disks.  Or instead of using mirrors, you might use
3-way mirrors.  Etc.

Neil Perrin

2011-Oct-28 16:38 UTC

head link

Re: Log disk with all ssd pool?

On 10/28/11 00:54, Neil Perrin wrote:

On 10/28/11 00:04, Mark Wolek wrote:

Still kicking around this idea and didn’t see
it
addressed in any of the threads before the forum closed.  

If one made an all ssd pool, would a log/cache
drive just slow you down?  Would zil slow you down?  Thinking rotate
MLC drives with sandforce controllers every few years to avoid losing a
drive to “sorry no more writes aloud” scenarios.  

Thanks

Mark

Interesting question. I don''t think there''s a straightforward
answer.
Oracle uses write optimised log devices and read optimised cache
devices in it''s appliances. However, assuming all the SSDs are the same
then I suspect neither a log nor a cache device would help:

  Log

If there is a log then it is solely used, and can be written to in
parallel with periodic TXG commit writes to the other pool devices.  If
that log were part of the pool then the ZIL code will spread the load
among all pool devices, but will compete with TXG commit writes.  My
gut feeling is that this would be the higher performing option though. 
I think, a long time ago, I experimented with designating one disk out
of the pool as a log and saw degradation on synchronous performance.
That seems to be the equivalent to your SSD question.

  Cache

Similarly for cache devices the read would compete at TXG commit
writes, but otherwise performance ought to be higher.

Neil.

Did some quick tests with disks to check if my memory was correct.

''sb'' is a simple problem to spawn a number of threads to fill
a file of
a certain size

with specified sized non zero writes. Bandwidth is also important.

1. Simple 2 disk system.

   32KB synchronous writes filling 1GB with 20 threads

zpool create whirl  &lt;2 disks&gt;; zfs set recordsize=32k whirl

st1 -n /whirl/f -f 1073741824 -b 32768 -t 20

        Elapsed time 95s  10.8MB/s

zpool create whirl  log  ; zfs set
recordsize=32k whirl

st1 -n /whirl/f -f 1073741824 -b 32768 -t 20

        Elapsed time 151s  6.8MB/s

2. Higher end 6 disk system.

   32KB synchronous writes filling 1GB with 100 threads

zpool create whirl &lt;6 disks&gt;; zfs set recordsize=32k whirl

st1 -n /whirl/f -f 1073741824 -b 32768 -t 100

        Elapsed time 33s  31MB/s

zpool create whirl &lt;5 disks&gt;  log &lt;1disk&gt;; zfs set
recordsize=32k whirl

st1 -n /whirl/f -f 1073741824 -b 32768 -t 100

        Elapsed time 147s  7.0MB/s

and for interest:

zpool create whirl &lt;5 disk&gt; log ; zfs set
recordsize=32k whirl

st1 -n /whirl/f -f 1073741824 -b 32768 -t 100

         Elapsed time 8s  129MB/s

3. Higher end smaller writes

   2K synchronous writes filling 128MB with 100 threads

zpool create whirl &lt;6 disks&gt;: zfs set recordsize=1k whirl

st1 -n /whirl/f -f 134217728 -b 2048 -t 100

        Elapsed time 16s  8.2MB/s

zpool create whirl &lt;5 disks&gt;  log &lt;1 disk&gt;

zfs set recordsize=1k whirl

ds8 -n /whirl/f -f 134217728 -b 2048 -t 100

        Elapsed time 24s  5.5MB/s


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Mark Wolek

2011-Oct-28 17:21 UTC

head link

[zfs-discuss] Log disk with all ssd pool?

Having the log disk slowed it down a lot in your tests (when it wasn''t
a SSD), 30MB/s vs 7.  Is this is also a 100% write / 100% sequential workload? 
Forcing sync?

It''s gotten to the point where I can buy a 120G SSD for less or the
same price as a 146G SAS disk...Sure the MLC drives have limited lifetime, but
at $150 (and dropping) just replace them every few years to be safe, work out a
rotation/rebuild cycle, it''s tempting...  I suppose if we do end up
buying all SSD''s it becomes really easy to test if we should use a log
or not!


From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-bounces at
opensolaris.org] On Behalf Of Neil Perrin
Sent: Friday, October 28, 2011 11:38 AM
To: zfs-discuss at opensolaris.org
Subject: Re: [zfs-discuss] Log disk with all ssd pool?

On 10/28/11 00:54, Neil Perrin wrote:

On 10/28/11 00:04, Mark Wolek wrote:
Still kicking around this idea and didn''t see it addressed in any of
the threads before the forum closed.

If one made an all ssd pool, would a log/cache drive just slow you down?  Would
zil slow you down?  Thinking rotate MLC drives with sandforce controllers every
few years to avoid losing a drive to "sorry no more writes aloud"
scenarios.

Thanks
Mark

Interesting question. I don''t think there''s a straightforward
answer. Oracle uses write optimised log devices and read optimised cache devices
in it''s appliances. However, assuming all the SSDs are the same then I
suspect neither a log nor a cache device would help:

Log
If there is a log then it is solely used, and can be written to in parallel with
periodic TXG commit writes to the other pool devices.  If that log were part of
the pool then the ZIL code will spread the load among all pool devices, but will
compete with TXG commit writes.  My gut feeling is that this would be the higher
performing option though.  I think, a long time ago, I experimented with
designating one disk out of the pool as a log and saw degradation on synchronous
performance. That seems to be the equivalent to your SSD question.

Cache
Similarly for cache devices the read would compete at TXG commit writes, but
otherwise performance ought to be higher.

Neil.
Did some quick tests with disks to check if my memory was correct.
''sb'' is a simple problem to spawn a number of threads to fill
a file of a certain size
with specified sized non zero writes. Bandwidth is also important.

1. Simple 2 disk system.
   32KB synchronous writes filling 1GB with 20 threads

zpool create whirl  <2 disks>; zfs set recordsize=32k whirl
st1 -n /whirl/f -f 1073741824 -b 32768 -t 20
        Elapsed time 95s  10.8MB/s

zpool create whirl <disk> log <disk> ; zfs set recordsize=32k whirl
st1 -n /whirl/f -f 1073741824 -b 32768 -t 20
        Elapsed time 151s  6.8MB/s

2. Higher end 6 disk system.
   32KB synchronous writes filling 1GB with 100 threads

zpool create whirl <6 disks>; zfs set recordsize=32k whirl
st1 -n /whirl/f -f 1073741824 -b 32768 -t 100
        Elapsed time 33s  31MB/s

zpool create whirl <5 disks>  log <1disk>; zfs set recordsize=32k
whirl
st1 -n /whirl/f -f 1073741824 -b 32768 -t 100
        Elapsed time 147s  7.0MB/s

and for interest:
zpool create whirl <5 disk> log <SSD>; zfs set recordsize=32k whirl
st1 -n /whirl/f -f 1073741824 -b 32768 -t 100
         Elapsed time 8s  129MB/s

3. Higher end smaller writes
   2K synchronous writes filling 128MB with 100 threads

zpool create whirl <6 disks>: zfs set recordsize=1k whirl
st1 -n /whirl/f -f 134217728 -b 2048 -t 100
        Elapsed time 16s  8.2MB/s

zpool create whirl <5 disks>  log <1 disk>
zfs set recordsize=1k whirl
ds8 -n /whirl/f -f 134217728 -b 2048 -t 100
        Elapsed time 24s  5.5MB/s

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20111028/18d9926f/attachment.html>

Neil Perrin

2011-Oct-28 21:44 UTC

head link

Re: Log disk with all ssd pool?

On 10/28/11 11:21, Mark Wolek wrote:

Having
the log disk slowed it down a lot in your tests (when it wasn’t a SSD),
30MB/s vs 7.  Is this is also a 100% write / 100% sequential workload? 
Forcing sync?

100% synchronous write. Writes are random but ZFS will write them
sequentially on disk.

It’s
gotten to the point where I can buy a 120G SSD for less or the same
price as a 146G SAS disk…Sure the MLC drives have limited lifetime, but
at $150 (and dropping) just replace them every few years to be safe,
work out a rotation/rebuild cycle, it’s tempting…  I suppose if we do
end up buying all SSD’s it becomes really easy to test if we should use
a log or not!

Would highly recommend some form of zpool redundancy (mirroring or
raidz).

From:
zfs-discuss-bounces@opensolaris.org
[mailto:zfs-discuss-bounces@opensolaris.org] On Behalf Of Neil
Perrin

  Sent: Friday, October 28, 2011 11:38 AM

  To: zfs-discuss@opensolaris.org

  Subject: Re: [zfs-discuss] Log disk with all ssd pool?

On 10/28/11 00:54, Neil Perrin wrote: 

On 10/28/11 00:04, Mark Wolek wrote: 

Still kicking around this idea and didn’t see it
addressed in any of the threads before the forum closed.  

If one made an all ssd pool, would a log/cache
drive just slow you down?  Would zil slow you down?  Thinking rotate
MLC drives with sandforce controllers every few years to avoid losing a
drive to “sorry no more writes aloud” scenarios.  

Thanks

Mark

Interesting question. I don''t think there''s a straightforward
answer.
Oracle uses write optimised log devices and read optimised cache
devices in it''s appliances. However, assuming all the SSDs are the same
then I suspect neither a log nor a cache device would help:

  Log

If there is a log then it is solely used, and can be written to in
parallel with periodic TXG commit writes to the other pool devices.  If
that log were part of the pool then the ZIL code will spread the load
among all pool devices, but will compete with TXG commit writes.  My
gut feeling is that this would be the higher performing option though. 
I think, a long time ago, I experimented with designating one disk out
of the pool as a log and saw degradation on synchronous performance.
That seems to be the equivalent to your SSD question.

  Cache

Similarly for cache devices the read would compete at TXG commit
writes, but otherwise performance ought to be higher.

Neil.

Did
some quick tests with disks to check if my memory was correct.

''sb'' is a simple problem to spawn a number of threads to fill
a file of
a certain size

with specified sized non zero writes. Bandwidth is also important.

1. Simple 2 disk system.

   32KB synchronous writes filling 1GB with 20 threads

zpool create whirl  &lt;2 disks&gt;; zfs set recordsize=32k whirl

st1 -n /whirl/f -f 1073741824 -b 32768 -t 20

        Elapsed time 95s  10.8MB/s

zpool create whirl  log  ; zfs set
recordsize=32k whirl

st1 -n /whirl/f -f 1073741824 -b 32768 -t 20

        Elapsed time 151s  6.8MB/s

2. Higher end 6 disk system.

   32KB synchronous writes filling 1GB with 100 threads

zpool create whirl &lt;6 disks&gt;; zfs set recordsize=32k whirl

st1 -n /whirl/f -f 1073741824 -b 32768 -t 100

        Elapsed time 33s  31MB/s

zpool create whirl &lt;5 disks&gt;  log &lt;1disk&gt;; zfs set
recordsize=32k whirl

st1 -n /whirl/f -f 1073741824 -b 32768 -t 100

        Elapsed time 147s  7.0MB/s

and for interest:

zpool create whirl &lt;5 disk&gt; log ; zfs set
recordsize=32k whirl

st1 -n /whirl/f -f 1073741824 -b 32768 -t 100

         Elapsed time 8s  129MB/s

3. Higher end smaller writes

   2K synchronous writes filling 128MB with 100 threads

zpool create whirl &lt;6 disks&gt;: zfs set recordsize=1k whirl

st1 -n /whirl/f -f 134217728 -b 2048 -t 100

        Elapsed time 16s  8.2MB/s

zpool create whirl &lt;5 disks&gt;  log &lt;1 disk&gt;

zfs set recordsize=1k whirl

ds8 -n /whirl/f -f 134217728 -b 2048 -t 100

        Elapsed time 24s  5.5MB/s


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Richard Elling

2011-Oct-30 22:33 UTC

head link

[zfs-discuss] Log disk with all ssd pool?

On Oct 27, 2011, at 11:04 PM, Mark Wolek wrote:
> Still kicking around this idea and didn?t see it addressed in any of the
threads before the forum closed.
>  
> If one made an all ssd pool, would a log/cache drive just slow you down? 
Would zil slow you down?
In general, a slog makes sense when the latency is significantly better than
access to the main pool.
For an order of magnitude difference, it is a no brainer. For less significant
differences, it can be debatable.
>  Thinking rotate MLC drives with sandforce controllers every few years to
avoid losing a drive to ?sorry no more writes aloud? scenarios.
Different SSDs have different performance behaviours.  If you don''t pay
attention to the details,
you might find your pool is faster than your slog :-)
 -- richard

-- 

ZFS and performance consulting
http://www.RichardElling.com
LISA ''11, Boston, MA, December 4-9

Karl Rossing

2011-Oct-31 17:12 UTC

head link

[zfs-discuss] Log disk with all ssd pool?

On 10/28/2011 01:04 AM, Mark Wolek wrote:> before the forum closed. Did I miss something?

Karl





CONFIDENTIALITY NOTICE:  This communication (including all attachments) is
confidential and is intended for the use of the named addressee(s) only and
may contain information that is private, confidential, privileged, and
exempt from disclosure under law.  All rights to privilege are expressly
claimed and reserved and are not waived.  Any use, dissemination,
distribution, copying or disclosure of this message and any attachments, in
whole or in part, by anyone other than the intended recipient(s) is strictly
prohibited.  If you have received this communication in error, please notify
the sender immediately, delete this communication from all data storage
devices and destroy all hard copies.

Edward Ned Harvey

2011-Nov-01 12:37 UTC

head link

[zfs-discuss] Log disk with all ssd pool?

> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
> bounces at opensolaris.org] On Behalf Of Karl Rossing
> 
> On 10/28/2011 01:04 AM, Mark Wolek wrote:
> > before the forum closed.
> Did I miss something?
Yes.  The forums no longer exist.  It''s only mailman email now.

Bob Friesenhahn

2011-Nov-01 14:16 UTC

head link

[zfs-discuss] Log disk with all ssd pool?

On Tue, 1 Nov 2011, Edward Ned Harvey wrote:
>> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
>> bounces at opensolaris.org] On Behalf Of Karl Rossing
>>
>> On 10/28/2011 01:04 AM, Mark Wolek wrote:
>>> before the forum closed.
>> Did I miss something?
>
> Yes.  The forums no longer exist.  It''s only mailman email now.
I notice that the mail activity has diminished substantially since the 
forums were shut down.  Apparently they were still in use.

Bob
-- 
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Edward Ned Harvey

2011-Nov-01 22:17 UTC

head link

[zfs-discuss] Log disk with all ssd pool?

> From: Bob Friesenhahn [mailto:bfriesen at simple.dallas.tx.us]
> 
> I notice that the mail activity has diminished substantially since the
> forums were shut down.  Apparently they were still in use.
I''m sure nobody thought they were unused.  I''m sure it was a
cost saving
measure.  Jive forums start at $20k/yr, assuming you want just a vanilla
config (which opensolaris didn''t) plus the man hours to maintain it and
its
consistency between mailman.  I recently looked into implementing a
community portal modeled on the opensolaris community (forums + email
coexisting blissfully) but all the competitors were in the same range.  You
can do either one by itself extremely well for free.  You can do both poorly
for free, or you can do both very well for big bucks.  That''s what
opensolaris was doing.

Seems coincidental this change was very near 1 year after "the
change,"
doesn''t it?  Somebody''s budget got slashed...  But the jive
renewal was
nearly a year later...  I bet.

Daniel Carosone

2011-Nov-02 01:25 UTC

head link

[zfs-discuss] Log disk with all ssd pool?

On Tue, Nov 01, 2011 at 06:17:57PM -0400, Edward Ned Harvey
wrote:> You can do both poorly for free, or you can do both very well for big
bucks.
> That''s what opensolaris was doing.
That mess was costing someone money and considered very well done?
Good riddance.  

--
Dan.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 194 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20111102/12e857b3/attachment.bin>

zfs discuss - Oct 2011 - Log disk with all ssd pool?

[zfs-discuss] Log disk with all ssd pool?

[zfs-discuss] Log disk with all ssd pool?

Re: Log disk with all ssd pool?

[zfs-discuss] Log disk with all ssd pool?

Re: Log disk with all ssd pool?

[zfs-discuss] Log disk with all ssd pool?

Re: Log disk with all ssd pool?

[zfs-discuss] Log disk with all ssd pool?

[zfs-discuss] Log disk with all ssd pool?

[zfs-discuss] Log disk with all ssd pool?

[zfs-discuss] Log disk with all ssd pool?

[zfs-discuss] Log disk with all ssd pool?

[zfs-discuss] Log disk with all ssd pool?