thr3ads.net - zfs discuss - [zfs-discuss] Ssd for zil on a dell 2950 [Aug 2009]

If this information is useful, please help other people find it:
Share via:

HUGE | David Stahl

2009-Aug-19 22:07 UTC

[zfs-discuss] Ssd for zil on a dell 2950

We have a setup with ZFS/ESX/NFS and I am looking to move our zil to a solid
state drive.
  So far I am looking into this one
http://www.newegg.com/Product/Product.aspx?Item=N82E16820167013
Does anyone have any experience with this drive as a poorman?s logzilla?
And also what have other people done for mounting these kind of ssd?s inside
a poweredge case? 
any suggestions on anything else?
 -D
-- 
HUGE

David Stahl
Sr. Systems Administrator
718 233 9164 / F 718 625 5157

www.hugeinc.com <http://www.hugeinc.com>  

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090819/c28e1d6c/attachment.html>

Greg Mason

2009-Aug-19 22:34 UTC

head link

[zfs-discuss] Ssd for zil on a dell 2950

Hi David,

We are using them in our Sun X4540 filers. We are actually using 2 SSDs 
per pool, to improve throughput (since the logbias feature isn''t in an 
official release of OpenSolaris yet). I kind of wish they made an 8G or 
16G part, since the 32G capacity is kind of a waste.

We had to go the NewEgg route though. We tried to buy some Sun-branded 
disks from Sun, but that''s a different story. To summarize, we had to 
buy the NewEgg parts to ensure a project stayed on-schedule.

Generally, we''ve been pretty pleased with them. Occasionally,
we''ve had
an SSD that wasn''t behaving well. Looks like you can replace log
devices
now though... :) We use the 2.5" to 3.5" SATA adapter from IcyDock, in
a
Sun X4540 drive sled. If you can attach a standard sata disk to a Dell 
sled, this approach would most likely work for you as well. Only issue 
with using the third-party parts is that the involved support 
organizations for the software/hardware will make it very clear that 
such a configuration is quite unsupported. That said, we''ve had pretty 
good luck with them.

-Greg

-- 
Greg Mason
System Administrator
High Performance Computing Center
Michigan State University

HUGE | David Stahl wrote:> We have a setup with ZFS/ESX/NFS and I am looking to move our zil to a 
> solid state drive.
> So far I am looking into this one 
> http://www.newegg.com/Product/Product.aspx?Item=N82E16820167013
> Does anyone have any experience with this drive as a poorman?s logzilla?

Monish Shah

2009-Aug-20 04:44 UTC

head link

[zfs-discuss] Ssd for zil on a dell 2950

Hello Greg,

I''m curious how much performance benefit you gain from the ZIL
accelerator.
Have you measured that?  If not, do you have a gut feel about how much it 
helped?  Also, for what kind of applications does it help?

(I know it helps with synchronous writes.  I''m looking for real world 
answers like: "Our XYZ application was running like a dog and we added an 
SSD for ZIL and the response time improved by X%.")

Of course, I would welcome a reply from anyone who has experience with this, 
not just Greg.

Monish

----- Original Message ----- 
From: "Greg Mason" <gmason at msu.edu>
To: "HUGE | David Stahl" <dstahl at hugeinc.com>
Cc: "zfs-discuss" <zfs-discuss at opensolaris.org>
Sent: Thursday, August 20, 2009 4:04 AM
Subject: Re: [zfs-discuss] Ssd for zil on a dell 2950

Hi David,

We are using them in our Sun X4540 filers. We are actually using 2 SSDs
per pool, to improve throughput (since the logbias feature isn''t in an
official release of OpenSolaris yet). I kind of wish they made an 8G or
16G part, since the 32G capacity is kind of a waste.

We had to go the NewEgg route though. We tried to buy some Sun-branded
disks from Sun, but that''s a different story. To summarize, we had to
buy the NewEgg parts to ensure a project stayed on-schedule.

Generally, we''ve been pretty pleased with them. Occasionally,
we''ve had
an SSD that wasn''t behaving well. Looks like you can replace log
devices
now though... :) We use the 2.5" to 3.5" SATA adapter from IcyDock, in
a
Sun X4540 drive sled. If you can attach a standard sata disk to a Dell
sled, this approach would most likely work for you as well. Only issue
with using the third-party parts is that the involved support
organizations for the software/hardware will make it very clear that
such a configuration is quite unsupported. That said, we''ve had pretty
good luck with them.

-Greg

-- 
Greg Mason
System Administrator
High Performance Computing Center
Michigan State University

HUGE | David Stahl wrote:> We have a setup with ZFS/ESX/NFS and I am looking to move our zil to a 
> solid state drive.
> So far I am looking into this one 
> http://www.newegg.com/Product/Product.aspx?Item=N82E16820167013
> Does anyone have any experience with this drive as a poorman?s logzilla?_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Jorgen Lundman

2009-Aug-20 04:57 UTC

head link

[zfs-discuss] Ssd for zil on a dell 2950

Does un-taring something count? It is what I used for our tests.

I tested with ZIL disable, zil cache on /tmp/zil, CF-card (300x) and 
cheap SSD. Waiting for X-25E SSDs to arrive for testing those:

http://mail.opensolaris.org/pipermail/zfs-discuss/2009-July/030183.html

If you want a quick answer, disable ZIL (you need to unmount/mount, 
export/import or reboot) on your ZFS volume and try it. That is the 
theoretical maximum. You can get close to this using various 
technologies, SSD and all that.

I am no expert on this, I knew nothing about it 2 weeks ago.

But for our provisioning engine to untar Movable-Types for customers, 5 
mins to 45secs is quite an improvement. I can get that to 11seconds 
theoretically. (ZIL disable)

Lund


Monish Shah wrote:> Hello Greg,
> 
> I''m curious how much performance benefit you gain from the ZIL 
> accelerator. Have you measured that?  If not, do you have a gut feel 
> about how much it helped?  Also, for what kind of applications does it 
> help?
> 
> (I know it helps with synchronous writes.  I''m looking for real
world
> answers like: "Our XYZ application was running like a dog and we added
> an SSD for ZIL and the response time improved by X%.")
> 
> Of course, I would welcome a reply from anyone who has experience with 
> this, not just Greg.
> 
> Monish
> 
> ----- Original Message ----- From: "Greg Mason" <gmason at
msu.edu>
> To: "HUGE | David Stahl" <dstahl at hugeinc.com>
> Cc: "zfs-discuss" <zfs-discuss at opensolaris.org>
> Sent: Thursday, August 20, 2009 4:04 AM
> Subject: Re: [zfs-discuss] Ssd for zil on a dell 2950
> 
> 
> Hi David,
> 
> We are using them in our Sun X4540 filers. We are actually using 2 SSDs
> per pool, to improve throughput (since the logbias feature isn''t
in an
> official release of OpenSolaris yet). I kind of wish they made an 8G or
> 16G part, since the 32G capacity is kind of a waste.
> 
> We had to go the NewEgg route though. We tried to buy some Sun-branded
> disks from Sun, but that''s a different story. To summarize, we had
to
> buy the NewEgg parts to ensure a project stayed on-schedule.
> 
> Generally, we''ve been pretty pleased with them. Occasionally,
we''ve had
> an SSD that wasn''t behaving well. Looks like you can replace log
devices
> now though... :) We use the 2.5" to 3.5" SATA adapter from
IcyDock, in a
> Sun X4540 drive sled. If you can attach a standard sata disk to a Dell
> sled, this approach would most likely work for you as well. Only issue
> with using the third-party parts is that the involved support
> organizations for the software/hardware will make it very clear that
> such a configuration is quite unsupported. That said, we''ve had
pretty
> good luck with them.
> 
> -Greg
> 
-- 
Jorgen Lundman       | <lundman at lundman.net>
Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo    | +81 (0)90-5578-8500          (cell)
Japan                | +81 (0)3 -3375-1767          (home)

Greg Mason

2009-Aug-20 12:41 UTC

head link

[zfs-discuss] Ssd for zil on a dell 2950

Something our users do quite a bit of is untarring archives with a lot 
of small files. Also, many small, quick writes are also one of the many 
workloads our users have.

Real-world test: our old Linux-based NFS server allowed us to unpack a 
particular tar file (the source for boost 1.37) in around 2-4 minutes, 
depending on load. This machine wasn''t special at all, but it had fancy
SGI disk on the back end, and was using the Linux-specific async NFS option.

We turned up our X4540s, and this same tar unpack took over 17 minutes! 
We disabled the ZIL for testing, and we dropped this to under 1 minute. 
With the X25-E as a slog, we were able to run this test in 2-4 minutes, 
same as the old storage.

That said, I strongly recommend using Richard Elling''s zilstat.
He''s
posted about it previously on this list. It will help you determine if 
adding a slog device will help your workload or not. I didn''t know
about
this script at the time of our testing, so it ended up being some trial 
and error, running various tests on different hardware setups (which 
means creating and destroying quite a few pools).

-Greg

Jorgen Lundman wrote:>
> Does un-taring something count? It is what I used for our tests.
>
> I tested with ZIL disable, zil cache on /tmp/zil, CF-card (300x) and 
> cheap SSD. Waiting for X-25E SSDs to arrive for testing those:
>
> http://mail.opensolaris.org/pipermail/zfs-discuss/2009-July/030183.html
>
> If you want a quick answer, disable ZIL (you need to unmount/mount, 
> export/import or reboot) on your ZFS volume and try it. That is the 
> theoretical maximum. You can get close to this using various 
> technologies, SSD and all that.
>
> I am no expert on this, I knew nothing about it 2 weeks ago.
>
> But for our provisioning engine to untar Movable-Types for customers, 
> 5 mins to 45secs is quite an improvement. I can get that to 11seconds 
> theoretically. (ZIL disable)
>
> Lund
>
>
> Monish Shah wrote:
>> Hello Greg,
>>
>> I''m curious how much performance benefit you gain from the ZIL
>> accelerator. Have you measured that?  If not, do you have a gut feel 
>> about how much it helped?  Also, for what kind of applications does 
>> it help?
>>
>> (I know it helps with synchronous writes.  I''m looking for
real world
>> answers like: "Our XYZ application was running like a dog and we 
>> added an SSD for ZIL and the response time improved by X%.")
>>
>> Of course, I would welcome a reply from anyone who has experience 
>> with this, not just Greg.
>>
>> Monish
>>
>> ----- Original Message ----- From: "Greg Mason" <gmason at
msu.edu>
>> To: "HUGE | David Stahl" <dstahl at hugeinc.com>
>> Cc: "zfs-discuss" <zfs-discuss at opensolaris.org>
>> Sent: Thursday, August 20, 2009 4:04 AM
>> Subject: Re: [zfs-discuss] Ssd for zil on a dell 2950
>>
>>
>> Hi David,
>>
>> We are using them in our Sun X4540 filers. We are actually using 2 SSDs
>> per pool, to improve throughput (since the logbias feature
isn''t in an
>> official release of OpenSolaris yet). I kind of wish they made an 8G or
>> 16G part, since the 32G capacity is kind of a waste.
>>
>> We had to go the NewEgg route though. We tried to buy some Sun-branded
>> disks from Sun, but that''s a different story. To summarize, we
had to
>> buy the NewEgg parts to ensure a project stayed on-schedule.
>>
>> Generally, we''ve been pretty pleased with them. Occasionally,
we''ve had
>> an SSD that wasn''t behaving well. Looks like you can replace
log devices
>> now though... :) We use the 2.5" to 3.5" SATA adapter from
IcyDock, in a
>> Sun X4540 drive sled. If you can attach a standard sata disk to a Dell
>> sled, this approach would most likely work for you as well. Only issue
>> with using the third-party parts is that the involved support
>> organizations for the software/hardware will make it very clear that
>> such a configuration is quite unsupported. That said, we''ve
had pretty
>> good luck with them.
>>
>> -Greg
>>
>

Roman Naumenko

2009-Aug-20 13:50 UTC

head link

[zfs-discuss] Ssd for zil on a dell 2950

> Something our users do quite a bit of is untarring
> archives with a lot 
> of small files. Also, many small, quick writes are
> also one of the many 
> workloads our users have.
> 
> Real-world test: our old Linux-based NFS server
> allowed us to unpack a 
> particular tar file (the source for boost 1.37) in
> around 2-4 minutes, 
> depending on load. This machine wasn''t special at
> all, but it had fancy 
> SGI disk on the back end, and was using the
> Linux-specific async NFS option.
> 
> We turned up our X4540s, and this same tar unpack took over 17 minutes! 
> We disabled the ZIL for testing, and we dropped this
> to under 1 minute. 
> With the X25-E as a slog, we were able to run this test in 2-4 minutes,
same as the old storage.
> 
> That said, I strongly recommend using Richard
> Elling''s zilstat. He''s posted about it previously on this
list. It will help
> you determine if adding a slog device will help your workload or not.
> I didn''t know about this script at the time of our testing, so it
ended
> up being some trial and error, running various tests on different
> hardware setups (which means creating and destroying quite a few pools).
> 
> -Greg
How about the bug "removing slog not possible"? What if this slog
fails? Is there a plan for such situation (pool becomes inaccessible in this
case)?

--
Roman
-- 
This message posted from opensolaris.org

Greg Mason

2009-Aug-20 15:34 UTC

head link

[zfs-discuss] Ssd for zil on a dell 2950

> How about the bug "removing slog not possible"? What if this slog
fails? Is there a plan for such situation (pool becomes inaccessible in this
case)?
>   You can "zpool replace" a bad slog device now.

-Greg

Stephen Green

2009-Aug-20 15:41 UTC

head link

[zfs-discuss] Ssd for zil on a dell 2950

Greg Mason wrote:> 
>> How about the bug "removing slog not possible"? What if this
slog
>> fails? Is there a plan for such situation (pool becomes inaccessible 
>> in this case)?
>>   
> You can "zpool replace" a bad slog device now.
And I can testify that it works as described.

Steve
-- 
Stephen Green                      //   Stephen.Green at sun.com
Principal Investigator             \\   http://blogs.sun.com/searchguy
The AURA Project                   //   Voice: +1 781-442-0926
Sun Microsystems Labs              \\   Fax:   +1 781-442-0399

Roman Naumenko

2009-Aug-20 16:34 UTC

head link

[zfs-discuss] Ssd for zil on a dell 2950

> Greg Mason wrote:
> > 
> >> How about the bug "removing slog not possible"?
> What if this slog 
> >> fails? Is there a plan for such situation (pool
> becomes inaccessible 
> >> in this case)?
> >>   
> > You can "zpool replace" a bad slog device now.
> 
> And I can testify that it works as described.
I meant this situation (and even if slog mirrored - it''s still might
happen):

root at zsan0:~# zpool  status zsan0store
  pool: zsan0store
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using ''zpool replace''.
   see: http://www.sun.com/msg/ZFS-8000-4J
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        zsan0store  DEGRADED     0     0     0
          raidz2    ONLINE       0     0     0
            c7t0d0  ONLINE       0     0     0
            c7t1d0  ONLINE       0     0     0
            c7t2d0  ONLINE       0     0     0
            c7t3d0  ONLINE       0     0     0
            c7t4d0  ONLINE       0     0     0
            c7t5d0  ONLINE       0     0     0
            c7t6d0  ONLINE       0     0     0
            c7t7d0  ONLINE       0     0     0
        logs
          c9d0      FAULTED      0     0     0  corrupted data

errors: No known data errors
root at zsan0:~# zpool detach   zsan0store c9d0
cannot detach c9d0: only applicable to mirror and replacing vdevs
root at zsan0:~# zpool remove   zsan0store c9d0
cannot remove c9d0: only inactive hot spares or cache devices can be removed
-- 
This message posted from opensolaris.org

Neil Perrin

2009-Aug-22 21:21 UTC

head link

[zfs-discuss] Ssd for zil on a dell 2950

On 08/20/09 06:41, Greg Mason wrote:> Something our users do quite a bit of is untarring archives with a lot 
> of small files. Also, many small, quick writes are also one of the many 
> workloads our users have.
> 
> Real-world test: our old Linux-based NFS server allowed us to unpack a 
> particular tar file (the source for boost 1.37) in around 2-4 minutes, 
> depending on load. This machine wasn''t special at all, but it had
fancy
> SGI disk on the back end, and was using the Linux-specific async NFS 
> option.
I''m glad you mentioned this option. It turns all synchronous requests
from the client into async allowing the server to immediately return
without making the data stable. This is the equivalent of setting zil_disable.
Async used to be the default behaviour. It must have been a shock to Linux
users when suddenly NFS slowed down when synchronous became the default!
I wonder what the perf numbers were without the async option.
> 
> We turned up our X4540s, and this same tar unpack took over 17 minutes! 
> We disabled the ZIL for testing, and we dropped this to under 1 minute. 
> With the X25-E as a slog, we were able to run this test in 2-4 minutes, 
> same as the old storage.
That''s pretty impressive. So with a X25-E slog ZFS is as fast
synchronously as
your previously hardware was asynchronously - but with no risk of data
corruption.
Of course the hardware is different so it''s not really apples to
apples.

Neil.

Ross Walker

2009-Aug-22 23:33 UTC

head link

[zfs-discuss] Ssd for zil on a dell 2950

On Aug 22, 2009, at 5:21 PM, Neil Perrin <Neil.Perrin at Sun.COM> wrote:
>
>
> On 08/20/09 06:41, Greg Mason wrote:
>> Something our users do quite a bit of is untarring archives with a  
>> lot of small files. Also, many small, quick writes are also one of  
>> the many workloads our users have.
>> Real-world test: our old Linux-based NFS server allowed us to  
>> unpack a particular tar file (the source for boost 1.37) in around  
>> 2-4 minutes, depending on load. This machine wasn''t special at
all,
>> but it had fancy SGI disk on the back end, and was using the Linux- 
>> specific async NFS option.
>
> I''m glad you mentioned this option. It turns all synchronous
requests
> from the client into async allowing the server to immediately return
> without making the data stable. This is the equivalent of setting  
> zil_disable.
> Async used to be the default behaviour. It must have been a shock to  
> Linux
> users when suddenly NFS slowed down when synchronous became the  
> default!
> I wonder what the perf numbers were without the async option.
>
>> We turned up our X4540s, and this same tar unpack took over 17  
>> minutes! We disabled the ZIL for testing, and we dropped this to  
>> under 1 minute. With the X25-E as a slog, we were able to run this  
>> test in 2-4 minutes, same as the old storage.
>
> That''s pretty impressive. So with a X25-E slog ZFS is as fast  
> synchronously as
> your previously hardware was asynchronously - but with no risk of  
> data corruption.
> Of course the hardware is different so it''s not really apples to  
> apples.
There was a thread not too along ago either on the xfs mailing list or  
mysql mailing list that talked about the Intel X25-E and it''s on board
cache. The cache ignores flushes, but isn''t persistent on power  
failure. Pulling the drive during a sync write caused data corruption.  
You can disable the write back cache of these, but the performance is  
no where near as good with it disabled.

-Ross

Ross Walker

2009-Aug-22 23:45 UTC

head link

[zfs-discuss] Ssd for zil on a dell 2950

On Aug 22, 2009, at 7:33 PM, Ross Walker <rswwalker at gmail.com> wrote:
> On Aug 22, 2009, at 5:21 PM, Neil Perrin <Neil.Perrin at Sun.COM>
wrote:
>
>>
>>
>> On 08/20/09 06:41, Greg Mason wrote:
>>> Something our users do quite a bit of is untarring archives with a
>>> lot of small files. Also, many small, quick writes are also one of
>>> the many workloads our users have.
>>> Real-world test: our old Linux-based NFS server allowed us to  
>>> unpack a particular tar file (the source for boost 1.37) in around
>>> 2-4 minutes, depending on load. This machine wasn''t
special at
>>> all, but it had fancy SGI disk on the back end, and was using the  
>>> Linux-specific async NFS option.
>>
>> I''m glad you mentioned this option. It turns all synchronous
requests
>> from the client into async allowing the server to immediately return
>> without making the data stable. This is the equivalent of setting  
>> zil_disable.
>> Async used to be the default behaviour. It must have been a shock  
>> to Linux
>> users when suddenly NFS slowed down when synchronous became the  
>> default!
>> I wonder what the perf numbers were without the async option.
>>
>>> We turned up our X4540s, and this same tar unpack took over 17  
>>> minutes! We disabled the ZIL for testing, and we dropped this to  
>>> under 1 minute. With the X25-E as a slog, we were able to run this
>>> test in 2-4 minutes, same as the old storage.
>>
>> That''s pretty impressive. So with a X25-E slog ZFS is as fast
>> synchronously as
>> your previously hardware was asynchronously - but with no risk of  
>> data corruption.
>> Of course the hardware is different so it''s not really apples
to
>> apples.
>
> There was a thread not too along ago either on the xfs mailing list  
> or mysql mailing list that talked about the Intel X25-E and it''s
on
> board cache. The cache ignores flushes, but isn''t persistent on  
> power failure. Pulling the drive during a sync write caused data  
> corruption. You can disable the write back cache of these, but the  
> performance is no where near as good with it disabled.
Here is the blog post:

http://www.mysqlperformanceblog.com/2009/03/02/ssd-xfs-lvm-fsync-write-cache-barrier-and-lost-transactions/

-Ross

Tristan Ball

2009-Aug-23 04:11 UTC

head link

[zfs-discuss] Ssd for zil on a dell 2950

Ross Walker wrote:> [snip]
>>>
>>>
>>>> We turned up our X4540s, and this same tar unpack took over 17 
>>>> minutes! We disabled the ZIL for testing, and we dropped this
to
>>>> under 1 minute. With the X25-E as a slog, we were able to run
this
>>>> test in 2-4 minutes, same as the old storage.
>>>
>>> That''s pretty impressive. So with a X25-E slog ZFS is as
fast
>>> synchronously as
>>> your previously hardware was asynchronously - but with no risk of 
>>> data corruption.
>>> Of course the hardware is different so it''s not really
apples to
>>> apples.
>>
>>  There was a thread not too along ago either on the xfs mailing list 
>> or mysql mailing list that talked about the Intel X25-E and
it''s on
>> board cache. The cache ignores flushes, but isn''t persistent
on power
>> failure. Pulling the drive during a sync write caused data 
>> corruption. You can disable the write back cache of these, but the 
>> performance is no where near as good with it disabled.
>
> Here is the blog post:
>
>
http://www.mysqlperformanceblog.com/2009/03/02/ssd-xfs-lvm-fsync-write-cache-barrier-and-lost-transactions/
>
>
> -Ross
>Hang on, in reading that his initial results were 50 writes a second, 
with the default xfs write barriers, which to me implies that the drive 
is honouring the cache flush. The fact that write rate jumps so 
significantly when he turns off barriers, but continues with ODIRECT and 
innodb_flush_log_at_trx_commit=1 to me just says that xfs is returning 
success on writes as soon as the data has been given to the drive - not 
when the drive has flushed it''s cache to have it persistent. Given that
we told xfs to turn off write barriers - isn''t it doing what
it''s told?
Why are we expecting data to be consistent across power loss or device 
removal?

Couldn''t this just be XFS only actually requesting cache flushes when 
barrier''s are enabled?

T
*


*
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090823/62476208/attachment.html>

Eric D. Mudama

2009-Aug-23 06:49 UTC

head link

[zfs-discuss] Ssd for zil on a dell 2950

On Sun, Aug 23 at 14:11, Tristan Ball wrote:>   Hang on, in reading that his initial results were 50 writes a second,
with
>   the default xfs write barriers, which to me implies that the drive is
>   honouring the cache flush. The fact that write rate jumps so
significantly
>   when he turns off barriers, but continues with ODIRECT and
>   innodb_flush_log_at_trx_commit=1 to me just says that xfs is returning
>   success on writes as soon as the data has been given to the drive - not
>   when the drive has flushed it''s cache to have it persistent.
Given that we
>   told xfs to turn off write barriers - isn''t it doing what
it''s told? Why
>   are we expecting data to be consistent across power loss or device
>   removal?
>
>   Couldn''t this just be XFS only actually requesting cache flushes
when
>   barrier''s are enabled?
Yea, I parsed the article the same way.

50 IOPS with xfs barriers (crappy by any standard)

5300 cache-enabled IOPS or 1200 cache-disabled IOPS with the X25-E,
and he tried yanking the power while doing work with cache disabled
and it didn''t lose any transactions.

Seems like it was behaving as expected, unless I misunderstood something.


-- 
Eric D. Mudama
edmudama at mail.bounceswoosh.org

Ross Walker

2009-Aug-23 13:59 UTC

head link

[zfs-discuss] Ssd for zil on a dell 2950

On Aug 23, 2009, at 12:11 AM, Tristan Ball <Tristan.Ball at
leica-microsystems.com
 > wrote:
>
>
> Ross Walker wrote:
>>
>> [snip]
>>>
>>>>
>>>>
>>>>> We turned up our X4540s, and this same tar unpack took over
17
>>>>> minutes! We disabled the ZIL for testing, and we dropped
this to
>>>>> under 1 minute. With the X25-E as a slog, we were able to
run
>>>>> this test in 2-4 minutes, same as the old storage.
>>>>
>>>> That''s pretty impressive. So with a X25-E slog ZFS is
as fast
>>>> synchronously as
>>>> your previously hardware was asynchronously - but with no risk
of
>>>> data corruption.
>>>> Of course the hardware is different so it''s not really
apples to
>>>> apples.
>>>
>>>  There was a thread not too along ago either on the xfs mailing  
>>> list or mysql mailing list that talked about the Intel X25-E and  
>>> it''s on board cache. The cache ignores flushes, but
isn''t
>>> persistent on power failure. Pulling the drive during a sync write
>>> caused data corruption. You can disable the write back cache of  
>>> these, but the performance is no where near as good with it  
>>> disabled.
>>
>> Here is the blog post:
>>
>>
http://www.mysqlperformanceblog.com/2009/03/02/ssd-xfs-lvm-fsync-write-cache-barrier-and-lost-transactions/
>>
>> -Ross
>>
> Hang on, in reading that his initial results were 50 writes a  
> second, with the default xfs write barriers, which to me implies  
> that the drive is honouring the cache flush. The fact that write  
> rate jumps so significantly when he turns off barriers, but  
> continues with ODIRECT and innodb_flush_log_at_trx_commit=1 to me  
> just says that xfs is returning success on writes as soon as the  
> data has been given to the drive - not when the drive has flushed  
> it''s cache to have it persistent. Given that we told xfs to turn
off
> write barriers - isn''t it doing what it''s told? Why are
we expecting
> data to be consistent across power loss or device removal?
>
> Couldn''t this just be XFS only actually requesting cache flushes  
> when barrier''s are enabled?
I think it''s more an illustration that write barriers on Linux need a  
little work, even with flushes it should do a lot better then 50 IOPS.

O_DIRECT does just that, with or without barriers, it flushes on each  
write, with an ever so slight delay to allow the queue to coalesce  
writes.

A barrier is more to enforce order and persistence when IO is async.

-Ross

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090823/a98d47ca/attachment.html>

Ross Walker

2009-Aug-23 14:30 UTC

head link

[zfs-discuss] Ssd for zil on a dell 2950

On Aug 23, 2009, at 9:59 AM, Ross Walker <rswwalker at gmail.com> wrote:
> On Aug 23, 2009, at 12:11 AM, Tristan Ball <Tristan.Ball at
leica-microsystems.com
> > wrote:
>
>>
>>
>> Ross Walker wrote:
>>>
>>> [snip]
>>>>
>>>>>
>>>>>
>>>>>> We turned up our X4540s, and this same tar unpack took
over 17
>>>>>> minutes! We disabled the ZIL for testing, and we
dropped this
>>>>>> to under 1 minute. With the X25-E as a slog, we were
able to
>>>>>> run this test in 2-4 minutes, same as the old storage.
>>>>>
>>>>> That''s pretty impressive. So with a X25-E slog ZFS
is as fast
>>>>> synchronously as
>>>>> your previously hardware was asynchronously - but with no
risk
>>>>> of data corruption.
>>>>> Of course the hardware is different so it''s not
really apples to
>>>>> apples.
>>>>
>>>>  There was a thread not too along ago either on the xfs mailing
>>>> list or mysql mailing list that talked about the Intel X25-E
and
>>>> it''s on board cache. The cache ignores flushes, but
isn''t
>>>> persistent on power failure. Pulling the drive during a sync  
>>>> write caused data corruption. You can disable the write back  
>>>> cache of these, but the performance is no where near as good
with
>>>> it disabled.
>>>
>>> Here is the blog post:
>>>
>>>
http://www.mysqlperformanceblog.com/2009/03/02/ssd-xfs-lvm-fsync-write-cache-barrier-and-lost-transactions/
>>>
>>> -Ross
>>>
>> Hang on, in reading that his initial results were 50 writes a  
>> second, with the default xfs write barriers, which to me implies  
>> that the drive is honouring the cache flush. The fact that write  
>> rate jumps so significantly when he turns off barriers, but  
>> continues with ODIRECT and innodb_flush_log_at_trx_commit=1 to me  
>> just says that xfs is returning success on writes as soon as the  
>> data has been given to the drive - not when the drive has flushed  
>> it''s cache to have it persistent. Given that we told xfs to
turn
>> off write barriers - isn''t it doing what it''s told?
Why are we
>> expecting data to be consistent across power loss or device removal?
>>
>> Couldn''t this just be XFS only actually requesting cache
flushes
>> when barrier''s are enabled?
>
> I think it''s more an illustration that write barriers on Linux
need
> a little work, even with flushes it should do a lot better then 50  
> IOPS.
>
> O_DIRECT does just that, with or without barriers, it flushes on  
> each write, with an ever so slight delay to allow the queue to  
> coalesce writes.
My bad O_DIRECT does NOT do that, it just goes direct to the driver  
bypassing page cache. Allows for low latency IO and arbitrary IO sizes  
for throughput (instead of page sized IO), but it doesn''t enforce  
persistence.
> A barrier is more to enforce order and persistence when IO is async.
I suspect that since XFS can use an internal or external log like ZFS  
does, that when a barrier is issued it is issued across all devices in  
the file system since XFS doesn''t know about the actual physical  
layout like ZFS does and that is why the IOPS are so low with XFS  
barriers.

-Ross

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090823/86254b7f/attachment.html>

Bob Friesenhahn

2009-Aug-23 15:26 UTC

head link

[zfs-discuss] Ssd for zil on a dell 2950

On Sun, 23 Aug 2009, Ross Walker wrote:>> 
>> O_DIRECT does just that, with or without barriers, it flushes on each 
>> write, with an ever so slight delay to allow the queue to coalesce
writes.
>
> My bad O_DIRECT does NOT do that, it just goes direct to the driver
bypassing
> page cache. Allows for low latency IO and arbitrary IO sizes for throughput
> (instead of page sized IO), but it doesn''t enforce persistence.
Right.  And Solaris does not support O_DIRECT.  It does provide a 
directio() function which requests similar functionality (as a hint) 
but zfs does not support direct I/O.  Linux O_DIRECT basically 
requires an application re-write to use it and its precise function 
tends to change across major kernel releases.

Bob
--
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Peter Eriksson

2009-Aug-24 15:17 UTC

head link

[zfs-discuss] Ssd for zil on a dell 2950

> You can "zpool replace" a bad slog device now.
>From which kernel release is this implemented/working?-- 
This message posted from opensolaris.org

zfs discuss - Aug 2009 - Ssd for zil on a dell 2950

[zfs-discuss] Ssd for zil on a dell 2950

[zfs-discuss] Ssd for zil on a dell 2950

[zfs-discuss] Ssd for zil on a dell 2950

[zfs-discuss] Ssd for zil on a dell 2950

[zfs-discuss] Ssd for zil on a dell 2950

[zfs-discuss] Ssd for zil on a dell 2950

[zfs-discuss] Ssd for zil on a dell 2950

[zfs-discuss] Ssd for zil on a dell 2950

[zfs-discuss] Ssd for zil on a dell 2950

[zfs-discuss] Ssd for zil on a dell 2950

[zfs-discuss] Ssd for zil on a dell 2950

[zfs-discuss] Ssd for zil on a dell 2950

[zfs-discuss] Ssd for zil on a dell 2950

[zfs-discuss] Ssd for zil on a dell 2950

[zfs-discuss] Ssd for zil on a dell 2950

[zfs-discuss] Ssd for zil on a dell 2950

[zfs-discuss] Ssd for zil on a dell 2950

[zfs-discuss] Ssd for zil on a dell 2950