thr3ads.net - zfs discuss - [zfs-discuss] Tuning ZFS for Sun Java Messaging Server [Oct 2008]

If this information is useful, please help other people find it:
Share via:

Adam N. Copeland

2008-Oct-21 13:00 UTC

[zfs-discuss] Tuning ZFS for Sun Java Messaging Server

We''re using a rather large (3.8TB) ZFS volume for our mailstores on a
JMS setup. Does anybody have any tips for tuning ZFS for JMS? I''m
looking for even the most obvious tips, as I am a bit of a novice. Thanks,

Adam

Robert Milkowski

2008-Oct-21 23:28 UTC

head link

[zfs-discuss] Tuning ZFS for Sun Java Messaging Server

Hello Adam,

Tuesday, October 21, 2008, 2:00:46 PM, you wrote:

ANC> We''re using a rather large (3.8TB) ZFS volume for our
mailstores on a
ANC> JMS setup. Does anybody have any tips for tuning ZFS for JMS?
I''m
ANC> looking for even the most obvious tips, as I am a bit of a novice.
Thanks,

Well, it''s kind of broad topic and it depends on a specific
environment. Then do not tune for the sake of tuning - try to
understand your problem first. Nevertheless you should consider things like
(random order):

1. RAID level - you probably will end-up with relatively small random
   IOs - generally avoid RAID-Z
   Of course it could be that RAID-Z in your environment is perfectly
   fine.

2. Depending on your workload and disk subsystem ZFS''s slog on SSD
could help to improve performance

3. Disable atime updates on zfs file system

4. Enabling compression like lzjb in theory could help - depends on
how weel you data would compress and how much CPU you have left and if
you are mostly IO bond

5. ZFS recordsize - probably not as in most cases when you read
anything from email you will probably read entire mail anyway.
Nevertheless could be easily checked with dtrace.

6. IIRC JMS keeps an index/db file per mailbox - so just maybe L2ARC
on large SSD would help assuming it would nicely cache these files -
would need to be simulated/tested

7. Disabling vdev pre-fetching in ZFS could help - see ZFS Evile tuning
guide


Except for #3 and maybe #7 first identify what is your problem and
what are you trying to fix.



-- 
Best regards,
 Robert Milkowski                            mailto:milek at task.gda.pl
                                       http://milek.blogspot.com

Richard Elling

2008-Oct-22 20:56 UTC

head link

[zfs-discuss] Tuning ZFS for Sun Java Messaging Server

As it happens, I''m currently involved with a project doing some
performance
analysis for this... but it is currently a WIP.  Comments below.

Robert Milkowski wrote:> Hello Adam,
>
> Tuesday, October 21, 2008, 2:00:46 PM, you wrote:
>
> ANC> We''re using a rather large (3.8TB) ZFS volume for our
mailstores on a
> ANC> JMS setup. Does anybody have any tips for tuning ZFS for JMS?
I''m
> ANC> looking for even the most obvious tips, as I am a bit of a novice.
Thanks,
>
> Well, it''s kind of broad topic and it depends on a specific
> environment. Then do not tune for the sake of tuning - try to
> understand your problem first. Nevertheless you should consider things like
(random order):
>
> 1. RAID level - you probably will end-up with relatively small random
>    IOs - generally avoid RAID-Z
>    Of course it could be that RAID-Z in your environment is perfectly
>    fine.
>   
There are some write latency-sensitive areas that will begin
to cause consternation for large loads.  Storage tuning is very
important in this space.  In our case, we''re using a ST6540
array which has a decent write cache and fast back-end.
> 2. Depending on your workload and disk subsystem ZFS''s slog on SSD
> could help to improve performance
>   
My experiments show that this is not the main performance
issue for large message volumes.
> 3. Disable atime updates on zfs file system
>   
Agree.  JMS doesn''t use it, so it just means extra work.
> 4. Enabling compression like lzjb in theory could help - depends on
> how weel you data would compress and how much CPU you have left and if
> you are mostly IO bond
>   
We have not experimented with this yet, but know that some
of the latency-sensitive writes are files with a small number of
bytes, which will not compress to be less than one disk block.
[opportunities for cleverness are here :-)]

There may be a benefit for the message body, but in my tests
we are not concentrating on that at this time.
> 5. ZFS recordsize - probably not as in most cases when you read
> anything from email you will probably read entire mail anyway.
> Nevertheless could be easily checked with dtrace.
>   
This does not seem to be an issue.
> 6. IIRC JMS keeps an index/db file per mailbox - so just maybe L2ARC
> on large SSD would help assuming it would nicely cache these files -
> would need to be simulated/tested
>   
This does not seem to be an issue, but in our testing the message
stores have plenty of memory, and hence, ARC size is on the order
of tens of GBytes.
> 7. Disabling vdev pre-fetching in ZFS could help - see ZFS Evile tuning
> guide
>   
My experiments showed no benefit by disabling pre-fetch.  However,
there are multiple layers of pre-fetching at play when you are using an
array, and we haven''t done a complete analysis on this yet.  It is
clear
that we are not bandwidth limited, so prefetching may not hurt.
>
> Except for #3 and maybe #7 first identify what is your problem and
> what are you trying to fix.
>
>   

Yep.
 -- richard

Adam N. Copeland

2008-Oct-24 18:54 UTC

head link

[zfs-discuss] Tuning ZFS for Sun Java Messaging Server

Thanks for the replies.

It appears the problem is that we are I/O bound. We have our SAN guy
looking into possibly moving us to faster spindles. In the meantime, I
wanted to implement whatever was possible to give us breathing room.
Turning off atime certainly helped, but we are definitely not completely
out of the drink yet.

I also found that disabling the ZFS cache flush as per the Evil Tuning
Guide was a huge boon, considering we''re on a battery-backed (non-Sun)
SAN.

Thanks,
Adam

Richard Elling wrote:> As it happens, I''m currently involved with a project doing some
> performance
> analysis for this... but it is currently a WIP.  Comments below.
>
> Robert Milkowski wrote:
>> Hello Adam,
>>
>> Tuesday, October 21, 2008, 2:00:46 PM, you wrote:
>>
>> ANC> We''re using a rather large (3.8TB) ZFS volume for our
mailstores
>> on a
>> ANC> JMS setup. Does anybody have any tips for tuning ZFS for JMS?
I''m
>> ANC> looking for even the most obvious tips, as I am a bit of a
>> novice. Thanks,
>>
>> Well, it''s kind of broad topic and it depends on a specific
>> environment. Then do not tune for the sake of tuning - try to
>> understand your problem first. Nevertheless you should consider
>> things like (random order):
>>
>> 1. RAID level - you probably will end-up with relatively small random
>>    IOs - generally avoid RAID-Z
>>    Of course it could be that RAID-Z in your environment is perfectly
>>    fine.
>>   
>
> There are some write latency-sensitive areas that will begin
> to cause consternation for large loads.  Storage tuning is very
> important in this space.  In our case, we''re using a ST6540
> array which has a decent write cache and fast back-end.
>
>> 2. Depending on your workload and disk subsystem ZFS''s slog on
SSD
>> could help to improve performance
>>   
>
> My experiments show that this is not the main performance
> issue for large message volumes.
>
>> 3. Disable atime updates on zfs file system
>>   
>
> Agree.  JMS doesn''t use it, so it just means extra work.
>
>> 4. Enabling compression like lzjb in theory could help - depends on
>> how weel you data would compress and how much CPU you have left and if
>> you are mostly IO bond
>>   
>
> We have not experimented with this yet, but know that some
> of the latency-sensitive writes are files with a small number of
> bytes, which will not compress to be less than one disk block.
> [opportunities for cleverness are here :-)]
>
> There may be a benefit for the message body, but in my tests
> we are not concentrating on that at this time.
>
>> 5. ZFS recordsize - probably not as in most cases when you read
>> anything from email you will probably read entire mail anyway.
>> Nevertheless could be easily checked with dtrace.
>>   
>
> This does not seem to be an issue.
>
>> 6. IIRC JMS keeps an index/db file per mailbox - so just maybe L2ARC
>> on large SSD would help assuming it would nicely cache these files -
>> would need to be simulated/tested
>>   
>
> This does not seem to be an issue, but in our testing the message
> stores have plenty of memory, and hence, ARC size is on the order
> of tens of GBytes.
>
>> 7. Disabling vdev pre-fetching in ZFS could help - see ZFS Evile tuning
>> guide
>>   
>
> My experiments showed no benefit by disabling pre-fetch.  However,
> there are multiple layers of pre-fetching at play when you are using an
> array, and we haven''t done a complete analysis on this yet.  It is
clear
> that we are not bandwidth limited, so prefetching may not hurt.
>
>>
>> Except for #3 and maybe #7 first identify what is your problem and
>> what are you trying to fix.
>>
>>   
>
>
> Yep.
> -- richard
>

Torrey McMahon

2008-Oct-24 18:57 UTC

head link

[zfs-discuss] Tuning ZFS for Sun Java Messaging Server

You may want to ask your SAN vendor if they have a setting you can make 
to no-op the cache flush. That way you don''t have to worry about the 
flush behavior if you change/add different arrays.

Adam N. Copeland wrote:> Thanks for the replies.
>
> It appears the problem is that we are I/O bound. We have our SAN guy
> looking into possibly moving us to faster spindles. In the meantime, I
> wanted to implement whatever was possible to give us breathing room.
> Turning off atime certainly helped, but we are definitely not completely
> out of the drink yet.
>
> I also found that disabling the ZFS cache flush as per the Evil Tuning
> Guide was a huge boon, considering we''re on a battery-backed
(non-Sun) SAN.
>
> Thanks,
> Adam
>
> Richard Elling wrote:
>   
>> As it happens, I''m currently involved with a project doing
some
>> performance
>> analysis for this... but it is currently a WIP.  Comments below.
>>
>> Robert Milkowski wrote:
>>     
>>> Hello Adam,
>>>
>>> Tuesday, October 21, 2008, 2:00:46 PM, you wrote:
>>>
>>> ANC> We''re using a rather large (3.8TB) ZFS volume for
our mailstores
>>> on a
>>> ANC> JMS setup. Does anybody have any tips for tuning ZFS for
JMS? I''m
>>> ANC> looking for even the most obvious tips, as I am a bit of a
>>> novice. Thanks,
>>>
>>> Well, it''s kind of broad topic and it depends on a
specific
>>> environment. Then do not tune for the sake of tuning - try to
>>> understand your problem first. Nevertheless you should consider
>>> things like (random order):
>>>
>>> 1. RAID level - you probably will end-up with relatively small
random
>>>    IOs - generally avoid RAID-Z
>>>    Of course it could be that RAID-Z in your environment is
perfectly
>>>    fine.
>>>   
>>>       
>> There are some write latency-sensitive areas that will begin
>> to cause consternation for large loads.  Storage tuning is very
>> important in this space.  In our case, we''re using a ST6540
>> array which has a decent write cache and fast back-end.
>>
>>     
>>> 2. Depending on your workload and disk subsystem ZFS''s
slog on SSD
>>> could help to improve performance
>>>   
>>>       
>> My experiments show that this is not the main performance
>> issue for large message volumes.
>>
>>     
>>> 3. Disable atime updates on zfs file system
>>>   
>>>       
>> Agree.  JMS doesn''t use it, so it just means extra work.
>>
>>     
>>> 4. Enabling compression like lzjb in theory could help - depends on
>>> how weel you data would compress and how much CPU you have left and
if
>>> you are mostly IO bond
>>>   
>>>       
>> We have not experimented with this yet, but know that some
>> of the latency-sensitive writes are files with a small number of
>> bytes, which will not compress to be less than one disk block.
>> [opportunities for cleverness are here :-)]
>>
>> There may be a benefit for the message body, but in my tests
>> we are not concentrating on that at this time.
>>
>>     
>>> 5. ZFS recordsize - probably not as in most cases when you read
>>> anything from email you will probably read entire mail anyway.
>>> Nevertheless could be easily checked with dtrace.
>>>   
>>>       
>> This does not seem to be an issue.
>>
>>     
>>> 6. IIRC JMS keeps an index/db file per mailbox - so just maybe
L2ARC
>>> on large SSD would help assuming it would nicely cache these files
-
>>> would need to be simulated/tested
>>>   
>>>       
>> This does not seem to be an issue, but in our testing the message
>> stores have plenty of memory, and hence, ARC size is on the order
>> of tens of GBytes.
>>
>>     
>>> 7. Disabling vdev pre-fetching in ZFS could help - see ZFS Evile
tuning
>>> guide
>>>   
>>>       
>> My experiments showed no benefit by disabling pre-fetch.  However,
>> there are multiple layers of pre-fetching at play when you are using an
>> array, and we haven''t done a complete analysis on this yet. 
It is clear
>> that we are not bandwidth limited, so prefetching may not hurt.
>>
>>     
>>> Except for #3 and maybe #7 first identify what is your problem and
>>> what are you trying to fix.
>>>
>>>   
>>>       
>> Yep.
>> -- richard
>>
>>     
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
>

Richard Elling

2008-Oct-24 19:10 UTC

head link

[zfs-discuss] Tuning ZFS for Sun Java Messaging Server

Adam N. Copeland wrote:> Thanks for the replies.
>
> It appears the problem is that we are I/O bound. We have our SAN guy
> looking into possibly moving us to faster spindles. In the meantime, I
> wanted to implement whatever was possible to give us breathing room.
> Turning off atime certainly helped, but we are definitely not completely
> out of the drink yet.
>
> I also found that disabling the ZFS cache flush as per the Evil Tuning
> Guide was a huge boon, considering we''re on a battery-backed
(non-Sun) SAN.
>   
Really?  Which OS version are you on?  This should have been
fixed in Solaris 10 5/08 (it is a fix in the [s]sd driver).  Caveat: there
may be some devices which do not properly negotiate the SYNC_NV
bit.  In my tests, using Solaris 10 5/08, disabling the cache flush made
zero difference.
 -- richard

Torrey McMahon

2008-Oct-24 19:34 UTC

head link

[zfs-discuss] Tuning ZFS for Sun Java Messaging Server

Richard Elling wrote:> Adam N. Copeland wrote:
>   
>> Thanks for the replies.
>>
>> It appears the problem is that we are I/O bound. We have our SAN guy
>> looking into possibly moving us to faster spindles. In the meantime, I
>> wanted to implement whatever was possible to give us breathing room.
>> Turning off atime certainly helped, but we are definitely not
completely
>> out of the drink yet.
>>
>> I also found that disabling the ZFS cache flush as per the Evil Tuning
>> Guide was a huge boon, considering we''re on a battery-backed
(non-Sun) SAN.
>>   
>>     
>
> Really?  Which OS version are you on?  This should have been
> fixed in Solaris 10 5/08 (it is a fix in the [s]sd driver).  Caveat: there
> may be some devices which do not properly negotiate the SYNC_NV
> bit.  In my tests, using Solaris 10 5/08, disabling the cache flush made
> zero difference.
>   
PSARC 2007/053

If I read through the code correctly...

If the array doesn''t respond to the device inquiry, you
haven''t made an
entry to sd.conf for the array, or it isn''t hard coded in the sd.c
table
- I think there are only two in that state - then you''d have to disable
the cache flush.

zfs discuss - Oct 2008 - Tuning ZFS for Sun Java Messaging Server

[zfs-discuss] Tuning ZFS for Sun Java Messaging Server

[zfs-discuss] Tuning ZFS for Sun Java Messaging Server

[zfs-discuss] Tuning ZFS for Sun Java Messaging Server

[zfs-discuss] Tuning ZFS for Sun Java Messaging Server

[zfs-discuss] Tuning ZFS for Sun Java Messaging Server

[zfs-discuss] Tuning ZFS for Sun Java Messaging Server

[zfs-discuss] Tuning ZFS for Sun Java Messaging Server