thr3ads.net - Xen users - [Xen-users] Xen and I/O Intensive Loads [Aug 2009]

If this information is useful, please help other people find it:
Share via:

Nick Couchman

2009-Aug-26 16:01 UTC

[Xen-users] Xen and I/O Intensive Loads

Hi, folks, 
I''m attempting to run an e-mail server on Xen.  The e-mail system is
Novell GroupWise, and it serves about 250 users.  The disk volume for the e-mail
is on my SAN, and I''ve attached the FC LUN to my Xen host, then used
the "phy:/dev..." method to forward the disk through to the domU. 
I''m running into an issue with high I/O wait on the box (~250%) and
large load averages (20-40 for the 1/5/15 minute average).  I was wondering if
anyone has ideas on tuning the domU to handle this - is there a better way to
forward the disk device through, should I try using an iSCSI software initiator
in the domU, or is it just a bad idea to put an I/O load like this in a domU? 
Unfortunately mapping the entire F/C card through to the domU isn''t
really an option - the FC card accesses other SAN volumes for the Xen host, so
it needs to be present in dom0.

I''m running Xen 3.2.0 on SLES 10 SP2, on a Dell PowerEdge R610 server. 
The FC HBA is a QLE2462, dual-channel 4Gb FC card.  Any help, hints, etc., are
greatly appreciated!

-Nick


--------
This e-mail may contain confidential and privileged material for the sole use of
the intended recipient.  If this email is not intended for you, or you are not
responsible for the delivery of this message to the intended recipient, please
note that this message may contain SEAKR Engineering (SEAKR)
Privileged/Proprietary Information.  In such a case, you are strictly prohibited
from downloading, photocopying, distributing or otherwise using this message,
its contents or attachments in any way.  If you have received this message in
error, please notify us immediately by replying to this e-mail and delete the
message from your mailbox.  Information contained in this message that does not
relate to the business of SEAKR is neither endorsed by nor attributable to
SEAKR.


_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Daniel Mealha Cabrita

2009-Aug-26 16:37 UTC

head link

Re: [Xen-users] Xen and I/O Intensive Loads

250 users normally is no big deal for an e-mail server, even a virtualized 
one, though I don''t know how GroupWise behaves.

I suggest you to change your domU IO scheduler to minimize dom0-domU IO 
latency impact:

BLAH: your domU block device.
$ echo deadline > /sys/block/BLAH/queue/scheduler
And play with the settings inside /sys/block/BLAH/queue/


About dom0, I don''t know about your storage and RAID setup so it might
(or
might not) be a good idea to try to reduce the latency between dom0-storage:

BLAH: your FC device paths (sda, sdb ... sdaa, sdab etc)
$ echo noop > /sys/block/BLAH/queue/scheduler


On Wednesday 26 August 2009 13:01:13 Nick Couchman
wrote:> Hi, folks,
> I''m attempting to run an e-mail server on Xen.  The e-mail system
is Novell
> GroupWise, and it serves about 250 users.  The disk volume for the e-mail
> is on my SAN, and I''ve attached the FC LUN to my Xen host, then
used the
> "phy:/dev..." method to forward the disk through to the domU. 
I''m running
> into an issue with high I/O wait on the box (~250%) and large load averages
> (20-40 for the 1/5/15 minute average).  I was wondering if anyone has ideas
> on tuning the domU to handle this - is there a better way to forward the
> disk device through, should I try using an iSCSI software initiator in the
> domU, or is it just a bad idea to put an I/O load like this in a domU? 
> Unfortunately mapping the entire F/C card through to the domU
isn''t really
> an option - the FC card accesses other SAN volumes for the Xen host, so it
> needs to be present in dom0.
>
> I''m running Xen 3.2.0 on SLES 10 SP2, on a Dell PowerEdge R610
server.  The
> FC HBA is a QLE2462, dual-channel 4Gb FC card.  Any help, hints, etc., are
> greatly appreciated!
>
> -Nick
>
>
> --------
> This e-mail may contain confidential and privileged material for the sole
> use of the intended recipient.  If this email is not intended for you, or
> you are not responsible for the delivery of this message to the intended
> recipient, please note that this message may contain SEAKR Engineering
> (SEAKR) Privileged/Proprietary Information.  In such a case, you are
> strictly prohibited from downloading, photocopying, distributing or
> otherwise using this message, its contents or attachments in any way.  If
> you have received this message in error, please notify us immediately by
> replying to this e-mail and delete the message from your mailbox. 
> Information contained in this message that does not relate to the business
> of SEAKR is neither endorsed by nor attributable to SEAKR.


-- 
 Daniel Mealha Cabrita
 Divisao de Suporte Tecnico
 AINFO / Reitoria / UTFPR
 http://www.utfpr.edu.br

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

John Madden

2009-Aug-26 17:32 UTC

head link

Re: [Xen-users] Xen and I/O Intensive Loads

> I''m attempting to run an e-mail server on Xen.  The e-mail system
is
> Novell GroupWise, and it serves about 250 users.  The disk volume for
> the e-mail is on my SAN, and I''ve attached the FC LUN to my Xen
host,
> then used the "phy:/dev..." method to forward the disk through to
the
> domU.  I''m running into an issue with high I/O wait on the box
(~250%)
> and large load averages (20-40 for the 1/5/15 minute average).  I was
> wondering if anyone has ideas on tuning the domU to handle this - is
> there a better way to forward the disk device through, should I try
> using an iSCSI software initiator in the domU, or is it just a bad
> idea to put an I/O load like this in a domU?  Unfortunately mapping
> the entire F/C card through to the domU isn''t really an option -
the
> FC card accesses other SAN volumes for the Xen host, so it needs to be
> present in dom0. 
If this turns out to be a global issue, I''d certainly like to hear
about
it.  I recently load-tested a postfix+cyrus domU with 6 SATA-backed
spools and 6 FC-backed meta partitions for about 300,000 IMAP accounts
and consistently delivered around 100 messages/sec to them.  That load
was obviously all i/o-bound, but at what I''d consider to be an
acceptable delivery rate (which seems to be the most
performance-challenging operation at least with Cyrus).  I did see
similar load averages though.

This was with a RHEL 5 domU and a CentOS 5 dom0 and phy: mappings.  

John



-- 
John Madden
Sr UNIX Systems Engineer
Ivy Tech Community College of Indiana
jmadden@ivytech.edu


_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Pasi Kärkkäinen

2009-Aug-26 17:32 UTC

head link

Re: [Xen-users] Xen and I/O Intensive Loads

On Wed, Aug 26, 2009 at 10:01:13AM -0600, Nick Couchman
wrote:> 
> Hi, folks, 
> The disk volume for the e-mail is on my SAN, and I''ve attached the
FC LUN to my Xen host
Does the SAN LUN perform OK from dom0 without any domUs running?
> I''m running into an issue with high I/O wait on the box (~250%)
and large load averages (20-40 for the 1/5/15 minute average).
Do you have iowait on dom0, or only in domU?

Try running "iostat 1" in both dom0 and domU.

Also, have you dedicated a cpu core only for dom0?
> I''m running Xen 3.2.0 on SLES 10 SP2, on a Dell PowerEdge R610
server.  The FC HBA is a QLE2462, dual-channel 4Gb FC card.  Any help, hints,
etc., are greatly appreciated!
You could also try updating to SLES11.

-- Pasi


_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Nick Couchman

2009-Aug-26 17:40 UTC

head link

Re: [Xen-users] Xen and I/O Intensive Loads

Pasi, 
1) Yes, it seems to, although I've not run my e-mail server inside my
dom0, yet - that's not really possible. 
2) Just on domU - dom0 seems fine. 
3) No, I haven't played around much with pinning dom0 or any of the
domUs to certain CPU cores. 
4) Updating to SLES11 is in the future; however, doing this requires
that I do it to all my production Xen nodes concurrently, since the
OCFS2 filesystem doesn't really play nice with other versions.  I can
either have it mounted on the SLES10 box(es) or the SLES11 box(es), but
not both at the same time. 

Thanks! 

-Nick
>>> On 2009/08/26 at 11:32, Pasi Kärkkäinen<pasik@iki.fi> wrote:

On Wed, Aug 26, 2009 at 10:01:13AM -0600, Nick Couchman
wrote:>
> Hi, folks,
> The disk volume for the e-mail is on my SAN, and I've attached the FCLUN to my Xen host

Does the SAN LUN perform OK from dom0 without any domUs running?
> I'm running into an issue with high I/O wait on the box (~250%) andlarge load averages (20-40 for the 1/5/15 minute average).

Do you have iowait on dom0, or only in domU?

Try running "iostat 1" in both dom0 and domU.

Also, have you dedicated a cpu core only for dom0?
> I'm running Xen 3.2.0 on SLES 10 SP2, on a Dell PowerEdge R610server.  The FC HBA is a QLE2462, dual-channel 4Gb FC card.  Any help,
hints, etc., are greatly appreciated!

You could also try updating to SLES11.

-- Pasi

--------
This e-mail may contain confidential and privileged material for the sole use of
the intended recipient.  If this email is not intended for you, or you are not
responsible for the delivery of this message to the intended recipient, please
note that this message may contain SEAKR Engineering (SEAKR)
Privileged/Proprietary Information.  In such a case, you are strictly prohibited
from downloading, photocopying, distributing or otherwise using this message,
its contents or attachments in any way.  If you have received this message in
error, please notify us immediately by replying to this e-mail and delete the
message from your mailbox.  Information contained in this message that does not
relate to the business of SEAKR is neither endorsed by nor attributable to
SEAKR.

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Nick Couchman

2009-Aug-26 17:41 UTC

head link

Re: [Xen-users] Xen and I/O Intensive Loads

John, 
What filesystem did you use for this test in the domU for the e-mail storage? 
I''m currently running XFS on my volume where the GroupWise data sits,
and I''m wondering if the filesystem isn''t tuned properly. 
Could you give me a run-down of what filesystem you used, and what parameters
you used for creating the filesystem (block size, inode size, etc.)?

Thanks! 

-Nick
>>> On 2009/08/26 at 11:32, John Madden <jmadden@ivytech.edu>
wrote:
> I''m attempting to run an e-mail server on Xen.  The e-mail system
is
> Novell GroupWise, and it serves about 250 users.  The disk volume for
> the e-mail is on my SAN, and I''ve attached the FC LUN to my Xen
host,
> then used the "phy:/dev..." method to forward the disk through to
the
> domU.  I''m running into an issue with high I/O wait on the box
(~250%)
> and large load averages (20-40 for the 1/5/15 minute average).  I was
> wondering if anyone has ideas on tuning the domU to handle this - is
> there a better way to forward the disk device through, should I try
> using an iSCSI software initiator in the domU, or is it just a bad
> idea to put an I/O load like this in a domU?  Unfortunately mapping
> the entire F/C card through to the domU isn''t really an option -
the
> FC card accesses other SAN volumes for the Xen host, so it needs to be
> present in dom0.
If this turns out to be a global issue, I''d certainly like to hear
about
it.  I recently load-tested a postfix+cyrus domU with 6 SATA-backed
spools and 6 FC-backed meta partitions for about 300,000 IMAP accounts
and consistently delivered around 100 messages/sec to them.  That load
was obviously all i/o-bound, but at what I''d consider to be an
acceptable delivery rate (which seems to be the most
performance-challenging operation at least with Cyrus).  I did see
similar load averages though.

This was with a RHEL 5 domU and a CentOS 5 dom0 and phy: mappings. 

John



--
John Madden
Sr UNIX Systems Engineer
Ivy Tech Community College of Indiana
jmadden@ivytech.edu



--------
This e-mail may contain confidential and privileged material for the sole use of
the intended recipient.  If this email is not intended for you, or you are not
responsible for the delivery of this message to the intended recipient, please
note that this message may contain SEAKR Engineering (SEAKR)
Privileged/Proprietary Information.  In such a case, you are strictly prohibited
from downloading, photocopying, distributing or otherwise using this message,
its contents or attachments in any way.  If you have received this message in
error, please notify us immediately by replying to this e-mail and delete the
message from your mailbox.  Information contained in this message that does not
relate to the business of SEAKR is neither endorsed by nor attributable to
SEAKR.


_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

John Madden

2009-Aug-26 17:54 UTC

head link

Re: [Xen-users] Xen and I/O Intensive Loads

On Wed, 2009-08-26 at 11:41 -0600, Nick Couchman wrote:> What filesystem did you use for this test in the domU for the e-mail
> storage?  I''m currently running XFS on my volume where the
GroupWise
> data sits, and I''m wondering if the filesystem isn''t
tuned properly.
> Could you give me a run-down of what filesystem you used, and what
> parameters you used for creating the filesystem (block size, inode
> size, etc.)? 
ext3, always.  xfs et al may be better depending on the filesystem use
but I''ve found ext3 to always be reliable and performant enough.

`mke2fs -j -O dir_index -T news /dev/vg/lv`







-- 
John Madden
Sr UNIX Systems Engineer
Ivy Tech Community College of Indiana
jmadden@ivytech.edu


_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Nick Couchman

2009-Aug-26 17:57 UTC

head link

Re: [Xen-users] Xen and I/O Intensive Loads

I take that back...iostat in dom0 shows similar results to iostat in
domU.  As suggested by another person, I've changed the dom0 elevator to
noop and the domU to deadline.  I'm playing with tuning some of the
parameters for the deadline scheduler, now. 

-Nick
>>> On 2009/08/26 at 11:32, Pasi Kärkkäinen<pasik@iki.fi> wrote:

On Wed, Aug 26, 2009 at 10:01:13AM -0600, Nick Couchman
wrote:>
> Hi, folks,
> The disk volume for the e-mail is on my SAN, and I've attached the FCLUN to my Xen host

Does the SAN LUN perform OK from dom0 without any domUs running?
> I'm running into an issue with high I/O wait on the box (~250%) andlarge load averages (20-40 for the 1/5/15 minute average).

Do you have iowait on dom0, or only in domU?

Try running "iostat 1" in both dom0 and domU.

Also, have you dedicated a cpu core only for dom0?
> I'm running Xen 3.2.0 on SLES 10 SP2, on a Dell PowerEdge R610server.  The FC HBA is a QLE2462, dual-channel 4Gb FC card.  Any help,
hints, etc., are greatly appreciated!

You could also try updating to SLES11.

-- Pasi



--------
This e-mail may contain confidential and privileged material for the sole use of
the intended recipient.  If this email is not intended for you, or you are not
responsible for the delivery of this message to the intended recipient, please
note that this message may contain SEAKR Engineering (SEAKR)
Privileged/Proprietary Information.  In such a case, you are strictly prohibited
from downloading, photocopying, distributing or otherwise using this message,
its contents or attachments in any way.  If you have received this message in
error, please notify us immediately by replying to this e-mail and delete the
message from your mailbox.  Information contained in this message that does not
relate to the business of SEAKR is neither endorsed by nor attributable to
SEAKR.


_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Javier Guerra

2009-Aug-26 18:00 UTC

head link

Re: [Xen-users] Xen and I/O Intensive Loads

On Wed, Aug 26, 2009 at 12:57 PM, Nick Couchman<Nick.Couchman@seakr.com>
wrote:> I''ve changed the dom0 elevator to noop and the domU to deadline.
shouldn''t that be the other way around?

-- 
Javier

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Nick Couchman

2009-Aug-26 18:07 UTC

head link

Re: [Xen-users] Xen and I/O Intensive Loads

Doesn''t really seem to make a difference which way I do it...I still
see pretty intense disk I/O.

Here is some sample output from iostat in the domU: 

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz
avgqu-sz   await  svctm  %util
xvdb             12.20     0.00 1217.40   26.20  9197.60   530.80    15.65   
29.66   23.47   0.80 100.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz
avgqu-sz   await  svctm  %util
xvdb             18.40     0.00 1121.20   19.60  8737.60   691.50    16.53   
32.97   29.13   0.88 100.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz
avgqu-sz   await  svctm  %util
xvdb             27.80     0.00 1241.40   29.20  8158.40   377.90    13.44   
42.59   33.73   0.79 100.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz
avgqu-sz   await  svctm  %util
xvdb             31.60     0.00 1256.60   35.00  9426.40   424.00    15.25   
42.06   32.44   0.77 100.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz
avgqu-sz   await  svctm  %util
xvdb             57.68     0.00 1250.50   17.76  8588.42   352.99    14.10   
51.36   40.60   0.79  99.80

the avgqu-sz is anywhere from 11 to 75, and the await is anywhere from 20 to 50.
%util is always around 100.

-Nick 

>>> On 2009/08/26 at 12:00, Javier Guerra <javier@guerrag.com>
wrote:

On Wed, Aug 26, 2009 at 12:57 PM, Nick Couchman<Nick.Couchman@seakr.com>
wrote:> I''ve changed the dom0 elevator to noop and the domU to deadline.
shouldn''t that be the other way around?

--
Javier


--------
This e-mail may contain confidential and privileged material for the sole use of
the intended recipient.  If this email is not intended for you, or you are not
responsible for the delivery of this message to the intended recipient, please
note that this message may contain SEAKR Engineering (SEAKR)
Privileged/Proprietary Information.  In such a case, you are strictly prohibited
from downloading, photocopying, distributing or otherwise using this message,
its contents or attachments in any way.  If you have received this message in
error, please notify us immediately by replying to this e-mail and delete the
message from your mailbox.  Information contained in this message that does not
relate to the business of SEAKR is neither endorsed by nor attributable to
SEAKR.


_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Nick Couchman

2009-Aug-26 18:08 UTC

head link

Re: [Xen-users] Xen and I/O Intensive Loads

Hmmm...may have to test with ext and the -T news parameter to see how that
works.

-Nick
>>> On 2009/08/26 at 11:54, John Madden <jmadden@ivytech.edu>
wrote:

On Wed, 2009-08-26 at 11:41 -0600, Nick Couchman wrote:> What filesystem did you use for this test in the domU for the e-mail
> storage?  I''m currently running XFS on my volume where the
GroupWise
> data sits, and I''m wondering if the filesystem isn''t
tuned properly.
> Could you give me a run-down of what filesystem you used, and what
> parameters you used for creating the filesystem (block size, inode
> size, etc.)?
ext3, always.  xfs et al may be better depending on the filesystem use
but I''ve found ext3 to always be reliable and performant enough.

`mke2fs -j -O dir_index -T news /dev/vg/lv`







--
John Madden
Sr UNIX Systems Engineer
Ivy Tech Community College of Indiana
jmadden@ivytech.edu



--------
This e-mail may contain confidential and privileged material for the sole use of
the intended recipient.  If this email is not intended for you, or you are not
responsible for the delivery of this message to the intended recipient, please
note that this message may contain SEAKR Engineering (SEAKR)
Privileged/Proprietary Information.  In such a case, you are strictly prohibited
from downloading, photocopying, distributing or otherwise using this message,
its contents or attachments in any way.  If you have received this message in
error, please notify us immediately by replying to this e-mail and delete the
message from your mailbox.  Information contained in this message that does not
relate to the business of SEAKR is neither endorsed by nor attributable to
SEAKR.


_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Daniel Mealha Cabrita

2009-Aug-27 07:46 UTC

head link

Re: [Xen-users] Xen and I/O Intensive Loads

On Wednesday 26 August 2009 15:00:39 Javier Guerra
wrote:> On Wed, Aug 26, 2009 at 12:57 PM, Nick
Couchman<Nick.Couchman@seakr.com>
wrote:> > I''ve changed the dom0 elevator to noop and the domU to
deadline.
>
> shouldn''t that be the other way around?
I see what you mean, but it''s really what I meant (as described in
another
message).

The noop@dom0 is to lower (the already high) IO latency and leave all 
optimizations to the storage to do, at least in theory.
In _my_ SAN, though, it''s still worth using a local IO scheduler
(though noop
is ok).
If you don''t have at least a decent hardware-RAID controller with write
cache
enabled, then it''s a no-no.

About deadline@domU, that''s perhaps even more peculiar.
Apparently. domU pestering dom0 too often with IO requests degrades 
performance, at least in xen 3.0.x.

-- 
 Daniel Mealha Cabrita
 Divisao de Suporte Tecnico
 AINFO / Reitoria / UTFPR
 http://www.utfpr.edu.br

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Pasi Kärkkäinen

2009-Aug-27 09:00 UTC

head link

Re: [Xen-users] Xen and I/O Intensive Loads

On Wed, Aug 26, 2009 at 12:07:55PM -0600, Nick Couchman
wrote:> 
> Doesn''t really seem to make a difference which way I do it...I
still see pretty intense disk I/O.
> 
> Here is some sample output from iostat in the domU: 
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz
avgqu-sz   await  svctm  %util
> xvdb             12.20     0.00 1217.40   26.20  9197.60   530.80    15.65 
29.66   23.47   0.80 100.00
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz
avgqu-sz   await  svctm  %util
> xvdb             18.40     0.00 1121.20   19.60  8737.60   691.50    16.53 
32.97   29.13   0.88 100.00
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz
avgqu-sz   await  svctm  %util
> xvdb             27.80     0.00 1241.40   29.20  8158.40   377.90    13.44 
42.59   33.73   0.79 100.00
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz
avgqu-sz   await  svctm  %util
> xvdb             31.60     0.00 1256.60   35.00  9426.40   424.00    15.25 
42.06   32.44   0.77 100.00
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz
avgqu-sz   await  svctm  %util
> xvdb             57.68     0.00 1250.50   17.76  8588.42   352.99    14.10 
51.36   40.60   0.79  99.80
> 
> the avgqu-sz is anywhere from 11 to 75, and the await is anywhere from 20
to 50.  %util is always around 100.
> 
Well.. it seems your SAN LUN is the problem. Have you checked the load
from the FC Storage array?

Or then the problem is in your FC HBA. Have you verified the FC link is at full
speed?

Are the FC switches OK?

Do you have up-to-date HBA driver in dom0? Are the HBA/Switch/Storage
firmwares up-to-date?

-- Pasi


_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Fajar A. Nugraha

2009-Aug-27 09:36 UTC

head link

Re: [Xen-users] Xen and I/O Intensive Loads

On Wed, Aug 26, 2009 at 11:01 PM, Nick Couchman<Nick.Couchman@seakr.com>
wrote:> Hi, folks,
>
> I''m attempting to run an e-mail server on Xen.  The e-mail system
is Novell
> GroupWise, and it serves about 250 users.  The disk volume for the e-mail
is
> on my SAN, and I''ve attached the FC LUN to my Xen host, then used
the
> "phy:/dev..." method to forward the disk through to the domU.
 I''m running
> into an issue with high I/O wait on the box (~250%) and large load averages
> (20-40 for the 1/5/15 minute average).
Just to be clear : can a native system handle your load?
Try iostat on both dom0 and domU. My guess is that you''re I/O bound,
and even moving to it a native physical server won''t help, since the
bottleneck is in the disk.
> I was wondering if anyone has ideas
> on tuning the domU to handle this - is there a better way to forward the
> disk device through, should I try using an iSCSI software initiator in the
> domU,
Some past threads on this list suggest otherwise. iSCSI in domU
provides worse performance compared to (for example) iscsi in dom0 and
passing the disk using phy:/.
> or is it just a bad idea to put an I/O load like this in a domU?
If it works on native system it should work on a domU.

-- 
Fajar

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Nick Couchman

2009-Aug-27 13:46 UTC

head link

Re: [Xen-users] Xen and I/O Intensive Loads

So here are some details on the SAN LUN...the SAN is a Compellent SAN
attached to my FC Switch (McData Sphereon 4700, now the Brocade M4700)
with 4 x 2Gb FC connections.  The dom0 uses the QLE2462 adapter, with a
single 4Gb connection hooked up.  I did find that there is a later
driver available - I'll try to switch to that when I get a chance.  One
interesting thing that I found is that it the adapter appears to be in a
4x PCIe slot, which means the max bandwidth for the card is 2.5Gbps. 
I'm not sure if this is a QLogic issue or if I need to move the card to
a different slot in my Dell PowerEdge R610 chassis, but it looks like
I'm being limited to 2/3 or so the speed of the FC connection by my PCIe
bus.  It's using a 4Gbps Point-to-Point connection, with a frame size of
2048.  Any hints on whether any of that needs tuning would be great.

I'm not really sure that bandwidth is an issue - perhaps latency more
than that.  I don't think the amount of data is what's causing the
problem; rather the number of transactions that the e-mail system is
trying to do on the volume.  The file sizes are actually pretty small -
1 to 4 Kb on average, so I think it's the large number of these files
that it has to try to read rather than streaming a large amount of data.
 Both the SAN and the iostat output on both dom0 and domU indicate
somewhere between 5000 and 20000 kB/s read rates - that's somewhere
around 40Mb/s to 160Mb/s, which is well within the capability of the FC
connection.  The SAN is indicating I/O operations between 500 and 1500
I/O requests per second, which I assume is what's causing the problem.

Again, any tips on what to look at next would be greatly appreciated! 
Thanks for all the advice so far!

-Nick
>>> On 2009/08/27 at 03:00, Pasi Kärkkäinen<pasik@iki.fi> wrote:
On Wed, Aug 26, 2009 at 12:07:55PM -0600, Nick Couchman
wrote:> 
> Doesn't really seem to make a difference which way I do it...I still
see pretty intense disk I/O. > 
> Here is some sample output from iostat in the domU: 
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
avgrq-sz avgqu-sz   await  svctm  %util > xvdb             12.20     0.00 1217.40   26.20  9197.60   530.80   
15.65    29.66   23.47   0.80 100.00 > 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
avgrq-sz avgqu-sz   await  svctm  %util > xvdb             18.40     0.00 1121.20   19.60  8737.60   691.50   
16.53    32.97   29.13   0.88 100.00 > 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
avgrq-sz avgqu-sz   await  svctm  %util > xvdb             27.80     0.00 1241.40   29.20  8158.40   377.90   
13.44    42.59   33.73   0.79 100.00 > 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
avgrq-sz avgqu-sz   await  svctm  %util > xvdb             31.60     0.00 1256.60   35.00  9426.40   424.00   
15.25    42.06   32.44   0.77 100.00 > 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
avgrq-sz avgqu-sz   await  svctm  %util > xvdb             57.68     0.00 1250.50   17.76  8588.42   352.99   
14.10    51.36   40.60   0.79  99.80 > 
> the avgqu-sz is anywhere from 11 to 75, and the await is anywhere
from 20 to 50.  %util is always around 100. > 
Well.. it seems your SAN LUN is the problem. Have you checked the load
from the FC Storage array?

Or then the problem is in your FC HBA. Have you verified the FC link is
at full speed? 

Are the FC switches OK?

Do you have up-to-date HBA driver in dom0? Are the HBA/Switch/Storage
firmwares up-to-date?

-- Pasi

--------
This e-mail may contain confidential and privileged material for the sole use of
the intended recipient.  If this email is not intended for you, or you are not
responsible for the delivery of this message to the intended recipient, please
note that this message may contain SEAKR Engineering (SEAKR)
Privileged/Proprietary Information.  In such a case, you are strictly prohibited
from downloading, photocopying, distributing or otherwise using this message,
its contents or attachments in any way.  If you have received this message in
error, please notify us immediately by replying to this e-mail and delete the
message from your mailbox.  Information contained in this message that does not
relate to the business of SEAKR is neither endorsed by nor attributable to
SEAKR.

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Nick Couchman

2009-Aug-27 13:54 UTC

head link

Re: [Xen-users] Xen and I/O Intensive Loads

I think that my previous native system from which I migrated this VM was
handling the load *better* than this one is, but, unfortunately, it''s
not a one-to-one comparison.  The previous system was a PowerEdge 2650 with 2 x
CPUs and 4GB of RAM, and a single 4Gb FC connection to the SAN.  This system was
seeing some I/O issues, but was also seeing some pretty severe CPU load.  Now
I''ve eliminated the CPU bottleneck by moving it into a VM that has 4 x
CPUs and 4GB of RAM, but seem to be up against the I/O bottleneck, now.  Due to
the complexity of the software installation, switching the load over to the dom0
on the box really isn''t an option for testing.  But, it seems that
you''re probably right, since iostat in both dom0 and domU show very
similar statistics.

My real question is, what can I do to alleviate it?  Is it really a SAN issue? 
Will tuning the filesystem (even if that means recreating the filesystem) help
reduce the number of I/O operations per second?  I guess I have a few things to
investigate, and I may file a support case with the SAN vendor and request some
assistance from them.

Thanks!

-Nick
>>> On 2009/08/27 at 03:36, "Fajar A. Nugraha"
<fajar@fajar.net> wrote:
On Wed, Aug 26, 2009 at 11:01 PM, Nick Couchman<Nick.Couchman@seakr.com>
wrote:> Hi, folks,
>
> I''m attempting to run an e-mail server on Xen.  The e-mail system
is Novell
> GroupWise, and it serves about 250 users.  The disk volume for the e-mail
is
> on my SAN, and I''ve attached the FC LUN to my Xen host, then used
the
> "phy:/dev..." method to forward the disk through to the domU. 
I''m running
> into an issue with high I/O wait on the box (~250%) and large load averages
> (20-40 for the 1/5/15 minute average).
Just to be clear : can a native system handle your load?
Try iostat on both dom0 and domU. My guess is that you''re I/O bound,
and even moving to it a native physical server won''t help, since the
bottleneck is in the disk.
> I was wondering if anyone has ideas
> on tuning the domU to handle this - is there a better way to forward the
> disk device through, should I try using an iSCSI software initiator in the
> domU,
Some past threads on this list suggest otherwise. iSCSI in domU
provides worse performance compared to (for example) iscsi in dom0 and
passing the disk using phy:/.
> or is it just a bad idea to put an I/O load like this in a domU?
If it works on native system it should work on a domU.

-- 
Fajar

--------
This e-mail may contain confidential and privileged material for the sole use of
the intended recipient.  If this email is not intended for you, or you are not
responsible for the delivery of this message to the intended recipient, please
note that this message may contain SEAKR Engineering (SEAKR)
Privileged/Proprietary Information.  In such a case, you are strictly prohibited
from downloading, photocopying, distributing or otherwise using this message,
its contents or attachments in any way.  If you have received this message in
error, please notify us immediately by replying to this e-mail and delete the
message from your mailbox.  Information contained in this message that does not
relate to the business of SEAKR is neither endorsed by nor attributable to
SEAKR.

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

John Madden

2009-Aug-27 14:15 UTC

head link

Re: [Xen-users] Xen and I/O Intensive Loads

> I''m not really sure that bandwidth is an issue - perhaps latency
more
> than that.  I don''t think the amount of data is what''s
causing the
> problem; rather the number of transactions that the e-mail system is
> trying to do on the volume.  The file sizes are actually pretty small
> - 1 to 4 Kb on average, so I think it''s the large number of these
> files that it has to try to read rather than streaming a large amount
> of data.  Both the SAN and the iostat output on both dom0 and domU
> indicate somewhere between 5000 and 20000 kB/s read rates - that''s
> somewhere around 40Mb/s to 160Mb/s, which is well within the
> capability of the FC connection.  The SAN is indicating I/O operations
> between 500 and 1500 I/O requests per second, which I assume is
what''s
> causing the problem.
What''s the backend inside the SAN look like?  Look into amount of
cache,
number of spindles, RAID used, what else is using those spindles, etc. 

500-1500 iops isn''t a lot for a "SAN" in general, but given
that your FC
disks are going to get around 200 worst-case iops, you''d still need
quite a few of them to push 1500 continuously (with your cache picking
up some of the spikes).  And that depends on workload (read/write,
random or not, block size) and RAID type.

In case you haven''t already, I''d look into the usual
filesystem
performance guides and do things like turning off atime and that lot.
My feeling on this is that you''re going to need to drive down those
iops
numbers.

What were your results on trying something other than xfs?

John


-- 
John Madden
Sr UNIX Systems Engineer
Ivy Tech Community College of Indiana
jmadden@ivytech.edu


_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Fajar A. Nugraha

2009-Aug-27 14:18 UTC

head link

Re: [Xen-users] Xen and I/O Intensive Loads

On Thu, Aug 27, 2009 at 8:54 PM, Nick Couchman<Nick.Couchman@seakr.com>
wrote:> But, it
> seems that you''re probably right, since iostat in both dom0 and
domU show
> very similar statistics.
>
> My real question is, what can I do to alleviate it?  Is it really a SAN
> issue?
If dom0 iostat says near 100% usage, then yes, most probably it''s a
storage issue.
>  Will tuning the filesystem (even if that means recreating the
> filesystem) help reduce the number of I/O operations per second?  I guess I
> have a few things to investigate, and I may file a support case with the
SAN
> vendor and request some assistance from them.
How many disks you have in your SAN? How many disks are in use
exclusively by this system? A typical SATA disk handles < 100 random
IOPS, so that might be the issue, and increasing the number of disks
(and configure them to be used evenly) seems to be the solution.

As to how to reduce number of IOPS, well, I''m not really sure
there''s
a way to do it that doesn''t involve changing your application.

Some things to try :
- if the load is bursty then usually adding more writeback memory
cache in SAN/SCSI controller helps.
- If it''s mostily temporary files then using something like ext4 which
has delayed allocation should help.
- Another method would be switching to zfs and adding some SSD for
ZIL, but this belongs in a different list :P

-- 
Fajar

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Nick Couchman

2009-Aug-27 14:25 UTC

head link

Re: [Xen-users] Xen and I/O Intensive Loads

Let''s see...the SAN has two controllers with a 4GB cache in each
controller.  Each controller has a single 4 x 2Gb FC controller.  Two of those
ports go to the switch; the other two create redundant loops with the disk array
(going from the controller to one disk array, then to the next disk array, then
to the second controler).  The disks are FCATA disks, there are 30 active disks
(with 2 hot-spares).  The SAN does RAIDs across the disks on a per-volume basis,
and my e-mail volume is using a RAID10 configuration.

I''ve done most of the filesystem tuning I can without completely
rebuilding the filesystem - atime is turned off.  I''ve also adjusted
the elevator per previous suggestions and played with some of the tuning
parameters for the elevators.  I haven''t got around to trying something
other than XFS, yet - it''s going to take a while to sync over stuff
from the existing FS to an EXT3 or something similar.  I''m also
contacting the SAN vendor to get their help in the situation.

-Nick
>>> On 2009/08/27 at 08:15, John Madden <jmadden@ivytech.edu>
wrote:
> I''m not really sure that bandwidth is an issue - perhaps latency
more
> than that.  I don''t think the amount of data is what''s
causing the
> problem; rather the number of transactions that the e-mail system is
> trying to do on the volume.  The file sizes are actually pretty small
> - 1 to 4 Kb on average, so I think it''s the large number of these
> files that it has to try to read rather than streaming a large amount
> of data.  Both the SAN and the iostat output on both dom0 and domU
> indicate somewhere between 5000 and 20000 kB/s read rates - that''s
> somewhere around 40Mb/s to 160Mb/s, which is well within the
> capability of the FC connection.  The SAN is indicating I/O operations
> between 500 and 1500 I/O requests per second, which I assume is
what''s
> causing the problem.
What''s the backend inside the SAN look like?  Look into amount of
cache,
number of spindles, RAID used, what else is using those spindles, etc. 

500-1500 iops isn''t a lot for a "SAN" in general, but given
that your FC
disks are going to get around 200 worst-case iops, you''d still need
quite a few of them to push 1500 continuously (with your cache picking
up some of the spikes).  And that depends on workload (read/write,
random or not, block size) and RAID type.

In case you haven''t already, I''d look into the usual
filesystem
performance guides and do things like turning off atime and that lot.
My feeling on this is that you''re going to need to drive down those
iops
numbers.

What were your results on trying something other than xfs?

John

-- 
John Madden
Sr UNIX Systems Engineer
Ivy Tech Community College of Indiana
jmadden@ivytech.edu

--------
This e-mail may contain confidential and privileged material for the sole use of
the intended recipient.  If this email is not intended for you, or you are not
responsible for the delivery of this message to the intended recipient, please
note that this message may contain SEAKR Engineering (SEAKR)
Privileged/Proprietary Information.  In such a case, you are strictly prohibited
from downloading, photocopying, distributing or otherwise using this message,
its contents or attachments in any way.  If you have received this message in
error, please notify us immediately by replying to this e-mail and delete the
message from your mailbox.  Information contained in this message that does not
relate to the business of SEAKR is neither endorsed by nor attributable to
SEAKR.

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Pasi Kärkkäinen

2009-Aug-27 14:28 UTC

head link

Re: [Xen-users] Xen and I/O Intensive Loads

On Thu, Aug 27, 2009 at 07:46:46AM -0600, Nick Couchman
wrote:> So here are some details on the SAN LUN...the SAN is a Compellent SAN
> attached to my FC Switch (McData Sphereon 4700, now the Brocade M4700)
> with 4 x 2Gb FC connections.  The dom0 uses the QLE2462 adapter, with a
> single 4Gb connection hooked up.  I did find that there is a later
> driver available - I''ll try to switch to that when I get a chance.
One
> interesting thing that I found is that it the adapter appears to be in a
> 4x PCIe slot, which means the max bandwidth for the card is 2.5Gbps. 
> I''m not sure if this is a QLogic issue or if I need to move the
card to
> a different slot in my Dell PowerEdge R610 chassis, but it looks like
> I''m being limited to 2/3 or so the speed of the FC connection by
my PCIe
> bus.  It''s using a 4Gbps Point-to-Point connection, with a frame
size of
> 2048.  Any hints on whether any of that needs tuning would be great.
> 
OK. I don''t think the pci-e slot is your problem.
> I''m not really sure that bandwidth is an issue - perhaps latency
more
> than that.  I don''t think the amount of data is what''s
causing the
> problem; rather the number of transactions that the e-mail system is
> trying to do on the volume.  The file sizes are actually pretty small -
> 1 to 4 Kb on average, so I think it''s the large number of these
files
> that it has to try to read rather than streaming a large amount of data.
>  Both the SAN and the iostat output on both dom0 and domU indicate
> somewhere between 5000 and 20000 kB/s read rates - that''s
somewhere
> around 40Mb/s to 160Mb/s, which is well within the capability of the FC
> connection.  The SAN is indicating I/O operations between 500 and 1500
> I/O requests per second, which I assume is what''s causing the
problem.
>  
What''s the size of those requests? 4 kB? 1500 IOPS * 4kB/IO == 6000
kB/sec (6 MB/sec).

What kind of disk drives are you using on the Compellent storage array,
on the RAID set for this LUN?

1500 random IOPS requires at least 10x 7200 SATA disks (if using SATA).

each SATA 7200 rpm sata disk can do max around 150 random IOPS.
each 15k rpm SAS disk can do max ~300 random IOPS.

It''s easy maths. Big write-back cache in the storage array will help
though.

-- Pasi
> Again, any tips on what to look at next would be greatly appreciated! 
> Thanks for all the advice so far!
>  
> -Nick
> 
> >>> On 2009/08/27 at 03:00, Pasi Kärkkäinen<pasik@iki.fi>
wrote:
> 
> On Wed, Aug 26, 2009 at 12:07:55PM -0600, Nick Couchman wrote:
> > 
> > Doesn''t really seem to make a difference which way I do
it...I still
> see pretty intense disk I/O. 
> > 
> > Here is some sample output from iostat in the domU: 
> > 
> > Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await  svctm  %util 
> > xvdb             12.20     0.00 1217.40   26.20  9197.60   530.80   
> 15.65    29.66   23.47   0.80 100.00 
> > 
> > Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await  svctm  %util 
> > xvdb             18.40     0.00 1121.20   19.60  8737.60   691.50   
> 16.53    32.97   29.13   0.88 100.00 
> > 
> > Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await  svctm  %util 
> > xvdb             27.80     0.00 1241.40   29.20  8158.40   377.90   
> 13.44    42.59   33.73   0.79 100.00 
> > 
> > Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await  svctm  %util 
> > xvdb             31.60     0.00 1256.60   35.00  9426.40   424.00   
> 15.25    42.06   32.44   0.77 100.00 
> > 
> > Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await  svctm  %util 
> > xvdb             57.68     0.00 1250.50   17.76  8588.42   352.99   
> 14.10    51.36   40.60   0.79  99.80 
> > 
> > the avgqu-sz is anywhere from 11 to 75, and the await is anywhere
> from 20 to 50.  %util is always around 100. 
> > 
> 
> Well.. it seems your SAN LUN is the problem. Have you checked the load
> from the FC Storage array?
> 
> Or then the problem is in your FC HBA. Have you verified the FC link is
> at full speed? 
> 
> Are the FC switches OK?
> 
> Do you have up-to-date HBA driver in dom0? Are the HBA/Switch/Storage
> firmwares up-to-date?
> 
> -- Pasi
> 
> 
> 
> --------
> This e-mail may contain confidential and privileged material for the sole
use of the intended recipient.  If this email is not intended for you, or you
are not responsible for the delivery of this message to the intended recipient,
please note that this message may contain SEAKR Engineering (SEAKR)
Privileged/Proprietary Information.  In such a case, you are strictly prohibited
from downloading, photocopying, distributing or otherwise using this message,
its contents or attachments in any way.  If you have received this message in
error, please notify us immediately by replying to this e-mail and delete the
message from your mailbox.  Information contained in this message that does not
relate to the business of SEAKR is neither endorsed by nor attributable to
SEAKR.
_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

John Madden

2009-Aug-27 14:30 UTC

head link

Re: [Xen-users] Xen and I/O Intensive Loads

On Thu, 2009-08-27 at 08:25 -0600, Nick Couchman wrote:> Let''s see...the SAN has two controllers with a 4GB cache in each
> controller.  Each controller has a single 4 x 2Gb FC controller.  Two
> of those ports go to the switch; the other two create redundant loops
> with the disk array (going from the controller to one disk array, then
> to the next disk array, then to the second controler).  The disks are
> FCATA disks, there are 30 active disks (with 2 hot-spares).  The SAN
> does RAIDs across the disks on a per-volume basis, and my e-mail
> volume is using a RAID10 configuration.
FCATA?  Well, that isn''t going to help your situation any.  But 30
spindles is a good start.  How many are in your particular RAID 10
group?  It''s sounding like 1500 iops might be all this guy can handle
(ATA, maybe 120 iops, RAID 10, so you''re using at most 14 of those
disks, that''d give you 1680 iops max -- for reads -- half that for
writes.)
> tuning parameters for the elevators.  I haven''t got around to
trying
> something other than XFS, yet - it''s going to take a while to sync
> over stuff from the existing FS to an EXT3 or something similar. 
I''m
> also contacting the SAN vendor to get their help in the situation.
Shut down, rsync, remount, start up.  I don''t think your SAN vendor
could really help here...?

John



-- 
John Madden
Sr UNIX Systems Engineer
Ivy Tech Community College of Indiana
jmadden@ivytech.edu


_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Oliver Wilcock

2009-Aug-27 15:11 UTC

head link

[Xen-users] Re: Xen and I/O Intensive Loads

Nick,
Do you mean Groupwise data volume is on one RAID10 comprised of 30 disks
dedicated to Groupwise data?  Or that this one RAID volume is contending
with other volumes using the disks on the SAN?  I''m not familiar with
how
Groupwise works, does ideal deployment suggest separate sets of spindles
for temp file, database and transaction logs?

Is the RAID block/chunk/stripe size aligned with xfs sunit/swidth
parameters?  Are the xfs block boundaries aligned with the RAID blocks?

Is that 4GB of write back cache?  What is the write back delay?  How fast
are the drives in rpm?

> Date: Thu, 27 Aug 2009 08:25:08 -0600
> From: "Nick Couchman" <Nick.Couchman@seakr.com>
> Subject: Re: [Xen-users] Xen and I/O Intensive Loads
>
> Let''s see...the SAN has two controllers with a 4GB cache in each
> controller.  Each controller has a single 4 x 2Gb FC controller.  Two of
> those ports go to the switch; the other two create redundant loops with
> the disk array (going from the controller to one disk array, then to the
> next disk array, then to the second controler).  The disks are FCATA
> disks, there are 30 active disks (with 2 hot-spares).  The SAN does RAIDs
> across the disks on a per-volume basis, and my e-mail volume is using a
> RAID10 configuration.
>
> I''ve done most of the filesystem tuning I can without completely
> rebuilding the filesystem - atime is turned off.  I''ve also
adjusted the
> elevator per previous suggestions and played with some of the tuning
> parameters for the elevators.  I haven''t got around to trying
something
> other than XFS, yet - it''s going to take a while to sync over
stuff from
> the existing FS to an EXT3 or something similar.  I''m also
contacting the
> SAN vendor to get their help in the situation.
>
> -Nick

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Nick Couchman

2009-Aug-27 19:05 UTC

head link

Re: [Xen-users] Xen and I/O Intensive Loads

Yeah, it may be time to invest in some real drives :-).  It''s hard to
tell how many disks are being used for the RAID 10 - the information
isn''t readily available in the user interface for the SAN. 
I''m working with the vendor right now, so maybe they can help me out. 
Also, I wasn''t suggesting the SAN vendor would help with the
rsync/startup/shutdown, just that maybe they can tell me if the performance
I''m seeing on the SAN is what I should expect or not.  I may have to
purchase a chassis of FC disks that run at 10K or 15K RPM.  Current FCATA drives
are 7200 RPM.

-Nick
>>> On 2009/08/27 at 08:30, John Madden <jmadden@ivytech.edu>
wrote:

On Thu, 2009-08-27 at 08:25 -0600, Nick Couchman wrote:> Let''s see...the SAN has two controllers with a 4GB cache in each
> controller.  Each controller has a single 4 x 2Gb FC controller.  Two
> of those ports go to the switch; the other two create redundant loops
> with the disk array (going from the controller to one disk array, then
> to the next disk array, then to the second controler).  The disks are
> FCATA disks, there are 30 active disks (with 2 hot-spares).  The SAN
> does RAIDs across the disks on a per-volume basis, and my e-mail
> volume is using a RAID10 configuration.
FCATA?  Well, that isn''t going to help your situation any.  But 30
spindles is a good start.  How many are in your particular RAID 10
group?  It''s sounding like 1500 iops might be all this guy can handle
(ATA, maybe 120 iops, RAID 10, so you''re using at most 14 of those
disks, that''d give you 1680 iops max -- for reads -- half that for
writes.)
> tuning parameters for the elevators.  I haven''t got around to
trying
> something other than XFS, yet - it''s going to take a while to sync
> over stuff from the existing FS to an EXT3 or something similar. 
I''m
> also contacting the SAN vendor to get their help in the situation.
Shut down, rsync, remount, start up.  I don''t think your SAN vendor
could really help here...?

John



--
John Madden
Sr UNIX Systems Engineer
Ivy Tech Community College of Indiana
jmadden@ivytech.edu



--------
This e-mail may contain confidential and privileged material for the sole use of
the intended recipient.  If this email is not intended for you, or you are not
responsible for the delivery of this message to the intended recipient, please
note that this message may contain SEAKR Engineering (SEAKR)
Privileged/Proprietary Information.  In such a case, you are strictly prohibited
from downloading, photocopying, distributing or otherwise using this message,
its contents or attachments in any way.  If you have received this message in
error, please notify us immediately by replying to this e-mail and delete the
message from your mailbox.  Information contained in this message that does not
relate to the business of SEAKR is neither endorsed by nor attributable to
SEAKR.


_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Nick Couchman

2009-Aug-27 19:08 UTC

head link

Re: [Xen-users] Xen and I/O Intensive Loads

You might have seen my other replies, but here you go...30 disks total active in
the SAN, with 2 hot spares.  FCATA drives, 7200 RPM.  I can''t really
modify the SAN controllers much - they have 4GB read and write caching - I
suppose I can ask the vendor if they go any higher on that, but I think
that''s about the highest it goes.  I may be able to purchase some SSD,
but that gets real $$$$ real fast, and I''d rather try out some faster
drives - maybe real FC at 10K or 15K RPM.

-Nick
>>> On 2009/08/27 at 08:18, "Fajar A. Nugraha"
<fajar@fajar.net> wrote:

On Thu, Aug 27, 2009 at 8:54 PM, Nick Couchman<Nick.Couchman@seakr.com>
wrote:> But, it
> seems that you''re probably right, since iostat in both dom0 and
domU show
> very similar statistics.
>
> My real question is, what can I do to alleviate it?  Is it really a SAN
> issue?
If dom0 iostat says near 100% usage, then yes, most probably it''s a
storage issue.
>  Will tuning the filesystem (even if that means recreating the
> filesystem) help reduce the number of I/O operations per second?  I guess I
> have a few things to investigate, and I may file a support case with the
SAN
> vendor and request some assistance from them.
How many disks you have in your SAN? How many disks are in use
exclusively by this system? A typical SATA disk handles < 100 random
IOPS, so that might be the issue, and increasing the number of disks
(and configure them to be used evenly) seems to be the solution.

As to how to reduce number of IOPS, well, I''m not really sure
there''s
a way to do it that doesn''t involve changing your application.

Some things to try :
- if the load is bursty then usually adding more writeback memory
cache in SAN/SCSI controller helps.
- If it''s mostily temporary files then using something like ext4 which
has delayed allocation should help.
- Another method would be switching to zfs and adding some SSD for
ZIL, but this belongs in a different list :P

--
Fajar


--------
This e-mail may contain confidential and privileged material for the sole use of
the intended recipient.  If this email is not intended for you, or you are not
responsible for the delivery of this message to the intended recipient, please
note that this message may contain SEAKR Engineering (SEAKR)
Privileged/Proprietary Information.  In such a case, you are strictly prohibited
from downloading, photocopying, distributing or otherwise using this message,
its contents or attachments in any way.  If you have received this message in
error, please notify us immediately by replying to this e-mail and delete the
message from your mailbox.  Information contained in this message that does not
relate to the business of SEAKR is neither endorsed by nor attributable to
SEAKR.


_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Nick Couchman

2009-Aug-27 19:11 UTC

head link

Re: [Xen-users] Xen and I/O Intensive Loads

I'm not sure what the request size is - block size on the filesystem is
4K, inode size is 512 bytes.  Drives are FCATA 7200 RPM - 30 active
spindles plus two hot spares.  I don't know how Compellent does the
striping, so I don't know if my RAID10 volume is striped across all 30
active spindles or if they choose a subset of those. 

-Nick
>>> On 2009/08/27 at 08:28, Pasi Kärkkäinen<pasik@iki.fi> wrote:

On Thu, Aug 27, 2009 at 07:46:46AM -0600, Nick Couchman
wrote:> So here are some details on the SAN LUN...the SAN is a Compellent
SAN> attached to my FC Switch (McData Sphereon 4700, now the Brocade
M4700)> with 4 x 2Gb FC connections.  The dom0 uses the QLE2462 adapter, with
a> single 4Gb connection hooked up.  I did find that there is a later
> driver available - I'll try to switch to that when I get a chance. 
One> interesting thing that I found is that it the adapter appears to be
in a> 4x PCIe slot, which means the max bandwidth for the card is 2.5Gbps.
> I'm not sure if this is a QLogic issue or if I need to move the card
to> a different slot in my Dell PowerEdge R610 chassis, but it looks
like> I'm being limited to 2/3 or so the speed of the FC connection by my
PCIe> bus.  It's using a 4Gbps Point-to-Point connection, with a frame size
of> 2048.  Any hints on whether any of that needs tuning would be great.
>
OK. I don't think the pci-e slot is your problem.
> I'm not really sure that bandwidth is an issue - perhaps latency
more> than that.  I don't think the amount of data is what's causing the
> problem; rather the number of transactions that the e-mail system is
> trying to do on the volume.  The file sizes are actually pretty small
-> 1 to 4 Kb on average, so I think it's the large number of these
files> that it has to try to read rather than streaming a large amount of
data.>  Both the SAN and the iostat output on both dom0 and domU indicate
> somewhere between 5000 and 20000 kB/s read rates - that's somewhere
> around 40Mb/s to 160Mb/s, which is well within the capability of the
FC> connection.  The SAN is indicating I/O operations between 500 and
1500> I/O requests per second, which I assume is what's causing the
problem.> 
What's the size of those requests? 4 kB? 1500 IOPS * 4kB/IO == 6000
kB/sec (6 MB/sec).

What kind of disk drives are you using on the Compellent storage
array,
on the RAID set for this LUN?

1500 random IOPS requires at least 10x 7200 SATA disks (if using
SATA).

each SATA 7200 rpm sata disk can do max around 150 random IOPS.
each 15k rpm SAS disk can do max ~300 random IOPS.

It's easy maths. Big write-back cache in the storage array will help
though.

-- Pasi
> Again, any tips on what to look at next would be greatly
appreciated!> Thanks for all the advice so far!
> 
> -Nick
>
> >>> On 2009/08/27 at 03:00, Pasi Kärkkäinen<pasik@iki.fi>
wrote:
>
> On Wed, Aug 26, 2009 at 12:07:55PM -0600, Nick Couchman wrote:
> >
> > Doesn't really seem to make a difference which way I do it...I
still> see pretty intense disk I/O.
> >
> > Here is some sample output from iostat in the domU:
> >
> > Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await  svctm  %util
> > xvdb             12.20     0.00 1217.40   26.20  9197.60   530.80 
> 15.65    29.66   23.47   0.80 100.00
> >
> > Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await  svctm  %util
> > xvdb             18.40     0.00 1121.20   19.60  8737.60   691.50 
> 16.53    32.97   29.13   0.88 100.00
> >
> > Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await  svctm  %util
> > xvdb             27.80     0.00 1241.40   29.20  8158.40   377.90 
> 13.44    42.59   33.73   0.79 100.00
> >
> > Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await  svctm  %util
> > xvdb             31.60     0.00 1256.60   35.00  9426.40   424.00 
> 15.25    42.06   32.44   0.77 100.00
> >
> > Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await  svctm  %util
> > xvdb             57.68     0.00 1250.50   17.76  8588.42   352.99 
> 14.10    51.36   40.60   0.79  99.80
> >
> > the avgqu-sz is anywhere from 11 to 75, and the await is anywhere
> from 20 to 50.  %util is always around 100.
> >
>
> Well.. it seems your SAN LUN is the problem. Have you checked the
load> from the FC Storage array?
>
> Or then the problem is in your FC HBA. Have you verified the FC link
is> at full speed?
>
> Are the FC switches OK?
>
> Do you have up-to-date HBA driver in dom0? Are the
HBA/Switch/Storage> firmwares up-to-date?
>
> -- Pasi
>
>
>
> --------
> This e-mail may contain confidential and privileged material for thesole use of the intended recipient.  If this email is not intended for
you, or you are not responsible for the delivery of this message to the
intended recipient, please note that this message may contain SEAKR
Engineering (SEAKR) Privileged/Proprietary Information.  In such a case,
you are strictly prohibited from downloading, photocopying, distributing
or otherwise using this message, its contents or attachments in any way.
 If you have received this message in error, please notify us
immediately by replying to this e-mail and delete the message from your
mailbox.  Information contained in this message that does not relate to
the business of SEAKR is neither endorsed by nor attributable to SEAKR.


--------
This e-mail may contain confidential and privileged material for the sole use of
the intended recipient.  If this email is not intended for you, or you are not
responsible for the delivery of this message to the intended recipient, please
note that this message may contain SEAKR Engineering (SEAKR)
Privileged/Proprietary Information.  In such a case, you are strictly prohibited
from downloading, photocopying, distributing or otherwise using this message,
its contents or attachments in any way.  If you have received this message in
error, please notify us immediately by replying to this e-mail and delete the
message from your mailbox.  Information contained in this message that does not
relate to the business of SEAKR is neither endorsed by nor attributable to
SEAKR.


_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Nick Couchman

2009-Aug-27 19:18 UTC

head link

[Xen-users] Re: Xen and I/O Intensive Loads

Oliver, 
The way the Compellent system works is that is does a per-volume RAID.  So,
there are 30 disks presented to the SAN controllers as a JBOD, and then each
volume is assigned one or more RAID levels, and the controller stripes the data
and moves it between RAID levels.  The GroupWise data volume is configured as a
RAID10 only, but it does contend with other volumes on the same set of disks. 
GroupWise does not use separate disks or volumes for temporary data, databases,
logs, etc. - everything is kept in the same filesystem and there really
isn''t much documentation or whether it''s possible or how to
separate those things.

I''m not sure about the RAID block/chunk/stripe size - the user
interface on the controller doesn''t really lend itself well to those
sorts of detailed customizations.  I''ll have to dig a little bit to see
about that.  Drives are FCATA 7200 RPM, and the 4GB cache is for read and write
- I''m not sure if they do write-through or write-back - I''ll
check on that.

-Nick
>>> On 2009/08/27 at 09:11, "Oliver Wilcock"
<oliver@owch.ca> wrote:

Nick,
Do you mean Groupwise data volume is on one RAID10 comprised of 30 disks
dedicated to Groupwise data?  Or that this one RAID volume is contending
with other volumes using the disks on the SAN?  I''m not familiar with
how
Groupwise works, does ideal deployment suggest separate sets of spindles
for temp file, database and transaction logs?

Is the RAID block/chunk/stripe size aligned with xfs sunit/swidth
parameters?  Are the xfs block boundaries aligned with the RAID blocks?

Is that 4GB of write back cache?  What is the write back delay?  How fast
are the drives in rpm?

> Date: Thu, 27 Aug 2009 08:25:08 -0600
> From: "Nick Couchman" <Nick.Couchman@seakr.com>
> Subject: Re: [Xen-users] Xen and I/O Intensive Loads
>
> Let''s see...the SAN has two controllers with a 4GB cache in each
> controller.  Each controller has a single 4 x 2Gb FC controller.  Two of
> those ports go to the switch; the other two create redundant loops with
> the disk array (going from the controller to one disk array, then to the
> next disk array, then to the second controler).  The disks are FCATA
> disks, there are 30 active disks (with 2 hot-spares).  The SAN does RAIDs
> across the disks on a per-volume basis, and my e-mail volume is using a
> RAID10 configuration.
>
> I''ve done most of the filesystem tuning I can without completely
> rebuilding the filesystem - atime is turned off.  I''ve also
adjusted the
> elevator per previous suggestions and played with some of the tuning
> parameters for the elevators.  I haven''t got around to trying
something
> other than XFS, yet - it''s going to take a while to sync over
stuff from
> the existing FS to an EXT3 or something similar.  I''m also
contacting the
> SAN vendor to get their help in the situation.
>
> -Nick

--------
This e-mail may contain confidential and privileged material for the sole use of
the intended recipient.  If this email is not intended for you, or you are not
responsible for the delivery of this message to the intended recipient, please
note that this message may contain SEAKR Engineering (SEAKR)
Privileged/Proprietary Information.  In such a case, you are strictly prohibited
from downloading, photocopying, distributing or otherwise using this message,
its contents or attachments in any way.  If you have received this message in
error, please notify us immediately by replying to this e-mail and delete the
message from your mailbox.  Information contained in this message that does not
relate to the business of SEAKR is neither endorsed by nor attributable to
SEAKR.

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Pasi Kärkkäinen

2009-Aug-27 20:33 UTC

head link

Re: [Xen-users] Xen and I/O Intensive Loads

On Thu, Aug 27, 2009 at 01:08:42PM -0600, Nick Couchman
wrote:> 
> You might have seen my other replies, but here you go...30 disks total
active in the SAN, with 2 hot spares.  FCATA drives, 7200 RPM.  I can''t
really modify the SAN controllers much - they have 4GB read and write caching -
I suppose I can ask the vendor if they go any higher on that, but I think
that''s about the highest it goes.  I may be able to purchase some SSD,
but that gets real $$$$ real fast, and I''d rather try out some faster
drives - maybe real FC at 10K or 15K RPM.
If you''re IOPS limited then definitely go for 15K drives. 
You''ll get 50% more IOPS from them.

-- Pasi


_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Possibly Parallel Threads

Search for more seemingly similar threads

Xen users - Aug 2009 - Xen and I/O Intensive Loads

[Xen-users] Xen and I/O Intensive Loads

Re: [Xen-users] Xen and I/O Intensive Loads

Re: [Xen-users] Xen and I/O Intensive Loads

Re: [Xen-users] Xen and I/O Intensive Loads

Re: [Xen-users] Xen and I/O Intensive Loads

Re: [Xen-users] Xen and I/O Intensive Loads

Re: [Xen-users] Xen and I/O Intensive Loads

Re: [Xen-users] Xen and I/O Intensive Loads

Re: [Xen-users] Xen and I/O Intensive Loads

Re: [Xen-users] Xen and I/O Intensive Loads

Re: [Xen-users] Xen and I/O Intensive Loads

Re: [Xen-users] Xen and I/O Intensive Loads

Re: [Xen-users] Xen and I/O Intensive Loads

Re: [Xen-users] Xen and I/O Intensive Loads

Re: [Xen-users] Xen and I/O Intensive Loads

Re: [Xen-users] Xen and I/O Intensive Loads

Re: [Xen-users] Xen and I/O Intensive Loads

Re: [Xen-users] Xen and I/O Intensive Loads

Re: [Xen-users] Xen and I/O Intensive Loads

Re: [Xen-users] Xen and I/O Intensive Loads

Re: [Xen-users] Xen and I/O Intensive Loads

[Xen-users] Re: Xen and I/O Intensive Loads

Re: [Xen-users] Xen and I/O Intensive Loads

Re: [Xen-users] Xen and I/O Intensive Loads

Re: [Xen-users] Xen and I/O Intensive Loads

[Xen-users] Re: Xen and I/O Intensive Loads

Re: [Xen-users] Xen and I/O Intensive Loads

Possibly Parallel Threads