thr3ads.net - Xen users - xen disk io performance [Nov 2011]

If this information is useful, please help other people find it:
Share via:

Alexandre Chapellon

2011-Nov-28 09:03 UTC

xen disk io performance

Hello,

I have a setup with several xen 3.2.1 dom0s (debian). Most of the VMs 
works nice but I have some load pick problem on some postgresql 
instances. PGSQL has been tuned quite nicely I guess, but during some 
jobs, IO load is very high and postgres seems really slow.
I am using FC LUNs fomr a NetApp SAN (with fast FC 15000t/m disks) and 
LVM as follow:

FC LUNs are in a volume group (I had new LUNs on the VG when mre space 
is needed) and VG is splited across several database servers (Xen 
guests) using LVs.
So my disk config for my VM look like this:

disk =[ ''phy:/dev/xendata/postgresql-syslog:xvdb1:w'']

What can be done to improve disk IO and throuput?

-- 
<http://www.horoa.net>

Alexandre Chapellon

Ingénierie des systèmes open sources et réseaux.
Follow me on twitter: @alxgomz <http://www.twitter.com/alxgomz>

Fajar A. Nugraha

2011-Nov-28 09:11 UTC

head link

Re: xen disk io performance

On Mon, Nov 28, 2011 at 4:03 PM, Alexandre Chapellon
<a.chapellon@horoa.net> wrote:> disk =[ ''phy:/dev/xendata/postgresql-syslog:xvdb1:w'']
>
> What can be done to improve disk IO and throuput?
Well, for starters, are you sure this is a Xen issue?

Try something simple:
- mount the LV on dom0 (use "xm block-attach 0" or kpartx and
pvscan/vgchange if necessary)
- do a chroot, and start postgres
- give it some load (e.g. sysbench)
- cleanup (unmount, xm block-detach, etc)
- repeat sysbench on domU

-- 
Fajar

Niels Dettenbach

2011-Nov-28 09:36 UTC

head link

Re: xen disk io performance

Dear Alexandre,

Am Montag, 28. November 2011, 10:03:23 schrieb Alexandre
Chapellon:> 
> disk =[ ''phy:/dev/xendata/postgresql-syslog:xvdb1:w'']This is typically correct.
 > What can be done to improve disk IO and throuput?
I did not know much about the NetApp SAN you describe but there are usually 
different possible reasons for performance problems of databases on SAN 
storage.

One of the major bottlenecks i saw often was a very limited transaction rate 
(number of i/o requests) of the SAN (if the SAN cache was filled) even if the 
SAN offers a high bandwidth / throughput when writing large files etc.. What 
kind of RAID configuration do you use within your SAN? Are there other 
applications / users which use the same disk space / disks in parallel? How 
many "free" disk heads are available for your pgsql? What kind of
filesystems
do you use?

You may take some tests with storage benchmarking tools to go down to the 
source of the bottleneck (make shure your benachmark fill up / eliminates the 
SAN cache for usuable results).

On the Xen level byself i did not see any further optimization options - 
typical things you have to review for any optimizations are:

 - reduce (swappiness) or disable swapping at all
 - optimize RAM usage (buffering)
 - (if possible) realize exclusive access of pgsql to the physical disks

In several cases filesystem behaviours / suboptimal block sizes may interfer 
to performance too.


hth
best regards,


Niels.

-- 
---
Niels Dettenbach
Syndicat IT&Internet
http://www.syndicat.com/

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Alexandre Chapellon

2011-Nov-28 23:56 UTC

head link

Re: xen disk io performance

Le 28/11/2011 10:36, Niels Dettenbach a écrit :> Dear Alexandre,
>
> Am Montag, 28. November 2011, 10:03:23 schrieb Alexandre Chapellon:
>> disk =[ ''phy:/dev/xendata/postgresql-syslog:xvdb1:w'']
> This is typically correct.
>
>> What can be done to improve disk IO and throuput?
> I did not know much about the NetApp SAN you describe but there are usually
> different possible reasons for performance problems of databases on SAN
> storage.
>
> One of the major bottlenecks i saw often was a very limited transaction
rate
> (number of i/o requests) of the SAN (if the SAN cache was filled) even if
the
> SAN offers a high bandwidth / throughput when writing large files etc..
What
> kind of RAID configuration do you use within your SAN? Are there other
> applications / users which use the same disk space / disks in parallel? How
> many "free" disk heads are available for your pgsql? What kind of
filesystems
> do you use?the raid is raid_dp, of 13 FC disks (15000rpm) and the LV containing 
postgres data is stripped accross 2 raid group of that kind (26 disks), 
filesystem is ext4.
Am not very familiar with benchmarking tools, for now I just used 
tiobench and monitor tools like iotop or iostats... The thing is am not 
sure how to read the result.
> You may take some tests with storage benchmarking tools to go down to the
> source of the bottleneck (make shure your benachmark fill up / eliminates
the
> SAN cache for usuable results).
>
> On the Xen level byself i did not see any further optimization options -
> typical things you have to review for any optimizations are:
>
>   - reduce (swappiness) or disable swapping at all
>   - optimize RAM usage (buffering)
>   - (if possible) realize exclusive access of pgsql to the physical disks
>
> In several cases filesystem behaviours / suboptimal block sizes may
interfer
> to performance too.The thread is now offtopic, but If you don''t mind I''d be glad
to hear
about good sources on how to deal with filesystem block siez 
calculation, partition alignement and its implications on LVM...

Regards> hth
> best regards,
>
>
> Niels.
>
>
>
> _______________________________________________
> Xen-users mailing list
> Xen-users@lists.xensource.com
> http://lists.xensource.com/xen-users
-- 
<http://www.horoa.net>

Alexandre Chapellon

Ingénierie des systèmes open sources et réseaux.
Follow me on twitter: @alxgomz <http://www.twitter.com/alxgomz>

Alexandre Chapellon

2011-Nov-28 23:57 UTC

head link

Re: xen disk io performance

George Shuklin

2011-Nov-29 03:14 UTC

head link

Re: xen disk io performance

"Very high" and "very slow" sounds like "i pay many
buks but got small
potion".

How many IOPS postgre generates from dom0 point of view? (see statistics 
in vbd/tap device in /sys). How do you did you check netapp storage 
performance?

FC is not synonym for ''fast work'' and netapp ether
can''t do magic if
postgre creates a thousands of cold random read operations.

I''ll like to propose to starts from stat gathering. At least atop in 
domU with postgre with enabled logging.

On 28.11.2011 13:03, Alexandre Chapellon wrote:> Hello,
>
> I have a setup with several xen 3.2.1 dom0s (debian). Most of the VMs 
> works nice but I have some load pick problem on some postgresql 
> instances. PGSQL has been tuned quite nicely I guess, but during some 
> jobs, IO load is very high and postgres seems really slow.
> I am using FC LUNs fomr a NetApp SAN (with fast FC 15000t/m disks) and 
> LVM as follow:
>
> FC LUNs are in a volume group (I had new LUNs on the VG when mre space 
> is needed) and VG is splited across several database servers (Xen 
> guests) using LVs.
> So my disk config for my VM look like this:
>
> disk =[ ''phy:/dev/xendata/postgresql-syslog:xvdb1:w'']
>
> What can be done to improve disk IO and throuput?
>

Alexandre Chapellon

2011-Nov-29 09:27 UTC

head link

Re: xen disk io performance

Le 29/11/2011 04:14, George Shuklin a écrit :> "Very high" and "very slow" sounds like "i pay
many buks but got small
> potion".
>
> How many IOPS postgre generates from dom0 point of view? (see 
> statistics in vbd/tap device in /sys). How do you did you check netapp 
> storage performance?I have found I have queries that are putting the system on its knees, 
Some are statistics collection and adding the right index just solved 
the problem. I never looked in /sys on dom0 to get informations about 
iops. Instead I used iostats in domU and it gave me ~ 5000 read/s when 
the statistic collection.
If I compare to what I see in /sys on the dom0 (looking in the stat of 
the dm- block device , not the underlying devices) I see ~ 1200 write/s 
and 100 read/s when thing are ok.

Now I still have one database purge job that mess up perf but i only run 
once a week.... I''ll wait next monday to watch after this stat file and
will send back some values here.

Regards.>
> FC is not synonym for ''fast work'' and netapp ether
can''t do magic if
> postgre creates a thousands of cold random read operations.
>
> I''ll like to propose to starts from stat gathering. At least atop
in
> domU with postgre with enabled logging.
>
> On 28.11.2011 13:03, Alexandre Chapellon wrote:
>> Hello,
>>
>> I have a setup with several xen 3.2.1 dom0s (debian). Most of the VMs 
>> works nice but I have some load pick problem on some postgresql 
>> instances. PGSQL has been tuned quite nicely I guess, but during some 
>> jobs, IO load is very high and postgres seems really slow.
>> I am using FC LUNs fomr a NetApp SAN (with fast FC 15000t/m disks) 
>> and LVM as follow:
>>
>> FC LUNs are in a volume group (I had new LUNs on the VG when mre 
>> space is needed) and VG is splited across several database servers 
>> (Xen guests) using LVs.
>> So my disk config for my VM look like this:
>>
>> disk =[ ''phy:/dev/xendata/postgresql-syslog:xvdb1:w'']
>>
>> What can be done to improve disk IO and throuput?
>>
>
> _______________________________________________
> Xen-users mailing list
> Xen-users@lists.xensource.com
> http://lists.xensource.com/xen-users
-- 
<http://www.horoa.net>

Alexandre Chapellon

Ingénierie des systèmes open sources et réseaux.
Follow me on twitter: @alxgomz <http://www.twitter.com/alxgomz>

Jeff Sturm

2011-Nov-29 14:10 UTC

head link

Re: xen disk io performance

> -----Original Message-----
> From: xen-users-bounces@lists.xensource.com [mailto:xen-users-
> bounces@lists.xensource.com] On Behalf Of Alexandre Chapellon
> Sent: Tuesday, November 29, 2011 4:27 AM
> 
> Le 29/11/2011 04:14, George Shuklin a écrit :
> > "Very high" and "very slow" sounds like "i
pay many buks but got small
> > potion".
> >
> > How many IOPS postgre generates from dom0 point of view? (see
> > statistics in vbd/tap device in /sys). How do you did you check netapp
> > storage performance?
> I have found I have queries that are putting the system on its knees, Some
are
> statistics collection and adding the right index just solved the problem. I
never looked in
> /sys on dom0 to get informations about iops. Instead I used iostats in domU
and it
> gave me ~ 5000 read/s when the statistic collection.
> If I compare to what I see in /sys on the dom0 (looking in the stat of the
dm- block
> device , not the underlying devices) I see ~ 1200 write/s and 100 read/s
when thing
> are ok.
Mechanical disks are slow.  The rules for disk performance haven''t
really changed due to virtualization: lots of RAM, lots of buffering.

In other words, avoid disk accesses like the plague.  Especially random I/O to
physical drives.

Solid-state storage helps, but you can often achieve the same effect with lots
of RAM, on the cheap.

I''ve fought similar issues on our virtualized clusters.  After many
cycles of tuning and monitoring, I came to a couple of conclusions.  One, a
16-disk storage array isn''t enough for 30 guests--despite plenty of
capacity and bandwidth, random I/O is still the problem, which can only be
mitigated with more spindles (we''d normally have at least 2 per
physical host, but in our virtual cluster we''ve allocated one-fourth of
that).

Two, rotating media are 1970''s technology, good for little more than
archival, ripe for replacement.  I''m keeping an eye on price/capacity
for solid state storage.

One little tip for Linux users:  Mount guest file systems with
"noatime" whenever you can.   You''ll be glad you did.

-Jeff

Niels Dettenbach

2011-Nov-29 15:07 UTC

head link

Re: xen disk io performance

Am Dienstag, 29. November 2011, 14:10:45 schrieb Jeff
Sturm:> Mechanical disks are slow.  The rules for disk performance haven''t
really
> changed due to virtualization: lots of RAM, lots of buffering.ack
 > In other words, avoid disk accesses like the plague.  Especially random I/O
> to physical drives.ack.
> Solid-state storage helps, but you can often achieve the same effect with
> lots of RAM, on the cheap.This is not correct in all cases as this still hardly depends from the model 
(wear leveling etc.) and - not at least - the applications disk usage profile 
(i.e. hardly parallel accesses etc.). This means: a high quality and 
application specific optimized SAS RAID could be significantly
"faster" then
many SSDs.
> random I/O is still the problem, which can only be mitigated with more spindles (we''d normally have at least 2 per physical host, but in our
virtual
cluster we''ve allocated one-fourth of that).
yes,
but just additionally: a "long" SAN path could slow down random i/o 
transaction rates too comparing to "directly" attached disks. Another
"ugly"
point on SAN could be parallel access from other users / SAN clients to the 
same physical disks or even swapping onto it from the same systems.
> Two, rotating media are 1970''s technology, good for little more
thanarchival, ripe for replacement.  I''m keeping an eye on price/capacity
for
solid state storage.

SSDs are still not a generally better / faster solution (even if you did not 
rely on the price) in each case - especially in database situations. Most 
database software developers / projects are still developing / optimizing 
their products for SSD storage usage as over decades for rotating disks before 
and the manufacturers of SSDs do their part of the development.

See i.e. developer discussions on MySQL etc. about this topic...


hth
best regards,


Niels.
-- 
---
Niels Dettenbach
Syndicat IT&Internet
http://www.syndicat.com/

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

George Shuklin

2011-Nov-30 20:19 UTC

head link

Re: xen disk io performance

use flashcache.

In my tests it works perfectly (about 15% degradation compare to 
fileio/wb mode for iscsi, which I accept as highest possible (but risky) 
storage access mode).

On 29.11.2011 18:10, Jeff Sturm wrote:>> -----Original Message-----
>> From: xen-users-bounces@lists.xensource.com [mailto:xen-users-
>> bounces@lists.xensource.com] On Behalf Of Alexandre Chapellon
>> Sent: Tuesday, November 29, 2011 4:27 AM
>>
>> Le 29/11/2011 04:14, George Shuklin a écrit :
>>> "Very high" and "very slow" sounds like "i
pay many buks but got small
>>> potion".
>>>
>>> How many IOPS postgre generates from dom0 point of view? (see
>>> statistics in vbd/tap device in /sys). How do you did you check
netapp
>>> storage performance?
>> I have found I have queries that are putting the system on its knees,
Some are
>> statistics collection and adding the right index just solved the
problem. I never looked in
>> /sys on dom0 to get informations about iops. Instead I used iostats in
domU and it
>> gave me ~ 5000 read/s when the statistic collection.
>> If I compare to what I see in /sys on the dom0 (looking in the stat of
the dm- block
>> device , not the underlying devices) I see ~ 1200 write/s and 100
read/s when thing
>> are ok.
> Mechanical disks are slow.  The rules for disk performance haven't
really changed due to virtualization: lots of RAM, lots of buffering.
>
> In other words, avoid disk accesses like the plague.  Especially random I/O
to physical drives.
>
> Solid-state storage helps, but you can often achieve the same effect with
lots of RAM, on the cheap.
>
> I've fought similar issues on our virtualized clusters.  After many
cycles of tuning and monitoring, I came to a couple of conclusions.  One, a
16-disk storage array isn't enough for 30 guests--despite plenty of capacity
and bandwidth, random I/O is still the problem, which can only be mitigated with
more spindles (we'd normally have at least 2 per physical host, but in our
virtual cluster we've allocated one-fourth of that).
>
> Two, rotating media are 1970's technology, good for little more than
archival, ripe for replacement.  I'm keeping an eye on price/capacity for
solid state storage.
>
> One little tip for Linux users:  Mount guest file systems with
"noatime" whenever you can.   You'll be glad you did.
>
> -Jeff
>
>
>
> _______________________________________________
> Xen-users mailing list
> Xen-users@lists.xensource.com
> http://lists.xensource.com/xen-users
_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Jeff Sturm

2011-Nov-30 20:34 UTC

head link

Re: xen disk io performance

> -----Original Message-----
> From: xen-users-bounces@lists.xensource.com [mailto:xen-users-
> bounces@lists.xensource.com] On Behalf Of Niels Dettenbach
> Sent: Tuesday, November 29, 2011 10:08 AM
> 
> > Solid-state storage helps, but you can often achieve the same effect
> > with lots of RAM, on the cheap.
> This is not correct in all cases as this still hardly depends from the
model (wear leveling
> etc.) and - not at least - the applications disk usage profile (i.e. hardly
parallel
> accesses etc.).
Yeah.  SSD isn''t a cure-all, yet.

For what I need--high throughput on small random read requests--SSD looks like
it may be a winner.  Haven''t done a lot of testing yet, though.
> This means: a high quality and application specific optimized SAS RAID
> could be significantly "faster" then many SSDs.
Sure, it''s possible.  Though it can be depressingly hard to find
software that optimizes disk accesses well.  MySQL is particularly bad.  Oracle
fares better.
> Another "ugly"
> point on SAN could be parallel access from other users / SAN clients to the
same
> physical disks or even swapping onto it from the same systems.
That''s a big problem with virtualization too (bringing us back on
topic).  Each host could do a good job at optimizing disk accesses.  But run
them all at once off the same SAN and you could end up with a huge mess.

-Jeff

Xen users - Nov 2011 - xen disk io performance

xen disk io performance

Re: xen disk io performance

Re: xen disk io performance

Re: xen disk io performance

Re: xen disk io performance

Re: xen disk io performance

Re: xen disk io performance

Re: xen disk io performance

Re: xen disk io performance

Re: xen disk io performance

Re: xen disk io performance