Hello, I have a setup with several xen 3.2.1 dom0s (debian). Most of the VMs works nice but I have some load pick problem on some postgresql instances. PGSQL has been tuned quite nicely I guess, but during some jobs, IO load is very high and postgres seems really slow. I am using FC LUNs fomr a NetApp SAN (with fast FC 15000t/m disks) and LVM as follow: FC LUNs are in a volume group (I had new LUNs on the VG when mre space is needed) and VG is splited across several database servers (Xen guests) using LVs. So my disk config for my VM look like this: disk =[ ''phy:/dev/xendata/postgresql-syslog:xvdb1:w''] What can be done to improve disk IO and throuput? -- <http://www.horoa.net> Alexandre Chapellon Ingénierie des systèmes open sources et réseaux. Follow me on twitter: @alxgomz <http://www.twitter.com/alxgomz>
On Mon, Nov 28, 2011 at 4:03 PM, Alexandre Chapellon <a.chapellon@horoa.net> wrote:> disk =[ ''phy:/dev/xendata/postgresql-syslog:xvdb1:w''] > > What can be done to improve disk IO and throuput?Well, for starters, are you sure this is a Xen issue? Try something simple: - mount the LV on dom0 (use "xm block-attach 0" or kpartx and pvscan/vgchange if necessary) - do a chroot, and start postgres - give it some load (e.g. sysbench) - cleanup (unmount, xm block-detach, etc) - repeat sysbench on domU -- Fajar
Dear Alexandre, Am Montag, 28. November 2011, 10:03:23 schrieb Alexandre Chapellon:> > disk =[ ''phy:/dev/xendata/postgresql-syslog:xvdb1:w'']This is typically correct.> What can be done to improve disk IO and throuput?I did not know much about the NetApp SAN you describe but there are usually different possible reasons for performance problems of databases on SAN storage. One of the major bottlenecks i saw often was a very limited transaction rate (number of i/o requests) of the SAN (if the SAN cache was filled) even if the SAN offers a high bandwidth / throughput when writing large files etc.. What kind of RAID configuration do you use within your SAN? Are there other applications / users which use the same disk space / disks in parallel? How many "free" disk heads are available for your pgsql? What kind of filesystems do you use? You may take some tests with storage benchmarking tools to go down to the source of the bottleneck (make shure your benachmark fill up / eliminates the SAN cache for usuable results). On the Xen level byself i did not see any further optimization options - typical things you have to review for any optimizations are: - reduce (swappiness) or disable swapping at all - optimize RAM usage (buffering) - (if possible) realize exclusive access of pgsql to the physical disks In several cases filesystem behaviours / suboptimal block sizes may interfer to performance too. hth best regards, Niels. -- --- Niels Dettenbach Syndicat IT&Internet http://www.syndicat.com/ _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Le 28/11/2011 10:36, Niels Dettenbach a écrit :> Dear Alexandre, > > Am Montag, 28. November 2011, 10:03:23 schrieb Alexandre Chapellon: >> disk =[ ''phy:/dev/xendata/postgresql-syslog:xvdb1:w''] > This is typically correct. > >> What can be done to improve disk IO and throuput? > I did not know much about the NetApp SAN you describe but there are usually > different possible reasons for performance problems of databases on SAN > storage. > > One of the major bottlenecks i saw often was a very limited transaction rate > (number of i/o requests) of the SAN (if the SAN cache was filled) even if the > SAN offers a high bandwidth / throughput when writing large files etc.. What > kind of RAID configuration do you use within your SAN? Are there other > applications / users which use the same disk space / disks in parallel? How > many "free" disk heads are available for your pgsql? What kind of filesystems > do you use?the raid is raid_dp, of 13 FC disks (15000rpm) and the LV containing postgres data is stripped accross 2 raid group of that kind (26 disks), filesystem is ext4. Am not very familiar with benchmarking tools, for now I just used tiobench and monitor tools like iotop or iostats... The thing is am not sure how to read the result.> You may take some tests with storage benchmarking tools to go down to the > source of the bottleneck (make shure your benachmark fill up / eliminates the > SAN cache for usuable results). > > On the Xen level byself i did not see any further optimization options - > typical things you have to review for any optimizations are: > > - reduce (swappiness) or disable swapping at all > - optimize RAM usage (buffering) > - (if possible) realize exclusive access of pgsql to the physical disks > > In several cases filesystem behaviours / suboptimal block sizes may interfer > to performance too.The thread is now offtopic, but If you don''t mind I''d be glad to hear about good sources on how to deal with filesystem block siez calculation, partition alignement and its implications on LVM... Regards> hth > best regards, > > > Niels. > > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users-- <http://www.horoa.net> Alexandre Chapellon Ingénierie des systèmes open sources et réseaux. Follow me on twitter: @alxgomz <http://www.twitter.com/alxgomz>
"Very high" and "very slow" sounds like "i pay many buks but got small potion". How many IOPS postgre generates from dom0 point of view? (see statistics in vbd/tap device in /sys). How do you did you check netapp storage performance? FC is not synonym for ''fast work'' and netapp ether can''t do magic if postgre creates a thousands of cold random read operations. I''ll like to propose to starts from stat gathering. At least atop in domU with postgre with enabled logging. On 28.11.2011 13:03, Alexandre Chapellon wrote:> Hello, > > I have a setup with several xen 3.2.1 dom0s (debian). Most of the VMs > works nice but I have some load pick problem on some postgresql > instances. PGSQL has been tuned quite nicely I guess, but during some > jobs, IO load is very high and postgres seems really slow. > I am using FC LUNs fomr a NetApp SAN (with fast FC 15000t/m disks) and > LVM as follow: > > FC LUNs are in a volume group (I had new LUNs on the VG when mre space > is needed) and VG is splited across several database servers (Xen > guests) using LVs. > So my disk config for my VM look like this: > > disk =[ ''phy:/dev/xendata/postgresql-syslog:xvdb1:w''] > > What can be done to improve disk IO and throuput? >
Le 29/11/2011 04:14, George Shuklin a écrit :> "Very high" and "very slow" sounds like "i pay many buks but got small > potion". > > How many IOPS postgre generates from dom0 point of view? (see > statistics in vbd/tap device in /sys). How do you did you check netapp > storage performance?I have found I have queries that are putting the system on its knees, Some are statistics collection and adding the right index just solved the problem. I never looked in /sys on dom0 to get informations about iops. Instead I used iostats in domU and it gave me ~ 5000 read/s when the statistic collection. If I compare to what I see in /sys on the dom0 (looking in the stat of the dm- block device , not the underlying devices) I see ~ 1200 write/s and 100 read/s when thing are ok. Now I still have one database purge job that mess up perf but i only run once a week.... I''ll wait next monday to watch after this stat file and will send back some values here. Regards.> > FC is not synonym for ''fast work'' and netapp ether can''t do magic if > postgre creates a thousands of cold random read operations. > > I''ll like to propose to starts from stat gathering. At least atop in > domU with postgre with enabled logging. > > On 28.11.2011 13:03, Alexandre Chapellon wrote: >> Hello, >> >> I have a setup with several xen 3.2.1 dom0s (debian). Most of the VMs >> works nice but I have some load pick problem on some postgresql >> instances. PGSQL has been tuned quite nicely I guess, but during some >> jobs, IO load is very high and postgres seems really slow. >> I am using FC LUNs fomr a NetApp SAN (with fast FC 15000t/m disks) >> and LVM as follow: >> >> FC LUNs are in a volume group (I had new LUNs on the VG when mre >> space is needed) and VG is splited across several database servers >> (Xen guests) using LVs. >> So my disk config for my VM look like this: >> >> disk =[ ''phy:/dev/xendata/postgresql-syslog:xvdb1:w''] >> >> What can be done to improve disk IO and throuput? >> > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users-- <http://www.horoa.net> Alexandre Chapellon Ingénierie des systèmes open sources et réseaux. Follow me on twitter: @alxgomz <http://www.twitter.com/alxgomz>
> -----Original Message----- > From: xen-users-bounces@lists.xensource.com [mailto:xen-users- > bounces@lists.xensource.com] On Behalf Of Alexandre Chapellon > Sent: Tuesday, November 29, 2011 4:27 AM > > Le 29/11/2011 04:14, George Shuklin a écrit : > > "Very high" and "very slow" sounds like "i pay many buks but got small > > potion". > > > > How many IOPS postgre generates from dom0 point of view? (see > > statistics in vbd/tap device in /sys). How do you did you check netapp > > storage performance? > I have found I have queries that are putting the system on its knees, Some are > statistics collection and adding the right index just solved the problem. I never looked in > /sys on dom0 to get informations about iops. Instead I used iostats in domU and it > gave me ~ 5000 read/s when the statistic collection. > If I compare to what I see in /sys on the dom0 (looking in the stat of the dm- block > device , not the underlying devices) I see ~ 1200 write/s and 100 read/s when thing > are ok.Mechanical disks are slow. The rules for disk performance haven''t really changed due to virtualization: lots of RAM, lots of buffering. In other words, avoid disk accesses like the plague. Especially random I/O to physical drives. Solid-state storage helps, but you can often achieve the same effect with lots of RAM, on the cheap. I''ve fought similar issues on our virtualized clusters. After many cycles of tuning and monitoring, I came to a couple of conclusions. One, a 16-disk storage array isn''t enough for 30 guests--despite plenty of capacity and bandwidth, random I/O is still the problem, which can only be mitigated with more spindles (we''d normally have at least 2 per physical host, but in our virtual cluster we''ve allocated one-fourth of that). Two, rotating media are 1970''s technology, good for little more than archival, ripe for replacement. I''m keeping an eye on price/capacity for solid state storage. One little tip for Linux users: Mount guest file systems with "noatime" whenever you can. You''ll be glad you did. -Jeff
Am Dienstag, 29. November 2011, 14:10:45 schrieb Jeff Sturm:> Mechanical disks are slow. The rules for disk performance haven''t really > changed due to virtualization: lots of RAM, lots of buffering.ack> In other words, avoid disk accesses like the plague. Especially random I/O > to physical drives.ack.> Solid-state storage helps, but you can often achieve the same effect with > lots of RAM, on the cheap.This is not correct in all cases as this still hardly depends from the model (wear leveling etc.) and - not at least - the applications disk usage profile (i.e. hardly parallel accesses etc.). This means: a high quality and application specific optimized SAS RAID could be significantly "faster" then many SSDs.> random I/O is still the problem, which can only be mitigated with morespindles (we''d normally have at least 2 per physical host, but in our virtual cluster we''ve allocated one-fourth of that). yes, but just additionally: a "long" SAN path could slow down random i/o transaction rates too comparing to "directly" attached disks. Another "ugly" point on SAN could be parallel access from other users / SAN clients to the same physical disks or even swapping onto it from the same systems.> Two, rotating media are 1970''s technology, good for little more thanarchival, ripe for replacement. I''m keeping an eye on price/capacity for solid state storage. SSDs are still not a generally better / faster solution (even if you did not rely on the price) in each case - especially in database situations. Most database software developers / projects are still developing / optimizing their products for SSD storage usage as over decades for rotating disks before and the manufacturers of SSDs do their part of the development. See i.e. developer discussions on MySQL etc. about this topic... hth best regards, Niels. -- --- Niels Dettenbach Syndicat IT&Internet http://www.syndicat.com/ _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
use flashcache. In my tests it works perfectly (about 15% degradation compare to fileio/wb mode for iscsi, which I accept as highest possible (but risky) storage access mode). On 29.11.2011 18:10, Jeff Sturm wrote:>> -----Original Message----- >> From: xen-users-bounces@lists.xensource.com [mailto:xen-users- >> bounces@lists.xensource.com] On Behalf Of Alexandre Chapellon >> Sent: Tuesday, November 29, 2011 4:27 AM >> >> Le 29/11/2011 04:14, George Shuklin a écrit : >>> "Very high" and "very slow" sounds like "i pay many buks but got small >>> potion". >>> >>> How many IOPS postgre generates from dom0 point of view? (see >>> statistics in vbd/tap device in /sys). How do you did you check netapp >>> storage performance? >> I have found I have queries that are putting the system on its knees, Some are >> statistics collection and adding the right index just solved the problem. I never looked in >> /sys on dom0 to get informations about iops. Instead I used iostats in domU and it >> gave me ~ 5000 read/s when the statistic collection. >> If I compare to what I see in /sys on the dom0 (looking in the stat of the dm- block >> device , not the underlying devices) I see ~ 1200 write/s and 100 read/s when thing >> are ok. > Mechanical disks are slow. The rules for disk performance haven't really changed due to virtualization: lots of RAM, lots of buffering. > > In other words, avoid disk accesses like the plague. Especially random I/O to physical drives. > > Solid-state storage helps, but you can often achieve the same effect with lots of RAM, on the cheap. > > I've fought similar issues on our virtualized clusters. After many cycles of tuning and monitoring, I came to a couple of conclusions. One, a 16-disk storage array isn't enough for 30 guests--despite plenty of capacity and bandwidth, random I/O is still the problem, which can only be mitigated with more spindles (we'd normally have at least 2 per physical host, but in our virtual cluster we've allocated one-fourth of that). > > Two, rotating media are 1970's technology, good for little more than archival, ripe for replacement. I'm keeping an eye on price/capacity for solid state storage. > > One little tip for Linux users: Mount guest file systems with "noatime" whenever you can. You'll be glad you did. > > -Jeff > > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> -----Original Message----- > From: xen-users-bounces@lists.xensource.com [mailto:xen-users- > bounces@lists.xensource.com] On Behalf Of Niels Dettenbach > Sent: Tuesday, November 29, 2011 10:08 AM > > > Solid-state storage helps, but you can often achieve the same effect > > with lots of RAM, on the cheap. > This is not correct in all cases as this still hardly depends from the model (wear leveling > etc.) and - not at least - the applications disk usage profile (i.e. hardly parallel > accesses etc.).Yeah. SSD isn''t a cure-all, yet. For what I need--high throughput on small random read requests--SSD looks like it may be a winner. Haven''t done a lot of testing yet, though.> This means: a high quality and application specific optimized SAS RAID > could be significantly "faster" then many SSDs.Sure, it''s possible. Though it can be depressingly hard to find software that optimizes disk accesses well. MySQL is particularly bad. Oracle fares better.> Another "ugly" > point on SAN could be parallel access from other users / SAN clients to the same > physical disks or even swapping onto it from the same systems.That''s a big problem with virtualization too (bringing us back on topic). Each host could do a good job at optimizing disk accesses. But run them all at once off the same SAN and you could end up with a huge mess. -Jeff