Hello in dom0: root@nas-1:/etc/xen# cat /etc/xen/mogilefs-images.nas-1.lo.ufanet.ru.cfg ... disk = [ 'file:/home/xen/domains/mogilefs-images.nas-1.lo.ufanet.ru/disk.img,xvda2,w', 'file:/home/xen/domains/mogilefs-images.nas-1.lo.ufanet.ru/swap.img,xvda1,w', 'phy:/dev/nas-1/mogilefs-images,xvdb1,w', 'file:/mnt/ext4,xvdb2,w', ] ... Here, /dev/nas-1/mogilefs-images is logical volume in LVM, /mnt/ext4 is file located on the other logical volume in the same LVM. LVM is one volume group located on one physical volume (software raid6 - 12 hdd). in domU (mogilefs-images.nas-1.lo.ufanet.ru): root@mogilefs-images:~# mount /dev/xvdb1 /mnt/lvm root@mogilefs-images:~# mount /dev/xvdb2 /mnt/file root@mogilefs-images:~# dd if=/dev/zero of=/mnt/lvm/test.file bs=40960 count=50000 50000+0 records in 50000+0 records out 2048000000 bytes (2.0 GB) copied, 71.1815 s, 28.8 MB/s root@mogilefs-images:~# dd if=/dev/zero of=/mnt/file/test.file bs=40960 count=50000 50000+0 records in 50000+0 records out 2048000000 bytes (2.0 GB) copied, 17.3065 s, 118 MB/s Why such a big difference in the rate of io (is physically on the same device)? -- С уважением, Шарипов Руслан. Руководитель отдела разработки и сопровождения программного обеспечения ОАО "Уфанет". Контактная информация: google+: http://gplus.to/ruslan jid: serafim@jabber.ufanet.ru wave: ufaweb@googlewave.com skype: ufaweb phone: +7(917)4775460 vkontakte: http://vkontakte.ru/ufaweb myspace: http://www.myspace.com/ufaweb facebook: http://facebook.com/sharipov linkedin: http://www.linkedin.com/in/ufaweb twitter: http://twitter.com/ufaweb _______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
Fajar A. Nugraha
2012-Jun-13 04:52 UTC
Re: IO is a big difference between the "file" and "phy"
On Wed, Jun 13, 2012 at 11:00 AM, Руслан Шарипов <ufaweb@gmail.com> wrote:> Here, /dev/nas-1/mogilefs-images is logical volume in LVM, /mnt/ext4 > is file located on the other logical volume in the same LVM. > LVM is one volume group located on one physical volume (software raid6 > - 12 hdd).> Why such a big difference in the rate of io (is physically on the same device)?That's a known issue. Depending on your configuration, dom0 kernel version, and type of test, file:/ can be MUCH faster, or MUCH slower compared to phy:/, due to interaction of dom0 filesystem, loopback driver, and dom0 kernel cache. But it shouldn't matter in production environment, as you're NOT supposed to use file:/ anyway, since (on some circumstances) it can cause data corruption. -- Fajar _______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
Руслан Шарипов
2012-Jun-13 07:59 UTC
Re: IO is a big difference between the "file" and "phy"
2012/6/13 Fajar A. Nugraha <list@fajar.net>> That's a known issue. > > Depending on your configuration, dom0 kernel version, and type of > test, file:/ can be MUCH faster, or MUCH slower compared to phy:/, due > to interaction of dom0 filesystem, loopback driver, and dom0 kernel > cache. But it shouldn't matter in production environment, as you're > NOT supposed to use file:/ anyway, since (on some circumstances) it > can cause data corruption.Ok, but it depends on what and how this problem can be solved? dom0: root@nas-1:/etc# uname -a Linux nas-1 3.2.0-2-amd64 #1 SMP Fri Jun 1 17:49:08 UTC 2012 x86_64 GNU/Linux root@nas-1:/etc# xen info host : nas-1 release : 3.2.0-2-amd64 version : #1 SMP Fri Jun 1 17:49:08 UTC 2012 machine : x86_64 nr_cpus : 4 nr_nodes : 1 cores_per_socket : 4 threads_per_core : 1 cpu_mhz : 3006 hw_caps : 178bf3ff:efd3fbff:00000000:00001710:00802001:00000000:000037ff:00000000 virt_caps : hvm total_memory : 16383 free_memory : 14024 free_cpus : 0 xen_major : 4 xen_minor : 1 xen_extra : .2 xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_scheduler : credit xen_pagesize : 4096 platform_params : virt_start=0xffff800000000000 xen_changeset : unavailable xen_commandline : placeholder dom0_mem=2048M dom0_max_vcpus=1 dom0_vcpus_pin cc_compiler : gcc version 4.6.3 (Debian 4.6.3-4) cc_compile_by : waldi cc_compile_domain : debian.org cc_compile_date : Sun May 6 18:08:50 UTC 2012 xend_config_format : 4 root@nas-1:/etc# xm list Name ID Mem VCPUs State Time(s) Domain-0 0 2047 1 r----- 780.4 mogilefs-images.nas-1.lo.ufanet.ru 6 128 1 -b---- 27.5 -- С уважением, Шарипов Руслан. Руководитель отдела разработки и сопровождения программного обеспечения ОАО "Уфанет". Контактная информация: google+: http://gplus.to/ruslan jid: serafim@jabber.ufanet.ru wave: ufaweb@googlewave.com skype: ufaweb phone: +7(917)4775460 vkontakte: http://vkontakte.ru/ufaweb myspace: http://www.myspace.com/ufaweb facebook: http://facebook.com/sharipov linkedin: http://www.linkedin.com/in/ufaweb twitter: http://twitter.com/ufaweb _______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
Fajar A. Nugraha
2012-Jun-13 08:12 UTC
Re: IO is a big difference between the "file" and "phy"
On Wed, Jun 13, 2012 at 2:59 PM, Руслан Шарипов <ufaweb@gmail.com> wrote:> 2012/6/13 Fajar A. Nugraha <list@fajar.net> >> That's a known issue. >> >> Depending on your configuration, dom0 kernel version, and type of >> test, file:/ can be MUCH faster, or MUCH slower compared to phy:/, due >> to interaction of dom0 filesystem, loopback driver, and dom0 kernel >> cache. But it shouldn't matter in production environment, as you're >> NOT supposed to use file:/ anyway, since (on some circumstances) it >> can cause data corruption. > > Ok, but it depends on what- how large is your test size, relative to dom0 memory - is the test I/O random or sequential - does the test use sync/direct I/O - which fs was used - how well does the fs handle the underlying storage "lying" about sync completion> and how this problem can be solved?Don't use file:/ -- Fajar _______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
Руслан Шарипов
2012-Jun-13 08:17 UTC
Re: IO is a big difference between the "file" and "phy"
2012/6/13 Fajar A. Nugraha <list@fajar.net>:> On Wed, Jun 13, 2012 at 2:59 PM, Руслан Шарипов <ufaweb@gmail.com> wrote: >> 2012/6/13 Fajar A. Nugraha <list@fajar.net> >>> That's a known issue. >>> >>> Depending on your configuration, dom0 kernel version, and type of >>> test, file:/ can be MUCH faster, or MUCH slower compared to phy:/, due >>> to interaction of dom0 filesystem, loopback driver, and dom0 kernel >>> cache. But it shouldn't matter in production environment, as you're >>> NOT supposed to use file:/ anyway, since (on some circumstances) it >>> can cause data corruption. >> >> Ok, but it depends on what >> - how large is your test size, relative to dom0 memorytest size = 2 GByte, memory in dom0 = 2GByte> - is the test I/O random or sequentialsequential> - does the test use sync/direct I/Odirect> - which fs was usedext4> - how well does the fs handle the underlying storage "lying" about > sync completionnot understood> >> and how this problem can be solved? > > Don't use file:/ok, it remains only to solve the problem of slow speed when using phy:/ :) -- С уважением, Шарипов Руслан. Руководитель отдела разработки и сопровождения программного обеспечения ОАО "Уфанет". Контактная информация: google+: http://gplus.to/ruslan jid: serafim@jabber.ufanet.ru wave: ufaweb@googlewave.com skype: ufaweb phone: +7(917)4775460 vkontakte: http://vkontakte.ru/ufaweb myspace: http://www.myspace.com/ufaweb facebook: http://facebook.com/sharipov linkedin: http://www.linkedin.com/in/ufaweb twitter: http://twitter.com/ufaweb _______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
Fajar A. Nugraha
2012-Jun-13 08:25 UTC
Re: IO is a big difference between the "file" and "phy"
On Wed, Jun 13, 2012 at 3:17 PM, Руслан Шарипов <ufaweb@gmail.com> wrote:>> - how large is your test size, relative to dom0 memory > test size = 2 GByte, memory in dom0 = 2GBytethe image will be cached in dom0 memory, making your test invalid. You need at least a dataset size of twice ram size to make sure caching is not a factor.> >> - is the test I/O random or sequential > sequentialIt's kinda useless these days to do sequential testing, since most loads (e.g. db, web server) will be random i/o. The exception is that if you're writing/reading large, streamed video files or image (e.g. downloading ISO/vm image).> >> - does the test use sync/direct I/O > directDo you know what direct I/O is? IIRC by default dd does not use direct or sync, so your test will be buffered. Try fio. You can tell it to do random or sequential, sync or buffered, specificy data size, etc.>> Don't use file:/ > ok, it remains only to solve the problem of slow speed when using phy:/ :)Your throughput looks normal for raid5/6. Anything above that would mean cache comes into play. Now if you're using striped raid 10, the story might be different. -- Fajar _______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
Florian Heigl
2012-Jun-14 22:51 UTC
Re: IO is a big difference between the "file" and "phy"
Hi,> Do you know what direct I/O is? > IIRC by default dd does not use direct or sync, so your test will be buffered.for the DD testing add conv=fdatasync to get real numbers. i.e. dd if=/dev/zero of=mytestfile bs=1024k count=1024 conv=fdatasync File: uses the linux loop driver. Many people have lost data in that mode, because of write-caching in the host, i.e. if the dom0 crashes or something, you''re likely to have corrupted virtual machines. Also the loop device is f*****ing slow, it just sometimes appears faster b/c of the caching. (NetBSD had a much better loop device) If you want fs based storage, switch over to tap:aio or what it''s called. A more modern option is the LVMTHIN target for device mapper, but that requires some good LVM knowledge. Florian
Ian Campbell
2012-Jun-19 10:08 UTC
Re: IO is a big difference between the "file" and "phy"
On Thu, 2012-06-14 at 23:51 +0100, Florian Heigl wrote:> A more modern option is the LVMTHIN target for device mapper, but that > requires some good LVM knowledge.This sound like it might be an interesting project, do you have link? Google just came up with a bunch of linux-lvm threads.. Cheers, Ian.
Fajar A. Nugraha
2012-Jun-19 10:18 UTC
Re: IO is a big difference between the "file" and "phy"
On Tue, Jun 19, 2012 at 5:08 PM, Ian Campbell <Ian.Campbell@citrix.com> wrote:> On Thu, 2012-06-14 at 23:51 +0100, Florian Heigl wrote: >> A more modern option is the LVMTHIN target for device mapper, but that >> requires some good LVM knowledge. > > This sound like it might be an interesting project, do you have link? > Google just came up with a bunch of linux-lvm threads..Probably http://lwn.net/Articles/465740/ However looking at the (frighteningly) high amount of dmsetup, and somewhat sparse documentation (what does 20971520 refer to?), personally I prefer to use zfsonlinux''s zvol (which can create thin volumes as well) for now. -- Fajar
Fajar A. Nugraha
2012-Jul-10 12:47 UTC
Re: IO is a big difference between the "file" and "phy"
On Tue, Jun 19, 2012 at 5:18 PM, Fajar A. Nugraha <list@fajar.net> wrote:> On Tue, Jun 19, 2012 at 5:08 PM, Ian Campbell <Ian.Campbell@citrix.com> wrote: >> On Thu, 2012-06-14 at 23:51 +0100, Florian Heigl wrote: >>> A more modern option is the LVMTHIN target for device mapper, but that >>> requires some good LVM knowledge. >> >> This sound like it might be an interesting project, do you have link? >> Google just came up with a bunch of linux-lvm threads.. > > Probably http://lwn.net/Articles/465740/ > > However looking at the (frighteningly) high amount of dmsetup, and > somewhat sparse documentation (what does 20971520 refer to?), > personally I prefer to use zfsonlinux''s zvol (which can create thin > volumes as well) for now.FWIW, RHEL 6.3 apparently has incorporate thin snapshots to their lvm2 package as a tech preview: https://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/6.3_Technical_Notes/storage_and_fs_tp.html -- Fajar