Hi list, first let me please describe our current storage solution. We have two storage machines with about 12 TB diskspace. So each domu has one lv, which is exported by iscsi, on each storage box. The domu assembles the connected targets to softwareraid arrays. The main advantage is that one storage server can fail and no domu is affected, but two big disadvantages are a lot more administration effort and sometimes obscure problems with the softwareraid arrays. Now my question: Is there any other storage opportunity which can gain the advantage of this solution, but minimize the administration effort? I am thankful for any hint or field report. Regards, Jan _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Wed, 11 Mar 2009, Jan Marquardt wrote:> first let me please describe our current storage solution. We have two > storage machines with about 12 TB diskspace. So each domu has one lv, > which is exported by iscsi, on each storage box. The domu assembles the > connected targets to softwareraid arrays. The main advantage is that one > storage server can fail and no domu is affected, but two big > disadvantages are a lot more administration effort and sometimes obscure > problems with the softwareraid arrays. > > Now my question: Is there any other storage opportunity which can gain > the advantage of this solution, but minimize the administration effort? > > I am thankful for any hint or field report.Abstract your disks and iscsi exports; then use ZFS on two pools this will minimize the administration. Implement hardware failover so the second box will only mount and export the disks if the first box is offline. Stefan _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
the first think i''d do is move all the volume management from DomU to Dom0. IOW: the iSCSI initiator and RAID (i guess it''s RAID1) should be on Dom0, and the DomU configs should refer the resultant blockdevices. all the administration would be done on Dom0, since the DomUs would simply see a ''normal'' blockdevice. you''ll also see some performance advantage, maybe quite significant. -- Javier _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Wed, 2009-03-11 at 10:54 -0500, Javier Guerra wrote:> the first think i''d do is move all the volume management from DomU to Dom0. > > IOW: the iSCSI initiator and RAID (i guess it''s RAID1) should be on > Dom0, and the DomU configs should refer the resultant blockdevices.Agreed. You could even potentially move the mirroring down to the storage nodes (mirrored nbd/etc. devices) and HA the iSCSI target service itself to reduce dom0''s work, although that would depend on you being comfortable with iSCSI moving around during a storage node failure, which may be a risk factor. John -- John Madden Sr. UNIX Systems Engineer Ivy Tech Community College of Indiana jmadden@ivytech.edu _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Honestly, I''d probably suggest you discard the idea entirely and switch to using heartbeat to manage the iSCSI resources. Alternatively, I believe you could also use iSCSI multipath to point to two initiators presenting the same DRBD backed volume. Consider your failure scenario: If you have a storage node go offline in your current configuration for any real length of time, when it becomes available again, all of the nodes will begin to resync the array simultaneously. With a single DomU, you''ll just consume the vast majority of either your Disk IO or Network IO. However, if you had a dozen guests, and they all start to rebuild their RAID1s from the same source SAN to the same destination SAN, through the same network link (in and out), at the same time, things are probably going to grind to an absolute halt. RAID is nearly always best when it is used right above the physical disks. The more layers you put between the RAID and the disks, the more bad things(tm) seem to occur. Best Regards, Nathan Eisenberg Atlas Networks, LLC Phone: 206-577-3078 support@atlasnetworks.us www.atlasnetworks.us -----Original Message----- From: xen-users-bounces@lists.xensource.com [mailto:xen-users-bounces@lists.xensource.com] On Behalf Of John Madden Sent: Wednesday, March 11, 2009 10:13 AM To: Javier Guerra Cc: xen-users@lists.xensource.com; Jan Marquardt Subject: Re: [Xen-users] Storage alternatives On Wed, 2009-03-11 at 10:54 -0500, Javier Guerra wrote:> the first think i''d do is move all the volume management from DomU to Dom0. > > IOW: the iSCSI initiator and RAID (i guess it''s RAID1) should be on > Dom0, and the DomU configs should refer the resultant blockdevices.Agreed. You could even potentially move the mirroring down to the storage nodes (mirrored nbd/etc. devices) and HA the iSCSI target service itself to reduce dom0''s work, although that would depend on you being comfortable with iSCSI moving around during a storage node failure, which may be a risk factor. John -- John Madden Sr. UNIX Systems Engineer Ivy Tech Community College of Indiana jmadden@ivytech.edu _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> IOW: the iSCSI initiator and RAID (i guess it''s RAID1) should be on > Dom0, and the DomU configs should refer the resultant blockdevices.This is one solution we are discussing at the moment, but I think it would be a lot smarter to get the raid functionality on a layer between the harddisks and the iscsi targets as adviced by Nathan.> Agreed. You could even potentially move the mirroring down to the > storage nodes (mirrored nbd/etc. devices) and HA the iSCSI target > service itself to reduce dom0''s work, although that would depend on you > being comfortable with iSCSI moving around during a storage node > failure, which may be a risk factor.I think that we would have to reboot each domU in this case after a failure, isn''t it? The goal is to have domUs which would not be affected by failure of one storage servers.> If you have a storage node go offline in your current configuration for any > real length of time, when it becomes available again, all of the nodes will > begin to resync the array simultaneously. With a single DomU, you''ll just > consume the vast majority of either your Disk IO or Network IO. However, > if you had a dozen guests, and they all start to rebuild their RAID1s from > the same source SAN to the same destination SAN, through the same network > link (in and out), at the same time, things are probably going to grind to > an absolute halt.This is of course also one reason why I want to change the current setup.> Abstract your disks and iscsi exports; then use ZFS on two pools this will > minimize the administration.ZFS seems to be very nice, but sadly we are not using Solaris and don''t want to use it with FUSE under Linux. Nevertheless does anyone use ZFS under Linux and can share his/her experiences? Regards, Jan _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> I think that we would have to reboot each domU in this case after a > failure, isn''t it? The goal is to have domUs which would not be affected > by failure of one storage servers.Ideally, the initiator in dom0 would handle the failover of the target successfully and domU would never know anything had happened. I''ve done things like entirely restart the target daemon without initiators caring, but if it''s gone long enough while i/o is going on, bad things will happen.> > Abstract your disks and iscsi exports; then use ZFS on two pools this will > > minimize the administration. > > ZFS seems to be very nice, but sadly we are not using Solaris and don''t > want to use it with FUSE under Linux. Nevertheless does anyone use ZFS > under Linux and can share his/her experiences?I wouldn''t pay too much attention to the ZFS fanboi''s anyway. It''s does good things but hardly anything truly useful that you can''t find in LVM and doesn''t really change anything in this case. John -- John Madden Sr. UNIX Systems Engineer Ivy Tech Community College of Indiana jmadden@ivytech.edu _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
PCextreme B.V. - Wido den Hollander
2009-Mar-13 13:47 UTC
Re: [Xen-users] Storage alternatives
Hi, Let me drop into this discussion. I have been running iSCSI on the dom0 for about 2 years now and that works fine. Making your iSCSI Target High Available can be done by using DRBD and Heartbeat, i have written a small howto (in Dutch), if you would to have this, contact me. Failovers on the iSCSI Target side go without notice of the domU, on the dom0 you get a iSCSI connection error, but that''s it. This whole setup can be created with Open-iSCSI, iSCSI Enterprise Target, DRBD and Heartbeat. - Met vriendelijke groet, Wido den Hollander Hoofd Systeembeheer / CSO Telefoon Support Nederland: 0900 9633 (45 cpm) Telefoon Support België: 0900 70312 (45 cpm) Telefoon Direct: (+31) (0)20 50 60 104 Fax: +31 (0)20 50 60 111 E-mail: support@pcextreme.nl Website: http://www.pcextreme.nl Kennisbank: http://support.pcextreme.nl/ Netwerkstatus: http://nmc.pcextreme.nl On Fri, 2009-03-13 at 09:37 -0400, John Madden wrote:> > I think that we would have to reboot each domU in this case after a > > failure, isn''t it? The goal is to have domUs which would not be affected > > by failure of one storage servers. > > Ideally, the initiator in dom0 would handle the failover of the target > successfully and domU would never know anything had happened. I''ve done > things like entirely restart the target daemon without initiators > caring, but if it''s gone long enough while i/o is going on, bad things > will happen. > > > > Abstract your disks and iscsi exports; then use ZFS on two pools this will > > > minimize the administration. > > > > ZFS seems to be very nice, but sadly we are not using Solaris and don''t > > want to use it with FUSE under Linux. Nevertheless does anyone use ZFS > > under Linux and can share his/her experiences? > > I wouldn''t pay too much attention to the ZFS fanboi''s anyway. It''s does > good things but hardly anything truly useful that you can''t find in LVM > and doesn''t really change anything in this case. > > John > > >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Fri, Mar 13, 2009 at 8:22 AM, Jan Marquardt <jm@artfiles.de> wrote:> >> IOW: the iSCSI initiator and RAID (i guess it''s RAID1) should be on >> Dom0, and the DomU configs should refer the resultant blockdevices. > > This is one solution we are discussing at the moment, but I think it would > be a lot smarter to get the raid functionality on a layer between the > harddisks and the iscsi targets as adviced by Nathan.yep, that further reduces duplicated traffic. i didn''t mention this just because i''m not too familiar with DRBD, and because i thought it''s too different from your current setup, so you might want to go in steps. if it''s easier to redo it all again, this is a better idea.>> Agreed. You could even potentially move the mirroring down to the >> storage nodes (mirrored nbd/etc. devices) and HA the iSCSI target >> service itself to reduce dom0''s work, although that would depend on you >> being comfortable with iSCSI moving around during a storage node >> failure, which may be a risk factor. > > I think that we would have to reboot each domU in this case after a failure, > isn''t it? The goal is to have domUs which would not be affected by failure > of one storage servers.that''s one reason to put all storage management as low on the stack as possible. in this case, Dom0 should be the only one noting the movement, and any failover detection (either RAID1, multipath, IP migration, etc) should finish at Dom0. DomUs won''t feel a thing (unless it takes so long that you get timeouts).>> Abstract your disks and iscsi exports; then use ZFS on two pools this will >> minimize the administration. > > ZFS seems to be very nice, but sadly we are not using Solaris and don''t want > to use it with FUSE under Linux. Nevertheless does anyone use ZFS under > Linux and can share his/her experiences?ZFS gets you some nice ways to rethink about storage, but on these cases it''s (mostly) the same as any other well-thought scheme. -- Javier _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> -----Original Message----- > From: xen-users-bounces@lists.xensource.com [mailto:xen-users- > bounces@lists.xensource.com] On Behalf Of Javier Guerra > Sent: Friday, March 13, 2009 10:04 AM > To: Jan Marquardt > Cc: xen-users@lists.xensource.com > Subject: Re: [Xen-users] Storage alternatives > > On Fri, Mar 13, 2009 at 8:22 AM, Jan Marquardt <jm@artfiles.de> wrote: > > > >> IOW: the iSCSI initiator and RAID (i guess it's RAID1) should be on > >> Dom0, and the DomU configs should refer the resultant blockdevices. > > > > This is one solution we are discussing at the moment, but I think it > would > > be a lot smarter to get the raid functionality on a layer > between the > > harddisks and the iscsi targets as adviced by Nathan. > > yep, that further reduces duplicated traffic. i didn't mention this > just because i'm not too familiar with DRBD, and because i thought > it's too different from your current setup, so you might want to go in > steps. if it's easier to redo it all again, this is a better idea. > > >> Agreed. You could even potentially move the mirroring down to the > >> storage nodes (mirrored nbd/etc. devices) and HA the iSCSI target > >> service itself to reduce dom0's work, although that would depend on > you > >> being comfortable with iSCSI moving around during a storage node > >> failure, which may be a risk factor. > > > > I think that we would have to reboot each domU in this case after a > failure, > > isn't it? The goal is to have domUs which would not be affected by > failure > > of one storage servers. > > that's one reason to put all storage management as low on the stack as > possible. in this case, Dom0 should be the only one noting the > movement, and any failover detection (either RAID1, multipath, IP > migration, etc) should finish at Dom0. DomUs won't feel a thing > (unless it takes so long that you get timeouts). > > >> Abstract your disks and iscsi exports; then use ZFS on two pools > this will > >> minimize the administration. > > > > ZFS seems to be very nice, but sadly we are not using Solaris and > don't want > > to use it with FUSE under Linux. Nevertheless does anyone use ZFS > under > > Linux and can share his/her experiences? > > ZFS gets you some nice ways to rethink about storage, but on these > cases it's (mostly) the same as any other well-thought scheme. > > > -- > Javier > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-usersJust to add to this thread, We are using DRBD here spread across local storage on two machines with plans for failover coming soon. Basically each machine has two NICS pointing at eachother for the replication and I haven't experienced an IO related issues. Now we have everything in file based images so we can export them in extreme emergency to another pool of servers that aren't setup this way. You can surely use LVM ontop of DRBD (we are using XFS and it is fantastic). In a test bed I had Heartbeat and DRBD working perfectly to provide auto failover of the xen resources. We decided on DRBD because of two reasons, one that we can easily change which server can mount which DRBD device (for maintenance one server can have both mounted) and for backups. I have written a backup script that kills the connection between the DRBD arrays, tells the server that the DRBD was secondary on to go primary and mount, then copy the VMs to a iSCSI or NFS mount while the array is down. This script then reconnects the DRBD and resyncs, never killing the online VMs (I still have to play with the rate of resync to find optimal settings for 10+ vms). So yeah, DRBD is pretty sweet so if you can connect multiple iSCSI targets in dom0 using DRBD, you are golden. Tait _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hello! Do you mind if I drop in your discussion? What speeds are you seeing (reads and writes) on your Dom0? I''m setting up something similar with IET and open-iscsi, and am trying to tune it for performance. I''d appreciate any numbers you would be willing to share. Also, what is your hardware setup? Hardware RAID, software RAID, network setup? Cheers cc On Fri, Mar 13, 2009 at 6:47 AM, PCextreme B.V. - Wido den Hollander <wido@pcextreme.nl> wrote:> Hi, > > Let me drop into this discussion. > > I have been running iSCSI on the dom0 for about 2 years now and that > works fine. > > Making your iSCSI Target High Available can be done by using DRBD and > Heartbeat, i have written a small howto (in Dutch), if you would to have > this, contact me. > > Failovers on the iSCSI Target side go without notice of the domU, on the > dom0 you get a iSCSI connection error, but that''s it. > > This whole setup can be created with Open-iSCSI, iSCSI Enterprise > Target, DRBD and Heartbeat. > > - > Met vriendelijke groet, > > Wido den Hollander > Hoofd Systeembeheer / CSO > Telefoon Support Nederland: 0900 9633 (45 cpm) > Telefoon Support België: 0900 70312 (45 cpm) > Telefoon Direct: (+31) (0)20 50 60 104 > Fax: +31 (0)20 50 60 111 > E-mail: support@pcextreme.nl > Website: http://www.pcextreme.nl > Kennisbank: http://support.pcextreme.nl/ > Netwerkstatus: http://nmc.pcextreme.nl > > > On Fri, 2009-03-13 at 09:37 -0400, John Madden wrote: >> > I think that we would have to reboot each domU in this case after a >> > failure, isn''t it? The goal is to have domUs which would not be affected >> > by failure of one storage servers. >> >> Ideally, the initiator in dom0 would handle the failover of the target >> successfully and domU would never know anything had happened. I''ve done >> things like entirely restart the target daemon without initiators >> caring, but if it''s gone long enough while i/o is going on, bad things >> will happen. >> >> > > Abstract your disks and iscsi exports; then use ZFS on two pools this will >> > > minimize the administration. >> > >> > ZFS seems to be very nice, but sadly we are not using Solaris and don''t >> > want to use it with FUSE under Linux. Nevertheless does anyone use ZFS >> > under Linux and can share his/her experiences? >> >> I wouldn''t pay too much attention to the ZFS fanboi''s anyway. It''s does >> good things but hardly anything truly useful that you can''t find in LVM >> and doesn''t really change anything in this case. >> >> John >> >> >> > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >-- Chris Chen <muffaleta@gmail.com> "I want the kind of six pack you can''t drink." -- Micah _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users