Hi Everyone, I am consider such a setup where I export an iSCSI target to a Xen node. This Xen node will then use the iSCSI block device as an LVM PV, and create lots of LVs for DomU use. I was wondering if anyone could make me aware of any special consideration I would need to take. I''ve posted a similar question to the LVM list to ask for further tips more specific to LVM. Am I barking down the wrong path here? I know it would be very easy to just an NFS server and use image files, but this will be for a large scale DomU hosting so this isn''t really an option. Additionally, if I wanted to make the LVM VG visible to multiple Xen nodes, is it just a matter of running CLVM on each Xen node? Please keep in mind that only one Xen node will be using an LV at any one time (so no need for GFS, I believe) Any help or tips would be appreciated Thanks _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Why not create one iscsi lun per vm disk instead of carving them up on the hypervisor? That''s more typical, and a more typical state of affairs in linux is your friend. Also, you would have just one lun queue if you exported one big PV, instead of one lun queue per vbd. That becomes a problem at scale. - Jonathan ----- Original Message ----- From: "Jonathan Tripathy" <jonnyt@abpni.co.uk> To: "Xen List" <xen-users@lists.xensource.com> Sent: Sunday, April 24, 2011 11:25:38 AM Subject: [Xen-users] Shared Storage Hi Everyone, I am consider such a setup where I export an iSCSI target to a Xen node. This Xen node will then use the iSCSI block device as an LVM PV, and create lots of LVs for DomU use. I was wondering if anyone could make me aware of any special consideration I would need to take. I''ve posted a similar question to the LVM list to ask for further tips more specific to LVM. Am I barking down the wrong path here? I know it would be very easy to just an NFS server and use image files, but this will be for a large scale DomU hosting so this isn''t really an option. Additionally, if I wanted to make the LVM VG visible to multiple Xen nodes, is it just a matter of running CLVM on each Xen node? Please keep in mind that only one Xen node will be using an LV at any one time (so no need for GFS, I believe) Any help or tips would be appreciated Thanks _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
We''re talking houndreds, if not thousands of DomUs here. Will iSCSI on Linux scale to these large numbers? Thanks On 24/04/2011 19:13, Jonathan Dye wrote:> Why not create one iscsi lun per vm disk instead of carving them up on the hypervisor? That''s more typical, and a more typical state of affairs in linux is your friend. Also, you would have just one lun queue if you exported one big PV, instead of one lun queue per vbd. That becomes a problem at scale. > > - Jonathan > > ----- Original Message ----- > From: "Jonathan Tripathy"<jonnyt@abpni.co.uk> > To: "Xen List"<xen-users@lists.xensource.com> > Sent: Sunday, April 24, 2011 11:25:38 AM > Subject: [Xen-users] Shared Storage > > Hi Everyone, > > I am consider such a setup where I export an iSCSI target to a Xen node. > This Xen node will then use the iSCSI block device as an LVM PV, and > create lots of LVs for DomU use. > > I was wondering if anyone could make me aware of any special > consideration I would need to take. I''ve posted a similar question to > the LVM list to ask for further tips more specific to LVM. > > Am I barking down the wrong path here? I know it would be very easy to > just an NFS server and use image files, but this will be for a large > scale DomU hosting so this isn''t really an option. Additionally, if I > wanted to make the LVM VG visible to multiple Xen nodes, is it just a > matter of running CLVM on each Xen node? Please keep in mind that only > one Xen node will be using an LV at any one time (so no need for GFS, I > believe) > > Any help or tips would be appreciated > > Thanks > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Agreed, that''s how we do it as well. One LUN for one host. In HA situations still no issue, as long as you use clustering software to determine which Hypervisor uses the DomU. B. On 04/24/11 20:13, Jonathan Dye wrote:> Why not create one iscsi lun per vm disk instead of carving them up on the hypervisor? That''s more typical, and a more typical state of affairs in linux is your friend. Also, you would have just one lun queue if you exported one big PV, instead of one lun queue per vbd. That becomes a problem at scale. > > - Jonathan > > ----- Original Message ----- > From: "Jonathan Tripathy"<jonnyt@abpni.co.uk> > To: "Xen List"<xen-users@lists.xensource.com> > Sent: Sunday, April 24, 2011 11:25:38 AM > Subject: [Xen-users] Shared Storage > > Hi Everyone, > > I am consider such a setup where I export an iSCSI target to a Xen node. > This Xen node will then use the iSCSI block device as an LVM PV, and > create lots of LVs for DomU use. > > I was wondering if anyone could make me aware of any special > consideration I would need to take. I''ve posted a similar question to > the LVM list to ask for further tips more specific to LVM. > > Am I barking down the wrong path here? I know it would be very easy to > just an NFS server and use image files, but this will be for a large > scale DomU hosting so this isn''t really an option. Additionally, if I > wanted to make the LVM VG visible to multiple Xen nodes, is it just a > matter of running CLVM on each Xen node? Please keep in mind that only > one Xen node will be using an LV at any one time (so no need for GFS, I > believe) > > Any help or tips would be appreciated > > Thanks > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
That is completely dependent on your hardware specs and DomU''s properties. It sounds like a lot though. I seem to remember some time ago you also stated to want to run at least 100 DomUs on one hypervisor, maybe this is again pushing it. With a decent RAID and 10gbit or infiniband you can go a long way though. You should also consider using SCST instrad of IET as it is faster. B. On 04/24/11 20:31, Jonathan Tripathy wrote:> We''re talking houndreds, if not thousands of DomUs here. Will iSCSI on > Linux scale to these large numbers? > > Thanks > > > On 24/04/2011 19:13, Jonathan Dye wrote: >> Why not create one iscsi lun per vm disk instead of carving them up on >> the hypervisor? That''s more typical, and a more typical state of >> affairs in linux is your friend. Also, you would have just one lun >> queue if you exported one big PV, instead of one lun queue per vbd. >> That becomes a problem at scale. >> >> - Jonathan >> >> ----- Original Message ----- >> From: "Jonathan Tripathy"<jonnyt@abpni.co.uk> >> To: "Xen List"<xen-users@lists.xensource.com> >> Sent: Sunday, April 24, 2011 11:25:38 AM >> Subject: [Xen-users] Shared Storage >> >> Hi Everyone, >> >> I am consider such a setup where I export an iSCSI target to a Xen node. >> This Xen node will then use the iSCSI block device as an LVM PV, and >> create lots of LVs for DomU use. >> >> I was wondering if anyone could make me aware of any special >> consideration I would need to take. I''ve posted a similar question to >> the LVM list to ask for further tips more specific to LVM. >> >> Am I barking down the wrong path here? I know it would be very easy to >> just an NFS server and use image files, but this will be for a large >> scale DomU hosting so this isn''t really an option. Additionally, if I >> wanted to make the LVM VG visible to multiple Xen nodes, is it just a >> matter of running CLVM on each Xen node? Please keep in mind that only >> one Xen node will be using an LV at any one time (so no need for GFS, I >> believe) >> >> Any help or tips would be appreciated >> >> Thanks >> >> _______________________________________________ >> Xen-users mailing list >> Xen-users@lists.xensource.com >> http://lists.xensource.com/xen-users > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Let me pose it to you this way. Say the queue depth is 32 for your iscsi-based PV, which would be pretty typical. Would you want ''hundreds if not thousands'' of vms sharing that same queue? Also please note you will need a LOT of spindles to support a thousand VMs, unless you are doing diskless. Is this a compute cluster or a consolidation/cloud project? - Jonathan ----- Original Message ----- From: "Jonathan Tripathy" <jonnyt@abpni.co.uk> To: "Jonathan Dye" <jdye@adaptivecomputing.com>, "Xen List" <xen-users@lists.xensource.com> Sent: Sunday, April 24, 2011 12:31:33 PM Subject: Re: [Xen-users] Shared Storage We''re talking houndreds, if not thousands of DomUs here. Will iSCSI on Linux scale to these large numbers? Thanks On 24/04/2011 19:13, Jonathan Dye wrote:> Why not create one iscsi lun per vm disk instead of carving them up on the hypervisor? That''s more typical, and a more typical state of affairs in linux is your friend. Also, you would have just one lun queue if you exported one big PV, instead of one lun queue per vbd. That becomes a problem at scale. > > - Jonathan > > ----- Original Message ----- > From: "Jonathan Tripathy"<jonnyt@abpni.co.uk> > To: "Xen List"<xen-users@lists.xensource.com> > Sent: Sunday, April 24, 2011 11:25:38 AM > Subject: [Xen-users] Shared Storage > > Hi Everyone, > > I am consider such a setup where I export an iSCSI target to a Xen node. > This Xen node will then use the iSCSI block device as an LVM PV, and > create lots of LVs for DomU use. > > I was wondering if anyone could make me aware of any special > consideration I would need to take. I''ve posted a similar question to > the LVM list to ask for further tips more specific to LVM. > > Am I barking down the wrong path here? I know it would be very easy to > just an NFS server and use image files, but this will be for a large > scale DomU hosting so this isn''t really an option. Additionally, if I > wanted to make the LVM VG visible to multiple Xen nodes, is it just a > matter of running CLVM on each Xen node? Please keep in mind that only > one Xen node will be using an LV at any one time (so no need for GFS, I > believe) > > Any help or tips would be appreciated > > Thanks > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
+1 for infiniband, but if anyone knows how to rescan the luns without breaking connectivity to all the luns with SRP in linux let me know o_O - jonathan ----- Original Message ----- From: "Bart Coninckx" <bart.coninckx@telenet.be> To: "Jonathan Tripathy" <jonnyt@abpni.co.uk> Cc: "Jonathan Dye" <jdye@adaptivecomputing.com>, "Xen List" <xen-users@lists.xensource.com> Sent: Sunday, April 24, 2011 12:36:46 PM Subject: Re: [Xen-users] Shared Storage That is completely dependent on your hardware specs and DomU''s properties. It sounds like a lot though. I seem to remember some time ago you also stated to want to run at least 100 DomUs on one hypervisor, maybe this is again pushing it. With a decent RAID and 10gbit or infiniband you can go a long way though. You should also consider using SCST instrad of IET as it is faster. B. On 04/24/11 20:31, Jonathan Tripathy wrote:> We''re talking houndreds, if not thousands of DomUs here. Will iSCSI on > Linux scale to these large numbers? > > Thanks > > > On 24/04/2011 19:13, Jonathan Dye wrote: >> Why not create one iscsi lun per vm disk instead of carving them up on >> the hypervisor? That''s more typical, and a more typical state of >> affairs in linux is your friend. Also, you would have just one lun >> queue if you exported one big PV, instead of one lun queue per vbd. >> That becomes a problem at scale. >> >> - Jonathan >> >> ----- Original Message ----- >> From: "Jonathan Tripathy"<jonnyt@abpni.co.uk> >> To: "Xen List"<xen-users@lists.xensource.com> >> Sent: Sunday, April 24, 2011 11:25:38 AM >> Subject: [Xen-users] Shared Storage >> >> Hi Everyone, >> >> I am consider such a setup where I export an iSCSI target to a Xen node. >> This Xen node will then use the iSCSI block device as an LVM PV, and >> create lots of LVs for DomU use. >> >> I was wondering if anyone could make me aware of any special >> consideration I would need to take. I''ve posted a similar question to >> the LVM list to ask for further tips more specific to LVM. >> >> Am I barking down the wrong path here? I know it would be very easy to >> just an NFS server and use image files, but this will be for a large >> scale DomU hosting so this isn''t really an option. Additionally, if I >> wanted to make the LVM VG visible to multiple Xen nodes, is it just a >> matter of running CLVM on each Xen node? Please keep in mind that only >> one Xen node will be using an LV at any one time (so no need for GFS, I >> believe) >> >> Any help or tips would be appreciated >> >> Thanks >> >> _______________________________________________ >> Xen-users mailing list >> Xen-users@lists.xensource.com >> http://lists.xensource.com/xen-users > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hi Guys, Please forget the "thousands" number. We would have thousands of DomUs, but this would be spread over multiple storage servers, so never mind about that scale. If I was exporting "One big LUN" per Xen node, it would contain at most 80 DomU LVs (In real world usage, closer to 50). Furthermore, each LUN would be exported from a seperate RAID array. Each storage server would contain x number of RAID arrays, where x equals the number of Xen nodes and the number of exported LUNs. Of course, if I went with one LUN per DomU, then each storage server would contain 80x LUNs (closer to 50x though). With these numbers, any idea which is better? Thanks -----Original Message----- From: Bart Coninckx [mailto:bart.coninckx@telenet.be] Sent: Sun 24/04/2011 19:36 To: Jonathan Tripathy Cc: Jonathan Dye; Xen List Subject: Re: [Xen-users] Shared Storage That is completely dependent on your hardware specs and DomU''s properties. It sounds like a lot though. I seem to remember some time ago you also stated to want to run at least 100 DomUs on one hypervisor, maybe this is again pushing it. With a decent RAID and 10gbit or infiniband you can go a long way though. You should also consider using SCST instrad of IET as it is faster. B. On 04/24/11 20:31, Jonathan Tripathy wrote:> We''re talking houndreds, if not thousands of DomUs here. Will iSCSI on > Linux scale to these large numbers? > > Thanks > > > On 24/04/2011 19:13, Jonathan Dye wrote: >> Why not create one iscsi lun per vm disk instead of carving them up on >> the hypervisor? That''s more typical, and a more typical state of >> affairs in linux is your friend. Also, you would have just one lun >> queue if you exported one big PV, instead of one lun queue per vbd. >> That becomes a problem at scale. >> >> - Jonathan >> >> ----- Original Message ----- >> From: "Jonathan Tripathy"<jonnyt@abpni.co.uk> >> To: "Xen List"<xen-users@lists.xensource.com> >> Sent: Sunday, April 24, 2011 11:25:38 AM >> Subject: [Xen-users] Shared Storage >> >> Hi Everyone, >> >> I am consider such a setup where I export an iSCSI target to a Xen node. >> This Xen node will then use the iSCSI block device as an LVM PV, and >> create lots of LVs for DomU use. >> >> I was wondering if anyone could make me aware of any special >> consideration I would need to take. I''ve posted a similar question to >> the LVM list to ask for further tips more specific to LVM. >> >> Am I barking down the wrong path here? I know it would be very easy to >> just an NFS server and use image files, but this will be for a large >> scale DomU hosting so this isn''t really an option. Additionally, if I >> wanted to make the LVM VG visible to multiple Xen nodes, is it just a >> matter of running CLVM on each Xen node? Please keep in mind that only >> one Xen node will be using an LV at any one time (so no need for GFS, I >> believe) >> >> Any help or tips would be appreciated >> >> Thanks >> >> _______________________________________________ >> Xen-users mailing list >> Xen-users@lists.xensource.com >> http://lists.xensource.com/xen-users > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
I think you better take one target and then several LUNs on it (one per DomU), that would make more sense. If you don''t do that and use just one LUN for several DomU''s, you need to create PVM LV''s on the newly created disk for each DomU on the hypervisor side, does not really sound comfortable. You would also close any path to HA, unless you maybe introduce some locking system, since every hypervisor would be wanting to try to write to the LUN. B. On 04/24/11 21:35, Jonathan Tripathy wrote:> Hi Guys, > > Please forget the "thousands" number. We would have thousands of DomUs, > but this would be spread over multiple storage servers, so never mind > about that scale. > > If I was exporting "One big LUN" per Xen node, it would contain at most > 80 DomU LVs (In real world usage, closer to 50). Furthermore, each LUN > would be exported from a seperate RAID array. Each storage server would > contain x number of RAID arrays, where x equals the number of Xen nodes > and the number of exported LUNs. > > Of course, if I went with one LUN per DomU, then each storage server > would contain 80x LUNs (closer to 50x though). > > With these numbers, any idea which is better? > > Thanks > > > -----Original Message----- > From: Bart Coninckx [mailto:bart.coninckx@telenet.be] > Sent: Sun 24/04/2011 19:36 > To: Jonathan Tripathy > Cc: Jonathan Dye; Xen List > Subject: Re: [Xen-users] Shared Storage > > That is completely dependent on your hardware specs and DomU''s properties. > It sounds like a lot though. I seem to remember some time ago you also > stated to want to run at least 100 DomUs on one hypervisor, maybe this > is again pushing it. > With a decent RAID and 10gbit or infiniband you can go a long way > though. You should also consider using SCST instrad of IET as it is faster. > > B. > > > > On 04/24/11 20:31, Jonathan Tripathy wrote: > > We''re talking houndreds, if not thousands of DomUs here. Will iSCSI on > > Linux scale to these large numbers? > > > > Thanks > > > > > > On 24/04/2011 19:13, Jonathan Dye wrote: > >> Why not create one iscsi lun per vm disk instead of carving them up on > >> the hypervisor? That''s more typical, and a more typical state of > >> affairs in linux is your friend. Also, you would have just one lun > >> queue if you exported one big PV, instead of one lun queue per vbd. > >> That becomes a problem at scale. > >> > >> - Jonathan > >> > >> ----- Original Message ----- > >> From: "Jonathan Tripathy"<jonnyt@abpni.co.uk> > >> To: "Xen List"<xen-users@lists.xensource.com> > >> Sent: Sunday, April 24, 2011 11:25:38 AM > >> Subject: [Xen-users] Shared Storage > >> > >> Hi Everyone, > >> > >> I am consider such a setup where I export an iSCSI target to a Xen node. > >> This Xen node will then use the iSCSI block device as an LVM PV, and > >> create lots of LVs for DomU use. > >> > >> I was wondering if anyone could make me aware of any special > >> consideration I would need to take. I''ve posted a similar question to > >> the LVM list to ask for further tips more specific to LVM. > >> > >> Am I barking down the wrong path here? I know it would be very easy to > >> just an NFS server and use image files, but this will be for a large > >> scale DomU hosting so this isn''t really an option. Additionally, if I > >> wanted to make the LVM VG visible to multiple Xen nodes, is it just a > >> matter of running CLVM on each Xen node? Please keep in mind that only > >> one Xen node will be using an LV at any one time (so no need for GFS, I > >> believe) > >> > >> Any help or tips would be appreciated > >> > >> Thanks > >> > >> _______________________________________________ > >> Xen-users mailing list > >> Xen-users@lists.xensource.com > >> http://lists.xensource.com/xen-users > > > > _______________________________________________ > > Xen-users mailing list > > Xen-users@lists.xensource.com > > http://lists.xensource.com/xen-users >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Thanks Bart. Very helpful info I agree with you about the LVM PV issue. It is indeed very uncomfortable. I am looking into CLVM (Cluster LVM) though, however this isn''t very well documented. So the current idea is one target per Xen node (hense one target per RAID array on the storage server), and one LUN per DomU. Is it easy enough to expand and shrink LUNs? This was the advantage of LVM that I loved. I guess I would run LVM on the storage server and export the LVs? Thanks -----Original Message----- From: Bart Coninckx [mailto:bart.coninckx@telenet.be] Sent: Sun 24/04/2011 20:40 To: Jonathan Tripathy Cc: Jonathan Dye; Xen List Subject: Re: [Xen-users] Shared Storage I think you better take one target and then several LUNs on it (one per DomU), that would make more sense. If you don''t do that and use just one LUN for several DomU''s, you need to create PVM LV''s on the newly created disk for each DomU on the hypervisor side, does not really sound comfortable. You would also close any path to HA, unless you maybe introduce some locking system, since every hypervisor would be wanting to try to write to the LUN. B. On 04/24/11 21:35, Jonathan Tripathy wrote:> Hi Guys, > > Please forget the "thousands" number. We would have thousands of DomUs, > but this would be spread over multiple storage servers, so never mind > about that scale. > > If I was exporting "One big LUN" per Xen node, it would contain at most > 80 DomU LVs (In real world usage, closer to 50). Furthermore, each LUN > would be exported from a seperate RAID array. Each storage server would > contain x number of RAID arrays, where x equals the number of Xen nodes > and the number of exported LUNs. > > Of course, if I went with one LUN per DomU, then each storage server > would contain 80x LUNs (closer to 50x though). > > With these numbers, any idea which is better? > > Thanks > > > -----Original Message----- > From: Bart Coninckx [mailto:bart.coninckx@telenet.be] > Sent: Sun 24/04/2011 19:36 > To: Jonathan Tripathy > Cc: Jonathan Dye; Xen List > Subject: Re: [Xen-users] Shared Storage > > That is completely dependent on your hardware specs and DomU''s properties. > It sounds like a lot though. I seem to remember some time ago you also > stated to want to run at least 100 DomUs on one hypervisor, maybe this > is again pushing it. > With a decent RAID and 10gbit or infiniband you can go a long way > though. You should also consider using SCST instrad of IET as it is faster. > > B. > > > > On 04/24/11 20:31, Jonathan Tripathy wrote: > > We''re talking houndreds, if not thousands of DomUs here. Will iSCSI on > > Linux scale to these large numbers? > > > > Thanks > > > > > > On 24/04/2011 19:13, Jonathan Dye wrote: > >> Why not create one iscsi lun per vm disk instead of carving them up on > >> the hypervisor? That''s more typical, and a more typical state of > >> affairs in linux is your friend. Also, you would have just one lun > >> queue if you exported one big PV, instead of one lun queue per vbd. > >> That becomes a problem at scale. > >> > >> - Jonathan > >> > >> ----- Original Message ----- > >> From: "Jonathan Tripathy"<jonnyt@abpni.co.uk> > >> To: "Xen List"<xen-users@lists.xensource.com> > >> Sent: Sunday, April 24, 2011 11:25:38 AM > >> Subject: [Xen-users] Shared Storage > >> > >> Hi Everyone, > >> > >> I am consider such a setup where I export an iSCSI target to a Xen node. > >> This Xen node will then use the iSCSI block device as an LVM PV, and > >> create lots of LVs for DomU use. > >> > >> I was wondering if anyone could make me aware of any special > >> consideration I would need to take. I''ve posted a similar question to > >> the LVM list to ask for further tips more specific to LVM. > >> > >> Am I barking down the wrong path here? I know it would be very easy to > >> just an NFS server and use image files, but this will be for a large > >> scale DomU hosting so this isn''t really an option. Additionally, if I > >> wanted to make the LVM VG visible to multiple Xen nodes, is it just a > >> matter of running CLVM on each Xen node? Please keep in mind that only > >> one Xen node will be using an LV at any one time (so no need for GFS, I > >> believe) > >> > >> Any help or tips would be appreciated > >> > >> Thanks > >> > >> _______________________________________________ > >> Xen-users mailing list > >> Xen-users@lists.xensource.com > >> http://lists.xensource.com/xen-users > > > > _______________________________________________ > > Xen-users mailing list > > Xen-users@lists.xensource.com > > http://lists.xensource.com/xen-users >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
So, linux storage servers then. If I might interject again I would suggest you try nexenta or solaris 11 express. If not, try a NAS appliance like FreeNAS or Openfiler - one of the linux based ones is likely to have done a better job than you will attempting to reproduce it. If you''re brave try clustered storage with Ceph since that''s the way everything is headed anyways (i.e. the way of isilon, luster, GPFS and the like). After all reasonable options fail, roll your own with LVM. IMO, making a storage server out of linux is inferior because the volume management, filesystem, and raid are stratified instead of engineered together. If you use any modern solaris kernel based distribution, like the ones named above, and ZFS then I think you''ll find that it can fill your network connection with storage traffic without tweaking. The downside is you have to be careful about hardware selection. - Jonathan ----- Original Message ----- From: "Jonathan Tripathy" <jonnyt@abpni.co.uk> To: "Bart Coninckx" <bart.coninckx@telenet.be>, xen-users@lists.xensource.com Sent: Sunday, April 24, 2011 1:43:46 PM Subject: RE: [Xen-users] Shared Storage Thanks Bart. Very helpful info I agree with you about the LVM PV issue. It is indeed very uncomfortable. I am looking into CLVM (Cluster LVM) though, however this isn''t very well documented. So the current idea is one target per Xen node (hense one target per RAID array on the storage server), and one LUN per DomU. Is it easy enough to expand and shrink LUNs? This was the advantage of LVM that I loved. I guess I would run LVM on the storage server and export the LVs? Thanks -----Original Message----- From: Bart Coninckx [mailto:bart.coninckx@telenet.be] Sent: Sun 24/04/2011 20:40 To: Jonathan Tripathy Cc: Jonathan Dye; Xen List Subject: Re: [Xen-users] Shared Storage I think you better take one target and then several LUNs on it (one per DomU), that would make more sense. If you don''t do that and use just one LUN for several DomU''s, you need to create PVM LV''s on the newly created disk for each DomU on the hypervisor side, does not really sound comfortable. You would also close any path to HA, unless you maybe introduce some locking system, since every hypervisor would be wanting to try to write to the LUN. B. On 04/24/11 21:35, Jonathan Tripathy wrote:> Hi Guys, > > Please forget the "thousands" number. We would have thousands of DomUs, > but this would be spread over multiple storage servers, so never mind > about that scale. > > If I was exporting "One big LUN" per Xen node, it would contain at most > 80 DomU LVs (In real world usage, closer to 50). Furthermore, each LUN > would be exported from a seperate RAID array. Each storage server would > contain x number of RAID arrays, where x equals the number of Xen nodes > and the number of exported LUNs. > > Of course, if I went with one LUN per DomU, then each storage server > would contain 80x LUNs (closer to 50x though). > > With these numbers, any idea which is better? > > Thanks > > > -----Original Message----- > From: Bart Coninckx [mailto:bart.coninckx@telenet.be] > Sent: Sun 24/04/2011 19:36 > To: Jonathan Tripathy > Cc: Jonathan Dye; Xen List > Subject: Re: [Xen-users] Shared Storage > > That is completely dependent on your hardware specs and DomU''s properties. > It sounds like a lot though. I seem to remember some time ago you also > stated to want to run at least 100 DomUs on one hypervisor, maybe this > is again pushing it. > With a decent RAID and 10gbit or infiniband you can go a long way > though. You should also consider using SCST instrad of IET as it is faster. > > B. > > > > On 04/24/11 20:31, Jonathan Tripathy wrote: > > We''re talking houndreds, if not thousands of DomUs here. Will iSCSI on > > Linux scale to these large numbers? > > > > Thanks > > > > > > On 24/04/2011 19:13, Jonathan Dye wrote: > >> Why not create one iscsi lun per vm disk instead of carving them up on > >> the hypervisor? That''s more typical, and a more typical state of > >> affairs in linux is your friend. Also, you would have just one lun > >> queue if you exported one big PV, instead of one lun queue per vbd. > >> That becomes a problem at scale. > >> > >> - Jonathan > >> > >> ----- Original Message ----- > >> From: "Jonathan Tripathy"<jonnyt@abpni.co.uk> > >> To: "Xen List"<xen-users@lists.xensource.com> > >> Sent: Sunday, April 24, 2011 11:25:38 AM > >> Subject: [Xen-users] Shared Storage > >> > >> Hi Everyone, > >> > >> I am consider such a setup where I export an iSCSI target to a Xen node. > >> This Xen node will then use the iSCSI block device as an LVM PV, and > >> create lots of LVs for DomU use. > >> > >> I was wondering if anyone could make me aware of any special > >> consideration I would need to take. I''ve posted a similar question to > >> the LVM list to ask for further tips more specific to LVM. > >> > >> Am I barking down the wrong path here? I know it would be very easy to > >> just an NFS server and use image files, but this will be for a large > >> scale DomU hosting so this isn''t really an option. Additionally, if I > >> wanted to make the LVM VG visible to multiple Xen nodes, is it just a > >> matter of running CLVM on each Xen node? Please keep in mind that only > >> one Xen node will be using an LV at any one time (so no need for GFS, I > >> believe) > >> > >> Any help or tips would be appreciated > >> > >> Thanks > >> > >> _______________________________________________ > >> Xen-users mailing list > >> Xen-users@lists.xensource.com > >> http://lists.xensource.com/xen-users > > > > _______________________________________________ > > Xen-users mailing list > > Xen-users@lists.xensource.com > > http://lists.xensource.com/xen-users >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Correct. If you create the LUNs on top of LVM LV''s on the storage server you have no issue regarding shrinking or expanding DomU''s. Mind you, this obviously cannnot happen "live" - you need to bring down your DomU, bring down the LUN and then resize the LV. Next you need to do changes to the partition table on the exported LUN. B. On 04/24/11 21:43, Jonathan Tripathy wrote:> Thanks Bart. Very helpful info > > I agree with you about the LVM PV issue. It is indeed very > uncomfortable. I am looking into CLVM (Cluster LVM) though, however this > isn''t very well documented. > > So the current idea is one target per Xen node (hense one target per > RAID array on the storage server), and one LUN per DomU. Is it easy > enough to expand and shrink LUNs? This was the advantage of LVM that I > loved. I guess I would run LVM on the storage server and export the LVs? > > Thanks > > -----Original Message----- > From: Bart Coninckx [mailto:bart.coninckx@telenet.be] > Sent: Sun 24/04/2011 20:40 > To: Jonathan Tripathy > Cc: Jonathan Dye; Xen List > Subject: Re: [Xen-users] Shared Storage > > I think you better take one target and then several LUNs on it (one per > DomU), that would make more sense. If you don''t do that and use just one > LUN for several DomU''s, you need to create PVM LV''s on the newly created > disk for each DomU on the hypervisor side, does not really sound > comfortable. You would also close any path to HA, unless you maybe > introduce some locking system, since every hypervisor would be wanting > to try to write to the LUN. > > B. > > On 04/24/11 21:35, Jonathan Tripathy wrote: > > Hi Guys, > > > > Please forget the "thousands" number. We would have thousands of DomUs, > > but this would be spread over multiple storage servers, so never mind > > about that scale. > > > > If I was exporting "One big LUN" per Xen node, it would contain at most > > 80 DomU LVs (In real world usage, closer to 50). Furthermore, each LUN > > would be exported from a seperate RAID array. Each storage server would > > contain x number of RAID arrays, where x equals the number of Xen nodes > > and the number of exported LUNs. > > > > Of course, if I went with one LUN per DomU, then each storage server > > would contain 80x LUNs (closer to 50x though). > > > > With these numbers, any idea which is better? > > > > Thanks > > > > > > -----Original Message----- > > From: Bart Coninckx [mailto:bart.coninckx@telenet.be] > > Sent: Sun 24/04/2011 19:36 > > To: Jonathan Tripathy > > Cc: Jonathan Dye; Xen List > > Subject: Re: [Xen-users] Shared Storage > > > > That is completely dependent on your hardware specs and DomU''s > properties. > > It sounds like a lot though. I seem to remember some time ago you also > > stated to want to run at least 100 DomUs on one hypervisor, maybe this > > is again pushing it. > > With a decent RAID and 10gbit or infiniband you can go a long way > > though. You should also consider using SCST instrad of IET as it is > faster. > > > > B. > > > > > > > > On 04/24/11 20:31, Jonathan Tripathy wrote: > > > We''re talking houndreds, if not thousands of DomUs here. Will iSCSI on > > > Linux scale to these large numbers? > > > > > > Thanks > > > > > > > > > On 24/04/2011 19:13, Jonathan Dye wrote: > > >> Why not create one iscsi lun per vm disk instead of carving them up on > > >> the hypervisor? That''s more typical, and a more typical state of > > >> affairs in linux is your friend. Also, you would have just one lun > > >> queue if you exported one big PV, instead of one lun queue per vbd. > > >> That becomes a problem at scale. > > >> > > >> - Jonathan > > >> > > >> ----- Original Message ----- > > >> From: "Jonathan Tripathy"<jonnyt@abpni.co.uk> > > >> To: "Xen List"<xen-users@lists.xensource.com> > > >> Sent: Sunday, April 24, 2011 11:25:38 AM > > >> Subject: [Xen-users] Shared Storage > > >> > > >> Hi Everyone, > > >> > > >> I am consider such a setup where I export an iSCSI target to a Xen > node. > > >> This Xen node will then use the iSCSI block device as an LVM PV, and > > >> create lots of LVs for DomU use. > > >> > > >> I was wondering if anyone could make me aware of any special > > >> consideration I would need to take. I''ve posted a similar question to > > >> the LVM list to ask for further tips more specific to LVM. > > >> > > >> Am I barking down the wrong path here? I know it would be very easy to > > >> just an NFS server and use image files, but this will be for a large > > >> scale DomU hosting so this isn''t really an option. Additionally, if I > > >> wanted to make the LVM VG visible to multiple Xen nodes, is it just a > > >> matter of running CLVM on each Xen node? Please keep in mind that only > > >> one Xen node will be using an LV at any one time (so no need for > GFS, I > > >> believe) > > >> > > >> Any help or tips would be appreciated > > >> > > >> Thanks > > >> > > >> _______________________________________________ > > >> Xen-users mailing list > > >> Xen-users@lists.xensource.com > > >> http://lists.xensource.com/xen-users > > > > > > _______________________________________________ > > > Xen-users mailing list > > > Xen-users@lists.xensource.com > > > http://lists.xensource.com/xen-users > > >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
I concur, in terms of performance Linux based iSCSI might not be the fastest, but in terms of what you are familiar with or what is flexible, it might be a good choice again. Also it might be worth to look into ATAoE. Not popular, but I''m told it is fast as hell. B. On 04/24/11 22:01, Jonathan Dye wrote:> So, linux storage servers then. If I might interject again I would suggest you try nexenta or solaris 11 express. If not, try a NAS appliance like FreeNAS or Openfiler - one of the linux based ones is likely to have done a better job than you will attempting to reproduce it. If you''re brave try clustered storage with Ceph since that''s the way everything is headed anyways (i.e. the way of isilon, luster, GPFS and the like). After all reasonable options fail, roll your own with LVM. IMO, making a storage server out of linux is inferior because the volume management, filesystem, and raid are stratified instead of engineered together. If you use any modern solaris kernel based distribution, like the ones named above, and ZFS then I think you''ll find that it can fill your network connection with storage traffic without tweaking. The downside is you have to be careful about hardware selection. > > - Jonathan > > ----- Original Message ----- > From: "Jonathan Tripathy"<jonnyt@abpni.co.uk> > To: "Bart Coninckx"<bart.coninckx@telenet.be>, xen-users@lists.xensource.com > Sent: Sunday, April 24, 2011 1:43:46 PM > Subject: RE: [Xen-users] Shared Storage > > Thanks Bart. Very helpful info > > I agree with you about the LVM PV issue. It is indeed very uncomfortable. I am looking into CLVM (Cluster LVM) though, however this isn''t very well documented. > > So the current idea is one target per Xen node (hense one target per RAID array on the storage server), and one LUN per DomU. Is it easy enough to expand and shrink LUNs? This was the advantage of LVM that I loved. I guess I would run LVM on the storage server and export the LVs? > > Thanks > > -----Original Message----- > From: Bart Coninckx [mailto:bart.coninckx@telenet.be] > Sent: Sun 24/04/2011 20:40 > To: Jonathan Tripathy > Cc: Jonathan Dye; Xen List > Subject: Re: [Xen-users] Shared Storage > > I think you better take one target and then several LUNs on it (one per > DomU), that would make more sense. If you don''t do that and use just one > LUN for several DomU''s, you need to create PVM LV''s on the newly created > disk for each DomU on the hypervisor side, does not really sound > comfortable. You would also close any path to HA, unless you maybe > introduce some locking system, since every hypervisor would be wanting > to try to write to the LUN. > > B. > > On 04/24/11 21:35, Jonathan Tripathy wrote: >> Hi Guys, >> >> Please forget the "thousands" number. We would have thousands of DomUs, >> but this would be spread over multiple storage servers, so never mind >> about that scale. >> >> If I was exporting "One big LUN" per Xen node, it would contain at most >> 80 DomU LVs (In real world usage, closer to 50). Furthermore, each LUN >> would be exported from a seperate RAID array. Each storage server would >> contain x number of RAID arrays, where x equals the number of Xen nodes >> and the number of exported LUNs. >> >> Of course, if I went with one LUN per DomU, then each storage server >> would contain 80x LUNs (closer to 50x though). >> >> With these numbers, any idea which is better? >> >> Thanks >> >> >> -----Original Message----- >> From: Bart Coninckx [mailto:bart.coninckx@telenet.be] >> Sent: Sun 24/04/2011 19:36 >> To: Jonathan Tripathy >> Cc: Jonathan Dye; Xen List >> Subject: Re: [Xen-users] Shared Storage >> >> That is completely dependent on your hardware specs and DomU''s properties. >> It sounds like a lot though. I seem to remember some time ago you also >> stated to want to run at least 100 DomUs on one hypervisor, maybe this >> is again pushing it. >> With a decent RAID and 10gbit or infiniband you can go a long way >> though. You should also consider using SCST instrad of IET as it is faster. >> >> B. >> >> >> >> On 04/24/11 20:31, Jonathan Tripathy wrote: >> > We''re talking houndreds, if not thousands of DomUs here. Will iSCSI on >> > Linux scale to these large numbers? >> > >> > Thanks >> > >> > >> > On 24/04/2011 19:13, Jonathan Dye wrote: >> >> Why not create one iscsi lun per vm disk instead of carving them up on >> >> the hypervisor? That''s more typical, and a more typical state of >> >> affairs in linux is your friend. Also, you would have just one lun >> >> queue if you exported one big PV, instead of one lun queue per vbd. >> >> That becomes a problem at scale. >> >> >> >> - Jonathan >> >> >> >> ----- Original Message ----- >> >> From: "Jonathan Tripathy"<jonnyt@abpni.co.uk> >> >> To: "Xen List"<xen-users@lists.xensource.com> >> >> Sent: Sunday, April 24, 2011 11:25:38 AM >> >> Subject: [Xen-users] Shared Storage >> >> >> >> Hi Everyone, >> >> >> >> I am consider such a setup where I export an iSCSI target to a Xen node. >> >> This Xen node will then use the iSCSI block device as an LVM PV, and >> >> create lots of LVs for DomU use. >> >> >> >> I was wondering if anyone could make me aware of any special >> >> consideration I would need to take. I''ve posted a similar question to >> >> the LVM list to ask for further tips more specific to LVM. >> >> >> >> Am I barking down the wrong path here? I know it would be very easy to >> >> just an NFS server and use image files, but this will be for a large >> >> scale DomU hosting so this isn''t really an option. Additionally, if I >> >> wanted to make the LVM VG visible to multiple Xen nodes, is it just a >> >> matter of running CLVM on each Xen node? Please keep in mind that only >> >> one Xen node will be using an LV at any one time (so no need for GFS, I >> >> believe) >> >> >> >> Any help or tips would be appreciated >> >> >> >> Thanks >> >> >> >> _______________________________________________ >> >> Xen-users mailing list >> >> Xen-users@lists.xensource.com >> >> http://lists.xensource.com/xen-users >> > >> > _______________________________________________ >> > Xen-users mailing list >> > Xen-users@lists.xensource.com >> > http://lists.xensource.com/xen-users >> > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Or FCoE for that matter. I know some very large deployments that are moving that way. - Jonathan ----- Original Message ----- From: "Bart Coninckx" <bart.coninckx@telenet.be> To: "Jonathan Dye" <jdye@adaptivecomputing.com> Cc: "Jonathan Tripathy" <jonnyt@abpni.co.uk>, xen-users@lists.xensource.com Sent: Sunday, April 24, 2011 2:04:56 PM Subject: Re: [Xen-users] Shared Storage I concur, in terms of performance Linux based iSCSI might not be the fastest, but in terms of what you are familiar with or what is flexible, it might be a good choice again. Also it might be worth to look into ATAoE. Not popular, but I''m told it is fast as hell. B. On 04/24/11 22:01, Jonathan Dye wrote:> So, linux storage servers then. If I might interject again I would suggest you try nexenta or solaris 11 express. If not, try a NAS appliance like FreeNAS or Openfiler - one of the linux based ones is likely to have done a better job than you will attempting to reproduce it. If you''re brave try clustered storage with Ceph since that''s the way everything is headed anyways (i.e. the way of isilon, luster, GPFS and the like). After all reasonable options fail, roll your own with LVM. IMO, making a storage server out of linux is inferior because the volume management, filesystem, and raid are stratified instead of engineered together. If you use any modern solaris kernel based distribution, like the ones named above, and ZFS then I think you''ll find that it can fill your network connection with storage traffic without tweaking. The downside is you have to be careful about hardware selection. > > - Jonathan > > ----- Original Message ----- > From: "Jonathan Tripathy"<jonnyt@abpni.co.uk> > To: "Bart Coninckx"<bart.coninckx@telenet.be>, xen-users@lists.xensource.com > Sent: Sunday, April 24, 2011 1:43:46 PM > Subject: RE: [Xen-users] Shared Storage > > Thanks Bart. Very helpful info > > I agree with you about the LVM PV issue. It is indeed very uncomfortable. I am looking into CLVM (Cluster LVM) though, however this isn''t very well documented. > > So the current idea is one target per Xen node (hense one target per RAID array on the storage server), and one LUN per DomU. Is it easy enough to expand and shrink LUNs? This was the advantage of LVM that I loved. I guess I would run LVM on the storage server and export the LVs? > > Thanks > > -----Original Message----- > From: Bart Coninckx [mailto:bart.coninckx@telenet.be] > Sent: Sun 24/04/2011 20:40 > To: Jonathan Tripathy > Cc: Jonathan Dye; Xen List > Subject: Re: [Xen-users] Shared Storage > > I think you better take one target and then several LUNs on it (one per > DomU), that would make more sense. If you don''t do that and use just one > LUN for several DomU''s, you need to create PVM LV''s on the newly created > disk for each DomU on the hypervisor side, does not really sound > comfortable. You would also close any path to HA, unless you maybe > introduce some locking system, since every hypervisor would be wanting > to try to write to the LUN. > > B. > > On 04/24/11 21:35, Jonathan Tripathy wrote: >> Hi Guys, >> >> Please forget the "thousands" number. We would have thousands of DomUs, >> but this would be spread over multiple storage servers, so never mind >> about that scale. >> >> If I was exporting "One big LUN" per Xen node, it would contain at most >> 80 DomU LVs (In real world usage, closer to 50). Furthermore, each LUN >> would be exported from a seperate RAID array. Each storage server would >> contain x number of RAID arrays, where x equals the number of Xen nodes >> and the number of exported LUNs. >> >> Of course, if I went with one LUN per DomU, then each storage server >> would contain 80x LUNs (closer to 50x though). >> >> With these numbers, any idea which is better? >> >> Thanks >> >> >> -----Original Message----- >> From: Bart Coninckx [mailto:bart.coninckx@telenet.be] >> Sent: Sun 24/04/2011 19:36 >> To: Jonathan Tripathy >> Cc: Jonathan Dye; Xen List >> Subject: Re: [Xen-users] Shared Storage >> >> That is completely dependent on your hardware specs and DomU''s properties. >> It sounds like a lot though. I seem to remember some time ago you also >> stated to want to run at least 100 DomUs on one hypervisor, maybe this >> is again pushing it. >> With a decent RAID and 10gbit or infiniband you can go a long way >> though. You should also consider using SCST instrad of IET as it is faster. >> >> B. >> >> >> >> On 04/24/11 20:31, Jonathan Tripathy wrote: >> > We''re talking houndreds, if not thousands of DomUs here. Will iSCSI on >> > Linux scale to these large numbers? >> > >> > Thanks >> > >> > >> > On 24/04/2011 19:13, Jonathan Dye wrote: >> >> Why not create one iscsi lun per vm disk instead of carving them up on >> >> the hypervisor? That''s more typical, and a more typical state of >> >> affairs in linux is your friend. Also, you would have just one lun >> >> queue if you exported one big PV, instead of one lun queue per vbd. >> >> That becomes a problem at scale. >> >> >> >> - Jonathan >> >> >> >> ----- Original Message ----- >> >> From: "Jonathan Tripathy"<jonnyt@abpni.co.uk> >> >> To: "Xen List"<xen-users@lists.xensource.com> >> >> Sent: Sunday, April 24, 2011 11:25:38 AM >> >> Subject: [Xen-users] Shared Storage >> >> >> >> Hi Everyone, >> >> >> >> I am consider such a setup where I export an iSCSI target to a Xen node. >> >> This Xen node will then use the iSCSI block device as an LVM PV, and >> >> create lots of LVs for DomU use. >> >> >> >> I was wondering if anyone could make me aware of any special >> >> consideration I would need to take. I''ve posted a similar question to >> >> the LVM list to ask for further tips more specific to LVM. >> >> >> >> Am I barking down the wrong path here? I know it would be very easy to >> >> just an NFS server and use image files, but this will be for a large >> >> scale DomU hosting so this isn''t really an option. Additionally, if I >> >> wanted to make the LVM VG visible to multiple Xen nodes, is it just a >> >> matter of running CLVM on each Xen node? Please keep in mind that only >> >> one Xen node will be using an LV at any one time (so no need for GFS, I >> >> believe) >> >> >> >> Any help or tips would be appreciated >> >> >> >> Thanks >> >> >> >> _______________________________________________ >> >> Xen-users mailing list >> >> Xen-users@lists.xensource.com >> >> http://lists.xensource.com/xen-users >> > >> > _______________________________________________ >> > Xen-users mailing list >> > Xen-users@lists.xensource.com >> > http://lists.xensource.com/xen-users >> > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Well, I''m very familiar with LVM and shrinking and extending LVs and filesystems. Been doing this for ages. I would like to use openfiler, however I''d like to script this, so maybe Linux is still the best option? And just to confirm, Linux iSCSI will be ok with hundreds of LUNs? Assume network and spindle hardware is ok. Thanks -----Original Message----- From: Bart Coninckx [mailto:bart.coninckx@telenet.be] Sent: Sun 24/04/2011 21:04 To: Jonathan Dye Cc: Jonathan Tripathy; xen-users@lists.xensource.com Subject: Re: [Xen-users] Shared Storage I concur, in terms of performance Linux based iSCSI might not be the fastest, but in terms of what you are familiar with or what is flexible, it might be a good choice again. Also it might be worth to look into ATAoE. Not popular, but I''m told it is fast as hell. B. On 04/24/11 22:01, Jonathan Dye wrote:> So, linux storage servers then. If I might interject again I would suggest you try nexenta or solaris 11 express. If not, try a NAS appliance like FreeNAS or Openfiler - one of the linux based ones is likely to have done a better job than you will attempting to reproduce it. If you''re brave try clustered storage with Ceph since that''s the way everything is headed anyways (i.e. the way of isilon, luster, GPFS and the like). After all reasonable options fail, roll your own with LVM. IMO, making a storage server out of linux is inferior because the volume management, filesystem, and raid are stratified instead of engineered together. If you use any modern solaris kernel based distribution, like the ones named above, and ZFS then I think you''ll find that it can fill your network connection with storage traffic without tweaking. The downside is you have to be careful about hardware selection. > > - Jonathan > > ----- Original Message ----- > From: "Jonathan Tripathy"<jonnyt@abpni.co.uk> > To: "Bart Coninckx"<bart.coninckx@telenet.be>, xen-users@lists.xensource.com > Sent: Sunday, April 24, 2011 1:43:46 PM > Subject: RE: [Xen-users] Shared Storage > > Thanks Bart. Very helpful info > > I agree with you about the LVM PV issue. It is indeed very uncomfortable. I am looking into CLVM (Cluster LVM) though, however this isn''t very well documented. > > So the current idea is one target per Xen node (hense one target per RAID array on the storage server), and one LUN per DomU. Is it easy enough to expand and shrink LUNs? This was the advantage of LVM that I loved. I guess I would run LVM on the storage server and export the LVs? > > Thanks > > -----Original Message----- > From: Bart Coninckx [mailto:bart.coninckx@telenet.be] > Sent: Sun 24/04/2011 20:40 > To: Jonathan Tripathy > Cc: Jonathan Dye; Xen List > Subject: Re: [Xen-users] Shared Storage > > I think you better take one target and then several LUNs on it (one per > DomU), that would make more sense. If you don''t do that and use just one > LUN for several DomU''s, you need to create PVM LV''s on the newly created > disk for each DomU on the hypervisor side, does not really sound > comfortable. You would also close any path to HA, unless you maybe > introduce some locking system, since every hypervisor would be wanting > to try to write to the LUN. > > B. > > On 04/24/11 21:35, Jonathan Tripathy wrote: >> Hi Guys, >> >> Please forget the "thousands" number. We would have thousands of DomUs, >> but this would be spread over multiple storage servers, so never mind >> about that scale. >> >> If I was exporting "One big LUN" per Xen node, it would contain at most >> 80 DomU LVs (In real world usage, closer to 50). Furthermore, each LUN >> would be exported from a seperate RAID array. Each storage server would >> contain x number of RAID arrays, where x equals the number of Xen nodes >> and the number of exported LUNs. >> >> Of course, if I went with one LUN per DomU, then each storage server >> would contain 80x LUNs (closer to 50x though). >> >> With these numbers, any idea which is better? >> >> Thanks >> >> >> -----Original Message----- >> From: Bart Coninckx [mailto:bart.coninckx@telenet.be] >> Sent: Sun 24/04/2011 19:36 >> To: Jonathan Tripathy >> Cc: Jonathan Dye; Xen List >> Subject: Re: [Xen-users] Shared Storage >> >> That is completely dependent on your hardware specs and DomU''s properties. >> It sounds like a lot though. I seem to remember some time ago you also >> stated to want to run at least 100 DomUs on one hypervisor, maybe this >> is again pushing it. >> With a decent RAID and 10gbit or infiniband you can go a long way >> though. You should also consider using SCST instrad of IET as it is faster. >> >> B. >> >> >> >> On 04/24/11 20:31, Jonathan Tripathy wrote: >> > We''re talking houndreds, if not thousands of DomUs here. Will iSCSI on >> > Linux scale to these large numbers? >> > >> > Thanks >> > >> > >> > On 24/04/2011 19:13, Jonathan Dye wrote: >> >> Why not create one iscsi lun per vm disk instead of carving them up on >> >> the hypervisor? That''s more typical, and a more typical state of >> >> affairs in linux is your friend. Also, you would have just one lun >> >> queue if you exported one big PV, instead of one lun queue per vbd. >> >> That becomes a problem at scale. >> >> >> >> - Jonathan >> >> >> >> ----- Original Message ----- >> >> From: "Jonathan Tripathy"<jonnyt@abpni.co.uk> >> >> To: "Xen List"<xen-users@lists.xensource.com> >> >> Sent: Sunday, April 24, 2011 11:25:38 AM >> >> Subject: [Xen-users] Shared Storage >> >> >> >> Hi Everyone, >> >> >> >> I am consider such a setup where I export an iSCSI target to a Xen node. >> >> This Xen node will then use the iSCSI block device as an LVM PV, and >> >> create lots of LVs for DomU use. >> >> >> >> I was wondering if anyone could make me aware of any special >> >> consideration I would need to take. I''ve posted a similar question to >> >> the LVM list to ask for further tips more specific to LVM. >> >> >> >> Am I barking down the wrong path here? I know it would be very easy to >> >> just an NFS server and use image files, but this will be for a large >> >> scale DomU hosting so this isn''t really an option. Additionally, if I >> >> wanted to make the LVM VG visible to multiple Xen nodes, is it just a >> >> matter of running CLVM on each Xen node? Please keep in mind that only >> >> one Xen node will be using an LV at any one time (so no need for GFS, I >> >> believe) >> >> >> >> Any help or tips would be appreciated >> >> >> >> Thanks >> >> >> >> _______________________________________________ >> >> Xen-users mailing list >> >> Xen-users@lists.xensource.com >> >> http://lists.xensource.com/xen-users >> > >> > _______________________________________________ >> > Xen-users mailing list >> > Xen-users@lists.xensource.com >> > http://lists.xensource.com/xen-users >> > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
There is no such thing as "Linux iSCSI", you have several implementations. I dare say that IET is the most known one, but not the fastest one. That would be SCST. B. On 04/24/11 22:24, Jonathan Tripathy wrote:> Well, I''m very familiar with LVM and shrinking and extending LVs and > filesystems. Been doing this for ages. > > I would like to use openfiler, however I''d like to script this, so maybe > Linux is still the best option? > > And just to confirm, Linux iSCSI will be ok with hundreds of LUNs? > Assume network and spindle hardware is ok. > > Thanks > > > > -----Original Message----- > From: Bart Coninckx [mailto:bart.coninckx@telenet.be] > Sent: Sun 24/04/2011 21:04 > To: Jonathan Dye > Cc: Jonathan Tripathy; xen-users@lists.xensource.com > Subject: Re: [Xen-users] Shared Storage > > I concur, in terms of performance Linux based iSCSI might not be the > fastest, but in terms of what you are familiar with or what is flexible, > it might be a good choice again. > > Also it might be worth to look into ATAoE. Not popular, but I''m told it > is fast as hell. > > B. > > > > On 04/24/11 22:01, Jonathan Dye wrote: > > So, linux storage servers then. If I might interject again I would > suggest you try nexenta or solaris 11 express. If not, try a NAS > appliance like FreeNAS or Openfiler - one of the linux based ones is > likely to have done a better job than you will attempting to reproduce > it. If you''re brave try clustered storage with Ceph since that''s the way > everything is headed anyways (i.e. the way of isilon, luster, GPFS and > the like). After all reasonable options fail, roll your own with LVM. > IMO, making a storage server out of linux is inferior because the volume > management, filesystem, and raid are stratified instead of engineered > together. If you use any modern solaris kernel based distribution, like > the ones named above, and ZFS then I think you''ll find that it can fill > your network connection with storage traffic without tweaking. The > downside is you have to be careful about hardware selection. > > > > - Jonathan > > > > ----- Original Message ----- > > From: "Jonathan Tripathy"<jonnyt@abpni.co.uk> > > To: "Bart Coninckx"<bart.coninckx@telenet.be>, > xen-users@lists.xensource.com > > Sent: Sunday, April 24, 2011 1:43:46 PM > > Subject: RE: [Xen-users] Shared Storage > > > > Thanks Bart. Very helpful info > > > > I agree with you about the LVM PV issue. It is indeed very > uncomfortable. I am looking into CLVM (Cluster LVM) though, however this > isn''t very well documented. > > > > So the current idea is one target per Xen node (hense one target per > RAID array on the storage server), and one LUN per DomU. Is it easy > enough to expand and shrink LUNs? This was the advantage of LVM that I > loved. I guess I would run LVM on the storage server and export the LVs? > > > > Thanks > > > > -----Original Message----- > > From: Bart Coninckx [mailto:bart.coninckx@telenet.be] > > Sent: Sun 24/04/2011 20:40 > > To: Jonathan Tripathy > > Cc: Jonathan Dye; Xen List > > Subject: Re: [Xen-users] Shared Storage > > > > I think you better take one target and then several LUNs on it (one per > > DomU), that would make more sense. If you don''t do that and use just one > > LUN for several DomU''s, you need to create PVM LV''s on the newly created > > disk for each DomU on the hypervisor side, does not really sound > > comfortable. You would also close any path to HA, unless you maybe > > introduce some locking system, since every hypervisor would be wanting > > to try to write to the LUN. > > > > B. > > > > On 04/24/11 21:35, Jonathan Tripathy wrote: > >> Hi Guys, > >> > >> Please forget the "thousands" number. We would have thousands of DomUs, > >> but this would be spread over multiple storage servers, so never mind > >> about that scale. > >> > >> If I was exporting "One big LUN" per Xen node, it would contain at most > >> 80 DomU LVs (In real world usage, closer to 50). Furthermore, each LUN > >> would be exported from a seperate RAID array. Each storage server would > >> contain x number of RAID arrays, where x equals the number of Xen nodes > >> and the number of exported LUNs. > >> > >> Of course, if I went with one LUN per DomU, then each storage server > >> would contain 80x LUNs (closer to 50x though). > >> > >> With these numbers, any idea which is better? > >> > >> Thanks > >> > >> > >> -----Original Message----- > >> From: Bart Coninckx [mailto:bart.coninckx@telenet.be] > >> Sent: Sun 24/04/2011 19:36 > >> To: Jonathan Tripathy > >> Cc: Jonathan Dye; Xen List > >> Subject: Re: [Xen-users] Shared Storage > >> > >> That is completely dependent on your hardware specs and DomU''s > properties. > >> It sounds like a lot though. I seem to remember some time ago you also > >> stated to want to run at least 100 DomUs on one hypervisor, maybe this > >> is again pushing it. > >> With a decent RAID and 10gbit or infiniband you can go a long way > >> though. You should also consider using SCST instrad of IET as it is > faster. > >> > >> B. > >> > >> > >> > >> On 04/24/11 20:31, Jonathan Tripathy wrote: > >> > We''re talking houndreds, if not thousands of DomUs here. Will iSCSI on > >> > Linux scale to these large numbers? > >> > > >> > Thanks > >> > > >> > > >> > On 24/04/2011 19:13, Jonathan Dye wrote: > >> >> Why not create one iscsi lun per vm disk instead of carving them > up on > >> >> the hypervisor? That''s more typical, and a more typical state of > >> >> affairs in linux is your friend. Also, you would have just one lun > >> >> queue if you exported one big PV, instead of one lun queue per vbd. > >> >> That becomes a problem at scale. > >> >> > >> >> - Jonathan > >> >> > >> >> ----- Original Message ----- > >> >> From: "Jonathan Tripathy"<jonnyt@abpni.co.uk> > >> >> To: "Xen List"<xen-users@lists.xensource.com> > >> >> Sent: Sunday, April 24, 2011 11:25:38 AM > >> >> Subject: [Xen-users] Shared Storage > >> >> > >> >> Hi Everyone, > >> >> > >> >> I am consider such a setup where I export an iSCSI target to a > Xen node. > >> >> This Xen node will then use the iSCSI block device as an LVM PV, and > >> >> create lots of LVs for DomU use. > >> >> > >> >> I was wondering if anyone could make me aware of any special > >> >> consideration I would need to take. I''ve posted a similar question to > >> >> the LVM list to ask for further tips more specific to LVM. > >> >> > >> >> Am I barking down the wrong path here? I know it would be very > easy to > >> >> just an NFS server and use image files, but this will be for a large > >> >> scale DomU hosting so this isn''t really an option. Additionally, if I > >> >> wanted to make the LVM VG visible to multiple Xen nodes, is it just a > >> >> matter of running CLVM on each Xen node? Please keep in mind that > only > >> >> one Xen node will be using an LV at any one time (so no need for > GFS, I > >> >> believe) > >> >> > >> >> Any help or tips would be appreciated > >> >> > >> >> Thanks > >> >> > >> >> _______________________________________________ > >> >> Xen-users mailing list > >> >> Xen-users@lists.xensource.com > >> >> http://lists.xensource.com/xen-users > >> > > >> > _______________________________________________ > >> > Xen-users mailing list > >> > Xen-users@lists.xensource.com > >> > http://lists.xensource.com/xen-users > >> > > > > > > _______________________________________________ > > Xen-users mailing list > > Xen-users@lists.xensource.com > > http://lists.xensource.com/xen-users >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On 04/24/2011 04:24 PM, Jonathan Tripathy wrote:> Well, I''m very familiar with LVM and shrinking and extending LVs and > filesystems. Been doing this for ages. > > I would like to use openfiler, however I''d like to script this, so maybe > Linux is still the best option? > > And just to confirm, Linux iSCSI will be ok with hundreds of LUNs? > Assume network and spindle hardware is ok.If you can serve the LUNs to the box, you''ll be fine. But keep in mind the 256-per-target/channel limit inherent to SCSI. And why you''d want to manage hundreds of disks instead of hundreds of LV''s is beyond me -- management at the LVM layer is a lot easier than duplicating the work of disk labeling, multipathing, etc. Oh, and with hundreds of VMs, the spindle will always be your bottleneck. John -- John Madden Sr UNIX Systems Engineer / Office of Technology Ivy Tech Community College of Indiana jmadden@ivytech.edu _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On 25/04/2011 14:27, John Madden wrote:> On 04/24/2011 04:24 PM, Jonathan Tripathy wrote: >> Well, I''m very familiar with LVM and shrinking and extending LVs and >> filesystems. Been doing this for ages. >> >> I would like to use openfiler, however I''d like to script this, so maybe >> Linux is still the best option? >> >> And just to confirm, Linux iSCSI will be ok with hundreds of LUNs? >> Assume network and spindle hardware is ok. > > If you can serve the LUNs to the box, you''ll be fine. But keep in > mind the 256-per-target/channel limit inherent to SCSI. And why you''d > want to manage hundreds of disks instead of hundreds of LV''s is beyond > me -- management at the LVM layer is a lot easier than duplicating the > work of disk labeling, multipathing, etc. > > Oh, and with hundreds of VMs, the spindle will always be your bottleneck. > > John >Hi John, I would honestly prefer to manage hundred of LVs instead of hundreds of LUNs. I''m just concerned about the iSCSI bottleneck (if any) if I were to create an LVM VG using a single iSCSI LUN for about 50 - 100 LVs. Any advice is appreciated. Thanks _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> I would honestly prefer to manage hundred of LVs instead of hundreds of > LUNs. I''m just concerned about the iSCSI bottleneck (if any) if I were > to create an LVM VG using a single iSCSI LUN for about 50 - 100 LVs. Any > advice is appreciated.I''d be more concerned over iSCSI itself being able to scale based on your workload, especially if you''re doing it over GbE. Even 50 VMs doing relatively little though concurrently could cause problems given the nature of iSCSI (TCP overhead, latency of ethernet, etc). My feel for this is that the fewer-LUN-more-LV route would be more efficient because you''ll have fewer block device queues and multipath call-outs and such but that''s just a guess on my part. Test it out, see which one is better. If there''s no difference, go with the one that''s easier to manage. John -- John Madden Sr UNIX Systems Engineer / Office of Technology Ivy Tech Community College of Indiana jmadden@ivytech.edu _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On 25/04/2011 18:34, John Madden wrote:>> I would honestly prefer to manage hundred of LVs instead of hundreds of >> LUNs. I''m just concerned about the iSCSI bottleneck (if any) if I were >> to create an LVM VG using a single iSCSI LUN for about 50 - 100 LVs. Any >> advice is appreciated. > > I''d be more concerned over iSCSI itself being able to scale based on > your workload, especially if you''re doing it over GbE. Even 50 VMs > doing relatively little though concurrently could cause problems given > the nature of iSCSI (TCP overhead, latency of ethernet, etc).When you say this, are you refering to the suitation where I would have 1 LUN per VM (i.e. 50 LUNs)? I would be doing this over GbE, however I would be using trunking> My feel for this is that the fewer-LUN-more-LV route would be more > efficient because you''ll have fewer block device queues and multipath > call-outs and such but that''s just a guess on my part.Interesting. I have had a comment on this list suggesting the opposite.> Test it out, see which one is better. If there''s no difference, go > with the one that''s easier to manage.Hands down mamanging LVM is my number one choice. Ideally I would just like to set up the iSCSI connections once and just leave it _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> Hands down mamanging LVM is my number one choice. Ideally I would just > like to set up the iSCSI connections once and just leave itYeah. iSCSI a few LUNs from your SAN, cLVM across your nodes (do the iSCSI in dom0), create your LV''s and you''re done. This is really only half the picture though and touches on another level of storage concepts. What does your backend disk and cache look like? In my clusters, I create two storage pools, one for "fast disk" and the other for "slow disk," then add LUNs from the SAN appropriately. You should get as granular as you can in performance and use-case terms though to keep the right IOs on the right disks but that may not be practical with your SAN (e.g., if you just have 64 spindles in a single RAID-10 or some dumb JBOD or something). I guess the message is to think about how you''re laying out your data and then align that with how you lay out your disks. You may squeeze out an extra 5% by going with multiple LUNs versus a single LUN and another 30% by going with FC instead of multi-GbE, but you can gain even more by utilizing the limited i/o of a spindle more effectively. John -- John Madden Sr UNIX Systems Engineer / Office of Technology Ivy Tech Community College of Indiana jmadden@ivytech.edu _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On 25/04/2011 19:57, John Madden wrote:>> Hands down mamanging LVM is my number one choice. Ideally I would just >> like to set up the iSCSI connections once and just leave it > > Yeah. iSCSI a few LUNs from your SAN, cLVM across your nodes (do the > iSCSI in dom0), create your LV''s and you''re done. > > This is really only half the picture though and touches on another > level of storage concepts. What does your backend disk and cache look > like? In my clusters, I create two storage pools, one for "fast disk" > and the other for "slow disk," then add LUNs from the SAN > appropriately. You should get as granular as you can in performance > and use-case terms though to keep the right IOs on the right disks but > that may not be practical with your SAN (e.g., if you just have 64 > spindles in a single RAID-10 or some dumb JBOD or something). > > I guess the message is to think about how you''re laying out your data > and then align that with how you lay out your disks. You may squeeze > out an extra 5% by going with multiple LUNs versus a single LUN and > another 30% by going with FC instead of multi-GbE, but you can gain > even more by utilizing the limited i/o of a spindle more effectively. > > John >Thanks for the excellent advice John. Very much appreciated. While I''m not able to disclose our disk setup (for commercial reasons), I am confident that what I have in mind is good for us, as we have been doing this in a non-shared manner (i.e. disks local to the Dom0) for quite some time. But yes, as you say, iSCSI will allow for a little bit of "fine tuning". I also need to sanity test CLVM and see how well (or how badly) it handles iSCSI lost connections, propagating LVM metadata changes to other nodes, etc.. Now onto some testing to see what works out best... Cheers _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
I don''t have any information to add, but this discussion made me think about how XCP/XenServer handle the LUNs and LVMs. It''s common practice to have one large LUN with several LVMs for each VM -- and that''s completely supported by XCP. I *believe* XCP handles all the locking mechanisms to prevent two hosts from trying to access the same LVM (which exists within the large shared LUN) at the same time, for example, during a live migration. It does not look like XCP uses cLVM to share the LVM information across all nodes -- does anyone have any deeper information on this? All the docs from Citrix/Xen mention is that you can simply have a single LUN shared across all nodes (using SR type LVMoiSCSI), but I have not a chance to test live migration under those circumstances yet. Am I missing something here? Is it possible to do live migrations using SR type LVMoiSCSI? The reason I ask is because the discussion made me think it would not be possible. Best regards, Eduardo. On Apr 25, 2011, at 4:17 PM, Jonathan Tripathy wrote:> > On 25/04/2011 19:57, John Madden wrote: >>> Hands down mamanging LVM is my number one choice. Ideally I would >>> just >>> like to set up the iSCSI connections once and just leave it >> >> Yeah. iSCSI a few LUNs from your SAN, cLVM across your nodes (do >> the iSCSI in dom0), create your LV''s and you''re done. >> >> This is really only half the picture though and touches on another >> level of storage concepts. What does your backend disk and cache >> look like? In my clusters, I create two storage pools, one for >> "fast disk" and the other for "slow disk," then add LUNs from the >> SAN appropriately. You should get as granular as you can in >> performance and use-case terms though to keep the right IOs on the >> right disks but that may not be practical with your SAN (e.g., if >> you just have 64 spindles in a single RAID-10 or some dumb JBOD or >> something). >> >> I guess the message is to think about how you''re laying out your >> data and then align that with how you lay out your disks. You may >> squeeze out an extra 5% by going with multiple LUNs versus a single >> LUN and another 30% by going with FC instead of multi-GbE, but you >> can gain even more by utilizing the limited i/o of a spindle more >> effectively. >> >> John >> > Thanks for the excellent advice John. Very much appreciated. While > I''m not able to disclose our disk setup (for commercial reasons), I > am confident that what I have in mind is good for us, as we have > been doing this in a non-shared manner (i.e. disks local to the > Dom0) for quite some time. But yes, as you say, iSCSI will allow for > a little bit of "fine tuning". > > I also need to sanity test CLVM and see how well (or how badly) it > handles iSCSI lost connections, propagating LVM metadata changes to > other nodes, etc.. > > Now onto some testing to see what works out best... > > Cheers > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> > > I would honestly prefer to manage hundred of LVs instead of hundredsof> > LUNs. I''m just concerned about the iSCSI bottleneck (if any) if Iwere> > to create an LVM VG using a single iSCSI LUN for about 50 - 100 LVs.Any> > advice is appreciated. > > I''d be more concerned over iSCSI itself being able to scale based on > your workload, especially if you''re doing it over GbE. Even 50 VMs > doing relatively little though concurrently could cause problems given > the nature of iSCSI (TCP overhead, latency of ethernet, etc). My feel > for this is that the fewer-LUN-more-LV route would be more efficient > because you''ll have fewer block device queues and multipath call-outs > and such but that''s just a guess on my part. Test it out, see whichone> is better. If there''s no difference, go with the one that''s easierto> manage. >TCP offload helps a lot in the case of iSCSI. You get the ability to send ~60KB of data at once, and your packets are checksummed for free. I don''t actually know what Linux support for RSS is like but if it is any good you also get automatic distribution of rx workload across multiple CPU''s too, but that only works if you have multiple TCP connections in flight at once (eg the multiple LUN scenario). I don''t know what iSCSI offload involves... maybe that can help further. James _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> Am I missing something here? Is it possible to do live migrations > using SR type LVMoiSCSI? The reason I ask is because the discussion > made me think it would not be possible.I don''t know what "SR" means but yes, you can do live migrations even without cLVM. All cLVM does is ensure consistency in LVM metadata changes across the cluster. Xen itself prevents the source and destination dom0''s from trashing the disk during live migration. Other multi-node trashings are left to lock managers like OCFS2, GFS, not-being-dumb, etc. John -- John Madden Sr UNIX Systems Engineer / Office of Technology Ivy Tech Community College of Indiana jmadden@ivytech.edu _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> TCP offload helps a lot in the case of iSCSI. You get the ability to > send ~60KB of data at once, and your packets are checksummed for free. I > don''t actually know what Linux support for RSS is like but if it is any > good you also get automatic distribution of rx workload across multiple > CPU''s too, but that only works if you have multiple TCP connections in > flight at once (eg the multiple LUN scenario). > > I don''t know what iSCSI offload involves... maybe that can help further.Offload is great, but it''s still TCP. The protocol itself is heavy on overhead so your 1gbit pipe ends up being, you know, .8gbit storage and .2gbit protocol (I don''t know what the exact numbers are). Offload keeps this away from your CPUs but doesn''t do anything on the pipe itself. iSCSI''s great when you can''t afford FC or FCoE (which still isn''t as good as FC) or when you need to do things over a WAN, but it has its cost. John -- John Madden Sr UNIX Systems Engineer / Office of Technology Ivy Tech Community College of Indiana jmadden@ivytech.edu _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Tue, Apr 26, 2011 at 12:03 PM, John Madden <jmadden@ivytech.edu> wrote:> so your 1gbit pipe ends up being, you know, .8gbit storage and .2gbit > protocol (I don''t know what the exact numbers are)more like .96gbit storage and .04gbit protocol.... or .993/.007 on jumbo frames -- Javier _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On 26/04/2011 18:53, Javier Guerra Giraldez wrote:> On Tue, Apr 26, 2011 at 12:03 PM, John Madden<jmadden@ivytech.edu> wrote: > >> so your 1gbit pipe ends up being, you know, .8gbit storage and .2gbit >> protocol (I don''t know what the exact numbers are) >> > more like .96gbit storage and .04gbit protocol.... or .993/.007 on jumbo frames > >Are you using TCP and/or iSCSI offload in your NIC? Those seem like pretty good numbers. This reminds me of a question that I was going to ask. I''m spec''ing up a Dell server and they have a "Broadcom® NetXtreme II 5709 Dual Port 1GbE NIC with TOE and iSCSI Offload". Does anyone have any idea on how to enable the iSCSI offload under Linux? Or does it "just work"? Thanks _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Tue, Apr 26, 2011 at 3:21 PM, Jonathan Tripathy <jonnyt@abpni.co.uk> wrote:> On 26/04/2011 18:53, Javier Guerra Giraldez wrote: > > more like .96gbit storage and .04gbit protocol.... or .993/.007 on jumbo > > frames > > Are you using TCP and/or iSCSI offload in your NIC? Those seem like pretty > good numbers.that''s the protocol overhead; what John guessed would be .8/.2 He''s right in that no amount of offloading or CPU power would improve on that, but his numbers are way off. -- Javier _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Eduardo Bragatto
2011-Apr-26 20:42 UTC
Re: [Xen-users] BCM5709 iSCSI Offload -- was: Shared Storage
On Apr 26, 2011, at 5:21 PM, Jonathan Tripathy wrote:> > On 26/04/2011 18:53, Javier Guerra Giraldez wrote: >> >> On Tue, Apr 26, 2011 at 12:03 PM, John Madden <jmadden@ivytech.edu> >> wrote: >> >>> so your 1gbit pipe ends up being, you know, .8gbit storage and . >>> 2gbit >>> protocol (I don''t know what the exact numbers are) >>> >> more like .96gbit storage and .04gbit protocol.... or .993/.007 on >> jumbo frames >> >> > Are you using TCP and/or iSCSI offload in your NIC? Those seem like > pretty good numbers. > > This reminds me of a question that I was going to ask. I''m spec''ing > up a Dell server and they have a "Broadcom® NetXtreme II 5709 Dual > Port 1GbE NIC with TOE and iSCSI Offload". Does anyone have any idea > on how to enable the iSCSI offload under Linux? Or does it "just > work"?Linux has a module named bnx2i which enables iSCSI Offload and on CentOS is "just works" (as it''s automatically loaded). However it fails to work on all my NICs, even though they are known to support iSCSI Offload (I even downloaded the latest vendor driver but it doesn''t work), see dmesg output below. In my case those are actually quad port, though, but same model (5709) from Dell PowerEdge R710 servers. Loading NIC''s driver bnx2: Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v2.0.18c (Sep 13, 2010) eth0: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found at mem d6000000, IRQ 24, node addr xxxxxxxxxxxxxxxx eth1: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found at mem d8000000, IRQ 25, node addr xxxxxxxxxxxxxxxx eth2: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found at mem da000000, IRQ 26, node addr xxxxxxxxxxxxxxxx eth3: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found at mem dc000000, IRQ 27, node addr xxxxxxxxxxxxxxxx Loading iSCSI Offload bnx2i: Broadcom NetXtreme II iSCSI Driver bnx2i v2.1.3b (Oct 06, 2010) iscsi: registered transport (bnx2i) bnx2i: dev eth0 does not support iSCSI bnx2i: dev eth1 does not support iSCSI bnx2i: dev eth2 does not support iSCSI bnx2i: dev eth3 does not support iSCSI On "XCP" the module bnx2i is not available and I didn''t even try to install. If anyone has any info on that, it would be very much appreciated. Best regards, Eduardo. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On 04/26/2011 04:31 PM, Javier Guerra Giraldez wrote:> On Tue, Apr 26, 2011 at 3:21 PM, Jonathan Tripathy<jonnyt@abpni.co.uk> wrote: >> On 26/04/2011 18:53, Javier Guerra Giraldez wrote: >>> more like .96gbit storage and .04gbit protocol.... or .993/.007 on jumbo >>> frames >> >> Are you using TCP and/or iSCSI offload in your NIC? Those seem like pretty >> good numbers. > > that''s the protocol overhead; what John guessed would be .8/.2 He''s > right in that no amount of offloading or CPU power would improve on > that, but his numbers are way off.Some good summary numbers are near the bottom here (PDF): http://www.google.com/url?sa=t&source=web&cd=1&ved=0CCIQFjAA&url=http%3A%2F%2Fwww.rainiersolutions.com%2FRainierLibrary%2FiSCSI%2520SAN%2520Performance.pdf&rct=j&q=iscsi%20throughput%20gigabit&ei=NC63TaGhI8n50gGJl4kF&usg=AFQjCNGqiaj31yc3XYue97OL9KpPyo_zDg&cad=rja With a MTU of 1500, they''re claiming 94.93% payload. On GbE, I don''t believe I''ve seen more than ~100MB/s on any protocol, hence my estimate of 20% overhead. John -- John Madden Sr UNIX Systems Engineer / Office of Technology Ivy Tech Community College of Indiana jmadden@ivytech.edu _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Apr 26, 2011, at 5:47 PM, John Madden wrote:> On 04/26/2011 04:31 PM, Javier Guerra Giraldez wrote: >> On Tue, Apr 26, 2011 at 3:21 PM, Jonathan >> Tripathy<jonnyt@abpni.co.uk> wrote: >>> On 26/04/2011 18:53, Javier Guerra Giraldez wrote: >>>> more like .96gbit storage and .04gbit protocol.... or .993/.007 >>>> on jumbo >>>> frames >>> >>> Are you using TCP and/or iSCSI offload in your NIC? Those seem >>> like pretty >>> good numbers. >> >> that''s the protocol overhead; what John guessed would be .8/.2 He''s >> right in that no amount of offloading or CPU power would improve on >> that, but his numbers are way off. > > Some good summary numbers are near the bottom here (PDF): > > http://www.google.com/url?sa=t&source=web&cd=1&ved=0CCIQFjAA&url=http%3A%2F%2Fwww.rainiersolutions.com%2FRainierLibrary%2FiSCSI%2520SAN%2520Performance.pdf&rct=j&q=iscsi%20throughput%20gigabit&ei=NC63TaGhI8n50gGJl4kF&usg=AFQjCNGqiaj31yc3XYue97OL9KpPyo_zDg&cad=rja > > With a MTU of 1500, they''re claiming 94.93% payload. On GbE, I > don''t believe I''ve seen more than ~100MB/s on any protocol, hence my > estimate of 20% overhead. > > JohnIn sequential writes I get nearly all the 125MB/s using iSCSI on BCM5709 without the iSCSI Offload module enabled (better yet, module is loaded but reporting my cards do not support it, although they do). It fluctuates from normally 120MB/s to rarely 125MB/s. The switch my servers are connected to, also contains an iSCSI Offload engine which is enabled, but I don''t know the details about that (I don''t even understand why would the switch care about the iSCSI payload in the first place)... However, that might be the reason to get literally 125MB/s at times. Best regards, Eduardo. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Apr 26, 2011, at 6:37 PM, Eduardo Bragatto wrote:>> On 04/26/2011 04:31 PM, Javier Guerra Giraldez wrote: >> Some good summary numbers are near the bottom here (PDF): >> >> http://www.google.com/url?sa=t&source=web&cd=1&ved=0CCIQFjAA&url=http%3A%2F%2Fwww.rainiersolutions.com%2FRainierLibrary%2FiSCSI%2520SAN%2520Performance.pdf&rct=j&q=iscsi%20throughput%20gigabit&ei=NC63TaGhI8n50gGJl4kF&usg=AFQjCNGqiaj31yc3XYue97OL9KpPyo_zDg&cad=rja >> >> With a MTU of 1500, they''re claiming 94.93% payload. On GbE, I >> don''t believe I''ve seen more than ~100MB/s on any protocol, hence >> my estimate of 20% overhead. >> >> John > > In sequential writes I get nearly all the 125MB/s using iSCSI on > BCM5709 without the iSCSI Offload module enabled (better yet, module > is loaded but reporting my cards do not support it, although they do). > > It fluctuates from normally 120MB/s to rarely 125MB/s.Oooops, my bad. I decided to double check after I posted and after doing a 10GB sequential write, I realized the software I used to test was using 10^9 for GB instead of 1024^3, resulting in higher rates than in reality. After redoing the math using the correct amount of data transfered, I got to 113MB/s which is what the document defines as the maximum theoretical throughput. Best regards, Eduardo _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> > > TCP offload helps a lot in the case of iSCSI. You get the ability to > > send ~60KB of data at once, and your packets are checksummed forfree. I> > don''t actually know what Linux support for RSS is like but if it isany> > good you also get automatic distribution of rx workload acrossmultiple> > CPU''s too, but that only works if you have multiple TCP connectionsin> > flight at once (eg the multiple LUN scenario). > > > > I don''t know what iSCSI offload involves... maybe that can helpfurther.> > Offload is great, but it''s still TCP. The protocol itself is heavy on > overhead so your 1gbit pipe ends up being, you know, .8gbit storageand> .2gbit protocol (I don''t know what the exact numbers are). Offload > keeps this away from your CPUs but doesn''t do anything on the pipe > itself. iSCSI''s great when you can''t afford FC or FCoE (which still > isn''t as good as FC) or when you need to do things over a WAN, but it > has its cost. >For your numbers to make any sense at all, that would be 1800 bytes of header for a 9000 byte packet. That doesn''t seem even remotely right. Even 180 bytes of header (0.02) seems a bit excessive in terms of raw numbers, while still being pretty efficient in terms of ratio (0.98) James _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
SR is the Xen Cloud Platform storage manager deamon I think. But yes, cLVM is not required to build a setup where a single large LUN is exported to multiple hypervisors as long as you manage the LVM metadata on a single host. If you need to manage it on multiple hosts make sure you script running an lvscan on the other hosts to switch the logical volumes to active. If you are running a RHEL environment I highly suggest looking into cLVM and the rest of the RHEL cluster suite as it makes alot of what you are trying to do alot easier. If not there is plenty of room left for hackery. :) For those that have suggested Infiniband I would also put my vote behind it. Our solutions are developed on Infiniband and are some of the fastest in the world (or fastest in the case of cloud storage) and we are yet to saturate the bandwith of bonded DDR which is 40gbit. Price per port it is not that far from 10GbE but much more useful and resilient. Joseph. On 27 April 2011 03:01, John Madden <jmadden@ivytech.edu> wrote:> Am I missing something here? Is it possible to do live migrations >> using SR type LVMoiSCSI? The reason I ask is because the discussion >> made me think it would not be possible. >> > > I don''t know what "SR" means but yes, you can do live migrations even > without cLVM. All cLVM does is ensure consistency in LVM metadata changes > across the cluster. Xen itself prevents the source and destination dom0''s > from trashing the disk during live migration. Other multi-node trashings > are left to lock managers like OCFS2, GFS, not-being-dumb, etc. > > > John > > > > > > -- > John Madden > Sr UNIX Systems Engineer / Office of Technology > Ivy Tech Community College of Indiana > jmadden@ivytech.edu > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >-- * Kind regards, Joseph. Founder | Director Orion Virtualisation Solutions* | www.orionvm.com.au | Phone: 1300 56 99 52 | Mobile: 0428 754 846 -- * Kind regards, Joseph. Founder | Director Orion Virtualisation Solutions* | www.orionvm.com.au | Phone: 1300 56 99 52 | Mobile: 0428 754 846 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
I really like InfiniBand. However, it is not supported with XCP. -----Original Message----- From: xen-users-bounces@lists.xensource.com [mailto:xen-users-bounces@lists.xensource.com] On Behalf Of Joseph Glanville Sent: Wednesday, April 27, 2011 6:28 AM To: xen-users@lists.xensource.com Subject: [Xen-users] Shared Storage SR is the Xen Cloud Platform storage manager deamon I think. But yes, cLVM is not required to build a setup where a single large LUN is exported to multiple hypervisors as long as you manage the LVM metadata on a single host. If you need to manage it on multiple hosts make sure you script running an lvscan on the other hosts to switch the logical volumes to active. If you are running a RHEL environment I highly suggest looking into cLVM and the rest of the RHEL cluster suite as it makes alot of what you are trying to do alot easier. If not there is plenty of room left for hackery. :) For those that have suggested Infiniband I would also put my vote behind it. Our solutions are developed on Infiniband and are some of the fastest in the world (or fastest in the case of cloud storage) and we are yet to saturate the bandwith of bonded DDR which is 40gbit. Price per port it is not that far from 10GbE but much more useful and resilient. Joseph. On 27 April 2011 03:01, John Madden <jmadden@ivytech.edu> wrote: Am I missing something here? Is it possible to do live migrations using SR type LVMoiSCSI? The reason I ask is because the discussion made me think it would not be possible. I don''t know what "SR" means but yes, you can do live migrations even without cLVM. All cLVM does is ensure consistency in LVM metadata changes across the cluster. Xen itself prevents the source and destination dom0''s from trashing the disk during live migration. Other multi-node trashings are left to lock managers like OCFS2, GFS, not-being-dumb, etc. John -- John Madden Sr UNIX Systems Engineer / Office of Technology Ivy Tech Community College of Indiana jmadden@ivytech.edu _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users -- Kind regards, Joseph. Founder | Director Orion Virtualisation Solutions | <http://www.orionvm.com.au/> www.orionvm.com.au | Phone: 1300 56 99 52 | Mobile: 0428 754 846 -- Kind regards, Joseph. Founder | Director Orion Virtualisation Solutions | <http://www.orionvm.com.au/> www.orionvm.com.au | Phone: 1300 56 99 52 | Mobile: 0428 754 846 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
I am not 100% familiar with the internals of XCP but after taking a glance it''s based off a Centos 5.4 kernel I believe which is OFED compatible. You could simply install the OFED RPM and have full Infiniband support. IPoIB is fine for iSCSI based storage etc. Joseph. On 27 April 2011 22:44, <admin@xenhive.com> wrote:> I really like InfiniBand. However, it is not supported with XCP. > > > > -----Original Message----- > *From:* xen-users-bounces@lists.xensource.com [mailto: > xen-users-bounces@lists.xensource.com] *On Behalf Of *Joseph Glanville > *Sent:* Wednesday, April 27, 2011 6:28 AM > *To:* xen-users@lists.xensource.com > *Subject:* [Xen-users] Shared Storage > > > > SR is the Xen Cloud Platform storage manager deamon I think. > > > But yes, cLVM is not required to build a setup where a single large LUN is > exported to multiple hypervisors as long as you manage the LVM metadata on a > single host. If you need to manage it on multiple hosts make sure you script > running an lvscan on the other hosts to switch the logical volumes to > active. > > If you are running a RHEL environment I highly suggest looking into cLVM > and the rest of the RHEL cluster suite as it makes alot of what you are > trying to do alot easier. If not there is plenty of room left for hackery. > :) > > For those that have suggested Infiniband I would also put my vote behind > it. Our solutions are developed on Infiniband and are some of the fastest in > the world (or fastest in the case of cloud storage) and we are yet to > saturate the bandwith of bonded DDR which is 40gbit. Price per port it is > not that far from 10GbE but much more useful and resilient. > > Joseph. > > > > On 27 April 2011 03:01, John Madden <jmadden@ivytech.edu> wrote: > > Am I missing something here? Is it possible to do live migrations > using SR type LVMoiSCSI? The reason I ask is because the discussion > made me think it would not be possible. > > > > I don''t know what "SR" means but yes, you can do live migrations even > without cLVM. All cLVM does is ensure consistency in LVM metadata changes > across the cluster. Xen itself prevents the source and destination dom0''s > from trashing the disk during live migration. Other multi-node trashings > are left to lock managers like OCFS2, GFS, not-being-dumb, etc. > > > > John > > > > > > -- > John Madden > Sr UNIX Systems Engineer / Office of Technology > Ivy Tech Community College of Indiana > jmadden@ivytech.edu > > _______________________________________________ > > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users > > > > -- > > Kind regards, > > Joseph. > > * * > > Founder | Director > > *Orion Virtualisation Solutions* | www.orionvm.com.au | Phone: 1300 56 99 > 52 | Mobile: 0428 754 846 > > > > > -- > > Kind regards, > > Joseph. > > * * > > Founder | Director > > *Orion Virtualisation Solutions* | www.orionvm.com.au | Phone: 1300 56 99 > 52 | Mobile: 0428 754 846 > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >-- * Kind regards, Joseph. Founder | Director Orion Virtualisation Solutions* | www.orionvm.com.au | Phone: 1300 56 99 52 | Mobile: 0428 754 846 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
The only way I know of to get InfiniBand working with XCP is by manually doing it using the DDK. By default, XCP does not support InfiniBand even though it is based off an OS that does support InfiniBand. -----Original Message----- From: xen-users-bounces@lists.xensource.com [mailto:xen-users-bounces@lists.xensource.com] On Behalf Of Joseph Glanville Sent: Wednesday, April 27, 2011 10:24 AM To: admin@xenhive.com Cc: xen-users@lists.xensource.com Subject: Re: [Xen-users] Shared Storage I am not 100% familiar with the internals of XCP but after taking a glance it''s based off a Centos 5.4 kernel I believe which is OFED compatible. You could simply install the OFED RPM and have full Infiniband support. IPoIB is fine for iSCSI based storage etc. Joseph. On 27 April 2011 22:44, <admin@xenhive.com> wrote: I really like InfiniBand. However, it is not supported with XCP. -----Original Message----- From: xen-users-bounces@lists.xensource.com [mailto:xen-users-bounces@lists.xensource.com] On Behalf Of Joseph Glanville Sent: Wednesday, April 27, 2011 6:28 AM To: xen-users@lists.xensource.com Subject: [Xen-users] Shared Storage SR is the Xen Cloud Platform storage manager deamon I think. But yes, cLVM is not required to build a setup where a single large LUN is exported to multiple hypervisors as long as you manage the LVM metadata on a single host. If you need to manage it on multiple hosts make sure you script running an lvscan on the other hosts to switch the logical volumes to active. If you are running a RHEL environment I highly suggest looking into cLVM and the rest of the RHEL cluster suite as it makes alot of what you are trying to do alot easier. If not there is plenty of room left for hackery. :) For those that have suggested Infiniband I would also put my vote behind it. Our solutions are developed on Infiniband and are some of the fastest in the world (or fastest in the case of cloud storage) and we are yet to saturate the bandwith of bonded DDR which is 40gbit. Price per port it is not that far from 10GbE but much more useful and resilient. Joseph. On 27 April 2011 03:01, John Madden <jmadden@ivytech.edu> wrote: Am I missing something here? Is it possible to do live migrations using SR type LVMoiSCSI? The reason I ask is because the discussion made me think it would not be possible. I don''t know what "SR" means but yes, you can do live migrations even without cLVM. All cLVM does is ensure consistency in LVM metadata changes across the cluster. Xen itself prevents the source and destination dom0''s from trashing the disk during live migration. Other multi-node trashings are left to lock managers like OCFS2, GFS, not-being-dumb, etc. John -- John Madden Sr UNIX Systems Engineer / Office of Technology Ivy Tech Community College of Indiana jmadden@ivytech.edu _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users -- Kind regards, Joseph. Founder | Director Orion Virtualisation Solutions | <http://www.orionvm.com.au/> www.orionvm.com.au | Phone: 1300 56 99 52 | Mobile: 0428 754 846 -- Kind regards, Joseph. Founder | Director Orion Virtualisation Solutions | <http://www.orionvm.com.au/> www.orionvm.com.au | Phone: 1300 56 99 52 | Mobile: 0428 754 846 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users -- Kind regards, Joseph. Founder | Director Orion Virtualisation Solutions | <http://www.orionvm.com.au/> www.orionvm.com.au | Phone: 1300 56 99 52 | Mobile: 0428 754 846 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Has anyone successfully been able to share the infiniband card with one or more domU''s? we have the connectx2 cards from mellanox which are claimed to have the shared channel capacity which can be shared with one or more virtual machines. Also, has anyone been able to hook up IB-dedicated storage to a Xen solution, dom0 or domU--if so, what make and model? Steve Timm On Thu, 28 Apr 2011, Joseph Glanville wrote:> I am not 100% familiar with the internals of XCP but after taking a glance > it''s based off a Centos 5.4 kernel I believe which is OFED compatible. > You could simply install the OFED RPM and have full Infiniband support. > IPoIB is fine for iSCSI based storage etc. > > Joseph. > > On 27 April 2011 22:44, <admin@xenhive.com> wrote: > >> I really like InfiniBand. However, it is not supported with XCP. >> >> >> >> -----Original Message----- >> *From:* xen-users-bounces@lists.xensource.com [mailto: >> xen-users-bounces@lists.xensource.com] *On Behalf Of *Joseph Glanville >> *Sent:* Wednesday, April 27, 2011 6:28 AM >> *To:* xen-users@lists.xensource.com >> *Subject:* [Xen-users] Shared Storage >> >> >> >> SR is the Xen Cloud Platform storage manager deamon I think. >> >> >> But yes, cLVM is not required to build a setup where a single large LUN is >> exported to multiple hypervisors as long as you manage the LVM metadata on a >> single host. If you need to manage it on multiple hosts make sure you script >> running an lvscan on the other hosts to switch the logical volumes to >> active. >> >> If you are running a RHEL environment I highly suggest looking into cLVM >> and the rest of the RHEL cluster suite as it makes alot of what you are >> trying to do alot easier. If not there is plenty of room left for hackery. >> :) >> >> For those that have suggested Infiniband I would also put my vote behind >> it. Our solutions are developed on Infiniband and are some of the fastest in >> the world (or fastest in the case of cloud storage) and we are yet to >> saturate the bandwith of bonded DDR which is 40gbit. Price per port it is >> not that far from 10GbE but much more useful and resilient. >> >> Joseph. >> >> >> >> On 27 April 2011 03:01, John Madden <jmadden@ivytech.edu> wrote: >> >> Am I missing something here? Is it possible to do live migrations >> using SR type LVMoiSCSI? The reason I ask is because the discussion >> made me think it would not be possible. >> >> >> >> I don''t know what "SR" means but yes, you can do live migrations even >> without cLVM. All cLVM does is ensure consistency in LVM metadata changes >> across the cluster. Xen itself prevents the source and destination dom0''s >> from trashing the disk during live migration. Other multi-node trashings >> are left to lock managers like OCFS2, GFS, not-being-dumb, etc. >> >> >> >> John >> >> >> >> >> >> -- >> John Madden >> Sr UNIX Systems Engineer / Office of Technology >> Ivy Tech Community College of Indiana >> jmadden@ivytech.edu >> >> _______________________________________________ >> >> Xen-users mailing list >> Xen-users@lists.xensource.com >> http://lists.xensource.com/xen-users >> >> >> >> -- >> >> Kind regards, >> >> Joseph. >> >> * * >> >> Founder | Director >> >> *Orion Virtualisation Solutions* | www.orionvm.com.au | Phone: 1300 56 99 >> 52 | Mobile: 0428 754 846 >> >> >> >> >> -- >> >> Kind regards, >> >> Joseph. >> >> * * >> >> Founder | Director >> >> *Orion Virtualisation Solutions* | www.orionvm.com.au | Phone: 1300 56 99 >> 52 | Mobile: 0428 754 846 >> >> _______________________________________________ >> Xen-users mailing list >> Xen-users@lists.xensource.com >> http://lists.xensource.com/xen-users >> > > > >-- ------------------------------------------------------------------ Steven C. Timm, Ph.D (630) 840-8525 timm@fnal.gov http://home.fnal.gov/~timm/ Fermilab Computing Division, Scientific Computing Facilities, Grid Facilities Department, FermiGrid Services Group, Group Leader. Lead of FermiCloud project. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hi Steven What do you mean by IB dedicated storage? We use IB to power shared storage for our VMs, you can see an overview of our technology stack here: http://orionvm.com.au/cloud-services/Our-Technology/ <http://orionvm.com.au/cloud-services/Our-Technology/> If you mean SRP or iSER then yes and yes, but there is very few good SRP initiators and even fewer good targets. Keep an eye on the LIO (Linux-iSCSI.org) project as it is merging with mainline in the 2.6.38 or .39 merge window and will provide SRP target support out of the box. The SRP initiator included in the OFED stack is somewhat suboptimal if you need to dynamically manage LUNs on dom0''s but not too bad if you have a single LUN with many LVMs as most of the setups in this thread entail. In terms of Channel I/O Virtualisation. We are looking into this and if I am successful I will post a howto on our techblog and forward it to the list. Joseph. On 28 April 2011 03:29, Steven Timm <timm@fnal.gov> wrote:> Has anyone successfully been able to share the infiniband > card with one or more domU''s? we have the connectx2 cards > from mellanox which are claimed to have the shared channel capacity > which can be shared with one or more virtual machines. > > Also, has anyone been able to hook up IB-dedicated storage > to a Xen solution, dom0 or domU--if so, what make and model? > > Steve Timm > > > > > On Thu, 28 Apr 2011, Joseph Glanville wrote: > > I am not 100% familiar with the internals of XCP but after taking a glance >> it''s based off a Centos 5.4 kernel I believe which is OFED compatible. >> You could simply install the OFED RPM and have full Infiniband support. >> IPoIB is fine for iSCSI based storage etc. >> >> Joseph. >> >> On 27 April 2011 22:44, <admin@xenhive.com> wrote: >> >> I really like InfiniBand. However, it is not supported with XCP. >>> >>> >>> >>> -----Original Message----- >>> *From:* xen-users-bounces@lists.xensource.com [mailto: >>> xen-users-bounces@lists.xensource.com] *On Behalf Of *Joseph Glanville >>> *Sent:* Wednesday, April 27, 2011 6:28 AM >>> *To:* xen-users@lists.xensource.com >>> *Subject:* [Xen-users] Shared Storage >>> >>> >>> >>> SR is the Xen Cloud Platform storage manager deamon I think. >>> >>> >>> But yes, cLVM is not required to build a setup where a single large LUN >>> is >>> exported to multiple hypervisors as long as you manage the LVM metadata >>> on a >>> single host. If you need to manage it on multiple hosts make sure you >>> script >>> running an lvscan on the other hosts to switch the logical volumes to >>> active. >>> >>> If you are running a RHEL environment I highly suggest looking into cLVM >>> and the rest of the RHEL cluster suite as it makes alot of what you are >>> trying to do alot easier. If not there is plenty of room left for >>> hackery. >>> :) >>> >>> For those that have suggested Infiniband I would also put my vote behind >>> it. Our solutions are developed on Infiniband and are some of the fastest >>> in >>> the world (or fastest in the case of cloud storage) and we are yet to >>> saturate the bandwith of bonded DDR which is 40gbit. Price per port it is >>> not that far from 10GbE but much more useful and resilient. >>> >>> Joseph. >>> >>> >>> >>> On 27 April 2011 03:01, John Madden <jmadden@ivytech.edu> wrote: >>> >>> Am I missing something here? Is it possible to do live migrations >>> using SR type LVMoiSCSI? The reason I ask is because the discussion >>> made me think it would not be possible. >>> >>> >>> >>> I don''t know what "SR" means but yes, you can do live migrations even >>> without cLVM. All cLVM does is ensure consistency in LVM metadata >>> changes >>> across the cluster. Xen itself prevents the source and destination >>> dom0''s >>> from trashing the disk during live migration. Other multi-node trashings >>> are left to lock managers like OCFS2, GFS, not-being-dumb, etc. >>> >>> >>> >>> John >>> >>> >>> >>> >>> >>> -- >>> John Madden >>> Sr UNIX Systems Engineer / Office of Technology >>> Ivy Tech Community College of Indiana >>> jmadden@ivytech.edu >>> >>> _______________________________________________ >>> >>> Xen-users mailing list >>> Xen-users@lists.xensource.com >>> http://lists.xensource.com/xen-users >>> >>> >>> >>> -- >>> >>> Kind regards, >>> >>> Joseph. >>> >>> * * >>> >>> Founder | Director >>> >>> *Orion Virtualisation Solutions* | www.orionvm.com.au | Phone: 1300 56 >>> 99 >>> 52 | Mobile: 0428 754 846 >>> >>> >>> >>> >>> -- >>> >>> Kind regards, >>> >>> Joseph. >>> >>> * * >>> >>> Founder | Director >>> >>> *Orion Virtualisation Solutions* | www.orionvm.com.au | Phone: 1300 56 >>> 99 >>> 52 | Mobile: 0428 754 846 >>> >>> _______________________________________________ >>> Xen-users mailing list >>> Xen-users@lists.xensource.com >>> http://lists.xensource.com/xen-users >>> >>> >> >> >> >> > -- > ------------------------------------------------------------------ > Steven C. Timm, Ph.D (630) 840-8525 > timm@fnal.gov http://home.fnal.gov/~timm/ > Fermilab Computing Division, Scientific Computing Facilities, > Grid Facilities Department, FermiGrid Services Group, Group Leader. > Lead of FermiCloud project. >-- * Kind regards, Joseph. Founder | Director Orion Virtualisation Solutions* | www.orionvm.com.au | Phone: 1300 56 99 52 | Mobile: 0428 754 846 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
What I mean by IB dedicated storage is a hardware storage array that I could plug into my existing IB switch without buying an ib-to-fibrechannel hybrid switch or having a machine to bridge between. Having the SRP drivers actually work to be able to read it would be a plus too. Steve On Thu, 28 Apr 2011, Joseph Glanville wrote:> Hi Steven > > What do you mean by IB dedicated storage? > We use IB to power shared storage for our VMs, you can see an overview of > our technology stack here: > http://orionvm.com.au/cloud-services/Our-Technology/ > <http://orionvm.com.au/cloud-services/Our-Technology/> > If you mean SRP or iSER then yes and yes, but there is very few good > SRP initiators and even fewer good targets. > Keep an eye on the LIO (Linux-iSCSI.org) project as it is merging with > mainline in the 2.6.38 or .39 merge window and will provide SRP target > support out of the box. > The SRP initiator included in the OFED stack is somewhat suboptimal if you > need to dynamically manage LUNs on dom0''s but not too bad if you have a > single LUN with many LVMs as most of the setups in this thread entail. > > In terms of Channel I/O Virtualisation. We are looking into this and if I am > successful I will post a howto on our techblog and forward it to the list. > > Joseph. > > > On 28 April 2011 03:29, Steven Timm <timm@fnal.gov> wrote: > >> Has anyone successfully been able to share the infiniband >> card with one or more domU''s? we have the connectx2 cards >> from mellanox which are claimed to have the shared channel capacity >> which can be shared with one or more virtual machines. >> >> Also, has anyone been able to hook up IB-dedicated storage >> to a Xen solution, dom0 or domU--if so, what make and model? >> >> Steve Timm >> >> >> >> >> On Thu, 28 Apr 2011, Joseph Glanville wrote: >> >> I am not 100% familiar with the internals of XCP but after taking a glance >>> it''s based off a Centos 5.4 kernel I believe which is OFED compatible. >>> You could simply install the OFED RPM and have full Infiniband support. >>> IPoIB is fine for iSCSI based storage etc. >>> >>> Joseph. >>> >>> On 27 April 2011 22:44, <admin@xenhive.com> wrote: >>> >>> I really like InfiniBand. However, it is not supported with XCP. >>>> >>>> >>>> >>>> -----Original Message----- >>>> *From:* xen-users-bounces@lists.xensource.com [mailto: >>>> xen-users-bounces@lists.xensource.com] *On Behalf Of *Joseph Glanville >>>> *Sent:* Wednesday, April 27, 2011 6:28 AM >>>> *To:* xen-users@lists.xensource.com >>>> *Subject:* [Xen-users] Shared Storage >>>> >>>> >>>> >>>> SR is the Xen Cloud Platform storage manager deamon I think. >>>> >>>> >>>> But yes, cLVM is not required to build a setup where a single large LUN >>>> is >>>> exported to multiple hypervisors as long as you manage the LVM metadata >>>> on a >>>> single host. If you need to manage it on multiple hosts make sure you >>>> script >>>> running an lvscan on the other hosts to switch the logical volumes to >>>> active. >>>> >>>> If you are running a RHEL environment I highly suggest looking into cLVM >>>> and the rest of the RHEL cluster suite as it makes alot of what you are >>>> trying to do alot easier. If not there is plenty of room left for >>>> hackery. >>>> :) >>>> >>>> For those that have suggested Infiniband I would also put my vote behind >>>> it. Our solutions are developed on Infiniband and are some of the fastest >>>> in >>>> the world (or fastest in the case of cloud storage) and we are yet to >>>> saturate the bandwith of bonded DDR which is 40gbit. Price per port it is >>>> not that far from 10GbE but much more useful and resilient. >>>> >>>> Joseph. >>>> >>>> >>>> >>>> On 27 April 2011 03:01, John Madden <jmadden@ivytech.edu> wrote: >>>> >>>> Am I missing something here? Is it possible to do live migrations >>>> using SR type LVMoiSCSI? The reason I ask is because the discussion >>>> made me think it would not be possible. >>>> >>>> >>>> >>>> I don''t know what "SR" means but yes, you can do live migrations even >>>> without cLVM. All cLVM does is ensure consistency in LVM metadata >>>> changes >>>> across the cluster. Xen itself prevents the source and destination >>>> dom0''s >>>> from trashing the disk during live migration. Other multi-node trashings >>>> are left to lock managers like OCFS2, GFS, not-being-dumb, etc. >>>> >>>> >>>> >>>> John >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> John Madden >>>> Sr UNIX Systems Engineer / Office of Technology >>>> Ivy Tech Community College of Indiana >>>> jmadden@ivytech.edu >>>> >>>> _______________________________________________ >>>> >>>> Xen-users mailing list >>>> Xen-users@lists.xensource.com >>>> http://lists.xensource.com/xen-users >>>> >>>> >>>> >>>> -- >>>> >>>> Kind regards, >>>> >>>> Joseph. >>>> >>>> * * >>>> >>>> Founder | Director >>>> >>>> *Orion Virtualisation Solutions* | www.orionvm.com.au | Phone: 1300 56 >>>> 99 >>>> 52 | Mobile: 0428 754 846 >>>> >>>> >>>> >>>> >>>> -- >>>> >>>> Kind regards, >>>> >>>> Joseph. >>>> >>>> * * >>>> >>>> Founder | Director >>>> >>>> *Orion Virtualisation Solutions* | www.orionvm.com.au | Phone: 1300 56 >>>> 99 >>>> 52 | Mobile: 0428 754 846 >>>> >>>> _______________________________________________ >>>> Xen-users mailing list >>>> Xen-users@lists.xensource.com >>>> http://lists.xensource.com/xen-users >>>> >>>> >>> >>> >>> >>> >> -- >> ------------------------------------------------------------------ >> Steven C. Timm, Ph.D (630) 840-8525 >> timm@fnal.gov http://home.fnal.gov/~timm/ >> Fermilab Computing Division, Scientific Computing Facilities, >> Grid Facilities Department, FermiGrid Services Group, Group Leader. >> Lead of FermiCloud project. >> > > > >-- ------------------------------------------------------------------ Steven C. Timm, Ph.D (630) 840-8525 timm@fnal.gov http://home.fnal.gov/~timm/ Fermilab Computing Division, Scientific Computing Facilities, Grid Facilities Department, FermiGrid Services Group, Group Leader. Lead of FermiCloud project. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
I haven''t used any dedicated storage before, I have however used SRP quite alot. Any "dedicated" device you find is probably using the Linux SCST SRP target or Solaris based with COMSTAR target. Both of which are very reliable. Most SANs are built this way, if you are looking for a device with good SRP support though I would recommend Scaleable Informatics: http://www.scalableinformatics.com - Particulary their Jackrabbit devices, they are excellent value for money. As I noted early however the inflexibility of the current SRP initiators is more the problem. ib_srp in any recent Linux kernel will allow you to connect to any SRP target. Joseph. On 28 April 2011 12:51, Steven Timm <timm@fnal.gov> wrote:> > What I mean by IB dedicated storage is a hardware storage array > that I could plug into my existing IB switch without > buying an ib-to-fibrechannel hybrid switch or having a machine > to bridge between. Having the SRP drivers actually work to be > able to read it would be a plus too. > > Steve > > > > On Thu, 28 Apr 2011, Joseph Glanville wrote: > > Hi Steven >> >> What do you mean by IB dedicated storage? >> We use IB to power shared storage for our VMs, you can see an overview of >> our technology stack here: >> http://orionvm.com.au/cloud-services/Our-Technology/ >> <http://orionvm.com.au/cloud-services/Our-Technology/> >> If you mean SRP or iSER then yes and yes, but there is very few good >> SRP initiators and even fewer good targets. >> Keep an eye on the LIO (Linux-iSCSI.org) project as it is merging with >> mainline in the 2.6.38 or .39 merge window and will provide SRP target >> support out of the box. >> The SRP initiator included in the OFED stack is somewhat suboptimal if you >> need to dynamically manage LUNs on dom0''s but not too bad if you have a >> single LUN with many LVMs as most of the setups in this thread entail. >> >> In terms of Channel I/O Virtualisation. We are looking into this and if I >> am >> successful I will post a howto on our techblog and forward it to the list. >> >> Joseph. >> >> >> On 28 April 2011 03:29, Steven Timm <timm@fnal.gov> wrote: >> >> Has anyone successfully been able to share the infiniband >>> card with one or more domU''s? we have the connectx2 cards >>> from mellanox which are claimed to have the shared channel capacity >>> which can be shared with one or more virtual machines. >>> >>> Also, has anyone been able to hook up IB-dedicated storage >>> to a Xen solution, dom0 or domU--if so, what make and model? >>> >>> Steve Timm >>> >>> >>> >>> >>> On Thu, 28 Apr 2011, Joseph Glanville wrote: >>> >>> I am not 100% familiar with the internals of XCP but after taking a >>> glance >>> >>>> it''s based off a Centos 5.4 kernel I believe which is OFED compatible. >>>> You could simply install the OFED RPM and have full Infiniband support. >>>> IPoIB is fine for iSCSI based storage etc. >>>> >>>> Joseph. >>>> >>>> On 27 April 2011 22:44, <admin@xenhive.com> wrote: >>>> >>>> I really like InfiniBand. However, it is not supported with XCP. >>>> >>>>> >>>>> >>>>> >>>>> -----Original Message----- >>>>> *From:* xen-users-bounces@lists.xensource.com [mailto: >>>>> xen-users-bounces@lists.xensource.com] *On Behalf Of *Joseph Glanville >>>>> *Sent:* Wednesday, April 27, 2011 6:28 AM >>>>> *To:* xen-users@lists.xensource.com >>>>> *Subject:* [Xen-users] Shared Storage >>>>> >>>>> >>>>> >>>>> SR is the Xen Cloud Platform storage manager deamon I think. >>>>> >>>>> >>>>> But yes, cLVM is not required to build a setup where a single large LUN >>>>> is >>>>> exported to multiple hypervisors as long as you manage the LVM metadata >>>>> on a >>>>> single host. If you need to manage it on multiple hosts make sure you >>>>> script >>>>> running an lvscan on the other hosts to switch the logical volumes to >>>>> active. >>>>> >>>>> If you are running a RHEL environment I highly suggest looking into >>>>> cLVM >>>>> and the rest of the RHEL cluster suite as it makes alot of what you are >>>>> trying to do alot easier. If not there is plenty of room left for >>>>> hackery. >>>>> :) >>>>> >>>>> For those that have suggested Infiniband I would also put my vote >>>>> behind >>>>> it. Our solutions are developed on Infiniband and are some of the >>>>> fastest >>>>> in >>>>> the world (or fastest in the case of cloud storage) and we are yet to >>>>> saturate the bandwith of bonded DDR which is 40gbit. Price per port it >>>>> is >>>>> not that far from 10GbE but much more useful and resilient. >>>>> >>>>> Joseph. >>>>> >>>>> >>>>> >>>>> On 27 April 2011 03:01, John Madden <jmadden@ivytech.edu> wrote: >>>>> >>>>> Am I missing something here? Is it possible to do live migrations >>>>> using SR type LVMoiSCSI? The reason I ask is because the discussion >>>>> made me think it would not be possible. >>>>> >>>>> >>>>> >>>>> I don''t know what "SR" means but yes, you can do live migrations even >>>>> without cLVM. All cLVM does is ensure consistency in LVM metadata >>>>> changes >>>>> across the cluster. Xen itself prevents the source and destination >>>>> dom0''s >>>>> from trashing the disk during live migration. Other multi-node >>>>> trashings >>>>> are left to lock managers like OCFS2, GFS, not-being-dumb, etc. >>>>> >>>>> >>>>> >>>>> John >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> John Madden >>>>> Sr UNIX Systems Engineer / Office of Technology >>>>> Ivy Tech Community College of Indiana >>>>> jmadden@ivytech.edu >>>>> >>>>> _______________________________________________ >>>>> >>>>> Xen-users mailing list >>>>> Xen-users@lists.xensource.com >>>>> http://lists.xensource.com/xen-users >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Kind regards, >>>>> >>>>> Joseph. >>>>> >>>>> * * >>>>> >>>>> Founder | Director >>>>> >>>>> *Orion Virtualisation Solutions* | www.orionvm.com.au | Phone: 1300 56 >>>>> 99 >>>>> 52 | Mobile: 0428 754 846 >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Kind regards, >>>>> >>>>> Joseph. >>>>> >>>>> * * >>>>> >>>>> Founder | Director >>>>> >>>>> *Orion Virtualisation Solutions* | www.orionvm.com.au | Phone: 1300 56 >>>>> 99 >>>>> 52 | Mobile: 0428 754 846 >>>>> >>>>> _______________________________________________ >>>>> Xen-users mailing list >>>>> Xen-users@lists.xensource.com >>>>> http://lists.xensource.com/xen-users >>>>> >>>>> >>>>> >>>> >>>> >>>> >>>> -- >>> ------------------------------------------------------------------ >>> Steven C. Timm, Ph.D (630) 840-8525 >>> timm@fnal.gov http://home.fnal.gov/~timm/ >>> Fermilab Computing Division, Scientific Computing Facilities, >>> Grid Facilities Department, FermiGrid Services Group, Group Leader. >>> Lead of FermiCloud project. >>> >>> >> >> >> >> > -- > ------------------------------------------------------------------ > Steven C. Timm, Ph.D (630) 840-8525 > timm@fnal.gov http://home.fnal.gov/~timm/ > Fermilab Computing Division, Scientific Computing Facilities, > Grid Facilities Department, FermiGrid Services Group, Group Leader. > Lead of FermiCloud project. >-- * Kind regards, Joseph. Founder | Director Orion Virtualisation Solutions* | www.orionvm.com.au | Phone: 1300 56 99 52 | Mobile: 0428 754 846 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users