Tomonari Horikoshi
2007-May-16 09:08 UTC
[Xen-devel] [RFC] pv-scsi driver (scsiback/scsifront)
Hi all. We developped a pv-scsi driver that we refered Fujita-san''s scsi-driver and blkback. (see, http://www.xensource.com/files/xensummit_4/Xen_Summit_8_Matsumoto.pdf) The pv-scsi driver''s feature is as follow: * Guest has dedicated SCSI-HBAs of Dom0. * Guest can send scsi_cdb to the HBAs. * Guest recognises the HBAs from hostno of xenstore. Currentlly, We are developping FC version based on this. * Future work: * implement python code * performance tunning * attach, detach * suspend, resume * We are wondering about: * We used "scsihost" as xenstore nodename. Is it suitable? * We consider about configfile format... scsihost = [''fc,0'', ''scsi,1'', ''type,num''] type = "fc" or "scsi" num = scsi host number on Dom0 Do you have any comment? * We have no idea how to implement suspend/resume feature. ex. Physical HBA mapping for resumed guest. Pending I/O. The WWN within FC mode for resumed guest. Influence of migration. ... Could you suggest to us about this? Best regards, Tomonari Horikoshi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
James Smart
2007-May-16 13:10 UTC
Re: [Xen-devel] [RFC] pv-scsi driver (scsiback/scsifront)
Tomonari Horikoshi wrote:> Hi all. > > We developped a pv-scsi driver that we refered Fujita-san''s scsi-driver > and blkback. > (see, http://www.xensource.com/files/xensummit_4/Xen_Summit_8_Matsumoto.pdf) > > The pv-scsi driver''s feature is as follow: > * Guest has dedicated SCSI-HBAs of Dom0. > * Guest can send scsi_cdb to the HBAs. > * Guest recognises the HBAs from hostno of xenstore. > > Currentlly, We are developping FC version based on this. > > * Future work: > * implement python code > * performance tunning > * attach, detach > * suspend, resume > > * We are wondering about: > * We used "scsihost" as xenstore nodename. Is it suitable? > * We consider about configfile format... > scsihost = [''fc,0'', ''scsi,1'', ''type,num''] > type = "fc" or "scsi" > num = scsi host number on Dom0I would expect "fc" does not need to be specified, unless there is FC-isms exposed to the guest.> Do you have any comment? > * We have no idea how to implement suspend/resume feature. > ex. Physical HBA mapping for resumed guest. > Pending I/O. > The WWN within FC mode for resumed guest.The WWN is a whole different issue - and I''m going to want to make sure that whatever you do here is consistent with FC NPIV virtual ports instantiated in Dom 0. See: http://marc.info/?l=linux-scsi&m=117768770720886&w=2> Influence of migration. > ... > > Could you suggest to us about this? > > > > Best regards, > Tomonari Horikoshi > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> We developped a pv-scsi driver that we refered Fujita-san''sscsi-driver> and blkback. > (see, > http://www.xensource.com/files/xensummit_4/Xen_Summit_8_Matsumoto.pdf)This is good work, and we''d certainly like to get it polished and in mainline -- thanks!> The pv-scsi driver''s feature is as follow: > * Guest has dedicated SCSI-HBAs of Dom0.Is it really the case that you must dedicate a HBA to the guest? Surely we can extend it to enable an individual LUN to be mapped through to a guest, translating the host:bus:id etc accordingly?> * We consider about configfile format... > scsihost = [''fc,0'', ''scsi,1'', ''type,num''] > type = "fc" or "scsi"Why do you need to select between fc and scsi?> * We have no idea how to implement suspend/resume feature. > ex. Physical HBA mapping for resumed guest. > Pending I/O. > The WWN within FC mode for resumed guest. > Influence of migration. > > Could you suggest to us about this?The blkfront/back code is obviously a good crib for this. Basically, the front end driver needs to store enough information to be able to reissue any uncompleted requests across a migration. This is accomplished with a ''shadow ring''. The frontend needs to be capable of reconnecting if the backend goes out of state connected, and then reissue the requests. For migration write-after-write safety, the backend shouldn''t close until all outstanding requests have either been completed or aborted. One other thing I notice is that you''ve kept the blk ring protocol''s 11 page limit per request. The blkring kinda gets away with this because (at least in principle) we could merge consecutive requests in blkback to create larger IOs. I guess we could do that with scsi requests too, but I''d feel more comfortable if we didn''t mess with the request stream that the guest is generating. We probably need to make the number of pages in an SG list variable, and hence have variable sized requests across the ring. We should defintiely make the ring multi-page too. What do you think? Thanks, Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
akira hayakawa
2007-May-18 02:48 UTC
[Xen-devel] Re: [RFC] pv-scsi driver (scsiback/scsifront)
Thank you for your comment. I work in same group Tomonari Horikoshi works.>Tomonari Horikoshi wrote: >> Hi all. >> >> We developped a pv-scsi driver that we refered Fujita-san''s scsi-driver >> and blkback. >> (see, http://www.xensource.com/files/xensummit_4/Xen_Summit_8_Matsumoto. >> pdf) >> >> The pv-scsi driver''s feature is as follow: >> * Guest has dedicated SCSI-HBAs of Dom0. >> * Guest can send scsi_cdb to the HBAs. >> * Guest recognises the HBAs from hostno of xenstore. >> >> Currentlly, We are developping FC version based on this. >> >> * Future work: >> * implement python code >> * performance tunning >> * attach, detach >> * suspend, resume >> >> * We are wondering about: >> * We used "scsihost" as xenstore nodename. Is it suitable? >> * We consider about configfile format... >> scsihost = [''fc,0'', ''scsi,1'', ''type,num''] >> type = "fc" or "scsi" >> num = scsi host number on Dom0 >I would expect "fc" does not need to be specified, unless there >is FC-isms exposed to the guest.We want to use SAN management software on guest OS. The software works on native(no VM) linux. So we think it is necesarry to have guest OS shown whether HBA card is FC or SCSI in the same way of native linux.>> Do you have any comment? >> * We have no idea how to implement suspend/resume feature. >> ex. Physical HBA mapping for resumed guest. >> Pending I/O. >> The WWN within FC mode for resumed guest. >The WWN is a whole different issue - and I''m going to want to make >sure that whatever you do here is consistent with FC NPIV virtual >ports instantiated in Dom 0. See: >http://marc.info/?l=linux-scsi&m=117768770720886&w=2We think whether WWN is same value or not when a guest resumes again is unknown because the WWN may be already used by another guest. Best Regards, Akira Hayakawa _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
James Smart
2007-May-18 13:08 UTC
[Xen-devel] Re: [RFC] pv-scsi driver (scsiback/scsifront)
>> I would expect "fc" does not need to be specified, unless there >> is FC-isms exposed to the guest. > > We want to use SAN management software on guest OS. The software > works on native(no VM) linux. So we think it is necesarry to > have guest OS shown whether HBA card is FC or SCSI in the same > way of native linux.Well - depends on what/how your san mgmt works. If it''s straight scsi, then it would be fine - but you can''t talk to anything non-scsi and not enumerated by the hba. If it''s layered on hbaapi, it does mean you want to talk FC, not just scsi, and now things change significantly.> >>> Do you have any comment? >>> * We have no idea how to implement suspend/resume feature. >>> ex. Physical HBA mapping for resumed guest. >>> Pending I/O. >>> The WWN within FC mode for resumed guest. >> The WWN is a whole different issue - and I''m going to want to make >> sure that whatever you do here is consistent with FC NPIV virtual >> ports instantiated in Dom 0. See: >> http://marc.info/?l=linux-scsi&m=117768770720886&w=2 > > We think whether WWN is same value or not when a guest resumes again is > unknown because the WWN may be already used by another guest.This confuses me greatly. WWN''s are how FC ports are known - which controls their SAN visibility and device access. If it''s changing for the VM, unless you have everything seeing everything (i.e. no SAN zoning or lun masking, which is very very rare in a production environment) then whether you see your storage is questionable. For this reason, regular NPIV will be adding the WWNs as a resource of the guest, much like the ethernet MAC addresses. And, if it is bound to the guest, it matches the model needed for suspend/resume, although there are challenges for discovery and enumeration. Additionally, I certainly hope you are keeping far more control on how WWN''s are allocated and used. There is that small part about uniqueness that has to be maintained or the fabric will show very nasty issues. -- james _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
akira hayakawa
2007-May-22 13:58 UTC
RE: [Xen-devel] [RFC] pv-scsi driver (scsiback/scsifront)
Thank you for your comment. I''m sorry for delaying response. I work in same group Tomonari Horikoshi works.>> We developped a pv-scsi driver that we refered Fujita-san''s >scsi-driver >> and blkback. >> (see, >> http://www.xensource.com/files/xensummit_4/Xen_Summit_8_Matsumoto.pdf) > >This is good work, and we''d certainly like to get it polished and in >mainline -- thanks! > >> The pv-scsi driver''s feature is as follow: >> * Guest has dedicated SCSI-HBAs of Dom0. > >Is it really the case that you must dedicate a HBA to the guest?Yes, and we are planning to use NPIV function so that more than one guest can use a HBA.>Surely we can extend it to enable an individual LUN to be mapped >through to a guest, translating the host:bus:id etc accordingly? > >> * We consider about configfile format... >> scsihost = [''fc,0'', ''scsi,1'', ''type,num''] >> type = "fc" or "scsi" > >Why do you need to select between fc and scsi?Please refer "Re: [RFC] pv-scsi driver(scsiback/scsifront)"(respose to James.Smart@Emulex.Com.).>> * We have no idea how to implement suspend/resume feature. >> ex. Physical HBA mapping for resumed guest. >> Pending I/O. >> The WWN within FC mode for resumed guest. >> Influence of migration. >> >> Could you suggest to us about this? > >The blkfront/back code is obviously a good crib for this. Basically, the >front end driver needs to store enough information to be able to reissue >any uncompleted requests across a migration. This is accomplished with a >''shadow ring''. The frontend needs to be capable of reconnecting if the >backend goes out of state connected, and then reissue the requests. For >migration write-after-write safety, the backend shouldn''t close until >all outstanding requests have either been completed or aborted.Thank you so much, your advise is very helpful to us.>One other thing I notice is that you''ve kept the blk ring protocol''s 11 >page limit per request. The blkring kinda gets away with this because >(at least in principle) we could merge consecutive requests in blkback >to create larger IOs. I guess we could do that with scsi requests too, >but I''d feel more comfortable if we didn''t mess with the request stream >that the guest is generating. We probably need to make the number of >pages in an SG list variable, and hence have variable sized requests >across the ring. We should defintiely make the ring multi-page too. > >What do you think?I agree with you. We try to defintiely make the ring multi-page.>Thanks, >IanBest Regards, Aki _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Apparently Analagous Threads
- Minor synchronisation quibble in scsifront
- [PATCH 2/4] pvSCSI : Fix many points of backend/frontend driver
- [PATCH] [IOEMU] Allow blktap to be able to be booted as systemvolume for PV-on-HVM(TAKE 3)
- [PATCH 1/6] scsifront/back drivers'' common Makefile and header
- [PATCH 3/5] pvSCSI (SCSI pass through) driver