Ian Campbell
2010-Jul-27 15:58 UTC
[Xen-devel] [RFC] Virtual disk configuration, PV vs. emulated, backward compatibility etc
Currently the configuration syntax available in a domain configuration
has several ways of specifying devices, some of which have slightly
unexpected semantics wrt whether or not an emulated device is created,
what the major number in xenstore is etc. Some also expose details of
the guest OS''s choice of major number (or rather exposes
Linux''s choice
to all guests AFAICT).
In an attempt to clean this up, or at least make the strange behaviour
more explicit, I''d like to propose some extensions to the dXpY syntax
supported by libxl such that the other existing ways of specifying
devices become syntactic sugar for specific well defined configurations
in the new syntax, whilst preserving backwards compatibility.
I hope that the following will also form the basis for a future document
(gasp!) describing the available syntax, which combinations are valid
etc (unless someone can point me to an existing document I can update).
Virtual Disk Configuration
--------------------------
A virtual disk is defined in the guest configuration file as
d<X>p<Y>
where <X> is the disk number and <Y> is the partition number. In
addition a number of options can be specified.
p0 indicates the entire disk.
Device number encoding in xenstore
----------------------------------
Given a disk specified as dXpY the device encoding used in xenstore has
two potential formats, legacy and extended. Both of these are already
defined and implemented in guest frontend drivers.
The extended encoding is generally preferred but for backwards
compatibility the legacy format must still be supported.
The legacy encoding is (major and minor 8 bits each):
(major << 8) | minor
The extended encoding is (disk == 19 bits, partition == 256 bits):
(1 << 28) | (disk << 8) | partition
Note that the extended encoding for d0p0..d0p255 overlaps in the minor
number space with the legacy encodings of d0p0..d15p15 and therefore
these must not be used simultaneously.
Configuration Options
---------------------
Each disk dXpY can optionally be followed by one or more of the
following key value pairs (precise syntax TBD, but comma separated is
common in similar situations).
Option keys and values with a _ prefix are for internal use only and are
used only to provide legacy semantics for syntactic sugar and must not
otherwise be used.
pv = true | false
Should a PV backend/frontend pair be created in xenstore
to correspond to this device.
Default: true for HVM guests, ignored for PV guests
(treated as true)
extended = true | false
Request use of extended device encoding in xenstore.
extended = false is only valid for d0..d15 (as d16+
cannot be represented in the legacy encoding)
When extended = false and in the absence of a specific
_vdevice configuration option (see below) the encoding
will use major==202 and minor=="(disk << 4) |
partition".
Default: false for d0p0..d0p255, false if _vdevice
option present (see below), otherwise true.
emul = none | ide[01].[01] | _ide[01].[01] | ...
none = No emulated device to be created.
ide[01].[01] = Emulate IDE device. First [01] =>
primary, secondary. Second [01] => master, slave
_ide[01].[01] = As per ide[01].[01] however emulation is
enabled iff no other disk is explicitly configured with
emulation.
In the future sata<X>.<Y> or similar might be added
here.
Default: none HVM guests, ignored for PV guests (treated
as none)
_vdevice = <N>:<M> | <Q>
Enforce use of legacy device encoding in xenstore with
the given major:minor or explicit value.
Default: unset, encoding determined by "extended"
option
(see above)
Backward compatible disk configuration
--------------------------------------
Given the above configuration options several short hands are defined
for backwards compatibility with existing configuration files and
guests.
These will be implemented by a straight textual substitution before
parsing the configuration.
hda => d0p0,pv=true,emul=ide0.0,_vdevice=3:0
hdb => d1p0,pv=true,emul=ide0.1,_vdevice=3:64
hdc => d2p0,pv=true,emul=ide1.0,_vdevice=22:0
hdd => d3p0,pv=true,emul=ide1.1,_vdevice=22:64
xvda => d0p0,pv=true,emul=_ide0.0,_vdevice=202:0
xvdb => d1p0,pv=true,emul=_ide0.1,_vdevice=202:16
xvdc => d2p0,pv=true,emul=_ide1.0,_vdevice=202:32
xvdd => d3p0,pv=true,emul=_ide1.1,_vdevice=202:64
xvde => d4p0,pv=true,emul=none,_vdevice=202:80
...
xvdo => d15p0,pv=true,emul=none,_vdevice=202:240
xvdp => d16p0,pv=true,emul=none
...
xvdz => d25,pv=true,emul=none
xvda[1..15] =>
d0p[1..15],pv=true,emul=_ide0.0,_vdevice=202:[0..15]
xvdb[1..15] => etc
Note that all the above are Linux (guest) specific.
The sd* syntax is not covered. It''s unclear if this is used in the wild
or what the existing semantics of emul= are for SCSI devices. If someone
cares to investigate the existing behaviour then it can be added.
Otherwise it is expected that additions will not be made to this set of
shorthands and that new functionality (e.g. emulation types) will be
available only via the explicit syntax.
(is there any non-Linux specific syntax used by other guest OSes which
needs to be supported?)
Implementation notes
--------------------
The behaviour specified by the emul=_ide[01].[01] syntax is currently
implemented by qemu (effectively as a workaround for users forgetting to
specify any emulated disks). I propose that as part of implementing this
new syntax we push responsibility for these semantics up into libxl.
libxl currently uses the legacy encoding for devices specified as xvd or
dXpY iff the particular configuration can be represented using the
legacy format (e.g. for d0p0..d15p15 or xvda..xvdp) in order to (1)
avoid the clash between the extended representation of d0p0 and the
legacy representations of d1..d15 and (2) to provide compatibility with
guests which do not support the extended device encoding.
The proposal above suggests instead that d1+ should be encoded using the
extended format unless overridden using the extended=false option or one
of the shorthands which uses the_vdevice option. Only d0 would default
to legacy encoding.
This (1) avoids the clash in minor numbers since d0 is the only disk
which can clash with legacy encodings and (2) provides compatibility
with old guests through their use of the xvd* syntax.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Ian Jackson
2010-Jul-27 16:50 UTC
Re: [Xen-devel] [RFC] Virtual disk configuration, PV vs. emulated, backward compatibility etc
Ian Campbell writes ("[Xen-devel] [RFC] Virtual disk configuration, PV vs.
emulated, backward compatibility etc"):> In an attempt to clean this up, or at least make the strange behaviour
> more explicit, I''d like to propose some extensions to the dXpY
syntax
> supported by libxl such that the other existing ways of specifying
> devices become syntactic sugar for specific well defined configurations
> in the new syntax, whilst preserving backwards compatibility.
Urgh. I don''t like this at all. I have a completely different
conceptual model. I guess I''ll have to write it up.
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Pasi Kärkkäinen
2010-Jul-27 20:41 UTC
Re: [Xen-devel] [RFC] Virtual disk configuration, PV vs. emulated, backward compatibility etc
On Tue, Jul 27, 2010 at 04:58:10PM +0100, Ian Campbell wrote:> Currently the configuration syntax available in a domain configuration > has several ways of specifying devices, some of which have slightly > unexpected semantics wrt whether or not an emulated device is created, > what the major number in xenstore is etc. Some also expose details of > the guest OS''s choice of major number (or rather exposes Linux''s choice > to all guests AFAICT). > > In an attempt to clean this up, or at least make the strange behaviour > more explicit, I''d like to propose some extensions to the dXpY syntax > supported by libxl such that the other existing ways of specifying > devices become syntactic sugar for specific well defined configurations > in the new syntax, whilst preserving backwards compatibility. > > I hope that the following will also form the basis for a future document > (gasp!) describing the available syntax, which combinations are valid > etc (unless someone can point me to an existing document I can update). > > Virtual Disk Configuration > -------------------------- > > A virtual disk is defined in the guest configuration file as d<X>p<Y> > where <X> is the disk number and <Y> is the partition number. In > addition a number of options can be specified. > > p0 indicates the entire disk. > > Device number encoding in xenstore > ---------------------------------- > > Given a disk specified as dXpY the device encoding used in xenstore has > two potential formats, legacy and extended. Both of these are already > defined and implemented in guest frontend drivers. > > The extended encoding is generally preferred but for backwards > compatibility the legacy format must still be supported. > > The legacy encoding is (major and minor 8 bits each): > (major << 8) | minor > > The extended encoding is (disk == 19 bits, partition == 256 bits): > (1 << 28) | (disk << 8) | partition > > Note that the extended encoding for d0p0..d0p255 overlaps in the minor > number space with the legacy encodings of d0p0..d15p15 and therefore > these must not be used simultaneously. > > Configuration Options > --------------------- > > Each disk dXpY can optionally be followed by one or more of the > following key value pairs (precise syntax TBD, but comma separated is > common in similar situations). > > Option keys and values with a _ prefix are for internal use only and are > used only to provide legacy semantics for syntactic sugar and must not > otherwise be used. > > pv = true | false > > Should a PV backend/frontend pair be created in xenstore > to correspond to this device. > > Default: true for HVM guests, ignored for PV guests > (treated as true) > > extended = true | false > > Request use of extended device encoding in xenstore. > > extended = false is only valid for d0..d15 (as d16+ > cannot be represented in the legacy encoding) > > When extended = false and in the absence of a specific > _vdevice configuration option (see below) the encoding > will use major==202 and minor=="(disk << 4) | > partition". > > Default: false for d0p0..d0p255, false if _vdevice > option present (see below), otherwise true. > > emul = none | ide[01].[01] | _ide[01].[01] | ... > > none = No emulated device to be created. > > ide[01].[01] = Emulate IDE device. First [01] => > primary, secondary. Second [01] => master, slave > > _ide[01].[01] = As per ide[01].[01] however emulation is > enabled iff no other disk is explicitly configured with > emulation. > > In the future sata<X>.<Y> or similar might be added > here. > > Default: none HVM guests, ignored for PV guests (treated > as none) > > _vdevice = <N>:<M> | <Q> > > Enforce use of legacy device encoding in xenstore with > the given major:minor or explicit value. > > Default: unset, encoding determined by "extended" option > (see above) > > Backward compatible disk configuration > -------------------------------------- > > Given the above configuration options several short hands are defined > for backwards compatibility with existing configuration files and > guests. > > These will be implemented by a straight textual substitution before > parsing the configuration. > > hda => d0p0,pv=true,emul=ide0.0,_vdevice=3:0 > hdb => d1p0,pv=true,emul=ide0.1,_vdevice=3:64 > hdc => d2p0,pv=true,emul=ide1.0,_vdevice=22:0 > hdd => d3p0,pv=true,emul=ide1.1,_vdevice=22:64 > > xvda => d0p0,pv=true,emul=_ide0.0,_vdevice=202:0 > xvdb => d1p0,pv=true,emul=_ide0.1,_vdevice=202:16 > xvdc => d2p0,pv=true,emul=_ide1.0,_vdevice=202:32 > xvdd => d3p0,pv=true,emul=_ide1.1,_vdevice=202:64 > xvde => d4p0,pv=true,emul=none,_vdevice=202:80 > ... > xvdo => d15p0,pv=true,emul=none,_vdevice=202:240 > xvdp => d16p0,pv=true,emul=none > ... > xvdz => d25,pv=true,emul=none > > xvda[1..15] => > d0p[1..15],pv=true,emul=_ide0.0,_vdevice=202:[0..15] > xvdb[1..15] => etc > > Note that all the above are Linux (guest) specific. > > The sd* syntax is not covered. It''s unclear if this is used in the wild > or what the existing semantics of emul= are for SCSI devices. If someone > cares to investigate the existing behaviour then it can be added. >sd* devices are still often used for Xen PV domUs.. (yeah, people should use xvd*, but many people still have sd*). -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Jul-28 09:45 UTC
Re: [Xen-devel] [RFC] Virtual disk configuration, PV vs. emulated, backward compatibility etc
On Tue, 2010-07-27 at 21:41 +0100, Pasi Kärkkäinen wrote:> > Backward compatible disk configuration > > -------------------------------------- > > > > Given the above configuration options several short hands are > defined > > for backwards compatibility with existing configuration files and > > guests. > > > > These will be implemented by a straight textual substitution before > > parsing the configuration. > > > > hda => d0p0,pv=true,emul=ide0.0,_vdevice=3:0 > > hdb => d1p0,pv=true,emul=ide0.1,_vdevice=3:64 > > hdc => d2p0,pv=true,emul=ide1.0,_vdevice=22:0 > > hdd => d3p0,pv=true,emul=ide1.1,_vdevice=22:64 > > > > xvda => d0p0,pv=true,emul=_ide0.0,_vdevice=202:0 > > xvdb => d1p0,pv=true,emul=_ide0.1,_vdevice=202:16 > > xvdc => d2p0,pv=true,emul=_ide1.0,_vdevice=202:32 > > xvdd => d3p0,pv=true,emul=_ide1.1,_vdevice=202:64 > > xvde => d4p0,pv=true,emul=none,_vdevice=202:80 > > ... > > xvdo => d15p0,pv=true,emul=none,_vdevice=202:240 > > xvdp => d16p0,pv=true,emul=none > > ... > > xvdz => d25,pv=true,emul=none > > > > xvda[1..15] => > > d0p[1..15],pv=true,emul=_ide0.0,_vdevice=202:[0..15] > > xvdb[1..15] => etc > > > > Note that all the above are Linux (guest) specific. > > > > The sd* syntax is not covered. It''s unclear if this is used in the > > wild or what the existing semantics of emul= are for SCSI devices. > > If someone cares to investigate the existing behaviour then it can > > be added. > > > > sd* devices are still often used for Xen PV domUs.. > (yeah, people should use xvd*, but many people still have sd*).Thanks. We can therefore add to the shorthands (emul=none unless anyone knows better): sda => d0p0,pv=true,emul=none,_vdevice=8:0 sdb => d1p0,pv=true,emul=none,_vdevice=8:16 sdc => d2p0,pv=true,emul=none,_vdevice=8:32 sdd => d3p0,pv=true,emul=none,_vdevice=8:48 ... sdp => d15p0,pv=true,emul=none,_vdevice=8:240 sdq => d16p0,pv=true,emul=none,_vdevice=65:0 sdr => d17p0,pv=true,emul=none,_vdevice=65:16 ... etc through: ... ... major 65 (sdq ->sdaf) ... ... major 66 (sdag->sdav) ... ... major 67 (sdaw->sdbl) ... ... major 68 (sdbm->sdcr) ... ... major 69 (sdcc->sdcr) ... ... major 70 (sdcs->sddh) ... ... major 71 (sddi->sddx) sddx => d127p0,pv=true,emul=none,_vdevice=71:240 (perhaps supporting all these is overkill ;-)) Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Paolo Bonzini
2010-Jul-28 12:31 UTC
[Xen-devel] Re: [RFC] Virtual disk configuration, PV vs. emulated, backward compatibility etc
On 07/27/2010 05:58 PM, Ian Campbell wrote:> The sd* syntax is not covered. It''s unclear if this is used in the wild > or what the existing semantics of emul= are for SCSI devices. If someone > cares to investigate the existing behaviour then it can be added.I don''t know what semantics xl uses for SCSI devices, but I know that we''ve seen bugs about SCSI emulation so it is sometimes used, and this is the semantics that it should use given your IDE example: sda => d0p0,pv=true,emul=scsi0.0,_vdevice=8:0 sdb => d0p0,pv=true,emul=scsi0.1,_vdevice=8:16 where the first number is the bus and the second is the unit as passed to -drive. The second number goes from 0 to 7 (that''s what QEMU does at least). Paolo _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2010-Jul-28 16:05 UTC
Re: [Xen-devel] [RFC] Virtual disk configuration, PV vs. emulated, backward compatibility etc
Ian Campbell writes ("[Xen-devel] [RFC] Virtual disk configuration, PV vs.
emulated, backward compatibility etc"):> Virtual Disk Configuration
I don''t agree with this interpretation. In February I posted a
draft spec which provided a different interpretation of events:
http://lists.xensource.com/archives/html/xen-devel/2010-02/msg00183.html
Below is a version of that which has been enhanced to answer the
questions raised by this conversation.
Xen guest interface
-------------------
A Xen guest can be provided with block devices. These are always
provided as Xen VBDs; for HVM guests they may also be provided as
emulated IDE or SCSI disks.
The abstract interface involves specifying, for each block device:
* Nominal disk type: Xen virtual disk (aka xvd*, the default); SCSI
(sd*); IDE (hd*).
For HVM guests, each whole-disk hd* and and sd* device is made
available _both_ via emulated IDE resp. SCSI controller, _and_ as a
Xen VBD. The HVM guest is entitled to assume that the IDE or SCSI
disks available via the emulated IDE controller target the same
underlying devices as the corresponding Xen VBD (ie, multipath).
For PV guests every device is made available to the guest only as a
Xen VBD. For these domains the type is advisory, for use by the
guest''s device naming scheme.
The Xen interface does not specify what name a device should have
in the guest (nor what major/minor device number it should have in
thee guest, if the guest has such a concept).
* Disk number, which is a nonnegative integer,
conventionally starting at 0 for the first disk.
* Partition number, which is a nonnegative integer where by
convention partition 0 indicates the "whole disk".
Normally for any disk _either_ partition 0 should be supplied in
which case the guest is expected to treat it as they would a native
whole disk (for example by putting or expecting a partition table
or disk label on it);
_Or_ only non-0 partitions should be supplied in which case the
guest should expect storage management to be done by the host and
treat each vbd as it would a partition or slice or LVM volume (for
example by putting or expecting a filesystem on it).
Non-whole disk devices cannot be passed through to HVM guests via
the emulated IDE or SCSI controllers.
Configuration file syntax
-------------------------
The config file syntaxes are, for example
d0 d0p0 xvda Xen virtual disk 0 partition 0 (whole disk)
d1p2 xvda2 Xen virtual disk 1 partition 2
d536p37 xvdtq37 Xen virtual disk 536 partition 37
sdb3 SCSI disk 1 partition 3
hdc2 IDE disk 2 partition 2
The d*p* syntax is not supported by xm/xend.
To cope with guests which predate this scheme we therefore preserve
the existing facility to specify the xenstore numerical value directly
by putting a single number (hex, decimal or octal) in the domain
config file instead of the disk identifier.
Concrete encoding in the VBD interface (in xenstore)
----------------------------------------------------
The information above is encoded in the concrete interface as an
integer (in a canonical decimal format in xenstore), whose value
encodes the information above as follows:
1 << 28 | disk << 8 | partition xvd, disks or partitions 16
onwards
202 << 8 | disk << 4 | partition xvd, disks and partitions
up to 15
8 << 8 | disk << 4 | partition sd, disks and partitions up
to 15
3 << 8 | disk << 6 | partition hd, disks 0..1, partitions
0..63
22 << 8 | (disk-2) << 6 | partition hd, disks 2..3, partitions
0..63
2 << 28 onwards reserved for future use
other values less than 1 << 28 deprecated / reserved
The 1<<28 format handles disks up to (1<<20)-1 and partitions up to
255. It will be used only where the 202<<8 format does not have
enough bits.
Guests MAY support any subset of the formats above except that if they
support 1<<28 they MUST also support 202<<8. PV-on-HVM drivers MUST
support at least one of 3<<8 or 8<<8; 3<<8 is recommended.
Some software has provided essentially Linux-specific encodings for
SCSI disks beyond disk 15 partition 15, and IDE disks beyond disk 3
partition 63. These vbds, and the corresponding encoded integers, are
deprecated.
Guests SHOULD ignore numbers that they do not understand or
recognise. They SHOULD check supplied numbers for validity.
Notes on Linux as a guest
-------------------------
Very old Linux guests (PV and PV-on-HVM) are able to "steal" the
device numbers and names normally used by the IDE and SCSI
controllers, so that writing "hda1" in the config file results in
/dev/hda1 in the guest. These systems interpret the xenstore integer
as
major << 8 | minor
where major and minor are the Linux-specific device numbers. Some old
configurations may depend on deprecated high-numbered SCSI and IDE
disks. This does not work in recent versions of Linux.
So for Linux PV guests, users are recommended to supply xvd* devices
only. Modern PV drivers will map these to identically-named devices
in the guest.
For Linux HVM guests using PV-on-HVM drivers, users are recommended to
supply as few hd* devices as possible and use pure xvd* devices for
the rest. Modern PV-on-HVM drivers will map the hd* devices to
/dev/xvdHDa etc.
Some Linux HVM guests with broken PV-on-HVM drivers do not cope
properly if both hda and hdc are supplied, nor with both hda and xvda,
because they directly map the bottom 8 bits of the xenstore integer
directly to the Linux guest''s device number and throw away the rest;
they can crash due to minor number clashes.
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Jul-28 16:45 UTC
Re: [Xen-devel] [RFC] Virtual disk configuration, PV vs. emulated, backward compatibility etc
On 07/28/2010 09:05 AM, Ian Jackson wrote:> For Linux HVM guests using PV-on-HVM drivers, users are recommended to > supply as few hd* devices as possible and use pure xvd* devices for > the rest. Modern PV-on-HVM drivers will map the hd* devices to > /dev/xvdHDa etc.I think we''ve decided to make blkfront register pv versions of emulated devices as hdX/sdX rather than using xvdHD. We don''t do this in pv domains. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2010-Jul-29 14:50 UTC
Re: [Xen-devel] [RFC] Virtual disk configuration, PV vs. emulated, backward compatibility etc
Jeremy Fitzhardinge writes ("Re: [Xen-devel] [RFC] Virtual disk
configuration, PV vs. emulated, backward compatibility
etc"):> On 07/28/2010 09:05 AM, Ian Jackson wrote:
> > For Linux HVM guests using PV-on-HVM drivers, users are recommended to
> > supply as few hd* devices as possible and use pure xvd* devices for
> > the rest. Modern PV-on-HVM drivers will map the hd* devices to
> > /dev/xvdHDa etc.
>
> I think we''ve decided to make blkfront register pv versions of
emulated
> devices as hdX/sdX rather than using xvdHD. We don''t do this in
pv domains.
Stealing the major number from the ide and scsi drivers, or just the
name ?
What if the domain has real sd* devices too ? (pvscsi, pvusb + usb
mass storage, passthrough, ...)
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Stefano Stabellini
2010-Jul-29 15:07 UTC
Re: [Xen-devel] [RFC] Virtual disk configuration, PV vs. emulated, backward compatibility etc
On Thu, 29 Jul 2010, Ian Jackson wrote:> Jeremy Fitzhardinge writes ("Re: [Xen-devel] [RFC] Virtual disk configuration, PV vs. emulated, backward compatibility etc"): > > On 07/28/2010 09:05 AM, Ian Jackson wrote: > > > For Linux HVM guests using PV-on-HVM drivers, users are recommended to > > > supply as few hd* devices as possible and use pure xvd* devices for > > > the rest. Modern PV-on-HVM drivers will map the hd* devices to > > > /dev/xvdHDa etc. > > > > I think we''ve decided to make blkfront register pv versions of emulated > > devices as hdX/sdX rather than using xvdHD. We don''t do this in pv domains. > > Stealing the major number from the ide and scsi drivers, or just the > name ? >Both> What if the domain has real sd* devices too ? (pvscsi, pvusb + usb > mass storage, passthrough, ...) >Clashes are theoretically possible but very hard to produce in practice. We are "stealing" device names only for emulated IDE and SCSI disks, and emulated SCSI disks don''t even work at the moment. So you would need to passthrough an IDE controller whose disks are configured as hd* (most distros use sd* for IDE disks). I think we are doing exactly what the user asked us to: setting up an hdX device; in these very unlikely scenarios the user knows what he is doing and can change the configuration. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2010-Jul-29 15:45 UTC
Re: [Xen-devel] [RFC] Virtual disk configuration, PV vs. emulated, backward compatibility etc
Stefano Stabellini writes ("Re: [Xen-devel] [RFC] Virtual disk
configuration, PV vs. emulated, backward compatibility
etc"):> On Thu, 29 Jul 2010, Ian Jackson wrote:
> > What if the domain has real sd* devices too ? (pvscsi, pvusb + usb
> > mass storage, passthrough, ...)
>
> Clashes are theoretically possible but very hard to produce in practice.
> We are "stealing" device names only for emulated IDE and SCSI
disks, and
> emulated SCSI disks don''t even work at the moment. So you would
need to
> passthrough an IDE controller whose disks are configured as hd* (most
> distros use sd* for IDE disks).
There are definitely people who are using emulated scsi disks; perhaps
they just haven''t updated yet.
> I think we are doing exactly what the user asked us to: setting up an
> hdX device; in these very unlikely scenarios the user knows what he is
> doing and can change the configuration.
Well, no, they can''t, because their bootloader probably
doesn''t
understand anything besides what they''re actually using.
Certainly stealing the major number for scsi disks seems quite
dangerous. pv-usb is hardly that unlikely a scenario.
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Stefano Stabellini
2010-Jul-29 15:59 UTC
Re: [Xen-devel] [RFC] Virtual disk configuration, PV vs. emulated, backward compatibility etc
On Thu, 29 Jul 2010, Ian Jackson wrote:> Stefano Stabellini writes ("Re: [Xen-devel] [RFC] Virtual disk configuration, PV vs. emulated, backward compatibility etc"): > > On Thu, 29 Jul 2010, Ian Jackson wrote: > > > What if the domain has real sd* devices too ? (pvscsi, pvusb + usb > > > mass storage, passthrough, ...) > > > > Clashes are theoretically possible but very hard to produce in practice. > > We are "stealing" device names only for emulated IDE and SCSI disks, and > > emulated SCSI disks don''t even work at the moment. So you would need to > > passthrough an IDE controller whose disks are configured as hd* (most > > distros use sd* for IDE disks). > > There are definitely people who are using emulated scsi disks; perhaps > they just haven''t updated yet.I am not so sure about that> > > I think we are doing exactly what the user asked us to: setting up an > > hdX device; in these very unlikely scenarios the user knows what he is > > doing and can change the configuration. > > Well, no, they can''t, because their bootloader probably doesn''t > understand anything besides what they''re actually using. >they only have to change the device name, not the device class> Certainly stealing the major number for scsi disks seems quite > dangerous. pv-usb is hardly that unlikely a scenario.we are not doing that for pvusb _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2010-Jul-29 16:09 UTC
Re: [Xen-devel] [RFC] Virtual disk configuration, PV vs. emulated, backward compatibility etc
Stefano Stabellini writes ("Re: [Xen-devel] [RFC] Virtual disk
configuration, PV vs. emulated, backward compatibility
etc"):> On Thu, 29 Jul 2010, Ian Jackson wrote:
> > Well, no, they can''t, because their bootloader probably
doesn''t
> > understand anything besides what they''re actually using.
>
> they only have to change the device name, not the device class
Surely you can''t steal only one minor number ?
> > Certainly stealing the major number for scsi disks seems quite
> > dangerous. pv-usb is hardly that unlikely a scenario.
>
> we are not doing that for pvusb
pv-usb => usb mass storage => scsi disks
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Stefano Stabellini
2010-Jul-29 16:14 UTC
Re: [Xen-devel] [RFC] Virtual disk configuration, PV vs. emulated, backward compatibility etc
On Thu, 29 Jul 2010, Ian Jackson wrote:> Stefano Stabellini writes ("Re: [Xen-devel] [RFC] Virtual disk configuration, PV vs. emulated, backward compatibility etc"): > > On Thu, 29 Jul 2010, Ian Jackson wrote: > > > Well, no, they can''t, because their bootloader probably doesn''t > > > understand anything besides what they''re actually using. > > > > they only have to change the device name, not the device class > > Surely you can''t steal only one minor number ?yes, that''s what we do.> > > > Certainly stealing the major number for scsi disks seems quite > > > dangerous. pv-usb is hardly that unlikely a scenario. > > > > we are not doing that for pvusb > > pv-usb => usb mass storage => scsi disksI mean there is no such thing as pv-usb. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Jul-29 16:29 UTC
Re: [Xen-devel] [RFC] Virtual disk configuration, PV vs. emulated, backward compatibility etc
On 07/29/2010 09:14 AM, Stefano Stabellini wrote:> On Thu, 29 Jul 2010, Ian Jackson wrote: >> Stefano Stabellini writes ("Re: [Xen-devel] [RFC] Virtual disk configuration, PV vs. emulated, backward compatibility etc"): >>> On Thu, 29 Jul 2010, Ian Jackson wrote: >>>> Well, no, they can''t, because their bootloader probably doesn''t >>>> understand anything besides what they''re actually using. >>> they only have to change the device name, not the device class >> Surely you can''t steal only one minor number ? > yes, that''s what we do.More than one minor, surely? One for each device.>>>> Certainly stealing the major number for scsi disks seems quite >>>> dangerous. pv-usb is hardly that unlikely a scenario. >>> >>> we are not doing that for pvusb >> pv-usb => usb mass storage => scsi disks > > I mean there is no such thing as pv-usb.Well, it hasn''t been ported to pvops yet. I''ve been getting promises of patches any month now for a couple of years. I wonder if blkfront could register itself with the scsi subsystem rather than directly as a block device? J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2010-Jul-29 16:34 UTC
Re: [Xen-devel] [RFC] Virtual disk configuration, PV vs. emulated, backward compatibility etc
Jeremy Fitzhardinge writes ("Re: [Xen-devel] [RFC] Virtual disk
configuration, PV vs. emulated, backward compatibility
etc"):> I wonder if blkfront could register itself with the scsi subsystem
> rather than directly as a block device?
I bet that would mean it would have to deal with SCSI command blocks
and stuff, so I doubt it.
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Jul-29 16:37 UTC
Re: [Xen-devel] [RFC] Virtual disk configuration, PV vs. emulated, backward compatibility etc
On 07/29/2010 09:34 AM, Ian Jackson wrote:> Jeremy Fitzhardinge writes ("Re: [Xen-devel] [RFC] Virtual disk configuration, PV vs. emulated, backward compatibility etc"): >> I wonder if blkfront could register itself with the scsi subsystem >> rather than directly as a block device? > I bet that would mean it would have to deal with SCSI command blocks > and stuff, so I doubt it.Well, random ide/ata devices are now part of scsi via libata, and they presumably can not deal with raw scsi commands. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Alan Cox
2010-Jul-29 17:54 UTC
Re: [Xen-devel] [RFC] Virtual disk configuration, PV vs. emulated, backward compatibility etc
On Thu, 29 Jul 2010 09:37:22 -0700 Jeremy Fitzhardinge <jeremy@goop.org> wrote:> On 07/29/2010 09:34 AM, Ian Jackson wrote: > > Jeremy Fitzhardinge writes ("Re: [Xen-devel] [RFC] Virtual disk configuration, PV vs. emulated, backward compatibility etc"): > >> I wonder if blkfront could register itself with the scsi subsystem > >> rather than directly as a block device? > > I bet that would mean it would have to deal with SCSI command blocks > > and stuff, so I doubt it. > > Well, random ide/ata devices are now part of scsi via libata, and they > presumably can not deal with raw scsi commands.ATAPI device speak SCSI (or a sort of Pidgin SCSI anyway), libata translates SCSI<->ATA for disks, so you can in the Linux world throw arbitary *valid* SCSI at them and you should get valid and correct behaviour for a SCSI disk. If not its a bug. Alan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel