thr3ads.net - crossbow discuss - [crossbow-discuss] Updated Crossbow virtualization architecture document [Aug 2007]

If this information is useful, please help other people find it:
Share via:

Nicolas Droux

2007-Aug-28 06:40 UTC

[crossbow-discuss] Updated Crossbow virtualization architecture document

Folks,

I just posted an updated Crossbow virtualization architecture  
document. The new revision is available at:

http://opensolaris.org/os/project/crossbow/Docs/crossbow-virt.pdf

The main changes are the addition of support for multiple MAC  
addresses per client, and an explicit separation between  
consolidation private and project private MAC API entry points. See  
in particular the updated section 4.3 and chapter 5.

Nicolas.

-- 
Nicolas Droux - Solaris Core OS - Sun Microsystems, Inc.
droux at sun.com - http://blogs.sun.com/droux

James Carlson

2007-Aug-28 13:44 UTC

head link

[crossbow-discuss] Updated Crossbow virtualization architecture document

Nicolas Droux writes:> http://opensolaris.org/os/project/crossbow/Docs/crossbow-virt.pdf
I have A few questions about this.  I''ve also read through as much of
the crossbow-discuss archives as seemed to be related to these
topics, and didn''t find answers there.

  - Why are bandwidth, CPU control, and MAC address assignment
    exclusively a VNIC feature, at least at the administrative level?
    Section 4.7 seems to say that MAC instances will get these
    features, so shouldn''t this be "modify-dev" instead?

    Needing to create a "dummy" VNIC on top of a regular interface
    just to interpose these new features seems like an implementation
    artifact.

  - I assume we need a redesign of the VLAN code in order to get
    per-VLAN bandwidth control.  Is that redesign part of Crossbow, or
    is it some later project?  In reading the archives, it seems that
    it''s been proposed as part of Crossbow, but in reading this
    document it seems to be part of something else.

  - If per-VLAN control appears, do the units of administration
    change?  Does it then become reasonable to talk about bandwidth
    and CPU control using "set-linkprop"?

  - Do bandwidth and CPU controls rely on squeues?  If so, then VNICs
    may not be able to control utilization from non-IP traffic, such
    as with bridging.

  - I''m not sure I understand the (undocumented? -- not in summary)
    "-F" option for move-vnic.  If I''m using a factory
address on one
    NIC and I move a VNIC to another NIC, does this cause the VNIC to
    continue using the _same_ address but just on a new NIC?

    If so, how is duplication avoided if that factory address is ever
    reused from the original NIC?

    I would have expected that a VNIC using a factory address would
    just get a *new* address during a forced move to a new NIC.
    Changing MAC address during reconfiguration doesn''t seem like a
    disaster to me -- in fact, it seems expected.  Why should it try
    to retain the address?

  - For showing statistics with "show-vnic -s", are these the same as
    "show-link -s"?  If so, wouldn''t the existing
"show-link -s" do
    the job?

  - What do "up" and "down" mean?  Are these equivalent to
controlling
    the "RUNNING" bit from user space (i.e., some way of marking link
    up and link down manually)?  Or are they something else?  Should
    regular MAC instances (other than VNICs) have the ability to be
    set administratively up and down?

    What would happen if VNICs were always "up?"

  - What happens if a NIC is oversubscribed by the amount of bandwidth
    configured for the VNICs?  Is the result proportionate (and thus
    "fair") allocation, or do they compete on some other grounds?

    What kind of bandwidth control exists here?  How granular is it,
    and what effects do clients see from restricted bandwidth?  Are
    packets dropped (they have to be, if bandwidth limits apply to
    forwarded traffic)?  If so, is it tail drop or something more
    sophisticated?

  - Can a VNIC be built atop another non-anchor VNIC?  (Seems like the
    answer is "yes.")

  - When VNICs share rings due to a lack of hardware resources, what
    happens when the client of one VNIC is using polling and the
    client of the other one is not?

    Won''t one client end up blanking the interrupts for another?

  - Instead of adding more arguments to mac_open() to handle priority
    and bandwidth, I''d suggest making these separate calls. 
You''ll
    need the separate call anyway to implement the "modify" mechanism.

  - What exactly does exclusive MAC access do?  If mac_exclusive_set
    is called, are other client requests blocked (sleeping)?  Or are
    they rejected (return error)?  Or are they just let through, and
    all clients are expected to bracket requests with exclusive
    set/clear calls?

  - MAC_UNICAST_AUTO seems unnecessary to me.  Why not just call first
    with MAC_UNICAST_FACTORY and, if that fails, call again with
    MAC_UNICAST_RANDOM?  Doing that would even have better
    functionality as MAC_UNICAST_AUTO seems to omit the possibility of
    desiring a particular factory address when available.

    I think having MAC_UNICAST_AUTO in the mix ends up pushing some of
    the control-path complexity out of the user space and into the
    kernel.  It''d be better to simplify the kernel parts.

  - What sorts of privileges are required to create and administer
    VNICs?  Are these things that can be delegated to non-global
    zones?

  - Why is [V]NIC the right level of bandwidth control?  If I want to
    give a zone 100Mbps worth of bandwidth, but I''m giving it multiple
    VNICs, how do I do that -- can the bandwidth control logic do
    accounting based on multiple interfaces (aggregate control, rather
    than individual interface control)?

    If I have application-level controls, such as HTTP virtual servers
    or a sendmail configuration handling multiple domains, how can I
    control bandwidth for those things?  Won''t the application need to
    be involved?

-- 
James Carlson, Solaris Networking              <james.d.carlson at
sun.com>
Sun Microsystems / 1 Network Drive         71.232W   Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757   42.496N   Fax +1 781 442 1677

Nicolas Droux

2007-Aug-28 22:47 UTC

head link

[crossbow-discuss] Updated Crossbow virtualization architecture document

Jim,

Thanks for the comments.

James Carlson wrote:> Nicolas Droux writes:
>> http://opensolaris.org/os/project/crossbow/Docs/crossbow-virt.pdf
> 
> I have A few questions about this.  I''ve also read through as much
of
> the crossbow-discuss archives as seemed to be related to these
> topics, and didn''t find answers there.
> 
>   - Why are bandwidth, CPU control, and MAC address assignment
>     exclusively a VNIC feature, at least at the administrative level?
>     Section 4.7 seems to say that MAC instances will get these
>     features, so shouldn''t this be "modify-dev" instead?
Bandwidth control, CPU mapping, fanout are not exclusive to VNICs. They 
will be expressed as properties, and applicable to non-VNIC data-links 
as well. This will be described in details by another upcoming document. 
I''ll see what I can do to make that clearer in the virtualization 
document I sent out for review.

 From the administration interface point of view, there are two ways to 
associate properties with data-links. For data-links that are created 
through a dladm subcommand like create-vnic, the initial set of 
properties can be specified during the creation of the data-link itself 
through an dedicated option. In addition the properties can be set on 
any data-link through the set-linkprop subcommand. The former allows the 
administrator to create a VNIC with bandwidth control in a single 
command instead of having to go through a two step dance.
> 
>     Needing to create a "dummy" VNIC on top of a regular
interface
>     just to interpose these new features seems like an implementation
>     artifact.
No, that won''t be needed, see above.
> 
>   - I assume we need a redesign of the VLAN code in order to get
>     per-VLAN bandwidth control.  Is that redesign part of Crossbow, or
>     is it some later project?  In reading the archives, it seems that
>     it''s been proposed as part of Crossbow, but in reading this
>     document it seems to be part of something else.
Yes, we''re currently planning to move VLAN processing down to the MAC 
layer itself, and the VLAN processing currently in the DLS layer will be 
removed. This still needs to be properly documented.
> 
>   - If per-VLAN control appears, do the units of administration
>     change?  Does it then become reasonable to talk about bandwidth
>     and CPU control using "set-linkprop"?
Yes, the properties will apply to VLAN data-links as well, see above.
> 
>   - Do bandwidth and CPU controls rely on squeues?  If so, then VNICs
>     may not be able to control utilization from non-IP traffic, such
>     as with bridging.
There is a level of bandwidth control done by squeue, but there''s also
a
bandwidth control done by the MAC layer itself. Which is useful when 
there''s a need to do bandwidth control before fanout to multiple CPUs
at
the MAC layer, and also for non-IP protocols, or when the MAC is being 
used by a virtual machines back-end drivers in the host OS. See also 
Sunay''s writeup at 
http://www.opensolaris.org/os/project/crossbow/Design_softringset.txt 
for more details on this topic.
>   - I''m not sure I understand the (undocumented? -- not in
summary)
>     "-F" option for move-vnic.  If I''m using a factory
address on one
>     NIC and I move a VNIC to another NIC, does this cause the VNIC to
>     continue using the _same_ address but just on a new NIC?
> 
>     If so, how is duplication avoided if that factory address is ever
>     reused from the original NIC?
> 
>     I would have expected that a VNIC using a factory address would
>     just get a *new* address during a forced move to a new NIC.
>     Changing MAC address during reconfiguration doesn''t seem like
a
>     disaster to me -- in fact, it seems expected.  Why should it try
>     to retain the address?
I was trying to allow the system administrator to minimize the impact on 
the existing MAC address assignment when moving a VNIC to be moved off 
and back to a device. But I agree that it''s not optimal. If the folks
on
this list feel that the MAC address changing is not an issue, I''ve no 
problem using the simpler scheme of reassigning a new MAC address to the 
VNIC/MAC client.
>   - For showing statistics with "show-vnic -s", are these the
same as
>     "show-link -s"?  If so, wouldn''t the existing
"show-link -s" do
>     the job?
Agreed, show-link -s should do fine here.
>   - What do "up" and "down" mean?  Are these equivalent
to controlling
>     the "RUNNING" bit from user space (i.e., some way of marking
link
>     up and link down manually)?  Or are they something else?  Should
>     regular MAC instances (other than VNICs) have the ability to be
>     set administratively up and down?
> 
>     What would happen if VNICs were always "up?"
Here it means causing the VNIC MACs to register with the framework. The 
same functionality already exists for link aggregations. Meem suggested 
init-vnic instead, which would be fine to me and avoid potential 
confusions with ifconfig up. I still need to update that part of the 
document.
> 
>   - What happens if a NIC is oversubscribed by the amount of bandwidth
>     configured for the VNICs?  Is the result proportionate (and thus
>     "fair") allocation, or do they compete on some other grounds?
In that case it will depend on other factors such as the type of 
traffic, the CPU(s) processing that traffic, etc.
> 
>     What kind of bandwidth control exists here?  How granular is it,
>     and what effects do clients see from restricted bandwidth?  Are
>     packets dropped (they have to be, if bandwidth limits apply to
>     forwarded traffic)?  If so, is it tail drop or something more
>     sophisticated?
In general if a SRS or flow is assigned its own hardware ring, then the 
polling thread will poll packets directly from the ring, and there''s no
dropping from the host. Packets will be polled from the rings when 
allowed as per bandwidth limits and consumption. The polling thread is 
scheduled every tick, and we compute a maximum number of bytes per tick.

If more than one SRS/squeue share a ring, there''s no polling of the 
ring. Instead, traffic will be interrupt driven, and packets will be 
deposited on queues associated with the SRS/squeue. Packets are then 
pulled from these queues based on bandwidth limits. If the maximum 
number of packets in these queues is exceeded, then there''s tail drop. 
Again, see the SRS design doc.
>   - Can a VNIC be built atop another non-anchor VNIC?  (Seems like the
>     answer is "yes.")
Correct.
> 
>   - When VNICs share rings due to a lack of hardware resources, what
>     happens when the client of one VNIC is using polling and the
>     client of the other one is not? >     Won''t one client end up blanking the interrupts for another?

If there''s one ring shared by multiple VNICs, traffic arrival will be 
interrupt based, and after software classification, traffic will be 
deposited to software rings.

If there are multiple hardware rings but only one interrupt, then the 
driver does not disable the hardware interrupt. Instead, it takes note 
of the request from the stack to not interrupt for specific rings. When 
a hardware interrupt is received, it avoids consuming packets from these 
rings, and continues delivering traffic to the MAC layer otherwise. 
Again, see the document on SRS and bandwidth control for more details.
>   - Instead of adding more arguments to mac_open() to handle priority
>     and bandwidth, I''d suggest making these separate calls. 
You''ll
>     need the separate call anyway to implement the "modify"
mechanism.
Having the parameters specified in mac_open() is useful since they allow 
  these parameters to be specified when the resources are allocated to 
the MAC client. This avoids allocating a set of default resources and 
then immediately changing these resources through a separate modify 
mechanism. If we can specify through 2-3 arguments I don''t think this 
should be an issue.
>   - What exactly does exclusive MAC access do?  If mac_exclusive_set
>     is called, are other client requests blocked (sleeping)?  Or are
>     they rejected (return error)?  Or are they just let through, and
>     all clients are expected to bracket requests with exclusive
>     set/clear calls?
This is basically the equivalent of the 
mac_active_set()/mac_active_clear() we have in Nevada today. I''m
looking
into whether the same semantics could be implemented indirectly through 
the mac_unicst_set() with the primary MAC address, since there''s only 
one and it can be assigned only to one MAC client.
> 
>   - MAC_UNICAST_AUTO seems unnecessary to me.  Why not just call first
>     with MAC_UNICAST_FACTORY and, if that fails, call again with
>     MAC_UNICAST_RANDOM?  Doing that would even have better
>     functionality as MAC_UNICAST_AUTO seems to omit the possibility of
>     desiring a particular factory address when available.
The intent was for AUTO to allow the slot to be specified. That option 
should allow the slot number to be specified via addr_slot.
>     I think having MAC_UNICAST_AUTO in the mix ends up pushing some of
>     the control-path complexity out of the user space and into the
>     kernel.  It''d be better to simplify the kernel parts.
This is very simple logic we''re talking about here, I don''t
see the
problem doing that selection in kernel space. In addition, it avoids 
having two system calls per VNIC created on top of NICs which do not 
provide multiple factory MAC addresses.
>   - What sorts of privileges are required to create and administer
>     VNICs?  Are these things that can be delegated to non-global
>     zones?
Basically the same that are needed for administrating other data-links, 
i.e. sys_net_config and net_rawaccess. In a zones environment data-link 
administration is limited to the global zone.
>   - Why is [V]NIC the right level of bandwidth control?  If I want to
>     give a zone 100Mbps worth of bandwidth, but I''m giving it
multiple
>     VNICs, how do I do that -- can the bandwidth control logic do
>     accounting based on multiple interfaces (aggregate control, rather
>     than individual interface control)?
No, the bandwidth control is on a per-interface on a per-flow basis. 
This is because the bandwidth is basically controlled by polling on a 
per ring (software or hardware) basis, not across a set of rings.
>     If I have application-level controls, such as HTTP virtual servers
>     or a sendmail configuration handling multiple domains, how can I
>     control bandwidth for those things?  Won''t the application
need to
>     be involved?
Then you will use flowadm(1M) which we are also introducing as part of 
Crossbow, and will be described separately. My document focuses on the 
virtualization aspects of the project.

Nicolas.

-- 
Nicolas Droux - Solaris Networking - Sun Microsystems, Inc.
droux at sun.com - http://blogs.sun.com/droux

James Carlson

2007-Aug-29 13:27 UTC

head link

[crossbow-discuss] Updated Crossbow virtualization architecture document

Nicolas Droux writes:>  From the administration interface point of view, there are two ways to 
> associate properties with data-links. For data-links that are created 
> through a dladm subcommand like create-vnic, the initial set of 
> properties can be specified during the creation of the data-link itself 
> through an dedicated option. In addition the properties can be set on 
> any data-link through the set-linkprop subcommand. The former allows the 
> administrator to create a VNIC with bandwidth control in a single 
> command instead of having to go through a two step dance.
Does this mean that the same properties will be accessible via both
"modify-vnic" and "set-linkprop"?

I can understand wanting to set some initial properties at create
time, but it seems odd that the new general properties are segregated
into VNIC-specific commands.
> >   - Do bandwidth and CPU controls rely on squeues?  If so, then VNICs
> >     may not be able to control utilization from non-IP traffic, such
> >     as with bridging.
> 
> There is a level of bandwidth control done by squeue, but there''s
also a
> bandwidth control done by the MAC layer itself. Which is useful when 
> there''s a need to do bandwidth control before fanout to multiple
CPUs at
> the MAC layer, and also for non-IP protocols, or when the MAC is being 
> used by a virtual machines back-end drivers in the host OS. See also 
> Sunay''s writeup at 
> http://www.opensolaris.org/os/project/crossbow/Design_softringset.txt 
> for more details on this topic.
I had found and read that document before writing my comment.

I still don''t quite see the relationship here.  What are the
responsibilities of the two mechanisms (the mac layer and the
squeues)?

To put the question in another way: suppose I have a non-IP protocol
using a VNIC with a bandwidth control set on it.  What happens?  Are
there features that were related to squeues that I won''t be able to
use?  If so, then what are those features?

Or, to put it another way still: are there things that non-IP
protocols should or could be doing in order to "cooperate" with this
bandwidth control so that they behave as well as IP''s squeues will?
> I was trying to allow the system administrator to minimize the impact on 
> the existing MAC address assignment when moving a VNIC to be moved off 
> and back to a device. But I agree that it''s not optimal. If the
folks on
> this list feel that the MAC address changing is not an issue, I''ve
no
> problem using the simpler scheme of reassigning a new MAC address to the 
> VNIC/MAC client.
If I (as a system administrator) say "factory" as part of the
configuration of the interface, then I''d expect to get a factory-
supplied address.  My expectation would be that when the factory-
supplied components are swapped out underneath, the address changes.

Having the factory-supplied address come unmoored from the device
itself seems odd to me, and almost certain to cause trouble.  I
suppose it could be possible to create a "adopt the factory address
and treat it as though it were my own statically-configured address"
option, but I''d certainly want to see it come with adequate warnings
about the dangers and a clear user interface (not "factory" but
"steal-from-factory" ;-}).  I''m not sure that it''d
be administratively
interesting, though.
> >   - What do "up" and "down" mean?  Are these
equivalent to controlling
> >     the "RUNNING" bit from user space (i.e., some way of
marking link
> >     up and link down manually)?  Or are they something else?  Should
> >     regular MAC instances (other than VNICs) have the ability to be
> >     set administratively up and down?
> > 
> >     What would happen if VNICs were always "up?"
> 
> Here it means causing the VNIC MACs to register with the framework. The 
> same functionality already exists for link aggregations. Meem suggested 
> init-vnic instead, which would be fine to me and avoid potential 
> confusions with ifconfig up. I still need to update that part of the 
> document.
Ah, ok.  Yes, that would make this a lot clearer.
> >   - What happens if a NIC is oversubscribed by the amount of bandwidth
> >     configured for the VNICs?  Is the result proportionate (and thus
> >     "fair") allocation, or do they compete on some other
grounds?
> 
> In that case it will depend on other factors such as the type of 
> traffic, the CPU(s) processing that traffic, etc.
I suggest putting more effort into characterizing this, because
oversubscribing is a common and fairly well understood way to balance
risk versus utilization and occurs often in handling failure scenarios
(such as with aggregation).

I''ve seen similar schemes for access servers (most have proprietary
RADIUS extensions for setting bandwidth limits), and the usual way
this works is that once the link is saturated, the configured limits
become shares.  Thus, the clients are all hurt in proportion to the
amount of bandwidth they''re given.
> >     What kind of bandwidth control exists here?  How granular is it,
> >     and what effects do clients see from restricted bandwidth?  Are
> >     packets dropped (they have to be, if bandwidth limits apply to
> >     forwarded traffic)?  If so, is it tail drop or something more
> >     sophisticated?
> 
> In general if a SRS or flow is assigned its own hardware ring, then the 
> polling thread will poll packets directly from the ring, and
there''s no
> dropping from the host. Packets will be polled from the rings when 
> allowed as per bandwidth limits and consumption. The polling thread is 
> scheduled every tick, and we compute a maximum number of bytes per tick.
> 
> If more than one SRS/squeue share a ring, there''s no polling of
the
> ring. Instead, traffic will be interrupt driven, and packets will be 
> deposited on queues associated with the SRS/squeue. Packets are then 
> pulled from these queues based on bandwidth limits. If the maximum 
> number of packets in these queues is exceeded, then there''s tail
drop.
> Again, see the SRS design doc.
"Tail drop" looks like the answer I was looking for.

In that case, you might want to consider (at least as an RFE)
including basic RED support here.  There can be a big difference in
behavior between hardware-imposed limits (ones that presumably affect
both the sender and receiver in most cases) and artificial limits
because the network behavior is quite different, and tail-drop is
known to cause poor TCP performance.
> >   - When VNICs share rings due to a lack of hardware resources, what
> >     happens when the client of one VNIC is using polling and the
> >     client of the other one is not?
>  >     Won''t one client end up blanking the interrupts for
another?
> 
> If there''s one ring shared by multiple VNICs, traffic arrival will
be
> interrupt based, and after software classification, traffic will be 
> deposited to software rings.
OK; that''s the part I was looking for.
> >   - Instead of adding more arguments to mac_open() to handle priority
> >     and bandwidth, I''d suggest making these separate calls. 
You''ll
> >     need the separate call anyway to implement the "modify"
mechanism.
> 
> Having the parameters specified in mac_open() is useful since they allow 
>   these parameters to be specified when the resources are allocated to 
> the MAC client. This avoids allocating a set of default resources and 
> then immediately changing these resources through a separate modify 
> mechanism. If we can specify through 2-3 arguments I don''t think
this
> should be an issue.
I think it''s much more flexible and easier to do it later.

You''re going to need a function to change the values after mac_open()
time.  By supplying the same values during mac_open(), you''re just
duplicating that functionality.

Worse, mac_open() is a core function, while resource control is at the
periphery.  If you need to modify mac_open() every time resource
controls are tweaked -- consider what happens when shared resources
are introduced (allowing control of multiple interfaces as a group),
or when more advanced queuing disciplines are allowed -- then this
interface will never settle down and never be appropriate as a DDI
function.

Separating these two allows you to add new control functions in the
future without having to modify every mac_open() caller.

It''s as though every fcntl(2) feature needed to be supplied in
open(2).

Why is the resource allocation itself an important thing to optimize
versus the interface stability and scalability?
> >   - What exactly does exclusive MAC access do?  If mac_exclusive_set
> >     is called, are other client requests blocked (sleeping)?  Or are
> >     they rejected (return error)?  Or are they just let through, and
> >     all clients are expected to bracket requests with exclusive
> >     set/clear calls?
> 
> This is basically the equivalent of the 
> mac_active_set()/mac_active_clear() we have in Nevada today. I''m
looking
> into whether the same semantics could be implemented indirectly through 
> the mac_unicst_set() with the primary MAC address, since there''s
only
> one and it can be assigned only to one MAC client.
I thought that the "active" flag was there to allow passive users
(such as snoop) to monitor interfaces that would otherwise be
off-bounds, such as aggregation members.  It''s not clear to me how
there''s an equivalent of that here.

Maybe this section just needs more explanation or a usage scenario.
> >   - MAC_UNICAST_AUTO seems unnecessary to me.  Why not just call first
> >     with MAC_UNICAST_FACTORY and, if that fails, call again with
> >     MAC_UNICAST_RANDOM?  Doing that would even have better
> >     functionality as MAC_UNICAST_AUTO seems to omit the possibility of
> >     desiring a particular factory address when available.
> 
> The intent was for AUTO to allow the slot to be specified. That option 
> should allow the slot number to be specified via addr_slot.
The document says it must be -1.
> >     I think having MAC_UNICAST_AUTO in the mix ends up pushing some of
> >     the control-path complexity out of the user space and into the
> >     kernel.  It''d be better to simplify the kernel parts.
> 
> This is very simple logic we''re talking about here, I
don''t see the
> problem doing that selection in kernel space. In addition, it avoids 
> having two system calls per VNIC created on top of NICs which do not 
> provide multiple factory MAC addresses.
It''s also duplicate logic.  Why optimize for system call counts versus
kernel code complexity?
> >   - What sorts of privileges are required to create and administer
> >     VNICs?  Are these things that can be delegated to non-global
> >     zones?
> 
> Basically the same that are needed for administrating other data-links, 
> i.e. sys_net_config and net_rawaccess. In a zones environment data-link 
> administration is limited to the global zone.
That latter part might not be right for IP Instances, particularly
since VNICs can be built atop other VNICs.  (Maybe that''s just an
issue for the future, though.)
> >   - Why is [V]NIC the right level of bandwidth control?  If I want to
> >     give a zone 100Mbps worth of bandwidth, but I''m giving it
multiple
> >     VNICs, how do I do that -- can the bandwidth control logic do
> >     accounting based on multiple interfaces (aggregate control, rather
> >     than individual interface control)?
> 
> No, the bandwidth control is on a per-interface on a per-flow basis. 
> This is because the bandwidth is basically controlled by polling on a 
> per ring (software or hardware) basis, not across a set of rings.
That''s quite different from what most QoS implementations I''ve
seen
do.  The usual model is to map interfaces and flows into a "QoS
group," which is then controlled as a single unit, as in Cisco''s
"qos-group" feature and policy maps.

I''d suggest making sure that potential customers of this new bandwidth
control feature are keenly aware of the no-resource-aggregation
limitation.  It sounds like it''s intended as a fundamental design
feature, and not something that might be a temporary feature
limitation that could be removed later.  (As a user, I wouldn''t be
surprised to find that the controls at initial release don''t match
what I actually need, but I''d be very surprised if the controls
couldn''t be fixed later.)
> >     If I have application-level controls, such as HTTP virtual servers
> >     or a sendmail configuration handling multiple domains, how can I
> >     control bandwidth for those things?  Won''t the
application need to
> >     be involved?
> 
> Then you will use flowadm(1M) which we are also introducing as part of 
> Crossbow, and will be described separately. My document focuses on the 
> virtualization aspects of the project.
OK.

-- 
James Carlson, Solaris Networking              <james.d.carlson at
sun.com>
Sun Microsystems / 1 Network Drive         71.232W   Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757   42.496N   Fax +1 781 442 1677

Nicolas Droux

2007-Aug-29 22:15 UTC

head link

[crossbow-discuss] Updated Crossbow virtualization architecture document

Jim,

James Carlson wrote:> Nicolas Droux writes:
>>  From the administration interface point of view, there are two ways to
>> associate properties with data-links. For data-links that are created 
>> through a dladm subcommand like create-vnic, the initial set of 
>> properties can be specified during the creation of the data-link itself
>> through an dedicated option. In addition the properties can be set on 
>> any data-link through the set-linkprop subcommand. The former allows
the
>> administrator to create a VNIC with bandwidth control in a single 
>> command instead of having to go through a two step dance.
> 
> Does this mean that the same properties will be accessible via both
> "modify-vnic" and "set-linkprop"?
> 
> I can understand wanting to set some initial properties at create
> time, but it seems odd that the new general properties are segregated
> into VNIC-specific commands.
No, only set-linkprop will be used to change these properties, not 
modify-vnic. We''ll send out updated man pages to reflect these changes,
and they will be different than the man pages that were published as 
part of our current bits.
>>>   - Do bandwidth and CPU controls rely on squeues?  If so, then
VNICs
>>>     may not be able to control utilization from non-IP traffic,
such
>>>     as with bridging.
>> There is a level of bandwidth control done by squeue, but
there''s also a
>> bandwidth control done by the MAC layer itself. Which is useful when 
>> there''s a need to do bandwidth control before fanout to
multiple CPUs at
>> the MAC layer, and also for non-IP protocols, or when the MAC is being 
>> used by a virtual machines back-end drivers in the host OS. See also 
>> Sunay''s writeup at 
>> http://www.opensolaris.org/os/project/crossbow/Design_softringset.txt 
>> for more details on this topic.
> 
> I had found and read that document before writing my comment.
> 
> I still don''t quite see the relationship here.  What are the
> responsibilities of the two mechanisms (the mac layer and the
> squeues)?
> 
> To put the question in another way: suppose I have a non-IP protocol
> using a VNIC with a bandwidth control set on it.  What happens?  Are
> there features that were related to squeues that I won''t be able
to
> use?  If so, then what are those features?
The client will see a MAC which has a bandwidth limit, nothing else is 
required.
> Or, to put it another way still: are there things that non-IP
> protocols should or could be doing in order to "cooperate" with
this
> bandwidth control so that they behave as well as IP''s squeues
will?
No, no special requirements. The bandwidth limits set on a MAC will be 
enforced by the MAC layer SRS. We''ll also have a flow API which will be
available to MAC clients to define bandwidth limits for services, etc, 
and used by clients like IP when needed.
>> I was trying to allow the system administrator to minimize the impact
on
>> the existing MAC address assignment when moving a VNIC to be moved off 
>> and back to a device. But I agree that it''s not optimal. If
the folks on
>> this list feel that the MAC address changing is not an issue,
I''ve no
>> problem using the simpler scheme of reassigning a new MAC address to
the
>> VNIC/MAC client.
> 
> If I (as a system administrator) say "factory" as part of the
> configuration of the interface, then I''d expect to get a factory-
> supplied address.  My expectation would be that when the factory-
> supplied components are swapped out underneath, the address changes.
Actually there are three sub-cases to this I think:

1. If the administrator does not specify an address (automatic 
assignment), and a factory MAC address is assigned to the VNIC. In this 
case, I think it''s fine to assign a different MAC address, e.g. a
random
one, to the VNIC if the VNIC is moved to a NIC which does not have 
available factory MAC addresses.

2. If the administrator requested a factory MAC addresses explicitly, 
then the VNIC could be moved to a different NIC which has an available 
factory MAC address. Otherwise the operation would fail unless a force 
flag is set.

3. If the administrator requested a factory MAC address of a specific 
slot, then there''s a clear intent of using a specific MAC address of
the
device underneath. In that case the move operation would fail unless a 
force flag is set.
> Having the factory-supplied address come unmoored from the device
> itself seems odd to me, and almost certain to cause trouble.  I
> suppose it could be possible to create a "adopt the factory address
> and treat it as though it were my own statically-configured address"
> option, but I''d certainly want to see it come with adequate
warnings
> about the dangers and a clear user interface (not "factory" but
> "steal-from-factory" ;-}).  I''m not sure that
it''d be administratively
> interesting, though.
Yes, there''s a risk of duplicate addresses if that option was chosen, 
and the source NIC ends-up being recycled later, that''s less than
ideal.
>>>   - What happens if a NIC is oversubscribed by the amount of
bandwidth
>>>     configured for the VNICs?  Is the result proportionate (and
thus
>>>     "fair") allocation, or do they compete on some other
grounds?
>> In that case it will depend on other factors such as the type of 
>> traffic, the CPU(s) processing that traffic, etc.
> 
> I suggest putting more effort into characterizing this, because
> oversubscribing is a common and fairly well understood way to balance
> risk versus utilization and occurs often in handling failure scenarios
> (such as with aggregation).
> 
> I''ve seen similar schemes for access servers (most have
proprietary
> RADIUS extensions for setting bandwidth limits), and the usual way
> this works is that once the link is saturated, the configured limits
> become shares.  Thus, the clients are all hurt in proportion to the
> amount of bandwidth they''re given.
The limits are really used to clamp down on bandwidth utilization by a 
MAC, but they do not imply any guaranteed bandwidth. As a future 
deliverable we''re also planning to provide bandwidth guarantees which
is
what you seem to be referring to here.
>>>     What kind of bandwidth control exists here?  How granular is
it,
>>>     and what effects do clients see from restricted bandwidth?  Are
>>>     packets dropped (they have to be, if bandwidth limits apply to
>>>     forwarded traffic)?  If so, is it tail drop or something more
>>>     sophisticated?
>> In general if a SRS or flow is assigned its own hardware ring, then the
>> polling thread will poll packets directly from the ring, and
there''s no
>> dropping from the host. Packets will be polled from the rings when 
>> allowed as per bandwidth limits and consumption. The polling thread is 
>> scheduled every tick, and we compute a maximum number of bytes per
tick.
>>
>> If more than one SRS/squeue share a ring, there''s no polling
of the
>> ring. Instead, traffic will be interrupt driven, and packets will be 
>> deposited on queues associated with the SRS/squeue. Packets are then 
>> pulled from these queues based on bandwidth limits. If the maximum 
>> number of packets in these queues is exceeded, then there''s
tail drop.
>> Again, see the SRS design doc.
> 
> "Tail drop" looks like the answer I was looking for.
> 
> In that case, you might want to consider (at least as an RFE)
> including basic RED support here.  There can be a big difference in
> behavior between hardware-imposed limits (ones that presumably affect
> both the sender and receiver in most cases) and artificial limits
> because the network behavior is quite different, and tail-drop is
> known to cause poor TCP performance.
Agreed. We still to document in more details our existing scheme here, 
and we should discuss alternatives as part of that text.
>>>   - Instead of adding more arguments to mac_open() to handle
priority
>>>     and bandwidth, I''d suggest making these separate
calls.  You''ll
>>>     need the separate call anyway to implement the
"modify" mechanism.
>> Having the parameters specified in mac_open() is useful since they
allow
>>   these parameters to be specified when the resources are allocated to 
>> the MAC client. This avoids allocating a set of default resources and 
>> then immediately changing these resources through a separate modify 
>> mechanism. If we can specify through 2-3 arguments I don''t
think this
>> should be an issue.
> 
> I think it''s much more flexible and easier to do it later.
> 
> You''re going to need a function to change the values after
mac_open()
> time.  By supplying the same values during mac_open(), you''re just
> duplicating that functionality.
It might be a single "piece of code" which can be called to allocate 
resources according to these parameters from both the open and modify 
functions. I think the duplication can be avoided.
> Worse, mac_open() is a core function, while resource control is at the
> periphery.  If you need to modify mac_open() every time resource
> controls are tweaked -- consider what happens when shared resources
> are introduced (allowing control of multiple interfaces as a group),
> or when more advanced queuing disciplines are allowed -- then this
> interface will never settle down and never be appropriate as a DDI
> function.
> 
> Separating these two allows you to add new control functions in the
> future without having to modify every mac_open() caller.
> 
> It''s as though every fcntl(2) feature needed to be supplied in
> open(2).
> 
> Why is the resource allocation itself an important thing to optimize
> versus the interface stability and scalability?
I don''t agree with the "core function" vs
"periphery" argument. The
resource control is becoming an integral part of the MAC layer, and 
there shouldn''t be a need to do "extra steps" to enable that
functionality.

But I agree with your point about designing an API which allows more 
options to be added in the future without breaking backward 
compatibility. However I think this can be made to work without 
requiring a separate call. I''ll need to take a closer look at this.
>>>   - MAC_UNICAST_AUTO seems unnecessary to me.  Why not just call
first
>>>     with MAC_UNICAST_FACTORY and, if that fails, call again with
>>>     MAC_UNICAST_RANDOM?  Doing that would even have better
>>>     functionality as MAC_UNICAST_AUTO seems to omit the possibility
of
>>>     desiring a particular factory address when available.
>> The intent was for AUTO to allow the slot to be specified. That option 
>> should allow the slot number to be specified via addr_slot.
> 
> The document says it must be -1.
Yes, and I need to fix the document to allow a slot number to be passed 
when that MAC address type is specified.
>>>     I think having MAC_UNICAST_AUTO in the mix ends up pushing some
of
>>>     the control-path complexity out of the user space and into the
>>>     kernel.  It''d be better to simplify the kernel parts.
>> This is very simple logic we''re talking about here, I
don''t see the
>> problem doing that selection in kernel space. In addition, it avoids 
>> having two system calls per VNIC created on top of NICs which do not 
>> provide multiple factory MAC addresses.
> 
> It''s also duplicate logic.  Why optimize for system call counts
versus
> kernel code complexity?
There''s additional code in the kernel, but that logic is very simple.
>>>   - What sorts of privileges are required to create and administer
>>>     VNICs?  Are these things that can be delegated to non-global
>>>     zones?
>> Basically the same that are needed for administrating other data-links,
>> i.e. sys_net_config and net_rawaccess. In a zones environment data-link
>> administration is limited to the global zone.
> 
> That latter part might not be right for IP Instances, particularly
> since VNICs can be built atop other VNICs.  (Maybe that''s just an
> issue for the future, though.)
Even with IP instances, data-link control remains in the global zone.
>>>   - Why is [V]NIC the right level of bandwidth control?  If I want
to
>>>     give a zone 100Mbps worth of bandwidth, but I''m giving
it multiple
>>>     VNICs, how do I do that -- can the bandwidth control logic do
>>>     accounting based on multiple interfaces (aggregate control,
rather
>>>     than individual interface control)?
>> No, the bandwidth control is on a per-interface on a per-flow basis. 
>> This is because the bandwidth is basically controlled by polling on a 
>> per ring (software or hardware) basis, not across a set of rings.
> 
> That''s quite different from what most QoS implementations
I''ve seen
> do.  The usual model is to map interfaces and flows into a "QoS
> group," which is then controlled as a single unit, as in
Cisco''s
> "qos-group" feature and policy maps.
> 
> I''d suggest making sure that potential customers of this new
bandwidth
> control feature are keenly aware of the no-resource-aggregation
> limitation.  It sounds like it''s intended as a fundamental design
> feature, and not something that might be a temporary feature
> limitation that could be removed later.  (As a user, I wouldn''t be
> surprised to find that the controls at initial release don''t match
> what I actually need, but I''d be very surprised if the controls
> couldn''t be fixed later.)
Yes, this will be of course fully documented. If we find an efficient 
way to do banwidth control across multiple rings in the future, I don''t
see why we wouldn''t be able to made use of that functionality.

Thanks,
Nicolas.

-- 
Nicolas Droux - Solaris Networking - Sun Microsystems, Inc.
droux at sun.com - http://blogs.sun.com/droux

James Carlson

2007-Aug-30 19:01 UTC

head link

[crossbow-discuss] Updated Crossbow virtualization architecture document

Nicolas Droux writes:> James Carlson wrote:
> > I can understand wanting to set some initial properties at create
> > time, but it seems odd that the new general properties are segregated
> > into VNIC-specific commands.
> 
> No, only set-linkprop will be used to change these properties, not 
> modify-vnic. We''ll send out updated man pages to reflect these
changes,
> and they will be different than the man pages that were published as 
> part of our current bits.
OK; thanks.
> > To put the question in another way: suppose I have a non-IP protocol
> > using a VNIC with a bandwidth control set on it.  What happens?  Are
> > there features that were related to squeues that I won''t be
able to
> > use?  If so, then what are those features?
> 
> The client will see a MAC which has a bandwidth limit, nothing else is 
> required.
That''s what I wanted to know.
> > If I (as a system administrator) say "factory" as part of
the
> > configuration of the interface, then I''d expect to get a
factory-
> > supplied address.  My expectation would be that when the factory-
> > supplied components are swapped out underneath, the address changes.
> 
> Actually there are three sub-cases to this I think:
> 
> 1. If the administrator does not specify an address (automatic 
> assignment), and a factory MAC address is assigned to the VNIC. In this 
> case, I think it''s fine to assign a different MAC address, e.g. a
random
> one, to the VNIC if the VNIC is moved to a NIC which does not have 
> available factory MAC addresses.
Yes, I agree with that.  That''d be the "auto" case, and I was
talking
about "factory."
> 2. If the administrator requested a factory MAC addresses explicitly, 
> then the VNIC could be moved to a different NIC which has an available 
> factory MAC address. Otherwise the operation would fail unless a force 
> flag is set.
Why would "force" be useful in this case?  What exactly happens if the
operation is "forced," and why couldn''t I configure the
interface in
that way in the first place?

I''m very leery of administrative options that leave the system in a
state where I couldn''t have configured it that way in the first
place.  In this case, I can''t configure a VNIC as "factory"
if there
are no addresses available, but I can "force" a "factory"
VNIC into an
interface with no addresses available.

Does the configuration pop loose (become something other than
"factory") during such a forced move, or does the configuration just
become incorrect, saying "factory" but meaning something else?
> 3. If the administrator requested a factory MAC address of a specific 
> slot, then there''s a clear intent of using a specific MAC address
of the
> device underneath. In that case the move operation would fail unless a 
> force flag is set.
I still think this is too divorced from administrative expectations.

When do I move VNICs around and what do I need and expect?  I think
this document should work through some actual usage scenarios and then
come up with usable interfaces based on that, because the current
interfaces seem to be self-referential: they do what they do because
that''s what they do.  The "force" flag seems particularly
problematic,
as it indicates that things the administrator should be able to do
aren''t doable.

The scenarios I can see are:

  - User configures VNIC for the first time on a given NIC.  What
    happens when the "factory" address desired doesn''t exist
or is in
    use?

  - User wants a VNIC to move from one NIC to another.  Forget about
    "forcing" the operation, and look at the need.  Why am I moving it
    from one to another and what should I expect?

  - The system needs to move a VNIC from one NIC to another (or to
    none at all!) due to DR removal of the assigned NIC.

There might be other variations here.

Here''s one possible answer that I think would make a bit more sense,
at least to me, and would be much simpler.

     The "auto" keyword and the "-F" flag go away.  All
configurations
     that specify "factory" are implicitly automatic: if the requested
     factory address isn''t available, then you get an auto-generated
     one and perhaps a warning message.  If you really care which kind
     of address you get, then look at the MAC address -- it''ll have
     the "local" flag set if it was auto-generated.

     When moving from one interface to another, if "factory" is
     selected, it''s the same as configuring the interface for the very
     first time.  If the requested address is available on the new
     (destination) NIC, then it''s used.  If it''s not, then an
auto-
     generated address is used instead.

     The system doesn''t have a way to be obstinate about using a
     particular factory-assigned address, and failing otherwise.  If
     you need to have a never-changing address, then assign one
     manually or use the "random" option, as neither of these options
     relies on data supplied by the hardware itself.  Factory
     addresses are, by definition, "ephemeral" from the point of view
     of a VNIC -- they''re tied to the hardware, not to the VNIC.
> > Having the factory-supplied address come unmoored from the device
> > itself seems odd to me, and almost certain to cause trouble.  I
> > suppose it could be possible to create a "adopt the factory
address
> > and treat it as though it were my own statically-configured
address"
> > option, but I''d certainly want to see it come with adequate
warnings
> > about the dangers and a clear user interface (not "factory"
but
> > "steal-from-factory" ;-}).  I''m not sure that
it''d be administratively
> > interesting, though.
> 
> Yes, there''s a risk of duplicate addresses if that option was
chosen,
> and the source NIC ends-up being recycled later, that''s less than
ideal.
Actually, it''s potentially a disaster if it happens.  If moving
factory addresses around among NICs is actually an important
administrative requirement, then, in terms of ARC review, I''d feel
TCR-strong that the system _must_ prevent duplicates from forming
somehow.  Or just not include that feature.
> > I''ve seen similar schemes for access servers (most have
proprietary
> > RADIUS extensions for setting bandwidth limits), and the usual way
> > this works is that once the link is saturated, the configured limits
> > become shares.  Thus, the clients are all hurt in proportion to the
> > amount of bandwidth they''re given.
> 
> The limits are really used to clamp down on bandwidth utilization by a 
> MAC, but they do not imply any guaranteed bandwidth. As a future 
> deliverable we''re also planning to provide bandwidth guarantees
which is
> what you seem to be referring to here.
Actually, no, that''s not quite what I''m referring to.

A bandwidth limit is an upper bound.  If the user tries to send more
than that, then he''ll experience delay and loss.  There''s no
guarantee
that he''ll be able to send that much, but he won''t be able to
send
more.

A bandwidth guarantee is a lower bound.  It''s a reservation.  The user
must always be able to get at least a given amount.  This project
doesn''t supply guarantees.

Quite apart from those definitions, though, is the issue of fairness.
In this case, I *am* talking about limits, but I''m also talking about
what happens when the limit is unachievable.  In the implementations
I''ve seen (Cisco and Ascend are pretty good references for this), the
limit becomes a share because this sort of behavior preserves
fairness.

Suppose we have twenty users with 10Mbps limits, and one user with a
50Mbps limit.  They''re all on a 100Mbps pipe.  If ten of those 10Mbps
users can together lock out all of the others from using any of the
pipe bandwidth at all, then that''s an "unfair" result.

A very simple, but "fair," result would be that, in the limit with
everyone sending flat-out, the 50Mbps-limited user would get 20% of
the bandwidth, or 20Mbps.  The 10Mbps users would get the remaining
80%, or 4Mbps.  Thus, each user would end up with 40% (which is
100/250 and 20/50 and 4/10) of his maximum.

Other results are possible, including splitting the various kinds of
users into priority classes.  I assume that''s not what''s going
on
here, though it''s not clear.

The point is that, although the answer could just be that it''s
inherently unfair, and that''s how it is, I don''t see how an
inherently
unfair system is something that people could use in practice.  Does it
make sense to do that?
> > You''re going to need a function to change the values after
mac_open()
> > time.  By supplying the same values during mac_open(), you''re
just
> > duplicating that functionality.
> 
> It might be a single "piece of code" which can be called to
allocate
> resources according to these parameters from both the open and modify 
> functions. I think the duplication can be avoided.
Then the duplication is only in the API.
> > Why is the resource allocation itself an important thing to optimize
> > versus the interface stability and scalability?
> 
> I don''t agree with the "core function" vs
"periphery" argument. The
> resource control is becoming an integral part of the MAC layer, and 
> there shouldn''t be a need to do "extra steps" to enable
that functionality.
Opening the device is clearly core functionality -- you can''t do much
if you can''t open it.  It sounds like you agree that if those
arguments weren''t present, then some "default" set of
resources would
need to be allocated.  Thus, I argue that the functionality isn''t core
to the goal of getting access to the mac layer.

So, the disagreement is on whether every consumer needs to set up
resource controls.  I''m not sure that they do.  But if they do,
aren''t
there other things they also "need to" set up, and should all of those
things be mac_open() arguments?
> But I agree with your point about designing an API which allows more 
> options to be added in the future without breaking backward 
> compatibility. However I think this can be made to work without 
> requiring a separate call. I''ll need to take a closer look at
this.
OK.
> > The document says it must be -1.
> 
> Yes, and I need to fix the document to allow a slot number to be passed 
> when that MAC address type is specified.
OK.
> > It''s also duplicate logic.  Why optimize for system call
counts versus
> > kernel code complexity?
> 
> There''s additional code in the kernel, but that logic is very
simple.
I''ll give up on this point.  I don''t think the duplication is
worthwhile, even if it''s "simple," as this sort of thing
often leads
to trouble when alternate policies are devised, but it''s something
hidden in the implementation that can be ripped back up later if
necessary.
> Yes, this will be of course fully documented. If we find an efficient 
> way to do banwidth control across multiple rings in the future, I
don''t
> see why we wouldn''t be able to made use of that functionality.
Not just "fully documented," but the design constraint around the
units of control (being individual NIC instances) needs to be clearly
described.  Maybe I''m atypical but, as a user, this wouldn''t
be
obvious to me.

-- 
James Carlson, Solaris Networking              <james.d.carlson at
sun.com>
Sun Microsystems / 1 Network Drive         71.232W   Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757   42.496N   Fax +1 781 442 1677

Nicolas Droux

2007-Aug-31 20:58 UTC

head link

[crossbow-discuss] Updated Crossbow virtualization architecture document

James Carlson wrote:
> When do I move VNICs around and what do I need and expect?  I think
> this document should work through some actual usage scenarios and then
> come up with usable interfaces based on that, because the current
> interfaces seem to be self-referential: they do what they do because
> that''s what they do.  The "force" flag seems
particularly problematic,
> as it indicates that things the administrator should be able to do
> aren''t doable.
<snip>

We need the ability to assign a factory MAC address or fail the 
operation is none is available. I don''t see a problem with that.
There''s
the "auto" (also the default) option for administrators who want to
have
a factory address if one is available, but don''t mind to have a random 
address assigned if no factory address is available.

I''m fine for not having a "force" option during the move
operation. This
means that if a VNIC was created with a "factory" option, and the 
destination NIC doesn''t have a factory MAC address available, then the 
move will fail. It also means that if a user explicitly specified a 
factory MAC address slot, the move will fail.
>>> I''ve seen similar schemes for access servers (most have
proprietary
>>> RADIUS extensions for setting bandwidth limits), and the usual way
>>> this works is that once the link is saturated, the configured
limits
>>> become shares.  Thus, the clients are all hurt in proportion to the
>>> amount of bandwidth they''re given.
>> The limits are really used to clamp down on bandwidth utilization by a 
>> MAC, but they do not imply any guaranteed bandwidth. As a future 
>> deliverable we''re also planning to provide bandwidth
guarantees which is
>> what you seem to be referring to here.
> 
> Actually, no, that''s not quite what I''m referring to.
> 
> A bandwidth limit is an upper bound.  If the user tries to send more
> than that, then he''ll experience delay and loss.  There''s
no guarantee
> that he''ll be able to send that much, but he won''t be
able to send
> more.
> 
> A bandwidth guarantee is a lower bound.  It''s a reservation.  The
user
> must always be able to get at least a given amount.  This project
> doesn''t supply guarantees.
Right.
> Quite apart from those definitions, though, is the issue of fairness.
> In this case, I *am* talking about limits, but I''m also talking
about
> what happens when the limit is unachievable.  In the implementations
> I''ve seen (Cisco and Ascend are pretty good references for this),
the
> limit becomes a share because this sort of behavior preserves
> fairness.
> 
> Suppose we have twenty users with 10Mbps limits, and one user with a
> 50Mbps limit.  They''re all on a 100Mbps pipe.  If ten of those
10Mbps
> users can together lock out all of the others from using any of the
> pipe bandwidth at all, then that''s an "unfair" result.
> 
> A very simple, but "fair," result would be that, in the limit
with
> everyone sending flat-out, the 50Mbps-limited user would get 20% of
> the bandwidth, or 20Mbps.  The 10Mbps users would get the remaining
> 80%, or 4Mbps.  Thus, each user would end up with 40% (which is
> 100/250 and 20/50 and 4/10) of his maximum.
I think this is an RFE we should consider. I''ll let the rest of the
team
chime in if they disagree or have more to add.
>>> Why is the resource allocation itself an important thing to
optimize
>>> versus the interface stability and scalability?
>> I don''t agree with the "core function" vs
"periphery" argument. The
>> resource control is becoming an integral part of the MAC layer, and 
>> there shouldn''t be a need to do "extra steps" to
enable that functionality.
> 
> Opening the device is clearly core functionality -- you can''t do
much
> if you can''t open it.  It sounds like you agree that if those
> arguments weren''t present, then some "default" set of
resources would
> need to be allocated.  Thus, I argue that the functionality isn''t
core
> to the goal of getting access to the mac layer.
If we have the hint at open time, then we can do it right there instead 
of doing a first default allocation followed by a new allocation when 
the hint is specified.
>> Yes, this will be of course fully documented. If we find an efficient 
>> way to do banwidth control across multiple rings in the future, I
don''t
>> see why we wouldn''t be able to made use of that functionality.
> 
> Not just "fully documented," but the design constraint around the
> units of control (being individual NIC instances) needs to be clearly
> described.  Maybe I''m atypical but, as a user, this
wouldn''t be
> obvious to me.
Sure, we''re planning to document this already.

Nicolas.

-- 
Nicolas Droux - Solaris Networking - Sun Microsystems, Inc.
droux at sun.com - http://blogs.sun.com/droux

James Carlson

2007-Sep-04 14:45 UTC

head link

[crossbow-discuss] Updated Crossbow virtualization architecture document

Nicolas Droux writes:> James Carlson wrote:
> 
> > When do I move VNICs around and what do I need and expect?  I think
> > this document should work through some actual usage scenarios and then
> > come up with usable interfaces based on that, because the current
> > interfaces seem to be self-referential: they do what they do because
> > that''s what they do.  The "force" flag seems
particularly problematic,
> > as it indicates that things the administrator should be able to do
> > aren''t doable.
> 
> <snip>
> 
> We need the ability to assign a factory MAC address or fail the 
> operation is none is available.
The question I''m asking here is: "why?"

Under what circumstances does it make sense to provide this failure
mode for administrators?  How does it help rather than hinder?

That''s what I''d like to see in the document -- some
explanation that
shows what administrative problem is being addressed by the
functionality that''s provided.  One way to do that is by providing
usage scenarios.  (Preferably ones that don''t assume the outcome.
I.e., not "the user wants to make sure configuration of vnic0 fails if
a factory address in slot 2 isn''t available, so ...")
> I''m fine for not having a "force" option during the move
operation. This
> means that if a VNIC was created with a "factory" option, and the
> destination NIC doesn''t have a factory MAC address available, then
the
> move will fail. It also means that if a user explicitly specified a 
> factory MAC address slot, the move will fail.
Unless there''s some reason to believe that factory address slot
numbers are allocated in a common way across NICs, I think that moving
a VNIC and preserving the slot number is a sketchy idea.

It''s akin to moving a zone from one machine to another and expecting
that "qfe1" will be the same interface on the same network on the
destination machine.  It might be, with sufficient advance planning.
It probably isn''t though.

-- 
James Carlson, Solaris Networking              <james.d.carlson at
sun.com>
Sun Microsystems / 1 Network Drive         71.232W   Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757   42.496N   Fax +1 781 442 1677

Nicolas Droux

2007-Sep-06 05:42 UTC

head link

[crossbow-discuss] Updated Crossbow virtualization architecture document

On Sep 4, 2007, at 8:45 AM, James Carlson wrote:
> The question I''m asking here is: "why?"
>
> Under what circumstances does it make sense to provide this failure
> mode for administrators?  How does it help rather than hinder?
It provides the user the ability to preserve and enforce the  
assignment of factory MAC addresses to virtual machines in a  
consolidated environment. If the administrator specifically asks for  
a factory MAC address but none are available, then the operation  
would fail. The (default) automatic mode is also there for the users  
who don''t care if a random address is assigned to the VNIC instead.
>> I''m fine for not having a "force" option during the
move
>> operation. This
>> means that if a VNIC was created with a "factory" option, and
the
>> destination NIC doesn''t have a factory MAC address available,
then
>> the
>> move will fail. It also means that if a user explicitly specified a
>> factory MAC address slot, the move will fail.
>
> Unless there''s some reason to believe that factory address slot
> numbers are allocated in a common way across NICs, I think that moving
> a VNIC and preserving the slot number is a sketchy idea.
I was not proposing preserving the slot number. The factory slot  
number after the move could be different. But it will be a factory  
MAC address if such an address was specifically requested on the source.

Nicolas.

-- 
Nicolas Droux - Solaris Core OS - Sun Microsystems, Inc.
droux at sun.com - http://blogs.sun.com/droux

James Carlson

2007-Sep-06 11:34 UTC

head link

[crossbow-discuss] Updated Crossbow virtualization architecture document

Nicolas Droux writes:> On Sep 4, 2007, at 8:45 AM, James Carlson wrote:
> 
> > The question I''m asking here is: "why?"
> >
> > Under what circumstances does it make sense to provide this failure
> > mode for administrators?  How does it help rather than hinder?
> 
> It provides the user the ability to preserve and enforce the  
> assignment of factory MAC addresses to virtual machines in a  
> consolidated environment. If the administrator specifically asks for  
> a factory MAC address but none are available, then the operation  
> would fail. The (default) automatic mode is also there for the users  
> who don''t care if a random address is assigned to the VNIC
instead.
Yes, I understand what it would do.  I still don''t see why
that''s a
helpful operation.

It clearly provides a special failure mode.  What isn''t clear is why
users of this feature would prefer to have the operation fail rather
than having the system provide a best attempt (perhaps with warnings)
instead.

What are the administrators actually doing with these MAC addresses
that causes them to prefer failure?

-- 
James Carlson, Solaris Networking              <james.d.carlson at
sun.com>
Sun Microsystems / 1 Network Drive         71.232W   Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757   42.496N   Fax +1 781 442 1677

Sunay Tripathi

2007-Sep-06 17:11 UTC

head link

[crossbow-discuss] Updated Crossbow virtualization architecture document

James Carlson wrote:> Nicolas Droux writes:
>> On Sep 4, 2007, at 8:45 AM, James Carlson wrote:
>>
>>> The question I''m asking here is: "why?"
>>>
>>> Under what circumstances does it make sense to provide this failure
>>> mode for administrators?  How does it help rather than hinder?
>> It provides the user the ability to preserve and enforce the  
>> assignment of factory MAC addresses to virtual machines in a  
>> consolidated environment. If the administrator specifically asks for  
>> a factory MAC address but none are available, then the operation  
>> would fail. The (default) automatic mode is also there for the users  
>> who don''t care if a random address is assigned to the VNIC
instead.
> 
> Yes, I understand what it would do.  I still don''t see why
that''s a
> helpful operation.
> 
> It clearly provides a special failure mode.  What isn''t clear is
why
> users of this feature would prefer to have the operation fail rather
> than having the system provide a best attempt (perhaps with warnings)
> instead. >
 > What are the administrators actually doing with these MAC addresses
 > that causes them to prefer failure?
 >

Factory assigned MAC addresses are inventoried entities in some
companies. They keep track of the MAC address(s) the machine has along
with other information (like physical location etc). Sparc''s have a
hostid but on x86, this is the only unique way to identify the physical
machines from the packet on the network.

Random MAC addresses are random at best and have no guarantees that
they are unique across different machines. The virtualization crowd
has adopted random mac address but a sizable set of customers are
still skeptical about duplication etc.

A user assigned MAC address is cumbersome at best and some customers
are not prepared to pay the overheads of assigning them and tracking
them across their data center(s).

As such, the NIC which have multiple factory assigned MAC addresses
becauses a very useful resource for a set of customer wanting to play
in virtualization space (but not caring about live migration - specially
zones). They don''t have to deal with user assigned addresses or random
addresses and they can inventory the factory assigned mac addresses
just as they used to before. Yes, they get limited by the number of
VNIC they can create but 8-16 factory assigned mac address gives them
sufficient headroom to play. Thats why you need to either use the
''auto'' flag where you don''t care but if user
specified factory, then
he does care and we can''t get him a factory mac address, we fail the
operation.

Perhaps there is a better administrative interface to express this and
we are open to suggestions. But hopefully you get the idea what we are
trying to achieve.

On a different note, having examples is always very helpful.

Cheers,
Sunay

-- 
Sunay Tripathi
Distinguished Engineer
Solaris Core Operating System
Sun MicroSystems Inc.

Solaris Networking:     http://www.opensolaris.org/os/community/networking
Project Crossbow:       http://www.opensolaris.org/os/project/crossbow

James Carlson

2007-Sep-06 17:17 UTC

head link

[crossbow-discuss] Updated Crossbow virtualization architecture document

Sunay Tripathi writes:> James Carlson wrote:
>  > What are the administrators actually doing with these MAC addresses
>  > that causes them to prefer failure?
>  >
> 
> Factory assigned MAC addresses are inventoried entities in some
> companies. They keep track of the MAC address(s) the machine has along
> with other information (like physical location etc). Sparc''s have
a
> hostid but on x86, this is the only unique way to identify the physical
> machines from the packet on the network.
Sure.  And you can tell which address you''ve got (if you care) by
using the status command.

And I''d point out that after any move, you would *need* to look at the
address on the interface, because the new physical interface likely
has a different set of MAC addresses on it, and you''re going to need
to update those crufty tables (such as /etc/ethers).

I''d even see no problem with issuing a warning when the
use-random-fallback event occurs:

	Warning: you asked for a factory address, but I couldn''t get
	one.  I''ve assigned a random address instead.  If that''s not
	ok, then you''ll probably want to reconfigure this interface.

(Or perhaps something more professional-looking than that.)

The problem I have is with the failure mode.  I don''t see a purpose.
> Random MAC addresses are random at best and have no guarantees that
> they are unique across different machines. The virtualization crowd
> has adopted random mac address but a sizable set of customers are
> still skeptical about duplication etc.
I understand why users would want to prefer factory addresses.  I
wasn''t questioning that at all.

I don''t understand why they would prefer to see failure.  It
doesn''t
seem helpful.

Would users actually be inconvenienced if an interface worked because
it fell back to a random address, where they''d actually have an
advantage if it failed instead?

-- 
James Carlson, Solaris Networking              <james.d.carlson at
sun.com>
Sun Microsystems / 1 Network Drive         71.232W   Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757   42.496N   Fax +1 781 442 1677

Sunay Tripathi

2007-Sep-06 18:00 UTC

head link

[crossbow-discuss] Updated Crossbow virtualization architecture document

James Carlson wrote:> Sunay Tripathi writes:
>> James Carlson wrote:
>>  > What are the administrators actually doing with these MAC
addresses
>>  > that causes them to prefer failure?
>>  >
>>
>> Factory assigned MAC addresses are inventoried entities in some
>> companies. They keep track of the MAC address(s) the machine has along
>> with other information (like physical location etc). Sparc''s
have a
>> hostid but on x86, this is the only unique way to identify the physical
>> machines from the packet on the network.
> 
> Sure.  And you can tell which address you''ve got (if you care) by
> using the status command.
> 
> And I''d point out that after any move, you would *need* to look at
the
> address on the interface, because the new physical interface likely
> has a different set of MAC addresses on it, and you''re going to
need
> to update those crufty tables (such as /etc/ethers).
> 
> I''d even see no problem with issuing a warning when the
> use-random-fallback event occurs:
> 
> 	Warning: you asked for a factory address, but I couldn''t get
> 	one.  I''ve assigned a random address instead.  If that''s
not
> 	ok, then you''ll probably want to reconfigure this interface.
> 
> (Or perhaps something more professional-looking than that.)
Huh? The guy only wants to deal with factory assigned MAC address
and you would still assign a random MAC address and create a VNIC??
What does the guy do after that? Run delete-vnic since he doesn''t
want it in the first place?

I was with you till earlier email that there might be a better way
of expressing the requirement that I am only interested in factory
assigned mac addresses and *don''t* want to deal with random or user
created things. But assigning a random MAC address when he asked for
factory is almost ignoring the request.
> The problem I have is with the failure mode.  I don''t see a
purpose.
Perhaps if you try to understand the difference between a unique
identifier (factory MAC) that is inventoried vs a randomly generated
non-unique identifier, it will be clear to you.
> 
>> Random MAC addresses are random at best and have no guarantees that
>> they are unique across different machines. The virtualization crowd
>> has adopted random mac address but a sizable set of customers are
>> still skeptical about duplication etc.
> 
> I understand why users would want to prefer factory addresses.  I
> wasn''t questioning that at all.
> 
> I don''t understand why they would prefer to see failure.  It
doesn''t
> seem helpful.
Failure happen all the time when you run out of resources. We fail a
process creation when we are out of memory? We fail a socket open when
we run out of descriptors ...
> Would users actually be inconvenienced if an interface worked because
> it fell back to a random address, where they''d actually have an
> advantage if it failed instead?
Yes. And yes. When they see a packet on the wire, they need to know
which physical machine is sending the packet and what is its location.

HTH.

Sunay



-- 
Sunay Tripathi
Distinguished Engineer
Solaris Core Operating System
Sun MicroSystems Inc.

Solaris Networking:     http://www.opensolaris.org/os/community/networking
Project Crossbow:       http://www.opensolaris.org/os/project/crossbow

James Carlson

2007-Sep-06 18:10 UTC

head link

[crossbow-discuss] Updated Crossbow virtualization architecture document

Sunay Tripathi writes:> James Carlson wrote:
> > 	Warning: you asked for a factory address, but I couldn''t get
> > 	one.  I''ve assigned a random address instead.  If
that''s not
> > 	ok, then you''ll probably want to reconfigure this interface.
> > 
> > (Or perhaps something more professional-looking than that.)
> 
> Huh? The guy only wants to deal with factory assigned MAC address
> and you would still assign a random MAC address and create a VNIC??
> What does the guy do after that? Run delete-vnic since he doesn''t
> want it in the first place?
The original context of this was with a "move" operation, where
failure seems quite strange.

But, yes, that''s exactly what I''d expect to see as a user.  In
the
scenario you cite, I''ve asked for two things.  I''ve asked to
have a
VNIC created, and I''ve asked that it have a factory address.

Your assertion is that if I can''t get one of those two things (the
factory address), then I get nothing.  You seem to be assuming that my
request for "factory address" is more important than my request for a
VNIC, such that my request for the VNIC can be ignored or rejected.

I''m not so sure that''s a useful semantic, and I''m
asking whether this
sort of failure is what users *desire* to see.
> I was with you till earlier email that there might be a better way
> of expressing the requirement that I am only interested in factory
> assigned mac addresses and *don''t* want to deal with random or
user
> created things. But assigning a random MAC address when he asked for
> factory is almost ignoring the request.
See above.

I''m not ignoring the request.  I''m saying:

	A.  Honor the request if you can.

	B.  If you can''t honor it, then at least create a usable
	    interface.

	C.  For bonus points, you can _always_ issue a warning message
	    for users who somehow think "factory > random."
> > The problem I have is with the failure mode.  I don''t see a
purpose.
> 
> Perhaps if you try to understand the difference between a unique
> identifier (factory MAC) that is inventoried vs a randomly generated
> non-unique identifier, it will be clear to you.
That''s still not the question I''m asking.

I know the difference between random and factory assignment.

I want to know why forcing failure is preferable to warning (if
necessary) and driving on, particularly when forcing failure simply
creates brand new points of annoyance.

If you''re not interested in covering that in the document, then
that''s
fine by me.  Just say you''re rejecting my comments.  There''s
no need
to assume that I''m ignorant.
> > I don''t understand why they would prefer to see failure.  It
doesn''t
> > seem helpful.
> 
> Failure happen all the time when you run out of resources. We fail a
> process creation when we are out of memory? We fail a socket open when
> we run out of descriptors ...
This is an adminstrative interface, not a dynamic failure mode.
> > Would users actually be inconvenienced if an interface worked because
> > it fell back to a random address, where they''d actually have
an
> > advantage if it failed instead?
> 
> Yes. And yes. When they see a packet on the wire, they need to know
> which physical machine is sending the packet and what is its location.
Specifying "factory" doesn''t actually tell you what address
you will
necessarily have.  You still must use the status interfaces to tell
which address you''ve gotten.

As long as you''re doing that anyway, I don''t see much of a
useful
difference.

-- 
James Carlson, Solaris Networking              <james.d.carlson at
sun.com>
Sun Microsystems / 1 Network Drive         71.232W   Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757   42.496N   Fax +1 781 442 1677

Sunay Tripathi

2007-Sep-06 20:50 UTC

head link

[crossbow-discuss] Updated Crossbow virtualization architecture document

James Carlson wrote:> Sunay Tripathi writes:
>> James Carlson wrote:
>>> 	Warning: you asked for a factory address, but I couldn''t
get
>>> 	one.  I''ve assigned a random address instead.  If
that''s not
>>> 	ok, then you''ll probably want to reconfigure this
interface.
>>>
>>> (Or perhaps something more professional-looking than that.)
>> Huh? The guy only wants to deal with factory assigned MAC address
>> and you would still assign a random MAC address and create a VNIC??
>> What does the guy do after that? Run delete-vnic since he
doesn''t
>> want it in the first place?
> 
> The original context of this was with a "move" operation, where
> failure seems quite strange.
> 
> But, yes, that''s exactly what I''d expect to see as a
user.  In the
> scenario you cite, I''ve asked for two things.  I''ve asked
to have a
> VNIC created, and I''ve asked that it have a factory address.
> 
> Your assertion is that if I can''t get one of those two things (the
> factory address), then I get nothing.  You seem to be assuming that my
> request for "factory address" is more important than my request
for a
> VNIC, such that my request for the VNIC can be ignored or rejected.
> 
> I''m not so sure that''s a useful semantic, and
I''m asking whether this
> sort of failure is what users *desire* to see.
Yes, we have about a dozen big customers that do exactly this.
>> I was with you till earlier email that there might be a better way
>> of expressing the requirement that I am only interested in factory
>> assigned mac addresses and *don''t* want to deal with random or
user
>> created things. But assigning a random MAC address when he asked for
>> factory is almost ignoring the request.
> 
> See above.
> 
> I''m not ignoring the request.  I''m saying:
> 
> 	A.  Honor the request if you can.
> 
> 	B.  If you can''t honor it, then at least create a usable
> 	    interface.
If that was the intent, the user can use the auto flag and let the
system decide. The fact that he specified factory means he cares
about it.

Your definition of usable doesn''t match how this is done today. Users
assert that if I can''t match a packet to a physical machine, it
creates more problems. Same as I don''t have a IP address so let me
snoop and see what address on the subnet is not in use and use that
instead. One could argue that did create a usable interface.

Understand that randomly created MAC addresses are just that - random.
The probability of duplication in todays data center is a very finite
probability and customers understand that and its an issue for them.
> 	C.  For bonus points, you can _always_ issue a warning message
> 	    for users who somehow think "factory > random."
> 
>>> The problem I have is with the failure mode.  I don''t see
a purpose.
>> Perhaps if you try to understand the difference between a unique
>> identifier (factory MAC) that is inventoried vs a randomly generated
>> non-unique identifier, it will be clear to you.
> 
> That''s still not the question I''m asking.
> 
> I know the difference between random and factory assignment.
> 
> I want to know why forcing failure is preferable to warning (if
> necessary) and driving on, particularly when forcing failure simply
> creates brand new points of annoyance.
Use the *auto* flag and not specify a specific mode if you don''t care.
> If you''re not interested in covering that in the document, then
that''s
> fine by me.  Just say you''re rejecting my comments. 
There''s no need
> to assume that I''m ignorant.
> 
>>> I don''t understand why they would prefer to see failure. 
It doesn''t
>>> seem helpful.
>> Failure happen all the time when you run out of resources. We fail a
>> process creation when we are out of memory? We fail a socket open when
>> we run out of descriptors ...
> 
> This is an adminstrative interface, not a dynamic failure mode.
> 
>>> Would users actually be inconvenienced if an interface worked
because
>>> it fell back to a random address, where they''d actually
have an
>>> advantage if it failed instead?
>> Yes. And yes. When they see a packet on the wire, they need to know
>> which physical machine is sending the packet and what is its location.
> 
> Specifying "factory" doesn''t actually tell you what
address you will
> necessarily have.  You still must use the status interfaces to tell
> which address you''ve gotten.
But if you do your inventory properly, when you see a packet on the
wire, you can map it to a physical machine. This is how its done in
real life buy some of our customers.

Sunay

-- 
Sunay Tripathi
Distinguished Engineer
Solaris Core Operating System
Sun MicroSystems Inc.

Solaris Networking:     http://www.opensolaris.org/os/community/networking
Project Crossbow:       http://www.opensolaris.org/os/project/crossbow

Nicolas Droux

2007-Sep-07 03:58 UTC

head link

[crossbow-discuss] Updated Crossbow virtualization architecture document

On Sep 6, 2007, at 12:10 PM, James Carlson wrote:
> But, yes, that''s exactly what I''d expect to see as a
user.  In the
> scenario you cite, I''ve asked for two things.  I''ve asked
to have a
> VNIC created, and I''ve asked that it have a factory address.
>
> Your assertion is that if I can''t get one of those two things (the
> factory address), then I get nothing.  You seem to be assuming that my
> request for "factory address" is more important than my request
for a
> VNIC, such that my request for the VNIC can be ignored or rejected.
If it''s acceptable to you to have a random address assigned to your  
VNIC if no factory addresses are available, then don''t use the  
"factory" option. Use the default "auto" option. If the
administrator
specifically asks for a factory MAC address, I don''t see why it would  
be a problem to fail that operation if no factory MAC addresses are  
available.

Nicolas.

-- 
Nicolas Droux - Solaris Core OS - Sun Microsystems, Inc.
droux at sun.com - http://blogs.sun.com/droux

Ben Rockwood

2007-Sep-13 09:09 UTC

head link

[crossbow-discuss] Updated Crossbow virtualization architecture

To what extent will VNIC configuration be included with a Zone configuration? 
This is similar to Duckhorn, in order to make zones portable there needs to be
little to no pre-configuration required in the global zone.  I''m
unclear if this is implied by "Functional Specification Summery" item
7: "Allow VNICs to be plumbed by Solaris zones".

Ideally, within a Zone definition (/etc/zones/myzone.xml) all information
required to configure the VNIC including bandwidth attributes, MAC settings,
VLAN tags, etc, would be present so that if a zone was cloned or migrated to
another system pre-configuration of the VNIC''s needed to support the
zones wasn''t necessary.

benr.
 
 
This message posted from opensolaris.org

David Edmondson

2007-Sep-13 09:51 UTC

head link

[crossbow-discuss] Updated Crossbow virtualization architecture

On Thu, Sep 13, 2007 at 02:09:08AM -0700, Ben Rockwood
wrote:> Ideally, within a Zone definition (/etc/zones/myzone.xml) all
> information required to configure the VNIC including bandwidth
> attributes, MAC settings, VLAN tags, etc, would be present so that
> if a zone was cloned or migrated to another system pre-configuration
> of the VNIC''s needed to support the zones wasn''t
necessary.
I agree that this would be ideal. Of course, You _might_ have to
change the name of the underlying physical NIC if the two machines are
different(ly connected).

Peter Memishian

2007-Sep-13 10:25 UTC

head link

[crossbow-discuss] Updated Crossbow virtualization architecture

> > Ideally, within a Zone definition (/etc/zones/myzone.xml) all > > information required to configure the VNIC including bandwidth
 > > attributes, MAC settings, VLAN tags, etc, would be present so that
 > > if a zone was cloned or migrated to another system pre-configuration
 > > of the VNIC''s needed to support the zones wasn''t
necessary.
 > 
 > I agree that this would be ideal. Of course, You _might_ have to
 > change the name of the underlying physical NIC if the two machines are
 > different(ly connected).

Hopefully not post-Clearview-UV.

-- 
meem

Ben Rockwood

2007-Sep-14 10:15 UTC

head link

[crossbow-discuss] Updated Crossbow virtualization architecture

If the physical NICs are diffrent thats a given, barring clearview.
 
 
This message posted from opensolaris.org

Nicolas Droux

2007-Sep-14 20:10 UTC

head link

[crossbow-discuss] Updated Crossbow virtualization architecture

Ben,

Ben Rockwood wrote:> To what extent will VNIC configuration be included with a Zone
> configuration?  This is similar to Duckhorn, in order to make zones
> portable there needs to be little to no pre-configuration required in
> the global zone.  I''m unclear if this is implied by
"Functional
> Specification Summery" item 7: "Allow VNICs to be plumbed by
Solaris
> zones".
Our long term goal is to fully integrate the configuration of VNICs with 
Zone configuration. For now they are done separately, you basically 
create a VNIC with its properties with dladm(1M), and then assign the 
VNIC to the zone via zonecfg(1M) (assuming the zone has its own IP 
instance), the zone can then plumb vnic<x> directly. This is what we 
will integrate as part of our first Crossbow putback.

For the long term, we''d like to do all of this from zonecfg(1M) itself 
directly, a la Duckhorn. For example, while the zone is being 
configured, you''ll be able to simply specify that you want a VNIC 
created on top of data-link "x0", call it "y0", with VLAN id
<i>, and by
default assign it the same CPUs that are used for the zone. This will be 
done as a follow-on putback.
> Ideally, within a Zone definition (/etc/zones/myzone.xml) all
> information required to configure the VNIC including bandwidth
> attributes, MAC settings, VLAN tags, etc, would be present so that if
> a zone was cloned or migrated to another system pre-configuration of
> the VNIC''s needed to support the zones wasn''t necessary.
Yes, that would be the case once we more closely integrate VNIC and Zone 
configuration. The configuration information would be kept on a per zone 
basis, and the VNIC would be instantiated dynamically when the zone is 
booted.

Nicolas.
> 
> benr.
> 
> 
> This message posted from opensolaris.org 
> _______________________________________________ crossbow-discuss
> mailing list crossbow-discuss at opensolaris.org 
> http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss
-- 
Nicolas Droux - Solaris Core OS - Sun Microsystems, Inc.
droux at sun.com - http://blogs.sun.com/droux

Seemingly Similar Threads

Search for more maybe matching threads

crossbow discuss - Aug 2007 - Updated Crossbow virtualization architecture document

[crossbow-discuss] Updated Crossbow virtualization architecture document

[crossbow-discuss] Updated Crossbow virtualization architecture document

[crossbow-discuss] Updated Crossbow virtualization architecture document

[crossbow-discuss] Updated Crossbow virtualization architecture document

[crossbow-discuss] Updated Crossbow virtualization architecture document

[crossbow-discuss] Updated Crossbow virtualization architecture document

[crossbow-discuss] Updated Crossbow virtualization architecture document

[crossbow-discuss] Updated Crossbow virtualization architecture document

[crossbow-discuss] Updated Crossbow virtualization architecture document

[crossbow-discuss] Updated Crossbow virtualization architecture document

[crossbow-discuss] Updated Crossbow virtualization architecture document

[crossbow-discuss] Updated Crossbow virtualization architecture document

[crossbow-discuss] Updated Crossbow virtualization architecture document

[crossbow-discuss] Updated Crossbow virtualization architecture document

[crossbow-discuss] Updated Crossbow virtualization architecture document

[crossbow-discuss] Updated Crossbow virtualization architecture document

[crossbow-discuss] Updated Crossbow virtualization architecture

[crossbow-discuss] Updated Crossbow virtualization architecture

[crossbow-discuss] Updated Crossbow virtualization architecture

[crossbow-discuss] Updated Crossbow virtualization architecture

[crossbow-discuss] Updated Crossbow virtualization architecture

Seemingly Similar Threads