thr3ads.net - crossbow discuss - [crossbow-discuss] question on crossbow changes to dls_devnet

If this information is useful, please help other people find it:
Share via:

Sebastien Roy

2009-Jan-16 21:16 UTC

[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()

Hi Crossbow team,

I have a question on the modifications made to dls_devnet_destroy():

int
dls_devnet_destroy(mac_handle_t mh, datalink_id_t *idp, boolean_t wait)
{
	int			err;
	mac_perim_handle_t	mph;

	*idp = DATALINK_INVALID_LINKID;
	err = dls_devnet_unset(mac_name(mh), idp, wait);
	if (err != 0 && err != ENOENT)
		return (err);

	mac_perim_enter_by_mh(mh, &mph);
	err = dls_link_rele_by_name(mac_name(mh));
	mac_perim_exit(mph);

	if (err != 0)
		(void) dls_devnet_set(mac_name(mh), *idp, NULL);
	return (err);
}

Once the call to dls_devnet_unset() has succeeded, is it possible for
dls_link_rele_by_name() to fail?  I ask because the subsequent call to
dls_devnet_set() upon dls_link_rele_by_name() failure is problematic.
It this dls_devnet_set() call fails, the system is rendered hosed until
rebooted.

The Clearview iptun project is adding some code in dls_devnet_set() to
automatically set the zoneid of the link if it is created from within a
non-global zone, and I cannot assert that this code will always succeed.

It would be good to understand what the semantics of this function are
in all possible failure scenarios.  It would be great if
dls_link_rele_by_name() couldn''t fail in this case, and there should
thus be an ASSERT() that err == 0 below it.  The call to
dls_devnet_set() could then be removed.

I don''t know enough about what dls_link_rele_by_name() does with flows
to be able to assert that myself though.  Input would be appreciated.

Thanks,
-Seb

Cathy Zhou

2009-Jan-16 21:33 UTC

head link

[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()

On 2009?01?16? 13:16, Sebastien Roy wrote:> Hi Crossbow team,
> 
> I have a question on the modifications made to dls_devnet_destroy():
> 
> int
> dls_devnet_destroy(mac_handle_t mh, datalink_id_t *idp, boolean_t wait)
> {
> 	int			err;
> 	mac_perim_handle_t	mph;
> 
> 	*idp = DATALINK_INVALID_LINKID;
> 	err = dls_devnet_unset(mac_name(mh), idp, wait);
> 	if (err != 0 && err != ENOENT)
> 		return (err);
> 
> 	mac_perim_enter_by_mh(mh, &mph);
> 	err = dls_link_rele_by_name(mac_name(mh));
> 	mac_perim_exit(mph);
> 
> 	if (err != 0)
> 		(void) dls_devnet_set(mac_name(mh), *idp, NULL);
> 	return (err);
> }
> 
> Once the call to dls_devnet_unset() has succeeded, is it possible for
> dls_link_rele_by_name() to fail?  I ask because the subsequent call to
> dls_devnet_set() upon dls_link_rele_by_name() failure is problematic.
> It this dls_devnet_set() call fails, the system is rendered hosed until
> rebooted.
> It looks possible. As long as there are still subflows added on this link, 
it would fail.

Thanks
- Cathy
> The Clearview iptun project is adding some code in dls_devnet_set() to
> automatically set the zoneid of the link if it is created from within a
> non-global zone, and I cannot assert that this code will always succeed.
> 
> It would be good to understand what the semantics of this function are
> in all possible failure scenarios.  It would be great if
> dls_link_rele_by_name() couldn''t fail in this case, and there
should
> thus be an ASSERT() that err == 0 below it.  The call to
> dls_devnet_set() could then be removed.
> 
> I don''t know enough about what dls_link_rele_by_name() does with
flows
> to be able to assert that myself though.  Input would be appreciated.
> 
> Thanks,
> -Seb
> 
> 
> _______________________________________________
> crossbow-discuss mailing list
> crossbow-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss

Sebastien Roy

2009-Jan-16 22:34 UTC

head link

[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()

On Fri, 2009-01-16 at 13:33 -0800, Cathy Zhou wrote:> On 2009?01?16? 13:16, Sebastien Roy wrote:
> > Hi Crossbow team,
> > 
> > I have a question on the modifications made to dls_devnet_destroy():
> > 
> > int
> > dls_devnet_destroy(mac_handle_t mh, datalink_id_t *idp, boolean_t
wait)
> > {
> > 	int			err;
> > 	mac_perim_handle_t	mph;
> > 
> > 	*idp = DATALINK_INVALID_LINKID;
> > 	err = dls_devnet_unset(mac_name(mh), idp, wait);
> > 	if (err != 0 && err != ENOENT)
> > 		return (err);
> > 
> > 	mac_perim_enter_by_mh(mh, &mph);
> > 	err = dls_link_rele_by_name(mac_name(mh));
> > 	mac_perim_exit(mph);
> > 
> > 	if (err != 0)
> > 		(void) dls_devnet_set(mac_name(mh), *idp, NULL);
> > 	return (err);
> > }
> > 
> > Once the call to dls_devnet_unset() has succeeded, is it possible for
> > dls_link_rele_by_name() to fail?  I ask because the subsequent call to
> > dls_devnet_set() upon dls_link_rele_by_name() failure is problematic.
> > It this dls_devnet_set() call fails, the system is rendered hosed
until
> > rebooted.
> > 
> It looks possible. As long as there are still subflows added on this link, 
> it would fail.
Is there a way to fix this?  There has to be some way to freeze the
state of things once it''s been verified that teardown is safe (sort of
like what mac_disable() was intended to do).  That way, one can assert
that the various steps involved in teardown cannot fail.

-Seb

Cathy Zhou

2009-Jan-16 23:17 UTC

head link

[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()

On 2009?01?16? 14:34, Sebastien Roy wrote:> On Fri, 2009-01-16 at 13:33 -0800, Cathy Zhou wrote:
>> On 2009?01?16? 13:16, Sebastien Roy wrote:
>>> Hi Crossbow team,
>>>
>>> I have a question on the modifications made to
dls_devnet_destroy():
>>>
>>> int
>>> dls_devnet_destroy(mac_handle_t mh, datalink_id_t *idp, boolean_t
wait)
>>> {
>>> 	int			err;
>>> 	mac_perim_handle_t	mph;
>>>
>>> 	*idp = DATALINK_INVALID_LINKID;
>>> 	err = dls_devnet_unset(mac_name(mh), idp, wait);
>>> 	if (err != 0 && err != ENOENT)
>>> 		return (err);
>>>
>>> 	mac_perim_enter_by_mh(mh, &mph);
>>> 	err = dls_link_rele_by_name(mac_name(mh));
>>> 	mac_perim_exit(mph);
>>>
>>> 	if (err != 0)
>>> 		(void) dls_devnet_set(mac_name(mh), *idp, NULL);
>>> 	return (err);
>>> }
>>>
>>> Once the call to dls_devnet_unset() has succeeded, is it possible
for
>>> dls_link_rele_by_name() to fail?  I ask because the subsequent call
to
>>> dls_devnet_set() upon dls_link_rele_by_name() failure is
problematic.
>>> It this dls_devnet_set() call fails, the system is rendered hosed
until
>>> rebooted.
>>>
>> It looks possible. As long as there are still subflows added on this
link,
>> it would fail.
> 
> Is there a way to fix this?  There has to be some way to freeze the
> state of things once it''s been verified that teardown is safe
(sort of
> like what mac_disable() was intended to do).  That way, one can assert
> that the various steps involved in teardown cannot fail.
> Yes. I believe it is possible. Can we add a dls_devnet_disable() function, 
which checks whether we can destroy dls_devnet_t, dls_link_t (or even 
mac_impl_t?) and set the disabled flag on each of them. The disabled flag 
will be used to prevent further usage of those structures.

Thanks
- Cathy

Eric Cheng

2009-Jan-17 00:35 UTC

head link

[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()

I think something like this could work:

int
dls_devnet_destroy(mac_handle_t mh, datalink_id_t *idp)
{
        int                     err;
        mac_perim_handle_t      mph;

        mac_perim_enter_by_mh(mh, &mph);

        *idp = DATALINK_INVALID_LINKID;
        err = dls_devnet_unset(mac_name(mh), idp, B_FALSE);
        if (err != 0) {
                mac_perim_exit(mph);
                return (err);
        }
        err = dls_link_rele_by_name(mac_name(mh));
        ASSERT(err == 0);
        mac_perim_exit(mph);
        return (err);
}

static int
dls_devnet_unset(const char *macname, datalink_id_t *id, boolean_t wait)
{
        dls_devnet_t    *ddp;
        int             err;
        mod_hash_val_t  val;

        rw_enter(&i_dls_devnet_lock, RW_WRITER);
        if ((err = mod_hash_find(i_dls_devnet_hash,
            (mod_hash_key_t)macname, (mod_hash_val_t *)&ddp)) != 0) {
                ASSERT(err == MH_ERR_NOTFOUND);
                rw_exit(&i_dls_devnet_lock);
                return (ENOENT);
        }

        if ((err = dls_link_is_busy(macname)) != 0) {
                rw_exit(&i_dls_devnet_lock);
                return (err);
        }

        mutex_enter(&ddp->dd_mutex);
        ....
}

int
dls_link_is_busy(const char *name)
{
        dls_link_t              *dlp;

        if (mod_hash_find(i_dls_link_hash, (mod_hash_key_t)name,
            (mod_hash_val_t *)&dlp) != 0)
                return (ENOENT);

        ASSERT(MAC_PERIM_HELD(dlp->dl_mh));

        /*
         * Must fail detach if mac client is busy.
         */
        ASSERT(dlp->dl_ref > 0 && dlp->dl_mch != NULL);
        if (mac_link_has_flows(dlp->dl_mch))
                return (ENOTEMPTY);

        return (0);
}


notes:
-the DD_CONDEMNED flag has the effect of disabling the dls_devnet_t
 and ensures that subsequent calls to dls_devnet_hold_tmp() would
 fail (indirectly called by mac_link_flow_add())
 
-we do not wait in dls_devnet_unset() because this could deadlock
 if someone is holding a tmp reference. softmac_destroy() already
 does this so it should be ok to change aggr/vnic to behave the
 same way.

let me know if this makes sense.

eric

On Fri, Jan 16, 2009 at 03:17:34PM -0800, Cathy Zhou
wrote:> On 2009?01?16? 14:34, Sebastien Roy wrote:
> > On Fri, 2009-01-16 at 13:33 -0800, Cathy Zhou wrote:
> >> On 2009?01?16? 13:16, Sebastien Roy wrote:
> >>> Hi Crossbow team,
> >>>
> >>> I have a question on the modifications made to
dls_devnet_destroy():
> >>>
> >>> int
> >>> dls_devnet_destroy(mac_handle_t mh, datalink_id_t *idp,
boolean_t wait)
> >>> {
> >>> 	int			err;
> >>> 	mac_perim_handle_t	mph;
> >>>
> >>> 	*idp = DATALINK_INVALID_LINKID;
> >>> 	err = dls_devnet_unset(mac_name(mh), idp, wait);
> >>> 	if (err != 0 && err != ENOENT)
> >>> 		return (err);
> >>>
> >>> 	mac_perim_enter_by_mh(mh, &mph);
> >>> 	err = dls_link_rele_by_name(mac_name(mh));
> >>> 	mac_perim_exit(mph);
> >>>
> >>> 	if (err != 0)
> >>> 		(void) dls_devnet_set(mac_name(mh), *idp, NULL);
> >>> 	return (err);
> >>> }
> >>>
> >>> Once the call to dls_devnet_unset() has succeeded, is it
possible for
> >>> dls_link_rele_by_name() to fail?  I ask because the subsequent
call to
> >>> dls_devnet_set() upon dls_link_rele_by_name() failure is
problematic.
> >>> It this dls_devnet_set() call fails, the system is rendered
hosed until
> >>> rebooted.
> >>>
> >> It looks possible. As long as there are still subflows added on
this link,
> >> it would fail.
> > 
> > Is there a way to fix this?  There has to be some way to freeze the
> > state of things once it''s been verified that teardown is safe
(sort of
> > like what mac_disable() was intended to do).  That way, one can assert
> > that the various steps involved in teardown cannot fail.
> > 
> Yes. I believe it is possible. Can we add a dls_devnet_disable() function, 
> which checks whether we can destroy dls_devnet_t, dls_link_t (or even 
> mac_impl_t?) and set the disabled flag on each of them. The disabled flag 
> will be used to prevent further usage of those structures.
> 
> Thanks
> - Cathy
> _______________________________________________
> crossbow-discuss mailing list
> crossbow-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss

Cathy Zhou

2009-Jan-17 00:45 UTC

head link

[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()

> -we do not wait in dls_devnet_unset() because this could deadlock
>  if someone is holding a tmp reference. softmac_destroy() already
>  does this so it should be ok to change aggr/vnic to behave the
>  same way.
> It is fine for softmac to do that, since it is fine to prevent a device 
from detaching. But it will be strange to disallow a vnic/aggr from being 
destroyed only because someone is showing its state right now (which would 
result a temporary reference).

Thanks
- Cathy
> let me know if this makes sense.
> 
> eric
> 
> On Fri, Jan 16, 2009 at 03:17:34PM -0800, Cathy Zhou wrote:
>> On 2009?01?16? 14:34, Sebastien Roy wrote:
>>> On Fri, 2009-01-16 at 13:33 -0800, Cathy Zhou wrote:
>>>> On 2009?01?16? 13:16, Sebastien Roy wrote:
>>>>> Hi Crossbow team,
>>>>>
>>>>> I have a question on the modifications made to
dls_devnet_destroy():
>>>>>
>>>>> int
>>>>> dls_devnet_destroy(mac_handle_t mh, datalink_id_t *idp,
boolean_t wait)
>>>>> {
>>>>> 	int			err;
>>>>> 	mac_perim_handle_t	mph;
>>>>>
>>>>> 	*idp = DATALINK_INVALID_LINKID;
>>>>> 	err = dls_devnet_unset(mac_name(mh), idp, wait);
>>>>> 	if (err != 0 && err != ENOENT)
>>>>> 		return (err);
>>>>>
>>>>> 	mac_perim_enter_by_mh(mh, &mph);
>>>>> 	err = dls_link_rele_by_name(mac_name(mh));
>>>>> 	mac_perim_exit(mph);
>>>>>
>>>>> 	if (err != 0)
>>>>> 		(void) dls_devnet_set(mac_name(mh), *idp, NULL);
>>>>> 	return (err);
>>>>> }
>>>>>
>>>>> Once the call to dls_devnet_unset() has succeeded, is it
possible for
>>>>> dls_link_rele_by_name() to fail?  I ask because the
subsequent call to
>>>>> dls_devnet_set() upon dls_link_rele_by_name() failure is
problematic.
>>>>> It this dls_devnet_set() call fails, the system is rendered
hosed until
>>>>> rebooted.
>>>>>
>>>> It looks possible. As long as there are still subflows added on
this link,
>>>> it would fail.
>>> Is there a way to fix this?  There has to be some way to freeze the
>>> state of things once it''s been verified that teardown is
safe (sort of
>>> like what mac_disable() was intended to do).  That way, one can
assert
>>> that the various steps involved in teardown cannot fail.
>>>
>> Yes. I believe it is possible. Can we add a dls_devnet_disable()
function,
>> which checks whether we can destroy dls_devnet_t, dls_link_t (or even 
>> mac_impl_t?) and set the disabled flag on each of them. The disabled
flag
>> will be used to prevent further usage of those structures.
>>
>> Thanks
>> - Cathy
>> _______________________________________________
>> crossbow-discuss mailing list
>> crossbow-discuss at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss
> _______________________________________________
> crossbow-discuss mailing list
> crossbow-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss

Eric Cheng

2009-Jan-17 01:15 UTC

head link

[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()

On Fri, Jan 16, 2009 at 04:45:02PM -0800, Cathy Zhou
wrote:> 
> >-we do not wait in dls_devnet_unset() because this could deadlock
> > if someone is holding a tmp reference. softmac_destroy() already
> > does this so it should be ok to change aggr/vnic to behave the
> > same way.
> >
> It is fine for softmac to do that, since it is fine to prevent a device 
> from detaching. But it will be strange to disallow a vnic/aggr from being 
> destroyed only because someone is showing its state right now (which would 
> result a temporary reference).
>
a device detach could also be triggered by the administrator via a command
(e.g. cfgadm). how is this different from, say, dladm delete-aggr?
why treat aggrs/vnics differently?

I am afraid this teardown issue will be very difficult to fix because of
locking issues if you must wait for tmp references to drop. adding a
separate _disable function will not make this easier.

eric
> Thanks
> - Cathy
> 
> >let me know if this makes sense.
> >
> >eric
> >
> >On Fri, Jan 16, 2009 at 03:17:34PM -0800, Cathy Zhou wrote:
> >>On 2009?01?16? 14:34, Sebastien Roy wrote:
> >>>On Fri, 2009-01-16 at 13:33 -0800, Cathy Zhou wrote:
> >>>>On 2009?01?16? 13:16, Sebastien Roy wrote:
> >>>>>Hi Crossbow team,
> >>>>>
> >>>>>I have a question on the modifications made to
dls_devnet_destroy():
> >>>>>
> >>>>>int
> >>>>>dls_devnet_destroy(mac_handle_t mh, datalink_id_t *idp,
boolean_t wait)
> >>>>>{
> >>>>>	int			err;
> >>>>>	mac_perim_handle_t	mph;
> >>>>>
> >>>>>	*idp = DATALINK_INVALID_LINKID;
> >>>>>	err = dls_devnet_unset(mac_name(mh), idp, wait);
> >>>>>	if (err != 0 && err != ENOENT)
> >>>>>		return (err);
> >>>>>
> >>>>>	mac_perim_enter_by_mh(mh, &mph);
> >>>>>	err = dls_link_rele_by_name(mac_name(mh));
> >>>>>	mac_perim_exit(mph);
> >>>>>
> >>>>>	if (err != 0)
> >>>>>		(void) dls_devnet_set(mac_name(mh), *idp, NULL);
> >>>>>	return (err);
> >>>>>}
> >>>>>
> >>>>>Once the call to dls_devnet_unset() has succeeded, is
it possible for
> >>>>>dls_link_rele_by_name() to fail?  I ask because the
subsequent call to
> >>>>>dls_devnet_set() upon dls_link_rele_by_name() failure
is problematic.
> >>>>>It this dls_devnet_set() call fails, the system is
rendered hosed until
> >>>>>rebooted.
> >>>>>
> >>>>It looks possible. As long as there are still subflows
added on this
> >>>>link, it would fail.
> >>>Is there a way to fix this?  There has to be some way to freeze
the
> >>>state of things once it''s been verified that teardown
is safe (sort of
> >>>like what mac_disable() was intended to do).  That way, one can
assert
> >>>that the various steps involved in teardown cannot fail.
> >>>
> >>Yes. I believe it is possible. Can we add a dls_devnet_disable() 
> >>function, which checks whether we can destroy dls_devnet_t,
dls_link_t
> >>(or even mac_impl_t?) and set the disabled flag on each of them.
The
> >>disabled flag will be used to prevent further usage of those
structures.
> >>
> >>Thanks
> >>- Cathy
> >>_______________________________________________
> >>crossbow-discuss mailing list
> >>crossbow-discuss at opensolaris.org
> >>http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss
> >_______________________________________________
> >crossbow-discuss mailing list
> >crossbow-discuss at opensolaris.org
> >http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss
>

Cathy Zhou

2009-Jan-17 02:12 UTC

head link

[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()

> 
> a device detach could also be triggered by the administrator via a command
> (e.g. cfgadm). how is this different from, say, dladm delete-aggr?
> why treat aggrs/vnics differently?
> Good point. I thought about that too. Yes, ideally that we should wait for 
the temporary reference count to drop for softmac as well. I think it is 
also because of locking issue which cause we choose not to wait for 
physical link.

Can you elaborate what would be the locking issues for aggr/vnic if we 
wait in dls_devnet_unset?

Thanks
- Cathy

Thirumalai Srinivasan

2009-Jan-19 02:37 UTC

head link

[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()

Eric,

Iam assuming you mean dls_devnet_unset() will fail the unset if there are
temporary references i.e. the dd_tref is non-zero or if dd_prop_taskid is
non-null. This is easy to implement, the only issue is the one Cathy pointed
For example a script that runs a delete vnic would fail occasionally if it
coincided with somone running a dladm show command. If this failure is
acceptable, then there is perhaps no need for even keeping track of a 
separate
field for temporary references.

(cfgadm failure is perhaps similar, but may not be such a call generator.)

> Cathy Zhou wrote:
>> Can you elaborate what would be the locking issues for aggr/vnic if we 
>> wait in dls_devnet_unset?
>>     

If you wait for the dd_tref to drop to zero, while holding the perimeter
there is the risk of deadlock. For eg. the following sequence in
drv_ioc_prop_common() would do the opposite order

if ((err = dls_devnet_hold_tmp(linkid, &dlh)) != 0)
goto done;

if ((err = mac_perim_enter_by_macname(dls_devnet_mac(dlh),
&mph)) != 0) {
goto done;
}


The other alternative would need lot more work and code reorganizion. We 
need to
separate the disable and destroy part all through, in vanity naming,
softmac, and mac, so that all disables are done first. Then we can undo the
disables if any of them fail. Only if all disables succeed, should we do the
destroys. Destroys must never fail.

Thirumalai



Eric Cheng wrote:> I think something like this could work:
>
> int
> dls_devnet_destroy(mac_handle_t mh, datalink_id_t *idp)
> {
>         int                     err;
>         mac_perim_handle_t      mph;
>
>         mac_perim_enter_by_mh(mh, &mph);
>
>         *idp = DATALINK_INVALID_LINKID;
>         err = dls_devnet_unset(mac_name(mh), idp, B_FALSE);
>         if (err != 0) {
>                 mac_perim_exit(mph);
>                 return (err);
>         }
>         err = dls_link_rele_by_name(mac_name(mh));
>         ASSERT(err == 0);
>         mac_perim_exit(mph);
>         return (err);
> }
>
> static int
> dls_devnet_unset(const char *macname, datalink_id_t *id, boolean_t wait)
> {
>         dls_devnet_t    *ddp;
>         int             err;
>         mod_hash_val_t  val;
>
>         rw_enter(&i_dls_devnet_lock, RW_WRITER);
>         if ((err = mod_hash_find(i_dls_devnet_hash,
>             (mod_hash_key_t)macname, (mod_hash_val_t *)&ddp)) != 0) {
>                 ASSERT(err == MH_ERR_NOTFOUND);
>                 rw_exit(&i_dls_devnet_lock);
>                 return (ENOENT);
>         }
>
>         if ((err = dls_link_is_busy(macname)) != 0) {
>                 rw_exit(&i_dls_devnet_lock);
>                 return (err);
>         }
>
>         mutex_enter(&ddp->dd_mutex);
>         ....
> }
>
> int
> dls_link_is_busy(const char *name)
> {
>         dls_link_t              *dlp;
>
>         if (mod_hash_find(i_dls_link_hash, (mod_hash_key_t)name,
>             (mod_hash_val_t *)&dlp) != 0)
>                 return (ENOENT);
>
>         ASSERT(MAC_PERIM_HELD(dlp->dl_mh));
>
>         /*
>          * Must fail detach if mac client is busy.
>          */
>         ASSERT(dlp->dl_ref > 0 && dlp->dl_mch != NULL);
>         if (mac_link_has_flows(dlp->dl_mch))
>                 return (ENOTEMPTY);
>
>         return (0);
> }
>
>
> notes:
> -the DD_CONDEMNED flag has the effect of disabling the dls_devnet_t
>  and ensures that subsequent calls to dls_devnet_hold_tmp() would
>  fail (indirectly called by mac_link_flow_add())
>  
> -we do not wait in dls_devnet_unset() because this could deadlock
>  if someone is holding a tmp reference. softmac_destroy() already
>  does this so it should be ok to change aggr/vnic to behave the
>  same way.
>
> let me know if this makes sense.
>
> eric
>
> On Fri, Jan 16, 2009 at 03:17:34PM -0800, Cathy Zhou wrote:
>   
>> On 2009?01?16? 14:34, Sebastien Roy wrote:
>>     
>>> On Fri, 2009-01-16 at 13:33 -0800, Cathy Zhou wrote:
>>>       
>>>> On 2009?01?16? 13:16, Sebastien Roy wrote:
>>>>         
>>>>> Hi Crossbow team,
>>>>>
>>>>> I have a question on the modifications made to
dls_devnet_destroy():
>>>>>
>>>>> int
>>>>> dls_devnet_destroy(mac_handle_t mh, datalink_id_t *idp,
boolean_t wait)
>>>>> {
>>>>> 	int			err;
>>>>> 	mac_perim_handle_t	mph;
>>>>>
>>>>> 	*idp = DATALINK_INVALID_LINKID;
>>>>> 	err = dls_devnet_unset(mac_name(mh), idp, wait);
>>>>> 	if (err != 0 && err != ENOENT)
>>>>> 		return (err);
>>>>>
>>>>> 	mac_perim_enter_by_mh(mh, &mph);
>>>>> 	err = dls_link_rele_by_name(mac_name(mh));
>>>>> 	mac_perim_exit(mph);
>>>>>
>>>>> 	if (err != 0)
>>>>> 		(void) dls_devnet_set(mac_name(mh), *idp, NULL);
>>>>> 	return (err);
>>>>> }
>>>>>
>>>>> Once the call to dls_devnet_unset() has succeeded, is it
possible for
>>>>> dls_link_rele_by_name() to fail?  I ask because the
subsequent call to
>>>>> dls_devnet_set() upon dls_link_rele_by_name() failure is
problematic.
>>>>> It this dls_devnet_set() call fails, the system is rendered
hosed until
>>>>> rebooted.
>>>>>
>>>>>           
>>>> It looks possible. As long as there are still subflows added on
this link,
>>>> it would fail.
>>>>         
>>> Is there a way to fix this?  There has to be some way to freeze the
>>> state of things once it''s been verified that teardown is
safe (sort of
>>> like what mac_disable() was intended to do).  That way, one can
assert
>>> that the various steps involved in teardown cannot fail.
>>>
>>>       
>> Yes. I believe it is possible. Can we add a dls_devnet_disable()
function,
>> which checks whether we can destroy dls_devnet_t, dls_link_t (or even 
>> mac_impl_t?) and set the disabled flag on each of them. The disabled
flag
>> will be used to prevent further usage of those structures.
>>
>> Thanks
>> - Cathy
>> _______________________________________________
>> crossbow-discuss mailing list
>> crossbow-discuss at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss
>>     
> _______________________________________________
> crossbow-discuss mailing list
> crossbow-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss
>

Eric Cheng

2009-Jan-19 09:47 UTC

head link

[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()

thanks for the explanation, thiru.

I agree with you that the complete solution will take a lot more work.
I will likely tackle this as part of 6791335 as they are somewhat
related.

eric

On Sun, Jan 18, 2009 at 06:37:35PM -0800, Thirumalai Srinivasan
wrote:> 
> Eric,
> 
> Iam assuming you mean dls_devnet_unset() will fail the unset if there are
> temporary references i.e. the dd_tref is non-zero or if dd_prop_taskid is
> non-null. This is easy to implement, the only issue is the one Cathy
pointed
> For example a script that runs a delete vnic would fail occasionally if it
> coincided with somone running a dladm show command. If this failure is
> acceptable, then there is perhaps no need for even keeping track of a 
> separate
> field for temporary references.
> 
> (cfgadm failure is perhaps similar, but may not be such a call generator.)
> 
> 
> >Cathy Zhou wrote:
> >>Can you elaborate what would be the locking issues for aggr/vnic if
we
> >>wait in dls_devnet_unset?
> >>    
> 
> 
> If you wait for the dd_tref to drop to zero, while holding the perimeter
> there is the risk of deadlock. For eg. the following sequence in
> drv_ioc_prop_common() would do the opposite order
> 
> if ((err = dls_devnet_hold_tmp(linkid, &dlh)) != 0)
> goto done;
> 
> if ((err = mac_perim_enter_by_macname(dls_devnet_mac(dlh),
> &mph)) != 0) {
> goto done;
> }
> 
> 
> The other alternative would need lot more work and code reorganizion. We 
> need to
> separate the disable and destroy part all through, in vanity naming,
> softmac, and mac, so that all disables are done first. Then we can undo the
> disables if any of them fail. Only if all disables succeed, should we do
the
> destroys. Destroys must never fail.
> 
> Thirumalai
> 
> 
> 
> Eric Cheng wrote:
> >I think something like this could work:
> >
> >int
> >dls_devnet_destroy(mac_handle_t mh, datalink_id_t *idp)
> >{
> >        int                     err;
> >        mac_perim_handle_t      mph;
> >
> >        mac_perim_enter_by_mh(mh, &mph);
> >
> >        *idp = DATALINK_INVALID_LINKID;
> >        err = dls_devnet_unset(mac_name(mh), idp, B_FALSE);
> >        if (err != 0) {
> >                mac_perim_exit(mph);
> >                return (err);
> >        }
> >        err = dls_link_rele_by_name(mac_name(mh));
> >        ASSERT(err == 0);
> >        mac_perim_exit(mph);
> >        return (err);
> >}
> >
> >static int
> >dls_devnet_unset(const char *macname, datalink_id_t *id, boolean_t
wait)
> >{
> >        dls_devnet_t    *ddp;
> >        int             err;
> >        mod_hash_val_t  val;
> >
> >        rw_enter(&i_dls_devnet_lock, RW_WRITER);
> >        if ((err = mod_hash_find(i_dls_devnet_hash,
> >            (mod_hash_key_t)macname, (mod_hash_val_t *)&ddp)) != 0)
{
> >                ASSERT(err == MH_ERR_NOTFOUND);
> >                rw_exit(&i_dls_devnet_lock);
> >                return (ENOENT);
> >        }
> >
> >        if ((err = dls_link_is_busy(macname)) != 0) {
> >                rw_exit(&i_dls_devnet_lock);
> >                return (err);
> >        }
> >
> >        mutex_enter(&ddp->dd_mutex);
> >        ....
> >}
> >
> >int
> >dls_link_is_busy(const char *name)
> >{
> >        dls_link_t              *dlp;
> >
> >        if (mod_hash_find(i_dls_link_hash, (mod_hash_key_t)name,
> >            (mod_hash_val_t *)&dlp) != 0)
> >                return (ENOENT);
> >
> >        ASSERT(MAC_PERIM_HELD(dlp->dl_mh));
> >
> >        /*
> >         * Must fail detach if mac client is busy.
> >         */
> >        ASSERT(dlp->dl_ref > 0 && dlp->dl_mch !=
NULL);
> >        if (mac_link_has_flows(dlp->dl_mch))
> >                return (ENOTEMPTY);
> >
> >        return (0);
> >}
> >
> >
> >notes:
> >-the DD_CONDEMNED flag has the effect of disabling the dls_devnet_t
> > and ensures that subsequent calls to dls_devnet_hold_tmp() would
> > fail (indirectly called by mac_link_flow_add())
> > 
> >-we do not wait in dls_devnet_unset() because this could deadlock
> > if someone is holding a tmp reference. softmac_destroy() already
> > does this so it should be ok to change aggr/vnic to behave the
> > same way.
> >
> >let me know if this makes sense.
> >
> >eric
> >
> >On Fri, Jan 16, 2009 at 03:17:34PM -0800, Cathy Zhou wrote:
> >  
> >>On 2009?01?16? 14:34, Sebastien Roy wrote:
> >>    
> >>>On Fri, 2009-01-16 at 13:33 -0800, Cathy Zhou wrote:
> >>>      
> >>>>On 2009?01?16? 13:16, Sebastien Roy wrote:
> >>>>        
> >>>>>Hi Crossbow team,
> >>>>>
> >>>>>I have a question on the modifications made to
dls_devnet_destroy():
> >>>>>
> >>>>>int
> >>>>>dls_devnet_destroy(mac_handle_t mh, datalink_id_t *idp,
boolean_t wait)
> >>>>>{
> >>>>>	int			err;
> >>>>>	mac_perim_handle_t	mph;
> >>>>>
> >>>>>	*idp = DATALINK_INVALID_LINKID;
> >>>>>	err = dls_devnet_unset(mac_name(mh), idp, wait);
> >>>>>	if (err != 0 && err != ENOENT)
> >>>>>		return (err);
> >>>>>
> >>>>>	mac_perim_enter_by_mh(mh, &mph);
> >>>>>	err = dls_link_rele_by_name(mac_name(mh));
> >>>>>	mac_perim_exit(mph);
> >>>>>
> >>>>>	if (err != 0)
> >>>>>		(void) dls_devnet_set(mac_name(mh), *idp, NULL);
> >>>>>	return (err);
> >>>>>}
> >>>>>
> >>>>>Once the call to dls_devnet_unset() has succeeded, is
it possible for
> >>>>>dls_link_rele_by_name() to fail?  I ask because the
subsequent call to
> >>>>>dls_devnet_set() upon dls_link_rele_by_name() failure
is problematic.
> >>>>>It this dls_devnet_set() call fails, the system is
rendered hosed until
> >>>>>rebooted.
> >>>>>
> >>>>>          
> >>>>It looks possible. As long as there are still subflows
added on this
> >>>>link, it would fail.
> >>>>        
> >>>Is there a way to fix this?  There has to be some way to freeze
the
> >>>state of things once it''s been verified that teardown
is safe (sort of
> >>>like what mac_disable() was intended to do).  That way, one can
assert
> >>>that the various steps involved in teardown cannot fail.
> >>>
> >>>      
> >>Yes. I believe it is possible. Can we add a dls_devnet_disable() 
> >>function, which checks whether we can destroy dls_devnet_t,
dls_link_t
> >>(or even mac_impl_t?) and set the disabled flag on each of them.
The
> >>disabled flag will be used to prevent further usage of those
structures.
> >>
> >>Thanks
> >>- Cathy
> >>_______________________________________________
> >>crossbow-discuss mailing list
> >>crossbow-discuss at opensolaris.org
> >>http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss
> >>    
> >_______________________________________________
> >crossbow-discuss mailing list
> >crossbow-discuss at opensolaris.org
> >http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss
> >  
>

Sebastien Roy

2009-Jan-20 18:05 UTC

head link

[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()

On Mon, 2009-01-19 at 01:47 -0800, Eric Cheng wrote:> I agree with you that the complete solution will take a lot more work.
> I will likely tackle this as part of 6791335 as they are somewhat
> related.
Sounds good, and thanks for the productive discussion.

-Seb

crossbow discuss - Jan 2009 - question on crossbow changes to dls_devnet_destroy()

[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()

[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()

[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()

[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()

[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()

[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()

[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()

[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()

[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()

[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()

[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()