Sebastien Roy
2009-Jan-16 21:16 UTC
[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()
Hi Crossbow team, I have a question on the modifications made to dls_devnet_destroy(): int dls_devnet_destroy(mac_handle_t mh, datalink_id_t *idp, boolean_t wait) { int err; mac_perim_handle_t mph; *idp = DATALINK_INVALID_LINKID; err = dls_devnet_unset(mac_name(mh), idp, wait); if (err != 0 && err != ENOENT) return (err); mac_perim_enter_by_mh(mh, &mph); err = dls_link_rele_by_name(mac_name(mh)); mac_perim_exit(mph); if (err != 0) (void) dls_devnet_set(mac_name(mh), *idp, NULL); return (err); } Once the call to dls_devnet_unset() has succeeded, is it possible for dls_link_rele_by_name() to fail? I ask because the subsequent call to dls_devnet_set() upon dls_link_rele_by_name() failure is problematic. It this dls_devnet_set() call fails, the system is rendered hosed until rebooted. The Clearview iptun project is adding some code in dls_devnet_set() to automatically set the zoneid of the link if it is created from within a non-global zone, and I cannot assert that this code will always succeed. It would be good to understand what the semantics of this function are in all possible failure scenarios. It would be great if dls_link_rele_by_name() couldn''t fail in this case, and there should thus be an ASSERT() that err == 0 below it. The call to dls_devnet_set() could then be removed. I don''t know enough about what dls_link_rele_by_name() does with flows to be able to assert that myself though. Input would be appreciated. Thanks, -Seb
Cathy Zhou
2009-Jan-16 21:33 UTC
[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()
On 2009?01?16? 13:16, Sebastien Roy wrote:> Hi Crossbow team, > > I have a question on the modifications made to dls_devnet_destroy(): > > int > dls_devnet_destroy(mac_handle_t mh, datalink_id_t *idp, boolean_t wait) > { > int err; > mac_perim_handle_t mph; > > *idp = DATALINK_INVALID_LINKID; > err = dls_devnet_unset(mac_name(mh), idp, wait); > if (err != 0 && err != ENOENT) > return (err); > > mac_perim_enter_by_mh(mh, &mph); > err = dls_link_rele_by_name(mac_name(mh)); > mac_perim_exit(mph); > > if (err != 0) > (void) dls_devnet_set(mac_name(mh), *idp, NULL); > return (err); > } > > Once the call to dls_devnet_unset() has succeeded, is it possible for > dls_link_rele_by_name() to fail? I ask because the subsequent call to > dls_devnet_set() upon dls_link_rele_by_name() failure is problematic. > It this dls_devnet_set() call fails, the system is rendered hosed until > rebooted. >It looks possible. As long as there are still subflows added on this link, it would fail. Thanks - Cathy> The Clearview iptun project is adding some code in dls_devnet_set() to > automatically set the zoneid of the link if it is created from within a > non-global zone, and I cannot assert that this code will always succeed. > > It would be good to understand what the semantics of this function are > in all possible failure scenarios. It would be great if > dls_link_rele_by_name() couldn''t fail in this case, and there should > thus be an ASSERT() that err == 0 below it. The call to > dls_devnet_set() could then be removed. > > I don''t know enough about what dls_link_rele_by_name() does with flows > to be able to assert that myself though. Input would be appreciated. > > Thanks, > -Seb > > > _______________________________________________ > crossbow-discuss mailing list > crossbow-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss
Sebastien Roy
2009-Jan-16 22:34 UTC
[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()
On Fri, 2009-01-16 at 13:33 -0800, Cathy Zhou wrote:> On 2009?01?16? 13:16, Sebastien Roy wrote: > > Hi Crossbow team, > > > > I have a question on the modifications made to dls_devnet_destroy(): > > > > int > > dls_devnet_destroy(mac_handle_t mh, datalink_id_t *idp, boolean_t wait) > > { > > int err; > > mac_perim_handle_t mph; > > > > *idp = DATALINK_INVALID_LINKID; > > err = dls_devnet_unset(mac_name(mh), idp, wait); > > if (err != 0 && err != ENOENT) > > return (err); > > > > mac_perim_enter_by_mh(mh, &mph); > > err = dls_link_rele_by_name(mac_name(mh)); > > mac_perim_exit(mph); > > > > if (err != 0) > > (void) dls_devnet_set(mac_name(mh), *idp, NULL); > > return (err); > > } > > > > Once the call to dls_devnet_unset() has succeeded, is it possible for > > dls_link_rele_by_name() to fail? I ask because the subsequent call to > > dls_devnet_set() upon dls_link_rele_by_name() failure is problematic. > > It this dls_devnet_set() call fails, the system is rendered hosed until > > rebooted. > > > It looks possible. As long as there are still subflows added on this link, > it would fail.Is there a way to fix this? There has to be some way to freeze the state of things once it''s been verified that teardown is safe (sort of like what mac_disable() was intended to do). That way, one can assert that the various steps involved in teardown cannot fail. -Seb
Cathy Zhou
2009-Jan-16 23:17 UTC
[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()
On 2009?01?16? 14:34, Sebastien Roy wrote:> On Fri, 2009-01-16 at 13:33 -0800, Cathy Zhou wrote: >> On 2009?01?16? 13:16, Sebastien Roy wrote: >>> Hi Crossbow team, >>> >>> I have a question on the modifications made to dls_devnet_destroy(): >>> >>> int >>> dls_devnet_destroy(mac_handle_t mh, datalink_id_t *idp, boolean_t wait) >>> { >>> int err; >>> mac_perim_handle_t mph; >>> >>> *idp = DATALINK_INVALID_LINKID; >>> err = dls_devnet_unset(mac_name(mh), idp, wait); >>> if (err != 0 && err != ENOENT) >>> return (err); >>> >>> mac_perim_enter_by_mh(mh, &mph); >>> err = dls_link_rele_by_name(mac_name(mh)); >>> mac_perim_exit(mph); >>> >>> if (err != 0) >>> (void) dls_devnet_set(mac_name(mh), *idp, NULL); >>> return (err); >>> } >>> >>> Once the call to dls_devnet_unset() has succeeded, is it possible for >>> dls_link_rele_by_name() to fail? I ask because the subsequent call to >>> dls_devnet_set() upon dls_link_rele_by_name() failure is problematic. >>> It this dls_devnet_set() call fails, the system is rendered hosed until >>> rebooted. >>> >> It looks possible. As long as there are still subflows added on this link, >> it would fail. > > Is there a way to fix this? There has to be some way to freeze the > state of things once it''s been verified that teardown is safe (sort of > like what mac_disable() was intended to do). That way, one can assert > that the various steps involved in teardown cannot fail. >Yes. I believe it is possible. Can we add a dls_devnet_disable() function, which checks whether we can destroy dls_devnet_t, dls_link_t (or even mac_impl_t?) and set the disabled flag on each of them. The disabled flag will be used to prevent further usage of those structures. Thanks - Cathy
Eric Cheng
2009-Jan-17 00:35 UTC
[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()
I think something like this could work: int dls_devnet_destroy(mac_handle_t mh, datalink_id_t *idp) { int err; mac_perim_handle_t mph; mac_perim_enter_by_mh(mh, &mph); *idp = DATALINK_INVALID_LINKID; err = dls_devnet_unset(mac_name(mh), idp, B_FALSE); if (err != 0) { mac_perim_exit(mph); return (err); } err = dls_link_rele_by_name(mac_name(mh)); ASSERT(err == 0); mac_perim_exit(mph); return (err); } static int dls_devnet_unset(const char *macname, datalink_id_t *id, boolean_t wait) { dls_devnet_t *ddp; int err; mod_hash_val_t val; rw_enter(&i_dls_devnet_lock, RW_WRITER); if ((err = mod_hash_find(i_dls_devnet_hash, (mod_hash_key_t)macname, (mod_hash_val_t *)&ddp)) != 0) { ASSERT(err == MH_ERR_NOTFOUND); rw_exit(&i_dls_devnet_lock); return (ENOENT); } if ((err = dls_link_is_busy(macname)) != 0) { rw_exit(&i_dls_devnet_lock); return (err); } mutex_enter(&ddp->dd_mutex); .... } int dls_link_is_busy(const char *name) { dls_link_t *dlp; if (mod_hash_find(i_dls_link_hash, (mod_hash_key_t)name, (mod_hash_val_t *)&dlp) != 0) return (ENOENT); ASSERT(MAC_PERIM_HELD(dlp->dl_mh)); /* * Must fail detach if mac client is busy. */ ASSERT(dlp->dl_ref > 0 && dlp->dl_mch != NULL); if (mac_link_has_flows(dlp->dl_mch)) return (ENOTEMPTY); return (0); } notes: -the DD_CONDEMNED flag has the effect of disabling the dls_devnet_t and ensures that subsequent calls to dls_devnet_hold_tmp() would fail (indirectly called by mac_link_flow_add()) -we do not wait in dls_devnet_unset() because this could deadlock if someone is holding a tmp reference. softmac_destroy() already does this so it should be ok to change aggr/vnic to behave the same way. let me know if this makes sense. eric On Fri, Jan 16, 2009 at 03:17:34PM -0800, Cathy Zhou wrote:> On 2009?01?16? 14:34, Sebastien Roy wrote: > > On Fri, 2009-01-16 at 13:33 -0800, Cathy Zhou wrote: > >> On 2009?01?16? 13:16, Sebastien Roy wrote: > >>> Hi Crossbow team, > >>> > >>> I have a question on the modifications made to dls_devnet_destroy(): > >>> > >>> int > >>> dls_devnet_destroy(mac_handle_t mh, datalink_id_t *idp, boolean_t wait) > >>> { > >>> int err; > >>> mac_perim_handle_t mph; > >>> > >>> *idp = DATALINK_INVALID_LINKID; > >>> err = dls_devnet_unset(mac_name(mh), idp, wait); > >>> if (err != 0 && err != ENOENT) > >>> return (err); > >>> > >>> mac_perim_enter_by_mh(mh, &mph); > >>> err = dls_link_rele_by_name(mac_name(mh)); > >>> mac_perim_exit(mph); > >>> > >>> if (err != 0) > >>> (void) dls_devnet_set(mac_name(mh), *idp, NULL); > >>> return (err); > >>> } > >>> > >>> Once the call to dls_devnet_unset() has succeeded, is it possible for > >>> dls_link_rele_by_name() to fail? I ask because the subsequent call to > >>> dls_devnet_set() upon dls_link_rele_by_name() failure is problematic. > >>> It this dls_devnet_set() call fails, the system is rendered hosed until > >>> rebooted. > >>> > >> It looks possible. As long as there are still subflows added on this link, > >> it would fail. > > > > Is there a way to fix this? There has to be some way to freeze the > > state of things once it''s been verified that teardown is safe (sort of > > like what mac_disable() was intended to do). That way, one can assert > > that the various steps involved in teardown cannot fail. > > > Yes. I believe it is possible. Can we add a dls_devnet_disable() function, > which checks whether we can destroy dls_devnet_t, dls_link_t (or even > mac_impl_t?) and set the disabled flag on each of them. The disabled flag > will be used to prevent further usage of those structures. > > Thanks > - Cathy > _______________________________________________ > crossbow-discuss mailing list > crossbow-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss
Cathy Zhou
2009-Jan-17 00:45 UTC
[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()
> -we do not wait in dls_devnet_unset() because this could deadlock > if someone is holding a tmp reference. softmac_destroy() already > does this so it should be ok to change aggr/vnic to behave the > same way. >It is fine for softmac to do that, since it is fine to prevent a device from detaching. But it will be strange to disallow a vnic/aggr from being destroyed only because someone is showing its state right now (which would result a temporary reference). Thanks - Cathy> let me know if this makes sense. > > eric > > On Fri, Jan 16, 2009 at 03:17:34PM -0800, Cathy Zhou wrote: >> On 2009?01?16? 14:34, Sebastien Roy wrote: >>> On Fri, 2009-01-16 at 13:33 -0800, Cathy Zhou wrote: >>>> On 2009?01?16? 13:16, Sebastien Roy wrote: >>>>> Hi Crossbow team, >>>>> >>>>> I have a question on the modifications made to dls_devnet_destroy(): >>>>> >>>>> int >>>>> dls_devnet_destroy(mac_handle_t mh, datalink_id_t *idp, boolean_t wait) >>>>> { >>>>> int err; >>>>> mac_perim_handle_t mph; >>>>> >>>>> *idp = DATALINK_INVALID_LINKID; >>>>> err = dls_devnet_unset(mac_name(mh), idp, wait); >>>>> if (err != 0 && err != ENOENT) >>>>> return (err); >>>>> >>>>> mac_perim_enter_by_mh(mh, &mph); >>>>> err = dls_link_rele_by_name(mac_name(mh)); >>>>> mac_perim_exit(mph); >>>>> >>>>> if (err != 0) >>>>> (void) dls_devnet_set(mac_name(mh), *idp, NULL); >>>>> return (err); >>>>> } >>>>> >>>>> Once the call to dls_devnet_unset() has succeeded, is it possible for >>>>> dls_link_rele_by_name() to fail? I ask because the subsequent call to >>>>> dls_devnet_set() upon dls_link_rele_by_name() failure is problematic. >>>>> It this dls_devnet_set() call fails, the system is rendered hosed until >>>>> rebooted. >>>>> >>>> It looks possible. As long as there are still subflows added on this link, >>>> it would fail. >>> Is there a way to fix this? There has to be some way to freeze the >>> state of things once it''s been verified that teardown is safe (sort of >>> like what mac_disable() was intended to do). That way, one can assert >>> that the various steps involved in teardown cannot fail. >>> >> Yes. I believe it is possible. Can we add a dls_devnet_disable() function, >> which checks whether we can destroy dls_devnet_t, dls_link_t (or even >> mac_impl_t?) and set the disabled flag on each of them. The disabled flag >> will be used to prevent further usage of those structures. >> >> Thanks >> - Cathy >> _______________________________________________ >> crossbow-discuss mailing list >> crossbow-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss > _______________________________________________ > crossbow-discuss mailing list > crossbow-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss
Eric Cheng
2009-Jan-17 01:15 UTC
[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()
On Fri, Jan 16, 2009 at 04:45:02PM -0800, Cathy Zhou wrote:> > >-we do not wait in dls_devnet_unset() because this could deadlock > > if someone is holding a tmp reference. softmac_destroy() already > > does this so it should be ok to change aggr/vnic to behave the > > same way. > > > It is fine for softmac to do that, since it is fine to prevent a device > from detaching. But it will be strange to disallow a vnic/aggr from being > destroyed only because someone is showing its state right now (which would > result a temporary reference). >a device detach could also be triggered by the administrator via a command (e.g. cfgadm). how is this different from, say, dladm delete-aggr? why treat aggrs/vnics differently? I am afraid this teardown issue will be very difficult to fix because of locking issues if you must wait for tmp references to drop. adding a separate _disable function will not make this easier. eric> Thanks > - Cathy > > >let me know if this makes sense. > > > >eric > > > >On Fri, Jan 16, 2009 at 03:17:34PM -0800, Cathy Zhou wrote: > >>On 2009?01?16? 14:34, Sebastien Roy wrote: > >>>On Fri, 2009-01-16 at 13:33 -0800, Cathy Zhou wrote: > >>>>On 2009?01?16? 13:16, Sebastien Roy wrote: > >>>>>Hi Crossbow team, > >>>>> > >>>>>I have a question on the modifications made to dls_devnet_destroy(): > >>>>> > >>>>>int > >>>>>dls_devnet_destroy(mac_handle_t mh, datalink_id_t *idp, boolean_t wait) > >>>>>{ > >>>>> int err; > >>>>> mac_perim_handle_t mph; > >>>>> > >>>>> *idp = DATALINK_INVALID_LINKID; > >>>>> err = dls_devnet_unset(mac_name(mh), idp, wait); > >>>>> if (err != 0 && err != ENOENT) > >>>>> return (err); > >>>>> > >>>>> mac_perim_enter_by_mh(mh, &mph); > >>>>> err = dls_link_rele_by_name(mac_name(mh)); > >>>>> mac_perim_exit(mph); > >>>>> > >>>>> if (err != 0) > >>>>> (void) dls_devnet_set(mac_name(mh), *idp, NULL); > >>>>> return (err); > >>>>>} > >>>>> > >>>>>Once the call to dls_devnet_unset() has succeeded, is it possible for > >>>>>dls_link_rele_by_name() to fail? I ask because the subsequent call to > >>>>>dls_devnet_set() upon dls_link_rele_by_name() failure is problematic. > >>>>>It this dls_devnet_set() call fails, the system is rendered hosed until > >>>>>rebooted. > >>>>> > >>>>It looks possible. As long as there are still subflows added on this > >>>>link, it would fail. > >>>Is there a way to fix this? There has to be some way to freeze the > >>>state of things once it''s been verified that teardown is safe (sort of > >>>like what mac_disable() was intended to do). That way, one can assert > >>>that the various steps involved in teardown cannot fail. > >>> > >>Yes. I believe it is possible. Can we add a dls_devnet_disable() > >>function, which checks whether we can destroy dls_devnet_t, dls_link_t > >>(or even mac_impl_t?) and set the disabled flag on each of them. The > >>disabled flag will be used to prevent further usage of those structures. > >> > >>Thanks > >>- Cathy > >>_______________________________________________ > >>crossbow-discuss mailing list > >>crossbow-discuss at opensolaris.org > >>http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss > >_______________________________________________ > >crossbow-discuss mailing list > >crossbow-discuss at opensolaris.org > >http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss >
Cathy Zhou
2009-Jan-17 02:12 UTC
[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()
> > a device detach could also be triggered by the administrator via a command > (e.g. cfgadm). how is this different from, say, dladm delete-aggr? > why treat aggrs/vnics differently? >Good point. I thought about that too. Yes, ideally that we should wait for the temporary reference count to drop for softmac as well. I think it is also because of locking issue which cause we choose not to wait for physical link. Can you elaborate what would be the locking issues for aggr/vnic if we wait in dls_devnet_unset? Thanks - Cathy
Thirumalai Srinivasan
2009-Jan-19 02:37 UTC
[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()
Eric, Iam assuming you mean dls_devnet_unset() will fail the unset if there are temporary references i.e. the dd_tref is non-zero or if dd_prop_taskid is non-null. This is easy to implement, the only issue is the one Cathy pointed For example a script that runs a delete vnic would fail occasionally if it coincided with somone running a dladm show command. If this failure is acceptable, then there is perhaps no need for even keeping track of a separate field for temporary references. (cfgadm failure is perhaps similar, but may not be such a call generator.)> Cathy Zhou wrote: >> Can you elaborate what would be the locking issues for aggr/vnic if we >> wait in dls_devnet_unset? >>If you wait for the dd_tref to drop to zero, while holding the perimeter there is the risk of deadlock. For eg. the following sequence in drv_ioc_prop_common() would do the opposite order if ((err = dls_devnet_hold_tmp(linkid, &dlh)) != 0) goto done; if ((err = mac_perim_enter_by_macname(dls_devnet_mac(dlh), &mph)) != 0) { goto done; } The other alternative would need lot more work and code reorganizion. We need to separate the disable and destroy part all through, in vanity naming, softmac, and mac, so that all disables are done first. Then we can undo the disables if any of them fail. Only if all disables succeed, should we do the destroys. Destroys must never fail. Thirumalai Eric Cheng wrote:> I think something like this could work: > > int > dls_devnet_destroy(mac_handle_t mh, datalink_id_t *idp) > { > int err; > mac_perim_handle_t mph; > > mac_perim_enter_by_mh(mh, &mph); > > *idp = DATALINK_INVALID_LINKID; > err = dls_devnet_unset(mac_name(mh), idp, B_FALSE); > if (err != 0) { > mac_perim_exit(mph); > return (err); > } > err = dls_link_rele_by_name(mac_name(mh)); > ASSERT(err == 0); > mac_perim_exit(mph); > return (err); > } > > static int > dls_devnet_unset(const char *macname, datalink_id_t *id, boolean_t wait) > { > dls_devnet_t *ddp; > int err; > mod_hash_val_t val; > > rw_enter(&i_dls_devnet_lock, RW_WRITER); > if ((err = mod_hash_find(i_dls_devnet_hash, > (mod_hash_key_t)macname, (mod_hash_val_t *)&ddp)) != 0) { > ASSERT(err == MH_ERR_NOTFOUND); > rw_exit(&i_dls_devnet_lock); > return (ENOENT); > } > > if ((err = dls_link_is_busy(macname)) != 0) { > rw_exit(&i_dls_devnet_lock); > return (err); > } > > mutex_enter(&ddp->dd_mutex); > .... > } > > int > dls_link_is_busy(const char *name) > { > dls_link_t *dlp; > > if (mod_hash_find(i_dls_link_hash, (mod_hash_key_t)name, > (mod_hash_val_t *)&dlp) != 0) > return (ENOENT); > > ASSERT(MAC_PERIM_HELD(dlp->dl_mh)); > > /* > * Must fail detach if mac client is busy. > */ > ASSERT(dlp->dl_ref > 0 && dlp->dl_mch != NULL); > if (mac_link_has_flows(dlp->dl_mch)) > return (ENOTEMPTY); > > return (0); > } > > > notes: > -the DD_CONDEMNED flag has the effect of disabling the dls_devnet_t > and ensures that subsequent calls to dls_devnet_hold_tmp() would > fail (indirectly called by mac_link_flow_add()) > > -we do not wait in dls_devnet_unset() because this could deadlock > if someone is holding a tmp reference. softmac_destroy() already > does this so it should be ok to change aggr/vnic to behave the > same way. > > let me know if this makes sense. > > eric > > On Fri, Jan 16, 2009 at 03:17:34PM -0800, Cathy Zhou wrote: > >> On 2009?01?16? 14:34, Sebastien Roy wrote: >> >>> On Fri, 2009-01-16 at 13:33 -0800, Cathy Zhou wrote: >>> >>>> On 2009?01?16? 13:16, Sebastien Roy wrote: >>>> >>>>> Hi Crossbow team, >>>>> >>>>> I have a question on the modifications made to dls_devnet_destroy(): >>>>> >>>>> int >>>>> dls_devnet_destroy(mac_handle_t mh, datalink_id_t *idp, boolean_t wait) >>>>> { >>>>> int err; >>>>> mac_perim_handle_t mph; >>>>> >>>>> *idp = DATALINK_INVALID_LINKID; >>>>> err = dls_devnet_unset(mac_name(mh), idp, wait); >>>>> if (err != 0 && err != ENOENT) >>>>> return (err); >>>>> >>>>> mac_perim_enter_by_mh(mh, &mph); >>>>> err = dls_link_rele_by_name(mac_name(mh)); >>>>> mac_perim_exit(mph); >>>>> >>>>> if (err != 0) >>>>> (void) dls_devnet_set(mac_name(mh), *idp, NULL); >>>>> return (err); >>>>> } >>>>> >>>>> Once the call to dls_devnet_unset() has succeeded, is it possible for >>>>> dls_link_rele_by_name() to fail? I ask because the subsequent call to >>>>> dls_devnet_set() upon dls_link_rele_by_name() failure is problematic. >>>>> It this dls_devnet_set() call fails, the system is rendered hosed until >>>>> rebooted. >>>>> >>>>> >>>> It looks possible. As long as there are still subflows added on this link, >>>> it would fail. >>>> >>> Is there a way to fix this? There has to be some way to freeze the >>> state of things once it''s been verified that teardown is safe (sort of >>> like what mac_disable() was intended to do). That way, one can assert >>> that the various steps involved in teardown cannot fail. >>> >>> >> Yes. I believe it is possible. Can we add a dls_devnet_disable() function, >> which checks whether we can destroy dls_devnet_t, dls_link_t (or even >> mac_impl_t?) and set the disabled flag on each of them. The disabled flag >> will be used to prevent further usage of those structures. >> >> Thanks >> - Cathy >> _______________________________________________ >> crossbow-discuss mailing list >> crossbow-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss >> > _______________________________________________ > crossbow-discuss mailing list > crossbow-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss >
Eric Cheng
2009-Jan-19 09:47 UTC
[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()
thanks for the explanation, thiru. I agree with you that the complete solution will take a lot more work. I will likely tackle this as part of 6791335 as they are somewhat related. eric On Sun, Jan 18, 2009 at 06:37:35PM -0800, Thirumalai Srinivasan wrote:> > Eric, > > Iam assuming you mean dls_devnet_unset() will fail the unset if there are > temporary references i.e. the dd_tref is non-zero or if dd_prop_taskid is > non-null. This is easy to implement, the only issue is the one Cathy pointed > For example a script that runs a delete vnic would fail occasionally if it > coincided with somone running a dladm show command. If this failure is > acceptable, then there is perhaps no need for even keeping track of a > separate > field for temporary references. > > (cfgadm failure is perhaps similar, but may not be such a call generator.) > > > >Cathy Zhou wrote: > >>Can you elaborate what would be the locking issues for aggr/vnic if we > >>wait in dls_devnet_unset? > >> > > > If you wait for the dd_tref to drop to zero, while holding the perimeter > there is the risk of deadlock. For eg. the following sequence in > drv_ioc_prop_common() would do the opposite order > > if ((err = dls_devnet_hold_tmp(linkid, &dlh)) != 0) > goto done; > > if ((err = mac_perim_enter_by_macname(dls_devnet_mac(dlh), > &mph)) != 0) { > goto done; > } > > > The other alternative would need lot more work and code reorganizion. We > need to > separate the disable and destroy part all through, in vanity naming, > softmac, and mac, so that all disables are done first. Then we can undo the > disables if any of them fail. Only if all disables succeed, should we do the > destroys. Destroys must never fail. > > Thirumalai > > > > Eric Cheng wrote: > >I think something like this could work: > > > >int > >dls_devnet_destroy(mac_handle_t mh, datalink_id_t *idp) > >{ > > int err; > > mac_perim_handle_t mph; > > > > mac_perim_enter_by_mh(mh, &mph); > > > > *idp = DATALINK_INVALID_LINKID; > > err = dls_devnet_unset(mac_name(mh), idp, B_FALSE); > > if (err != 0) { > > mac_perim_exit(mph); > > return (err); > > } > > err = dls_link_rele_by_name(mac_name(mh)); > > ASSERT(err == 0); > > mac_perim_exit(mph); > > return (err); > >} > > > >static int > >dls_devnet_unset(const char *macname, datalink_id_t *id, boolean_t wait) > >{ > > dls_devnet_t *ddp; > > int err; > > mod_hash_val_t val; > > > > rw_enter(&i_dls_devnet_lock, RW_WRITER); > > if ((err = mod_hash_find(i_dls_devnet_hash, > > (mod_hash_key_t)macname, (mod_hash_val_t *)&ddp)) != 0) { > > ASSERT(err == MH_ERR_NOTFOUND); > > rw_exit(&i_dls_devnet_lock); > > return (ENOENT); > > } > > > > if ((err = dls_link_is_busy(macname)) != 0) { > > rw_exit(&i_dls_devnet_lock); > > return (err); > > } > > > > mutex_enter(&ddp->dd_mutex); > > .... > >} > > > >int > >dls_link_is_busy(const char *name) > >{ > > dls_link_t *dlp; > > > > if (mod_hash_find(i_dls_link_hash, (mod_hash_key_t)name, > > (mod_hash_val_t *)&dlp) != 0) > > return (ENOENT); > > > > ASSERT(MAC_PERIM_HELD(dlp->dl_mh)); > > > > /* > > * Must fail detach if mac client is busy. > > */ > > ASSERT(dlp->dl_ref > 0 && dlp->dl_mch != NULL); > > if (mac_link_has_flows(dlp->dl_mch)) > > return (ENOTEMPTY); > > > > return (0); > >} > > > > > >notes: > >-the DD_CONDEMNED flag has the effect of disabling the dls_devnet_t > > and ensures that subsequent calls to dls_devnet_hold_tmp() would > > fail (indirectly called by mac_link_flow_add()) > > > >-we do not wait in dls_devnet_unset() because this could deadlock > > if someone is holding a tmp reference. softmac_destroy() already > > does this so it should be ok to change aggr/vnic to behave the > > same way. > > > >let me know if this makes sense. > > > >eric > > > >On Fri, Jan 16, 2009 at 03:17:34PM -0800, Cathy Zhou wrote: > > > >>On 2009?01?16? 14:34, Sebastien Roy wrote: > >> > >>>On Fri, 2009-01-16 at 13:33 -0800, Cathy Zhou wrote: > >>> > >>>>On 2009?01?16? 13:16, Sebastien Roy wrote: > >>>> > >>>>>Hi Crossbow team, > >>>>> > >>>>>I have a question on the modifications made to dls_devnet_destroy(): > >>>>> > >>>>>int > >>>>>dls_devnet_destroy(mac_handle_t mh, datalink_id_t *idp, boolean_t wait) > >>>>>{ > >>>>> int err; > >>>>> mac_perim_handle_t mph; > >>>>> > >>>>> *idp = DATALINK_INVALID_LINKID; > >>>>> err = dls_devnet_unset(mac_name(mh), idp, wait); > >>>>> if (err != 0 && err != ENOENT) > >>>>> return (err); > >>>>> > >>>>> mac_perim_enter_by_mh(mh, &mph); > >>>>> err = dls_link_rele_by_name(mac_name(mh)); > >>>>> mac_perim_exit(mph); > >>>>> > >>>>> if (err != 0) > >>>>> (void) dls_devnet_set(mac_name(mh), *idp, NULL); > >>>>> return (err); > >>>>>} > >>>>> > >>>>>Once the call to dls_devnet_unset() has succeeded, is it possible for > >>>>>dls_link_rele_by_name() to fail? I ask because the subsequent call to > >>>>>dls_devnet_set() upon dls_link_rele_by_name() failure is problematic. > >>>>>It this dls_devnet_set() call fails, the system is rendered hosed until > >>>>>rebooted. > >>>>> > >>>>> > >>>>It looks possible. As long as there are still subflows added on this > >>>>link, it would fail. > >>>> > >>>Is there a way to fix this? There has to be some way to freeze the > >>>state of things once it''s been verified that teardown is safe (sort of > >>>like what mac_disable() was intended to do). That way, one can assert > >>>that the various steps involved in teardown cannot fail. > >>> > >>> > >>Yes. I believe it is possible. Can we add a dls_devnet_disable() > >>function, which checks whether we can destroy dls_devnet_t, dls_link_t > >>(or even mac_impl_t?) and set the disabled flag on each of them. The > >>disabled flag will be used to prevent further usage of those structures. > >> > >>Thanks > >>- Cathy > >>_______________________________________________ > >>crossbow-discuss mailing list > >>crossbow-discuss at opensolaris.org > >>http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss > >> > >_______________________________________________ > >crossbow-discuss mailing list > >crossbow-discuss at opensolaris.org > >http://mail.opensolaris.org/mailman/listinfo/crossbow-discuss > > >
Sebastien Roy
2009-Jan-20 18:05 UTC
[crossbow-discuss] question on crossbow changes to dls_devnet_destroy()
On Mon, 2009-01-19 at 01:47 -0800, Eric Cheng wrote:> I agree with you that the complete solution will take a lot more work. > I will likely tackle this as part of 6791335 as they are somewhat > related.Sounds good, and thanks for the productive discussion. -Seb