yzy@clusterfs.com
2007-Jan-19 02:40 UTC
[Lustre-devel] [Bug 11013] implement hash table for client UUID/NID export lookups
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11013 Created an attachment (id=9386) Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: --> (https://bugzilla.lustre.org/attachment.cgi?id=9386&action=view) new patch for buffalo test New patch, passed acceptance-small.sh test in local system, the new patch is testing in buffalo.
adilger@clusterfs.com
2007-Jan-25 03:21 UTC
[Lustre-devel] [Bug 11013] implement hash table for client UUID/NID export lookups
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11013 (From update of attachment 9386)>+lustre_hash_delitem_nolock(struct lustre_class_hash_body *hash_body, >+ int hashent, struct hlist_node * hash_item) >+{ >+ struct lustre_hash_operations *hop = hash_body->lchb_hash_operations; >+ >+ if (hash_item == NULL) { >+ RETURN(-ENOENT); >+ }No need for braces here.>+ >+ hlist_del(hash_item); >+ >+ hop->lustre_hash_object_refcount_put(hash_item); >+ >+#ifdef LUSTRE_HASH_DEBUG >+ hash_body->lchb_hash_tables[hashent]->lhb_item_count--; >+ CDEBUG(D_INFO, "hashname[%s] bucket[%d] has [%d] hashitem\n", >+ hash_body->hashname, hashent, >+ hash_body->lchb_hash_tables[hashent]->lhb_item_count); >+#endifYou are referencing a struct that may already be freed by the _put (if that was the last reference). \>@@ -655,41 +654,38 @@ int target_handle_connect(struct ptlrpc_> goto dont_check_exports; > > spin_lock(&target->obd_dev_lock); >+ export = lustre_hash_get_object_by_key(target->obd_uuid_hash_body, &cluuid); > >+ if (export != NULL && export->exp_connecting) { /* bug 9635, et. al. */ >+ } else if (export != NULL) { >+ } else { > export = NULL; > }Isn''t export == NULL in this case already?>+int lustre_hash_additem_unique(struct lustre_class_hash_body *hash_body, >+ void *key, struct hlist_node *actual_hnode) >+{ >+ int hashent; >+ struct lustre_hash_bucket *bucket = NULL; >+ struct lustre_hash_operations *hop = hash_body->lchb_hash_operations; >+ ENTRY; >+ >+ hashent = hop->lustre_hashfn(hash_body, key); >+ >+ /* get the hash-bucket and lock it */ >+ bucket = &hash_body->lchb_hash_tables[hashent]; >+ spin_lock(&bucket->lhb_lock); >+ >+ if ( (lustre_hash_getitem_in_bucket_nolock(hash_body, hashent, key)) != NULL) { >+ /* the added-item exist in hashtables, so cannot add it again */ >+ spin_unlock(&bucket->lhb_lock); >+ >+ CWARN("Object exist\n");This isn''t a useful error message. Either there are already error messages in the caller (that may already be true), or this needs to be made more useful like "Already found {key} in {hashname}, not adding duplicate".>+void * lustre_hash_get_object_by_key(struct lustre_class_hash_body *hash_body, >+ void *key) >+{ >+ if (hash_item_hnode == NULL) { >+ spin_unlock(&bucket->lhb_lock); /* lock the bucket */ >+ CERROR("Object don''t exist\n");Again, not a very useful error message. How often do we expect this to be hit? If very often at all, then it probably shouldn''t go to the console.>+} >+EXPORT_SYMBOL(lustre_hash_get_object_by_key);What happens if this is called when there are multiple of the same keys in the hash, as is described for the NID hash above?>+int nid_hash_key_compare(void *key, struct hlist_node *compared_hnode) >+{ >+ if (strcmp(obd_export_nid2str(export), libcfs_nid2str(*nid_key)) == 0) >+ retval = 1;How often do we expect to call obd_export_nid2str()? If this is called a LOT then it may make sense to just save the formatted NID into the export and have obd_export_nid2str() just return that pointer always.> int obd_export_evict_by_nid(struct obd_device *obd, char *nid) >+ lnet_nid_t nid_key = libcfs_str2nid(nid); >+ >+ for (; ;) {I think it is preferred to use do { } while (1)> int obd_export_evict_by_uuid(struct obd_device *obd, char *uuid) > { > struct obd_export *doomed_exp = NULL; > struct obd_uuid doomed; > int exports_evicted = 0; > > obd_str2uuid(&doomed, uuid); > >+ doomed_exp = lustre_hash_get_object_by_key(obd->obd_uuid_hash_body, >+ &doomed);Not directly a scalability (but rather a usability) issue, can you have this code skip a leading "uuid:" if present, so that it can be used similar to the "nid:" tag for evict-by-nid.>@@ -426,6 +442,16 @@ int class_cleanup(struct obd_device *obd >+ if (obd->obd_uuid_hash_body != NULL) { >+ lustre_hash_exit(obd->obd_uuid_hash_body); >+ } >+ >+ if (obd->obd_nid_hash_body != NULL) { >+ lustre_hash_exit(obd->obd_nid_hash_body); >+ }No need for braces here.>@@ -143,14 +145,27 @@ int ptlrpc_put_connection(struct ptlrpc_ >+ if (!hlist_unhashed(&c->c_hash)) { >+ lustre_hash_delitem(conn_hash_body, &peer, &c->c_hash); >+ }No need for braces. I haven''t done a full inspection, but this is generally good looking code here. What I don''t see is real unit tests for this. You need to write a test that verifies the scalability of this code. One option is to use (and possibly modify) Nathan''s loadgen program so that it only does the connection part without the IO generation. Then, time the performance of "evict-by-nid" or similar.
yzy@clusterfs.com
2007-Feb-01 02:57 UTC
[Lustre-devel] [Bug 11013] implement hash table for client UUID/NID export lookups
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11013 What |Removed |Added ---------------------------------------------------------------------------- Attachment #9367 is|0 |1 obsolete| | Attachment #9386 is|0 |1 obsolete| | Created an attachment (id=9468) Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: --> (https://bugzilla.lustre.org/attachment.cgi?id=9468&action=view) new patch, fixed the kernel crash, passed the acc-small.sh test new patch, fixed the kernel crash. passed the acc-small.sh in local system, submitted to buffalo to test.