Hi, I've found some possible deadlock in fs/ocfs2/dlm/dlmmaster.c - version 2.6.28 (probably this code is in newer versions too). Could someone confirm this? Thank you. fs/ocfs2/dlm/dlmmaster.c ================= function dlm_master_request_handler: (res->spinlock <- dlm->master_lock) ----------------------------------- spin_lock(&res->spinlock); at line 1427 spin_lock(&dlm->master_lock); at line 1475 function dlm_migrate_request_handler: (dlm->master_lock <- res->spinlock) ------------------------------------------------------- spin_lock(&dlm->master_lock) at line 3036 spin_lock(&res->spinlock); at line 3039 caught by Stanse (http://iti.fi.muni.cz/stanse/) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20090503/7592bfb1/attachment-0001.html
Jan Kucera wrote:> I've found some possible deadlock in fs/ocfs2/dlm/dlmmaster.c - > version 2.6.28 (probably this code is in newer versions too). > Could someone confirm this? Thank you. > > > fs/ocfs2/dlm/dlmmaster.c > =================> > function dlm_master_request_handler: (res->spinlock <- dlm->master_lock) > ----------------------------------- > spin_lock(&res->spinlock); at line 1427 > spin_lock(&dlm->master_lock); at line 1475 > > function dlm_migrate_request_handler: (dlm->master_lock <- res->spinlock) > ------------------------------------------------------- > spin_lock(&dlm->master_lock) at line 3036 > spin_lock(&res->spinlock); at line 3039So this should not happen. The first condition can only be hit if the resource has no master and is in the process of being mastered. The second condition will only be hit if the resource has a master and is currently being migrated (remastered) from one node to another. The two appear to be mutually exclusive. But feel free to file a bugzilla so that I remember to look into it more carefully when I have more time. http://oss.oracle.com/bugzilla Thanks Sunil