홍신 shin hong
2009-Nov-11 15:07 UTC
BUG? a possible race due to the absence of memory barrier
Hello. I am reporting possible data race due to the the absence of memory barriers. I reported a similar issue. Although the previous one turns out to be safe, please examine this issue and let me know your opinion. In btrfs_init_new_device(), a btrfs_device object is allocated and initialized and then links to &root->fs_info->fs_devcies->alloc_list. It seems that a memory barrier is necessary between the initialization and the linking to the list. If these two operations are re-ordered so that executed opposite orders, it may result data race where uninitialized values are read by other threads. For btrfs_init_new_device(), i think __btfs_alloc_chunk() is a suspected to be possible to contribute data race by concurrent execution. Thank you Sincerely Shin Hong -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2009-Nov-11 16:06 UTC
Re: BUG? a possible race due to the absence of memory barrier
On Thu, Nov 12, 2009 at 12:07:05AM +0900, 홍신 shin hong wrote:> Hello. I am reporting possible data race > due to the the absence of memory barriers. > > I reported a similar issue. Although the previous one turns out to be safe, > please examine this issue and let me know your opinion. > > In btrfs_init_new_device(), a btrfs_device object is allocated and initialized > and then links to &root->fs_info->fs_devcies->alloc_list. > > It seems that a memory barrier is necessary > between the initialization and the linking to the list. > > If these two operations are re-ordered so that executed opposite orders, > it may result data race where uninitialized values are read by other threads. > > For btrfs_init_new_device(), i think __btfs_alloc_chunk() is a suspected > to be possible to contribute data race by concurrent execution.Thanks for searching for races in this code, it definitely has a lot of locks to go through. In this case, btrfs_init_new_device has the chunk mutex held (from lock_chunks), and __btrfs_alloc_chunk should always be called by with the chunk mutex held as well. In general the btrfs locking tries not to rely on barriers and ordering unless a given area of the code is very performance sensitive. It''s very easy for subtle bugs to creep in with barriers only, so I try to use mutexes and spinlocks everywhere that I can get away with it. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
홍신 shin hong
2009-Nov-12 01:14 UTC
Re: BUG? a possible race due to the absence of memory barrier
Thank you for the review. I did not notice that lock_chunks() is a locking function. I am using my own static analysis for finding bugs. As I register lock_chunks() as a locking functions, the bug alarm is disappeared. On Thu, Nov 12, 2009 at 1:06 AM, Chris Mason <chris.mason@oracle.com> wrote:> On Thu, Nov 12, 2009 at 12:07:05AM +0900, 홍신 shin hong wrote: >> Hello. I am reporting possible data race >> due to the the absence of memory barriers. >> >> I reported a similar issue. Although the previous one turns out to be safe, >> please examine this issue and let me know your opinion. >> >> In btrfs_init_new_device(), a btrfs_device object is allocated and initialized >> and then links to &root->fs_info->fs_devcies->alloc_list. >> >> It seems that a memory barrier is necessary >> between the initialization and the linking to the list. >> >> If these two operations are re-ordered so that executed opposite orders, >> it may result data race where uninitialized values are read by other threads. >> >> For btrfs_init_new_device(), i think __btfs_alloc_chunk() is a suspected >> to be possible to contribute data race by concurrent execution. > > Thanks for searching for races in this code, it definitely has a lot of > locks to go through. > > In this case, btrfs_init_new_device has the chunk mutex held (from > lock_chunks), and __btrfs_alloc_chunk should always be called by with > the chunk mutex held as well. > > In general the btrfs locking tries not to rely on barriers and ordering > unless a given area of the code is very performance sensitive. It''s > very easy for subtle bugs to creep in with barriers only, so I try to > use mutexes and spinlocks everywhere that I can get away with it. > > -chris >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html