thr3ads.net - Btrfs devel - [PATCH] Btrfs: Check for NULL page in extent_range

If this information is useful, please help other people find it:
Share via:

Mitch Harder

2012-Jan-25 19:03 UTC

[PATCH] Btrfs: Check for NULL page in extent_range_uptodate

A user has encountered a NULL pointer kernel oops in btrfs when
encountering media errors.  The problem has been identified
as an unhandled NULL pointer returned from find_get_page().
This modification simply checks for a NULL page, and returns
with an error if found (the extent_range_uptodate() function
returns 1 on errors).

After testing this patch, the user reported that the error with
the NULL pointer oops was solved.  However, there is still a
remaining problem with a thread becoming stuck in
wait_on_page_locked(page) in the read_extent_buffer_pages(...)
function in extent_io.c

       for (i = start_i; i < num_pages; i++) {
               page = extent_buffer_page(eb, i);
               wait_on_page_locked(page);
               if (!PageUptodate(page))
                       ret = -EIO;
       }

This patch leaves the issue with the locked page yet to be resolved.

Signed-off-by: Mitch Harder <mitch.harder@sabayonlinux.org>
---
 fs/btrfs/extent_io.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 9d09a4f..fcf77e1 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -3909,6 +3909,8 @@ int extent_range_uptodate(struct extent_io_tree *tree,
 	while (start <= end) {
 		index = start >> PAGE_CACHE_SHIFT;
 		page = find_get_page(tree->mapping, index);
+		if (!page)
+			return 1;
 		uptodate = PageUptodate(page);
 		page_cache_release(page);
 		if (!uptodate) {
-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Vincent Vanackere

2012-Jan-30 21:41 UTC

head link

Re: [PATCH] Btrfs: Check for NULL page in extent_range_uptodate

On Wed, Jan 25, 2012 at 20:03, Mitch Harder
<mitch.harder@sabayonlinux.org> wrote:> A user has encountered a NULL pointer kernel oops in btrfs when
> encountering media errors.  The problem has been identified
> as an unhandled NULL pointer returned from find_get_page().
> This modification simply checks for a NULL page, and returns
> with an error if found (the extent_range_uptodate() function
> returns 1 on errors).
>
> After testing this patch, the user reported that the error with
> the NULL pointer oops was solved.  However, there is still a
> remaining problem with a thread becoming stuck in
> wait_on_page_locked(page) in the read_extent_buffer_pages(...)
> function in extent_io.c
>
>       for (i = start_i; i < num_pages; i++) {
>               page = extent_buffer_page(eb, i);
>               wait_on_page_locked(page);
>               if (!PageUptodate(page))
>                       ret = -EIO;
>       }
>
> This patch leaves the issue with the locked page yet to be resolved.
>
> Signed-off-by: Mitch Harder <mitch.harder@sabayonlinux.org>
> ---
>  fs/btrfs/extent_io.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 9d09a4f..fcf77e1 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -3909,6 +3909,8 @@ int extent_range_uptodate(struct extent_io_tree
*tree,
>        while (start <= end) {
>                index = start >> PAGE_CACHE_SHIFT;
>                page = find_get_page(tree->mapping, index);
> +               if (!page)
> +                       return 1;
>                uptodate = PageUptodate(page);
>                page_cache_release(page);
>                if (!uptodate) {
> --
> 1.7.3.4
>

Hi,

 If any btrfs developer could have a look at it while I can still
reproduce the situation (it won''t last long, I''ll send the
disk to RMA
next week), I''m still interested in solving the remaining part of the
btrfs bug. Here is the trace I get with the current linux kernel
(6bc2b95ee602659c1be6fac0f6aadeb0c5c29a5d) :

[  330.530015] btrfs bad tree block start 959241011200 959241011200
[  480.288046] INFO: task cat:2627 blocked for more than 120 seconds.
[  480.288050] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  480.288052] cat             D ffffffff8180c600     0  2627   2468 0x00000004
[  480.288057]  ffff8801fe135618 0000000000000086 ffff8801fe1355d8
ffff880222061650
[  480.288062]  ffff880215b5db80 ffff8801fe135fd8 ffff8801fe135fd8
ffff8801fe135fd8
[  480.288067]  ffff8802241a16e0 ffff880215b5db80 ffff8801fe1355e8
ffff88022fd93e88
[  480.288071] Call Trace:
[  480.288080]  [<ffffffff81114440>] ? __lock_page+0x70/0x70
[  480.288084]  [<ffffffff8162c0af>] schedule+0x3f/0x60
[  480.288087]  [<ffffffff8162c15f>] io_schedule+0x8f/0xd0
[  480.288091]  [<ffffffff8111444e>] sleep_on_page+0xe/0x20
[  480.288094]  [<ffffffff8162a96f>] __wait_on_bit+0x5f/0x90
[  480.288098]  [<ffffffff811145b8>] wait_on_page_bit+0x78/0x80
[  480.288102]  [<ffffffff81070c70>] ? autoremove_wake_function+0x40/0x40
[  480.288129]  [<ffffffffa005d161>]
read_extent_buffer_pages+0x471/0x4d0 [btrfs]
[  480.288142]  [<ffffffffa00347b0>] ? verify_parent_transid+0x160/0x160
[btrfs]
[  480.288155]  [<ffffffffa003513a>]
btree_read_extent_buffer_pages.isra.99+0x8a/0xc0 [btrfs]
[  480.288169]  [<ffffffffa00371e1>] read_tree_block+0x41/0x60 [btrfs]
[  480.288179]  [<ffffffffa001d6a3>]
read_block_for_search.isra.34+0xf3/0x3d0 [btrfs]
[  480.288190]  [<ffffffffa001f930>] btrfs_search_slot+0x300/0x8a0 [btrfs]
[  480.288203]  [<ffffffffa0031ab4>] btrfs_lookup_csum+0x74/0x170 [btrfs]
[  480.288216]  [<ffffffffa0031d5f>] __btrfs_lookup_bio_sums+0x1af/0x3b0
[btrfs]
[  480.288228]  [<ffffffffa0031fb6>] btrfs_lookup_bio_sums+0x16/0x20
[btrfs]
[  480.288242]  [<ffffffffa003e650>] btrfs_submit_bio_hook+0x140/0x170
[btrfs]
[  480.288256]  [<ffffffffa00405d0>] ? btrfs_real_readdir+0x720/0x720
[btrfs]
[  480.288272]  [<ffffffffa00571aa>] submit_one_bio+0x6a/0xa0 [btrfs]
[  480.288287]  [<ffffffffa005be64>] extent_readpages+0xe4/0x100 [btrfs]
[  480.288301]  [<ffffffffa00405d0>] ? btrfs_real_readdir+0x720/0x720
[btrfs]
[  480.288315]  [<ffffffffa003eebf>] btrfs_readpages+0x1f/0x30 [btrfs]
[  480.288319]  [<ffffffff81120bef>] __do_page_cache_readahead+0x1af/0x250
[  480.288323]  [<ffffffff81120ff1>] ra_submit+0x21/0x30
[  480.288326]  [<ffffffff81121115>] ondemand_readahead+0x115/0x230
[  480.288330]  [<ffffffff81137eb9>] ? __do_fault+0x419/0x530
[  480.288333]  [<ffffffff81121311>] page_cache_sync_readahead+0x31/0x50
[  480.288337]  [<ffffffff811167d8>] generic_file_aio_read+0x438/0x780
[  480.288342]  [<ffffffff81173db2>] do_sync_read+0xd2/0x110
[  480.288346]  [<ffffffff81294113>] ? security_file_permission+0x93/0xb0
[  480.288349]  [<ffffffff81174231>] ? rw_verify_area+0x61/0xf0
[  480.288352]  [<ffffffff81174710>] vfs_read+0xb0/0x180
[  480.288355]  [<ffffffff8117482a>] sys_read+0x4a/0x90
[  480.288359]  [<ffffffff81635229>] system_call_fastpath+0x16/0x1b
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Mitch Harder

2012-Jan-30 23:13 UTC

head link

Re: [PATCH] Btrfs: Check for NULL page in extent_range_uptodate

On Mon, Jan 30, 2012 at 3:41 PM, Vincent Vanackere
<vincent.vanackere@gmail.com> wrote:> On Wed, Jan 25, 2012 at 20:03, Mitch Harder
> <mitch.harder@sabayonlinux.org> wrote:
>> A user has encountered a NULL pointer kernel oops in btrfs when
>> encountering media errors.  The problem has been identified
>> as an unhandled NULL pointer returned from find_get_page().
>> This modification simply checks for a NULL page, and returns
>> with an error if found (the extent_range_uptodate() function
>> returns 1 on errors).
>>
>> After testing this patch, the user reported that the error with
>> the NULL pointer oops was solved.  However, there is still a
>> remaining problem with a thread becoming stuck in
>> wait_on_page_locked(page) in the read_extent_buffer_pages(...)
>> function in extent_io.c
>>
>>       for (i = start_i; i < num_pages; i++) {
>>               page = extent_buffer_page(eb, i);
>>               wait_on_page_locked(page);
>>               if (!PageUptodate(page))
>>                       ret = -EIO;
>>       }
>>
>> This patch leaves the issue with the locked page yet to be resolved.
>>
>> Signed-off-by: Mitch Harder <mitch.harder@sabayonlinux.org>
>> ---
>>  fs/btrfs/extent_io.c |    2 ++
>>  1 files changed, 2 insertions(+), 0 deletions(-)
>>
>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
>> index 9d09a4f..fcf77e1 100644
>> --- a/fs/btrfs/extent_io.c
>> +++ b/fs/btrfs/extent_io.c
>> @@ -3909,6 +3909,8 @@ int extent_range_uptodate(struct extent_io_tree
*tree,
>>        while (start <= end) {
>>                index = start >> PAGE_CACHE_SHIFT;
>>                page = find_get_page(tree->mapping, index);
>> +               if (!page)
>> +                       return 1;
>>                uptodate = PageUptodate(page);
>>                page_cache_release(page);
>>                if (!uptodate) {
>> --
>> 1.7.3.4
>>
>
>
> Hi,
>
>  If any btrfs developer could have a look at it while I can still
> reproduce the situation (it won''t last long, I''ll send
the disk to RMA
> next week), I''m still interested in solving the remaining part of
the
> btrfs bug. Here is the trace I get with the current linux kernel
> (6bc2b95ee602659c1be6fac0f6aadeb0c5c29a5d) :
>
> [  330.530015] btrfs bad tree block start 959241011200 959241011200
> [  480.288046] INFO: task cat:2627 blocked for more than 120 seconds.
> [  480.288050] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  480.288052] cat             D ffffffff8180c600     0  2627   2468
0x00000004
> [  480.288057]  ffff8801fe135618 0000000000000086 ffff8801fe1355d8
> ffff880222061650
> [  480.288062]  ffff880215b5db80 ffff8801fe135fd8 ffff8801fe135fd8
> ffff8801fe135fd8
> [  480.288067]  ffff8802241a16e0 ffff880215b5db80 ffff8801fe1355e8
> ffff88022fd93e88
> [  480.288071] Call Trace:
> [  480.288080]  [<ffffffff81114440>] ? __lock_page+0x70/0x70
> [  480.288084]  [<ffffffff8162c0af>] schedule+0x3f/0x60
> [  480.288087]  [<ffffffff8162c15f>] io_schedule+0x8f/0xd0
> [  480.288091]  [<ffffffff8111444e>] sleep_on_page+0xe/0x20
> [  480.288094]  [<ffffffff8162a96f>] __wait_on_bit+0x5f/0x90
> [  480.288098]  [<ffffffff811145b8>] wait_on_page_bit+0x78/0x80
> [  480.288102]  [<ffffffff81070c70>] ?
autoremove_wake_function+0x40/0x40
> [  480.288129]  [<ffffffffa005d161>]
> read_extent_buffer_pages+0x471/0x4d0 [btrfs]
> [  480.288142]  [<ffffffffa00347b0>] ?
verify_parent_transid+0x160/0x160 [btrfs]
> [  480.288155]  [<ffffffffa003513a>]
> btree_read_extent_buffer_pages.isra.99+0x8a/0xc0 [btrfs]
> [  480.288169]  [<ffffffffa00371e1>] read_tree_block+0x41/0x60
[btrfs]
> [  480.288179]  [<ffffffffa001d6a3>]
> read_block_for_search.isra.34+0xf3/0x3d0 [btrfs]
> [  480.288190]  [<ffffffffa001f930>] btrfs_search_slot+0x300/0x8a0
[btrfs]
> [  480.288203]  [<ffffffffa0031ab4>] btrfs_lookup_csum+0x74/0x170
[btrfs]
> [  480.288216]  [<ffffffffa0031d5f>]
__btrfs_lookup_bio_sums+0x1af/0x3b0 [btrfs]
> [  480.288228]  [<ffffffffa0031fb6>] btrfs_lookup_bio_sums+0x16/0x20
[btrfs]
> [  480.288242]  [<ffffffffa003e650>]
btrfs_submit_bio_hook+0x140/0x170 [btrfs]
> [  480.288256]  [<ffffffffa00405d0>] ? btrfs_real_readdir+0x720/0x720
[btrfs]
> [  480.288272]  [<ffffffffa00571aa>] submit_one_bio+0x6a/0xa0 [btrfs]
> [  480.288287]  [<ffffffffa005be64>] extent_readpages+0xe4/0x100
[btrfs]
> [  480.288301]  [<ffffffffa00405d0>] ? btrfs_real_readdir+0x720/0x720
[btrfs]
> [  480.288315]  [<ffffffffa003eebf>] btrfs_readpages+0x1f/0x30
[btrfs]
> [  480.288319]  [<ffffffff81120bef>]
__do_page_cache_readahead+0x1af/0x250
> [  480.288323]  [<ffffffff81120ff1>] ra_submit+0x21/0x30
> [  480.288326]  [<ffffffff81121115>] ondemand_readahead+0x115/0x230
> [  480.288330]  [<ffffffff81137eb9>] ? __do_fault+0x419/0x530
> [  480.288333]  [<ffffffff81121311>]
page_cache_sync_readahead+0x31/0x50
> [  480.288337]  [<ffffffff811167d8>]
generic_file_aio_read+0x438/0x780
> [  480.288342]  [<ffffffff81173db2>] do_sync_read+0xd2/0x110
> [  480.288346]  [<ffffffff81294113>] ?
security_file_permission+0x93/0xb0
> [  480.288349]  [<ffffffff81174231>] ? rw_verify_area+0x61/0xf0
> [  480.288352]  [<ffffffff81174710>] vfs_read+0xb0/0x180
> [  480.288355]  [<ffffffff8117482a>] sys_read+0x4a/0x90
> [  480.288359]  [<ffffffff81635229>] system_call_fastpath+0x16/0x1b
Jeff Mahoney has been working on a large overhaul of error
handling/BUG_ONs.  It is difficult to say when it  will be ready, or
if it will even address this specific problem.

I''d go ahead and return the disk.  I doubt you''ll be the last
user to
have bad sectors, so there''ll be more opportunities to see how this
issue is handled after the changes to error handling.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Vincent Vanackere

2012-Jan-31 09:06 UTC

head link

Re: [PATCH] Btrfs: Check for NULL page in extent_range_uptodate

On Tue, Jan 31, 2012 at 00:13, Mitch Harder
<mitch.harder@sabayonlinux.org> wrote:
> Jeff Mahoney has been working on a large overhaul of error
> handling/BUG_ONs.  It is difficult to say when it  will be ready, or
> if it will even address this specific problem.
>
> I''d go ahead and return the disk.  I doubt you''ll be the
last user to
> have bad sectors, so there''ll be more opportunities to see how
this
> issue is handled after the changes to error handling.
Ok I''m returning the disk now. Thanks for the help !
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Apparently Analagous Threads

Search for more maybe matching threads

Btrfs devel - Jan 2012 - [PATCH] Btrfs: Check for NULL page in extent_range_uptodate

[PATCH] Btrfs: Check for NULL page in extent_range_uptodate

Re: [PATCH] Btrfs: Check for NULL page in extent_range_uptodate

Re: [PATCH] Btrfs: Check for NULL page in extent_range_uptodate

Re: [PATCH] Btrfs: Check for NULL page in extent_range_uptodate

Apparently Analagous Threads