thr3ads.net - Btrfs devel - [PATCH v2] Btrfs: remove superblock writing after fatal error [Aug 2012]

If this information is useful, please help other people find it:
Share via:

Stefan Behrens

2012-Aug-01 11:45 UTC

[PATCH v2] Btrfs: remove superblock writing after fatal error

With commit acce952b0, btrfs was changed to flag the filesystem with
BTRFS_SUPER_FLAG_ERROR and switch to read-only mode after a fatal
error happened like a write I/O errors of all mirrors.
In such situations, on unmount, the superblock is written in
btrfs_error_commit_super(). This is done with the intention to be able
to evaluate the error flag on the next mount. A warning is printed
in this case during the next mount and the log tree is ignored.

The issue is that it is possible that the superblock points to a root
that was not written (due to write I/O errors).
The result is that the filesystem cannot be mounted. btrfsck also does
not start and all the other btrfs-progs tools fail to start as well.
However, mount -o recovery is working well and does the right things
to recover the filesystem (i.e., don''t use the log root, clear the
free space cache and use the next mountable root that is stored in the
root backup array).

This patch removes the writing of the superblock when
BTRFS_SUPER_FLAG_ERROR is set, and removes the handling of the error
flag in the mount function.

These lines can be used to reproduce the issue (using /dev/sdm):
SCRATCH_DEV=/dev/sdm
SCRATCH_MNT=/mnt
echo 0 25165824 linear $SCRATCH_DEV 0 | dmsetup create foo
ls -alLF /dev/mapper/foo
mkfs.btrfs /dev/mapper/foo
mount /dev/mapper/foo $SCRATCH_MNT
echo bar > $SCRATCH_MNT/foo
sync
echo 0 25165824 error | dmsetup reload foo
dmsetup resume foo
ls -alF $SCRATCH_MNT
touch $SCRATCH_MNT/1
ls -alF $SCRATCH_MNT
sleep 35
echo 0 25165824 linear $SCRATCH_DEV 0 | dmsetup reload foo
dmsetup resume foo
sleep 1
umount $SCRATCH_MNT
btrfsck /dev/mapper/foo
dmsetup remove foo

Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
---
Changes v1 -> v2:
- Removed one large comment entirely which was not needed anymore
  after the code changes.
- Removed a warning message which was dead code since the super block
  is not written anymore when the flag BTRFS_SUPER_FLAG_ERROR is set.
- Removed the return code of btrfs_error_commit_super() since it now
  does not return any errors anymore.

 fs/btrfs/disk-io.c | 36 ++++--------------------------------
 fs/btrfs/disk-io.h |  2 +-
 2 files changed, 5 insertions(+), 33 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 502b20c..f6a1d0f 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2531,8 +2531,7 @@ retry_root_backup:
 		goto fail_trans_kthread;
 
 	/* do not make disk changes in broken FS */
-	if (btrfs_super_log_root(disk_super) != 0 &&
-	    !(fs_info->fs_state & BTRFS_SUPER_FLAG_ERROR)) {
+	if (btrfs_super_log_root(disk_super) != 0) {
 		u64 bytenr = btrfs_super_log_root(disk_super);
 
 		if (fs_devices->rw_devices == 0) {
@@ -3192,30 +3191,14 @@ int close_ctree(struct btrfs_root *root)
 	/* clear out the rbtree of defraggable inodes */
 	btrfs_run_defrag_inodes(fs_info);
 
-	/*
-	 * Here come 2 situations when btrfs is broken to flip readonly:
-	 *
-	 * 1. when btrfs flips readonly somewhere else before
-	 * btrfs_commit_super, sb->s_flags has MS_RDONLY flag,
-	 * and btrfs will skip to write sb directly to keep
-	 * ERROR state on disk.
-	 *
-	 * 2. when btrfs flips readonly just in btrfs_commit_super,
-	 * and in such case, btrfs cannot write sb via btrfs_commit_super,
-	 * and since fs_state has been set BTRFS_SUPER_FLAG_ERROR flag,
-	 * btrfs will cleanup all FS resources first and write sb then.
-	 */
 	if (!(fs_info->sb->s_flags & MS_RDONLY)) {
 		ret = btrfs_commit_super(root);
 		if (ret)
 			printk(KERN_ERR "btrfs: commit super ret %d\n", ret);
 	}
 
-	if (fs_info->fs_state & BTRFS_SUPER_FLAG_ERROR) {
-		ret = btrfs_error_commit_super(root);
-		if (ret)
-			printk(KERN_ERR "btrfs: commit super ret %d\n", ret);
-	}
+	if (fs_info->fs_state & BTRFS_SUPER_FLAG_ERROR)
+		btrfs_error_commit_super(root);
 
 	btrfs_put_block_group_cache(fs_info);
 
@@ -3437,18 +3420,11 @@ static int btrfs_check_super_valid(struct btrfs_fs_info
*fs_info,
 	if (read_only)
 		return 0;
 
-	if (fs_info->fs_state & BTRFS_SUPER_FLAG_ERROR) {
-		printk(KERN_WARNING "warning: mount fs with errors, "
-		       "running btrfsck is recommended\n");
-	}
-
 	return 0;
 }
 
-int btrfs_error_commit_super(struct btrfs_root *root)
+void btrfs_error_commit_super(struct btrfs_root *root)
 {
-	int ret;
-
 	mutex_lock(&root->fs_info->cleaner_mutex);
 	btrfs_run_delayed_iputs(root);
 	mutex_unlock(&root->fs_info->cleaner_mutex);
@@ -3458,10 +3434,6 @@ int btrfs_error_commit_super(struct btrfs_root *root)
 
 	/* cleanup FS via transaction */
 	btrfs_cleanup_transaction(root);
-
-	ret = write_ctree_super(NULL, root, 0);
-
-	return ret;
 }
 
 static void btrfs_destroy_ordered_operations(struct btrfs_root *root)
diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h
index 95e147e..c5b00a7 100644
--- a/fs/btrfs/disk-io.h
+++ b/fs/btrfs/disk-io.h
@@ -54,7 +54,7 @@ int write_ctree_super(struct btrfs_trans_handle *trans,
 		      struct btrfs_root *root, int max_mirrors);
 struct buffer_head *btrfs_read_dev_super(struct block_device *bdev);
 int btrfs_commit_super(struct btrfs_root *root);
-int btrfs_error_commit_super(struct btrfs_root *root);
+void btrfs_error_commit_super(struct btrfs_root *root);
 struct extent_buffer *btrfs_find_tree_block(struct btrfs_root *root,
 					    u64 bytenr, u32 blocksize);
 struct btrfs_root *btrfs_read_fs_root_no_radix(struct btrfs_root *tree_root,
-- 
1.7.11.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Liu Bo

2012-Aug-01 12:02 UTC

head link

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

On 08/01/2012 07:45 PM, Stefan Behrens wrote:> With commit acce952b0, btrfs was changed to flag the filesystem with
> BTRFS_SUPER_FLAG_ERROR and switch to read-only mode after a fatal
> error happened like a write I/O errors of all mirrors.
> In such situations, on unmount, the superblock is written in
> btrfs_error_commit_super(). This is done with the intention to be able
> to evaluate the error flag on the next mount. A warning is printed
> in this case during the next mount and the log tree is ignored.
> 
> The issue is that it is possible that the superblock points to a root
> that was not written (due to write I/O errors).
> The result is that the filesystem cannot be mounted. btrfsck also does
> not start and all the other btrfs-progs tools fail to start as well.
> However, mount -o recovery is working well and does the right things
> to recover the filesystem (i.e., don''t use the log root, clear the
> free space cache and use the next mountable root that is stored in the
> root backup array).
> 
> This patch removes the writing of the superblock when
> BTRFS_SUPER_FLAG_ERROR is set, and removes the handling of the error
> flag in the mount function.
> 
Yes, I have to admit that this can be a serious problem.

But we''ll need to send the error flag stored in the super block into
disk in the future so that the next mount can find it unstable and do
fsck by itself maybe.

If we do not write the super any more, it cannot find the flag.

So can we find a way to make both things happy?

thanks,
liubo
> These lines can be used to reproduce the issue (using /dev/sdm):
> SCRATCH_DEV=/dev/sdm
> SCRATCH_MNT=/mnt
> echo 0 25165824 linear $SCRATCH_DEV 0 | dmsetup create foo
> ls -alLF /dev/mapper/foo
> mkfs.btrfs /dev/mapper/foo
> mount /dev/mapper/foo $SCRATCH_MNT
> echo bar > $SCRATCH_MNT/foo
> sync
> echo 0 25165824 error | dmsetup reload foo
> dmsetup resume foo
> ls -alF $SCRATCH_MNT
> touch $SCRATCH_MNT/1
> ls -alF $SCRATCH_MNT
> sleep 35
> echo 0 25165824 linear $SCRATCH_DEV 0 | dmsetup reload foo
> dmsetup resume foo
> sleep 1
> umount $SCRATCH_MNT
> btrfsck /dev/mapper/foo
> dmsetup remove foo
> 
> Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
> Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
> ---
> Changes v1 -> v2:
> - Removed one large comment entirely which was not needed anymore
>   after the code changes.
> - Removed a warning message which was dead code since the super block
>   is not written anymore when the flag BTRFS_SUPER_FLAG_ERROR is set.
> - Removed the return code of btrfs_error_commit_super() since it now
>   does not return any errors anymore.
> 
>  fs/btrfs/disk-io.c | 36 ++++--------------------------------
>  fs/btrfs/disk-io.h |  2 +-
>  2 files changed, 5 insertions(+), 33 deletions(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 502b20c..f6a1d0f 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -2531,8 +2531,7 @@ retry_root_backup:
>  		goto fail_trans_kthread;
>  
>  	/* do not make disk changes in broken FS */
> -	if (btrfs_super_log_root(disk_super) != 0 &&
> -	    !(fs_info->fs_state & BTRFS_SUPER_FLAG_ERROR)) {
> +	if (btrfs_super_log_root(disk_super) != 0) {
>  		u64 bytenr = btrfs_super_log_root(disk_super);
>  
>  		if (fs_devices->rw_devices == 0) {
> @@ -3192,30 +3191,14 @@ int close_ctree(struct btrfs_root *root)
>  	/* clear out the rbtree of defraggable inodes */
>  	btrfs_run_defrag_inodes(fs_info);
>  
> -	/*
> -	 * Here come 2 situations when btrfs is broken to flip readonly:
> -	 *
> -	 * 1. when btrfs flips readonly somewhere else before
> -	 * btrfs_commit_super, sb->s_flags has MS_RDONLY flag,
> -	 * and btrfs will skip to write sb directly to keep
> -	 * ERROR state on disk.
> -	 *
> -	 * 2. when btrfs flips readonly just in btrfs_commit_super,
> -	 * and in such case, btrfs cannot write sb via btrfs_commit_super,
> -	 * and since fs_state has been set BTRFS_SUPER_FLAG_ERROR flag,
> -	 * btrfs will cleanup all FS resources first and write sb then.
> -	 */
>  	if (!(fs_info->sb->s_flags & MS_RDONLY)) {
>  		ret = btrfs_commit_super(root);
>  		if (ret)
>  			printk(KERN_ERR "btrfs: commit super ret %d\n", ret);
>  	}
>  
> -	if (fs_info->fs_state & BTRFS_SUPER_FLAG_ERROR) {
> -		ret = btrfs_error_commit_super(root);
> -		if (ret)
> -			printk(KERN_ERR "btrfs: commit super ret %d\n", ret);
> -	}
> +	if (fs_info->fs_state & BTRFS_SUPER_FLAG_ERROR)
> +		btrfs_error_commit_super(root);
>  
>  	btrfs_put_block_group_cache(fs_info);
>  
> @@ -3437,18 +3420,11 @@ static int btrfs_check_super_valid(struct
btrfs_fs_info *fs_info,
>  	if (read_only)
>  		return 0;
>  
> -	if (fs_info->fs_state & BTRFS_SUPER_FLAG_ERROR) {
> -		printk(KERN_WARNING "warning: mount fs with errors, "
> -		       "running btrfsck is recommended\n");
> -	}
> -
>  	return 0;
>  }
>  
> -int btrfs_error_commit_super(struct btrfs_root *root)
> +void btrfs_error_commit_super(struct btrfs_root *root)
>  {
> -	int ret;
> -
>  	mutex_lock(&root->fs_info->cleaner_mutex);
>  	btrfs_run_delayed_iputs(root);
>  	mutex_unlock(&root->fs_info->cleaner_mutex);
> @@ -3458,10 +3434,6 @@ int btrfs_error_commit_super(struct btrfs_root
*root)
>  
>  	/* cleanup FS via transaction */
>  	btrfs_cleanup_transaction(root);
> -
> -	ret = write_ctree_super(NULL, root, 0);
> -
> -	return ret;
>  }
>  
>  static void btrfs_destroy_ordered_operations(struct btrfs_root *root)
> diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h
> index 95e147e..c5b00a7 100644
> --- a/fs/btrfs/disk-io.h
> +++ b/fs/btrfs/disk-io.h
> @@ -54,7 +54,7 @@ int write_ctree_super(struct btrfs_trans_handle *trans,
>  		      struct btrfs_root *root, int max_mirrors);
>  struct buffer_head *btrfs_read_dev_super(struct block_device *bdev);
>  int btrfs_commit_super(struct btrfs_root *root);
> -int btrfs_error_commit_super(struct btrfs_root *root);
> +void btrfs_error_commit_super(struct btrfs_root *root);
>  struct extent_buffer *btrfs_find_tree_block(struct btrfs_root *root,
>  					    u64 bytenr, u32 blocksize);
>  struct btrfs_root *btrfs_read_fs_root_no_radix(struct btrfs_root
*tree_root,
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jan Schmidt

2012-Aug-01 13:07 UTC

head link

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

On Wed, August 01, 2012 at 14:02 (+0200), Liu Bo wrote:> On 08/01/2012 07:45 PM, Stefan Behrens wrote:
>> With commit acce952b0, btrfs was changed to flag the filesystem with
>> BTRFS_SUPER_FLAG_ERROR and switch to read-only mode after a fatal
>> error happened like a write I/O errors of all mirrors.
>> In such situations, on unmount, the superblock is written in
>> btrfs_error_commit_super(). This is done with the intention to be able
>> to evaluate the error flag on the next mount. A warning is printed
>> in this case during the next mount and the log tree is ignored.
>>
>> The issue is that it is possible that the superblock points to a root
>> that was not written (due to write I/O errors).
>> The result is that the filesystem cannot be mounted. btrfsck also does
>> not start and all the other btrfs-progs tools fail to start as well.
>> However, mount -o recovery is working well and does the right things
>> to recover the filesystem (i.e., don''t use the log root, clear
the
>> free space cache and use the next mountable root that is stored in the
>> root backup array).
>>
>> This patch removes the writing of the superblock when
>> BTRFS_SUPER_FLAG_ERROR is set, and removes the handling of the error
>> flag in the mount function.
>>
> 
> Yes, I have to admit that this can be a serious problem.
> 
> But we''ll need to send the error flag stored in the super block
into
> disk in the future so that the next mount can find it unstable and do
> fsck by itself maybe.
Hum, that''s possible. However, I neither see

a) a safe way to get that flag to disk

nor

b) a situation where this flag would help. When we abort a transaction, we just
roll everything back to the last commit, i.e. a consistent state. So if we stop
writing a potentially corrupt super block, we should be fine anyway. Or am I
missing something?

-Jan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Liu Bo

2012-Aug-01 13:31 UTC

head link

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

On 08/01/2012 09:07 PM, Jan Schmidt wrote:> On Wed, August 01, 2012 at 14:02 (+0200), Liu Bo wrote:
>> On 08/01/2012 07:45 PM, Stefan Behrens wrote:
>>> With commit acce952b0, btrfs was changed to flag the filesystem
with
>>> BTRFS_SUPER_FLAG_ERROR and switch to read-only mode after a fatal
>>> error happened like a write I/O errors of all mirrors.
>>> In such situations, on unmount, the superblock is written in
>>> btrfs_error_commit_super(). This is done with the intention to be
able
>>> to evaluate the error flag on the next mount. A warning is printed
>>> in this case during the next mount and the log tree is ignored.
>>>
>>> The issue is that it is possible that the superblock points to a
root
>>> that was not written (due to write I/O errors).
>>> The result is that the filesystem cannot be mounted. btrfsck also
does
>>> not start and all the other btrfs-progs tools fail to start as
well.
>>> However, mount -o recovery is working well and does the right
things
>>> to recover the filesystem (i.e., don''t use the log root,
clear the
>>> free space cache and use the next mountable root that is stored in
the
>>> root backup array).
>>>
>>> This patch removes the writing of the superblock when
>>> BTRFS_SUPER_FLAG_ERROR is set, and removes the handling of the
error
>>> flag in the mount function.
>>>
>>
>> Yes, I have to admit that this can be a serious problem.
>>
>> But we''ll need to send the error flag stored in the super
block into
>> disk in the future so that the next mount can find it unstable and do
>> fsck by itself maybe.
> 
> Hum, that''s possible. However, I neither see
> 
> a) a safe way to get that flag to disk
> 
> nor
> 
> b) a situation where this flag would help. When we abort a transaction, we
just
> roll everything back to the last commit, i.e. a consistent state. So if we
stop
> writing a potentially corrupt super block, we should be fine anyway. Or am
I
> missing something?
> 
I''m just wondering if we can roll everything back well, why do we need
fsck?

thanks,
liubo
> -Jan
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Arne Jansen

2012-Aug-01 13:56 UTC

head link

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

On 01.08.2012 15:31, Liu Bo wrote:> On 08/01/2012 09:07 PM, Jan Schmidt wrote:
>> On Wed, August 01, 2012 at 14:02 (+0200), Liu Bo wrote:
>>> On 08/01/2012 07:45 PM, Stefan Behrens wrote:
>>>> With commit acce952b0, btrfs was changed to flag the filesystem
with
>>>> BTRFS_SUPER_FLAG_ERROR and switch to read-only mode after a
fatal
>>>> error happened like a write I/O errors of all mirrors.
>>>> In such situations, on unmount, the superblock is written in
>>>> btrfs_error_commit_super(). This is done with the intention to
be able
>>>> to evaluate the error flag on the next mount. A warning is
printed
>>>> in this case during the next mount and the log tree is ignored.
>>>>
>>>> The issue is that it is possible that the superblock points to
a root
>>>> that was not written (due to write I/O errors).
>>>> The result is that the filesystem cannot be mounted. btrfsck
also does
>>>> not start and all the other btrfs-progs tools fail to start as
well.
>>>> However, mount -o recovery is working well and does the right
things
>>>> to recover the filesystem (i.e., don''t use the log
root, clear the
>>>> free space cache and use the next mountable root that is stored
in the
>>>> root backup array).
>>>>
>>>> This patch removes the writing of the superblock when
>>>> BTRFS_SUPER_FLAG_ERROR is set, and removes the handling of the
error
>>>> flag in the mount function.
>>>>
>>>
>>> Yes, I have to admit that this can be a serious problem.
>>>
>>> But we''ll need to send the error flag stored in the super
block into
>>> disk in the future so that the next mount can find it unstable and
do
>>> fsck by itself maybe.
>>
>> Hum, that''s possible. However, I neither see
>>
>> a) a safe way to get that flag to disk
>>
>> nor
>>
>> b) a situation where this flag would help. When we abort a transaction,
we just
>> roll everything back to the last commit, i.e. a consistent state. So if
we stop
>> writing a potentially corrupt super block, we should be fine anyway. Or
am I
>> missing something?
>>
> 
> I''m just wondering if we can roll everything back well, why do we
need fsck?
Mostly for undetected errors.
> 
> thanks,
> liubo
> 
>> -Jan
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Stefan Behrens

2012-Aug-01 14:31 UTC

head link

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

On Wed, 01 Aug 2012 21:31:58 +0800, Liu Bo wrote:> On 08/01/2012 09:07 PM, Jan Schmidt wrote:
>> On Wed, August 01, 2012 at 14:02 (+0200), Liu Bo wrote:
>>> On 08/01/2012 07:45 PM, Stefan Behrens wrote:
>>>> With commit acce952b0, btrfs was changed to flag the filesystem
with
>>>> BTRFS_SUPER_FLAG_ERROR and switch to read-only mode after a
fatal
>>>> error happened like a write I/O errors of all mirrors.
>>>> In such situations, on unmount, the superblock is written in
>>>> btrfs_error_commit_super(). This is done with the intention to
be able
>>>> to evaluate the error flag on the next mount. A warning is
printed
>>>> in this case during the next mount and the log tree is ignored.
>>>>
>>>> The issue is that it is possible that the superblock points to
a root
>>>> that was not written (due to write I/O errors).
>>>> The result is that the filesystem cannot be mounted. btrfsck
also does
>>>> not start and all the other btrfs-progs tools fail to start as
well.
>>>> However, mount -o recovery is working well and does the right
things
>>>> to recover the filesystem (i.e., don''t use the log
root, clear the
>>>> free space cache and use the next mountable root that is stored
in the
>>>> root backup array).
>>>>
>>>> This patch removes the writing of the superblock when
>>>> BTRFS_SUPER_FLAG_ERROR is set, and removes the handling of the
error
>>>> flag in the mount function.
>>>>
>>>
>>> Yes, I have to admit that this can be a serious problem.
>>>
>>> But we''ll need to send the error flag stored in the super
block into
>>> disk in the future so that the next mount can find it unstable and
do
>>> fsck by itself maybe.
>>
>> Hum, that''s possible. However, I neither see
>>
>> a) a safe way to get that flag to disk
>>
>> nor
>>
>> b) a situation where this flag would help. When we abort a transaction,
we just
>> roll everything back to the last commit, i.e. a consistent state. So if
we stop
>> writing a potentially corrupt super block, we should be fine anyway. Or
am I
>> missing something?
>>
> 
> I''m just wondering if we can roll everything back well, why do we
need fsck?
If the disks support barriers, we roll everything back very well. The
most recent superblock on the disks always defines a consistent
filesystem state. There are only two remaining filesystem consistency
issues left that can cause inconsistent states, one is the one that the
patch in this email addresses, and the second one is that the error
result from barrier_all_devices() is ignored (which I want to change next).
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Stefan Behrens

2012-Aug-02 10:30 UTC

head link

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

On Wed, 01 Aug 2012 16:31:54 +0200, Stefan Behrens
wrote:> On Wed, 01 Aug 2012 21:31:58 +0800, Liu Bo wrote:
>> On 08/01/2012 09:07 PM, Jan Schmidt wrote:
>>> On Wed, August 01, 2012 at 14:02 (+0200), Liu Bo wrote:
>>>> On 08/01/2012 07:45 PM, Stefan Behrens wrote:
>>>>> With commit acce952b0, btrfs was changed to flag the
filesystem with
>>>>> BTRFS_SUPER_FLAG_ERROR and switch to read-only mode after a
fatal
>>>>> error happened like a write I/O errors of all mirrors.
>>>>> In such situations, on unmount, the superblock is written
in
>>>>> btrfs_error_commit_super(). This is done with the intention
to be able
>>>>> to evaluate the error flag on the next mount. A warning is
printed
>>>>> in this case during the next mount and the log tree is
ignored.
>>>>>
>>>>> The issue is that it is possible that the superblock points
to a root
>>>>> that was not written (due to write I/O errors).
>>>>> The result is that the filesystem cannot be mounted.
btrfsck also does
>>>>> not start and all the other btrfs-progs tools fail to start
as well.
>>>>> However, mount -o recovery is working well and does the
right things
>>>>> to recover the filesystem (i.e., don''t use the log
root, clear the
>>>>> free space cache and use the next mountable root that is
stored in the
>>>>> root backup array).
>>>>>
>>>>> This patch removes the writing of the superblock when
>>>>> BTRFS_SUPER_FLAG_ERROR is set, and removes the handling of
the error
>>>>> flag in the mount function.
>>>>>
>>>>
>>>> Yes, I have to admit that this can be a serious problem.
>>>>
>>>> But we''ll need to send the error flag stored in the
super block into
>>>> disk in the future so that the next mount can find it unstable
and do
>>>> fsck by itself maybe.
>>>
>>> Hum, that''s possible. However, I neither see
>>>
>>> a) a safe way to get that flag to disk
>>>
>>> nor
>>>
>>> b) a situation where this flag would help. When we abort a
transaction, we just
>>> roll everything back to the last commit, i.e. a consistent state.
So if we stop
>>> writing a potentially corrupt super block, we should be fine
anyway. Or am I
>>> missing something?
>>>
>>
>> I''m just wondering if we can roll everything back well, why do
we need fsck?
> 
> If the disks support barriers, we roll everything back very well. The
> most recent superblock on the disks always defines a consistent
> filesystem state. There are only two remaining filesystem consistency
> issues left that can cause inconsistent states, one is the one that the
> patch in this email addresses, and the second one is that the error
> result from barrier_all_devices() is ignored (which I want to change next).
Hi Liu Bo,

Do you have any remaining objections to that patch?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Liu Bo

2012-Aug-02 10:36 UTC

head link

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

On 08/02/2012 06:30 PM, Stefan Behrens wrote:> On Wed, 01 Aug 2012 16:31:54 +0200, Stefan Behrens wrote:
>> On Wed, 01 Aug 2012 21:31:58 +0800, Liu Bo wrote:
>>> On 08/01/2012 09:07 PM, Jan Schmidt wrote:
>>>> On Wed, August 01, 2012 at 14:02 (+0200), Liu Bo wrote:
>>>>> On 08/01/2012 07:45 PM, Stefan Behrens wrote:
>>>>>> With commit acce952b0, btrfs was changed to flag the
filesystem with
>>>>>> BTRFS_SUPER_FLAG_ERROR and switch to read-only mode
after a fatal
>>>>>> error happened like a write I/O errors of all mirrors.
>>>>>> In such situations, on unmount, the superblock is
written in
>>>>>> btrfs_error_commit_super(). This is done with the
intention to be able
>>>>>> to evaluate the error flag on the next mount. A warning
is printed
>>>>>> in this case during the next mount and the log tree is
ignored.
>>>>>>
>>>>>> The issue is that it is possible that the superblock
points to a root
>>>>>> that was not written (due to write I/O errors).
>>>>>> The result is that the filesystem cannot be mounted.
btrfsck also does
>>>>>> not start and all the other btrfs-progs tools fail to
start as well.
>>>>>> However, mount -o recovery is working well and does the
right things
>>>>>> to recover the filesystem (i.e., don''t use the
log root, clear the
>>>>>> free space cache and use the next mountable root that
is stored in the
>>>>>> root backup array).
>>>>>>
>>>>>> This patch removes the writing of the superblock when
>>>>>> BTRFS_SUPER_FLAG_ERROR is set, and removes the handling
of the error
>>>>>> flag in the mount function.
>>>>>>
>>>>>
>>>>> Yes, I have to admit that this can be a serious problem.
>>>>>
>>>>> But we''ll need to send the error flag stored in
the super block into
>>>>> disk in the future so that the next mount can find it
unstable and do
>>>>> fsck by itself maybe.
>>>>
>>>> Hum, that''s possible. However, I neither see
>>>>
>>>> a) a safe way to get that flag to disk
>>>>
>>>> nor
>>>>
>>>> b) a situation where this flag would help. When we abort a
transaction, we just
>>>> roll everything back to the last commit, i.e. a consistent
state. So if we stop
>>>> writing a potentially corrupt super block, we should be fine
anyway. Or am I
>>>> missing something?
>>>>
>>>
>>> I''m just wondering if we can roll everything back well,
why do we need fsck?
>>
>> If the disks support barriers, we roll everything back very well. The
>> most recent superblock on the disks always defines a consistent
>> filesystem state. There are only two remaining filesystem consistency
>> issues left that can cause inconsistent states, one is the one that the
>> patch in this email addresses, and the second one is that the error
>> result from barrier_all_devices() is ignored (which I want to change
next).
> 
> Hi Liu Bo,
> 
> Do you have any remaining objections to that patch?
> 
Hi Stefan,

Still I have another question:

Our metadata can be flushed into disk if we reach the limit, 32k, so we
can end up with updated metadata and the latest superblock if we do not
write the current super block.

Any ideas?

thanks,
liubo
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Arne Jansen

2012-Aug-02 11:18 UTC

head link

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

On 02.08.2012 12:36, Liu Bo wrote:> On 08/02/2012 06:30 PM, Stefan Behrens wrote:
>> On Wed, 01 Aug 2012 16:31:54 +0200, Stefan Behrens wrote:
>>> On Wed, 01 Aug 2012 21:31:58 +0800, Liu Bo wrote:
>>>> On 08/01/2012 09:07 PM, Jan Schmidt wrote:
>>>>> On Wed, August 01, 2012 at 14:02 (+0200), Liu Bo wrote:
>>>>>> On 08/01/2012 07:45 PM, Stefan Behrens wrote:
>>>>>>> With commit acce952b0, btrfs was changed to flag
the filesystem with
>>>>>>> BTRFS_SUPER_FLAG_ERROR and switch to read-only mode
after a fatal
>>>>>>> error happened like a write I/O errors of all
mirrors.
>>>>>>> In such situations, on unmount, the superblock is
written in
>>>>>>> btrfs_error_commit_super(). This is done with the
intention to be able
>>>>>>> to evaluate the error flag on the next mount. A
warning is printed
>>>>>>> in this case during the next mount and the log tree
is ignored.
>>>>>>>
>>>>>>> The issue is that it is possible that the
superblock points to a root
>>>>>>> that was not written (due to write I/O errors).
>>>>>>> The result is that the filesystem cannot be
mounted. btrfsck also does
>>>>>>> not start and all the other btrfs-progs tools fail
to start as well.
>>>>>>> However, mount -o recovery is working well and does
the right things
>>>>>>> to recover the filesystem (i.e., don''t use
the log root, clear the
>>>>>>> free space cache and use the next mountable root
that is stored in the
>>>>>>> root backup array).
>>>>>>>
>>>>>>> This patch removes the writing of the superblock
when
>>>>>>> BTRFS_SUPER_FLAG_ERROR is set, and removes the
handling of the error
>>>>>>> flag in the mount function.
>>>>>>>
>>>>>>
>>>>>> Yes, I have to admit that this can be a serious
problem.
>>>>>>
>>>>>> But we''ll need to send the error flag stored
in the super block into
>>>>>> disk in the future so that the next mount can find it
unstable and do
>>>>>> fsck by itself maybe.
>>>>>
>>>>> Hum, that''s possible. However, I neither see
>>>>>
>>>>> a) a safe way to get that flag to disk
>>>>>
>>>>> nor
>>>>>
>>>>> b) a situation where this flag would help. When we abort a
transaction, we just
>>>>> roll everything back to the last commit, i.e. a consistent
state. So if we stop
>>>>> writing a potentially corrupt super block, we should be
fine anyway. Or am I
>>>>> missing something?
>>>>>
>>>>
>>>> I''m just wondering if we can roll everything back
well, why do we need fsck?
>>>
>>> If the disks support barriers, we roll everything back very well.
The
>>> most recent superblock on the disks always defines a consistent
>>> filesystem state. There are only two remaining filesystem
consistency
>>> issues left that can cause inconsistent states, one is the one that
the
>>> patch in this email addresses, and the second one is that the error
>>> result from barrier_all_devices() is ignored (which I want to
change next).
>>
>> Hi Liu Bo,
>>
>> Do you have any remaining objections to that patch?
>>
> 
> Hi Stefan,
> 
> Still I have another question:
> 
> Our metadata can be flushed into disk if we reach the limit, 32k, so we
> can end up with updated metadata and the latest superblock if we do not
> write the current super block.
The old metadata stays valid until the new superblock is written,
so no problem here, or maybe I don''t understand your question :)
> 
> Any ideas?
> 
> thanks,
> liubo
> --
> To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Liu Bo

2012-Aug-02 11:34 UTC

head link

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

On 08/02/2012 07:18 PM, Arne Jansen wrote:> On 02.08.2012 12:36, Liu Bo wrote:
>> On 08/02/2012 06:30 PM, Stefan Behrens wrote:
>>> On Wed, 01 Aug 2012 16:31:54 +0200, Stefan Behrens wrote:
>>>> On Wed, 01 Aug 2012 21:31:58 +0800, Liu Bo wrote:
>>>>> On 08/01/2012 09:07 PM, Jan Schmidt wrote:
>>>>>> On Wed, August 01, 2012 at 14:02 (+0200), Liu Bo wrote:
>>>>>>> On 08/01/2012 07:45 PM, Stefan Behrens wrote:
>>>>>>>> With commit acce952b0, btrfs was changed to
flag the filesystem with
>>>>>>>> BTRFS_SUPER_FLAG_ERROR and switch to read-only
mode after a fatal
>>>>>>>> error happened like a write I/O errors of all
mirrors.
>>>>>>>> In such situations, on unmount, the superblock
is written in
>>>>>>>> btrfs_error_commit_super(). This is done with
the intention to be able
>>>>>>>> to evaluate the error flag on the next mount. A
warning is printed
>>>>>>>> in this case during the next mount and the log
tree is ignored.
>>>>>>>>
>>>>>>>> The issue is that it is possible that the
superblock points to a root
>>>>>>>> that was not written (due to write I/O errors).
>>>>>>>> The result is that the filesystem cannot be
mounted. btrfsck also does
>>>>>>>> not start and all the other btrfs-progs tools
fail to start as well.
>>>>>>>> However, mount -o recovery is working well and
does the right things
>>>>>>>> to recover the filesystem (i.e., don''t
use the log root, clear the
>>>>>>>> free space cache and use the next mountable
root that is stored in the
>>>>>>>> root backup array).
>>>>>>>>
>>>>>>>> This patch removes the writing of the
superblock when
>>>>>>>> BTRFS_SUPER_FLAG_ERROR is set, and removes the
handling of the error
>>>>>>>> flag in the mount function.
>>>>>>>>
>>>>>>>
>>>>>>> Yes, I have to admit that this can be a serious
problem.
>>>>>>>
>>>>>>> But we''ll need to send the error flag
stored in the super block into
>>>>>>> disk in the future so that the next mount can find
it unstable and do
>>>>>>> fsck by itself maybe.
>>>>>>
>>>>>> Hum, that''s possible. However, I neither see
>>>>>>
>>>>>> a) a safe way to get that flag to disk
>>>>>>
>>>>>> nor
>>>>>>
>>>>>> b) a situation where this flag would help. When we
abort a transaction, we just
>>>>>> roll everything back to the last commit, i.e. a
consistent state. So if we stop
>>>>>> writing a potentially corrupt super block, we should be
fine anyway. Or am I
>>>>>> missing something?
>>>>>>
>>>>>
>>>>> I''m just wondering if we can roll everything back
well, why do we need fsck?
>>>>
>>>> If the disks support barriers, we roll everything back very
well. The
>>>> most recent superblock on the disks always defines a consistent
>>>> filesystem state. There are only two remaining filesystem
consistency
>>>> issues left that can cause inconsistent states, one is the one
that the
>>>> patch in this email addresses, and the second one is that the
error
>>>> result from barrier_all_devices() is ignored (which I want to
change next).
>>>
>>> Hi Liu Bo,
>>>
>>> Do you have any remaining objections to that patch?
>>>
>>
>> Hi Stefan,
>>
>> Still I have another question:
>>
>> Our metadata can be flushed into disk if we reach the limit, 32k, so we
>> can end up with updated metadata and the latest superblock if we do not
>> write the current super block.
> 
> The old metadata stays valid until the new superblock is written,
> so no problem here, or maybe I don''t understand your question :)
> 
Yeah, Arne, you''re right :)

But for undetected and unexpected errors as Arne had mentioned,  I want
to keep the error flag which is able to inform users that this FS is
recommended (but not must) to do fsck at least.

thanks,
liubo
>>
>> Any ideas?
>>
>> thanks,
>> liubo
>> --
>> To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Arne Jansen

2012-Aug-02 11:40 UTC

head link

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

On 02.08.2012 13:34, Liu Bo wrote:> On 08/02/2012 07:18 PM, Arne Jansen wrote:
>> On 02.08.2012 12:36, Liu Bo wrote:
>>> On 08/02/2012 06:30 PM, Stefan Behrens wrote:
>>>> On Wed, 01 Aug 2012 16:31:54 +0200, Stefan Behrens wrote:
>>>>> On Wed, 01 Aug 2012 21:31:58 +0800, Liu Bo wrote:
>>>>>> On 08/01/2012 09:07 PM, Jan Schmidt wrote:
>>>>>>> On Wed, August 01, 2012 at 14:02 (+0200), Liu Bo
wrote:
>>>>>>>> On 08/01/2012 07:45 PM, Stefan Behrens wrote:
>>>>>>>>> With commit acce952b0, btrfs was changed to
flag the filesystem with
>>>>>>>>> BTRFS_SUPER_FLAG_ERROR and switch to
read-only mode after a fatal
>>>>>>>>> error happened like a write I/O errors of
all mirrors.
>>>>>>>>> In such situations, on unmount, the
superblock is written in
>>>>>>>>> btrfs_error_commit_super(). This is done
with the intention to be able
>>>>>>>>> to evaluate the error flag on the next
mount. A warning is printed
>>>>>>>>> in this case during the next mount and the
log tree is ignored.
>>>>>>>>>
>>>>>>>>> The issue is that it is possible that the
superblock points to a root
>>>>>>>>> that was not written (due to write I/O
errors).
>>>>>>>>> The result is that the filesystem cannot be
mounted. btrfsck also does
>>>>>>>>> not start and all the other btrfs-progs
tools fail to start as well.
>>>>>>>>> However, mount -o recovery is working well
and does the right things
>>>>>>>>> to recover the filesystem (i.e.,
don''t use the log root, clear the
>>>>>>>>> free space cache and use the next mountable
root that is stored in the
>>>>>>>>> root backup array).
>>>>>>>>>
>>>>>>>>> This patch removes the writing of the
superblock when
>>>>>>>>> BTRFS_SUPER_FLAG_ERROR is set, and removes
the handling of the error
>>>>>>>>> flag in the mount function.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Yes, I have to admit that this can be a serious
problem.
>>>>>>>>
>>>>>>>> But we''ll need to send the error flag
stored in the super block into
>>>>>>>> disk in the future so that the next mount can
find it unstable and do
>>>>>>>> fsck by itself maybe.
>>>>>>>
>>>>>>> Hum, that''s possible. However, I neither
see
>>>>>>>
>>>>>>> a) a safe way to get that flag to disk
>>>>>>>
>>>>>>> nor
>>>>>>>
>>>>>>> b) a situation where this flag would help. When we
abort a transaction, we just
>>>>>>> roll everything back to the last commit, i.e. a
consistent state. So if we stop
>>>>>>> writing a potentially corrupt super block, we
should be fine anyway. Or am I
>>>>>>> missing something?
>>>>>>>
>>>>>>
>>>>>> I''m just wondering if we can roll everything
back well, why do we need fsck?
>>>>>
>>>>> If the disks support barriers, we roll everything back very
well. The
>>>>> most recent superblock on the disks always defines a
consistent
>>>>> filesystem state. There are only two remaining filesystem
consistency
>>>>> issues left that can cause inconsistent states, one is the
one that the
>>>>> patch in this email addresses, and the second one is that
the error
>>>>> result from barrier_all_devices() is ignored (which I want
to change next).
>>>>
>>>> Hi Liu Bo,
>>>>
>>>> Do you have any remaining objections to that patch?
>>>>
>>>
>>> Hi Stefan,
>>>
>>> Still I have another question:
>>>
>>> Our metadata can be flushed into disk if we reach the limit, 32k,
so we
>>> can end up with updated metadata and the latest superblock if we do
not
>>> write the current super block.
>>
>> The old metadata stays valid until the new superblock is written,
>> so no problem here, or maybe I don''t understand your question
:)
>>
> 
> Yeah, Arne, you''re right :)
> 
> But for undetected and unexpected errors as Arne had mentioned,  I want
> to keep the error flag which is able to inform users that this FS is
> recommended (but not must) to do fsck at least.
How about storing the flag in a different location than the superblock?
If the fs is in an unknown state, every write potentially makes it only
worse.
> 
> thanks,
> liubo
> 
>>>
>>> Any ideas?
>>>
>>> thanks,
>>> liubo
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Liu Bo

2012-Aug-02 11:57 UTC

head link

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

On 08/02/2012 07:40 PM, Arne Jansen wrote:> On 02.08.2012 13:34, Liu Bo wrote:
>> On 08/02/2012 07:18 PM, Arne Jansen wrote:
>>> On 02.08.2012 12:36, Liu Bo wrote:
>>>> On 08/02/2012 06:30 PM, Stefan Behrens wrote:
>>>>> On Wed, 01 Aug 2012 16:31:54 +0200, Stefan Behrens wrote:
>>>>>> On Wed, 01 Aug 2012 21:31:58 +0800, Liu Bo wrote:
>>>>>>> On 08/01/2012 09:07 PM, Jan Schmidt wrote:
>>>>>>>> On Wed, August 01, 2012 at 14:02 (+0200), Liu
Bo wrote:
>>>>>>>>> On 08/01/2012 07:45 PM, Stefan Behrens
wrote:
>>>>>>>>>> With commit acce952b0, btrfs was
changed to flag the filesystem with
>>>>>>>>>> BTRFS_SUPER_FLAG_ERROR and switch to
read-only mode after a fatal
>>>>>>>>>> error happened like a write I/O errors
of all mirrors.
>>>>>>>>>> In such situations, on unmount, the
superblock is written in
>>>>>>>>>> btrfs_error_commit_super(). This is
done with the intention to be able
>>>>>>>>>> to evaluate the error flag on the next
mount. A warning is printed
>>>>>>>>>> in this case during the next mount and
the log tree is ignored.
>>>>>>>>>>
>>>>>>>>>> The issue is that it is possible that
the superblock points to a root
>>>>>>>>>> that was not written (due to write I/O
errors).
>>>>>>>>>> The result is that the filesystem
cannot be mounted. btrfsck also does
>>>>>>>>>> not start and all the other btrfs-progs
tools fail to start as well.
>>>>>>>>>> However, mount -o recovery is working
well and does the right things
>>>>>>>>>> to recover the filesystem (i.e.,
don''t use the log root, clear the
>>>>>>>>>> free space cache and use the next
mountable root that is stored in the
>>>>>>>>>> root backup array).
>>>>>>>>>>
>>>>>>>>>> This patch removes the writing of the
superblock when
>>>>>>>>>> BTRFS_SUPER_FLAG_ERROR is set, and
removes the handling of the error
>>>>>>>>>> flag in the mount function.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yes, I have to admit that this can be a
serious problem.
>>>>>>>>>
>>>>>>>>> But we''ll need to send the error
flag stored in the super block into
>>>>>>>>> disk in the future so that the next mount
can find it unstable and do
>>>>>>>>> fsck by itself maybe.
>>>>>>>>
>>>>>>>> Hum, that''s possible. However, I
neither see
>>>>>>>>
>>>>>>>> a) a safe way to get that flag to disk
>>>>>>>>
>>>>>>>> nor
>>>>>>>>
>>>>>>>> b) a situation where this flag would help. When
we abort a transaction, we just
>>>>>>>> roll everything back to the last commit, i.e. a
consistent state. So if we stop
>>>>>>>> writing a potentially corrupt super block, we
should be fine anyway. Or am I
>>>>>>>> missing something?
>>>>>>>>
>>>>>>>
>>>>>>> I''m just wondering if we can roll
everything back well, why do we need fsck?
>>>>>>
>>>>>> If the disks support barriers, we roll everything back
very well. The
>>>>>> most recent superblock on the disks always defines a
consistent
>>>>>> filesystem state. There are only two remaining
filesystem consistency
>>>>>> issues left that can cause inconsistent states, one is
the one that the
>>>>>> patch in this email addresses, and the second one is
that the error
>>>>>> result from barrier_all_devices() is ignored (which I
want to change next).
>>>>>
>>>>> Hi Liu Bo,
>>>>>
>>>>> Do you have any remaining objections to that patch?
>>>>>
>>>>
>>>> Hi Stefan,
>>>>
>>>> Still I have another question:
>>>>
>>>> Our metadata can be flushed into disk if we reach the limit,
32k, so we
>>>> can end up with updated metadata and the latest superblock if
we do not
>>>> write the current super block.
>>>
>>> The old metadata stays valid until the new superblock is written,
>>> so no problem here, or maybe I don''t understand your
question :)
>>>
>>
>> Yeah, Arne, you''re right :)
>>
>> But for undetected and unexpected errors as Arne had mentioned,  I want
>> to keep the error flag which is able to inform users that this FS is
>> recommended (but not must) to do fsck at least.
> 
> How about storing the flag in a different location than the superblock?
> If the fs is in an unknown state, every write potentially makes it only
> worse.
> 
IMO it does not make sense if we don''t write the flag into disk, and on
ext4''s side, it just tries to write the super block.

Anyway, for now, our error flag has only been stored in memory, so what
about just keep it until we find a graceful way?


thanks,
liubo

>>
>> thanks,
>> liubo
>>
>>>>
>>>> Any ideas?
>>>>
>>>> thanks,
>>>> liubo
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at 
http://vger.kernel.org/majordomo-info.html
>>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Arne Jansen

2012-Aug-02 13:46 UTC

head link

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

On 02.08.2012 13:57, Liu Bo wrote:> On 08/02/2012 07:40 PM, Arne Jansen wrote:
>> On 02.08.2012 13:34, Liu Bo wrote:
>>> On 08/02/2012 07:18 PM, Arne Jansen wrote:
>>>> On 02.08.2012 12:36, Liu Bo wrote:
>>>>> On 08/02/2012 06:30 PM, Stefan Behrens wrote:
>>>>>> On Wed, 01 Aug 2012 16:31:54 +0200, Stefan Behrens
wrote:
>>>>>>> On Wed, 01 Aug 2012 21:31:58 +0800, Liu Bo wrote:
>>>>>>>> On 08/01/2012 09:07 PM, Jan Schmidt wrote:
>>>>>>>>> On Wed, August 01, 2012 at 14:02 (+0200),
Liu Bo wrote:
>>>>>>>>>> On 08/01/2012 07:45 PM, Stefan Behrens
wrote:
>>>>>>>>>>> With commit acce952b0, btrfs was
changed to flag the filesystem with
>>>>>>>>>>> BTRFS_SUPER_FLAG_ERROR and switch
to read-only mode after a fatal
>>>>>>>>>>> error happened like a write I/O
errors of all mirrors.
>>>>>>>>>>> In such situations, on unmount, the
superblock is written in
>>>>>>>>>>> btrfs_error_commit_super(). This is
done with the intention to be able
>>>>>>>>>>> to evaluate the error flag on the
next mount. A warning is printed
>>>>>>>>>>> in this case during the next mount
and the log tree is ignored.
>>>>>>>>>>>
>>>>>>>>>>> The issue is that it is possible
that the superblock points to a root
>>>>>>>>>>> that was not written (due to write
I/O errors).
>>>>>>>>>>> The result is that the filesystem
cannot be mounted. btrfsck also does
>>>>>>>>>>> not start and all the other
btrfs-progs tools fail to start as well.
>>>>>>>>>>> However, mount -o recovery is
working well and does the right things
>>>>>>>>>>> to recover the filesystem (i.e.,
don''t use the log root, clear the
>>>>>>>>>>> free space cache and use the next
mountable root that is stored in the
>>>>>>>>>>> root backup array).
>>>>>>>>>>>
>>>>>>>>>>> This patch removes the writing of
the superblock when
>>>>>>>>>>> BTRFS_SUPER_FLAG_ERROR is set, and
removes the handling of the error
>>>>>>>>>>> flag in the mount function.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Yes, I have to admit that this can be a
serious problem.
>>>>>>>>>>
>>>>>>>>>> But we''ll need to send the
error flag stored in the super block into
>>>>>>>>>> disk in the future so that the next
mount can find it unstable and do
>>>>>>>>>> fsck by itself maybe.
>>>>>>>>>
>>>>>>>>> Hum, that''s possible. However, I
neither see
>>>>>>>>>
>>>>>>>>> a) a safe way to get that flag to disk
>>>>>>>>>
>>>>>>>>> nor
>>>>>>>>>
>>>>>>>>> b) a situation where this flag would help.
When we abort a transaction, we just
>>>>>>>>> roll everything back to the last commit,
i.e. a consistent state. So if we stop
>>>>>>>>> writing a potentially corrupt super block,
we should be fine anyway. Or am I
>>>>>>>>> missing something?
>>>>>>>>>
>>>>>>>>
>>>>>>>> I''m just wondering if we can roll
everything back well, why do we need fsck?
>>>>>>>
>>>>>>> If the disks support barriers, we roll everything
back very well. The
>>>>>>> most recent superblock on the disks always defines
a consistent
>>>>>>> filesystem state. There are only two remaining
filesystem consistency
>>>>>>> issues left that can cause inconsistent states, one
is the one that the
>>>>>>> patch in this email addresses, and the second one
is that the error
>>>>>>> result from barrier_all_devices() is ignored (which
I want to change next).
>>>>>>
>>>>>> Hi Liu Bo,
>>>>>>
>>>>>> Do you have any remaining objections to that patch?
>>>>>>
>>>>>
>>>>> Hi Stefan,
>>>>>
>>>>> Still I have another question:
>>>>>
>>>>> Our metadata can be flushed into disk if we reach the
limit, 32k, so we
>>>>> can end up with updated metadata and the latest superblock
if we do not
>>>>> write the current super block.
>>>>
>>>> The old metadata stays valid until the new superblock is
written,
>>>> so no problem here, or maybe I don''t understand your
question :)
>>>>
>>>
>>> Yeah, Arne, you''re right :)
>>>
>>> But for undetected and unexpected errors as Arne had mentioned,  I
want
>>> to keep the error flag which is able to inform users that this FS
is
>>> recommended (but not must) to do fsck at least.
>>
>> How about storing the flag in a different location than the superblock?
>> If the fs is in an unknown state, every write potentially makes it only
>> worse.
>>
> 
> IMO it does not make sense if we don''t write the flag into disk,
and on
> ext4''s side, it just tries to write the super block.
> 
> Anyway, for now, our error flag has only been stored in memory, so what
> about just keep it until we find a graceful way?
Yeah, we need this patch to restore consistency. We can define a fixed
area on disk (e.g. behind the superblock) where we can write the flag
to without risking the superblock.
> 
> 
> thanks,
> liubo
> 
> 
>>>
>>> thanks,
>>> liubo
>>>
>>>>>
>>>>> Any ideas?
>>>>>
>>>>> thanks,
>>>>> liubo
>>>>> --
>>>>> To unsubscribe from this list: send the line
"unsubscribe linux-btrfs" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at 
http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Sterba

2012-Aug-02 13:57 UTC

head link

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

On Thu, Aug 02, 2012 at 03:46:50PM +0200, Arne Jansen
wrote:> > Anyway, for now, our error flag has only been stored in memory, so
what
> > about just keep it until we find a graceful way?
> 
> Yeah, we need this patch to restore consistency. We can define a fixed
> area on disk (e.g. behind the superblock) where we can write the flag
> to without risking the superblock.
Is it possible that sectors around the superblock are somehow damaged
and unwritable that writing to them will fail as well? Ie. some safe
distance from the superblock would be needed.

david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Arne Jansen

2012-Aug-02 14:01 UTC

head link

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

On 02.08.2012 15:57, David Sterba wrote:> On Thu, Aug 02, 2012 at 03:46:50PM +0200, Arne Jansen wrote:
>>> Anyway, for now, our error flag has only been stored in memory, so
what
>>> about just keep it until we find a graceful way?
>>
>> Yeah, we need this patch to restore consistency. We can define a fixed
>> area on disk (e.g. behind the superblock) where we can write the flag
>> to without risking the superblock.
> 
> Is it possible that sectors around the superblock are somehow damaged
> and unwritable that writing to them will fail as well? Ie. some safe
> distance from the superblock would be needed.
In about 1MB distance you''re side-by-side with the superblock, so
closer
to it might even be safer ;)
> 
> david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jan Schmidt

2012-Aug-02 14:28 UTC

head link

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

On Thu, August 02, 2012 at 15:46 (+0200), Arne Jansen
wrote:> On 02.08.2012 13:57, Liu Bo wrote:
>> Anyway, for now, our error flag has only been stored in memory, so what
>> about just keep it until we find a graceful way?
> 
> Yeah, we need this patch to restore consistency. We can define a fixed
> area on disk (e.g. behind the superblock) where we can write the flag
> to without risking the superblock.
At least we all agree that we need this patch, fine.

We don''t yet agree that we need a place to store a "please
consider fsck" flag.
Can I please get one concrete example in which situation we

a) do detect the user should really do a file system check AND
b) do not abort the transaction to clean the mess up?

(An example on how we could fail transaction cleanup is also accepted).

If such a situation doesn''t exist, there''s no need for this
flag. The fact that
ext has such a flag doesn''t convince me, probably because I know
nothing about
ext. I can imagine that they can detect file system errors without the ability
to return to a potentially older consistent state.

Thanks,
-Jan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

cwillu

2012-Aug-02 15:06 UTC

head link

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

On Thu, Aug 2, 2012 at 7:46 AM, Arne Jansen <sensille@gmx.net>
wrote:> On 02.08.2012 13:57, Liu Bo wrote:
>> On 08/02/2012 07:40 PM, Arne Jansen wrote:
>>> On 02.08.2012 13:34, Liu Bo wrote:
>>>> On 08/02/2012 07:18 PM, Arne Jansen wrote:
>>>>> On 02.08.2012 12:36, Liu Bo wrote:
>>>>>> On 08/02/2012 06:30 PM, Stefan Behrens wrote:
>>>>>>> On Wed, 01 Aug 2012 16:31:54 +0200, Stefan Behrens
wrote:
>>>>>>>> On Wed, 01 Aug 2012 21:31:58 +0800, Liu Bo
wrote:
>>>>>>>>> On 08/01/2012 09:07 PM, Jan Schmidt wrote:
>>>>>>>>>> On Wed, August 01, 2012 at 14:02
(+0200), Liu Bo wrote:
>>>>>>>>>>> On 08/01/2012 07:45 PM, Stefan
Behrens wrote:
>>>>>>>>>>>> With commit acce952b0, btrfs
was changed to flag the filesystem with
>>>>>>>>>>>> BTRFS_SUPER_FLAG_ERROR and
switch to read-only mode after a fatal
>>>>>>>>>>>> error happened like a write I/O
errors of all mirrors.
>>>>>>>>>>>> In such situations, on unmount,
the superblock is written in
>>>>>>>>>>>> btrfs_error_commit_super().
This is done with the intention to be able
>>>>>>>>>>>> to evaluate the error flag on
the next mount. A warning is printed
>>>>>>>>>>>> in this case during the next
mount and the log tree is ignored.
>>>>>>>>>>>>
>>>>>>>>>>>> The issue is that it is
possible that the superblock points to a root
>>>>>>>>>>>> that was not written (due to
write I/O errors).
>>>>>>>>>>>> The result is that the
filesystem cannot be mounted. btrfsck also does
>>>>>>>>>>>> not start and all the other
btrfs-progs tools fail to start as well.
>>>>>>>>>>>> However, mount -o recovery is
working well and does the right things
>>>>>>>>>>>> to recover the filesystem
(i.e., don''t use the log root, clear the
>>>>>>>>>>>> free space cache and use the
next mountable root that is stored in the
>>>>>>>>>>>> root backup array).
>>>>>>>>>>>>
>>>>>>>>>>>> This patch removes the writing
of the superblock when
>>>>>>>>>>>> BTRFS_SUPER_FLAG_ERROR is set,
and removes the handling of the error
>>>>>>>>>>>> flag in the mount function.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Yes, I have to admit that this can
be a serious problem.
>>>>>>>>>>>
>>>>>>>>>>> But we''ll need to send the
error flag stored in the super block into
>>>>>>>>>>> disk in the future so that the next
mount can find it unstable and do
>>>>>>>>>>> fsck by itself maybe.
>>>>>>>>>>
>>>>>>>>>> Hum, that''s possible. However,
I neither see
>>>>>>>>>>
>>>>>>>>>> a) a safe way to get that flag to disk
>>>>>>>>>>
>>>>>>>>>> nor
>>>>>>>>>>
>>>>>>>>>> b) a situation where this flag would
help. When we abort a transaction, we just
>>>>>>>>>> roll everything back to the last
commit, i.e. a consistent state. So if we stop
>>>>>>>>>> writing a potentially corrupt super
block, we should be fine anyway. Or am I
>>>>>>>>>> missing something?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I''m just wondering if we can roll
everything back well, why do we need fsck?
>>>>>>>>
>>>>>>>> If the disks support barriers, we roll
everything back very well. The
>>>>>>>> most recent superblock on the disks always
defines a consistent
>>>>>>>> filesystem state. There are only two remaining
filesystem consistency
>>>>>>>> issues left that can cause inconsistent states,
one is the one that the
>>>>>>>> patch in this email addresses, and the second
one is that the error
>>>>>>>> result from barrier_all_devices() is ignored
(which I want to change next).
>>>>>>>
>>>>>>> Hi Liu Bo,
>>>>>>>
>>>>>>> Do you have any remaining objections to that patch?
>>>>>>>
>>>>>>
>>>>>> Hi Stefan,
>>>>>>
>>>>>> Still I have another question:
>>>>>>
>>>>>> Our metadata can be flushed into disk if we reach the
limit, 32k, so we
>>>>>> can end up with updated metadata and the latest
superblock if we do not
>>>>>> write the current super block.
>>>>>
>>>>> The old metadata stays valid until the new superblock is
written,
>>>>> so no problem here, or maybe I don''t understand
your question :)
>>>>>
>>>>
>>>> Yeah, Arne, you''re right :)
>>>>
>>>> But for undetected and unexpected errors as Arne had mentioned,
I want
>>>> to keep the error flag which is able to inform users that this
FS is
>>>> recommended (but not must) to do fsck at least.
>>>
>>> How about storing the flag in a different location than the
superblock?
>>> If the fs is in an unknown state, every write potentially makes it
only
>>> worse.
>>>
>>
>> IMO it does not make sense if we don''t write the flag into
disk, and on
>> ext4''s side, it just tries to write the super block.
>>
>> Anyway, for now, our error flag has only been stored in memory, so what
>> about just keep it until we find a graceful way?
>
> Yeah, we need this patch to restore consistency. We can define a fixed
> area on disk (e.g. behind the superblock) where we can write the flag
> to without risking the superblock.
Is there a reason btrfs_error_commit_super couldn''t do the as treelog:
update only the first superblock via max_mirrors=1?  I''d expect that
fsck, -o recovery and so forth should all handle this correctly
already, and we even have documentation that discusses it.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Liu Bo

2012-Aug-03 03:20 UTC

head link

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

On 08/02/2012 10:28 PM, Jan Schmidt wrote:> On Thu, August 02, 2012 at 15:46 (+0200), Arne Jansen wrote:
>> On 02.08.2012 13:57, Liu Bo wrote:
>>> Anyway, for now, our error flag has only been stored in memory, so
what
>>> about just keep it until we find a graceful way?
>>
>> Yeah, we need this patch to restore consistency. We can define a fixed
>> area on disk (e.g. behind the superblock) where we can write the flag
>> to without risking the superblock.
> 
> At least we all agree that we need this patch, fine.
> 
> We don''t yet agree that we need a place to store a "please
consider fsck" flag.
> Can I please get one concrete example in which situation we
> 
> a) do detect the user should really do a file system check AND
> b) do not abort the transaction to clean the mess up?
> 
> (An example on how we could fail transaction cleanup is also accepted).
> 
Unfortunately I don''t have such an example either.

Since we always get COW on metadata, I believe that it''s ok to just
roll
back on failure.
> If such a situation doesn''t exist, there''s no need for
this flag. The fact that
> ext has such a flag doesn''t convince me, probably because I know
nothing about
> ext. I can imagine that they can detect file system errors without the
ability
> to return to a potentially older consistent state.
> 
This error flag is also used to indicate filesystem''s error state for
transaction cleanup, so keeping it in memory is reasonable.

thanks,
liubo
> Thanks,
> -Jan
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jan Schmidt

2012-Aug-03 04:59 UTC

head link

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

[added Josef]

Josef: Please queue this patch for rc2, we finally agreed.

On Fri, August 03, 2012 at 05:20 (+0200), Liu Bo wrote:> On 08/02/2012 10:28 PM, Jan Schmidt wrote:
>> If such a situation doesn''t exist, there''s no need
for this flag. The fact that
>> ext has such a flag doesn''t convince me, probably because I
know nothing about
>> ext. I can imagine that they can detect file system errors without the
ability
>> to return to a potentially older consistent state.
>>
> 
> This error flag is also used to indicate filesystem''s error state
for
> transaction cleanup, so keeping it in memory is reasonable.
Granted :-)

-Jan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Btrfs devel - Aug 2012 - [PATCH v2] Btrfs: remove superblock writing after fatal error

[PATCH v2] Btrfs: remove superblock writing after fatal error

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error

Re: [PATCH v2] Btrfs: remove superblock writing after fatal error