thr3ads.net - Btrfs devel - [PATCH] Btrfs: relocate csums properly with prealloc extents [Sep 2013]

If this information is useful, please help other people find it:
Share via:

Josef Bacik

2013-Sep-27 13:37 UTC

[PATCH] Btrfs: relocate csums properly with prealloc extents

A user reported a problem where they were getting csum errors when running a
balance and running systemd''s journal.  This is because systemd is
awesome and
fallocate()''s its log space and writes into it.  Unfortunately we
assume that
when we read in all the csums for an extent that they are sequential starting at
the bytenr we care about.  This obviously isn''t the case for prealloc
extents,
where we could have written to the middle of the prealloc extent only, which
means the csum would be for the bytenr in the middle of our range and not the
front of our range.  Fix this by offsetting the new bytenr we are logging to
based on the original bytenr the csum was for.  With this patch I no longer see
the csum errors I was seeing.  Thanks,

Cc: stable@vger.kernel.org
Reported-by: Chris Murphy <lists@colorremedies.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
---
 fs/btrfs/relocation.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 5ca7ea9..b7afeaa 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -4472,6 +4472,7 @@ int btrfs_reloc_clone_csums(struct inode *inode, u64
file_pos, u64 len)
 	struct btrfs_root *root = BTRFS_I(inode)->root;
 	int ret;
 	u64 disk_bytenr;
+	u64 new_bytenr;
 	LIST_HEAD(list);
 
 	ordered = btrfs_lookup_ordered_extent(inode, file_pos);
@@ -4483,13 +4484,24 @@ int btrfs_reloc_clone_csums(struct inode *inode, u64
file_pos, u64 len)
 	if (ret)
 		goto out;
 
-	disk_bytenr = ordered->start;
 	while (!list_empty(&list)) {
 		sums = list_entry(list.next, struct btrfs_ordered_sum, list);
 		list_del_init(&sums->list);
 
-		sums->bytenr = disk_bytenr;
-		disk_bytenr += sums->len;
+		/*
+		 * We need to offset the new_bytenr based on where the csum is.
+		 * We need to do this because we will read in entire prealloc
+		 * extents but we may have written to say the middle of the
+		 * prealloc extent, so we need to make sure the csum goes with
+		 * the right disk offset.
+		 *
+		 * We can do this because the data reloc inode refers strictly
+		 * to the on disk bytes, so we don''t have to worry about
+		 * disk_len vs real len like with real inodes since it''s all
+		 * disk length.
+		 */
+		new_bytenr = ordered->start + (sums->bytenr - disk_bytenr);
+		sums->bytenr = new_bytenr;
 
 		btrfs_add_ordered_sum(inode, ordered, sums);
 	}
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Johannes Hirte

2013-Oct-04 21:19 UTC

head link

Re: [PATCH] Btrfs: relocate csums properly with prealloc extents

On Fri, 27 Sep 2013 09:37:00 -0400
Josef Bacik <jbacik@fusionio.com> wrote:
> A user reported a problem where they were getting csum errors when
> running a balance and running systemd''s journal.  This is because
> systemd is awesome and fallocate()''s its log space and writes into
> it.  Unfortunately we assume that when we read in all the csums for
> an extent that they are sequential starting at the bytenr we care
> about.  This obviously isn''t the case for prealloc extents, where
we
> could have written to the middle of the prealloc extent only, which
> means the csum would be for the bytenr in the middle of our range and
> not the front of our range.  Fix this by offsetting the new bytenr we
> are logging to based on the original bytenr the csum was for.  With
> this patch I no longer see the csum errors I was seeing.  Thanks,
Any assessment when this goes upstream? Until it hit Linus tree it
won''t won''t appear in stable. And this seems rather important.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hans-Kristian Bakke

2013-Oct-23 21:24 UTC

head link

Re: [PATCH] Btrfs: relocate csums properly with prealloc extents

I was hit by this when trying to rebalance a 16TB RAID10 to 32TB
RAID10 going from 4 to 8 WD SE 4TB drives today. I cannot finish a
rebalance because of failed csum.

[10228.850910] BTRFS info (device sdq): csum failed ino 487 off 65536
csum 2566472073 private 151366068
[10228.850967] BTRFS info (device sdq): csum failed ino 487 off 69632
csum 2566472073 private 3056924305
[10228.850973] BTRFS info (device sdq): csum failed ino 487 off 593920
csum 2566472073 private 906093395
[10228.851004] BTRFS info (device sdq): csum failed ino 487 off 73728
csum 2566472073 private 2680502892
[10228.851014] BTRFS info (device sdq): csum failed ino 487 off 598016
csum 2566472073 private 1940162924
[10228.851029] BTRFS info (device sdq): csum failed ino 487 off 77824
csum 2566472073 private 2939385278
[10228.851051] BTRFS info (device sdq): csum failed ino 487 off 602112
csum 2566472073 private 645310077
[10228.851055] BTRFS info (device sdq): csum failed ino 487 off 81920
csum 2566472073 private 3600741549
[10228.851078] BTRFS info (device sdq): csum failed ino 487 off 86016
csum 2566472073 private 200201951
[10228.851091] BTRFS info (device sdq): csum failed ino 487 off 606208
csum 2566472073 private 1002916440

The system is running a scrub now and I will return with some more
details later. I do not think systemd is logging to this volume, but
the scrub wil probably show which files are affected.

As this is a very serious issue for those hit by the corruption (it
basically makes it impossible to run rebalance with all its
consequences) hopefully this wil go upstream soon.
I am on Kernel 3.11.6 by the way.
Mvh

Hans-Kristian Bakke
Mob: 91 76 17 38


On 4 October 2013 23:19, Johannes Hirte <johannes.hirte@datenkhaos.de>
wrote:> On Fri, 27 Sep 2013 09:37:00 -0400
> Josef Bacik <jbacik@fusionio.com> wrote:
>
>> A user reported a problem where they were getting csum errors when
>> running a balance and running systemd''s journal.  This is
because
>> systemd is awesome and fallocate()''s its log space and writes
into
>> it.  Unfortunately we assume that when we read in all the csums for
>> an extent that they are sequential starting at the bytenr we care
>> about.  This obviously isn''t the case for prealloc extents,
where we
>> could have written to the middle of the prealloc extent only, which
>> means the csum would be for the bytenr in the middle of our range and
>> not the front of our range.  Fix this by offsetting the new bytenr we
>> are logging to based on the original bytenr the csum was for.  With
>> this patch I no longer see the csum errors I was seeing.  Thanks,
>
> Any assessment when this goes upstream? Until it hit Linus tree it
> won''t won''t appear in stable. And this seems rather
important.
> --
> To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hans-Kristian Bakke

2013-Oct-23 21:49 UTC

head link

Re: [PATCH] Btrfs: relocate csums properly with prealloc extents

OK. btrfs scrub and dmesg is hitting me with lots of unfixable errors.
All in the same file. Example

[13313.441091] btrfs: unable to fixup (regular) error at logical
560107954176 on dev /dev/sdn
[13321.532223] scrub_handle_errored_block: 1510 callbacks suppressed
[13321.532309] btrfs_dev_stat_print_on_error: 1510 callbacks suppressed
[13321.532314] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
40016, gen 0
[13321.532420] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
40017, gen 0
[13321.532545] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
40018, gen 0
[13321.532605] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
40019, gen 0
[13321.533039] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
40020, gen 0
[13321.537519] scrub_handle_errored_block: 1508 callbacks suppressed
[13321.537525] btrfs: unable to fixup (regular) error at logical
560630136832 on dev /dev/sdq
[13321.537821] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
40021, gen 0
[13321.538081] btrfs: unable to fixup (regular) error at logical
560630140928 on dev /dev/sdq
[13321.538438] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
40022, gen 0
[13321.538715] btrfs: unable to fixup (regular) error at logical
560630145024 on dev /dev/sdq
[13321.539016] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
40023, gen 0
[13321.539234] btrfs: unable to fixup (regular) error at logical
560630149120 on dev /dev/sdq
[13321.539522] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
40024, gen 0
[13321.539739] btrfs: unable to fixup (regular) error at logical
560630153216 on dev /dev/sdq
[13321.540027] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
40025, gen 0
[13321.540242] btrfs: unable to fixup (regular) error at logical
560630157312 on dev /dev/sdq
[13321.540620] btrfs: unable to fixup (regular) error at logical
560630161408 on dev /dev/sdq
[13321.541140] btrfs: unable to fixup (regular) error at logical
560630165504 on dev /dev/sdq
[13321.541571] btrfs: unable to fixup (regular) error at logical
560630169600 on dev /dev/sdq
[13321.541931] btrfs: unable to fixup (regular) error at logical
560630173696 on dev /dev/sdq

Luckily all the corruption seems to be in a single very large file,
but on different part of it on different disks. The file was written
by rtorrent which have the option "system.file_allocate.set = yes"
configured.
I also have samba configured with "strict allocate = yes" because it
is recommended for best performance on extent based filesystems. Do
that mean even samba files vulnerable to this corruption too?
If so this could become very ugly very fast on certain systems.

Mvh

Hans-Kristian Bakke


On 23 October 2013 23:24, Hans-Kristian Bakke <hkbakke@gmail.com>
wrote:> I was hit by this when trying to rebalance a 16TB RAID10 to 32TB
> RAID10 going from 4 to 8 WD SE 4TB drives today. I cannot finish a
> rebalance because of failed csum.
>
> [10228.850910] BTRFS info (device sdq): csum failed ino 487 off 65536
> csum 2566472073 private 151366068
> [10228.850967] BTRFS info (device sdq): csum failed ino 487 off 69632
> csum 2566472073 private 3056924305
> [10228.850973] BTRFS info (device sdq): csum failed ino 487 off 593920
> csum 2566472073 private 906093395
> [10228.851004] BTRFS info (device sdq): csum failed ino 487 off 73728
> csum 2566472073 private 2680502892
> [10228.851014] BTRFS info (device sdq): csum failed ino 487 off 598016
> csum 2566472073 private 1940162924
> [10228.851029] BTRFS info (device sdq): csum failed ino 487 off 77824
> csum 2566472073 private 2939385278
> [10228.851051] BTRFS info (device sdq): csum failed ino 487 off 602112
> csum 2566472073 private 645310077
> [10228.851055] BTRFS info (device sdq): csum failed ino 487 off 81920
> csum 2566472073 private 3600741549
> [10228.851078] BTRFS info (device sdq): csum failed ino 487 off 86016
> csum 2566472073 private 200201951
> [10228.851091] BTRFS info (device sdq): csum failed ino 487 off 606208
> csum 2566472073 private 1002916440
>
> The system is running a scrub now and I will return with some more
> details later. I do not think systemd is logging to this volume, but
> the scrub wil probably show which files are affected.
>
> As this is a very serious issue for those hit by the corruption (it
> basically makes it impossible to run rebalance with all its
> consequences) hopefully this wil go upstream soon.
> I am on Kernel 3.11.6 by the way.
> Mvh
>
> Hans-Kristian Bakke
> Mob: 91 76 17 38
>
>
> On 4 October 2013 23:19, Johannes Hirte
<johannes.hirte@datenkhaos.de> wrote:
>> On Fri, 27 Sep 2013 09:37:00 -0400
>> Josef Bacik <jbacik@fusionio.com> wrote:
>>
>>> A user reported a problem where they were getting csum errors when
>>> running a balance and running systemd''s journal.  This is
because
>>> systemd is awesome and fallocate()''s its log space and
writes into
>>> it.  Unfortunately we assume that when we read in all the csums for
>>> an extent that they are sequential starting at the bytenr we care
>>> about.  This obviously isn''t the case for prealloc
extents, where we
>>> could have written to the middle of the prealloc extent only, which
>>> means the csum would be for the bytenr in the middle of our range
and
>>> not the front of our range.  Fix this by offsetting the new bytenr
we
>>> are logging to based on the original bytenr the csum was for.  With
>>> this patch I no longer see the csum errors I was seeing.  Thanks,
>>
>> Any assessment when this goes upstream? Until it hit Linus tree it
>> won''t won''t appear in stable. And this seems rather
important.
>> --
>> To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Sterba

2013-Oct-24 14:08 UTC

head link

Re: [PATCH] Btrfs: relocate csums properly with prealloc extents - for 3.12-rc

Hi Chris,

this needs to go to 3.12, the patch is only in btrfs-next. The bug can
happen with systemd journal + balance, the fix helps quite a lot of
users out there. (https://bugzilla.kernel.org/show_bug.cgi?id=63411)

I have cherry-picked the patch to current master, applies cleanly and
the test btrfs/013 passes, here''s my

Tested-by: David Sterba <dsterba@suse.cz>

david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hans-Kristian Bakke

2013-Oct-24 16:19 UTC

head link

Re: [PATCH] Btrfs: relocate csums properly with prealloc extents

The result of the scrubbing came back today and it was not pretty:
...
scrub done for b64daec7-6c14-4996-94b3-80c6abfa26ce
        scrub started at Wed Oct 23 23:01:22 2013 and finished after
34990 seconds
        total bytes scrubbed: 12.55TB with 3859542 errors
        error details: csum=3859542
        corrected errors: 0, uncorrectable errors: 3859542, unverified errors: 0
---

Still only two folder structures affected, but seemingly unrecoverable.
I noticed the mail to include it in 3.12. Jippi!
Until this is included I will have to pospone rebalancing over the
four new drives.


Mvh

Hans-Kristian Bakke


On 23 October 2013 23:49, Hans-Kristian Bakke <hkbakke@gmail.com>
wrote:> OK. btrfs scrub and dmesg is hitting me with lots of unfixable errors.
> All in the same file. Example
>
> [13313.441091] btrfs: unable to fixup (regular) error at logical
> 560107954176 on dev /dev/sdn
> [13321.532223] scrub_handle_errored_block: 1510 callbacks suppressed
> [13321.532309] btrfs_dev_stat_print_on_error: 1510 callbacks suppressed
> [13321.532314] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
> 40016, gen 0
> [13321.532420] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
> 40017, gen 0
> [13321.532545] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
> 40018, gen 0
> [13321.532605] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
> 40019, gen 0
> [13321.533039] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
> 40020, gen 0
> [13321.537519] scrub_handle_errored_block: 1508 callbacks suppressed
> [13321.537525] btrfs: unable to fixup (regular) error at logical
> 560630136832 on dev /dev/sdq
> [13321.537821] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
> 40021, gen 0
> [13321.538081] btrfs: unable to fixup (regular) error at logical
> 560630140928 on dev /dev/sdq
> [13321.538438] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
> 40022, gen 0
> [13321.538715] btrfs: unable to fixup (regular) error at logical
> 560630145024 on dev /dev/sdq
> [13321.539016] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
> 40023, gen 0
> [13321.539234] btrfs: unable to fixup (regular) error at logical
> 560630149120 on dev /dev/sdq
> [13321.539522] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
> 40024, gen 0
> [13321.539739] btrfs: unable to fixup (regular) error at logical
> 560630153216 on dev /dev/sdq
> [13321.540027] btrfs: bdev /dev/sdq errs: wr 0, rd 0, flush 0, corrupt
> 40025, gen 0
> [13321.540242] btrfs: unable to fixup (regular) error at logical
> 560630157312 on dev /dev/sdq
> [13321.540620] btrfs: unable to fixup (regular) error at logical
> 560630161408 on dev /dev/sdq
> [13321.541140] btrfs: unable to fixup (regular) error at logical
> 560630165504 on dev /dev/sdq
> [13321.541571] btrfs: unable to fixup (regular) error at logical
> 560630169600 on dev /dev/sdq
> [13321.541931] btrfs: unable to fixup (regular) error at logical
> 560630173696 on dev /dev/sdq
>
> Luckily all the corruption seems to be in a single very large file,
> but on different part of it on different disks. The file was written
> by rtorrent which have the option "system.file_allocate.set =
yes"
> configured.
> I also have samba configured with "strict allocate = yes" because
it
> is recommended for best performance on extent based filesystems. Do
> that mean even samba files vulnerable to this corruption too?
> If so this could become very ugly very fast on certain systems.
>
> Mvh
>
> Hans-Kristian Bakke
>
>
> On 23 October 2013 23:24, Hans-Kristian Bakke <hkbakke@gmail.com>
wrote:
>> I was hit by this when trying to rebalance a 16TB RAID10 to 32TB
>> RAID10 going from 4 to 8 WD SE 4TB drives today. I cannot finish a
>> rebalance because of failed csum.
>>
>> [10228.850910] BTRFS info (device sdq): csum failed ino 487 off 65536
>> csum 2566472073 private 151366068
>> [10228.850967] BTRFS info (device sdq): csum failed ino 487 off 69632
>> csum 2566472073 private 3056924305
>> [10228.850973] BTRFS info (device sdq): csum failed ino 487 off 593920
>> csum 2566472073 private 906093395
>> [10228.851004] BTRFS info (device sdq): csum failed ino 487 off 73728
>> csum 2566472073 private 2680502892
>> [10228.851014] BTRFS info (device sdq): csum failed ino 487 off 598016
>> csum 2566472073 private 1940162924
>> [10228.851029] BTRFS info (device sdq): csum failed ino 487 off 77824
>> csum 2566472073 private 2939385278
>> [10228.851051] BTRFS info (device sdq): csum failed ino 487 off 602112
>> csum 2566472073 private 645310077
>> [10228.851055] BTRFS info (device sdq): csum failed ino 487 off 81920
>> csum 2566472073 private 3600741549
>> [10228.851078] BTRFS info (device sdq): csum failed ino 487 off 86016
>> csum 2566472073 private 200201951
>> [10228.851091] BTRFS info (device sdq): csum failed ino 487 off 606208
>> csum 2566472073 private 1002916440
>>
>> The system is running a scrub now and I will return with some more
>> details later. I do not think systemd is logging to this volume, but
>> the scrub wil probably show which files are affected.
>>
>> As this is a very serious issue for those hit by the corruption (it
>> basically makes it impossible to run rebalance with all its
>> consequences) hopefully this wil go upstream soon.
>> I am on Kernel 3.11.6 by the way.
>> Mvh
>>
>> Hans-Kristian Bakke
>> Mob: 91 76 17 38
>>
>>
>> On 4 October 2013 23:19, Johannes Hirte
<johannes.hirte@datenkhaos.de> wrote:
>>> On Fri, 27 Sep 2013 09:37:00 -0400
>>> Josef Bacik <jbacik@fusionio.com> wrote:
>>>
>>>> A user reported a problem where they were getting csum errors
when
>>>> running a balance and running systemd''s journal.  This
is because
>>>> systemd is awesome and fallocate()''s its log space and
writes into
>>>> it.  Unfortunately we assume that when we read in all the csums
for
>>>> an extent that they are sequential starting at the bytenr we
care
>>>> about.  This obviously isn''t the case for prealloc
extents, where we
>>>> could have written to the middle of the prealloc extent only,
which
>>>> means the csum would be for the bytenr in the middle of our
range and
>>>> not the front of our range.  Fix this by offsetting the new
bytenr we
>>>> are logging to based on the original bytenr the csum was for. 
With
>>>> this patch I no longer see the csum errors I was seeing. 
Thanks,
>>>
>>> Any assessment when this goes upstream? Until it hit Linus tree it
>>> won''t won''t appear in stable. And this seems
rather important.
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe
linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Sterba

2013-Nov-25 16:51 UTC

head link

Re: [PATCH] Btrfs: relocate csums properly with prealloc extents

On Fri, Sep 27, 2013 at 09:37:00AM -0400, Josef Bacik
wrote:> A user reported a problem where they were getting csum errors when running
a
> balance and running systemd''s journal.  This is because systemd is
awesome and
> fallocate()''s its log space and writes into it.  Unfortunately we
assume that
> when we read in all the csums for an extent that they are sequential
starting at
> the bytenr we care about.  This obviously isn''t the case for
prealloc extents,
> where we could have written to the middle of the prealloc extent only,
which
> means the csum would be for the bytenr in the middle of our range and not
the
> front of our range.  Fix this by offsetting the new bytenr we are logging
to
> based on the original bytenr the csum was for.  With this patch I no longer
see
> the csum errors I was seeing.  Thanks,
> 
> Cc: stable@vger.kernel.org
The patch had the right CC but I don''t see it in the mail''s CC
list (now
added by me). I''m afraid that this never reached stable and explains
why
the patch did not end up in 3.12.1.

Stable team, please add this patch to 3.12.x, the commit id is

 4577b014d1bc3db386da3246f625888fc48083a9

thanks,
david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Greg KH

2013-Nov-25 21:01 UTC

head link

Re: [PATCH] Btrfs: relocate csums properly with prealloc extents

On Mon, Nov 25, 2013 at 05:51:16PM +0100, David Sterba
wrote:> On Fri, Sep 27, 2013 at 09:37:00AM -0400, Josef Bacik wrote:
> > A user reported a problem where they were getting csum errors when
running a
> > balance and running systemd''s journal.  This is because
systemd is awesome and
> > fallocate()''s its log space and writes into it. 
Unfortunately we assume that
> > when we read in all the csums for an extent that they are sequential
starting at
> > the bytenr we care about.  This obviously isn''t the case for
prealloc extents,
> > where we could have written to the middle of the prealloc extent only,
which
> > means the csum would be for the bytenr in the middle of our range and
not the
> > front of our range.  Fix this by offsetting the new bytenr we are
logging to
> > based on the original bytenr the csum was for.  With this patch I no
longer see
> > the csum errors I was seeing.  Thanks,
> > 
> > Cc: stable@vger.kernel.org
> 
> The patch had the right CC but I don''t see it in the
mail''s CC list (now
> added by me). I''m afraid that this never reached stable and
explains why
> the patch did not end up in 3.12.1.
No, it made it to my list, I was waiting for 3.13-rc1 to come out with
this patch in it before I could queue it up.  Don''t worry,
it''s not
lost.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Goffredo Baroncelli

2013-Nov-30 07:39 UTC

head link

Re: [PATCH] Btrfs: relocate csums properly with prealloc extents

On 2013-11-25 22:01, Greg KH wrote:> On Mon, Nov 25, 2013 at 05:51:16PM +0100, David Sterba wrote:
>> On Fri, Sep 27, 2013 at 09:37:00AM -0400, Josef Bacik wrote:
>>> A user reported a problem where they were getting csum errors when
running a
>>> balance and running systemd''s journal.  This is because
systemd is awesome and
>>> fallocate()''s its log space and writes into it. 
Unfortunately we assume that
>>> when we read in all the csums for an extent that they are
sequential starting at
>>> the bytenr we care about.  This obviously isn''t the case
for prealloc extents,
>>> where we could have written to the middle of the prealloc extent
only, which
>>> means the csum would be for the bytenr in the middle of our range
and not the
>>> front of our range.  Fix this by offsetting the new bytenr we are
logging to
>>> based on the original bytenr the csum was for.  With this patch I
no longer see
>>> the csum errors I was seeing.  Thanks,
>>>
>>> Cc: stable@vger.kernel.org
>>
>> The patch had the right CC but I don''t see it in the
mail''s CC list (now
>> added by me). I''m afraid that this never reached stable and
explains why
>> the patch did not end up in 3.12.1.
> 
> No, it made it to my list, I was waiting for 3.13-rc1 to come out with
> this patch in it before I could queue it up.  Don''t worry,
it''s not
> lost.
> The patch landed in 3.12.2

-- 
gpg @keyserver.linux.it: Goffredo Baroncelli (kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Btrfs devel - Sep 2013 - [PATCH] Btrfs: relocate csums properly with prealloc extents

[PATCH] Btrfs: relocate csums properly with prealloc extents

Re: [PATCH] Btrfs: relocate csums properly with prealloc extents

Re: [PATCH] Btrfs: relocate csums properly with prealloc extents

Re: [PATCH] Btrfs: relocate csums properly with prealloc extents

Re: [PATCH] Btrfs: relocate csums properly with prealloc extents - for 3.12-rc

Re: [PATCH] Btrfs: relocate csums properly with prealloc extents

Re: [PATCH] Btrfs: relocate csums properly with prealloc extents

Re: [PATCH] Btrfs: relocate csums properly with prealloc extents

Re: [PATCH] Btrfs: relocate csums properly with prealloc extents